SlideShare a Scribd company logo
May (updated) 2010 Product Stack
Enterprise Approach
Enterprise Approach Semantic Enterprise  based on  semantic Web ,  linked data Leverage existing assets Data, records and instances Taxonomies, structure and schema Layer semantics on to existing systems Develop incrementally Add sophistication, scope over time Keep risks low Integrate with public and Web data  (“open world”)
Linked Data “ Linked Data is a set of best practices for publishing and deploying instance and class data using the RDF data model, naming the data objects using uniform resource identifiers (URIs), thereby exposing the data for access via the HTTP protocol, while emphasizing data interconnections, interrelationships and context useful to both humans and machine agents.”
Layers and Current Products
Current Products the pivotal product; Web services middleware that provides distributed data access and federation Drupal-based structured data linkage to  structWSF spreadsheet, JSON and XML authoring and conversion framework reference set of linking subjects and basis for domain vocabularies an ontology- and entity-driven information extraction and tagging system
Fit of Current Products within Layers
Existing Assets Layer
Existing Assets These are the materials that need to be federated, made interoperable, and given a common semantics structured data / databases semi-structure data (XML, Web pages) unstructured data (text)
Preserving Existing Assets Relational databases  (RDBMs) Distributed structured assets spreadsheets lightweight datastores Web pages and Web sites Existing documents and text Web databases and APIs Other databases  (RDF, OO, etc.)
Access/Conversion Layer
Conversion Provides in-place access to existing information Translates existing formats and structures to RDF Extracts structured information from unstructured text Aids creation of interoperable datasets Geared almost entirely to  records ,  instances  or  entities   (that is, basic data)
Conversion Methods Relational DBs: RDB2RDF RDFizers Information Extraction New Dataset Authoring Direct Use  (already in RDF)
Relational DB Conversion Simple mappings of instance records to RDF Methodologies well proven  if  kept to the instance level RDB schema inform the interoperable layer  (“ontologies”) Relational datastores left in place Record data obtained via access layer  (structWSF)
RDFizers General serialization or data format conversions to RDF Mostly applied to: Standard data formats and data structs Web content APIs Some legacy content  Sometimes some minor ontology or schema mapping Embodies all conversion steps to linked data We have access to more than  100+  existing formats
RDFizers – Listing 1 URN handlers (in addition to IRI and URI): DOI  LSID  OAI  RDF Serialization formats:  irON  N3  RDF/XML  Turtle  Languages and ontologies:  AB Meta  Annotea  APML  AtomOWL  Bibliographic Ontology  Creative Commons  EXIF  FOAF  GeoNames  GoodRelations  Java   Javadoc   MARC/MODS   Meta Standards  Music Ontology  Natural Language   Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)  Open Geospatial  OWL  SIOC  SIOCT SKOS  UMBEL  vCard  XML   Others  (X)HTML pages  Embedded Microformats and GRDDL * (see note below):  DC  eRDF  geoURL  Google Base  hAudio  hCalendar  hCard  hListing  hResume  hReview  HR-XML  Ning  RDFa  relLicense  SVG  XBRL  XFN  xFolk  XR-XML  XSLT  Syndication Formats:  Atom  OPML  OCS  RSS 1.1  RSS 2.0  XBEL (for bookmarks) REST-style Web service APIs:  Alchemy  Amazon  Apple  Best Buy  Calais  CNet  CrunchBase  Del.icio.us  Digg  Discogs  Disqus  eBay  Facebook  Flickr  Freebase (MQL)  FriendFeed  Garmin   Get Satisfaction  Google  Google Apps  Hoover's  HTTP (raw)  ISBN DB  Last.fm  Library Thing  Magnolia  Meetup  MusicBrainz  New York Times  New York Times Campaign Finance (NYTCF)  New York Times tags
RDFizers – Listing 2 Open Library  Open Social  Open Street  OpenLink (facets)  O'Reilly  Picasa  Radio Pop (BBC)  Rhapsody  Salesforce  Slideshare  Slidy  Technorati  Tesco  They Work For You  Twine  Twitter  Weather   Wikipedia  World Bank  Yahoo! BOSS  Yahoo! Finance  Yahoo! Maps  Yahoo! Weather  Yelp  YouTube  Zemanta  Zillow  Files (multitude of file formats and MIME types, including):  audio (general)  BibJSON  BibTEX  and  others   BitTorrent   commON  CSV   Fink   Flat files   irJSON  irXML  JPEG   JSON  images  MS Office  OpenOffice  Open Document Format  Palm   RDF123   video  XLS   etc.  Metadata extractors:  CRW   DEB   EXIF   OCW   RPM   XMP   Email formats:  EMail   Outlook   RFC822   Version control and related systems:  Bugzilla  Jira   POM   Subversion   Other Web service frameworks:  BPEL  WSDL  XBRL  XBEL  Data exchange formats:  iCalendar  LDIF   vCalendar  vCard  Relational databases and related:  D2RQ   D2RMAP   RDF Views  Virtuoso VADs  OpenLink license files  Third party metadata extraction frameworks:  Aperture   Spotlight  Miscellaneous and other related converters:  MPEG-7/CS  -> OWL  Random XSD  -> OWL * GRDDL  (Gleaning Resource Descriptions from Dialects of Languages) accommodates a wide variety of dialects (see  one listing ) and can be combined with arbitrary transformation mechanisms (though currently mostly based on XSLTs).
scones
Information Extraction scones  ( S ubject  C oncept  O r  N amed  E ntitie S )  is our IE tagger Information extraction is applied to input Web pages and unstructured text May be applied  after  structure extraction: (often, at minimum,  defluffing ) Settable “window” for snippet  (from # of bracketing terms to full document) Extraction is performed for both: Entities (per Wikipedia and enterprise dictionaries) Subject concepts (per UMBEL and domain ontologies) Presently in prototype
(Named) Entities The  places ,  events ,  people ,  objects , and specific  things  of the real world Literally millions of notable instances Each belongs to one or more  subject concept (s) Currently, the predominate basis for linked data Public sources include Wikipedia and Freebase, others Can be readily mixed-and-matched with private entities
Creating New Entity Dictionaries
Triangulating Information Extraction
irON –  i nstance  r ecord and  O bject  N otation
irON Dataset Authoring Framework Simple authoring and dataset creation irON includes an abstract notation and vocabulary for instance records Serializations available for: XML (irXML) JSON (irJSON) CSV/spreadsheets (commON) Notations for: Instance records Schema Datasets and metadata Linkages to other schema
Three irON Serializations irXML irJSON commON
More-or-less Interchangeable Formats
structWSF
structWSF Generally RESTful Web services middleware Uniform, distributed access point Provides the interoperability architecture Based on canonical RDF data model Dataset access orientation Standard tools and services: User permissions and access CRUD (create, read, update, delete) Browse Full-text, faceted search Import / export Many others
RDF and Data Federation Model
Advantages of a Canonical Model All tools can be driven from a single data format basis Single converters can link in other hubs of data forms ‘ Round-tripping’ thru the canonical form can bring consistency and cleanliness to inputted data RDF is well-suited as the canonical form: Structured data Semi-structured data Unstructured data (after IE) Simple-to-complex data structures Logic and inferencing Suitable to all input data formats Many serializations possible
A Collaborative, Distributed Network
Flexible User Access Permissions
Access, APIs and Endpoints The resulting linked data may be exposed as: APIs Web services SPARQL endpoints
Ontologies Layer
Ontologies Ontologies provide the basis for: Interoperating Reconciling semantics Multiples may be used at any time Both enterprise (internal) and external ontologies Best built incrementally, with participation Easily modified:  OK  to test and experiment
Ontologies The structural relationships of concepts within a domain Generally class-  (or set-)  oriented Analogous to relational database schema, only with controlled vocabularies and exact semantics Sets the structure of how to organize the actual data  (“instances”)  in the domain Semantics and mapping techniques allow disparate ontologies to be inter-related Can inference or reason over the structure
Migrating Structure to the Ontology Layer
Ontologies Layer
irON
irON Record Vocabulary irON also provides the standard instance record vocabulary for all federated records Each record source has its own attributes But, irON provides common descriptors: Useful for interoperating Unique, Web-accessible identifiers Standard descriptions and labels Conventions for “driving” user interfaces and tools
UMBEL UMBEL  ( U pper  M apping and  B inding  E xchange  L ayer) 20,000  defined reference points in information space Means to assert what a given chunk of content  is about Enable similar content to be aggregated Place content in  context   with other content Aggregation points for tying in  instances  and  entities   Derived and a subset of the Cyc knowledge base Vocabulary basis for domain-specific subject ontologies
Notable Ontologies and Vocabularies
Management Layer
Management/Federation Layer Management/Federation Layer handles: Ontology mapping, management Queries and retrievals All Web services Imports and exports Inferencing and logic Ontology creation and expansion Works off of many RDF datastores Has efficient, full-text indexing with faceting Interface to the system is  structWSF Can plug into many options at the Applications Layer (only Drupal with conStruct SCS yet deployed)
Web-oriented Architecture
Applications Layer
conStruct SCS
conStruct Browse Screen
conStruct Capabilities Based on Drupal Single-click  ( cloud )  deployment Theming User and group access and management Data display templates General content management system  (CMS) Publishing RDF Open source
Re-cap
Summary Incremental, low-risk approach to the  semantic enterprise Maximum leverage and re-use of existing information assets Conversion and federation of all available data forms Excellent uses for: Business intelligence Knowledge management Master data management modernization Taxonomy modernization Enterprise content integration All baseline products are open source
Contacts & Information Michael K. Bergman CEO 319.621.5225 [email_address] blog:  www.mkbergman.com Steve Ardire Senior Advisor [email_address] Frédérick Giasson CTO [email_address] blog:  fgiasson.com /blog Web Sites structureddynamics.com umbel.org umbel.structureddynamics.com  (UMBEL Web services) citizen- dan.org  (community indicator systems) openstructs.org  (open source distros + documentation) constructscs.com  (Drupal structured data system)
 

More Related Content

PPT
Rdf And Rdf Schema For Ontology Specification
PPT
Semantic Web in Action
PPTX
Linked Data for Czech Legislation
ODP
ISO MLR semantics
PDF
ODP
DC-2008 Architecture Forum Open session
PDF
Linked Open Data Visualization
PPTX
Wi2015 - Clustering of Linked Open Data - the LODeX tool
Rdf And Rdf Schema For Ontology Specification
Semantic Web in Action
Linked Data for Czech Legislation
ISO MLR semantics
DC-2008 Architecture Forum Open session
Linked Open Data Visualization
Wi2015 - Clustering of Linked Open Data - the LODeX tool

What's hot (20)

PPT
Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
PDF
Scaling the (evolving) web data –at low cost-
PDF
.Net and Rdf APIs
PDF
Ontologies and semantic web
PPT
Solving Real Problems Using Linked Data
PDF
RDF and Java
PPTX
Owl web ontology language
PPTX
Efficient RDF Interchange (ERI) Format for RDF Data Streams
PPT
Linked data and voyager
PDF
Resource description framework
PPT
Virtuoso Universal Server Overview
PPT
Linked Data Driven Data Virtualization for Web-scale Integration
PDF
The web of interlinked data and knowledge stripped
PPTX
Building Linked Data Applications
ODP
Data Integration And Visualization
PPT
PDF
Jarrar: OWL (Web Ontology Language)
PPT
Linked Data Planet Key Note
PPT
Site Interoperability Projects at DERI Galway's SW Cluster
PPT
Providing Tools for Author Evaluation - A case study
Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
Scaling the (evolving) web data –at low cost-
.Net and Rdf APIs
Ontologies and semantic web
Solving Real Problems Using Linked Data
RDF and Java
Owl web ontology language
Efficient RDF Interchange (ERI) Format for RDF Data Streams
Linked data and voyager
Resource description framework
Virtuoso Universal Server Overview
Linked Data Driven Data Virtualization for Web-scale Integration
The web of interlinked data and knowledge stripped
Building Linked Data Applications
Data Integration And Visualization
Jarrar: OWL (Web Ontology Language)
Linked Data Planet Key Note
Site Interoperability Projects at DERI Galway's SW Cluster
Providing Tools for Author Evaluation - A case study
Ad

Viewers also liked (20)

PPT
Ch20 OS
 
PPT
Ch11 OS
 
KEY
Utilizing open-data
PDF
Livinbrand 2016 - Daniel Vítová, Public Eye: Základní pravidla budování vztah...
PDF
Next Generation Web
PPT
CategoríA Informativa
 
PPT
Workshop
PDF
Peace Corps Wiki and Peace Corps Journals
PPS
Top Reasons To Recycle
PDF
The role of COINS in the Civic Space: Building a pathway to shared prosperity
PDF
Building Community In The Civic Space-revitalizing communities in America.
KEY
Getting To "Paid"
PPTX
Slide 1
PPS
Pink Aveona
PPT
Googley Family Philanthropy
PPS
Empress Carlota Maroof
PDF
User experience utopia - interact seattle
PPT
miLexicon @ Eurocall2010
PPT
solar system_yasmine
PPS
soy normal
Ch20 OS
 
Ch11 OS
 
Utilizing open-data
Livinbrand 2016 - Daniel Vítová, Public Eye: Základní pravidla budování vztah...
Next Generation Web
CategoríA Informativa
 
Workshop
Peace Corps Wiki and Peace Corps Journals
Top Reasons To Recycle
The role of COINS in the Civic Space: Building a pathway to shared prosperity
Building Community In The Civic Space-revitalizing communities in America.
Getting To "Paid"
Slide 1
Pink Aveona
Googley Family Philanthropy
Empress Carlota Maroof
User experience utopia - interact seattle
miLexicon @ Eurocall2010
solar system_yasmine
soy normal
Ad

Similar to Structured Dynamics' Semantic Technologies Product Stack (20)

PPT
Adri Jovin - Semantic Web
PPT
Linked Data Tutorial
PPTX
Enterprise knowledge graphs
PPT
Corrib.org - OpenSource and Research
PPT
Web Topics
PPTX
Linked data HHS 2015
PPT
Web 3 Mark Greaves
ODP
State of the Semantic Web
PPTX
Web 3.0 & io t (en)
PPTX
Web 3.0 & IoT (English)
PPT
From Web 2.0 to the Semantic Web: Bridging the Gap in the Newsmedia Industry
ODP
Eol Drupal Dman Presentation
PPT
Semantic Web: Technolgies and Applications for Real-World
PPTX
Legislative data portals and linked data quality
PDF
CDF Embraces XML and SOAP
PPTX
Robust Module based data management system
PPT
Data Portability
PPTX
Linked Data MLA 2015
PPTX
Linked data MLA 2015
Adri Jovin - Semantic Web
Linked Data Tutorial
Enterprise knowledge graphs
Corrib.org - OpenSource and Research
Web Topics
Linked data HHS 2015
Web 3 Mark Greaves
State of the Semantic Web
Web 3.0 & io t (en)
Web 3.0 & IoT (English)
From Web 2.0 to the Semantic Web: Bridging the Gap in the Newsmedia Industry
Eol Drupal Dman Presentation
Semantic Web: Technolgies and Applications for Real-World
Legislative data portals and linked data quality
CDF Embraces XML and SOAP
Robust Module based data management system
Data Portability
Linked Data MLA 2015
Linked data MLA 2015

More from Mike Bergman (8)

PPTX
Context, Perspective, and Generalities in a Knowledge Ontology
PPT
Seven Arguments for Semantic Technologies
PPT
The Rationale for Semantic Technologies
PPT
Pragmatic Approaches to the Semantic Web
PPT
DCMI Keynote: Bridging the Semantic Gaps and Interoperability
PPT
Data-driven Applications with conStruct
PPT
UMBEL: Subject Concepts Layer for the Web
PPT
UMBEL Semantic Web Services
Context, Perspective, and Generalities in a Knowledge Ontology
Seven Arguments for Semantic Technologies
The Rationale for Semantic Technologies
Pragmatic Approaches to the Semantic Web
DCMI Keynote: Bridging the Semantic Gaps and Interoperability
Data-driven Applications with conStruct
UMBEL: Subject Concepts Layer for the Web
UMBEL Semantic Web Services

Recently uploaded (20)

PDF
Electronic commerce courselecture one. Pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Machine learning based COVID-19 study performance prediction
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation theory and applications.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPT
Teaching material agriculture food technology
PPTX
Cloud computing and distributed systems.
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Chapter 3 Spatial Domain Image Processing.pdf
Electronic commerce courselecture one. Pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Spectral efficient network and resource selection model in 5G networks
Machine learning based COVID-19 study performance prediction
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation theory and applications.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Review of recent advances in non-invasive hemoglobin estimation
Encapsulation_ Review paper, used for researhc scholars
sap open course for s4hana steps from ECC to s4
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
MIND Revenue Release Quarter 2 2025 Press Release
MYSQL Presentation for SQL database connectivity
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Teaching material agriculture food technology
Cloud computing and distributed systems.
Understanding_Digital_Forensics_Presentation.pptx
Spectroscopy.pptx food analysis technology
Chapter 3 Spatial Domain Image Processing.pdf

Structured Dynamics' Semantic Technologies Product Stack

  • 1. May (updated) 2010 Product Stack
  • 3. Enterprise Approach Semantic Enterprise based on semantic Web , linked data Leverage existing assets Data, records and instances Taxonomies, structure and schema Layer semantics on to existing systems Develop incrementally Add sophistication, scope over time Keep risks low Integrate with public and Web data (“open world”)
  • 4. Linked Data “ Linked Data is a set of best practices for publishing and deploying instance and class data using the RDF data model, naming the data objects using uniform resource identifiers (URIs), thereby exposing the data for access via the HTTP protocol, while emphasizing data interconnections, interrelationships and context useful to both humans and machine agents.”
  • 6. Current Products the pivotal product; Web services middleware that provides distributed data access and federation Drupal-based structured data linkage to structWSF spreadsheet, JSON and XML authoring and conversion framework reference set of linking subjects and basis for domain vocabularies an ontology- and entity-driven information extraction and tagging system
  • 7. Fit of Current Products within Layers
  • 9. Existing Assets These are the materials that need to be federated, made interoperable, and given a common semantics structured data / databases semi-structure data (XML, Web pages) unstructured data (text)
  • 10. Preserving Existing Assets Relational databases (RDBMs) Distributed structured assets spreadsheets lightweight datastores Web pages and Web sites Existing documents and text Web databases and APIs Other databases (RDF, OO, etc.)
  • 12. Conversion Provides in-place access to existing information Translates existing formats and structures to RDF Extracts structured information from unstructured text Aids creation of interoperable datasets Geared almost entirely to records , instances or entities (that is, basic data)
  • 13. Conversion Methods Relational DBs: RDB2RDF RDFizers Information Extraction New Dataset Authoring Direct Use (already in RDF)
  • 14. Relational DB Conversion Simple mappings of instance records to RDF Methodologies well proven if kept to the instance level RDB schema inform the interoperable layer (“ontologies”) Relational datastores left in place Record data obtained via access layer (structWSF)
  • 15. RDFizers General serialization or data format conversions to RDF Mostly applied to: Standard data formats and data structs Web content APIs Some legacy content Sometimes some minor ontology or schema mapping Embodies all conversion steps to linked data We have access to more than 100+ existing formats
  • 16. RDFizers – Listing 1 URN handlers (in addition to IRI and URI): DOI LSID OAI RDF Serialization formats: irON N3 RDF/XML Turtle Languages and ontologies: AB Meta Annotea APML AtomOWL Bibliographic Ontology Creative Commons EXIF FOAF GeoNames GoodRelations Java Javadoc MARC/MODS Meta Standards Music Ontology Natural Language Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Open Geospatial OWL SIOC SIOCT SKOS UMBEL vCard XML Others (X)HTML pages Embedded Microformats and GRDDL * (see note below): DC eRDF geoURL Google Base hAudio hCalendar hCard hListing hResume hReview HR-XML Ning RDFa relLicense SVG XBRL XFN xFolk XR-XML XSLT Syndication Formats: Atom OPML OCS RSS 1.1 RSS 2.0 XBEL (for bookmarks) REST-style Web service APIs: Alchemy Amazon Apple Best Buy Calais CNet CrunchBase Del.icio.us Digg Discogs Disqus eBay Facebook Flickr Freebase (MQL) FriendFeed Garmin Get Satisfaction Google Google Apps Hoover's HTTP (raw) ISBN DB Last.fm Library Thing Magnolia Meetup MusicBrainz New York Times New York Times Campaign Finance (NYTCF) New York Times tags
  • 17. RDFizers – Listing 2 Open Library Open Social Open Street OpenLink (facets) O'Reilly Picasa Radio Pop (BBC) Rhapsody Salesforce Slideshare Slidy Technorati Tesco They Work For You Twine Twitter Weather Wikipedia World Bank Yahoo! BOSS Yahoo! Finance Yahoo! Maps Yahoo! Weather Yelp YouTube Zemanta Zillow Files (multitude of file formats and MIME types, including): audio (general) BibJSON BibTEX and others BitTorrent commON CSV Fink Flat files irJSON irXML JPEG JSON images MS Office OpenOffice Open Document Format Palm RDF123 video XLS etc. Metadata extractors: CRW DEB EXIF OCW RPM XMP Email formats: EMail Outlook RFC822 Version control and related systems: Bugzilla Jira POM Subversion Other Web service frameworks: BPEL WSDL XBRL XBEL Data exchange formats: iCalendar LDIF vCalendar vCard Relational databases and related: D2RQ D2RMAP RDF Views Virtuoso VADs OpenLink license files Third party metadata extraction frameworks: Aperture Spotlight Miscellaneous and other related converters: MPEG-7/CS -> OWL Random XSD -> OWL * GRDDL (Gleaning Resource Descriptions from Dialects of Languages) accommodates a wide variety of dialects (see one listing ) and can be combined with arbitrary transformation mechanisms (though currently mostly based on XSLTs).
  • 19. Information Extraction scones ( S ubject C oncept O r N amed E ntitie S ) is our IE tagger Information extraction is applied to input Web pages and unstructured text May be applied after structure extraction: (often, at minimum, defluffing ) Settable “window” for snippet (from # of bracketing terms to full document) Extraction is performed for both: Entities (per Wikipedia and enterprise dictionaries) Subject concepts (per UMBEL and domain ontologies) Presently in prototype
  • 20. (Named) Entities The places , events , people , objects , and specific things of the real world Literally millions of notable instances Each belongs to one or more subject concept (s) Currently, the predominate basis for linked data Public sources include Wikipedia and Freebase, others Can be readily mixed-and-matched with private entities
  • 21. Creating New Entity Dictionaries
  • 23. irON – i nstance r ecord and O bject N otation
  • 24. irON Dataset Authoring Framework Simple authoring and dataset creation irON includes an abstract notation and vocabulary for instance records Serializations available for: XML (irXML) JSON (irJSON) CSV/spreadsheets (commON) Notations for: Instance records Schema Datasets and metadata Linkages to other schema
  • 25. Three irON Serializations irXML irJSON commON
  • 28. structWSF Generally RESTful Web services middleware Uniform, distributed access point Provides the interoperability architecture Based on canonical RDF data model Dataset access orientation Standard tools and services: User permissions and access CRUD (create, read, update, delete) Browse Full-text, faceted search Import / export Many others
  • 29. RDF and Data Federation Model
  • 30. Advantages of a Canonical Model All tools can be driven from a single data format basis Single converters can link in other hubs of data forms ‘ Round-tripping’ thru the canonical form can bring consistency and cleanliness to inputted data RDF is well-suited as the canonical form: Structured data Semi-structured data Unstructured data (after IE) Simple-to-complex data structures Logic and inferencing Suitable to all input data formats Many serializations possible
  • 32. Flexible User Access Permissions
  • 33. Access, APIs and Endpoints The resulting linked data may be exposed as: APIs Web services SPARQL endpoints
  • 35. Ontologies Ontologies provide the basis for: Interoperating Reconciling semantics Multiples may be used at any time Both enterprise (internal) and external ontologies Best built incrementally, with participation Easily modified: OK to test and experiment
  • 36. Ontologies The structural relationships of concepts within a domain Generally class- (or set-) oriented Analogous to relational database schema, only with controlled vocabularies and exact semantics Sets the structure of how to organize the actual data (“instances”) in the domain Semantics and mapping techniques allow disparate ontologies to be inter-related Can inference or reason over the structure
  • 37. Migrating Structure to the Ontology Layer
  • 39. irON
  • 40. irON Record Vocabulary irON also provides the standard instance record vocabulary for all federated records Each record source has its own attributes But, irON provides common descriptors: Useful for interoperating Unique, Web-accessible identifiers Standard descriptions and labels Conventions for “driving” user interfaces and tools
  • 41. UMBEL UMBEL ( U pper M apping and B inding E xchange L ayer) 20,000 defined reference points in information space Means to assert what a given chunk of content is about Enable similar content to be aggregated Place content in context with other content Aggregation points for tying in instances and entities Derived and a subset of the Cyc knowledge base Vocabulary basis for domain-specific subject ontologies
  • 42. Notable Ontologies and Vocabularies
  • 44. Management/Federation Layer Management/Federation Layer handles: Ontology mapping, management Queries and retrievals All Web services Imports and exports Inferencing and logic Ontology creation and expansion Works off of many RDF datastores Has efficient, full-text indexing with faceting Interface to the system is structWSF Can plug into many options at the Applications Layer (only Drupal with conStruct SCS yet deployed)
  • 49. conStruct Capabilities Based on Drupal Single-click ( cloud ) deployment Theming User and group access and management Data display templates General content management system (CMS) Publishing RDF Open source
  • 51. Summary Incremental, low-risk approach to the semantic enterprise Maximum leverage and re-use of existing information assets Conversion and federation of all available data forms Excellent uses for: Business intelligence Knowledge management Master data management modernization Taxonomy modernization Enterprise content integration All baseline products are open source
  • 52. Contacts & Information Michael K. Bergman CEO 319.621.5225 [email_address] blog: www.mkbergman.com Steve Ardire Senior Advisor [email_address] Frédérick Giasson CTO [email_address] blog: fgiasson.com /blog Web Sites structureddynamics.com umbel.org umbel.structureddynamics.com (UMBEL Web services) citizen- dan.org (community indicator systems) openstructs.org (open source distros + documentation) constructscs.com (Drupal structured data system)
  • 53.  

Editor's Notes

  • #17: At present, though constantly increasing, Zitgist's existing conversion services recognizes nearly 100 various formats GRDDL (Gleaning Resource Descriptions from Dialects of Languages) is a W3C markup format for getting RDF data out of XML and XHTML documents using explicitly associated transformation algorithms, typically represented in XSLT  GRDDL accomodates a wide variety of dialects (see one listing) and can be combined with arbitrary transformation mechanisms (though currently mostly based on XSLTs).
  • #18: At present, though constantly increasing, Zitgist's existing conversion services recognizes nearly 100 various formats GRDDL (Gleaning Resource Descriptions from Dialects of Languages) is a W3C markup format for getting RDF data out of XML and XHTML documents using explicitly associated transformation algorithms, typically represented in XSLT  GRDDL accomodates a wide variety of dialects (see one listing) and can be combined with arbitrary transformation mechanisms (though currently mostly based on XSLTs).
  • #23: More here also, use the candidate properties content to get the extract to the SC context (??? more about the “aboutness”) contextual UMBEL metadata on the fly