SlideShare a Scribd company logo
October 22, 2014 Fogbeam Labs
Semantic Integration with Apache
Jena and Apache Stanbol
Semantic Integration with Apache
Jena and Apache Stanbol
All Things OpenAll Things Open
Raleigh, NCRaleigh, NC
Oct. 22, 2014Oct. 22, 2014
October 22, 2014 Fogbeam Labs
OverviewOverview
● Theory (~10 mins)Theory (~10 mins)
● Application Examples (~10 mins)Application Examples (~10 mins)
● Technical Details (~25 mins)Technical Details (~25 mins)
October 22, 2014 Fogbeam Labs
What do we mean by
“Semantic Integration”?
What do we mean by
“Semantic Integration”?
● Integration, generallyIntegration, generally
● Letting things “talk to each other” so they can act as
a cohesive whole
Letting things “talk to each other” so they can act as
a cohesive whole
● Uses the Semantic Web technology stackUses the Semantic Web technology stack
● Data integration using RDF, well known vocabularies,
as well as in-house vocabularies and ontologies.
Data integration using RDF, well known vocabularies,
as well as in-house vocabularies and ontologies.
● Relationship to EAI, MDM, etc?Relationship to EAI, MDM, etc?
October 22, 2014 Fogbeam Labs
Uses Semantic Web technology
to do what, exactly?
Uses Semantic Web technology
to do what, exactly?
● Work with knowledge, not labelsWork with knowledge, not labels
● Express metadata about “things”Express metadata about “things”
● And the relationships between those “things” and their
characteristics
And the relationships between those “things” and their
characteristics
● Reason about those “things” in order to:Reason about those “things” in order to:
● Find contextually relevant informationFind contextually relevant information
● Search with greater precisionSearch with greater precision
● Generate new knowledgeGenerate new knowledge
● ??????
October 22, 2014 Fogbeam Labs
Knowledge?Knowledge?
● What's the difference between “Data”, “Information”,
“Knowledge”, etc?
What's the difference between “Data”, “Information”,
“Knowledge”, etc?
● Different ways of talking about this.Different ways of talking about this.
● DIKW Pyramid is a popular modelDIKW Pyramid is a popular model
● http://en.wikipedia.org/wiki/DIKW_Pyramidhttp://en.wikipedia.org/wiki/DIKW_Pyramid
October 22, 2014 Fogbeam Labs
October 22, 2014 Fogbeam Labs
Knowledge?Knowledge?
●
For our purposes today...For our purposes today...
●Unambigous IdentifiersUnambigous Identifiers
● OntologyOntology
●Type / Class informationType / Class information
●RelationshipsRelationships
October 22, 2014 Fogbeam Labs
Working With Knowledge instead of
Labels
Working With Knowledge instead of
Labels
● Backing up – what do we mean by “Semantic” anyway?Backing up – what do we mean by “Semantic” anyway?
● Is “Java”:Is “Java”:
● An island in the South PacificAn island in the South Pacific
● A slang word for coffeeA slang word for coffee
● A programming language invented by Sun MicrosystemsA programming language invented by Sun Microsystems
● Using URIs as labelsUsing URIs as labels
●
“java” we are talking about.which
In order to talk about “the semantics of Java” we have to
know unambiguously
In order to talk about “the semantics of Java” we have to
know unambiguously which “java” we are talking about.
October 22, 2014 Fogbeam Labs
OntologyOntology
● The attributes / properties of a ThingThe attributes / properties of a Thing
● Set membership of a ThingSet membership of a Thing
● rdfs:Classrdfs:Class
● Relationships between ThingsRelationships between Things
● dc:relationdc:relation
● dc:subjectdc:subject
● rdfs:subClassOfrdfs:subClassOf
● skos:narrower, skos:broaderskos:narrower, skos:broader
October 22, 2014 Fogbeam Labs
Data Table SlideData Table Slide
id color size manufacturer
2345 Blue Large Acme
2378 Red Small Cullet
3421 Green Medium Acme
October 22, 2014 Fogbeam Labs
Data as TriplesData as Triples
ubject predicate objectssubject predicate object
uid:2345 rdf:type owl:Thinguid:2345 rdf:type owl:Thing
uid:2378 rdf:type owl:Thinguid:2378 rdf:type owl:Thing
uid:3421 rdf:type owl:Thinguid:3421 rdf:type owl:Thing
uid:2345 pref:color “Blue”uid:2345 pref:color “Blue”
uid:2378 pref:color “Red”uid:2378 pref:color “Red”
uid:3421 pref:color “Green”uid:3421 pref:color “Green”
uid:2345 pref:size
“Large”
uid:2345 pref:size
“Large”
uid:2378 pref:size
“Small”
uid:2378 pref:size
“Small”
uid:3421 pref:size
“Medium”
uid:3421 pref:size
“Medium”
uid:2345 pref:manufacturer uid:9998uid:2345 pref:manufacturer uid:9998
uid:2378 pref:manufacturer uid:9997uid:2378 pref:manufacturer uid:9997
uid:3421 pref:manufacturer uid:9998uid:3421 pref:manufacturer uid:9998
October 22, 2014 Fogbeam Labs
Types & RelationshipsTypes & Relationships
● RDF/SRDF/S
● superclass / subclass relationships for Classessuperclass / subclass relationships for Classes
● superclass / subclass relationships for Propertiessuperclass / subclass relationships for Properties
● domain / range relationship between Properties and Classesdomain / range relationship between Properties and Classes
● OWLOWL
● class equivalenceclass equivalence
● entity equivalenceentity equivalence
● class disjointnessclass disjointness
● SKOSSKOS
● narrower / broader relationship between Conceptsnarrower / broader relationship between Concepts
● ordered collectionsordered collections
October 22, 2014 Fogbeam Labs
ButBut
● But... we're not here for a course on Epistemology or
Metaphysics...
But... we're not here for a course on Epistemology or
Metaphysics...
October 22, 2014 Fogbeam Labs
SynonymsSynonyms
● Smart DataSmart Data
● Semantic DataSemantic Data
● KnowledgeKnowledge
October 22, 2014 Fogbeam Labs
Semantic Integration LayerSemantic Integration Layer
Enterprise ApplicationsEnterprise Applications
(ERP, SFA,(ERP, SFA,
CRM, etc.)CRM, etc.)
Document RepositoriesDocument Repositories
DMS, Wikis, Blogs,DMS, Wikis, Blogs,
Forums, Etc.Forums, Etc.
“Big Data”“Big Data”
Data Warehouses,Data Warehouses,
Data Lakes, etc.Data Lakes, etc.
Internet of Things,Internet of Things,
M2M, Sensor DataM2M, Sensor Data
etc.etc.
“Open Data”“Open Data”
SEC filingsSEC filings
EPA dataEPA data
building permits,building permits,
etc.etc.
StanbolStanbol
JenaJena
UsersUsers
October 22, 2014 Fogbeam Labs
But wait, there's more...But wait, there's more...
● From relational database to Semantic Web -> R2RMLFrom relational database to Semantic Web -> R2RML
● D2RQD2RQ
● http://d2rq.orghttp://d2rq.org
● ANY23 – Anything to TriplesANY23 – Anything to Triples
● http://any23.apache.orghttp://any23.apache.org
● OpenRefine, Tika, JSoup, Boilerpipe, ...OpenRefine, Tika, JSoup, Boilerpipe, ...
● Potentially, anything that might be part of a normal ETL
workflow
Potentially, anything that might be part of a normal ETL
workflow
October 22, 2014 Fogbeam Labs
So, what is the Semantic
Web?
So, what is the Semantic
Web?
An evolving extension of the World Wide Web in which the semantics
of information and services on the web is defined, making it possible
for the web to understand and satisfy the requests of people and
machines to use the web content.
An evolving extension of the World Wide Web in which the semantics
of information and services on the web is defined, making it possible
for the web to understand and satisfy the requests of people and
machines to use the web content.
Sir Tim Berners-Lee's vision of the Web as a universal medium for data,
information, and knowledge exchange.
Sir Tim Berners-Lee's vision of the Web as a universal medium for data,
information, and knowledge exchange.
...prospective future possibilities that are yet to be implemented or
realized.
...prospective future possibilities that are yet to be implemented or
realized.
A set of design principles, collaborative working groups, and a variety of
enabling technologies.
A set of design principles, collaborative working groups, and a variety of
enabling technologies.
October 22, 2014 Fogbeam Labs
What is the Semantic Web?
(continued)
What is the Semantic Web?
(continued)
”
... supposed to make data located anywhere on the Web
accessible and understandable, both to people and to
machines.
““... supposed to make data located anywhere on the Web
accessible and understandable, both to people and to
machines.”
(Explorers Guide to the Semantic Web, p 3)(Explorers Guide to the Semantic Web, p 3)
”... more a vision than a technology.““... more a vision than a technology.”
(Explorers Guide to the Semantic Web, p 3)(Explorers Guide to the Semantic Web, p 3)
“...a fluid, evolving, informally defined concept rather than an
integrated, working system.”
“...a fluid, evolving, informally defined concept rather than an
integrated, working system.”
(Explorers Guide to the Semantic Web, p 3)(Explorers Guide to the Semantic Web, p 3)
October 22, 2014 Fogbeam Labs
The “Semantic Web Layer Cake”The “Semantic Web Layer Cake”
October 22, 2014 Fogbeam Labs
RDF – Resource Description FrameworkRDF – Resource Description Framework
● Resources unambiguously named using URIsResources unambiguously named using URIs
● Everything is a triple... ex: “the shoe is red” would be the triple with subject = “shoe”,
predicate (or property) = “color”, and object (or value = “red”
Everything is a triple... ex: “the shoe is red” would be the triple with subject = “shoe”,
predicate (or property) = “color”, and object (or value = “red”
● Serialization formats include XML (known as RDF/XML ) and developer friendly
serialization formats including N3, Turtle, and JSON-LD
Serialization formats include XML (known as RDF/XML ) and developer friendly
serialization formats including N3, Turtle, and JSON-LD
SubjectSubject PropertyProperty ValueValue
bjectSubject, Predicate, OSubject, Predicate, Object
Models statements as “triples”Models statements as “triples”
October 22, 2014 Fogbeam Labs
Reasoning over dataReasoning over data
● OWL / SKOS / etc.OWL / SKOS / etc.
● Ability to access “Inferred” triplesAbility to access “Inferred” triples
October 22, 2014 Fogbeam Labs
Common VocabulariesCommon Vocabularies
● FOAFFOAF
● SKOSSKOS
● DOAPDOAP
● Dublin CoreDublin Core
● Etc.Etc.
October 22, 2014 Fogbeam Labs
Querying with SPARQLQuerying with SPARQL
● Basic queriesBasic queries
● Using inferred triplesUsing inferred triples
● Federated QueriesFederated Queries
● DBPedia exampleDBPedia example
October 22, 2014 Fogbeam Labs
Semantic Integration in the
Enterprise
Semantic Integration in the
Enterprise
● Knowledge ManagementKnowledge Management
● CollaborationCollaboration
● BPMBPM
● Business IntelligenceBusiness Intelligence
● Predictive AnalyticsPredictive Analytics
October 22, 2014 Fogbeam Labs
Apache JenaApache Jena
● RDF APIRDF API
● Triplestore (TDB)Triplestore (TDB)
● Sparql Execution Engine (ARQ)Sparql Execution Engine (ARQ)
● OWL ReasonerOWL Reasoner
● SPARQL endpoint (Fuseki)SPARQL endpoint (Fuseki)
● Inference APIInference API
● Use built in reasonersUse built in reasoners
● Or define your own inference rulesOr define your own inference rules
● http://jena.apache.orghttp://jena.apache.org
October 22, 2014 Fogbeam Labs
Apache StanbolApache Stanbol
● A “RESTful Semantic Processing Engine”A “RESTful Semantic Processing Engine”
● Use casesUse cases
● Content EnhancementContent Enhancement
● see:see:
– http://stanbol.apache.org/docs/trunk/scenarios.htmlhttp://stanbol.apache.org/docs/trunk/scenarios.html
● ContentHub, EntityHub, etc.ContentHub, EntityHub, etc.
● Quoddy scenario demoQuoddy scenario demo
● http://stanbol.apache.orghttp://stanbol.apache.org
October 22, 2014 Fogbeam Labs
Not AI, but...Not AI, but...
● Newer reasoners can utilize new techniques,
including Bayesian inference, any sort of machine
learning models, cognitive models, new NLP
techniques, etc.
Newer reasoners can utilize new techniques,
including Bayesian inference, any sort of machine
learning models, cognitive models, new NLP
techniques, etc.
● Same for Stanbol extraction – you can write your
own extractors and new extractors will be coming
down the pipe.
Same for Stanbol extraction – you can write your
own extractors and new extractors will be coming
down the pipe.

More Related Content

PPTX
Apache Jena Elephas and Friends
PPTX
Practical SPARQL Benchmarking Revisited
PPTX
Quadrupling your elephants - RDF and the Hadoop ecosystem
PDF
Sempala - Interactive SPARQL Query Processing on Hadoop
PDF
Linking the world with Python and Semantics
PDF
Debugging Apache Spark - Scala & Python super happy fun times 2017
PDF
Getting started with Apache Spark in Python - PyLadies Toronto 2016
PDF
Querying Linked Data with SPARQL
Apache Jena Elephas and Friends
Practical SPARQL Benchmarking Revisited
Quadrupling your elephants - RDF and the Hadoop ecosystem
Sempala - Interactive SPARQL Query Processing on Hadoop
Linking the world with Python and Semantics
Debugging Apache Spark - Scala & Python super happy fun times 2017
Getting started with Apache Spark in Python - PyLadies Toronto 2016
Querying Linked Data with SPARQL

What's hot (20)

PDF
WebTech Tutorial Querying DBPedia
PDF
Apache Spark Super Happy Funtimes - CHUG 2016
PPTX
Apache Spark MLlib - Random Foreset and Desicion Trees
PDF
Holden Karau - Spark ML for Custom Models
PDF
Introduction to and Extending Spark ML
PDF
Why Scala Is Taking Over the Big Data World
PDF
Scaling with apache spark (a lesson in unintended consequences) strange loo...
PDF
Getting started contributing to Apache Spark
PPTX
SPARQL Cheat Sheet
PDF
Debugging PySpark: Spark Summit East talk by Holden Karau
ODP
SPARQL 1.1 Update (2013-03-05)
PPTX
Semantic web meetup – sparql tutorial
PDF
Sparkling pandas Letting Pandas Roam - PyData Seattle 2015
PDF
Extending spark ML for custom models now with python!
PDF
3 avro hug-2010-07-21
PDF
Spark ML for custom models - FOSDEM HPC 2017
PDF
Pandas UDF and Python Type Hint in Apache Spark 3.0
PPTX
Beyond shuffling - Strata London 2016
PDF
Streaming & Scaling Spark - London Spark Meetup 2016
PDF
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
WebTech Tutorial Querying DBPedia
Apache Spark Super Happy Funtimes - CHUG 2016
Apache Spark MLlib - Random Foreset and Desicion Trees
Holden Karau - Spark ML for Custom Models
Introduction to and Extending Spark ML
Why Scala Is Taking Over the Big Data World
Scaling with apache spark (a lesson in unintended consequences) strange loo...
Getting started contributing to Apache Spark
SPARQL Cheat Sheet
Debugging PySpark: Spark Summit East talk by Holden Karau
SPARQL 1.1 Update (2013-03-05)
Semantic web meetup – sparql tutorial
Sparkling pandas Letting Pandas Roam - PyData Seattle 2015
Extending spark ML for custom models now with python!
3 avro hug-2010-07-21
Spark ML for custom models - FOSDEM HPC 2017
Pandas UDF and Python Type Hint in Apache Spark 3.0
Beyond shuffling - Strata London 2016
Streaming & Scaling Spark - London Spark Meetup 2016
Apache Spark - Intro to Large-scale recommendations with Apache Spark and Python
Ad

Similar to Semantic Integration with Apache Jena and Stanbol (20)

ODT
Riding The Semantic Wave
PDF
WebGUI And The Semantic Web
PPTX
Introduction to the Semantic Web
PPTX
CSHALS 2010 W3C Semanic Web Tutorial
PPT
Tutorial on Semantic Digital Libraries (WWW'2007)
PPT
Corrib.org - OpenSource and Research
PDF
Semantic web: An overview
PDF
Introduction to the Semantic Web
PPT
Tutorial on Semantic Digital Libraries (ESWC'2007)
PPT
Spivack Blogtalk 2008
PPT
Nova Spivack - Semantic Web Talk
PPT
Netflix presentation final
KEY
Semantic Web and Linked Open Data
PPT
Semantic Web 2.0: Creating Social Semantic Information Spaces
PPTX
Semantic Web, e-commerce
PPT
A review of the state of the art in Machine Learning on the Semantic Web
PPT
Semantic Web 2.0
PPTX
Poster Semantic Web - Abhijit Chandrasen Manepatil
 
PDF
IRJET- Semantic Web Mining and Semantic Search Engine: A Review
ODP
Research on collaborative information sharing systems
Riding The Semantic Wave
WebGUI And The Semantic Web
Introduction to the Semantic Web
CSHALS 2010 W3C Semanic Web Tutorial
Tutorial on Semantic Digital Libraries (WWW'2007)
Corrib.org - OpenSource and Research
Semantic web: An overview
Introduction to the Semantic Web
Tutorial on Semantic Digital Libraries (ESWC'2007)
Spivack Blogtalk 2008
Nova Spivack - Semantic Web Talk
Netflix presentation final
Semantic Web and Linked Open Data
Semantic Web 2.0: Creating Social Semantic Information Spaces
Semantic Web, e-commerce
A review of the state of the art in Machine Learning on the Semantic Web
Semantic Web 2.0
Poster Semantic Web - Abhijit Chandrasen Manepatil
 
IRJET- Semantic Web Mining and Semantic Search Engine: A Review
Research on collaborative information sharing systems
Ad

More from All Things Open (20)

PDF
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
PPTX
Big Data on a Small Budget: Scalable Data Visualization for the Rest of Us - ...
PDF
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
PDF
Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
PDF
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
PDF
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
PDF
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
PPTX
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
PDF
Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay ...
PDF
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
PPTX
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
PDF
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
PPTX
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
PDF
The Death of the Browser - Rachel-Lee Nabors, AgentQL
PDF
Making Operating System updates fast, easy, and safe
PDF
Reshaping the landscape of belonging to transform community
PDF
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
PDF
Integrating Diversity, Equity, and Inclusion into Product Design
PDF
The Open Source Ecosystem for eBPF in Kubernetes
PDF
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman
Agentic AI for Developers and Data Scientists Build an AI Agent in 10 Lines o...
Big Data on a Small Budget: Scalable Data Visualization for the Rest of Us - ...
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
Let's Create a GitHub Copilot Extension! - Nick Taylor, Pomerium
Leveraging Pre-Trained Transformer Models for Protein Function Prediction - T...
Gen AI: AI Agents - Making LLMs work together in an organized way - Brent Las...
You Don't Need an AI Strategy, But You Do Need to Be Strategic About AI - Jes...
DON’T PANIC: AI IS COMING – The Hitchhiker’s Guide to AI - Mark Hinkle, Perip...
Fine-Tuning Large Language Models with Declarative ML Orchestration - Shivay ...
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
Artificial Intelligence Needs Community Intelligence - Sriram Raghavan, IBM R...
Don't just talk to AI, do more with AI: how to improve productivity with AI a...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
The Death of the Browser - Rachel-Lee Nabors, AgentQL
Making Operating System updates fast, easy, and safe
Reshaping the landscape of belonging to transform community
The Unseen, Underappreciated Security Work Your Maintainers May (or may not) ...
Integrating Diversity, Equity, and Inclusion into Product Design
The Open Source Ecosystem for eBPF in Kubernetes
Open Source Privacy-Preserving Metrics - Sarah Gran & Brandon Pitman

Recently uploaded (20)

PDF
KodekX | Application Modernization Development
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Approach and Philosophy of On baking technology
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
MIND Revenue Release Quarter 2 2025 Press Release
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Electronic commerce courselecture one. Pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Spectral efficient network and resource selection model in 5G networks
KodekX | Application Modernization Development
Encapsulation_ Review paper, used for researhc scholars
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Big Data Technologies - Introduction.pptx
Programs and apps: productivity, graphics, security and other tools
20250228 LYD VKU AI Blended-Learning.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Approach and Philosophy of On baking technology
MYSQL Presentation for SQL database connectivity
Diabetes mellitus diagnosis method based random forest with bat algorithm
MIND Revenue Release Quarter 2 2025 Press Release
The AUB Centre for AI in Media Proposal.docx
Spectroscopy.pptx food analysis technology
Unlocking AI with Model Context Protocol (MCP)
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Electronic commerce courselecture one. Pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Empathic Computing: Creating Shared Understanding
Spectral efficient network and resource selection model in 5G networks

Semantic Integration with Apache Jena and Stanbol

  • 1. October 22, 2014 Fogbeam Labs Semantic Integration with Apache Jena and Apache Stanbol Semantic Integration with Apache Jena and Apache Stanbol All Things OpenAll Things Open Raleigh, NCRaleigh, NC Oct. 22, 2014Oct. 22, 2014
  • 2. October 22, 2014 Fogbeam Labs OverviewOverview ● Theory (~10 mins)Theory (~10 mins) ● Application Examples (~10 mins)Application Examples (~10 mins) ● Technical Details (~25 mins)Technical Details (~25 mins)
  • 3. October 22, 2014 Fogbeam Labs What do we mean by “Semantic Integration”? What do we mean by “Semantic Integration”? ● Integration, generallyIntegration, generally ● Letting things “talk to each other” so they can act as a cohesive whole Letting things “talk to each other” so they can act as a cohesive whole ● Uses the Semantic Web technology stackUses the Semantic Web technology stack ● Data integration using RDF, well known vocabularies, as well as in-house vocabularies and ontologies. Data integration using RDF, well known vocabularies, as well as in-house vocabularies and ontologies. ● Relationship to EAI, MDM, etc?Relationship to EAI, MDM, etc?
  • 4. October 22, 2014 Fogbeam Labs Uses Semantic Web technology to do what, exactly? Uses Semantic Web technology to do what, exactly? ● Work with knowledge, not labelsWork with knowledge, not labels ● Express metadata about “things”Express metadata about “things” ● And the relationships between those “things” and their characteristics And the relationships between those “things” and their characteristics ● Reason about those “things” in order to:Reason about those “things” in order to: ● Find contextually relevant informationFind contextually relevant information ● Search with greater precisionSearch with greater precision ● Generate new knowledgeGenerate new knowledge ● ??????
  • 5. October 22, 2014 Fogbeam Labs Knowledge?Knowledge? ● What's the difference between “Data”, “Information”, “Knowledge”, etc? What's the difference between “Data”, “Information”, “Knowledge”, etc? ● Different ways of talking about this.Different ways of talking about this. ● DIKW Pyramid is a popular modelDIKW Pyramid is a popular model ● http://en.wikipedia.org/wiki/DIKW_Pyramidhttp://en.wikipedia.org/wiki/DIKW_Pyramid
  • 6. October 22, 2014 Fogbeam Labs
  • 7. October 22, 2014 Fogbeam Labs Knowledge?Knowledge? ● For our purposes today...For our purposes today... ●Unambigous IdentifiersUnambigous Identifiers ● OntologyOntology ●Type / Class informationType / Class information ●RelationshipsRelationships
  • 8. October 22, 2014 Fogbeam Labs Working With Knowledge instead of Labels Working With Knowledge instead of Labels ● Backing up – what do we mean by “Semantic” anyway?Backing up – what do we mean by “Semantic” anyway? ● Is “Java”:Is “Java”: ● An island in the South PacificAn island in the South Pacific ● A slang word for coffeeA slang word for coffee ● A programming language invented by Sun MicrosystemsA programming language invented by Sun Microsystems ● Using URIs as labelsUsing URIs as labels ● “java” we are talking about.which In order to talk about “the semantics of Java” we have to know unambiguously In order to talk about “the semantics of Java” we have to know unambiguously which “java” we are talking about.
  • 9. October 22, 2014 Fogbeam Labs OntologyOntology ● The attributes / properties of a ThingThe attributes / properties of a Thing ● Set membership of a ThingSet membership of a Thing ● rdfs:Classrdfs:Class ● Relationships between ThingsRelationships between Things ● dc:relationdc:relation ● dc:subjectdc:subject ● rdfs:subClassOfrdfs:subClassOf ● skos:narrower, skos:broaderskos:narrower, skos:broader
  • 10. October 22, 2014 Fogbeam Labs Data Table SlideData Table Slide id color size manufacturer 2345 Blue Large Acme 2378 Red Small Cullet 3421 Green Medium Acme
  • 11. October 22, 2014 Fogbeam Labs Data as TriplesData as Triples ubject predicate objectssubject predicate object uid:2345 rdf:type owl:Thinguid:2345 rdf:type owl:Thing uid:2378 rdf:type owl:Thinguid:2378 rdf:type owl:Thing uid:3421 rdf:type owl:Thinguid:3421 rdf:type owl:Thing uid:2345 pref:color “Blue”uid:2345 pref:color “Blue” uid:2378 pref:color “Red”uid:2378 pref:color “Red” uid:3421 pref:color “Green”uid:3421 pref:color “Green” uid:2345 pref:size “Large” uid:2345 pref:size “Large” uid:2378 pref:size “Small” uid:2378 pref:size “Small” uid:3421 pref:size “Medium” uid:3421 pref:size “Medium” uid:2345 pref:manufacturer uid:9998uid:2345 pref:manufacturer uid:9998 uid:2378 pref:manufacturer uid:9997uid:2378 pref:manufacturer uid:9997 uid:3421 pref:manufacturer uid:9998uid:3421 pref:manufacturer uid:9998
  • 12. October 22, 2014 Fogbeam Labs Types & RelationshipsTypes & Relationships ● RDF/SRDF/S ● superclass / subclass relationships for Classessuperclass / subclass relationships for Classes ● superclass / subclass relationships for Propertiessuperclass / subclass relationships for Properties ● domain / range relationship between Properties and Classesdomain / range relationship between Properties and Classes ● OWLOWL ● class equivalenceclass equivalence ● entity equivalenceentity equivalence ● class disjointnessclass disjointness ● SKOSSKOS ● narrower / broader relationship between Conceptsnarrower / broader relationship between Concepts ● ordered collectionsordered collections
  • 13. October 22, 2014 Fogbeam Labs ButBut ● But... we're not here for a course on Epistemology or Metaphysics... But... we're not here for a course on Epistemology or Metaphysics...
  • 14. October 22, 2014 Fogbeam Labs SynonymsSynonyms ● Smart DataSmart Data ● Semantic DataSemantic Data ● KnowledgeKnowledge
  • 15. October 22, 2014 Fogbeam Labs Semantic Integration LayerSemantic Integration Layer Enterprise ApplicationsEnterprise Applications (ERP, SFA,(ERP, SFA, CRM, etc.)CRM, etc.) Document RepositoriesDocument Repositories DMS, Wikis, Blogs,DMS, Wikis, Blogs, Forums, Etc.Forums, Etc. “Big Data”“Big Data” Data Warehouses,Data Warehouses, Data Lakes, etc.Data Lakes, etc. Internet of Things,Internet of Things, M2M, Sensor DataM2M, Sensor Data etc.etc. “Open Data”“Open Data” SEC filingsSEC filings EPA dataEPA data building permits,building permits, etc.etc. StanbolStanbol JenaJena UsersUsers
  • 16. October 22, 2014 Fogbeam Labs But wait, there's more...But wait, there's more... ● From relational database to Semantic Web -> R2RMLFrom relational database to Semantic Web -> R2RML ● D2RQD2RQ ● http://d2rq.orghttp://d2rq.org ● ANY23 – Anything to TriplesANY23 – Anything to Triples ● http://any23.apache.orghttp://any23.apache.org ● OpenRefine, Tika, JSoup, Boilerpipe, ...OpenRefine, Tika, JSoup, Boilerpipe, ... ● Potentially, anything that might be part of a normal ETL workflow Potentially, anything that might be part of a normal ETL workflow
  • 17. October 22, 2014 Fogbeam Labs So, what is the Semantic Web? So, what is the Semantic Web? An evolving extension of the World Wide Web in which the semantics of information and services on the web is defined, making it possible for the web to understand and satisfy the requests of people and machines to use the web content. An evolving extension of the World Wide Web in which the semantics of information and services on the web is defined, making it possible for the web to understand and satisfy the requests of people and machines to use the web content. Sir Tim Berners-Lee's vision of the Web as a universal medium for data, information, and knowledge exchange. Sir Tim Berners-Lee's vision of the Web as a universal medium for data, information, and knowledge exchange. ...prospective future possibilities that are yet to be implemented or realized. ...prospective future possibilities that are yet to be implemented or realized. A set of design principles, collaborative working groups, and a variety of enabling technologies. A set of design principles, collaborative working groups, and a variety of enabling technologies.
  • 18. October 22, 2014 Fogbeam Labs What is the Semantic Web? (continued) What is the Semantic Web? (continued) ” ... supposed to make data located anywhere on the Web accessible and understandable, both to people and to machines. ““... supposed to make data located anywhere on the Web accessible and understandable, both to people and to machines.” (Explorers Guide to the Semantic Web, p 3)(Explorers Guide to the Semantic Web, p 3) ”... more a vision than a technology.““... more a vision than a technology.” (Explorers Guide to the Semantic Web, p 3)(Explorers Guide to the Semantic Web, p 3) “...a fluid, evolving, informally defined concept rather than an integrated, working system.” “...a fluid, evolving, informally defined concept rather than an integrated, working system.” (Explorers Guide to the Semantic Web, p 3)(Explorers Guide to the Semantic Web, p 3)
  • 19. October 22, 2014 Fogbeam Labs The “Semantic Web Layer Cake”The “Semantic Web Layer Cake”
  • 20. October 22, 2014 Fogbeam Labs RDF – Resource Description FrameworkRDF – Resource Description Framework ● Resources unambiguously named using URIsResources unambiguously named using URIs ● Everything is a triple... ex: “the shoe is red” would be the triple with subject = “shoe”, predicate (or property) = “color”, and object (or value = “red” Everything is a triple... ex: “the shoe is red” would be the triple with subject = “shoe”, predicate (or property) = “color”, and object (or value = “red” ● Serialization formats include XML (known as RDF/XML ) and developer friendly serialization formats including N3, Turtle, and JSON-LD Serialization formats include XML (known as RDF/XML ) and developer friendly serialization formats including N3, Turtle, and JSON-LD SubjectSubject PropertyProperty ValueValue bjectSubject, Predicate, OSubject, Predicate, Object Models statements as “triples”Models statements as “triples”
  • 21. October 22, 2014 Fogbeam Labs Reasoning over dataReasoning over data ● OWL / SKOS / etc.OWL / SKOS / etc. ● Ability to access “Inferred” triplesAbility to access “Inferred” triples
  • 22. October 22, 2014 Fogbeam Labs Common VocabulariesCommon Vocabularies ● FOAFFOAF ● SKOSSKOS ● DOAPDOAP ● Dublin CoreDublin Core ● Etc.Etc.
  • 23. October 22, 2014 Fogbeam Labs Querying with SPARQLQuerying with SPARQL ● Basic queriesBasic queries ● Using inferred triplesUsing inferred triples ● Federated QueriesFederated Queries ● DBPedia exampleDBPedia example
  • 24. October 22, 2014 Fogbeam Labs Semantic Integration in the Enterprise Semantic Integration in the Enterprise ● Knowledge ManagementKnowledge Management ● CollaborationCollaboration ● BPMBPM ● Business IntelligenceBusiness Intelligence ● Predictive AnalyticsPredictive Analytics
  • 25. October 22, 2014 Fogbeam Labs Apache JenaApache Jena ● RDF APIRDF API ● Triplestore (TDB)Triplestore (TDB) ● Sparql Execution Engine (ARQ)Sparql Execution Engine (ARQ) ● OWL ReasonerOWL Reasoner ● SPARQL endpoint (Fuseki)SPARQL endpoint (Fuseki) ● Inference APIInference API ● Use built in reasonersUse built in reasoners ● Or define your own inference rulesOr define your own inference rules ● http://jena.apache.orghttp://jena.apache.org
  • 26. October 22, 2014 Fogbeam Labs Apache StanbolApache Stanbol ● A “RESTful Semantic Processing Engine”A “RESTful Semantic Processing Engine” ● Use casesUse cases ● Content EnhancementContent Enhancement ● see:see: – http://stanbol.apache.org/docs/trunk/scenarios.htmlhttp://stanbol.apache.org/docs/trunk/scenarios.html ● ContentHub, EntityHub, etc.ContentHub, EntityHub, etc. ● Quoddy scenario demoQuoddy scenario demo ● http://stanbol.apache.orghttp://stanbol.apache.org
  • 27. October 22, 2014 Fogbeam Labs Not AI, but...Not AI, but... ● Newer reasoners can utilize new techniques, including Bayesian inference, any sort of machine learning models, cognitive models, new NLP techniques, etc. Newer reasoners can utilize new techniques, including Bayesian inference, any sort of machine learning models, cognitive models, new NLP techniques, etc. ● Same for Stanbol extraction – you can write your own extractors and new extractors will be coming down the pipe. Same for Stanbol extraction – you can write your own extractors and new extractors will be coming down the pipe.