How Linked Data Can Speed Information Discovery

How Linked Data
Can Speed
Information
Discovery
Alex Meadows, CSpring
Bubba Puryear, Syngenta

Agenda
 Linked Data Overview
 Case Study: Linked Data At Syngenta
 Q&A

We don’t know your data, it’s
Going to take us some time.
-or-
We have so many other projects
we’re not sure when we can get
to this request.
We’re not sure what we want,
but can’t we have it all?
-or-
Here’s our requirements, when
can we have this completed?
Business BI Team

New source: weeks to months
Existing source: days to weeks

What is Linked Data?
 Coined in 2006 by Tim Berners-Lee
 Provides vocabulary for every data set
 Can combine vocabularies
 Highly structured in triple format

Triples
Pale Ale
Beer
Mark
Person
Mt. Carmel Brewing Co.
Brewer

Option 1: Virtualization
New source: hours to week
Existing source: hours to days

Ontop
 Mapping layer
between SQL and
SPARQL
 Integrates with many
tools (Protégé,
Sesame, etc.)

Option 2: Lift and Format
New source: days to weeks
Existing source: hours to days

SPARQL
PREFIX beer: http://my.beer.vocab/1.0/
SELECT ?brewery Name
WHERE {
?brewery beer:hasName ?breweryName
?person beer:owner_of ?brewery
?person beer:first_name “Mark”
}
PREFIX beer: http://my.beer.vocab/1.0/
SELECT ?beertype
WHERE {
?beer beer:isOfType ?beertype
?person beer:brews ?beer
?person beer:first_name “Mark”
<beer:isOfType rdf:resource="beer:PaleAle"/>
<beer:isOfType rdf:resource=“beer:Lager”/>
<beer:hasName>Mt. Carmel Brewing
Company</beer:hasName>

Case Study:
Linked Data At Syngenta

Syngenta
Syngenta is a leading agriculture company helping
to improve global food security by enabling millions
of farmers to make better use of available
resources.
We have two primary lines of business: Seeds and
Agricultural Chemicals.
We have a huge commitment to internal R&D and
that is where our linked data initiatives are.

Linked Data at Syngenta
 Concept Store
Enable Syngenta applications to consume and publish
linked data controlled vocabulary (reference terms and
relationships)
 ENVision Tool
Enables trial placements and weightings that best
represent target markets
 MINT Data
Make genetic identity & inventory data available for
discovery, analysis and R&D driven proof of concepts

What we accomplished
 In a 3 day hackathon we:
 Mapped about 60% of MINT’s model from 2
databases to RDF
 Built a virtualized RDF triple store
 Created a data-discovery / browsing user
interface

MINT Data
MINT Browser
Repository
Configuration
• Identity
• Material
MINT Ontology
• Identity
• Material
RDBMS-RDF Mapper
RDF
Repository
Broker
Open-Sesame
MINT Material
RDBMS
JDBC
R2RML Mapping
• Material
Semantic Wiki
SPARQL
Ontology &
Mapping
Designer
Ontologist
RDBMS-RDF Mapper
MINT Identity
RDBMS
JDBC
R2RML Mapping
• Identity

MINT Class Model
 The MINT ontology was created
within Protégé as shown here

Next Steps
 Moving from the virtualized layer into actual
physical triple store implementation
 Partnering with our benefits tracking team to get
accurate metrics on MINT adoption and value
 Linking to additional data sources to provide
dashboard KPI’s and analytics for our R&D seeds
pipeline

About Alex…
 Principal Consultant, CSpring
 https://www.linkedin.com/in/alexmeadows
 Twitter, GitHub as OpenDataAlex
 Alex has spent the last ten years working in various industries to
help businesses unlock the information hidden in their data sets. He
specializes in open source business intelligence solutions from data
warehousing to dashboards, analytics, and beyond. His latest area
of research has been on linked data (also known as triple stores).
Alex has a Masters in Business Intelligence from Saint Joseph’s
University in Pennsylvania and a Bachelors in Business
Administration from Chowan University in North Carolina.

About Bubba…
 Team Leader, R&D IS, Syngenta
 https://www.linkedin.com/in/bubbapuryear
 I’ve held roles as a software engineer, architect and manager across
multiple industries. The last 13 years I’ve worked in the life sciences
industry supporting Research & Development. I’m currently the program
architect / technical lead for a standardization program within Syngenta
bringing Track & Trace compliance to R&D’s material operations. Many of
Syngenta’s R&D product decisions for our Seeds line of business are
founded on data associated with plant material identity. I have a
Bachelors degree in Computer Science from Rose-Hulman Institute of
Technology.

How Linked Data Can Speed Information Discovery

More Related Content

What's hot (20)

Viewers also liked (15)

Similar to How Linked Data Can Speed Information Discovery (20)

More from Alex Meadows (13)

Recently uploaded (20)

How Linked Data Can Speed Information Discovery

Editor's Notes