SlideShare a Scribd company logo
Pelorus:
A Semantic Web Application
        Platform
        2010 Semantic Technology
               Conference

                       Michael Grove
             Director of Software Development
                    Clark & Parsia, LLC.
                   mike@clarkparsia.com
  http://clarkparsia.com -- http://www.twitter.com/candp
Who are we?
Clark & Parsia is a Semantic software startup founded in
2005
Offices in DC and Cambridge, MA
Software products for end-user and OEM use
Provides software development and integration services
Specializing in Semantic Web, web services, and
advanced AI technologies for federal and enterprise
customers.
Where do we start?
No, literally, where do we start?
Enterprise increasingly wants to utilize semweb tech to
manage information
   Lack of in-house SemWeb expertise
So what's the first step in these cases?
   It's hard to get a project off the ground without
   expertise
   In many cases, you just want to get a prototype
   running ASAP to evaluate the approach
An integrated platform to rapidly prototype and assess
semweb tech, which also scales to production, is crucial
The Pelorus Platform
Pelorus Platform aims to ease this situation
It's a standards-based application development stack
geared toward enterprise information integration via RDF,
SPARQL and OWL.
     Provides a collection of software designed to take you
     from ontology (or data) to application
     Based on years of customer engagements learning
     what parts are the same for everyone, and what parts
     are customized by everyone--and facilitating both.
Minimal or no human in the loop steps are required to get
a barebones application running
     From there, it's just UI customization
Ingredients
PelletServer
   RESTful server-side component powered by Pellet
   Provides:
       Reasoning
       Semantic Search
       Integrity constraints
       Query services
       Machine Learning ... and Planning too!
Semantic ETL
   Toolkit for transforming existing data into RDF
       Support for most common formats, XML, CSV,
       Excel, relational, etc.
       Conversion driven from domain ontology
More Ingredients
Annex - A linked data server
   Publishes your RDF as linked data
   Works in-place against any RDF database
      No files to parse and directory structure to fill out
   Javascript module and pluggable template API for
   rendering resources
   CRUD workflow support for maintaining your data
More Ingredients
Machine Learning Suite
   Bootstrap ontologies from existing data
   Provides capabilities for learning ETL transformations
   from existing data, decreasing by-hand mapping
   burden
   Automatically create Pelorus models for browsing
   Analysis support, clustering, classification, and more.
Pelorus
   Faceted browsing via SPARQL for RDF data.
So What Now?
Intent of Platform is to take either your existing data, or an
existing ontology, as input and provide as output a
working skeleton application.
    This is the Staples Easy button for the Semantic Web
    Some minimal configuration and UI customize may be
    required
The goal is to Just Add Data and get back a working, full-
service, modern app that's optimized for data integration
and analysis.
Getting Started
Legacy data in a series of databases, XML files, etc
    This is a maintenance nightmare
    How to you search this data, analyze it, or verify it's
    correctness?
If we could get the data out of these legacy formats and
integrate them, then we could do something useful...
1. Integrate Legacy Data
 Ontology Bootstrapping via ML
    We can learn the basic ontology from our existing data
    Feed data to a ML process that will produce our
    ontology
 Semantic ETL
    Using our ontology, and some additional ML, we can
    generate mappings from the source data to the
    ontology
    Automatically convert our legacy data into RDF
2. Publish Integrated Data
Now that we have RDF, we'd like to publish it as Linked
Data
   Annex Linked Data server takes any RDF database
   and exposes it's contents as Linked Data.
       Customizable template framework
       Javascript API to access original RDF database
We'd also like to maintain our data
   Using Empire, we can generate Java beans to
   represent our domain ontology.
   Annex provides generic CRUD templates driven from
   standard Java beans, using JPA as a persistence
   mechanism.
By virtue of simply having RDF in a database, we've got
publication as Linked Data, and maintenance via simple
CRUD pages for free.
3. Browse & Search & Query
We've published our RDF, but clicking around pages
looking for a particular resource is not ideal
Having a simple interface to browse the data would be
great.
Pelorus is served via Annex
   Facet model is generated dynamically via more ML
   Uses same Javascript template framework for custom
   display of RDF content.
Step 4: Analyze & Plan & Act
 We can use OWL reasoning via Pellet to learn new things
 about the data; for example:
    which products should we sell to which customers?
    which products should we sell to which prospects?
    why do we make these recommendations?
 We can use Machine Learning to learn new things, too:
    which customers are like others? (similarity)
    which groups do our customers fall into? (clustering)
    which employees are liaisons between parts of the
    company (social network analysis)
    which employees are most likely to retire in the next
    year? (classification)
 We can use Automated Planning to:
    build actionable plans/workflows based on these
    analyses
Interlude: Pelorus Demos

http://pelorus.clarkparsia.com/ -- American baseball

http://nasa.clarkparsia.com/ -- NASA Space Program

http://datagov.clarkparsia.com/ -- data.gov data catalog
What's the point?
Getting to step 4 (and beyond) is the point, that's where
the real ROI lives...
   You want to get there sooner & cheaper
   But many times step 1-3 is a hurdle
       If you've got limited time and/or budget to prove
       value in step 4, you don't want to waste it on the
       drudgery of getting off the ground
   This is the key to semantic technology's value
   proposition
Questions?

More Related Content

PPT
ETL Market Webcast
PPTX
OData Fundamental
PPTX
Odata - Open Data Protocol
PPTX
Modern REST APIs for Enterprise Databases - OData
PPTX
Apache atlas sydney 2017-v4
PPTX
Mythbusters
PPTX
OData - The Universal REST API
PDF
Migrating Fast to Solr
ETL Market Webcast
OData Fundamental
Odata - Open Data Protocol
Modern REST APIs for Enterprise Databases - OData
Apache atlas sydney 2017-v4
Mythbusters
OData - The Universal REST API
Migrating Fast to Solr

What's hot (20)

PPTX
Barcelona salesforce sdg november lightning connect
PDF
Introduction to External Objects and the OData Connector
PPTX
Salesforce Connect External Object Reports
PPTX
NSGIC 2011 Presentation on geo open source
PPTX
Improving Search in Workday Products using Natural Language Processing
PDF
Talend Introduction by TSI
PPTX
Clean coding in plsql and sql, v2
PDF
Talend Interview Questions and Answers | Talend Online Training | Talend Tuto...
PPTX
Dimensional modeling in oracle sql developer
PPTX
Implementing BCS-Business Connectivity Services - Sharepoint 2013- Office 365
PDF
Intro to graphs for HR analytics
PDF
Spark is going to replace Apache Hadoop! Know Why?
PPS
01 introduction to course
PPTX
Strata sf - Amundsen presentation
PDF
Talend Open Studio for Big Data | Talend Open Studio Tutorial | Talend Online...
PPT
Enterprise Mashup Infrastructure Kapow Mashup Server
PPTX
How Lyft Drives Data Discovery
PPTX
REST API debate: OData vs GraphQL vs ORDS
PPTX
Planning your move to the cloud: SaaS Enablement and User Experience (Oracle ...
PPTX
Recommendation engine
Barcelona salesforce sdg november lightning connect
Introduction to External Objects and the OData Connector
Salesforce Connect External Object Reports
NSGIC 2011 Presentation on geo open source
Improving Search in Workday Products using Natural Language Processing
Talend Introduction by TSI
Clean coding in plsql and sql, v2
Talend Interview Questions and Answers | Talend Online Training | Talend Tuto...
Dimensional modeling in oracle sql developer
Implementing BCS-Business Connectivity Services - Sharepoint 2013- Office 365
Intro to graphs for HR analytics
Spark is going to replace Apache Hadoop! Know Why?
01 introduction to course
Strata sf - Amundsen presentation
Talend Open Studio for Big Data | Talend Open Studio Tutorial | Talend Online...
Enterprise Mashup Infrastructure Kapow Mashup Server
How Lyft Drives Data Discovery
REST API debate: OData vs GraphQL vs ORDS
Planning your move to the cloud: SaaS Enablement and User Experience (Oracle ...
Recommendation engine
Ad

Similar to SemTech 2010: Pelorus Platform (20)

ODP
What is apache pig
PDF
What is apache_pig
PDF
What is apache_pig
PDF
Big Data Engineering for Machine Learning
PDF
Mapping Data Flows in Azure Data Factory 1st Edition Mark Kromer
PDF
Mapping Data Flows in Azure Data Factory 1st Edition Mark Kromer
PPT
Sem tech 2011 v8
PPTX
Delivering a Linked Data warehouse and realising the power of graphs
PDF
Symphony Driver Essay
PDF
Started with-apache-spark
PPTX
Data Lake na área da saúde- AWS
PDF
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
PPTX
8_reasons_php_developers_love_using_laravel.pptx
PPTX
big data analytics (BAD601) Module-5.pptx
PDF
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
PDF
Building Enterprise-Ready Knowledge Graph Applications in the Cloud
PDF
apache solr web development.pdf
PDF
Oracle Data Integration - Overview
PPTX
TechDayPakistan-Slides RAG with Cosmos DB.pptx
PDF
Comparison among rdbms, hadoop and spark
What is apache pig
What is apache_pig
What is apache_pig
Big Data Engineering for Machine Learning
Mapping Data Flows in Azure Data Factory 1st Edition Mark Kromer
Mapping Data Flows in Azure Data Factory 1st Edition Mark Kromer
Sem tech 2011 v8
Delivering a Linked Data warehouse and realising the power of graphs
Symphony Driver Essay
Started with-apache-spark
Data Lake na área da saúde- AWS
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
8_reasons_php_developers_love_using_laravel.pptx
big data analytics (BAD601) Module-5.pptx
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Building Enterprise-Ready Knowledge Graph Applications in the Cloud
apache solr web development.pdf
Oracle Data Integration - Overview
TechDayPakistan-Slides RAG with Cosmos DB.pptx
Comparison among rdbms, hadoop and spark
Ad

More from Clark & Parsia LLC (11)

PDF
Stardog Linked Data Catalog
PDF
Stardog 1.1: Easier, Smarter, Faster RDF Database
PDF
Stardog talk-dc-march-17
PDF
RR2010 Keynote
PDF
Validating Linked Data with OWL
PDF
Sem tech 2010_integrity_constraints
PDF
Terp: An OWL-friendly SPARQL
PDF
PelletServer: REST and Semantic Technologies
PDF
PelletDb: Scalable Reasoning for Enterprise Semantics
PDF
Automated Planning as a Semantic Technology
PDF
Empire: JPA for RDF & SPARQL
Stardog Linked Data Catalog
Stardog 1.1: Easier, Smarter, Faster RDF Database
Stardog talk-dc-march-17
RR2010 Keynote
Validating Linked Data with OWL
Sem tech 2010_integrity_constraints
Terp: An OWL-friendly SPARQL
PelletServer: REST and Semantic Technologies
PelletDb: Scalable Reasoning for Enterprise Semantics
Automated Planning as a Semantic Technology
Empire: JPA for RDF & SPARQL

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Encapsulation theory and applications.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
cuic standard and advanced reporting.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Empathic Computing: Creating Shared Understanding
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
sap open course for s4hana steps from ECC to s4
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
MYSQL Presentation for SQL database connectivity
Reach Out and Touch Someone: Haptics and Empathic Computing
Encapsulation theory and applications.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Unlocking AI with Model Context Protocol (MCP)
Programs and apps: productivity, graphics, security and other tools
Encapsulation_ Review paper, used for researhc scholars
cuic standard and advanced reporting.pdf
20250228 LYD VKU AI Blended-Learning.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Empathic Computing: Creating Shared Understanding
Review of recent advances in non-invasive hemoglobin estimation
Chapter 3 Spatial Domain Image Processing.pdf
sap open course for s4hana steps from ECC to s4

SemTech 2010: Pelorus Platform

  • 1. Pelorus: A Semantic Web Application Platform 2010 Semantic Technology Conference Michael Grove Director of Software Development Clark & Parsia, LLC. mike@clarkparsia.com http://clarkparsia.com -- http://www.twitter.com/candp
  • 2. Who are we? Clark & Parsia is a Semantic software startup founded in 2005 Offices in DC and Cambridge, MA Software products for end-user and OEM use Provides software development and integration services Specializing in Semantic Web, web services, and advanced AI technologies for federal and enterprise customers.
  • 3. Where do we start? No, literally, where do we start? Enterprise increasingly wants to utilize semweb tech to manage information Lack of in-house SemWeb expertise So what's the first step in these cases? It's hard to get a project off the ground without expertise In many cases, you just want to get a prototype running ASAP to evaluate the approach An integrated platform to rapidly prototype and assess semweb tech, which also scales to production, is crucial
  • 4. The Pelorus Platform Pelorus Platform aims to ease this situation It's a standards-based application development stack geared toward enterprise information integration via RDF, SPARQL and OWL. Provides a collection of software designed to take you from ontology (or data) to application Based on years of customer engagements learning what parts are the same for everyone, and what parts are customized by everyone--and facilitating both. Minimal or no human in the loop steps are required to get a barebones application running From there, it's just UI customization
  • 5. Ingredients PelletServer RESTful server-side component powered by Pellet Provides: Reasoning Semantic Search Integrity constraints Query services Machine Learning ... and Planning too! Semantic ETL Toolkit for transforming existing data into RDF Support for most common formats, XML, CSV, Excel, relational, etc. Conversion driven from domain ontology
  • 6. More Ingredients Annex - A linked data server Publishes your RDF as linked data Works in-place against any RDF database No files to parse and directory structure to fill out Javascript module and pluggable template API for rendering resources CRUD workflow support for maintaining your data
  • 7. More Ingredients Machine Learning Suite Bootstrap ontologies from existing data Provides capabilities for learning ETL transformations from existing data, decreasing by-hand mapping burden Automatically create Pelorus models for browsing Analysis support, clustering, classification, and more. Pelorus Faceted browsing via SPARQL for RDF data.
  • 8. So What Now? Intent of Platform is to take either your existing data, or an existing ontology, as input and provide as output a working skeleton application. This is the Staples Easy button for the Semantic Web Some minimal configuration and UI customize may be required The goal is to Just Add Data and get back a working, full- service, modern app that's optimized for data integration and analysis.
  • 9. Getting Started Legacy data in a series of databases, XML files, etc This is a maintenance nightmare How to you search this data, analyze it, or verify it's correctness? If we could get the data out of these legacy formats and integrate them, then we could do something useful...
  • 10. 1. Integrate Legacy Data Ontology Bootstrapping via ML We can learn the basic ontology from our existing data Feed data to a ML process that will produce our ontology Semantic ETL Using our ontology, and some additional ML, we can generate mappings from the source data to the ontology Automatically convert our legacy data into RDF
  • 11. 2. Publish Integrated Data Now that we have RDF, we'd like to publish it as Linked Data Annex Linked Data server takes any RDF database and exposes it's contents as Linked Data. Customizable template framework Javascript API to access original RDF database We'd also like to maintain our data Using Empire, we can generate Java beans to represent our domain ontology. Annex provides generic CRUD templates driven from standard Java beans, using JPA as a persistence mechanism. By virtue of simply having RDF in a database, we've got publication as Linked Data, and maintenance via simple CRUD pages for free.
  • 12. 3. Browse & Search & Query We've published our RDF, but clicking around pages looking for a particular resource is not ideal Having a simple interface to browse the data would be great. Pelorus is served via Annex Facet model is generated dynamically via more ML Uses same Javascript template framework for custom display of RDF content.
  • 13. Step 4: Analyze & Plan & Act We can use OWL reasoning via Pellet to learn new things about the data; for example: which products should we sell to which customers? which products should we sell to which prospects? why do we make these recommendations? We can use Machine Learning to learn new things, too: which customers are like others? (similarity) which groups do our customers fall into? (clustering) which employees are liaisons between parts of the company (social network analysis) which employees are most likely to retire in the next year? (classification) We can use Automated Planning to: build actionable plans/workflows based on these analyses
  • 14. Interlude: Pelorus Demos http://pelorus.clarkparsia.com/ -- American baseball http://nasa.clarkparsia.com/ -- NASA Space Program http://datagov.clarkparsia.com/ -- data.gov data catalog
  • 15. What's the point? Getting to step 4 (and beyond) is the point, that's where the real ROI lives... You want to get there sooner & cheaper But many times step 1-3 is a hurdle If you've got limited time and/or budget to prove value in step 4, you don't want to waste it on the drudgery of getting off the ground This is the key to semantic technology's value proposition