SlideShare a Scribd company logo
24-06-2010




      Web 3.0, Semantics &                  Session III –
      Enterprise Computing             Enterprising Semantics




                                             Nagaraju Pappu
                                               June 2010
           www.canopusconsulting.com




     Describing Semantic Web to Nontechnical
     Users
          Labeling data on Web so that both humans
           and machines can more effectively use them
          Associating meaning to data that machines
2

           can understand so as to achieve lot more
           automation and off-load more work to
           machines
          Exploiting common vocabulary and richer
           modeling of subject area for much better
           integration of data

© Canopus Consulting




                                                                        1
24-06-2010




3




© Canopus Consulting




     Semantics and Enterprise Computing
          Domain Ontologies and their uses:
               Banking, Healthcare, Product catalogues
               Online, targetted advertisement
               Commoditized B2B infrastructure
4
          Example Domain Ontologies:




© Canopus Consulting




                                                                  2
24-06-2010




     Enterprise Intra-Nets
               From Data-Information-Content transformation
                      Unlocking the Huge amounts of unstructured information and
                       connecting the information
                      From Users/Employees to Communities
               From Role Based Access Control to Modeling
5               Membership into Communities and groups that are
                goal oriented
               Intelligent Agents as a backbone for large scale,
                dynamic interchange of information across
                applications and systems
               Enterprise Performance Management
                      ITIL CMDB, SLA Management are immediate candidates for
                       going “onto-logical”


© Canopus Consulting




     Agenda

     •      Tools
     •      Data / vocabularies
     •      Collateral
     •      Community
6    •      Pointers for getting going



               From:
               Semantic Web: Technologies and Applications for the Real-World
               Authors: Amit Seth
               Wright State University
               http://knoesis.wright.edu


© Canopus Consulting




                                                                                            3
24-06-2010




     Enablers and Techniques
          Ontology: Agreement with Common
           Vocabulary & Domain Knowledge; Schema +
           Knowledge base
          Semantic Annotation (meatadata Extraction):
7
           Manual, Semi-automatic (automatic with
           human verification), Automatic
          Reasoning/computation: semantics enabled
           search, integration, complex queries, analysis
           (paths, subgraph), pattern finding, mining,
           hypothesis validation, discovery, visualization

© Canopus Consulting




     A Typical Enterprise SW Application
     Lifecycle
               Build Ontology
                      Build Schema (model level representation
                      Populate with Knowledgebase (people, location, organizations, events)


               Automatic Semantic Annotation (Extract Semantic
8               Metadata)
                      Any type of document, multiple sources of documents
                      Metadata can be stored with or sparely from documents

               Applications:
                 search (ranked list of documents of interest (semantic search),
                  integrate/portal, summarize/explain, analyze, make decisions
                Reasoning techniques: graph analysis, inferencing

           Types of content/documents, Use of standards,
             Scalability, Performance
© Canopus Consulting




                                                                                                       4
24-06-2010




     Semagix Freedom Architecture: for building
     ontology-driven information system




9




                                                                                © Semagix, Inc.


                                      Managing Semantic Content on the Web
© Canopus Consulting




     Building ontology

         Three broad approaches:
         Option 1: social process/manual: many years, committees
                          Can be based on metadata standard


         Option 2: automatic taxonomy generation (statistical clustering/NLP):
10
          limitation/problems on quality, dependence on corpus, naming


         Option 3: Descriptional component (schema) designed by domain
          experts; Description base (assertional component, extension) by
          automated processes


         Option 2 is being investigated in several research projects;


         Option 3 is currently supported by technologies such as Semagix Freedom


© Canopus Consulting




                                                                                                          5
24-06-2010




       Ontology Examples
           Time, Space
           Gene Ontology, Glycomics
           Pharma Drug, Treatment-Diagnosis
           Repertoire Management
           Equity Markets
           Anti-money Laundering, Financial Risk, Terrorism
11




           Can be Public, Government, Limited Availability, Commercial




© Canopus Consulting




     Ontology Language/ Representation Spectrum

                                                         Modal Logic
                                                    First Order Logic
                                            Logical Theory            Is Disjoint Subclass of
                                      Description Logic               with transitivity
                                 DAML+OIL, OWL, UML                   property

                          Conceptual Model
12                                RDFS, XTM          Is Subclass of
                              Extended ER                             Semantic Interoperability

                        Thesaurus
                           ER         Has Narrower Meaning Than
       DB Schemas, XML Schema
                                                              Structural Interoperability
            Taxonomy
     Relational Model    Is Sub-Classification of
                XML
                                                        Syntactic Interoperability



© Canopus Consulting




                                                                                                          6
24-06-2010




     Large Scale Systems already in use

          UMLS – A high level schema of the biomedical
           domain                                                       T147—effect
                                                                        T147—induce
               136 classes and 49 relationships                        T147—etiology
               Synonyms of all relationship – using variant lookup     T147—cause
                                                                        T147—effecting
13              (tools from NLM)                                        T147—induced



          MeSH
               Terms already asserted as instance of one or more
                classes in UMLS
          PubMed
               Abstracts annotated with one or more MeSH terms



© Canopus Consulting




     Example PubMed abstract (for the domain
     expert)

                                                             Abstract




14




                                                 Classification/Annotation




© Canopus Consulting




                                                                                                 7
24-06-2010




      Method – Parse Sentences in PubMed




15                                                       SS-Tagger (University of Tokyo)
                                                         SS-Parser (University of Tokyo)

     (TOP (S (NP (NP (DT An) (JJ excessive) (ADJP (JJ endogenous) (CC or) (JJ
     exogenous) ) (NN stimulation) ) (PP (IN by) (NP (NN estrogen) ) ) ) (VP (VBZ
     induces) (NP (NP (JJ adenomatous) (NN hyperplasia) ) (PP (IN of) (NP (DT
     the) (NN endometrium) ) ) ) ) ) )



     REmBRANDTS: REtrieval, BRowsing, Analytics and kNowledge Discovery over Text using Semantics

© Canopus Consulting




      Method – Identify entities and Relationships in
      Parse Tree
                                                                          Modifiers
                                                                          Modified entities
                                                                          Composite Entities




16




© Canopus Consulting




                                                                                                            8
24-06-2010




17   SEMANTIC WEB IN PRACTICE




© Canopus Consulting




     Semantic Web Infrastructure
          Triple stores
          RDFizers
          Ontology Editors / Reasoning Systems
          Application Frameworks

19




© Canopus Consulting




                                                          9
24-06-2010




     Triple Stores

                                      • Oracle RDF Data Model
               •       3Store
                                      • Profium Metadata Server
               •       Aduna
                                      • RDF Gateway
               •       AllegroGraph   • RDFStore
20


               •       Boca           • Sesame
               •       Joseki         • Virtuoso

               •       Kowari         • YARS

               •       Mulgara

            There are many others available too…

© Canopus Consulting




     Boca, IBM




21




                                                      Source: IBM



© Canopus Consulting




                                                                           10
24-06-2010




     RDF Data Model, Oracle
          Object-relational implementation
          Set of triples form an RDF/OWL graph (model)
          Optimized storage structure: repeated values stored only once
          Can handle multiple lexical forms of the same value
          Incremental load and bulk load
22
          SPARQL-like graph pattern embedded in SQL query
          Native inferencing for RDF, RDFS & user-defined rules
          Support for OWL and Semantic Operators in the next release




© Canopus Consulting




     Virtuoso, OpenLink
     •     Hybrid Data Server that combines SQL, RDF, XML, and Full Text
           Data Management

     •      Includes a Virtual / Federated DBMS Layer that enables transparent
           access to data from 3rd party SQL, RDF, XML, and Web Services

     •     Produces RDF Instance Data in Physical and Virtual forms from local
23         or 3rd party data sources

     •     Provides full support for the SPARQL Query Language against
           Physical and Virtual RDF Graphs

     •      Query Optimizer is specifically tuned for high-performance data
           access across all realms

     •      Includes in-built middleware for producing RDF instance data on-
           the-fly from non RDF Data Sources (e.g. (X)HTML, Microformats,
           Web Services, Binary Files)


© Canopus Consulting




                                                                                        11
24-06-2010




     Adapting SQL Databases




24




                                 Source: Tim Berners-Lee


© Canopus Consulting




     Mapping Relational to RDF

     •      D2RQ
     •      SquirrelRDF
     •      DartGrid
     •      SPASQL
25




                                      Source: DartGrid


© Canopus Consulting




                                                                  12
24-06-2010




     RDFizers

               •       Relational ->   •   Java -> RDF
                       RDF
               •       XML -> RDF      • Weather -> RDF
               •       Excel -> RDF    • Palm -> RDF
               •       JPEG -> RDF
26
                                       • Outlook -> RDF
               •       BibTEX -> RDF
                                       • Flickr -> RDF


 A directory of RDFizers is provided at:
 http://simile.mit.edu/wiki/RDFizers




© Canopus Consulting




     Ontology Editors and Environments

          Protégé, SWOOP, GrOWL, TopBraid, Ontotrack, SemanticWorks,
           ..



27




                                                         Source: Ian Horrocks


© Canopus Consulting




                                                                                       13
24-06-2010




     Reasoning Systems




                       CEL
28




                                                             Pellet
                              KAON2
© Canopus Consulting




     Semantic Web Tools
          RDF programming environment for 14+ languages
               C, C++, C# and .Net, Haskell, Java, Javascript, Lisp, Obj-C,
               PHP, Perl, Prolog, Python, Ruby, Tcl/Tk
          Selection of on-line validators
               BBN OWL Validator, OWL Consistency Checker,
29             WonderWeb OWL-DL Validator, RDF Validator,
               RDF/XML & N3 Validator, ConsVISor OWL Consistency
               Checker
           SPARQL Endpoints
               SPARQLer, SPARQLette, XML Army Knife, OpenLink
               Virtuoso
          Semantic Web Crawlers
               Swoogle, SWSE, Zitgist

© Canopus Consulting




                                                                                      14
24-06-2010




     Semantic Web Tools
          RDF Browsers
               BrowseRDF, /facet, Longwell, mSpace, Siderean Software, Exhibit
           Semantic Web Browsers
               DISCO, ObjectViewer, OpenLink RDF Browser, Tabulator Browser,
                Haystack
           Labeling
               Adobe XMP
           Information Extraction
30
               Amilcare, Language and Computing
           Visualization
               IsaViz, Perfuse, Tom Sawyer, RDF-Gravity
           Relationship Analytics
               Cogito
           Content Management
               Profium Semantic Information Router
           Information Integration
               Ontoprise, Software AG, @Semantics, webMethods, Revelytix, Ontology
                Works

                         Over 500 tools are now available
© Canopus Consulting




     Lists of Tools

     •      http://sites.wiwiss.fu-berlin.de/suhl/bizer/toolkits/index.htm
     •      http://esw.w3.org/topic/SemanticWebTools
     •      http://www.mkbergman.com/?p=291
     •      http://planetrdf.com/guide/
31   •      http://www.sekt-project.org/resources/sekt_components.html




© Canopus Consulting




                                                                                             15
24-06-2010




     How to get RDF Data?

     •      Write your own RDF in your preferred syntax
     •      Add RDF to XML directly (in its own namespace), e.g. in SVG
     •      Use intelligent scrapers or wrappers to extract RDF from a
           Web page and then generate automatically (e.g. via an XSLT
32
           script)
     •      Formalize the scraper approach with GRDDL
     •      RDFa extend (X)HTML by defining general attributes to add
           metadata to any element
     •      Create bridge to relational databases
     •      Use bridge from other data sources




© Canopus Consulting




     RDF Data
     •     Annotea Bookmark              •   BIND
           File                          •   BrainPharm
     •     DBLP                          •   Entrez Gene
     •     dbpedia                       •   HIVSDB
     •     dbtune
                                         •   KEGG
33
     •     Geonames
                                             NeuroNames
     •     MusicBrainz                   •


     •     RDF Book Mashup               •   Reactome
     •     Revyu                         •   SenseLab
     •     US Census Data                •   SWAN publication &
                                             hypothesis
     •     WordNet
                                         •   UniProt



© Canopus Consulting




                                                                                 16
24-06-2010




     Vocabularies
     •      eClassOwl: eBusiness ontology for products and services
     •      Gene Ontology: describes gene and gene products
     •      BioPAX: for biological pathway data
     •      SKOS core: describes knowledge systems, thesauri,
           glossaries
34
     •      Dublin Core: about information resources, digital
           libraries, with extensions for rights, permissions, digital
           rights management
     •      FOAF: about people and their organizations
     •      DOAP: on the descriptions of software products
     •      Music Ontology: describes CDs, music tracks, etc.
     •      SIOC: for semantically-Interlinked Online Communities
                                                                 Source: Ivan Herman



© Canopus Consulting




     Collateral

     •      Much good information at W3C
               http://www.w3.org/2001/sw/
     •      New FAQ on the Semantic Web
               http://www.w3.org/2001/sw/SW-FAQ

35
     •      Semantic Web Case Studies and Use Cases
               http://www.w3.org/2001/sw/sweo/public/UseCases
     •      List of Semantic Web books
               http://esw.w3.org/topic/SwBooks
     •      Dave Beckett’s Resources
     •      PlanetRDF a blog aggregator on Semantic Web topics




© Canopus Consulting




                                                                                              17
24-06-2010




     Public Fora at W3C

     •      Semantic Web Interest Group
               A forum for developers with an archived mailing
                list, and a constant IRC presence on
                freenode.net#swig
     •     Semantic Web for Health Care & Life Sciences: SW-HCLS
36
     •     Semantic Web Deployment Working Group
               Archives of working group are public
     •      Semantic Web Education and Outreach IG
               Community Projects
                      Whitelisting Email Senders with FOAF
                      Linking Open Data on the Semantic Web
                      Knowee Contact Organizer
                      POWDER Browser Extension
© Canopus Consulting




     Pointers for Getting Going

     •      Use robust URIs
     •      Reuse existing data and ontologies
     •      A little semantics goes a long way
     •      Model the real world rather than data artifacts
37
     •      Build upon your infrastructure incrementally




© Canopus Consulting




                                                                          18
24-06-2010




     Books
          Programming The Semantic Web, O'Reilly Media Inc
               Authors: Toby Segaran, Colin Evans & Jamie Taylor

          Social Networks and the Semantic Web, Springer
               Author: Peter Mika

          Semantic Web for the Working Ontologist, Morgan Kauffman
38
           Publishers,
               Author: Dean Allemang, James Hendler

          Introduction to the Semantic Web and Semantic Web Services,
           Chapman & Hall,
               Author: Liyang Yu

          Semantic Web-Based Information Systems: State-of-the-Art
           Applications, Cybertech Publishing
               Amit Sheth, Miltiadis Lytras

© Canopus Consulting




     Summary

     •     Many Semantic Web tools are available
     •     Data and vocabularies are increasingly being made
           available in RDF/OWL
     •     Many books, tutorial and overviews are available to
39
           help you get going
     •     Several public fora for community activities




© Canopus Consulting




                                                                                19

More Related Content

PDF
Model-Driven Software Development with Semantic Web Technologies
PPTX
Semantic Web powering Enterprise and Web Applications
PDF
20120419 linkedopendataandteamsciencemcguinnesschicago
PPT
Metadata in general and Dublin Core in specific; some experiences
PPT
SCOReD-UniTEN 2010 Managing Personal Knowledge
PPTX
Semantics empowered Physical-Cyber-Social Systems for EarthCube
PDF
Wipro web3.0 seminar-brochure
PDF
Web3.0 seminar wipro-session1-pursuitofmeaning
Model-Driven Software Development with Semantic Web Technologies
Semantic Web powering Enterprise and Web Applications
20120419 linkedopendataandteamsciencemcguinnesschicago
Metadata in general and Dublin Core in specific; some experiences
SCOReD-UniTEN 2010 Managing Personal Knowledge
Semantics empowered Physical-Cyber-Social Systems for EarthCube
Wipro web3.0 seminar-brochure
Web3.0 seminar wipro-session1-pursuitofmeaning

Similar to Web3.0 seminar wipro-session4-enterprisingsemantics (20)

PDF
Web3.0 seminar wipro-session2-logicalontological
PDF
Astitva jneyatva-abhideyatva
PDF
Veda Semantic Technology
PDF
Semantic Technology: State of the arts and Trends
PDF
Introduction to the Semantic Web
PDF
20120411 travelalliancemcguinnessfinal
ODT
Riding The Semantic Wave
PDF
Semtech2006
PPTX
Intro to the Semantic Web Landscape - 2011
PPTX
Poster Semantic Web - Abhijit Chandrasen Manepatil
 
PPT
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
PPT
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
PDF
Semantic Search for Enterprise 2.0
PDF
Tutorial kcc-2011
PPT
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
PDF
Information Quality in the Web Era
PPT
Semantically-aware Networks and Services for Training and Knowledge Managemen...
PDF
A Controlled Natural Language Interface for Semantic MediaWiki
PDF
Mit press a semantic web primer - 2004 !! - (by laxxuss)
PPTX
Semantic web
Web3.0 seminar wipro-session2-logicalontological
Astitva jneyatva-abhideyatva
Veda Semantic Technology
Semantic Technology: State of the arts and Trends
Introduction to the Semantic Web
20120411 travelalliancemcguinnessfinal
Riding The Semantic Wave
Semtech2006
Intro to the Semantic Web Landscape - 2011
Poster Semantic Web - Abhijit Chandrasen Manepatil
 
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Search for Enterprise 2.0
Tutorial kcc-2011
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Information Quality in the Web Era
Semantically-aware Networks and Services for Training and Knowledge Managemen...
A Controlled Natural Language Interface for Semantic MediaWiki
Mit press a semantic web primer - 2004 !! - (by laxxuss)
Semantic web
Ad

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Approach and Philosophy of On baking technology
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Modernizing your data center with Dell and AMD
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
cuic standard and advanced reporting.pdf
PPT
Teaching material agriculture food technology
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Encapsulation theory and applications.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Review of recent advances in non-invasive hemoglobin estimation
Agricultural_Statistics_at_a_Glance_2022_0.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Approach and Philosophy of On baking technology
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Unlocking AI with Model Context Protocol (MCP)
Modernizing your data center with Dell and AMD
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
cuic standard and advanced reporting.pdf
Teaching material agriculture food technology
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Digital-Transformation-Roadmap-for-Companies.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Ad

Web3.0 seminar wipro-session4-enterprisingsemantics

  • 1. 24-06-2010 Web 3.0, Semantics & Session III – Enterprise Computing Enterprising Semantics Nagaraju Pappu June 2010 www.canopusconsulting.com Describing Semantic Web to Nontechnical Users  Labeling data on Web so that both humans and machines can more effectively use them  Associating meaning to data that machines 2 can understand so as to achieve lot more automation and off-load more work to machines  Exploiting common vocabulary and richer modeling of subject area for much better integration of data © Canopus Consulting 1
  • 2. 24-06-2010 3 © Canopus Consulting Semantics and Enterprise Computing  Domain Ontologies and their uses:  Banking, Healthcare, Product catalogues  Online, targetted advertisement  Commoditized B2B infrastructure 4  Example Domain Ontologies: © Canopus Consulting 2
  • 3. 24-06-2010 Enterprise Intra-Nets  From Data-Information-Content transformation  Unlocking the Huge amounts of unstructured information and connecting the information  From Users/Employees to Communities  From Role Based Access Control to Modeling 5 Membership into Communities and groups that are goal oriented  Intelligent Agents as a backbone for large scale, dynamic interchange of information across applications and systems  Enterprise Performance Management  ITIL CMDB, SLA Management are immediate candidates for going “onto-logical” © Canopus Consulting Agenda • Tools • Data / vocabularies • Collateral • Community 6 • Pointers for getting going From: Semantic Web: Technologies and Applications for the Real-World Authors: Amit Seth Wright State University http://knoesis.wright.edu © Canopus Consulting 3
  • 4. 24-06-2010 Enablers and Techniques  Ontology: Agreement with Common Vocabulary & Domain Knowledge; Schema + Knowledge base  Semantic Annotation (meatadata Extraction): 7 Manual, Semi-automatic (automatic with human verification), Automatic  Reasoning/computation: semantics enabled search, integration, complex queries, analysis (paths, subgraph), pattern finding, mining, hypothesis validation, discovery, visualization © Canopus Consulting A Typical Enterprise SW Application Lifecycle  Build Ontology  Build Schema (model level representation  Populate with Knowledgebase (people, location, organizations, events)  Automatic Semantic Annotation (Extract Semantic 8 Metadata)  Any type of document, multiple sources of documents  Metadata can be stored with or sparely from documents  Applications:  search (ranked list of documents of interest (semantic search), integrate/portal, summarize/explain, analyze, make decisions  Reasoning techniques: graph analysis, inferencing Types of content/documents, Use of standards, Scalability, Performance © Canopus Consulting 4
  • 5. 24-06-2010 Semagix Freedom Architecture: for building ontology-driven information system 9 © Semagix, Inc. Managing Semantic Content on the Web © Canopus Consulting Building ontology  Three broad approaches:  Option 1: social process/manual: many years, committees  Can be based on metadata standard  Option 2: automatic taxonomy generation (statistical clustering/NLP): 10 limitation/problems on quality, dependence on corpus, naming  Option 3: Descriptional component (schema) designed by domain experts; Description base (assertional component, extension) by automated processes  Option 2 is being investigated in several research projects;  Option 3 is currently supported by technologies such as Semagix Freedom © Canopus Consulting 5
  • 6. 24-06-2010 Ontology Examples  Time, Space  Gene Ontology, Glycomics  Pharma Drug, Treatment-Diagnosis  Repertoire Management  Equity Markets  Anti-money Laundering, Financial Risk, Terrorism 11  Can be Public, Government, Limited Availability, Commercial © Canopus Consulting Ontology Language/ Representation Spectrum Modal Logic First Order Logic Logical Theory Is Disjoint Subclass of Description Logic with transitivity DAML+OIL, OWL, UML property Conceptual Model 12 RDFS, XTM Is Subclass of Extended ER Semantic Interoperability Thesaurus ER Has Narrower Meaning Than DB Schemas, XML Schema Structural Interoperability Taxonomy Relational Model Is Sub-Classification of XML Syntactic Interoperability © Canopus Consulting 6
  • 7. 24-06-2010 Large Scale Systems already in use  UMLS – A high level schema of the biomedical domain T147—effect T147—induce  136 classes and 49 relationships T147—etiology  Synonyms of all relationship – using variant lookup T147—cause T147—effecting 13 (tools from NLM) T147—induced  MeSH  Terms already asserted as instance of one or more classes in UMLS  PubMed  Abstracts annotated with one or more MeSH terms © Canopus Consulting Example PubMed abstract (for the domain expert) Abstract 14 Classification/Annotation © Canopus Consulting 7
  • 8. 24-06-2010 Method – Parse Sentences in PubMed 15 SS-Tagger (University of Tokyo) SS-Parser (University of Tokyo) (TOP (S (NP (NP (DT An) (JJ excessive) (ADJP (JJ endogenous) (CC or) (JJ exogenous) ) (NN stimulation) ) (PP (IN by) (NP (NN estrogen) ) ) ) (VP (VBZ induces) (NP (NP (JJ adenomatous) (NN hyperplasia) ) (PP (IN of) (NP (DT the) (NN endometrium) ) ) ) ) ) ) REmBRANDTS: REtrieval, BRowsing, Analytics and kNowledge Discovery over Text using Semantics © Canopus Consulting Method – Identify entities and Relationships in Parse Tree Modifiers Modified entities Composite Entities 16 © Canopus Consulting 8
  • 9. 24-06-2010 17 SEMANTIC WEB IN PRACTICE © Canopus Consulting Semantic Web Infrastructure  Triple stores  RDFizers  Ontology Editors / Reasoning Systems  Application Frameworks 19 © Canopus Consulting 9
  • 10. 24-06-2010 Triple Stores • Oracle RDF Data Model • 3Store • Profium Metadata Server • Aduna • RDF Gateway • AllegroGraph • RDFStore 20 • Boca • Sesame • Joseki • Virtuoso • Kowari • YARS • Mulgara There are many others available too… © Canopus Consulting Boca, IBM 21 Source: IBM © Canopus Consulting 10
  • 11. 24-06-2010 RDF Data Model, Oracle  Object-relational implementation  Set of triples form an RDF/OWL graph (model)  Optimized storage structure: repeated values stored only once  Can handle multiple lexical forms of the same value  Incremental load and bulk load 22  SPARQL-like graph pattern embedded in SQL query  Native inferencing for RDF, RDFS & user-defined rules  Support for OWL and Semantic Operators in the next release © Canopus Consulting Virtuoso, OpenLink • Hybrid Data Server that combines SQL, RDF, XML, and Full Text Data Management • Includes a Virtual / Federated DBMS Layer that enables transparent access to data from 3rd party SQL, RDF, XML, and Web Services • Produces RDF Instance Data in Physical and Virtual forms from local 23 or 3rd party data sources • Provides full support for the SPARQL Query Language against Physical and Virtual RDF Graphs • Query Optimizer is specifically tuned for high-performance data access across all realms • Includes in-built middleware for producing RDF instance data on- the-fly from non RDF Data Sources (e.g. (X)HTML, Microformats, Web Services, Binary Files) © Canopus Consulting 11
  • 12. 24-06-2010 Adapting SQL Databases 24 Source: Tim Berners-Lee © Canopus Consulting Mapping Relational to RDF • D2RQ • SquirrelRDF • DartGrid • SPASQL 25 Source: DartGrid © Canopus Consulting 12
  • 13. 24-06-2010 RDFizers • Relational -> • Java -> RDF RDF • XML -> RDF • Weather -> RDF • Excel -> RDF • Palm -> RDF • JPEG -> RDF 26 • Outlook -> RDF • BibTEX -> RDF • Flickr -> RDF A directory of RDFizers is provided at: http://simile.mit.edu/wiki/RDFizers © Canopus Consulting Ontology Editors and Environments  Protégé, SWOOP, GrOWL, TopBraid, Ontotrack, SemanticWorks, .. 27 Source: Ian Horrocks © Canopus Consulting 13
  • 14. 24-06-2010 Reasoning Systems CEL 28 Pellet KAON2 © Canopus Consulting Semantic Web Tools  RDF programming environment for 14+ languages  C, C++, C# and .Net, Haskell, Java, Javascript, Lisp, Obj-C,  PHP, Perl, Prolog, Python, Ruby, Tcl/Tk  Selection of on-line validators  BBN OWL Validator, OWL Consistency Checker, 29  WonderWeb OWL-DL Validator, RDF Validator,  RDF/XML & N3 Validator, ConsVISor OWL Consistency  Checker  SPARQL Endpoints  SPARQLer, SPARQLette, XML Army Knife, OpenLink  Virtuoso  Semantic Web Crawlers  Swoogle, SWSE, Zitgist © Canopus Consulting 14
  • 15. 24-06-2010 Semantic Web Tools  RDF Browsers  BrowseRDF, /facet, Longwell, mSpace, Siderean Software, Exhibit  Semantic Web Browsers  DISCO, ObjectViewer, OpenLink RDF Browser, Tabulator Browser, Haystack  Labeling  Adobe XMP  Information Extraction 30  Amilcare, Language and Computing  Visualization  IsaViz, Perfuse, Tom Sawyer, RDF-Gravity  Relationship Analytics  Cogito  Content Management  Profium Semantic Information Router  Information Integration  Ontoprise, Software AG, @Semantics, webMethods, Revelytix, Ontology Works Over 500 tools are now available © Canopus Consulting Lists of Tools • http://sites.wiwiss.fu-berlin.de/suhl/bizer/toolkits/index.htm • http://esw.w3.org/topic/SemanticWebTools • http://www.mkbergman.com/?p=291 • http://planetrdf.com/guide/ 31 • http://www.sekt-project.org/resources/sekt_components.html © Canopus Consulting 15
  • 16. 24-06-2010 How to get RDF Data? • Write your own RDF in your preferred syntax • Add RDF to XML directly (in its own namespace), e.g. in SVG • Use intelligent scrapers or wrappers to extract RDF from a Web page and then generate automatically (e.g. via an XSLT 32 script) • Formalize the scraper approach with GRDDL • RDFa extend (X)HTML by defining general attributes to add metadata to any element • Create bridge to relational databases • Use bridge from other data sources © Canopus Consulting RDF Data • Annotea Bookmark • BIND File • BrainPharm • DBLP • Entrez Gene • dbpedia • HIVSDB • dbtune • KEGG 33 • Geonames NeuroNames • MusicBrainz • • RDF Book Mashup • Reactome • Revyu • SenseLab • US Census Data • SWAN publication & hypothesis • WordNet • UniProt © Canopus Consulting 16
  • 17. 24-06-2010 Vocabularies • eClassOwl: eBusiness ontology for products and services • Gene Ontology: describes gene and gene products • BioPAX: for biological pathway data • SKOS core: describes knowledge systems, thesauri, glossaries 34 • Dublin Core: about information resources, digital libraries, with extensions for rights, permissions, digital rights management • FOAF: about people and their organizations • DOAP: on the descriptions of software products • Music Ontology: describes CDs, music tracks, etc. • SIOC: for semantically-Interlinked Online Communities Source: Ivan Herman © Canopus Consulting Collateral • Much good information at W3C  http://www.w3.org/2001/sw/ • New FAQ on the Semantic Web  http://www.w3.org/2001/sw/SW-FAQ 35 • Semantic Web Case Studies and Use Cases  http://www.w3.org/2001/sw/sweo/public/UseCases • List of Semantic Web books  http://esw.w3.org/topic/SwBooks • Dave Beckett’s Resources • PlanetRDF a blog aggregator on Semantic Web topics © Canopus Consulting 17
  • 18. 24-06-2010 Public Fora at W3C • Semantic Web Interest Group  A forum for developers with an archived mailing list, and a constant IRC presence on freenode.net#swig • Semantic Web for Health Care & Life Sciences: SW-HCLS 36 • Semantic Web Deployment Working Group  Archives of working group are public • Semantic Web Education and Outreach IG  Community Projects  Whitelisting Email Senders with FOAF  Linking Open Data on the Semantic Web  Knowee Contact Organizer  POWDER Browser Extension © Canopus Consulting Pointers for Getting Going • Use robust URIs • Reuse existing data and ontologies • A little semantics goes a long way • Model the real world rather than data artifacts 37 • Build upon your infrastructure incrementally © Canopus Consulting 18
  • 19. 24-06-2010 Books  Programming The Semantic Web, O'Reilly Media Inc  Authors: Toby Segaran, Colin Evans & Jamie Taylor  Social Networks and the Semantic Web, Springer  Author: Peter Mika  Semantic Web for the Working Ontologist, Morgan Kauffman 38 Publishers,  Author: Dean Allemang, James Hendler  Introduction to the Semantic Web and Semantic Web Services, Chapman & Hall,  Author: Liyang Yu  Semantic Web-Based Information Systems: State-of-the-Art Applications, Cybertech Publishing  Amit Sheth, Miltiadis Lytras © Canopus Consulting Summary • Many Semantic Web tools are available • Data and vocabularies are increasingly being made available in RDF/OWL • Many books, tutorial and overviews are available to 39 help you get going • Several public fora for community activities © Canopus Consulting 19