SlideShare a Scribd company logo
Digital Enterprise Research Institute                                                               www.deri.ie




                Lessons and requirements from
                     a decade of deployed
                      Semantic Web apps
                          Benjamin Heitmann, Richard Cyganiak,
                               Conor Hayes, Stefan Decker
                                       Funded by Science Foundation Ireland under
                                           Grant No. SFI/08/CE/I1380 (Líon-2)
© Copyright 2011 Digital Enterprise Research Institute. All rights reserved.




                                                                               Enabling Networked Knowledge
Input for this workshop
Digital Enterprise Research Institute                                  www.deri.ie



           LEDP workshop CfP calls for:
               requirements
               patterns
               gaps in Linked Data

                  standards + guidelines


           Where should this input
            come from ?




                                               Enabling Networked Knowledge
                                               Benjamin Heitmann, slide: 2/17
The Semantic Web:
                                a decade is a long time
Digital Enterprise Research Institute                                       www.deri.ie




                 2001                                            2011
                                                    Enabling Networked Knowledge
                                                    Benjamin Heitmann, slide: 3/17
Choice of methodology?
Digital Enterprise Research Institute                                       www.deri.ie



         Goal:
               patterns, requirements and gaps
                regarding LD
         Data:
               10 years of Semantic Web research


         Which scientific approach fits ?
               Empirical software engineering

         Full IEEE transactions journal paper:
          http://tinyurl.com/semweblessons

                                                    Enabling Networked Knowledge
                                                    Benjamin Heitmann, slide: 4/17
Overview
Digital Enterprise Research Institute                                                  www.deri.ie




                                             Empirical
                                              survey



           Architecture:                   LD standards:            Software Eng. Process:
           arch. pattern                       gaps                     shortcomings




                                        Software engineering
                                             solutions

                                                               Enabling Networked Knowledge
                                                               Benjamin Heitmann, slide: 5/17
Empirical survey
Digital Enterprise Research Institute                                        www.deri.ie



                                             Sources: 124 apps total
                                                 Semantic Web Challenge
                                                  (ISWC): 2003-2009,
                                                  101 apps
                                                 Scripting for SemWeb
                                                  Challenge (ESWC), 2006-2009,
                                                  23 apps
                                                 includes industry & research
                                                  apps
                                             Checklist (12 questions)
                                             Data collection:
                                              1. own analysis of paper
                                              2. validation by email

                                                     Enabling Networked Knowledge
                                                     Benjamin Heitmann, slide: 6/17
Empirical survey results
Digital Enterprise Research Institute                                   www.deri.ie




    widespread support for SemWeb specific
     features
    clear difference to database-driven apps
    big uptake of Linked Data principles and
     eco-system
    integration requires human intervention
    top 3 standards: RDF, OWL, SPARQL
    top 3 vocabularies: FOAF, DC, SIOC




                                                Enabling Networked Knowledge
                                                Benjamin Heitmann, slide: 7/17
Conceptual architecture
Digital Enterprise Research Institute                                          www.deri.ie




                                           Conceptual architecture:
                                             describes major design elements of
                                              a system (+ relations)
                                             domain specific

                                              (e.g. the Semantic Web)
                                             provides architectural pattern
                                             documents community consensus




                                                       Enabling Networked Knowledge
                                                       Benjamin Heitmann, slide: 8/17
Components of conceptual
                                architecture
Digital Enterprise Research Institute                                                           www.deri.ie


         starting
          point:                    decouple +
                                     specialise
          RDF data                                Graph access           RDF store         Graph query
                                                                                         language service
          handling                                layer (100%)            (88%)
                                                                                              (77%)




            Data                                  Data homogenisation            Data discovery
         integration                                 service (74%)               service (30%)




            User                                     Graph-based                 Structured data
                                                  navigation interface         authoring interface
          interface                                      (91%)                        (29%)

                                                                   Enabling Networked Knowledge
                                                                   Benjamin Heitmann, slide: 9/17
LD gaps:
                                publishing/consuming
Digital Enterprise Research Institute                                     www.deri.ie




    all applications consume RDF
    73% import API, 69% export API
    but: incompatible
     implementations
    LD principles in 2006 led to
     consolidation

        embedding RDF:
              web for humans vs. web for machines
              2008: introduction of RDFa


                                                 Enabling Networked Knowledge
                                                 Benjamin Heitmann, slide: 10/17
LD gaps: beyond open data
Digital Enterprise Research Institute                                                   www.deri.ie



                                         writing/changing/updating RDF data
                                          is difficult
                                         71% of apps do not support data
                                          changes

                                           Writing to remote RDF store:
                                               draft status in 2011: SPARQL Update
                                           Restricting access (read/write):
                                               no standards
                                               no interoperability
                                               closest ideas (?): R/W design note, WebID


                                                               Enabling Networked Knowledge
                                                               Benjamin Heitmann, slide: 11/17
Software Eng. process
                                shortcomings (1)
Digital Enterprise Research Institute                                                     www.deri.ie


                                           Integrating noisy RDF data:
                                             60% semi-automatic integration
                                             this involves human intervention
                                             only 20% use automatic heuristics
                                               major part of Semantic Web specific code


                                           Distribution of application logic:
                                             multiple components and standards
                                             queries(41%), rules(52%) or formal
                                              vocabularies
                                             hard to maintain




                                                                 Enabling Networked Knowledge
                                                                 Benjamin Heitmann, slide: 12/17
Software Eng. process
                                shortcomings (2)
Digital Enterprise Research Institute                                       www.deri.ie



                                                         graph-based


             Mismatch of data models
              between components
                   graph versus relational or
                    object oriented (90%)
                   overhead in communication
                   inconsistent round-trip
                    conversion
                   3 way ORM needed ?


                                                                       object
                                                 relational            oriented


                                                   Enabling Networked Knowledge
                                                   Benjamin Heitmann, slide: 13/17
Software Eng. solutions (1)
Digital Enterprise Research Institute                                             www.deri.ie




           More guidelines, best
            practices and design
            patterns:
                 current examples:
                     – Linked Data principles and
                       publishing guidelines
                     – guidelines for naming of URIs
                     – Linked Data patterns collection
                 result: more interoperability,
                  more coherent Web of Data




                                                         Enabling Networked Knowledge
                                                         Benjamin Heitmann, slide: 14/17
Software Eng. solutions (2)
Digital Enterprise Research Institute                                         www.deri.ie




           More software libraries
            (beyond RDF storage!)
               guidelines can be hardcoded in
                reusable libraries
               good libraries can make
                complicated guidelines easy to
                use (See HTTP, SSL, SMTP and
                DNS lookups)
               current examples:
                     – any23, d2r server, Semantic
                       Web Client Library



                                                     Enabling Networked Knowledge
                                                     Benjamin Heitmann, slide: 15/17
Software Eng. solutions (3)
Digital Enterprise Research Institute                                                        www.deri.ie


      More software factories:
          create complete applications
          requires patterns + libraries
          or: “opinionated software”


          components can be
           customised for domain
          Interface, homogenisation
           and data discovery usually
           made from scratch




                                             https://developers.facebook.com/docs/beta/opengraph/tutorial/

                                                        Enabling Networked Knowledge
                                                        Benjamin Heitmann, slide: 16/17
Summary
Digital Enterprise Research Institute                                                   www.deri.ie




                                             Empirical
                                              survey



           Architecture:                   LD standards:            Software Eng. Process:
           arch. pattern                       gaps                     shortcomings



                                                                        Full article:
                                        Software engineering            http://tinyurl.com/
                                             solutions                  semweblessons

                                                               Enabling Networked Knowledge
                                                               Benjamin Heitmann, slide: 17/17
Appendix: threats to validity
Digital Enterprise Research Institute                                           www.deri.ie



           Representativeness:
                 only complete applications part of challenges (not tools or
                  libraries)
                 apps needed to use real-world data
                 submission of paper describing the app was required
                 challenge extends of multiple years, allows trends to be seen
           Number of authors who verified checklist (65%):
                 academic email addresses expire quickly
                 we manually tried to find new email addresses
           no source code was used:
                 source code was not required for challenges due to e.g. IP
                  issues

                                                       Enabling Networked Knowledge
                                                       Benjamin Heitmann, slide: 18/17
Table: Impl. details
Digital Enterprise Research Institute                                                                               www.deri.ie




                         2003              2004          2005      2006         2007         2008         2009          overall

 Programming                                                    Java 10%                               Java 46%       Java 48%
                     Java 60%           Java 56%                             Java 50%     Java 43%
 languages                                           Java 66%   JS 15%                                 JS 23%         PHP 19%
                     C 20%              JS 12%                               PHP 25%      PHP 21%
                                                                PHP 26%                                PHP 23%        JS 13%
                                        Jena 18%                RAP 15%                   Sesame 17%
 RDF libraries       —                               —                       Sesame 33%                               Sesame 19%
                                        Sesame 12%              RDFLib                    ARC 17%      Sesame 23%
                                                                             Jena 8%                                  Jena 9%
                                        Lucene 18%              10%                       Jena 13%
                                                                RDF 89%      RDF 100%     RDF 100%     RDF 100%       RDF 96%
                                        RDF 87%      RDF 66%
 SemWeb standards    RDF 100%                                   OWL 42%      SPARQL       SPARQL       SPARQL         OWL 43%
                                        RDFS 37%     OWL 66%
                     OWL 30%                                    SPARQL       50%          17%          69%            SPARQL
                                        OWL 37%      RDFS 50%
                                                                15%          OWL 41%      OWL 10%      OWL 46%        41%

 Schemas/                                                                                 FOAF 30%
                     RSS 20%                                    FOAF 26%     FOAF 41%                  FOAF 34%       FOAF 27%
 vocabularies/                          DC 12%       —                                    DC 21%
                     FOAF 20%                                   RSS 15%      DC 20%                    DC 15%         DC 13%
 ontologies                             SWRC 12%                                          DBpedia
                     DC 20%                                     Bibtex 10%   SIOC 20%                  SKOS 15%       SIOC 7%
                                                                                          13%




                                                                               Enabling Networked Knowledge
                                                                                Benjamin Heitmann, slide: 19/17
Tables: Data integration and
                                other properties
Digital Enterprise Research Institute                                                                                www.deri.ie



                     2003    2004       2005   2006   2007     2008     2009
    manual           30%     13%         0%    16%     9%       5%       4%
    semi-            70%     31%        100%   47%    58%      65%      61%
    automatic
    automatic        0%      25%        0%     11%    13%      4%       19%
    not needed       0%      31%        0%     26%    20%      26%      16%


                                                                               2003    2004   2005   2006   2007   2008   2009
                                                             Data creation     20%     37%    50%    52%    37%    52%    76%
                                                             Data import       70%     50%    83%    52%    70%    86%    73%
                                                             Data export       70%     56%    83%    68%    79%    86%    73%
                                                             Inferencing       60%     68%    83%    57%    79%    52%    42%
                                                             Decentralised     90%     75%    100%   57%    41%    95%    96%
                                                             sources
                                                             Multiple          90%     93%    100%   89%    83%    91%    88%
                                                             owners
                                                             Heterogeneous     90%     87%    100%   89%    87%    78%    88%
                                                             formats
                                                             Data updates      90%     75%    83%    78%    45%    73%    50%
                                                             Linked Data
                                                                               0%      0%     0%     5%     25%    26%    65%
                                                             principles


                                                                                      Enabling Networked Knowledge
                                                                                      Benjamin Heitmann, slide: 20/17
Table: architectural analysis
Digital Enterprise Research Institute                                                                                                                                        www.deri.ie




                                                                                                                                      authoring interface
                                                                              graph-based navi-




                                                                                                                   language service
                                                                                                  data homogeni-
                                                                              gation interface




                                                                                                                                      structured data


                                                                                                                                                            data discovery
                                                                                                  sation service
                                                       graph access




                                                                                                                   graph query
                                        applications
                                        number of




                                                                                                                                                            service
                                                       layer

                                                                      RDF
                                                                      store
                            year
                            2003           10          100%            80%      90%                90%               80%                20%                  50%
                            2004           16          100%            94%     100%                50%               88%                38%                  25%
                            2005           6           100%           100%     100%                83%               83%                33%                  33%
                            2006           19          100%            95%      89%                63%               68%                37%                  16%
                            2007           24          100%            92%      96%                88%               88%                33%                  54%
                            2008           23          100%            87%      83%                70%               78%                26%                  30%
                            2009           26          100%            77%      88%                80%               65%                19%                  15%
                            total         124          100%            88%      91%                74%               77%                29%                  30%




                                                                                                              Enabling Networked Knowledge
                                                                                                              Benjamin Heitmann, slide: 21/17

More Related Content

PDF
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...
PDF
Implementing Semantic Web applications: reference architecture and challenges
PDF
An architecture for privacy-enabled user profile portability on the Web of Data
KEY
What your hairstyle says about your political preferences, and why you should...
PDF
Transitioning web application frameworks towards the Semantic Web (master the...
PDF
RDFa: putting RDF on the Web
PDF
Presentation of current research: distributed architecture for recommendation...
PPT
Linked Open Data
Enabling Case-Based Reasoning on the Web of Data (How to create a Web of Exp...
Implementing Semantic Web applications: reference architecture and challenges
An architecture for privacy-enabled user profile portability on the Web of Data
What your hairstyle says about your political preferences, and why you should...
Transitioning web application frameworks towards the Semantic Web (master the...
RDFa: putting RDF on the Web
Presentation of current research: distributed architecture for recommendation...
Linked Open Data

What's hot (20)

PDF
Swap2010 agave
PDF
Semantic Desktop
ODP
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
PPTX
Turning social disputes into knowledge representations DERI reading group 201...
PPTX
LEAD - Learning Design – Design For Learning -project presentation
DOC
Itgs scheme 2011-2012
PPTX
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
PDF
Internet Science
PPTX
One-stop shop for software development information
PPTX
Twitter and research impact
PDF
A PLATFORM FOR LEARNING INTERNET OF THINGS
PDF
Federating Distributed Social Data to Build an Interlinked Online Information...
PPTX
Making sense out of disagreement, University of Limerick Interaction Design C...
PDF
6 - Making Information Pay 2011 -- SOLOMON, MADI (Pearson)
PDF
Service Integration - A Web of Things Perspective
PDF
23625509 internetworking-technologies
ODP
Knowledge management on the desktop
PPTX
IUI 2010: An Informal Summary of the International Conference on Intelligent ...
 
PPTX
PDF
Dagstuhl 2010 - Kalman Graffi - Alternative, more promising IT Paradigms for ...
Swap2010 agave
Semantic Desktop
Interlinking Personal Semantic Data on the Semantic Desktop and the Web of Data
Turning social disputes into knowledge representations DERI reading group 201...
LEAD - Learning Design – Design For Learning -project presentation
Itgs scheme 2011-2012
Leveraging Matching Dependencies for Guided User Feedback in Linked Data Appl...
Internet Science
One-stop shop for software development information
Twitter and research impact
A PLATFORM FOR LEARNING INTERNET OF THINGS
Federating Distributed Social Data to Build an Interlinked Online Information...
Making sense out of disagreement, University of Limerick Interaction Design C...
6 - Making Information Pay 2011 -- SOLOMON, MADI (Pearson)
Service Integration - A Web of Things Perspective
23625509 internetworking-technologies
Knowledge management on the desktop
IUI 2010: An Informal Summary of the International Conference on Intelligent ...
 
Dagstuhl 2010 - Kalman Graffi - Alternative, more promising IT Paradigms for ...
Ad

Similar to Lessons and requirements from a decade of deployed Semantic Web apps (20)

PPTX
Approximate Semantic Matching of Heterogeneous Events
PDF
Approximate Semantic Matching of Heterogeneous Events
PPTX
Building Optimisation using Scenario Modeling and Linked Data
PDF
Web Science: Motivation, Goals and Contributions
PDF
Lsms SUPSI DTI ISIN
PPTX
IT Academy WorkShop 14/05/2012
PDF
From research to business: the Web of linked data
PDF
System of Systems Information Interoperability using a Linked Dataspace
PPT
Taming digital traces for informal learning dhaval
PDF
SharePoint Saturdays_ECM_SCN20_Webinar
PDF
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
PDF
Making Conversations Visible
PPTX
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
KEY
Data-Intensive Research
PDF
Big Data Beyond Hadoop*: Research Directions for the Future
PPTX
Cisco Presentation 1
PDF
1109 siit jfriedrich v02
PPT
Cloud computingjun28
PPT
Cloud computingjun28
PDF
Mining Development Repositories to Study the Impact of Collaboration on Softw...
Approximate Semantic Matching of Heterogeneous Events
Approximate Semantic Matching of Heterogeneous Events
Building Optimisation using Scenario Modeling and Linked Data
Web Science: Motivation, Goals and Contributions
Lsms SUPSI DTI ISIN
IT Academy WorkShop 14/05/2012
From research to business: the Web of linked data
System of Systems Information Interoperability using a Linked Dataspace
Taming digital traces for informal learning dhaval
SharePoint Saturdays_ECM_SCN20_Webinar
Leveraging existing Web Frameworks for a SIOC explorer (Scripting for the Sem...
Making Conversations Visible
Towards Expertise Modelling for Routing Data Cleaning Tasks within a Communit...
Data-Intensive Research
Big Data Beyond Hadoop*: Research Directions for the Future
Cisco Presentation 1
1109 siit jfriedrich v02
Cloud computingjun28
Cloud computingjun28
Mining Development Repositories to Study the Impact of Collaboration on Softw...
Ad

Recently uploaded (20)

PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Pre independence Education in Inndia.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Cell Structure & Organelles in detailed.
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Cell Types and Its function , kingdom of life
PPTX
GDM (1) (1).pptx small presentation for students
PDF
Computing-Curriculum for Schools in Ghana
PDF
Insiders guide to clinical Medicine.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Pre independence Education in Inndia.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Renaissance Architecture: A Journey from Faith to Humanism
Final Presentation General Medicine 03-08-2024.pptx
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
O5-L3 Freight Transport Ops (International) V1.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
TR - Agricultural Crops Production NC III.pdf
Pharma ospi slides which help in ospi learning
Cell Structure & Organelles in detailed.
Supply Chain Operations Speaking Notes -ICLT Program
Cell Types and Its function , kingdom of life
GDM (1) (1).pptx small presentation for students
Computing-Curriculum for Schools in Ghana
Insiders guide to clinical Medicine.pdf
VCE English Exam - Section C Student Revision Booklet
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx

Lessons and requirements from a decade of deployed Semantic Web apps

  • 1. Digital Enterprise Research Institute www.deri.ie Lessons and requirements from a decade of deployed Semantic Web apps Benjamin Heitmann, Richard Cyganiak, Conor Hayes, Stefan Decker Funded by Science Foundation Ireland under Grant No. SFI/08/CE/I1380 (Líon-2) © Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Enabling Networked Knowledge
  • 2. Input for this workshop Digital Enterprise Research Institute www.deri.ie  LEDP workshop CfP calls for:  requirements  patterns  gaps in Linked Data standards + guidelines  Where should this input come from ? Enabling Networked Knowledge Benjamin Heitmann, slide: 2/17
  • 3. The Semantic Web: a decade is a long time Digital Enterprise Research Institute www.deri.ie 2001 2011 Enabling Networked Knowledge Benjamin Heitmann, slide: 3/17
  • 4. Choice of methodology? Digital Enterprise Research Institute www.deri.ie  Goal:  patterns, requirements and gaps regarding LD  Data:  10 years of Semantic Web research  Which scientific approach fits ?  Empirical software engineering  Full IEEE transactions journal paper: http://tinyurl.com/semweblessons Enabling Networked Knowledge Benjamin Heitmann, slide: 4/17
  • 5. Overview Digital Enterprise Research Institute www.deri.ie Empirical survey Architecture: LD standards: Software Eng. Process: arch. pattern gaps shortcomings Software engineering solutions Enabling Networked Knowledge Benjamin Heitmann, slide: 5/17
  • 6. Empirical survey Digital Enterprise Research Institute www.deri.ie  Sources: 124 apps total  Semantic Web Challenge (ISWC): 2003-2009, 101 apps  Scripting for SemWeb Challenge (ESWC), 2006-2009, 23 apps  includes industry & research apps  Checklist (12 questions)  Data collection: 1. own analysis of paper 2. validation by email Enabling Networked Knowledge Benjamin Heitmann, slide: 6/17
  • 7. Empirical survey results Digital Enterprise Research Institute www.deri.ie  widespread support for SemWeb specific features  clear difference to database-driven apps  big uptake of Linked Data principles and eco-system  integration requires human intervention  top 3 standards: RDF, OWL, SPARQL  top 3 vocabularies: FOAF, DC, SIOC Enabling Networked Knowledge Benjamin Heitmann, slide: 7/17
  • 8. Conceptual architecture Digital Enterprise Research Institute www.deri.ie  Conceptual architecture:  describes major design elements of a system (+ relations)  domain specific (e.g. the Semantic Web)  provides architectural pattern  documents community consensus Enabling Networked Knowledge Benjamin Heitmann, slide: 8/17
  • 9. Components of conceptual architecture Digital Enterprise Research Institute www.deri.ie starting point: decouple + specialise RDF data Graph access RDF store Graph query language service handling layer (100%) (88%) (77%) Data Data homogenisation Data discovery integration service (74%) service (30%) User Graph-based Structured data navigation interface authoring interface interface (91%) (29%) Enabling Networked Knowledge Benjamin Heitmann, slide: 9/17
  • 10. LD gaps: publishing/consuming Digital Enterprise Research Institute www.deri.ie  all applications consume RDF  73% import API, 69% export API  but: incompatible implementations  LD principles in 2006 led to consolidation  embedding RDF:  web for humans vs. web for machines  2008: introduction of RDFa Enabling Networked Knowledge Benjamin Heitmann, slide: 10/17
  • 11. LD gaps: beyond open data Digital Enterprise Research Institute www.deri.ie  writing/changing/updating RDF data is difficult  71% of apps do not support data changes  Writing to remote RDF store:  draft status in 2011: SPARQL Update  Restricting access (read/write):  no standards  no interoperability  closest ideas (?): R/W design note, WebID Enabling Networked Knowledge Benjamin Heitmann, slide: 11/17
  • 12. Software Eng. process shortcomings (1) Digital Enterprise Research Institute www.deri.ie  Integrating noisy RDF data:  60% semi-automatic integration  this involves human intervention  only 20% use automatic heuristics  major part of Semantic Web specific code  Distribution of application logic:  multiple components and standards  queries(41%), rules(52%) or formal vocabularies  hard to maintain Enabling Networked Knowledge Benjamin Heitmann, slide: 12/17
  • 13. Software Eng. process shortcomings (2) Digital Enterprise Research Institute www.deri.ie graph-based  Mismatch of data models between components  graph versus relational or object oriented (90%)  overhead in communication  inconsistent round-trip conversion  3 way ORM needed ? object relational oriented Enabling Networked Knowledge Benjamin Heitmann, slide: 13/17
  • 14. Software Eng. solutions (1) Digital Enterprise Research Institute www.deri.ie  More guidelines, best practices and design patterns:  current examples: – Linked Data principles and publishing guidelines – guidelines for naming of URIs – Linked Data patterns collection  result: more interoperability, more coherent Web of Data Enabling Networked Knowledge Benjamin Heitmann, slide: 14/17
  • 15. Software Eng. solutions (2) Digital Enterprise Research Institute www.deri.ie  More software libraries (beyond RDF storage!)  guidelines can be hardcoded in reusable libraries  good libraries can make complicated guidelines easy to use (See HTTP, SSL, SMTP and DNS lookups)  current examples: – any23, d2r server, Semantic Web Client Library Enabling Networked Knowledge Benjamin Heitmann, slide: 15/17
  • 16. Software Eng. solutions (3) Digital Enterprise Research Institute www.deri.ie  More software factories:  create complete applications  requires patterns + libraries  or: “opinionated software”  components can be customised for domain  Interface, homogenisation and data discovery usually made from scratch https://developers.facebook.com/docs/beta/opengraph/tutorial/ Enabling Networked Knowledge Benjamin Heitmann, slide: 16/17
  • 17. Summary Digital Enterprise Research Institute www.deri.ie Empirical survey Architecture: LD standards: Software Eng. Process: arch. pattern gaps shortcomings Full article: Software engineering http://tinyurl.com/ solutions semweblessons Enabling Networked Knowledge Benjamin Heitmann, slide: 17/17
  • 18. Appendix: threats to validity Digital Enterprise Research Institute www.deri.ie  Representativeness:  only complete applications part of challenges (not tools or libraries)  apps needed to use real-world data  submission of paper describing the app was required  challenge extends of multiple years, allows trends to be seen  Number of authors who verified checklist (65%):  academic email addresses expire quickly  we manually tried to find new email addresses  no source code was used:  source code was not required for challenges due to e.g. IP issues Enabling Networked Knowledge Benjamin Heitmann, slide: 18/17
  • 19. Table: Impl. details Digital Enterprise Research Institute www.deri.ie 2003 2004 2005 2006 2007 2008 2009 overall Programming Java 10% Java 46% Java 48% Java 60% Java 56% Java 50% Java 43% languages Java 66% JS 15% JS 23% PHP 19% C 20% JS 12% PHP 25% PHP 21% PHP 26% PHP 23% JS 13% Jena 18% RAP 15% Sesame 17% RDF libraries — — Sesame 33% Sesame 19% Sesame 12% RDFLib ARC 17% Sesame 23% Jena 8% Jena 9% Lucene 18% 10% Jena 13% RDF 89% RDF 100% RDF 100% RDF 100% RDF 96% RDF 87% RDF 66% SemWeb standards RDF 100% OWL 42% SPARQL SPARQL SPARQL OWL 43% RDFS 37% OWL 66% OWL 30% SPARQL 50% 17% 69% SPARQL OWL 37% RDFS 50% 15% OWL 41% OWL 10% OWL 46% 41% Schemas/ FOAF 30% RSS 20% FOAF 26% FOAF 41% FOAF 34% FOAF 27% vocabularies/ DC 12% — DC 21% FOAF 20% RSS 15% DC 20% DC 15% DC 13% ontologies SWRC 12% DBpedia DC 20% Bibtex 10% SIOC 20% SKOS 15% SIOC 7% 13% Enabling Networked Knowledge Benjamin Heitmann, slide: 19/17
  • 20. Tables: Data integration and other properties Digital Enterprise Research Institute www.deri.ie 2003 2004 2005 2006 2007 2008 2009 manual 30% 13% 0% 16% 9% 5% 4% semi- 70% 31% 100% 47% 58% 65% 61% automatic automatic 0% 25% 0% 11% 13% 4% 19% not needed 0% 31% 0% 26% 20% 26% 16% 2003 2004 2005 2006 2007 2008 2009 Data creation 20% 37% 50% 52% 37% 52% 76% Data import 70% 50% 83% 52% 70% 86% 73% Data export 70% 56% 83% 68% 79% 86% 73% Inferencing 60% 68% 83% 57% 79% 52% 42% Decentralised 90% 75% 100% 57% 41% 95% 96% sources Multiple 90% 93% 100% 89% 83% 91% 88% owners Heterogeneous 90% 87% 100% 89% 87% 78% 88% formats Data updates 90% 75% 83% 78% 45% 73% 50% Linked Data 0% 0% 0% 5% 25% 26% 65% principles Enabling Networked Knowledge Benjamin Heitmann, slide: 20/17
  • 21. Table: architectural analysis Digital Enterprise Research Institute www.deri.ie authoring interface graph-based navi- language service data homogeni- gation interface structured data data discovery sation service graph access graph query applications number of service layer RDF store year 2003 10 100% 80% 90% 90% 80% 20% 50% 2004 16 100% 94% 100% 50% 88% 38% 25% 2005 6 100% 100% 100% 83% 83% 33% 33% 2006 19 100% 95% 89% 63% 68% 37% 16% 2007 24 100% 92% 96% 88% 88% 33% 54% 2008 23 100% 87% 83% 70% 78% 26% 30% 2009 26 100% 77% 88% 80% 65% 19% 15% total 124 100% 88% 91% 74% 77% 29% 30% Enabling Networked Knowledge Benjamin Heitmann, slide: 21/17