SlideShare a Scribd company logo
An introduction to SDshare2011-03-15Lars Marius Garshol, <larsga@bouvet.no>http://twitter.com/larsga
Overview of SDshare
SDshareA protocol for tracking changes in a semantic datastoreessentially allows clients to keep track of all changes, for replication purposesSupports both Topic Maps and RDFBased on AtomHighly RESTfulA CEN specification
Basic workingsServerClientFragmentFragmentFragmentFragmentClient pulls these in, updateslocal copy of datasetServer publishes fragments representing changes in datastoreThere is, however, more to it than just this
What more is needed?Support for more than one dataset per serverthis means: more than one fragment streamHow do clients get started?a change feed is nice once you've got a copy of the dataset, but how do you get a copy?What if you miss out on some changes and need to restart?must be a way to reset local copyThe protocol supports all this
Two new conceptsCollectionessentially a dataset inside the serverexact meaning is not defined in specwill generally be a topic map (TMs) or a graph (RDF)Snapshota complete copy of a collection at some point in time
Feeds in the serverSnapshotSnapshot feedOverview feedFragmentFragment feedCollection feeds
An overview feed<feed xmlns="http://www.w3.org/2005/Atom" xmlns:sdshare="http://www.egovpt.org/sdshare">  <title>SDshare feeds from localhost</title>  <updated>2011-03-15T18:55:38Z</updated>  <author>    <name>Ontopia SDshare server</name>  </author>  <id>http://localhost:8080/sdshare/</id>  <link href="http://localhost:8080/sdshare/"></link>  <entry><title>beer.xtm</title>    <updated>2011-03-15T18:55:38Z</updated>    <id>http://localhost:8080/sdshare/beer.xtm</id><link href="collection.jsp?topicmap=beer.xtm" type="application/atom+xml" rel="http://www.egovpt.org/sdshare/collectionfeed"></link>  </entry>  <entry> <title>metadata.xtm</title>    <updated>2011-03-15T18:55:38Z</updated>    <id>http://localhost:8080/sdshare/metadata.xtm</id>  <link href="collection.jsp?topicmap=metadata.xtm" type="application/atom+xml" rel="http://www.egovpt.org/sdshare/collectionfeed"></link>  </entry></feed>
The snapshot feedA list of links to snapshots of the entire dataset (collection)The spec doesn't say anything about how and when snapshots are producedIt's up to implementations to decide how they want to do thisIt makes sense, though, to always have a snapshot for the current state of the dataset
Example snapshot feed<feed xmlns="http://www.w3.org/2005/Atom" xmlns:sdshare="http://www.egovpt.org/sdshare">  <title>Snapshots feed for beer.xtm</title>  <updated>2011-03-15T19:12:34Z</updated>  <author>    <name>Ontopia SDshare server</name>  </author>  <id>file:/Users/larsga/data/topicmaps/beer.xtm/snapshots</id>  <sdshare:ServerSrcLocatorPrefix>file:/Users/larsga/data/topicmaps/beer.xtm</sdshare:ServerSrcLocatorPrefix>  <entry>    <title>Snapshot of beer.xtm</title>    <updated>2011-03-15T19:12:34Z</updated>    <id>file:/Users/larsga/data/topicmaps/beer.xtm/snapshot/0</id>    <link href="snapshot.jsp?topicmap=beer.xtm" type="application/x-tm+xml; version=1.0" rel="alternate"></link>  </entry></feed>
The fragment feedFor every change in the topic map, there is one fragmentthe granularity of changes is not defined by the specit could be per transaction, or per topic changedThe fragment is basically a link to a URL that produces a part of the dataset
An example fragment feed<feed xmlns="http://www.w3.org/2005/Atom" xmlns:sdshare="http://www.egovpt.org/sdshare">  <title>Fragments feed for beer.xtm</title>  <updated>2011-03-15T19:21:20Z</updated>  <author>    <name>Ontopia SDshare server</name>  </author>  <id>file:/Users/larsga/data/topicmaps/beer.xtm/fragments</id>  <sdshare:ServerSrcLocatorPrefix>file:/Users/larsga/data/topicmaps/beer.xtm</sdshare:ServerSrcLocatorPrefix>  <entry>    <title>Topic with object ID 4521</title>    <updated>2011-03-15T19:20:03Z</updated>    <id>file:/Users/larsga/data/topicmaps/beer.xtm/4521/1300216803730</id>    <link href="fragment.jsp?topicmap=beer.xtm&amp;topic=4521&amp;syntax=rdf" type="application/rdf+xml" rel="alternate"/>    <link href="fragment.jsp?topicmap=beer.xtm&amp;topic=4521&amp;syntax=xtm" type="application/x-tm+xml; version=1.0" rel="alternate"/>    <sdshare:TopicSI>http://psi.example.org/12</sdshare:TopicSI>  </entry></feed>
What is a fragment?Essentially, a piece of a topic mapthat is, a complete XTM file that contains only part of a bigger topic maptypically, most of the topic references will point to topics not in the XTM fileDownloading more fragments will yield a bigger subset of the topic mapthe automatic merging in Topic Maps will cause the fragments to match upExactly the same applies in RDF
An example fragment<topicMap xmlns="http://www.topicmaps.org/xtm/1.0/" xmlns:xlink="http://www.w3.org/1999/xlink">  <topic id="id4521">    <instanceOf>      <subjectIndicatorRef xlink:href="http://psi.garshol.priv.no/beer/pub"></subjectIndicatorRef>    </instanceOf>    <subjectIdentity>      <subjectIndicatorRef xlink:href="http://psi.example.org/12"></subjectIndicatorRef>      <topicRef xlink:href="file:/Users/larsga/data/topicmaps/beer.xtm#id2662"></topicRef>    </subjectIdentity>    <baseName>      <baseNameString>Amundsen Bryggeri og Spiseri</baseNameString>    </baseName>    <occurrence>      <instanceOf>        <subjectIndicatorRef xlink:href="http://psi.ontopia.net/ontology/latitude"></subjectIndicatorRef>      </instanceOf>      <resourceData>59.913816</resourceData>    </occurrence>    ...    </topic>    ...</topicMap>
Applying a fragmentThe feed contains a URI prefixthis is used to create item identifiers tagging statements with their originFor each TopicSI find that topic, thenfor each statement, remove matching item identifierif statement now has no item identifiers, delete itMerge in the received fragmentthen tag all statements in it with matching item identifier
Properties of the protocolHATEOASuses hypertext principlesonly endpoint is that of the overview feedall other URLs available via hypertextApplying a fragment is idempotentie: result is the same, no matter how many times you do itLoose bindingvery loose binding between server and clientSupports federation of dataclient can safely merge data from different sources
SDshare pushIn normal SDshare data receivers connect to the data sourcebasically, they poll the source with GET requestsHowever, the receiver is not always allowed to make connections to the sourceSDshare push is designed for this situationSolution is a slightly modified protocolsource POSTs Atom feeds with inline fragments to receipientthis flips the server/client relationshipNot part of the spec; unofficial Ontopia extension
Uses of SDshare
Example use case #1FrontendDatabaseOntopiaDB2TMJDBCPortal
Example use case #1Service #1FrontendDatabaseOntopiaDB2TMSDshareOntopiaSDshareService #3PortalESB
NRK/Skole todayProduction environmentEditorial serverMediaDBProd #1Prod #2DB2TMExportJDBCJDBCnrk-grep.xtmImportDB server 1DB server 2DatabaseFirewallServer
NRK/Skole with SDshare pushProduction environmentSDsharePUSHEditorial serverMediaDBProd #1Prod #2DB2TMJDBCJDBCDB server 1DB server 2DatabaseFirewallServer
HafslundERPGISCRM...UMICSearch engineArchive
Hafslund architectureThe beauty of this architecture is that SDshare insulates the different systems from one anotherMore input systems can be added without hassleAny component can be replaced without affecting the othersEssentially, a plug-and-play architecture
A Hafslund problemThere are too many duplicates in the dataduplicates within each systemalso duplication across systemsHow to get rid of the duplicates?unrealistic to expect cleanup across systemsSo, we build a deduplicatorand plug it in...
DuKe plugged inERPGISCRM...UMICSearch engineDupe KillerArchive
Implementations
Current implementationsWeb3both client and serverOntopiaditto + SDshare pushIsidorusdon't knowAtomicoserver framework only; no actual implementation
Ontopia SDshare serverEvent trackertaps into event API where it listens for changesmaintains in-memory list of changeswrites all changes to disk as wellremoves duplicate changes and discards old changesWeb application based on trackerJSP pages producing feeds and fragmentsone fragment per changed topic, sorted by timeonly a single snapshot of current state of TM
Ontopia SDshare clientWeb UI for mgmtPluggable frontendsPluggable backendsCombine at willFrontendsOntopia: event listenerSDshare: polls Atom feedsBackendsOntopia: applies changes to Ontopia locallySPARQL: writes changes to RDF repo via SPARULpush: pushes changes over SDshare pushWeb UIOntopia eventsCore logicOntopia backendSPARQL UpdateSDshare clientSDshare push
Web UI to client
Problems with the spec
What if many fragments?The size of the fragments feed grows enormousexpensive if polled frequentlyPaging might be one solutionbasically, end of feed contains pointer to more"since" parameter might be anotherallows client to say "only show me changes since ..."Probably need both in practicehttp://projects.topicmapslab.de/issues/3675
Ordering of fragmentsShould the spec require that fragments be ordered?not really necessary if all fragment URIs return current state (instead of state at time fragment entry was created)
RDF fragment algorithmThe one given in the spec makes no senseRelies on Topic Maps constructs not found in RDFReally no way to make use of ithttp://projects.topicmapslab.de/issues/4013
Our interpretationServer prefix is URI of RDF named graphFragment algorithm therefore becomesdelete all statements about changed resourcesthen add all statements in fragmentMeans each source gets a different graph
TopicSL/TopicIICurrently, topics can only be identified by subject identifierbut not all topics have oneSolutionadd elements for subject locators and item identifiershttp://projects.topicmapslab.de/issues/3667
Paging of snapshots?What if the snapshot is vast?clients probably won't be able to download and store the entire thing in one goCould we page the snapshot into fragments?Or is there some other solution?http://projects.topicmapslab.de/issues/4307
How to tell if the fragment feed is complete?When reading the fragment feed, how can we tell if there are older fragments that are discarded?and how can we tell which fragment was the newest to be thrown away?Without this there's no way to know for certain if you've lost fragments if the feed stops before the newest fragment you've gotand if you're using since it always will stop before the newest fragment...Make new sdshare:foo element on feed level for this information?http://projects.topicmapslab.de/issues/4308
Blank nodes are not supportedWhat to do?http://projects.topicmapslab.de/issues/4306
More informationSDshare spechttp://www.egovpt.org/fg/CWA_Part_1bSDshare issue trackerhttp://projects.topicmapslab.de/projects/sdshareSDshare use caseshttp://www.garshol.priv.no/blog/215.html

More Related Content

PDF
Querying federations 
of Triple Pattern Fragments
PDF
DBpedia's Triple Pattern Fragments
PDF
Sustainable queryable access to Linked Data
PDF
Xml Applications Libraries
PDF
Querying datasets on the Web with high availability
PDF
The Future is Federated
PDF
Creating customized openSUSE versions with SUSE Studio
PDF
Functional Composition of Sensor Web APIs
Querying federations 
of Triple Pattern Fragments
DBpedia's Triple Pattern Fragments
Sustainable queryable access to Linked Data
Xml Applications Libraries
Querying datasets on the Web with high availability
The Future is Federated
Creating customized openSUSE versions with SUSE Studio
Functional Composition of Sensor Web APIs

Viewers also liked (11)

PPTX
Eminem
PDF
Sala de lo Constitucional oficializa fallo sobre reelección presidencial
PPTX
Bienvenido mr
PDF
Jornal Cidade - Ano I - Nº 19
DOCX
La jurisdicción constitucional es la rama de la justicia que vela por la supr...
DOC
FALLO Marbury vs. Madison
PDF
Resumen -marbury_versus_madison_para_lexweb_
DOCX
Recurso de inaplicabilidad
PDF
Patrocinio deportivo
DOCX
Participación democrática(articulo 40 de la constitución colombiana)
PPT
EstadoDerechoyConstitucion
Eminem
Sala de lo Constitucional oficializa fallo sobre reelección presidencial
Bienvenido mr
Jornal Cidade - Ano I - Nº 19
La jurisdicción constitucional es la rama de la justicia que vela por la supr...
FALLO Marbury vs. Madison
Resumen -marbury_versus_madison_para_lexweb_
Recurso de inaplicabilidad
Patrocinio deportivo
Participación democrática(articulo 40 de la constitución colombiana)
EstadoDerechoyConstitucion
Ad

Similar to Introduction to SDshare (20)

ODP
DC-2008 Tutorial 3 - Dublin Core and other metadata schemas
PPT
Web 2.0 Lessonplan Day1
ODP
Sword v2 at UKCoRR
PPT
CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...
ODP
Slug: A Semantic Web Crawler
PPT
Catacomb Apachecon Fast Feather 2008
PPT
RESTFul IDEAS
PPT
Solr Presentation
ODP
SPARQLing Services
ODP
Terracotta Ch'ti Jug
ODP
Clustering Made Easier: Using Terracotta with Hibernate and/or EHCache
PPT
Creative Commons @ Seybold San Francisco 2004 - DRM Roundtable
PPTX
HTTP/2 Introduction
PDF
User-space Network Processing
PDF
Revisiting HTTP/2
PPT
Getting Started With The Talis Platform
PPT
Ibm
PPT
Agile Descriptions
PPT
11g R2
PPTX
Ontopia Code Camp
DC-2008 Tutorial 3 - Dublin Core and other metadata schemas
Web 2.0 Lessonplan Day1
Sword v2 at UKCoRR
CrossRef How-to: A Technical Introduction to the Basics of CrossRef, Chuck Ko...
Slug: A Semantic Web Crawler
Catacomb Apachecon Fast Feather 2008
RESTFul IDEAS
Solr Presentation
SPARQLing Services
Terracotta Ch'ti Jug
Clustering Made Easier: Using Terracotta with Hibernate and/or EHCache
Creative Commons @ Seybold San Francisco 2004 - DRM Roundtable
HTTP/2 Introduction
User-space Network Processing
Revisiting HTTP/2
Getting Started With The Talis Platform
Ibm
Agile Descriptions
11g R2
Ontopia Code Camp
Ad

More from Lars Marius Garshol (20)

PDF
JSLT: JSON querying and transformation
PDF
Data collection in AWS at Schibsted
PPTX
Kveik - what is it?
PDF
Nature-inspired algorithms
PDF
Collecting 600M events/day
PDF
History of writing
PDF
NoSQL and Einstein's theory of relativity
PPTX
Norwegian farmhouse ale
PPTX
Archive integration with RDF
PPTX
The Euro crisis in 10 minutes
PPTX
Using the search engine as recommendation engine
PPTX
Linked Open Data for the Cultural Sector
PPTX
NoSQL databases, the CAP theorem, and the theory of relativity
PPTX
Bitcoin - digital gold
PPTX
Introduction to Big Data/Machine Learning
PPTX
Hops - the green gold
PPTX
Big data 101
PPTX
Linked Open Data
PPTX
Hafslund SESAM - Semantic integration in practice
PPTX
Approximate string comparators
JSLT: JSON querying and transformation
Data collection in AWS at Schibsted
Kveik - what is it?
Nature-inspired algorithms
Collecting 600M events/day
History of writing
NoSQL and Einstein's theory of relativity
Norwegian farmhouse ale
Archive integration with RDF
The Euro crisis in 10 minutes
Using the search engine as recommendation engine
Linked Open Data for the Cultural Sector
NoSQL databases, the CAP theorem, and the theory of relativity
Bitcoin - digital gold
Introduction to Big Data/Machine Learning
Hops - the green gold
Big data 101
Linked Open Data
Hafslund SESAM - Semantic integration in practice
Approximate string comparators

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPT
Teaching material agriculture food technology
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Cloud computing and distributed systems.
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Machine learning based COVID-19 study performance prediction
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Reach Out and Touch Someone: Haptics and Empathic Computing
Teaching material agriculture food technology
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Programs and apps: productivity, graphics, security and other tools
MIND Revenue Release Quarter 2 2025 Press Release
Building Integrated photovoltaic BIPV_UPV.pdf
Cloud computing and distributed systems.
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Network Security Unit 5.pdf for BCA BBA.
Digital-Transformation-Roadmap-for-Companies.pptx
Encapsulation theory and applications.pdf
Encapsulation_ Review paper, used for researhc scholars
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...

Introduction to SDshare

  • 1. An introduction to SDshare2011-03-15Lars Marius Garshol, <larsga@bouvet.no>http://twitter.com/larsga
  • 3. SDshareA protocol for tracking changes in a semantic datastoreessentially allows clients to keep track of all changes, for replication purposesSupports both Topic Maps and RDFBased on AtomHighly RESTfulA CEN specification
  • 4. Basic workingsServerClientFragmentFragmentFragmentFragmentClient pulls these in, updateslocal copy of datasetServer publishes fragments representing changes in datastoreThere is, however, more to it than just this
  • 5. What more is needed?Support for more than one dataset per serverthis means: more than one fragment streamHow do clients get started?a change feed is nice once you've got a copy of the dataset, but how do you get a copy?What if you miss out on some changes and need to restart?must be a way to reset local copyThe protocol supports all this
  • 6. Two new conceptsCollectionessentially a dataset inside the serverexact meaning is not defined in specwill generally be a topic map (TMs) or a graph (RDF)Snapshota complete copy of a collection at some point in time
  • 7. Feeds in the serverSnapshotSnapshot feedOverview feedFragmentFragment feedCollection feeds
  • 8. An overview feed<feed xmlns="http://www.w3.org/2005/Atom" xmlns:sdshare="http://www.egovpt.org/sdshare"> <title>SDshare feeds from localhost</title> <updated>2011-03-15T18:55:38Z</updated> <author> <name>Ontopia SDshare server</name> </author> <id>http://localhost:8080/sdshare/</id> <link href="http://localhost:8080/sdshare/"></link> <entry><title>beer.xtm</title> <updated>2011-03-15T18:55:38Z</updated> <id>http://localhost:8080/sdshare/beer.xtm</id><link href="collection.jsp?topicmap=beer.xtm" type="application/atom+xml" rel="http://www.egovpt.org/sdshare/collectionfeed"></link> </entry> <entry> <title>metadata.xtm</title> <updated>2011-03-15T18:55:38Z</updated> <id>http://localhost:8080/sdshare/metadata.xtm</id> <link href="collection.jsp?topicmap=metadata.xtm" type="application/atom+xml" rel="http://www.egovpt.org/sdshare/collectionfeed"></link> </entry></feed>
  • 9. The snapshot feedA list of links to snapshots of the entire dataset (collection)The spec doesn't say anything about how and when snapshots are producedIt's up to implementations to decide how they want to do thisIt makes sense, though, to always have a snapshot for the current state of the dataset
  • 10. Example snapshot feed<feed xmlns="http://www.w3.org/2005/Atom" xmlns:sdshare="http://www.egovpt.org/sdshare"> <title>Snapshots feed for beer.xtm</title> <updated>2011-03-15T19:12:34Z</updated> <author> <name>Ontopia SDshare server</name> </author> <id>file:/Users/larsga/data/topicmaps/beer.xtm/snapshots</id> <sdshare:ServerSrcLocatorPrefix>file:/Users/larsga/data/topicmaps/beer.xtm</sdshare:ServerSrcLocatorPrefix> <entry> <title>Snapshot of beer.xtm</title> <updated>2011-03-15T19:12:34Z</updated> <id>file:/Users/larsga/data/topicmaps/beer.xtm/snapshot/0</id> <link href="snapshot.jsp?topicmap=beer.xtm" type="application/x-tm+xml; version=1.0" rel="alternate"></link> </entry></feed>
  • 11. The fragment feedFor every change in the topic map, there is one fragmentthe granularity of changes is not defined by the specit could be per transaction, or per topic changedThe fragment is basically a link to a URL that produces a part of the dataset
  • 12. An example fragment feed<feed xmlns="http://www.w3.org/2005/Atom" xmlns:sdshare="http://www.egovpt.org/sdshare"> <title>Fragments feed for beer.xtm</title> <updated>2011-03-15T19:21:20Z</updated> <author> <name>Ontopia SDshare server</name> </author> <id>file:/Users/larsga/data/topicmaps/beer.xtm/fragments</id> <sdshare:ServerSrcLocatorPrefix>file:/Users/larsga/data/topicmaps/beer.xtm</sdshare:ServerSrcLocatorPrefix> <entry> <title>Topic with object ID 4521</title> <updated>2011-03-15T19:20:03Z</updated> <id>file:/Users/larsga/data/topicmaps/beer.xtm/4521/1300216803730</id> <link href="fragment.jsp?topicmap=beer.xtm&amp;topic=4521&amp;syntax=rdf" type="application/rdf+xml" rel="alternate"/> <link href="fragment.jsp?topicmap=beer.xtm&amp;topic=4521&amp;syntax=xtm" type="application/x-tm+xml; version=1.0" rel="alternate"/> <sdshare:TopicSI>http://psi.example.org/12</sdshare:TopicSI> </entry></feed>
  • 13. What is a fragment?Essentially, a piece of a topic mapthat is, a complete XTM file that contains only part of a bigger topic maptypically, most of the topic references will point to topics not in the XTM fileDownloading more fragments will yield a bigger subset of the topic mapthe automatic merging in Topic Maps will cause the fragments to match upExactly the same applies in RDF
  • 14. An example fragment<topicMap xmlns="http://www.topicmaps.org/xtm/1.0/" xmlns:xlink="http://www.w3.org/1999/xlink"> <topic id="id4521"> <instanceOf> <subjectIndicatorRef xlink:href="http://psi.garshol.priv.no/beer/pub"></subjectIndicatorRef> </instanceOf> <subjectIdentity> <subjectIndicatorRef xlink:href="http://psi.example.org/12"></subjectIndicatorRef> <topicRef xlink:href="file:/Users/larsga/data/topicmaps/beer.xtm#id2662"></topicRef> </subjectIdentity> <baseName> <baseNameString>Amundsen Bryggeri og Spiseri</baseNameString> </baseName> <occurrence> <instanceOf> <subjectIndicatorRef xlink:href="http://psi.ontopia.net/ontology/latitude"></subjectIndicatorRef> </instanceOf> <resourceData>59.913816</resourceData> </occurrence> ... </topic> ...</topicMap>
  • 15. Applying a fragmentThe feed contains a URI prefixthis is used to create item identifiers tagging statements with their originFor each TopicSI find that topic, thenfor each statement, remove matching item identifierif statement now has no item identifiers, delete itMerge in the received fragmentthen tag all statements in it with matching item identifier
  • 16. Properties of the protocolHATEOASuses hypertext principlesonly endpoint is that of the overview feedall other URLs available via hypertextApplying a fragment is idempotentie: result is the same, no matter how many times you do itLoose bindingvery loose binding between server and clientSupports federation of dataclient can safely merge data from different sources
  • 17. SDshare pushIn normal SDshare data receivers connect to the data sourcebasically, they poll the source with GET requestsHowever, the receiver is not always allowed to make connections to the sourceSDshare push is designed for this situationSolution is a slightly modified protocolsource POSTs Atom feeds with inline fragments to receipientthis flips the server/client relationshipNot part of the spec; unofficial Ontopia extension
  • 19. Example use case #1FrontendDatabaseOntopiaDB2TMJDBCPortal
  • 20. Example use case #1Service #1FrontendDatabaseOntopiaDB2TMSDshareOntopiaSDshareService #3PortalESB
  • 21. NRK/Skole todayProduction environmentEditorial serverMediaDBProd #1Prod #2DB2TMExportJDBCJDBCnrk-grep.xtmImportDB server 1DB server 2DatabaseFirewallServer
  • 22. NRK/Skole with SDshare pushProduction environmentSDsharePUSHEditorial serverMediaDBProd #1Prod #2DB2TMJDBCJDBCDB server 1DB server 2DatabaseFirewallServer
  • 24. Hafslund architectureThe beauty of this architecture is that SDshare insulates the different systems from one anotherMore input systems can be added without hassleAny component can be replaced without affecting the othersEssentially, a plug-and-play architecture
  • 25. A Hafslund problemThere are too many duplicates in the dataduplicates within each systemalso duplication across systemsHow to get rid of the duplicates?unrealistic to expect cleanup across systemsSo, we build a deduplicatorand plug it in...
  • 26. DuKe plugged inERPGISCRM...UMICSearch engineDupe KillerArchive
  • 28. Current implementationsWeb3both client and serverOntopiaditto + SDshare pushIsidorusdon't knowAtomicoserver framework only; no actual implementation
  • 29. Ontopia SDshare serverEvent trackertaps into event API where it listens for changesmaintains in-memory list of changeswrites all changes to disk as wellremoves duplicate changes and discards old changesWeb application based on trackerJSP pages producing feeds and fragmentsone fragment per changed topic, sorted by timeonly a single snapshot of current state of TM
  • 30. Ontopia SDshare clientWeb UI for mgmtPluggable frontendsPluggable backendsCombine at willFrontendsOntopia: event listenerSDshare: polls Atom feedsBackendsOntopia: applies changes to Ontopia locallySPARQL: writes changes to RDF repo via SPARULpush: pushes changes over SDshare pushWeb UIOntopia eventsCore logicOntopia backendSPARQL UpdateSDshare clientSDshare push
  • 31. Web UI to client
  • 33. What if many fragments?The size of the fragments feed grows enormousexpensive if polled frequentlyPaging might be one solutionbasically, end of feed contains pointer to more"since" parameter might be anotherallows client to say "only show me changes since ..."Probably need both in practicehttp://projects.topicmapslab.de/issues/3675
  • 34. Ordering of fragmentsShould the spec require that fragments be ordered?not really necessary if all fragment URIs return current state (instead of state at time fragment entry was created)
  • 35. RDF fragment algorithmThe one given in the spec makes no senseRelies on Topic Maps constructs not found in RDFReally no way to make use of ithttp://projects.topicmapslab.de/issues/4013
  • 36. Our interpretationServer prefix is URI of RDF named graphFragment algorithm therefore becomesdelete all statements about changed resourcesthen add all statements in fragmentMeans each source gets a different graph
  • 37. TopicSL/TopicIICurrently, topics can only be identified by subject identifierbut not all topics have oneSolutionadd elements for subject locators and item identifiershttp://projects.topicmapslab.de/issues/3667
  • 38. Paging of snapshots?What if the snapshot is vast?clients probably won't be able to download and store the entire thing in one goCould we page the snapshot into fragments?Or is there some other solution?http://projects.topicmapslab.de/issues/4307
  • 39. How to tell if the fragment feed is complete?When reading the fragment feed, how can we tell if there are older fragments that are discarded?and how can we tell which fragment was the newest to be thrown away?Without this there's no way to know for certain if you've lost fragments if the feed stops before the newest fragment you've gotand if you're using since it always will stop before the newest fragment...Make new sdshare:foo element on feed level for this information?http://projects.topicmapslab.de/issues/4308
  • 40. Blank nodes are not supportedWhat to do?http://projects.topicmapslab.de/issues/4306
  • 41. More informationSDshare spechttp://www.egovpt.org/fg/CWA_Part_1bSDshare issue trackerhttp://projects.topicmapslab.de/projects/sdshareSDshare use caseshttp://www.garshol.priv.no/blog/215.html