SlideShare a Scribd company logo
Tutorial – Semantic Digital Libraries -  Introduction  -  Sebastian R. Kruk , Stefan Decker , Bernhard Haslhofer, Predrag Kneževic , Sandy Payette, Dean Krafft
Tutorial overview Who we are Sebastian R. Kruk, DERI Galway – Ireland Stefan Decker, DERI Galway – Ireland Bernhard Haslhofer,  University of Vienna  - Austria Predrag Knezevic, Fraunhofer IPSI – Germany Sandy Payette,  Cornell University, USA  Dean Krafft, NSDL, USA In  the next 3,5 hours we want to give you a brief introduction to the Semantic Web, and show how SW is related to digital libraries present existing semantic   digital library systems discuss the current problems and future directions of semantic digital libraries and get feedback from you After this tutorial you will know what is the semantic digital library system existing solutions in various degrees of detail
Tutorial   Schedule Existing Solutions - JeromeDL 2:15 - 3:00 Conclusions, discussion & future of SemDL  4:30 - 5:00  Existing Semantic Digital Libraries solutions  (BRICKS, FEDORA, SIMILE) 3: 15  - 4:30  Coffee break  3:00 - 3: 15   Introduction to Semantic Digital Libraries  1:30 - 2:15  Time
The Semantic Web – A Brief Introduction Current Web vs. Semantic Web? An extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.  [Tim Berners-Lee] Current Web was designed for humans, and there is little information usable for machines Was the Web meant to be more? Objects with well defined attributes as opposed to untyped hyperlinks between Internet resources A  network of relationships  amongst named objects, yielding unified information management tasks What do you mean by “Semantic”? the  semantics  of something is the  meaning  of something Semantic Web is able to describe things in a way that computers can understand
Outline Introduction to Semantic Web Semantic Digital Libraries
The Semantic Web – A Brief Introduction Where are we in the  “S emantic  W eb   layer cake”? You Are Here!
The Semantic Web – A Brief Introduction The challenge for the Semantic Web The Semantic Web can’t work all by itself For example, it is not very likely that you will be able to sell your car just by putting your RDF file on the Web Need society-scale applications: Semantic Web agents and/or services, consumers and processors for semantic data, more advanced collaborative applications
The Semantic Web –  What is RDF ? Describing things on the S emantic  W eb RDF (Resource Description Framework) a  data  format  for describing information and resources,  the fundamental data model for the Semantic Web Using RDF, we can describe relationships between things like: A is a  part  of B or Y is a  member  of  Z and their properties ( size ,  weight ,  age ,  price …) in a machine-understandable format where each thing has a RDF  graph-based model  delivers  straightforward  machine  process ing Putting information into RDF files makes it possible for “scutters” or RDF crawlers to  search ,  discover ,  pick up ,  collect ,  analyse  and  process  information from the Web
The Semantic Web –  What is RDF ? A simple RDF example Statement: “ Stefan Decker  is the  creator  of the resource (web page)   http://www.stefandecker.org ” Structure: Resource (subject) http://www.stefandecker.org Property (predicate)  http://purl.org/dc/elements/1.1/creator Value (object)  “ Stefan Decker ” Directed graph: http://www.stefandecker.org dc:creator Stefan Decker
The Semantic Web – How RDF can help us? How RDF can help us? identify objects establish relationships express a new relationship   just add a new RDF statement  integrate information from different sources    copy all the RDF data together RDF allows many points of view
What is an Ontology? „ An ontology is a specification of a conceptualization.“ Tom Gruber, 1993 Ontologies are social contracts Agreed, explicit semantics Understandable to outsiders (Often) derived in  a community process Ontology markup and representation languages: RDF  and RDF Schema OWL Other:  DAML+OIL ,  EER ,  UML ,  Topic Maps ,  MOF ,  XML Schemas The Semantic Web –  Ontologies and Schemata
Defines small vocabulary for RDF:  Class, subClassOf, type Property, subPropertyOf domain, range Vocabulary can be used to define other vocabularies for your application domain The Semantic Web –  RDF Schema Person Student Researcher subClassOf subClassOf Jeen type hasSuperVisor domain range Frank type hasSuperVisor
OWL – The Web Ontology Language Owl  took Christopher Robin’s notice from Rabbit and looked at it   nervously. He could spell his own name  WOL , and he could spell Tuesday so that you knew it wasn’t Wednesday, and he could read quite comfortably when you weren’t looking over his shoulder and saying "Well?" all the time... provides  a  vocabulary for defining classes, their properties and their relationships among classes. The Semantic Web –  OWL owl :disjointWith s s s s Animal Herbivore Carnivore Omnivore Based on Description Logics OWL is a W3C Recommendation
The Semantic Web –  Applications Semantic Web cannot be and is not only a set of recommendations Semantic Web is  becoming reality by applications  that support it and are based on it Enabling technologies: RDF Storages: Sesame, Jena, YARS Reasoners: KAON, Racer  Editors: Protege, SWOOP, MarcOnt Portal End-User applications: Semantic wikis: Makna, SemperWiki Semantic blogs Semantic digital libraries
Outline Introduction to Semantic Web Semantic Digital Libraries
What is a Semantic Digital Library? Semantic digital libraries integrate  information based on different metadata, e.g.: resources, user profiles, bookmarks, taxonomies  –  high quality semantics = highly and meaningfully connected information provide  interoperability  with other systems (not only digital libraries) on either metadata or communication level or both –  RDF as common denominator between digital libraries and other services delivering more robust,  user friendly and adaptable search and browsing  interfaces empowered by semantics
Evolution of Libraries Social Semantic Digital Library Involves the community into sharing knowledge Semantic Digital Library Accessible  by  machines, not only  with  machines Digital Library Online, easy searching with a full-text index Library Organized collection
Different Kind of Libraries (Evolution of Libraries) Classic libraries Scientific libraries Digital libraries Semantic libraries
How are Semantic Digital Libraries different? Semantic digital libraries extend digital libraries by describing and exposing its resources in a machine ‘understandable’ way resources can be contents, digital artefacts organization of objects (e.g. collections) users, user communities controlled vocabularies, thesauri,  taxonomies expose the semantics of their metadata  in terms of an ontology defined using a formal language deliver mediation services for communication  with other systems
Semantic Web Technologies for Digital Libraries? Metadata is the key  concept the Web  does not have  metadata the idea of a Semantic Web is nice but difficult to  implement many digital libraries  do have  metadata in place we simply must make them available in a machine understandable format the Semantic Web provides the format: RDF
Semantic Web Technologies for Digital Libraries? Knowledge in bibliographic records Digital Libraries  already have  controlled vocabularies, taxonomies or even ontologies in place  the challenge is to model this knowledge in a machine understandable way the Semantic Web provides  ontology  language s:   RDF  Schema OWL SKOS
A Sample Bibliographic Record Copyright 2000 The J. Paul Getty Trust & College Art Association, Inc . Terms taken from Controlled Vocabularies Vincent van Gogh;  painter: Gogh, Vincent van (Dutch painter, 1853-1890) Creation-Creator/Role J. Paul Getty Museum Current Location-Repository Name irises ,  nature ,  soil , etc. Subject-Matter 1889, earliest: 1889, latest: 1889 Creation-Date Irises Title paintings Object/Work type Paintings Classification
Knowledge Organization Systems tools that present the  organized interpretation  of knowledge structures semantic tools -  meaning  of words and other symbols as well as (semantic)  relations  between symbols and concept  organize  information and  promote  knowledge management Examples: classification and categorization  schemata (organize materials at a general level) subject headings  (provide more detailed access) authority files  (control variant versions of key information such as geographic names and personal names) highly structured vocabularies, such as  thesauri traditional schemes, such as semantic networks and  ontologies
Taxonomy of Knowledge Organization Systems Term Lists  Authority files ( FOAF ) Glossaries  Dictionaries  Gazetteers  Classifications and Categories ( DMoz ) Subject headings Classification schemes Taxonomies  Categorization Schemes.  Relationship Lists Thesauri ( WordNet, MeSH ) Semantic networks Ontologies   (Hodge, 2000)
Simple Knowledge Organization Systems (SKOS) basic structure and content of concept schemes such as  thesauri,  classification schemes,  subject heading lists,  taxonomies,  'folksonomies ',  other types of controlled vocabulary core concepts: narrower  and  broader isSubjectOf and  subject ; isPrimarySubjectOf and primarySubject member  and Collection; memberList and OrderedCollection related  and semanticRelation note, definition; altLabel and  prefLabel ; symbol and altSymbol
Benefits of Semantic Digital Libraries  Problems of today’s libraries  rapidly growing islands of highly organized information How to find things in a growing information space? is it enough to have a full-text index (à la Google)? typical “end-users” versus “expert users” converging digital library systems e.g. uniform access to Europe’s digital libraries and cultural heritage
Benefits of Semantic Digital Libraries  T he two main benefits of Semantic Digital Libraries new search paradigms for the information space Ontology - based search / facet search Community-enabled browsing providing interoperability on the data level integrating metadata from various heterogeneous sources Interconnecting different digital library systems
Searching the Sample Bibliographic Record Full-text search “ Paintings ” AND “ Van Gogh ” AND “ flowers ”      no result Semantic query if the knowledge that “ irises ” are “ flowers ” is modeled in an ontology (e.g. subclass-hierarchy) we can query for all “ Paintings ” by “ Van Gogh ” with subject “ flowers ”   and retrieve also the picture with subject “ irises ” Copyright 2000 The J. Paul Getty Trust & College Art Association, Inc . Vincent van Gogh;  painter: Gogh, Vincent van (Dutch painter, 1853-1890) Creation-Creator/Role J. Paul Getty Museum Current Location-Repository Name irises ,  nature ,  soil , etc. Subject-Matter 1889, earliest: 1889, latest: 1889 Creation-Date Irises Title paintings Object/Work type Paintings Classification
Semantic Digital Libraries and Existing DL Systems how to handle the legacy (meta-)data problem  lifting existing (meta-)data to a semantic level simple solutions like MARC21   DublinCore complex ontologies like MarcOnt Ontology for capturing concepts from different standards legacy libraries expose their metadata via well established protocols - the metadata can be imported into semantic DLs semantic DLs can play a role of integration champions in the information retrieval process in heterogeneous networks: OAI-PMH Z39.50 Dienst
Application  A reas for Semantic Web  T echnologies Thesauri & Controlled Vocabularies qualified DublinCore DMoz, DDC-based taxonomies SKOS, WordNet and other thesauri Schema Mappings / Crosswalks MarcOnt Ontology – aims to cover concepts from MARC21, BibTeX and DublinCore MarcOnt Mediation Services – an open mediation framework between common legacy metadata standards  Metadata Integration RDF as a common data model for integrating metadata from various autonomous and heterogeneous data sources OWL for modeling the data source’s semantics SPARQL as a common query language
Semantic DL as Evolving Knowledge Space In state-of-the-art digital libraries users are  consumers Retrieve contents based on available bibliographic records Recent trends: user communities Connetea Flickr In Semantic digital libraries users are  contributers  as well Tagging (Web 2.0) Social Semantic Collaborative Filtering Annotations Semantic   Digital libraries enforce the  transition from a static information to a  dynamic (collaborative) knowledge space
Existing Semantic Digital Library Systems JeromeDL a social semantic digital library makes use of Semantic Web and Social Networking technologies to enhance both interoperability and usability BRICKS aims at establishing the organizational and technological foundations for a digital library network in order to share knowledge and resources in the cultural heritage domain. FEDORA delivers flexible service-oriented architecture to managing and delivering content in the form of digital objects SIMILE extends and laverages DSpace, seeking to enhance interoperability among digital assets, schemata, metadata, and services
Tutorial – Semantic Digital Libraries -  Existing Semantic Digital Libraries Solutions  – Sebastian R. Kruk, Stefan Decker Predrag Kneževi ć , Bernhard Haslhofer,  Sandy Payette, Dean Krafft
Existing Semantic Digital Library Systems JeromeDL a social semantic digital library makes use of Semantic Web and Social Networking technologies to enhance both interoperability and usability BRICKS aims at establishing the organizational and technological foundations for a digital library network in order to share knowledge and resources in the cultural heritage domain. FEDORA delivers flexible service-oriented architecture to managing and delivering content in the form of digital objects SIMILE extends and laverages DSpace, seeking to enhance interoperability among digital assets, schemata, metadata, and services
Tutorial 7 – Semantic Digital Libraries -  Existing Semantic Digital Libraries Solutions  – JeromeDL Sebastian R. Kruk
Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
JeromeDL -  Introduction Joint effort of DERI, National University of Ireland, Galway and Gdansk University of Technology (GUT) Distributed under BSD Open Source license Digital library build on semantic web technologies to answer requirements from: librarians, scientists and everyone.
JeromeDL –  Motivations Use Cases Librarians: support for rich metadata (MARC21) in uploading resources,  accessing bibliographic information and searching persistent identifiers Scientists:  easy publishing (designed as a institute/university digital library) creating hierarchical networks of digital libraries support for accessing, sharing and searching using bibliography  metadata (BibTeX) Everyone: simple search (incl. natural language queries)  community-aware information sharing and browsing,  support for interationalization
JeromeDL - Motivations Support for different kinds of bibliographic medatata, like:  DublinCore ,  BibTeX  and  MARC21  at the same time. Making use of existing  rich sources  of bibliographic descriptions  (like MARC21) created by human. Supporting users and communities: user s  ha ve  control over  their  profile information ; community-aware profiles are integrated with bibliographic descriptions support for community generated knowledge Delivering communication between instances: P2P mode for searching and users authentication Hierarchical mode for browsing
Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
JeromeDL – Architecture Resources and annotations repository Middleware: query processing community space resources management User interface agents: Communication to the outside world Administrative interface
Bibliographic Description in JeromeDL <?xml  version =&quot;1.0&quot;  encoding =&quot;UTF-8&quot;  ?> <rdf:Description   rdf:about =&quot;http://...id=828374765&quot; > <dc:title> JeromeDL - Adding Semantic Web Technologies to DLs </dc:title> <dc:creator> Sebastian  Kruk </dc:creator> <dc:description> In recent  years... </dc:description> </rdf:Description> 01450cas 922004331i 450000100...019c19329999gw  qr|p|  ||||0  |0ger |  a0044-2992 9a200412140219bVLOADc200404071525dvkulc200310071018dvbjc200303101205dkopumky200209211341zVLOAD  aGD U/MPcGD  U/MPdGD U/MFdGD U/KKsdWR O/EJ0 ager1 aZ. Kunstgesch. 0aZeitschrift für Kunstgeschichte00aZeitschrift für Kunstgeschichte.18aZfK  aMünchen ;aBerlin :bDeutscher Kunstverlag,c1932-.  c26-29 cm.  aKwart.0 a1 Bd. (Juni 1932)-.  aOpis na podst.: LCC.  aW 1932 założycielami czasopisma byli Wilhelm Waetzoldt i Ernst Gall....  These all can be represented in RDF @ InProceedings  { jeromedexa2005, author  = &quot;Sebastian Ryszard Kruk and  ... &quot;, title  = &quot;{JeromeDL - Adding Semantic  ...}&quot;, booktitle  = &quot;{In Proceedings to DEXA 2005}&quot;, year  = 2005}
Structure ontology in JeromeDL
Bibliographic (MarcOnt) Ontology in JeromeDL
Community-aware (FOAFRealm) ontology
Ontologies in JeromeDL
Metadata and Services in JeromeDL
Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
MarcOnt Initiative – Overview Motivation: Provide set of tools for  collaborative ontology development MarcOnt Initiative goals: Create a framework for collaborative ontology improvement (E-learning) Provide domain experts with tools to share their knowledge Offer tools for data mediation between different data formats
MarcOnt Portal and MarcOnt Ontology MarcOnt Ontology: Central point of MarcOnt Initiative Translation and mediation format Continuos collaborative ontology improvement Knowledge from the domain experts MarcOnt Portal (source of knowledge): Suggestions Annotations Versioning Ontology editor
MarcOnt Mediation Services for Legacy Metadata Format translation RDF Translator Format co-operation MarcOnt Mediation Services
Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
Social Services in JeromeDL Involve users into sharing knowledge Blogs – comments and discussions about documents and resources  Tagging – collaborative classification Wikis – collaboratively edited additional descriptions, such as summaries and interesting facts Preserve knowledge for future use Users can learn from experience of others instantly Recommend new, interesting resources based on users’ profiles
FOAF - Describing Social Networks FOAF - Stands for Friend-of-a-Friend Defines properties for a person (but it does not have to be a person, can be an “agent”) Does not only have to contain one person per file Can build a network of people with foaf:knows links FOAF can be easily extended to meet requirements, as in the case of FOAFRealm for identity management…
Identity management with FOAFRealm Identity defined with extended FOAF metadata Policies expressed by social networking  Distance between owner and requester Friendship level between owner and requester, calculated across digraph of social network Support for single registration and sign on Distributed identity management with HyperCuP (“D-FOAF”) FOAFRealm is currently implemented as a plugin for Tomcat (Realm/Valve implementation), with PHP and .NET versions coming soon
What is  S ocial  S emantic  C ollaborative  F iltering? Goal:   t o enhance individual bookmarks with shared knowledge within a community Users annotate catalogues of bookmarks with semantic information taken from DM oz  or WordNet vocabularies Catalogs can include ( transclusion ) friend's catalogues Access to catalogues can be restricted with social networking-based polices SSCF delivers: Community-oriented, semantically-rich taxonomies Information about a user's interest  Flows of expertise from the domain expert Recommendations based on users previous actions Support for SIOC metadata
Example of  S ocial  S emantic  C ollaborative  F iltering foaf:knows xfoaf:include xfoaf:bookmark
Social Networks in Digital Libraries Resource xfoaf:Annotation user_C creator_B foaf:knows marcont:hasCreator creator_A foaf:knows foaf:knows xfoaf:Directory user_D xfoaf:owns xfoaf:linksTo xfoaf:isIn
Support for online communities in SSCF
Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
JeromeDL – Delivering Semantic Content Providing semantic annotations during uploading process: open module for handling any taxonomies keywords based on WordNet and free tagging defining structure of resources in the JeromeDL ontology Lifting legacy metadata to MarcOnt ontology Community maintained annotations social semantic collaborative filtering semantic descriptions based on the FOAF metadata
Annotating Library Resources
JeromeDL – Semantic Information In Use Searching: Keyword-based search with semantic query expansion Semantic search: Direct RDF quering Natural language templates Browsing Exibit MultiBeeBrowse Sharing: Social Semantic Collaborative Filtering Semantically Interlinked Online Communities Heterogeneous communication: Bibster ,  A9 ,  OAI -PMH
Exposing Semantic Annotations
Filtering Resources in JeromeDL
Sharing Knowledge with SSCF
Information Retrieval in JeromeDL Fulltext Index Structure Repository MarcOnt Repository Resources’ Content FOAFRealm Repository (typed) keywords RDF & NL Query OpenSearch RSS collaborative filtering types translation semantic query expansion RDF Repositories Secure Snapshot local interface distributed interface
Networks of Digital Libraries  ELP (Extensible Library Protocol) implementation communication within JeromeDL network adapters for communication with other networks D-FOAF integration (distributed user profile management) single sign on and single registration within D-FOAF network HyperCuP integration (scalable P2P network) Independent ELP network entry point: http://search.jeromedl.org/ 0 0 1 1 0 0 1 1 0 2 2 2 2
Tutorial – Semantic Digital Libraries -  Existing Semantic Digital Libraries Solutions  –  BRICKS Predrag Knežević Fraunhofer IPSI Institute Germany Bernhard Haslhofer  University of Vienna Austria
Outline BRICKS Overview BRICKS Components BRICKS Applications
What is BRICKS? A software infrastructure for building digital library networks Transparent access to distributed resources Multilinguality Easy installation & maintainance A set of end-user applications Network & content management Web 2.0 tagging/annotations Domain specific applications A business model Open source, platform independent Low cost infrastructure User communities    sustainability
BRICKS Architecture A decentralized P2P network Avoid central coordination Highly Scalable, increased reliability Minimized maintainance costs Each P2P Node is a set of SOA components Web Service interface Platform independent Flexible composition Components for Storing, accessing and protecting digital objects (Semantic) search & browsing P2P commmunication
Accessing Data
A Look into a BNode { BNode
Outline BRICKS Overview BRICKS Components BRICKS Applications
Collection Manager Single access point for all content and metadata related operations (local and remote) Physical Collection Similar to folder/directory hierarchy in a file system Bound to a single BNode Each digital content object belongs to exactly one collection Logical Collection Virtual folder for organizing content items independent of their physical location  Links to content items from various physical collections on different BNodes A content item might belong to many of them Stored Query similar to database views
Content Manager Two ways to handle content in BRICKS Stored locally at site of a member party, accessed via URL Stored within BRICKS Based on Java Content Repository (JCR) Provides a meta-content model Re-use of existing content models Use standard models
Metadata Manager Metadata descriptions     RDF Suitable for any application scenario Express relationships between objects React to changes without changing the model Schema defintions     OWL No fixed schema Extensible (e.g. Application profiles) Semantic concepts instead of schematic strucutures SPARQL Metadata queries over ontology concepts Queries for graph patterns
Security Manager Transparently invoked by the Framework any service call is checked Context-aware policies based on RBAC (via XACML rules) supporting Roles, Groups, at DLObject level Permission declaration through Javadoc @tags Federated identity is managed through an adapted version of OpenSAML Reputation-based Trust calculation integrated Web-based GUI for security configuration
Digital Rights Management DRM Component Support for licenses based on  MPEG-21 REL license declaration standard Generic API for the integration of commercial DRM systems Watermarking Open-source watermarking tool for images Other tools can be integrated BRICKS Store web application for commercial content Creative Commons support for other content in BRICKS
Outline BRICKS Overview BRICKS Components BRICKS Applications
Application: BRICKS Workspace  What does it demonstrate? A web application (thin client) accessing BRICKS Foundation services Web 2.0 image annotations Reference application Primary customers General end-users (citizens) Application developers Technology Struts based interface to the BCH
Application: BRICKS Desktop  What does it demonstrate? A rich client application accessing BRICKS foundation services Direct access to the BCHN Primary customers Expert end-users (researchers, educators) Application developers Technology Eclipse based rich client interface
Application: Annotation Tool What does it demonstrate? Tool which allows end-users to annotate images Creation of annotation threads Supervised Annotations Primary customers End-users Institutions with large image collections Technology Web Application
Application: Online Exhibition Authoring Tool What does it demonstrate? Creating and publishing online exhibitions using contents that is available in the BRICKS network Primary customers? Expert end-users (curators) Technology Web Application
Application: Archeological Finds Identifier What does it demonstrate? A web application for comparing findings (e.g. ancient coins) with objects in reference collections  Application of complex domain ontology (CIDOC-CRM) Map visualization of GIS-Metadata Primary customers? Museum curators, archaeologists, students, amateurs, Technology Struts based interface
References BRICKS Community Web Site http://www.brickscommunity.org/ Main Contact: silvia.boi@metaware.it Related (de-facto) standards Resource Description Framework (RDF) http://www.w3.org/TR/rdf-primer/ OWL Web Ontology Language (OWL) http://www.w3.org/TR/owl-guide/ SPARQL http://www.w3.org/TR/rdf-sparql-query/ Java Content Repository (JCR) http://www.jcp.org/en/jsr/detail?id=170 Tools and Libraries Jackrabbit http://jackrabbit.apache.org/ Jena Semantic Web Framework http://jena.sourceforge.net/
Tutorial – Semantic Digital Libraries -  Existing Semantic Digital Libraries Solutions  –  Fedora Sandy Payette Director, Fedora Project Cornell University Dean Krafft,  PI, NSDL Cornell University
Outline Fedora NSDL - National Science Digital Library
Fedora Semantic Digital Libraries enable … Scholarly and Scientific   Workbenches “ Web 2.0” Collaborative Repositories Museum   Exhibits   with   Lesson   Plans Linking   Data   and   Publications blog and wiki
The Fedora Project Fedora F lexible E xtensible  D igital  O bject R epository A rchitecture History Cornell Research (1997-2002)  DARPA and NSF-funded research and reference implementations Distributed, Interoperable Repositories (experiments with CNRI) Open Source Project (2002-present) Andrew W. Mellon Foundation (2002-2009) Joint development by Cornell University and University of Virginia Transitioning into non-profit organization (Fedora Commons 501c3)
Fedora - Technology Integration Semantic Repository Enterprise Preservation Information Networks Contextualization Relationships Query Inference Workflow Messaging Transactions Replication Digital Objects Manage Access Versioning Storage Integrity Check Monitoring Alerting Migration
Motivations:  Fedora and Semantic Technologies A natural model for exposing repository as network of objects Object-to-object relationships Relationships to external entities Query the graph; traversal to discover related stuff Indexing based on generalizable data model Graph-based data model is a common reduction Avoid fixed schema problems and metadata mud wrestling  Extensible enrichment of object descriptions Keep overlaying statements from multiple ontologies Organic evolution Powerful queries and inference for repository management Transitive relationships among objects Dependency analysis;  Detection/Extraction of sub-graphs Provenance of disseminations
RDF in the Fedora Digital Object Model
Digital Objects contain their RDF assertions Assert relationships from Fedora base ontology Collection – member Whole – part Equivalence Description Of More… Assert relationships/properties from community ontologies isAnnotationOf isRecommendedBy isCertifiedBy More ….
Example: Digital Objects with “compositional semantics”
Use Case:  scholarly objects and annotation in the humanities musuem and library objects commercial web content scholarly objects URI-100 xx:recommends URI-55 yy:certifies
3 Objects – 3 RDF “Relationships” Datastreams <rdf:Description rdf:about=&quot;info:fedora/uva:pid-11>   <ais:annotationOf rdf:resource=“info:fedora/uva:pid-3”/> </rdf:Description> </rdf:RDF> <rdf:Description rdf:about=&quot;info:fedora/uva:pid-3&quot;> <uva:hasPartLetter rdf:resource=&quot;info:fedora/uva:pid-2&quot;/>   <uva:hasPartDiagram rdf:resource=&quot;info:fedora/uva:pid-1&quot;/> </rdf:Description> </rdf:RDF> <rdf:Description rdf:about=&quot;info:fedora/uva:pid-10>   <ais:providesContextFor rdf:resource=“info:fedora/uva:pid-3”/> </rdf:Description> </rdf:RDF>
NOT the core object store - RI is a graph-based index of the repository Automatic, incremental indexing into triplestore Search/query the repository via Fedora RI Query Interface Fedora RDF-based Resource Index (RI) RDF Index of Repository RDF datastream Fedora object properties DC datastream Digital Object Store
RI Graph - view 1 (abbreviated) …
RI Graph - view 2 (abbreviated) …
RI Implementation: The Triplestore Challenge Scalability Few triplestores perform well  for 100M+ triples Kowari – we tested to 180M triples MPTStore – we tested to 250M triples Performance Jena - easy to get out of memory Sesame Native - slow for complex queries  Kowari  Fast queries and full-featured query language (iTQL) Instability and corruption problems MPTStore Very fast for SPO queries (limited support for complex queries) Add/modify significantly faster than Kowari Mulgara Fork of Kowari; complex queries; models; inference Major bug fixes to fix stability and corruption problems XA2 transactions Claims support for billions of triples
Outline FEDORA  NSDL - National Science Digital Library
Demo Use Case: Object-Centered Sociality
What is NSDL committed to? NSDL 2.0 as a platform for developing digital library tools Support for communities across the full range of science, technology, engineering and mathematics research, learning and education The library as a shared, collaborative, contributory space Supporting the creation of context around library resources to enhance discovery, use, and understanding
NSDL Semantic Digital Library repository requirements Supports storing both content and metadata Allows arbitrary relationships among resource and metadata objects: organization, annotation, citation Accessible through web service architecture of remixable data sources and transformations
NSDL Data Repository (NDR) Implemented in Fedora 2.2 with MPTStore and journalling Moderately large: 4.7 million digital objects, 250 million RDF triples D.O.s: resources, metadata, agents, metadata providers, aggregators A REST API to allow authenticated access by other applications In production at nsdl.org
NSDL as Semantic Digital Library :  collaboration, context, and contribution The NDR and services provide the platform, but we still need the applications  Solution 1: Leverage the existing successful models: blogs, wikis, bookmarking/tagging Solution 2: Leverage the existing software: WordPress, MediaWiki, Connotea, Sakai Solution 3: Engage with partners and the broader community to build applications to the platform
Expert Voices The NSDL Blogosphere, live at http://expertvoices.nsdl.org Topic-based discussions (e.g. forensics) linked to related library resources A way for NSDL community members to become NSDL contributors: of resources, questions, reviews, annotations, metadata Wordpress-based multi-user multi-blog application (open source, plug-in architecture) Owner controls publication of entries as NSDL resources and visibility of comments Entries can contain linked references to NSDL resources, references to URLs that should become resources, and new resource metadata
 
 
 
OurNSDL:  NDR-integrated Wiki Community of approved contributors (e.g. teachers, librarians, scientists) are granted edit access on OurNSDL wiki New resources and metadata are created as wiki pages and reflected into the NDR Non-wiki-based NDR resources and metadata are displayed as read-only wiki pages, subject to comment and linking, with links reflected back into RDF relationships in NDR User and project pages organize NDR resources, again reflected back into repository as RDF Now implementing MediaWiki extensions; beta release expected 2Q07
NDR Entry for Soft Matter Wiki Wiki Entry New Metadata New Audience MD Referenced New Resource 1 Referenced Existing Resource 2 Annotates Metadata for Metadata for Member of Metadata Provider Metadata Provider Existing Collection Soft Matter Wiki Member of Inferred relationship between resources
 
 
NSDL 2.0 Ecosystem … Protocol: OAI-PMH HTTP REST NDR API STEM Collections Search Service Archive Service Fedora-based   NDR
NSDL 2.0 and the Semantic Web NSDL 2.0 applications situate resources in context, aiding both discovery and use Users become contributors, adding new resources, ratings, annotations, and organizational structure – frequently as a side effect of using the library Fedora-based semantic web technology organizes resources, ties context to content, maintains provenance, enables discovery, empowers the user, and powers the library
Tutorial – Semantic Digital Libraries -  Comparison and the Future  -  Sebastian R. Kruk, Stefan Decker Bernhard Haslhofer, Predrag Kneževic, Sandy Payette
System Features Comparison General Properties JeromeDL BRICKS Fedora OS Support Any Any Any Hardware Requirements 500MB RAM, min 128MB HD 500MB RAM, min 100MB HD Depends Software Requirements Java 1.5, Tomcat 5.5, Sesame Java 1.4/1.5 Java 1.5, Tomcat, Kowari/Mulgara or MPTStore Current Stage Research Stable version 2.0.1 Second Prototype Production Version 2.2 No. Installations 12+ ~ 8 ~50 monitored;  large # of downloads unmonitored Support Model Open Source Open Source Open Source
System Features Comparison Architectural Aspects JeromeDL BRICKS Fedora Distribution Distributed searching (P2P), aggregated browsing (hierarchical) Fully decentralized (P2P) Objects as surrogates for distributed content; federation via search services; Alvis P2P  Architecture Granularity Low (main building blocks) High (many Components) Moderate (core repository service with configurable modules; loosely coupled services) DB - Support Any Sesame-complient backend H2, HSQL, Postgres, MySQL, Oracle, SQLServer MySQL, Postgres, Oracle, McKoi
System Features Comparison Content & Metadata Aspects JeromeDL BRICKS Fedora Content Types All All All Content Models Any Any Metadata Schema MarcOnt + extensions Any OWL- Any Query types Full-text, Filed-Search, Ontology-based, NL Query Templates Full-text, Field-Search, Ontology-based Field Search, Ontology-based (itql, rdql, spo), Full-Text (Lucene or Zebra backed service)
System Features Comparison Security & DRM Aspects JeromeDL BRICKS Fedora Security Model FOAFRealm RBAC XACML Policy Granularity Resource, Degrees of seperation Component, Method, Object Object, Datastream, Dissemination method DRM Model Fair use DRM under development MPEG-21 REL DRM Enabling Tool Support Watermarking
System Features Comparison Semantic Aspects & Community Features JeromeDL BRICKS Fedora Reasoning Recommendation engine based on Prolog Configurable inference engine Holding pattern; look to Mulgara;  Tagging Free tagging, Wordnet-based Annotation  middleware/apps (e.g., NSDL/NDR; PLoSONE/Topaz) Taxonomies Any (JOnto) Any Knoledge Sharing SSCF component middleware/apps (e.g., NSDL/NDR; PLoSONE/Topaz) Communities SIOC and FOAF compli a nce
The future - Social Semantic Digital Libraries Why current (semantic) digital libraries are not enough? digital libraries should not be for librarians only but for average people they concentrate on delivering content/information, not on knowledge sharing within a community of users digital libraries have lost human-part of their predecessors
The future - Social Semantic Digital Libraries What could be the solution? make users/readers involved in the content annotation process allow users/readers to share their knowledge within a community provide better communication between users in and across communities
The future - Social Semantic Digital Libraries What is Web 2.0? The Web where “ordinary” users can meet, collaborate, and share using whatever is newly popular on the Web (tagged content, social bookmarking, AJAX, etc.) The term Web 2.0 was made popular by Tim O’Reilly: http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html Popular examples include: Bebo, del.icio.us, digg, Flickr, Google Maps, Skype, Technorati, Wikipedia…
The future - Social Semantic Digital Libraries (3) Web 2.0 focuses include: The Web as a platform for social and collaborative exchange Reusable community contributions Subscriptions to information, news, data flows, services Mass-publishing using web-based social software Social software for communication and collaboration: IM, IRC, Forums, Blogs, Wikis, Social Network Services, Social Bookmarks, MMOGs…
Social Semantic Information Spaces
Comparing Web 1.0 / Web 2.0 / Semantic Web 2.0 Semantic Social Networks Online Social Networks Buddy Lists, Address Books Semantic Social Information Spaces - - Social Semantic Digital Libraries Google Scholar, Book Search CiteSeer, Project Gutenberg Semantic Forums and Community Portals Community Portals Message Boards Semantic Blogs Blogs Personal Websites Semantic Search Google Personalised, DumbFind Altavista, Google Semantic Wikis Wikis Content Management Systems Semantic Web 2.0 Web 2.0 Web 1.0
Geo, Time, and Machine Tagging Geo-tagging  for resources with a specific geographical location Time-tagging  – community driven process of assigning auxiliary multimedia content  Machine-tagging  – ability to mix structured annotations into tags ROI-tagging : Regions of interest ERP game Asynchonous version with annealing of annotations for less frequently visited libraries
SDL in eLearning One of potential sources of future e-Learning systems On the verge between formal (libraries) and informal (communities) learning sources Semantic interoperability with Learning Management Systems Improve knowledge creation, delivery and sharing
SDL in Future Museums Museums have physical objects Should bind digital annotations with physical objects Real-virtual tours Start with real, guided tour Ubiquitous browse through context information Locate other exhibitions in the vicinity  Share your knowledge and experience with others, leave bread-crumbs for others Get the most of the exhibition during your visit
Discussion – Feedback The Librarian from Unseen University  in Ankh-Morpork  (formerly Dr. Horace Worblehat)

More Related Content

PPTX
Multidimensional data models
PPT
Topic: ISDN (Integrated Services Digital Network)
PDF
Sha
PPTX
Dbms Useful PPT
PPTX
Data-in-Motion, Data-At-Rest and GPG
PDF
Flynn's classification.pdf
PPT
Unit 4 DBMS.ppt
Multidimensional data models
Topic: ISDN (Integrated Services Digital Network)
Sha
Dbms Useful PPT
Data-in-Motion, Data-At-Rest and GPG
Flynn's classification.pdf
Unit 4 DBMS.ppt

What's hot (20)

PPTX
Systems Analyst and Design - Data Dictionary
PPTX
Reed solomon codes
PPT
Bus and Memory transfer
PPT
Error Detection And Correction
PDF
What is network topology, Bus, Star, Ring, Tree, Mesh topology
PDF
Data Communication & Computer Networks : Serial and parellel transmission
PPT
Bus interconnection
PPTX
Serial vs Parallel communication & Synchronous and Asynchronous transmission
PPTX
Dynamic source routing
PDF
Unit 1 DBMS
PPTX
Hadoop Architecture
PPTX
Security services and mechanisms
PPTX
Text MIning
PDF
Data mining & data warehousing (ppt)
PPTX
Data warehousing
PPTX
Basic Concept of Database
PPTX
Distributed database
PPTX
Message digest 5
PPTX
Asynchronous Data Transfer.pptx
PPTX
Computer networking
Systems Analyst and Design - Data Dictionary
Reed solomon codes
Bus and Memory transfer
Error Detection And Correction
What is network topology, Bus, Star, Ring, Tree, Mesh topology
Data Communication & Computer Networks : Serial and parellel transmission
Bus interconnection
Serial vs Parallel communication & Synchronous and Asynchronous transmission
Dynamic source routing
Unit 1 DBMS
Hadoop Architecture
Security services and mechanisms
Text MIning
Data mining & data warehousing (ppt)
Data warehousing
Basic Concept of Database
Distributed database
Message digest 5
Asynchronous Data Transfer.pptx
Computer networking
Ad

Viewers also liked (6)

PDF
An ontology-based context aware system for Selective Dissemination of Informa...
PPT
Ontology based metadata schema for digital library projects in China
PDF
New Rights on Public Domain via Digitalization?
PPTX
Ontology and Ontology Libraries: a Critical Study
PDF
Introduction to Information Retrieval & Models
An ontology-based context aware system for Selective Dissemination of Informa...
Ontology based metadata schema for digital library projects in China
New Rights on Public Domain via Digitalization?
Ontology and Ontology Libraries: a Critical Study
Introduction to Information Retrieval & Models
Ad

Similar to Tutorial on Semantic Digital Libraries (WWW'2007) (20)

PPT
Tutorial on Semantic Digital Libraries (ESWC'2007)
PPT
Digital Libraries of the Future
PPT
Corrib.org - OpenSource and Research
PPT
Irish Digital Libraries Summit
PPT
Porting Library Vocabularies to the Semantic Web - IFLA 2010
PPT
Web 3 Mark Greaves
PPT
Rdf and open linked data a first approach
PPT
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
PPTX
Semantic web
PPT
Semantic Web Technologies For Digital Libraries
PPTX
Digital Library Applications Of Social Networking
PPTX
Digital Library Applications Of Social Networking Jeju Intl Conference
ODP
State of the Semantic Web
PPTX
Semantic Web Technologies: Changing Bibliographic Descriptions?
PDF
Semantic - Based Querying Using Ontology in Relational Database of Library Ma...
PPT
Semantic Web in Action
PPT
Relevance of clasification and indexing
PPT
Intelligent expert systems for location planning
PPTX
Hack U Barcelona 2011
PPT
Repositories thru the looking glass
Tutorial on Semantic Digital Libraries (ESWC'2007)
Digital Libraries of the Future
Corrib.org - OpenSource and Research
Irish Digital Libraries Summit
Porting Library Vocabularies to the Semantic Web - IFLA 2010
Web 3 Mark Greaves
Rdf and open linked data a first approach
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
Semantic web
Semantic Web Technologies For Digital Libraries
Digital Library Applications Of Social Networking
Digital Library Applications Of Social Networking Jeju Intl Conference
State of the Semantic Web
Semantic Web Technologies: Changing Bibliographic Descriptions?
Semantic - Based Querying Using Ontology in Relational Database of Library Ma...
Semantic Web in Action
Relevance of clasification and indexing
Intelligent expert systems for location planning
Hack U Barcelona 2011
Repositories thru the looking glass

More from Sebastian Ryszard Kruk (20)

PDF
PDF
Sieć Semantyczna w teorii i praktyce
PDF
Web 3.0 w teorii i praktyce
ZIP
Semantic Digital Libraries
PDF
JeromeDL - Semantic Digital Library
PDF
Knowledge Management with Web 3.0
ZIP
węzełki.pl - knowledge sharing portal on Web 3.0
PDF
Ecdl2008 Jeromedl Evaluation Long
PDF
Rendering Navigation and Information Space with HoneyCombTM
PPT
Building Heterogeneous Networks of Digital Libraries on the Semantic Web
PDF
MultiBeeBrowse - Accessible Browsing on Unstructured Metadata
PPT
JeromeDL Tutorial
PDF
Role of Ontologies in Semantic Digital Libraries
PDF
Search and Browsing Cycle for Knowledge Discovery and Learning
PPT
Digital Libraries of the Future: Use of Semantic Web and Social Bookmarking t...
PPT
JeromeDL - the Semantic Digital Library
PPT
Social Semantic Digital Libraries in a Nutshell
PPT
Social Semantic Search and Browsing
ODP
Browsing Information with TreeMaps
PPT
Social Semantic Collaborative Filtering
Sieć Semantyczna w teorii i praktyce
Web 3.0 w teorii i praktyce
Semantic Digital Libraries
JeromeDL - Semantic Digital Library
Knowledge Management with Web 3.0
węzełki.pl - knowledge sharing portal on Web 3.0
Ecdl2008 Jeromedl Evaluation Long
Rendering Navigation and Information Space with HoneyCombTM
Building Heterogeneous Networks of Digital Libraries on the Semantic Web
MultiBeeBrowse - Accessible Browsing on Unstructured Metadata
JeromeDL Tutorial
Role of Ontologies in Semantic Digital Libraries
Search and Browsing Cycle for Knowledge Discovery and Learning
Digital Libraries of the Future: Use of Semantic Web and Social Bookmarking t...
JeromeDL - the Semantic Digital Library
Social Semantic Digital Libraries in a Nutshell
Social Semantic Search and Browsing
Browsing Information with TreeMaps
Social Semantic Collaborative Filtering

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
cuic standard and advanced reporting.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Approach and Philosophy of On baking technology
PDF
Electronic commerce courselecture one. Pdf
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
Building Integrated photovoltaic BIPV_UPV.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
cuic standard and advanced reporting.pdf
Encapsulation_ Review paper, used for researhc scholars
NewMind AI Monthly Chronicles - July 2025
20250228 LYD VKU AI Blended-Learning.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
The Rise and Fall of 3GPP – Time for a Sabbatical?
NewMind AI Weekly Chronicles - August'25 Week I
Approach and Philosophy of On baking technology
Electronic commerce courselecture one. Pdf
Machine learning based COVID-19 study performance prediction
Understanding_Digital_Forensics_Presentation.pptx
Spectral efficient network and resource selection model in 5G networks
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Diabetes mellitus diagnosis method based random forest with bat algorithm
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Advanced methodologies resolving dimensionality complications for autism neur...

Tutorial on Semantic Digital Libraries (WWW'2007)

  • 1. Tutorial – Semantic Digital Libraries - Introduction - Sebastian R. Kruk , Stefan Decker , Bernhard Haslhofer, Predrag Kneževic , Sandy Payette, Dean Krafft
  • 2. Tutorial overview Who we are Sebastian R. Kruk, DERI Galway – Ireland Stefan Decker, DERI Galway – Ireland Bernhard Haslhofer, University of Vienna - Austria Predrag Knezevic, Fraunhofer IPSI – Germany Sandy Payette, Cornell University, USA Dean Krafft, NSDL, USA In the next 3,5 hours we want to give you a brief introduction to the Semantic Web, and show how SW is related to digital libraries present existing semantic digital library systems discuss the current problems and future directions of semantic digital libraries and get feedback from you After this tutorial you will know what is the semantic digital library system existing solutions in various degrees of detail
  • 3. Tutorial Schedule Existing Solutions - JeromeDL 2:15 - 3:00 Conclusions, discussion & future of SemDL 4:30 - 5:00 Existing Semantic Digital Libraries solutions (BRICKS, FEDORA, SIMILE) 3: 15 - 4:30 Coffee break 3:00 - 3: 15 Introduction to Semantic Digital Libraries 1:30 - 2:15 Time
  • 4. The Semantic Web – A Brief Introduction Current Web vs. Semantic Web? An extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation. [Tim Berners-Lee] Current Web was designed for humans, and there is little information usable for machines Was the Web meant to be more? Objects with well defined attributes as opposed to untyped hyperlinks between Internet resources A network of relationships amongst named objects, yielding unified information management tasks What do you mean by “Semantic”? the semantics of something is the meaning of something Semantic Web is able to describe things in a way that computers can understand
  • 5. Outline Introduction to Semantic Web Semantic Digital Libraries
  • 6. The Semantic Web – A Brief Introduction Where are we in the “S emantic W eb layer cake”? You Are Here!
  • 7. The Semantic Web – A Brief Introduction The challenge for the Semantic Web The Semantic Web can’t work all by itself For example, it is not very likely that you will be able to sell your car just by putting your RDF file on the Web Need society-scale applications: Semantic Web agents and/or services, consumers and processors for semantic data, more advanced collaborative applications
  • 8. The Semantic Web – What is RDF ? Describing things on the S emantic W eb RDF (Resource Description Framework) a data format for describing information and resources, the fundamental data model for the Semantic Web Using RDF, we can describe relationships between things like: A is a part of B or Y is a member of Z and their properties ( size , weight , age , price …) in a machine-understandable format where each thing has a RDF graph-based model delivers straightforward machine process ing Putting information into RDF files makes it possible for “scutters” or RDF crawlers to search , discover , pick up , collect , analyse and process  information from the Web
  • 9. The Semantic Web – What is RDF ? A simple RDF example Statement: “ Stefan Decker is the creator of the resource (web page) http://www.stefandecker.org ” Structure: Resource (subject) http://www.stefandecker.org Property (predicate) http://purl.org/dc/elements/1.1/creator Value (object) “ Stefan Decker ” Directed graph: http://www.stefandecker.org dc:creator Stefan Decker
  • 10. The Semantic Web – How RDF can help us? How RDF can help us? identify objects establish relationships express a new relationship  just add a new RDF statement integrate information from different sources  copy all the RDF data together RDF allows many points of view
  • 11. What is an Ontology? „ An ontology is a specification of a conceptualization.“ Tom Gruber, 1993 Ontologies are social contracts Agreed, explicit semantics Understandable to outsiders (Often) derived in a community process Ontology markup and representation languages: RDF and RDF Schema OWL Other: DAML+OIL , EER , UML , Topic Maps , MOF , XML Schemas The Semantic Web – Ontologies and Schemata
  • 12. Defines small vocabulary for RDF: Class, subClassOf, type Property, subPropertyOf domain, range Vocabulary can be used to define other vocabularies for your application domain The Semantic Web – RDF Schema Person Student Researcher subClassOf subClassOf Jeen type hasSuperVisor domain range Frank type hasSuperVisor
  • 13. OWL – The Web Ontology Language Owl took Christopher Robin’s notice from Rabbit and looked at it nervously. He could spell his own name WOL , and he could spell Tuesday so that you knew it wasn’t Wednesday, and he could read quite comfortably when you weren’t looking over his shoulder and saying &quot;Well?&quot; all the time... provides a vocabulary for defining classes, their properties and their relationships among classes. The Semantic Web – OWL owl :disjointWith s s s s Animal Herbivore Carnivore Omnivore Based on Description Logics OWL is a W3C Recommendation
  • 14. The Semantic Web – Applications Semantic Web cannot be and is not only a set of recommendations Semantic Web is becoming reality by applications that support it and are based on it Enabling technologies: RDF Storages: Sesame, Jena, YARS Reasoners: KAON, Racer Editors: Protege, SWOOP, MarcOnt Portal End-User applications: Semantic wikis: Makna, SemperWiki Semantic blogs Semantic digital libraries
  • 15. Outline Introduction to Semantic Web Semantic Digital Libraries
  • 16. What is a Semantic Digital Library? Semantic digital libraries integrate information based on different metadata, e.g.: resources, user profiles, bookmarks, taxonomies – high quality semantics = highly and meaningfully connected information provide interoperability with other systems (not only digital libraries) on either metadata or communication level or both – RDF as common denominator between digital libraries and other services delivering more robust, user friendly and adaptable search and browsing interfaces empowered by semantics
  • 17. Evolution of Libraries Social Semantic Digital Library Involves the community into sharing knowledge Semantic Digital Library Accessible by  machines, not only with machines Digital Library Online, easy searching with a full-text index Library Organized collection
  • 18. Different Kind of Libraries (Evolution of Libraries) Classic libraries Scientific libraries Digital libraries Semantic libraries
  • 19. How are Semantic Digital Libraries different? Semantic digital libraries extend digital libraries by describing and exposing its resources in a machine ‘understandable’ way resources can be contents, digital artefacts organization of objects (e.g. collections) users, user communities controlled vocabularies, thesauri, taxonomies expose the semantics of their metadata in terms of an ontology defined using a formal language deliver mediation services for communication with other systems
  • 20. Semantic Web Technologies for Digital Libraries? Metadata is the key concept the Web does not have metadata the idea of a Semantic Web is nice but difficult to implement many digital libraries do have metadata in place we simply must make them available in a machine understandable format the Semantic Web provides the format: RDF
  • 21. Semantic Web Technologies for Digital Libraries? Knowledge in bibliographic records Digital Libraries already have controlled vocabularies, taxonomies or even ontologies in place the challenge is to model this knowledge in a machine understandable way the Semantic Web provides ontology language s: RDF Schema OWL SKOS
  • 22. A Sample Bibliographic Record Copyright 2000 The J. Paul Getty Trust & College Art Association, Inc . Terms taken from Controlled Vocabularies Vincent van Gogh; painter: Gogh, Vincent van (Dutch painter, 1853-1890) Creation-Creator/Role J. Paul Getty Museum Current Location-Repository Name irises , nature , soil , etc. Subject-Matter 1889, earliest: 1889, latest: 1889 Creation-Date Irises Title paintings Object/Work type Paintings Classification
  • 23. Knowledge Organization Systems tools that present the organized interpretation of knowledge structures semantic tools - meaning of words and other symbols as well as (semantic) relations between symbols and concept organize information and promote knowledge management Examples: classification and categorization schemata (organize materials at a general level) subject headings (provide more detailed access) authority files (control variant versions of key information such as geographic names and personal names) highly structured vocabularies, such as thesauri traditional schemes, such as semantic networks and ontologies
  • 24. Taxonomy of Knowledge Organization Systems Term Lists Authority files ( FOAF ) Glossaries Dictionaries Gazetteers Classifications and Categories ( DMoz ) Subject headings Classification schemes Taxonomies Categorization Schemes. Relationship Lists Thesauri ( WordNet, MeSH ) Semantic networks Ontologies (Hodge, 2000)
  • 25. Simple Knowledge Organization Systems (SKOS) basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, 'folksonomies ', other types of controlled vocabulary core concepts: narrower and broader isSubjectOf and subject ; isPrimarySubjectOf and primarySubject member and Collection; memberList and OrderedCollection related and semanticRelation note, definition; altLabel and prefLabel ; symbol and altSymbol
  • 26. Benefits of Semantic Digital Libraries Problems of today’s libraries rapidly growing islands of highly organized information How to find things in a growing information space? is it enough to have a full-text index (à la Google)? typical “end-users” versus “expert users” converging digital library systems e.g. uniform access to Europe’s digital libraries and cultural heritage
  • 27. Benefits of Semantic Digital Libraries T he two main benefits of Semantic Digital Libraries new search paradigms for the information space Ontology - based search / facet search Community-enabled browsing providing interoperability on the data level integrating metadata from various heterogeneous sources Interconnecting different digital library systems
  • 28. Searching the Sample Bibliographic Record Full-text search “ Paintings ” AND “ Van Gogh ” AND “ flowers ”  no result Semantic query if the knowledge that “ irises ” are “ flowers ” is modeled in an ontology (e.g. subclass-hierarchy) we can query for all “ Paintings ” by “ Van Gogh ” with subject “ flowers ” and retrieve also the picture with subject “ irises ” Copyright 2000 The J. Paul Getty Trust & College Art Association, Inc . Vincent van Gogh; painter: Gogh, Vincent van (Dutch painter, 1853-1890) Creation-Creator/Role J. Paul Getty Museum Current Location-Repository Name irises , nature , soil , etc. Subject-Matter 1889, earliest: 1889, latest: 1889 Creation-Date Irises Title paintings Object/Work type Paintings Classification
  • 29. Semantic Digital Libraries and Existing DL Systems how to handle the legacy (meta-)data problem lifting existing (meta-)data to a semantic level simple solutions like MARC21  DublinCore complex ontologies like MarcOnt Ontology for capturing concepts from different standards legacy libraries expose their metadata via well established protocols - the metadata can be imported into semantic DLs semantic DLs can play a role of integration champions in the information retrieval process in heterogeneous networks: OAI-PMH Z39.50 Dienst
  • 30. Application A reas for Semantic Web T echnologies Thesauri & Controlled Vocabularies qualified DublinCore DMoz, DDC-based taxonomies SKOS, WordNet and other thesauri Schema Mappings / Crosswalks MarcOnt Ontology – aims to cover concepts from MARC21, BibTeX and DublinCore MarcOnt Mediation Services – an open mediation framework between common legacy metadata standards Metadata Integration RDF as a common data model for integrating metadata from various autonomous and heterogeneous data sources OWL for modeling the data source’s semantics SPARQL as a common query language
  • 31. Semantic DL as Evolving Knowledge Space In state-of-the-art digital libraries users are consumers Retrieve contents based on available bibliographic records Recent trends: user communities Connetea Flickr In Semantic digital libraries users are contributers as well Tagging (Web 2.0) Social Semantic Collaborative Filtering Annotations Semantic Digital libraries enforce the transition from a static information to a dynamic (collaborative) knowledge space
  • 32. Existing Semantic Digital Library Systems JeromeDL a social semantic digital library makes use of Semantic Web and Social Networking technologies to enhance both interoperability and usability BRICKS aims at establishing the organizational and technological foundations for a digital library network in order to share knowledge and resources in the cultural heritage domain. FEDORA delivers flexible service-oriented architecture to managing and delivering content in the form of digital objects SIMILE extends and laverages DSpace, seeking to enhance interoperability among digital assets, schemata, metadata, and services
  • 33. Tutorial – Semantic Digital Libraries - Existing Semantic Digital Libraries Solutions – Sebastian R. Kruk, Stefan Decker Predrag Kneževi ć , Bernhard Haslhofer, Sandy Payette, Dean Krafft
  • 34. Existing Semantic Digital Library Systems JeromeDL a social semantic digital library makes use of Semantic Web and Social Networking technologies to enhance both interoperability and usability BRICKS aims at establishing the organizational and technological foundations for a digital library network in order to share knowledge and resources in the cultural heritage domain. FEDORA delivers flexible service-oriented architecture to managing and delivering content in the form of digital objects SIMILE extends and laverages DSpace, seeking to enhance interoperability among digital assets, schemata, metadata, and services
  • 35. Tutorial 7 – Semantic Digital Libraries - Existing Semantic Digital Libraries Solutions – JeromeDL Sebastian R. Kruk
  • 36. Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
  • 37. JeromeDL - Introduction Joint effort of DERI, National University of Ireland, Galway and Gdansk University of Technology (GUT) Distributed under BSD Open Source license Digital library build on semantic web technologies to answer requirements from: librarians, scientists and everyone.
  • 38. JeromeDL – Motivations Use Cases Librarians: support for rich metadata (MARC21) in uploading resources, accessing bibliographic information and searching persistent identifiers Scientists: easy publishing (designed as a institute/university digital library) creating hierarchical networks of digital libraries support for accessing, sharing and searching using bibliography metadata (BibTeX) Everyone: simple search (incl. natural language queries) community-aware information sharing and browsing, support for interationalization
  • 39. JeromeDL - Motivations Support for different kinds of bibliographic medatata, like: DublinCore , BibTeX and MARC21 at the same time. Making use of existing rich sources of bibliographic descriptions (like MARC21) created by human. Supporting users and communities: user s ha ve control over their profile information ; community-aware profiles are integrated with bibliographic descriptions support for community generated knowledge Delivering communication between instances: P2P mode for searching and users authentication Hierarchical mode for browsing
  • 40. Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
  • 41. JeromeDL – Architecture Resources and annotations repository Middleware: query processing community space resources management User interface agents: Communication to the outside world Administrative interface
  • 42. Bibliographic Description in JeromeDL <?xml version =&quot;1.0&quot; encoding =&quot;UTF-8&quot; ?> <rdf:Description rdf:about =&quot;http://...id=828374765&quot; > <dc:title> JeromeDL - Adding Semantic Web Technologies to DLs </dc:title> <dc:creator> Sebastian Kruk </dc:creator> <dc:description> In recent years... </dc:description> </rdf:Description> 01450cas 922004331i 450000100...019c19329999gw qr|p| ||||0 |0ger | a0044-2992 9a200412140219bVLOADc200404071525dvkulc200310071018dvbjc200303101205dkopumky200209211341zVLOAD aGD U/MPcGD U/MPdGD U/MFdGD U/KKsdWR O/EJ0 ager1 aZ. Kunstgesch. 0aZeitschrift für Kunstgeschichte00aZeitschrift für Kunstgeschichte.18aZfK aMünchen ;aBerlin :bDeutscher Kunstverlag,c1932-. c26-29 cm. aKwart.0 a1 Bd. (Juni 1932)-. aOpis na podst.: LCC. aW 1932 założycielami czasopisma byli Wilhelm Waetzoldt i Ernst Gall.... These all can be represented in RDF @ InProceedings { jeromedexa2005, author = &quot;Sebastian Ryszard Kruk and ... &quot;, title = &quot;{JeromeDL - Adding Semantic ...}&quot;, booktitle = &quot;{In Proceedings to DEXA 2005}&quot;, year = 2005}
  • 47. Metadata and Services in JeromeDL
  • 48. Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
  • 49. MarcOnt Initiative – Overview Motivation: Provide set of tools for collaborative ontology development MarcOnt Initiative goals: Create a framework for collaborative ontology improvement (E-learning) Provide domain experts with tools to share their knowledge Offer tools for data mediation between different data formats
  • 50. MarcOnt Portal and MarcOnt Ontology MarcOnt Ontology: Central point of MarcOnt Initiative Translation and mediation format Continuos collaborative ontology improvement Knowledge from the domain experts MarcOnt Portal (source of knowledge): Suggestions Annotations Versioning Ontology editor
  • 51. MarcOnt Mediation Services for Legacy Metadata Format translation RDF Translator Format co-operation MarcOnt Mediation Services
  • 52. Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
  • 53. Social Services in JeromeDL Involve users into sharing knowledge Blogs – comments and discussions about documents and resources Tagging – collaborative classification Wikis – collaboratively edited additional descriptions, such as summaries and interesting facts Preserve knowledge for future use Users can learn from experience of others instantly Recommend new, interesting resources based on users’ profiles
  • 54. FOAF - Describing Social Networks FOAF - Stands for Friend-of-a-Friend Defines properties for a person (but it does not have to be a person, can be an “agent”) Does not only have to contain one person per file Can build a network of people with foaf:knows links FOAF can be easily extended to meet requirements, as in the case of FOAFRealm for identity management…
  • 55. Identity management with FOAFRealm Identity defined with extended FOAF metadata Policies expressed by social networking Distance between owner and requester Friendship level between owner and requester, calculated across digraph of social network Support for single registration and sign on Distributed identity management with HyperCuP (“D-FOAF”) FOAFRealm is currently implemented as a plugin for Tomcat (Realm/Valve implementation), with PHP and .NET versions coming soon
  • 56. What is S ocial S emantic C ollaborative F iltering? Goal: t o enhance individual bookmarks with shared knowledge within a community Users annotate catalogues of bookmarks with semantic information taken from DM oz or WordNet vocabularies Catalogs can include ( transclusion ) friend's catalogues Access to catalogues can be restricted with social networking-based polices SSCF delivers: Community-oriented, semantically-rich taxonomies Information about a user's interest Flows of expertise from the domain expert Recommendations based on users previous actions Support for SIOC metadata
  • 57. Example of S ocial S emantic C ollaborative F iltering foaf:knows xfoaf:include xfoaf:bookmark
  • 58. Social Networks in Digital Libraries Resource xfoaf:Annotation user_C creator_B foaf:knows marcont:hasCreator creator_A foaf:knows foaf:knows xfoaf:Directory user_D xfoaf:owns xfoaf:linksTo xfoaf:isIn
  • 59. Support for online communities in SSCF
  • 60. Outline JeromeDL - Motivation and Overview JeromeDL - Architecture and Ontologies JeromeDL - Semantic Services JeromeDL - Social Services JeromeDL - Semantics in Use
  • 61. JeromeDL – Delivering Semantic Content Providing semantic annotations during uploading process: open module for handling any taxonomies keywords based on WordNet and free tagging defining structure of resources in the JeromeDL ontology Lifting legacy metadata to MarcOnt ontology Community maintained annotations social semantic collaborative filtering semantic descriptions based on the FOAF metadata
  • 63. JeromeDL – Semantic Information In Use Searching: Keyword-based search with semantic query expansion Semantic search: Direct RDF quering Natural language templates Browsing Exibit MultiBeeBrowse Sharing: Social Semantic Collaborative Filtering Semantically Interlinked Online Communities Heterogeneous communication: Bibster , A9 , OAI -PMH
  • 67. Information Retrieval in JeromeDL Fulltext Index Structure Repository MarcOnt Repository Resources’ Content FOAFRealm Repository (typed) keywords RDF & NL Query OpenSearch RSS collaborative filtering types translation semantic query expansion RDF Repositories Secure Snapshot local interface distributed interface
  • 68. Networks of Digital Libraries ELP (Extensible Library Protocol) implementation communication within JeromeDL network adapters for communication with other networks D-FOAF integration (distributed user profile management) single sign on and single registration within D-FOAF network HyperCuP integration (scalable P2P network) Independent ELP network entry point: http://search.jeromedl.org/ 0 0 1 1 0 0 1 1 0 2 2 2 2
  • 69. Tutorial – Semantic Digital Libraries - Existing Semantic Digital Libraries Solutions – BRICKS Predrag Knežević Fraunhofer IPSI Institute Germany Bernhard Haslhofer University of Vienna Austria
  • 70. Outline BRICKS Overview BRICKS Components BRICKS Applications
  • 71. What is BRICKS? A software infrastructure for building digital library networks Transparent access to distributed resources Multilinguality Easy installation & maintainance A set of end-user applications Network & content management Web 2.0 tagging/annotations Domain specific applications A business model Open source, platform independent Low cost infrastructure User communities  sustainability
  • 72. BRICKS Architecture A decentralized P2P network Avoid central coordination Highly Scalable, increased reliability Minimized maintainance costs Each P2P Node is a set of SOA components Web Service interface Platform independent Flexible composition Components for Storing, accessing and protecting digital objects (Semantic) search & browsing P2P commmunication
  • 74. A Look into a BNode { BNode
  • 75. Outline BRICKS Overview BRICKS Components BRICKS Applications
  • 76. Collection Manager Single access point for all content and metadata related operations (local and remote) Physical Collection Similar to folder/directory hierarchy in a file system Bound to a single BNode Each digital content object belongs to exactly one collection Logical Collection Virtual folder for organizing content items independent of their physical location Links to content items from various physical collections on different BNodes A content item might belong to many of them Stored Query similar to database views
  • 77. Content Manager Two ways to handle content in BRICKS Stored locally at site of a member party, accessed via URL Stored within BRICKS Based on Java Content Repository (JCR) Provides a meta-content model Re-use of existing content models Use standard models
  • 78. Metadata Manager Metadata descriptions  RDF Suitable for any application scenario Express relationships between objects React to changes without changing the model Schema defintions  OWL No fixed schema Extensible (e.g. Application profiles) Semantic concepts instead of schematic strucutures SPARQL Metadata queries over ontology concepts Queries for graph patterns
  • 79. Security Manager Transparently invoked by the Framework any service call is checked Context-aware policies based on RBAC (via XACML rules) supporting Roles, Groups, at DLObject level Permission declaration through Javadoc @tags Federated identity is managed through an adapted version of OpenSAML Reputation-based Trust calculation integrated Web-based GUI for security configuration
  • 80. Digital Rights Management DRM Component Support for licenses based on MPEG-21 REL license declaration standard Generic API for the integration of commercial DRM systems Watermarking Open-source watermarking tool for images Other tools can be integrated BRICKS Store web application for commercial content Creative Commons support for other content in BRICKS
  • 81. Outline BRICKS Overview BRICKS Components BRICKS Applications
  • 82. Application: BRICKS Workspace What does it demonstrate? A web application (thin client) accessing BRICKS Foundation services Web 2.0 image annotations Reference application Primary customers General end-users (citizens) Application developers Technology Struts based interface to the BCH
  • 83. Application: BRICKS Desktop What does it demonstrate? A rich client application accessing BRICKS foundation services Direct access to the BCHN Primary customers Expert end-users (researchers, educators) Application developers Technology Eclipse based rich client interface
  • 84. Application: Annotation Tool What does it demonstrate? Tool which allows end-users to annotate images Creation of annotation threads Supervised Annotations Primary customers End-users Institutions with large image collections Technology Web Application
  • 85. Application: Online Exhibition Authoring Tool What does it demonstrate? Creating and publishing online exhibitions using contents that is available in the BRICKS network Primary customers? Expert end-users (curators) Technology Web Application
  • 86. Application: Archeological Finds Identifier What does it demonstrate? A web application for comparing findings (e.g. ancient coins) with objects in reference collections Application of complex domain ontology (CIDOC-CRM) Map visualization of GIS-Metadata Primary customers? Museum curators, archaeologists, students, amateurs, Technology Struts based interface
  • 87. References BRICKS Community Web Site http://www.brickscommunity.org/ Main Contact: silvia.boi@metaware.it Related (de-facto) standards Resource Description Framework (RDF) http://www.w3.org/TR/rdf-primer/ OWL Web Ontology Language (OWL) http://www.w3.org/TR/owl-guide/ SPARQL http://www.w3.org/TR/rdf-sparql-query/ Java Content Repository (JCR) http://www.jcp.org/en/jsr/detail?id=170 Tools and Libraries Jackrabbit http://jackrabbit.apache.org/ Jena Semantic Web Framework http://jena.sourceforge.net/
  • 88. Tutorial – Semantic Digital Libraries - Existing Semantic Digital Libraries Solutions – Fedora Sandy Payette Director, Fedora Project Cornell University Dean Krafft, PI, NSDL Cornell University
  • 89. Outline Fedora NSDL - National Science Digital Library
  • 90. Fedora Semantic Digital Libraries enable … Scholarly and Scientific Workbenches “ Web 2.0” Collaborative Repositories Museum Exhibits with Lesson Plans Linking Data and Publications blog and wiki
  • 91. The Fedora Project Fedora F lexible E xtensible D igital O bject R epository A rchitecture History Cornell Research (1997-2002) DARPA and NSF-funded research and reference implementations Distributed, Interoperable Repositories (experiments with CNRI) Open Source Project (2002-present) Andrew W. Mellon Foundation (2002-2009) Joint development by Cornell University and University of Virginia Transitioning into non-profit organization (Fedora Commons 501c3)
  • 92. Fedora - Technology Integration Semantic Repository Enterprise Preservation Information Networks Contextualization Relationships Query Inference Workflow Messaging Transactions Replication Digital Objects Manage Access Versioning Storage Integrity Check Monitoring Alerting Migration
  • 93. Motivations: Fedora and Semantic Technologies A natural model for exposing repository as network of objects Object-to-object relationships Relationships to external entities Query the graph; traversal to discover related stuff Indexing based on generalizable data model Graph-based data model is a common reduction Avoid fixed schema problems and metadata mud wrestling Extensible enrichment of object descriptions Keep overlaying statements from multiple ontologies Organic evolution Powerful queries and inference for repository management Transitive relationships among objects Dependency analysis; Detection/Extraction of sub-graphs Provenance of disseminations
  • 94. RDF in the Fedora Digital Object Model
  • 95. Digital Objects contain their RDF assertions Assert relationships from Fedora base ontology Collection – member Whole – part Equivalence Description Of More… Assert relationships/properties from community ontologies isAnnotationOf isRecommendedBy isCertifiedBy More ….
  • 96. Example: Digital Objects with “compositional semantics”
  • 97. Use Case: scholarly objects and annotation in the humanities musuem and library objects commercial web content scholarly objects URI-100 xx:recommends URI-55 yy:certifies
  • 98. 3 Objects – 3 RDF “Relationships” Datastreams <rdf:Description rdf:about=&quot;info:fedora/uva:pid-11> <ais:annotationOf rdf:resource=“info:fedora/uva:pid-3”/> </rdf:Description> </rdf:RDF> <rdf:Description rdf:about=&quot;info:fedora/uva:pid-3&quot;> <uva:hasPartLetter rdf:resource=&quot;info:fedora/uva:pid-2&quot;/> <uva:hasPartDiagram rdf:resource=&quot;info:fedora/uva:pid-1&quot;/> </rdf:Description> </rdf:RDF> <rdf:Description rdf:about=&quot;info:fedora/uva:pid-10> <ais:providesContextFor rdf:resource=“info:fedora/uva:pid-3”/> </rdf:Description> </rdf:RDF>
  • 99. NOT the core object store - RI is a graph-based index of the repository Automatic, incremental indexing into triplestore Search/query the repository via Fedora RI Query Interface Fedora RDF-based Resource Index (RI) RDF Index of Repository RDF datastream Fedora object properties DC datastream Digital Object Store
  • 100. RI Graph - view 1 (abbreviated) …
  • 101. RI Graph - view 2 (abbreviated) …
  • 102. RI Implementation: The Triplestore Challenge Scalability Few triplestores perform well for 100M+ triples Kowari – we tested to 180M triples MPTStore – we tested to 250M triples Performance Jena - easy to get out of memory Sesame Native - slow for complex queries Kowari Fast queries and full-featured query language (iTQL) Instability and corruption problems MPTStore Very fast for SPO queries (limited support for complex queries) Add/modify significantly faster than Kowari Mulgara Fork of Kowari; complex queries; models; inference Major bug fixes to fix stability and corruption problems XA2 transactions Claims support for billions of triples
  • 103. Outline FEDORA NSDL - National Science Digital Library
  • 104. Demo Use Case: Object-Centered Sociality
  • 105. What is NSDL committed to? NSDL 2.0 as a platform for developing digital library tools Support for communities across the full range of science, technology, engineering and mathematics research, learning and education The library as a shared, collaborative, contributory space Supporting the creation of context around library resources to enhance discovery, use, and understanding
  • 106. NSDL Semantic Digital Library repository requirements Supports storing both content and metadata Allows arbitrary relationships among resource and metadata objects: organization, annotation, citation Accessible through web service architecture of remixable data sources and transformations
  • 107. NSDL Data Repository (NDR) Implemented in Fedora 2.2 with MPTStore and journalling Moderately large: 4.7 million digital objects, 250 million RDF triples D.O.s: resources, metadata, agents, metadata providers, aggregators A REST API to allow authenticated access by other applications In production at nsdl.org
  • 108. NSDL as Semantic Digital Library : collaboration, context, and contribution The NDR and services provide the platform, but we still need the applications Solution 1: Leverage the existing successful models: blogs, wikis, bookmarking/tagging Solution 2: Leverage the existing software: WordPress, MediaWiki, Connotea, Sakai Solution 3: Engage with partners and the broader community to build applications to the platform
  • 109. Expert Voices The NSDL Blogosphere, live at http://expertvoices.nsdl.org Topic-based discussions (e.g. forensics) linked to related library resources A way for NSDL community members to become NSDL contributors: of resources, questions, reviews, annotations, metadata Wordpress-based multi-user multi-blog application (open source, plug-in architecture) Owner controls publication of entries as NSDL resources and visibility of comments Entries can contain linked references to NSDL resources, references to URLs that should become resources, and new resource metadata
  • 110.  
  • 111.  
  • 112.  
  • 113. OurNSDL: NDR-integrated Wiki Community of approved contributors (e.g. teachers, librarians, scientists) are granted edit access on OurNSDL wiki New resources and metadata are created as wiki pages and reflected into the NDR Non-wiki-based NDR resources and metadata are displayed as read-only wiki pages, subject to comment and linking, with links reflected back into RDF relationships in NDR User and project pages organize NDR resources, again reflected back into repository as RDF Now implementing MediaWiki extensions; beta release expected 2Q07
  • 114. NDR Entry for Soft Matter Wiki Wiki Entry New Metadata New Audience MD Referenced New Resource 1 Referenced Existing Resource 2 Annotates Metadata for Metadata for Member of Metadata Provider Metadata Provider Existing Collection Soft Matter Wiki Member of Inferred relationship between resources
  • 115.  
  • 116.  
  • 117. NSDL 2.0 Ecosystem … Protocol: OAI-PMH HTTP REST NDR API STEM Collections Search Service Archive Service Fedora-based NDR
  • 118. NSDL 2.0 and the Semantic Web NSDL 2.0 applications situate resources in context, aiding both discovery and use Users become contributors, adding new resources, ratings, annotations, and organizational structure – frequently as a side effect of using the library Fedora-based semantic web technology organizes resources, ties context to content, maintains provenance, enables discovery, empowers the user, and powers the library
  • 119. Tutorial – Semantic Digital Libraries - Comparison and the Future - Sebastian R. Kruk, Stefan Decker Bernhard Haslhofer, Predrag Kneževic, Sandy Payette
  • 120. System Features Comparison General Properties JeromeDL BRICKS Fedora OS Support Any Any Any Hardware Requirements 500MB RAM, min 128MB HD 500MB RAM, min 100MB HD Depends Software Requirements Java 1.5, Tomcat 5.5, Sesame Java 1.4/1.5 Java 1.5, Tomcat, Kowari/Mulgara or MPTStore Current Stage Research Stable version 2.0.1 Second Prototype Production Version 2.2 No. Installations 12+ ~ 8 ~50 monitored; large # of downloads unmonitored Support Model Open Source Open Source Open Source
  • 121. System Features Comparison Architectural Aspects JeromeDL BRICKS Fedora Distribution Distributed searching (P2P), aggregated browsing (hierarchical) Fully decentralized (P2P) Objects as surrogates for distributed content; federation via search services; Alvis P2P Architecture Granularity Low (main building blocks) High (many Components) Moderate (core repository service with configurable modules; loosely coupled services) DB - Support Any Sesame-complient backend H2, HSQL, Postgres, MySQL, Oracle, SQLServer MySQL, Postgres, Oracle, McKoi
  • 122. System Features Comparison Content & Metadata Aspects JeromeDL BRICKS Fedora Content Types All All All Content Models Any Any Metadata Schema MarcOnt + extensions Any OWL- Any Query types Full-text, Filed-Search, Ontology-based, NL Query Templates Full-text, Field-Search, Ontology-based Field Search, Ontology-based (itql, rdql, spo), Full-Text (Lucene or Zebra backed service)
  • 123. System Features Comparison Security & DRM Aspects JeromeDL BRICKS Fedora Security Model FOAFRealm RBAC XACML Policy Granularity Resource, Degrees of seperation Component, Method, Object Object, Datastream, Dissemination method DRM Model Fair use DRM under development MPEG-21 REL DRM Enabling Tool Support Watermarking
  • 124. System Features Comparison Semantic Aspects & Community Features JeromeDL BRICKS Fedora Reasoning Recommendation engine based on Prolog Configurable inference engine Holding pattern; look to Mulgara; Tagging Free tagging, Wordnet-based Annotation middleware/apps (e.g., NSDL/NDR; PLoSONE/Topaz) Taxonomies Any (JOnto) Any Knoledge Sharing SSCF component middleware/apps (e.g., NSDL/NDR; PLoSONE/Topaz) Communities SIOC and FOAF compli a nce
  • 125. The future - Social Semantic Digital Libraries Why current (semantic) digital libraries are not enough? digital libraries should not be for librarians only but for average people they concentrate on delivering content/information, not on knowledge sharing within a community of users digital libraries have lost human-part of their predecessors
  • 126. The future - Social Semantic Digital Libraries What could be the solution? make users/readers involved in the content annotation process allow users/readers to share their knowledge within a community provide better communication between users in and across communities
  • 127. The future - Social Semantic Digital Libraries What is Web 2.0? The Web where “ordinary” users can meet, collaborate, and share using whatever is newly popular on the Web (tagged content, social bookmarking, AJAX, etc.) The term Web 2.0 was made popular by Tim O’Reilly: http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html Popular examples include: Bebo, del.icio.us, digg, Flickr, Google Maps, Skype, Technorati, Wikipedia…
  • 128. The future - Social Semantic Digital Libraries (3) Web 2.0 focuses include: The Web as a platform for social and collaborative exchange Reusable community contributions Subscriptions to information, news, data flows, services Mass-publishing using web-based social software Social software for communication and collaboration: IM, IRC, Forums, Blogs, Wikis, Social Network Services, Social Bookmarks, MMOGs…
  • 130. Comparing Web 1.0 / Web 2.0 / Semantic Web 2.0 Semantic Social Networks Online Social Networks Buddy Lists, Address Books Semantic Social Information Spaces - - Social Semantic Digital Libraries Google Scholar, Book Search CiteSeer, Project Gutenberg Semantic Forums and Community Portals Community Portals Message Boards Semantic Blogs Blogs Personal Websites Semantic Search Google Personalised, DumbFind Altavista, Google Semantic Wikis Wikis Content Management Systems Semantic Web 2.0 Web 2.0 Web 1.0
  • 131. Geo, Time, and Machine Tagging Geo-tagging for resources with a specific geographical location Time-tagging – community driven process of assigning auxiliary multimedia content Machine-tagging – ability to mix structured annotations into tags ROI-tagging : Regions of interest ERP game Asynchonous version with annealing of annotations for less frequently visited libraries
  • 132. SDL in eLearning One of potential sources of future e-Learning systems On the verge between formal (libraries) and informal (communities) learning sources Semantic interoperability with Learning Management Systems Improve knowledge creation, delivery and sharing
  • 133. SDL in Future Museums Museums have physical objects Should bind digital annotations with physical objects Real-virtual tours Start with real, guided tour Ubiquitous browse through context information Locate other exhibitions in the vicinity Share your knowledge and experience with others, leave bread-crumbs for others Get the most of the exhibition during your visit
  • 134. Discussion – Feedback The Librarian from Unseen University in Ankh-Morpork (formerly Dr. Horace Worblehat)