SlideShare a Scribd company logo
The Mint Mapping tool
The MoRe aggregator
Vassilis Tzouvaras, Dimitris Gavrilis
National Technical University of Athens
Digital Curation Unit - IMIS, Athena Research Center
LoCloud is funded by the
European Commission's ICT Policy Support Programme
Cultural Heritage Content
• Diversity of cultural heritage content
– Numerous metadata schemas to annotate content
(LIDO, CIDOC-CRM, EAD, METS )
• Massive digitization and annotation activities are in
progress
• Need for interoperability
MINT Mapping Tool
• Provides users the ability to perform a mapping of
their own metadata schemas to reference domain
models
• Follows a typical web based architecture
• It was developed for ATHENA, but it is currently used
for EUScreen, CARARE, Judaica, ECLAP, DCA and
Linked Heritage
MINT 2 – What’s new?
• The backend was reconstructed for better
performance
– File size for imports is extended
• The frontend was updated
– New interface
– Workflow is integrated in UI
– Facilitated browsing of input and target schema
G05 dimitris gavrilis_more_aggregation
G05 dimitris gavrilis_more_aggregation
G05 dimitris gavrilis_more_aggregation
G05 dimitris gavrilis_more_aggregation
G05 dimitris gavrilis_more_aggregation
G05 dimitris gavrilis_more_aggregation
G05 dimitris gavrilis_more_aggregation
MORe Overall Architecture
Registry
Apache Cassandra cluster
Fedora-commons
Temporary storage
Vocabulary services
Storage
JMS logging
Messaging
Core services
Enrichment service
management
Entity matching / NLP
Geocoding / Historic
Place names
REST
External enrichment
services
Publish service
management
OAI-PMH
RDF Store
Elastic Search
Archive
Cloud architecture
• De-centralized
• Scalable
• Four cloud environmets
– Storage
– Monitoring & logging
– Core services deployment
– Enrichment services deployment
Distributed
• Enrichment services run on:
– Austria
– Spain
– Greece
– Lithuania
– Slovenia
– Norway
• Scalability can be facilitated through a virtualization
infrastructure
Workflow
OAI-PMH
LoCloud
Collections
Wikimedia
MINT
Harvest
Ingest
Transform Enrich
Publish
OAI-PMH
Archive
RDF Store
SolR
Validate Index
Delete Reject
Omeka
Intermediate Schemas
Dublin Core
LIDO
CARARE
EAD
ESE
EDM
Dublin Core
LIDO
CARARE
EAD
ESE
EDM
OMEKA-XML
OGD
• Harvesting
• Validation
• Ingestion
• Transformation
• Enrichment
• Previewing
• Publishing
Core services
Harvests content from metadata sources
OAI-PMH repository
MINT
LoCloud Collections
Wikimedia
Multiple schemas are supported
OAI_DC
CARARE
CARARE 2.0
LIDO
EAD
EDM
ESE
• Harvesting
• Validation
• Ingestion
• Transformation
• Enrichment
• Previewing
• Publishing
Core services
Validates incoming information packages
Executes validation schemes
Validation micro-services
Structure
Schema
Linking
Schematron rules
Flexible
How it is used in MoRe:
Pre-validation
Post-validation
• Harvesting
• Validation
• Ingestion
• Transformation
• Enrichment
• Previewing
• Publishing
Core services
Ingest content into storage
Uses storage layer API
Pluggable drivers for attaching different technologies /
repositories
Apache Cassandra
Filesystem-based
Fedora-commons
Versioning support
Complex digital object support
• Harvesting
• Validation
• Ingestion
• Transformation
• Enrichment
• Previewing
• Publishing
Core services
Content Model
Digital objects comprise data streams
Each data stream can hold any kind of information
• XML/RDF, Image, Video, Documents, etc.
Each different representation of an information object is
stored as a different data stream
Each curation action generates a new version
• Transformation, Enrichment
• Harvesting
• Validation
• Ingestion
• Transformation
• Enrichment
• Previewing
• Publishing
Core services
Transforms entire information packages into the
Europeana Data Model (EDM), or any other schema
Multiple transformation routines
Per schema
Per project
Per provider
User can attach rights statement
• Harvesting
• Validation
• Ingestion
• Transformation
• Enrichment
• Previewing
• Publishing
Core services
The generic enrichment service facilitates the execution
of the enrichment micro-services
• Hides the complexity from the user by using
enrichment plans
• Provides seamless integration with the UI of
MORE
Virtual Enrichment driver
• Allows developers/creative industries to create
their own enrichment services and declare/use
them within MoRe
• Harvesting
• Validation
• Ingestion
• Transformation
• Enrichment
• Previewing
• Publishing
Core services
Preview the XML record information for all datastreams
Preview the record in HTML (using the Europeana style
sheet)
• Harvesting
• Validation
• Ingestion
• Transformation
• Enrichment
• Previewing
• Publishing
Core services
Publish transformed / enriched information
• Internal OAI-PMH provider
• XML export
• Publish directly to RDF repositories
• Sesame
• Virtuoso
• SolR index server
• Thematic
– Thesauri collections
– Vocabulary matching
– Background links
• Spatial
– Geo normalization
– Geo coding
– Reverse geo-coding
– Historic place names
• Other
– Language identification
Enrichment micro-services
SKOS Thesauri
Geo-Names
DBPedia
Wikipedia
Enrichment Plan
• Enrichment micro-services are used
within enrichment workflows:
– Enrichment plans
• Each enrichment plan applies to a
specific schema
• Each enrichment plan executes
enrichment micro-services in a specific
order
Enrichment plans
Language
identification
Vocabulary matching
Geo-normalization
Geo-coding
Enrichment Plan
• Each enrichment plan defines run-time
parameters for specific services
– Content based
Enrichment plans
Language
identification
Vocabulary matching
Geo-normalization
Geo-coding
Add subject collection
A only if term X or Y
are matched
Dashboard
Packages organization
Package overview
Package lifecycle overview
Preview
Metadata completeness & statistics
Enrichment services overview
Direct access to 27 thesauri
Create & (re)use subject collections
Thank you
tzouvaras@image.ntua.gr
d.gavrilis@dcu.gr

More Related Content

PPTX
Resource space
PPT
Cloud computing
PPTX
Different Online Platforms in ICT
PPTX
1. introduction to cloud computing
PDF
06 lo cloud
PDF
PDF
Cloud computing
PPTX
Pesentation on cloud computing by vijesh
Resource space
Cloud computing
Different Online Platforms in ICT
1. introduction to cloud computing
06 lo cloud
Cloud computing
Pesentation on cloud computing by vijesh

What's hot (17)

PPTX
Cloud Services Providers
PPTX
Cloud Computing
PPT
Cloud Computing Introduction - Deep Dive
PPTX
Cloud computing
PPTX
Cloud computing power point presentation
PPTX
Cloud computing
PPTX
Cloud computing
PPTX
cloud shilpa
PDF
Why Cloud Computing?
PPTX
Cloud Service Model
PPTX
Introduction to cloud computing
PPTX
All about paas_iaas_saas_29.01.2015
PPTX
Cloud Computing Basics
PPT
Unit 2 -Cloud Computing Architecture
PPTX
Cloud introducton and_openstack_nova
PPT
Cloud storage and services
PPT
Open Data Masterclass - Europeana and LOD
Cloud Services Providers
Cloud Computing
Cloud Computing Introduction - Deep Dive
Cloud computing
Cloud computing power point presentation
Cloud computing
Cloud computing
cloud shilpa
Why Cloud Computing?
Cloud Service Model
Introduction to cloud computing
All about paas_iaas_saas_29.01.2015
Cloud Computing Basics
Unit 2 -Cloud Computing Architecture
Cloud introducton and_openstack_nova
Cloud storage and services
Open Data Masterclass - Europeana and LOD
Ad

Viewers also liked (7)

PDF
Thunderbirds Atlas
PDF
brochure_ASP_web_122015
DOCX
Attention please
PPTX
G05 dimitris gavrilis_more_aggregation
PPTX
Ladder of success tian de 2016_pl
PDF
Estados excepcionales en España
DOCX
Aa report
Thunderbirds Atlas
brochure_ASP_web_122015
Attention please
G05 dimitris gavrilis_more_aggregation
Ladder of success tian de 2016_pl
Estados excepcionales en España
Aa report
Ad

Similar to G05 dimitris gavrilis_more_aggregation (20)

PDF
The LoCloud MORE aggregator, Gavrilis Dimitris Afiontzi Eleni, Makri Dimit...
PPTX
Do MORe with your data
PDF
The Europeana Data Model - TPDL2018
PPTX
LoCloud: overview of LoCloud Services
PPT
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
PPTX
The Mint Mapping tool
PPTX
Europeana as a Linked Data (Quality) case
PDF
Alexandria winer20100623
PDF
LoCloud Technical Poster
PDF
Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...
ODP
Linked Open Europeana: Semantics for the Citizen
PDF
Europeana Creative. EDM Endpoint. Custom Views
PDF
Eun lre brussels_winer20100616
PDF
The Europeana Datamodel: A semantic layer on top of Cultural Heritage Objects
PDF
LoCloud - D2.4: Metatata Preparation Toolkit
PDF
Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...
PDF
Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...
PDF
Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...
PPT
Collection Description and its Potential, Giuliana De Francesco CIDOC 2011
The LoCloud MORE aggregator, Gavrilis Dimitris Afiontzi Eleni, Makri Dimit...
Do MORe with your data
The Europeana Data Model - TPDL2018
LoCloud: overview of LoCloud Services
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
The Mint Mapping tool
Europeana as a Linked Data (Quality) case
Alexandria winer20100623
LoCloud Technical Poster
Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...
Linked Open Europeana: Semantics for the Citizen
Europeana Creative. EDM Endpoint. Custom Views
Eun lre brussels_winer20100616
The Europeana Datamodel: A semantic layer on top of Cultural Heritage Objects
LoCloud - D2.4: Metatata Preparation Toolkit
Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...
Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...
Technical Challenges and Approaches to Build an Open Ecosystem of Heterogeneo...
Collection Description and its Potential, Giuliana De Francesco CIDOC 2011

More from evaminerva (20)

PDF
G14 eyal reuven_nli_theopenlibrary
PPTX
G12 susan hazan_roundtableopenaccesjewish
PDF
G12 susan hazan_roundtableopenaccesjewish
PPTX
G11 alex valdman_yerushaproject
PDF
G11 alex valdman_yerushaproject
PPTX
G10 ronit gadish_alexandervainer_hebrewterminology
PDF
G10 ronit gadish_alexandervainer_hebrewterminology
PPTX
G8 seroussi sprinzak_mappingjewishculture
PDF
G8 seroussi sprinzak_mappingjewishculture
PPTX
G7 menahem katz_hillelgershuni_textualvariants
PDF
G7 menahem katz_hillelgershuni_textualvariants
PDF
G6 jonathan bendovsqe_minerva 2016
PPTX
G5 orit rosengarten_leonlevy_dl_deadseascrolls
PDF
G5 orit rosengarten_leonlevy_dl_deadseascrolls
PPTX
G3 stoeck and_hayim_lapin_nextgenerationculturalheritage
PDF
G3 stoeck and_hayim_lapin_nextgenerationculturalheritage
PPTX
G2 michale satlow_inscriptionsisraelpalestine
PDF
G2 michale satlow_inscriptionsisraelpalestine
PPTX
F3 sigal arieerez_reconnectingpast_evaminerva2016
PDF
F3 sigal arieerez_reconnectingpast_evaminerva2016
G14 eyal reuven_nli_theopenlibrary
G12 susan hazan_roundtableopenaccesjewish
G12 susan hazan_roundtableopenaccesjewish
G11 alex valdman_yerushaproject
G11 alex valdman_yerushaproject
G10 ronit gadish_alexandervainer_hebrewterminology
G10 ronit gadish_alexandervainer_hebrewterminology
G8 seroussi sprinzak_mappingjewishculture
G8 seroussi sprinzak_mappingjewishculture
G7 menahem katz_hillelgershuni_textualvariants
G7 menahem katz_hillelgershuni_textualvariants
G6 jonathan bendovsqe_minerva 2016
G5 orit rosengarten_leonlevy_dl_deadseascrolls
G5 orit rosengarten_leonlevy_dl_deadseascrolls
G3 stoeck and_hayim_lapin_nextgenerationculturalheritage
G3 stoeck and_hayim_lapin_nextgenerationculturalheritage
G2 michale satlow_inscriptionsisraelpalestine
G2 michale satlow_inscriptionsisraelpalestine
F3 sigal arieerez_reconnectingpast_evaminerva2016
F3 sigal arieerez_reconnectingpast_evaminerva2016

Recently uploaded (20)

PDF
SlidesGDGoCxRAIS about Google Dialogflow and NotebookLM.pdf
PPT
Ethics in Information System - Management Information System
PPTX
Funds Management Learning Material for Beg
DOC
Rose毕业证学历认证,利物浦约翰摩尔斯大学毕业证国外本科毕业证
PDF
The Ikigai Template _ Recalibrate How You Spend Your Time.pdf
PPTX
E -tech empowerment technologies PowerPoint
PDF
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
PPTX
Internet Safety for Seniors presentation
PDF
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
PPT
415456121-Jiwratrwecdtwfdsfwgdwedvwe dbwsdjsadca-EVN.ppt
PPTX
artificial intelligence overview of it and more
PPTX
Layers_of_the_Earth_Grade7.pptx class by
PPTX
SAP Ariba Sourcing PPT for learning material
PDF
si manuel quezon at mga nagawa sa bansang pilipinas
PPT
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
PDF
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
PPTX
Database Information System - Management Information System
PPTX
Mathew Digital SEO Checklist Guidlines 2025
PPT
Design_with_Watersergyerge45hrbgre4top (1).ppt
PPT
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
SlidesGDGoCxRAIS about Google Dialogflow and NotebookLM.pdf
Ethics in Information System - Management Information System
Funds Management Learning Material for Beg
Rose毕业证学历认证,利物浦约翰摩尔斯大学毕业证国外本科毕业证
The Ikigai Template _ Recalibrate How You Spend Your Time.pdf
E -tech empowerment technologies PowerPoint
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
Internet Safety for Seniors presentation
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
415456121-Jiwratrwecdtwfdsfwgdwedvwe dbwsdjsadca-EVN.ppt
artificial intelligence overview of it and more
Layers_of_the_Earth_Grade7.pptx class by
SAP Ariba Sourcing PPT for learning material
si manuel quezon at mga nagawa sa bansang pilipinas
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
Database Information System - Management Information System
Mathew Digital SEO Checklist Guidlines 2025
Design_with_Watersergyerge45hrbgre4top (1).ppt
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt

G05 dimitris gavrilis_more_aggregation

  • 1. The Mint Mapping tool The MoRe aggregator Vassilis Tzouvaras, Dimitris Gavrilis National Technical University of Athens Digital Curation Unit - IMIS, Athena Research Center LoCloud is funded by the European Commission's ICT Policy Support Programme
  • 2. Cultural Heritage Content • Diversity of cultural heritage content – Numerous metadata schemas to annotate content (LIDO, CIDOC-CRM, EAD, METS ) • Massive digitization and annotation activities are in progress • Need for interoperability
  • 3. MINT Mapping Tool • Provides users the ability to perform a mapping of their own metadata schemas to reference domain models • Follows a typical web based architecture • It was developed for ATHENA, but it is currently used for EUScreen, CARARE, Judaica, ECLAP, DCA and Linked Heritage
  • 4. MINT 2 – What’s new? • The backend was reconstructed for better performance – File size for imports is extended • The frontend was updated – New interface – Workflow is integrated in UI – Facilitated browsing of input and target schema
  • 12. MORe Overall Architecture Registry Apache Cassandra cluster Fedora-commons Temporary storage Vocabulary services Storage JMS logging Messaging Core services Enrichment service management Entity matching / NLP Geocoding / Historic Place names REST External enrichment services Publish service management OAI-PMH RDF Store Elastic Search Archive
  • 13. Cloud architecture • De-centralized • Scalable • Four cloud environmets – Storage – Monitoring & logging – Core services deployment – Enrichment services deployment
  • 14. Distributed • Enrichment services run on: – Austria – Spain – Greece – Lithuania – Slovenia – Norway • Scalability can be facilitated through a virtualization infrastructure
  • 16. Intermediate Schemas Dublin Core LIDO CARARE EAD ESE EDM Dublin Core LIDO CARARE EAD ESE EDM OMEKA-XML OGD
  • 17. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Harvests content from metadata sources OAI-PMH repository MINT LoCloud Collections Wikimedia Multiple schemas are supported OAI_DC CARARE CARARE 2.0 LIDO EAD EDM ESE
  • 18. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Validates incoming information packages Executes validation schemes Validation micro-services Structure Schema Linking Schematron rules Flexible How it is used in MoRe: Pre-validation Post-validation
  • 19. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Ingest content into storage Uses storage layer API Pluggable drivers for attaching different technologies / repositories Apache Cassandra Filesystem-based Fedora-commons Versioning support Complex digital object support
  • 20. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Content Model Digital objects comprise data streams Each data stream can hold any kind of information • XML/RDF, Image, Video, Documents, etc. Each different representation of an information object is stored as a different data stream Each curation action generates a new version • Transformation, Enrichment
  • 21. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Transforms entire information packages into the Europeana Data Model (EDM), or any other schema Multiple transformation routines Per schema Per project Per provider User can attach rights statement
  • 22. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services The generic enrichment service facilitates the execution of the enrichment micro-services • Hides the complexity from the user by using enrichment plans • Provides seamless integration with the UI of MORE Virtual Enrichment driver • Allows developers/creative industries to create their own enrichment services and declare/use them within MoRe
  • 23. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Preview the XML record information for all datastreams Preview the record in HTML (using the Europeana style sheet)
  • 24. • Harvesting • Validation • Ingestion • Transformation • Enrichment • Previewing • Publishing Core services Publish transformed / enriched information • Internal OAI-PMH provider • XML export • Publish directly to RDF repositories • Sesame • Virtuoso • SolR index server
  • 25. • Thematic – Thesauri collections – Vocabulary matching – Background links • Spatial – Geo normalization – Geo coding – Reverse geo-coding – Historic place names • Other – Language identification Enrichment micro-services SKOS Thesauri Geo-Names DBPedia Wikipedia
  • 26. Enrichment Plan • Enrichment micro-services are used within enrichment workflows: – Enrichment plans • Each enrichment plan applies to a specific schema • Each enrichment plan executes enrichment micro-services in a specific order Enrichment plans Language identification Vocabulary matching Geo-normalization Geo-coding
  • 27. Enrichment Plan • Each enrichment plan defines run-time parameters for specific services – Content based Enrichment plans Language identification Vocabulary matching Geo-normalization Geo-coding Add subject collection A only if term X or Y are matched
  • 35. Direct access to 27 thesauri Create & (re)use subject collections