SlideShare a Scribd company logo
1 / 20 20 / 03 / 2013
Towards Preservation of semantically
enriched Architectural Knowledge
Stefan Dietze (L3S Research Center, Leibniz University Hanover, DE)
Stefan Dietze, Jakob Beetz, Ujwal Gadiraju, Georgios
Katsimpras, Raoul Wessel, René Berndt
2 / 20 20 / 03 / 201327/09/13
Challenges
 Diversity of data - interoperability: low-level
point clouds & legacy 3D models up to enriched
Building Information Models (BIM), higher-level
semantics and Web data / knowledge
 Diverse stakeholders: architects, building
operators, urban planners, archivists, …
 Building, model and data evolution: document
temporal evolution to prevent information loss
Goals and Challenges (1/2)
Goal
 Methods and tools for sustainable long-term
preservation of architectural knowledge
Stefan Dietze (L3S Research Center)
3 / 20 20 / 03 / 201327/09/13
Challenges
 “Semantic” enrichment of
architectural knowledge: exploiting
Web data and knowledge to enrich
low-level architectural data.
 Inconsistent vocabularies: adopting
state of the art (LD) vocabularies and
schemas towards sustainability
 Long-term readability / renderability
of architectural models: addressing
digital decay (eg due to deprecated
file formats) and model evolution
Architectural Archives Architectural Web Data
Goals and Challenges (2/2)
Stefan Dietze (L3S Research Center)
4 / 20 20 / 03 / 201327/09/13
UBO: Universität Bonn
- Technical Coordinator
- WP4/WP5: change management, shape
recognition
Fraunhofer Austria
- WP2: system specification
& integration
TUE, Department of the Built Environment,
Eindhoven University of Technology
- WP3:,semantics & metadata
CITA, Center for Information Technology
and Architecture Copenhagen
- WP7: data, evaluation, test
Luleå University of Technology
- WP8: dissemination/exploitation
Catenda, SME
- User perspective, market requirements, evaluation
LUH: German National Library of
Science and Technology (TIB) &
L3S Research Center Hannover
-Coordinator
- WP3 Semantic Enrichment
- WP6 leader, long-term preservation
Consortium
Stefan Dietze (L3S Research Center)
5 / 20 20 / 03 / 201327/09/13
Why interlinking & semantic enrichment?
Stefan Dietze (L3S Research Center)
policies
traffichistory
environment
infrastructure
1. research 2. design 3. monitoring (over time)
A very simplistic view on urban planning/architectural lifecycle today
DURAARK approach - exploiting Web data to help architects and urban planners to
answer questions like:
 What‘s the legal, social and environmental context of a structure (sustainability policies etc)?
 How did buildings and their contexts (traffic, surroundings, usage and functionality, popularity, etc)
evolve over time?
 How did an architectural change impact surrounding traffic/environment?
(examples: bridges, airports)
 How did an architectural change impact popularity and attractiveness of a building?
 ….
6 / 20 20 / 03 / 201327/09/13
Architectural Data Preservation
3D Models
Point Clouds
Stefan Dietze (L3S Research Center)
Building Information
Models (BIM)
= structured „Building
Model Metadata“
7 / 20 20 / 03 / 201327/09/13
Architectural Data Preservation
SDA Scope
Stefan Dietze (L3S Research Center)
 Semantic enrichment of low-level architectural models
(gradual process)
 Interlinking of related models/data
(across different abstraction levels, model types, datasets
and repositories…)
 Preservation & temporal analysis: tracking the evolution
of models, buildings and related data
8 / 20 20 / 03 / 2013
Example: GDR’s People’s Palace - static vs evolving data/links
Social & Semantic Web for enrichment
9 / 20 20 / 03 / 201327/09/13
Semantic enrichment – schema/knowledge types
Challenges
 Selection of suitable datasets from
wealth of diverse datasets
 Preservation: dealing with evolution
of distributed datasets (i.e. the
semantics & context of the
structure/models)
10 / 20 20 / 03 / 201327/09/13 Stefan Dietze (L3S Research Center)
Data selection: too few information about too many datasets
 Lack of reliable dataset metadata but wide diversity (eg, DBpedia vs traffic stats London vs … ) :
 Spatial and temporal coverage ?
 Dynamics ? (evolution, frequency of changes…)
 Resource types & topics ? (policy documents vs traffic statistics)
 Currentness, availability, provenance, ….
Enrichment & Preservation
http://datahub.io/dataset/transport-data-gov-uk
329.527.661 triples
metadata
LOD cloud: 300++ datasets
DataHub: 6000++ datasets
11 / 20 20 / 03 / 201327/09/13
<geoLatLong:52/13>
Stefan Dietze (L3S Research Center)
Data preservation: handling evolution of distributed data
 Preservation needs to address evolution of distributed datasets / semantics of links
 In RDF graphs (such as the LOD Cloud), „all“ nodes are connected:
 Which datasets to preserve (only direct links or also more distant neighbours)?
(semantic relatedness, see [ESWC2013])
 Propagation of changes in LOD graph => measuring relevance of changes for specific entities
Enrichment & Preservation
<dbp:Berlin(east)>
<dura:GDR Peoples Palace>
<dbp:Berlin>
12 / 20 20 / 03 / 201327/09/13
<geoLatLong:52/13>
Stefan Dietze (L3S Research Center)
Data preservation: handling evolution of distributed data
 Preservation needs to address evolution of distributed datasets / semantics of links
 In RDF graphs (such as the LOD Cloud), „all“ nodes are connected:
 Which datasets to preserve (only direct links or also more distant neighbours)?
(semantic relatedness, see [ESWC2013])
 Propagation of changes in LOD graph => measuring relevance of changes for specific entities
 Preservation strategies dependent on dataset dynamics
 Simple linking (archiving) for static datasets (eg statistics over past periods in data.gov.uk)
 Recurring link computation and graph archival for dynamic datasets (frequency?)
Enrichment & Preservation
<dbp:Berlin(east)>
<dura:GDR Peoples Palace>
<dbp:Berlin>
Traffic statistics
(1986-1989)
Traffic statistics
(2013-…)
Energy efficiency policies
13 / 20 20 / 03 / 201327/09/13
Approach: dataset profiling
 Enrichment & preservation = intertwined process!
 Dataset selection & cataloging: via DataHub.io
(similar to LOD cloud)
 Dataset profiling: metadata about dataset dynamics, size,
types, topics, evolution, temporal/spatial coverage etc
=> Data observatory (see also [ESWC2013], [ISWC2013])
 Vocabulary curation (expert-based)
Web Data Curation for Building-related Data
DURAARK
Data Observatory
Automated processing to generate:
 Descriptive Dataset Profiles
 Data Interlinking & Correlation
Stefan Dietze (L3S Research Center)
describes
Endpoint
Retrieval & Graph
Extraction
Schema
Extraction and
Mapping
Sample Graph
Extraction
(per dataset)
NER & NED
(per resource)
Interlinking & Co-
Resolution
(cross-dataset)
Profiling
(topics, coverage,
dynamics,…)http://datahub.io/group/linked-building-data
14 / 20 20 / 03 / 201327/09/13
Endpoint
Retrieval & Graph
Extraction
Schema
Extraction and
Mapping
Sample Graph
Extraction
(per dataset)
NER & NED
(per resource)
Interlinking & Co-
Resolution
(cross-dataset)
Profiling
(topics, coverage,
dynamics,…)
Dataset
Catalog/Index
Links/
Cross-references
rdfs:label:„…ECB….“ ?
Dataset metadata (RDF/VoID):
 Schema mappings
(types, properties)
 Entities & categories
 Topic relevance scores
 Availability, currentness
data (tbc)
dbpedia:Finance
dbpedia:Sports
dbpedia:England-Wales-Cricket-Board
dbpedia:European_Central_Bank
Goals:
 RDF catalog of datasets
 Tracking the evolution
of datasets according
to, eg, topics,
dynamics, spatial
coverage, accessability
 Links and coreferences
=> unified view on data
=> Linked Building Data
Graph
 Infrastructure & APIs for
federated queries
Dataset profiling: processing workflow
Towards a Web Data ”Observatory”
Stefan Dietze (L3S Research Center)
dbpedia:Frankfurt
15 / 20 20 / 03 / 201327/09/13
Pipeline
 Demo categories!
Web Data Observatory – ongoing work
Stefan Dietze (L3S Research Center)
http://data.linkededucation.org/linkedup/categories-explorer
16 / 20 20 / 03 / 201327/09/13
Vocabulary Curation & Data Interlinking
Stefan Dietze (L3S Research Center)
 Using dataset profiles for semi-
automated data interlinking:
 Manual alignment of schemas &
vocabularies into unified RDF
graph
 Automated interlinking (and
preservation) techniques
 Preservation metadata (PREMIS RDF?)
 Expert-based curation of building-
related vocabularies
 BuildingSmartDD
(http://www.buildingsmart.org
/standards/ifd)
 OMNIClass, UNIClass
 SFB-NL (http://nl-
sfb.bk.tudelft.nl)
 CROW Library for
infrastructural objects
(http://www.gww-ob.nl/)
 …
17 / 20 20 / 03 / 201327/09/13
Conclusions
Summary
 “Data Observatory” as generic platform and domain-specific instantiation
(profiling building-related dataset aspects in DURAARK)
 Preservation/linking strategies for SDA based on dataset profiles (eg dynamics, relevance)
SDA Scope
Outlook
 Dataset selection: populating DataHub-
group
 Schema and vocabulary curation and
alignment
 Dataset profiling: establishing LDO,
considering range of metadata aspects
 Building SDA: data interlinking & dataset
preservation
DURAARK
Data Observatory
Stefan Dietze (L3S Research Center)
ongoing work
future work
18 / 20 20 / 03 / 201327/09/13
Thank you!
http://purl.org/dietze | @stefandietze
http://www.duraark.eu

More Related Content

PPT
DURAARK at IGeLU 2014
PDF
DURAARK presentation at DEDICATE final seminar, October 21st 2013, Michelle L...
PPTX
Presentation nokobit
PPT
Preservation of 3 d objects of buildings
PDF
A Domain-driven Approach to Digital Curation and Preservation of 3D Architect...
PDF
Quality criteria for architectural 3D data in usage and preservation processes
PPTX
DURAARK presentation CIB W78 "Applications of IT in AEC" conference Beijing 2...
PDF
Presentation of the DURAARK project at Ex Libris conference, Berlin, Germany.
DURAARK at IGeLU 2014
DURAARK presentation at DEDICATE final seminar, October 21st 2013, Michelle L...
Presentation nokobit
Preservation of 3 d objects of buildings
A Domain-driven Approach to Digital Curation and Preservation of 3D Architect...
Quality criteria for architectural 3D data in usage and preservation processes
DURAARK presentation CIB W78 "Applications of IT in AEC" conference Beijing 2...
Presentation of the DURAARK project at Ex Libris conference, Berlin, Germany.

What's hot (20)

PDF
Preserving Computer-Aided Design, Digital Preservation Coalition Report
PDF
DURAARK Preserving Architectural Knowledge
PDF
Towards a Linked Data Publishing Methodology
PDF
BDE SC3.3 Workshop - BDE review: Scope and Opportunities
PDF
3D ICONS IPR experience
PPTX
Using Open Research Data for Public Policy Making: Opportunities of Virtual R...
PPT
Metadata, the CARARE aggregation service and 3D ICONS
PDF
Potential usage of 3D data and IPR issues, presented by Sheena Basset
PPT
Introduction to 3D ICONS
PDF
The last mile of 3DIcons: making available 3D contents and their metadata thr...
PDF
Metadata for 3D models, Sheena Bassett
PDF
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
PPTX
SC1 Workshop 2 Pilot instantiations
PDF
Josep Maria Salanova - Introduction to BDE+SC4
PDF
The DURAARK Workbench and PREMIS
PDF
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
PDF
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
PDF
Data management plans – EUDAT Best practices and case study | www.eudat.eu
PPTX
The Next Generation of the Microdata Information System MISSY - An Integrated...
PPTX
2016 SDMX Experts meeting, National Accounts business case (validation, data ...
Preserving Computer-Aided Design, Digital Preservation Coalition Report
DURAARK Preserving Architectural Knowledge
Towards a Linked Data Publishing Methodology
BDE SC3.3 Workshop - BDE review: Scope and Opportunities
3D ICONS IPR experience
Using Open Research Data for Public Policy Making: Opportunities of Virtual R...
Metadata, the CARARE aggregation service and 3D ICONS
Potential usage of 3D data and IPR issues, presented by Sheena Basset
Introduction to 3D ICONS
The last mile of 3DIcons: making available 3D contents and their metadata thr...
Metadata for 3D models, Sheena Bassett
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
SC1 Workshop 2 Pilot instantiations
Josep Maria Salanova - Introduction to BDE+SC4
The DURAARK Workbench and PREMIS
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Data management plans – EUDAT Best practices and case study | www.eudat.eu
The Next Generation of the Microdata Information System MISSY - An Integrated...
2016 SDMX Experts meeting, National Accounts business case (validation, data ...
Ad

Viewers also liked (7)

PPT
DURAARK at Bibliotheksymposium Wildau
PDF
Grapp2014 presentation
PDF
Turning Data into Knowledge (KESW2014 Keynote)
PDF
What's all the data about? - Linking and Profiling of Linked Datasets
PPT
DURAARK at AUdS 2015
PDF
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
PDF
KnowEscape workshop, OKCon 2013
DURAARK at Bibliotheksymposium Wildau
Grapp2014 presentation
Turning Data into Knowledge (KESW2014 Keynote)
What's all the data about? - Linking and Profiling of Linked Datasets
DURAARK at AUdS 2015
A Scalable Approach for Efficiently Generating Structured Dataset Topic Profiles
KnowEscape workshop, OKCon 2013
Ad

Similar to Towards preservation of semantically enriched architectural knowledge (20)

PDF
Linked Data for Architecture, Engineering and Construction (AEC)
PDF
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked data
PPTX
LONG-TERM PRESERVATION OF 3D ARCHITECTURAL BUILDING DATA: A LITERATURE REVIEW
PDF
A distributed network of digital heritage information - Unesco/NDL India
PPT
Radically Open Cultural Heritage Data on the Web
PDF
CLARIAH Toogdag 2018: A distributed network of digital heritage information
PPT
Going for GOLD - Adventures in Open Linked Geospatial Metadata
PDF
lodlam summit session browsable linked data
PPTX
Linked Open Data for Cultural Heritage
PPTX
Linked Data: opportunities and challenges
PPTX
Gap Analysis
PDF
20170501 Distributed Network of Digital Heritage Information
PDF
EuropeanaTech 2018: A distributed network of digital heritage information
PPTX
The Future of LOD
PDF
Beyond Built Heritage Documentation: digital applications needs for research ...
PDF
Ontologies in architecture, engineering and construction (AEC)
PPTX
‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...
PPTX
One day workshop Linked Data and Semantic Web
PDF
MW2011: Cope, A., Authority Records, Future Computers and Other Unfinished Hi...
PDF
A distributed network of digital heritage information by Enno Meijers - Europ...
Linked Data for Architecture, Engineering and Construction (AEC)
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked data
LONG-TERM PRESERVATION OF 3D ARCHITECTURAL BUILDING DATA: A LITERATURE REVIEW
A distributed network of digital heritage information - Unesco/NDL India
Radically Open Cultural Heritage Data on the Web
CLARIAH Toogdag 2018: A distributed network of digital heritage information
Going for GOLD - Adventures in Open Linked Geospatial Metadata
lodlam summit session browsable linked data
Linked Open Data for Cultural Heritage
Linked Data: opportunities and challenges
Gap Analysis
20170501 Distributed Network of Digital Heritage Information
EuropeanaTech 2018: A distributed network of digital heritage information
The Future of LOD
Beyond Built Heritage Documentation: digital applications needs for research ...
Ontologies in architecture, engineering and construction (AEC)
‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...
One day workshop Linked Data and Semantic Web
MW2011: Cope, A., Authority Records, Future Computers and Other Unfinished Hi...
A distributed network of digital heritage information by Enno Meijers - Europ...

More from Stefan Dietze (20)

PDF
Understanding Scientific and Societal Adoption and Impact of Science Through ...
PDF
NEWORDER Project - Science in the online knowledge order
PDF
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
PDF
AI in between online and offline discourse - and what has ChatGPT to do with ...
PDF
An interdisciplinary journey with the SAL spaceship – results and challenges ...
PDF
Research Knowledge Graphs at NFDI4DS & GESIS
PDF
Research Knowledge Graphs at GESIS & NFDI4DataScience
PDF
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
PDF
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
PDF
Towards research data knowledge graphs
PDF
Beyond research data infrastructures: exploiting artificial & crowd intellige...
PDF
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
PDF
Using AI to understand everyday learning on the Web
PDF
Analysing User Knowledge, Competence and Learning during Online Activities
PDF
Analysing & Improving Learning Resources Markup on the Web
PDF
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
PDF
Big Data in Learning Analytics - Analytics for Everyday Learning
PDF
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
PDF
Mining and Understanding Activities and Resources on the Web
PDF
Towards embedded Markup of Learning Resources on the Web
Understanding Scientific and Societal Adoption and Impact of Science Through ...
NEWORDER Project - Science in the online knowledge order
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
AI in between online and offline discourse - and what has ChatGPT to do with ...
An interdisciplinary journey with the SAL spaceship – results and challenges ...
Research Knowledge Graphs at NFDI4DS & GESIS
Research Knowledge Graphs at GESIS & NFDI4DataScience
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
Towards research data knowledge graphs
Beyond research data infrastructures: exploiting artificial & crowd intellige...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
Using AI to understand everyday learning on the Web
Analysing User Knowledge, Competence and Learning during Online Activities
Analysing & Improving Learning Resources Markup on the Web
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Big Data in Learning Analytics - Analytics for Everyday Learning
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Mining and Understanding Activities and Resources on the Web
Towards embedded Markup of Learning Resources on the Web

Recently uploaded (20)

PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Machine learning based COVID-19 study performance prediction
PPT
Teaching material agriculture food technology
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Empathic Computing: Creating Shared Understanding
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Big Data Technologies - Introduction.pptx
PDF
cuic standard and advanced reporting.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Spectral efficient network and resource selection model in 5G networks
Network Security Unit 5.pdf for BCA BBA.
Machine learning based COVID-19 study performance prediction
Teaching material agriculture food technology
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
MIND Revenue Release Quarter 2 2025 Press Release
Chapter 3 Spatial Domain Image Processing.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Diabetes mellitus diagnosis method based random forest with bat algorithm
Unlocking AI with Model Context Protocol (MCP)
Dropbox Q2 2025 Financial Results & Investor Presentation
Empathic Computing: Creating Shared Understanding
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Review of recent advances in non-invasive hemoglobin estimation
Big Data Technologies - Introduction.pptx
cuic standard and advanced reporting.pdf
20250228 LYD VKU AI Blended-Learning.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Understanding_Digital_Forensics_Presentation.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

Towards preservation of semantically enriched architectural knowledge

  • 1. 1 / 20 20 / 03 / 2013 Towards Preservation of semantically enriched Architectural Knowledge Stefan Dietze (L3S Research Center, Leibniz University Hanover, DE) Stefan Dietze, Jakob Beetz, Ujwal Gadiraju, Georgios Katsimpras, Raoul Wessel, René Berndt
  • 2. 2 / 20 20 / 03 / 201327/09/13 Challenges  Diversity of data - interoperability: low-level point clouds & legacy 3D models up to enriched Building Information Models (BIM), higher-level semantics and Web data / knowledge  Diverse stakeholders: architects, building operators, urban planners, archivists, …  Building, model and data evolution: document temporal evolution to prevent information loss Goals and Challenges (1/2) Goal  Methods and tools for sustainable long-term preservation of architectural knowledge Stefan Dietze (L3S Research Center)
  • 3. 3 / 20 20 / 03 / 201327/09/13 Challenges  “Semantic” enrichment of architectural knowledge: exploiting Web data and knowledge to enrich low-level architectural data.  Inconsistent vocabularies: adopting state of the art (LD) vocabularies and schemas towards sustainability  Long-term readability / renderability of architectural models: addressing digital decay (eg due to deprecated file formats) and model evolution Architectural Archives Architectural Web Data Goals and Challenges (2/2) Stefan Dietze (L3S Research Center)
  • 4. 4 / 20 20 / 03 / 201327/09/13 UBO: Universität Bonn - Technical Coordinator - WP4/WP5: change management, shape recognition Fraunhofer Austria - WP2: system specification & integration TUE, Department of the Built Environment, Eindhoven University of Technology - WP3:,semantics & metadata CITA, Center for Information Technology and Architecture Copenhagen - WP7: data, evaluation, test Luleå University of Technology - WP8: dissemination/exploitation Catenda, SME - User perspective, market requirements, evaluation LUH: German National Library of Science and Technology (TIB) & L3S Research Center Hannover -Coordinator - WP3 Semantic Enrichment - WP6 leader, long-term preservation Consortium Stefan Dietze (L3S Research Center)
  • 5. 5 / 20 20 / 03 / 201327/09/13 Why interlinking & semantic enrichment? Stefan Dietze (L3S Research Center) policies traffichistory environment infrastructure 1. research 2. design 3. monitoring (over time) A very simplistic view on urban planning/architectural lifecycle today DURAARK approach - exploiting Web data to help architects and urban planners to answer questions like:  What‘s the legal, social and environmental context of a structure (sustainability policies etc)?  How did buildings and their contexts (traffic, surroundings, usage and functionality, popularity, etc) evolve over time?  How did an architectural change impact surrounding traffic/environment? (examples: bridges, airports)  How did an architectural change impact popularity and attractiveness of a building?  ….
  • 6. 6 / 20 20 / 03 / 201327/09/13 Architectural Data Preservation 3D Models Point Clouds Stefan Dietze (L3S Research Center) Building Information Models (BIM) = structured „Building Model Metadata“
  • 7. 7 / 20 20 / 03 / 201327/09/13 Architectural Data Preservation SDA Scope Stefan Dietze (L3S Research Center)  Semantic enrichment of low-level architectural models (gradual process)  Interlinking of related models/data (across different abstraction levels, model types, datasets and repositories…)  Preservation & temporal analysis: tracking the evolution of models, buildings and related data
  • 8. 8 / 20 20 / 03 / 2013 Example: GDR’s People’s Palace - static vs evolving data/links Social & Semantic Web for enrichment
  • 9. 9 / 20 20 / 03 / 201327/09/13 Semantic enrichment – schema/knowledge types Challenges  Selection of suitable datasets from wealth of diverse datasets  Preservation: dealing with evolution of distributed datasets (i.e. the semantics & context of the structure/models)
  • 10. 10 / 20 20 / 03 / 201327/09/13 Stefan Dietze (L3S Research Center) Data selection: too few information about too many datasets  Lack of reliable dataset metadata but wide diversity (eg, DBpedia vs traffic stats London vs … ) :  Spatial and temporal coverage ?  Dynamics ? (evolution, frequency of changes…)  Resource types & topics ? (policy documents vs traffic statistics)  Currentness, availability, provenance, …. Enrichment & Preservation http://datahub.io/dataset/transport-data-gov-uk 329.527.661 triples metadata LOD cloud: 300++ datasets DataHub: 6000++ datasets
  • 11. 11 / 20 20 / 03 / 201327/09/13 <geoLatLong:52/13> Stefan Dietze (L3S Research Center) Data preservation: handling evolution of distributed data  Preservation needs to address evolution of distributed datasets / semantics of links  In RDF graphs (such as the LOD Cloud), „all“ nodes are connected:  Which datasets to preserve (only direct links or also more distant neighbours)? (semantic relatedness, see [ESWC2013])  Propagation of changes in LOD graph => measuring relevance of changes for specific entities Enrichment & Preservation <dbp:Berlin(east)> <dura:GDR Peoples Palace> <dbp:Berlin>
  • 12. 12 / 20 20 / 03 / 201327/09/13 <geoLatLong:52/13> Stefan Dietze (L3S Research Center) Data preservation: handling evolution of distributed data  Preservation needs to address evolution of distributed datasets / semantics of links  In RDF graphs (such as the LOD Cloud), „all“ nodes are connected:  Which datasets to preserve (only direct links or also more distant neighbours)? (semantic relatedness, see [ESWC2013])  Propagation of changes in LOD graph => measuring relevance of changes for specific entities  Preservation strategies dependent on dataset dynamics  Simple linking (archiving) for static datasets (eg statistics over past periods in data.gov.uk)  Recurring link computation and graph archival for dynamic datasets (frequency?) Enrichment & Preservation <dbp:Berlin(east)> <dura:GDR Peoples Palace> <dbp:Berlin> Traffic statistics (1986-1989) Traffic statistics (2013-…) Energy efficiency policies
  • 13. 13 / 20 20 / 03 / 201327/09/13 Approach: dataset profiling  Enrichment & preservation = intertwined process!  Dataset selection & cataloging: via DataHub.io (similar to LOD cloud)  Dataset profiling: metadata about dataset dynamics, size, types, topics, evolution, temporal/spatial coverage etc => Data observatory (see also [ESWC2013], [ISWC2013])  Vocabulary curation (expert-based) Web Data Curation for Building-related Data DURAARK Data Observatory Automated processing to generate:  Descriptive Dataset Profiles  Data Interlinking & Correlation Stefan Dietze (L3S Research Center) describes Endpoint Retrieval & Graph Extraction Schema Extraction and Mapping Sample Graph Extraction (per dataset) NER & NED (per resource) Interlinking & Co- Resolution (cross-dataset) Profiling (topics, coverage, dynamics,…)http://datahub.io/group/linked-building-data
  • 14. 14 / 20 20 / 03 / 201327/09/13 Endpoint Retrieval & Graph Extraction Schema Extraction and Mapping Sample Graph Extraction (per dataset) NER & NED (per resource) Interlinking & Co- Resolution (cross-dataset) Profiling (topics, coverage, dynamics,…) Dataset Catalog/Index Links/ Cross-references rdfs:label:„…ECB….“ ? Dataset metadata (RDF/VoID):  Schema mappings (types, properties)  Entities & categories  Topic relevance scores  Availability, currentness data (tbc) dbpedia:Finance dbpedia:Sports dbpedia:England-Wales-Cricket-Board dbpedia:European_Central_Bank Goals:  RDF catalog of datasets  Tracking the evolution of datasets according to, eg, topics, dynamics, spatial coverage, accessability  Links and coreferences => unified view on data => Linked Building Data Graph  Infrastructure & APIs for federated queries Dataset profiling: processing workflow Towards a Web Data ”Observatory” Stefan Dietze (L3S Research Center) dbpedia:Frankfurt
  • 15. 15 / 20 20 / 03 / 201327/09/13 Pipeline  Demo categories! Web Data Observatory – ongoing work Stefan Dietze (L3S Research Center) http://data.linkededucation.org/linkedup/categories-explorer
  • 16. 16 / 20 20 / 03 / 201327/09/13 Vocabulary Curation & Data Interlinking Stefan Dietze (L3S Research Center)  Using dataset profiles for semi- automated data interlinking:  Manual alignment of schemas & vocabularies into unified RDF graph  Automated interlinking (and preservation) techniques  Preservation metadata (PREMIS RDF?)  Expert-based curation of building- related vocabularies  BuildingSmartDD (http://www.buildingsmart.org /standards/ifd)  OMNIClass, UNIClass  SFB-NL (http://nl- sfb.bk.tudelft.nl)  CROW Library for infrastructural objects (http://www.gww-ob.nl/)  …
  • 17. 17 / 20 20 / 03 / 201327/09/13 Conclusions Summary  “Data Observatory” as generic platform and domain-specific instantiation (profiling building-related dataset aspects in DURAARK)  Preservation/linking strategies for SDA based on dataset profiles (eg dynamics, relevance) SDA Scope Outlook  Dataset selection: populating DataHub- group  Schema and vocabulary curation and alignment  Dataset profiling: establishing LDO, considering range of metadata aspects  Building SDA: data interlinking & dataset preservation DURAARK Data Observatory Stefan Dietze (L3S Research Center) ongoing work future work
  • 18. 18 / 20 20 / 03 / 201327/09/13 Thank you! http://purl.org/dietze | @stefandietze http://www.duraark.eu