SlideShare a Scribd company logo
Dealing with Data Diversity in a 
Smart City Data Hub 
Mathieu d'Aquin - @mdaquin 
slideshare.net/mdaquin 
Knowledge Media Institute, The Open University
Diversity 
where a penguin is a dataset
Why should we care about 
diversity? 
Because diversity is good, and what 
makes data diverse is not the same as 
what makes it more or less relevant
Why should we care about 
diversity? 
Because it is hard to manage 
How many species of species of 
penguins/animals/things? 
How many biologist to classify them? 
and that's purely static... unlike species, new data 
appear all the time...
Why should we care about 
diversity? 
The 
Eskimo language 
has 255 different 
words for 
"visiting linguist" 
Because we might have a lot of it, or 
what we need to manage is very 
granular
Data diversity in a Smart City 
Example of the MK:Smart project in 
Milton Keynes, UK (mksmart.org)
Data diversity in a Smart City 
Partners in the MK:Smart project
Data diversity in a Smart City 
Areas of the MK:Smart project
Data diversity in a Smart City 
MK Data Hub - Where diversity is handled
A concrete example 
Wifi-based presence sensors
A concrete example 
Wifi-based presence sensors 
10-12 can covers an reasonably large enclosed area (here, the refectory 
of the Open University);
A concrete example 
Wifi-based presence sensors 
Use trianglation to find the location of wifi-enabled devices.
A concrete example 
Wifi-based presence sensors 
Basic statistical analysis to extract patterns of usage of the facility
A concrete example 
Wifi-based presence sensors 
Basic statistical analysis to extract patterns of usage of the facility
A concrete example: Diversity
A concrete example: Diversity
A concrete example: Diversity
A concrete example: Diversity
A concrete example: Diversity
A concrete example: Diversity
How do we usually deal with 
this 
data heterogenity 
for we use alignments, mappings, links, etc. 
Example: The LinkedUp Catalogue of datasets 
for education includes mappings between 
the vocanulaties of different datasets 
data.linkededucation.org/linkedup/catalogue/
What about diversity at the 
policy level?
What about diversity at the 
policy level?
What about diversity at the 
policy level?
What about diversity at the 
policy level?
More structured representation 
VoID and DC to represent datasets, PROV-O for basic provenance.
More structured representation 
ODRL for the structured representation of policies and rights
More structured representation 
With the tools to deal with it
More structured representation 
And the processes
Reasoning on the way policy-information 
propagates 
Requires an appropriate representation of dataflows
DataNode 
http://purl.org/datanode/ns/ 
An ontology of relationships between data artifacts (DataNodes).
DataNode 
Captures the essence of dataflows rather than the process, as a basis for 
meta-information propagation.
Propagating meta information 
accross dataflows 
Examples of rules: 
Duties such as attributions propagate over relations of derivation, but 
not necessraly others 
Permissions such as the right to redistribute however do not 
propagate over relations of derivation, except of specific cases (e.g. 
copies) 
Prohibitions such as preventing commercial exploitation propage over 
derivations
Discussion/future 
A lot of the semantics for Smart Cities work focus on data heterogeneity. 
There is a need to look at data diversity at the meta-information level 
(here we focus on policy related information). 
How to manage, catalogue, keep track of and manipulate a large 
number of datasets with diverse rights, access, validity, scope. 
How do we help users/developers in exploring and exploiting this 
diversity...
Discussion/future 
Master of Datasets
Discussion/future 
Need for a clear, semantic (i.e. ontological) foundation for describing 
and defining data artefacts. 
DataNode is a step towards defining their relationships. Vocabularies 
such as ODRL and VOID focus on specific aspects. 
More is needed to formally represent the foundamental descriptors of 
data (scope, validity, policy, ...)
Thanks! 
Mathieu d'Aquin Alessandro Adamou Enrico Daga 
Shuangyan Liu Keerthi Thomas Enrico Motta

More Related Content

PDF
Smart City from the Data Perspective
PPTX
Big Data: Getting Smarter
PDF
Big Data and IOT
PPT
Large scale data analytics for smart cities and related use cases
PPT
Internet of Things and Data Analytics for Smart Cities
PDF
Creating The World’s First
PPT
CityPulse: Large-scale data analysis for smart city applications
PDF
FIWARE Global Summit - The Smart City Program in Japan: Cities as Enablers of...
Smart City from the Data Perspective
Big Data: Getting Smarter
Big Data and IOT
Large scale data analytics for smart cities and related use cases
Internet of Things and Data Analytics for Smart Cities
Creating The World’s First
CityPulse: Large-scale data analysis for smart city applications
FIWARE Global Summit - The Smart City Program in Japan: Cities as Enablers of...

What's hot (20)

PPTX
The Smart City as a Data City - Google Tedx Talk
PPT
Smart Cities and Big Data - Research Presentation
PPT
Data Analytics for Smart Cities: Looking Back, Looking Forward
PPTX
IoT Smart Cities Presentation
PPTX
Bristol is Open Introduction
PPT
How to make cities "smarter"?
PDF
City Opportunities for Innovation
PDF
What type of (smart) city do we want to live in?
PDF
PPT
Internet of Things: The story so far
PPT
The impact of Big Data on next generation of smart cities
PDF
Smart Cities and Open Data
PPTX
Big data and smart cities
PPTX
Big data and smart cities: Key data issues
PDF
FIWARE Global Summit - The Digital Single Market - Benefits and Solutions for...
PPT
Open data
PPT
Intelligent Data Processing for the Internet of Things
PPTX
Presentation emerging tecnology
PPTX
Boot the Open Smart City
PDF
Smart cities and open data platforms
The Smart City as a Data City - Google Tedx Talk
Smart Cities and Big Data - Research Presentation
Data Analytics for Smart Cities: Looking Back, Looking Forward
IoT Smart Cities Presentation
Bristol is Open Introduction
How to make cities "smarter"?
City Opportunities for Innovation
What type of (smart) city do we want to live in?
Internet of Things: The story so far
The impact of Big Data on next generation of smart cities
Smart Cities and Open Data
Big data and smart cities
Big data and smart cities: Key data issues
FIWARE Global Summit - The Digital Single Market - Benefits and Solutions for...
Open data
Intelligent Data Processing for the Internet of Things
Presentation emerging tecnology
Boot the Open Smart City
Smart cities and open data platforms
Ad

Viewers also liked (9)

PDF
FIWARE Context Broker
PDF
FIWARE Complex Event Processing
PDF
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKAN
PDF
Mining in the Middle of the City: The needs of Big Data for Smart Cities
PPTX
Financing Smart Cities
PPTX
The Role of Big Data in Smart Cities
PDF
Big Data for Smart City
PPTX
PPT on SMART city
PPTX
Analysing Smart City Development in india
FIWARE Context Broker
FIWARE Complex Event Processing
Open Data Conference - Rufus Pollock - Open Knowledge Foundation & CKAN
Mining in the Middle of the City: The needs of Big Data for Smart Cities
Financing Smart Cities
The Role of Big Data in Smart Cities
Big Data for Smart City
PPT on SMART city
Analysing Smart City Development in india
Ad

Similar to Dealing with Data Diversity in a Smart City Data Hub (20)

PDF
CAEPIA 2011
PDF
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
PDF
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
PDF
Where is the World is my Open Government Data?
PDF
20120419 linkedopendataandteamsciencemcguinnesschicago
PDF
DataCite and its Members: Connecting Research and Identifying Knowledge
PPTX
Conférence Open Data par où commencer ? "How to achieve interoperability?" E....
PPTX
FAIRy stories: the FAIR Data principles in theory and in practice
PPTX
Dublinked tech workshop_15_dec2011
PDF
Big Data on the Web – What We Will Do
PDF
APLIC 2012: Discovering & Dealing with Data
PPTX
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
PPTX
Building COVID-19 Museum as Open Science Project
 
PPTX
Data.gov Overview, August 2012
PPTX
ISWC 2012 Keynote
PDF
FIWARE Wednesday Webinars - Cities as Enablers of the Data Economy: Smart Dat...
PDF
ITWS Capstone Lecture (Spring 2013)
PDF
Shared data infrastructures from smart cities to education
PDF
The Semantic Web: RPI ITWS Capstone (Fall 2012)
PPTX
The Information Workbench - Linked Data and Semantic Wikis in the Enterprise
CAEPIA 2011
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
Where is the World is my Open Government Data?
20120419 linkedopendataandteamsciencemcguinnesschicago
DataCite and its Members: Connecting Research and Identifying Knowledge
Conférence Open Data par où commencer ? "How to achieve interoperability?" E....
FAIRy stories: the FAIR Data principles in theory and in practice
Dublinked tech workshop_15_dec2011
Big Data on the Web – What We Will Do
APLIC 2012: Discovering & Dealing with Data
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
Building COVID-19 Museum as Open Science Project
 
Data.gov Overview, August 2012
ISWC 2012 Keynote
FIWARE Wednesday Webinars - Cities as Enablers of the Data Economy: Smart Dat...
ITWS Capstone Lecture (Spring 2013)
Shared data infrastructures from smart cities to education
The Semantic Web: RPI ITWS Capstone (Fall 2012)
The Information Workbench - Linked Data and Semantic Wikis in the Enterprise

More from Mathieu d'Aquin (20)

PDF
A factorial study of neural network learning from differences for regression
PDF
Recentrer l'intelligence artificielle sur les connaissances
PDF
Data and Knowledge as Commodities
PDF
Unsupervised learning approach for identifying sub-genres in music scores
PDF
Is knowledge engineering still relevant?
PDF
A data view of the data science process
PDF
Dealing with Open Domain Data
PDF
Web Analytics for Everyday Learning
PDF
Presentation a in ovive montpellier - 26%2 f06%2f2018 (1)
PDF
Learning Analytics: understand learning and support the learner
PDF
The AFEL Project
PDF
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...
PDF
Data ethics
PDF
Data for Learning and Learning with Data
PDF
Towards an “Ethics in Design” methodology for AI research projects
PDF
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...
PDF
Profiling information sources and services for discovery
PDF
Analyse de données et de réseaux sociaux pour l’aide à l’apprentissage infor...
PDF
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
PDF
Data analytics beyond data processing and how it affects Industry 4.0
A factorial study of neural network learning from differences for regression
Recentrer l'intelligence artificielle sur les connaissances
Data and Knowledge as Commodities
Unsupervised learning approach for identifying sub-genres in music scores
Is knowledge engineering still relevant?
A data view of the data science process
Dealing with Open Domain Data
Web Analytics for Everyday Learning
Presentation a in ovive montpellier - 26%2 f06%2f2018 (1)
Learning Analytics: understand learning and support the learner
The AFEL Project
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...
Data ethics
Data for Learning and Learning with Data
Towards an “Ethics in Design” methodology for AI research projects
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...
Profiling information sources and services for discovery
Analyse de données et de réseaux sociaux pour l’aide à l’apprentissage infor...
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
Data analytics beyond data processing and how it affects Industry 4.0

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
MYSQL Presentation for SQL database connectivity
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Understanding_Digital_Forensics_Presentation.pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Digital-Transformation-Roadmap-for-Companies.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Diabetes mellitus diagnosis method based random forest with bat algorithm
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Reach Out and Touch Someone: Haptics and Empathic Computing
Building Integrated photovoltaic BIPV_UPV.pdf
Empathic Computing: Creating Shared Understanding
Mobile App Security Testing_ A Comprehensive Guide.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

Dealing with Data Diversity in a Smart City Data Hub

  • 1. Dealing with Data Diversity in a Smart City Data Hub Mathieu d'Aquin - @mdaquin slideshare.net/mdaquin Knowledge Media Institute, The Open University
  • 2. Diversity where a penguin is a dataset
  • 3. Why should we care about diversity? Because diversity is good, and what makes data diverse is not the same as what makes it more or less relevant
  • 4. Why should we care about diversity? Because it is hard to manage How many species of species of penguins/animals/things? How many biologist to classify them? and that's purely static... unlike species, new data appear all the time...
  • 5. Why should we care about diversity? The Eskimo language has 255 different words for "visiting linguist" Because we might have a lot of it, or what we need to manage is very granular
  • 6. Data diversity in a Smart City Example of the MK:Smart project in Milton Keynes, UK (mksmart.org)
  • 7. Data diversity in a Smart City Partners in the MK:Smart project
  • 8. Data diversity in a Smart City Areas of the MK:Smart project
  • 9. Data diversity in a Smart City MK Data Hub - Where diversity is handled
  • 10. A concrete example Wifi-based presence sensors
  • 11. A concrete example Wifi-based presence sensors 10-12 can covers an reasonably large enclosed area (here, the refectory of the Open University);
  • 12. A concrete example Wifi-based presence sensors Use trianglation to find the location of wifi-enabled devices.
  • 13. A concrete example Wifi-based presence sensors Basic statistical analysis to extract patterns of usage of the facility
  • 14. A concrete example Wifi-based presence sensors Basic statistical analysis to extract patterns of usage of the facility
  • 15. A concrete example: Diversity
  • 16. A concrete example: Diversity
  • 17. A concrete example: Diversity
  • 18. A concrete example: Diversity
  • 19. A concrete example: Diversity
  • 20. A concrete example: Diversity
  • 21. How do we usually deal with this data heterogenity for we use alignments, mappings, links, etc. Example: The LinkedUp Catalogue of datasets for education includes mappings between the vocanulaties of different datasets data.linkededucation.org/linkedup/catalogue/
  • 22. What about diversity at the policy level?
  • 23. What about diversity at the policy level?
  • 24. What about diversity at the policy level?
  • 25. What about diversity at the policy level?
  • 26. More structured representation VoID and DC to represent datasets, PROV-O for basic provenance.
  • 27. More structured representation ODRL for the structured representation of policies and rights
  • 28. More structured representation With the tools to deal with it
  • 29. More structured representation And the processes
  • 30. Reasoning on the way policy-information propagates Requires an appropriate representation of dataflows
  • 31. DataNode http://purl.org/datanode/ns/ An ontology of relationships between data artifacts (DataNodes).
  • 32. DataNode Captures the essence of dataflows rather than the process, as a basis for meta-information propagation.
  • 33. Propagating meta information accross dataflows Examples of rules: Duties such as attributions propagate over relations of derivation, but not necessraly others Permissions such as the right to redistribute however do not propagate over relations of derivation, except of specific cases (e.g. copies) Prohibitions such as preventing commercial exploitation propage over derivations
  • 34. Discussion/future A lot of the semantics for Smart Cities work focus on data heterogeneity. There is a need to look at data diversity at the meta-information level (here we focus on policy related information). How to manage, catalogue, keep track of and manipulate a large number of datasets with diverse rights, access, validity, scope. How do we help users/developers in exploring and exploiting this diversity...
  • 36. Discussion/future Need for a clear, semantic (i.e. ontological) foundation for describing and defining data artefacts. DataNode is a step towards defining their relationships. Vocabularies such as ODRL and VOID focus on specific aspects. More is needed to formally represent the foundamental descriptors of data (scope, validity, policy, ...)
  • 37. Thanks! Mathieu d'Aquin Alessandro Adamou Enrico Daga Shuangyan Liu Keerthi Thomas Enrico Motta