SlideShare a Scribd company logo
BIG DATA EUROPE
H2020 CSA (2015-17)
22.9.2016
Integrating Big Data, Software & Communities for Addressing
Europe’s Societal ChallengesSC4 Workshop
Big Data in Marketing
10-oct.-16www.big-data-europe.eu
Big Data in Intelligence
10-oct.-16www.big-data-europe.eu
Big Data Europe (CSA: 2015-17)
 Show societal value of Big Data: 7 Domains
 Lower barrier for using big data technologies
o Required effort and resources
o Limited data science skills
 Help establishing cross-
lingual/organizational/domain Data Value
Chains 10-oct.-16
Consortium
Stakeholder Engagement & Activities
M12M6 M18 M24 M30
ARCHITECTURE &
COMPONENTS
Integrating Big Data, Software & Communities for Addressing
Europe’s Societal Challenges
The three Big Data „V“
Variety is often neglected
Quelle: Gesellschaft für Informatik
Adding a Semantic Layer to Data
Lakes
Manufacturing Marketing Sales SupportAccounting
Semantic Data Lake
• central place for
model, schema and
data historization
• Combination of Scale
Out (cost reduction)
and semantics
(increased control &
flexibility)
• grows incrementally
(pay-as-you-go)
Inbound
Data Sources
Outbound and
Consumption
Inbound Raw Data Store
Data Lake (order of magnitude cheaper scalable data store)
Knowledge Graph for Relationship Definition and Meta Data
Frontend to Access Relationship and KPI
Definition / Documentation
Frontend to Access (ad hoc) Reports
Outbound Data Delivery to
Target Systems
JSON-LD CSVW R2RMLXML2RDF
© eccenca.com https://www.eccenca.com/en/products-corporate-memory.html
Current State of Platform Architecture
BDE SANSA Stack
 Distributed Machine
Learning (ML) algorithms
that work out of the box on
RDF data and make use of
its structure / semantics
 Examples:
o Tensor Factorization for e.g.
KB completion
o Spatiotemporal analytics
o Anomaly prediction
o Clustering
o Association rules
o Decision trees
10-oct.-16www.big-data-europe.eu
Platform components
Search/indexing Data processing
Apache Solr Apache Spark
Data acquisition Apache Flink
Apache Flume Semantic Components
Message passing Strabon
Apache Kafka Sextant
Data storage GeoTriples
Hue Silk
Apache Cassandra SEMAGROW
ScyllaDB LIMES
Apache Hive 4Store
Postgis OpenLink Virtuoso
10-oct.-16www.big-data-europe.eu
BDE vs Hadoop distributions
Hortonworks Cloudera MapR Bigtop BDE
File System HDFS HDFS NFS HDFS HDFS
Installation Native Native Native Native lightweight
virtualization
Plug & play components
(no rigid schema)
no no no no yes
High Availability Single failure
recovery (yarn)
Single failure
recovery (yarn)
Self healing,
mult. failure rec.
Single failure
recovery (yarn)
Multiple Failure
recovery
Cost Commercial Commercial Commercial Free Free
Scaling Freemium Freemium Freemium Free Free
Addition of custom
components
Not easy No No No Yes
Integration testing yes yes yes yes --
Operating systems Linux Linux Linux Linux All
Management tool Ambari Cloudera
manager
MapR Control
system
- Docker swarm
UI+ Custom
10-oct.-16www.big-data-europe.eu
7 SC-PILOT INSTANTIATIONS
Integrating Big Data, Software & Communities for Addressing
Europe’s Societal Challenges
Pilots: Overview
 SC1: Health & Pharm.
 SC2: Food & Agr.
 SC3: Energy
 SC4: Transport
10-oct.-16www.big-data-europe.eu
 SC5: Climate
 SC6: Social
Sciences
 SC7: Security
SC1: Life Sciences & Health
10-oct.-16www.big-data-europe.eu
SC1: Life Sciences & Health
SC1: Life Sciences & Health
10-oct.-16www.big-data-europe.eu
Big Data Focus area:
Large-scale heterogeneous
pharma-research data
linking & integration
Selected Key Data assets:
ACD Labs / ChemSpider,
ChEBI, ChEMBL,
ConceptWiki, DrugBank,
ENZYME, Gene Ontology, GO
Annotation, SwissProt,
WikiPathways
SC1: Life Sciences & Health
10-oct.-16www.big-data-europe.eu
Pilot 1: Replicate Open PHACTS
functionality on the BDE
infrastructure using Open Source
solutions
Reasons:
• Deployment possible in-house
• Apply to other domains (e.g.
Agriculture)
• Using extra BDE functionalities
(e.g. logging, analysis)
SC2: Food & Agriculture
10-oct.-16www.big-data-europe.eu
SC2: Food & Agriculture
SC2: Food & Agriculture
10-oct.-16www.big-data-europe.eu
AGINFRA
Big Data Focus area:
Large-scale distributed
agricultural data
integration
Selected Key Data
assets: INFOODS,
AQUASTAT Green
Learning Network (GLN),
Agricultural Bibliography
Network (ABN), AgroVoc,
AquaMaps, Fishbase
SC2: Food & Agriculture
10-oct.-16www.big-data-europe.eu
Pilot focus area:
Viticulture
(from the Latin word for vine)
is the science, production,
and study of grapes.
It deals with the series of
events that occur in the vineyard.
SC3: Energy
10-oct.-16www.big-data-europe.eu
SC3: Energy
SC3: Energy
10-oct.-16www.big-data-europe.eu
Pilot focus area:
System monitoring in
energy production
units.
Big Data Focus area: Real-time turbine
monitoring stream processing and analytics
Selected Key Data assets: European Energy
Exchange Data, smart meter sensor data,
gas/fuels market/price data, consumption statistics,
stratigraphic model data (geology, geophysics)
SC4: Transport
10-oct.-16www.big-data-europe.eu
The Fraunhofer Society is a
German research organization with
67 institutes spread throughout
Germany, each focusing on
different fields of applied science.
The Centre for Research and
Technology-Hellas (CERTH)
founded in 2000 is one of the
leading research
centres in Greece. CERTH
includes the Hellenic Institute of
Transport (HIT): Land, Sea and Air
Transportation as well
as Sustainable Mobility services
ERTICO - ITS Europe is a
partnership of around 100 companies
and institutions involved in the
production of Intelligent Transport
Systems (ITS).
IAIS
SC4 Pilot Focus Area
10-oct.-16www.big-data-europe.eu
Info mobility
based on Mobility
Pattern IdentificationPilot 4: Multisource data
collection for the provision of
accurate info-mobility and
advanced transport planning
service in Thessaloniki,
Greece
SC4: Twitter data in
Thessaloniki
SC4: Floating Car Data
www.big-data-europe.eu
Real time traffic conditions information based on a combination of
traffic modeling and real time measurements (traffic flow and speed)
>1.200 vehicles (one taxi fleet)
• Circulating 16-24 hours/day
• Pulse each 100m or 10s
• 500-2.500 pulses /minute
Speeds along a 2km stretch
SC5: Climate
10-oct.-16www.big-data-europe.eu
SC5: Climate
SC5: Climate
10-oct.-16www.big-data-europe.eu
Pilot focus area:
Supporting data-intensive
climate research
Big Data Focus area: Enormous
simulation time. Extremely complicated
computing model. Selected Key Data
assets: European Grid Infrastructure (EGI).
Access to several data centres hosted at
CNRS-Lyon, NCSR-D Athens, INFN-Milan,
NIKhEF-Amsterdam.
SC6: Social Sciences
10-oct.-16www.big-data-europe.eu
SC6: Social Sciences
SC6: Social Sciences
10-oct.-16www.big-data-europe.eu
Pilot focus area:
Citizens budget spending
on municipal level
Big Data Focus area: Statistical
and research data linking &
integration
Selected Key Data assets:
Federated social sciences data
catalogs, statistical data from public
data portals and statistical offices
(e.g. EuroStats, UNESCO,
SC7: Security
10-oct.-16www.big-data-europe.eu
SC7: Security
SC7: Security
10-oct.-16www.big-data-europe.eu
Pilot focus area:
Getting insight in man-made surface
changes triggered by automatic detection,
news, or social media information
Big Data Focus area: Image data
analysis
Selected Key Data assets: Earth
Observation data (e.g. Very High Resolution
Satellite Imagery acquired from commercial
providers and governmental systems) and
collateral data for supporting CFSP/CSDP
missions and operations
SC7: Security
10-oct.-16www.big-data-europe.eu
Pilot 7: Ingestion of remote
sensing images and social
sensing data to detect and
verify man-made changes on
the Earth surface for security
applications
Evacuation route planning
Monitoring of critical infrastructures
Border security
Satellite image data is HUGE and
computational intensive to compare
Smart ‘focus’ algorithms are needed to
prioritize the analysis jobs
Reasons:
WEB: www.big-data-europe.eu
EMAIL: info@big-data-europe.eu
PROJECT COORDINATION
Prof. Sören Auer, auer © cs.uni-bonn · de (Fraunhofer IAIS)
> Dr. Simon Scerri (Deputy), scerri © cs.uni-bonn · de
(Fraunhofer IAIS)
EIS Department/Group,
Fraunhofer IAIS & CS Department Uni-Bonn,
Bonn, Germany
Fraunhofer IAIS: Leads Fraunhofer Big Data Alliance
Questions & Contacts
www.big-data-europe.eu
10-oct.-16
#BigDataEurope

More Related Content

PDF
BDE-SC1 Webinar: OpenPHACTS Re-engineered with Big Data Europe
PDF
Bde sc3 2nd_workshop_2016_10_04_p01_bde_introduction
PDF
Bde sc3 2nd_workshop_2016_10_04_p03_efacec
PDF
Bde sc3 2nd_workshop_2016_10_04_p02_maher_chebbo_sap
PDF
Bde sc3 2nd_workshop_2016_10_04_p09_csi
PDF
Bde sc3 2nd_workshop_2016_10_04_p10_maja_skrjanc
PDF
Big Data Europe at eHealth Week 2017: Linking Big Data in Health
PPTX
BDE-BDVA Webinar: Arne Berre and Ana Garcia slides for BDVA/BDE Webinar
BDE-SC1 Webinar: OpenPHACTS Re-engineered with Big Data Europe
Bde sc3 2nd_workshop_2016_10_04_p01_bde_introduction
Bde sc3 2nd_workshop_2016_10_04_p03_efacec
Bde sc3 2nd_workshop_2016_10_04_p02_maher_chebbo_sap
Bde sc3 2nd_workshop_2016_10_04_p09_csi
Bde sc3 2nd_workshop_2016_10_04_p10_maja_skrjanc
Big Data Europe at eHealth Week 2017: Linking Big Data in Health
BDE-BDVA Webinar: Arne Berre and Ana Garcia slides for BDVA/BDE Webinar

What's hot (20)

PDF
SC7 Workshop 2: Big Data Technologies and Scenarios
PPTX
SC2 Workshop 2: Big Data Europe Project
PDF
SC7 Webinar 4 04/05/2017 SatCen Presentation "The Secure Societies Community ...
PDF
BDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVA
PPTX
2016 09-28 bde sc6-pilot-webinar vaf
PPTX
SC1 Workshop 2 Pilot instantiations
PPTX
BDE SC6-ws-05/12/2016 technology part - SWC
PPTX
BDE SC6.2 Workshop-05/12/16 - CESSDA
PDF
Big data Europe: concept, platform and pilots
PDF
Bde sc3 2nd_workshop_2016_10_04_p08_bems
PDF
SC7 Workshop 2: Space-based applications and Big Data
PDF
Bde sc3 2nd_workshop_2016_10_04_p06_bde_pilot
PDF
SC7 Workshop 2: Big Data and Secure Societies
PDF
Bde sc3 2nd_workshop_2016_10_04_p00_welcome_agenda
PPTX
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
PPTX
Bde cessda sc6-hang_out-28september2016_ivana
PDF
SC7 Workshop 1: Big Data in Secure Societies
PDF
Big Data Europe Concept and Platform
PPTX
European Cloud Initiative: implementation status
PDF
SC1 - Hangout 2: The Open PHACTS pilot
SC7 Workshop 2: Big Data Technologies and Scenarios
SC2 Workshop 2: Big Data Europe Project
SC7 Webinar 4 04/05/2017 SatCen Presentation "The Secure Societies Community ...
BDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVA
2016 09-28 bde sc6-pilot-webinar vaf
SC1 Workshop 2 Pilot instantiations
BDE SC6-ws-05/12/2016 technology part - SWC
BDE SC6.2 Workshop-05/12/16 - CESSDA
Big data Europe: concept, platform and pilots
Bde sc3 2nd_workshop_2016_10_04_p08_bems
SC7 Workshop 2: Space-based applications and Big Data
Bde sc3 2nd_workshop_2016_10_04_p06_bde_pilot
SC7 Workshop 2: Big Data and Secure Societies
Bde sc3 2nd_workshop_2016_10_04_p00_welcome_agenda
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Bde cessda sc6-hang_out-28september2016_ivana
SC7 Workshop 1: Big Data in Secure Societies
Big Data Europe Concept and Platform
European Cloud Initiative: implementation status
SC1 - Hangout 2: The Open PHACTS pilot
Ad

Similar to SC4 Workshop 2: Soren Auer BDE project Overview (20)

PPTX
BDE SC6-hang out - technology part-SWC - Martin
PPTX
BDE SC4 Hangout - Simon Scerri, Introduction
PPTX
SC2 Workshop 1: Big Data Europe (BDE) - Project Overview & Food Workshop
PDF
SC7 Workshop 2: The BigDataEurope project
PPTX
Linked Open Data (LOD) Pilot Austria
PDF
BDE Technical Webinar 1 : Pilot Instantiation
PDF
BigDataEurope @BDVA Summit2016 2: Societal Pilots
PDF
SC6 Workshop 1: What can big data do for you?
PPTX
BDE SC6 workshop - introduction 2016
PPTX
Bde euro proworkshop
PPTX
Project overview big data europe
PPTX
BigDataEurope: Project Introduction @ Year #1 Workshops
PPTX
BDE: Concepts, Platform and Pilots
PPTX
SC1 Workshop 2 General Introduction to BDE
PPTX
SC4 Hangout 1: BDE-Transport Webinar Simon Scerri
PDF
Eco-Systems for Smart Cities based on Open Urban Platforms
PDF
SC7 Workshop 3: The BDE pilot for secure societies
PDF
BDE Webinar: SC6 - EUROPE IN A CHANGING WORLD -INCLUSIVE, INNOVATIVE AND REFL...
PDF
SC7 Workshop 3: Big Data Europe Project
PDF
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)
BDE SC6-hang out - technology part-SWC - Martin
BDE SC4 Hangout - Simon Scerri, Introduction
SC2 Workshop 1: Big Data Europe (BDE) - Project Overview & Food Workshop
SC7 Workshop 2: The BigDataEurope project
Linked Open Data (LOD) Pilot Austria
BDE Technical Webinar 1 : Pilot Instantiation
BigDataEurope @BDVA Summit2016 2: Societal Pilots
SC6 Workshop 1: What can big data do for you?
BDE SC6 workshop - introduction 2016
Bde euro proworkshop
Project overview big data europe
BigDataEurope: Project Introduction @ Year #1 Workshops
BDE: Concepts, Platform and Pilots
SC1 Workshop 2 General Introduction to BDE
SC4 Hangout 1: BDE-Transport Webinar Simon Scerri
Eco-Systems for Smart Cities based on Open Urban Platforms
SC7 Workshop 3: The BDE pilot for secure societies
BDE Webinar: SC6 - EUROPE IN A CHANGING WORLD -INCLUSIVE, INNOVATIVE AND REFL...
SC7 Workshop 3: Big Data Europe Project
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)
Ad

More from BigData_Europe (20)

PDF
Luigi Selmi - The Big Data Integrator Platform
PDF
Josep Maria Salanova - Introduction to BDE+SC4
PDF
Rajendra Akerkar - LeMO Project
PDF
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
PDF
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
PDF
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
PDF
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
PDF
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
PDF
BDE SC3.3 Workshop - BDE review: Scope and Opportunities
PDF
BDE SC3.3 Workshop - Agenda
PDF
BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
PDF
BDE SC3.3 Workshop - Data management in WT testing and monitoring
PDF
BDE SC3.3 Workshop - Big Data in Wind Turbine Condition Monitoring
PDF
BDE SC3.3 Workshop - BDE Platform: Technical overview
PDF
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
PDF
BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
PDF
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
PDF
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
PPTX
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
PPTX
BDE SC1 Workshop 3 - MIDAS (Michaela Black)
Luigi Selmi - The Big Data Integrator Platform
Josep Maria Salanova - Introduction to BDE+SC4
Rajendra Akerkar - LeMO Project
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
BDE SC3.3 Workshop - BDE review: Scope and Opportunities
BDE SC3.3 Workshop - Agenda
BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
BDE SC3.3 Workshop - Data management in WT testing and monitoring
BDE SC3.3 Workshop - Big Data in Wind Turbine Condition Monitoring
BDE SC3.3 Workshop - BDE Platform: Technical overview
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
BDE SC1 Workshop 3 - MIDAS (Michaela Black)

Recently uploaded (20)

PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Empathic Computing: Creating Shared Understanding
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Big Data Technologies - Introduction.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Encapsulation theory and applications.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPT
Teaching material agriculture food technology
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Electronic commerce courselecture one. Pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Modernizing your data center with Dell and AMD
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Digital-Transformation-Roadmap-for-Companies.pptx
Empathic Computing: Creating Shared Understanding
MYSQL Presentation for SQL database connectivity
Dropbox Q2 2025 Financial Results & Investor Presentation
Big Data Technologies - Introduction.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Network Security Unit 5.pdf for BCA BBA.
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Encapsulation theory and applications.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Teaching material agriculture food technology
NewMind AI Weekly Chronicles - August'25 Week I
Electronic commerce courselecture one. Pdf
Machine learning based COVID-19 study performance prediction
Encapsulation_ Review paper, used for researhc scholars
Mobile App Security Testing_ A Comprehensive Guide.pdf
Modernizing your data center with Dell and AMD
The Rise and Fall of 3GPP – Time for a Sabbatical?
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication

SC4 Workshop 2: Soren Auer BDE project Overview

  • 1. BIG DATA EUROPE H2020 CSA (2015-17) 22.9.2016 Integrating Big Data, Software & Communities for Addressing Europe’s Societal ChallengesSC4 Workshop
  • 2. Big Data in Marketing 10-oct.-16www.big-data-europe.eu
  • 3. Big Data in Intelligence 10-oct.-16www.big-data-europe.eu
  • 4. Big Data Europe (CSA: 2015-17)  Show societal value of Big Data: 7 Domains  Lower barrier for using big data technologies o Required effort and resources o Limited data science skills  Help establishing cross- lingual/organizational/domain Data Value Chains 10-oct.-16
  • 6. Stakeholder Engagement & Activities M12M6 M18 M24 M30
  • 7. ARCHITECTURE & COMPONENTS Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges
  • 8. The three Big Data „V“ Variety is often neglected Quelle: Gesellschaft für Informatik
  • 9. Adding a Semantic Layer to Data Lakes Manufacturing Marketing Sales SupportAccounting Semantic Data Lake • central place for model, schema and data historization • Combination of Scale Out (cost reduction) and semantics (increased control & flexibility) • grows incrementally (pay-as-you-go) Inbound Data Sources Outbound and Consumption Inbound Raw Data Store Data Lake (order of magnitude cheaper scalable data store) Knowledge Graph for Relationship Definition and Meta Data Frontend to Access Relationship and KPI Definition / Documentation Frontend to Access (ad hoc) Reports Outbound Data Delivery to Target Systems JSON-LD CSVW R2RMLXML2RDF © eccenca.com https://www.eccenca.com/en/products-corporate-memory.html
  • 10. Current State of Platform Architecture
  • 11. BDE SANSA Stack  Distributed Machine Learning (ML) algorithms that work out of the box on RDF data and make use of its structure / semantics  Examples: o Tensor Factorization for e.g. KB completion o Spatiotemporal analytics o Anomaly prediction o Clustering o Association rules o Decision trees 10-oct.-16www.big-data-europe.eu
  • 12. Platform components Search/indexing Data processing Apache Solr Apache Spark Data acquisition Apache Flink Apache Flume Semantic Components Message passing Strabon Apache Kafka Sextant Data storage GeoTriples Hue Silk Apache Cassandra SEMAGROW ScyllaDB LIMES Apache Hive 4Store Postgis OpenLink Virtuoso 10-oct.-16www.big-data-europe.eu
  • 13. BDE vs Hadoop distributions Hortonworks Cloudera MapR Bigtop BDE File System HDFS HDFS NFS HDFS HDFS Installation Native Native Native Native lightweight virtualization Plug & play components (no rigid schema) no no no no yes High Availability Single failure recovery (yarn) Single failure recovery (yarn) Self healing, mult. failure rec. Single failure recovery (yarn) Multiple Failure recovery Cost Commercial Commercial Commercial Free Free Scaling Freemium Freemium Freemium Free Free Addition of custom components Not easy No No No Yes Integration testing yes yes yes yes -- Operating systems Linux Linux Linux Linux All Management tool Ambari Cloudera manager MapR Control system - Docker swarm UI+ Custom 10-oct.-16www.big-data-europe.eu
  • 14. 7 SC-PILOT INSTANTIATIONS Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges
  • 15. Pilots: Overview  SC1: Health & Pharm.  SC2: Food & Agr.  SC3: Energy  SC4: Transport 10-oct.-16www.big-data-europe.eu  SC5: Climate  SC6: Social Sciences  SC7: Security
  • 16. SC1: Life Sciences & Health 10-oct.-16www.big-data-europe.eu SC1: Life Sciences & Health
  • 17. SC1: Life Sciences & Health 10-oct.-16www.big-data-europe.eu Big Data Focus area: Large-scale heterogeneous pharma-research data linking & integration Selected Key Data assets: ACD Labs / ChemSpider, ChEBI, ChEMBL, ConceptWiki, DrugBank, ENZYME, Gene Ontology, GO Annotation, SwissProt, WikiPathways
  • 18. SC1: Life Sciences & Health 10-oct.-16www.big-data-europe.eu Pilot 1: Replicate Open PHACTS functionality on the BDE infrastructure using Open Source solutions Reasons: • Deployment possible in-house • Apply to other domains (e.g. Agriculture) • Using extra BDE functionalities (e.g. logging, analysis)
  • 19. SC2: Food & Agriculture 10-oct.-16www.big-data-europe.eu SC2: Food & Agriculture
  • 20. SC2: Food & Agriculture 10-oct.-16www.big-data-europe.eu AGINFRA Big Data Focus area: Large-scale distributed agricultural data integration Selected Key Data assets: INFOODS, AQUASTAT Green Learning Network (GLN), Agricultural Bibliography Network (ABN), AgroVoc, AquaMaps, Fishbase
  • 21. SC2: Food & Agriculture 10-oct.-16www.big-data-europe.eu Pilot focus area: Viticulture (from the Latin word for vine) is the science, production, and study of grapes. It deals with the series of events that occur in the vineyard.
  • 23. SC3: Energy 10-oct.-16www.big-data-europe.eu Pilot focus area: System monitoring in energy production units. Big Data Focus area: Real-time turbine monitoring stream processing and analytics Selected Key Data assets: European Energy Exchange Data, smart meter sensor data, gas/fuels market/price data, consumption statistics, stratigraphic model data (geology, geophysics)
  • 24. SC4: Transport 10-oct.-16www.big-data-europe.eu The Fraunhofer Society is a German research organization with 67 institutes spread throughout Germany, each focusing on different fields of applied science. The Centre for Research and Technology-Hellas (CERTH) founded in 2000 is one of the leading research centres in Greece. CERTH includes the Hellenic Institute of Transport (HIT): Land, Sea and Air Transportation as well as Sustainable Mobility services ERTICO - ITS Europe is a partnership of around 100 companies and institutions involved in the production of Intelligent Transport Systems (ITS). IAIS
  • 25. SC4 Pilot Focus Area 10-oct.-16www.big-data-europe.eu Info mobility based on Mobility Pattern IdentificationPilot 4: Multisource data collection for the provision of accurate info-mobility and advanced transport planning service in Thessaloniki, Greece
  • 26. SC4: Twitter data in Thessaloniki
  • 27. SC4: Floating Car Data www.big-data-europe.eu Real time traffic conditions information based on a combination of traffic modeling and real time measurements (traffic flow and speed) >1.200 vehicles (one taxi fleet) • Circulating 16-24 hours/day • Pulse each 100m or 10s • 500-2.500 pulses /minute Speeds along a 2km stretch
  • 29. SC5: Climate 10-oct.-16www.big-data-europe.eu Pilot focus area: Supporting data-intensive climate research Big Data Focus area: Enormous simulation time. Extremely complicated computing model. Selected Key Data assets: European Grid Infrastructure (EGI). Access to several data centres hosted at CNRS-Lyon, NCSR-D Athens, INFN-Milan, NIKhEF-Amsterdam.
  • 31. SC6: Social Sciences 10-oct.-16www.big-data-europe.eu Pilot focus area: Citizens budget spending on municipal level Big Data Focus area: Statistical and research data linking & integration Selected Key Data assets: Federated social sciences data catalogs, statistical data from public data portals and statistical offices (e.g. EuroStats, UNESCO,
  • 33. SC7: Security 10-oct.-16www.big-data-europe.eu Pilot focus area: Getting insight in man-made surface changes triggered by automatic detection, news, or social media information Big Data Focus area: Image data analysis Selected Key Data assets: Earth Observation data (e.g. Very High Resolution Satellite Imagery acquired from commercial providers and governmental systems) and collateral data for supporting CFSP/CSDP missions and operations
  • 34. SC7: Security 10-oct.-16www.big-data-europe.eu Pilot 7: Ingestion of remote sensing images and social sensing data to detect and verify man-made changes on the Earth surface for security applications Evacuation route planning Monitoring of critical infrastructures Border security Satellite image data is HUGE and computational intensive to compare Smart ‘focus’ algorithms are needed to prioritize the analysis jobs Reasons:
  • 35. WEB: www.big-data-europe.eu EMAIL: info@big-data-europe.eu PROJECT COORDINATION Prof. Sören Auer, auer © cs.uni-bonn · de (Fraunhofer IAIS) > Dr. Simon Scerri (Deputy), scerri © cs.uni-bonn · de (Fraunhofer IAIS) EIS Department/Group, Fraunhofer IAIS & CS Department Uni-Bonn, Bonn, Germany Fraunhofer IAIS: Leads Fraunhofer Big Data Alliance Questions & Contacts www.big-data-europe.eu 10-oct.-16 #BigDataEurope

Editor's Notes

  • #5: Project obecjtives: Addressing each of the Societal Challenge domains (7), we have a domain representative for each & a pilot instantiation of the BDE platform for each in progress One of the challenges to Big Data opportunities is the lack of skills (data science) – our aim is to provide out of the box technology with not a lot of training required to use and apply BDE technology can be applied in multiple domains and in different phases within Data Value Chains, working with different data providers and addressing multiple objectives (as opposed to current solutions, which tend to be very specific to one data source or domain, and address one objective.
  • #6: 9/16 partners: Sole or joint domain representatives of 7 SC domains (COORDINATION ROLE) Other 7/16 partners: technical support (SUPPORT ROLE) Fraunhofer coordinates the project
  • #7: Centered around the 7 SC communitiess, we follow a number of iterations each having five steps: 1) Engaging with stakeholders for 2) collecting feedback and translating to requirements and identify 3) domain-specific data assets. Prototypes are deployed as 7 Pilots use-cases (4) and evaluated with the community (5) – currently we are in the first iteration at step 4 (Pilot conceptualisation and deployment).
  • #9: http://www.gi.de/nc/service/informatiklexikon/detailansicht/article/big-data.html
  • #10: Data Lake is a storage repository for big data scale raw data in original data formats. late binding approach to schema: “Let us decide, when we need it.” scale out architecture on commodity infrastructure, mostly with HFS/Hadoop/Spark, which gives a huge cost advantage – about factor 10 compared to data warehouses. Semantic Data Lake = Data Lake + Knowledge Graph management of structure (vocabularies/schemas, KPIs trees, metadata, …) on top of the Data Lake is performed in a knowledge graph - a complex data fabric representing all kinds of things and how they relate to each other. A knowledge graph is unique regarding flexibility, multiple views and metadata capabilities. Based on the Resource Description Framework (RDF) standard and Linked Data principles.