SlideShare a Scribd company logo
Benchmarking of distributed linked data streaming systems
This project has received funding from the European Union's H2020 research and innovation action program under grant agreement number 688227.
The project runtime is December 2015 until November 2018.
The HOBBIT project
Pavel Smirnov
AGT International
1
Stream Reasoning Workshop
January 17, 2018
2
Overview
• The HOBBIT project
• DEBS challenges
• Available benchmarks overview
• Summary
Goal
To abolish the barriers in the adoption and deployment of Big Linked Data by European companies by:
• The deployment of benchmarks on data that reflects reality within realistic settings.
• The provision of corresponding industry-relevant key performance indicators (KPIs).
• The computation of comparable results on standardized hardware.
• The institution of an independent and thus bias-free organization to conduct regular benchmarks and
provide the European industry with up-to-date performance results.
Deliverables:
• The benchmarking platform (the HOBBIT platform)
• The set of benchmarks with KPIs
• Benchmarking association
3
The HOBBIT project. Overview
http://project-hobbit.eu
4
The HOBBIT platform. Business logic
1
2
3.
2
3.
1
4
5
6
Customer
Requires ranking of alternative
solutions by some KPI
Solution provider (vendor)
(e.g. DB, Streaming Platforms, ML
frameworks, etc…)
The HOBBIT platform
(online or local instance)
Customer
Requires ranking of alternative
solutions by some KPI
Customer
Requires ranking of alternative
solutions by some KPI Provides:
1. Automatic benchmark executions
2. Leaderboards (online or private)
Main advantages:
1. Streaming fashion
2. Docker virtualization
3. RDF-enabled
Submit
benchmarks
Submit
systems
http://github.com/hobbit-project/platform
5
The HOBBIT platform. Architecture
The data pipeline:
1. Raw/initial data send (optional)
2. Sending raw tuples
3.1 Sending tasks (task={tuple, id})
3.2 Sending expected results per tasks
4. Send actual results per tasks
5. Send the “expected-actual” pairs
6. Send KPIs back to the controller
7. Send KPIs back to the platform
Benchmark (customer’s application)
System components
(black box for customers)
Platform components
1
2
3.1
3.2
4
5
6
The online platform:
http://master.project-hobbit.eu/
Cluster: 6 nodes, each is
2×64 bit Intel Xeon E5-2630v3
(8-Cores, 2.4 GHz, HT, 20MB
Cache, each proc.), 256 GB RAM,
1Gb Ethernet
Nodes (benchmark/system): 3/3
https://github.com/hobbit-project/platform/wiki/Overview
7
http://github.com/hobbit-project/platform
6
The HOBBIT platform. Technologies
https://github.com/hobbit-project/platform/wiki/Overview
Platform communication channel (RarritMQ only)
Data transportation channel (app-specific)
Platform-side:
1. Java
2. RabbitMQ
3. Docker+Swarm
4. GitLab
5. Redis
6. Virtuoso (RDF)
7. NodeJS
8. KeyCloak
App-side (defaults):
1. Java
2. RabbitMQ
Application side Platform side
(RabbitMQ, Kafka, Netty, Akka…)
http://github.com/hobbit-project/platform
Design and upload to HOBBIT
Create a project at
https://git.project-hobbit.eu
Create and account at
https://master.project-hobbit.eu
Clone and extend the basic codes:
https://github.com/hobbit-project/java-sdk-
example
Design components using the manuals:
Run tests locally as pure java code
Update ttl-files for you project
Upload Design (alternative using the JAVA SDK)
Develop a benchmark component in Java
Develop a component in Java
Develop a system adapter
Develop a system adapter in Java
Create docker files using details (manual)
Design (the standard HOBBIT way)
Debug Docker images by running tests
Find your benchmark or system at
https://master.project-hobbit.eu
Build images (manual)
Configure remote project details
Upload docker images to
https://git.project-hobbit.eu
- Lots of understanding and manual work
- Impossible to debug locally *
- Upload non-tested images *
- No logs from the online platform, only GUI *
+ Clone and extend standard classes with your logic
+ Test and debug your code from IDE
+ Built Docker images on demand from IDE
+ Run your images from IDE, check all internal logs
+ Upload fully tested images
7
* Unless you haven’t a local HOBBIT deployment
8
Example: single benchmark run
http://master.project-hobbit.eu/
9
Example: challenges & leaderboards
http://master.project-hobbit.eu/
Challenges: DEBS GC 2017
DEBS Grand Challenge 2017 successfully completed
Anomaly detection for injection molding machines over RDF-streams.
10
14 teams
registered
7 teams passed
correctness check
2 were awarded
(main and audience
award)
StreaML Open Challenge is opened; Price: 500 €
The main result:
For the first time we can objectively quantify the performance of
a distributed stream processing pipeline running analytics algorithms
https://project-hobbit.eu/challenges/debs-grand-challenge/
https://project-hobbit.eu/open-challenges/streaml-open-challenge/
Find Cluster
Centers Over W
time units
Apply Markov
Model for
Anomaly Detection
Train Markov
Model over last W
time units
start
After at least W
time units
The anomaly detector:
Challenges: DEBS GC 2018
DEBS Grand Challenge 2018 is just started
https://project-hobbit.eu/challenges/debs2018-grand-challenge/
Prediction of arrival times and ports on marine traffic data.
Price: 1000 € + publication at DEBS proceedings (conf. will be in New Zealand)
11
• Synthetic generated data
• Predefined algorithms
• True RDF-streaming benchmark
• Focus: correctness check,
throughput, latency
• Real annotated data
• No predefined approach
• True ML-benchmark
• Focus: prediction accuracy,
performance
DEBS Grand Challenge 2018DEBS Grand Challenge 2017
12
Available benchmarks overview
Versioning Benchmark
• Benchmark for assessing an ability of
versioning systems to efficiently
manage evolving datasets and queries
Data Storage Benchmark
 benchmark for RDF data storage
solutions against an interactive
workload in a real-world scenario, using
various dataset sizes
Linking Benchmark
 Benchmark for assessing the
performance of instance Matching
tools that implement string-based
approaches
Faceted Browsing Benchmark
• Benchmark for systems which support
browsing through linked data by
iterative transitions performed by an
intelligent user
ODIN Benchmark
• benchmark for data extraction
solutions for structured data
• simulates the ingestion, storage
and retrieval of streams of RDF
data
Spatial Benchmark
 Benchmark for systems which deal with
topological relations proposed in the
state of the art DE-9IM model.
Question Answering Benchmark
• Benchmark for ranking question
answering systems based on their
performance and accuracy
GERBIL Benchmark
• benchmark for entity annotation
and disambiguation tools
• 9 annotators, 11 RDF datasets
Stream Machine Learning Benchmark
 Benchmark for assess the performance of
anomaly detection for injection molding
machines over RDF-streams
Stream Machine Learning Benchmark v2
• Benchmark for assess the accuracy of
prediction over stream of marine traffic
data
http://github.com/hobbit-project
Summary
The HOBBIT platform
• Ability to benchmark heterogeneous distibuted systems in streaming fashion
• A set of benchmarks to compare relevant Linked Data technologies and solutions
• We apply the HOBBIT platform to rank machine-learning pipelines over the RDF-streams
• The platform may be a basics for benchmark of stream-reasoning solutions
13
QA
Thank you for attention!
14
psmirnov@agtinternational.com
http://twitter.com/smirnp
http://twitter.com/AGTIntl

More Related Content

PPT
LoCloud Micro Services and the Digitisation Workflow
PPT
Plan4 all portal
PDF
COMSODE networking session at ICT Lisbon 2015
PPTX
Publishing "5 star" data: the case for RDF
PDF
FIWARE Global Summit - Defragmenting the IoT with the Web of Things
PDF
BigDataEurope @BDVA Summit2016 2: Societal Pilots
PPTX
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
PDF
Multilingual Data Value Chain for CEF Automated Translation: Interoperability...
LoCloud Micro Services and the Digitisation Workflow
Plan4 all portal
COMSODE networking session at ICT Lisbon 2015
Publishing "5 star" data: the case for RDF
FIWARE Global Summit - Defragmenting the IoT with the Web of Things
BigDataEurope @BDVA Summit2016 2: Societal Pilots
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
Multilingual Data Value Chain for CEF Automated Translation: Interoperability...

What's hot (19)

PDF
OSLC & The Future of Interoperability
PDF
DEEP general presentation
PPTX
Updates from Hungary (Jozsef Kovacs)
PPTX
Big Data Europe Transport Pilot case, Luigi Selmi
PPSX
The path to an hybrid open source paradigm
PDF
Cartogrammar Poster
PDF
h5web: a web-based viewer of HDF5 files
PPTX
Open DMPs: Machine Actionable open data management planning (Presentation at ...
PDF
LDBC 6th TUC Meeting conclusions by Peter Boncz
PDF
20141030 LinDA Workshop echallenges2014 - LinDA project overview
PPTX
Deep Hybrid DataCloud
PDF
Enabling the digital thread using open OSLC standards
PDF
Initiative Based Technology Consulting Case Studies
PDF
Planetdata simpda
PPTX
ECPPM2016 - SemCat: Publishing and Accessing Building Product Information as ...
PPT
Data Processing and Analysis
PDF
DSD-NL 2021 Delft-FEWS visie 2025 en roadmap 2021 - stand van zaken en voorui...
PDF
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
PDF
Linguistic Linked Open Data, Challenges, Approaches, Future Work
OSLC & The Future of Interoperability
DEEP general presentation
Updates from Hungary (Jozsef Kovacs)
Big Data Europe Transport Pilot case, Luigi Selmi
The path to an hybrid open source paradigm
Cartogrammar Poster
h5web: a web-based viewer of HDF5 files
Open DMPs: Machine Actionable open data management planning (Presentation at ...
LDBC 6th TUC Meeting conclusions by Peter Boncz
20141030 LinDA Workshop echallenges2014 - LinDA project overview
Deep Hybrid DataCloud
Enabling the digital thread using open OSLC standards
Initiative Based Technology Consulting Case Studies
Planetdata simpda
ECPPM2016 - SemCat: Publishing and Accessing Building Product Information as ...
Data Processing and Analysis
DSD-NL 2021 Delft-FEWS visie 2025 en roadmap 2021 - stand van zaken en voorui...
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Ad

Similar to Benchmarking of distributed linked data streaming systems (20)

PDF
Hobbit project overview presented at EBDVF 2017
PPTX
The DEBS Grand Challenge 2017
PDF
HOBBIT Project Overview @ ESWC HOBBIT Workshop
PDF
Holistic Benchmarking of Big Linked Data: HOBBIT
PDF
HOBBIT Link Discovery Benchmarks at OM2017 ISWC 2017
PDF
HOBBIT at ESWC EU Networking Session
PPTX
The DEBS Grand Challenge 2017
PDF
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019
PDF
Hobbit in a Nutshell - EDF2016
PDF
The DEBS Grand Challenge 2018
PDF
Hobbit presentation at Apache Big Data Europe 2016
PDF
Introducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
PDF
Benchmarking Big Linked Data: The case of the HOBBIT Project
PDF
Adventures in Research
PDF
OpenNebulaConf 2013 -Adventures in Research by Joel Merrick
PPTX
Continuous Quality
PDF
API Performance testing with Gatling
PDF
S-CUBE LP: Variability Modeling and QoS Analysis of Web Services Orchestrations
Hobbit project overview presented at EBDVF 2017
The DEBS Grand Challenge 2017
HOBBIT Project Overview @ ESWC HOBBIT Workshop
Holistic Benchmarking of Big Linked Data: HOBBIT
HOBBIT Link Discovery Benchmarks at OM2017 ISWC 2017
HOBBIT at ESWC EU Networking Session
The DEBS Grand Challenge 2017
DataBench session @ BDV Meet-Up Riga: The case of HOBBIT, 27/06/2019
Hobbit in a Nutshell - EDF2016
The DEBS Grand Challenge 2018
Hobbit presentation at Apache Big Data Europe 2016
Introducing the HOBBIT platform into the Ontology Alignment Evaluation Campaign
Benchmarking Big Linked Data: The case of the HOBBIT Project
Adventures in Research
OpenNebulaConf 2013 -Adventures in Research by Joel Merrick
Continuous Quality
API Performance testing with Gatling
S-CUBE LP: Variability Modeling and QoS Analysis of Web Services Orchestrations
Ad

More from Holistic Benchmarking of Big Linked Data (20)

PDF
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
PDF
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
PDF
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
PDF
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
PDF
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
PDF
Scalable Link Discovery for Modern Data-Driven Applications (poster)
PDF
An Evaluation of Models for Runtime Approximation in Link Discovery
PDF
Scalable Link Discovery for Modern Data-Driven Applications
PDF
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
PPTX
SPgen: A Benchmark Generator for Spatial Link Discovery Tools
PDF
OKE2018 Challenge @ ESWC2018
PDF
MOCHA 2018 Challenge @ ESWC2018
PDF
Dynamic planning for link discovery - ESWC 2018
PDF
Leopard ISWC Semantic Web Challenge 2017 (poster)
PDF
Leopard ISWC Semantic Web Challenge 2017
PDF
Benchmarking Link Discovery Systems for Geo-Spatial Data - BLINK ISWC2017.
PDF
Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017
PDF
High-Performance Approach to String Similarity using Most Frequent K Characters
PPTX
Benchmarking Faceted Browsing Capabilities of Triple Stores
PDF
QALD-7 Question Answering over Linked Data Challenge
EARL: Joint Entity and Relation Linking for Question Answering over Knowledge...
Assessing Linked Data Versioning Systems: The Semantic Publishing Versioning ...
SQCFramework: SPARQL Query Containment Benchmarks Generation Framework
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
Scalable Link Discovery for Modern Data-Driven Applications (poster)
An Evaluation of Models for Runtime Approximation in Link Discovery
Scalable Link Discovery for Modern Data-Driven Applications
Extending LargeRDFBench for Multi-Source Data at Scale for SPARQL Endpoint F...
SPgen: A Benchmark Generator for Spatial Link Discovery Tools
OKE2018 Challenge @ ESWC2018
MOCHA 2018 Challenge @ ESWC2018
Dynamic planning for link discovery - ESWC 2018
Leopard ISWC Semantic Web Challenge 2017 (poster)
Leopard ISWC Semantic Web Challenge 2017
Benchmarking Link Discovery Systems for Geo-Spatial Data - BLINK ISWC2017.
Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017
High-Performance Approach to String Similarity using Most Frequent K Characters
Benchmarking Faceted Browsing Capabilities of Triple Stores
QALD-7 Question Answering over Linked Data Challenge

Recently uploaded (20)

PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Approach and Philosophy of On baking technology
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Cloud computing and distributed systems.
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
KodekX | Application Modernization Development
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
“AI and Expert System Decision Support & Business Intelligence Systems”
Chapter 3 Spatial Domain Image Processing.pdf
cuic standard and advanced reporting.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Review of recent advances in non-invasive hemoglobin estimation
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Spectral efficient network and resource selection model in 5G networks
Approach and Philosophy of On baking technology
MYSQL Presentation for SQL database connectivity
20250228 LYD VKU AI Blended-Learning.pptx
Spectroscopy.pptx food analysis technology
Cloud computing and distributed systems.
The Rise and Fall of 3GPP – Time for a Sabbatical?
Dropbox Q2 2025 Financial Results & Investor Presentation
KodekX | Application Modernization Development
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf

Benchmarking of distributed linked data streaming systems

  • 1. Benchmarking of distributed linked data streaming systems This project has received funding from the European Union's H2020 research and innovation action program under grant agreement number 688227. The project runtime is December 2015 until November 2018. The HOBBIT project Pavel Smirnov AGT International 1 Stream Reasoning Workshop January 17, 2018
  • 2. 2 Overview • The HOBBIT project • DEBS challenges • Available benchmarks overview • Summary
  • 3. Goal To abolish the barriers in the adoption and deployment of Big Linked Data by European companies by: • The deployment of benchmarks on data that reflects reality within realistic settings. • The provision of corresponding industry-relevant key performance indicators (KPIs). • The computation of comparable results on standardized hardware. • The institution of an independent and thus bias-free organization to conduct regular benchmarks and provide the European industry with up-to-date performance results. Deliverables: • The benchmarking platform (the HOBBIT platform) • The set of benchmarks with KPIs • Benchmarking association 3 The HOBBIT project. Overview http://project-hobbit.eu
  • 4. 4 The HOBBIT platform. Business logic 1 2 3. 2 3. 1 4 5 6 Customer Requires ranking of alternative solutions by some KPI Solution provider (vendor) (e.g. DB, Streaming Platforms, ML frameworks, etc…) The HOBBIT platform (online or local instance) Customer Requires ranking of alternative solutions by some KPI Customer Requires ranking of alternative solutions by some KPI Provides: 1. Automatic benchmark executions 2. Leaderboards (online or private) Main advantages: 1. Streaming fashion 2. Docker virtualization 3. RDF-enabled Submit benchmarks Submit systems http://github.com/hobbit-project/platform
  • 5. 5 The HOBBIT platform. Architecture The data pipeline: 1. Raw/initial data send (optional) 2. Sending raw tuples 3.1 Sending tasks (task={tuple, id}) 3.2 Sending expected results per tasks 4. Send actual results per tasks 5. Send the “expected-actual” pairs 6. Send KPIs back to the controller 7. Send KPIs back to the platform Benchmark (customer’s application) System components (black box for customers) Platform components 1 2 3.1 3.2 4 5 6 The online platform: http://master.project-hobbit.eu/ Cluster: 6 nodes, each is 2×64 bit Intel Xeon E5-2630v3 (8-Cores, 2.4 GHz, HT, 20MB Cache, each proc.), 256 GB RAM, 1Gb Ethernet Nodes (benchmark/system): 3/3 https://github.com/hobbit-project/platform/wiki/Overview 7 http://github.com/hobbit-project/platform
  • 6. 6 The HOBBIT platform. Technologies https://github.com/hobbit-project/platform/wiki/Overview Platform communication channel (RarritMQ only) Data transportation channel (app-specific) Platform-side: 1. Java 2. RabbitMQ 3. Docker+Swarm 4. GitLab 5. Redis 6. Virtuoso (RDF) 7. NodeJS 8. KeyCloak App-side (defaults): 1. Java 2. RabbitMQ Application side Platform side (RabbitMQ, Kafka, Netty, Akka…) http://github.com/hobbit-project/platform
  • 7. Design and upload to HOBBIT Create a project at https://git.project-hobbit.eu Create and account at https://master.project-hobbit.eu Clone and extend the basic codes: https://github.com/hobbit-project/java-sdk- example Design components using the manuals: Run tests locally as pure java code Update ttl-files for you project Upload Design (alternative using the JAVA SDK) Develop a benchmark component in Java Develop a component in Java Develop a system adapter Develop a system adapter in Java Create docker files using details (manual) Design (the standard HOBBIT way) Debug Docker images by running tests Find your benchmark or system at https://master.project-hobbit.eu Build images (manual) Configure remote project details Upload docker images to https://git.project-hobbit.eu - Lots of understanding and manual work - Impossible to debug locally * - Upload non-tested images * - No logs from the online platform, only GUI * + Clone and extend standard classes with your logic + Test and debug your code from IDE + Built Docker images on demand from IDE + Run your images from IDE, check all internal logs + Upload fully tested images 7 * Unless you haven’t a local HOBBIT deployment
  • 8. 8 Example: single benchmark run http://master.project-hobbit.eu/
  • 9. 9 Example: challenges & leaderboards http://master.project-hobbit.eu/
  • 10. Challenges: DEBS GC 2017 DEBS Grand Challenge 2017 successfully completed Anomaly detection for injection molding machines over RDF-streams. 10 14 teams registered 7 teams passed correctness check 2 were awarded (main and audience award) StreaML Open Challenge is opened; Price: 500 € The main result: For the first time we can objectively quantify the performance of a distributed stream processing pipeline running analytics algorithms https://project-hobbit.eu/challenges/debs-grand-challenge/ https://project-hobbit.eu/open-challenges/streaml-open-challenge/ Find Cluster Centers Over W time units Apply Markov Model for Anomaly Detection Train Markov Model over last W time units start After at least W time units The anomaly detector:
  • 11. Challenges: DEBS GC 2018 DEBS Grand Challenge 2018 is just started https://project-hobbit.eu/challenges/debs2018-grand-challenge/ Prediction of arrival times and ports on marine traffic data. Price: 1000 € + publication at DEBS proceedings (conf. will be in New Zealand) 11 • Synthetic generated data • Predefined algorithms • True RDF-streaming benchmark • Focus: correctness check, throughput, latency • Real annotated data • No predefined approach • True ML-benchmark • Focus: prediction accuracy, performance DEBS Grand Challenge 2018DEBS Grand Challenge 2017
  • 12. 12 Available benchmarks overview Versioning Benchmark • Benchmark for assessing an ability of versioning systems to efficiently manage evolving datasets and queries Data Storage Benchmark  benchmark for RDF data storage solutions against an interactive workload in a real-world scenario, using various dataset sizes Linking Benchmark  Benchmark for assessing the performance of instance Matching tools that implement string-based approaches Faceted Browsing Benchmark • Benchmark for systems which support browsing through linked data by iterative transitions performed by an intelligent user ODIN Benchmark • benchmark for data extraction solutions for structured data • simulates the ingestion, storage and retrieval of streams of RDF data Spatial Benchmark  Benchmark for systems which deal with topological relations proposed in the state of the art DE-9IM model. Question Answering Benchmark • Benchmark for ranking question answering systems based on their performance and accuracy GERBIL Benchmark • benchmark for entity annotation and disambiguation tools • 9 annotators, 11 RDF datasets Stream Machine Learning Benchmark  Benchmark for assess the performance of anomaly detection for injection molding machines over RDF-streams Stream Machine Learning Benchmark v2 • Benchmark for assess the accuracy of prediction over stream of marine traffic data http://github.com/hobbit-project
  • 13. Summary The HOBBIT platform • Ability to benchmark heterogeneous distibuted systems in streaming fashion • A set of benchmarks to compare relevant Linked Data technologies and solutions • We apply the HOBBIT platform to rank machine-learning pipelines over the RDF-streams • The platform may be a basics for benchmark of stream-reasoning solutions 13
  • 14. QA Thank you for attention! 14 psmirnov@agtinternational.com http://twitter.com/smirnp http://twitter.com/AGTIntl