SlideShare a Scribd company logo
GPSInsights: Towards an efficient
framework for storing and mining
massive real-time vehicle location data
Linh-Truong Hoang, Duy-Khanh Bui, Viet-
Trung Tran
Hanoi University of Science and Technology
1	
  
Agenda
•  Motivation
•  System architecture
•  Scalable map-matching
•  Experimentation
•  Conclusion
2	
  
Global Navigation Satellite
System (GNSS)
•  Autonomous geo-spatial positioning
– position
– velocity
– time
•  "Great" points about GNSS
– Free
– Real-time
– No required local infrastructures
3	
  
GNSS as part of Intelligent
transport system (ITS)
•  "precious" data for real-time traffic
managements 
– traffic dashboard
– speed control
– traffic jams monitoring
4	
  
Need	
  for	
  collec-ng	
  and	
  mining	
  massive	
  GNSS	
  
data	
  	
  
in	
  REAL-­‐TIME	
  
GNSS data characteristics
•  Real-time
– reported every
second
•  Massive in volume
– from millions cars
•  "bad" data
•  Need to be
processed within
digital map topology
5	
  
GNSS data is Bigdata's 5V
6	
  
SYSTEM ARCHITECTURE
Store massive GNSS data
Real-time mining 

7	
  
8	
  
Elas(city	
  
High-­‐throughput	
  
Fault-­‐tolerance	
  
Scalable	
  
First-­‐class	
  spa(o-­‐temporal	
  
API	
  
High-­‐thoughput	
  
Fault-­‐tolerance	
  
Online	
  processing	
  	
  
Scalable	
  	
  
Fault-­‐tolerence	
  
	
  Leverage	
  opensource	
  components	
  
9	
  
Apache spark processing
•  Resilient Distributed dataset (RDD)
– In-memory, backed by persistent storage (HDFS)
– fault-tolerance by lineage
– Support interactive – iterative analysis 
10	
  
Spark streaming
11	
  
Apache storm
12	
  
MongoDb with geo-indexing 
13	
  
Geomesa: Accumulo + geo-indexing
14	
  
SCALABLE MAP-MATCHING
ALGORITHM
15	
  
Map-matching
•  Online vs. Offline

•  OSM map
16	
  
Algorithm
•  OSM map format
•  Filling intermediate points
– Millions more points 
– Massive data 
– but simple calculations
•  real-time, scalable
17	
  
K-d tree for closest neighbours
•  Run by apache spark/storm
18	
  
EXPERIMENTATION
19	
  
Experiment setup
•  12 millions GPS records collected by
vehicles equipped with the GPS receiver in
March 2014
•  4 nodes cluster
– 8-cores Intel Xeon 2.6GHz CPU, 32GB memory
20	
  
Map-matching completion time
21	
  
Latency
22	
  
"Scalability"
23	
  
Demonstration
24	
  
Real-time traffic monitoring
25	
  
Real-time shortest path
26	
  
Conclusion
•  GPSInsights: Scalable framework for storing
and mining massive location data
– built on open-source scalable components
– scalable storage + real-time mining 
– Plug-able components
– Demonstration with scalable map-matching
algorithm
•  Future work
– Advance map-matching algorithms
– Traffic jam prediction
27	
  
Current state-of-the-arts
•  PostGIS
– Spatial objects management
over Postgres
– Small size 
– No mining supported 
28	
  

More Related Content

PPTX
giasan.vn real-estate analytics: a Vietnam case study
PPTX
Large-Scale Geographically Weighted Regression on Spark
PDF
Sdwwg experiences and outlook
PDF
ESTA-LD exploring spatio-temporal linked statistical data
PPTX
SexTant: Visualizing Time-Evolving Linked Geospatial Data
PDF
RIPE Atlas and IXPs "Stitchin' it up"
PPTX
Spark summit europe 2015 magellan
PPTX
Apache con big data 2015 magellan
giasan.vn real-estate analytics: a Vietnam case study
Large-Scale Geographically Weighted Regression on Spark
Sdwwg experiences and outlook
ESTA-LD exploring spatio-temporal linked statistical data
SexTant: Visualizing Time-Evolving Linked Geospatial Data
RIPE Atlas and IXPs "Stitchin' it up"
Spark summit europe 2015 magellan
Apache con big data 2015 magellan

What's hot (20)

PPTX
Dr Richard Fry - Using R as a GIS
PDF
Magellen: Geospatial Analytics on Spark by Ram Sriharsha
PDF
Are Dutch Internet Paths Local - A Measurement Study Using RIPE Atlas
PDF
Sparksummitny2016
PPTX
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
PDF
Large Scale Geospatial Indexing and Analysis on Apache Spark
PDF
RIPE Atlas
PDF
The RIPE NCC, Internet Measurements and IXPs
PDF
The State of the (Romanian) Internet
PDF
Using Deep Learning in Production Pipelines to Predict Consumers’ Interest wi...
PDF
IXP Traffic and Major Sports Events
PDF
Maps with leafletR
PPTX
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
PDF
Development of groundwater management information system for Bandung
PDF
Countries, IXPs and RIPE Atlas
PPTX
Hadoop World 2010 - BAH - Fuzzy Table
PDF
Smartnets2018
PPTX
From Data to insight: Emerging Opportunities in Africa for 2018
PDF
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
PDF
Reforming Traditional Machine Learning Algorithms with Spatio-Temporal Analy...
Dr Richard Fry - Using R as a GIS
Magellen: Geospatial Analytics on Spark by Ram Sriharsha
Are Dutch Internet Paths Local - A Measurement Study Using RIPE Atlas
Sparksummitny2016
Using R to Visualize Spatial Data: R as GIS - Guy Lansley
Large Scale Geospatial Indexing and Analysis on Apache Spark
RIPE Atlas
The RIPE NCC, Internet Measurements and IXPs
The State of the (Romanian) Internet
Using Deep Learning in Production Pipelines to Predict Consumers’ Interest wi...
IXP Traffic and Major Sports Events
Maps with leafletR
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
Development of groundwater management information system for Bandung
Countries, IXPs and RIPE Atlas
Hadoop World 2010 - BAH - Fuzzy Table
Smartnets2018
From Data to insight: Emerging Opportunities in Africa for 2018
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Reforming Traditional Machine Learning Algorithms with Spatio-Temporal Analy...
Ad

Viewers also liked (20)

PDF
A Vietnamese Language Model Based on Recurrent Neural Network
ODP
Neural Networks for OCR
PPTX
OCR processing with deep learning: Apply to Vietnamese documents
PDF
Giasan.vn @rstars
PDF
GeoMesa LocationTech DC
PPTX
LocationTech Projects
PDF
From decision trees to random forests
PDF
Sqrrl real time_big_data_20130411
PPT
Apache Accumulo Overview
PDF
success factors for project proposals
PDF
Deep Learning Class #3 - Take Two LSTMs
PPTX
3 - Finding similar items
PDF
Recent progress on distributing deep learning
PPTX
Recommender systems: Content-based and collaborative filtering
PPTX
Tamil OCR using Tesseract OCR Engine
PPTX
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
PDF
Introduction to BigData @TCTK2015
PDF
Deep learning for nlp
PPTX
Dimensionality reduction: SVD and its applications
PDF
From neural networks to deep learning
A Vietnamese Language Model Based on Recurrent Neural Network
Neural Networks for OCR
OCR processing with deep learning: Apply to Vietnamese documents
Giasan.vn @rstars
GeoMesa LocationTech DC
LocationTech Projects
From decision trees to random forests
Sqrrl real time_big_data_20130411
Apache Accumulo Overview
success factors for project proposals
Deep Learning Class #3 - Take Two LSTMs
3 - Finding similar items
Recent progress on distributing deep learning
Recommender systems: Content-based and collaborative filtering
Tamil OCR using Tesseract OCR Engine
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Introduction to BigData @TCTK2015
Deep learning for nlp
Dimensionality reduction: SVD and its applications
From neural networks to deep learning
Ad

Similar to Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data (20)

PPT
OS MasterMap it's not a map - but data
PPTX
Geoposicionamiento Big Data o It's bigger on the inside Commit conf 2018
PDF
Geoposicionamiento Big Data o It's bigger on the inside Codemetion Madrid 2018
PDF
A Mobile Sensing Architecture for Massive Urban Scanning
PDF
Gis capabilities on Big Data Systems
PDF
Geostor Essay
PDF
Intro To Geospatial
PDF
Apache Geode Meetup, Cork, Ireland at CIT
PDF
Introduction to Apache Geode (Cork, Ireland)
PDF
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
PPTX
Geo data analytics
PDF
iTimer - Count On Your Time
PDF
NCGIC The Geospatial Revolution
PPTX
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
PDF
Apache Geode Meetup, London
PDF
GIS in the Rockies Geospatial Revolution
PPTX
Databases Basics and Spacial Matrix - Discussig Geographic Potentials of Data...
PDF
AGIT 2015 - Hans Viehmann: "Big Data and Smart Cities"
PDF
CrateDB 101: Geospatial data
OS MasterMap it's not a map - but data
Geoposicionamiento Big Data o It's bigger on the inside Commit conf 2018
Geoposicionamiento Big Data o It's bigger on the inside Codemetion Madrid 2018
A Mobile Sensing Architecture for Massive Urban Scanning
Gis capabilities on Big Data Systems
Geostor Essay
Intro To Geospatial
Apache Geode Meetup, Cork, Ireland at CIT
Introduction to Apache Geode (Cork, Ireland)
Nye forskninsgresultater inden for geo-spatiale data af Christian S. Jensen, AAU
Geo data analytics
iTimer - Count On Your Time
NCGIC The Geospatial Revolution
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
Apache Geode Meetup, London
GIS in the Rockies Geospatial Revolution
Databases Basics and Spacial Matrix - Discussig Geographic Potentials of Data...
AGIT 2015 - Hans Viehmann: "Big Data and Smart Cities"
CrateDB 101: Geospatial data

More from Viet-Trung TRAN (16)

PDF
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
PDF
Dynamo: Amazon’s Highly Available Key-value Store
PDF
Pregel: Hệ thống xử lý đồ thị lớn
PDF
Mapreduce simplified-data-processing
PDF
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
PDF
A Vietnamese Language Model Based on Recurrent Neural Network
PDF
GPSinsights poster
PDF
Introduction to mining massive datasets
PDF
6 clustering
PDF
2 association rules
PDF
Tachyon memory centric, fault tolerance storage for cluster framworks
PDF
Interactive big data analytics
PPTX
Hệ thống phân tích tình trạng giao thông: Ứng dụng công cụ xử lý dữ liệu lớn...
PDF
Nosql data models
PDF
Overview of big data in cloud computing
PPT
Vanilla Hadoop vs. the rest
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Dynamo: Amazon’s Highly Available Key-value Store
Pregel: Hệ thống xử lý đồ thị lớn
Mapreduce simplified-data-processing
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
A Vietnamese Language Model Based on Recurrent Neural Network
GPSinsights poster
Introduction to mining massive datasets
6 clustering
2 association rules
Tachyon memory centric, fault tolerance storage for cluster framworks
Interactive big data analytics
Hệ thống phân tích tình trạng giao thông: Ứng dụng công cụ xử lý dữ liệu lớn...
Nosql data models
Overview of big data in cloud computing
Vanilla Hadoop vs. the rest

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
KodekX | Application Modernization Development
PDF
Approach and Philosophy of On baking technology
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Cloud computing and distributed systems.
PDF
Machine learning based COVID-19 study performance prediction
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
cuic standard and advanced reporting.pdf
Encapsulation theory and applications.pdf
Empathic Computing: Creating Shared Understanding
Reach Out and Touch Someone: Haptics and Empathic Computing
Agricultural_Statistics_at_a_Glance_2022_0.pdf
MYSQL Presentation for SQL database connectivity
Building Integrated photovoltaic BIPV_UPV.pdf
Unlocking AI with Model Context Protocol (MCP)
KodekX | Application Modernization Development
Approach and Philosophy of On baking technology
Advanced methodologies resolving dimensionality complications for autism neur...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation_ Review paper, used for researhc scholars
Cloud computing and distributed systems.
Machine learning based COVID-19 study performance prediction
The AUB Centre for AI in Media Proposal.docx
Network Security Unit 5.pdf for BCA BBA.
cuic standard and advanced reporting.pdf

Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data