SlideShare a Scribd company logo
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and its Spark Extension
Alibaba Cloud Intelligence TST
•
•
• SimpleFeature
Geometry
• SimpleFeatureType
• WKT Well-known text
SimpleFeature
- -
• )
• - -
• -
• - -
3
3
3
3
3
-
-
-
-
-
-
0A D D A B D
• 0 H- -, /
Ø H- -, /
Ø A D 1
Ø , 1-
Ø 0EAA BD D B B D B A D E DB D BG D
• 0 H
Ø H
Ø 0E D B - B
Ø , -
Ø 0EAA BD D B DB D BG D
• HBase Ganos is a new generation of Spatio-temporal Data Engine (SDE) based on GeoMesa
and Ali-HBase storage platform.
• Enable large-scale geospatial analytics on cloud and distributed computing systems
• Support data analysis based on Apache Spark using HBase Ganos as the backend datastore.
• Support hot and cold data separation
HBase Ganos HBase Ganos Spark
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and its Spark Extension
GeoHash uses interval halving on latitude and longitude to build up a bit-string of alternating
dimensions.
a: Define query region
b: spatial partition
c: calculate query range
•
•
•
•
•
•
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and its Spark Extension
Testing Environment: Huabei-2
•Master node 2CPU 4GB (hbase.n1.medium)
•Core node 3 Node 4CPU 8GB (hbase.sn1.2xlarge)
•Writing Thread Num. 10
•Batching Size:1000
•Reading Thread Num. 10
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and its Spark Extension
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and its Spark Extension
l 6 6A A6
• A3 6
• -
• 3 A D 6 A A 53 3 6 6A
• 4 BA6 5 06 6AD
l - BA6
• % %
• % % A 0 3
• 3 A6 3 A 3 A D 6 A B3 D
• 3 A D 6 5 3 6
• - 3 A 53 3
• GeoMesa Spark allows for execution of jobs on Apache Spark using
data stored in HBase Ganos.
• The library allows creation of Spark RDDs and DataFrames, writing of
Spark RDDs and DataFrames to HBase Ganos.
Global Index:
•Grid
•RTree
•QuadTree
•KDBTree
Local Index:
•QuadTree
•Rtree
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and its Spark Extension
SELECT ship_id,ST_PointToTrajectory(ship_id,dtg,geom) AS traj FROM point GROUP BY point.ship_id
SELECT * FROM aispoint WHERE st_contains(st_makeBBOX(114.00000,22.00000,115.00000,23.00000), geom)
|AND dtg between cast('2018-09-08T01:00:00Z' as timestamp) AND cast('2018-09-13T01:00:00Z' as timestamp)
Amount of Points:
985,800,104 Time 53ms
Result count 15
1. Spatial Query
Time 145ms
Result count 7
2. Id and time query
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and its Spark Extension

More Related Content

PDF
An introduction to Big-Data processing applying hadoop
PPTX
PPTX
Big data solution capacity planning
PPTX
Hadoop Cluster Configuration and Data Loading - Module 2
PPTX
2012 apache hadoop_map_reduce_windows_azure
PDF
An introduction To Apache Spark
PDF
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
PDF
Cluster Drm
An introduction to Big-Data processing applying hadoop
Big data solution capacity planning
Hadoop Cluster Configuration and Data Loading - Module 2
2012 apache hadoop_map_reduce_windows_azure
An introduction To Apache Spark
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
Cluster Drm

What's hot (20)

PDF
Cluster Drm
PDF
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
PDF
Introduction to Hadoop Ecosystem
PPT
Hive integration: HBase and Rcfile__HadoopSummit2010
PPTX
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
PDF
Performance evaluation of apache tajo
PDF
Hadoop and Hive Development at Facebook
PDF
Hadoop operations basic
PPTX
scalable machine learning
PPTX
Introduction to Hadoop
PPTX
Hadoop_EcoSystem_Pradeep_MG
PPTX
Hadoop
PPT
Hadoop hbase introduction
PDF
Introduction to Apache Tajo: Data Warehouse for Big Data
PPTX
BIG DATA: Apache Hadoop
PDF
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
PPT
Hw09 Hadoop Development At Facebook Hive And Hdfs
PPTX
Hadoop eco system-first class
PDF
Alexander Ignatyev "MapReduce infrastructure"
PDF
Prashant de-ny-project-s1
Cluster Drm
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Introduction to Hadoop Ecosystem
Hive integration: HBase and Rcfile__HadoopSummit2010
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
Performance evaluation of apache tajo
Hadoop and Hive Development at Facebook
Hadoop operations basic
scalable machine learning
Introduction to Hadoop
Hadoop_EcoSystem_Pradeep_MG
Hadoop
Hadoop hbase introduction
Introduction to Apache Tajo: Data Warehouse for Big Data
BIG DATA: Apache Hadoop
Hadoop - Introduction to map reduce programming - Reunião 12/04/2014
Hw09 Hadoop Development At Facebook Hive And Hdfs
Hadoop eco system-first class
Alexander Ignatyev "MapReduce infrastructure"
Prashant de-ny-project-s1
Ad

Similar to hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and its Spark Extension (20)

PPTX
Big Data on azure
PDF
Scaling HDFS to Manage Billions of Files
PDF
Scaling HDFS to Manage Billions of Files with Key-Value Stores
PDF
Big Data Solutions in Azure - David Giard
PPTX
Apache Spark
PDF
DUG'20: 02 - Accelerating apache spark with DAOS on Aurora
PDF
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
PPTX
Hadoop enhancements using next gen IA technologies
PDF
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
PDF
Apache Drill talk ApacheCon 2018
PDF
Hadoop Hardware @Twitter: Size does matter!
PPTX
Five essential new enhancements in azure HDnsight
PDF
Hypertable - massively scalable nosql database
PDF
Applied Machine learning using H2O, python and R Workshop
PPTX
Introduction to Kudu - StampedeCon 2016
PPTX
Hadoop storage
PDF
Hadoop 3.0 - Revolution or evolution?
PPTX
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
PPTX
xPatterns on Spark, Shark, Mesos, Tachyon
PDF
Hadoop Hardware @Twitter: Size does matter.
Big Data on azure
Scaling HDFS to Manage Billions of Files
Scaling HDFS to Manage Billions of Files with Key-Value Stores
Big Data Solutions in Azure - David Giard
Apache Spark
DUG'20: 02 - Accelerating apache spark with DAOS on Aurora
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Hadoop enhancements using next gen IA technologies
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
Apache Drill talk ApacheCon 2018
Hadoop Hardware @Twitter: Size does matter!
Five essential new enhancements in azure HDnsight
Hypertable - massively scalable nosql database
Applied Machine learning using H2O, python and R Workshop
Introduction to Kudu - StampedeCon 2016
Hadoop storage
Hadoop 3.0 - Revolution or evolution?
Best practices on Building a Big Data Analytics Solution (SQLBits 2018 Traini...
xPatterns on Spark, Shark, Mesos, Tachyon
Hadoop Hardware @Twitter: Size does matter.
Ad

More from Michael Stack (20)

PDF
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
PDF
hbaseconasia2019 Recent work on HBase at Pinterest
PDF
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
PDF
hbaseconasia2019 HBase at Didi
PDF
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
PDF
hbaseconasia2019 HBase at Tencent
PDF
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
PDF
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
PDF
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
PDF
hbaseconasia2019 OpenTSDB at Xiaomi
PDF
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
PDF
hbaseconasia2019 Distributed Bitmap Index Solution
PDF
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
PDF
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
PDF
hbaseconasia2019 BDS: A data synchronization platform for HBase
PDF
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
PDF
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
PDF
HBaseConAsia2019 Keynote
PDF
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
PDF
HBaseConAsia2018 Track1-3: HBase at Xiaomi
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 HBase at Didi
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 HBase at Tencent
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
HBaseConAsia2019 Keynote
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track1-3: HBase at Xiaomi

Recently uploaded (20)

PPTX
Introduction to Information and Communication Technology
PDF
Cloud-Scale Log Monitoring _ Datadog.pdf
PPTX
introduction about ICD -10 & ICD-11 ppt.pptx
PPTX
innovation process that make everything different.pptx
PDF
Slides PDF The World Game (s) Eco Economic Epochs.pdf
PDF
Exploring VPS Hosting Trends for SMBs in 2025
PDF
Paper PDF World Game (s) Great Redesign.pdf
PPT
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
PDF
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
PDF
Introduction to the IoT system, how the IoT system works
PPTX
artificial intelligence overview of it and more
PPTX
Introuction about WHO-FIC in ICD-10.pptx
PPTX
Power Point - Lesson 3_2.pptx grad school presentation
PDF
Best Practices for Testing and Debugging Shopify Third-Party API Integrations...
PDF
An introduction to the IFRS (ISSB) Stndards.pdf
PDF
WebRTC in SignalWire - troubleshooting media negotiation
PPTX
Module 1 - Cyber Law and Ethics 101.pptx
PPTX
Funds Management Learning Material for Beg
PPTX
Job_Card_System_Styled_lorem_ipsum_.pptx
PPTX
presentation_pfe-universite-molay-seltan.pptx
Introduction to Information and Communication Technology
Cloud-Scale Log Monitoring _ Datadog.pdf
introduction about ICD -10 & ICD-11 ppt.pptx
innovation process that make everything different.pptx
Slides PDF The World Game (s) Eco Economic Epochs.pdf
Exploring VPS Hosting Trends for SMBs in 2025
Paper PDF World Game (s) Great Redesign.pdf
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
Introduction to the IoT system, how the IoT system works
artificial intelligence overview of it and more
Introuction about WHO-FIC in ICD-10.pptx
Power Point - Lesson 3_2.pptx grad school presentation
Best Practices for Testing and Debugging Shopify Third-Party API Integrations...
An introduction to the IFRS (ISSB) Stndards.pdf
WebRTC in SignalWire - troubleshooting media negotiation
Module 1 - Cyber Law and Ethics 101.pptx
Funds Management Learning Material for Beg
Job_Card_System_Styled_lorem_ipsum_.pptx
presentation_pfe-universite-molay-seltan.pptx

hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and its Spark Extension