Tsinghua University: Two Exemplary Applications in China

Big Machine Data - Two Exemplary Applications in China
Jianmin Wang
Tsinghua University
Beijing, China

Agenda
• Background
• Two Exemplary Applications in China

Who we are?
• Institute for Data Science,
Tsinghua University
• Founded in April 2014
• Missions & Status Quo
– Recruiting world-class researchers and engineers from industry and academia
– Long-term dedication to system research and industry practice
– Leading China’s big data strategy, especially for industrial big data

BIG data
Big Data Landscape
People
generated
2
1
3
Computer
generated
Machine
generated

Machine Generated Data
• Broadly exist
– Industrial business
– Agriculture
– Utility
– Military
– Smarter City
– Logistics
– Smart devices
– Science research
Data Rate
24*7, up to million
data points/s, and
millions of devices
DataType
Mostly are time-series, temporal sequence,
and spatial-temporal and array data
Data Usage
Real-time processing.
From monitoring to content, shape,
signal based query and analysis

 Industrial businesses have entered the era of “big data”, but the real challenge is
how to extract value from data.
 Machine generated data is the core of industrial big data
Big Machine data is beyond 3Vs

Our research spans big data lifecycle
Storage1 Access & Exploration3
Preprocessing2 Modeling & Analytics4

1 Industrial Sensor Data Management:
Cassandra at China Sany Group
2 Climate Data Management:
Cassandra at China Meteorological Administration
9© 2015. All Rights Reserved.

China Sany Group
More than 200K active engineering machineries
In more than 150 countries
SANY Group is a global company in the
construction machinery industry.
In 2011, SANY became China’s unique
company listed among the world’s top
500 companies in the construction
machinery industry.

Pipeline of Industrial Sensor Data Processing
© 2015. All Rights Reserved. 11
Internet
三一运动控制器
SYMC
三一工业显示屏
SYLD
三一移动终端
SYMT
产品主控制柜
基
于
SCP协
议
包
车
辆
工
况
数
据
无线基站
无线到有线
指定IP与端口
快反工程师
资料工程师
...
用户计算机服务人员
业主
1
2
3
4
execute
collect
decision
transfer
The data records the operational
statuses of the machineries
5000 kinds of sensors
50 billion records per year

2008
• Start
managing
sensor data
2010
• 60k
machineries
2012
• 80k
machineries
• Can only
support 6
month data
online
2014
• >100k
machineries
• All data
online
2020
•>500K
machineries
•>10K users
Technology Roadmap in Sany Group
SQL Server
➡ Oracle
Oracle ➡
Cassandra
Why Cassandra?
•Cost performance
•Scalability
•P2P Architecture
Operation Worst case Average
case
Write 30% 2x
Query 22.6% 10x

Software Stack of Sensor Data Management
Collect Store Analyze
Storm
设备（主键）
工况1（列族1）工况2（列族2） ……
接收时间1
（列1）
接收时间 2
（列2）
……
接收时间 1
（列1）
接收时间 2
（列2）
…… ……
设备1 监测值监测值 …… 监测值监测值 …… ……
设备2 监测值监测值 …… 监测值监测值 …… ……
…… …… …… …… …… …… …… ……
Map/Reduce
row
key
sensor1(cf1) sensor2(cf2)
device2
received
time1
received
time2
received
time1
received
time2
device1 value
value
value
value
value value
value value

Structured Storage
gathertime
Cassandra Storage:
machine
gather time
sensors
。。。
。。。
Schema Design – Row and Column
• Use sensor as Column Family (CF)
• In each Column Family (CF)
– Use as the row key
– Use as the column name
– Use as the column value
– Columns of each row are sorted in advance
– The number of columns is readily increasable
machine
gather time
。。。
machine
gather time
。。。
…
sensor1 sensor2 sensorN
~5000
sensors
5000+ column families
Cassandra v1.2
CQL2 (not CQL3)

Why 5000+ Column Families?
• Cassandra V1.* does not support multiple primary key & clustering key
• This makes programming more complex
• Manually split the row key or column name
• All the data in one SSTABLE belongs to a specific CF
• When querying a specified sensor, we need not scan unnecessary data
Row Key Column Name
machine_id sensor_id : gather_time
Row Key Column Name
sensor_id : machine_id gather_time
Cassandra v1.2
CQL2 (not CQL3)

Challenge 1 – Creating Schema Hang
• Problem
– Create 5000+ CFs in batch
– Creation cost increases dramatically
© 2015. All Rights Reserved.
0
5000
10000
15000
1
28
55
82
109
136
163
190
217
244
271
298
325
352
379
406
433
460
487
514
541
TimeCost（ms）
CF Serial Number
Time Cost Create 1
CF: 10s
Create 1
CF: 0.1s
• Root Cause
– Protocol Conflict
• Between Gossip Protocol and Request
Propagation Mechanism
– Message Overhead
• May transform the whole schema instead of
the changed part
ReceiveSchema
Message
Memory Cost
SendSchema
Message
Memory Cost
Total
N1 4.465G 4.236G 8.70G
N2 4.308G 4.907G 9.21G
N3 4.236G 4.024G 8.26G
N4 4.808G 4.387G 9.19G
N5 6.111G 6.373G 12.48G
Memory used by Gossip

Challenge 1 – Creating Schema Hang
• Solution
– Gossip takes effect only when:
• Propagation messages lost/timeout
• Nodes recovered from a failure
– Creation time cost can keep constant 17
Propagate
LOAD
STATUS
SHCEMA
VERSION
...
LOAD
STATUS
SHCEMA 延迟：t秒
VERSION
...
metadata metadata
Delay T sT strategy:
1 2
34
Adaptive Lazy Gossip
3 4

Challenge 2 – Balancing Consistency &
Throughput
• Production environment
– Sany production: 5 nodes cluster,
2x4 cores 64GB
• Problem
– Throughput = 200K data points/sec
– 75% data is written successfully only
in one replica, while the other
replicas are stale (inconsistent)
• Cassandra is NOT very consistent
• Big obstacle for query operation
– Repair is required, but is very slow
Experiment on Amazon EC2
2 cores, 8GB, 5 nodes
rywc: read your write consistency

Challenge 2 – Balancing Consistency &
Throughput
• Root Cause of slow Repair
– Too many column families (5000+)
– Too many ranges in the consistent
hashing ring
• 256 virtual nodes (VN) per physical
node
• Too many merkle trees (ranges x CFs)
• Experience and Suggestions
– Repair CFs and ranges one by one
• Do not repair the whole keyspace (all
CFs) at once
– Repair the important CFs first
– Perform repair at light workload
- 5 physical nodes
- each has 2 VNs
- 10 ranges in total
For each range and each CF, create merkle
tree and compare them between two nodes

Challenge 3 – Heterogeneous Nodes
• Problem
– How to assign the data partitions
in a heterogeneous cluster?
• Experiment Study
– Deploy a heterogeneous cluster
• 2 powerful servers and 8 PCs
– Throughput performance
• Heterogeneous cluster cluster
only with the 2 powerful servers
Assign the position of the nodes (i.e. Tokens) in
the ring according to their computing capacities

• Root Cause
– The replica mechanism makes the
unbalanced problem complicated
• Each Node’s configurations may impact
other nodes’ performance
– The Virtual Node (VN) mechanism
cannot fit all scenarios
• Too many VNs make the lookup table
too big and slow down repair speed
• Max #VNs in a physical node is 1536
(restricted by Cassandra source code)
The capacity of N1 is the worst, and E is short
But N1 is responsible for many data records
to the cluster:
• N5 finish the operation quickly
• But N5 has to wait for N1, which is slow

• Solution
– Initialize the cluster properly
• Use Quadratic Optimization
(QP) to find the best positions
of the (virtual) servers
• Has been deployed to China
Sany Group successfully
– Scaling out the cluster
• Use a dynamic algorithm to
find the best positions for the
new added server
Scaling out: Xiangdong Huang, Jianmin Wang et al. Optimizing Data Partition for Scaling out NoSQL Cluster. Concurrency and Computation: Practice and Experience (Early View)
Scaling out: find the best position
Optimize:
1. the order of the nodes in the ring
2. the range length of the ring

Datasets & Results in China Sany Group
• 5000+ column families for sensor data
• 100K+ engineering machineries
• Amount of historical data loaded
– From 2012.4 to now
• Data size
– Tens of billions operational statuses records
– Several billion GPS data
– Write throughput
– 5 nodes (2*4 cores CPU, 64GB memory, 9TB Disk)
– 20K TPS as regular workload, 200K at peak
23

Industrial Big Data Platform: More Requirements
——Beyond Sany Applications
High frequency sensor
High volume sensors
10+ M data point/second
Time and value based query
Richer set of analytical queries
<1 Second response
Edge synchronization
Compression, out-of-order,
retransmission
Different data, different algorithms
Transparent to query
Deep compression to historical data
Spatial-temporal index
Trajectory based queries
Even higher
throughput
Native time-series
query
Synchronization
Adaptive deep
compression
Moving object
support

Industrial Data Analysis Pipeline
Boolean value
Status values
Analogue value
1.046Billion
Basic indicator
8030
Baseline
1.046Billion
Variance
Specific
features
Common
features
Outliers
Specifiedoperational
statusesdata
General
count baseline variance
frequency baseline variance
..
Analogue
average baseline variance
variance baseline variance
extremum baseline variance
…
Boolean
times baseline variance
duration baseline variance
…
States
Changes
times
baseline variance
duration baseline variance
…

Driver profile
Hydraulic oil
temperature analysis
Temporal parameter
analysis for vehicle start
Parameter correlation
Spatial analysis
for failure
Service
Quality
ControlR&D
Key components
anomaly detection
Industry Practice – Value-Added Analytics

horizontal
inclination
angle
Concrete pump truck’s tip-over is mainly caused by insufficient leg’s cylinder
support, which is a major issue of production safety
Big Data Application 1
—Concrete Pump Truck Tip-over Detection

Fast spot and prevent dangerous operation through group behavior
analysis of concrete pump trucks
The overall distribution of horizontal (X-axis) & vertical (Y-axis)
inclination angle of concrete pumps
Unstable instances
Idle instances
Inclination angle vibration
level filter
Inclination angle distribution of
individual concrete pump

Idle instances:
unplugging operation leads to
malfunctioning
Unstable instances:
Early degradation pattern of cylinder
Typical instances:
stable oscillation
Data driven anomaly and potential accidents detection

—Fault Diagnosis
Investigation proved that salt-spray environment and the water quality
along the seaside caused the corrosion of cylinder’s potted component
Via time series pattern analysis and spatial correlation, leakage problem of master
cylinder is highly correlated with a high-speed rail construction project.
Hangzhou-Shenzhen high-speed rail
Salt-spray corrosion environment

- Spare Components Demand Forecasting
• Traditional approach is
based on marketers’
experience
• New approach
– Combining the real-time data from
machines, sale history, holdings of
vehicles, environment and GDP,
etc.
• Result
– Reduce half of inactive spare part
inventory
0
50
100
150
200
250
中
旬
下
旬
上
旬
中
旬
下
旬
上
旬
中
旬
下
旬
上
旬
中
旬
下
旬
上
旬
中
旬
下
旬
上
旬
中
旬
下
旬
上
旬
中
旬
下
旬
上
旬
中
旬
下
旬
上
旬
中
旬
下
旬
2012/10 2012/11 2012/12 2013/1 2013/2 2013/3 2013/4 2013/5 2013/6
配件需求量数量/个
实际备件需求量基于矩阵分解的多地区协同备件预测结果企业实际备货量
The predicted result fits the actual
demand better
Sparepartsnumber
Actual demand Actual prepared
Results of Multi-Region Collaborative
Spare Components Prediction Based
on Matrix Factorization

1 Industrial Data Management:
Cassandra at China Sany Group
2 Climate Data Management:
Cassandra at China Meteorological Administration

T639
windfield
temperaturefield
humidity
rainfall
snowfall
…...
model
Ground
Aerological
Satellite
Radar
Lightning
Typhoon
850Pa
800Pa
……
900Pa
temperature
8AM, 3h
8AM, 6h
…
8PM, 3h
8PM, 6h
Characteristics of Climate Data

Challenge in Meteorological Application
—Pattern Data
• Hierarchical pattern data + flat others
• A highly-efficient data-deliver system for end users
– Support millions of small files
– Access data fast
– Scan data in various order
• Performance requirement
– Get ~1MB data in 50ms
– 600 concurrent clients
/
T639d1 ...
windtemper ...d2
d3
d4
d5
800 850 900
2014.2.
18.08
2014.2.
18.20
2014.2.
19.08
3 6 9
...
... ... ...
2014.2.
18.08
...
...
2014.2.
18.08
...
... ...
3 3 3 3
t1t2t3 t4 t5 t6 t7
d3

Why Cassandra?
• Scalability
• Fast read/write data
• Some columns are sorted
– Easy to scan data sequentially
• Time-based Compaction (>=Cassandra v2.0) for time series
key 3h 6h 9h …
T639/temperature/800Pa file file file …
1. Get the data where key=‘T639…/800Pa’
2. Retrieval the data before 6h
Or retrieval the data after 6h
key 3h 6h 9h 12h … 3h 6h 9h
T639/temperature/800Pa file file file file … file file file

Solution – Schema Design for Pattern Data
• Data items
– 5-tuple
– Pattern and variable are disordered
– Level, time, ageing are ordered
time
level
ageingData space
(pattern, variable)
ColumnFamily
Row key
Column
/
T639d1 ...
windtemper ...d2
d3
d4
d5
800 850 900
2014.2.
18.08
2014.2.
18.20
2014.2.
19.08
3 6 9
...
... ... ...
2014.2.
18.08
...
...
2014.2.
18.08
...
... ...
3 3 3 3
t1t2t3 t4 t5 t6 t7

Performance Results
• 10 servers: 2*4 cores of CPU, 64GB memory, 9TB SAS Disk
• Store 7 kinds of model data
– 16TB per day
• Get data quickly
– 100 times faster than the older
system

Tsinghua University: Two Exemplary Applications in China

More Related Content

What's hot (20)

Similar to Tsinghua University: Two Exemplary Applications in China (20)

More from DataStax Academy (20)

Recently uploaded (20)

Tsinghua University: Two Exemplary Applications in China