ApproxIoT
Approximate Analytics for Edge
Computing
https://ApproxIoT.github.io/ApproxIoT/
Zhenyu Wen, Do Le Quoc,
Pramod Bhatotia, Ruichuan Chen, Myungjin Lee
Modern online services
Stream
aggregator
Stream
analytics
system
Useful
Information
Processing streaming data from different sources
Modern online services
Low latency
Tension
Approximate computing
Efficient resource
utilization
Approximate computing
Many applications:
Approximate output is good enough!
The proportion of data is useful for this application
Live taxi heatmap
Approximate computing
Idea: To achieve low latency, compute over a sub-set of data items
instead of the entire data-set
Analyze
Approximate output
± error bound
Approximate
computing
(sampling)
State-of-the-art system
StreamApprox [Middleware’17]
Approximate output
± error bound
StreamApprox
Stream
aggregator
S1
S2
Sn
…
Data
stream
Cloud datacenter
Limitations:
• It wastes bandwidth
• It utilizes only cloud datacenter resources
Edge computing
Cloud
Gateway
Edge node
Local processing
Source of
data
Allows data to be processed at the edge
node before it’s sent to the cloud
Opportunities:
• Providing more computing resources
• Saving bandwidth
Edge infrastructure
Source: https://peering.google.com/#/infrastructure
Azure IoT edge
Watson IoT
AWS IoT
Problem statement
To build a stream analytics system
• By utilizing the cloud and edge computing resources
• By leveraging approximate computing
Design goals
• Efficiency: Efficient utilization of computing resources
• Adaptability: Adaptive execution based on the available resources
• Transparency: No code change required and resource management
Outline
• Motivation
• Design
• Implementation
• Evaluation
ApproxIoT: Overview
S1
Si
Sn
…
Sm
…
…
Central
node
Cloud
Query
Approximate output
± error bound
ApproxIoT
ApproxIoT employs sampling in the distributed environment of
edge + cloud
Edge nodes
Regional edge
Continental node
Naïve algorithm
SRS Query
Simple random sampling (SRS)
Approximate output
± error bound
Sampled unfairly
Overlooked Low accuracy
Background: Stratified sampling
Stratified
sampling
Advantage: The sub-streams are sampled fairly
Disadvantage: Requires the knowledge of each sub-stream size
Background: Reservoir sampling
Reservoir
sampling
Size of reservoir = 4
Reservoir
sampling
Size of reservoir = 4
Advantage:
• No pre-knowledge required of sub-stream size
Disadvantages:
• The sub-streams are sampled unfairly
• Difficult to run on multiple nodes
Reservoir
sampling
Size of reservoir = 4
The 5th item With probability(
4
5
) replaced by the 5th item
Reservoir
sampling
Size of reservoir = 4
Reservoir
sampling
Size of reservoir = 4
The 6th item With probability(
4
6
) replaced by the 6th item
Reservoir
sampling
Size of reservoir = 4
Reservoir
sampling
Size of reservoir = 4
ApproxIoT sampling algorithm
Easy to parallelize, requires
no synchronization between
sub-streams
Weighted hierarchical sampling (WHS)
Combining stratified and reservoir sampling
Weight: C/N, if C>N
1, if C <=N
WHS
Reservoir size N=4
With initial weight 1
W=1
W=1
W=1
W=6/4
W=1
W=1
C=6
WHS on edge nodes
Regional
edge WHS
W=1
W=1
W=1
W=6/2=3
W=4/2=2
W=1
Continental
node WHS
W=4
W=1
W=3
W=4*5/2=10
W=1*3/2=3/2
W=3
Reservoir size equals 2
Central
node
Cloud
Edge nodes
Regional edge Continental node
Easy to parallelize, requires
no synchronization between
computing nodes
Carried weight Current weight
ApproxIoT in the cloud
Reservoir size equals 1
Query
(sum)
WHS
The weights are carried
W=4/3*6/1 =8
W=1*4/1=4
W=1*2/1=2
± error bound
8* +4* +2*
W=4/3
W=1
W=1
Approximate output:
Central
node
Cloud
Edge nodes
Regional edge Continental node
Outline
• Motivation
• Design
• Implementation
• Evaluation
Implementation
S1
S2
Sn
…
Kafka
cluster
Stream
pub/sub
Edge
nodes
Cloud
datacenter
Data stream
Sampled
data stream
Sampled
data stream
See the paper
for more details
Kafka Streams
Experimental setup
• Evaluation questions
• Accuracy vs. sample size
• Throughput vs. sample size
• Testbed: 25 nodes
• 15 nodes for ApproxIoT deployment
• 10 nodes for Kafka cluster
• Datasets:
• Synthetic: Poisson and Gaussian distribution
• Real: Brasvo pollution and New York Taxi Ride
See the paper
for more
results!
Accuracy vs. sample size
0
20
40
60
80
10 20 40 60 80
Accuracy
loss(%)
Sampling fraction(%)
SRS ApproxIoT
Lower
the better
ApproxIoT: ~2600X higher accuracy over SRS
The average is 0.035%
Throughput vs. sample size
0
40
80
120
10 20 40 60 80 90 100
Throughput(k)
items/s
Sampling fraction(%)
Native SRS ApproxIoT
Higher
the better
• ApproxIoT has low overhead compared to the native execution
• ApproxIoT has similar throughput as SRS
Conclusion
ApproxIoT: Approximate analytics for edge computing
Adaptability Adaptive execution based on the available resources
Transparency Requires no code changes and resource management
Thank you!
More details on the project website:
https://ApproxIoT.github.io/ApproxIoT/
Efficiency Efficient computing and bandwidth resource utilization

More Related Content

PDF
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
PDF
Steam++ An Extensible End-to-end Framework for Developing IoT Data Processing...
PDF
STEAM++ AN EXTENSIBLE END-TO-END FRAMEWORK FOR DEVELOPING IOT DATA PROCESSING...
PDF
IoT Story: From Edge to HDP
PDF
KurtPortelliMastersDissertation
PPTX
Edge computing system for large scale distributed sensing systems
PPTX
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
PPTX
Low-Cost Approximate and Adaptive Monitoring Techniques for the Internet of T...
Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...
Steam++ An Extensible End-to-end Framework for Developing IoT Data Processing...
STEAM++ AN EXTENSIBLE END-TO-END FRAMEWORK FOR DEVELOPING IOT DATA PROCESSING...
IoT Story: From Edge to HDP
KurtPortelliMastersDissertation
Edge computing system for large scale distributed sensing systems
StreamSight - Query-Driven Descriptive Analytics for IoT and Edge Computing
Low-Cost Approximate and Adaptive Monitoring Techniques for the Internet of T...

Similar to Edge Comp.pptx (20)

PDF
Internet of Things (IoT) in the Fog
PDF
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
PDF
WSO2Con ASIA 2016: IoT Analytics
PDF
Description Of A Graph
PPTX
(R)evolution of the computing continuum - A few challenges
PPTX
Streaming HYpothesis REasoning
PDF
Geospatial Sensor Networks and Partitioning Data
PDF
Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Comp...
PDF
Adaptive Multi-Criteria-Based Load Balancing Technique for Resource Allocatio...
PDF
Adaptive Multi-Criteria-Based Load Balancing Technique for Resource Allocatio...
PPTX
Iot Report
PDF
Approximation algorithms for stream and batch processing
PPTX
Building IoT solutions using Windows 10 IoT Core & Azure
PPTX
Introspection Analaysis of Availability ver0_1-1.pptx
PPTX
Streaming Hypothesis Reasoning - William Smith, Jan 2016
DOCX
IoT + Big Data + Cloud + AI Integration Insights from Patents
PDF
Cad phase 2 of the naan mudhalvan in our college
PDF
Big Data and Small Devices by Katharina Morik
PDF
sensors-22-00196-v2.pdf
PDF
A Survey of Adaptive Sampling and Filtering Algorithms for the Internet of Th...
Internet of Things (IoT) in the Fog
Distributed Near Real-Time Processing of Sensor Network Data Flows for Smart ...
WSO2Con ASIA 2016: IoT Analytics
Description Of A Graph
(R)evolution of the computing continuum - A few challenges
Streaming HYpothesis REasoning
Geospatial Sensor Networks and Partitioning Data
Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Comp...
Adaptive Multi-Criteria-Based Load Balancing Technique for Resource Allocatio...
Adaptive Multi-Criteria-Based Load Balancing Technique for Resource Allocatio...
Iot Report
Approximation algorithms for stream and batch processing
Building IoT solutions using Windows 10 IoT Core & Azure
Introspection Analaysis of Availability ver0_1-1.pptx
Streaming Hypothesis Reasoning - William Smith, Jan 2016
IoT + Big Data + Cloud + AI Integration Insights from Patents
Cad phase 2 of the naan mudhalvan in our college
Big Data and Small Devices by Katharina Morik
sensors-22-00196-v2.pdf
A Survey of Adaptive Sampling and Filtering Algorithms for the Internet of Th...
Ad

Recently uploaded (20)

PDF
August 2025 - Top 10 Read Articles in Network Security & Its Applications
PPTX
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
PPTX
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
PDF
Abrasive, erosive and cavitation wear.pdf
PPT
Total quality management ppt for engineering students
PDF
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
PDF
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PPTX
Amdahl’s law is explained in the above power point presentations
PPTX
Current and future trends in Computer Vision.pptx
PDF
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
PPTX
CyberSecurity Mobile and Wireless Devices
PPT
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
PDF
distributed database system" (DDBS) is often used to refer to both the distri...
PPTX
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
PPTX
Information Storage and Retrieval Techniques Unit III
PDF
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PPTX
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
August 2025 - Top 10 Read Articles in Network Security & Its Applications
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
Abrasive, erosive and cavitation wear.pdf
Total quality management ppt for engineering students
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
Fundamentals of safety and accident prevention -final (1).pptx
Amdahl’s law is explained in the above power point presentations
Current and future trends in Computer Vision.pptx
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
CyberSecurity Mobile and Wireless Devices
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
distributed database system" (DDBS) is often used to refer to both the distri...
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
Information Storage and Retrieval Techniques Unit III
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
Chemical Technological Processes, Feasibility Study and Chemical Process Indu...
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
Ad

Edge Comp.pptx