SlideShare a Scribd company logo
IOT, PYTHON, AND ML:
From Chips and Bits to Data Science
Jeff Fischer
Data-Ken Research
jeff@data-ken.org
https://data-ken.org
Sunnyvale, California, USA
PyData Seattle July 6, 2017
Agenda
¨ Project overview
¨ Hardware
¨ Data capture
¨ Data analysis
¨ Player
¨ Parting thoughts
© 2016, 2017 Jeff Fischer
Why Python for IoT?
¨ High-level, easy to prototype ideas and explore options
¨ Runs on embedded devices
¨ Python data analysis ecosystem
© 2016, 2017 Jeff Fischer
Array and matrix processing High level data analysis tools Numerical analysis routines Machine learning
Raspberry Pi
• Linux “workstation”
• Can run CPython and full
data science stack
• Not battery friendly
ESP8266
• System-on-a-chip with
32-bit CPU, WiFi, I/O
• Low power consumption
• Only 96K data memory!
• MicroPython to the
rescure
Project Motivation
¨ First thought about smart thermostat, but too dangerous
¨ Lighting is “safe”
¨ If out of town for the weekend, don’t want to leave the house dark
¨ Timers are flakey and predictable
¨ Would like a self-contained solution
¨ “Wouldn’t it be cool to use machine learning?”
© 2016, 2017 Jeff Fischer
Lighting Replay Application
Data
Capture
Lux Sensors
ESP8266 remote
nodes +
Raspberry Pi
Analysis and
Machine
Learning
Offline analysis
using Jupyter,
Pandas,
HMMlearn
Captured sensor
data
Smart Lights
Player
Application
Simple script using
HMMlearn and
Phue to control
Philips Hue lights
HMM state machines
© 2016, 2017 Jeff Fischer
Hardware
ESP8266
© 2016, 2017 Jeff Fischer
TSL2591
lux sensor
breakout
board
Lithium Ion
Polymer
Battery
3.7v 350mAh
MicroUSB to
USB cable
½ Size breadboard
Adafruit Feather HUZZAH
ESP8266 breakout board
ESP8266: Wiring Diagram
© 2016, 2017 Jeff Fischer
SDA
SCL
GND
3V
Raspberry Pi
© 2016, 2017 Jeff Fischer
Raspberry Pi 2
Breakout cable
“Pi Cobbler Plus”
Solderless Breadboards
Resistor
LED
TSL2591
lux sensor
breakout
board
Raspberry Pi: Wiring Diagram
© 2016, 2017 Jeff Fischer
Resistor
LED
Anode
(long lead)
Cathode
(short lead)
10k
GND
3.3V
SDA
SCL
GPIO 0
Data Capture
Lighting Replay Application: Capture
Lux
Sensor ESP8266
Front Bedroom Sensor Node
Lux
Sensor ESP8266
Back Bedroom Sensor Node
Raspberry Pi
(Dining Room)
MQTT
Broker
Data
Capture
App
Lux
Sensor
Flat
Files
© 2016, 2017 Jeff Fischer
MQTT
MQTT
Event-driven IoT Code Can Be Ugly
def sample_and_process(sensor, mqtt_writer, xducer, completion_cb, error_cb):
try:
sample = sensor.sample()
except StopIteration:
final_event = xducer.complete()
if final_event:
mqtt_writer.send(final_event,
lambda: mqtt_writer.disconnect(lambda: completion_cb(False), error_cb), error_cb)
else:
mqtt_writer.disconnect(lambda: completion_cb(False), error_cb)
return
except Exception as e:
error_cb(e)
mqtt_writer.disconnect(lambda: pass, error_cb)
return
event = SensorEvent(sensor_id=sensor.sensor_id, ts=time.time(), val=sample)
csv_writer(event)
median_event = xducer.step(event)
if median_event:
mqtt_writer.send(median_event,
lambda: completion_cb(True), error_cb)
else:
completion_cb(True)
def loop():
def completion_cb(more):
if more:
event_loop.call_later(0.5, loop)
else:
print("all done, no more callbacks to schedule")
event_loop.stop()
def error_cb(e):
print("Got error: %s" % e)
event_loop.stop()
event_loop.call_soon(lambda: sample_and_process(sensor, mqtt_writer, transducer, completion_cb, error_cb))
Problems
1. Callback hell
2. Connecting of event streams intermixed with
handling of runtime situations: normal flow,
error, and end-of-stream conditions.
3. Low-level scheduling
4. async/await helps, but not much
© 2016, 2017 Jeff Fischer
My Solution: ThingFlow
¨ What is ThingFlow?
¤ A Domain Specific Language for IoT event processing
¤ Runs on Python3 and MicroPython
¨ Co-creator
¤ Rupak Majumdar, Scientific Director at Max Planck Institute for Software Systems
¨ Why did we create ThingFlow?
¤ IoT event processing code can be very convoluted
¤ No standardization of sensors, adapters, and transformations
¤ Different frameworks for microcontrollers, edge processing, analytics
© 2016, 2017 Jeff Fischer
Simple ThingFlow Example
o Periodically sample a light sensor
o Write the sensed value to a local file
o Every 5 samples, send the moving average to MQTT Broker
© 2016, 2017 Jeff Fischer
Lux
Sensor
Write
to
File
Event
Scheduler
Send
to
MQTT
Moving
Avg
Graphical Representation
sensor.connect(file_writer(’file’))
sensor.transduce(MovingAvg(5)).connect(mqtt_writer)
scheduler.schedule_periodic(sensor, 5)
Code
ESP8266 ThingFlow Code
© 2016, 2017 Jeff Fischer
from thingflow import Scheduler, SensorAsOutputThing
from tsl2591 import Tsl2591
from mqtt_writer import MQTTWriter
from wifi import wifi_connect
import os
# Params to set
WIFI_SID= …
WIFI_PW= …
SENSOR_ID="front-room"
BROKER='192.168.11.153'
wifi_connect(WIFI_SID, WIFI_PW)
sensor = SensorAsOutputThing(Tsl2591())
writer = MQTTWriter(SENSOR_ID, BROKER, 1883,
'remote-sensors')
sched = Scheduler()
sched.schedule_sensor(sensor, SENSOR_ID, 60, writer)
sched.run_forever()
https://github.com/jfischer/micropython-tsl2591
Sample at 60 second intervals
The MQTT writer is connected to
the lux sensor.
See https://github.com/mpi-sws-rse/thingflow-examples/blob/master/lighting_replay_app/capture/esp8266_main.py
Raspberry Pi Code
© 2016, 2017 Jeff Fischer
Lux
Sensor
MQTT
Adapter
Map
to
UTF8
Parse
JSON
Map
to
events
Dispatch
CSV File
Writer
(front room)
CSV File
Writer
(back room)
CSV File
Writer
(dining room)
https://github.com/mpi-sws-rse/thingflow-examples/blob/master/lighting_replay_app/capture/sensor_capture.py
Data Analysis
Lighting Replay Application: Analysis
Raspberry Pi
(Dining Room)
Flat
Files
HMM
definitions
Laptop
Jupyter Notebook
file
copy
© 2016, 2017 Jeff Fischer
Preprocessing the Data
(ThingFlow running in a Jupyter Notebook)
© 2016, 2017 Jeff Fischer
CSV File
Reader
Fill in
missing
times
Sliding
Mean
Round
values
Output
Event
Count
Capture
NaN
Indexes
Pandas
Writer
(raw series)
Pandas
Writer
(smoothed
series)
reader.fill_in_missing_times()
.passthrough(raw_series_writer)
.transduce(SensorSlidingMeanPassNaNs(5)).select(round_event_val).passthrough(smoothed_series_writer)
.passthrough(capture_nan_indexes).output_count()
Data Processing: Raw Data
© 2016, 2017 Jeff Fischer
Front room, last day
Data
gaps
Data Processing: Smoothed Data
© 2016, 2017 Jeff Fischer
Front room, last day
Data Processing: K-Means Clustering
© 2016, 2017 Jeff Fischer
Front room, last day
Data Processing: Mapping to on-off values
© 2016, 2017 Jeff Fischer
Front room, last day
Applying “Machine Learning”
¨ Apply a supervised learning to create predictions for the light
¤ Regression => predict light value
¤ Classification => Light “on” or “off”
¤ Features = time of day; time relative to sunrise, sunset; history
¨ Challenges
¤ Transitions more important than individual samples (200 vs. 25,000)
¤ Different class sizes: light is mostly off
¤ Really a random process
¨ Solution: Hidden Markov Models
© 2016, 2017 Jeff Fischer
Hidden Markov Models (HMMs)
¨ Markov process
¤ State machine with probability associated with each outgoing
transition
¤ Probabilities determined only by the current state, not on history
¨ Hidden Markov Model
¤ The states are not visible to the observer, only the outputs
(“emissions”).
¨ In a machine learning context:
¤ (Sequence of emissions, # states) => inferred HMM
¨ The hmmlearn library will do this for us.
¤ https://github.com/hmmlearn/hmmlearn
¨ But, no way to account for time of day, etc.
© 2016, 2017 Jeff Fischer
Example Markov process
(from Wikipedia)
Slicing Data into Time-based “Zones”
© 2016, 2017 Jeff Fischer
Sunrise
30 Minutes
before
sunset
Max(sunset+60m, 9:30 pm)
0 1 2 3 0
HMM Training and Prediction Process
Training
1. Build a list of sample subsequences for each zone
2. Guess a number of states (e.g. 5)
3. For each zone, create an HMM and call fit() with the subsequences
Prediction
For each zone of a given day:
n Run the associated HMM to generate N samples for an N minute zone duration
n Associated a computed timestamp with each sample
© 2016, 2017 Jeff Fischer
HMM Predicted Data
© 2016, 2017 Jeff Fischer
Front room, one week predicted data
Front room, one day predicted data
Replaying the Lights
Lighting Replay Application: Replay
Front Room
Smart Light
Raspberry Pi
(Dining Room)
HMM
definitions
Player
Script
Back Room
Smart Light
Philips
Hue
Bridge
WiFi
Router
and
Switch
ZigBee
HTTP
© 2016, 2017 Jeff Fischer
Logic of the Replay Script
¨ Use phue library to control lights
¨ Reuse time zone logic and HMMs from analysis
¨ Pseudo-code:
Initial testing of lights
while True:
compute predicted values for rest of day
organize predictions into a time-sorted list of on/off events
for each event:
sleep until event time
send control message for event
wait until next day
© 2016, 2017 Jeff Fischer
https://github.com/mpi-sws-rse/thingflow-examples/blob/master/lighting_replay_app/player/lux_player.py
Parting Thoughts
Lessons Learned
¨ End-to-end projects great for learning
¨ Machine learning involves trial-and-error
¨ Visualization is key
¨ Python ecosystem is great for both runtime IoT and offline analytics
© 2016, 2017 Jeff Fischer
Thank You
Contact Me
Email: jeff@data-ken.org
Twitter: @fischer_jeff
Website and blog: https://data-ken.org
More Information
ThingFlow: https://thingflow.io
Examples (including lighting replay app): https://github.com/mpi-sws-rse/thingflow-examples
Hardware tutorial: http://micropython-iot-hackathon.readthedocs.io/en/latest/

More Related Content

PDF
Kafka short
PDF
Processing Big Data in Realtime
PDF
PDF
[232]mist 고성능 iot 스트림 처리 시스템
PDF
Real-time Big Data Processing with Storm
PPTX
BioPig for scalable analysis of big sequencing data
PPTX
Data Stream Algorithms in Storm and R
PDF
Probabilistic data structures. Part 3. Frequency
Kafka short
Processing Big Data in Realtime
[232]mist 고성능 iot 스트림 처리 시스템
Real-time Big Data Processing with Storm
BioPig for scalable analysis of big sequencing data
Data Stream Algorithms in Storm and R
Probabilistic data structures. Part 3. Frequency

What's hot (20)

PPTX
Probabilistic data structures
PDF
Apache Nemo
PDF
Storm@Twitter, SIGMOD 2014
PDF
DPF 2017: GPUs in LHCb for Analysis
PDF
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013
PDF
Realtime processing with storm presentation
PDF
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
PPT
Strata 2014 Talk:Tracking a Soccer Game with Big Data
PDF
Scientific visualization with_gr
PDF
Multithreading to Construct Neural Networks
PDF
Tensorflow presentation
PPTX
Mc (1)
PPT
Big data streams, Internet of Things, and Complex Event Processing Improve So...
DOCX
Coding matlab
PDF
2019 IRIS-HEP AS workshop: Boost-histogram and hist
PPTX
Storm-on-YARN: Convergence of Low-Latency and Big-Data
PDF
Probabilistic data structures. Part 2. Cardinality
PDF
[241]large scale search with polysemous codes
PDF
Teaching Recurrent Neural Networks using Tensorflow (May 2016)
PDF
Storm@Twitter, SIGMOD 2014 paper
Probabilistic data structures
Apache Nemo
Storm@Twitter, SIGMOD 2014
DPF 2017: GPUs in LHCb for Analysis
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013
Realtime processing with storm presentation
Exceeding Classical: Probabilistic Data Structures in Data Intensive Applicat...
Strata 2014 Talk:Tracking a Soccer Game with Big Data
Scientific visualization with_gr
Multithreading to Construct Neural Networks
Tensorflow presentation
Mc (1)
Big data streams, Internet of Things, and Complex Event Processing Improve So...
Coding matlab
2019 IRIS-HEP AS workshop: Boost-histogram and hist
Storm-on-YARN: Convergence of Low-Latency and Big-Data
Probabilistic data structures. Part 2. Cardinality
[241]large scale search with polysemous codes
Teaching Recurrent Neural Networks using Tensorflow (May 2016)
Storm@Twitter, SIGMOD 2014 paper
Ad

Similar to Jeff Fischer - Python and IoT: From Chips and Bits to Data Science (20)

PDF
3D Computer Graphics with Python
PDF
Time series data monitoring at 99acres.com
PDF
Serverless Swift for Mobile Developers
PPTX
Class 26: Objectifying Objects
PPTX
High Throughput Data Analysis
PDF
Cracking the nut, solving edge ai with apache tools and frameworks
PDF
QConSF 2014 talk on Netflix Mantis, a stream processing system
PDF
Mantis: Netflix's Event Stream Processing System
PDF
Copy of Copy of Untitled presentation (1).pdf
PPT
Real Time Event Dispatcher
PPT
Large Scale Log collection using LogStash & mongoDB
KEY
Getting Started on Hadoop
PDF
Intelligent Monitoring
PDF
Monitoring as Software Validation
PDF
Session 1 - The Current Landscape of Big Data Benchmarks
PDF
Making fitting in RooFit faster
PDF
Kidd_Portfolio_May2015
PPTX
CPaaS.io Y1 Review Meeting - Use Cases
PPTX
Apache Beam (incubating)
PDF
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
3D Computer Graphics with Python
Time series data monitoring at 99acres.com
Serverless Swift for Mobile Developers
Class 26: Objectifying Objects
High Throughput Data Analysis
Cracking the nut, solving edge ai with apache tools and frameworks
QConSF 2014 talk on Netflix Mantis, a stream processing system
Mantis: Netflix's Event Stream Processing System
Copy of Copy of Untitled presentation (1).pdf
Real Time Event Dispatcher
Large Scale Log collection using LogStash & mongoDB
Getting Started on Hadoop
Intelligent Monitoring
Monitoring as Software Validation
Session 1 - The Current Landscape of Big Data Benchmarks
Making fitting in RooFit faster
Kidd_Portfolio_May2015
CPaaS.io Y1 Review Meeting - Use Cases
Apache Beam (incubating)
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
Ad

More from PyData (20)

PDF
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
PDF
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
PDF
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
PDF
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
PDF
Deploying Data Science for Distribution of The New York Times - Anne Bauer
PPTX
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
PPTX
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
PDF
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PDF
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
PDF
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
PDF
Words in Space - Rebecca Bilbro
PDF
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
PPTX
Pydata beautiful soup - Monica Puerto
PDF
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
PPTX
Extending Pandas with Custom Types - Will Ayd
PDF
Measuring Model Fairness - Stephen Hoover
PDF
What's the Science in Data Science? - Skipper Seabold
PDF
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
PDF
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
PDF
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Words in Space - Rebecca Bilbro
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
Pydata beautiful soup - Monica Puerto
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
Extending Pandas with Custom Types - Will Ayd
Measuring Model Fairness - Stephen Hoover
What's the Science in Data Science? - Skipper Seabold
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
DOCX
The AUB Centre for AI in Media Proposal.docx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Electronic commerce courselecture one. Pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
A Presentation on Artificial Intelligence
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Cloud computing and distributed systems.
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
Machine learning based COVID-19 study performance prediction
Spectroscopy.pptx food analysis technology
Programs and apps: productivity, graphics, security and other tools
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
The AUB Centre for AI in Media Proposal.docx
“AI and Expert System Decision Support & Business Intelligence Systems”
Unlocking AI with Model Context Protocol (MCP)
Review of recent advances in non-invasive hemoglobin estimation
Electronic commerce courselecture one. Pdf
Network Security Unit 5.pdf for BCA BBA.
20250228 LYD VKU AI Blended-Learning.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
A Presentation on Artificial Intelligence
Spectral efficient network and resource selection model in 5G networks
Cloud computing and distributed systems.
Mobile App Security Testing_ A Comprehensive Guide.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Digital-Transformation-Roadmap-for-Companies.pptx

Jeff Fischer - Python and IoT: From Chips and Bits to Data Science

  • 1. IOT, PYTHON, AND ML: From Chips and Bits to Data Science Jeff Fischer Data-Ken Research jeff@data-ken.org https://data-ken.org Sunnyvale, California, USA PyData Seattle July 6, 2017
  • 2. Agenda ¨ Project overview ¨ Hardware ¨ Data capture ¨ Data analysis ¨ Player ¨ Parting thoughts © 2016, 2017 Jeff Fischer
  • 3. Why Python for IoT? ¨ High-level, easy to prototype ideas and explore options ¨ Runs on embedded devices ¨ Python data analysis ecosystem © 2016, 2017 Jeff Fischer Array and matrix processing High level data analysis tools Numerical analysis routines Machine learning Raspberry Pi • Linux “workstation” • Can run CPython and full data science stack • Not battery friendly ESP8266 • System-on-a-chip with 32-bit CPU, WiFi, I/O • Low power consumption • Only 96K data memory! • MicroPython to the rescure
  • 4. Project Motivation ¨ First thought about smart thermostat, but too dangerous ¨ Lighting is “safe” ¨ If out of town for the weekend, don’t want to leave the house dark ¨ Timers are flakey and predictable ¨ Would like a self-contained solution ¨ “Wouldn’t it be cool to use machine learning?” © 2016, 2017 Jeff Fischer
  • 5. Lighting Replay Application Data Capture Lux Sensors ESP8266 remote nodes + Raspberry Pi Analysis and Machine Learning Offline analysis using Jupyter, Pandas, HMMlearn Captured sensor data Smart Lights Player Application Simple script using HMMlearn and Phue to control Philips Hue lights HMM state machines © 2016, 2017 Jeff Fischer
  • 7. ESP8266 © 2016, 2017 Jeff Fischer TSL2591 lux sensor breakout board Lithium Ion Polymer Battery 3.7v 350mAh MicroUSB to USB cable ½ Size breadboard Adafruit Feather HUZZAH ESP8266 breakout board
  • 8. ESP8266: Wiring Diagram © 2016, 2017 Jeff Fischer SDA SCL GND 3V
  • 9. Raspberry Pi © 2016, 2017 Jeff Fischer Raspberry Pi 2 Breakout cable “Pi Cobbler Plus” Solderless Breadboards Resistor LED TSL2591 lux sensor breakout board
  • 10. Raspberry Pi: Wiring Diagram © 2016, 2017 Jeff Fischer Resistor LED Anode (long lead) Cathode (short lead) 10k GND 3.3V SDA SCL GPIO 0
  • 12. Lighting Replay Application: Capture Lux Sensor ESP8266 Front Bedroom Sensor Node Lux Sensor ESP8266 Back Bedroom Sensor Node Raspberry Pi (Dining Room) MQTT Broker Data Capture App Lux Sensor Flat Files © 2016, 2017 Jeff Fischer MQTT MQTT
  • 13. Event-driven IoT Code Can Be Ugly def sample_and_process(sensor, mqtt_writer, xducer, completion_cb, error_cb): try: sample = sensor.sample() except StopIteration: final_event = xducer.complete() if final_event: mqtt_writer.send(final_event, lambda: mqtt_writer.disconnect(lambda: completion_cb(False), error_cb), error_cb) else: mqtt_writer.disconnect(lambda: completion_cb(False), error_cb) return except Exception as e: error_cb(e) mqtt_writer.disconnect(lambda: pass, error_cb) return event = SensorEvent(sensor_id=sensor.sensor_id, ts=time.time(), val=sample) csv_writer(event) median_event = xducer.step(event) if median_event: mqtt_writer.send(median_event, lambda: completion_cb(True), error_cb) else: completion_cb(True) def loop(): def completion_cb(more): if more: event_loop.call_later(0.5, loop) else: print("all done, no more callbacks to schedule") event_loop.stop() def error_cb(e): print("Got error: %s" % e) event_loop.stop() event_loop.call_soon(lambda: sample_and_process(sensor, mqtt_writer, transducer, completion_cb, error_cb)) Problems 1. Callback hell 2. Connecting of event streams intermixed with handling of runtime situations: normal flow, error, and end-of-stream conditions. 3. Low-level scheduling 4. async/await helps, but not much © 2016, 2017 Jeff Fischer
  • 14. My Solution: ThingFlow ¨ What is ThingFlow? ¤ A Domain Specific Language for IoT event processing ¤ Runs on Python3 and MicroPython ¨ Co-creator ¤ Rupak Majumdar, Scientific Director at Max Planck Institute for Software Systems ¨ Why did we create ThingFlow? ¤ IoT event processing code can be very convoluted ¤ No standardization of sensors, adapters, and transformations ¤ Different frameworks for microcontrollers, edge processing, analytics © 2016, 2017 Jeff Fischer
  • 15. Simple ThingFlow Example o Periodically sample a light sensor o Write the sensed value to a local file o Every 5 samples, send the moving average to MQTT Broker © 2016, 2017 Jeff Fischer Lux Sensor Write to File Event Scheduler Send to MQTT Moving Avg Graphical Representation sensor.connect(file_writer(’file’)) sensor.transduce(MovingAvg(5)).connect(mqtt_writer) scheduler.schedule_periodic(sensor, 5) Code
  • 16. ESP8266 ThingFlow Code © 2016, 2017 Jeff Fischer from thingflow import Scheduler, SensorAsOutputThing from tsl2591 import Tsl2591 from mqtt_writer import MQTTWriter from wifi import wifi_connect import os # Params to set WIFI_SID= … WIFI_PW= … SENSOR_ID="front-room" BROKER='192.168.11.153' wifi_connect(WIFI_SID, WIFI_PW) sensor = SensorAsOutputThing(Tsl2591()) writer = MQTTWriter(SENSOR_ID, BROKER, 1883, 'remote-sensors') sched = Scheduler() sched.schedule_sensor(sensor, SENSOR_ID, 60, writer) sched.run_forever() https://github.com/jfischer/micropython-tsl2591 Sample at 60 second intervals The MQTT writer is connected to the lux sensor. See https://github.com/mpi-sws-rse/thingflow-examples/blob/master/lighting_replay_app/capture/esp8266_main.py
  • 17. Raspberry Pi Code © 2016, 2017 Jeff Fischer Lux Sensor MQTT Adapter Map to UTF8 Parse JSON Map to events Dispatch CSV File Writer (front room) CSV File Writer (back room) CSV File Writer (dining room) https://github.com/mpi-sws-rse/thingflow-examples/blob/master/lighting_replay_app/capture/sensor_capture.py
  • 19. Lighting Replay Application: Analysis Raspberry Pi (Dining Room) Flat Files HMM definitions Laptop Jupyter Notebook file copy © 2016, 2017 Jeff Fischer
  • 20. Preprocessing the Data (ThingFlow running in a Jupyter Notebook) © 2016, 2017 Jeff Fischer CSV File Reader Fill in missing times Sliding Mean Round values Output Event Count Capture NaN Indexes Pandas Writer (raw series) Pandas Writer (smoothed series) reader.fill_in_missing_times() .passthrough(raw_series_writer) .transduce(SensorSlidingMeanPassNaNs(5)).select(round_event_val).passthrough(smoothed_series_writer) .passthrough(capture_nan_indexes).output_count()
  • 21. Data Processing: Raw Data © 2016, 2017 Jeff Fischer Front room, last day Data gaps
  • 22. Data Processing: Smoothed Data © 2016, 2017 Jeff Fischer Front room, last day
  • 23. Data Processing: K-Means Clustering © 2016, 2017 Jeff Fischer Front room, last day
  • 24. Data Processing: Mapping to on-off values © 2016, 2017 Jeff Fischer Front room, last day
  • 25. Applying “Machine Learning” ¨ Apply a supervised learning to create predictions for the light ¤ Regression => predict light value ¤ Classification => Light “on” or “off” ¤ Features = time of day; time relative to sunrise, sunset; history ¨ Challenges ¤ Transitions more important than individual samples (200 vs. 25,000) ¤ Different class sizes: light is mostly off ¤ Really a random process ¨ Solution: Hidden Markov Models © 2016, 2017 Jeff Fischer
  • 26. Hidden Markov Models (HMMs) ¨ Markov process ¤ State machine with probability associated with each outgoing transition ¤ Probabilities determined only by the current state, not on history ¨ Hidden Markov Model ¤ The states are not visible to the observer, only the outputs (“emissions”). ¨ In a machine learning context: ¤ (Sequence of emissions, # states) => inferred HMM ¨ The hmmlearn library will do this for us. ¤ https://github.com/hmmlearn/hmmlearn ¨ But, no way to account for time of day, etc. © 2016, 2017 Jeff Fischer Example Markov process (from Wikipedia)
  • 27. Slicing Data into Time-based “Zones” © 2016, 2017 Jeff Fischer Sunrise 30 Minutes before sunset Max(sunset+60m, 9:30 pm) 0 1 2 3 0
  • 28. HMM Training and Prediction Process Training 1. Build a list of sample subsequences for each zone 2. Guess a number of states (e.g. 5) 3. For each zone, create an HMM and call fit() with the subsequences Prediction For each zone of a given day: n Run the associated HMM to generate N samples for an N minute zone duration n Associated a computed timestamp with each sample © 2016, 2017 Jeff Fischer
  • 29. HMM Predicted Data © 2016, 2017 Jeff Fischer Front room, one week predicted data Front room, one day predicted data
  • 31. Lighting Replay Application: Replay Front Room Smart Light Raspberry Pi (Dining Room) HMM definitions Player Script Back Room Smart Light Philips Hue Bridge WiFi Router and Switch ZigBee HTTP © 2016, 2017 Jeff Fischer
  • 32. Logic of the Replay Script ¨ Use phue library to control lights ¨ Reuse time zone logic and HMMs from analysis ¨ Pseudo-code: Initial testing of lights while True: compute predicted values for rest of day organize predictions into a time-sorted list of on/off events for each event: sleep until event time send control message for event wait until next day © 2016, 2017 Jeff Fischer https://github.com/mpi-sws-rse/thingflow-examples/blob/master/lighting_replay_app/player/lux_player.py
  • 34. Lessons Learned ¨ End-to-end projects great for learning ¨ Machine learning involves trial-and-error ¨ Visualization is key ¨ Python ecosystem is great for both runtime IoT and offline analytics © 2016, 2017 Jeff Fischer
  • 35. Thank You Contact Me Email: jeff@data-ken.org Twitter: @fischer_jeff Website and blog: https://data-ken.org More Information ThingFlow: https://thingflow.io Examples (including lighting replay app): https://github.com/mpi-sws-rse/thingflow-examples Hardware tutorial: http://micropython-iot-hackathon.readthedocs.io/en/latest/