1© Cloudera, Inc. All rights reserved.
Harnessing Data within Hadoop
in the Connected Brewery:
Kafka, Spark Streaming, and
Kudu
Jason Hubbard
Jason.hubbard@cloudera.com
Cloudera
2© Cloudera, Inc. All rights reserved.
Internet of Things (IoT)
$1.7
Trillion
In Value
20%
Annual Growth
30 Billion
Things
250
Million
Connected Vehicles
Source - IDC & Gartner Estimates
Internet of
Things
IoT Markets - 2020
3© Cloudera, Inc. All rights reserved.
IoT Will Drive An Explosion of Data…
Data expected to explode to
44 ZB by 2020
Source: IDC
44 Trillion GB!80% of data will be
unstructured
4© Cloudera, Inc. All rights reserved.
Value is maximized when data is combined with other
sources
Value of Data is multiplied when you combine
and correlate it with other data from relevant
sources
Improvement in value that can be
unlocked by combining data from
multiple IoT applications and sources
SOURCE: McKinsey Global Institute analysis
Interoperability would significantly improve performance by
combining sensor data from different machines and systems to provide
decision makers with an integrated view of performance
40%
5© Cloudera, Inc. All rights reserved.
The IoT Ecosystem
Consumer
Industrial
IoT Gateway
Data Center
Data Analytics
Sensors/ Things
Data Characteristics
• Un-structured
• Intermittent
• Volume & Variety
Gateway
• Data Routing
• Edge-Processing
• Edge-Storage
Sensors/ Things
•To grow by 50X
•Drop in prices by
70% in last 5 years
Data Storage, Processing & Analytics
IOT Data Characteristics
• More processing in the
cloud
• Analytics on the cloud
IOT Data Analytics
• Key to Value Creation
• Combine data from multiple
sources & types
• Drive business insights
IOT Data Characteristics
• Distributed Data
Processing
• Cloud & On-Premise
Cloud
6© Cloudera, Inc. All rights reserved.
IoT Attributes
• Low powered devices, possibly battery powered
• Highly Distributed
• Gateway/Controller possibly mesh network
• Compact messages
7© Cloudera, Inc. All rights reserved.
IoT Challenges
• Multiple protocols (Z-wave, Zigbee, Thread, etc)
• Distributed, low power may mean data coming from multiple locations
• May power off to save battery or away from controller, need to handle late data
• Calibration between devices may be limited
• Very fast and bursty traffic
• Low bandwidth last mile
8© Cloudera, Inc. All rights reserved.
Use Cases
• Yes, Contrived
• But a good excuse to:
• Brew Beer
• Buy more sensors and microprocessors
• Sorry Wife
9© Cloudera, Inc. All rights reserved.
Use Case - Calibration
• Sensors need to continually be calibrated
• Calibration takes resources and down time
• Instead use historical raw data
• Calibrate on known values
• For temperature sensors use bowling point and triple point
• Temperature sensor is typically linear between these points
• Fit a curve instead
10© Cloudera, Inc. All rights reserved.
Use Case - Optimize Models
• Kalman Filter is used to estimate variable with presence of noise
• Need to know accuracy of sensor
• Usually published by manufacturer but generalized
• Accuracy can degrade over time
• PID Controller
• 3 parameters control performance
• Parameters different for each application
11© Cloudera, Inc. All rights reserved.
Use Case - Predictive Maintenance
• No, not just for heavy machinery
• Sensors fail too
• Can save money by not replacing too early
• More importantly, schedule downtime
• Better Model with more data – Sensors same application many factories
12© Cloudera, Inc. All rights reserved.
Technologies
• Apache Kafka
• Messaging Framework – Scalable, Fault Tolerant
• Pub/Sub
• Retains Data
• Apache Spark
• General Purpose Distributed Processing Framework
• Multiple Components including Streaming
• Streaming continually processes data
• Apache Kudu
13© Cloudera, Inc. All rights reserved.
Kudu for IoT
Why it matters
14© Cloudera, Inc. All rights reserved.
Kudu use cases
Kudu is best for use cases requiring a simultaneous combination of
sequential and random reads and writes
• Machine data analytics
• Example: IOT, Connected Cars, Network threat detection
• Workload: Inserts, scans, lookups
• Time series
• Examples: Streaming market data, fraud detection / prevention, risk monitoring
• Workload: Insert, updates, scans, lookups
• Online reporting
• Example: Operational data store (ODS)
• Workload: Inserts, updates, scans, lookups
15© Cloudera, Inc. All rights reserved.
How would we build the Analytics System Today?
• HDFS Excels at:
• Full table scans
• Ad-hoc analytics
Click to enter confidentiality
Sensors Kafka /
Pub-sub
Events
Today’s Partition
Yesterday’s Partition
Historic Data
AnalystIngest
1. Have we
accumulated
enough data?
2. Flush into
HDFS
16© Cloudera, Inc. All rights reserved.
Handling Late Arriving Data
Click to enter confidentiality
/cars/01-13/
/cars/01-14/
/cars/01-15/HDFS (Storage)
17© Cloudera, Inc. All rights reserved.
Hybrid big data analytics pipeline
Before Kudu
Sensors Kafka /
Pub-sub
Events
HBase
Consumer
HDFS (Storage)
Random Reads
Analyst
Analytics
Snapshot
& Convert to
Parquet
Compact late
arriving data
18© Cloudera, Inc. All rights reserved.
Hybrid big data analytics pipeline
After Kudu
Sensors Kafka /
Pub-sub
Events
Kudu
ConsumerRandom Reads
Analyst
Analytics
Kudu supports simultaneous combination of
sequential and random reads and writes
19© Cloudera, Inc. All rights reserved.
What Kudu is *NOT*
• Not a SQL interface itself
• It’s just the storage layer
• Not an application that runs on HDFS
• It’s an alternative, native Hadoop storage engine
• Not a replacement for HDFS or HBase
• Select the right storage for the right use case
20© Cloudera, Inc. All rights reserved.
Kudu Trade-Offs (vs Hbase)
• Random updates will be slower
• HBase model allows random updates without incurring a disk seek
• Kudu requires a key lookup before update, Bloom lookup before insert
• Single-row reads may be slower
• Columnar design is optimized for scans
• Future: may introduce “column groups” for applications where single-row
access is more important
21© Cloudera, Inc. All rights reserved.
Demo

More Related Content

PDF
Apache Eagle: Secure Hadoop in Real Time
PPTX
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
PDF
Making the Most of Data in Multiple Data Sources (with Virtual Data Lakes)
PPTX
Real-Time Robot Predictive Maintenance in Action
PDF
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
PDF
Machine Learning for Any Size of Data, Any Type of Data
PPTX
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
PPTX
Log I am your father
Apache Eagle: Secure Hadoop in Real Time
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
Making the Most of Data in Multiple Data Sources (with Virtual Data Lakes)
Real-Time Robot Predictive Maintenance in Action
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
Machine Learning for Any Size of Data, Any Type of Data
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
Log I am your father

What's hot (20)

PPTX
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...
PPTX
Simplifying Real-Time Architectures for IoT with Apache Kudu
PPTX
Real Time Machine Learning Visualization with Spark
PPTX
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
PPTX
Intuit Analytics Cloud 101
PPTX
Active Learning for Fraud Prevention
PPTX
Real time machine learning visualization with spark -- Hadoop Summit 2016
PPTX
Part 1: Lambda Architectures: Simplified by Apache Kudu
PPTX
Breaking the Silos: Storage for Analytics & AI
PPTX
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
PPTX
Disrupting Insurance with Advanced Analytics The Next Generation Carrier
PPTX
Lightning Fast Analytics with Hive LLAP and Druid
PPTX
Extending Twitter's Data Platform to Google Cloud
PPTX
From SQL to NoSQL - StampedeCon 2015
PPTX
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
PPTX
Apache Kudu: Technical Deep Dive


PPTX
Webinar: The Modern Streaming Data Stack with Kinetica & StreamSets
PPTX
Operating a secure big data platform in a multi-cloud environment
PPTX
Self-Service Analytics on Hadoop: Lessons Learned
PPTX
Make Streaming Analytics work for you: The Devil is in the Details
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...
Simplifying Real-Time Architectures for IoT with Apache Kudu
Real Time Machine Learning Visualization with Spark
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Intuit Analytics Cloud 101
Active Learning for Fraud Prevention
Real time machine learning visualization with spark -- Hadoop Summit 2016
Part 1: Lambda Architectures: Simplified by Apache Kudu
Breaking the Silos: Storage for Analytics & AI
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Disrupting Insurance with Advanced Analytics The Next Generation Carrier
Lightning Fast Analytics with Hive LLAP and Druid
Extending Twitter's Data Platform to Google Cloud
From SQL to NoSQL - StampedeCon 2015
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Apache Kudu: Technical Deep Dive


Webinar: The Modern Streaming Data Stack with Kinetica & StreamSets
Operating a secure big data platform in a multi-cloud environment
Self-Service Analytics on Hadoop: Lessons Learned
Make Streaming Analytics work for you: The Devil is in the Details
Ad

Viewers also liked (20)

PPTX
Spark Tips & Tricks
PDF
Developing streaming applications with apache apex (strata + hadoop world)
PPTX
Kafka presentation
PDF
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
PPTX
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
PDF
Anomaly detection in real-time data streams using Heron
PDF
Monitoring Apache Kafka with Confluent Control Center
PDF
Distributed stream processing with Apache Kafka
PDF
Advanced Analytics and Recommendations with Apache Spark - Spark Maryland/DC ...
PPTX
Blr hadoop meetup
PPTX
Getting started with Azure Event Hubs and Stream Analytics services
PDF
London Apache Kafka Meetup (Jan 2017)
PDF
Storm over gearpump
PPTX
Kafka connect
PDF
Not Only Streams for Akademia JLabs
PPTX
Processing IoT Data with Apache Kafka
PDF
Confluent kafka meetupseattle jan2017
PDF
Strata+Hadoop 2017 San Jose - The Rise of Real Time: Apache Kafka and the Str...
PDF
Apache kafka-a distributed streaming platform
PDF
Extracting Insights from Data at Twitter
Spark Tips & Tricks
Developing streaming applications with apache apex (strata + hadoop world)
Kafka presentation
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Anomaly detection in real-time data streams using Heron
Monitoring Apache Kafka with Confluent Control Center
Distributed stream processing with Apache Kafka
Advanced Analytics and Recommendations with Apache Spark - Spark Maryland/DC ...
Blr hadoop meetup
Getting started with Azure Event Hubs and Stream Analytics services
London Apache Kafka Meetup (Jan 2017)
Storm over gearpump
Kafka connect
Not Only Streams for Akademia JLabs
Processing IoT Data with Apache Kafka
Confluent kafka meetupseattle jan2017
Strata+Hadoop 2017 San Jose - The Rise of Real Time: Apache Kafka and the Str...
Apache kafka-a distributed streaming platform
Extracting Insights from Data at Twitter
Ad

Similar to IoT Connected Brewery (20)

PPTX
Cloudera - IoT & Smart Cities
PPTX
Powering the Internet of Things with Apache Hadoop
PPTX
Top 5 IoT Use Cases
PPTX
Enabling the Active Data Warehouse with Apache Kudu
PPTX
Introducing Cloudera DataFlow (CDF) 2.13.19
PPTX
IoT-Enabled Predictive Maintenance
PDF
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
PPTX
Hadoop and Manufacturing
PPTX
Connect Tableau & Power BI to Cognos Data
PPTX
Cloudera Altus: Big Data in the Cloud Made Easy
PDF
Hadoop As The Platform For The Smartgrid At TVA
PPTX
巨量資料入門 The evolution of data architecture
PDF
Fighting cyber fraud with hadoop v2
PPTX
How to Build Continuous Ingestion for the Internet of Things
PPTX
Overview of Fintech industry in Indian context
PPTX
Turning Data into Business Value with a Modern Data Platform
PPTX
Preventative Maintenance of Robots in Automotive Industry
PDF
Cloud-based vs. On-site CTMS - Which is Right for Your Organization?
PDF
Barga ACM DEBS 2013 Keynote
PDF
The Future of Data Management: The Enterprise Data Hub
Cloudera - IoT & Smart Cities
Powering the Internet of Things with Apache Hadoop
Top 5 IoT Use Cases
Enabling the Active Data Warehouse with Apache Kudu
Introducing Cloudera DataFlow (CDF) 2.13.19
IoT-Enabled Predictive Maintenance
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
Hadoop and Manufacturing
Connect Tableau & Power BI to Cognos Data
Cloudera Altus: Big Data in the Cloud Made Easy
Hadoop As The Platform For The Smartgrid At TVA
巨量資料入門 The evolution of data architecture
Fighting cyber fraud with hadoop v2
How to Build Continuous Ingestion for the Internet of Things
Overview of Fintech industry in Indian context
Turning Data into Business Value with a Modern Data Platform
Preventative Maintenance of Robots in Automotive Industry
Cloud-based vs. On-site CTMS - Which is Right for Your Organization?
Barga ACM DEBS 2013 Keynote
The Future of Data Management: The Enterprise Data Hub

Recently uploaded (20)

PDF
Unlock new opportunities with location data.pdf
PDF
Architecture types and enterprise applications.pdf
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PDF
1 - Historical Antecedents, Social Consideration.pdf
PPTX
observCloud-Native Containerability and monitoring.pptx
PPTX
Tartificialntelligence_presentation.pptx
PPTX
Modernising the Digital Integration Hub
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
DOCX
search engine optimization ppt fir known well about this
PDF
DP Operators-handbook-extract for the Mautical Institute
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
Unlock new opportunities with location data.pdf
Architecture types and enterprise applications.pdf
WOOl fibre morphology and structure.pdf for textiles
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Assigned Numbers - 2025 - Bluetooth® Document
Enhancing emotion recognition model for a student engagement use case through...
Taming the Chaos: How to Turn Unstructured Data into Decisions
Final SEM Unit 1 for mit wpu at pune .pptx
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
1 - Historical Antecedents, Social Consideration.pdf
observCloud-Native Containerability and monitoring.pptx
Tartificialntelligence_presentation.pptx
Modernising the Digital Integration Hub
A comparative study of natural language inference in Swahili using monolingua...
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
search engine optimization ppt fir known well about this
DP Operators-handbook-extract for the Mautical Institute
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx

IoT Connected Brewery

  • 1. 1© Cloudera, Inc. All rights reserved. Harnessing Data within Hadoop in the Connected Brewery: Kafka, Spark Streaming, and Kudu Jason Hubbard Jason.hubbard@cloudera.com Cloudera
  • 2. 2© Cloudera, Inc. All rights reserved. Internet of Things (IoT) $1.7 Trillion In Value 20% Annual Growth 30 Billion Things 250 Million Connected Vehicles Source - IDC & Gartner Estimates Internet of Things IoT Markets - 2020
  • 3. 3© Cloudera, Inc. All rights reserved. IoT Will Drive An Explosion of Data… Data expected to explode to 44 ZB by 2020 Source: IDC 44 Trillion GB!80% of data will be unstructured
  • 4. 4© Cloudera, Inc. All rights reserved. Value is maximized when data is combined with other sources Value of Data is multiplied when you combine and correlate it with other data from relevant sources Improvement in value that can be unlocked by combining data from multiple IoT applications and sources SOURCE: McKinsey Global Institute analysis Interoperability would significantly improve performance by combining sensor data from different machines and systems to provide decision makers with an integrated view of performance 40%
  • 5. 5© Cloudera, Inc. All rights reserved. The IoT Ecosystem Consumer Industrial IoT Gateway Data Center Data Analytics Sensors/ Things Data Characteristics • Un-structured • Intermittent • Volume & Variety Gateway • Data Routing • Edge-Processing • Edge-Storage Sensors/ Things •To grow by 50X •Drop in prices by 70% in last 5 years Data Storage, Processing & Analytics IOT Data Characteristics • More processing in the cloud • Analytics on the cloud IOT Data Analytics • Key to Value Creation • Combine data from multiple sources & types • Drive business insights IOT Data Characteristics • Distributed Data Processing • Cloud & On-Premise Cloud
  • 6. 6© Cloudera, Inc. All rights reserved. IoT Attributes • Low powered devices, possibly battery powered • Highly Distributed • Gateway/Controller possibly mesh network • Compact messages
  • 7. 7© Cloudera, Inc. All rights reserved. IoT Challenges • Multiple protocols (Z-wave, Zigbee, Thread, etc) • Distributed, low power may mean data coming from multiple locations • May power off to save battery or away from controller, need to handle late data • Calibration between devices may be limited • Very fast and bursty traffic • Low bandwidth last mile
  • 8. 8© Cloudera, Inc. All rights reserved. Use Cases • Yes, Contrived • But a good excuse to: • Brew Beer • Buy more sensors and microprocessors • Sorry Wife
  • 9. 9© Cloudera, Inc. All rights reserved. Use Case - Calibration • Sensors need to continually be calibrated • Calibration takes resources and down time • Instead use historical raw data • Calibrate on known values • For temperature sensors use bowling point and triple point • Temperature sensor is typically linear between these points • Fit a curve instead
  • 10. 10© Cloudera, Inc. All rights reserved. Use Case - Optimize Models • Kalman Filter is used to estimate variable with presence of noise • Need to know accuracy of sensor • Usually published by manufacturer but generalized • Accuracy can degrade over time • PID Controller • 3 parameters control performance • Parameters different for each application
  • 11. 11© Cloudera, Inc. All rights reserved. Use Case - Predictive Maintenance • No, not just for heavy machinery • Sensors fail too • Can save money by not replacing too early • More importantly, schedule downtime • Better Model with more data – Sensors same application many factories
  • 12. 12© Cloudera, Inc. All rights reserved. Technologies • Apache Kafka • Messaging Framework – Scalable, Fault Tolerant • Pub/Sub • Retains Data • Apache Spark • General Purpose Distributed Processing Framework • Multiple Components including Streaming • Streaming continually processes data • Apache Kudu
  • 13. 13© Cloudera, Inc. All rights reserved. Kudu for IoT Why it matters
  • 14. 14© Cloudera, Inc. All rights reserved. Kudu use cases Kudu is best for use cases requiring a simultaneous combination of sequential and random reads and writes • Machine data analytics • Example: IOT, Connected Cars, Network threat detection • Workload: Inserts, scans, lookups • Time series • Examples: Streaming market data, fraud detection / prevention, risk monitoring • Workload: Insert, updates, scans, lookups • Online reporting • Example: Operational data store (ODS) • Workload: Inserts, updates, scans, lookups
  • 15. 15© Cloudera, Inc. All rights reserved. How would we build the Analytics System Today? • HDFS Excels at: • Full table scans • Ad-hoc analytics Click to enter confidentiality Sensors Kafka / Pub-sub Events Today’s Partition Yesterday’s Partition Historic Data AnalystIngest 1. Have we accumulated enough data? 2. Flush into HDFS
  • 16. 16© Cloudera, Inc. All rights reserved. Handling Late Arriving Data Click to enter confidentiality /cars/01-13/ /cars/01-14/ /cars/01-15/HDFS (Storage)
  • 17. 17© Cloudera, Inc. All rights reserved. Hybrid big data analytics pipeline Before Kudu Sensors Kafka / Pub-sub Events HBase Consumer HDFS (Storage) Random Reads Analyst Analytics Snapshot & Convert to Parquet Compact late arriving data
  • 18. 18© Cloudera, Inc. All rights reserved. Hybrid big data analytics pipeline After Kudu Sensors Kafka / Pub-sub Events Kudu ConsumerRandom Reads Analyst Analytics Kudu supports simultaneous combination of sequential and random reads and writes
  • 19. 19© Cloudera, Inc. All rights reserved. What Kudu is *NOT* • Not a SQL interface itself • It’s just the storage layer • Not an application that runs on HDFS • It’s an alternative, native Hadoop storage engine • Not a replacement for HDFS or HBase • Select the right storage for the right use case
  • 20. 20© Cloudera, Inc. All rights reserved. Kudu Trade-Offs (vs Hbase) • Random updates will be slower • HBase model allows random updates without incurring a disk seek • Kudu requires a key lookup before update, Bloom lookup before insert • Single-row reads may be slower • Columnar design is optimized for scans • Future: may introduce “column groups” for applications where single-row access is more important
  • 21. 21© Cloudera, Inc. All rights reserved. Demo

Editor's Notes

  • #3: Lets start by taking a look at the market potential for IoT: Billions of devices include everything from cars, homes, airplanes, parking meters, factories, oil rigs, heavy machinery to wearables will be connected to the internet and more importantly will be interconnected enabling businesses to work smarter, faster and more profitably. If you look at the market potential for IoT, We are talking about significant growth and some big numbers here.   By 2020 we are talking anywhere from 30 – 50 Billion connected things depending on who you talk to… There will be around a quarter billion connected cars on the roads.   And It is estimated that IoT will generate ~1.7 Trillion US Dollars in value in the next 4-5 years with an approx. growth rate of 20% YoY.
  • #4: Data is the key to IoT – all of the ability to gain insights out of all of this data However, IoT isn’t just about the things or connecting these objects to the Internet; IoT is really going to be all about the data.   With 30 Billion things connected, IoT Will Drive An Explosion of Data…   The amount of data on the planet is set to grow 10-fold to around 44ZB. . If you are wondering how much that is – That is about 44 trillion GBs of data.   Not only that – over 80% of that data is going to be unstructured/ semi-structured. So the question then becomes - how can you effectively manage, store, process, analyze and drive insights into all of this data that IoT is going to generate?
  • #5: Data coming in from just one sensor has value, but limited value. Real value from this data can be exploited by combining with data from other IoT sensors or combining it with Internal & external data. So for example – its good to know that your brake pads need to be replaced in your care, through sensors, but auto manufacturers are taking it to the next level – They want to combine that data with other data about the customer including what make and model is the car, where does the customer live, how does he or she like to shop and then send targeted offers to the customer saying – Here is an offer for your brake pad change, at your favorite body shop and you here is a coupon for 15% off your brake pad replacement service. ------------------------------------- McKinsey estimates that situations in which two or more IoT systems must work together can account for about 40 percent of the total value that can be unlocked by the Internet of Things. For ex. Interoperability would significantly improve performance by combining sensor data from different machines and systems to provide decision makers with an integrated view of performance across an entire factory or oil rig. While most use cases involve an immediate response — e.g., when a sensor detects a water leak — the bigger value may be in analyzing historical data or combining it with other data sets.
  • #6: 30-70% Drop in the price of MEMS sensors in past five years – McKinsey Research Diverse data types – from intermittent sensor readings of temperature and pressure to real-time location data or streaming live videos for video analytics Given the flexible, scalable nature of cloud-based infrastructure and the fact that machine data often originates off premises, we expect a lot of IoT data to be stored and processed in the cloud. The ideal IoT data platform can be deployed either on premise or in a public, hybrid, or private cloud environment. It should be possible to administer the platform via both a web-based interface and API calls. Gateways collects, aggregates, and optionally processes the data generated by the devices. The gateway can also accept and route commands sent from the backend to the respective device. Gateway is responsible for authenticating and authorizing the devices to participate in the workflow. It ensures secure communication between the devices and the centralized command center. The gateway is capable of dealing with multiple protocols and data formats. Response to edge analytics: Having access to all of your data is important, but with access comes responsibility and you a need a strategy about which data needs to be collected at the atomic level, which data needs to be rolled up and aggregated, and which data needs to be used to run your business. We are not saying all and every bit of the data generated by every sensor needs to make it way back to the data center. For some data it might make sense to collect, store, interpret and respond to locally. But organizations need a strategy about which data needs to be collected at the atomic level, which data needs to be rolled up and aggregated, and which data needs to be used to run your business. You will have to decide what happens at the edge, at the core, and perhaps in-between. For example, rather than send all sensor data to a central location, an edge device or software solution may send a summary of the data or trigger an automatic alert based a threshold-level status change. However there are few things you need to mind 1) you need to ensure you are not building up hundreds of different data silos that sits out there and you lack a centralized/ comprehensive view of the business – That is really a huge step backwards from both a business and IT perspective and 2) Security & Governance – Do you want sensitive data, customer data sitting in thousands of edge sensors or gateways significantly increasing your risk of a breach.