SlideShare a Scribd company logo
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
Enriched Data Movement
Automation
Dashboarding
Azure Streaming Analytics: A comprehensive Guide.
Enables customer to get real-time Insights
when time to action is critical
Real-time Fraud Detection Streaming ETL Predictive Maintenance Call Center Analytics
IT Infrastructure and Network Monitoring Customer Behavior Prediction Log Analytics Real-time Cross Sell Offers
Fleet monitoring and Connected Cars Real-time Patient Monitoring Smart Grid Real-time Marketing
Azure Streaming Analytics: A comprehensive Guide.
Programmer Productivity:
Azure Streaming Analytics: A comprehensive Guide.
Data Manipulation
SELECT
FROM
WHERE
HAVING
GROUP BY
CASE WHEN THEN
ELSE
INNER/LEFT OUTER
JOIN
UNION
CROSS/OUTER APPLY
CAST INTO
ORDER BY ASC, DSC
ScalingExtensions
WITH
PARTITION BY
OVER
Date and Time
DateName
DatePart Day, Month, Year
DateDiff
DateTimeFromParts
DateAdd
WindowingExtensions
TumblingWindow
HoppingWindow
SlidingWindow
Aggregation
SUM
COUNT
AVG
MIN
MAX
STDEV
STDEVP
VAR
VARP
TopOne
String
Len
Concat
CharIndex
Substring
Lower, Upper
PatIndex
Temporal
Lag
IsFirst
Last
CollectTop
Mathematical
ABS
CEILING
EXP
FLOOR
POWER
SIGN
SQUARE
SQRT
Geospatial(preview)
CreatePoint
CreatePolygon
CreateLineString
ST_DISTANCE
ST_WITHIN
ST_OVERLAPS
ST_INTERSECTS
Declarative SQL like language to
describe transformations
Filters (“Where”)
Projections (“Select”)
Time-window and property-based aggregates
(“Group By”)
Time-shifted joins (specifying time bounds within
which the joining events must occur)
and all combinations thereof
1,915 lines of code with Apache Storm
@ApplicationAnnotation(name="WordCountDemo")
public class Application implements StreamingApplication
{
protected String fileName =
"com/datatorrent/demos/wordcount/samplefile.txt";
private Locality locality = null;
@Override public void populateDAG(DAG dag, Configuration
conf)
{
locality = Locality.CONTAINER_LOCAL;
WordCountInputOperator input =
dag.addOperator("wordinput", new
WordCountInputOperator());
input.setFileName(fileName);
UniqueCounter<String> wordCount =
dag.addOperator("count", new
UniqueCounter<String>());
dag.addStream("wordinput-count", input.outputPort,
wordCount.data).setLocality(locality);
ConsoleOutputOperator consoleOperator =
dag.addOperator("console", new
ConsoleOutputOperator());
dag.addStream("count-console",wordCount.count,
consoleOperator.input);
}
}
3 lines of SQL in Azure Stream Analytics
SELECT Avg(Purchase), ScoreTollId, Count(*)
FROM GameDataStream
GROUP BY TumblingWindows(5, Minute), Score
Intelligent Edge and Cloud
Serverless and low TCO
Easy to get started
Stream
Analytics on
IoT Edge
Presentation &
Action
Storage &
Batch Analysis
Stream
Analytics
Event Queuing
& Stream
Ingestion
Event
production
IoT Hub
Applications
Archiving for long
term storage/
batch analytics
Real-time dashboard
Azure
Stream
Analytics
Automation to
kick-off workflows
Machine Learning
Blob Storage
(streaming ingress
and reference data)
Event Hubs
Devices &
Gateways
Enterprise grade SLA
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
There are two distinct types of Inputs
• Data Streams:
• IoT Hub
• Event Hub
• Azure Blob storage
• Reference data:
• Azure Blob storage
Data Outputs Supported
• Azure Data Lake Store
• SQL Database
• Blob storage
• Event Hub
• Power BI
• Table Storage
• Service Bus Queues
• Service Bus Topics
• Azure Cosmos DB
• Azure Functions
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
Every event that flows through the system has a timestamp
ASA supports:
Arrival Time - Event timestamps based on arrival time (input adapter clock, e.g., Event Hubs)
App Time - Event timestamps based on a timestamp field in the actual event tuple
SELECT * FROM EntryStream TIMESTAMP BY EntryTime
SELECT * FROM EntryStream
Output at the end of each window
Windows are fixed length
Used in a GROUP BY clause
1 5 4 2
6 8 6 4
t1 t2 t5 t6
t3 t4
Time
Window 1 Window 2 Window 3
Aggregate
Function (Sum)
18 14
Output Events
SELECT TollId, Count(*)
FROM EntryStream TIMESTAMP BY EntryTime
GROUP BY TollId, TumblingWindow(second, 10)
1 5 4 2
6 8 6 5
0 5 20
10 15
Time(s)
1 5 4 2
6
8 6
25
A 10-second Tumbling Window
30
3 6 1
5 3 6 1
Every 10 seconds give me the count
of vehicles entering each toll booth
over the last 10 seconds
Every 5 seconds give me the
count of vehicles entering each
toll booth over the last 10
seconds
1 5 4 2
6 8 7
0 5 20
10 15
Time
(s)
25
A 10 second Hopping Window with a 5 second hop
30
4 2
6
8 6
5 3 6 1
1 5 4 2
6
8 6 5 3
6 1
5 3
SELECT TollId, Count(*)
FROM EntryStream TIMESTAMP BY EntryTime
GROUP BY TollId, HoppingWindow(second, 10, 5)
SELECT TollId, Count(*)
FROM EntryStream TIMESTAMP BY EntryTime
GROUP BY TollId, SlidingWindow(second, 20)
HAVING Count(*) > 10
1 5
0 10 40
20 30 Time
(s)
50
A 20-second Sliding Window
5
1
5
1
Entry
Exit
1 5
Find all toll booths that have
served more than 10 vehicles in
the last 20 seconds
An output is generated whenever an event
either enters/leaves the system
Azure Streaming Analytics: A comprehensive Guide.
Perform real-time scoring on streaming data
Anomaly Detection and Sentiment Analysis are common use cases
Function calls from the query
Azure ML can publish web endpoints for operationalized ML models
Azure Stream Analytics binds custom function names to such web endpoints
SELECT text, sentiment(text) AS score
FROM myStream
in public preview
in public preview
Azure Streaming Analytics: A comprehensive Guide.
in private preview
It is recommended to have at least 50 events in each window for best results.
in private preview
It is recommended to have at least 50 events in each window for best results.
IoT Hub Stream
Analytics
Azure Streaming Analytics: A comprehensive Guide.
Scenarios where you might find JavaScript user-defined functions useful:
• Parsing and manipulating strings that have regular expression functions, for example,
Regexp_Replace() and Regexp_Extract()
• Decoding and encoding data, for example, binary-to-hex conversion
• Performing mathematic computations with JavaScript Math functions
• Performing array operations like sort, join, find, and fill
Here are some things that you cannot do with a JavaScript user-defined function in Stream
Analytics:
• Call out external REST endpoints, for example, performing reverse IP lookup or pulling
reference data from an external source
• Perform custom event format serialization or deserialization on inputs/outputs
• Create custom aggregates
Provided by Rob Klause
The ASA query
Provided by Rob Klause
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
IoT Edge
Factory / Customer
Site
sqlFunction
AvgtoCloud
Alert
sql
tempSensor
{
"routes": {
"TelemetryTosqlFunction": "FROM /messages/modules/tempSensor/outputs/temperatureOutput INTO BrokeredEndpoint("/modules/sqlFunction/inputs/input1")",
"TelemetryToAsa": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/Alert/inputs/temperature")",
"AlertsToReset": "FROM /messages/modules/Alert/* INTO BrokeredEndpoint("/modules/tempSensor/inputs/control")",
"AlertsToCloud": "FROM /messages/modules/Alert/* INTO $upstream",
"TelemetryToAsaAvg": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/AvgtoCloud/inputs/EdgeStream")",
"AvgToCloud": "FROM /messages/modules/AvgtoCloud/* INTO $upstream"
}
}
IoT Edge
Factory / Customer
Site
sqlFunction
AvgtoCloud
Alert
sql
tempSensor
{
"routes": {
"TelemetryTosqlFunction": "FROM /messages/modules/tempSensor/outputs/temperatureOutput INTO BrokeredEndpoint("/modules/sqlFunction/inputs/input1")",
"TelemetryToAsa": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/Alert/inputs/temperature")",
"AlertsToReset": "FROM /messages/modules/Alert/* INTO BrokeredEndpoint("/modules/tempSensor/inputs/control")",
"AlertsToCloud": "FROM /messages/modules/Alert/* INTO $upstream",
"TelemetryToAsaAvg": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/AvgtoCloud/inputs/EdgeStream")",
"AvgToCloud": "FROM /messages/modules/AvgtoCloud/* INTO $upstream"
}
}
IoT Hub
IoT Edge
Factory / Customer
Site
sqlFunction
AvgtoCloud
Alert
sql
tempSensor
{
"routes": {
"TelemetryTosqlFunction": "FROM /messages/modules/tempSensor/outputs/temperatureOutput INTO BrokeredEndpoint("/modules/sqlFunction/inputs/input1")",
"TelemetryToAsa": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/Alert/inputs/temperature")",
"AlertsToReset": "FROM /messages/modules/Alert/* INTO BrokeredEndpoint("/modules/tempSensor/inputs/control")",
"AlertsToCloud": "FROM /messages/modules/Alert/* INTO $upstream",
"TelemetryToAsaAvg": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/AvgtoCloud/inputs/EdgeStream")",
"AvgToCloud": "FROM /messages/modules/AvgtoCloud/* INTO $upstream"
}
}
IoT Edge
Factory / Customer
Site
sqlFunction
AvgtoCloud
Alert
sql
tempSensor
{
"routes": {
"TelemetryTosqlFunction": "FROM /messages/modules/tempSensor/outputs/temperatureOutput INTO BrokeredEndpoint("/modules/sqlFunction/inputs/input1")",
"TelemetryToAsa": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/Alert/inputs/temperature")",
"AlertsToReset": "FROM /messages/modules/Alert/* INTO BrokeredEndpoint("/modules/tempSensor/inputs/control")",
"AlertsToCloud": "FROM /messages/modules/Alert/* INTO $upstream",
"TelemetryToAsaAvg": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/AvgtoCloud/inputs/EdgeStream")",
"AvgToCloud": "FROM /messages/modules/AvgtoCloud/* INTO $upstream"
}
}
IoT Hub
Blob Storage
Azure Streaming Analytics: A comprehensive Guide.
Stream
Analytics
Data
Factory
Blob
Storage Event Hub Time
Series
Insights
Azure Streaming Analytics: A comprehensive Guide.
Feature Status Remarks
SQL Parallel Write GA WW rollout in 1 week
Blob O/P partitioning by custom date-time Public preview WW rollout in 1 week
C# UDF on IoT Edge Public preview Available now
Live testing in Visual Studio Public preview Available now
User defined custom repartition count Public preview Available now
New built-in ML models for A/D – Edge and Cloud Private Preview Access granted upon sign-up
Custom de-serializers on IoT Edge Private Preview Access granted upon sign-up
MSI Authentication for egress to ADLS Gen1 Private Preview Access granted upon sign-up
Supports inline learning and real-time scoring
Easily invoked with simple function calls within query language
Types of Anomalies Detected:
Spikes
Dips
Slow positive trend
Slow negative trend
Bi-Level change
• Faster iterative testing
• Show results in real time
• View Job metrics
• Time policies support
Public
Preview
Partition egress to Blob storage by
1) Any input field
2) Custom date and time formats
Gain more fine grain control over data
written to Blob storage for dashboarding
and reporting
Better alignment with Hive conventions
for blob output to be consumed by
HDInsight and Azure Databricks.
To achieve fully parallel topologies, ASA will
transition SQL ‘writes’ from Serial to Parallel
operations for SQL DB and SQL Data Warehouse
4x-5x improvement in write throughput
Allows for batch size customization to achieve
higher throughput
For e.g., this feature enabled our customer
building a connected car scenario to scale up
from 150K events/min to 500K events/min
MSI based authentication will enable
egress to Azure Data Lake Storage.
Key benefits over existing AAD (Azure
Active Directory) based authentication:
• Job deployment automation (thru Power
Shell etc.)
• Long running production jobs
• Consistency with other services
Enables better performance
tuning
Key Scenarios
• When upstream partition count
can’t be changed
• Partitioned processing is
needed to scale out to larger
processing load
• Fixed number of output
partitions
SELECT *
INTO
[output]
FROM
[input]
PARTITION BY
DeviceID INTO 10
Public
Preview
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
Azure Streaming Analytics: A comprehensive Guide.
Thank You!
© 2018 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to
changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the
date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

More Related Content

PPTX
Introduction to WSO2 Data Analytics Platform
PDF
Continuous Application with Structured Streaming 2.0
PDF
Data streaming for connected devices with Azure Stream Analytics by Juan Manu...
PDF
Streaming sql w kafka and flink
PDF
Cortana Analytics Workshop: Real-Time Data Processing -- How Do I Choose the ...
PDF
A Deep Dive into Structured Streaming in Apache Spark
PDF
Introducing the WSO2 Complex Event Processor
PDF
Stream Processing with Ballerina
Introduction to WSO2 Data Analytics Platform
Continuous Application with Structured Streaming 2.0
Data streaming for connected devices with Azure Stream Analytics by Juan Manu...
Streaming sql w kafka and flink
Cortana Analytics Workshop: Real-Time Data Processing -- How Do I Choose the ...
A Deep Dive into Structured Streaming in Apache Spark
Introducing the WSO2 Complex Event Processor
Stream Processing with Ballerina

Similar to Azure Streaming Analytics: A comprehensive Guide. (20)

PPTX
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
PDF
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
PDF
Stream Processing with Ballerina
PDF
EDA Meets Data Engineering – What's the Big Deal?
PDF
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor
PPTX
Azure Stream Analytics : Analyse Data in Motion
PPTX
4.out port
PPTX
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
PDF
Complex Event Processor 3.0.0 - An overview of upcoming features
PDF
Scaling Experimentation & Data Capture at Grab
ODP
Aspect-Oriented Programming
ODP
Parallel Complex Event Processing
PDF
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
PDF
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
PPTX
Microsoft SQL Server - StreamInsight Overview Presentation
PDF
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
PPTX
Streaming SQL to unify batch and stream processing: Theory and practice with ...
PDF
Building Analytics Applications with Streaming Expressions in Apache Solr - A...
PDF
Streaming Solr - Activate 2018 talk
PDF
Capacity Planning for Linux Systems
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
Stream Processing with Ballerina
EDA Meets Data Engineering – What's the Big Deal?
WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor
Azure Stream Analytics : Analyse Data in Motion
4.out port
Inflight to Insights: Real-time Insights with Event Hubs, Stream Analytics an...
Complex Event Processor 3.0.0 - An overview of upcoming features
Scaling Experimentation & Data Capture at Grab
Aspect-Oriented Programming
Parallel Complex Event Processing
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
A Practical Deep Dive into Observability of Streaming Applications with Kosta...
Microsoft SQL Server - StreamInsight Overview Presentation
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Streaming SQL to unify batch and stream processing: Theory and practice with ...
Building Analytics Applications with Streaming Expressions in Apache Solr - A...
Streaming Solr - Activate 2018 talk
Capacity Planning for Linux Systems
Ad

Recently uploaded (20)

PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Lecture1 pattern recognition............
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Launch Your Data Science Career in Kochi – 2025
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Computer network topology notes for revision
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
Fluorescence-microscope_Botany_detailed content
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Moving the Public Sector (Government) to a Digital Adoption
Miokarditis (Inflamasi pada Otot Jantung)
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
.pdf is not working space design for the following data for the following dat...
Introduction-to-Cloud-ComputingFinal.pptx
Lecture1 pattern recognition............
Introduction to Knowledge Engineering Part 1
Launch Your Data Science Career in Kochi – 2025
climate analysis of Dhaka ,Banglades.pptx
Quality review (1)_presentation of this 21
IBA_Chapter_11_Slides_Final_Accessible.pptx
Supervised vs unsupervised machine learning algorithms
1_Introduction to advance data techniques.pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Computer network topology notes for revision
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Fluorescence-microscope_Botany_detailed content
Ad

Azure Streaming Analytics: A comprehensive Guide.

  • 6. Enables customer to get real-time Insights when time to action is critical
  • 7. Real-time Fraud Detection Streaming ETL Predictive Maintenance Call Center Analytics IT Infrastructure and Network Monitoring Customer Behavior Prediction Log Analytics Real-time Cross Sell Offers Fleet monitoring and Connected Cars Real-time Patient Monitoring Smart Grid Real-time Marketing
  • 11. Data Manipulation SELECT FROM WHERE HAVING GROUP BY CASE WHEN THEN ELSE INNER/LEFT OUTER JOIN UNION CROSS/OUTER APPLY CAST INTO ORDER BY ASC, DSC ScalingExtensions WITH PARTITION BY OVER Date and Time DateName DatePart Day, Month, Year DateDiff DateTimeFromParts DateAdd WindowingExtensions TumblingWindow HoppingWindow SlidingWindow Aggregation SUM COUNT AVG MIN MAX STDEV STDEVP VAR VARP TopOne String Len Concat CharIndex Substring Lower, Upper PatIndex Temporal Lag IsFirst Last CollectTop Mathematical ABS CEILING EXP FLOOR POWER SIGN SQUARE SQRT Geospatial(preview) CreatePoint CreatePolygon CreateLineString ST_DISTANCE ST_WITHIN ST_OVERLAPS ST_INTERSECTS Declarative SQL like language to describe transformations Filters (“Where”) Projections (“Select”) Time-window and property-based aggregates (“Group By”) Time-shifted joins (specifying time bounds within which the joining events must occur) and all combinations thereof
  • 12. 1,915 lines of code with Apache Storm @ApplicationAnnotation(name="WordCountDemo") public class Application implements StreamingApplication { protected String fileName = "com/datatorrent/demos/wordcount/samplefile.txt"; private Locality locality = null; @Override public void populateDAG(DAG dag, Configuration conf) { locality = Locality.CONTAINER_LOCAL; WordCountInputOperator input = dag.addOperator("wordinput", new WordCountInputOperator()); input.setFileName(fileName); UniqueCounter<String> wordCount = dag.addOperator("count", new UniqueCounter<String>()); dag.addStream("wordinput-count", input.outputPort, wordCount.data).setLocality(locality); ConsoleOutputOperator consoleOperator = dag.addOperator("console", new ConsoleOutputOperator()); dag.addStream("count-console",wordCount.count, consoleOperator.input); } } 3 lines of SQL in Azure Stream Analytics SELECT Avg(Purchase), ScoreTollId, Count(*) FROM GameDataStream GROUP BY TumblingWindows(5, Minute), Score
  • 15. Easy to get started
  • 16. Stream Analytics on IoT Edge Presentation & Action Storage & Batch Analysis Stream Analytics Event Queuing & Stream Ingestion Event production IoT Hub Applications Archiving for long term storage/ batch analytics Real-time dashboard Azure Stream Analytics Automation to kick-off workflows Machine Learning Blob Storage (streaming ingress and reference data) Event Hubs Devices & Gateways
  • 21. There are two distinct types of Inputs • Data Streams: • IoT Hub • Event Hub • Azure Blob storage • Reference data: • Azure Blob storage Data Outputs Supported • Azure Data Lake Store • SQL Database • Blob storage • Event Hub • Power BI • Table Storage • Service Bus Queues • Service Bus Topics • Azure Cosmos DB • Azure Functions
  • 27. Every event that flows through the system has a timestamp ASA supports: Arrival Time - Event timestamps based on arrival time (input adapter clock, e.g., Event Hubs) App Time - Event timestamps based on a timestamp field in the actual event tuple SELECT * FROM EntryStream TIMESTAMP BY EntryTime SELECT * FROM EntryStream
  • 28. Output at the end of each window Windows are fixed length Used in a GROUP BY clause 1 5 4 2 6 8 6 4 t1 t2 t5 t6 t3 t4 Time Window 1 Window 2 Window 3 Aggregate Function (Sum) 18 14 Output Events
  • 29. SELECT TollId, Count(*) FROM EntryStream TIMESTAMP BY EntryTime GROUP BY TollId, TumblingWindow(second, 10) 1 5 4 2 6 8 6 5 0 5 20 10 15 Time(s) 1 5 4 2 6 8 6 25 A 10-second Tumbling Window 30 3 6 1 5 3 6 1 Every 10 seconds give me the count of vehicles entering each toll booth over the last 10 seconds
  • 30. Every 5 seconds give me the count of vehicles entering each toll booth over the last 10 seconds 1 5 4 2 6 8 7 0 5 20 10 15 Time (s) 25 A 10 second Hopping Window with a 5 second hop 30 4 2 6 8 6 5 3 6 1 1 5 4 2 6 8 6 5 3 6 1 5 3 SELECT TollId, Count(*) FROM EntryStream TIMESTAMP BY EntryTime GROUP BY TollId, HoppingWindow(second, 10, 5)
  • 31. SELECT TollId, Count(*) FROM EntryStream TIMESTAMP BY EntryTime GROUP BY TollId, SlidingWindow(second, 20) HAVING Count(*) > 10 1 5 0 10 40 20 30 Time (s) 50 A 20-second Sliding Window 5 1 5 1 Entry Exit 1 5 Find all toll booths that have served more than 10 vehicles in the last 20 seconds An output is generated whenever an event either enters/leaves the system
  • 33. Perform real-time scoring on streaming data Anomaly Detection and Sentiment Analysis are common use cases Function calls from the query Azure ML can publish web endpoints for operationalized ML models Azure Stream Analytics binds custom function names to such web endpoints SELECT text, sentiment(text) AS score FROM myStream in public preview
  • 36. in private preview It is recommended to have at least 50 events in each window for best results.
  • 37. in private preview It is recommended to have at least 50 events in each window for best results.
  • 40. Scenarios where you might find JavaScript user-defined functions useful: • Parsing and manipulating strings that have regular expression functions, for example, Regexp_Replace() and Regexp_Extract() • Decoding and encoding data, for example, binary-to-hex conversion • Performing mathematic computations with JavaScript Math functions • Performing array operations like sort, join, find, and fill Here are some things that you cannot do with a JavaScript user-defined function in Stream Analytics: • Call out external REST endpoints, for example, performing reverse IP lookup or pulling reference data from an external source • Perform custom event format serialization or deserialization on inputs/outputs • Create custom aggregates
  • 41. Provided by Rob Klause
  • 42. The ASA query Provided by Rob Klause
  • 46. IoT Edge Factory / Customer Site sqlFunction AvgtoCloud Alert sql tempSensor { "routes": { "TelemetryTosqlFunction": "FROM /messages/modules/tempSensor/outputs/temperatureOutput INTO BrokeredEndpoint("/modules/sqlFunction/inputs/input1")", "TelemetryToAsa": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/Alert/inputs/temperature")", "AlertsToReset": "FROM /messages/modules/Alert/* INTO BrokeredEndpoint("/modules/tempSensor/inputs/control")", "AlertsToCloud": "FROM /messages/modules/Alert/* INTO $upstream", "TelemetryToAsaAvg": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/AvgtoCloud/inputs/EdgeStream")", "AvgToCloud": "FROM /messages/modules/AvgtoCloud/* INTO $upstream" } }
  • 47. IoT Edge Factory / Customer Site sqlFunction AvgtoCloud Alert sql tempSensor { "routes": { "TelemetryTosqlFunction": "FROM /messages/modules/tempSensor/outputs/temperatureOutput INTO BrokeredEndpoint("/modules/sqlFunction/inputs/input1")", "TelemetryToAsa": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/Alert/inputs/temperature")", "AlertsToReset": "FROM /messages/modules/Alert/* INTO BrokeredEndpoint("/modules/tempSensor/inputs/control")", "AlertsToCloud": "FROM /messages/modules/Alert/* INTO $upstream", "TelemetryToAsaAvg": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/AvgtoCloud/inputs/EdgeStream")", "AvgToCloud": "FROM /messages/modules/AvgtoCloud/* INTO $upstream" } } IoT Hub
  • 48. IoT Edge Factory / Customer Site sqlFunction AvgtoCloud Alert sql tempSensor { "routes": { "TelemetryTosqlFunction": "FROM /messages/modules/tempSensor/outputs/temperatureOutput INTO BrokeredEndpoint("/modules/sqlFunction/inputs/input1")", "TelemetryToAsa": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/Alert/inputs/temperature")", "AlertsToReset": "FROM /messages/modules/Alert/* INTO BrokeredEndpoint("/modules/tempSensor/inputs/control")", "AlertsToCloud": "FROM /messages/modules/Alert/* INTO $upstream", "TelemetryToAsaAvg": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/AvgtoCloud/inputs/EdgeStream")", "AvgToCloud": "FROM /messages/modules/AvgtoCloud/* INTO $upstream" } }
  • 49. IoT Edge Factory / Customer Site sqlFunction AvgtoCloud Alert sql tempSensor { "routes": { "TelemetryTosqlFunction": "FROM /messages/modules/tempSensor/outputs/temperatureOutput INTO BrokeredEndpoint("/modules/sqlFunction/inputs/input1")", "TelemetryToAsa": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/Alert/inputs/temperature")", "AlertsToReset": "FROM /messages/modules/Alert/* INTO BrokeredEndpoint("/modules/tempSensor/inputs/control")", "AlertsToCloud": "FROM /messages/modules/Alert/* INTO $upstream", "TelemetryToAsaAvg": "FROM /messages/modules/tempSensor/* INTO BrokeredEndpoint("/modules/AvgtoCloud/inputs/EdgeStream")", "AvgToCloud": "FROM /messages/modules/AvgtoCloud/* INTO $upstream" } } IoT Hub Blob Storage
  • 53. Feature Status Remarks SQL Parallel Write GA WW rollout in 1 week Blob O/P partitioning by custom date-time Public preview WW rollout in 1 week C# UDF on IoT Edge Public preview Available now Live testing in Visual Studio Public preview Available now User defined custom repartition count Public preview Available now New built-in ML models for A/D – Edge and Cloud Private Preview Access granted upon sign-up Custom de-serializers on IoT Edge Private Preview Access granted upon sign-up MSI Authentication for egress to ADLS Gen1 Private Preview Access granted upon sign-up
  • 54. Supports inline learning and real-time scoring Easily invoked with simple function calls within query language Types of Anomalies Detected: Spikes Dips Slow positive trend Slow negative trend Bi-Level change
  • 55. • Faster iterative testing • Show results in real time • View Job metrics • Time policies support Public Preview
  • 56. Partition egress to Blob storage by 1) Any input field 2) Custom date and time formats Gain more fine grain control over data written to Blob storage for dashboarding and reporting Better alignment with Hive conventions for blob output to be consumed by HDInsight and Azure Databricks.
  • 57. To achieve fully parallel topologies, ASA will transition SQL ‘writes’ from Serial to Parallel operations for SQL DB and SQL Data Warehouse 4x-5x improvement in write throughput Allows for batch size customization to achieve higher throughput For e.g., this feature enabled our customer building a connected car scenario to scale up from 150K events/min to 500K events/min
  • 58. MSI based authentication will enable egress to Azure Data Lake Storage. Key benefits over existing AAD (Azure Active Directory) based authentication: • Job deployment automation (thru Power Shell etc.) • Long running production jobs • Consistency with other services
  • 59. Enables better performance tuning Key Scenarios • When upstream partition count can’t be changed • Partitioned processing is needed to scale out to larger processing load • Fixed number of output partitions SELECT * INTO [output] FROM [input] PARTITION BY DeviceID INTO 10
  • 67. © 2018 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.