SlideShare a Scribd company logo
DATA SATURDAY #10
Sofia, Oct 09th
Deep Dive Time Series Anomaly
Detection with different Azure
data services
Marco Parenzan
Marco Parenzan
• Senion Solution Architect @ beanTech
• 1nn0va Community Lead (Pordenone)
• Microsoft Azure MVP
• Profiles
o Linkedin: https://www.linkedin.com/in/marcoparenzan/
o Slideshare: https://www.slideshare.net/marco.parenzan
o GitHub: https://github.com/marcoparenzan
This is the journey of…
• …a .NET developer…
• …or an IoT developer…
• …a one-man band (sometimes )…
• …facing typical data science world topics…
• …that wants to use .NET everywhere!
Thanks to our Sponsors
A typical scenario
Scenario
• In an industrial fridge, you monitor temperatures to check not the
temperature «per se», but to check the healthy of the plant
From real industrial fridges 
With no any specific request...
what is IoT all about?
Efficiency Anomalies
Batch Streaming
Storage
Account
IoT Hub
Devices
Events
Ingest
The batch point of view...
How can we implement
processing?
Ingest Process
Storage
Account
Azure
IoT Hub-Related
Services
Devices
Events
?
We explore some of them,
probably the most Microsoft and Azure oriented
But….
I’m not a data scientist!
Or a BI Analyst!
I’m a .NET Developer!
Make me think and work as a Data
Scientist!
A (very little) bit of theory
for Anomaly Detection
in Time Series
Threshold anomalies?
• Threshold alarms are not enough
o Anomalies cannot be just «over a threshold for
some time»...
o Condenser or Evaporator with difficulties starting
o Distinguish from Opening a door (that is also an
anomaly)
o Or also counting the number of times that there
are peaks (too many times)
• You can considering each of these
events as anomalies that alter the
temperature you measure in
different part of the fridge
Anomaly Detection
• Anomaly detection is the process of identifying unexpected items or events in
data sets, which differ from the norm.
• And anomaly detection is often applied on unlabeled data which is known as
unsupervised anomaly detection.
• Anomaly is not just a matter of time and scalar values. It can also be a matter
of visual anomalies!
Time Series
o Trend
• is a general direction in which something is
developing or changing. A trend can be
upward(uptrend) or downward(downtrend).
It is not always necessary that the increase or
decrease is consistently in the same direction
in a given period.
o Seasonality
• Predictable pattern that recurs or repeats
over regular intervals. Seasonality is often
observed within a year or less.
o Irregular fluctuation
• These are variations that occur due to
sudden causes and are unpredictable. For
example the rise in prices of food due to war,
flood, earthquakes, farmers striking etc.
• Definition
o Time series is a sequence of data points recorded
in time order, often taken at successive equally
paced points in time.
• Examples
o Stock prices, Sales demand, website traffic, daily
temperatures, quarterly sales
• Time series is different from
regression analysis because of its
time-dependent nature.
Anomaly Detection in Time Series
• In time series data, an anomaly or outlier can be termed as a data point
which is not following the common collective trend or seasonal or cyclic
pattern of the entire data and is significantly distinct from rest of the data. By
significant, most data scientists mean statistical significance, which in order
words, signify that the statistical properties of the data point is not in
alignment with the rest of the series.
• Anomaly detection has two basic assumptions:
o Anomalies only occur very rarely in the data.
o Their features differ from the normal instances significantly.
Classic ML Process
Helping no-data scientits developers (all! )
• Unsupervised Machine
LearningNo labelling
• Automated Training Set for
Anomaly Detection Algorithms
• the algorithms automatically
generates a simulated training set
based non your input data
• Auto(mated) MLfind the best
tuning for you with parameters
and algorithms
https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-cheat-sheet
Spectrum Residual Cnn (SrCnn)
• To monitor the time-series continuously and alert for potential incidents on time
• The algorithm first computes the Fourier Transform of the original data. Then it computes
the spectral residual of the log amplitude of the transformed signal before applying the
Inverse Fourier Transform to map the sequence back from the frequency to the time domain.
This sequence is called the saliency map. The anomaly score is then computed as the relative
difference between the saliency map values and their moving averages. If the score is above
a threshold, the value at a specific timestep is flagged as an outlier.
• There are several parameters for SR algorithm. To obtain a model with good performance, we
suggest to tune windowSize and threshold at first, these are the most important parameters
to SR. Then you could search for an appropriate judgementWindowSize which is no larger
than windowSize. And for the remaining parameters, you could use the default value directly.
• Time-Series Anomaly Detection Service at Microsoft [https://arxiv.org/pdf/1906.03821.pdf]
Data Science for the .NET developer
Data Science and AI for the .NET developer
• ML.NET is first and foremost a framework that you can use to
create your own custom ML models. This custom approach
contrasts with “pre-built AI,” where you use pre-designed general
AI services from the cloud (like many of the offerings from Azure
Cognitive Services). This can work great for many scenarios, but
it might not always fit your specific business needs due to the
nature of the machine learning problem or to the deployment
context (cloud vs. on-premises).
• ML.NET enables developers to use their existing .NET skills to
easily integrate machine learning into almost any .NET
application. This means that if C# (or F# or VB) is your
programming language of choice, you no longer have to learn a
new programming language, like Python or R, in order to
develop your own ML models and infuse custom machine
learning into your .NET apps.
ML.NET Components
Anomaly Detection
Some tools required
• .NET 5 + WPF + ML.NET
• Mandatory , the platform where we try to make experiments
• Xplot.Ploty (soon you will understand I use this) https://fslab.org/XPlot/
• XPlot is a cross-platform data visualization package for the F# programming language
powered by popular JavaScript charting libraries Plotly and Google Charts. The library
provides a complete mapping for the configuration options of the underlying libraries and so
you get a nice F# interface that gives you access to the full power of Plotly and Google
Charts. The XPlot library can be used interactively from F# Interactive, but charts can equally
easy be embedded in F# applications and in HTML reports.
• WebView2 https://docs.microsoft.com/en-us/microsoft-edge/webview2/gettingstarted/wpf
• The Microsoft Edge WebView2 control enables you to embed web technologies (HTML, CSS,
and JavaScript) in your native apps. The WebView2 control uses Microsoft Edge (Chromium)
as the rendering engine to display the web content in native apps. With WebView2, you may
embed web code in different parts of your native app. Build all of the native app within a
single WebView instance.
Experimenting with .NET and WPF
How Data Scientists work
Batch vs. Notebooks
• Batch
o Work on slow data stored into a Datalake
o Submit a complete app in one single deploy
o Receive the entire output
• Notebook
o «sketching» the code
o Write/delete/rewrite continuously
o Run cell by cell (but also all at once) interactive
• In a world of Mathematica
Jupyter
• Evolution and generalization of the seminal role of Mathematica
• In web standards way
o Web (HTTP+Markdown)
o Python adoption (ipynb)
• Written in Java
• Python has an interop bridge...not native (if ever important)Python is a
kernel for Jupyter
Python!
• Simple to start (that why C# is pythonizing…)
• “Open Source”
• TensorFlow, Scikit-learn, Keras, Pandas, PyTorch
• Remember one thing:
o Often behind a Data Science framework there is a native library and Python binds that library
o Spark is written in Java and there is a bridge for Python to Spark
o Jupyter is written in Java and there is a bridge (kernel) for Python
Spark Unifies:
 Batch Processing
 Interactive SQL
 Real-time processing
 Machine Learning
 Deep Learning
 Graph Processing
An unified, open source, parallel, data processing framework for Big Data Analytics
Spark Core Engine
Spark SQL
Batch processing
Spark Structured
Streaming
Stream processing
Spark MLlib
Machine
Learning
Yarn
Spark MLlib
Machine
Learning
Spark
Streaming
Stream processing
GraphX
Graph
Computation
http://spark.apache.org
Apache Spark
.NET Interactive and Jupyter
and Visual Studio Code
• .NET Interactive gives C# and F# kernels to Jupyter
• .NET Interactive gives all tools to create your hosting application
independently from Jupyter
• In Visual Studio Code, you have two different notebooks (looking similar but
developed in parallel by different teams)
o .NET Interactive Notebook (by the .NET Interactive Team) that can run also Python
o Jupyter Notebook (by the Azure Data Studio Team – probably) that can run also C# and F#
• There is a little confusion on that 
• .NET Interactive has a strong C#/F# Kernel...
o ...a less mature infrastructure (compared to Jupiter)
.NET for Apache Spark 1.1.1
• .NET bindings (C# e F#) to Spark
o Written on the Spark interop layer, designed to
provide high performance bindings to multiple
languages
• Re-use knowledge, skills, code you
have as a .NET developer
o Compliant with .NET Standard
• You can use .NET for Apache
Spark anywhere you write .NET
code
• Original project Moebius
o https://github.com/microsoft/Mobius
Experimenting .NET Interactive
The Azure toolbox
.NET (5) hosting in Azure
Existing apps
.NET web apps (on-premises)
Cloud-Optimized
PaaS
Cloud-Native
PaaS for microservices and serverless
Monolithic / N-Tier
architectures
Monolithic / N-Tier
architectures
Microservices and serverless architectures
Cloud
Infrastructure-Ready
Monolithic / N-Tier
architectures
Relational
Database
VMs
Managed services
On-premises Azure
PaaS for containerized microservices
+ Serverless computing
+ Managed services
And Windows Containers
IaaS
(Infrastructure as a Service)
Azure Azure
Functions everywhere
Platform
App delivery
OS
On-premises
Code
App Service on Azure Stack
Windows
●●●
Non-Azure hosts
●●●
●●●
+
Azure Functions
host runtime
Azure Functions
Core Tools
Azure Functions
base Docker image
Azure Functions
.NET Docker image
Azure Functions
Node Docker image
●●●
Logic Apps
o Visually design workflows in the cloud
o Express logic through powerful control flow
o Connect disparate functions and APIs
o Utilize declarative definition to work with CI/CD
Experimenting with
Azure Functions
Anomaly Detection
As A Service in Azure
Azure Cognitive Services
• Cognitive Services brings AI within reach of every developer—without
requiring machine-learning expertise. All it takes is an API call to embed the
ability to see, hear, speak, search, understand, and accelerate decision-
making into your apps. Enable developers of all skill levels to easily add AI
capabilities to their apps.
• Five areas:
• Decision
• Language
• Speech
• Vision
• Web search
Anomaly Detector
Identify potential problems early on.
Content Moderator
Detect potentially offensive or unwanted
content.
Metrics Advisor PREVIEW
Monitor metrics and diagnose issues.
Personalizer
Create rich, personalized experiences for every
user.
Anomaly Detector
• Through an API, Anomaly Detector ingests time-series data of all types and
selects the best-fitting detection model for your data to ensure high accuracy.
Customize the service to detect any level of anomaly and deploy it where you
need it most -- from the cloud to the intelligent edge with containers. Azure
is the only major cloud provider that offers anomaly detection as an AI
service.
Experimenting
with Anomaly Detector
Anomaly Detection
with Big Data
Fully managed big data analytics service
• Fully managed
for efficiency
Focus on insights, not the
infra-structure for fast time to
value
• No infrastructure to manage;
provision the service, choose
the SKU for your workload,
and create database.
• Optimized for
streaming data
Get near-instant insights
from fast-flowing data
• Scale linearly up to 200 MB per
second per node with highly
performant, low latency
ingestion.
• Designed for
data exploration
• Run ad-hoc queries using the
intuitive query language
• Returns results from 1 Billion
records < 1 second without
modifying the data or
metadata
•seconds freshness, days retention
•in-mem aggregated data
•pre-defined standing queries
•split-seconds query performance
•data viewing
Hot
•minutes freshness, months retention
•raw data
•ad-hoc queries
•seconds-minutes query perf
•data exploration
Warm
•hours freshness, years retention
•raw data
•programmatic batch processing
•minutes-hours query perf
•data manipulation
Cold
• in-mem cube
• stream analytics
• …
• column store
• Indexing
• …
• distributed file
system
• map reduce
• …
Multi-temperature data processing paths
Experimenting with
Azure Data Explorer
Upcoming Events
jsTalks (Bulgaria), 2021
Nov 19-20
http://jstalks.net/
Azure MVP Unplugged, 5th Edition
October 21
Global AI Back Together Bulgaria 2021
19 of October
Thanks to our Sponsors

More Related Content

PPTX
Anomaly Detection with Azure and .net
PPTX
Hot to build continuously processing for 24/7 real-time data streaming platform?
PPTX
Onnx and onnx runtime
PDF
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
PDF
Javantura v4 - JVM++ The GraalVM - Martin Toshev
PDF
Infrastructure as Code with Ansible
PDF
The Next Generation Cloud: Unleashing the Power of the Unikernal
PPTX
Splunking the JVM
Anomaly Detection with Azure and .net
Hot to build continuously processing for 24/7 real-time data streaming platform?
Onnx and onnx runtime
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
Javantura v4 - JVM++ The GraalVM - Martin Toshev
Infrastructure as Code with Ansible
The Next Generation Cloud: Unleashing the Power of the Unikernal
Splunking the JVM

What's hot (20)

PPTX
Splunk Conf 2014 - Getting the message
PDF
MLFlow 1.0 Meetup
PDF
Javantura v4 - Spring Boot and JavaFX - can they play together - Josip Kovaček
DOC
Ankit Chohan - Java
PPTX
Simple tweaks to get the most out of your jvm
PPTX
Splunk Conf 2014 - Splunking the Java Virtual Machine
PPTX
Splunk Modular Inputs / JMS Messaging Module Input
PDF
Openvino ncs2
PDF
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
PPTX
A personal journey towards more reproducible networking research
PPTX
Open Source In The World Of Java
PPTX
Perforce Helix Never Dies: DevOps at Bandai Namco Studios
PPTX
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation Ecosystem
PPTX
Splunk Developer Platform
PPTX
Splunk for JMX
PPTX
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
PPTX
SplunkLive London 2014 Developer Presentation
PDF
The Polyglot Data Scientist - Exploring R, Python, and SQL Server
PPT
Enterprise PHP (PHP London Conference 2008)
PDF
Microsoft Tech Summit Switzerland 2019 - DOCKER + AZURE DEVOPS + KUBERNETES = ♥
Splunk Conf 2014 - Getting the message
MLFlow 1.0 Meetup
Javantura v4 - Spring Boot and JavaFX - can they play together - Josip Kovaček
Ankit Chohan - Java
Simple tweaks to get the most out of your jvm
Splunk Conf 2014 - Splunking the Java Virtual Machine
Splunk Modular Inputs / JMS Messaging Module Input
Openvino ncs2
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
A personal journey towards more reproducible networking research
Open Source In The World Of Java
Perforce Helix Never Dies: DevOps at Bandai Namco Studios
How APIs are Transforming Cisco Solutions and Catalyzing an Innovation Ecosystem
Splunk Developer Platform
Splunk for JMX
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
SplunkLive London 2014 Developer Presentation
The Polyglot Data Scientist - Exploring R, Python, and SQL Server
Enterprise PHP (PHP London Conference 2008)
Microsoft Tech Summit Switzerland 2019 - DOCKER + AZURE DEVOPS + KUBERNETES = ♥
Ad

Similar to Deep Dive Time Series Anomaly Detection in Azure with dotnet (20)

PPTX
Deep dive time series anomaly detection with different Azure Data Services
PPTX
Time Series Anomaly Detection with Azure and .NETT
PPTX
Time Series Anomaly Detection for .net and Azure
PPTX
Time Series Anomaly Detection with .net and Azure
PDF
Consolidating MLOps at One of Europe’s Biggest Airports
PPTX
.NET per la Data Science e oltre
PPTX
Time Series Anomaly Detection with .net and Azure
PPTX
MWLUG 2015 - An Introduction to MVC
PPTX
An Introduction To Model  View  Controller In XPages
PDF
Autonomous Machines with Project Bonsai
PDF
Surge2012
PDF
Streamlio and IoT analytics with Apache Pulsar
PPTX
Postmortem of a uwp xaml application development
PPTX
Math with .NET for you and Azure
PPTX
Introduction to Tensor Flow-v1.pptx
PPTX
Enabling IoT Devices’ Hardware and Software Interoperability, IPSO Alliance (...
PPTX
Machine learning and Deep learning on edge devices using TensorFlow
PPTX
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
PPTX
.NET for Azure Synapse (and viceversa)
PDF
Building and deploying LLM applications with Apache Airflow
Deep dive time series anomaly detection with different Azure Data Services
Time Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection for .net and Azure
Time Series Anomaly Detection with .net and Azure
Consolidating MLOps at One of Europe’s Biggest Airports
.NET per la Data Science e oltre
Time Series Anomaly Detection with .net and Azure
MWLUG 2015 - An Introduction to MVC
An Introduction To Model  View  Controller In XPages
Autonomous Machines with Project Bonsai
Surge2012
Streamlio and IoT analytics with Apache Pulsar
Postmortem of a uwp xaml application development
Math with .NET for you and Azure
Introduction to Tensor Flow-v1.pptx
Enabling IoT Devices’ Hardware and Software Interoperability, IPSO Alliance (...
Machine learning and Deep learning on edge devices using TensorFlow
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
.NET for Azure Synapse (and viceversa)
Building and deploying LLM applications with Apache Airflow
Ad

More from Marco Parenzan (20)

PPTX
Azure IoT Central per lo SCADA engineer
PPTX
Azure Hybrid @ Home
PPTX
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
PPTX
Azure Synapse Analytics for your IoT Solutions
PPTX
Power BI Streaming Data Flow e Azure IoT Central
PPTX
Power BI Streaming Data Flow e Azure IoT Central
PPTX
Power BI Streaming Data Flow e Azure IoT Central
PPTX
Developing Actors in Azure with .net
PPTX
Power BI data flow and Azure IoT Central
PPTX
.net for fun: write a Christmas videogame
PPTX
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
PPTX
Anomaly Detection with Azure and .NET
PPTX
Deploy Microsoft Azure Data Solutions
PPTX
Azure IoT Central
PPTX
Code Generation for Azure with .net
PPTX
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
PPTX
Code Generation for Azure with .net
PPTX
.net interactive for notebooks and for your data job
PPTX
.net interactive for your code and Azure
PPTX
From IoT Central to IoT Hub
Azure IoT Central per lo SCADA engineer
Azure Hybrid @ Home
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Azure Synapse Analytics for your IoT Solutions
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
Developing Actors in Azure with .net
Power BI data flow and Azure IoT Central
.net for fun: write a Christmas videogame
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Anomaly Detection with Azure and .NET
Deploy Microsoft Azure Data Solutions
Azure IoT Central
Code Generation for Azure with .net
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
Code Generation for Azure with .net
.net interactive for notebooks and for your data job
.net interactive for your code and Azure
From IoT Central to IoT Hub

Recently uploaded (20)

PDF
top salesforce developer skills in 2025.pdf
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
System and Network Administraation Chapter 3
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
ai tools demonstartion for schools and inter college
PPTX
Materi-Enum-and-Record-Data-Type (1).pptx
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
System and Network Administration Chapter 2
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Essential Infomation Tech presentation.pptx
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PPT
Introduction Database Management System for Course Database
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
top salesforce developer skills in 2025.pdf
Design an Analysis of Algorithms II-SECS-1021-03
2025 Textile ERP Trends: SAP, Odoo & Oracle
System and Network Administraation Chapter 3
Internet Downloader Manager (IDM) Crack 6.42 Build 41
ai tools demonstartion for schools and inter college
Materi-Enum-and-Record-Data-Type (1).pptx
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
System and Network Administration Chapter 2
Operating system designcfffgfgggggggvggggggggg
How to Choose the Right IT Partner for Your Business in Malaysia
Essential Infomation Tech presentation.pptx
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Design an Analysis of Algorithms I-SECS-1021-03
Introduction Database Management System for Course Database
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Wondershare Filmora 15 Crack With Activation Key [2025
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...

Deep Dive Time Series Anomaly Detection in Azure with dotnet

  • 1. DATA SATURDAY #10 Sofia, Oct 09th Deep Dive Time Series Anomaly Detection with different Azure data services Marco Parenzan
  • 2. Marco Parenzan • Senion Solution Architect @ beanTech • 1nn0va Community Lead (Pordenone) • Microsoft Azure MVP • Profiles o Linkedin: https://www.linkedin.com/in/marcoparenzan/ o Slideshare: https://www.slideshare.net/marco.parenzan o GitHub: https://github.com/marcoparenzan
  • 3. This is the journey of… • …a .NET developer… • …or an IoT developer… • …a one-man band (sometimes )… • …facing typical data science world topics… • …that wants to use .NET everywhere!
  • 4. Thanks to our Sponsors
  • 6. Scenario • In an industrial fridge, you monitor temperatures to check not the temperature «per se», but to check the healthy of the plant From real industrial fridges 
  • 7. With no any specific request... what is IoT all about? Efficiency Anomalies Batch Streaming
  • 9. How can we implement processing? Ingest Process Storage Account Azure IoT Hub-Related Services Devices Events ? We explore some of them, probably the most Microsoft and Azure oriented
  • 11. I’m not a data scientist! Or a BI Analyst!
  • 12. I’m a .NET Developer!
  • 13. Make me think and work as a Data Scientist!
  • 14. A (very little) bit of theory for Anomaly Detection in Time Series
  • 15. Threshold anomalies? • Threshold alarms are not enough o Anomalies cannot be just «over a threshold for some time»... o Condenser or Evaporator with difficulties starting o Distinguish from Opening a door (that is also an anomaly) o Or also counting the number of times that there are peaks (too many times) • You can considering each of these events as anomalies that alter the temperature you measure in different part of the fridge
  • 16. Anomaly Detection • Anomaly detection is the process of identifying unexpected items or events in data sets, which differ from the norm. • And anomaly detection is often applied on unlabeled data which is known as unsupervised anomaly detection. • Anomaly is not just a matter of time and scalar values. It can also be a matter of visual anomalies!
  • 17. Time Series o Trend • is a general direction in which something is developing or changing. A trend can be upward(uptrend) or downward(downtrend). It is not always necessary that the increase or decrease is consistently in the same direction in a given period. o Seasonality • Predictable pattern that recurs or repeats over regular intervals. Seasonality is often observed within a year or less. o Irregular fluctuation • These are variations that occur due to sudden causes and are unpredictable. For example the rise in prices of food due to war, flood, earthquakes, farmers striking etc. • Definition o Time series is a sequence of data points recorded in time order, often taken at successive equally paced points in time. • Examples o Stock prices, Sales demand, website traffic, daily temperatures, quarterly sales • Time series is different from regression analysis because of its time-dependent nature.
  • 18. Anomaly Detection in Time Series • In time series data, an anomaly or outlier can be termed as a data point which is not following the common collective trend or seasonal or cyclic pattern of the entire data and is significantly distinct from rest of the data. By significant, most data scientists mean statistical significance, which in order words, signify that the statistical properties of the data point is not in alignment with the rest of the series. • Anomaly detection has two basic assumptions: o Anomalies only occur very rarely in the data. o Their features differ from the normal instances significantly.
  • 20. Helping no-data scientits developers (all! ) • Unsupervised Machine LearningNo labelling • Automated Training Set for Anomaly Detection Algorithms • the algorithms automatically generates a simulated training set based non your input data • Auto(mated) MLfind the best tuning for you with parameters and algorithms https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-cheat-sheet
  • 21. Spectrum Residual Cnn (SrCnn) • To monitor the time-series continuously and alert for potential incidents on time • The algorithm first computes the Fourier Transform of the original data. Then it computes the spectral residual of the log amplitude of the transformed signal before applying the Inverse Fourier Transform to map the sequence back from the frequency to the time domain. This sequence is called the saliency map. The anomaly score is then computed as the relative difference between the saliency map values and their moving averages. If the score is above a threshold, the value at a specific timestep is flagged as an outlier. • There are several parameters for SR algorithm. To obtain a model with good performance, we suggest to tune windowSize and threshold at first, these are the most important parameters to SR. Then you could search for an appropriate judgementWindowSize which is no larger than windowSize. And for the remaining parameters, you could use the default value directly. • Time-Series Anomaly Detection Service at Microsoft [https://arxiv.org/pdf/1906.03821.pdf]
  • 22. Data Science for the .NET developer
  • 23. Data Science and AI for the .NET developer • ML.NET is first and foremost a framework that you can use to create your own custom ML models. This custom approach contrasts with “pre-built AI,” where you use pre-designed general AI services from the cloud (like many of the offerings from Azure Cognitive Services). This can work great for many scenarios, but it might not always fit your specific business needs due to the nature of the machine learning problem or to the deployment context (cloud vs. on-premises). • ML.NET enables developers to use their existing .NET skills to easily integrate machine learning into almost any .NET application. This means that if C# (or F# or VB) is your programming language of choice, you no longer have to learn a new programming language, like Python or R, in order to develop your own ML models and infuse custom machine learning into your .NET apps.
  • 25. Some tools required • .NET 5 + WPF + ML.NET • Mandatory , the platform where we try to make experiments • Xplot.Ploty (soon you will understand I use this) https://fslab.org/XPlot/ • XPlot is a cross-platform data visualization package for the F# programming language powered by popular JavaScript charting libraries Plotly and Google Charts. The library provides a complete mapping for the configuration options of the underlying libraries and so you get a nice F# interface that gives you access to the full power of Plotly and Google Charts. The XPlot library can be used interactively from F# Interactive, but charts can equally easy be embedded in F# applications and in HTML reports. • WebView2 https://docs.microsoft.com/en-us/microsoft-edge/webview2/gettingstarted/wpf • The Microsoft Edge WebView2 control enables you to embed web technologies (HTML, CSS, and JavaScript) in your native apps. The WebView2 control uses Microsoft Edge (Chromium) as the rendering engine to display the web content in native apps. With WebView2, you may embed web code in different parts of your native app. Build all of the native app within a single WebView instance.
  • 28. Batch vs. Notebooks • Batch o Work on slow data stored into a Datalake o Submit a complete app in one single deploy o Receive the entire output • Notebook o «sketching» the code o Write/delete/rewrite continuously o Run cell by cell (but also all at once) interactive • In a world of Mathematica
  • 29. Jupyter • Evolution and generalization of the seminal role of Mathematica • In web standards way o Web (HTTP+Markdown) o Python adoption (ipynb) • Written in Java • Python has an interop bridge...not native (if ever important)Python is a kernel for Jupyter
  • 30. Python! • Simple to start (that why C# is pythonizing…) • “Open Source” • TensorFlow, Scikit-learn, Keras, Pandas, PyTorch • Remember one thing: o Often behind a Data Science framework there is a native library and Python binds that library o Spark is written in Java and there is a bridge for Python to Spark o Jupyter is written in Java and there is a bridge (kernel) for Python
  • 31. Spark Unifies:  Batch Processing  Interactive SQL  Real-time processing  Machine Learning  Deep Learning  Graph Processing An unified, open source, parallel, data processing framework for Big Data Analytics Spark Core Engine Spark SQL Batch processing Spark Structured Streaming Stream processing Spark MLlib Machine Learning Yarn Spark MLlib Machine Learning Spark Streaming Stream processing GraphX Graph Computation http://spark.apache.org Apache Spark
  • 32. .NET Interactive and Jupyter and Visual Studio Code • .NET Interactive gives C# and F# kernels to Jupyter • .NET Interactive gives all tools to create your hosting application independently from Jupyter • In Visual Studio Code, you have two different notebooks (looking similar but developed in parallel by different teams) o .NET Interactive Notebook (by the .NET Interactive Team) that can run also Python o Jupyter Notebook (by the Azure Data Studio Team – probably) that can run also C# and F# • There is a little confusion on that  • .NET Interactive has a strong C#/F# Kernel... o ...a less mature infrastructure (compared to Jupiter)
  • 33. .NET for Apache Spark 1.1.1 • .NET bindings (C# e F#) to Spark o Written on the Spark interop layer, designed to provide high performance bindings to multiple languages • Re-use knowledge, skills, code you have as a .NET developer o Compliant with .NET Standard • You can use .NET for Apache Spark anywhere you write .NET code • Original project Moebius o https://github.com/microsoft/Mobius
  • 36. .NET (5) hosting in Azure Existing apps .NET web apps (on-premises) Cloud-Optimized PaaS Cloud-Native PaaS for microservices and serverless Monolithic / N-Tier architectures Monolithic / N-Tier architectures Microservices and serverless architectures Cloud Infrastructure-Ready Monolithic / N-Tier architectures Relational Database VMs Managed services On-premises Azure PaaS for containerized microservices + Serverless computing + Managed services And Windows Containers IaaS (Infrastructure as a Service) Azure Azure
  • 37. Functions everywhere Platform App delivery OS On-premises Code App Service on Azure Stack Windows ●●● Non-Azure hosts ●●● ●●● + Azure Functions host runtime Azure Functions Core Tools Azure Functions base Docker image Azure Functions .NET Docker image Azure Functions Node Docker image ●●●
  • 38. Logic Apps o Visually design workflows in the cloud o Express logic through powerful control flow o Connect disparate functions and APIs o Utilize declarative definition to work with CI/CD
  • 40. Anomaly Detection As A Service in Azure
  • 41. Azure Cognitive Services • Cognitive Services brings AI within reach of every developer—without requiring machine-learning expertise. All it takes is an API call to embed the ability to see, hear, speak, search, understand, and accelerate decision- making into your apps. Enable developers of all skill levels to easily add AI capabilities to their apps. • Five areas: • Decision • Language • Speech • Vision • Web search Anomaly Detector Identify potential problems early on. Content Moderator Detect potentially offensive or unwanted content. Metrics Advisor PREVIEW Monitor metrics and diagnose issues. Personalizer Create rich, personalized experiences for every user.
  • 42. Anomaly Detector • Through an API, Anomaly Detector ingests time-series data of all types and selects the best-fitting detection model for your data to ensure high accuracy. Customize the service to detect any level of anomaly and deploy it where you need it most -- from the cloud to the intelligent edge with containers. Azure is the only major cloud provider that offers anomaly detection as an AI service.
  • 45. Fully managed big data analytics service • Fully managed for efficiency Focus on insights, not the infra-structure for fast time to value • No infrastructure to manage; provision the service, choose the SKU for your workload, and create database. • Optimized for streaming data Get near-instant insights from fast-flowing data • Scale linearly up to 200 MB per second per node with highly performant, low latency ingestion. • Designed for data exploration • Run ad-hoc queries using the intuitive query language • Returns results from 1 Billion records < 1 second without modifying the data or metadata
  • 46. •seconds freshness, days retention •in-mem aggregated data •pre-defined standing queries •split-seconds query performance •data viewing Hot •minutes freshness, months retention •raw data •ad-hoc queries •seconds-minutes query perf •data exploration Warm •hours freshness, years retention •raw data •programmatic batch processing •minutes-hours query perf •data manipulation Cold • in-mem cube • stream analytics • … • column store • Indexing • … • distributed file system • map reduce • … Multi-temperature data processing paths
  • 48. Upcoming Events jsTalks (Bulgaria), 2021 Nov 19-20 http://jstalks.net/ Azure MVP Unplugged, 5th Edition October 21 Global AI Back Together Bulgaria 2021 19 of October
  • 49. Thanks to our Sponsors

Editor's Notes

  • #17: https://towardsdatascience.com/effective-approaches-for-time-series-anomaly-detection-9485b40077f1 https://towardsdatascience.com/anomaly-detection-for-dummies-15f148e559c1
  • #18: https://towardsdatascience.com/time-series-analysis-for-beginners-8a200552e332
  • #19: Anomaly detection is the process of identifying unexpected items or events in data sets, which differ from the norm. And anomaly detection is often applied on unlabeled data which is known as unsupervised anomaly detection. https://towardsdatascience.com/effective-approaches-for-time-series-anomaly-detection-9485b40077f1
  • #22: The Spectral Residual outlier detector is based on the paper Time-Series Anomaly Detection Service at Microsoft and is suitable for unsupervised online anomaly detection in univariate time series data. The algorithm first computes the Fourier Transform of the original data. Then it computes the spectral residual of the log amplitude of the transformed signal before applying the Inverse Fourier Transform to map the sequence back from the frequency to the time domain. This sequence is called the saliency map. The anomaly score is then computed as the relative difference between the saliency map values and their moving averages. If the score is above a threshold, the value at a specific timestep is flagged as an outlier. For more details, please check out the paper.
  • #37: What’s next? Modernize applications with .NET Core Today we focused on Cloud-optimized .NET Framework apps. However, many applications will benefit from modern architecture built on .NET Core – a much faster, modular, cross-platform, open source .NET. Websites can be modernized with ASP.NET Core to bring in better security, compliance, and much better performance than ASP.NET on .NET Framework. .NET Core also provides code patterns for building resilient, high-performance microservices on Linux and Windows.
  • #39: Build 2015
  • #46: WHAT is ADX EXACTLY?  Is a Fully managed big data analytics service, based on an Analytical database. Analytical databases are optimized to query and run advanced analytics on large volumes of data with extremely low response times.  Modern analytical databases are generally distributed, scalable, fault-tolerant They are columnar based databases that deal with compressed formats and with an intelligent softweare infrastructure composed by a blend of in-memory and disk caching technologies. 
  • #47: Hot is in terms of Instant results from a continuous dataflow Warm path mean analytical approach, not immediate but full of rough data to be modeled Cold doesn't mean unreachable So the question is: In a Multi Temperature situation, is there any Azure service that can be the answer to all the three data paths.