Cutting Edge Computer Vision for Everyone

GLOBAL AI BOOTCAMP IS POWERED BY:
Cutting Edge
Computer Vision
for Everyone

• Solution Architect @
• Microsoft Azure MVP
• External Expert Eurostars-Eureka, Horizon Europe
• External Expert InnoFund Denmark, RIF Cyprus
• Business Interests
o Web Development, SOA, Integration
o IoT, Machine Learning
o Security & Performance Optimization
• Contact
ivelin.andreev@kongsbergdigital.com
www.linkedin.com/in/ivelin
www.slideshare.net/ivoandreev
SPEAKER BIO

Upcoming Events
JS Experts
March 29, 2023 @Sofia Tech Park
Tickets (Eventbrite)
Submit Session (Sessionize)
Global Azure
May 13, 2023 @Sofia Tech Park
Tickets (Eventbrite)
Submit Session (Sessionize)

Agenda
• Real-life Scenario Context
• How Computer Vision Works
• Computer Vision, Image Classification, Object Detection
• ML .NET
• Custom Vision
• Azure ML Service

• Analyze Structured Documents
• Read Text (OCR)
• Detect Objects
Typical Computer Vision Tasks
• Analyze Images
• Classify Images
• Detect Faces

Cutting Edge Computer Vision for Everyone

A Digital Image
• Image – a 2D matrix of pixels and value (colour intensity)

AI for Computer Vision Works
• Human vision
o Start percepting light
o Light transferred to electro-chem signals
o Brain neural networks are activated (thinking, memories, feelings)
o Low level patterns recognized (nose, eyes, ears)
o Combined in higher order patterns (animal, fish, male/female)
o Classification - labeling the subject in words (dog, cat, trout)
• Computer Vision
o How do you do that for a computer?
o What are ears and how to describe them?
o How they look from different angles?
• DNN key use is as Classifiers
• Feature extraction = pattern recognition
David Hubel (CA) and Torsten Wiesel(SE),
1950; Nobel Prize 1981, „Mammal visual
system development“

Neural Network Structure
• Nodes, organized in layers, with weighted connections
o Acyclic directed graph
• Layers
o Input (1), Output (1)
o Shallow - 1 hidden layer
o Deep – multiple hidden layers
• Artificial Neuron Model

Artificial Neuron Activation
• Calculates weighted sum of inputs
• Adds bias (function shift)
• Decides whether it shall be
activated
Natural Questions
• Why do we have so many?
• Why some work better than other?
• Which one to use ?

Activation Functions
• Goal
• Convert input -> output signal
• Output signal is input to next layer
• Approximate target function faster
• Samples
• ReLu, PReLu – good to start with
• TanH and Sigmoid – outdated
• Softmax – output layer, classification
• Linear func – output layer, regression

How does CNN work?
• Convolution
• Non-Linearity (i.e. PReLU)
• Pooling (Downsample)
• Fully connected (Classify)
• Dropout (Overfitting prevention)
Convolution Pooling
Edge detect filter

Compare
Complexity Modeling Training Inference Pricing
Custom
Vision
Low Iterations Internal • Web Endpoint
• Export
Prediction transactions - €2 / 1’000
transactions
Training - €10 / compute hour
Image Storage - €0.70 per 1’000
images
AZ ML
Service
High Automated,
Designer
AKS • Web Endpoint
• Export
AZ Resources – 2x D4 VM (€0.18/h)
+AZ Blob, ACR, App Insights,
KeyVault
ML .NET Low VS Extension
Wizard
Local • Function App
• Export (ONNX)
Inference API hosting
(AZ app service, ACI, AKS)

Step 1: Obtain Training Data
• Open datasets for ML and computer vision projects
• Create PoC before real data are available
• Test different concepts
• Find DataSets for training
o Google Dataset Search
o Kaggle
o DataSetList.com

Step 2: ML.NET – Task Selection
• Download and install ML.NET VS 2022 extension
• Set up model training
1. Scenario (i.e. Classification)
2. Training environment
1. Local CPU
2. Local GPU
3. Azure
• Locally trained ML model
in ML.NET format

Step 3: ML.NET - Training Data
• Add training data to model
o Supervised machine learning – saying to the
model the class of each image (input)
o Organize images in subfolders-based classes

Step 4: ML.NET - Training Algorithm
• Algorithm is automatically selected
o Based on the selected scenario
• “Start Training” magic button
• ML.NET uses under the hood:
o TensorFlow
o ONNX
o Infer.NET
• Classification model uses TF.NET
under the hood
• TF.NET loads a pretrained model
(Transfer Learning) – faster training,
better performance

Step 5: ML.NET – Model Evaluation
• Training picks the best model
• Results KPIs
o Accuracy – correct predictions
o AUC – how well the model describes data
o AUC-PR (Precision-Recall) – for imbalanced classes
o F1 – balance precision/recall
• Different KPIs for different types of tasks
https://learn.microsoft.com/en-us/dotnet/machine-learning/resources/metrics
• Try model – manually upload image

Step 6: ML.NET – Consume Model
• Sample Code to consume the model
o Console App (local ONNX model)
o Web API
o Notebook
• Deploy model
o Azure Function
o Web API

Step 7: Improve the Model
• Additional Data
o The more data, the better the model will learn
o Beware of overfitting
• More data for the same good features is OK
• More data for unsignificant features is NOT OK (i.e. too many yellow apples)
• Data augmentation
o Preprocess images (direction, cropping, contrast)
• Train longer
• Hyper parameter tuning (Depending on algorithm used)
• Cross Validation (make more robust)
• Model architecture
o Train with other architecture (pre trained model)

Custom Vision
• Part of Cognitive Services
• Azure Resource Dependency
o Custom Vision Training
o Custom Vision Prediction
• Pricing Tiers
o Free (2 projects, 1000 images, 1h/month training, 10’000 predictions)
o Standard Terms
• Up to 100 projects
• Training €10/hr
• Image Storage €0.7 / 1’000 images
• Predictions €2 / 1’000 transactions

Step 2: Custom Vision – Task Selection
• Two types of projects
o Classification – tag the whole image
o Object detection – find location and tag in the image
• Classification types
o Multilabel – multiple tags per image
o Multiclass – single tag per image
• Domains
o Predefined types of tasks used to optimize the model (i.e. by using appropriate filters and CNN architectures)
o General / General A1 / General A2 / Food / Landmarks / Retail
o Compact domains – optimized for export and usage on edge devices (less accurate, less weight)
o Note: exported models are not guaranteed to work 100% like the cloud hosted models

• Multilabel
o Probability is up to 100% each
Step 2: Custom Vision - Multilabel vs Multiclass
• Multiclass
o Probability sums up to 100%

Step 3/4: Custom Vision – Training Data/Training
• Images
o Upload in bulk
o Label during upload
• Multi class classification
o Tag images with multiple tags
• Training parameters
o No parameters are available for customization
during training
o Algorithm is determined automatically based on
the domain

Step 5/6: Custom Vision – Evaluation / Consume
• Evaluation
o Model iterations (up to last 10) created for each
training
o Overall KPIs
o Performance per tag
• Predict / Inference
o View history of previous predictions
o Manually select and compare training iteration
o Ability to review labels and submit input to the
training set

Step 7: Custom Vision – Deploy
• Deployment
o Publish specific training iteration
o Export model to file (Compact)
o Both image URL and image file supported
o Host model (Prediction API)

Azure ML Service
• Cloud ML as a Service with advanced AutoML features (Wizard)
• Start by selecting a dataset

Step 1: Azure ML Studio – Training Data
• Data Labeling
o Upload images to AZ Blob (.jpg, .jpeg, .png, .tiff, etc)
o Option 1: ML assisted (from model to pre-label)
o Option 2: Users manually assign respective tag
• Create Dataset
o Build as an export of the labeled images

Step 1: Azure ML Studio – Bulk Data Labeling
• Data Preparation Effort
o 20-100 labels
o Min 50, Recommended 200 images per label
• Optimize labeling process
o Cluster images in folders based on labels
o Prepare input image data in JSONL (JSON Lines) format.
o Each line describes one image
o Uploaded as a new dataset
{
"image_url":"azureml://subscriptions/<my-subscription-
id>/resourcegroups/<my-resource-group>/workspaces/<my-
workspace>/datastores/<my-datastore>/paths/<path_to_image>",
"label":"class_name"
}

Step 2: Azure ML Studio - Task Selection
• Select type of task
• Select target column (to predict)
from the dataset

Step 4: Azure ML Studio - Training
• Data Preparation Effort
o Provide model hyperparameters
o Algorithm is determined by problem
o Hyperparameters are algorithm specific
o Sweep for values in parameter space
• Training
o Several pipelines that train in parallel
o AutoML experiments different algorithms and params
o Each iteration calculates training score
o The model with best score is considered best

Step 5: Azure ML Studio - Evaluation
• Validation options
o Auto - 20% of training data used for validation (default)
o Train-validation split - adjustable percentage of the training data.
o User-validation data – using a different dataset for validation.

Step 7: Azure ML Studio - Deploy
• Models created with Auto ML can be deployed to ACI or AKS
• Automatically created endpoint accessible via HTTP

Step 4: Azure ML Designer - Training
• Visual drag-drop interface to train
and deploy models
• Replaces ML Studio Classic
• ML designer training pipeline
• Allows customization and tuning of
the model
• Advanced processing of data
o Convert to Image Directory – Converts the image
dataset to “Image Directory” standardized data format
o Image Transformation – preprocessing of images
based on image transformation - Resize, Crop, Pad,
Color jitter, Grayscale, etc..

Step 7: Azure ML Designer - Deploy
• ML Designer realtime pipeline

Takeaways
Computer Vision Training
o Microsoft Azure AI Fundamentals: Explore Computer Vision
Convolutional Neural Nets in Plain
o https://hackernoon.com/learning-ai-if-you-suck-at-math-p5-deep-learning-and-convolutional-neural-nets-in-plain-
english-cda79679bbe3
o https://towardsdatascience.com/understanding-convolutional-neural-networks-cnns-81dffc813a69
Activation Functions
o https://medium.com/towards-data-science/activation-functions-and-its-types-which-is-better-a9a5310cc8f
Azure ML Designer
o https://learn.microsoft.com/en-us/azure/machine-learning/concept-designer
Platform Tools
o https://ml.azure.com/
o https://www.customvision.ai/

Cutting Edge Computer Vision for Everyone

More Related Content

Similar to Cutting Edge Computer Vision for Everyone (20)

More from Ivo Andreev (20)

Recently uploaded (20)

Cutting Edge Computer Vision for Everyone