SlideShare a Scribd company logo
In Collab
with
Microsoft Azure AI Fundamentals: Computer
Vision
Subtitle or speaker name
Sreya E P
Agenda
•Fundamentals of Computer Vision
•Azure AI Vision
•Fundamentals of Facial Recognition
•Facial Analysis
•Azure Face Services
•Responsible AI Use
•Optical Character Recognition
•Azure AI Vision OCR Engine
•Vision Studio
•Demo
•Conclusion
Fundamentals of Computer Vision
Fundamentals of Computer Vision
Computer vision is one of the core areas of artificial intelligence (AI), and
focuses on creating solutions that enable AI applications to "see" the
world and make sense of it.
Azure AI Vision
• While you can train your own machine learning models for
computer vision, the architecture for computer vision models
can be complex; and you require significant volumes of
training images and compute power to perform the training
process.
• Microsoft's Azure AI Vision service provides prebuilt and
customizable computer vision models that are based on the
Florence foundation model and provide various powerful
capabilities.
Analyzing images with the Azure AI Vision service
Azure AI Vision supports multiple image analysis
capabilities, including:
•Optical character recognition (OCR) - extracting
text from images.
•Generating captions and descriptions of images.
•Detection of thousands of common objects in
images.
•Tagging visual features in images
Fundamentals of Facial Recognition
Introduction
Face detection and analysis is an area of artificial intelligence (AI) which uses algorithms to locate and analyze human faces
in images or video content.
There are many applications for face detection, analysis, and recognition. For example,
•Security - facial recognition can be used in building security applications, and increasingly it is used in smart phones
operating systems for unlocking devices.
•Social media - facial recognition can be used to automatically tag known friends in photographs.
•Intelligent monitoring - for example, an automobile might include a system that monitors the driver's face to determine
if the driver is looking at the road, looking at a mobile device, or shows signs of tiredness.
•Advertising - analyzing faces in an image can help direct advertisements to an appropriate demographic audience.
•Missing persons - using public cameras systems, facial recognition can be used to identify if a missing person is in the
image frame.
•Identity validation - useful at ports of entry kiosks where a person holds a special entry permit.
Understand facial analysis
Face detection involves identifying regions of
an image that contain a human face, typically
by returning bounding box coordinates that
form a rectangle around the face
Understand facial analysis
With Face analysis, facial features can be used
to train machine learning models to return
other information, such as facial features such
as nose, eyes, eyebrows, lips, and others
Understand facial analysis
A further application of facial analysis is to train
a machine learning model to identify known
individuals from their facial features. This is
known as facial recognition, and uses multiple
images of an individual to train the model. This
trains the model so that it can detect those
individuals in new images on which it wasn't
trained.
Get started with facial analysis on Azure
Microsoft Azure provides multiple Azure AI services that you can use to detect and analyze faces, including:
•Azure AI Vision, which offers face detection and some basic face analysis, such as returning the bounding box
coordinates around an image.
•Azure AI Video Indexer, which you can use to detect and identify faces in a video.
•Azure AI Face, which offers pre-built algorithms that can detect, recognize, and analyze faces.
Of these, Face offers the widest range of facial analysis capabilities.
Azure AI Face service
The Azure AI Face service can return the rectangle coordinates for any human faces that are found in an image, as
well as a series of related attributes:
• Accessories: indicates whether the given face has accessories. This attribute returns possible accessories
including headwear, glasses, and mask, with confidence score between zero and one for each accessory.
• Blur: how blurred the face is, which can be an indication of how likely the face is to be the main focus of the
image.
• Exposure: such as whether the image is underexposed or over exposed. This applies to the face in the image
and not the overall image exposure.
• Glasses: whether or not the person is wearing glasses.
• Head pose: the face's orientation in a 3D space.
• Mask: indicates whether the face is wearing a mask.
• Noise: refers to visual noise in the image. If you have taken a photo with a high ISO setting for darker settings,
you would notice this noise in the image. The image looks grainy or full of tiny dots that make the image less
clear.
• Occlusion: determines if there might be objects blocking the face in the image.
• Quality For Recognition: a rating of high, medium, or low that reflects if the image is of sufficient quality to
attempt face recognition on.
Responsible AI Use
Anyone can use the Face service to:
• Detect the location of faces in an image.
• Determine if a person is wearing glasses.
• Determine if there's occlusion, blur, noise, or over/under exposure for any of the faces.
• Return the head pose coordinates for each face in an image.
The Limited Access policy requires customers to submit an intake form to access additional Azure AI Face service capabilities
including:
•Face verification: the ability to compare faces for similarity.
•Face identification: the ability to identify named individuals in an image.
•Liveness detection: the ability to detect and mitigate instances of recurring content and/or behaviors that indicate a violation
of policies (eg. such as if the input video stream is real or fake).
Fundamentals of optical character
recognition
Introduction
OCR, or Optical Character Recognition, is a technology that converts different
types of documents, such as scanned paper documents, PDFs, or images
captured by a digital camera, into editable and searchable data.
Key Points About OCR
1.Functionality:
1. OCR scans text characters in images and converts them into
machine-encoded text. This includes recognizing printed text,
handwritten text, or other textual content within images.
2.Applications:
1. Digitizing Documents: Converting physical paper documents into
digital formats, making them easier to store, search, and share.
2. Text Extraction: Extracting text from images and PDFs to use in
other applications, such as databases, word processors, and
spreadsheets.
3. Automation: Automating data entry processes, reducing manual
input errors, and increasing efficiency.
Get started with Azure AI Vision
• The ability for computer systems to
process written and printed text is an
area of AI where computer
vision intersects with natural language
processing.
• Vision capabilities are needed to "read"
the text, and then natural language
processing capabilities make sense of it.
• OCR is the foundation of processing text
in images and uses machine learning
models that are trained to recognize
individual shapes as letters, numerals,
punctuation, or other elements of text.
Azure AI Vision's OCR Engine
• Azure AI Vision service has the ability to extract machine-readable text from images. Azure AI Vision's Read API is the
OCR engine that powers text extraction from images, PDFs, and TIFF files. OCR for images is optimized for general,
non-document images that makes it easier to embed OCR in your user experience scenarios.
• The Read API, otherwise known as Read OCR engine, uses the latest recognition models and is optimized for images
that have a significant amount of text or have considerable visual noise. It can automatically determine the proper
recognition model to use taking into consideration the number of lines of text, images that include text, and
handwriting.
• The OCR engine takes in an image file and identifies bounding boxes, or coordinates, where items are located within
an image. In OCR, the model identifies bounding boxes around anything that appears to be text in the image.
Azure AI Vision's OCR Engine
Calling the Read API returns results
arranged into the following
hierarchy:
•Pages - One for each page of
text, including information about
the page size and orientation.
•Lines - The lines of text on a
page.
•Words - The words in a line of
text, including the bounding box
coordinates and text itself.
Get started with Vision Studio on Azure
To use the Azure AI Vision service you must first create a resource for it in your Azure subscription. You can use
either of the following resource types:
•Azure AI Vision: A specific resource for vision services. Use this resource type if you don't intend to use any
other AI services, or if you want to track utilization and costs for your AI Vision resource separately.
•Azure AI services: A general resource that includes Azure AI Vision along with many other Azure AI services
such as Azure AI Language, Azure AI Speech, and others. Use this resource type if you plan to use multiple
Azure AI services and want to simplify administration and development.
Once you've created a resource, there are several ways to use Azure AI Vision's Read API:
•Vision Studio
•REST API
•Software Development Kits (SDKs): Python, C#, JavaScript
Demo
Conclusion

More Related Content

PPTX
Artificial Intelligence Day 2 Slides for your Reference Happy Learning
DOCX
AI NOTES.docx
PPTX
Unit 4 Object Recognition and Classification.pptx
PPTX
Microsoft Azure beyond IaaS
PDF
Microsoft Cognitive Services at a Glance
PPTX
Azure beyond IaaS
PDF
AI-900: Microsoft Azure AI Fundamentals 2021
PDF
Azure Cognitive Services - Custom Vision
Artificial Intelligence Day 2 Slides for your Reference Happy Learning
AI NOTES.docx
Unit 4 Object Recognition and Classification.pptx
Microsoft Azure beyond IaaS
Microsoft Cognitive Services at a Glance
Azure beyond IaaS
AI-900: Microsoft Azure AI Fundamentals 2021
Azure Cognitive Services - Custom Vision

Similar to Artificial Intelligence Day 3 Slides for your Reference Happy Learning (20)

PDF
Artificial Intelligence Question Bank
PPTX
Biometric Systems - Automate Video Streaming Analysis with Azure and AWS
PPTX
What is Azure Cognitive Service? Features, Components & Business Benefits
PPTX
Hyf azure ml_1
PPTX
UNIT III_Cloud APIs for CV_unit III power point
PPTX
Presentation Sketchrekon see u like it is .pptx
PPTX
Artificial Intelligence Day 5 Slides for your Reference Happy Learning
PPTX
Interesting Facts on Facial Recognition Using Artificial Intelligence
PPTX
Final year ppt
PPTX
Face Recognition: A Comprehensive Overview
PDF
46.-Applications-of-AI-Image-Processing.pdf
PDF
beginners guide to computer vision courses | IABAC
PPTX
Improving your vision with Azure Cognitive Services - /dev/070
PPTX
Computer_Vision_Presentation.pptx ppt presentation
PPTX
With just a few clicks, you can generate wonderful slideshows that suit your ...
PDF
Mobile & Cognitive Services | Harnessing the Power of IoT – Xamarin Experienc...
PPTX
AI GRPOUP 4 PRESENTATION.pptx
PPTX
Imagine Cup Junior 2020
PDF
Unity and Microsoft Azure Cognitive Services - DIGITREK21 Workshop
PPTX
Image Recognition? But why?
Artificial Intelligence Question Bank
Biometric Systems - Automate Video Streaming Analysis with Azure and AWS
What is Azure Cognitive Service? Features, Components & Business Benefits
Hyf azure ml_1
UNIT III_Cloud APIs for CV_unit III power point
Presentation Sketchrekon see u like it is .pptx
Artificial Intelligence Day 5 Slides for your Reference Happy Learning
Interesting Facts on Facial Recognition Using Artificial Intelligence
Final year ppt
Face Recognition: A Comprehensive Overview
46.-Applications-of-AI-Image-Processing.pdf
beginners guide to computer vision courses | IABAC
Improving your vision with Azure Cognitive Services - /dev/070
Computer_Vision_Presentation.pptx ppt presentation
With just a few clicks, you can generate wonderful slideshows that suit your ...
Mobile & Cognitive Services | Harnessing the Power of IoT – Xamarin Experienc...
AI GRPOUP 4 PRESENTATION.pptx
Imagine Cup Junior 2020
Unity and Microsoft Azure Cognitive Services - DIGITREK21 Workshop
Image Recognition? But why?
Ad

Recently uploaded (20)

PPTX
MYSQL Presentation for SQL database connectivity
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Encapsulation theory and applications.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
MYSQL Presentation for SQL database connectivity
Mobile App Security Testing_ A Comprehensive Guide.pdf
Encapsulation theory and applications.pdf
Approach and Philosophy of On baking technology
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
NewMind AI Weekly Chronicles - August'25 Week I
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
The Rise and Fall of 3GPP – Time for a Sabbatical?
20250228 LYD VKU AI Blended-Learning.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Machine learning based COVID-19 study performance prediction
Digital-Transformation-Roadmap-for-Companies.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Reach Out and Touch Someone: Haptics and Empathic Computing
Building Integrated photovoltaic BIPV_UPV.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Ad

Artificial Intelligence Day 3 Slides for your Reference Happy Learning

  • 2. Microsoft Azure AI Fundamentals: Computer Vision Subtitle or speaker name Sreya E P
  • 3. Agenda •Fundamentals of Computer Vision •Azure AI Vision •Fundamentals of Facial Recognition •Facial Analysis •Azure Face Services •Responsible AI Use •Optical Character Recognition •Azure AI Vision OCR Engine •Vision Studio •Demo •Conclusion
  • 5. Fundamentals of Computer Vision Computer vision is one of the core areas of artificial intelligence (AI), and focuses on creating solutions that enable AI applications to "see" the world and make sense of it.
  • 6. Azure AI Vision • While you can train your own machine learning models for computer vision, the architecture for computer vision models can be complex; and you require significant volumes of training images and compute power to perform the training process. • Microsoft's Azure AI Vision service provides prebuilt and customizable computer vision models that are based on the Florence foundation model and provide various powerful capabilities.
  • 7. Analyzing images with the Azure AI Vision service Azure AI Vision supports multiple image analysis capabilities, including: •Optical character recognition (OCR) - extracting text from images. •Generating captions and descriptions of images. •Detection of thousands of common objects in images. •Tagging visual features in images
  • 9. Introduction Face detection and analysis is an area of artificial intelligence (AI) which uses algorithms to locate and analyze human faces in images or video content. There are many applications for face detection, analysis, and recognition. For example, •Security - facial recognition can be used in building security applications, and increasingly it is used in smart phones operating systems for unlocking devices. •Social media - facial recognition can be used to automatically tag known friends in photographs. •Intelligent monitoring - for example, an automobile might include a system that monitors the driver's face to determine if the driver is looking at the road, looking at a mobile device, or shows signs of tiredness. •Advertising - analyzing faces in an image can help direct advertisements to an appropriate demographic audience. •Missing persons - using public cameras systems, facial recognition can be used to identify if a missing person is in the image frame. •Identity validation - useful at ports of entry kiosks where a person holds a special entry permit.
  • 10. Understand facial analysis Face detection involves identifying regions of an image that contain a human face, typically by returning bounding box coordinates that form a rectangle around the face
  • 11. Understand facial analysis With Face analysis, facial features can be used to train machine learning models to return other information, such as facial features such as nose, eyes, eyebrows, lips, and others
  • 12. Understand facial analysis A further application of facial analysis is to train a machine learning model to identify known individuals from their facial features. This is known as facial recognition, and uses multiple images of an individual to train the model. This trains the model so that it can detect those individuals in new images on which it wasn't trained.
  • 13. Get started with facial analysis on Azure Microsoft Azure provides multiple Azure AI services that you can use to detect and analyze faces, including: •Azure AI Vision, which offers face detection and some basic face analysis, such as returning the bounding box coordinates around an image. •Azure AI Video Indexer, which you can use to detect and identify faces in a video. •Azure AI Face, which offers pre-built algorithms that can detect, recognize, and analyze faces. Of these, Face offers the widest range of facial analysis capabilities.
  • 14. Azure AI Face service The Azure AI Face service can return the rectangle coordinates for any human faces that are found in an image, as well as a series of related attributes: • Accessories: indicates whether the given face has accessories. This attribute returns possible accessories including headwear, glasses, and mask, with confidence score between zero and one for each accessory. • Blur: how blurred the face is, which can be an indication of how likely the face is to be the main focus of the image. • Exposure: such as whether the image is underexposed or over exposed. This applies to the face in the image and not the overall image exposure. • Glasses: whether or not the person is wearing glasses. • Head pose: the face's orientation in a 3D space. • Mask: indicates whether the face is wearing a mask. • Noise: refers to visual noise in the image. If you have taken a photo with a high ISO setting for darker settings, you would notice this noise in the image. The image looks grainy or full of tiny dots that make the image less clear. • Occlusion: determines if there might be objects blocking the face in the image. • Quality For Recognition: a rating of high, medium, or low that reflects if the image is of sufficient quality to attempt face recognition on.
  • 15. Responsible AI Use Anyone can use the Face service to: • Detect the location of faces in an image. • Determine if a person is wearing glasses. • Determine if there's occlusion, blur, noise, or over/under exposure for any of the faces. • Return the head pose coordinates for each face in an image. The Limited Access policy requires customers to submit an intake form to access additional Azure AI Face service capabilities including: •Face verification: the ability to compare faces for similarity. •Face identification: the ability to identify named individuals in an image. •Liveness detection: the ability to detect and mitigate instances of recurring content and/or behaviors that indicate a violation of policies (eg. such as if the input video stream is real or fake).
  • 16. Fundamentals of optical character recognition
  • 17. Introduction OCR, or Optical Character Recognition, is a technology that converts different types of documents, such as scanned paper documents, PDFs, or images captured by a digital camera, into editable and searchable data. Key Points About OCR 1.Functionality: 1. OCR scans text characters in images and converts them into machine-encoded text. This includes recognizing printed text, handwritten text, or other textual content within images. 2.Applications: 1. Digitizing Documents: Converting physical paper documents into digital formats, making them easier to store, search, and share. 2. Text Extraction: Extracting text from images and PDFs to use in other applications, such as databases, word processors, and spreadsheets. 3. Automation: Automating data entry processes, reducing manual input errors, and increasing efficiency.
  • 18. Get started with Azure AI Vision • The ability for computer systems to process written and printed text is an area of AI where computer vision intersects with natural language processing. • Vision capabilities are needed to "read" the text, and then natural language processing capabilities make sense of it. • OCR is the foundation of processing text in images and uses machine learning models that are trained to recognize individual shapes as letters, numerals, punctuation, or other elements of text.
  • 19. Azure AI Vision's OCR Engine • Azure AI Vision service has the ability to extract machine-readable text from images. Azure AI Vision's Read API is the OCR engine that powers text extraction from images, PDFs, and TIFF files. OCR for images is optimized for general, non-document images that makes it easier to embed OCR in your user experience scenarios. • The Read API, otherwise known as Read OCR engine, uses the latest recognition models and is optimized for images that have a significant amount of text or have considerable visual noise. It can automatically determine the proper recognition model to use taking into consideration the number of lines of text, images that include text, and handwriting. • The OCR engine takes in an image file and identifies bounding boxes, or coordinates, where items are located within an image. In OCR, the model identifies bounding boxes around anything that appears to be text in the image.
  • 20. Azure AI Vision's OCR Engine Calling the Read API returns results arranged into the following hierarchy: •Pages - One for each page of text, including information about the page size and orientation. •Lines - The lines of text on a page. •Words - The words in a line of text, including the bounding box coordinates and text itself.
  • 21. Get started with Vision Studio on Azure To use the Azure AI Vision service you must first create a resource for it in your Azure subscription. You can use either of the following resource types: •Azure AI Vision: A specific resource for vision services. Use this resource type if you don't intend to use any other AI services, or if you want to track utilization and costs for your AI Vision resource separately. •Azure AI services: A general resource that includes Azure AI Vision along with many other Azure AI services such as Azure AI Language, Azure AI Speech, and others. Use this resource type if you plan to use multiple Azure AI services and want to simplify administration and development. Once you've created a resource, there are several ways to use Azure AI Vision's Read API: •Vision Studio •REST API •Software Development Kits (SDKs): Python, C#, JavaScript
  • 22. Demo

Editor's Notes

  • #1: Show leaderboard
  • #4: Show leaderboard
  • #5: Show leaderboard
  • #6: Show leaderboard
  • #7: Show leaderboard
  • #8: Show leaderboard
  • #9: Show leaderboard
  • #10: Show leaderboard
  • #11: Show leaderboard
  • #12: Show leaderboard
  • #13: Show leaderboard
  • #14: Show leaderboard
  • #15: Show leaderboard
  • #17: Show leaderboard
  • #22: Show leaderboard
  • #23: Show leaderboard