SlideShare a Scribd company logo
18. Application Example –
Photo OCR:
The Photo OCR Problem
Machine Learning Pipeline: A system with many
stages/components, several of which may use machine learning.
Sliding Window: PEDESTRIAN DETECTION
To detect the pedestrians, small frames are allowed to scan the
whole image.
It is tried with many frames of different sizes and all the captured
images are resized to a particular size and then that image is sent
to Neural network to determine if there is a pedestrian or not
SLIDING WINDOW DETECTION
For text detection example:
The white regions show where the text is detected
The grey regions show where there is a probability of text. The
algo has lower confidence in those parts
➢In expansion (on the right), we ask if a pixel is in within 5
pixels of a white pixel, then that pixel is also made white pixel
Next we filter to only those white boxes, where the aspect ratio is
likely to be suitable for text
We now cut out these regions from the image and use them in
later stages of detection
Artificial Data Synthesis: to amplify the training set:
Synthetic data is prepared by using different fonts and putting
letters on different backgrounds
18 application example  photo ocr
Ceiling Analysis: What part of the pipeline to work on next?
18 application example  photo ocr
Machine Learning by Stanford University on Coursera. Certificate
earned at Friday, April 12, 2019 9:43 AM GMT
coursera.org/verify/4VW5AT4B38TZ

More Related Content

PPTX
Simple MATLAB Projects Research Guidance
PPTX
2012 zebedee
PPTX
IRIS Recognition Projects Research Topics
PPT
OCR.ppt
PDF
FPT17: An object detector based on multiscale sliding window search using a f...
PDF
Real Time Moving Object Detection for Day-Night Surveillance using AI
PDF
Traffic Management system using Deep Learning
PPTX
Object detection presentation
Simple MATLAB Projects Research Guidance
2012 zebedee
IRIS Recognition Projects Research Topics
OCR.ppt
FPT17: An object detector based on multiscale sliding window search using a f...
Real Time Moving Object Detection for Day-Night Surveillance using AI
Traffic Management system using Deep Learning
Object detection presentation

Similar to 18 application example photo ocr (6)

PDF
Smart surveillance using deep learning
PDF
Ujan Sengupta Resume
PPTX
GEETHAhshansbbsbsbhshnsnsn_INTERNSHIP.pptx
PDF
Flutter + tensor flow lite = awesome sauce
PPTX
Traffic Automation System
PDF
Image recognition
Smart surveillance using deep learning
Ujan Sengupta Resume
GEETHAhshansbbsbsbhshnsnsn_INTERNSHIP.pptx
Flutter + tensor flow lite = awesome sauce
Traffic Automation System
Image recognition
Ad

More from TanmayVijay1 (17)

PDF
1 Introduction to Machine Learning
PDF
17 large scale machine learning
PDF
16 recommender systems
PDF
15 anomaly detection
PDF
14 dimentionality reduction
PDF
13 unsupervised learning clustering
PDF
12 support vector machines
PDF
11 ml system design
PDF
10 advice for applying ml
PDF
9 neural network learning
PDF
8 neural network representation
PDF
7 regularization
PDF
6 logistic regression classification algo
PDF
5 octave tutorial
PDF
4 linear regeression with multiple variables
PDF
3 linear algebra review
PDF
2 linear regression with one variable
1 Introduction to Machine Learning
17 large scale machine learning
16 recommender systems
15 anomaly detection
14 dimentionality reduction
13 unsupervised learning clustering
12 support vector machines
11 ml system design
10 advice for applying ml
9 neural network learning
8 neural network representation
7 regularization
6 logistic regression classification algo
5 octave tutorial
4 linear regeression with multiple variables
3 linear algebra review
2 linear regression with one variable
Ad

Recently uploaded (20)

PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Big Data Technologies - Introduction.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
1. Introduction to Computer Programming.pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Tartificialntelligence_presentation.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPT
Teaching material agriculture food technology
PDF
MIND Revenue Release Quarter 2 2025 Press Release
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Network Security Unit 5.pdf for BCA BBA.
Dropbox Q2 2025 Financial Results & Investor Presentation
Unlocking AI with Model Context Protocol (MCP)
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Empathic Computing: Creating Shared Understanding
Big Data Technologies - Introduction.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
1. Introduction to Computer Programming.pptx
Assigned Numbers - 2025 - Bluetooth® Document
NewMind AI Weekly Chronicles - August'25-Week II
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Tartificialntelligence_presentation.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Encapsulation_ Review paper, used for researhc scholars
Teaching material agriculture food technology
MIND Revenue Release Quarter 2 2025 Press Release

18 application example photo ocr

  • 1. 18. Application Example – Photo OCR: The Photo OCR Problem Machine Learning Pipeline: A system with many stages/components, several of which may use machine learning.
  • 3. To detect the pedestrians, small frames are allowed to scan the whole image. It is tried with many frames of different sizes and all the captured images are resized to a particular size and then that image is sent to Neural network to determine if there is a pedestrian or not SLIDING WINDOW DETECTION
  • 5. The white regions show where the text is detected The grey regions show where there is a probability of text. The algo has lower confidence in those parts ➢In expansion (on the right), we ask if a pixel is in within 5 pixels of a white pixel, then that pixel is also made white pixel Next we filter to only those white boxes, where the aspect ratio is likely to be suitable for text We now cut out these regions from the image and use them in later stages of detection
  • 6. Artificial Data Synthesis: to amplify the training set: Synthetic data is prepared by using different fonts and putting letters on different backgrounds
  • 8. Ceiling Analysis: What part of the pipeline to work on next?
  • 10. Machine Learning by Stanford University on Coursera. Certificate earned at Friday, April 12, 2019 9:43 AM GMT coursera.org/verify/4VW5AT4B38TZ