SlideShare a Scribd company logo
© Vigen Sahakyan 2016
Content Based Image Retrieval by
Deep Learning
© Vigen Sahakyan 2016
Agenda
● Goals
● What is CBIR?
● What is Deep Learning ?
● AutoEncoder
● Tool description
© Vigen Sahakyan 2016
Goals
● We want to create Image search system based on Machine Learning
technique, which can do searching by image content. It has lots of
applications in public safety, military, medicine diagnoses e.t.c
● In modern web we have millions and billions of images without labels and
only a couple thousands of labeled images. The problem is how we can use
the power of this unlabeled data in our system ?
● In this presentation we explain our CBIR system which able to collect all
meaningful information from unlabeled data by using one of the widely used
Deep Learning technique which is called AutoEncoder.
© Vigen Sahakyan 2016
What is CBIR?
● Content Based Image Retrieval (CBIR)
● Is the process by which one searches for similar images.
● "Content-based" means that the search analyzes the contents of the image
rather than the metadata such as keywords, tags, or descriptions associated
with the image.
● One of the open problems in Computer Vision.
● It has lots of applications in many fields such as (Public safety, Military,
Medical Diagnoses, Robotics e.t.c)
© Vigen Sahakyan 2016
What is Deep Learning?
1. Deep learning is a branch of machine learning based on a set of algorithms that attempt to model
high-level abstractions in data by using multiple processing layers.
2. It’s used in Machine Learning to automatically figure out high level feature.
3. By Deep Learning we can extract high level features like shape, texture, contrast e.t.c from image
datasets(it’s not necessary for images to be labeled).
4. There are lots of Deep Learning algorithms
like Convolutional and Recursive Neural
Network, Deep Belief Network, Restricted
Boltzmann Machine e.t.c. In this work we
were used AutoEncoder .
5. It has lots of applications in many fields such
as (Computer Vision, Search Engines, Speech
Recognition, Artificial Intelligence e.t.c)
© Vigen Sahakyan 2016
AutoEncoder
● The aim of an autoencoder is to learn a representation (encoding) for a set of data,
typically for the purpose of dimensionality reduction.
● Recently, the autoencoder concept has become more widely used for learning
generative models of data
● The AutoEncoder is also a Neural Network.
The difference is that the AutoEncoder uses
unsupervised learning. To achieve this, the
AutoEncoder gets the same input value vector
at the output. Differences in the vectors at the
output can be considered errors for
backpropagation. It try to learn codec on hidden
layer (encoded value).
● Input = Decode(Encode(Input))
© Vigen Sahakyan 2016
Tool description
1. First of all Web service receive raw image (.jpg, .png, e.t.c) and pass it to
preprocessing step.
2. Preprocess raw Image:
a. Resize image to the appropriate size (our model size)
b. Generate GrayScale representation of resized image.
3. Generate row vector from preprocessed image pixels.
4. Call Normalization module
© Vigen Sahakyan 2016
Tool description
We call sigmoid function on value of every neuron
and it useful to have normalized inputs, to find global
minimum faster and improve error rate.
1. We do Min-Max normalization of input values by following
formula. zi
=(xi
−min(x))/(max(x)−min(x))
2. In our case zi
= xi
/ 255
3. Call Encoding module
© Vigen Sahakyan 2016
Tool description
We have already pretrained our AutoEncoder model via stochastic gradient
descent. As dataset we used 60000 unlabeled images of handwritten digits. After
training AutoEncoder figured out lots of high level feature of those images.
1. We feed our normalized row image to our AutoEncoder then we get more
compact feature vector (this vector represent probabilities of each high level
feature to be found on this image).
2. We pass new compact vector to Classifier module. (There isn’t need to
normalize this vector as it’s already had normalized when passed through
sigmoid function)
© Vigen Sahakyan 2016
Tool description
We pre trained our Neural Network classifier with several
thousands of labeled examples which were passed through
the AutoEncoder.
1. We feed row vector encoded by AutoEncoder
and call Result retrieval module to figure out
Result class from output layer.
© Vigen Sahakyan 2016
Tool description
Each node in the output layer will have a probability that it's class is the
correct output.
1. If the probability of one of the outputs class is greater than the
threshold (0.5) then it is considered as result class.
© Vigen Sahakyan 2016
Result
We tested our algorithm on MNIST digital handwritten image dataset and
compared it with the couple of famous article results.
MNIST
Our algorithm 95%
Yann LeCun algorithm 95.3%
Aurelio Ranzato algorithm 99%

More Related Content

PDF
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
PPT
Content based image retrieval(cbir)
PDF
Deep VO and SLAM
PPT
Image processing1 introduction
PPT
Image Processing
PPTX
PDF
CBIR in the Era of Deep Learning
PDF
Stereo vision
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content based image retrieval(cbir)
Deep VO and SLAM
Image processing1 introduction
Image Processing
CBIR in the Era of Deep Learning
Stereo vision

What's hot (20)

PPT
Video object tracking with classification and recognition of objects
PPT
Face Detection techniques
PPTX
Content Based Image Retrieval
PDF
LiDAR-based Autonomous Driving III (by Deep Learning)
PPTX
Digital image processing
PDF
Content Based Image Retrieval
PPTX
Object Detection & Tracking
PPT
Digital Image Processing_ ch1 introduction-2003
PDF
Image Indexing and Retrieval
PDF
Object tracking presentation
PPTX
Content based image retrieval
PPTX
Chain Code.pptx
PPTX
Digital image processing
PPTX
Face detection ppt by Batyrbek
PPTX
Deep learning for image super resolution
PDF
Chap_10_Object_Recognition.pdf
PPTX
Object tracking survey
PPTX
Fundamental Steps of Digital Image Processing & Image Components
PPTX
Semantic Segmentation Methods using Deep Learning
PDF
Image segmentation with deep learning
Video object tracking with classification and recognition of objects
Face Detection techniques
Content Based Image Retrieval
LiDAR-based Autonomous Driving III (by Deep Learning)
Digital image processing
Content Based Image Retrieval
Object Detection & Tracking
Digital Image Processing_ ch1 introduction-2003
Image Indexing and Retrieval
Object tracking presentation
Content based image retrieval
Chain Code.pptx
Digital image processing
Face detection ppt by Batyrbek
Deep learning for image super resolution
Chap_10_Object_Recognition.pdf
Object tracking survey
Fundamental Steps of Digital Image Processing & Image Components
Semantic Segmentation Methods using Deep Learning
Image segmentation with deep learning
Ad

Viewers also liked (20)

PDF
Advances in Image Search and Retrieval
PDF
Indexation image
PPTX
Une approche d’indexation et de recherche d’images pulmonaires TDM par le con...
PDF
Une Approche d’Indexation et de Recherche d’Images Pulmonaires TDM par le Con...
PPTX
Content Based Image Retrieval
PDF
PDF
Image Search Engine Frequently Asked Questions
PPTX
Image search engine
PDF
Building Knowledge Graphs in DIG
PPTX
DARPA Project Memex Erodes Privacy
PPTX
DARPA II
PPTX
ECML-2015 Presentation
PPTX
Eddl5131 assignment 1 march2013
PDF
PDF
REPRESENTATION LEARNING FOR STATE APPROXIMATION IN PLATFORM GAMES
PPTX
YFCC100M HybridNet fc6 Deep Features for Content-Based Image Retrieval
PPTX
Vertical Image Search Engine
PDF
Open source best practices (DARPA)
PPTX
Multimodal Learning Analytics
PDF
Multimodal Residual Learning for Visual Question-Answering
Advances in Image Search and Retrieval
Indexation image
Une approche d’indexation et de recherche d’images pulmonaires TDM par le con...
Une Approche d’Indexation et de Recherche d’Images Pulmonaires TDM par le Con...
Content Based Image Retrieval
Image Search Engine Frequently Asked Questions
Image search engine
Building Knowledge Graphs in DIG
DARPA Project Memex Erodes Privacy
DARPA II
ECML-2015 Presentation
Eddl5131 assignment 1 march2013
REPRESENTATION LEARNING FOR STATE APPROXIMATION IN PLATFORM GAMES
YFCC100M HybridNet fc6 Deep Features for Content-Based Image Retrieval
Vertical Image Search Engine
Open source best practices (DARPA)
Multimodal Learning Analytics
Multimodal Residual Learning for Visual Question-Answering
Ad

Similar to CBIR by deep learning (20)

PPTX
Neural Networks for Machine Learning and Deep Learning
PPTX
• An attacker’s aim for carrying out a CSRF attack is to force the user to su...
PDF
IRJET- Handwritten Decimal Image Compression using Deep Stacked Autoencoder
PPTX
UNIT-4.pptx
PDF
UNIT-4.pdf
PDF
UNIT-4.pdf
PPTX
Reducing the dimensionality of data with neural networks
PDF
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
PPTX
Autoencoders for image_classification
PPTX
Deep learning from a novice perspective
PDF
Autoencoders
PDF
Honey, I Deep-shrunk the Sample Covariance Matrix! by Erk Subasi at QuantCon ...
PDF
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
PPTX
Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...
PDF
Using Deep Learning to do Real-Time Scoring in Practical Applications
PDF
Introduction to Autoencoders
PPTX
Ai in 45 minutes
PPTX
Introduction to Autoencoders: Types and Applications
PPT
ECCV2010: feature learning for image classification, part 4
PDF
A Survey of Deep Learning Algorithms for Malware Detection
Neural Networks for Machine Learning and Deep Learning
• An attacker’s aim for carrying out a CSRF attack is to force the user to su...
IRJET- Handwritten Decimal Image Compression using Deep Stacked Autoencoder
UNIT-4.pptx
UNIT-4.pdf
UNIT-4.pdf
Reducing the dimensionality of data with neural networks
Tensorflow London 13: Zbigniew Wojna 'Deep Learning for Big Scale 2D Imagery'
Autoencoders for image_classification
Deep learning from a novice perspective
Autoencoders
Honey, I Deep-shrunk the Sample Covariance Matrix! by Erk Subasi at QuantCon ...
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...
Using Deep Learning to do Real-Time Scoring in Practical Applications
Introduction to Autoencoders
Ai in 45 minutes
Introduction to Autoencoders: Types and Applications
ECCV2010: feature learning for image classification, part 4
A Survey of Deep Learning Algorithms for Malware Detection

Recently uploaded (20)

PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Machine learning based COVID-19 study performance prediction
PDF
KodekX | Application Modernization Development
PPTX
Cloud computing and distributed systems.
PDF
Electronic commerce courselecture one. Pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
cuic standard and advanced reporting.pdf
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Spectroscopy.pptx food analysis technology
Per capita expenditure prediction using model stacking based on satellite ima...
Reach Out and Touch Someone: Haptics and Empathic Computing
Agricultural_Statistics_at_a_Glance_2022_0.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
The Rise and Fall of 3GPP – Time for a Sabbatical?
Unlocking AI with Model Context Protocol (MCP)
Machine learning based COVID-19 study performance prediction
KodekX | Application Modernization Development
Cloud computing and distributed systems.
Electronic commerce courselecture one. Pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Network Security Unit 5.pdf for BCA BBA.
Chapter 3 Spatial Domain Image Processing.pdf
cuic standard and advanced reporting.pdf
Big Data Technologies - Introduction.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Mobile App Security Testing_ A Comprehensive Guide.pdf
Spectroscopy.pptx food analysis technology

CBIR by deep learning

  • 1. © Vigen Sahakyan 2016 Content Based Image Retrieval by Deep Learning
  • 2. © Vigen Sahakyan 2016 Agenda ● Goals ● What is CBIR? ● What is Deep Learning ? ● AutoEncoder ● Tool description
  • 3. © Vigen Sahakyan 2016 Goals ● We want to create Image search system based on Machine Learning technique, which can do searching by image content. It has lots of applications in public safety, military, medicine diagnoses e.t.c ● In modern web we have millions and billions of images without labels and only a couple thousands of labeled images. The problem is how we can use the power of this unlabeled data in our system ? ● In this presentation we explain our CBIR system which able to collect all meaningful information from unlabeled data by using one of the widely used Deep Learning technique which is called AutoEncoder.
  • 4. © Vigen Sahakyan 2016 What is CBIR? ● Content Based Image Retrieval (CBIR) ● Is the process by which one searches for similar images. ● "Content-based" means that the search analyzes the contents of the image rather than the metadata such as keywords, tags, or descriptions associated with the image. ● One of the open problems in Computer Vision. ● It has lots of applications in many fields such as (Public safety, Military, Medical Diagnoses, Robotics e.t.c)
  • 5. © Vigen Sahakyan 2016 What is Deep Learning? 1. Deep learning is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data by using multiple processing layers. 2. It’s used in Machine Learning to automatically figure out high level feature. 3. By Deep Learning we can extract high level features like shape, texture, contrast e.t.c from image datasets(it’s not necessary for images to be labeled). 4. There are lots of Deep Learning algorithms like Convolutional and Recursive Neural Network, Deep Belief Network, Restricted Boltzmann Machine e.t.c. In this work we were used AutoEncoder . 5. It has lots of applications in many fields such as (Computer Vision, Search Engines, Speech Recognition, Artificial Intelligence e.t.c)
  • 6. © Vigen Sahakyan 2016 AutoEncoder ● The aim of an autoencoder is to learn a representation (encoding) for a set of data, typically for the purpose of dimensionality reduction. ● Recently, the autoencoder concept has become more widely used for learning generative models of data ● The AutoEncoder is also a Neural Network. The difference is that the AutoEncoder uses unsupervised learning. To achieve this, the AutoEncoder gets the same input value vector at the output. Differences in the vectors at the output can be considered errors for backpropagation. It try to learn codec on hidden layer (encoded value). ● Input = Decode(Encode(Input))
  • 7. © Vigen Sahakyan 2016 Tool description 1. First of all Web service receive raw image (.jpg, .png, e.t.c) and pass it to preprocessing step. 2. Preprocess raw Image: a. Resize image to the appropriate size (our model size) b. Generate GrayScale representation of resized image. 3. Generate row vector from preprocessed image pixels. 4. Call Normalization module
  • 8. © Vigen Sahakyan 2016 Tool description We call sigmoid function on value of every neuron and it useful to have normalized inputs, to find global minimum faster and improve error rate. 1. We do Min-Max normalization of input values by following formula. zi =(xi −min(x))/(max(x)−min(x)) 2. In our case zi = xi / 255 3. Call Encoding module
  • 9. © Vigen Sahakyan 2016 Tool description We have already pretrained our AutoEncoder model via stochastic gradient descent. As dataset we used 60000 unlabeled images of handwritten digits. After training AutoEncoder figured out lots of high level feature of those images. 1. We feed our normalized row image to our AutoEncoder then we get more compact feature vector (this vector represent probabilities of each high level feature to be found on this image). 2. We pass new compact vector to Classifier module. (There isn’t need to normalize this vector as it’s already had normalized when passed through sigmoid function)
  • 10. © Vigen Sahakyan 2016 Tool description We pre trained our Neural Network classifier with several thousands of labeled examples which were passed through the AutoEncoder. 1. We feed row vector encoded by AutoEncoder and call Result retrieval module to figure out Result class from output layer.
  • 11. © Vigen Sahakyan 2016 Tool description Each node in the output layer will have a probability that it's class is the correct output. 1. If the probability of one of the outputs class is greater than the threshold (0.5) then it is considered as result class.
  • 12. © Vigen Sahakyan 2016 Result We tested our algorithm on MNIST digital handwritten image dataset and compared it with the couple of famous article results. MNIST Our algorithm 95% Yann LeCun algorithm 95.3% Aurelio Ranzato algorithm 99%