SlideShare a Scribd company logo
Information from
Pixels
Dave Snowdon
@davesnowdon
https://github.com/davesnowdon/ljc-information-from-pixels
http://www.slideshare.net/DaveSnowdon1/information-from-pixels
Summary
• Why? What?
• Range operations and colour spaces
• Kernels & convolution
• Object detection
• Contours
• Conclusion
Why me?
• Social robotics developer
• Social robots need to handle unstructured
environments
• Vision is the most versatile way of sensing the
environment
Most general purpose
sensor
Machine vision
• Tracking movement: Dyson 360, Google Tango
• Recognising people, biometric security
• Recognising medication
• Image search
• …
Why this is hard
https://adeshpande3.github.io/adeshpande3.github.io/A-Beginner's-Guide-To-Understanding-Convolutional-Neural-Networks/
Why this is hard
• Colour reproduction, lighting & white balance
• Perspective & rotation effects
• Noise
• Different scales
Rotations & perspective
Open CV
The good news
• Open source
• Tried and tested
• Large collections of algorithms
• Language bindings for C, python & java
• Runs on pretty much anything (Linux, Mac,
Windows, android, iOS, RaspberryPi)
The less good news
• Native code
• Java API is a bit clunky
• Not much structure
• Not the new shiny
Range operations &
colour spaces
RGB
https://en.wikipedia.org/wiki/RGB_color_model#/media/File:RGB_color_solid_cube.png
HSV
https://upload.wikimedia.org/wikipedia/commons/a/a0/Hsl-hsv_models.svg
L*a*b* / CIELAB
https://gurus.pyimagesearch.com/wp-content/uploads/2015/03/color_spaces_lab_axis.jpg
Blob detection
Get an image
• From a Java image
• From video / webcam
org.opencv.videoio.VideoCapture
• From file
import org.opencv.core.Mat;
Mat image = Imgcodecs.imread(filename);
org.opencv.core.Mat
new Mat(numRows, numColumns, CvType.CV_8UC3);
• Dense multi-dimensional matrix
• Variants with int, double, byte values
• Implements basic matrix operations
B G R B G R B G R B G R
B G R B G R B G R B G R
B G R B G R B G R B G R
Blur the image
Imgproc.GaussianBlur(image, result,
new Size(kernelSize, kernelSize),
0.0);
Convert to HSV
Imgproc.cvtColor(input, hsv,
Imgproc.COLOR_BGR2HSV);
Select only pixels in range
Core.inRange(image, low, high, result);
Erode & Dilate
final Mat se = Imgproc.getStructuringElement(Imgproc.MORPH
new Size(kernelSize, kernelSize));
Imgproc.erode(image, result, se, new Point(-1, -1), numIteration
Imgproc.dilate(image, result, se, new Point(-1, -1), numIteration
Find contours
Imgproc.findContours(image, contours, new Mat(),
Imgproc.RETR_EXTERNAL,
Imgproc.CHAIN_APPROX_SIMPLE);
Find largest contour
contours.stream()
.max((c1, c2) ->
(Imgproc.contourArea(c1) > Imgproc.contourArea(c2) ? 1
: -1))
.get();
Draw contour (for demo)
Imgproc.circle(image, centre, 5, CENTRE_COLOUR, 2);
Imgproc.drawContours(image,
Arrays.asList(contour), 0, OUTLINE_COLOUR, 2);
Output image
• Don’t always need to
• Grab region of interest
Mat roi = mat.submat(Rect)
• Convert to java image
BufferedImage javaImage = Util.matrixToImage(mat);
Util.displayImage(command, javaImage);
• Write to file
Imgcodecs.imwrite(filename, mat);
Built-in blob detection
• OpenCV has built-in blob detection:
SimpleBlobDetector
• blob detection by colour may not work
• Blog post: https://www.learnopencv.com/blob-
detection-using-opencv-python-c/
Kernels & Convolution
Convolution
//developer.apple.com/library/mac/documentation/Performance/Conceptual/vImage/ConvolutionOperations/ConvolutionOperations
Example kernels
Gaussian
Example kernels
Laplacian
Detecting blurred
images
Detecting blurred images
• Want to discard images that are unlikely to be of
use
• The more blurred an image is the fewer sharp
edges will be found
• What happens to the laplacian of an image as it’s
blurred…
Input image
Grayscale + laplacian
3x3 gaussian kernel
5x5 gaussian kernel
7x7 gaussian kernel
13x13 kernel
19x19 kernel
Variance of the laplacian
Code
// apply laplacian to grayscale copy of image
Imgproc.Laplacian(gray, laplacian, CvType.CV_64F);
// determine variance
MatOfDouble mean = new MatOfDouble();
MatOfDouble stddev = new MatOfDouble();
Core.meanStdDev(laplacian, mean, stddev);
double sd = stddev.toList().get(0);
double var = sd * sd;
Line following
Detect the line
What we want to do
Kernel to detect vertical lines
-1 2 -1
Mat kernel = new Mat(1, 3, CvType.CV_64F);
double[] kernel_values = {-1.0, 2.0, -1.0};
kernel.put(0, 0, kernel_values);
Convolve image with kernel
Imgproc.filter2D(gray, convolved, -1, kernel);
Threshold
Imgproc.threshold(convolved, thresh, 45.0, 255, Imgproc.
Result
Object detection
Sliding window
http://www.pyimagesearch.com/2015/03/23/sliding-windows-for-object-detection-with-python-and-opencv/
Haar features
Boosting
• Train all features on every training example
• For each feature find the best threshold which
distinguished positive from negative
• Select features with minimum error rate
• Final classifier is weighted sum of these weak
classifiers
Cascade
• Hugely expensive to compute all features on every
window location
• Group features into different stages with smaller
number of features
• Only proceed to next stage when previous stage
passes
• In Viola-Jones paper as few as 10 features out of
6000 might be evaluated per window
Pre-trained classifiers
• front face
• profile face
• Full body
• Upper body
• Lower body
• Left & right eyes (one classifier each for left & right)
• Smile
• Front cat face
• Russian license plate
Using a classifier
// create classifier object from XML definition
final CascadeClassifier faceClassifier =
new CascadeClassifier(classifierFilename);
// apply classifer to get list of matching regions
final MatOfRect mor = new MatOfRect();
clr.detectMultiScale(image, mor);
List<Rect> result = mor.toList();
Front face detection
Training your own classifier
How to train
• Create sample vectors from text files listing +ve & -ve images
• opencv_createsamples -info positives.txt -num 68 -w 60 -h 98 -vec
nao.vec
• Train
• Haar: opencv_traincascade -data classifier -vec samples.vec -bg
negatives.txt -numStages 20 -minHitRate 0.999 -maxFalseAlarmRate 0.5
-numPos 1000 -numNeg 600 -w 60 -h 98 -mode ALL -precalcValBufSize
1024 -precalcIdxBufSize 1024
• LBP : opencv_traincascade -data classifier.lbp -vec samples.vec -bg
negatives.txt -numStages 20 -minHitRate 0.999 -maxFalseAlarmRate 0.5
-numPos 1000 -numNeg 600 -w 60 -h 98 -featureType LBP -
precalcValBufSize 1024 -precalcIdxBufSize 1024
Training docs & tutorials
• http://docs.opencv.org/trunk/dc/d88/tutorial_train
cascade.html
• http://coding-robin.de/2013/07/22/train-your-own-
opencv-haar-classifier.html
Results
More uses for
contours
Detecting geometric shapes
Find contours
// use Canny edge detector on blurred grayscale image
Imgproc.Canny(blurred, edges, 75, 200);
// find contours
Imgproc.findContours(image, contours, new Mat(),
Imgproc.RETR_EXTERNAL,
Imgproc.CHAIN_APPROX_SIMPLE);
Type conversion
// need to convert the contour from a MatOfPoint to
MatOfPoint2f
final MatOfPoint2f m2f = new MatOfPoint2f();
m2f.fromList(contour.toList());
Approximate shapes
// approximate contour polygon with 1% or less difference in
perimeter
double perimeter = Imgproc.arcLength(m2f, true);
MatOfPoint2f approx = new MatOfPoint2f();
Imgproc.approxPolyDP(m2f, approx, 0.01 * perimeter,
true);
// check number of line segments
int numSides = approx.toList().size();
More information
• OpenCV docs: http://docs.opencv.org/3.1.0/
• Useful blogs:
• http://www.pyimagesearch.com
• https://www.learnopencv.com
• https://opencv-java-tutorials.readthedocs.io/en/latest/
• Code for examples:
https://github.com/davesnowdon/ljc-information-from-pixels
Summary
• Colour spaces: RGB, HSV, L*a*b*
• masking images using colour ranges
• Finding outline of objects using contours
• Convolution
• Using cascade classifiers to detect objects

More Related Content

KEY
Objective-C Survives
KEY
RubyistのためのObjective-C入門
PDF
Down the Rabbit Hole: An Adventure in JVM Wonderland
PDF
Ruby Performance - The Last Mile - RubyConf India 2016
PDF
JRuby and Invokedynamic - Japan JUG 2015
PDF
CBDW2014 - MockBox, get ready to mock your socks off!
PDF
OSCON Presentation: Developing High Performance Websites and Modern Apps with...
PDF
Sync considered unethical
Objective-C Survives
RubyistのためのObjective-C入門
Down the Rabbit Hole: An Adventure in JVM Wonderland
Ruby Performance - The Last Mile - RubyConf India 2016
JRuby and Invokedynamic - Japan JUG 2015
CBDW2014 - MockBox, get ready to mock your socks off!
OSCON Presentation: Developing High Performance Websites and Modern Apps with...
Sync considered unethical

What's hot (7)

PDF
Why GC is eating all my CPU?
PPTX
From Ruby to Scala
PDF
Wait for your fortune without Blocking!
PDF
JRuby 9000 - Optimizing Above the JVM
PDF
Android UI Development: Tips, Tricks, and Techniques
PDF
Apache Spark: Moving on from Hadoop
PDF
What's a macro?: Learning by Examples / Scalaのマクロに実用例から触れてみよう!
Why GC is eating all my CPU?
From Ruby to Scala
Wait for your fortune without Blocking!
JRuby 9000 - Optimizing Above the JVM
Android UI Development: Tips, Tricks, and Techniques
Apache Spark: Moving on from Hadoop
What's a macro?: Learning by Examples / Scalaのマクロに実用例から触れてみよう!
Ad

Similar to Information from pixels (20)

PPTX
DIY Java Profiling
PPTX
專題報告
PPTX
OpenCV @ Droidcon 2012
PDF
Multithreading and Parallelism on iOS [MobOS 2013]
PDF
Java 8 selected updates
PPTX
Implementing a JavaScript Engine
PDF
Everything you need to know about GraalVM Native Image
PPTX
UML for Aspect Oriented Design
PDF
Sista: Improving Cog’s JIT performance
PPT
Intro_OpenCV.ppt
PPTX
Install, Compile, Setup, Setting OpenCV 3.2, Visual C++ 2015, Win 64bit,
PDF
Raffaele Rialdi
PDF
Surge2012
PPTX
Practical LLM inference in modern Java.pptx
PPTX
Practical LLM inference in modern Java.pptx
PDF
"The OpenCV Open Source Computer Vision Library: Latest Developments," a Pres...
PDF
Introducing BoxLang : A new JVM language for productivity and modularity!
PDF
2013 syscan360 yuki_chen_syscan360_exploit your java native vulnerabilities o...
PDF
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
PDF
Hi performance table views with QuartzCore and CoreText
DIY Java Profiling
專題報告
OpenCV @ Droidcon 2012
Multithreading and Parallelism on iOS [MobOS 2013]
Java 8 selected updates
Implementing a JavaScript Engine
Everything you need to know about GraalVM Native Image
UML for Aspect Oriented Design
Sista: Improving Cog’s JIT performance
Intro_OpenCV.ppt
Install, Compile, Setup, Setting OpenCV 3.2, Visual C++ 2015, Win 64bit,
Raffaele Rialdi
Surge2012
Practical LLM inference in modern Java.pptx
Practical LLM inference in modern Java.pptx
"The OpenCV Open Source Computer Vision Library: Latest Developments," a Pres...
Introducing BoxLang : A new JVM language for productivity and modularity!
2013 syscan360 yuki_chen_syscan360_exploit your java native vulnerabilities o...
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
Hi performance table views with QuartzCore and CoreText
Ad

Recently uploaded (20)

PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Machine Learning_overview_presentation.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Big Data Technologies - Introduction.pptx
PDF
Getting Started with Data Integration: FME Form 101
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
A Presentation on Artificial Intelligence
PDF
cuic standard and advanced reporting.pdf
PPTX
MYSQL Presentation for SQL database connectivity
Assigned Numbers - 2025 - Bluetooth® Document
Machine Learning_overview_presentation.pptx
Machine learning based COVID-19 study performance prediction
Diabetes mellitus diagnosis method based random forest with bat algorithm
Programs and apps: productivity, graphics, security and other tools
Advanced methodologies resolving dimensionality complications for autism neur...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Unlocking AI with Model Context Protocol (MCP)
Big Data Technologies - Introduction.pptx
Getting Started with Data Integration: FME Form 101
Spectroscopy.pptx food analysis technology
Digital-Transformation-Roadmap-for-Companies.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
A Presentation on Artificial Intelligence
cuic standard and advanced reporting.pdf
MYSQL Presentation for SQL database connectivity

Information from pixels

Editor's Notes

  • #3: Aim is to give a basic intro to machine vision and some of the basic techniques using java Not going to talk about deep learning - plenty of other introductions to that Not going to talk about photogrammetry - very specialised subject
  • #6: Google, Apple Facebook using machine vision to recognise faces to match photos in photo albums
  • #14: - Additive moddel - red, green & blue light added together to produce different colours - creating colours unintuitive - device dependent, no guarantee colours will look same on different devices - sRGB standard colour space produced by HP & Microsoft in 1996 to allow reproduction - Should still use colour calibration for accurate results
  • #15: - defines a colour using values for hue, lightness & saturation - often easier than RGB when creating colours - device dependent
  • #16: - distances in other colour spaces don't correspond to how perceptually different colours look to humans - device independent (often used when converting from RGB to CMYK)
  • #17: - Most machine vision operates on grayscale images - Sometimes colour can be useful if you know the colour and aren't able to train an object detector
  • #19: http://docs.opencv.org/2.4/modules/core/doc/basic_structures.html#mat
  • #20: Blurring the image smoothes it out and helps remove noise Using a sigma of zero causes it to be computed from the width & height
  • #23: // erode to remove small specks // A foreground pixel in the input image will be kept only if ALL pixels inside the structuring element are > 0. Otherwise, the pixels are set to 0 (i.e. background). final Mat se = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(kernelSize, kernelSize)); Imgproc.erode(image, result, se, new Point(-1, -1), numIterations); // dilate to restore large areas and remove gaps // Dilations, just as an erosion, also utilize structuring elements — a center pixel p of the structuring element is set to white if ANY pixel in the structuring element is > 0. Imgproc.dilate(image, result, se, new Point(-1, -1), numIterations);
  • #30: You may have heard the term convolutional neural networks in conjunction with deep learning. It’s a similar idea: convolutional layers apply a kernel or filter to the layer’s input. The difference is that the filter is learned as part of the training process. In CNNs a convolutional layer is typically followed by a pooling layer which allows a fixed output vector size and a degree of position independence.
  • #42: The vertical scale is a log-scale so the drop of after even modest blurring is substantial
  • #44: - line following at speed a classic robot competition - variants with obstacles/hurdles
  • #52: Finding an object in an image done by sliding a window across the image checking whether the area under the window is an object of interest. Differences in scale handled by using a larger window and scaling down to the detector window size.
  • #53: Each feature is a single value obtained by subtracting sum of pixels under white rectangle from sum of pixels under black rectangle.
  • #59: Need a lot of images. I had 70 positive images and 1000 negative examples