SlideShare a Scribd company logo
Machine Learning
Support Vector Machines
Sjoerd Maessen
•  AFOL
•  Works at E-sites Breda
•  Stock market enthusiast
Titanic
The chance of survival
PassengerId	 Survived	Pclass	 Name	 Sex	 Age	 SibSp	 Parch	 Ticket	 Fare	 Cabin	 Embarked	
1	 0	 3	 Braund,	Mr.	Owen	Harris	 male	 22	 1	 0	 A/5	21171	 7.25	 S	
2	 1	 1	
Cumings,	Mrs.	John	Bradley	(Florence	Briggs	
Thayer)	 female	 38	 1	 0	 PC	17599	 712.833	 C85	 C	
3	 1	 3	 Heikkinen,	Miss.	Laina	 female	 26	 0	 0	 STON/O2.	3101282	 7.925	 S	
4	 1	 1	 Futrelle,	Mrs.	Jacques	Heath	(Lily	May	Peel)	 female	 35	 1	 0	 113803	 53.1	 C123	 S	
5	 0	 3	 Allen,	Mr.	William	Henry	 male	 35	 0	 0	 373450	 8.05	 S	
6	 0	 3	 Moran,	Mr.	James	 male	 0	 0	 330877	 84.583	 Q	
7	 0	 1	 McCarthy,	Mr.	Timothy	J	 male	 54	 0	 0	 17463	 518.625	 E46	 S	
8	 0	 3	 Palsson,	Master.	Gosta	Leonard	 male	 2	 3	 1	 349909	 21.075	 S	
9	 1	 3	
Johnson,	Mrs.	Oscar	W	(Elisabeth	Vilhelmina	
Berg)	 female	 27	 0	 2	 347742	 111.333	 S	
10	 1	 2	 Nasser,	Mrs.	Nicholas	(Adele	Achem)	 female	 14	 1	 0	 237736	 300.708	 C	
11	 1	 3	 Sandstrom,	Miss.	Marguerite	Rut	 female	 4	 1	 1	 PP	9549	 16.7	 G6	 S	
12	 1	 1	 Bonnell,	Miss.	Elizabeth	 female	 58	 0	 0	 113783	 26.55	 C103	 S	
13	 0	 3	 Saundercock,	Mr.	William	Henry	 male	 20	 0	 0	 A/5.	2151	 8.05	 S	
14	 0	 3	 Andersson,	Mr.	Anders	Johan	 male	 39	 1	 5	 347082	 31.275	 S
Challenge accepted
“Could you take in account siblings?”
Sure…
“Oh and the number of parents and children aboard…”
Of course!
“Could you add a ‘simple if’ for age as well?”
Machine learning   support vector machines
“Great! We are almost there!
But…”
Machine learning   support vector machines
“Field of study that gives computers
the ability to learn without being
explicitly programmed”
Arthur Lee Samuel
Machine learning   support vector machines
hZps://personality-insights-livedemo.mybluemix.net/
Machine learning   support vector machines
Alright!
Let's become data scientists!
Machine learning   support vector machines
A comparison
Traditional
programming
Machine
learning
Input program Input output
output programnew input new output
Classification vs regression
Input	 Output	
0.98	 68	 0	
0.76	 42	 0	
1.23	 78	 1	
1.91	 109	 1	
Input	 Output	
0.98	 68	 0.23	
0.76	 42	 0.15	
1.23	 78	 4.74	
1.91	 109	 7.98
Support Vector Machine
•  Automatically creates a “program” or model
•  Inputs are ‘features’
•  Model represents a space
•  New input fits somewhere
Machine learning   support vector machines
Machine learning   support vector machines
Machine learning   support vector machines
Support Vectors
•  Optimal hyperplane
•  Linear classifier
•  Maximum margin
•  Classification
Linearly separable dataset
Non-linear decision boundary
The kernel trick
A whole new dimension
The kernel trick
Choosing a kernel
•  No kernel or linear kernel
•  Gaussian kernel
•  Polynomial kernel
•  Sigmoid kernel
•  Radial basis function kernel
•  …
Choosing a kernel
Rule of thumb
•  N much bigger than M
=> linear kernel
•  N small, M intermediate
=> gaussian kernel
N = number of features
M = number of training examples
Spam detection
•  N = 10000 (bad words, # of urls,…)
•  M = 250 (sample mails)
=> linear kernel
Validation of housing prices
•  N = 1-1000 (# of rooms, m3, location,…)
•  M = 100,000 (of transactions)
=> Gaussian kernel
Features
It’s all about preparation
Features
•  Representation of raw data
•  The hardest part
Raw data
Pre-
processing
Feature
scaling
Feature
extraction
Association discovery
OCR – Pre-processing
•  De-skew
OCR – Pre-processing
•  De-skew
•  Despeckle
OCR – Pre-processing
•  De-skew
•  Despeckle
•  Convert to black & white
OCR – Pre-processing
•  De-skew
•  Despeckle
•  Convert to black & white
•  Zoning
OCR – Pre-processing
•  De-skew
•  Despeckle
•  Convert to black & white
•  Zoning
•  Character segmentation
OCR – Feature extraction
OCR – Feature scaling
How to scale the number of black pixels?
22 / 56=> 0.39286
Scale between 0 - 1
Real life
OCR – Training the model
Training file
•  Labels
•  Features
Label	 %	Black	Pixels	 X1	%	 X2	%	 X3	%	 …	
0	 0.33	 0.546	 0.840	
1	 0.78	 0.123	 0.567	 0.347	
1	 0.75	 0.512	 0.543
Alice in Wonderland
Down the rabbit hole
•  Avg word length
•  Char frequency
•  …
Basic text features
•  Avg word length
•  Char frequency
•  …
Basic text features
Training the model
Predicting the unknown
Input
“Thank you for contacting us. This is an automated response confirming the
receipt of your ticket. Our team will get back to you as soon as possible. When
replying, please make sure that the ticket ID is kept in the subject so that we
can track your replies.”
Testing the unknown
Input
“Thank you for contacting us. This is an automated response confirming the
receipt of your ticket. Our team will get back to you as soon as possible. When
replying, please make sure that the ticket ID is kept in the subject so that we
can track your replies.”
Output
This is an English text
Testing the unknown
Input
“Hierbij bevestigen wij de ontvangst en verwerking van uw e-mail met
ticketnummer PCL-98124-735. Uw vraag wordt opgepakt door één van onze
engineers. Wij streven ernaar spoedig een oplossing aan u terug te kunnen
koppelen. “
Testing the unknown
Input
“Hierbij bevestigen wij de ontvangst en verwerking van uw e-mail met
ticketnummer PCL-98124-735. Uw vraag wordt opgepakt door één van onze
engineers. Wij streven ernaar spoedig een oplossing aan u terug te kunnen
koppelen. “
Output
This is a Dutch text
Testing the unknown
Titanic
The chance of survival
PassengerId	 Survived	Pclass	 Name	 Sex	 Age	 SibSp	 Parch	 Ticket	 Fare	 Cabin	 Embarked	
1	 0	 3	 Braund,	Mr.	Owen	Harris	 male	 22	 1	 0	 A/5	21171	 7.25	 S	
2	 1	 1	
Cumings,	Mrs.	John	Bradley	(Florence	Briggs	
Thayer)	 female	 38	 1	 0	 PC	17599	 712.833	 C85	 C	
3	 1	 3	 Heikkinen,	Miss.	Laina	 female	 26	 0	 0	 STON/O2.	3101282	 7.925	 S	
4	 1	 1	 Futrelle,	Mrs.	Jacques	Heath	(Lily	May	Peel)	 female	 35	 1	 0	 113803	 53.1	 C123	 S	
5	 0	 3	 Allen,	Mr.	William	Henry	 male	 35	 0	 0	 373450	 8.05	 S	
6	 0	 3	 Moran,	Mr.	James	 male	 0	 0	 330877	 84.583	 Q	
7	 0	 1	 McCarthy,	Mr.	Timothy	J	 male	 54	 0	 0	 17463	 518.625	 E46	 S	
8	 0	 3	 Palsson,	Master.	Gosta	Leonard	 male	 2	 3	 1	 349909	 21.075	 S	
9	 1	 3	
Johnson,	Mrs.	Oscar	W	(Elisabeth	Vilhelmina	
Berg)	 female	 27	 0	 2	 347742	 111.333	 S	
10	 1	 2	 Nasser,	Mrs.	Nicholas	(Adele	Achem)	 female	 14	 1	 0	 237736	 300.708	 C	
11	 1	 3	 Sandstrom,	Miss.	Marguerite	Rut	 female	 4	 1	 1	 PP	9549	 16.7	 G6	 S	
12	 1	 1	 Bonnell,	Miss.	Elizabeth	 female	 58	 0	 0	 113783	 26.55	 C103	 S	
13	 0	 3	 Saundercock,	Mr.	William	Henry	 male	 20	 0	 0	 A/5.	2151	 8.05	 S	
14	 0	 3	 Andersson,	Mr.	Anders	Johan	 male	 39	 1	 5	 347082	 31.275	 S
Machine learning   support vector machines
Machine learning   support vector machines
Machine learning   support vector machines
Feature extraction
•  Title (Mrs, Miss, Mr, Jonkheer, Capt,..)
•  Passengerclass
•  Sex
•  Age
•  Siblings/spouses
•  Parent/children
•  Cabin
•  Port of embarkation
Creating the trainingfile
Preprocessing and scaling
Preprocessing and scaling
Preprocessing and scaling
Preprocessing and scaling
Filling in blanks
Magic!
Result: 83,26% accuracy
Common issues
•  Feature numbering
•  Training data <> real world
•  Overfitting
•  Feature selection
•  Multiclass classification
Next step?
Learn R, Python,…
Machine learning   support vector machines
Machine learning   support vector machines
Machine learning   support vector machines
Machine learning   support vector machines
Resources
•  https://www.csie.ntu.edu.tw/~cjlin/libsvm/
•  http://php.net/manual/en/book.svm.php
•  https://www.kaggle.com/
•  http://scikit-learn.org/stable/
•  https://packagist.org/packages/sjoerdmaessen/machinelearning
@sjoerdmaessen
linkedin.com/in/sjoerdmaessen
https://joind.in/talk/d921d

More Related Content

PDF
Support Vector Machines for Classification
PDF
Cost savings from auto-scaling of network resources using machine learning
PPTX
Lecture 9 - Machine Learning and Support Vector Machines (SVM)
PDF
Applications of Machine Learning to Location-based Social Networks
PDF
IoT Mobility Forensics
PPTX
Network_Intrusion_Detection_System_Team1
PPTX
Airline passenger profiling based on fuzzy deep machine learning
PDF
Machine Learning for dummies
Support Vector Machines for Classification
Cost savings from auto-scaling of network resources using machine learning
Lecture 9 - Machine Learning and Support Vector Machines (SVM)
Applications of Machine Learning to Location-based Social Networks
IoT Mobility Forensics
Network_Intrusion_Detection_System_Team1
Airline passenger profiling based on fuzzy deep machine learning
Machine Learning for dummies

Viewers also liked (20)

PDF
Computer security using machine learning
PDF
Online Machine Learning: introduction and examples
PDF
Classification Based Machine Learning Algorithms
PDF
BSidesLV 2013 - Using Machine Learning to Support Information Security
PDF
Distributed Online Machine Learning Framework for Big Data
PPTX
Support vector machines
PPTX
Online algorithms in Machine Learning
PDF
A use case of online machine learning using Jubatus
PDF
Computer security - A machine learning approach
PDF
Agile for Embedded & System Software Development : Presented by Priyank KS
PPTX
Application of machine learning in industrial applications
PPT
3.7 heap sort
PDF
Interfaces to ubiquitous computing
PDF
iBeacons: Security and Privacy?
PDF
Agile London: Industrial Agility, How to respond to the 4th Industrial Revolu...
PPTX
Demystifying dependency Injection: Dagger and Toothpick
PDF
Dependency Injection with Apex
PPTX
A review of machine learning based anomaly detection
PPTX
Agile Methodology PPT
PDF
HUG Ireland Event Presentation - In-Memory Databases
Computer security using machine learning
Online Machine Learning: introduction and examples
Classification Based Machine Learning Algorithms
BSidesLV 2013 - Using Machine Learning to Support Information Security
Distributed Online Machine Learning Framework for Big Data
Support vector machines
Online algorithms in Machine Learning
A use case of online machine learning using Jubatus
Computer security - A machine learning approach
Agile for Embedded & System Software Development : Presented by Priyank KS
Application of machine learning in industrial applications
3.7 heap sort
Interfaces to ubiquitous computing
iBeacons: Security and Privacy?
Agile London: Industrial Agility, How to respond to the 4th Industrial Revolu...
Demystifying dependency Injection: Dagger and Toothpick
Dependency Injection with Apex
A review of machine learning based anomaly detection
Agile Methodology PPT
HUG Ireland Event Presentation - In-Memory Databases
Ad

Recently uploaded (20)

PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Mega Projects Data Mega Projects Data
PPT
Quality review (1)_presentation of this 21
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Lecture1 pattern recognition............
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Introduction to machine learning and Linear Models
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Computer network topology notes for revision
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
annual-report-2024-2025 original latest.
Galatica Smart Energy Infrastructure Startup Pitch Deck
Mega Projects Data Mega Projects Data
Quality review (1)_presentation of this 21
Introduction to Knowledge Engineering Part 1
IBA_Chapter_11_Slides_Final_Accessible.pptx
Lecture1 pattern recognition............
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Acceptance and paychological effects of mandatory extra coach I classes.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
IB Computer Science - Internal Assessment.pptx
Introduction to machine learning and Linear Models
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Reliability_Chapter_ presentation 1221.5784
Computer network topology notes for revision
Introduction-to-Cloud-ComputingFinal.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
.pdf is not working space design for the following data for the following dat...
Supervised vs unsupervised machine learning algorithms
annual-report-2024-2025 original latest.
Ad

Machine learning support vector machines