SlideShare a Scribd company logo
by Ilya Kuzovkin
ilya.kuzovkin@gmail.com
Mooncascade ML Camp
2016
Machine Learning
ESSENTIAL CONCEPTS
ONE MACHINE LEARNING USE CASE
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
Can we ask a computer to
create those patterns
automatically?
Can we ask a computer to
create those patterns
automatically?
Yes
Can we ask a computer to
create those patterns
automatically?
Yes
How?
Raw data
Instance
Raw data
Class (label)
A data sample:
“7”
Instance
Raw data
Class (label)
A data sample:
“7”
How to represent it in a machine-readable form?
Instance
Raw data
Class (label)
A data sample:
“7”
How to represent it in a machine-readable form?
Feature extraction
Instance
Raw data
Class (label)
A data sample:
“7”
How to represent it in a machine-readable form?
Feature extraction
28px
28 px
Instance
Raw data
Class (label)
A data sample:
“7”
28px
28 px
784 pixels in total
Feature vector
(0, 0, 0, …, 28, 65, 128, 255, 101, 38,… 0, 0, 0)
How to represent it in a machine-readable form?
Feature extraction
Instance
Raw data
Class (label)
A data sample:
“7”
28px
28 px
784 pixels in total
Feature vector
(0, 0, 0, …, 28, 65, 128, 255, 101, 38,… 0, 0, 0)
How to represent it in a machine-readable form?
Feature extraction
(0, 0, 0, …, 28, 65, 128, 255, 101, 38,… 0, 0, 0)
(0, 0, 0, …, 13, 48, 102, 0, 46, 255,… 0, 0, 0)
(0, 0, 0, …, 17, 34, 12, 43, 122, 70,… 0, 7, 0)
(0, 0, 0, …, 98, 21, 255, 255, 231, 140,… 0, 0, 0)
“7”
“2”
“8”
“2”
Instance
Raw data
Class (label)
A data sample:
“7”
28px
28 px
784 pixels in total
Feature vector
(0, 0, 0, …, 28, 65, 128, 255, 101, 38,… 0, 0, 0)
How to represent it in a machine-readable form?
Feature extraction
(0, 0, 0, …, 28, 65, 128, 255, 101, 38,… 0, 0, 0)
(0, 0, 0, …, 13, 48, 102, 0, 46, 255,… 0, 0, 0)
(0, 0, 0, …, 17, 34, 12, 43, 122, 70,… 0, 7, 0)
Dataset
(0, 0, 0, …, 98, 21, 255, 255, 231, 140,… 0, 0, 0)
“7”
“2”
“8”
“2”
The data is in the right format — what’s next?
The data is in the right format — what’s next?
• C4.5	
• Random	forests	
• Bayesian	networks	
• Hidden	Markov	models	
• Artificial	neural	network	
• Data	clustering	
• Expectation-maximization	
algorithm	
• Self-organizing	map	
• Radial	basis	function	network	
• Vector	Quantization	
• Generative	topographic	map	
• Information	bottleneck	method	
• IBSEAD	
• Apriori	algorithm	
• Eclat	algorithm	
• FP-growth	algorithm	
• Single-linkage	clustering	
• Conceptual	clustering	
• K-means	algorithm	
• Fuzzy	clustering	
• Temporal	difference	learning	
• Q-learning	
• Learning	Automata
• AODE	
• Artificial	neural	network	
• Backpropagation	
• Naive	Bayes	classifier	
• Bayesian	network	
• Bayesian	knowledge	base	
• Case-based	reasoning	
• Decision	trees	
• Inductive	logic	
programming	
• Gaussian	process	regression	
• Gene	expression	
programming	
• Group	method	of	data	
handling	(GMDH)	
• Learning	Automata	
• Learning	Vector	
Quantization	
• Logistic	Model	Tree	
• Decision	tree	
• Decision	graphs	
• Lazy	learning	
• Monte	Carlo	Method	
• SARSA
• Instance-based	learning	
• Nearest	Neighbor	Algorithm	
• Analogical	modeling	
• Probably	approximately	correct	learning	
(PACL)	
• Symbolic	machine	learning	algorithms	
• Subsymbolic	machine	learning	algorithms	
• Support	vector	machines	
• Random	Forest	
• Ensembles	of	classifiers	
• Bootstrap	aggregating	(bagging)	
• Boosting	(meta-algorithm)	
• Ordinal	classification	
• Regression	analysis	
• Information	fuzzy	networks	(IFN)	
• Linear	classifiers	
• Fisher's	linear	discriminant	
• Logistic	regression	
• Naive	Bayes	classifier	
• Perceptron	
• Support	vector	machines	
• Quadratic	classifiers	
• k-nearest	neighbor	
• Boosting
Pick an algorithm
The data is in the right format — what’s next?
• C4.5	
• Random	forests	
• Bayesian	networks	
• Hidden	Markov	models	
• Artificial	neural	network	
• Data	clustering	
• Expectation-maximization	
algorithm	
• Self-organizing	map	
• Radial	basis	function	network	
• Vector	Quantization	
• Generative	topographic	map	
• Information	bottleneck	method	
• IBSEAD	
• Apriori	algorithm	
• Eclat	algorithm	
• FP-growth	algorithm	
• Single-linkage	clustering	
• Conceptual	clustering	
• K-means	algorithm	
• Fuzzy	clustering	
• Temporal	difference	learning	
• Q-learning	
• Learning	Automata
• AODE	
• Artificial	neural	network	
• Backpropagation	
• Naive	Bayes	classifier	
• Bayesian	network	
• Bayesian	knowledge	base	
• Case-based	reasoning	
• Decision	trees	
• Inductive	logic	
programming	
• Gaussian	process	regression	
• Gene	expression	
programming	
• Group	method	of	data	
handling	(GMDH)	
• Learning	Automata	
• Learning	Vector	
Quantization	
• Logistic	Model	Tree	
• Decision	tree	
• Decision	graphs	
• Lazy	learning	
• Monte	Carlo	Method	
• SARSA
• Instance-based	learning	
• Nearest	Neighbor	Algorithm	
• Analogical	modeling	
• Probably	approximately	correct	learning	
(PACL)	
• Symbolic	machine	learning	algorithms	
• Subsymbolic	machine	learning	algorithms	
• Support	vector	machines	
• Random	Forest	
• Ensembles	of	classifiers	
• Bootstrap	aggregating	(bagging)	
• Boosting	(meta-algorithm)	
• Ordinal	classification	
• Regression	analysis	
• Information	fuzzy	networks	(IFN)	
• Linear	classifiers	
• Fisher's	linear	discriminant	
• Logistic	regression	
• Naive	Bayes	classifier	
• Perceptron	
• Support	vector	machines	
• Quadratic	classifiers	
• k-nearest	neighbor	
• Boosting
Pick an algorithm
DECISION TREE
vs.
DECISION TREE
vs.
(0, …, 28, 65, …, 207, 101, 0, 0)
(0, …, 19, 34, …, 254, 54, 0, 0)
(0, …, 87, 59, …, 240, 52, 4, 0)
(0, …, 87, 52, …, 240, 19, 3, 0)
(0, …, 28, 64, …, 102, 101, 0, 0)
(0, …, 19, 23, …, 105, 54, 0, 0)
(0, …, 87, 74, …, 121, 51, 7, 0)
(0, …, 87, 112, …, 239, 52, 4, 0)
DECISION TREE
vs.
(0, …, 28, 65, …, 207, 101, 0, 0)
(0, …, 19, 34, …, 254, 54, 0, 0)
(0, …, 87, 59, …, 240, 52, 4, 0)
(0, …, 87, 52, …, 240, 19, 3, 0)
(0, …, 28, 64, …, 102, 101, 0, 0)
(0, …, 19, 23, …, 105, 54, 0, 0)
(0, …, 87, 74, …, 121, 51, 7, 0)
(0, …, 87, 112, …, 239, 52, 4, 0)
PIXEL
#417
DECISION TREE
vs.
(0, …, 28, 65, …, 207, 101, 0, 0)
(0, …, 19, 34, …, 254, 54, 0, 0)
(0, …, 87, 59, …, 240, 52, 4, 0)
(0, …, 87, 52, …, 240, 19, 3, 0)
(0, …, 28, 64, …, 102, 101, 0, 0)
(0, …, 19, 23, …, 105, 54, 0, 0)
(0, …, 87, 74, …, 121, 51, 7, 0)
(0, …, 87, 112, …, 239, 52, 4, 0)
PIXEL
#417
PIXEL
#417
>200 <200
DECISION TREE
vs.
(0, …, 28, 65, …, 207, 101, 0, 0)
(0, …, 19, 34, …, 254, 54, 0, 0)
(0, …, 87, 59, …, 240, 52, 4, 0)
(0, …, 87, 52, …, 240, 19, 3, 0)
(0, …, 28, 64, …, 102, 101, 0, 0)
(0, …, 19, 23, …, 105, 54, 0, 0)
(0, …, 87, 74, …, 121, 51, 7, 0)
(0, …, 87, 112, …, 239, 52, 4, 0)
PIXEL
#417
PIXEL
#417
>200 <200
DECISION TREE
vs.
(0, …, 28, 65, …, 207, 101, 0, 0)
(0, …, 19, 34, …, 254, 54, 0, 0)
(0, …, 87, 59, …, 240, 52, 4, 0)
(0, …, 87, 52, …, 240, 19, 3, 0)
(0, …, 28, 64, …, 102, 101, 0, 0)
(0, …, 19, 23, …, 105, 54, 0, 0)
(0, …, 87, 74, …, 121, 51, 7, 0)
(0, …, 87, 112, …, 239, 52, 4, 0)
PIXEL
#417
>200 <200
DECISION TREE
vs.
(0, …, 28, 65, …, 207, 101, 0, 0)
(0, …, 19, 34, …, 254, 54, 0, 0)
(0, …, 87, 59, …, 240, 52, 4, 0)
(0, …, 87, 52, …, 240, 19, 3, 0)
(0, …, 28, 64, …, 102, 101, 0, 0)
(0, …, 19, 23, …, 105, 54, 0, 0)
(0, …, 87, 74, …, 121, 51, 7, 0)
(0, …, 87, 112, …, 239, 52, 4, 0)
PIXEL
#417
>200 <200
PIXEL
#123
DECISION TREE
vs.
(0, …, 28, 65, …, 207, 101, 0, 0)
(0, …, 19, 34, …, 254, 54, 0, 0)
(0, …, 87, 59, …, 240, 52, 4, 0)
(0, …, 87, 52, …, 240, 19, 3, 0)
(0, …, 28, 64, …, 102, 101, 0, 0)
(0, …, 19, 23, …, 105, 54, 0, 0)
(0, …, 87, 74, …, 121, 51, 7, 0)
(0, …, 87, 112, …, 239, 52, 4, 0)
PIXEL
#417
>200 <200
PIXEL
#123
<100 >100
PIXEL
#123
DECISION TREE
vs.
(0, …, 28, 65, …, 207, 101, 0, 0)
(0, …, 19, 34, …, 254, 54, 0, 0)
(0, …, 87, 59, …, 240, 52, 4, 0)
(0, …, 87, 52, …, 240, 19, 3, 0)
(0, …, 28, 64, …, 102, 101, 0, 0)
(0, …, 19, 23, …, 105, 54, 0, 0)
(0, …, 87, 74, …, 121, 51, 7, 0)
(0, …, 87, 112, …, 239, 52, 4, 0)
PIXEL
#417
>200 <200
<100 >100
PIXEL
#123
DECISION TREE
DECISION TREE
ACCURACY
ACCURACY
Confusion matrix
Trueclass
Predicted class
ACCURACY
Confusion matrix
acc =
correctly classified
total number of samples
Trueclass
Predicted class
ACCURACY
Confusion matrix
acc =
correctly classified
total number of samples
Beware of an
imbalanced dataset!
Trueclass
Predicted class
ACCURACY
Confusion matrix
acc =
correctly classified
total number of samples
Beware of an
imbalanced dataset!
Consider the following model:
“Always predict 2”
Trueclass
Predicted class
ACCURACY
Confusion matrix
acc =
correctly classified
total number of samples
Beware of an
imbalanced dataset!
Consider the following model:
“Always predict 2”
Accuracy 0.9
Trueclass
Predicted class
DECISION TREE
DECISION TREE
“You said 100%
accurate?! Every 10th
digit your system
detects is wrong!”
Angry client
DECISION TREE
“You said 100%
accurate?! Every 10th
digit your system
detects is wrong!”
Angry client
We’ve trained our system on the data the client gave us. But our
system has never seen the new data the client applied it to.
And in the real life — it never will…
OVERFITTING
Simulate the real-life situation — split the dataset
OVERFITTING
Simulate the real-life situation — split the dataset
OVERFITTING
Simulate the real-life situation — split the dataset
OVERFITTING
Simulate the real-life situation — split the dataset
Underfitting!
“Too stupid”
OK
Overfitting!
“Too smart”
OVERFITTING
Underfitting!
“Too stupid”
OK
Overfitting!
“Too smart”
OVERFITTING
Our current decision tree has too much capacity,
it just has memorized all of the data.
Let’s make it less complex.
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
You probably did not notice, but we are overfitting again :(
TEST SET
20%
TRAINING SET
60%
THE WHOLE DATASET
VALIDATION SET
20%
TEST SET
20%
TRAINING SET
60%
THE WHOLE DATASET
VALIDATION SET
20%
Fit various models
and parameter
combinations on this
subset
TEST SET
20%
TRAINING SET
60%
THE WHOLE DATASET
VALIDATION SET
20%
Fit various models
and parameter
combinations on this
subset
• Evaluate the
models created
with different
parameters
TEST SET
20%
TRAINING SET
60%
THE WHOLE DATASET
VALIDATION SET
20%
Fit various models
and parameter
combinations on this
subset
• Evaluate the
models created
with different
parameters
!
• Estimate overfitting
TRA
VALI
TEST SET
20%
TRAINING SET
60%
THE WHOLE DATASET
VALIDATION SET
20%
Fit various models
and parameter
combinations on this
subset
• Evaluate the
models created
with different
parameters
!
• Estimate overfitting
TRA
VALI
TRA
VALI
TEST SET
20%
TRAINING SET
60%
THE WHOLE DATASET
VALIDATION SET
20%
Fit various models
and parameter
combinations on this
subset
• Evaluate the
models created
with different
parameters
!
• Estimate overfitting
TRA
VALI
TRA
VALI
TRA
VALI
TEST SET
20%
TRAINING SET
60%
THE WHOLE DATASET
VALIDATION SET
20%
Fit various models
and parameter
combinations on this
subset
• Evaluate the
models created
with different
parameters
!
• Estimate overfitting
TRA
VALI
TRA
VALI
TRA
VALI
TRA
VALI
TEST SET
20%
TRAINING SET
60%
THE WHOLE DATASET
VALIDATION SET
20%
Fit various models
and parameter
combinations on this
subset
• Evaluate the
models created
with different
parameters
!
• Estimate overfitting
TRA
VALI
TRA
VALI
TRA
VALI
TRA
VALI
TRA
VALI
TEST SET
20%
TRAINING SET
60%
THE WHOLE DATASET
VALIDATION SET
20%
Fit various models
and parameter
combinations on this
subset
• Evaluate the
models created
with different
parameters
!
• Estimate overfitting
Use only once to get
the final performance
estimate
TRA
VALI
TRA
VALI
TRA
VALI
TRA
VALI
TRA
VALI
TEST SET
20%
TRAINING SET
60%
VALIDATION SET
20%
TEST SET
20%
TRAINING SET
60%
VALIDATION SET
20%
CROSS-VALIDATION
TRAINING SET 60%
THE WHOLE DATASET
VALIDATION SET 20%
CROSS-VALIDATION
TRAINING SET 60%
THE WHOLE DATASET
VALIDATION SET 20%
What if we got too
optimistic validation set?
CROSS-VALIDATION
TRAINING SET 60%
THE WHOLE DATASET
VALIDATION SET 20%
What if we got too
optimistic validation set?
TRAINING SET 80%
CROSS-VALIDATION
TRAINING SET 60%
THE WHOLE DATASET
VALIDATION SET 20%
What if we got too
optimistic validation set?
TRAINING SET 80%
Fix the parameter value you ned to evaluate, say msl=15
CROSS-VALIDATION
TRAINING SET 60%
THE WHOLE DATASET
VALIDATION SET 20%
What if we got too
optimistic validation set?
TRAINING SET 80%
Fix the parameter value you ned to evaluate, say msl=15
TRAINING VAL
TRAINING VAL
TRAININGVAL
Repeat 10 times
CROSS-VALIDATION
TRAINING SET 60%
THE WHOLE DATASET
VALIDATION SET 20%
What if we got too
optimistic validation set?
TRAINING SET 80%
Fix the parameter value you ned to evaluate, say msl=15
TRAINING VAL
TRAINING VAL
TRAININGVAL
Repeat 10 times
}
Take average
validation score
over 10 runs —
it is a more
stable estimate.
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
MACHINE LEARNING PIPELINE
Take raw data Extract features
Split into TRAINING
and TEST
Pick an algorithm
and parameters
Train on the
TRAINING data
Evaluate on the
TRAINING data
with CV
Train on the
whole TRAINING
Fix the best
parameters
Evaluate on TEST
Report final
performance to
the client
Try our different algorithms
and parameters
MACHINE LEARNING PIPELINE
Take raw data Extract features
Split into TRAINING
and TEST
Pick an algorithm
and parameters
Train on the
TRAINING data
Evaluate on the
TRAINING data
with CV
Train on the
whole TRAINING
Fix the best
parameters
Evaluate on TEST
Report final
performance to
the client
Try our different algorithms
and parameters
“So it is ~87%…erm…
Could you do better?”
MACHINE LEARNING PIPELINE
Take raw data Extract features
Split into TRAINING
and TEST
Pick an algorithm
and parameters
Train on the
TRAINING data
Evaluate on the
TRAINING data
with CV
Train on the
whole TRAINING
Fix the best
parameters
Evaluate on TEST
Report final
performance to
the client
Try our different algorithms
and parameters
“So it is ~87%…erm…
Could you do better?”
Yes
• C4.5	
• Random	forests	
• Bayesian	networks	
• Hidden	Markov	models	
• Artificial	neural	network	
• Data	clustering	
• Expectation-maximization	
algorithm	
• Self-organizing	map	
• Radial	basis	function	network	
• Vector	Quantization	
• Generative	topographic	map	
• Information	bottleneck	method	
• IBSEAD	
• Apriori	algorithm	
• Eclat	algorithm	
• FP-growth	algorithm	
• Single-linkage	clustering	
• Conceptual	clustering	
• K-means	algorithm	
• Fuzzy	clustering	
• Temporal	difference	learning	
• Q-learning	
• Learning	Automata
• AODE	
• Artificial	neural	network	
• Backpropagation	
• Naive	Bayes	classifier	
• Bayesian	network	
• Bayesian	knowledge	base	
• Case-based	reasoning	
• Decision	trees	
• Inductive	logic	
programming	
• Gaussian	process	regression	
• Gene	expression	
programming	
• Group	method	of	data	
handling	(GMDH)	
• Learning	Automata	
• Learning	Vector	
Quantization	
• Logistic	Model	Tree	
• Decision	tree	
• Decision	graphs	
• Lazy	learning	
• Monte	Carlo	Method	
• SARSA
• Instance-based	learning	
• Nearest	Neighbor	Algorithm	
• Analogical	modeling	
• Probably	approximately	correct	learning	
(PACL)	
• Symbolic	machine	learning	algorithms	
• Subsymbolic	machine	learning	algorithms	
• Support	vector	machines	
• Random	Forest	
• Ensembles	of	classifiers	
• Bootstrap	aggregating	(bagging)	
• Boosting	(meta-algorithm)	
• Ordinal	classification	
• Regression	analysis	
• Information	fuzzy	networks	(IFN)	
• Linear	classifiers	
• Fisher's	linear	discriminant	
• Logistic	regression	
• Naive	Bayes	classifier	
• Perceptron	
• Support	vector	machines	
• Quadratic	classifiers	
• k-nearest	neighbor	
• Boosting
Pick another algorithm
• C4.5	
• Random	forests	
• Bayesian	networks	
• Hidden	Markov	models	
• Artificial	neural	network	
• Data	clustering	
• Expectation-maximization	
algorithm	
• Self-organizing	map	
• Radial	basis	function	network	
• Vector	Quantization	
• Generative	topographic	map	
• Information	bottleneck	method	
• IBSEAD	
• Apriori	algorithm	
• Eclat	algorithm	
• FP-growth	algorithm	
• Single-linkage	clustering	
• Conceptual	clustering	
• K-means	algorithm	
• Fuzzy	clustering	
• Temporal	difference	learning	
• Q-learning	
• Learning	Automata
• AODE	
• Artificial	neural	network	
• Backpropagation	
• Naive	Bayes	classifier	
• Bayesian	network	
• Bayesian	knowledge	base	
• Case-based	reasoning	
• Decision	trees	
• Inductive	logic	
programming	
• Gaussian	process	regression	
• Gene	expression	
programming	
• Group	method	of	data	
handling	(GMDH)	
• Learning	Automata	
• Learning	Vector	
Quantization	
• Logistic	Model	Tree	
• Decision	tree	
• Decision	graphs	
• Lazy	learning	
• Monte	Carlo	Method	
• SARSA
• Instance-based	learning	
• Nearest	Neighbor	Algorithm	
• Analogical	modeling	
• Probably	approximately	correct	learning	
(PACL)	
• Symbolic	machine	learning	algorithms	
• Subsymbolic	machine	learning	algorithms	
• Support	vector	machines	
• Random	Forest	
• Ensembles	of	classifiers	
• Bootstrap	aggregating	(bagging)	
• Boosting	(meta-algorithm)	
• Ordinal	classification	
• Regression	analysis	
• Information	fuzzy	networks	(IFN)	
• Linear	classifiers	
• Fisher's	linear	discriminant	
• Logistic	regression	
• Naive	Bayes	classifier	
• Perceptron	
• Support	vector	machines	
• Quadratic	classifiers	
• k-nearest	neighbor	
• Boosting
Pick another algorithm
RANDOM FOREST
RANDOM FOREST
Decision tree:
pick best out of all features
RANDOM FOREST
Decision tree:
pick best out of all features
Random forest:
pick best out of random
subset of features
RANDOM FOREST
RANDOM FOREST
pick best out of another
random subset of features
RANDOM FOREST
pick best out of another
random subset of features pick best out of yet another
random subset of features
RANDOM FOREST
RANDOM FOREST
RANDOM FOREST
class
instance
RANDOM FOREST
class
instance
RANDOM FOREST
class
instance
RANDOM FOREST
class
instance
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
Happy client
ALL OTHER USE CASES
Sound
Frequency
components
Genre
Bag of
words
Topic
Text
Pixel
values
Image
Cat or
dog
Video
Frame
pixels
Walking
or running
Database records Biometric data
Census
data
Average
salary
…
Dead or
alive
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
Introduction to Machine Learning @ Mooncascade ML Camp
HANDS-ON SESSION
http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html
Introduction to Machine Learning @ Mooncascade ML Camp

More Related Content

PPTX
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
PDF
The Ring programming language version 1.5.2 book - Part 66 of 181
PDF
Recentrer l'intelligence artificielle sur les connaissances
PDF
Wrangling data the tidy way with the tidyverse
TXT
Moddefaults mac
PDF
第7回 大規模データを用いたデータフレーム操作実習(1)
PDF
Belfast JUG, SIMD (Vectorial) Operations
PDF
CSS Algorithms - v3.6.1 @ Strange Loop
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
The Ring programming language version 1.5.2 book - Part 66 of 181
Recentrer l'intelligence artificielle sur les connaissances
Wrangling data the tidy way with the tidyverse
Moddefaults mac
第7回 大規模データを用いたデータフレーム操作実習(1)
Belfast JUG, SIMD (Vectorial) Operations
CSS Algorithms - v3.6.1 @ Strange Loop

Viewers also liked (20)

PDF
Mastering the game of Go with deep neural networks and tree search (article o...
PPTX
#48 Machine learning
PDF
Machine Learning for Understanding and Managing Ecosystems
PDF
Demystifying Machine Learning - How to give your business superpowers.
DOCX
Actividad 02
PDF
Machine Learning and Data Mining: 03 Data Representation
PPTX
A Beginner's Guide to Machine Learning with Scikit-Learn
PDF
Machine learning the next revolution or just another hype
PDF
Введение в архитектуры нейронных сетей / HighLoad++ 2016
PDF
A Nontechnical Introduction to Machine Learning
PPTX
Introduction to Machine Learning
PPTX
Machine Learning - Challenges, Learnings & Opportunities
PPTX
Machine Learning in Pathology Diagnostics with Simagis Live
PDF
A brief history of machine learning
PDF
Neural Turing Machines
PPTX
Machine Learning and Search -State of Search 2016
PPTX
MLaaS - Machine Learning as a Service
PDF
Focus Junior - 14 Maggio 2016
PDF
Natural Language Processing with Python
PDF
Introduction Machine Learning by MyLittleAdventure
Mastering the game of Go with deep neural networks and tree search (article o...
#48 Machine learning
Machine Learning for Understanding and Managing Ecosystems
Demystifying Machine Learning - How to give your business superpowers.
Actividad 02
Machine Learning and Data Mining: 03 Data Representation
A Beginner's Guide to Machine Learning with Scikit-Learn
Machine learning the next revolution or just another hype
Введение в архитектуры нейронных сетей / HighLoad++ 2016
A Nontechnical Introduction to Machine Learning
Introduction to Machine Learning
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning in Pathology Diagnostics with Simagis Live
A brief history of machine learning
Neural Turing Machines
Machine Learning and Search -State of Search 2016
MLaaS - Machine Learning as a Service
Focus Junior - 14 Maggio 2016
Natural Language Processing with Python
Introduction Machine Learning by MyLittleAdventure
Ad

Similar to Introduction to Machine Learning @ Mooncascade ML Camp (20)

PDF
Research overview Oct. 2018
PDF
Lorentz workshop - 2018
PDF
Machine learning for_finance
PDF
Machine Learning : why we should know and how it works
PDF
Introducing Reactive Machine Learning
PDF
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
PDF
Bigger Data v Better Math
PDF
Workshop - Introduction to Machine Learning with R
PPTX
Machine Learning in a Flash (Extended Edition 2): An Introduction to Neural N...
PPTX
Ltc completed slides
PDF
GANS Project for Image idetification.pdf
PPTX
An introduction to Deep Learning with Apache MXNet (November 2017)
PDF
Deep learning
PDF
Comparing Machine Learning Algorithms in Text Mining
KEY
Numpy Talk at SIAM
PDF
4. Classification.pdf
PDF
TAO Fayan_Report on Top 10 data mining algorithms applications with R
PPTX
visualisasi data praktik pakai excel, py
PDF
Barga Data Science lecture 7
PDF
20181106 arie van_deursen_testday2018
Research overview Oct. 2018
Lorentz workshop - 2018
Machine learning for_finance
Machine Learning : why we should know and how it works
Introducing Reactive Machine Learning
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attent...
Bigger Data v Better Math
Workshop - Introduction to Machine Learning with R
Machine Learning in a Flash (Extended Edition 2): An Introduction to Neural N...
Ltc completed slides
GANS Project for Image idetification.pdf
An introduction to Deep Learning with Apache MXNet (November 2017)
Deep learning
Comparing Machine Learning Algorithms in Text Mining
Numpy Talk at SIAM
4. Classification.pdf
TAO Fayan_Report on Top 10 data mining algorithms applications with R
visualisasi data praktik pakai excel, py
Barga Data Science lecture 7
20181106 arie van_deursen_testday2018
Ad

More from Ilya Kuzovkin (14)

PDF
Understanding Information Processing in Human Brain by Interpreting Machine L...
PDF
The Brain and the Modern AI: Drastic Differences and Curious Similarities
PDF
The First Day at the Deep learning Zoo
PDF
Intuitive Intro to Gödel's Incompleteness Theorem
PDF
Paper overview: "Deep Residual Learning for Image Recognition"
PDF
Deep Learning: Theory, History, State of the Art & Practical Tools
PDF
Article overview: Unsupervised Learning of Visual Structure Using Predictive ...
PDF
Article overview: Deep Neural Networks Reveal a Gradient in the Complexity of...
PDF
NIPS2014 Article Overview: Do Deep Nets Really Need to be Deep?
PDF
Neuroimaging: Intracortical, fMRI, EEG
PDF
Article Overview "Reach and grasp by people with tetraplegia using a neurally...
PDF
Introduction to Computing on GPU
PDF
Soft Introduction to Brain-Computer Interfaces and Machine Learning
PDF
Ilya Kuzovkin - Adaptive Interactive Learning for Brain-Computer Interfaces
Understanding Information Processing in Human Brain by Interpreting Machine L...
The Brain and the Modern AI: Drastic Differences and Curious Similarities
The First Day at the Deep learning Zoo
Intuitive Intro to Gödel's Incompleteness Theorem
Paper overview: "Deep Residual Learning for Image Recognition"
Deep Learning: Theory, History, State of the Art & Practical Tools
Article overview: Unsupervised Learning of Visual Structure Using Predictive ...
Article overview: Deep Neural Networks Reveal a Gradient in the Complexity of...
NIPS2014 Article Overview: Do Deep Nets Really Need to be Deep?
Neuroimaging: Intracortical, fMRI, EEG
Article Overview "Reach and grasp by people with tetraplegia using a neurally...
Introduction to Computing on GPU
Soft Introduction to Brain-Computer Interfaces and Machine Learning
Ilya Kuzovkin - Adaptive Interactive Learning for Brain-Computer Interfaces

Recently uploaded (20)

PDF
Encapsulation_ Review paper, used for researhc scholars
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Big Data Technologies - Introduction.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
KodekX | Application Modernization Development
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Cloud computing and distributed systems.
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Encapsulation_ Review paper, used for researhc scholars
“AI and Expert System Decision Support & Business Intelligence Systems”
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Spectroscopy.pptx food analysis technology
Big Data Technologies - Introduction.pptx
Encapsulation theory and applications.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Advanced methodologies resolving dimensionality complications for autism neur...
sap open course for s4hana steps from ECC to s4
Diabetes mellitus diagnosis method based random forest with bat algorithm
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
KodekX | Application Modernization Development
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Cloud computing and distributed systems.
Unlocking AI with Model Context Protocol (MCP)
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11

Introduction to Machine Learning @ Mooncascade ML Camp

  • 1. by Ilya Kuzovkin ilya.kuzovkin@gmail.com Mooncascade ML Camp 2016 Machine Learning ESSENTIAL CONCEPTS
  • 10. Can we ask a computer to create those patterns automatically?
  • 11. Can we ask a computer to create those patterns automatically? Yes
  • 12. Can we ask a computer to create those patterns automatically? Yes How?
  • 14. Instance Raw data Class (label) A data sample: “7”
  • 15. Instance Raw data Class (label) A data sample: “7” How to represent it in a machine-readable form?
  • 16. Instance Raw data Class (label) A data sample: “7” How to represent it in a machine-readable form? Feature extraction
  • 17. Instance Raw data Class (label) A data sample: “7” How to represent it in a machine-readable form? Feature extraction 28px 28 px
  • 18. Instance Raw data Class (label) A data sample: “7” 28px 28 px 784 pixels in total Feature vector (0, 0, 0, …, 28, 65, 128, 255, 101, 38,… 0, 0, 0) How to represent it in a machine-readable form? Feature extraction
  • 19. Instance Raw data Class (label) A data sample: “7” 28px 28 px 784 pixels in total Feature vector (0, 0, 0, …, 28, 65, 128, 255, 101, 38,… 0, 0, 0) How to represent it in a machine-readable form? Feature extraction (0, 0, 0, …, 28, 65, 128, 255, 101, 38,… 0, 0, 0) (0, 0, 0, …, 13, 48, 102, 0, 46, 255,… 0, 0, 0) (0, 0, 0, …, 17, 34, 12, 43, 122, 70,… 0, 7, 0) (0, 0, 0, …, 98, 21, 255, 255, 231, 140,… 0, 0, 0) “7” “2” “8” “2”
  • 20. Instance Raw data Class (label) A data sample: “7” 28px 28 px 784 pixels in total Feature vector (0, 0, 0, …, 28, 65, 128, 255, 101, 38,… 0, 0, 0) How to represent it in a machine-readable form? Feature extraction (0, 0, 0, …, 28, 65, 128, 255, 101, 38,… 0, 0, 0) (0, 0, 0, …, 13, 48, 102, 0, 46, 255,… 0, 0, 0) (0, 0, 0, …, 17, 34, 12, 43, 122, 70,… 0, 7, 0) Dataset (0, 0, 0, …, 98, 21, 255, 255, 231, 140,… 0, 0, 0) “7” “2” “8” “2”
  • 21. The data is in the right format — what’s next?
  • 22. The data is in the right format — what’s next? • C4.5 • Random forests • Bayesian networks • Hidden Markov models • Artificial neural network • Data clustering • Expectation-maximization algorithm • Self-organizing map • Radial basis function network • Vector Quantization • Generative topographic map • Information bottleneck method • IBSEAD • Apriori algorithm • Eclat algorithm • FP-growth algorithm • Single-linkage clustering • Conceptual clustering • K-means algorithm • Fuzzy clustering • Temporal difference learning • Q-learning • Learning Automata • AODE • Artificial neural network • Backpropagation • Naive Bayes classifier • Bayesian network • Bayesian knowledge base • Case-based reasoning • Decision trees • Inductive logic programming • Gaussian process regression • Gene expression programming • Group method of data handling (GMDH) • Learning Automata • Learning Vector Quantization • Logistic Model Tree • Decision tree • Decision graphs • Lazy learning • Monte Carlo Method • SARSA • Instance-based learning • Nearest Neighbor Algorithm • Analogical modeling • Probably approximately correct learning (PACL) • Symbolic machine learning algorithms • Subsymbolic machine learning algorithms • Support vector machines • Random Forest • Ensembles of classifiers • Bootstrap aggregating (bagging) • Boosting (meta-algorithm) • Ordinal classification • Regression analysis • Information fuzzy networks (IFN) • Linear classifiers • Fisher's linear discriminant • Logistic regression • Naive Bayes classifier • Perceptron • Support vector machines • Quadratic classifiers • k-nearest neighbor • Boosting Pick an algorithm
  • 23. The data is in the right format — what’s next? • C4.5 • Random forests • Bayesian networks • Hidden Markov models • Artificial neural network • Data clustering • Expectation-maximization algorithm • Self-organizing map • Radial basis function network • Vector Quantization • Generative topographic map • Information bottleneck method • IBSEAD • Apriori algorithm • Eclat algorithm • FP-growth algorithm • Single-linkage clustering • Conceptual clustering • K-means algorithm • Fuzzy clustering • Temporal difference learning • Q-learning • Learning Automata • AODE • Artificial neural network • Backpropagation • Naive Bayes classifier • Bayesian network • Bayesian knowledge base • Case-based reasoning • Decision trees • Inductive logic programming • Gaussian process regression • Gene expression programming • Group method of data handling (GMDH) • Learning Automata • Learning Vector Quantization • Logistic Model Tree • Decision tree • Decision graphs • Lazy learning • Monte Carlo Method • SARSA • Instance-based learning • Nearest Neighbor Algorithm • Analogical modeling • Probably approximately correct learning (PACL) • Symbolic machine learning algorithms • Subsymbolic machine learning algorithms • Support vector machines • Random Forest • Ensembles of classifiers • Bootstrap aggregating (bagging) • Boosting (meta-algorithm) • Ordinal classification • Regression analysis • Information fuzzy networks (IFN) • Linear classifiers • Fisher's linear discriminant • Logistic regression • Naive Bayes classifier • Perceptron • Support vector machines • Quadratic classifiers • k-nearest neighbor • Boosting Pick an algorithm
  • 25. DECISION TREE vs. (0, …, 28, 65, …, 207, 101, 0, 0) (0, …, 19, 34, …, 254, 54, 0, 0) (0, …, 87, 59, …, 240, 52, 4, 0) (0, …, 87, 52, …, 240, 19, 3, 0) (0, …, 28, 64, …, 102, 101, 0, 0) (0, …, 19, 23, …, 105, 54, 0, 0) (0, …, 87, 74, …, 121, 51, 7, 0) (0, …, 87, 112, …, 239, 52, 4, 0)
  • 26. DECISION TREE vs. (0, …, 28, 65, …, 207, 101, 0, 0) (0, …, 19, 34, …, 254, 54, 0, 0) (0, …, 87, 59, …, 240, 52, 4, 0) (0, …, 87, 52, …, 240, 19, 3, 0) (0, …, 28, 64, …, 102, 101, 0, 0) (0, …, 19, 23, …, 105, 54, 0, 0) (0, …, 87, 74, …, 121, 51, 7, 0) (0, …, 87, 112, …, 239, 52, 4, 0) PIXEL #417
  • 27. DECISION TREE vs. (0, …, 28, 65, …, 207, 101, 0, 0) (0, …, 19, 34, …, 254, 54, 0, 0) (0, …, 87, 59, …, 240, 52, 4, 0) (0, …, 87, 52, …, 240, 19, 3, 0) (0, …, 28, 64, …, 102, 101, 0, 0) (0, …, 19, 23, …, 105, 54, 0, 0) (0, …, 87, 74, …, 121, 51, 7, 0) (0, …, 87, 112, …, 239, 52, 4, 0) PIXEL #417 PIXEL #417 >200 <200
  • 28. DECISION TREE vs. (0, …, 28, 65, …, 207, 101, 0, 0) (0, …, 19, 34, …, 254, 54, 0, 0) (0, …, 87, 59, …, 240, 52, 4, 0) (0, …, 87, 52, …, 240, 19, 3, 0) (0, …, 28, 64, …, 102, 101, 0, 0) (0, …, 19, 23, …, 105, 54, 0, 0) (0, …, 87, 74, …, 121, 51, 7, 0) (0, …, 87, 112, …, 239, 52, 4, 0) PIXEL #417 PIXEL #417 >200 <200
  • 29. DECISION TREE vs. (0, …, 28, 65, …, 207, 101, 0, 0) (0, …, 19, 34, …, 254, 54, 0, 0) (0, …, 87, 59, …, 240, 52, 4, 0) (0, …, 87, 52, …, 240, 19, 3, 0) (0, …, 28, 64, …, 102, 101, 0, 0) (0, …, 19, 23, …, 105, 54, 0, 0) (0, …, 87, 74, …, 121, 51, 7, 0) (0, …, 87, 112, …, 239, 52, 4, 0) PIXEL #417 >200 <200
  • 30. DECISION TREE vs. (0, …, 28, 65, …, 207, 101, 0, 0) (0, …, 19, 34, …, 254, 54, 0, 0) (0, …, 87, 59, …, 240, 52, 4, 0) (0, …, 87, 52, …, 240, 19, 3, 0) (0, …, 28, 64, …, 102, 101, 0, 0) (0, …, 19, 23, …, 105, 54, 0, 0) (0, …, 87, 74, …, 121, 51, 7, 0) (0, …, 87, 112, …, 239, 52, 4, 0) PIXEL #417 >200 <200 PIXEL #123
  • 31. DECISION TREE vs. (0, …, 28, 65, …, 207, 101, 0, 0) (0, …, 19, 34, …, 254, 54, 0, 0) (0, …, 87, 59, …, 240, 52, 4, 0) (0, …, 87, 52, …, 240, 19, 3, 0) (0, …, 28, 64, …, 102, 101, 0, 0) (0, …, 19, 23, …, 105, 54, 0, 0) (0, …, 87, 74, …, 121, 51, 7, 0) (0, …, 87, 112, …, 239, 52, 4, 0) PIXEL #417 >200 <200 PIXEL #123 <100 >100 PIXEL #123
  • 32. DECISION TREE vs. (0, …, 28, 65, …, 207, 101, 0, 0) (0, …, 19, 34, …, 254, 54, 0, 0) (0, …, 87, 59, …, 240, 52, 4, 0) (0, …, 87, 52, …, 240, 19, 3, 0) (0, …, 28, 64, …, 102, 101, 0, 0) (0, …, 19, 23, …, 105, 54, 0, 0) (0, …, 87, 74, …, 121, 51, 7, 0) (0, …, 87, 112, …, 239, 52, 4, 0) PIXEL #417 >200 <200 <100 >100 PIXEL #123
  • 37. ACCURACY Confusion matrix acc = correctly classified total number of samples Trueclass Predicted class
  • 38. ACCURACY Confusion matrix acc = correctly classified total number of samples Beware of an imbalanced dataset! Trueclass Predicted class
  • 39. ACCURACY Confusion matrix acc = correctly classified total number of samples Beware of an imbalanced dataset! Consider the following model: “Always predict 2” Trueclass Predicted class
  • 40. ACCURACY Confusion matrix acc = correctly classified total number of samples Beware of an imbalanced dataset! Consider the following model: “Always predict 2” Accuracy 0.9 Trueclass Predicted class
  • 42. DECISION TREE “You said 100% accurate?! Every 10th digit your system detects is wrong!” Angry client
  • 43. DECISION TREE “You said 100% accurate?! Every 10th digit your system detects is wrong!” Angry client We’ve trained our system on the data the client gave us. But our system has never seen the new data the client applied it to. And in the real life — it never will…
  • 44. OVERFITTING Simulate the real-life situation — split the dataset
  • 45. OVERFITTING Simulate the real-life situation — split the dataset
  • 46. OVERFITTING Simulate the real-life situation — split the dataset
  • 47. OVERFITTING Simulate the real-life situation — split the dataset
  • 49. Underfitting! “Too stupid” OK Overfitting! “Too smart” OVERFITTING Our current decision tree has too much capacity, it just has memorized all of the data. Let’s make it less complex.
  • 53. You probably did not notice, but we are overfitting again :(
  • 54. TEST SET 20% TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20%
  • 55. TEST SET 20% TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20% Fit various models and parameter combinations on this subset
  • 56. TEST SET 20% TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20% Fit various models and parameter combinations on this subset • Evaluate the models created with different parameters
  • 57. TEST SET 20% TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20% Fit various models and parameter combinations on this subset • Evaluate the models created with different parameters ! • Estimate overfitting TRA VALI
  • 58. TEST SET 20% TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20% Fit various models and parameter combinations on this subset • Evaluate the models created with different parameters ! • Estimate overfitting TRA VALI TRA VALI
  • 59. TEST SET 20% TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20% Fit various models and parameter combinations on this subset • Evaluate the models created with different parameters ! • Estimate overfitting TRA VALI TRA VALI TRA VALI
  • 60. TEST SET 20% TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20% Fit various models and parameter combinations on this subset • Evaluate the models created with different parameters ! • Estimate overfitting TRA VALI TRA VALI TRA VALI TRA VALI
  • 61. TEST SET 20% TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20% Fit various models and parameter combinations on this subset • Evaluate the models created with different parameters ! • Estimate overfitting TRA VALI TRA VALI TRA VALI TRA VALI TRA VALI
  • 62. TEST SET 20% TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20% Fit various models and parameter combinations on this subset • Evaluate the models created with different parameters ! • Estimate overfitting Use only once to get the final performance estimate TRA VALI TRA VALI TRA VALI TRA VALI TRA VALI
  • 65. CROSS-VALIDATION TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20%
  • 66. CROSS-VALIDATION TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20% What if we got too optimistic validation set?
  • 67. CROSS-VALIDATION TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20% What if we got too optimistic validation set? TRAINING SET 80%
  • 68. CROSS-VALIDATION TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20% What if we got too optimistic validation set? TRAINING SET 80% Fix the parameter value you ned to evaluate, say msl=15
  • 69. CROSS-VALIDATION TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20% What if we got too optimistic validation set? TRAINING SET 80% Fix the parameter value you ned to evaluate, say msl=15 TRAINING VAL TRAINING VAL TRAININGVAL Repeat 10 times
  • 70. CROSS-VALIDATION TRAINING SET 60% THE WHOLE DATASET VALIDATION SET 20% What if we got too optimistic validation set? TRAINING SET 80% Fix the parameter value you ned to evaluate, say msl=15 TRAINING VAL TRAINING VAL TRAININGVAL Repeat 10 times } Take average validation score over 10 runs — it is a more stable estimate.
  • 74. MACHINE LEARNING PIPELINE Take raw data Extract features Split into TRAINING and TEST Pick an algorithm and parameters Train on the TRAINING data Evaluate on the TRAINING data with CV Train on the whole TRAINING Fix the best parameters Evaluate on TEST Report final performance to the client Try our different algorithms and parameters
  • 75. MACHINE LEARNING PIPELINE Take raw data Extract features Split into TRAINING and TEST Pick an algorithm and parameters Train on the TRAINING data Evaluate on the TRAINING data with CV Train on the whole TRAINING Fix the best parameters Evaluate on TEST Report final performance to the client Try our different algorithms and parameters “So it is ~87%…erm… Could you do better?”
  • 76. MACHINE LEARNING PIPELINE Take raw data Extract features Split into TRAINING and TEST Pick an algorithm and parameters Train on the TRAINING data Evaluate on the TRAINING data with CV Train on the whole TRAINING Fix the best parameters Evaluate on TEST Report final performance to the client Try our different algorithms and parameters “So it is ~87%…erm… Could you do better?” Yes
  • 77. • C4.5 • Random forests • Bayesian networks • Hidden Markov models • Artificial neural network • Data clustering • Expectation-maximization algorithm • Self-organizing map • Radial basis function network • Vector Quantization • Generative topographic map • Information bottleneck method • IBSEAD • Apriori algorithm • Eclat algorithm • FP-growth algorithm • Single-linkage clustering • Conceptual clustering • K-means algorithm • Fuzzy clustering • Temporal difference learning • Q-learning • Learning Automata • AODE • Artificial neural network • Backpropagation • Naive Bayes classifier • Bayesian network • Bayesian knowledge base • Case-based reasoning • Decision trees • Inductive logic programming • Gaussian process regression • Gene expression programming • Group method of data handling (GMDH) • Learning Automata • Learning Vector Quantization • Logistic Model Tree • Decision tree • Decision graphs • Lazy learning • Monte Carlo Method • SARSA • Instance-based learning • Nearest Neighbor Algorithm • Analogical modeling • Probably approximately correct learning (PACL) • Symbolic machine learning algorithms • Subsymbolic machine learning algorithms • Support vector machines • Random Forest • Ensembles of classifiers • Bootstrap aggregating (bagging) • Boosting (meta-algorithm) • Ordinal classification • Regression analysis • Information fuzzy networks (IFN) • Linear classifiers • Fisher's linear discriminant • Logistic regression • Naive Bayes classifier • Perceptron • Support vector machines • Quadratic classifiers • k-nearest neighbor • Boosting Pick another algorithm
  • 78. • C4.5 • Random forests • Bayesian networks • Hidden Markov models • Artificial neural network • Data clustering • Expectation-maximization algorithm • Self-organizing map • Radial basis function network • Vector Quantization • Generative topographic map • Information bottleneck method • IBSEAD • Apriori algorithm • Eclat algorithm • FP-growth algorithm • Single-linkage clustering • Conceptual clustering • K-means algorithm • Fuzzy clustering • Temporal difference learning • Q-learning • Learning Automata • AODE • Artificial neural network • Backpropagation • Naive Bayes classifier • Bayesian network • Bayesian knowledge base • Case-based reasoning • Decision trees • Inductive logic programming • Gaussian process regression • Gene expression programming • Group method of data handling (GMDH) • Learning Automata • Learning Vector Quantization • Logistic Model Tree • Decision tree • Decision graphs • Lazy learning • Monte Carlo Method • SARSA • Instance-based learning • Nearest Neighbor Algorithm • Analogical modeling • Probably approximately correct learning (PACL) • Symbolic machine learning algorithms • Subsymbolic machine learning algorithms • Support vector machines • Random Forest • Ensembles of classifiers • Bootstrap aggregating (bagging) • Boosting (meta-algorithm) • Ordinal classification • Regression analysis • Information fuzzy networks (IFN) • Linear classifiers • Fisher's linear discriminant • Logistic regression • Naive Bayes classifier • Perceptron • Support vector machines • Quadratic classifiers • k-nearest neighbor • Boosting Pick another algorithm
  • 80. RANDOM FOREST Decision tree: pick best out of all features
  • 81. RANDOM FOREST Decision tree: pick best out of all features Random forest: pick best out of random subset of features
  • 83. RANDOM FOREST pick best out of another random subset of features
  • 84. RANDOM FOREST pick best out of another random subset of features pick best out of yet another random subset of features
  • 94. ALL OTHER USE CASES
  • 95. Sound Frequency components Genre Bag of words Topic Text Pixel values Image Cat or dog Video Frame pixels Walking or running Database records Biometric data Census data Average salary … Dead or alive