2_1. Types of Machine Learning, History of ML.pdf

Types of Learning
• Supervised (inductive) learning
– Given: training data + desired outputs (labels)
• Unsupervised learning
– Given: training data (without desired outputs)
• Semi-supervised learning
– Given: training data + a few desired outputs
• Reinforcement learning
– Rewards from sequence of actions

Supervised Learning: Regression
• Given (x1, y1), (x2, y2), ..., (xn, yn)
• Learn a function f(x) to predict y given x
– y is real-valued == regression
0
1
2
3
4
5
6
7
8
9
1970 1980 1990 2000 2010 2020
September
Arctic
Sea
Ice
Extent
(1,000,000
sq
km)
Year

Supervised Learning: Classification
• Given (x1, y1), (x2, y2), ..., (xn, yn)
– y is categorical == classification
1(Malignant)
0(Benign)
Tumor Size
Breast Cancer (Malignant / Benign)

• Given (x1, y1), (x2, y2), ..., (xn, yn)
1(Malignant)
0(Benign)
Tumor Size
Tumor Size

• Given (x1, y1), (x2, y2), ..., (xn, yn)
1(Malignant)
0(Benign)
Tumor Size
Tumor Size
Predict Malignant
Predict Benign

Supervised Learning
Tumor Size
Age
- Clump Thickness
- Uniformity of Cell Size
- Uniformity of Cell Shape
…
• x can be multi-dimensional
– Each dimension corresponds to an attribute

Unsupervised Learning
• Given x1, x2, ..., xn (without labels)
• Output hidden structure behind the x’s
– E.g., clustering

Unsupervised Learning
• Independent component analysis – separate a
combined signal into its original sources

Designing a Learning System
• Choose the training experience
• Choose exactly what is to be learned
– i.e. the target function
• Choose how to represent the target function
• Choose a learning algorithm to infer the target
function from the experience
Environment/
Experience
Learner
Knowledge
Performance
Element
Training data
Testing data

Training vs. Test Distribution
• We generally assume that the training and
test examples are independently drawn from
the same overall distribution of data
– We call this “i.i.d” which stands for “independent
and identically distributed”
• If examples are not independent, requires
collective classification
• If test distribution is different, requires
transfer learning

Various Function Representations
• Numerical functions
– Linear regression
– Neural networks
– Support vector machines
• Symbolic functions
– Decision trees
– Rules in propositional logic
– Rules in first-order predicate logic
• Instance-based functions
– Nearest-neighbor
– Case-based
• Probabilistic Graphical Models
– Naïve Bayes
– Bayesian networks
– Hidden-Markov Models (HMMs)
– Probabilistic Context Free Grammars (PCFGs)
– Markov networks

Various Search/Optimization
Algorithms
• Gradient descent
– Perceptron
– Backpropagation
• Dynamic Programming
– HMM Learning
– PCFG Learning
• Divide and Conquer
– Decision tree induction
– Rule learning
• Evolutionary Computation
– Genetic Algorithms (GAs)
– Genetic Programming (GP)
– Neuro-evolution

Evaluation
• Accuracy
• Precision and recall
• Squared error
• Likelihood
• Posterior probability
• Cost / Utility
• Margin
• Entropy
• K-L divergence
• etc.

A Brief History of
Machine Learning

History of Machine Learning
• 1950s
– Samuel’s checker player
– Selfridge’s Pandemonium
• 1960s:
– Neural networks: Perceptron
– Pattern recognition
– Learning in the limit theory
– Minsky and Papert prove limitations of Perceptron
• 1970s:
– Symbolic concept induction
– Winston’s arch learner
– Expert systems and the knowledge acquisition bottleneck
– Quinlan’s ID3
– Michalski’s AQ and soybean diagnosis
– Scientific discovery with BACON
– Mathematical discovery with AM

History of Machine Learning (cont.)
• 1980s:
– Advanced decision tree and rule learning
– Explanation-based Learning (EBL)
– Learning and planning and problem solving
– Utility problem
– Analogy
– Cognitive architectures
– Resurgence of neural networks (connectionism, backpropagation)
– Valiant’s PAC Learning Theory
– Focus on experimental methodology
• 1990s
– Data mining
– Adaptive software agents and web applications
– Text learning
– Reinforcement learning (RL)
– Inductive Logic Programming (ILP)
– Ensembles: Bagging, Boosting, and Stacking
– Bayes Net learning

History of Machine Learning (cont.)
• 2000s
– Support vector machines & kernel methods
– Graphical models
– Statistical relational learning
– Transfer learning
– Sequence labeling
– Collective classification and structured outputs
– Computer Systems Applications (Compilers, Debugging, Graphics, Security)
– E-mail management
– Personalized assistants that learn
– Learning in robotics and vision
• 2010s
– Deep learning systems
– Learning for big data
– Bayesian methods
– Multi-task & lifelong learning
– Applications to vision, speech, social networks, learning to read, etc.
– ???

2_1. Types of Machine Learning, History of ML.pdf

More Related Content

Similar to 2_1. Types of Machine Learning, History of ML.pdf (20)

Recently uploaded (20)

2_1. Types of Machine Learning, History of ML.pdf