SlideShare a Scribd company logo
Hichem Felouat - hichemfel@gmail.com - Algeria 1
The Fundamentals
of
Machine Learning
Hichem Felouat
hichemfel@gmail.com
2Hichem Felouat - hichemfel@gmail.com - Algeria
What Is Artificial Intelligence?
Artificial intelligence (AI) is an area of computer science
that emphasizes the creation of intelligent machines that work
and react like humans.
• AI is an interdisciplinary science with multiple approaches.
• AI has become an essential part of the technology industry.
Subdomains of Artificial Intelligence
3Hichem Felouat - hichemfel@gmail.com - Algeria
4Hichem Felouat - hichemfel@gmail.com - Algeria
What Is Machine Learning?
• Machine Learning is the science (and
art) of programming computers so
they can learn from data.
• Machine Learning is the field of
study that gives computers the ability
to learn without being explicitly
programmed. —Arthur Samuel, 1959
Hichem Felouat - hichemfel@gmail.com - Algeria 5
What Does Learning Mean?
A computer program is said to
learn from experience E with
respect to some task T and some
performance measure P, if its
performance on T, as measured by
P, improves with experience E. —
Tom Mitchell, 1997
Hichem Felouat - hichemfel@gmail.com - Algeria 6
Timeline of Machine Learning
Hichem Felouat - hichemfel@gmail.com - Algeria 7
Why Use Machine Learning?
The traditional approach. If the problem is not trivial, your program will
likely become a long list of complex rules pretty hard to maintain.
Hichem Felouat - hichemfel@gmail.com - Algeria 8
Why Use Machine Learning?
Machine Learning approach. The program is much shorter, easier to
maintain, and most likely more accurate.
Hichem Felouat - hichemfel@gmail.com - Algeria 9
Why Use Machine Learning?
Machine Learning can help humans learn.
Hichem Felouat - hichemfel@gmail.com - Algeria 10
Why Use Machine Learning?
AI Index 2019 Annual Report.
Hichem Felouat - hichemfel@gmail.com - Algeria 11
Applications of Machine Learning
Machine learning is currently the preferred approach in the following
domains:
1) Speech analysis: e.g., speech recognition, synthesis.
2) Computer vision: e.g., object recognition/detection.
3) Robotics: e.g., position/map estimation.
4) Bio-informatics: e.g., sequence alignment, genetic analysis.
5) E-commerce: e.g., automatic trading, fraud detection.
6) Financial analysis: e.g., portfolio allocation, credits.
7) Medicine: e.g., diagnosis, therapy conception.
8) Web: e.g., Content management, social networks, etc.
Hichem Felouat - hichemfel@gmail.com - Algeria 12
Applications of Machine Learning
To summarize, Machine Learning is great for:
• Problems for which existing solutions require a lot of hand-tuning or
long lists of rules: one Machine Learning algorithm can often simplify
code and perform better.
• Complex problems for which there is no good solution at all using a
traditional approach: the best Machine Learning techniques can find a
solution.
Hichem Felouat - hichemfel@gmail.com - Algeria 13
How to get started with ML
1) Mathematics: statistics, probability, and
linear algebra.(NumPy, SciPy)
2) Programming: data structures, OOP, and
parallel programming. (Python)
3) Databases: SQL and NOSQL.
4) ML algorithms: regression, classification,
and clustering.
5) ML Tools: Scikt learn, TensorFlow and
Keras.
Hichem Felouat - hichemfel@gmail.com - Algeria 14
How to get started with ML
Hichem Felouat - hichemfel@gmail.com - Algeria 15
Machine Learning Vocabulary 1
1) Examples: Items or instances of data used for learning or evaluation. In our
spam problem, these examples correspond to the collection of email
messages we will use for learning and testing.
2) Training sample: Examples used to train a learning algorithm. In our spam
problem, the training sample consists of a set of email examples along with
their associated labels.
3) Labels: Values or categories assigned to examples. In classification
problems, examples are assigned specific categories, for instance, the spam
and non-spam categories in our binary classification problem. In regression,
items are assigned real-valued labels.
Hichem Felouat - hichemfel@gmail.com - Algeria 16
Machine Learning Vocabulary 2
5) Test sample: Examples used to evaluate the performance of a learning algorithm. The test
sample is separate from the training and validation data and is not made available in the
learning stage. In the spam problem, the test sample consists of a collection of email
examples for which the learning algorithm must predict labels based on features. These
predictions are then compared with the labels of the test sample to measure the performance
of the algorithm.
4) Features: The set of attributes, often represented as a vector, associated to an example. In
the case of email messages, some relevant features may include the length of the message,
the name of the sender, various characteristics of the header, the presence of certain
keywords in the body of the message, and so on.
6) Loss function: A function that measures the difference, or loss, between a
predicted label and a true label.
Hichem Felouat - hichemfel@gmail.com - Algeria 17
Types of Machine Learning Systems
There are so many different types of Machine Learning systems that it is
useful to classify them in broad categories based on:
• Whether or not they are trained with human supervision (supervised,
unsupervised, semisupervised, and Reinforcement Learning).
• Whether or not they can learn incrementally on the fly (online versus
batch learning).
• Whether they work by simply comparing new data points to known data
points, or instead detect patterns in the training data and build a
predictive model, much like scientists do (instance-based versus model-
based learning).
Hichem Felouat - hichemfel@gmail.com - Algeria 18
Types of Machine Learning Systems
Hichem Felouat - hichemfel@gmail.com - Algeria 19
Types of Machine Learning Systems
Supervised learning :
In supervised learning, the training data you feed to the
algorithm includes the desired solutions, called labels.
• When y is real, we talk about regression.
• When y is discrete, we talk about classification.
Hichem Felouat - hichemfel@gmail.com - Algeria 20
Types of Machine Learning Systems
A labeled training set for supervised learning.
Hichem Felouat - hichemfel@gmail.com - Algeria 21
Types of Machine Learning Systems
Here are some of the most important supervised
learning algorithms:
• k-Nearest Neighbors
• Linear Regression
• Logistic Regression
• Support Vector Machines (SVMs)
• Decision Trees and Random Forests
• Neural networks*
Hichem Felouat - hichemfel@gmail.com - Algeria 22
Types of Machine Learning Systems
Unsupervised Learning:
In unsupervised learning, as you might guess, the training data is
unlabeled. The system tries to learn without a teacher.
No labels are given to the learning algorithm, leaving it on its own to
explore or find structure in the data.
Hichem Felouat - hichemfel@gmail.com - Algeria 23
Types of Machine Learning Systems
An unlabeled training set for unsupervised learning.
Hichem Felouat - hichemfel@gmail.com - Algeria 24
Here are some of the most important unsupervised
learning algorithms:
• Clustering
• Visualization and dimensionality reduction
Types of Machine Learning Systems
Hichem Felouat - hichemfel@gmail.com - Algeria 25
Types of Machine Learning Systems
Semi-Supervised Learning :
Some algorithms can deal with partially labeled training data,
usually a lot of unlabeled data and a little bit of labeled data. This
is called semi-supervised learning.
Most semi-supervised learning algorithms are combinations of
unsupervised and supervised algorithms.
Hichem Felouat - hichemfel@gmail.com - Algeria 26
Types of Machine Learning Systems
Reinforcement Learning :
• The learning system called an agent in this context.
• Can observe the environment, select and perform actions, and get
rewards in return (or penalties in the form of negative rewards).
• It must then learn by itself what is the best strategy, called a policy, to get
the most reward over time.
• A policy defines what action the agent should choose when it is in a given
situation.
Hichem Felouat - hichemfel@gmail.com - Algeria 27
Types of Machine Learning Systems
Reinforcement Learning
Hichem Felouat - hichemfel@gmail.com - Algeria 28
Types of Machine Learning Systems
Batch learning:
In batch learning, the system is incapable of learning
incrementally: it must be trained using all the available
data. This will generally take a lot of time and computing
resources, so it is typically done offline. First, the system is
trained, and then it is launched into production and runs
without learning anymore; it just applies what it has learned.
This is called offline learning.
Hichem Felouat - hichemfel@gmail.com - Algeria 29
Types of Machine Learning Systems
On-line learning:
In online learning, you train the system incrementally by
feeding it data instances sequentially, either individually
or by small groups called mini batches. Each learning step is
fast and cheap, so the system can learn about new data
on the fly, as it arrives.
Hichem Felouat - hichemfel@gmail.com - Algeria 30
Types of Machine Learning Systems
Online learning
Hichem Felouat - hichemfel@gmail.com - Algeria 31
Instance-Based VS Model-Based Learning
One more way to categorize Machine Learning systems is by how
they generalize. Most Machine Learning tasks are about making
predictions. This means that given a number of training examples,
the system needs to be able to generalize to examples it has never
seen before.
Having a good performance measure on the training data is good,
but insufficient; the true goal is to perform well on new instances.
There are two main approaches to generalization: instance-based
learning and model-based learning.
Hichem Felouat - hichemfel@gmail.com - Algeria 32
Instance-Based VS Model-Based Learning
Instance-based learning:
The system learns the examples by heart, then generalizes to new
cases using a similarity measure.
Hichem Felouat - hichemfel@gmail.com - Algeria 33
Instance-Based VS Model-Based Learning
Model-based learning:
Build a model of these examples, then use that model to make
predictions.
Hichem Felouat - hichemfel@gmail.com - Algeria 34
Loss Function
The loss function computes the error for a single training
example, while the cost function is the average of the loss
functions of the entire training set.
Hichem Felouat - hichemfel@gmail.com - Algeria 35
Machine Learning Vocabulary 3
• Hyperparameters : are configuration variables that are external to the model
and whose values cannot be estimated from data. That is to say, they can not
be learned directly from the data in standard model training. They are almost
always specified by the machine learning engineer prior to training.
• Regression: this is the problem of predicting a real value for each item.
Examples of regression include prediction of stock values or that of variations
of economic variables.
• Classification: this is the problem of assigning a category to each item.
• Clustering: this is the problem of partitioning a set of items into
homogeneous subsets.
Hichem Felouat - hichemfel@gmail.com - Algeria 36
In Summary
1) You studied the data.
2) You selected a model.
3) You trained it on the training data.
4) Finally, you applied the model to make predictions
on new cases.
Hichem Felouat - hichemfel@gmail.com - Algeria 37
Main Challenges of Machine Learning
In short, since your main task is to select
a learning algorithm and train it on some
data, the two things that can go wrong are
“bad data” and “bad algorithm”.
Hichem Felouat - hichemfel@gmail.com - Algeria 38
Main Challenges of Machine Learning
1- Database
Hichem Felouat - hichemfel@gmail.com - Algeria 39
Main Challenges of Machine Learning
1- Database
1- Insufficient Quantity of Training Data :
Machine Learning takes a lot of data for most Machine
Learning algorithms to work properly. Even for very simple
problems you typically need thousands of examples, and
for complex problems such as image or speech
recognition you may need millions of examples (unless
you can reuse parts of an existing model).
Hichem Felouat - hichemfel@gmail.com - Algeria 40
Main Challenges of Machine Learning
1- Database
2) Non-representative Training Data:
In order to generalize well, it is crucial that your training data be representative of
the new cases you want to generalize to. This is true whether you use instance-
based learning or model-based learning.
Hichem Felouat - hichemfel@gmail.com - Algeria 41
Main Challenges of Machine Learning
1- Database
3) Poor-Quality Data:
If your training data is full of errors, outliers, and noise (e.g., due to poor quality
measurements), it will make it harder for the system to detect the underlying patterns, so your
system is less likely to perform well. It is often well worth the effort to spend time cleaning up
your training data. The truth is, most data scientists spend a significant part of their time
doing just that. For example:
1) If some instances are clearly outliers, it may help to simply discard them or try to fix
the errors manually.
2) If some instances are missing a few features (e.g., 5% of your customers did not
specify their age), you must decide whether you want to ignore this attribute altogether,
ignore these instances, fill in the missing values (e.g., with the median age), or train
one model with the feature and one model without it, and so on.
Hichem Felouat - hichemfel@gmail.com - Algeria 42
Main Challenges of Machine Learning
1- Database
4) Irrelevant Features:
Your system will only be capable of learning if the training data contains enough
relevant features and not too many irrelevant ones. A critical part of the success
of a Machine Learning project is coming up with a good set of features to train on.
This process, called feature engineering, involves:
1) Feature selection: selecting the most useful features to train on among
existing features.
2) Feature extraction: combining existing features to produce a more useful
one (dimensionality reduction algorithms can help).
3) Creating new features by gathering new data.
Hichem Felouat - hichemfel@gmail.com - Algeria 43
Main Challenges of Machine Learning
2- Algorithm
1) Overfitting the Training Data:
Overfitting happens when a model learns the detail and noise in the training
data to the extent that it negatively impacts the performance of the model on
new data. This means that the noise or random fluctuations in the training data
is picked up and learned as concepts by the model. The problem is that these
concepts do not apply to new data and negatively impact the models ability to
generalize.
The model performs well on the training data, but it does not
generalize well.
Hichem Felouat - hichemfel@gmail.com - Algeria 44
Main Challenges of Machine Learning
2- Algorithm
2) Underfitting the Training Data:
Underfitting is the opposite of overfitting: it occurs
when your model is too simple to learn the
underlying structure of the data.
Hichem Felouat - hichemfel@gmail.com - Algeria 45
Main Challenges of Machine Learning
2- Algorithm
Hichem Felouat - hichemfel@gmail.com - Algeria 46
Main Challenges of Machine Learning
2- Algorithm
Hichem Felouat - hichemfel@gmail.com - Algeria 47
How to Avoid Underfitting and Overfitting
Underfitting :
• Complexify model
• Add more features
• Train longer
Overfitting :
• validation
• Perform regularization
• Get more data
• Remove/Add some features
Hichem Felouat - hichemfel@gmail.com - Algeria 48
Common Classification Model Evaluation
Metrics : Confusion Matrix
The confusion matrix is used to describe the performance of a
classification model on a set of test data for which true values are known.
Hichem Felouat - hichemfel@gmail.com - Algeria 49
Common Classification Model Evaluation
metrics : Main Metrics
Hichem Felouat - hichemfel@gmail.com - Algeria 50
Common Classification Model Evaluation
metrics : Main Metrics
Hichem Felouat - hichemfel@gmail.com - Algeria 51
Common Regression Model Evaluation
metrics : Mean Absolute Error
Hichem Felouat - hichemfel@gmail.com - Algeria 52
Common Regression Model Evaluation
metrics : Mean Square Error
Hichem Felouat - hichemfel@gmail.com - Algeria 53
Common Regression Model Evaluation
metrics : Mean Absolute Percentage Error
Hichem Felouat - hichemfel@gmail.com - Algeria 54
Common Regression Model Evaluation
metrics : Mean Percentage Error
Hichem Felouat - hichemfel@gmail.com - Algeria 55
Testing and Validating
It is common to use 80% of the data for training and hold out 20% for
testing.
If the training error is low (i.e., your model makes few mistakes on the training
set) but the generalization error is high, it means that your model is overfitting the
training data.
A common solution to this problem is to have a second holdout set called the
validation set. You train multiple models with various hyperparameters using the
training set, you select the model and hyperparameters that perform best on the
validation set, and when you’re happy with your model you run a single final test
against the test set to get an estimate of the generalization error.
Hichem Felouat - hichemfel@gmail.com - Algeria 56
Testing and Validating : Cross-Validation
Cross-Validation (CV) : the training set is split into
complementary subsets, and each model is trained against
a different combination of these subsets and validated
against the remaining parts. Once the model type and
hyperparameters have been selected, a final model is
trained using these hyperparameters on the full training set,
and the generalized error is measured on the test set.
Hichem Felouat - hichemfel@gmail.com - Algeria 57
Testing and Validating : Cross-Validation
Hichem Felouat - hichemfel@gmail.com - Algeria 58
Boosting
Boosting refers to any Ensemble method that can combine
several weak learners into a strong learner. The general
idea of most boosting methods is to train predictors
sequentially, each trying to correct its predecessor. There
are many boosting methods available, but by far the most
popular are AdaBoost (Adaptive Boosting) and Gradient
Boosting.
Hichem Felouat - hichemfel@gmail.com - Algeria 59
Boosting
AdaBoost sequential training with instance weight updates
Hichem Felouat - hichemfel@gmail.com - Algeria 60
Voting Classifiers
The Voting Classifier: is a meta-classifier for combining similar or
conceptually different machine learning classifiers for classification via majority
or plurality voting. (For simplicity, we will refer to both majority and plurality voting
as majority voting.)
Hichem Felouat - hichemfel@gmail.com - Algeria 61
Dimensionality Reduction
Many Machine Learning problems involve thousands or even millions of features
for each training instance. Not only does this make training extremely slow, but it
can also make it much harder to find a good solution. This problem is often
referred to as the curse of dimensionality.
Principal Component Analysis
Hichem Felouat - hichemfel@gmail.com - Algeria 62
Hyperparameter Tuning
Hyperparameter Tuning : works by running multiple trials in a
single training job. Each trial is a complete execution of your training
application with values for your chosen hyperparameters, set within
limits you specify. The AI Platform training service keeps track of the
results of each trial and makes adjustments for subsequent trials.
When the job is finished, you can get a summary of all the trials along
with the most effective configuration of values according to the
criteria you specify.
Hichem Felouat - hichemfel@gmail.com - Algeria 63
Steps to Build a Machine Learning System
1. Data collection.
2. Improving data quality (data preprocessing).
3. Feature engineering (feature extraction and
selection, dimensionality reduction).
4. Splitting data into training and evaluation sets.
5. Algorithm selection.
6. Training.
7. Evaluation + Hyperparameter tuning.
8. Testing.
9. Deployment
Hichem Felouat - hichemfel@gmail.com - Algeria 64
Deep Learning is a subfield of machine learning
concerned with algorithms inspired by the structure and
function of the brain called artificial neural networks.
Deep Learning
Hichem Felouat - hichemfel@gmail.com - Algeria 65
Deep Learning VS Machine Learning
Hichem Felouat - hichemfel@gmail.com - Algeria 66
Feature extraction
Engineering of features is , however, a tedious process for several
reasons: Takes a lot of time and Requires expert knowledge.
For learning-based applications, a lot of time is spent to adjust the
features.
Extracted features often lack a structural representation reflecting
abstraction levels in the problem at hand.
Hichem Felouat - hichemfel@gmail.com - Algeria 67
Representation learning
Deep Learning aims at learning automatically
representations from large sets of labeled data:
• The machine is powered with raw data.
• Automatic discovery of representations.
Hichem Felouat - hichemfel@gmail.com - Algeria 68
Deep learning models
Several DL models have been proposed :
• Autoencoders (Aes)
• Deep belief networks (DBNs)
• Convolutional neural networks (CNNs).
• Recurrent neural networks (RNNs).
• Generative adversial networks (GANs), etc.
Hichem Felouat - hichemfel@gmail.com - Algeria 69
Convolutional neural networks (CNNs)
Hichem Felouat - hichemfel@gmail.com - Algeria 70
Convolutional neural networks (CNNs)
Hichem Felouat - hichemfel@gmail.com - Algeria 71
Convolutional neural networks (CNNs)
Hichem Felouat - hichemfel@gmail.com - Algeria 72
Convolutional neural networks (CNNs)
Hichem Felouat - hichemfel@gmail.com - Algeria 73
Convolutional neural networks (CNNs)
Hichem Felouat - hichemfel@gmail.com - Algeria 74
Thank you for your
attention

More Related Content

PPTX
Machine Learning Algorithms
PPTX
Machine Learning and Real-World Applications
PPTX
Introduction to Machine Learning
PPTX
Introduction to Machine Learning
PDF
Machine learning
PDF
An introduction to Machine Learning
PDF
Lecture 1: What is Machine Learning?
PDF
Machine learning Algorithms
Machine Learning Algorithms
Machine Learning and Real-World Applications
Introduction to Machine Learning
Introduction to Machine Learning
Machine learning
An introduction to Machine Learning
Lecture 1: What is Machine Learning?
Machine learning Algorithms

What's hot (20)

PDF
Machine Learning: Introduction to Neural Networks
PPT
Machine learning
PPTX
Deep Learning With Neural Networks
PDF
Introduction to Machine Learning with SciKit-Learn
PDF
Machine Learning
PPT
Machine Learning
PPTX
Machine Learning
PPTX
Types of Machine Learning
PPTX
Overfitting & Underfitting
PPTX
Federated Learning
PPTX
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
PDF
Explainable AI
PPTX
Machine learning
PDF
Introduction to AI & ML
PPTX
Introduction to ML (Machine Learning)
PPTX
AI: Learning in AI
PPTX
AI: AI & Problem Solving
PPTX
Machine learning ppt.
PPTX
supervised learning
PPT
Rule Based System
Machine Learning: Introduction to Neural Networks
Machine learning
Deep Learning With Neural Networks
Introduction to Machine Learning with SciKit-Learn
Machine Learning
Machine Learning
Machine Learning
Types of Machine Learning
Overfitting & Underfitting
Federated Learning
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Explainable AI
Machine learning
Introduction to AI & ML
Introduction to ML (Machine Learning)
AI: Learning in AI
AI: AI & Problem Solving
Machine learning ppt.
supervised learning
Rule Based System
Ad

Similar to The fundamentals of Machine Learning (20)

PPTX
Machine Learning
PDF
Machine Learning Landscape
DOC
Intro/Overview on Machine Learning Presentation -2
PPTX
MACHINE LEARNING PPT.pptx for the machine learning studnets
PDF
Overview of machine learning
PPTX
Module 4.pptx............................
PDF
Deep Learning Overview
PPTX
Machine learning
PDF
machine learning
PPTX
Machine Learning.pptx
PDF
Machine Learning Basics_Dr.Balamurugan.pdf
PPTX
Machine learning
PDF
PWL Seattle #23 - A Few Useful Things to Know About Machine Learning
PPT
ML.ppt
PPT
ML.pptvdvdvdvdvdfvdfgvdsdgdsfgdfgdfgdfgdf
PPT
ML.ppt
PPT
PPT
ML.ppt
PPTX
introduction to machine learning
PDF
what-is-machine-learning-and-its-importance-in-todays-world.pdf
Machine Learning
Machine Learning Landscape
Intro/Overview on Machine Learning Presentation -2
MACHINE LEARNING PPT.pptx for the machine learning studnets
Overview of machine learning
Module 4.pptx............................
Deep Learning Overview
Machine learning
machine learning
Machine Learning.pptx
Machine Learning Basics_Dr.Balamurugan.pdf
Machine learning
PWL Seattle #23 - A Few Useful Things to Know About Machine Learning
ML.ppt
ML.pptvdvdvdvdvdfvdfgvdsdgdsfgdfgdfgdfgdf
ML.ppt
ML.ppt
introduction to machine learning
what-is-machine-learning-and-its-importance-in-todays-world.pdf
Ad

More from Hichem Felouat (11)

PDF
مفاهيم حول الذكاء الاصطناعي تشمل تعاريف و معلومات أساسية
PDF
Natural Language Processing NLP (Transformers)
PDF
Introduction To Generative Adversarial Networks GANs
PDF
Object detection and Instance Segmentation
PDF
Artificial Intelligence and its Applications
PDF
Natural Language Processing (NLP)
PDF
Predict future time series forecasting
PDF
Transfer Learning
PDF
How to Build your First Neural Network
PDF
Machine Learning Algorithms
PDF
Build your own Convolutional Neural Network CNN
مفاهيم حول الذكاء الاصطناعي تشمل تعاريف و معلومات أساسية
Natural Language Processing NLP (Transformers)
Introduction To Generative Adversarial Networks GANs
Object detection and Instance Segmentation
Artificial Intelligence and its Applications
Natural Language Processing (NLP)
Predict future time series forecasting
Transfer Learning
How to Build your First Neural Network
Machine Learning Algorithms
Build your own Convolutional Neural Network CNN

Recently uploaded (20)

PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Cell Structure & Organelles in detailed.
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Insiders guide to clinical Medicine.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
RMMM.pdf make it easy to upload and study
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Basic Mud Logging Guide for educational purpose
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Classroom Observation Tools for Teachers
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Module 4: Burden of Disease Tutorial Slides S2 2025
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Cell Structure & Organelles in detailed.
Anesthesia in Laparoscopic Surgery in India
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
Insiders guide to clinical Medicine.pdf
Microbial disease of the cardiovascular and lymphatic systems
human mycosis Human fungal infections are called human mycosis..pptx
RMMM.pdf make it easy to upload and study
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Microbial diseases, their pathogenesis and prophylaxis
O7-L3 Supply Chain Operations - ICLT Program
Basic Mud Logging Guide for educational purpose
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Classroom Observation Tools for Teachers
Renaissance Architecture: A Journey from Faith to Humanism
TR - Agricultural Crops Production NC III.pdf
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES

The fundamentals of Machine Learning

  • 1. Hichem Felouat - hichemfel@gmail.com - Algeria 1 The Fundamentals of Machine Learning Hichem Felouat hichemfel@gmail.com
  • 2. 2Hichem Felouat - hichemfel@gmail.com - Algeria What Is Artificial Intelligence? Artificial intelligence (AI) is an area of computer science that emphasizes the creation of intelligent machines that work and react like humans. • AI is an interdisciplinary science with multiple approaches. • AI has become an essential part of the technology industry.
  • 3. Subdomains of Artificial Intelligence 3Hichem Felouat - hichemfel@gmail.com - Algeria
  • 4. 4Hichem Felouat - hichemfel@gmail.com - Algeria What Is Machine Learning? • Machine Learning is the science (and art) of programming computers so they can learn from data. • Machine Learning is the field of study that gives computers the ability to learn without being explicitly programmed. —Arthur Samuel, 1959
  • 5. Hichem Felouat - hichemfel@gmail.com - Algeria 5 What Does Learning Mean? A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. — Tom Mitchell, 1997
  • 6. Hichem Felouat - hichemfel@gmail.com - Algeria 6 Timeline of Machine Learning
  • 7. Hichem Felouat - hichemfel@gmail.com - Algeria 7 Why Use Machine Learning? The traditional approach. If the problem is not trivial, your program will likely become a long list of complex rules pretty hard to maintain.
  • 8. Hichem Felouat - hichemfel@gmail.com - Algeria 8 Why Use Machine Learning? Machine Learning approach. The program is much shorter, easier to maintain, and most likely more accurate.
  • 9. Hichem Felouat - hichemfel@gmail.com - Algeria 9 Why Use Machine Learning? Machine Learning can help humans learn.
  • 10. Hichem Felouat - hichemfel@gmail.com - Algeria 10 Why Use Machine Learning? AI Index 2019 Annual Report.
  • 11. Hichem Felouat - hichemfel@gmail.com - Algeria 11 Applications of Machine Learning Machine learning is currently the preferred approach in the following domains: 1) Speech analysis: e.g., speech recognition, synthesis. 2) Computer vision: e.g., object recognition/detection. 3) Robotics: e.g., position/map estimation. 4) Bio-informatics: e.g., sequence alignment, genetic analysis. 5) E-commerce: e.g., automatic trading, fraud detection. 6) Financial analysis: e.g., portfolio allocation, credits. 7) Medicine: e.g., diagnosis, therapy conception. 8) Web: e.g., Content management, social networks, etc.
  • 12. Hichem Felouat - hichemfel@gmail.com - Algeria 12 Applications of Machine Learning To summarize, Machine Learning is great for: • Problems for which existing solutions require a lot of hand-tuning or long lists of rules: one Machine Learning algorithm can often simplify code and perform better. • Complex problems for which there is no good solution at all using a traditional approach: the best Machine Learning techniques can find a solution.
  • 13. Hichem Felouat - hichemfel@gmail.com - Algeria 13 How to get started with ML 1) Mathematics: statistics, probability, and linear algebra.(NumPy, SciPy) 2) Programming: data structures, OOP, and parallel programming. (Python) 3) Databases: SQL and NOSQL. 4) ML algorithms: regression, classification, and clustering. 5) ML Tools: Scikt learn, TensorFlow and Keras.
  • 14. Hichem Felouat - hichemfel@gmail.com - Algeria 14 How to get started with ML
  • 15. Hichem Felouat - hichemfel@gmail.com - Algeria 15 Machine Learning Vocabulary 1 1) Examples: Items or instances of data used for learning or evaluation. In our spam problem, these examples correspond to the collection of email messages we will use for learning and testing. 2) Training sample: Examples used to train a learning algorithm. In our spam problem, the training sample consists of a set of email examples along with their associated labels. 3) Labels: Values or categories assigned to examples. In classification problems, examples are assigned specific categories, for instance, the spam and non-spam categories in our binary classification problem. In regression, items are assigned real-valued labels.
  • 16. Hichem Felouat - hichemfel@gmail.com - Algeria 16 Machine Learning Vocabulary 2 5) Test sample: Examples used to evaluate the performance of a learning algorithm. The test sample is separate from the training and validation data and is not made available in the learning stage. In the spam problem, the test sample consists of a collection of email examples for which the learning algorithm must predict labels based on features. These predictions are then compared with the labels of the test sample to measure the performance of the algorithm. 4) Features: The set of attributes, often represented as a vector, associated to an example. In the case of email messages, some relevant features may include the length of the message, the name of the sender, various characteristics of the header, the presence of certain keywords in the body of the message, and so on. 6) Loss function: A function that measures the difference, or loss, between a predicted label and a true label.
  • 17. Hichem Felouat - hichemfel@gmail.com - Algeria 17 Types of Machine Learning Systems There are so many different types of Machine Learning systems that it is useful to classify them in broad categories based on: • Whether or not they are trained with human supervision (supervised, unsupervised, semisupervised, and Reinforcement Learning). • Whether or not they can learn incrementally on the fly (online versus batch learning). • Whether they work by simply comparing new data points to known data points, or instead detect patterns in the training data and build a predictive model, much like scientists do (instance-based versus model- based learning).
  • 18. Hichem Felouat - hichemfel@gmail.com - Algeria 18 Types of Machine Learning Systems
  • 19. Hichem Felouat - hichemfel@gmail.com - Algeria 19 Types of Machine Learning Systems Supervised learning : In supervised learning, the training data you feed to the algorithm includes the desired solutions, called labels. • When y is real, we talk about regression. • When y is discrete, we talk about classification.
  • 20. Hichem Felouat - hichemfel@gmail.com - Algeria 20 Types of Machine Learning Systems A labeled training set for supervised learning.
  • 21. Hichem Felouat - hichemfel@gmail.com - Algeria 21 Types of Machine Learning Systems Here are some of the most important supervised learning algorithms: • k-Nearest Neighbors • Linear Regression • Logistic Regression • Support Vector Machines (SVMs) • Decision Trees and Random Forests • Neural networks*
  • 22. Hichem Felouat - hichemfel@gmail.com - Algeria 22 Types of Machine Learning Systems Unsupervised Learning: In unsupervised learning, as you might guess, the training data is unlabeled. The system tries to learn without a teacher. No labels are given to the learning algorithm, leaving it on its own to explore or find structure in the data.
  • 23. Hichem Felouat - hichemfel@gmail.com - Algeria 23 Types of Machine Learning Systems An unlabeled training set for unsupervised learning.
  • 24. Hichem Felouat - hichemfel@gmail.com - Algeria 24 Here are some of the most important unsupervised learning algorithms: • Clustering • Visualization and dimensionality reduction Types of Machine Learning Systems
  • 25. Hichem Felouat - hichemfel@gmail.com - Algeria 25 Types of Machine Learning Systems Semi-Supervised Learning : Some algorithms can deal with partially labeled training data, usually a lot of unlabeled data and a little bit of labeled data. This is called semi-supervised learning. Most semi-supervised learning algorithms are combinations of unsupervised and supervised algorithms.
  • 26. Hichem Felouat - hichemfel@gmail.com - Algeria 26 Types of Machine Learning Systems Reinforcement Learning : • The learning system called an agent in this context. • Can observe the environment, select and perform actions, and get rewards in return (or penalties in the form of negative rewards). • It must then learn by itself what is the best strategy, called a policy, to get the most reward over time. • A policy defines what action the agent should choose when it is in a given situation.
  • 27. Hichem Felouat - hichemfel@gmail.com - Algeria 27 Types of Machine Learning Systems Reinforcement Learning
  • 28. Hichem Felouat - hichemfel@gmail.com - Algeria 28 Types of Machine Learning Systems Batch learning: In batch learning, the system is incapable of learning incrementally: it must be trained using all the available data. This will generally take a lot of time and computing resources, so it is typically done offline. First, the system is trained, and then it is launched into production and runs without learning anymore; it just applies what it has learned. This is called offline learning.
  • 29. Hichem Felouat - hichemfel@gmail.com - Algeria 29 Types of Machine Learning Systems On-line learning: In online learning, you train the system incrementally by feeding it data instances sequentially, either individually or by small groups called mini batches. Each learning step is fast and cheap, so the system can learn about new data on the fly, as it arrives.
  • 30. Hichem Felouat - hichemfel@gmail.com - Algeria 30 Types of Machine Learning Systems Online learning
  • 31. Hichem Felouat - hichemfel@gmail.com - Algeria 31 Instance-Based VS Model-Based Learning One more way to categorize Machine Learning systems is by how they generalize. Most Machine Learning tasks are about making predictions. This means that given a number of training examples, the system needs to be able to generalize to examples it has never seen before. Having a good performance measure on the training data is good, but insufficient; the true goal is to perform well on new instances. There are two main approaches to generalization: instance-based learning and model-based learning.
  • 32. Hichem Felouat - hichemfel@gmail.com - Algeria 32 Instance-Based VS Model-Based Learning Instance-based learning: The system learns the examples by heart, then generalizes to new cases using a similarity measure.
  • 33. Hichem Felouat - hichemfel@gmail.com - Algeria 33 Instance-Based VS Model-Based Learning Model-based learning: Build a model of these examples, then use that model to make predictions.
  • 34. Hichem Felouat - hichemfel@gmail.com - Algeria 34 Loss Function The loss function computes the error for a single training example, while the cost function is the average of the loss functions of the entire training set.
  • 35. Hichem Felouat - hichemfel@gmail.com - Algeria 35 Machine Learning Vocabulary 3 • Hyperparameters : are configuration variables that are external to the model and whose values cannot be estimated from data. That is to say, they can not be learned directly from the data in standard model training. They are almost always specified by the machine learning engineer prior to training. • Regression: this is the problem of predicting a real value for each item. Examples of regression include prediction of stock values or that of variations of economic variables. • Classification: this is the problem of assigning a category to each item. • Clustering: this is the problem of partitioning a set of items into homogeneous subsets.
  • 36. Hichem Felouat - hichemfel@gmail.com - Algeria 36 In Summary 1) You studied the data. 2) You selected a model. 3) You trained it on the training data. 4) Finally, you applied the model to make predictions on new cases.
  • 37. Hichem Felouat - hichemfel@gmail.com - Algeria 37 Main Challenges of Machine Learning In short, since your main task is to select a learning algorithm and train it on some data, the two things that can go wrong are “bad data” and “bad algorithm”.
  • 38. Hichem Felouat - hichemfel@gmail.com - Algeria 38 Main Challenges of Machine Learning 1- Database
  • 39. Hichem Felouat - hichemfel@gmail.com - Algeria 39 Main Challenges of Machine Learning 1- Database 1- Insufficient Quantity of Training Data : Machine Learning takes a lot of data for most Machine Learning algorithms to work properly. Even for very simple problems you typically need thousands of examples, and for complex problems such as image or speech recognition you may need millions of examples (unless you can reuse parts of an existing model).
  • 40. Hichem Felouat - hichemfel@gmail.com - Algeria 40 Main Challenges of Machine Learning 1- Database 2) Non-representative Training Data: In order to generalize well, it is crucial that your training data be representative of the new cases you want to generalize to. This is true whether you use instance- based learning or model-based learning.
  • 41. Hichem Felouat - hichemfel@gmail.com - Algeria 41 Main Challenges of Machine Learning 1- Database 3) Poor-Quality Data: If your training data is full of errors, outliers, and noise (e.g., due to poor quality measurements), it will make it harder for the system to detect the underlying patterns, so your system is less likely to perform well. It is often well worth the effort to spend time cleaning up your training data. The truth is, most data scientists spend a significant part of their time doing just that. For example: 1) If some instances are clearly outliers, it may help to simply discard them or try to fix the errors manually. 2) If some instances are missing a few features (e.g., 5% of your customers did not specify their age), you must decide whether you want to ignore this attribute altogether, ignore these instances, fill in the missing values (e.g., with the median age), or train one model with the feature and one model without it, and so on.
  • 42. Hichem Felouat - hichemfel@gmail.com - Algeria 42 Main Challenges of Machine Learning 1- Database 4) Irrelevant Features: Your system will only be capable of learning if the training data contains enough relevant features and not too many irrelevant ones. A critical part of the success of a Machine Learning project is coming up with a good set of features to train on. This process, called feature engineering, involves: 1) Feature selection: selecting the most useful features to train on among existing features. 2) Feature extraction: combining existing features to produce a more useful one (dimensionality reduction algorithms can help). 3) Creating new features by gathering new data.
  • 43. Hichem Felouat - hichemfel@gmail.com - Algeria 43 Main Challenges of Machine Learning 2- Algorithm 1) Overfitting the Training Data: Overfitting happens when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data. This means that the noise or random fluctuations in the training data is picked up and learned as concepts by the model. The problem is that these concepts do not apply to new data and negatively impact the models ability to generalize. The model performs well on the training data, but it does not generalize well.
  • 44. Hichem Felouat - hichemfel@gmail.com - Algeria 44 Main Challenges of Machine Learning 2- Algorithm 2) Underfitting the Training Data: Underfitting is the opposite of overfitting: it occurs when your model is too simple to learn the underlying structure of the data.
  • 45. Hichem Felouat - hichemfel@gmail.com - Algeria 45 Main Challenges of Machine Learning 2- Algorithm
  • 46. Hichem Felouat - hichemfel@gmail.com - Algeria 46 Main Challenges of Machine Learning 2- Algorithm
  • 47. Hichem Felouat - hichemfel@gmail.com - Algeria 47 How to Avoid Underfitting and Overfitting Underfitting : • Complexify model • Add more features • Train longer Overfitting : • validation • Perform regularization • Get more data • Remove/Add some features
  • 48. Hichem Felouat - hichemfel@gmail.com - Algeria 48 Common Classification Model Evaluation Metrics : Confusion Matrix The confusion matrix is used to describe the performance of a classification model on a set of test data for which true values are known.
  • 49. Hichem Felouat - hichemfel@gmail.com - Algeria 49 Common Classification Model Evaluation metrics : Main Metrics
  • 50. Hichem Felouat - hichemfel@gmail.com - Algeria 50 Common Classification Model Evaluation metrics : Main Metrics
  • 51. Hichem Felouat - hichemfel@gmail.com - Algeria 51 Common Regression Model Evaluation metrics : Mean Absolute Error
  • 52. Hichem Felouat - hichemfel@gmail.com - Algeria 52 Common Regression Model Evaluation metrics : Mean Square Error
  • 53. Hichem Felouat - hichemfel@gmail.com - Algeria 53 Common Regression Model Evaluation metrics : Mean Absolute Percentage Error
  • 54. Hichem Felouat - hichemfel@gmail.com - Algeria 54 Common Regression Model Evaluation metrics : Mean Percentage Error
  • 55. Hichem Felouat - hichemfel@gmail.com - Algeria 55 Testing and Validating It is common to use 80% of the data for training and hold out 20% for testing. If the training error is low (i.e., your model makes few mistakes on the training set) but the generalization error is high, it means that your model is overfitting the training data. A common solution to this problem is to have a second holdout set called the validation set. You train multiple models with various hyperparameters using the training set, you select the model and hyperparameters that perform best on the validation set, and when you’re happy with your model you run a single final test against the test set to get an estimate of the generalization error.
  • 56. Hichem Felouat - hichemfel@gmail.com - Algeria 56 Testing and Validating : Cross-Validation Cross-Validation (CV) : the training set is split into complementary subsets, and each model is trained against a different combination of these subsets and validated against the remaining parts. Once the model type and hyperparameters have been selected, a final model is trained using these hyperparameters on the full training set, and the generalized error is measured on the test set.
  • 57. Hichem Felouat - hichemfel@gmail.com - Algeria 57 Testing and Validating : Cross-Validation
  • 58. Hichem Felouat - hichemfel@gmail.com - Algeria 58 Boosting Boosting refers to any Ensemble method that can combine several weak learners into a strong learner. The general idea of most boosting methods is to train predictors sequentially, each trying to correct its predecessor. There are many boosting methods available, but by far the most popular are AdaBoost (Adaptive Boosting) and Gradient Boosting.
  • 59. Hichem Felouat - hichemfel@gmail.com - Algeria 59 Boosting AdaBoost sequential training with instance weight updates
  • 60. Hichem Felouat - hichemfel@gmail.com - Algeria 60 Voting Classifiers The Voting Classifier: is a meta-classifier for combining similar or conceptually different machine learning classifiers for classification via majority or plurality voting. (For simplicity, we will refer to both majority and plurality voting as majority voting.)
  • 61. Hichem Felouat - hichemfel@gmail.com - Algeria 61 Dimensionality Reduction Many Machine Learning problems involve thousands or even millions of features for each training instance. Not only does this make training extremely slow, but it can also make it much harder to find a good solution. This problem is often referred to as the curse of dimensionality. Principal Component Analysis
  • 62. Hichem Felouat - hichemfel@gmail.com - Algeria 62 Hyperparameter Tuning Hyperparameter Tuning : works by running multiple trials in a single training job. Each trial is a complete execution of your training application with values for your chosen hyperparameters, set within limits you specify. The AI Platform training service keeps track of the results of each trial and makes adjustments for subsequent trials. When the job is finished, you can get a summary of all the trials along with the most effective configuration of values according to the criteria you specify.
  • 63. Hichem Felouat - hichemfel@gmail.com - Algeria 63 Steps to Build a Machine Learning System 1. Data collection. 2. Improving data quality (data preprocessing). 3. Feature engineering (feature extraction and selection, dimensionality reduction). 4. Splitting data into training and evaluation sets. 5. Algorithm selection. 6. Training. 7. Evaluation + Hyperparameter tuning. 8. Testing. 9. Deployment
  • 64. Hichem Felouat - hichemfel@gmail.com - Algeria 64 Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks. Deep Learning
  • 65. Hichem Felouat - hichemfel@gmail.com - Algeria 65 Deep Learning VS Machine Learning
  • 66. Hichem Felouat - hichemfel@gmail.com - Algeria 66 Feature extraction Engineering of features is , however, a tedious process for several reasons: Takes a lot of time and Requires expert knowledge. For learning-based applications, a lot of time is spent to adjust the features. Extracted features often lack a structural representation reflecting abstraction levels in the problem at hand.
  • 67. Hichem Felouat - hichemfel@gmail.com - Algeria 67 Representation learning Deep Learning aims at learning automatically representations from large sets of labeled data: • The machine is powered with raw data. • Automatic discovery of representations.
  • 68. Hichem Felouat - hichemfel@gmail.com - Algeria 68 Deep learning models Several DL models have been proposed : • Autoencoders (Aes) • Deep belief networks (DBNs) • Convolutional neural networks (CNNs). • Recurrent neural networks (RNNs). • Generative adversial networks (GANs), etc.
  • 69. Hichem Felouat - hichemfel@gmail.com - Algeria 69 Convolutional neural networks (CNNs)
  • 70. Hichem Felouat - hichemfel@gmail.com - Algeria 70 Convolutional neural networks (CNNs)
  • 71. Hichem Felouat - hichemfel@gmail.com - Algeria 71 Convolutional neural networks (CNNs)
  • 72. Hichem Felouat - hichemfel@gmail.com - Algeria 72 Convolutional neural networks (CNNs)
  • 73. Hichem Felouat - hichemfel@gmail.com - Algeria 73 Convolutional neural networks (CNNs)
  • 74. Hichem Felouat - hichemfel@gmail.com - Algeria 74 Thank you for your attention