SlideShare a Scribd company logo
Role of Machine Learning in
Telecommunication
Dr. Mohamad Abou Taam
WHAT IS MACHINE LEARNING?
Machine learning is a subfield of computer science
that studies and develops algorithms that can learn
from data without being explicitly programmed
Computer Science
Artificial Intelligence
Machine Learning
Deep Learning
Machine learning algorithms can detect patterns in
data and use them to predict future data
Machine learning
Data Rules / Model
Traditional software: applying given rules to data
Traditional software
Rules
Data Answers /
Actions
Machine learning –
how is it different?
M
a
c
h
i
n
e
l
Model design, training and testing (model building, feature engineering)
Historical Data Machine Learning
Model
1
Model application (model scoring)
New Data Model Predictions
2
TRIAD OF ALGORITHMS, DATA AND TRAINING
Data
Machine
learning
Algorithms Training
"Learning"is the process of estimating an
unknown dependency or structure of a system
(building a model) from a limited number of
observation (data points) and ability to
generalize it onto previously unseen data
Inferential Statistics
Descriptive
Statistics
• Sample should be representative of
population
• Generalization – extrapolation to entire
population
• Watch for population drift!
Inference
THE "CENTRAL DOGMA" OF STATISTICS
Machine learning == statistical learning
Sampling principle
Probability
Population
Learning on sample
Sample
THREE TYPES OF MACHINE LEARNING
Reinforcement
Learning
The goal is to optimise actions in a way
that maximises cumulative reward. no
explicitly labeled data is given, but
“rewards” and “punishment” signals are
provided
X – input data /independent variable
Unsupervised
Learning
The goal is to learn patterns and
structure in data given only inputs X.
(no output Y information given at all)
Supervised
Learning
The goal is to learn mapping from
given inputs X to outputs Y, given a
labeled set of input-output (X-Y) pairs
.
X – input data / independent variable
Y – response/ dependent variable
MACHINE LEARNING METHODS
SUPERVISED LEARNING: REGRESSION
Response variable Y – real valued
Years of Education
S
e
n
i
o
r
i
t
y
I
n
c
o
m
e
0 50 100 200 300
5
10
15
20
25
TV
Sales
Sales
multivariate
univariate
SUPERVISED LEARNING: CLASSIFICATION
Response variable Y – categorical
binary multiclass
REGRESSION AND CLASSIFICATION ARE SIMILAR
Regression
Predict a numeric variable
Classification
Predict a binary (or categorical) outcome
0
Y
5
10
15
20
25
X
15
5
0 10
0.0
0.2
0.4
0.6
0.8
1.0
-2 -1 0 1 2
X
Probability of event
Data are 1s and 0s – event
either happens or doesn't
happen
MODEL OVERFITTING
Regression
Too simple Too complex Just right
Predictions will have high "bias" –
from inadequate assumptions
Predictions will have high "variance"
– driven by noise in the training data
Model complexity is appropriate
given the noise
MODEL OVERFITTING
Classification
overfit boundary
just right
two classes
14
PREDICTION ACCURACY VS EXPLAINABILITY
Model explainability Prediction accuracy
White box models
• Interpretable by design
• Easy to explain
• Quick to run
• Limited tuning needed
Black box models
• Lots of work to get insights
Better predictive performance
• Potential for overfitting
• Often lot of tuning required
• Linear / logistic regression
• Decision trees
Model properties
Algorithm examples • Random forests
• Gradient boosting
• Neural networks
• Deep learning
REGRESSION
Modeling
REGRESSION
Quality metrics
REGRESSION EVALUATION
Quality metrics
Standard quality metrics
Mean absolute error:
Mean squared error:
Root mean squared error:
R-squared:
CLASSIFICATION
Classification
CLASSIFICATION
Classification
CLASSIFICATION EVALUATION
Quality metrics
Actual
Yes (or 1) No (or 0)
True positives
TP
False
Positives
FP
False
Negatives
FN
True negatives
TN
Predicted
Yes (or 1)
No (or 0)
True positive = Predict event and event happens
True negative = Predict event does not happen, nothing
happens
False positive = Predict event and event does not happen
(false alarm)
False negative = Fail to predict event that does happen
(missed alarm)
TRAINING AND TESTING
Train-test split
• 70%-90% of the data
• Used to build the model
• 10%-30% of the data
• Used to check the performance
of the model on unseen data
Train & Test split
• Measure algorithm performance on both
train and test sets!
• Performance will be worse on the test set
• Algorithms hyperparameter tuning can be
used to improve test set performance
• Avoid overfitting!
• Actual performance of the algorithm in
production will not be better than on test
set!
TRAINING AND TESTING
Cross-validation
• Makes best use of the data
• Data split in to N "folds" at random
• N models built. On each model, N-1 folds
are used for training and one is used for
testing
• Evaluation criteria averaged across folds
• Allows use of eg 90% training data / 10%
test data splits for 10-fold cross validation
• More data for training increases predictive
power
• Reduces the chance of getting
lucky/unlucky just due to the way a single
train/test split is done
• More time/computer resources
consuming
average
Cross-validation
5-fold cross-validation
TYPICAL SUPERVISED LEARNING PIPELINE
Model training
Model application
regression
model
value
value
and testing
A SUPERVISED MACHINE LEARNING WORKFLOW
Prepare data Model and predict Impact
business
Define problem and
potential solution
Get the data
Understand the data
Clean the data
Feature engineering
Build and test model
Understand the model
What does it mean for
the business?
What are we going to
change?
Productionise
Iterate
Ongoing monitoring and
improvements

More Related Content

PPTX
Echelon Asia Summit 2017 Startup Academy Workshop
PPTX
Chapter8_What_Is_Machine_Learning Testing Cases
PDF
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
PDF
Unit1_Introduction to ML_Cross_validation.pdf
PDF
Unit 1_Data Validation_Validation Techniques.pdf
PDF
Top 10 Data Science Practitioner Pitfalls
PDF
Introduction to Artificial Intelligence_ Lec 10
PPTX
Build_Machine_Learning_System for Machine Learning Course
Echelon Asia Summit 2017 Startup Academy Workshop
Chapter8_What_Is_Machine_Learning Testing Cases
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...
Unit1_Introduction to ML_Cross_validation.pdf
Unit 1_Data Validation_Validation Techniques.pdf
Top 10 Data Science Practitioner Pitfalls
Introduction to Artificial Intelligence_ Lec 10
Build_Machine_Learning_System for Machine Learning Course

Similar to machine learning types methods classification regression decision tree (20)

PPTX
Machine Learning - Lecture2.pptx
PPTX
in5490-classification (1).pptx
PDF
Machine Learning - Lecture1.pptx.pdf
PDF
Introduction to Machine Learning concepts
PPTX
Lecture2_machine learning training+testing.pptx
PDF
1. Demystifying ML.pdf
PDF
The Impact of Class Rebalancing Techniques on the Performance and Interpretat...
PPT
5_Model for Predictions_Machine_Learning.ppt
PPTX
Statistical Learning and Model Selection (1).pptx
PPTX
Machine learning with scikitlearn
PPTX
Tech meetup Data Driven - Codemotion
PDF
Barga Data Science lecture 10
PPTX
Application of Machine Learning in Agriculture
PPTX
ECT463 Machine Learning Module 1 KTU 2019 Scheme.pptx
PPTX
Lecture 1 of system simulation and modulation.pptx
PDF
Lecture 9: Machine Learning in Practice (2)
PDF
Making Netflix Machine Learning Algorithms Reliable
PDF
Week 2 Sentiment Analysis Using Machine Learning
PPTX
Machine Learning in the Financial Industry
Machine Learning - Lecture2.pptx
in5490-classification (1).pptx
Machine Learning - Lecture1.pptx.pdf
Introduction to Machine Learning concepts
Lecture2_machine learning training+testing.pptx
1. Demystifying ML.pdf
The Impact of Class Rebalancing Techniques on the Performance and Interpretat...
5_Model for Predictions_Machine_Learning.ppt
Statistical Learning and Model Selection (1).pptx
Machine learning with scikitlearn
Tech meetup Data Driven - Codemotion
Barga Data Science lecture 10
Application of Machine Learning in Agriculture
ECT463 Machine Learning Module 1 KTU 2019 Scheme.pptx
Lecture 1 of system simulation and modulation.pptx
Lecture 9: Machine Learning in Practice (2)
Making Netflix Machine Learning Algorithms Reliable
Week 2 Sentiment Analysis Using Machine Learning
Machine Learning in the Financial Industry
Ad

Recently uploaded (20)

PPTX
additive manufacturing of ss316l using mig welding
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
web development for engineering and engineering
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
UNIT 4 Total Quality Management .pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Sustainable Sites - Green Building Construction
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
PPT on Performance Review to get promotions
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
Well-logging-methods_new................
PPTX
Geodesy 1.pptx...............................................
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
additive manufacturing of ss316l using mig welding
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Internet of Things (IOT) - A guide to understanding
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
web development for engineering and engineering
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
UNIT 4 Total Quality Management .pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Sustainable Sites - Green Building Construction
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPT on Performance Review to get promotions
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Model Code of Practice - Construction Work - 21102022 .pdf
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Well-logging-methods_new................
Geodesy 1.pptx...............................................
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Ad

machine learning types methods classification regression decision tree

  • 1. Role of Machine Learning in Telecommunication Dr. Mohamad Abou Taam
  • 2. WHAT IS MACHINE LEARNING? Machine learning is a subfield of computer science that studies and develops algorithms that can learn from data without being explicitly programmed Computer Science Artificial Intelligence Machine Learning Deep Learning Machine learning algorithms can detect patterns in data and use them to predict future data
  • 3. Machine learning Data Rules / Model Traditional software: applying given rules to data Traditional software Rules Data Answers / Actions Machine learning – how is it different? M a c h i n e l
  • 4. Model design, training and testing (model building, feature engineering) Historical Data Machine Learning Model 1 Model application (model scoring) New Data Model Predictions 2
  • 5. TRIAD OF ALGORITHMS, DATA AND TRAINING Data Machine learning Algorithms Training "Learning"is the process of estimating an unknown dependency or structure of a system (building a model) from a limited number of observation (data points) and ability to generalize it onto previously unseen data
  • 6. Inferential Statistics Descriptive Statistics • Sample should be representative of population • Generalization – extrapolation to entire population • Watch for population drift! Inference THE "CENTRAL DOGMA" OF STATISTICS Machine learning == statistical learning Sampling principle Probability Population Learning on sample Sample
  • 7. THREE TYPES OF MACHINE LEARNING Reinforcement Learning The goal is to optimise actions in a way that maximises cumulative reward. no explicitly labeled data is given, but “rewards” and “punishment” signals are provided X – input data /independent variable Unsupervised Learning The goal is to learn patterns and structure in data given only inputs X. (no output Y information given at all) Supervised Learning The goal is to learn mapping from given inputs X to outputs Y, given a labeled set of input-output (X-Y) pairs . X – input data / independent variable Y – response/ dependent variable
  • 9. SUPERVISED LEARNING: REGRESSION Response variable Y – real valued Years of Education S e n i o r i t y I n c o m e 0 50 100 200 300 5 10 15 20 25 TV Sales Sales multivariate univariate
  • 10. SUPERVISED LEARNING: CLASSIFICATION Response variable Y – categorical binary multiclass
  • 11. REGRESSION AND CLASSIFICATION ARE SIMILAR Regression Predict a numeric variable Classification Predict a binary (or categorical) outcome 0 Y 5 10 15 20 25 X 15 5 0 10 0.0 0.2 0.4 0.6 0.8 1.0 -2 -1 0 1 2 X Probability of event Data are 1s and 0s – event either happens or doesn't happen
  • 12. MODEL OVERFITTING Regression Too simple Too complex Just right Predictions will have high "bias" – from inadequate assumptions Predictions will have high "variance" – driven by noise in the training data Model complexity is appropriate given the noise
  • 14. 14 PREDICTION ACCURACY VS EXPLAINABILITY Model explainability Prediction accuracy White box models • Interpretable by design • Easy to explain • Quick to run • Limited tuning needed Black box models • Lots of work to get insights Better predictive performance • Potential for overfitting • Often lot of tuning required • Linear / logistic regression • Decision trees Model properties Algorithm examples • Random forests • Gradient boosting • Neural networks • Deep learning
  • 17. REGRESSION EVALUATION Quality metrics Standard quality metrics Mean absolute error: Mean squared error: Root mean squared error: R-squared:
  • 20. CLASSIFICATION EVALUATION Quality metrics Actual Yes (or 1) No (or 0) True positives TP False Positives FP False Negatives FN True negatives TN Predicted Yes (or 1) No (or 0) True positive = Predict event and event happens True negative = Predict event does not happen, nothing happens False positive = Predict event and event does not happen (false alarm) False negative = Fail to predict event that does happen (missed alarm)
  • 21. TRAINING AND TESTING Train-test split • 70%-90% of the data • Used to build the model • 10%-30% of the data • Used to check the performance of the model on unseen data Train & Test split • Measure algorithm performance on both train and test sets! • Performance will be worse on the test set • Algorithms hyperparameter tuning can be used to improve test set performance • Avoid overfitting! • Actual performance of the algorithm in production will not be better than on test set!
  • 22. TRAINING AND TESTING Cross-validation • Makes best use of the data • Data split in to N "folds" at random • N models built. On each model, N-1 folds are used for training and one is used for testing • Evaluation criteria averaged across folds • Allows use of eg 90% training data / 10% test data splits for 10-fold cross validation • More data for training increases predictive power • Reduces the chance of getting lucky/unlucky just due to the way a single train/test split is done • More time/computer resources consuming average Cross-validation 5-fold cross-validation
  • 23. TYPICAL SUPERVISED LEARNING PIPELINE Model training Model application regression model value value and testing
  • 24. A SUPERVISED MACHINE LEARNING WORKFLOW Prepare data Model and predict Impact business Define problem and potential solution Get the data Understand the data Clean the data Feature engineering Build and test model Understand the model What does it mean for the business? What are we going to change? Productionise Iterate Ongoing monitoring and improvements