Practical ML
Antonio Pitasi
Samuele Sabella
2019 - Polo Fibonacci
Before we get started, we need some theory
Machine learning
● Practice: We define machine learning as a set of methods that can
automatically detect patterns in data, and then use the uncovered patterns to
predict future data, or to perform other kinds of decision making under
uncertainty
● Theory: How does learning performance vary with the number of training
examples presented? Which learning algorithms are most appropriate for
various types of learning tasks?
- Machine Learning, Tom Mitchell
- Machine Learning: A Probabilistic Perspective, Kevin P.
Murphy
ML is not only artificial neural networks
● Lots of mathematical models
○ Hidden Markov models
○ Support Vector Machines
○ Decision trees
○ Boltzmann machines, Deep belief network, Deep Boltzmann
● Neural network models are many...
○ Shallow network, Deep neural network
○ CNN (Yolo, AlexNet, GoogLeNet)
○ Echo state network, Deep echo state network
○ Rnn, LSTM, GRU
Machine learning categories
● Supervised: the goal is to learn a mapping from input to output given a
dataset of labeled pairs called training set (e.g. Iris Data Set
● Unsupervised: we have only a set of data points and the
goal is to find interesting patterns in the data
Example: young American males who buy diapers also have a predisposition
to buy beer (original story [3])
[2])
How does it work?
● Dataset of examples to learn from
● A model to learn from that data (e.g. neural net)
○ With some parameters to tune
○ With some hyperparameter to choose (neurons, layers, ...)
● Target function (loss) to minimize
cat: 0.9
lobster: 0.1
You are
[0.3, 0.2, 0.1]
wrong!
What is usually done
● Validation phase: compare different models and configurations
○ Which model to choose
○ Model hyper-parameters
● Test phase: loss, accuracy, recall, precision…
● We skip all this for seek of simplicity
OR?
Note: train/validation/test on different data
Models: feed-forward neural networks
virginica if argmax(net(input))==0 else setosa
non-linear
function
Note: A non-linear function ensure ANN to be a universal approximator [4]
Stack neurons in layers
Models: feed-forward neural networks
non-linear
function
*
*The optimizer will tune this
weights to minimize the loss
function in batch of examples
(btw we will use Adam [5])
Stack neurons in layers
virginica if argmax(net(input))==0 else setosa
A lot more stuff to know but for us...
UNDERSTANDING
MACHINE LEARNING
import keras
Practical ML
Practical ML
Practical ML
- Interactive
- Collaborative
- Python, R, Julia, Scala, ...
+ =
Practical ML
Easy to build a neural network
Easy to build a neural network wrong
Features:
Accuracy
Fitting
Performance
Keep an eye for:
Problem 1
Points classification
Practical ML
Practical ML
Generating the dataset
Generating the dataset
Plotting
Our model
● For non-linearity: rectifier linear
unit
● We use a softmax function in the
output layer to represent a
probability distribution
Let’s code!
https://colab.research.google.com
https://ml.anto.pt
Me after
training a neural
network
Back to theory - Convolving Lenna
● Given a function f, a convolution g with a kernel w is given by a very complex
formula with a very simple meaning: “adding each element of the image to its
local neighbors, weighted by the kernel” (wikipedia)
Note: check the manual implementation at link http://bit.ly/ml-dummies-lenna References: [11]
Convolution arithmetic
● Zero-padding: deal with borders pixels by adding zeros (preserves the size)
● Pooling: helps the network to become transformation invariant (translations,
rotations...) [7]
padding=same && no strides
GIF credits - https://github.com/vdumoulin/conv_arithmetic
max pooling && 2x2 strides
No padding, no strides
Dealing with multiple input channels
References: [12]
GoogLeNet on ImageNet - Feature visualization
feature visualization of
the 1s conv. layer
layer 3a layer 4d
References: [8,9,10]
Stack multiple filters and
learn kernels dynamically
(hierarchy of features)
Problem 2
Digits recognition
References
[1] Pattern Recognition in a Bucket
https://link.springer.com/chapter/10.1007/978-3-540-39432-7_63
[2] Iris dataset: https://archive.ics.uci.edu/ml/datasets/iris
[3] Beer and diapers: http://www.dssresources.com/newsletters/66.php
[4] Multilayer feedforward networks are universal approximators:
http://cognitivemedium.com/magic_paper/assets/Hornik.pdf
[5] Adam: A Method for Stochastic Optimization: https://arxiv.org/abs/1412.6980
[6] MNIST dataset: http://yann.lecun.com/exdb/mnist/
References
[7] Bengio, Yoshua, Ian Goodfellow, and Aaron Courville. Deep learning. Vol. 1.
MIT press, 2017: http://www.deeplearningbook.org/
[8] Feature-visualization: https://distill.pub/2017/feature-visualization/
[9] Going deeper with convolutions: https://arxiv.org/pdf/1409.4842.pdf
[10] Imagenet: A large-scale hierarchical image database: http://www.image-
net.org/papers/imagenet_cvpr09.pdf
[11] Culture, Communication, and an Information Age Madonna:
http://www.lenna.org/pcs_mirror/may_june01.pdf
References
[12] Intuitively Understanding Convolutions for Deep Learning:
https://towardsdatascience.com/intuitively-understanding-convolutions-for-deep-
learning-1f6f42faee1
Books
Difficulty
Antonio Pitasi
Software Engineer, Nextworks
https://anto.pt
Samuele Sabella
https://github.com/samuelesabella
Practical ML
Antonio Pitasi
Software Engineer, Nextworks
https://anto.pt
Samuele Sabella
https://github.com/samuelesabella
Practical ML

More Related Content

PPTX
08 neural networks
PDF
Main principles of Data Science and Machine Learning
PDF
Getting started with Machine Learning
PPTX
Artificial Intelligence, Machine Learning and Deep Learning
PDF
Scaling Deep Learning with MXNet
ODP
Machine learning 2016: deep networks and Monte Carlo Tree Search
ODP
Machine learning 2016: deep networks and Monte Carlo Tree Search
PPTX
Introduction to Machine Learning basics.pptx
08 neural networks
Main principles of Data Science and Machine Learning
Getting started with Machine Learning
Artificial Intelligence, Machine Learning and Deep Learning
Scaling Deep Learning with MXNet
Machine learning 2016: deep networks and Monte Carlo Tree Search
Machine learning 2016: deep networks and Monte Carlo Tree Search
Introduction to Machine Learning basics.pptx

Similar to Practical ML (20)

PDF
CSSC ML Workshop
PDF
Deep Learning: concepts and use cases (October 2018)
PDF
Ml masterclass
PPTX
Deep learning introduction
PPTX
Java and Deep Learning
PPTX
Machine Learning, Deep Learning and Data Analysis Introduction
PPTX
DeepLearningLecture.pptx
PDF
[update] Introductory Parts of the Book "Dive into Deep Learning"
PPTX
B4UConference_machine learning_deeplearning
PDF
AI/ML Fundamentals to advanced Slides by GDG Amrita Mysuru.pdf
PPTX
Machine learning ppt.
PPTX
Deep Learning
PDF
Machine learning and its parameter is discussed here
PPTX
Diving into Deep Learning (Silicon Valley Code Camp 2017)
PDF
Deep Learning Study _ FInalwithCNN_RNN_LSTM_GRU.pdf
PDF
Fundementals of Machine Learning and Deep Learning
PPTX
1. Introduction to deep learning.pptx
PPTX
Deep Learning Sample Class (Jon Lederman)
PPTX
Introduction to Deep Learning and Tensorflow
CSSC ML Workshop
Deep Learning: concepts and use cases (October 2018)
Ml masterclass
Deep learning introduction
Java and Deep Learning
Machine Learning, Deep Learning and Data Analysis Introduction
DeepLearningLecture.pptx
[update] Introductory Parts of the Book "Dive into Deep Learning"
B4UConference_machine learning_deeplearning
AI/ML Fundamentals to advanced Slides by GDG Amrita Mysuru.pdf
Machine learning ppt.
Deep Learning
Machine learning and its parameter is discussed here
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Deep Learning Study _ FInalwithCNN_RNN_LSTM_GRU.pdf
Fundementals of Machine Learning and Deep Learning
1. Introduction to deep learning.pptx
Deep Learning Sample Class (Jon Lederman)
Introduction to Deep Learning and Tensorflow
Ad

Recently uploaded (20)

PDF
iTop VPN Crack Latest Version Full Key 2025
PDF
BoxLang Dynamic AWS Lambda - Japan Edition
PDF
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
PPTX
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
PPTX
Matchmaking for JVMs: How to Pick the Perfect GC Partner
PDF
DuckDuckGo Private Browser Premium APK for Android Crack Latest 2025
PDF
The Dynamic Duo Transforming Financial Accounting Systems Through Modern Expe...
PDF
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
PDF
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
PPTX
Cybersecurity: Protecting the Digital World
PPTX
Full-Stack Developer Courses That Actually Land You Jobs
PPTX
Tech Workshop Escape Room Tech Workshop
PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
PPTX
Download Adobe Photoshop Crack 2025 Free
PDF
DNT Brochure 2025 – ISV Solutions @ D365
PDF
AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access
DOCX
Modern SharePoint Intranet Templates That Boost Employee Engagement in 2025.docx
PPTX
Weekly report ppt - harsh dattuprasad patel.pptx
PDF
EaseUS PDF Editor Pro 6.2.0.2 Crack with License Key 2025
PDF
MCP Security Tutorial - Beginner to Advanced
iTop VPN Crack Latest Version Full Key 2025
BoxLang Dynamic AWS Lambda - Japan Edition
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
Matchmaking for JVMs: How to Pick the Perfect GC Partner
DuckDuckGo Private Browser Premium APK for Android Crack Latest 2025
The Dynamic Duo Transforming Financial Accounting Systems Through Modern Expe...
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
Cybersecurity: Protecting the Digital World
Full-Stack Developer Courses That Actually Land You Jobs
Tech Workshop Escape Room Tech Workshop
Advanced SystemCare Ultimate Crack + Portable (2025)
Download Adobe Photoshop Crack 2025 Free
DNT Brochure 2025 – ISV Solutions @ D365
AI/ML Infra Meetup | Beyond S3's Basics: Architecting for AI-Native Data Access
Modern SharePoint Intranet Templates That Boost Employee Engagement in 2025.docx
Weekly report ppt - harsh dattuprasad patel.pptx
EaseUS PDF Editor Pro 6.2.0.2 Crack with License Key 2025
MCP Security Tutorial - Beginner to Advanced
Ad

Practical ML

  • 1. Practical ML Antonio Pitasi Samuele Sabella 2019 - Polo Fibonacci
  • 2. Before we get started, we need some theory
  • 3. Machine learning ● Practice: We define machine learning as a set of methods that can automatically detect patterns in data, and then use the uncovered patterns to predict future data, or to perform other kinds of decision making under uncertainty ● Theory: How does learning performance vary with the number of training examples presented? Which learning algorithms are most appropriate for various types of learning tasks? - Machine Learning, Tom Mitchell - Machine Learning: A Probabilistic Perspective, Kevin P. Murphy
  • 4. ML is not only artificial neural networks ● Lots of mathematical models ○ Hidden Markov models ○ Support Vector Machines ○ Decision trees ○ Boltzmann machines, Deep belief network, Deep Boltzmann ● Neural network models are many... ○ Shallow network, Deep neural network ○ CNN (Yolo, AlexNet, GoogLeNet) ○ Echo state network, Deep echo state network ○ Rnn, LSTM, GRU
  • 5. Machine learning categories ● Supervised: the goal is to learn a mapping from input to output given a dataset of labeled pairs called training set (e.g. Iris Data Set ● Unsupervised: we have only a set of data points and the goal is to find interesting patterns in the data Example: young American males who buy diapers also have a predisposition to buy beer (original story [3]) [2])
  • 6. How does it work? ● Dataset of examples to learn from ● A model to learn from that data (e.g. neural net) ○ With some parameters to tune ○ With some hyperparameter to choose (neurons, layers, ...) ● Target function (loss) to minimize cat: 0.9 lobster: 0.1 You are [0.3, 0.2, 0.1] wrong!
  • 7. What is usually done ● Validation phase: compare different models and configurations ○ Which model to choose ○ Model hyper-parameters ● Test phase: loss, accuracy, recall, precision… ● We skip all this for seek of simplicity OR? Note: train/validation/test on different data
  • 8. Models: feed-forward neural networks virginica if argmax(net(input))==0 else setosa non-linear function Note: A non-linear function ensure ANN to be a universal approximator [4] Stack neurons in layers
  • 9. Models: feed-forward neural networks non-linear function * *The optimizer will tune this weights to minimize the loss function in batch of examples (btw we will use Adam [5]) Stack neurons in layers virginica if argmax(net(input))==0 else setosa
  • 10. A lot more stuff to know but for us... UNDERSTANDING MACHINE LEARNING import keras
  • 14. - Interactive - Collaborative - Python, R, Julia, Scala, ...
  • 15. + =
  • 17. Easy to build a neural network Easy to build a neural network wrong Features:
  • 25. Our model ● For non-linearity: rectifier linear unit ● We use a softmax function in the output layer to represent a probability distribution
  • 27. Me after training a neural network
  • 28. Back to theory - Convolving Lenna ● Given a function f, a convolution g with a kernel w is given by a very complex formula with a very simple meaning: “adding each element of the image to its local neighbors, weighted by the kernel” (wikipedia) Note: check the manual implementation at link http://bit.ly/ml-dummies-lenna References: [11]
  • 29. Convolution arithmetic ● Zero-padding: deal with borders pixels by adding zeros (preserves the size) ● Pooling: helps the network to become transformation invariant (translations, rotations...) [7] padding=same && no strides GIF credits - https://github.com/vdumoulin/conv_arithmetic max pooling && 2x2 strides No padding, no strides
  • 30. Dealing with multiple input channels References: [12]
  • 31. GoogLeNet on ImageNet - Feature visualization feature visualization of the 1s conv. layer layer 3a layer 4d References: [8,9,10] Stack multiple filters and learn kernels dynamically (hierarchy of features)
  • 33. References [1] Pattern Recognition in a Bucket https://link.springer.com/chapter/10.1007/978-3-540-39432-7_63 [2] Iris dataset: https://archive.ics.uci.edu/ml/datasets/iris [3] Beer and diapers: http://www.dssresources.com/newsletters/66.php [4] Multilayer feedforward networks are universal approximators: http://cognitivemedium.com/magic_paper/assets/Hornik.pdf [5] Adam: A Method for Stochastic Optimization: https://arxiv.org/abs/1412.6980 [6] MNIST dataset: http://yann.lecun.com/exdb/mnist/
  • 34. References [7] Bengio, Yoshua, Ian Goodfellow, and Aaron Courville. Deep learning. Vol. 1. MIT press, 2017: http://www.deeplearningbook.org/ [8] Feature-visualization: https://distill.pub/2017/feature-visualization/ [9] Going deeper with convolutions: https://arxiv.org/pdf/1409.4842.pdf [10] Imagenet: A large-scale hierarchical image database: http://www.image- net.org/papers/imagenet_cvpr09.pdf [11] Culture, Communication, and an Information Age Madonna: http://www.lenna.org/pcs_mirror/may_june01.pdf
  • 35. References [12] Intuitively Understanding Convolutions for Deep Learning: https://towardsdatascience.com/intuitively-understanding-convolutions-for-deep- learning-1f6f42faee1
  • 37. Antonio Pitasi Software Engineer, Nextworks https://anto.pt Samuele Sabella https://github.com/samuelesabella
  • 39. Antonio Pitasi Software Engineer, Nextworks https://anto.pt Samuele Sabella https://github.com/samuelesabella

Editor's Notes

  • #4: The explanation goes that when fathers are sent out on an errand to buy diapers, they often purchase a six-pack of their favorite beer as a reward.
  • #9: It is important that g be non- linear, otherwise the whole model collapses into a large linear regression model of the form y = wT (Vx). One can show that an MLP is a universal approximator, meaning it can model any suitably smooth function, given enough hidden units, to any desired level of accuracy (Hornik 1991).
  • #10: It is important that g be non- linear, otherwise the whole model collapses into a large linear regression model of the form y = wT (Vx). One can show that an MLP is a universal approximator, meaning it can model any suitably smooth function, given enough hidden units, to any desired level of accuracy (Hornik 1991).
  • #11: Prima slide Pitasi
  • #13: Jupyter! Notebook open source per includere codice e il suo output
  • #14: Non solo codice e il suo output: immagini, markdown, latex, link
  • #15: Punti di forza dei notebook: condivisione e collaborazione E supporta linguaggi diversi da Python: al momento Jupyter supporta 40 linguaggi
  • #16: Per il workshop useremo i notebook online di Google: colab(orative)
  • #17: Recap: utilizzeremo i notebook di Google Colab Ma useremo anche una libreria: NUMPY - operazioni matematiche in modo efficiente (e.s matrici, vettori, …) Poi ci sarebbe SCIPY che implementa altri algoritmi e utilità matematiche PANDAS implementa strutture dati come tabelle o serie temporali MATPLOTLIB è la libreria fondamentale per la costruzione di grafici TENSORFLOW è la libreria creata da Google per machine learning KERAS è una libreria di alto livello che utilizza TENSORFLOW per fornire funzioni piuttosto semplici per fare machine learning
  • #21: Il nostro problema: distinguere i punti accumulati al centro dalla fascia che li circonda Partiamo generando un po’ di questi punti che useremo come training set, dopodichè ne genereremo altri come test set, e infine vedremo come si comporta la nostra rete
  • #22: Il nostro problema: distinguere i punti accumulati al centro dalla fascia che li circonda Partiamo generando un po’ di questi punti che useremo come training set, dopodichè ne genereremo altri come test set, e infine vedremo come si comporta la nostra rete
  • #26: It is important that g be non- linear, otherwise the whole model collapses into a large linear regression model of the form y = wT (Vx). One can show that an MLP is a universal approximator, meaning it can model any suitably smooth function, given enough hidden units, to any desired level of accuracy (Hornik 1991).
  • #29: Sobel operator (the one in the picture) return local intensity changes along one axis. In other words the kernel activates (maximum dot product) when placed over a patch of the image with high changes along one axis (border). A specific convolution kernel can be used to detect features in an image, for example the borders.