SlideShare a Scribd company logo
The mathematics of AI (machine learning mathematically)
What is machine learning?
▶ Task Identify handwritten digits
▶ We can see this as a function in the following way:
▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers
▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282
= 784 entries
▶ The output is a vector with 10 entries
▶ We thus have a function R784
→ R10
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is machine learning?
▶ Task Identify handwritten digits
▶ We can see this as a function in the following way:
▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers
▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282
= 784 entries
▶ The output is a vector with 10 entries
▶ We thus have a function R784
→ R10
Input example
Output example
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is machine learning?
▶ Task Identify handwritten digits
▶ We can see this as a function in the following way:
▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers
▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282
= 784 entries
▶ The output is a vector with 10 entries
▶ We thus have a function R784
→ R10
Task – rephrased
We have a function R784
→ R10
How can we describe this function?
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is machine learning?
▶ Task Identify handwritten digits
▶ We can see this as a function in the following way:
▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers
▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282
= 784 entries
▶ The output is a vector with 10 entries
▶ We thus have a function R784
→ R10
Task – rephrased
We have a function R784
→ R10
How can we describe this function?
Machine/deep learning (today’s topic)
tries to answer questions of this type
letting a computer detect patterns in data
Crucial In ML the computer performs tasks without explicit instructions
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is machine learning?
,
▶ Idea Approximate the unknown function R784
→ R10
▶ Neural network = a piecewise linear approximation (matrices + PL maps)
▶ The matrices = a bunch of numbers (weights) and offsets (biases)
▶ The PL maps = usually ReLU
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is machine learning?
Forward
→
Loss
→
Backward
→ → → → ...
▶ Machine learning mantra
▶ Forward = calculate an approximation (start with random inputs)
▶ Loss = compare to real data
▶ Backward = adjust the approximation
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is a neural network (nn)?
▶ NN = a directed graph as above
▶ The task of a nn is to approximate an unknown function
▶ It consist of neurons = entries of vectors, and weights = entries of matrices
The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
What is a neural network (nn)?
▶ NN = a directed graph as above
▶ The task of a nn is to approximate an unknown function
▶ It consist of neurons = entries of vectors, and weights = entries of matrices
Example
Here we have two matrices, a 3-by-1 and a 2-by-3 matrix


w
x
y

 and

a11 a12 a13
a21 a22 a23

plus two bias terms as in y = matrix · x + bias
The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
What is a neural network (nn)?
▶ NN = a directed graph as above
▶ The task of a nn is to approximate an unknown function
▶ It consist of neurons = entries of vectors, and weights = entries of matrices
Example
Here we have four matrices (plus four biases), whose composition gives a map
R3
→ R4
→ R3
→ R3
→ R2
The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
What is a neural network (nn)?
▶ NN = a directed graph as above
▶ The task of a nn is to approximate an unknown function
▶ It consist of neurons = entries of vectors, and weights = entries of matrices
Actually...
we need nonlinear maps as well, say ReLU applied componentwise
Here we have four matrices, whose composition gives a map
R3 ReLU◦matrix
−
−
−
−
−
−
−
→ R4 ReLU◦matrix
−
−
−
−
−
−
−
→ R3 ReLU◦matrix
−
−
−
−
−
−
−
→ R3 ReLU◦matrix
−
−
−
−
−
−
−
→ R2
But ignore that for now
ReLU doesn’t learn anything, its just brings in nonlinearity
The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
What is a neural network (nn)?


a1
11 b1
1
a1
12 b1
2
a1
13 b1
3

 ,

a2
11 a2
12 a2
13 b2
1
a2
21 a2
22 a2
23 b2
2

▶ The ak
ij and bk
i are the parameters of our nn
▶ k = number of the layer
▶ Deep = many layers = better approximation
The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
What is a neural network (nn)?


a1
11 b1
1
a1
12 b1
2
a1
13 b1
3

 ,

a2
11 a2
12 a2
13 b2
1
a2
21 a2
22 a2
23 b2
2

▶ The ak
ij and bk
i are the parameters of our nn
▶ k = number of the layer
▶ Deep = many layers = better approximation
The point
Many layers → many parameters
These are good for approximating real world problems
Examples
ResNet-152 with 152. layers (used in transformer models such as ChatGPT)
VGG-19 with 19 layers (used in image classification)
GoogLeNet with 22 layers (used in face detection)
The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
What is a neural network (nn)?


a1
11 b1
1
a1
12 b1
2
a1
13 b1
3

 ,

a2
11 a2
12 a2
13 b2
1
a2
21 a2
22 a2
23 b2
2

▶ The ak
ij and bk
i are the parameters of our nn
▶ k = number of the layer
▶ Deep = many layers = better approximation
Side fact
Gaming has improved AI!?
A GPU can do e.g. matrix multiplications faster
than a CPU and lots of nn run on GPUs
The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
How learning works
▶ Supervised learning Create a dataset with answers, e.g. pictures of
handwritten digits plus their label
▶ There are other forms of learning e.g. unsupervised, which I skip
▶ Split the data into ≈80% training and ≈20% testing data
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
How learning works
▶ Supervised learning Create a dataset with answers, e.g. pictures of
handwritten digits plus their label
▶ There are other forms of learning e.g. unsupervised, which I skip
▶ Split the data into ≈80% training and ≈20% testing data
Idea to keep in mind
How to train students?
There are lectures, exercises etc., the training data
There is a final exam, the testing data
Upon performance, we let them out into the wild
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
How learning works
▶ Forward Run the nn = function on the training data
▶ Loss Calculate the difference “results - answers” (⇒ loss function)
▶ Backward Change the parameters trying to minimize the loss function
▶ Repeat
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
How learning works
▶ Forward Run the nn = function on the training data
▶ Loss Calculate the difference “results - answers” (⇒ loss function)
▶ Backward Change the parameters trying to minimize the loss function
▶ Repeat
Forward
Boils down to a bunch of matrix multiplications
followed by the nonlinear activation e.g. ReLU
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
How learning works
▶ Forward Run the nn = function on the training data
▶ Loss Calculate the difference “results - answers” (⇒ loss function)
▶ Backward Change the parameters trying to minimize the loss function
▶ Repeat
Loss
The difference between real values and predictions
Task Minimize loss function
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
How learning works
▶ Forward Run the nn = function on the training data
▶ Loss Calculate the difference “results - answers” (⇒ loss function)
▶ Backward Change the parameters trying to minimize the loss function
▶ Repeat
Backward
This is running gradient descent on the loss function
Slogan Adjust parameters following the steepest descent
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
How learning works
▶ Forward Run the nn = function on the training data
▶ Loss Calculate the difference “results - answers” (⇒ loss function)
▶ Backward Change the parameters trying to minimize the loss function
▶ Repeat
And what makes it even better:
You can try it yourself
My favorite tool is PyTorch but there are also other methods
Let us see how!
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
What is machine learning?
▶ Task Identify handwritten digits
▶ We can see this as a function in the following way:
▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers
▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282
= 784 entries
▶ The output is a vector with 10 entries
▶ We thus have a function R784
→ R10
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is machine learning?
▶ Task Identify handwritten digits
▶ We can see this as a function in the following way:
▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers
▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282
= 784 entries
▶ The output is a vector with 10 entries
▶ We thus have a function R784
→ R10
Input example
Output example
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is machine learning?
,
▶ Idea Approximate the unknown function R784
→ R10
▶ Neural network = a piecewise linear approximation (matrices + PL maps)
▶ The matrices = a bunch of numbers (weights) and offsets (biases)
▶ The PL maps = usually ReLU
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is machine learning?
Forward
→
Loss
→
Backward
→ → → → ...
▶ Machine learning mantra
▶ Forward = calculate an approximation (start with random inputs)
▶ Loss = compare to real data
▶ Backward = adjust the approximation
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is a neural network (nn)?
▶ NN = a directed graph as above
▶ The task of a nn is to approximate an unknown function
▶ It consist of neurons = entries of vectors, and weights = entries of matrices
Example
Here we have two matrices, a 3-by-1 and a 2-by-3 matrix


w
x
y

 and

a11 a12 a13
a21 a22 a23

plus two bias terms as in y = matrix · x + bias
The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
How learning works
▶ Supervised learning Create a dataset with answers, e.g. pictures of
handwritten digits plus their label
▶ There are other forms of learning e.g. unsupervised, which I skip
▶ Split the data into ≈80% training and ≈20% testing data
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
How learning works
▶ Forward Run the nn = function on the training data
▶ Loss Calculate the difference “results - answers” (⇒ loss function)
▶ Backward Change the parameters trying to minimize the loss function
▶ Repeat
Forward
Boils down to a bunch of matrix multiplications
followed by the nonlinear activation e.g. ReLU
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
How learning works
▶ Forward Run the nn = function on the training data
▶ Loss Calculate the difference “results - answers” (⇒ loss function)
▶ Backward Change the parameters trying to minimize the loss function
▶ Repeat
Loss
The difference between real values and predictions
Task Minimize loss function
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
How learning works
▶ Forward Run the nn = function on the training data
▶ Loss Calculate the difference “results - answers” (⇒ loss function)
▶ Backward Change the parameters trying to minimize the loss function
▶ Repeat
Backward
This is running gradient descent on the loss function
Slogan Adjust parameters following the steepest descent
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
There is still much to do...
The mathematics of AI Or: Learning = forward, loss, backward April 2024 5 / 5
What is machine learning?
▶ Task Identify handwritten digits
▶ We can see this as a function in the following way:
▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers
▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282
= 784 entries
▶ The output is a vector with 10 entries
▶ We thus have a function R784
→ R10
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is machine learning?
▶ Task Identify handwritten digits
▶ We can see this as a function in the following way:
▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers
▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282
= 784 entries
▶ The output is a vector with 10 entries
▶ We thus have a function R784
→ R10
Input example
Output example
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is machine learning?
,
▶ Idea Approximate the unknown function R784
→ R10
▶ Neural network = a piecewise linear approximation (matrices + PL maps)
▶ The matrices = a bunch of numbers (weights) and offsets (biases)
▶ The PL maps = usually ReLU
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is machine learning?
Forward
→
Loss
→
Backward
→ → → → ...
▶ Machine learning mantra
▶ Forward = calculate an approximation (start with random inputs)
▶ Loss = compare to real data
▶ Backward = adjust the approximation
The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
What is a neural network (nn)?
▶ NN = a directed graph as above
▶ The task of a nn is to approximate an unknown function
▶ It consist of neurons = entries of vectors, and weights = entries of matrices
Example
Here we have two matrices, a 3-by-1 and a 2-by-3 matrix


w
x
y

 and

a11 a12 a13
a21 a22 a23

plus two bias terms as in y = matrix · x + bias
The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
How learning works
▶ Supervised learning Create a dataset with answers, e.g. pictures of
handwritten digits plus their label
▶ There are other forms of learning e.g. unsupervised, which I skip
▶ Split the data into ≈80% training and ≈20% testing data
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
How learning works
▶ Forward Run the nn = function on the training data
▶ Loss Calculate the difference “results - answers” (⇒ loss function)
▶ Backward Change the parameters trying to minimize the loss function
▶ Repeat
Forward
Boils down to a bunch of matrix multiplications
followed by the nonlinear activation e.g. ReLU
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
How learning works
▶ Forward Run the nn = function on the training data
▶ Loss Calculate the difference “results - answers” (⇒ loss function)
▶ Backward Change the parameters trying to minimize the loss function
▶ Repeat
Loss
The difference between real values and predictions
Task Minimize loss function
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
How learning works
▶ Forward Run the nn = function on the training data
▶ Loss Calculate the difference “results - answers” (⇒ loss function)
▶ Backward Change the parameters trying to minimize the loss function
▶ Repeat
Backward
This is running gradient descent on the loss function
Slogan Adjust parameters following the steepest descent
The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
Thanks for your attention!
The mathematics of AI Or: Learning = forward, loss, backward April 2024 5 / 5

More Related Content

PDF
A primer on computer algebra
PDF
Equivariant neural networks and representation theory
PDF
Machine learning for computer vision part 2
PDF
ML_basics_lecture1_linear_regression.pdf
PDF
Course project solutions 2019
PPTX
CP1-Chp6-Matrices (2).pptx used for revision
PPT
Rightand wrong[1]
PPTX
DF change subject Yr7 edit nuinnjonjnjnjinjinjin
A primer on computer algebra
Equivariant neural networks and representation theory
Machine learning for computer vision part 2
ML_basics_lecture1_linear_regression.pdf
Course project solutions 2019
CP1-Chp6-Matrices (2).pptx used for revision
Rightand wrong[1]
DF change subject Yr7 edit nuinnjonjnjnjinjinjin

Similar to The mathematics of AI (machine learning mathematically) (20)

PDF
2.4 Complex Numbers
PDF
오토인코더의 모든 것
PPTX
Authentic learning activity
DOCX
040 the whole module
DOCX
Guided notes
PDF
Unit5: Learning
PPTX
WEKA: Algorithms The Basic Methods
PPTX
WEKA:Algorithms The Basic Methods
PPTX
Real numbers class 10 cbse slides .pptx
PDF
Lecture 2 neural network covers the basic
PPTX
Training and Testing Neural Network unit II
PPTX
Introduction to Machine Learning for Java Developers
PPT
Technology Lesson Plan Assignment: Quadratice Functions
PDF
1.1 Real Number Properties
PPTX
DeepLearningLecture.pptx
PDF
Introduction to Machine Learning with Spark
PPT
Lesson 1 matrix
PPTX
Neural Learning to Rank
DOCX
G6 m4-a-lesson 4-t
PPTX
Deep learning from scratch
2.4 Complex Numbers
오토인코더의 모든 것
Authentic learning activity
040 the whole module
Guided notes
Unit5: Learning
WEKA: Algorithms The Basic Methods
WEKA:Algorithms The Basic Methods
Real numbers class 10 cbse slides .pptx
Lecture 2 neural network covers the basic
Training and Testing Neural Network unit II
Introduction to Machine Learning for Java Developers
Technology Lesson Plan Assignment: Quadratice Functions
1.1 Real Number Properties
DeepLearningLecture.pptx
Introduction to Machine Learning with Spark
Lesson 1 matrix
Neural Learning to Rank
G6 m4-a-lesson 4-t
Deep learning from scratch
Ad

More from Daniel Tubbenhauer (11)

PDF
Three colors suffice
PDF
Knots and algebra
PDF
Representation theory of monoidal categories
PDF
Representation theory of algebras
PDF
Representation theory of monoids
PDF
Representation theory of monoids and monoidal categories
PDF
Why category theory?
PDF
Why (categorical) representation theory?
PDF
Subfactors in a nutshell
PDF
Temperley-Lieb times four
PDF
Some history of quantum groups
Three colors suffice
Knots and algebra
Representation theory of monoidal categories
Representation theory of algebras
Representation theory of monoids
Representation theory of monoids and monoidal categories
Why category theory?
Why (categorical) representation theory?
Subfactors in a nutshell
Temperley-Lieb times four
Some history of quantum groups
Ad

Recently uploaded (20)

PPTX
ap-psych-ch-1-introduction-to-psychology-presentation.pptx
PPTX
Biomechanics of the Hip - Basic Science.pptx
PPT
Mutation in dna of bacteria and repairss
PPT
veterinary parasitology ````````````.ppt
PPTX
Microbes in human welfare class 12 .pptx
PDF
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
PPTX
TORCH INFECTIONS in pregnancy with toxoplasma
PDF
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
PPTX
perinatal infections 2-171220190027.pptx
PDF
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PPTX
Understanding the Circulatory System……..
PDF
Science Form five needed shit SCIENEce so
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PPTX
INTRODUCTION TO PAEDIATRICS AND PAEDIATRIC HISTORY TAKING-1.pptx
PPTX
Lesson-1-Introduction-to-the-Study-of-Chemistry.pptx
PPTX
Welcome-grrewfefweg-students-of-2024.pptx
PPTX
Seminar Hypertension and Kidney diseases.pptx
PDF
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
PPTX
BODY FLUIDS AND CIRCULATION class 11 .pptx
ap-psych-ch-1-introduction-to-psychology-presentation.pptx
Biomechanics of the Hip - Basic Science.pptx
Mutation in dna of bacteria and repairss
veterinary parasitology ````````````.ppt
Microbes in human welfare class 12 .pptx
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
TORCH INFECTIONS in pregnancy with toxoplasma
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
perinatal infections 2-171220190027.pptx
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
Understanding the Circulatory System……..
Science Form five needed shit SCIENEce so
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
INTRODUCTION TO PAEDIATRICS AND PAEDIATRIC HISTORY TAKING-1.pptx
Lesson-1-Introduction-to-the-Study-of-Chemistry.pptx
Welcome-grrewfefweg-students-of-2024.pptx
Seminar Hypertension and Kidney diseases.pptx
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
BODY FLUIDS AND CIRCULATION class 11 .pptx

The mathematics of AI (machine learning mathematically)

  • 2. What is machine learning? ▶ Task Identify handwritten digits ▶ We can see this as a function in the following way: ▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers ▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282 = 784 entries ▶ The output is a vector with 10 entries ▶ We thus have a function R784 → R10 The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
  • 3. What is machine learning? ▶ Task Identify handwritten digits ▶ We can see this as a function in the following way: ▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers ▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282 = 784 entries ▶ The output is a vector with 10 entries ▶ We thus have a function R784 → R10 Input example Output example The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
  • 4. What is machine learning? ▶ Task Identify handwritten digits ▶ We can see this as a function in the following way: ▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers ▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282 = 784 entries ▶ The output is a vector with 10 entries ▶ We thus have a function R784 → R10 Task – rephrased We have a function R784 → R10 How can we describe this function? The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
  • 5. What is machine learning? ▶ Task Identify handwritten digits ▶ We can see this as a function in the following way: ▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers ▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282 = 784 entries ▶ The output is a vector with 10 entries ▶ We thus have a function R784 → R10 Task – rephrased We have a function R784 → R10 How can we describe this function? Machine/deep learning (today’s topic) tries to answer questions of this type letting a computer detect patterns in data Crucial In ML the computer performs tasks without explicit instructions The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
  • 6. What is machine learning? , ▶ Idea Approximate the unknown function R784 → R10 ▶ Neural network = a piecewise linear approximation (matrices + PL maps) ▶ The matrices = a bunch of numbers (weights) and offsets (biases) ▶ The PL maps = usually ReLU The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
  • 7. What is machine learning? Forward → Loss → Backward → → → → ... ▶ Machine learning mantra ▶ Forward = calculate an approximation (start with random inputs) ▶ Loss = compare to real data ▶ Backward = adjust the approximation The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5
  • 8. What is a neural network (nn)? ▶ NN = a directed graph as above ▶ The task of a nn is to approximate an unknown function ▶ It consist of neurons = entries of vectors, and weights = entries of matrices The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
  • 9. What is a neural network (nn)? ▶ NN = a directed graph as above ▶ The task of a nn is to approximate an unknown function ▶ It consist of neurons = entries of vectors, and weights = entries of matrices Example Here we have two matrices, a 3-by-1 and a 2-by-3 matrix   w x y   and a11 a12 a13 a21 a22 a23 plus two bias terms as in y = matrix · x + bias The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
  • 10. What is a neural network (nn)? ▶ NN = a directed graph as above ▶ The task of a nn is to approximate an unknown function ▶ It consist of neurons = entries of vectors, and weights = entries of matrices Example Here we have four matrices (plus four biases), whose composition gives a map R3 → R4 → R3 → R3 → R2 The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
  • 11. What is a neural network (nn)? ▶ NN = a directed graph as above ▶ The task of a nn is to approximate an unknown function ▶ It consist of neurons = entries of vectors, and weights = entries of matrices Actually... we need nonlinear maps as well, say ReLU applied componentwise Here we have four matrices, whose composition gives a map R3 ReLU◦matrix − − − − − − − → R4 ReLU◦matrix − − − − − − − → R3 ReLU◦matrix − − − − − − − → R3 ReLU◦matrix − − − − − − − → R2 But ignore that for now ReLU doesn’t learn anything, its just brings in nonlinearity The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
  • 12. What is a neural network (nn)?   a1 11 b1 1 a1 12 b1 2 a1 13 b1 3   , a2 11 a2 12 a2 13 b2 1 a2 21 a2 22 a2 23 b2 2 ▶ The ak ij and bk i are the parameters of our nn ▶ k = number of the layer ▶ Deep = many layers = better approximation The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
  • 13. What is a neural network (nn)?   a1 11 b1 1 a1 12 b1 2 a1 13 b1 3   , a2 11 a2 12 a2 13 b2 1 a2 21 a2 22 a2 23 b2 2 ▶ The ak ij and bk i are the parameters of our nn ▶ k = number of the layer ▶ Deep = many layers = better approximation The point Many layers → many parameters These are good for approximating real world problems Examples ResNet-152 with 152. layers (used in transformer models such as ChatGPT) VGG-19 with 19 layers (used in image classification) GoogLeNet with 22 layers (used in face detection) The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
  • 14. What is a neural network (nn)?   a1 11 b1 1 a1 12 b1 2 a1 13 b1 3   , a2 11 a2 12 a2 13 b2 1 a2 21 a2 22 a2 23 b2 2 ▶ The ak ij and bk i are the parameters of our nn ▶ k = number of the layer ▶ Deep = many layers = better approximation Side fact Gaming has improved AI!? A GPU can do e.g. matrix multiplications faster than a CPU and lots of nn run on GPUs The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5
  • 15. How learning works ▶ Supervised learning Create a dataset with answers, e.g. pictures of handwritten digits plus their label ▶ There are other forms of learning e.g. unsupervised, which I skip ▶ Split the data into ≈80% training and ≈20% testing data The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
  • 16. How learning works ▶ Supervised learning Create a dataset with answers, e.g. pictures of handwritten digits plus their label ▶ There are other forms of learning e.g. unsupervised, which I skip ▶ Split the data into ≈80% training and ≈20% testing data Idea to keep in mind How to train students? There are lectures, exercises etc., the training data There is a final exam, the testing data Upon performance, we let them out into the wild The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
  • 17. How learning works ▶ Forward Run the nn = function on the training data ▶ Loss Calculate the difference “results - answers” (⇒ loss function) ▶ Backward Change the parameters trying to minimize the loss function ▶ Repeat The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
  • 18. How learning works ▶ Forward Run the nn = function on the training data ▶ Loss Calculate the difference “results - answers” (⇒ loss function) ▶ Backward Change the parameters trying to minimize the loss function ▶ Repeat Forward Boils down to a bunch of matrix multiplications followed by the nonlinear activation e.g. ReLU The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
  • 19. How learning works ▶ Forward Run the nn = function on the training data ▶ Loss Calculate the difference “results - answers” (⇒ loss function) ▶ Backward Change the parameters trying to minimize the loss function ▶ Repeat Loss The difference between real values and predictions Task Minimize loss function The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
  • 20. How learning works ▶ Forward Run the nn = function on the training data ▶ Loss Calculate the difference “results - answers” (⇒ loss function) ▶ Backward Change the parameters trying to minimize the loss function ▶ Repeat Backward This is running gradient descent on the loss function Slogan Adjust parameters following the steepest descent The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
  • 21. How learning works ▶ Forward Run the nn = function on the training data ▶ Loss Calculate the difference “results - answers” (⇒ loss function) ▶ Backward Change the parameters trying to minimize the loss function ▶ Repeat And what makes it even better: You can try it yourself My favorite tool is PyTorch but there are also other methods Let us see how! The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5
  • 22. What is machine learning? ▶ Task Identify handwritten digits ▶ We can see this as a function in the following way: ▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers ▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282 = 784 entries ▶ The output is a vector with 10 entries ▶ We thus have a function R784 → R10 The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5 What is machine learning? ▶ Task Identify handwritten digits ▶ We can see this as a function in the following way: ▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers ▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282 = 784 entries ▶ The output is a vector with 10 entries ▶ We thus have a function R784 → R10 Input example Output example The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5 What is machine learning? , ▶ Idea Approximate the unknown function R784 → R10 ▶ Neural network = a piecewise linear approximation (matrices + PL maps) ▶ The matrices = a bunch of numbers (weights) and offsets (biases) ▶ The PL maps = usually ReLU The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5 What is machine learning? Forward → Loss → Backward → → → → ... ▶ Machine learning mantra ▶ Forward = calculate an approximation (start with random inputs) ▶ Loss = compare to real data ▶ Backward = adjust the approximation The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5 What is a neural network (nn)? ▶ NN = a directed graph as above ▶ The task of a nn is to approximate an unknown function ▶ It consist of neurons = entries of vectors, and weights = entries of matrices Example Here we have two matrices, a 3-by-1 and a 2-by-3 matrix   w x y   and a11 a12 a13 a21 a22 a23 plus two bias terms as in y = matrix · x + bias The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5 How learning works ▶ Supervised learning Create a dataset with answers, e.g. pictures of handwritten digits plus their label ▶ There are other forms of learning e.g. unsupervised, which I skip ▶ Split the data into ≈80% training and ≈20% testing data The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5 How learning works ▶ Forward Run the nn = function on the training data ▶ Loss Calculate the difference “results - answers” (⇒ loss function) ▶ Backward Change the parameters trying to minimize the loss function ▶ Repeat Forward Boils down to a bunch of matrix multiplications followed by the nonlinear activation e.g. ReLU The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5 How learning works ▶ Forward Run the nn = function on the training data ▶ Loss Calculate the difference “results - answers” (⇒ loss function) ▶ Backward Change the parameters trying to minimize the loss function ▶ Repeat Loss The difference between real values and predictions Task Minimize loss function The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5 How learning works ▶ Forward Run the nn = function on the training data ▶ Loss Calculate the difference “results - answers” (⇒ loss function) ▶ Backward Change the parameters trying to minimize the loss function ▶ Repeat Backward This is running gradient descent on the loss function Slogan Adjust parameters following the steepest descent The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5 There is still much to do... The mathematics of AI Or: Learning = forward, loss, backward April 2024 5 / 5
  • 23. What is machine learning? ▶ Task Identify handwritten digits ▶ We can see this as a function in the following way: ▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers ▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282 = 784 entries ▶ The output is a vector with 10 entries ▶ We thus have a function R784 → R10 The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5 What is machine learning? ▶ Task Identify handwritten digits ▶ We can see this as a function in the following way: ▶ Convert the pictures into grayscale values, e.g. 28 × 28 grid of numbers ▶ Flatten the result into a vector, e.g. 28 × 28 7→ a vector with 282 = 784 entries ▶ The output is a vector with 10 entries ▶ We thus have a function R784 → R10 Input example Output example The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5 What is machine learning? , ▶ Idea Approximate the unknown function R784 → R10 ▶ Neural network = a piecewise linear approximation (matrices + PL maps) ▶ The matrices = a bunch of numbers (weights) and offsets (biases) ▶ The PL maps = usually ReLU The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5 What is machine learning? Forward → Loss → Backward → → → → ... ▶ Machine learning mantra ▶ Forward = calculate an approximation (start with random inputs) ▶ Loss = compare to real data ▶ Backward = adjust the approximation The mathematics of AI Or: Learning = forward, loss, backward April 2024 2 / 5 What is a neural network (nn)? ▶ NN = a directed graph as above ▶ The task of a nn is to approximate an unknown function ▶ It consist of neurons = entries of vectors, and weights = entries of matrices Example Here we have two matrices, a 3-by-1 and a 2-by-3 matrix   w x y   and a11 a12 a13 a21 a22 a23 plus two bias terms as in y = matrix · x + bias The mathematics of AI Or: Learning = forward, loss, backward April 2024 π / 5 How learning works ▶ Supervised learning Create a dataset with answers, e.g. pictures of handwritten digits plus their label ▶ There are other forms of learning e.g. unsupervised, which I skip ▶ Split the data into ≈80% training and ≈20% testing data The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5 How learning works ▶ Forward Run the nn = function on the training data ▶ Loss Calculate the difference “results - answers” (⇒ loss function) ▶ Backward Change the parameters trying to minimize the loss function ▶ Repeat Forward Boils down to a bunch of matrix multiplications followed by the nonlinear activation e.g. ReLU The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5 How learning works ▶ Forward Run the nn = function on the training data ▶ Loss Calculate the difference “results - answers” (⇒ loss function) ▶ Backward Change the parameters trying to minimize the loss function ▶ Repeat Loss The difference between real values and predictions Task Minimize loss function The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5 How learning works ▶ Forward Run the nn = function on the training data ▶ Loss Calculate the difference “results - answers” (⇒ loss function) ▶ Backward Change the parameters trying to minimize the loss function ▶ Repeat Backward This is running gradient descent on the loss function Slogan Adjust parameters following the steepest descent The mathematics of AI Or: Learning = forward, loss, backward April 2024 4 / 5 Thanks for your attention! The mathematics of AI Or: Learning = forward, loss, backward April 2024 5 / 5