SlideShare a Scribd company logo
1
2
2
Advanced statistical methods for linear
regression
3
Agenda
1. Reminder
2. Batch learning
3. Fit
4. Residuals
5. Estimator properties
6. Visualizations
7. Fisher tests
8. Regularization
9. Multiple output
4
4
Linear regression | Definition
5
Data model
Definition
b b b
X - features (regressors) | Matrix (d, n)
b - unknown parameters | Vector (d, 1)
Y - target (answer) | Vector (1, n)
epsilon - error | Vector (1, n)
6
Definition
Normal equation for linear regression (MSE case)
sklearn.linear_model.LinearRegression
7
7
Linear Regression | Batch learning
8
GD
Batches
Possible upgrades:
- use subsamples
- use 2-nd order optimization procedures
Criticism:
- Normal equation!!!
- Useless and naive
- Good idea for non-linear regression
9
GD (d ~ 10 000)
Batches
What if:
dim(x) = (n, d),
n = 100 000,
d = 10 000
10
Big scale (d ~ 10 000, n ~ 100 000, X and y are rvs)
Batches
Fit time: 140s - 170s
Loss: 0.074
Fit time: 49s - 60s
Loss: 0.085
Faster, but less accurate on
train data
Classic Iterative
Slow, but train quality is the best
11
Iterative fit with keras
Batches
classic way
iterative way
12
12
Linear regression | Cascade of models
13
Cascade
Cars’ rotation angles (POC)
Problem:
- small client’s dataset (~1200 train, 400 test)
- one car per image
Model pipeline:
- Encoder
- SVD
- Ridge
Requirements:
- Near real time
- Portable model to C++
14
Cascade
Cars’ rotation angles (encoder)
Variants:
- Encoders from Tensorflow hub
- Custom autoencoder
- Variational autoencoder
Custom autoencoder:
- 600k parameters
- 20 min fit on CPU
15
Cascade
Cars’ rotation angles (SVD + Ridge)
SVD:
- Decrease dimension
Ridge:
- Regularize
- Output is an angle
autoencoder
(_, 64*3)
SVD
(_, 64)
Ridge
(_, 1)
tensorflow sklearn
image (64, 64, 3)
Characteristics
- quickly to train and hyper optimize
- Cross validation is quick
- Unstable without SVD
- TF - sklearn bottleneck.
- Autoencoder uses GPU,
- SVD and Ridge uses CPU)
16
Cascade
Inference optimization
SVD and Ridge are linear models:
- SVD is defined by Projection matrix and
dimension reduction is a matrix operation.
- Ridge model is described by a matrix
multiplication
We can add them as layers into Encoder
17
Cascade
Bigger encoder
18
Cascade
Bigger encoder
Characteristics
- quickly to train and hyper optimize
- Cross validation is quick
- Unstable without SVD
- TF - sklearn bottleneck.
- Autoencoder uses GPU
- SVD and Ridge uses CPU
1. Now the cascade is run in GPU
2. We do not require to create additional cpp code for SVD and
Ridge.
3. Future work will be only with TF and c++ :)
19
19
Linear regression | Bias-Variance decomposition
20
Bias Variance tradeoff
Bias-Variance decomposition
link
21
Bias Variance tradeoff
BV for linear function
link
Bias^2 Variance
Irreducible
error
Irreducible
error
Bias^2 Variance
f is linear
22
22
Linear regression | Mixture model
(Mixture of experts, MoE, Bias minimization)
23
Mixture model
Mixture or various concentrations
xi are observed, k are hidden
24
Mixture model
Gaussian mixture model
EM algorithm
sklearn.mixture.GaussianMixture
Bilmes
x2
x1
25
Mixture model
What if we have next data?
X
Y
X1
X2
y
26
Mixture model
How fit this. (GMM + linreg)
1. Fit GMM on X 2. Fit regression
1. Use GMM to receive clusters for X
2. Fit stratified linear regression model using
clusters and X
In terms of MLE (maximum likelihood estimator)
(cross_entropy) such approach is not the best
one. Consistency properties are unknown
27
Mixture model
Regression mixture
X1
X2
y
All these parameters are unknown :)
sklearn doesn’t have this model
28
Mixture model
Regression mixture (solution)
X1
X2
y
Faria (EM), link (SDG, NR),
Implement yourself
- use EM algorithm
- or optimize likelihood with tf
29
Mixture model
Regression mixture (solution)
X
Y
30
30
Linear regression | Ensemble (variance minimization)
31
Ensemble
Linear model ensemble (bagging)
32
Ensemble
Linear model ensemble (bagging)
Ensemble for linear model is linear model too
optimization!
If model’s outputs are independent :)
33
Ensemble
Linear model ensemble (bagging)
If model’s outputs are dependent :
- Models in the ensemble should be from different paradigms
(parametric/nonparametric)
- Features should be completely different for different models
for one model
The best output:
34
Ensemble
Linear model ensemble (stacking)
Leave-one-out
w minimizes error term variance (on train), but
35
Ensemble
Linear model ensemble (stacking)
w minimizes error term variance (on train), but
This mean that for linear model, stack is well validated, but not the best in train sense
(it might prevent overfit for linear model…)
Leave-one-out
test_loss estimator train_loss
36
36
Linear regression | Jackknife estimators
(why stacking works)
37
Jackknife
Jackknife
Then
link
38
Jackknife
Jackknife application
V_n is a consistent estimator for asymptotic
covariance matrix of MSE-estimator
J. Shao, Mathematical Statistics, ch 5.
39
Jackknife
Jackknife
With the theorem
stacking loss is a consistent test-loss estimator.
i-th error term estimated by model
on a sample without i-th object
well, here should be a 10-page article with theorems …, link
40

More Related Content

PPTX
Linear regression
PPTX
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
PDF
Lecture 2 neural network covers the basic
PPTX
Machine learning introduction lecture notes
PDF
MLHEP Lectures - day 2, basic track
PPTX
Introduction to Convolutional Neural Network.pptx
PPTX
ML_in_QM_JC_02-10-18
PDF
Linear Regression
Linear regression
GlobalLogic Machine Learning Webinar “Statistical learning of linear regressi...
Lecture 2 neural network covers the basic
Machine learning introduction lecture notes
MLHEP Lectures - day 2, basic track
Introduction to Convolutional Neural Network.pptx
ML_in_QM_JC_02-10-18
Linear Regression

Similar to GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear Regression” (20)

PDF
Lecture 5 - Linear Regression Linear Regression
PPTX
Linear regression in machine learning
PPTX
UNIT II SUPERVISED LEARNING - Introduction
PDF
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
PDF
Generalized Linear Models in Spark MLlib and SparkR by Xiangrui Meng
PDF
Generalized Linear Models in Spark MLlib and SparkR
PPTX
Regression ppt
PPTX
Deep Learning Module 2A Training MLP.pptx
PPTX
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
PPTX
Machine Learning Workshop
PDF
Data Science Cheatsheet.pdf
PDF
Machine learning Introduction
PPT
Jörg Stelzer
PPT
Classification and regression power point
PDF
Linear models for data science
PPTX
cnn.pptx
PDF
Chapter 1: Linear Regression
PPTX
Linear regression
PPTX
13Kernel_Machines.pptx
PPTX
cnn.pptx Convolutional neural network used for image classication
Lecture 5 - Linear Regression Linear Regression
Linear regression in machine learning
UNIT II SUPERVISED LEARNING - Introduction
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Generalized Linear Models in Spark MLlib and SparkR by Xiangrui Meng
Generalized Linear Models in Spark MLlib and SparkR
Regression ppt
Deep Learning Module 2A Training MLP.pptx
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Machine Learning Workshop
Data Science Cheatsheet.pdf
Machine learning Introduction
Jörg Stelzer
Classification and regression power point
Linear models for data science
cnn.pptx
Chapter 1: Linear Regression
Linear regression
13Kernel_Machines.pptx
cnn.pptx Convolutional neural network used for image classication
Ad

More from GlobalLogic Ukraine (20)

PDF
GlobalLogic JavaScript Community Webinar #21 “Інтерв’ю без заспокійливих”
PPTX
Deadlocks in SQL - Turning Fear Into Understanding (by Sergii Stets)
PDF
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
PDF
GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"
PDF
GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”
PDF
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
PPTX
Штучний інтелект як допомога в навчанні, а не замінник.pptx
PPTX
Задачі AI-розробника як застосовується штучний інтелект.pptx
PPTX
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
PDF
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...
PDF
JavaScript Community Webinar #14 "Why Is Git Rebase?"
PDF
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
PPTX
Страх і сила помилок - IT Inside від GlobalLogic Education
PDF
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
PDF
GlobalLogic QA Webinar “What does it take to become a Test Engineer”
PDF
“How to Secure Your Applications With a Keycloak?
PDF
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
PDF
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
PPTX
GlobalLogic Webinar "Introduction to Embedded QA"
PPTX
C++ Webinar "Why Should You Learn C++ in 2021-22?"
GlobalLogic JavaScript Community Webinar #21 “Інтерв’ю без заспокійливих”
Deadlocks in SQL - Turning Fear Into Understanding (by Sergii Stets)
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...
GlobalLogic Embedded Community x ROS Ukraine Webinar "Surgical Robots"
GlobalLogic Java Community Webinar #17 “SpringJDBC vs JDBC. Is Spring a Hero?”
GlobalLogic JavaScript Community Webinar #18 “Long Story Short: OSI Model”
Штучний інтелект як допомога в навчанні, а не замінник.pptx
Задачі AI-розробника як застосовується штучний інтелект.pptx
Що треба вивчати, щоб стати розробником штучного інтелекту та нейромереж.pptx
GlobalLogic Java Community Webinar #16 “Zaloni’s Architecture for Data-Driven...
JavaScript Community Webinar #14 "Why Is Git Rebase?"
GlobalLogic .NET Community Webinar #3 "Exploring Serverless with Azure Functi...
Страх і сила помилок - IT Inside від GlobalLogic Education
GlobalLogic .NET Webinar #2 “Azure RBAC and Managed Identity”
GlobalLogic QA Webinar “What does it take to become a Test Engineer”
“How to Secure Your Applications With a Keycloak?
GlobalLogic C++ Webinar “The Minimum Knowledge to Become a C++ Developer”
Embedded Webinar #17 "Low-level Network Testing in Embedded Devices Development"
GlobalLogic Webinar "Introduction to Embedded QA"
C++ Webinar "Why Should You Learn C++ in 2021-22?"
Ad

Recently uploaded (20)

PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Construction Project Organization Group 2.pptx
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Internet of Things (IOT) - A guide to understanding
DOCX
573137875-Attendance-Management-System-original
PPTX
web development for engineering and engineering
PPTX
Geodesy 1.pptx...............................................
CYBER-CRIMES AND SECURITY A guide to understanding
Construction Project Organization Group 2.pptx
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Operating System & Kernel Study Guide-1 - converted.pdf
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
bas. eng. economics group 4 presentation 1.pptx
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Arduino robotics embedded978-1-4302-3184-4.pdf
additive manufacturing of ss316l using mig welding
Internet of Things (IOT) - A guide to understanding
573137875-Attendance-Management-System-original
web development for engineering and engineering
Geodesy 1.pptx...............................................

GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear Regression”