SlideShare a Scribd company logo
Machine Learning Interviews – 
Day 5 
Arpit Agarwal
Practical Considerations for Selecting 
Learning Algorithms 
• Large Number of Samples in Dataset? 
– Don’t worry about variance, but worry about 
computational time 
– Can use Random Forests, kernel-SVM with SGD, 
SMO, Logistic Regression with SGD, Perceptron 
– Don’t use k-NN 
– The problem with SVM is that too many 
parameters, so computational time is large
Practical Considerations for Selecting 
Learning Algorithms 
• Very Small Dataset 
– Worry about variance of your classifier 
– Naïve Bayes works very well with less amout of 
data 
– Can use SVM with linear kernel with 
regularization, Logistic Regression with 
regularization 
– Don’t use decision trees
Practical Considerations for Selecting 
Learning Algorithms 
• Very low Dimensionality? 
– Worry about high bias 
– Need to use powerful kernel methods like Kernel SVM, 
Kernel LR subject to that we have large enough data 
– Useful to collect more features 
• Very Large Dimensionality? 
– Don’t worry about high bias, worry about 
computational time 
– SVM with linear kernel, Random forests can be used 
– Can’t use Decision Trees
Practical Considerations for Selecting 
Learning Algorithms 
• Want probability estimates? 
– Logistic Regression is good, (SVM with Platt scaling) 
– Might not want to use Random Forests, kNN unless 
you have large amount of data 
• Working with Text Data? 
– Naïve Bayes works very well 
• Want to constantly update your model with new 
data? 
– Difficult to use Random Forests 
– Can use Logistic Regression, Perceptron, kNN
Practical Considerations for Selecting 
Learning Algorithms 
• Categorical Attributes? 
– Can work with Decision Trees and Random Forests 
• Don’t want any parameter tuning? 
– Use Naïve Bayes, Random Forests 
– Don’t use SVM 
• Can have large training time but want less prediction 
time? 
– Use SVM, Neural Networks 
– Don’t Use kNN,
Practical Considerations for Selecting 
Learning Algorithms 
• The underlying data is to complex? 
– SVM with powerful kernel, Neural Networks 
• Want to parallelize your algorithm? 
– Random Forests bit easy to parallelize, SVM can 
also be parallelized
Linear Regression 
• On Board
Perceptron 
• On Board
SVD 
• Any real m x n matrix A can be decomposed uniquely: 
• U is m x n and column orthonormal (UTU=I) 
• D is n x n and diagonal 
– σi are called singular values of A 
– It is assumed that σ1 ≥ σ2 ≥ … ≥ σn ≥ 0 
• V is n x n and orthonormal (VVT=VTV=I)
SVD 
• If m=n, then: 
• U is n x n and orthonormal (UTU=UUT=I) 
• D is n x n and diagonal 
• V is n x n and orthonormal (VVT=VTV=I)
SVD 
• The columns of U are eigenvectors of AAT 
• The columns of V are eigenvectors of ATA 
for square matrices: 
A=PΛP-1 
• If λi is an eigenvalue of ATA (or AAT), then λi =σi 
2
U = (u1 u2 . . . un) V = (v1 v2 . . . vn) 
D
Relation with PCA 
• On board
Disclaimer: This crash course was just to aid your 
preparation not to replace your preparation.
All the very best for you placements!

More Related Content

PPTX
Machine learning interviews day2
PPTX
Machine learning interviews day3
PPTX
Support Vector Machine (SVM)
PPTX
Support Vector Machine without tears
PPTX
Machine learning interviews day1
ODP
Linear Classification
PDF
Summer internship project report
Machine learning interviews day2
Machine learning interviews day3
Support Vector Machine (SVM)
Support Vector Machine without tears
Machine learning interviews day1
Linear Classification
Summer internship project report

Similar to Machine learning interviews day5 (20)

PPTX
Machine Learning
PDF
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์ เอื้อวัฒนามงคล
PPTX
How Machine Learning Helps Organizations to Work More Efficiently?
PDF
CVPR2008 tutorial generalized pca
PPTX
background.pptx
PDF
Lecture 2 neural network covers the basic
PPTX
Unit-V.pptx DVD is a great way to get sbi and more jobs available review and ...
PDF
مدخل إلى تعلم الآلة
PPT
Machine Learning Deep Learning Machine learning
PDF
ARTIFICIAL-NEURAL-NETWORKMACHINELEARNING
PPTX
Computational Giants_nhom.pptx
PDF
Machine learning Introduction
PPTX
04 Classification in Data Mining
PPTX
Predictive analytics
PPTX
DECESION TREE and -SVM-NAIVEs bayes-BAYS.pptx
PPT
lec10svm.ppt
PPTX
Support Vector Machine Techniques for Nonlinear Equalization
PDF
Machine learning
PDF
Machine learning meetup
PPTX
Random Forest Decision Tree.pptx
Machine Learning
Machine Learning: An introduction โดย รศ.ดร.สุรพงค์ เอื้อวัฒนามงคล
How Machine Learning Helps Organizations to Work More Efficiently?
CVPR2008 tutorial generalized pca
background.pptx
Lecture 2 neural network covers the basic
Unit-V.pptx DVD is a great way to get sbi and more jobs available review and ...
مدخل إلى تعلم الآلة
Machine Learning Deep Learning Machine learning
ARTIFICIAL-NEURAL-NETWORKMACHINELEARNING
Computational Giants_nhom.pptx
Machine learning Introduction
04 Classification in Data Mining
Predictive analytics
DECESION TREE and -SVM-NAIVEs bayes-BAYS.pptx
lec10svm.ppt
Support Vector Machine Techniques for Nonlinear Equalization
Machine learning
Machine learning meetup
Random Forest Decision Tree.pptx
Ad

Recently uploaded (20)

PDF
Classroom Observation Tools for Teachers
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
master seminar digital applications in india
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
Sports Quiz easy sports quiz sports quiz
PPTX
Pharma ospi slides which help in ospi learning
PPTX
GDM (1) (1).pptx small presentation for students
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
RMMM.pdf make it easy to upload and study
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
Cell Types and Its function , kingdom of life
PPTX
Lesson notes of climatology university.
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
Classroom Observation Tools for Teachers
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
master seminar digital applications in india
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Anesthesia in Laparoscopic Surgery in India
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Sports Quiz easy sports quiz sports quiz
Pharma ospi slides which help in ospi learning
GDM (1) (1).pptx small presentation for students
VCE English Exam - Section C Student Revision Booklet
RMMM.pdf make it easy to upload and study
Microbial disease of the cardiovascular and lymphatic systems
Cell Types and Its function , kingdom of life
Lesson notes of climatology university.
O5-L3 Freight Transport Ops (International) V1.pdf
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Ad

Machine learning interviews day5

  • 1. Machine Learning Interviews – Day 5 Arpit Agarwal
  • 2. Practical Considerations for Selecting Learning Algorithms • Large Number of Samples in Dataset? – Don’t worry about variance, but worry about computational time – Can use Random Forests, kernel-SVM with SGD, SMO, Logistic Regression with SGD, Perceptron – Don’t use k-NN – The problem with SVM is that too many parameters, so computational time is large
  • 3. Practical Considerations for Selecting Learning Algorithms • Very Small Dataset – Worry about variance of your classifier – Naïve Bayes works very well with less amout of data – Can use SVM with linear kernel with regularization, Logistic Regression with regularization – Don’t use decision trees
  • 4. Practical Considerations for Selecting Learning Algorithms • Very low Dimensionality? – Worry about high bias – Need to use powerful kernel methods like Kernel SVM, Kernel LR subject to that we have large enough data – Useful to collect more features • Very Large Dimensionality? – Don’t worry about high bias, worry about computational time – SVM with linear kernel, Random forests can be used – Can’t use Decision Trees
  • 5. Practical Considerations for Selecting Learning Algorithms • Want probability estimates? – Logistic Regression is good, (SVM with Platt scaling) – Might not want to use Random Forests, kNN unless you have large amount of data • Working with Text Data? – Naïve Bayes works very well • Want to constantly update your model with new data? – Difficult to use Random Forests – Can use Logistic Regression, Perceptron, kNN
  • 6. Practical Considerations for Selecting Learning Algorithms • Categorical Attributes? – Can work with Decision Trees and Random Forests • Don’t want any parameter tuning? – Use Naïve Bayes, Random Forests – Don’t use SVM • Can have large training time but want less prediction time? – Use SVM, Neural Networks – Don’t Use kNN,
  • 7. Practical Considerations for Selecting Learning Algorithms • The underlying data is to complex? – SVM with powerful kernel, Neural Networks • Want to parallelize your algorithm? – Random Forests bit easy to parallelize, SVM can also be parallelized
  • 10. SVD • Any real m x n matrix A can be decomposed uniquely: • U is m x n and column orthonormal (UTU=I) • D is n x n and diagonal – σi are called singular values of A – It is assumed that σ1 ≥ σ2 ≥ … ≥ σn ≥ 0 • V is n x n and orthonormal (VVT=VTV=I)
  • 11. SVD • If m=n, then: • U is n x n and orthonormal (UTU=UUT=I) • D is n x n and diagonal • V is n x n and orthonormal (VVT=VTV=I)
  • 12. SVD • The columns of U are eigenvectors of AAT • The columns of V are eigenvectors of ATA for square matrices: A=PΛP-1 • If λi is an eigenvalue of ATA (or AAT), then λi =σi 2
  • 13. U = (u1 u2 . . . un) V = (v1 v2 . . . vn) D
  • 14. Relation with PCA • On board
  • 15. Disclaimer: This crash course was just to aid your preparation not to replace your preparation.
  • 16. All the very best for you placements!

Editor's Notes

  • #14: Rank L approximation
  • #16: Don’t ‘t think that these slides are enough