SlideShare a Scribd company logo
Machine Learning with Spark
and Cassandra - Model
Selection Tests
Series
Machine Learning with Spark and
Cassandra
● Environment Setup
● Data Pre-processing
● Testing
● Cross-Validation
● Model Selection Tests
● Deployment
What are model
selection tests?
Overview
● For comparing the relative performance of two different machine learning algorithms
○ Only gives information within a specific domain, based on the data used for tests
● Similar to statistical significance tests used in scientific research
○ Checking whether performance differences are due to model skill or random chance
○ Null hypothesis is that any observed difference is due to random chance
● Requires a specific shared measure of model skill
○ Cannot compare classification vs regression models
○ Cannot compare one models accuracy to another models f1-score
● Different tests make different statistical assumptions
Types of tests
Wilcoxon signed-rank test
● A version of the student’s t test, useful with a small number of samples
● Use k-fold cross validation to generate k scores for each model
● Feed those two sets of k accuracies into the wilcoxon significance test
○ Not really writable as a formula
○ Involves calculating absolute differences between samples in a set and rank them based on the value of the
difference. Then you return their signs and sum the ranks.
○ The result is a p value. Like in scientific studies if p < 0.05 then we reject the null hypothesis.
■ P < 0.05 predicts a 5% chance that the results are this way due to statistical chance and 95% chance
that differences are due to actual existing differences
● Models must be trained and tested using exactly the same cross-validation folds
McNemar’s test
● Checks how well the predictions two models make, match
● Build a contingency table
○ Similar to a confusion matrix, but rather than class predictions its categories are based on whether each
model successfully predicted the actual value
○ Matrix values calculate x^2 which is then used to calculate p-values
● Works best if b,c have a large number of values
○ Variations exist for situations with low amounts of b,c
5x2CV paired t-test
● Another paired t-test variation, like the signed rank test
● Take a random 50% split of the data, train each model with this split for DiffA results and then flip
them for DiffB results
○ Repeat five times and calculate the mean variance of the differences
○ Calculate the t statistic, then use t to calculate p-value
5x2CV combined F test
● A variation of the 5x2CV paired t test
● Rather than having two performance results for model a and model b, the performance metric is
combined and then we estimate mean and variance
● Then calculate f-statistic and use the to calculate p values
Any Questions?
Strategy: Scalable Fast Data
Architecture: Cassandra, Spark, Kafka
Engineering: Node, Python, JVM,CLR
Operations: Cloud, Container
Rescue: Downtime!! I need help.
 www.anant.us | solutions@anant.us | (855) 262-6826
3 Washington Circle, NW | Suite 301 | Washington, DC 20037

More Related Content

PDF
Cause effect graphing.ppt
PPTX
CHAPTER 4- Lesson A
PPT
Gaur11428
PDF
Online Consumer Panel simulator - demo: Project Description
PPT
Output analysis of a single model
PDF
Model checking With ATM
PPTX
Uncertainty in Geospatial Data
PDF
mc_simulation documentation
Cause effect graphing.ppt
CHAPTER 4- Lesson A
Gaur11428
Online Consumer Panel simulator - demo: Project Description
Output analysis of a single model
Model checking With ATM
Uncertainty in Geospatial Data
mc_simulation documentation

Similar to Machine Learning with Spark and Cassandra - Model Selection Tests (20)

PPTX
UNIT - 5 DESIGN AND ANALYSIS OF MACHINE LEARNING EXPERIMENTS
PPTX
Lecture3-eval.pptx
PPTX
Lecture 3 for the AI course in A university
PPTX
The 10 Algorithms Machine Learning Engineers Need to Know.pptx
PDF
Experimental Design for Distributed Machine Learning with Myles Baker
PDF
Nonparametric Tests For Complete Data Vilijandas Bagdonavicius
PDF
Chapter 02-logistic regression
PDF
Overview of statistical tests: Data handling and data quality (Part II)
PDF
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
PPT
Introduce to approaches of classifiers combination
PPTX
Build_Machine_Learning_System for Machine Learning Course
PDF
ML MODULE 2.pdf
PPT
Business Analytics using R.ppt
PPT
CROSS-VALIDATION AND MODEL SELECTION (1).ppt
PDF
Machine learning Mind Map
PDF
IRJET- Performance Evaluation of Various Classification Algorithms
PDF
IRJET- Performance Evaluation of Various Classification Algorithms
PDF
Machine Learning.pdf
PDF
From Data to Decisions, a Mixed Path of Data Visualization and Machine Learning
UNIT - 5 DESIGN AND ANALYSIS OF MACHINE LEARNING EXPERIMENTS
Lecture3-eval.pptx
Lecture 3 for the AI course in A university
The 10 Algorithms Machine Learning Engineers Need to Know.pptx
Experimental Design for Distributed Machine Learning with Myles Baker
Nonparametric Tests For Complete Data Vilijandas Bagdonavicius
Chapter 02-logistic regression
Overview of statistical tests: Data handling and data quality (Part II)
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Introduce to approaches of classifiers combination
Build_Machine_Learning_System for Machine Learning Course
ML MODULE 2.pdf
Business Analytics using R.ppt
CROSS-VALIDATION AND MODEL SELECTION (1).ppt
Machine learning Mind Map
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification Algorithms
Machine Learning.pdf
From Data to Decisions, a Mixed Path of Data Visualization and Machine Learning
Ad

More from Anant Corporation (20)

PPTX
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
PPTX
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
PDF
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
PDF
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
PDF
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
PDF
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
PPTX
YugabyteDB Developer Tools
PPTX
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
PPTX
Machine Learning Orchestration with Airflow
PDF
Cassandra Lunch 130: Recap of Cassandra Forward Talks
PDF
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
PDF
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
PDF
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
PDF
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
PDF
Data Engineer's Lunch #85: Designing a Modern Data Stack
PPTX
PDF
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
PDF
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
PPTX
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
PPTX
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
YugabyteDB Developer Tools
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Machine Learning Orchestration with Airflow
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Ad

Recently uploaded (20)

PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
A Presentation on Artificial Intelligence
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
Chapter 5: Probability Theory and Statistics
PDF
project resource management chapter-09.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Web App vs Mobile App What Should You Build First.pdf
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
1 - Historical Antecedents, Social Consideration.pdf
Heart disease approach using modified random forest and particle swarm optimi...
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
A comparative analysis of optical character recognition models for extracting...
A Presentation on Artificial Intelligence
TLE Review Electricity (Electricity).pptx
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
A comparative study of natural language inference in Swahili using monolingua...
NewMind AI Weekly Chronicles - August'25-Week II
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Chapter 5: Probability Theory and Statistics
project resource management chapter-09.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Web App vs Mobile App What Should You Build First.pdf
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Encapsulation_ Review paper, used for researhc scholars

Machine Learning with Spark and Cassandra - Model Selection Tests

  • 1. Machine Learning with Spark and Cassandra - Model Selection Tests
  • 2. Series Machine Learning with Spark and Cassandra ● Environment Setup ● Data Pre-processing ● Testing ● Cross-Validation ● Model Selection Tests ● Deployment
  • 4. Overview ● For comparing the relative performance of two different machine learning algorithms ○ Only gives information within a specific domain, based on the data used for tests ● Similar to statistical significance tests used in scientific research ○ Checking whether performance differences are due to model skill or random chance ○ Null hypothesis is that any observed difference is due to random chance ● Requires a specific shared measure of model skill ○ Cannot compare classification vs regression models ○ Cannot compare one models accuracy to another models f1-score ● Different tests make different statistical assumptions
  • 6. Wilcoxon signed-rank test ● A version of the student’s t test, useful with a small number of samples ● Use k-fold cross validation to generate k scores for each model ● Feed those two sets of k accuracies into the wilcoxon significance test ○ Not really writable as a formula ○ Involves calculating absolute differences between samples in a set and rank them based on the value of the difference. Then you return their signs and sum the ranks. ○ The result is a p value. Like in scientific studies if p < 0.05 then we reject the null hypothesis. ■ P < 0.05 predicts a 5% chance that the results are this way due to statistical chance and 95% chance that differences are due to actual existing differences ● Models must be trained and tested using exactly the same cross-validation folds
  • 7. McNemar’s test ● Checks how well the predictions two models make, match ● Build a contingency table ○ Similar to a confusion matrix, but rather than class predictions its categories are based on whether each model successfully predicted the actual value ○ Matrix values calculate x^2 which is then used to calculate p-values ● Works best if b,c have a large number of values ○ Variations exist for situations with low amounts of b,c
  • 8. 5x2CV paired t-test ● Another paired t-test variation, like the signed rank test ● Take a random 50% split of the data, train each model with this split for DiffA results and then flip them for DiffB results ○ Repeat five times and calculate the mean variance of the differences ○ Calculate the t statistic, then use t to calculate p-value
  • 9. 5x2CV combined F test ● A variation of the 5x2CV paired t test ● Rather than having two performance results for model a and model b, the performance metric is combined and then we estimate mean and variance ● Then calculate f-statistic and use the to calculate p values
  • 11. Strategy: Scalable Fast Data Architecture: Cassandra, Spark, Kafka Engineering: Node, Python, JVM,CLR Operations: Cloud, Container Rescue: Downtime!! I need help.  www.anant.us | solutions@anant.us | (855) 262-6826 3 Washington Circle, NW | Suite 301 | Washington, DC 20037