SlideShare a Scribd company logo
Andrew Clark, Principal Machine Learning Auditor
Capital One
The Machine Learning Audit
Massachusetts Bay Company
 First record of auditors in the New World
 London venture capitalists backed colony
efforts in the New World
 Massachusetts Bay Company employed 8
auditors.
Brooks, Rebecca Beatrice. "History of the Massachusetts Bay Colony." History of Massachusetts. January 5, 2015. Accessed
October 26, 2017. http://historyofmassachusetts.org/history-of-the-massachusetts-bay-colony/.
Overview
 What is Machine Learning?
 Why is it important?
 Why do we need machine learning audits?
 What exactly is a machine learning audit?
 What would a machine learning audit entail?
 Full-length example using the CRISP-DMA
framework
Kong, Qingkai . "Machine Learning 1 - What is machine learning and real world
example." Qingkai's Blog (web log), October 4, 2016. Accessed February 21, 2017.
http://qingkaikong.blogspot.com/2016/10/machine-learning-1-what-is-
machine.html?showComment=1484689212391#c4748865641151946089.
What is Machine Learning?
 A computer recognizing patterns without having to be explicitly programmed.
The Machine Learning Audit. MIS ITAC 2017 Keynote
The Machine Learning Audit. MIS ITAC 2017 Keynote
Why is Machine Learning important?
Disrupting business. Example ML powered businesses disrupted Blockbuster, Taxis, etc.
Revolutionizing existing business models. Predictive maintenance in manufacturing,
retailing, credit card fraud detection, loan underwriting.
One of the key technologies in driving economic growth.
One of the most talked about but least understood topics in modern discourse. e.x.
“Facebook shuts down robots after they invent their own language” (The Telegraph
August 1, 2017) and “Elon Musk: regulate AI to combat 'existential threat' before it's
too late” (The Guardian July 17, 2017).
Sensational stories are clickbait.
What Machine Learning is not:
 Magic
 Going to take your job (for the majority of professionals)
 Always the best tool for the job
What do all these buzz words mean?
Machine Learning based artificial intelligent Big Data
spewing Deep Learning Neural Network touting
Cognitive Computing Virtual Reality Natural Language
Processing Chat Bot.
Why do we need machine learning audits?
 With algorithms increasingly dictating our lives, how do we know that they are
operating as intended?
 e.x. Weapons of Math Destruction by Cathy O'Neil
 Some believe the EU General Data Protection Regulation act provides a “Right to
Explanation”, although this is not explicitly stated and is untested in the courts.
What exactly is a machine learning audit?
Examination of the purpose, process, execution, and monitoring of a
machine learning model ‘in the wild’.
As assurance professionals, how do we know that the model is doing what
it should be doing? What is the risk to the business?
Data Science is a new discipline, without the formal rigor and mature of
processes that exist in other disciplines. Statistics is a profession that has
been around for years, yet there are so many issues with the peer review
process of statistics, and their models aren’t as complicated!
What would a machine learning audit entail?
 Understand the business use case.
 Model integration into existing architecture.
 Potential regulatory or risk constraints
 “Data Sciencey stuff” – i.e.
 How was the test data obtained?
 How was the data cleaned?
 How was the feature engineering conducted?
 How was the specific algorithm decided upon?
 Are there correction cascades?
 How was the model evaluated?
 What was the process to prevent overfitting, etc.
 Is the model accomplishing what the business wanted it to accomplish?
Introducing the CRISP-DMA framework
Framework written by yours truly that extends the industry standard data mining
framework, CRISP-DM to auditing machine learning implementations.
Leverages that existing, eight, iterative steps of the CRISP-DM model:
Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment
Business Understanding
 What is the goal of the algorithm?
 Have models been used in this use case before?
 What attributes, i.e. temperature, humidity, etc., have been identified by the business as key factors for deriving
the desired decision in the given use case?
 Are there any regulatory constraints or considerations of which to be aware?
Data Understanding
 What dataset[s] was utilized to train the model?
 What dataset[s] is utilized for production prediction?
 Where did the data set[s] identified in 1,2 originate? I.e. web scrapped data, log files, relational databases.
 Are all of the input variables in the same format? I.e. miles or kilometers.
 Have the correlations and covariances been examined?
Data Preparation
 How was the data cleaned?
 If supervised learning was used, how was the training dataset created?
 Were standard software development techniques used for the ETL process for
production models?
 How was the data scaled?
 How were the variables selected? Was an automated variable selection technique
utilized?
 What process was used to separate the data into train and test sets? Was care taken to
avoid peaking at the test set?
Modeling
What was the thought process behind choosing algorithm[s] for
the model?
What steps were used to guard against overfitting?
What process was used to optimize the chosen algorithm?
Was the algorithm coded from scratch or was a standard library
used? If so, what are the license terms of the library?
What type of version control was utilized?
Evaluation
 What metrics were used to evaluate the model?
 What process and metrics are in place to monitor the continued accuracy and stability
of the model?
 Create a mock dataset that covers all of the relevant assumptions and run the results
through the algorithm to test that it is operating as intended.
Deployment
 How was the model moved to production? Was it rewritten by the engineering team,
or does it rely on an API, etc., (if it was rewritten, a code review for accuracy should be
performed).
 Is the model accomplishing what the business wanted it to accomplish?
Raspberry Pie Machine Learning Weather Prediction - A simple
example
Architecture Diagram
Raspberry Pi readings and actual weather
Aggregate readings to one average reading every thirty minutes
Aggregation cont.
Convert the status to 1 if the status is rain or thunderstorm, 0 otherwise
Split the data into training and test sets
The Machine Learning Audit. MIS ITAC 2017 Keynote
View model accuracy
Examine model weights
Test the model by manually passing in observations
Conclusion and Recap
 What machine learning is.
 Why machine learning is important.
 Why we need machine learning audits.
 What constitutes a machine learning audit.
 What a machine learning audit entails.
 Overview of the CRISP-DMA framework.
 Simple end to end machine learning audit example using the CRISP-DMA framework.
Thank you!
 Email: andrewtaylorclark@gmail.com
 GitHub: aclarkData
 Blog: https://aclarkdata.github.io/
 LinkedIn: www.linkedin.com/in/andrew-clark-b326b767
THANK YOU!
Please Remember To Fill Out Your
Session Evaluation Forms!
Name
Position, Company
Handle

More Related Content

PPTX
The Machine Learning Audit
PPTX
Machine Learning for Auditors: What you need to know - ISACA North America CA...
PPTX
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
PPTX
Machine Learning for Auditors
PDF
ITAC 2016 Where Open Source Meets Audit Analytics
PDF
AI & ML in Cyber Security - Why Algorithms Are Dangerous
PDF
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
PDF
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
The Machine Learning Audit
Machine Learning for Auditors: What you need to know - ISACA North America CA...
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
Machine Learning for Auditors
ITAC 2016 Where Open Source Meets Audit Analytics
AI & ML in Cyber Security - Why Algorithms Are Dangerous
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...

What's hot (20)

PDF
Introduction to Machine Learning - WeCloudData
PDF
Healthcare + AI: Use cases & Challenges
PPTX
How to perform Secure Data Labeling for Machine Learning
PDF
AI-SDV 2020: AI-augmented Question Answering and Semantic Search for Life Sci...
PDF
MIT Sloan: Intro to Machine Learning
PDF
Machine Learning Project Lifecycle
PPTX
Carmelo Iaria, AI Academy - How The AI Academy is accelerating NLP projects w...
PPTX
AI and Security
PDF
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
PDF
Big Data & Analytics - What is it and How does it matter to Insurance?
PDF
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
PDF
MLSEV Virtual. ML: Business Perspective
PDF
Model Risk Management for Machine Learning
PDF
from_physics_to_data_science
PDF
Debugging AI
PDF
Thailand 4.0 strategies by Data Science and Blockchain
PDF
Big Data and Artificial Intelligence in Critical Care
PPTX
Data Science
PPTX
Career in Data Science
PPTX
introduction to data science
Introduction to Machine Learning - WeCloudData
Healthcare + AI: Use cases & Challenges
How to perform Secure Data Labeling for Machine Learning
AI-SDV 2020: AI-augmented Question Answering and Semantic Search for Life Sci...
MIT Sloan: Intro to Machine Learning
Machine Learning Project Lifecycle
Carmelo Iaria, AI Academy - How The AI Academy is accelerating NLP projects w...
AI and Security
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Big Data & Analytics - What is it and How does it matter to Insurance?
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
MLSEV Virtual. ML: Business Perspective
Model Risk Management for Machine Learning
from_physics_to_data_science
Debugging AI
Thailand 4.0 strategies by Data Science and Blockchain
Big Data and Artificial Intelligence in Critical Care
Data Science
Career in Data Science
introduction to data science
Ad

Similar to The Machine Learning Audit. MIS ITAC 2017 Keynote (20)

PDF
Technovision
PPTX
SESE 2021: Where Systems Engineering meets AI/ML
PDF
Machine Learning for Finance Master Class
PDF
The Machine Learning Audit
PDF
AI for Software Engineering
PPTX
#ATAGTR2021 Presentation : "Use of AI and ML in Performance Testing" by Adolf...
PDF
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
PPTX
Data Science as a Service: Intersection of Cloud Computing and Data Science
PPTX
Data Science as a Service: Intersection of Cloud Computing and Data Science
PPTX
Innovation at the Edge_Final
PPTX
Pistoia Alliance US Conference 2015 - 1.1.2 Innovation in Pharma - Chris Waller
PDF
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
PDF
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
PDF
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
DOCX
My Journey from Data Confusion to Data Mastery.docx
PPT
Multiview Methodology
PDF
Python Machine Learning by Example Yuxi (Hayden) Liu
PDF
Machine learning at b.e.s.t. summer university
PDF
Implementing AI for improved performance testing – Cuneiform.pdf
PDF
Ai in finance
Technovision
SESE 2021: Where Systems Engineering meets AI/ML
Machine Learning for Finance Master Class
The Machine Learning Audit
AI for Software Engineering
#ATAGTR2021 Presentation : "Use of AI and ML in Performance Testing" by Adolf...
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
Innovation at the Edge_Final
Pistoia Alliance US Conference 2015 - 1.1.2 Innovation in Pharma - Chris Waller
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
SEAMLESS AUTOMATION AND INTEGRATION OF MACHINE LEARNING CAPABILITIES FOR BIG ...
My Journey from Data Confusion to Data Mastery.docx
Multiview Methodology
Python Machine Learning by Example Yuxi (Hayden) Liu
Machine learning at b.e.s.t. summer university
Implementing AI for improved performance testing – Cuneiform.pdf
Ai in finance
Ad

More from Andrew Clark (8)

PDF
GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
PDF
Blockchain for Auditors
PDF
AWS for Auditors
PDF
Machine Learning Risk Management
PDF
Big data and other buzzwords
PDF
Machine Learning: What Assurance Professionals Need to Know
PPTX
Reinventing Auditing with Machine Learning
PPTX
Active Directory for Auditors
GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
Blockchain for Auditors
AWS for Auditors
Machine Learning Risk Management
Big data and other buzzwords
Machine Learning: What Assurance Professionals Need to Know
Reinventing Auditing with Machine Learning
Active Directory for Auditors

Recently uploaded (20)

PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Leprosy and NLEP programme community medicine
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PDF
How to run a consulting project- client discovery
PDF
Introduction to Data Science and Data Analysis
PDF
Global Data and Analytics Market Outlook Report
PPTX
IMPACT OF LANDSLIDE.....................
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PDF
Introduction to the R Programming Language
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PDF
Business Analytics and business intelligence.pdf
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
A Complete Guide to Streamlining Business Processes
PPTX
Introduction to Inferential Statistics.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
Database Infoormation System (DBIS).pptx
Leprosy and NLEP programme community medicine
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
How to run a consulting project- client discovery
Introduction to Data Science and Data Analysis
Global Data and Analytics Market Outlook Report
IMPACT OF LANDSLIDE.....................
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
Introduction to the R Programming Language
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
Business Analytics and business intelligence.pdf
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
ISS -ESG Data flows What is ESG and HowHow
A Complete Guide to Streamlining Business Processes
Introduction to Inferential Statistics.pptx

The Machine Learning Audit. MIS ITAC 2017 Keynote

  • 1. Andrew Clark, Principal Machine Learning Auditor Capital One The Machine Learning Audit
  • 2. Massachusetts Bay Company  First record of auditors in the New World  London venture capitalists backed colony efforts in the New World  Massachusetts Bay Company employed 8 auditors. Brooks, Rebecca Beatrice. "History of the Massachusetts Bay Colony." History of Massachusetts. January 5, 2015. Accessed October 26, 2017. http://historyofmassachusetts.org/history-of-the-massachusetts-bay-colony/.
  • 3. Overview  What is Machine Learning?  Why is it important?  Why do we need machine learning audits?  What exactly is a machine learning audit?  What would a machine learning audit entail?  Full-length example using the CRISP-DMA framework Kong, Qingkai . "Machine Learning 1 - What is machine learning and real world example." Qingkai's Blog (web log), October 4, 2016. Accessed February 21, 2017. http://qingkaikong.blogspot.com/2016/10/machine-learning-1-what-is- machine.html?showComment=1484689212391#c4748865641151946089.
  • 4. What is Machine Learning?  A computer recognizing patterns without having to be explicitly programmed.
  • 7. Why is Machine Learning important? Disrupting business. Example ML powered businesses disrupted Blockbuster, Taxis, etc. Revolutionizing existing business models. Predictive maintenance in manufacturing, retailing, credit card fraud detection, loan underwriting. One of the key technologies in driving economic growth. One of the most talked about but least understood topics in modern discourse. e.x. “Facebook shuts down robots after they invent their own language” (The Telegraph August 1, 2017) and “Elon Musk: regulate AI to combat 'existential threat' before it's too late” (The Guardian July 17, 2017). Sensational stories are clickbait.
  • 8. What Machine Learning is not:  Magic  Going to take your job (for the majority of professionals)  Always the best tool for the job
  • 9. What do all these buzz words mean? Machine Learning based artificial intelligent Big Data spewing Deep Learning Neural Network touting Cognitive Computing Virtual Reality Natural Language Processing Chat Bot.
  • 10. Why do we need machine learning audits?  With algorithms increasingly dictating our lives, how do we know that they are operating as intended?  e.x. Weapons of Math Destruction by Cathy O'Neil  Some believe the EU General Data Protection Regulation act provides a “Right to Explanation”, although this is not explicitly stated and is untested in the courts.
  • 11. What exactly is a machine learning audit? Examination of the purpose, process, execution, and monitoring of a machine learning model ‘in the wild’. As assurance professionals, how do we know that the model is doing what it should be doing? What is the risk to the business? Data Science is a new discipline, without the formal rigor and mature of processes that exist in other disciplines. Statistics is a profession that has been around for years, yet there are so many issues with the peer review process of statistics, and their models aren’t as complicated!
  • 12. What would a machine learning audit entail?  Understand the business use case.  Model integration into existing architecture.  Potential regulatory or risk constraints  “Data Sciencey stuff” – i.e.  How was the test data obtained?  How was the data cleaned?  How was the feature engineering conducted?  How was the specific algorithm decided upon?  Are there correction cascades?  How was the model evaluated?  What was the process to prevent overfitting, etc.  Is the model accomplishing what the business wanted it to accomplish?
  • 13. Introducing the CRISP-DMA framework Framework written by yours truly that extends the industry standard data mining framework, CRISP-DM to auditing machine learning implementations. Leverages that existing, eight, iterative steps of the CRISP-DM model: Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment
  • 14. Business Understanding  What is the goal of the algorithm?  Have models been used in this use case before?  What attributes, i.e. temperature, humidity, etc., have been identified by the business as key factors for deriving the desired decision in the given use case?  Are there any regulatory constraints or considerations of which to be aware?
  • 15. Data Understanding  What dataset[s] was utilized to train the model?  What dataset[s] is utilized for production prediction?  Where did the data set[s] identified in 1,2 originate? I.e. web scrapped data, log files, relational databases.  Are all of the input variables in the same format? I.e. miles or kilometers.  Have the correlations and covariances been examined?
  • 16. Data Preparation  How was the data cleaned?  If supervised learning was used, how was the training dataset created?  Were standard software development techniques used for the ETL process for production models?  How was the data scaled?  How were the variables selected? Was an automated variable selection technique utilized?  What process was used to separate the data into train and test sets? Was care taken to avoid peaking at the test set?
  • 17. Modeling What was the thought process behind choosing algorithm[s] for the model? What steps were used to guard against overfitting? What process was used to optimize the chosen algorithm? Was the algorithm coded from scratch or was a standard library used? If so, what are the license terms of the library? What type of version control was utilized?
  • 18. Evaluation  What metrics were used to evaluate the model?  What process and metrics are in place to monitor the continued accuracy and stability of the model?  Create a mock dataset that covers all of the relevant assumptions and run the results through the algorithm to test that it is operating as intended.
  • 19. Deployment  How was the model moved to production? Was it rewritten by the engineering team, or does it rely on an API, etc., (if it was rewritten, a code review for accuracy should be performed).  Is the model accomplishing what the business wanted it to accomplish?
  • 20. Raspberry Pie Machine Learning Weather Prediction - A simple example
  • 22. Raspberry Pi readings and actual weather
  • 23. Aggregate readings to one average reading every thirty minutes
  • 25. Convert the status to 1 if the status is rain or thunderstorm, 0 otherwise
  • 26. Split the data into training and test sets
  • 30. Test the model by manually passing in observations
  • 31. Conclusion and Recap  What machine learning is.  Why machine learning is important.  Why we need machine learning audits.  What constitutes a machine learning audit.  What a machine learning audit entails.  Overview of the CRISP-DMA framework.  Simple end to end machine learning audit example using the CRISP-DMA framework.
  • 32. Thank you!  Email: andrewtaylorclark@gmail.com  GitHub: aclarkData  Blog: https://aclarkdata.github.io/  LinkedIn: www.linkedin.com/in/andrew-clark-b326b767
  • 33. THANK YOU! Please Remember To Fill Out Your Session Evaluation Forms! Name Position, Company Handle

Editor's Notes

  • #3: https://www.ua.edu/news/2003/08/founding-fathers-were-among-first-auditors/ http://historyofmassachusetts.org/history-of-the-massachusetts-bay-colony/ http://www.accountingin.com/accounting-historians-journal/volume-10-number-1/a-historical-perspective-on-the-auditors-role-the-early-experience-of-the-american-railroads/ https://partners-network.com/2012/09/20/accounting-history/
  • #5: Basically, statistics on steroids. I recently read an article where the author referred to machine learning to “statistics on a mac”. Well, that isn’t completely accurate, but the basics behind machine learning are not as ”revolutionary” as one may think, but is the culmination of a “perfect storm” of applied statistics, ingenious mathematics, Moore’s law, distributed computing, cheap data storage, and the rise of the Silicon Valley firm. AI, which machine learning is a subset off, will not, as Elon Musk famously postulates, pose an existential threat to human existence, and will not replace the need for human workers. Machines cannot generalize learned processes to completely new areas, as humans can (cite), cannot reason, and not matter what anyone, IBM, harrumph, might tell you, machines will need actually “think”, have a conscience, empathy, curiosity, invention, or any of the truly human traits. This, in fact, mean that AI will make human employees more important, not less. Certain jobs that no not require more than a very narrow range of movement or thought (think factor line jobs, possibly driving jobs (the jury is still out on this one)), will be automated, but this will provide more and more opportunities for human jobs, ones that require empathy, compassion, relationships, etc. Additionally, the need for more and more skilled tech workers will increase as well. There is work going on to automate repeated aspects of programming, but this only allows for more time for creativity and innovation.
  • #8: False: http://www.snopes.com/facebook-ai-developed-own-language/ Facebook: http://www.telegraph.co.uk/technology/2017/08/01/facebook-shuts-robots-invent-language/ Musk: https://www.theguardian.com/technology/2017/jul/17/elon-musk-regulation-ai-combat-existential-threat-tesla-spacex-ceo Example ML powered businesses disrupted Blockbuster, Taxis, etc. One might argue that actually customer centric businesses caused the disruption, however I believe the correct lesson to take away from Blockbuster and traditional Taxi companies is “Companies that saw a way to use new technology to cater better to customers needs and wants”. It is both, not an either or scenario. Techies prefer the first definition (after all, the tool is always the answer. Go to any computer science or data science program in the country, better yet, any meetup or forum and you will find almost exclusively discussions about the tool, not the process or how to actually use the tool in the real world). Many times, “new, shiny objects” are not ready for game time. For example, data science programs focus almost exclusively on modeling, giving students standard, pristine datasets. Even when they claim it is ”really world”, they just slightly jumble a real dataset. The real world doesn’t have a standard definition for ’y’, or the outcome, what is right or wrong, and the data almost always includes serious problems. I would saw the majority of the time working in data science is about dealing with datasets, be it text, web, or relational, where nobody has a clue why it is there, what happened with during the last implementation that was botched and created bad data in the system, etc. The real ”data science” is not about the fanciest new algorithm, but business concerns, wrangling data, feature engineering, culture changes, model deployment, and a bit of modeling dropped in. While business consultants, like to say it is all about the ’customer experience’.
  • #15: Starts and ends right here. As data scientists and machine learning experts, we are excited and love talking about the tools and algorithmic implementations. This however, means nothing outside of an academic setting for the ’real world’. It is all for not if it cannot be applied to optimizing and solving business problems.
  • #21: https://github.com/aclarkData/RaspberryPieExperiments Super scientific setup
  • #25: Shift back the actual status of the weather by thirty minutes to allow the model to be trained to predict what the weather
  • #26: Shift back the actual status of the weather by thirty minutes to allow the model to be trained to predict what the weather
  • #28: Scale the data and fit a basic Logistic Regression Model http://www.chioka.in/differences-between-l1-and-l2-as-loss-function-and-regularization/
  • #29: Have explanation between difference between AUC and accuracy
  • #33: Make a public version of the repo and link here