Model governance in the age of data science & AI

Model Governance
in the age of Data Science and AI
2018 Copyright QuantUniversity LLC.
Presented By:
Sri Krishnamurthy, CFA, CAP
sri@quantuniversity.com
www.quantuniversity.com
10/23/2018
PRMIA Seminar
Suffolk University
Boston

2
About us:
• Data Science, Quant Finance and
Model Governance Advisory
• Technologies using MATLAB, Python
and R
• Programs
▫ Analytics Certificate Program
▫ Fintech programs
• Platform

3
• Your challenge is to design an artificial intelligence and machine
learning (AI/ML) framework capable of flying a drone through
several professional drone racing courses without human
intervention or navigational pre-programming.
AlphaPilot Drone AI Challenge

4
Market impact at the speed of light!

6
And sentiments drives markets

7
• Model Risk and Model Governance
• AI/ML and the opportunity
• Model Governance challenges and opportunities
Agenda

10
Elements of Model Risk Management

13
Model Verification is defined as:
“The process of determining that a model or simulation implementation and its
associated data accurately represent the developer’s conceptual description and
specifications”.
Model Validation is defined as:
“The process of determining the degree to which a model or simulation and its
associated data are an accurate representation of the real world from the
perspective of the intended uses of the model”.
Ref: DoD Modeling and Simulation (M&S) Verification, Validation, and
Accreditation (VV&A), DoDInstruction 5000.61, December 9, 2009.
Model Verification vs Validation

14
The new reality!
The drivers in the markets are changing!

15
Machine Learning & AI in finance – A paradigm shift
Stochastic
Models
Factor Models
Optimization
Risk Factors
P/Q Quants
Derivative
pricing
Trading
Strategies
Simulations
Distribution
fitting
Quant
Real-time analytics
Predictive analytics
Machine Learning
RPA
NLP
Deep Learning
Computer Vision
Graph Analytics
Chatbots
Sentiment Analysis
Alternative Data
Data Scientist

16
The Virtuous Circle of Machine Learning and AI
Smart
Algorithms
Hardware
Data

17
The Rise of Big Data and Data Science
Image Source: http://www.ibmbigdatahub.com/sites/default/files/infographic_file/4-Vs-of-big-data.jpg

18
Smarter Algorithms
Parallel and Distributing Computing Frameworks Deep Learning Frameworks
1. Our labeled datasets were thousands of times too
small.
2. Our computers were millions of times too slow.
3. We initialized the weights in a stupid way.
4. We used the wrong type of non-linearity.
- Geoff Hinton
“Capital One was able to determine fraudulent credit
card applications in 100 milliseconds”*
* http://go.databricks.com/hubfs/pdfs/Databricks-for-FinTech-170306.pdf

20
The Machine Learning Process
Data
cleansing
Feature
Engineering
Training and
Testing
Model
building
Model
selection
Model
Deployment

21
NLP pipeline
Data Ingestion
from Edgar
Pre-Processing
Invoking APIs to
label data
Compare APIs
Build a new
model for
sentiment
Analysis
Stage 1 Stage 2 Stage 3 Stage 4 Stage 5
External REST APIs
• Amazon Comprehend API
• Google API
• Watson API
• Azure API

22
The Machine Learning Process
Data
cleansing
Feature
Engineering
Training and
Testing
Model
building
Model
selection
Model
Deployment

23
Processes are chaotic
Planning
Reality

24
The reproducibility challenge

25
Data Engineering vs Data Science
Engineering/IT
• Scaling
• Structuring
• Design of Experiments
• Data Parallel/Task Parallel
Quants/Data Scientists
• New Algorithms
• Try new methods
• Effect of Parameters and
Hyper Parameters

27
Claim:
• Machine learning is good for fraud
detection, looking for arbitrage
opportunities and trade execution
Caution:
• Beware of imbalanced class problems
• A model that gives 99% accuracy may still
not be good enough
1. Machine learning is not a generic solution to all problems

28
Claim:
• Our models work on all the
datasets we have tested on
Caution:
• Do we have enough data?
• How do we handle bias in
datasets?
• Beware of overfitting
• Historical Analysis is not
Prediction
2. A prototype model is not your production model

29
AI and Machine Learning in Production
https://www.itnews.com.au/news/hsbc-societe-generale-run-
into-ais-production-problems-477966
Kristy Roth from HSBC:
“It’s been somewhat easy - in a funny way - to
get going using sample data, [but] then you hit
the real problems,” Roth said.
“I think our early track record on PoCs or pilots
hides a little bit the underlying issues.
Matt Davey from Societe Generale:
“We’ve done quite a bit of work with RPA
recently and I have to say we’ve been a bit
disillusioned with that experience,”
“the PoC is the easy bit: it’s how you get that
into production and shift the balance”

30
Claim:
• It works. We don’t know how!
Caution:
• Lots of heuristics; still not a proven
science
• Interpretability or Auditability of
models are important
• Beware of black boxes; Transparency in
codebase is paramount with the
proliferation of opensource tools
• Skilled data scientists who are
knowledgeable about algorithms and
their appropriate usage are key to
successful adoption
3. We are just getting started!

31
Claim:
• Machine Learning models are
more accurate than
traditional models
Caution:
• Is accuracy the right metric?
• How do we evaluate the
model? RMS or R2
• How does the model behave
in different regimes?
4. Choose the right metrics for evaluation

32
Claim:
• Machine Learning and AI will replace
humans in most applications
Caution:
• Beware of the hype!
• Just because it worked some times
doesn’t mean that the organization can
be on autopilot
• Will we have true AI or Augmented
Intelligence?
• Model risk and robust risk
management is paramount to the
success of the organization.
• We are just getting started!
5. Are we there yet?
https://www.bloomberg.com/news/articles/2017-10-20/automation-
starts-to-sweep-wall-street-with-tons-of-glitches

33
1. A need for a clearly defined Model Verification and Validation
framework applicable to your organization is required.
2. Define replicability, interpretability and auditability requirements
upfront.
3. Distinguish process automation, machine learning and
autonomous decision making using AI
4. Machine learning is not magic; Hire the right talent prior to
deploying models into production
5. Model lifecycle management shouldn’t be an afterthought
6. Define and address risks evolving from the adoption of new
processes
Summary

34
QuantUniversity’s Model Risk related whitepapers published in the Wilmott Magazine
Email me at sri@quantuniversity.com for a copy

35
www.QuSandbox.com
Model
Analytics
Studio
QuResearchHub
QuSandbox
Prototype, Iterate and tune Standardize workflows
Productionize and share

36
www.analyticscertificate.com/MachineLearning
Analytics for a cause scholarships for students
November 7,8,2018

Sri Krishnamurthy, CFA, CAP
Founder and Chief Data Scientist
sri@quantuniversity.com
srikrishnamurthy
www.QuantUniversity.com
www.analyticscertificate.com
www.qusandbox.com
Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be
distributed or used in any other publication without the prior written consent of QuantUniversity LLC.
37

• Founder of QuantUniversity LLC. and
www.analyticscertificate.com
• Advisory and Consultancy for Financial Analytics
• Prior Experience at MathWorks, Citigroup and
Endeca and 25+ financial services and energy
customers.
• Regular Columnist for the Wilmott Magazine
• Author of forthcoming book
“Financial Modeling: A case study approach”
published by Wiley
• Charted Financial Analyst and Certified Analytics
Professional
• Teaches Analytics in the Babson College MBA
program and at Northeastern University, Boston
Sri Krishnamurthy
Founder and CEO
38

Model governance in the age of data science & AI

More Related Content

What's hot (20)

Similar to Model governance in the age of data science & AI (20)

More from QuantUniversity (20)

Recently uploaded (20)

Model governance in the age of data science & AI