SlideShare a Scribd company logo
Investing in AI-Driven
Startups
Yeshiva University
Angel and Venture Capital 101
April 9, 2018
Dr. Roy Lowrance, roy.lowrance@gmail.com, +1 347 255 2544
● CEO and Founder, Applied Data Science, LLC
○ Design and build machine learning systems
○ Advise machine learning-driven startups
● Co-founder, 7Chord
● Advisor, Blumberg Capital
● Former
○ Co-Designer and Managing Director, NYU Center for Data Science
○ Partner, The Boston Consulting Group
○ CTO, Capital One
○ CTO, Reuters
● PhD, Computer Science, NYU
● MBA, Harvard Graduate School of Business Administration
● BA Mathematics, Vanderbilt University
2
100 AI startups that have raised $11.7B
Source: www.cbinsights.com, accessed 2018-03-31.
3
Agenda
● Overview of AI, machine learning, and big data
● Life cycle of AI projects
● Sustainable competitive advantage for AI-based startups
4
Definitions
● Big Data: lots of computers
● Advanced Analytics
○ More than a spreadsheet, or
○ Requiring calculus, or
○ Requiring statistics
● Machine learning
○ A kind of advanced analytics
○ Most common use case
■ The algorithm is not obvious
■ Write and run a program that “learns” the algorithm
● Artificial Intelligence (AI)
○ A kind of advanced analytics
○ A computer performs a task that normally requires human intelligence
○ Often, the technology is advanced machine learning
5
Supervised machine learning
1. Gather training data: spreadsheet like
○ The first row contains the names of the features and the name of the target
○ Subsequent rows contain numbers, the values of the features and target variable
○ Each row is one labeled sample
○ All entries are numbers (not text nor images nor words)
2. Select model and optimization criterion
3. Train the model
4. Predict using the trained model
5. Decide using the prediction
6
Supervised machine learning
1. Gather training data
2. Select model and optimization criterion
○ Assume a functional form for function f such that
■ Features = (feature 1, features 2, …., feature N)
■ Parameters = (parameter 1, parameter 2, …, parameter D)
■ Target = f(features, parameters)
○ Define a loss function to be minimized during by the learning process
■ Measure how unhappy you are
■ For each data row, calculate row loss = f(row features, parameters) - row target
■ Total loss = mean of the row losses, or median, or ...
3. Train the model
4. Predict using the trained model
5. Decide using the prediction
7
Supervised machine learning
1. Gather training data
2. Select model and optimization criterion:
3. Train: find best parameters that approximately minimize the total loss
○ Often uses calculus-based procedure
○ Can require lots of computing time
○ Can require lots of training samples
4. Predict using the trained model
5. Decide using the prediction
8
Supervised machine learning
1. Gather training data
2. Select model and optimization criterion:
3. Train the model
4. Predict using the trained model
○ Given new features
○ Calculate prediction = f(new features, best parameters)
○ Often runs very quickly
5. Decide using the prediction
9
Supervised machine learning
1. Gather training data
2. Select model and optimization criterion
3. Train the model
4. Predict using the trained model
5. Decide using the prediction
○ Use ...
■ Predicted value
■ Understanding of the business
○ … to design and implement a computer program that ...
○ ...makes a business decision
10For advice on machine learning, see Pedro Domingos, “A Few Useful Things to Know About Machine Learning”, 2012
What can go wrong?
1. Gather training data
a. Features not well correlated with target
b. Features are mutually dependent
c. Target not derived from features
d. Training data not representative of operational environment
e. Training data not legal or ethical to use
f. Training data encode undesired bias
2. Select model and optimization criterion
3. Train the model
4. Predict using the trained model
5. Decide using the prediction
11
What can go wrong?
1. Gather training data
2. Select model and optimization criterion
a. Model cannot represent the “true” function f well enough
i. Omitted features
ii. Wrong functional form
b. Wrong optimization criterion
3. Train the model
4. Predict using the trained model
5. Decide using the prediction
12
What can go wrong?
1. Gather training data
2. Select model and optimization criterion
3. Train the model
a. Best parameters not close to best possible parameters (under fitting)
b. Best parameters encode the noise in the training data, not the systematic patterns (over fitting)
4. Predict using the trained model
5. Decide using the prediction
13
What can go wrong?
1. Gather training data
2. Select model and optimization criterion
3. Train the model
4. Predict using the trained model
a. Training used data that appears only after the prediction must be made
b. Cannot cost-effectively marshal the features
5. Decide using the prediction
14
What can go wrong?
1. Gather training data
2. Select model and optimization criterion
3. Train the model
4. Predict using the trained model
5. Decide using the prediction
a. Bad decisions
b. Correct decisions on average, but
i. Decisions have highly variable errors
ii. Decisions have low business value
c. Correct decisions at first, then they stop working
15
Agenda
● Overview of AI, machine learning, and big data
● Life cycle of AI projects
● Sustainable competitive advantage for AI-based startups
16
Software development life cycle
17
Assess
Needs
Write
Specifications
Design,
develop, test
software
Implement
system
Support
operations
Evaluate
Performance
Cross-industry standard process for data mining
(CRISP-DM)
18Source: Wikipedia at “cross-industry standard process for data mining”, accessed 2018-04-01.
Business
Understanding
Data
Understanding
Data
Preparation
ModelingEvaluation
Deployment
Key skills sets
● Data Engineer
○ Role: Build data sets used by data scientists
○ Education: computer science
● Data Scientist
○ Role: Use data to build predictive models
○ McKinsey says there may be 140,000 to 190,000 empty positions in the US in 2018
○ Education: emerging BA and MS computer science and data science degrees
● Translator
○ Use understanding of what is possible technically and what the business needs to
■ Specify predictions that can be made using data that can become available
■ Redesign repetitive operational processes to change decisions based in part on the
predictions
○ McKinsey says there may be 1,500,000 empty positions in the US in 2018
○ Education: emerging BAs and MBAs business degrees
19Source for open position estimates: https://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/big-data-the-next-frontier-for-innovation,
access 2018-03-31.
Agenda
● Overview of AI, machine learning, and big data
● Life cycle of AI projects
● Sustainable competitive advantage for AI-based startups
20
Competitive advantages in AI/machine learning
● AI is only a part: better predictions alone are useless until they are
○ Used to inform correct-enough decisions ...
○ … that are incorporated into business processes
● Most machine learning projects can use off-the-shelf learning algorithms
○ See Python’s Scikit-learn
○ See Kevin Murphy, Machine Learning
○ Exception: companies at Internet scale that are processing data that are not easily to put into
spreadsheet-like form
■ Images, videos, natural language
■ Sounds, voice, music
● More relevant data generally lead to more accurate predictions
● About 90% of a machine learning project is around gathering, cleaning, and
aligning the data so that it contains correct, informative features
● Finding the decisions to improve is as hard as gathering the data 21
Many AI-driven startups enable superior decisions
● Find processes in customer segments with high opportunity costs from
improvable decisions
● Build unique resources and capabilities that are difficult to copy
○ Data scientists are in short supply
○ AI translators are in shorter supply
○ Relevant data not yet gathered
○ Informative features not yet fully understood
● Be early to market with
○ An improved prediction that enable an improved decision; AND
○ Improved decisions
● Develop sustainable advantages
○ Better understand economics of correct and incorrect decisions
○ Have more relevant data including data from customers
22
Becoming an AI Translator
● Representative Courses
○ YU, Sy Syms, UG
■ IDS 2030 Business Analytics and Programming
■ IDS 2160 Decision Models
○ YU, Katz School, MS
■ Visual Storytelling
■ Data Product Design
○ UC Berkeley, Data 8: Foundations of Data Science (online, free)
● Executive Ed
○ NYU, Stern School, MS Business Analytics
○ Lowrance and Provost, custom translator group courses
● Teach yourself
○ Provost and Fawcett, “Data Science for Business Analytics”
○ James and Witten, “Introduction to Statistical Learning” 23
EXTRAS
24
AI/machine repeatable process roles
1. Data Curators: collect, clean, index, store, adjust, deliver to next stage
2. Feature Analysts: transform raw data into informative signals
3. Strategists: convert informative signals into business decisions
4. Backtesters: assess how the strategy would have performed historically
5. Deployers: move training and prediction and decisions into operations
6. Product supervisors: Monitor decision qualify and diagnose problems
a. Simulated production: pretend that decisions were made
b. Initial deployment: test on a small fraction of possibilities
c. Rollout: insert into all processes
d. Decommission: upgrade predictions and/or decisions
25Source: Marcos Lopez de Prado, Advanced in Financial Machine Learning, 2018.
Overview
● Many startups leverage Artificial Intelligence (AI)
● In addition to the usual risks of investing in startups, these startups face risk
induced by their technology
● We seek to explain how to manage certain of those investment risks:
○ Not understanding AI well enough so that the the sustainable competitive advantage of the
company can be effectively assessed
○ Not able to assess effectively the extent to which the team has the right skills and knowledge
○ Not anticipating the life cycle of AI-based innovations
26
Two uses cases: Given a desired computational
result:
● When how to determine the result is clear: use traditional software
○ Example: Calculation of sales tax in New York City stores
○ Example: Schedule workload in a factory
○ Example: Rendering a web page in a browser
● When how to determine the result is not clear: “learn” the program
○ Example: Determining whether a user of a computer system is who she claims to be
○ Example: Deciding what items to place in a Facebook news feed
○ Example: Finding all of the faces in an image
27
Features and labels
● Example: Determining whether a user of a computer system is who she
claims to be
○ Features:
■ User identity
■ Time and place where logged on
■ Applications accessed
■ Keyboard and mouse usage
■ ...
○ Labels: whether the user identity was correct
● Example: Deciding what items to place in a Facebook news feed
● Example: Finding all of the faces in an image
28
Features and labels
● Example: Determining whether a user of a computer system is who she
claims to be
● Example: Deciding what items to place in a Facebook news feed
○ Features
■ User identity, political affiliation, what items user has liked, ...
■ Potential items: their identity, who posted the item, relationship of user to the poster, …
■ ...
○ Labels: Whether a human expert would have included that item for that user
● Example: Finding all of the faces in an image
29
Features and labels
● Example: Determining whether a user of a computer system is who she
claims to be
● Example: Deciding what items to place in a Facebook news feed
● Example: Finding all of the faces in an image
○ Features:
■ jpegs (pixels and RBG colors in each)
■ ...
○ Labels: circles around faces, drawn by a person
30
Definition: business strategy
● Where to compete
○ Products
○ Markets
● How to compete
○ Value chain
31
Michael Porter’s generic business strategies
● Cost leadership: appeal to cost or price-sensitive customers
○ Have lowest prices in target segments; OR
○ Have lowest price-to-value ratio
○ Requires having lower costs than competitors
○ Requires having larger volumes than competitors
● Differentiation
○ Appropriate when
■ Target customer segment is not price sensitive
■ Market is competitive or saturated
■ Customers have needs that are under served
■ Firm has unique resources and capabilities that are difficult to copy
● Focus
○ Target a few segments with specialized needs
○ Within these, offer either low prices or differentiation
32Source: Wikipedia at “Porter’s generic strategies”, accessed 2018-03-31..
Software development life cycle (SDLC)
33

More Related Content

PDF
Machine Learning for Finance Master Class
PDF
Ml master class northeastern university
PDF
Ml master class
PDF
Ai in finance
PDF
(In)convenient truths about applied machine learning
PPTX
This is AI doing – applying artificial intelligence to business problems by H...
PDF
ML master class
PDF
Ml master class cfa poland
Machine Learning for Finance Master Class
Ml master class northeastern university
Ml master class
Ai in finance
(In)convenient truths about applied machine learning
This is AI doing – applying artificial intelligence to business problems by H...
ML master class
Ml master class cfa poland

Similar to Investing in ai driven startups (20)

PDF
Clarke Global - Artificial Intelligence Overview
PDF
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
PDF
AI Hierarchy of Needs
PDF
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
PPTX
Ml in a Day Workshop 5/1
 
PPTX
The Future of AI (September 2019)
PPTX
artificialintelligencedata driven analytics23.pptx
PDF
QCon conference 2019
PDF
Ml in a day v 1.1
 
PPTX
Machine intelligence data science methodology 060420
PPTX
Machine learning 060517
PDF
ML and AI in Finance: Master Class
PDF
Ai and analytics for business
PPTX
Machine learning101 v1.2
 
PPTX
Integrating AI - Business Applications
PPTX
Lectuhhhhhhhhhhhhhhhhhhhhhhbbbhhhre 1.pptx
PPTX
The 4 Machine Learning Models Imperative for Business Transformation
PDF
Machine Learning an Research Overview
PDF
Artificial Intelligence and Machine Learning : Infographics view
PDF
Artificial intelligence and machine learning overview
Clarke Global - Artificial Intelligence Overview
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
AI Hierarchy of Needs
A business level introduction to Artificial Intelligence - Louis Dorard @ PAP...
Ml in a Day Workshop 5/1
 
The Future of AI (September 2019)
artificialintelligencedata driven analytics23.pptx
QCon conference 2019
Ml in a day v 1.1
 
Machine intelligence data science methodology 060420
Machine learning 060517
ML and AI in Finance: Master Class
Ai and analytics for business
Machine learning101 v1.2
 
Integrating AI - Business Applications
Lectuhhhhhhhhhhhhhhhhhhhhhhbbbhhhre 1.pptx
The 4 Machine Learning Models Imperative for Business Transformation
Machine Learning an Research Overview
Artificial Intelligence and Machine Learning : Infographics view
Artificial intelligence and machine learning overview
Ad

Recently uploaded (20)

PPT
Quality review (1)_presentation of this 21
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PDF
Mega Projects Data Mega Projects Data
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Computer network topology notes for revision
PPT
Reliability_Chapter_ presentation 1221.5784
Quality review (1)_presentation of this 21
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Fluorescence-microscope_Botany_detailed content
IB Computer Science - Internal Assessment.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Moving the Public Sector (Government) to a Digital Adoption
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
climate analysis of Dhaka ,Banglades.pptx
Introduction to Knowledge Engineering Part 1
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Mega Projects Data Mega Projects Data
1_Introduction to advance data techniques.pptx
Database Infoormation System (DBIS).pptx
Computer network topology notes for revision
Reliability_Chapter_ presentation 1221.5784
Ad

Investing in ai driven startups

  • 1. Investing in AI-Driven Startups Yeshiva University Angel and Venture Capital 101 April 9, 2018
  • 2. Dr. Roy Lowrance, roy.lowrance@gmail.com, +1 347 255 2544 ● CEO and Founder, Applied Data Science, LLC ○ Design and build machine learning systems ○ Advise machine learning-driven startups ● Co-founder, 7Chord ● Advisor, Blumberg Capital ● Former ○ Co-Designer and Managing Director, NYU Center for Data Science ○ Partner, The Boston Consulting Group ○ CTO, Capital One ○ CTO, Reuters ● PhD, Computer Science, NYU ● MBA, Harvard Graduate School of Business Administration ● BA Mathematics, Vanderbilt University 2
  • 3. 100 AI startups that have raised $11.7B Source: www.cbinsights.com, accessed 2018-03-31. 3
  • 4. Agenda ● Overview of AI, machine learning, and big data ● Life cycle of AI projects ● Sustainable competitive advantage for AI-based startups 4
  • 5. Definitions ● Big Data: lots of computers ● Advanced Analytics ○ More than a spreadsheet, or ○ Requiring calculus, or ○ Requiring statistics ● Machine learning ○ A kind of advanced analytics ○ Most common use case ■ The algorithm is not obvious ■ Write and run a program that “learns” the algorithm ● Artificial Intelligence (AI) ○ A kind of advanced analytics ○ A computer performs a task that normally requires human intelligence ○ Often, the technology is advanced machine learning 5
  • 6. Supervised machine learning 1. Gather training data: spreadsheet like ○ The first row contains the names of the features and the name of the target ○ Subsequent rows contain numbers, the values of the features and target variable ○ Each row is one labeled sample ○ All entries are numbers (not text nor images nor words) 2. Select model and optimization criterion 3. Train the model 4. Predict using the trained model 5. Decide using the prediction 6
  • 7. Supervised machine learning 1. Gather training data 2. Select model and optimization criterion ○ Assume a functional form for function f such that ■ Features = (feature 1, features 2, …., feature N) ■ Parameters = (parameter 1, parameter 2, …, parameter D) ■ Target = f(features, parameters) ○ Define a loss function to be minimized during by the learning process ■ Measure how unhappy you are ■ For each data row, calculate row loss = f(row features, parameters) - row target ■ Total loss = mean of the row losses, or median, or ... 3. Train the model 4. Predict using the trained model 5. Decide using the prediction 7
  • 8. Supervised machine learning 1. Gather training data 2. Select model and optimization criterion: 3. Train: find best parameters that approximately minimize the total loss ○ Often uses calculus-based procedure ○ Can require lots of computing time ○ Can require lots of training samples 4. Predict using the trained model 5. Decide using the prediction 8
  • 9. Supervised machine learning 1. Gather training data 2. Select model and optimization criterion: 3. Train the model 4. Predict using the trained model ○ Given new features ○ Calculate prediction = f(new features, best parameters) ○ Often runs very quickly 5. Decide using the prediction 9
  • 10. Supervised machine learning 1. Gather training data 2. Select model and optimization criterion 3. Train the model 4. Predict using the trained model 5. Decide using the prediction ○ Use ... ■ Predicted value ■ Understanding of the business ○ … to design and implement a computer program that ... ○ ...makes a business decision 10For advice on machine learning, see Pedro Domingos, “A Few Useful Things to Know About Machine Learning”, 2012
  • 11. What can go wrong? 1. Gather training data a. Features not well correlated with target b. Features are mutually dependent c. Target not derived from features d. Training data not representative of operational environment e. Training data not legal or ethical to use f. Training data encode undesired bias 2. Select model and optimization criterion 3. Train the model 4. Predict using the trained model 5. Decide using the prediction 11
  • 12. What can go wrong? 1. Gather training data 2. Select model and optimization criterion a. Model cannot represent the “true” function f well enough i. Omitted features ii. Wrong functional form b. Wrong optimization criterion 3. Train the model 4. Predict using the trained model 5. Decide using the prediction 12
  • 13. What can go wrong? 1. Gather training data 2. Select model and optimization criterion 3. Train the model a. Best parameters not close to best possible parameters (under fitting) b. Best parameters encode the noise in the training data, not the systematic patterns (over fitting) 4. Predict using the trained model 5. Decide using the prediction 13
  • 14. What can go wrong? 1. Gather training data 2. Select model and optimization criterion 3. Train the model 4. Predict using the trained model a. Training used data that appears only after the prediction must be made b. Cannot cost-effectively marshal the features 5. Decide using the prediction 14
  • 15. What can go wrong? 1. Gather training data 2. Select model and optimization criterion 3. Train the model 4. Predict using the trained model 5. Decide using the prediction a. Bad decisions b. Correct decisions on average, but i. Decisions have highly variable errors ii. Decisions have low business value c. Correct decisions at first, then they stop working 15
  • 16. Agenda ● Overview of AI, machine learning, and big data ● Life cycle of AI projects ● Sustainable competitive advantage for AI-based startups 16
  • 17. Software development life cycle 17 Assess Needs Write Specifications Design, develop, test software Implement system Support operations Evaluate Performance
  • 18. Cross-industry standard process for data mining (CRISP-DM) 18Source: Wikipedia at “cross-industry standard process for data mining”, accessed 2018-04-01. Business Understanding Data Understanding Data Preparation ModelingEvaluation Deployment
  • 19. Key skills sets ● Data Engineer ○ Role: Build data sets used by data scientists ○ Education: computer science ● Data Scientist ○ Role: Use data to build predictive models ○ McKinsey says there may be 140,000 to 190,000 empty positions in the US in 2018 ○ Education: emerging BA and MS computer science and data science degrees ● Translator ○ Use understanding of what is possible technically and what the business needs to ■ Specify predictions that can be made using data that can become available ■ Redesign repetitive operational processes to change decisions based in part on the predictions ○ McKinsey says there may be 1,500,000 empty positions in the US in 2018 ○ Education: emerging BAs and MBAs business degrees 19Source for open position estimates: https://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/big-data-the-next-frontier-for-innovation, access 2018-03-31.
  • 20. Agenda ● Overview of AI, machine learning, and big data ● Life cycle of AI projects ● Sustainable competitive advantage for AI-based startups 20
  • 21. Competitive advantages in AI/machine learning ● AI is only a part: better predictions alone are useless until they are ○ Used to inform correct-enough decisions ... ○ … that are incorporated into business processes ● Most machine learning projects can use off-the-shelf learning algorithms ○ See Python’s Scikit-learn ○ See Kevin Murphy, Machine Learning ○ Exception: companies at Internet scale that are processing data that are not easily to put into spreadsheet-like form ■ Images, videos, natural language ■ Sounds, voice, music ● More relevant data generally lead to more accurate predictions ● About 90% of a machine learning project is around gathering, cleaning, and aligning the data so that it contains correct, informative features ● Finding the decisions to improve is as hard as gathering the data 21
  • 22. Many AI-driven startups enable superior decisions ● Find processes in customer segments with high opportunity costs from improvable decisions ● Build unique resources and capabilities that are difficult to copy ○ Data scientists are in short supply ○ AI translators are in shorter supply ○ Relevant data not yet gathered ○ Informative features not yet fully understood ● Be early to market with ○ An improved prediction that enable an improved decision; AND ○ Improved decisions ● Develop sustainable advantages ○ Better understand economics of correct and incorrect decisions ○ Have more relevant data including data from customers 22
  • 23. Becoming an AI Translator ● Representative Courses ○ YU, Sy Syms, UG ■ IDS 2030 Business Analytics and Programming ■ IDS 2160 Decision Models ○ YU, Katz School, MS ■ Visual Storytelling ■ Data Product Design ○ UC Berkeley, Data 8: Foundations of Data Science (online, free) ● Executive Ed ○ NYU, Stern School, MS Business Analytics ○ Lowrance and Provost, custom translator group courses ● Teach yourself ○ Provost and Fawcett, “Data Science for Business Analytics” ○ James and Witten, “Introduction to Statistical Learning” 23
  • 25. AI/machine repeatable process roles 1. Data Curators: collect, clean, index, store, adjust, deliver to next stage 2. Feature Analysts: transform raw data into informative signals 3. Strategists: convert informative signals into business decisions 4. Backtesters: assess how the strategy would have performed historically 5. Deployers: move training and prediction and decisions into operations 6. Product supervisors: Monitor decision qualify and diagnose problems a. Simulated production: pretend that decisions were made b. Initial deployment: test on a small fraction of possibilities c. Rollout: insert into all processes d. Decommission: upgrade predictions and/or decisions 25Source: Marcos Lopez de Prado, Advanced in Financial Machine Learning, 2018.
  • 26. Overview ● Many startups leverage Artificial Intelligence (AI) ● In addition to the usual risks of investing in startups, these startups face risk induced by their technology ● We seek to explain how to manage certain of those investment risks: ○ Not understanding AI well enough so that the the sustainable competitive advantage of the company can be effectively assessed ○ Not able to assess effectively the extent to which the team has the right skills and knowledge ○ Not anticipating the life cycle of AI-based innovations 26
  • 27. Two uses cases: Given a desired computational result: ● When how to determine the result is clear: use traditional software ○ Example: Calculation of sales tax in New York City stores ○ Example: Schedule workload in a factory ○ Example: Rendering a web page in a browser ● When how to determine the result is not clear: “learn” the program ○ Example: Determining whether a user of a computer system is who she claims to be ○ Example: Deciding what items to place in a Facebook news feed ○ Example: Finding all of the faces in an image 27
  • 28. Features and labels ● Example: Determining whether a user of a computer system is who she claims to be ○ Features: ■ User identity ■ Time and place where logged on ■ Applications accessed ■ Keyboard and mouse usage ■ ... ○ Labels: whether the user identity was correct ● Example: Deciding what items to place in a Facebook news feed ● Example: Finding all of the faces in an image 28
  • 29. Features and labels ● Example: Determining whether a user of a computer system is who she claims to be ● Example: Deciding what items to place in a Facebook news feed ○ Features ■ User identity, political affiliation, what items user has liked, ... ■ Potential items: their identity, who posted the item, relationship of user to the poster, … ■ ... ○ Labels: Whether a human expert would have included that item for that user ● Example: Finding all of the faces in an image 29
  • 30. Features and labels ● Example: Determining whether a user of a computer system is who she claims to be ● Example: Deciding what items to place in a Facebook news feed ● Example: Finding all of the faces in an image ○ Features: ■ jpegs (pixels and RBG colors in each) ■ ... ○ Labels: circles around faces, drawn by a person 30
  • 31. Definition: business strategy ● Where to compete ○ Products ○ Markets ● How to compete ○ Value chain 31
  • 32. Michael Porter’s generic business strategies ● Cost leadership: appeal to cost or price-sensitive customers ○ Have lowest prices in target segments; OR ○ Have lowest price-to-value ratio ○ Requires having lower costs than competitors ○ Requires having larger volumes than competitors ● Differentiation ○ Appropriate when ■ Target customer segment is not price sensitive ■ Market is competitive or saturated ■ Customers have needs that are under served ■ Firm has unique resources and capabilities that are difficult to copy ● Focus ○ Target a few segments with specialized needs ○ Within these, offer either low prices or differentiation 32Source: Wikipedia at “Porter’s generic strategies”, accessed 2018-03-31..
  • 33. Software development life cycle (SDLC) 33