D7 MarkPlus - Machine Learning Algorithm.pdf

Machine Learning
Algorithm
Novan Parmonangan Simanjuntak
Head of Machine Learning and AI Strategy
novanps
novan.p.simanjuntak@glair.ai
glair glair.ai hi@glair.ai
Contact Us

01
02
03
04
05
glair.ai
Intro to Machine Learning
Machine Learning Workflow
01
02
OUTLINE
2

glair.ai
Intro to
Machine
Learning 01
3
3

AI
ML
A program that can sense, reason,
act, and adapt
Algorithms learns from data
Artificial neural networks (inspired by
brain) learn from data
ARTIFICIAL INTELLIGENCE
MACHINE LEARNING
DL
AI, ML & DL
DEEP LEARNING
4
towardsdatascience.com

Neural Network
towardsdatascience.com
5

Why ML: Example
Output/Label:
Do I want to go to outside on the weekend given data? (Answer: Yes or No)
Input:
● Weather: Rainy, Sunny
● Distance to destination (km)
● Invited by friends: yes, no
Rules:
1. Initially, score = 0
2. If the weather is rainy then (score + 10), else
if the weather is sunny then (score - 10)
3. score + distance
4. If invited by friends then (score -10), else if
not invited by friends then (score + 10)
5. If score < 10 then Yes, else No
No Weather Distance Invited by Friends Score Go on Weekend? (Answers)
1 Rainy 8 Yes 10+8-10 = 8 Yes
2 Sunny 20 No -10+20+10 = 20 No
3 Rainy 5 No 10+5+10 = 25 No
Data:
6

Why ML: Example
Let’s model it using AI (Model: Artificial Neural Network)
7

Why ML: Example
Let’s model it using Artificial Neural Network
8
>=10
Example:
● Input: weather = Rainy, Distance = 8, Invited by Friends = Yes
● Score = 1 * 10 + 0 * -10 + 8 * 1 + 1 * -10 + 0 * 10 = 8
● Score < 10, so the Output is Yes

Why ML?: A New Paradigm
9
Traditional Programming
Input
Rules
Output
Machine Learning
Input
Output
Rules

Why ML?
No Weather Distance (km) Invited by Friends Go on Weekend? (Answers)
1 Rainy 8 Yes Yes
2 Sunny 20 No No
3 Rainy 5 No No
ML: Find the Rules
10

Optimization : Finding Rules
11
● Determine Loss Function
● Update Rules/Weight to minimize loss

Why ML?
If you know the Rules Do Not Use ML
If not Use ML
12

AI Approach
Learning Type
● Supervised Learning
● Unsupervised Learning
● Reinforcement Learning
15

● Input and Output is Given
● Finding a rules/function f that maps a set of points X (input/predictor) to a set of labels Y
(output), based on given data (xi
, yi
).
● Categorized to:
○ Classification (categorical labels)
■ Credit Scoring
■ Fraud Detection
■ Recommendation System
■ Object Recognition
■ Spam Filtering
○ Regression (numerical labels)
■ Home Price Prediction
■ Stock Market Prediction
■ Demand Forecasting
Supervised Learning
16

Unsupervised Learning
● Only Input is given
● Find hidden patterns or underlying
structure in the given data
Document Clustering 17
search.carrotsearch.com/pertamina

glair.ai
Reinforcement Learning in Action
22

glair.ai
Machine
Learning
Workflow 02
23
23

glair.ai
Business Problem
ML Problem Framing
Data Ingestion
24
24

• Transportation of data from varied sources
to a storage where it can be accessed, used, and analyzed
by an organization.
• The destination is typically a data warehouse, data mart, or database.
GDP LABS CONFIDENTIAL
Data Ingestion
01
25

glair.ai
Business Problem
ML Problem Framing
Data Preparation
Data Ingestion
26

● Make it easier for ML algorithm to interpret the data.
● Categories:
○ Data cleaning:
■ Garbage In Garbage Out
■ identifying incomplete, incorrect, inaccurate or irrelevant parts of the
data and then replacing, modifying, or deleting the dirty data
○ Data encoding, normalization, resampling
○ Data Splitting (Training, Validation, Test)
Data Preparation
02
27

glair.ai
Business Problem
ML Problem Framing
Data Preparation
EDA
Data Ingestion
28

● Exploratory Data Analysis
Analyze data to summarize main characteristics, often using statistical
techniques or data visualization.
● Objectives:
○ Suggest hypotheses
○ Assess assumptions
○ Support the selection of appropriate tools, techniques, and features
Exploratory Data Analysis (EDA)
03
29

glair.ai
Business Problem
ML Problem Framing
Data Preparation
EDA
Feature Engineering
Data Ingestion
30

● Feature engineering can use domain knowledge to extract new features from
raw data
● Objectives:
○ Improving the performance of machine learning models.
● Examples:
○ Grouping operations over a window (e.g, average)
○ Binning
○ Log transformation
Feature Engineering
04
31

glair.ai
Business Problem
ML Problem Framing
Data Preparation
EDA
Feature Engineering
Model Training and
Tuning
Data Ingestion
32

● Choose which model to try
● Simpler model is better
● Iteratively using a more complex model and features if needed
● Use a baseline model
● Maximize model performance using hyperparameter tuning
● Consider constraints:
○ cost, explainability, and speed
Model Training and Tuning
05
33

glair.ai
Business Problem
ML Problem Framing
Data Preparation
EDA
Feature Engineering
Model Training and
Tuning
Model Evaluation
Data Ingestion
34

● Evaluate the model with the test data
● Pick a suitable metrics for the problem.
○ There must be a Business and ML metrics for the problem
● If the business metrics achieved then continue with model deployment, if not
reiterate the model creation with data augmentation (adding more data) or feature
augmentation (adding other features)
Model Evaluation
06
35

glair.ai
Business Problem
ML Problem Framing
Data Preparation
EDA
Feature Engineering
Model Training and
Tuning
Model Evaluation Model Deployment
Data Ingestion
Are
Business
Goals
Met?
Data
Augmentation
Feature Augmentation
No Yes
36

● Model Serving/Inference
● Considerations:
○ How to wrap the prediction code as a production-ready service?
○ Which API / Protocol to use?
○ Scalability, Throughput, Latency.
○ Deployments
■ Model Versioning
■ Choose appropriate deployment strategy (e.g, A/B Testing)
Model Deployment
07
37

glair.ai
Business Problem
ML Problem Framing
Data Preparation
EDA
Feature Engineering
Model Training and
Tuning
Model Evaluation Model Deployment
Model Monitoring
Data Ingestion
Are
Business
Goals
Met?
Predictions
Data
Augmentation
No Yes
38

● Maintain model performance
● Things to be monitored:
○ Service Health
○ Data Quality & Integrity
○ Data & Target Drift
○ Bias/fairness
Model Monitoring
08
39

Novan Parmonangan Simanjuntak
Head of Machine Learning and AI Strategy
novanps
novan.p.simanjuntak@glair.ai
glair glair.ai hi@glair.ai
Contact Us
Thank You!

D7 MarkPlus - Machine Learning Algorithm.pdf

More Related Content

Similar to D7 MarkPlus - Machine Learning Algorithm.pdf (20)

Recently uploaded (20)

D7 MarkPlus - Machine Learning Algorithm.pdf