SlideShare a Scribd company logo
How to combine Db2 on Z, IBM Db2 Analytics
Accelerator and IBM Machine Learning on z/OS for
Credit Scoring
A live Demo !
March 12th - 16th , 2018
Guillaume ARNOULD
arnould@fr.ibm.com
Db2 Update Days 2018
© 2017 IBM Corporation
IBM Confidential
Agenda
Analytics on IBM z Systems
• Machine Learning Basics
• IBM Db2 Analytics Accelerator for Machine Learning
• IBM Machine Learning for z/OS + Demo
© 2017 IBM Corporation
IBM Confidential
Machine learning is everywhere,
influencing nearly everything we do…
Netflix personalized
movie recommendations
Waze personalized
driving experience
7 out of 10 financial
customers would take
recommendations from a
robot advisor
Machine Learning Basics
▪ Identifies patterns in historical data
▪ Builds/trains behavioral models from
patterns
▪ Makes recommendations
© 2017 IBM Corporation
IBM Confidential
The Machine Learning Workflow: Perception
© 2017 IBM Corporation
IBM Confidential
Machine Learning 101: Types of machine learning
• Classification
– Data points are labeled and are being used to predict a category
– Two-class vs multi-class
– Example:
• Fraud detection (fraud vs non-fraud)
• Spam email detection (spam vs non-spam)
• Regression
– When a value is being predicted
– Example:
• Stock prices prediction
• Clustering
– Data points are not labeled.
– Goal is to group data into clusters to better organize the data
© 2017 IBM Corporation
IBM Confidential
Machine Learning 101: Supervised Learning
• A feature is a piece of information that might be useful for prediction
– Example, predict the probability of a customer buying a product
• Labeled data is the desired output data
– Example, 1.0 representing a customer has bought a product; 0.0 representing NOT
GENDE
R
AGE MARITAL_STATUS PROFESSIO
N
CUSTOMER_ID LABEL
F 24 Married Retail 4003 1.0
M 43 Married Trades 4004 1.0
F 43 Unspecified Hospitality 4005 0.0
F 43 Unspecified Sales 4006 1.0
M 28 Single Trades 4007 1.0
Feature Feature Feature Feature NOT a feature Label
© 2017 IBM Corporation
IBM Confidential
Training a
model
Feature
Engineering
Feature
Engineering
Scoring
Labeled
examples
Training
Scoring
New
data
Model
Model
Predicted
data
Deploy
Data Science Experience
Operational system
Dev
Ops
Machine Learning 101 : a TrainOps (DevOps) Story
© 2017 IBM Corporation
IBM Confidential
8
Fraud Detection Example : a 2 Steps Approach
Analyses
Segments
Profiles
Scoring models
...
Scoring Scoring based decisionData:
Demographics
Account activity
Transactions
Channel usage
Service queries
Renewals
…
Identify predictive
models/patterns found in
historical data
Use those predictive models
with variables to score
transactions & identify the
best possible future
outcomes
Practical scoring approaches
▪Off-line: Batch Scoring
▪On-line: External scoring function
▪On-line: Within a transaction, in real time
Step 1 – Build the predictive model Step 2 – Execute the predictive model
Model
© 2017 IBM Corporation
IBM Confidential
The Machine Learning Workflow: Reality
© 2017 IBM Corporation
IBM Confidential
Agenda
Analytics on IBM z Systems
• Machine Learning Basics
• IBM Db2 Analytics Accelerator for Machine Learning
• IBM Machine Learning for z/OS + Demo
© 2017 IBM Corporation
IBM Confidential
Version 5.1 of Db2 Analytics Accelerator opens up a new dimension of Analytical
Processing by introducing In-Database Analytics and Accelerator-Only Tables.
➢ In-Database Analytics capabilities enable acceleration of predictive analytics applications. This enables
SPSS/Netezza Analytics Data Mining and In-Database Modeling to be processed within the Accelerator.
➢ Accelerator-only tables can benefit statistics and analytics tools that use temporary data for reports. The high
velocity of execution enables these tools to quickly gather all required data.
➢ Accelerator-only tables enable acceleration of Data Transformations implemented via SQL statements.
Storing interim results in accelerator-only tables enables subsequent queries or data transformations to
process all relevant data on the accelerator with high speed
© 2017 IBM Corporation
IBM Confidential
Introducing Accelerator-Only Table type in Db2 for z/OS
Creation (DDL) and access remains through Db2 for z/OS in all cases
Non-accelerator Db2 table
• Data in Db2 only
Accelerator-shadow table
• Data in Db2 and the Accelerator
Accelerator-archived table / partition
• Empty read-only partition in Db2
• Partition data is in Accelerator only
Accelerator-only table (AOT)
• “Proxy table” in Db2
• Data is in Accelerator only
Table 1
Table 4
Table 3
Table 2Table 2
Table 4
Table 3
© 2017 IBM Corporation
IBM Confidential
Data scientist work area
Using Accelerator-only tables for ad-hoc analysis
Transaction Processing
Systems (OLTP)
Data for transactional and analytical processing
Customer
Transactions
Customer
Data
Customer
Transactions
Customer
Data
Work database
John
Work Area
AOTs
Work database
Bob
Work Area
AOTs
Data Scientist John
Data Scientist Bob
© 2017 IBM Corporation
IBM Confidential
14
ETL on a different Platform (Traditional Approach)
Database Transformation – Common Usage
Customer data
Customer
Transactions
Transaction Processing Systems (OLTP)
Customer Transaction
Summary and History
Customer data
Customer
Transactions
Customer Summary
Mart
Distributed
DBMS
ETL
logic
Copy
Table
Data
(FTP)
Disadvantages:
▪ Process driven movement of large amounts of data
▪ Aged data for analytics/reporting depending on performance of
data movement and transformation process
Analytics
Unix
Server
The Secret Weapon
© 2017 IBM Corporation
IBM Confidential
In-Database Transformation
Using Accelerator-only tables and ELT Logic in the Accelerator
Transaction Processing
Systems (OLTP)
Analytics
Advantages:
• Simpler to manage
• Better performance and
reduced latency Data for transactional and analytical processing
Customer
Transactions
Customer
Data
Customer Transaction
Summary and History
AOTs
Customer Summary
Mart AOTs
Customer
Transactions
Customer
Data
ELT logic
© 2017 IBM Corporation
IBM Confidential
16
ETL using Infosphere Information Server
© 2017 IBM Corporation
IBM Confidential
17
Balanced Optimization Using IDAA : Optimization
© 2017 IBM Corporation
IBM Confidential
18
Run-Time Comparisons – Benefits of Running inside IDAA
© 2017 IBM Corporation
IBM Confidential
In-Database Analytics – Technical basics
▪ Set of stored procedures of IBM Netezza In-Database Analytics Package (INZA) are available on the Accelerator,
including:
▪ Decision Tree
▪ Regression Tree
▪ Naive Bayes
▪ K-means Clustering and TwoStep Clustering
▪ Stored procedures use accelerator-shadow tables or accelerator-only tables as input and create accelerator-only
tables as output
▪ Db2 for z/OS contains stored procedure wrappers to enable invocation of the stored procedures from Db2 for z/OS
analytical applications or from SPSS Modeler 17.1
▪ Actual stored procedure execution takes place on Accelerator
© 2017 IBM Corporation
IBM Confidential
In-Database Analytics
Data Preparation (using AOTs) and SPSS Modeling in the Accelerator
Transaction Processing Systems
(OLTP)
With embedded scoring
Advantages:
• Allows fast model refreshes
• Ensures adequate scoring
• Better performance and
reduced latency
• Scoring outside accelerator with SPSS Modeler Server , or Zementis
Data for transactional and analytical processing
Customer
Transactions
Customer
Data
Customer Txn
Data Prep AOTs
Customer
Transactions
Customer
Data
Modeling
ModelModel
SPSS Modeler
© 2017 IBM Corporation
IBM Confidential
21
« Behind the scenes»: Exploiting In-Database Modeling using Accelerator
© 2017 IBM Corporation
IBM Confidential
22
« Behind the scenes»: Execution Results
© 2017 IBM Corporation
IBM Confidential
23
« Behind the scenes»: Looking at the Accelerator using DataStudio
© 2017 IBM Corporation
IBM Confidential
In-Database Analytics
Using Accelerator-Only Tables , ELT Logic and Modeling in the Accelerator
Transaction
Processing
Systems (OLTP)
Data for Transactional and Analytical Processing
Customer
Transactions
Customer
Data
Customer Transactions
Customer Data
Customer Transaction
Summary and History
AOTs
Customer Summary
Mart AOTs
ELT logicCustomer Txn
Data Prep AOTs
Modeling
ModelModel
SPSS Modeler
© 2017 IBM Corporation
IBM Confidential
Agenda
Analytics on IBM z Systems
• Machine Learning Basics
• IBM Db2 Analytics Accelerator for Machine Learning
• IBM Machine Learning for z/OS + Demo
© 2017 IBM Corporation
IBM Confidential
26
Understand the 2 Different Steps
Decision
Management /
Rules
application
Scoring against
an existing model
5
4
data synchronisation
IBMDb2Analytics
Accelerator
TRANSACTION
Rule & Model Execution Rule & Model Creation
In Database
Transformation
2a
(accelerated)
Real Time / Predictive
Analytics / Reporting
2
3
Merging non Db2
for zOS data and
Db2 data
1
Model & Rule
Updates
zDatazApps
© 2017 IBM Corporation
IBM Confidential
IBM ML Training runtimes
Leverage a wide range of platforms to meet the varying needs of
enterprise architectures
IBM ML authoring (DSX)
(Cloud, MacOSX, Windows, Linux, Linux on System z)
Cloud Distributed Power z/OS
Model Repository
Spark Anaconda Deep Learning
IBM Machine Learning
© 2017 IBM Corporation
IBM Confidential
IBM ML Deployments
Leverage a wide range of platforms to meet the varying needs of
enterprise architectures
IBM ML authoring (DSX)
(Cloud, MacOSX, Windows, Linux, Linux on System z)
Cloud Distributed Power z/OS
Model Repository
IBM Machine Learning
© 2017 IBM Corporation
IBM Confidential
• Auto-modeling
– Cognitive assistant for data scientists (CADS)
• Select the best algorithm with the best performance from a set of candidates
– Hyperparameter optimization (HPO)
• Select the hyperparameter with the best performance from a set of candidates given a specific algorithm
– CADS and HPO use the performance of models on small data sets to predict performance on large data sets.
They use machine learning to facility machine learning.
• Visualization
– Data Scientists use visualization tool to help them understand data distribution. Brunel is one of the tools
commonly used by Data Scientists
– Brunel is designed as a layer on top of low-level visualization technology that does not require programming
to design compelling interactive visualizations.
Model Creation – Build-in Libraries
%%brunel data(‘churndata’) bar
x(AGE) y(#count) label(NEGTWEETS)
color(CHURN_LABEL)
© 2017 IBM Corporation
IBM Confidential
• The Jupyter Notebook
is an open-source web
application that allows
you to create and share
documents that contain
live code, equations,
visualizations and
explanatory text.
Model Creation – Integrated Jupyter Notebook
Cell for code snippet
Interactively execution
© 2017 IBM Corporation
IBM Confidential
• Visual model builder
is a wizard guiding
users to create a
model step by step
• No programming
skill is required
Model Training – Visual Model Builder
© 2017 IBM Corporation
IBM Confidential
• The Predictive Model Markup Language (PMML) is an XML-based predictive model interchange
format.
• Many vendors can export their models to PMML format, including SPSS, R and SAS
• IBM Machine Learning for z/OS supports scoring for PMML models that conforms PMML standard
• Support for PMML extensions is not guaranteed
PMML Model Support
© 2017 IBM Corporation
IBM Confidential
• Models are managed in a central repository in Db2 for z/OS
• Leverage the high availability of Db2 and z
Model Management – Saving Model
Db2 for
z/OS
V10 or
above
Models and
metadata
of models
ML libraries / services to
persistent models
Notebook
Visual
model
builder
ML
services
PMML
Model
© 2017 IBM Corporation
IBM Confidential
• Model deployment is the process of moving model into production environment to serve business
need – single click deployment
• Models are deployed as REST interfaces
• Runtime performance monitoring for scoring services
Model Deployment
CICS WAS Mobile
DFHJSON
© 2017 IBM Corporation
IBM Confidential
Continuous Performance Monitoring
Day 1 Day 2 Day 3 Day 4 Day 5 …
New inserted feedback data
Feedback
Dataset
GENDER AGE MARITIAL
_STATUS
PROFESSION …
GENDER AGE MARITIAL
_STATUS
PROFESSION INSERT_TIME
Evaluated feedback data
Training
Dataset
Evaluate Evaluate Evaluate Evaluate Evaluate
Users add new labeled data as feedback
data to feedback dataset
Evaluation tasks are scheduled to monitor
performance of a model with the new
inserted feedback data
Re-train?
© 2017 IBM Corporation
IBM Confidential
Highlights deployments whose performance is downgrading
Continuous Performance Monitoring (cont.)
© 2017 IBM Corporation
IBM Confidential
Components for a Machine Learning Implementation
38
Build / Train
Model Evaluate
Scoring
Service
Monitor
Success
Business
Process /
Application
Historical Data
New Data
Training / Learning
Application
Scoring
Ingest Deploy
Feedback
© 2017 IBM Corporation
IBM Confidential
IBM Machine Learning for z/OS Architecture
➢ Move Machine Leaning
capability to the
platform where the
most valuable data
resides
➢ Integrate real-time
predictive analytics
with transactions
➢ Leverage z/OS superior
reliability, availability
and security
© 2017 IBM Corporation
IBM Confidential
The former Loan Application
40
• A customer is eligible for a loan according to several criteria such as the amount
of the loan, the yearly income of the borrower, and the duration of the loan.
• The decision logic is embedded in multiple loan approval applications
• The branch application running on z/OS (Cobol/CICS)
• The Internet application running on JEE server
Branch Application
(Cobol/CICS)
Decision logic
Internet Application
(JEE)
validation eligibilityvalidation eligibility
Batch Scoring
Application
scoring
▪ Scoring is not real time
▪ Disruption in the predictive models lifecycle management (Data scientist & IT)
▪ Rules written in software code cannot be read by business people
▪ Hard coded rules are difficult to change
▪ Rules intertwined within applications cannot be reused by other systems
© 2017 IBM Corporation
IBM Confidential
The NEW Loan Application
41
• Objectives
• The SBSLoan application illustrates how Decision Management helps lenders to make an online
decision for loans approval.
• Functional Description
• Manage loan approval through
• a set of Business Rules
• An In-Transaction, In-Database real-time Scoring
Real-time
scoring
validation
eligibility
scoring
© 2017 IBM Corporation
IBM Confidential
The NEW Loan Application – Demonstration overall picture
42
z/OS
Rule Execution Server
on WAS for z/OS
zRES
on CICS
validation
eligibility
Decision Center
Repository
Business User
Database Server
Db2 z/OS
Production
Data
Historical
Data
Scoring Services
Liberty on z/OS
Scoring
REST/JSON
Machine Learning /
Spark on z/OS
IBM Machine Learning UI
Jupyter Notebook / Visual
Model Builder /
Model Management /
Model Deployment /
Monitoring
Ingestion lib
Training lib
Db2 JDBC driver
CADS/HPO lib
validation
eligibility
CICS
Cobol App
Runtime
Deployment
Development / Training
© 2017 IBM Corporation
IBM Confidential
Live Demonstration
© 2017 IBM Corporation
IBM Confidential
© 2017 IBM Corporation
IBM Confidential
(Relatively) New IBM Redbooks publication: SG24-8314-00
• Analytics on z Systems environment
• Warehouse concepts
• Logical data warehouse
• Transformation patterns
• Accelerator-only tables
• Concepts and architecture
• Use cases enabled by accelerator-only
tables and in-database-analytics
• Multi-step reporting
• Using QMF to store query results and
importing tables
• Accelerating IBM Campaign
processing
• In-database transformations
• Accelerator and accelerator-only table
usage within DataStage
• Accelerator-only tables supporting data
scientists ad-hoc analysis
• Integration of more data sources and
archiving for analytics
• In-database analyticshttp://www.redbooks.ibm.com/Redbooks.nsf/RedbookAbstracts/sg248314.html?Open
© 2017 IBM Corporation
IBM Confidential
New IBM Redbooks publication to come very soon ….
• Analytics on z Systems environment
• Warehouse concepts
• Logical data warehouse
• Transformation patterns
• Accelerator-only tables
• Concepts and architecture
• Use cases enabled by accelerator-only
tables and in-database-analytics
• Multi-step reporting
• Using QMF to store query results and
importing tables
• Accelerating IBM Campaign
processing
• In-database transformations
• Accelerator and accelerator-only table
usage within DataStage
• Accelerator-only tables supporting data
scientists ad-hoc analysis
• Integration of more data sources and
archiving for analytics
• In-database analyticshttp://www.redbooks.ibm.com/Redbooks.nsf/RedbookAbstracts/sg248314.html?Open
IBM Machine Learning on z/OS
© 2017 IBM Corporation
IBM Confidential
Proof of Concept on IBM Machine Learning for z/OS
Help testing IBM Machine Learning on a Real environment.
47
Preparation Enablement Setup
• How can we benefit from IBM Machine Learning solution ?
• How to identify best use cases to get started with ?
• How to implement IBM Machine Learning within our
environment ?
• How to ingest / Model our data within Machine Learning ?
• How to deploy new models to be used in our current
applications ?
• How to manage IBM Machine Learning installation ?
• See immediate benefits from IBM Machine Learning
in identifying new patterns within their data.
• Get help to identify, implement, ingest, train, model
the most relevant data for their use cases.
• Experiment in a dedicated PoC environment IBM
Machine Learning solution.
• Assistance to install IBM Machine Learning solution in
their existing production environment.
• Customer team members enabled to leverage IBM
Machine Learning benefits.
2 days 1 day 3 days to
2 weeks
• Client Data Scientists
• Client Database Administrators
• Client Technical Team
• IBM Local Team
Questions our
clients ask
themselves
The benefits
for our clients
Participants
Duration
Key
Activities
▪ 1 Questionnaire
to be filled in by
client
▪ 1H Conference
call.
▪ Review of pre-
requisites in
client target
environment
• Teaching on
new IBM
Machine
Learning for
z/OS
benefits and
features.
• Hands-On
Labs on IBM
Sandboxes
Contacts
• Preparation
of IBM
Machine
Learning
environment
in Montpellier
PoC
dedicated
environment.
• Implement
IBM Machine
Learning
environment
by customer
location.
Guillaume Arnould
Analytics on zSystems
Certified IT Specialist
IBM Client Center Montpellier
arnould@fr.ibm.com
PoC
1 month
• Remote
assistance
while PoC .
© 2017 IBM Corporation
IBM Confidential
Thank You for attending the Db2 Update Days 2018 !!!

More Related Content

PDF
Ibm db2update2019 machine learning and db2 ai
PDF
Analytics on z Systems Focus on Real Time - Hélène Lyon
 
PDF
Native Stored Procedures with data studio
PPTX
Altair Pbs Works Overview 10 1 Kiew
PPTX
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
PPTX
2018 08-13-ib ms-latest-buzz-share-final
PDF
Data Virtualization Manager for z/OS
PDF
Gse 2009 Cmolaro Final02 1
Ibm db2update2019 machine learning and db2 ai
Analytics on z Systems Focus on Real Time - Hélène Lyon
 
Native Stored Procedures with data studio
Altair Pbs Works Overview 10 1 Kiew
How to Leverage Mainframe Data with Hadoop: Bridging the Gap Between Big Iron...
2018 08-13-ib ms-latest-buzz-share-final
Data Virtualization Manager for z/OS
Gse 2009 Cmolaro Final02 1

What's hot (18)

PDF
Ibm db2 update2019 intro ending
PPS
Assembler & z/OS Internals Syllabus
PDF
The NRB Group mainframe day 2021 - Containerisation on Z - Paul Pilotto - Seb...
 
PDF
2014 01-23-eranea-apalia-private-cloud
PDF
The NRB Group mainframe day 2021 - Application Modernisation On Z - Sebastien...
 
PPTX
Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization
PDF
Productionizing Spark ML Pipelines with the Portable Format for Analytics wit...
PDF
IMS integration 2017
DOC
Klausing, Patrick Resume Consultant2
PPTX
IBM i at the heart of Cognitive Solutions
PDF
Lift Your Legacy UNIX Applications & Databases into the Cloud
PDF
Modernizing Your IMS Environment Without an Application Rewrite Series Part 2...
PDF
S200516 copy-data-management-ist2020-v2001c
PPS
Systemz Security Overview (for non-Mainframe folks)
PDF
Z13 update
PDF
IBM Storage at Fiserv Forum 2018
PPTX
IBM i at the eart of cognitive solutions
PPTX
DNUG 2015 - Going Cloud - warum und wie (IS11)
Ibm db2 update2019 intro ending
Assembler & z/OS Internals Syllabus
The NRB Group mainframe day 2021 - Containerisation on Z - Paul Pilotto - Seb...
 
2014 01-23-eranea-apalia-private-cloud
The NRB Group mainframe day 2021 - Application Modernisation On Z - Sebastien...
 
Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization
Productionizing Spark ML Pipelines with the Portable Format for Analytics wit...
IMS integration 2017
Klausing, Patrick Resume Consultant2
IBM i at the heart of Cognitive Solutions
Lift Your Legacy UNIX Applications & Databases into the Cloud
Modernizing Your IMS Environment Without an Application Rewrite Series Part 2...
S200516 copy-data-management-ist2020-v2001c
Systemz Security Overview (for non-Mainframe folks)
Z13 update
IBM Storage at Fiserv Forum 2018
IBM i at the eart of cognitive solutions
DNUG 2015 - Going Cloud - warum und wie (IS11)
Ad

Similar to How to combine Db2 on Z, IBM Db2 Analytics Accelerator and IBM Machine Learning on z/OS for Credit Scoring (20)

PPT
PDF
Nrb Mainframe Day z Data and AI - Leif Pedersen
 
PDF
Analytics on system z final
PDF
Machine Learning for z/OS
PDF
Analytics with IMS Assets - 2017
PDF
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
 
PDF
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
 
PPTX
L’architettura di Classe Enterprise di Nuova Generazione
PDF
Ingesting Data at Blazing Speed Using Apache Orc
PPT
IBMHadoopofferingTechline-Systems2015
PDF
How to Revamp your Legacy Applications For More Agility and Better Service - ...
 
PDF
Db2 for z os trends
PDF
Ibm db2 big sql
PDF
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
PDF
Dev ops for z
PDF
IBM Z for the Digital Enterprise - DevOps for Z
PDF
Enterprise analytics journey from Helene Lyon
PDF
IBM Z for the Digital Enterprise - IBM Z Open Data Analytics
PDF
DB2 Real-Time Analytics Meeting Wayne, PA 2015 - IDAA & DB2 Tools Update
PDF
Ibm db2update2019 icp4 data
Nrb Mainframe Day z Data and AI - Leif Pedersen
 
Analytics on system z final
Machine Learning for z/OS
Analytics with IMS Assets - 2017
NRB - BE MAINFRAME DAY 2017 - Data spark and the data federation
 
NRB - LUXEMBOURG MAINFRAME DAY 2017 - Data Spark and the Data Federation
 
L’architettura di Classe Enterprise di Nuova Generazione
Ingesting Data at Blazing Speed Using Apache Orc
IBMHadoopofferingTechline-Systems2015
How to Revamp your Legacy Applications For More Agility and Better Service - ...
 
Db2 for z os trends
Ibm db2 big sql
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Dev ops for z
IBM Z for the Digital Enterprise - DevOps for Z
Enterprise analytics journey from Helene Lyon
IBM Z for the Digital Enterprise - IBM Z Open Data Analytics
DB2 Real-Time Analytics Meeting Wayne, PA 2015 - IDAA & DB2 Tools Update
Ibm db2update2019 icp4 data
Ad

Recently uploaded (20)

PPTX
Introduction to machine learning and Linear Models
PDF
annual-report-2024-2025 original latest.
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Business Analytics and business intelligence.pdf
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
Mega Projects Data Mega Projects Data
PDF
Lecture1 pattern recognition............
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
1_Introduction to advance data techniques.pptx
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Introduction to Data Science and Data Analysis
Introduction to machine learning and Linear Models
annual-report-2024-2025 original latest.
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Business Analytics and business intelligence.pdf
IB Computer Science - Internal Assessment.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Mega Projects Data Mega Projects Data
Lecture1 pattern recognition............
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
.pdf is not working space design for the following data for the following dat...
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
SAP 2 completion done . PRESENTATION.pptx
1_Introduction to advance data techniques.pptx
[EN] Industrial Machine Downtime Prediction
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Introduction to Data Science and Data Analysis

How to combine Db2 on Z, IBM Db2 Analytics Accelerator and IBM Machine Learning on z/OS for Credit Scoring

  • 1. How to combine Db2 on Z, IBM Db2 Analytics Accelerator and IBM Machine Learning on z/OS for Credit Scoring A live Demo ! March 12th - 16th , 2018 Guillaume ARNOULD arnould@fr.ibm.com Db2 Update Days 2018
  • 2. © 2017 IBM Corporation IBM Confidential Agenda Analytics on IBM z Systems • Machine Learning Basics • IBM Db2 Analytics Accelerator for Machine Learning • IBM Machine Learning for z/OS + Demo
  • 3. © 2017 IBM Corporation IBM Confidential Machine learning is everywhere, influencing nearly everything we do… Netflix personalized movie recommendations Waze personalized driving experience 7 out of 10 financial customers would take recommendations from a robot advisor Machine Learning Basics ▪ Identifies patterns in historical data ▪ Builds/trains behavioral models from patterns ▪ Makes recommendations
  • 4. © 2017 IBM Corporation IBM Confidential The Machine Learning Workflow: Perception
  • 5. © 2017 IBM Corporation IBM Confidential Machine Learning 101: Types of machine learning • Classification – Data points are labeled and are being used to predict a category – Two-class vs multi-class – Example: • Fraud detection (fraud vs non-fraud) • Spam email detection (spam vs non-spam) • Regression – When a value is being predicted – Example: • Stock prices prediction • Clustering – Data points are not labeled. – Goal is to group data into clusters to better organize the data
  • 6. © 2017 IBM Corporation IBM Confidential Machine Learning 101: Supervised Learning • A feature is a piece of information that might be useful for prediction – Example, predict the probability of a customer buying a product • Labeled data is the desired output data – Example, 1.0 representing a customer has bought a product; 0.0 representing NOT GENDE R AGE MARITAL_STATUS PROFESSIO N CUSTOMER_ID LABEL F 24 Married Retail 4003 1.0 M 43 Married Trades 4004 1.0 F 43 Unspecified Hospitality 4005 0.0 F 43 Unspecified Sales 4006 1.0 M 28 Single Trades 4007 1.0 Feature Feature Feature Feature NOT a feature Label
  • 7. © 2017 IBM Corporation IBM Confidential Training a model Feature Engineering Feature Engineering Scoring Labeled examples Training Scoring New data Model Model Predicted data Deploy Data Science Experience Operational system Dev Ops Machine Learning 101 : a TrainOps (DevOps) Story
  • 8. © 2017 IBM Corporation IBM Confidential 8 Fraud Detection Example : a 2 Steps Approach Analyses Segments Profiles Scoring models ... Scoring Scoring based decisionData: Demographics Account activity Transactions Channel usage Service queries Renewals … Identify predictive models/patterns found in historical data Use those predictive models with variables to score transactions & identify the best possible future outcomes Practical scoring approaches ▪Off-line: Batch Scoring ▪On-line: External scoring function ▪On-line: Within a transaction, in real time Step 1 – Build the predictive model Step 2 – Execute the predictive model Model
  • 9. © 2017 IBM Corporation IBM Confidential The Machine Learning Workflow: Reality
  • 10. © 2017 IBM Corporation IBM Confidential Agenda Analytics on IBM z Systems • Machine Learning Basics • IBM Db2 Analytics Accelerator for Machine Learning • IBM Machine Learning for z/OS + Demo
  • 11. © 2017 IBM Corporation IBM Confidential Version 5.1 of Db2 Analytics Accelerator opens up a new dimension of Analytical Processing by introducing In-Database Analytics and Accelerator-Only Tables. ➢ In-Database Analytics capabilities enable acceleration of predictive analytics applications. This enables SPSS/Netezza Analytics Data Mining and In-Database Modeling to be processed within the Accelerator. ➢ Accelerator-only tables can benefit statistics and analytics tools that use temporary data for reports. The high velocity of execution enables these tools to quickly gather all required data. ➢ Accelerator-only tables enable acceleration of Data Transformations implemented via SQL statements. Storing interim results in accelerator-only tables enables subsequent queries or data transformations to process all relevant data on the accelerator with high speed
  • 12. © 2017 IBM Corporation IBM Confidential Introducing Accelerator-Only Table type in Db2 for z/OS Creation (DDL) and access remains through Db2 for z/OS in all cases Non-accelerator Db2 table • Data in Db2 only Accelerator-shadow table • Data in Db2 and the Accelerator Accelerator-archived table / partition • Empty read-only partition in Db2 • Partition data is in Accelerator only Accelerator-only table (AOT) • “Proxy table” in Db2 • Data is in Accelerator only Table 1 Table 4 Table 3 Table 2Table 2 Table 4 Table 3
  • 13. © 2017 IBM Corporation IBM Confidential Data scientist work area Using Accelerator-only tables for ad-hoc analysis Transaction Processing Systems (OLTP) Data for transactional and analytical processing Customer Transactions Customer Data Customer Transactions Customer Data Work database John Work Area AOTs Work database Bob Work Area AOTs Data Scientist John Data Scientist Bob
  • 14. © 2017 IBM Corporation IBM Confidential 14 ETL on a different Platform (Traditional Approach) Database Transformation – Common Usage Customer data Customer Transactions Transaction Processing Systems (OLTP) Customer Transaction Summary and History Customer data Customer Transactions Customer Summary Mart Distributed DBMS ETL logic Copy Table Data (FTP) Disadvantages: ▪ Process driven movement of large amounts of data ▪ Aged data for analytics/reporting depending on performance of data movement and transformation process Analytics Unix Server The Secret Weapon
  • 15. © 2017 IBM Corporation IBM Confidential In-Database Transformation Using Accelerator-only tables and ELT Logic in the Accelerator Transaction Processing Systems (OLTP) Analytics Advantages: • Simpler to manage • Better performance and reduced latency Data for transactional and analytical processing Customer Transactions Customer Data Customer Transaction Summary and History AOTs Customer Summary Mart AOTs Customer Transactions Customer Data ELT logic
  • 16. © 2017 IBM Corporation IBM Confidential 16 ETL using Infosphere Information Server
  • 17. © 2017 IBM Corporation IBM Confidential 17 Balanced Optimization Using IDAA : Optimization
  • 18. © 2017 IBM Corporation IBM Confidential 18 Run-Time Comparisons – Benefits of Running inside IDAA
  • 19. © 2017 IBM Corporation IBM Confidential In-Database Analytics – Technical basics ▪ Set of stored procedures of IBM Netezza In-Database Analytics Package (INZA) are available on the Accelerator, including: ▪ Decision Tree ▪ Regression Tree ▪ Naive Bayes ▪ K-means Clustering and TwoStep Clustering ▪ Stored procedures use accelerator-shadow tables or accelerator-only tables as input and create accelerator-only tables as output ▪ Db2 for z/OS contains stored procedure wrappers to enable invocation of the stored procedures from Db2 for z/OS analytical applications or from SPSS Modeler 17.1 ▪ Actual stored procedure execution takes place on Accelerator
  • 20. © 2017 IBM Corporation IBM Confidential In-Database Analytics Data Preparation (using AOTs) and SPSS Modeling in the Accelerator Transaction Processing Systems (OLTP) With embedded scoring Advantages: • Allows fast model refreshes • Ensures adequate scoring • Better performance and reduced latency • Scoring outside accelerator with SPSS Modeler Server , or Zementis Data for transactional and analytical processing Customer Transactions Customer Data Customer Txn Data Prep AOTs Customer Transactions Customer Data Modeling ModelModel SPSS Modeler
  • 21. © 2017 IBM Corporation IBM Confidential 21 « Behind the scenes»: Exploiting In-Database Modeling using Accelerator
  • 22. © 2017 IBM Corporation IBM Confidential 22 « Behind the scenes»: Execution Results
  • 23. © 2017 IBM Corporation IBM Confidential 23 « Behind the scenes»: Looking at the Accelerator using DataStudio
  • 24. © 2017 IBM Corporation IBM Confidential In-Database Analytics Using Accelerator-Only Tables , ELT Logic and Modeling in the Accelerator Transaction Processing Systems (OLTP) Data for Transactional and Analytical Processing Customer Transactions Customer Data Customer Transactions Customer Data Customer Transaction Summary and History AOTs Customer Summary Mart AOTs ELT logicCustomer Txn Data Prep AOTs Modeling ModelModel SPSS Modeler
  • 25. © 2017 IBM Corporation IBM Confidential Agenda Analytics on IBM z Systems • Machine Learning Basics • IBM Db2 Analytics Accelerator for Machine Learning • IBM Machine Learning for z/OS + Demo
  • 26. © 2017 IBM Corporation IBM Confidential 26 Understand the 2 Different Steps Decision Management / Rules application Scoring against an existing model 5 4 data synchronisation IBMDb2Analytics Accelerator TRANSACTION Rule & Model Execution Rule & Model Creation In Database Transformation 2a (accelerated) Real Time / Predictive Analytics / Reporting 2 3 Merging non Db2 for zOS data and Db2 data 1 Model & Rule Updates zDatazApps
  • 27. © 2017 IBM Corporation IBM Confidential IBM ML Training runtimes Leverage a wide range of platforms to meet the varying needs of enterprise architectures IBM ML authoring (DSX) (Cloud, MacOSX, Windows, Linux, Linux on System z) Cloud Distributed Power z/OS Model Repository Spark Anaconda Deep Learning IBM Machine Learning
  • 28. © 2017 IBM Corporation IBM Confidential IBM ML Deployments Leverage a wide range of platforms to meet the varying needs of enterprise architectures IBM ML authoring (DSX) (Cloud, MacOSX, Windows, Linux, Linux on System z) Cloud Distributed Power z/OS Model Repository IBM Machine Learning
  • 29. © 2017 IBM Corporation IBM Confidential • Auto-modeling – Cognitive assistant for data scientists (CADS) • Select the best algorithm with the best performance from a set of candidates – Hyperparameter optimization (HPO) • Select the hyperparameter with the best performance from a set of candidates given a specific algorithm – CADS and HPO use the performance of models on small data sets to predict performance on large data sets. They use machine learning to facility machine learning. • Visualization – Data Scientists use visualization tool to help them understand data distribution. Brunel is one of the tools commonly used by Data Scientists – Brunel is designed as a layer on top of low-level visualization technology that does not require programming to design compelling interactive visualizations. Model Creation – Build-in Libraries %%brunel data(‘churndata’) bar x(AGE) y(#count) label(NEGTWEETS) color(CHURN_LABEL)
  • 30. © 2017 IBM Corporation IBM Confidential • The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Model Creation – Integrated Jupyter Notebook Cell for code snippet Interactively execution
  • 31. © 2017 IBM Corporation IBM Confidential • Visual model builder is a wizard guiding users to create a model step by step • No programming skill is required Model Training – Visual Model Builder
  • 32. © 2017 IBM Corporation IBM Confidential • The Predictive Model Markup Language (PMML) is an XML-based predictive model interchange format. • Many vendors can export their models to PMML format, including SPSS, R and SAS • IBM Machine Learning for z/OS supports scoring for PMML models that conforms PMML standard • Support for PMML extensions is not guaranteed PMML Model Support
  • 33. © 2017 IBM Corporation IBM Confidential • Models are managed in a central repository in Db2 for z/OS • Leverage the high availability of Db2 and z Model Management – Saving Model Db2 for z/OS V10 or above Models and metadata of models ML libraries / services to persistent models Notebook Visual model builder ML services PMML Model
  • 34. © 2017 IBM Corporation IBM Confidential • Model deployment is the process of moving model into production environment to serve business need – single click deployment • Models are deployed as REST interfaces • Runtime performance monitoring for scoring services Model Deployment CICS WAS Mobile DFHJSON
  • 35. © 2017 IBM Corporation IBM Confidential Continuous Performance Monitoring Day 1 Day 2 Day 3 Day 4 Day 5 … New inserted feedback data Feedback Dataset GENDER AGE MARITIAL _STATUS PROFESSION … GENDER AGE MARITIAL _STATUS PROFESSION INSERT_TIME Evaluated feedback data Training Dataset Evaluate Evaluate Evaluate Evaluate Evaluate Users add new labeled data as feedback data to feedback dataset Evaluation tasks are scheduled to monitor performance of a model with the new inserted feedback data Re-train?
  • 36. © 2017 IBM Corporation IBM Confidential Highlights deployments whose performance is downgrading Continuous Performance Monitoring (cont.)
  • 37. © 2017 IBM Corporation IBM Confidential Components for a Machine Learning Implementation 38 Build / Train Model Evaluate Scoring Service Monitor Success Business Process / Application Historical Data New Data Training / Learning Application Scoring Ingest Deploy Feedback
  • 38. © 2017 IBM Corporation IBM Confidential IBM Machine Learning for z/OS Architecture ➢ Move Machine Leaning capability to the platform where the most valuable data resides ➢ Integrate real-time predictive analytics with transactions ➢ Leverage z/OS superior reliability, availability and security
  • 39. © 2017 IBM Corporation IBM Confidential The former Loan Application 40 • A customer is eligible for a loan according to several criteria such as the amount of the loan, the yearly income of the borrower, and the duration of the loan. • The decision logic is embedded in multiple loan approval applications • The branch application running on z/OS (Cobol/CICS) • The Internet application running on JEE server Branch Application (Cobol/CICS) Decision logic Internet Application (JEE) validation eligibilityvalidation eligibility Batch Scoring Application scoring ▪ Scoring is not real time ▪ Disruption in the predictive models lifecycle management (Data scientist & IT) ▪ Rules written in software code cannot be read by business people ▪ Hard coded rules are difficult to change ▪ Rules intertwined within applications cannot be reused by other systems
  • 40. © 2017 IBM Corporation IBM Confidential The NEW Loan Application 41 • Objectives • The SBSLoan application illustrates how Decision Management helps lenders to make an online decision for loans approval. • Functional Description • Manage loan approval through • a set of Business Rules • An In-Transaction, In-Database real-time Scoring Real-time scoring validation eligibility scoring
  • 41. © 2017 IBM Corporation IBM Confidential The NEW Loan Application – Demonstration overall picture 42 z/OS Rule Execution Server on WAS for z/OS zRES on CICS validation eligibility Decision Center Repository Business User Database Server Db2 z/OS Production Data Historical Data Scoring Services Liberty on z/OS Scoring REST/JSON Machine Learning / Spark on z/OS IBM Machine Learning UI Jupyter Notebook / Visual Model Builder / Model Management / Model Deployment / Monitoring Ingestion lib Training lib Db2 JDBC driver CADS/HPO lib validation eligibility CICS Cobol App Runtime Deployment Development / Training
  • 42. © 2017 IBM Corporation IBM Confidential Live Demonstration
  • 43. © 2017 IBM Corporation IBM Confidential
  • 44. © 2017 IBM Corporation IBM Confidential (Relatively) New IBM Redbooks publication: SG24-8314-00 • Analytics on z Systems environment • Warehouse concepts • Logical data warehouse • Transformation patterns • Accelerator-only tables • Concepts and architecture • Use cases enabled by accelerator-only tables and in-database-analytics • Multi-step reporting • Using QMF to store query results and importing tables • Accelerating IBM Campaign processing • In-database transformations • Accelerator and accelerator-only table usage within DataStage • Accelerator-only tables supporting data scientists ad-hoc analysis • Integration of more data sources and archiving for analytics • In-database analyticshttp://www.redbooks.ibm.com/Redbooks.nsf/RedbookAbstracts/sg248314.html?Open
  • 45. © 2017 IBM Corporation IBM Confidential New IBM Redbooks publication to come very soon …. • Analytics on z Systems environment • Warehouse concepts • Logical data warehouse • Transformation patterns • Accelerator-only tables • Concepts and architecture • Use cases enabled by accelerator-only tables and in-database-analytics • Multi-step reporting • Using QMF to store query results and importing tables • Accelerating IBM Campaign processing • In-database transformations • Accelerator and accelerator-only table usage within DataStage • Accelerator-only tables supporting data scientists ad-hoc analysis • Integration of more data sources and archiving for analytics • In-database analyticshttp://www.redbooks.ibm.com/Redbooks.nsf/RedbookAbstracts/sg248314.html?Open IBM Machine Learning on z/OS
  • 46. © 2017 IBM Corporation IBM Confidential Proof of Concept on IBM Machine Learning for z/OS Help testing IBM Machine Learning on a Real environment. 47 Preparation Enablement Setup • How can we benefit from IBM Machine Learning solution ? • How to identify best use cases to get started with ? • How to implement IBM Machine Learning within our environment ? • How to ingest / Model our data within Machine Learning ? • How to deploy new models to be used in our current applications ? • How to manage IBM Machine Learning installation ? • See immediate benefits from IBM Machine Learning in identifying new patterns within their data. • Get help to identify, implement, ingest, train, model the most relevant data for their use cases. • Experiment in a dedicated PoC environment IBM Machine Learning solution. • Assistance to install IBM Machine Learning solution in their existing production environment. • Customer team members enabled to leverage IBM Machine Learning benefits. 2 days 1 day 3 days to 2 weeks • Client Data Scientists • Client Database Administrators • Client Technical Team • IBM Local Team Questions our clients ask themselves The benefits for our clients Participants Duration Key Activities ▪ 1 Questionnaire to be filled in by client ▪ 1H Conference call. ▪ Review of pre- requisites in client target environment • Teaching on new IBM Machine Learning for z/OS benefits and features. • Hands-On Labs on IBM Sandboxes Contacts • Preparation of IBM Machine Learning environment in Montpellier PoC dedicated environment. • Implement IBM Machine Learning environment by customer location. Guillaume Arnould Analytics on zSystems Certified IT Specialist IBM Client Center Montpellier arnould@fr.ibm.com PoC 1 month • Remote assistance while PoC .
  • 47. © 2017 IBM Corporation IBM Confidential Thank You for attending the Db2 Update Days 2018 !!!