SlideShare a Scribd company logo
2
Most read
4
Most read
5
Most read
PMML  Overview
PMML  defines a standard not only to represent data-mining models, but also  data handling  and  data transformations  (pre- and post-processing) PMML Predictive Model Markup Language Transformations PMML is an  XML-based language  used to define statistical and data mining models and to share these between compliant applications. It is a mature  standard  developed by the DMG (Data Mining Group) to avoid proprietary issues and incompatibilities and to deploy models. PMML allows for the clear separation of tasks: Model development vs. model deployment. As a consequence, scientists can focus on building the best model.  PMML eliminates need for custom model deployment and ensures scalability and reliability. Models
Matured and Supported by Industry PMML PMML  Industry Support Data Mining Group  http:// www.dmg.org Mature standard Current version 4.0 (just released) Active group and constant enhancements Vendor independent consortium Industry supporters Major Players: IBM, Oracle, SAP, Microsoft Analytics:  KXEN, SAS, Salford, SPSS, Zementis BI: Microstrategy, Teradata, Tibco Open Source: KNIME, R, Rapid-I Others: Equifax, FICO, Open Data Group, Visa, Pervasive
PMML Components A  Data Dictionary  defines all the raw data fields (including missing value strategy and outlier treatment). Several  Data Transformations  strategies allow for intelligent extraction of feature detectors from raw data (“data massaging”). A comprehensive list of  Data-Mining Models  offers power and flexibility. Post-processing of results allow for tailored decisions. Model Explanation allows for performance evaluation.
PMML Files 3) Post-Processing Scaling of model outputs can be performed with PMML element  Targets 1) Pre-Processing PMML elements  Transformations, Mining Schema  and  Functions  allow for effective pre-processing 2) Models PMML allows for several predictive modeling techniques to be fully expressed PMML
PMML: Data Pre-Processing Data Dictionary : Allows for the explicit specification of valid, invalid and missing values. Mining Schema : Used to define the appropriate treatment to be applied to missing and invalid values. Transformations : Allow for variable discretization, normalization, and mapping with handling of missing and default values. Built-in Functions : Arithmetic expressions, handling of date and time as well as strings. Also used for implementing IF-THEN-ELSE logic and Boolean operations. Data Pre-Processing 1
Data Pre-Processing: PMML Example Arbitrary Piecewise Linear Function This PMML code implements:  Var_b:=interpolate(Var_a,((100,0),(200,1),(800,3),(900,4))) See http://www.dmg.org/v3-2/Transformations.html -  look for element NormContinuous.
Modeling Elements PMML allows for several  predictive modeling  techniques to be expressed directly. Supported techniques which have their own elements are: Regression and General Regression Neural Networks Support Vector Machines Decision Trees Naïve Bayes Clustering Sequences Rule Sets Association Rules Time-Series (as of PMML 4.0) Text Models Support for Multiple Models   Easy Expression of Predictive Models 2
Modeling Elements: PMML Example for Neural Network
The PMML code below implements score post-processing.  It uses the PMML element  Targets  for checking  boundaries ( min  and  max ) and to rescale ( rescaleConstant   and  rescaleFactor ) the original score generated by model  See http://www.dmg.org/v3-2/Targets.html Data Post-Processing: PMML Example 3
Applications Service Providers  External Vendors  Divisions One Standard, One Process
PMML = Easy Model Deployment Model Deployment Model Building PMML
PMML - Zementis Contributions ADAPA : A decision engine that deploys models expressed in PMML and executes them in real-time. Now available as a service on the Amazon Cloud.  PMML Converter : Validates, converts, and corrects old and new PMML code. Available at the DMG website and at  http://www.zementis.com/pmml.htm . Contributing Member of the DMG : Submitted several proposals for PMML 4.0 and already working with other members on PMML 4.1. Code contributor for the  R PMML package  (available on CRAN). PMML Articles : R Journal and SIGKDD Explorations Newsletter. Available for downloading at  http:// www.zementis.com/manual.htm PMML Blogs : Several blogs on PMML topics ( http://adapasupport.zementis.com  and  http ://www.predictive-analytics.info ).
Thank You! U.S.A Headquarters Asia Office E-mail:   [email_address] 19/F., Unit A Ho Lee Commercial Building 38-44 D’Aguilar Street Central, Hong Kong (S.A.R.) Tel:  +852 2868-0878 Fax:  +852 2845-6027 6125 Cornerstone Court East Suite 250 San Diego, CA, 92121 Tel:  +1 619 330-0780 Fax:  +1 858 535-0227

More Related Content

PDF
Deploying Machine Learning Models to Production
PPT
HIDDEN MARKOV MODEL AND ITS APPLICATION
PDF
Lecture 6
PPTX
Natural Language Processing in AI
PDF
Principal Component Analysis
PPTX
NLP_KASHK:Parsing with Context-Free Grammar
PDF
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
PPT
Creating a SNP calling pipeline
Deploying Machine Learning Models to Production
HIDDEN MARKOV MODEL AND ITS APPLICATION
Lecture 6
Natural Language Processing in AI
Principal Component Analysis
NLP_KASHK:Parsing with Context-Free Grammar
Introduction to NGS Variant Calling Analysis (UEB-UAT Bioinformatics Course -...
Creating a SNP calling pipeline

What's hot (20)

PPTX
XLnet RoBERTa Reformer
PDF
Introduction to Few shot learning
PDF
Kogo 2013 RNA-seq analysis
PDF
Feature Selection.pdf
PDF
Principal component analysis, Code and Time Complexity
POT
RNA-seq quality control and pre-processing
PDF
Optimization for Deep Learning
PPT
Event handling63
PDF
Machine Learning: Generative and Discriminative Models
PDF
Hyperparameter Optimization for Machine Learning
PPTX
A Unified Approach to Interpreting Model Predictions (SHAP)
PDF
Linear models for classification
PPTX
PDF
Representation Learning of Text for NLP
PPTX
Introduction to MapReduce
PDF
Best Practices for Hyperparameter Tuning with MLflow
PPTX
How to fine-tune and develop your own large language model.pptx
PDF
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
PDF
Latent Dirichlet Allocation
PPT
New Generation Sequencing Technologies: an overview
XLnet RoBERTa Reformer
Introduction to Few shot learning
Kogo 2013 RNA-seq analysis
Feature Selection.pdf
Principal component analysis, Code and Time Complexity
RNA-seq quality control and pre-processing
Optimization for Deep Learning
Event handling63
Machine Learning: Generative and Discriminative Models
Hyperparameter Optimization for Machine Learning
A Unified Approach to Interpreting Model Predictions (SHAP)
Linear models for classification
Representation Learning of Text for NLP
Introduction to MapReduce
Best Practices for Hyperparameter Tuning with MLflow
How to fine-tune and develop your own large language model.pptx
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
Latent Dirichlet Allocation
New Generation Sequencing Technologies: an overview
Ad

Viewers also liked (20)

PDF
SVD and the Netflix Dataset
PDF
A Short PMML Tutorial by LatentView
PDF
Pattern: PMML for Cascading and Hadoop
PDF
Matrix Factorization Techniques For Recommender Systems
PDF
Agile deployment predictive analytics on hadoop
PDF
Use of standards and related issues in predictive analytics
PPTX
Dimensionality reduction: SVD and its applications
PDF
Михаил Ройзнер - Рекомендательные системы и факторизационые модели
PPTX
智能推荐系统
PDF
PMML Execution of R Built Predictive Solutions
KEY
Disconnecting the Database with ActiveRecord
PDF
ACM Bay Area Data Mining Workshop: Pattern, PMML, Hadoop
PDF
Deploying Data Science with Docker and AWS
PPTX
Predictive analytics from a to z
PDF
If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...
PDF
PCA for the uninitiated
PDF
On the representation and reuse of machine learning (ML) models
PPTX
Incremental collaborative filtering via evolutionary co clustering
PDF
PyCon Korea 2015: 탐색적으로 큰 데이터 분석하기
PPTX
giasan.vn real-estate analytics: a Vietnam case study
SVD and the Netflix Dataset
A Short PMML Tutorial by LatentView
Pattern: PMML for Cascading and Hadoop
Matrix Factorization Techniques For Recommender Systems
Agile deployment predictive analytics on hadoop
Use of standards and related issues in predictive analytics
Dimensionality reduction: SVD and its applications
Михаил Ройзнер - Рекомендательные системы и факторизационые модели
智能推荐系统
PMML Execution of R Built Predictive Solutions
Disconnecting the Database with ActiveRecord
ACM Bay Area Data Mining Workshop: Pattern, PMML, Hadoop
Deploying Data Science with Docker and AWS
Predictive analytics from a to z
If You Are Not Embedding Analytics Into Your Day To Day Processes, You Are Do...
PCA for the uninitiated
On the representation and reuse of machine learning (ML) models
Incremental collaborative filtering via evolutionary co clustering
PyCon Korea 2015: 탐색적으로 큰 데이터 분석하기
giasan.vn real-estate analytics: a Vietnam case study
Ad

Similar to PMML - Predictive Model Markup Language (20)

PDF
Predictive Analytics - Big Data Warehousing Meetup, Zementis
PPT
Zeller Edm Summit Agile Deployment Of Predictive Analytics
PDF
Instant Visualizations in Every Step of Analysis
PDF
State of the (J)PMML art
PDF
Best Practices for Big Data Analytics with Machine Learning by Datameer
PDF
Kamanja: Driving Business Value through Real-Time Decisioning Solutions
PPTX
Best practices machine learning final
PDF
Zementis hortonworks-webinar-2014-09
PDF
The case for (J)PMML
PPTX
Operationalizing analytics to scale
PPTX
IBM Meetup on November 1, 2018: Machine Learning made easy with Watson Studio
PDF
eXplainable Predictive Decisioning: combine ML and Decision Management to pro...
PPTX
Flink Forward SF 2017: Erik de Nooij - StreamING models, how ING adds models ...
PDF
VSSML16 L5. Basic Data Transformations
PPTX
AzureML Welcome to the future of Predictive Analytics
DOC
Presentation on Machine Learning and Data Mining
PDF
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
PDF
ARTIFICIAL INTELLIGENCE BASED BUSINESS TRANSFORMATION PROJECTS-THE ROLE OF DA...
PDF
Artificial Intelligence Based Business Transformation Projects-the Role of Da...
Predictive Analytics - Big Data Warehousing Meetup, Zementis
Zeller Edm Summit Agile Deployment Of Predictive Analytics
Instant Visualizations in Every Step of Analysis
State of the (J)PMML art
Best Practices for Big Data Analytics with Machine Learning by Datameer
Kamanja: Driving Business Value through Real-Time Decisioning Solutions
Best practices machine learning final
Zementis hortonworks-webinar-2014-09
The case for (J)PMML
Operationalizing analytics to scale
IBM Meetup on November 1, 2018: Machine Learning made easy with Watson Studio
eXplainable Predictive Decisioning: combine ML and Decision Management to pro...
Flink Forward SF 2017: Erik de Nooij - StreamING models, how ING adds models ...
VSSML16 L5. Basic Data Transformations
AzureML Welcome to the future of Predictive Analytics
Presentation on Machine Learning and Data Mining
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
ARTIFICIAL INTELLIGENCE BASED BUSINESS TRANSFORMATION PROJECTS-THE ROLE OF DA...
Artificial Intelligence Based Business Transformation Projects-the Role of Da...

Recently uploaded (20)

PPTX
Cloud computing and distributed systems.
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
KodekX | Application Modernization Development
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Approach and Philosophy of On baking technology
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Cloud computing and distributed systems.
Spectral efficient network and resource selection model in 5G networks
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
KodekX | Application Modernization Development
Per capita expenditure prediction using model stacking based on satellite ima...
Approach and Philosophy of On baking technology
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Unlocking AI with Model Context Protocol (MCP)
Diabetes mellitus diagnosis method based random forest with bat algorithm
Understanding_Digital_Forensics_Presentation.pptx
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
The AUB Centre for AI in Media Proposal.docx
Chapter 3 Spatial Domain Image Processing.pdf
Encapsulation_ Review paper, used for researhc scholars
20250228 LYD VKU AI Blended-Learning.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Network Security Unit 5.pdf for BCA BBA.
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication

PMML - Predictive Model Markup Language

  • 2. PMML defines a standard not only to represent data-mining models, but also data handling and data transformations (pre- and post-processing) PMML Predictive Model Markup Language Transformations PMML is an XML-based language used to define statistical and data mining models and to share these between compliant applications. It is a mature standard developed by the DMG (Data Mining Group) to avoid proprietary issues and incompatibilities and to deploy models. PMML allows for the clear separation of tasks: Model development vs. model deployment. As a consequence, scientists can focus on building the best model. PMML eliminates need for custom model deployment and ensures scalability and reliability. Models
  • 3. Matured and Supported by Industry PMML PMML Industry Support Data Mining Group http:// www.dmg.org Mature standard Current version 4.0 (just released) Active group and constant enhancements Vendor independent consortium Industry supporters Major Players: IBM, Oracle, SAP, Microsoft Analytics: KXEN, SAS, Salford, SPSS, Zementis BI: Microstrategy, Teradata, Tibco Open Source: KNIME, R, Rapid-I Others: Equifax, FICO, Open Data Group, Visa, Pervasive
  • 4. PMML Components A Data Dictionary defines all the raw data fields (including missing value strategy and outlier treatment). Several Data Transformations strategies allow for intelligent extraction of feature detectors from raw data (“data massaging”). A comprehensive list of Data-Mining Models offers power and flexibility. Post-processing of results allow for tailored decisions. Model Explanation allows for performance evaluation.
  • 5. PMML Files 3) Post-Processing Scaling of model outputs can be performed with PMML element Targets 1) Pre-Processing PMML elements Transformations, Mining Schema and Functions allow for effective pre-processing 2) Models PMML allows for several predictive modeling techniques to be fully expressed PMML
  • 6. PMML: Data Pre-Processing Data Dictionary : Allows for the explicit specification of valid, invalid and missing values. Mining Schema : Used to define the appropriate treatment to be applied to missing and invalid values. Transformations : Allow for variable discretization, normalization, and mapping with handling of missing and default values. Built-in Functions : Arithmetic expressions, handling of date and time as well as strings. Also used for implementing IF-THEN-ELSE logic and Boolean operations. Data Pre-Processing 1
  • 7. Data Pre-Processing: PMML Example Arbitrary Piecewise Linear Function This PMML code implements: Var_b:=interpolate(Var_a,((100,0),(200,1),(800,3),(900,4))) See http://www.dmg.org/v3-2/Transformations.html - look for element NormContinuous.
  • 8. Modeling Elements PMML allows for several predictive modeling techniques to be expressed directly. Supported techniques which have their own elements are: Regression and General Regression Neural Networks Support Vector Machines Decision Trees Naïve Bayes Clustering Sequences Rule Sets Association Rules Time-Series (as of PMML 4.0) Text Models Support for Multiple Models Easy Expression of Predictive Models 2
  • 9. Modeling Elements: PMML Example for Neural Network
  • 10. The PMML code below implements score post-processing. It uses the PMML element Targets for checking boundaries ( min and max ) and to rescale ( rescaleConstant and rescaleFactor ) the original score generated by model See http://www.dmg.org/v3-2/Targets.html Data Post-Processing: PMML Example 3
  • 11. Applications Service Providers External Vendors Divisions One Standard, One Process
  • 12. PMML = Easy Model Deployment Model Deployment Model Building PMML
  • 13. PMML - Zementis Contributions ADAPA : A decision engine that deploys models expressed in PMML and executes them in real-time. Now available as a service on the Amazon Cloud. PMML Converter : Validates, converts, and corrects old and new PMML code. Available at the DMG website and at http://www.zementis.com/pmml.htm . Contributing Member of the DMG : Submitted several proposals for PMML 4.0 and already working with other members on PMML 4.1. Code contributor for the R PMML package (available on CRAN). PMML Articles : R Journal and SIGKDD Explorations Newsletter. Available for downloading at http:// www.zementis.com/manual.htm PMML Blogs : Several blogs on PMML topics ( http://adapasupport.zementis.com and http ://www.predictive-analytics.info ).
  • 14. Thank You! U.S.A Headquarters Asia Office E-mail: [email_address] 19/F., Unit A Ho Lee Commercial Building 38-44 D’Aguilar Street Central, Hong Kong (S.A.R.) Tel: +852 2868-0878 Fax: +852 2845-6027 6125 Cornerstone Court East Suite 250 San Diego, CA, 92121 Tel: +1 619 330-0780 Fax: +1 858 535-0227