SlideShare a Scribd company logo
Predictive Analytics & IBM SPSS Modeler
What is Predictive Analytics?
3
What is Predictive Analytics?
• A set of business intelligence technologies that
uncovers relationships and patterns within large
volumes of data that can be used to predict
behaviour and events.
• Predictive Analytics is forward looking, using past
events to anticipate the future.
• Gives your organisation the knowledge to predict
and the power to act.
The Predictive Analytics Pillars
Predictive
Operational
Analytics
Predictive
Customer
Analytics
Acquire
Grow
Retain
Manage
Maintain
Maximize
Predictive
Threat & Risk
Analytics
Monitor
Detect
Control
• Claims fraud
• Credit-card fraud
• Minimize risk inventory loss
• Assess network outages risk
• National security
• Predictive maintenance
• Assortment planning
• Condition monitoring
• Reverse logistics
• Allocation management
• Up-sell/cross-sell
• Market basket analysis
• Churn prevention
• Customer segmentation
• Customer loyalty
IBM SPSS Modeler
IBM SPSS Modeler
• Comprehensive predictive analytics platform
• Improve outcomes through predictive intelligence
• Desktop to integration within operational system deployment
• Providing a range of advanced analytics - text analytics, entity
analytics, social network analysis, decision management and
optimization.
IBM SPSS Modeler Editions
• IBM SPSS Modeler Gold
– Build and deploy predictive models directly into business processes
– Achieved with Decision Management, combining predictive analytics with rules, scoring and
optimization with org’s processes & ops systems for recommended actions at POI.
– SPSS Modeler Premium (client & server) + Collaboration & Deployment Services + Analytical Decision
Management
• IBM SPSS Modeler Premium
– Range of advanced algorithms & capabilities incl. text analytics, entity analytics and social network
analytics to address a multitude of business problems & analytic requirements on almost any type of data
– SPSS Modeler Professional + Text Analytics + Entity Analytics + Social Network Analysis
• IBM SPSS Modeler Professional
– Range of advanced algorithms, data manipulation & automated modeling and preparation techniques to
build predictive models and uncover hidden patterns in structured data.
• IBM SPSS Modeler Personal
– Design and build predictive models from your desktop. SPSS Modeler Personal helps you solve business
problems faster by revealing patterns and trends in your structured data, for deeper insights into your
customers or constituents.
Data Mining Methodology
9
Cross Industry Standard Practice for Data Mining - CRISP-DM
Phases and Tasks in CRISP-DM
Data Mining Techniques
Association
Classification Segmentation
Data Mining Techniques
Classification and Prediction
• Used to predict a result:
– Will a customer buy or leave?
– Does transaction fit a known pattern of fraud?
– Expected inventory levels
– Forecast number of widget purchases
• Techniques included
– Decision trees, Bayesian Networks, Neural Networks,
Decision List, Statistical Models, Time Series, Self Learning
Response Models, Support Vector Models, Nearest
Neighbor Models
Segmentation
• Help to group records into clusters or identify
unusual cases:
– Identify new patterns of fraud
– Identify groups of interest in your customer base
– Identify data segments that are unusual
• Techniques included
– Kohonen
– K-Means
– TwoStep
– Anomaly Detection
Association
• Help determine relationships and rules to
determine an outcome given a set of conditions:
– Find associations quickly in larger data sets
– Customers who bought product X also brought Y and Z
(market basket)
• Techniques included
– Apriori
– CARMA
– Sequence Model
Automated Modeling
• Choose from three automated modeling nodes
depending on the needs of your analysis:
– Build a number of different modeling methods in a single
modeling run, then rank them to compare the models’
performance.
• Techniques included
– Auto Classifier
– Auto Numeric
– Auto Cluster
IBM SPSS Modeler Demo
Additional Slides
Entity Analytics
• Detect non-obvious relationships, resolve entities and
find threats & vulnerabilities that are hiding in your
data.
• Incremental context accumulator for detecting like and
related entities:
– Large, sparse and disparate collections of data,
– New and old data
– Small and big data environments
• Perform analytics on events, people, things,
transactions and relationships.
• Find out answers to the questions like who is who; who
knows who; who does what; whose name is who?
Text Analytics
• IBM® SPSS® Modeler Text Analytics offers
powerful text analytic capabilities, which use
advanced linguistic technologies and Natural
Language Processing (NLP) to rapidly process a
large variety of unstructured text data and, from
this text, extract and organize the key concepts.
Getting a More Accurate Picture: All the Data
Matters
THE MORE DATA YOU HAVE, THE BETTER YOUR PREDICTIONS CAN BE
Behavioral data
- Orders
- Transactions
- Payment history
- Usage history
Descriptive data
- Attributes
- Characteristics
- Self-declared info
- (Geo)demographics
Attitudinal data
- Opinions
- Preferences
- Needs & Desires
- Survey results
- Social Network Data
Interaction data
- E-Mail / chat transcripts
- Call center notes
- Web Click-streams
- In person dialogues
“Traditional”
High-value, dynamic
- source of competitive differentiation
Is this one person or two?
Bill Smith
123 Main Street
(800) 555-1212
SSN: 444-33-2222
DOB: 8/7/84
Applicant: Today
William R Smith
123 S Main Avenue
(100) 111-1234
DL: 90909091
DOB: 7/8/84
Arrested: Feb 2013
Entity Analytics automatically detects when multiple
entities are the same despite having been described
differently.
Text Analytics within IBM SPSS Modeler
Text Analytics Extracts Concepts and Patterns
from Text
Text Analytics Identifies the Context/Sentiment
of the Text

More Related Content

PDF
PDF
PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014
PPT
Introducing SPSS customer overview
PPTX
Egypt hackathon 2014 analytics & spss session
PPTX
Business Partner Product Enablement Roadmap, IBM Predictive Analytics
PDF
What's New in Predictive Analytics IBM SPSS - Apr 2016
PDF
What's New in Predictive Analytics IBM SPSS
PDF
Fuel for the cognitive age: What's new in IBM predictive analytics
PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014
Introducing SPSS customer overview
Egypt hackathon 2014 analytics & spss session
Business Partner Product Enablement Roadmap, IBM Predictive Analytics
What's New in Predictive Analytics IBM SPSS - Apr 2016
What's New in Predictive Analytics IBM SPSS
Fuel for the cognitive age: What's new in IBM predictive analytics

Similar to Predictive Maintenance- From fixing to predicting problems (20)

PPT
SPSS Solutions
PPTX
BA4206 UNIT 4.pptx business analytics ppt
PPT
Data mining applications
PPTX
Predictive Analytics - An Overview
PDF
Predictive Analytics Overview
PDF
¿Como los modelos predictivos cambian los negocios?
PPTX
Analytics
PPTX
Machine Learning on the Microsoft Stack
PPTX
Watson Analytics for HSE - Copy
PPTX
Lecture 1 introduction
PPTX
Tools and techniques for predictive analytics
PPTX
Digital Marketing expained about how to markeing effectively
PPTX
Data mining and predictive analytics are related yet distinct fields focused ...
PDF
Risk mgmt-analysis-wp-326822
PDF
Predictive analytics km chicago
PPSX
Predictive analytics applying it in your organization - bocx 2014
PDF
Predictive Analytics
PPTX
Data Analytics introduction .pptx
PPTX
Microsoft Machine Learning Smackdown
PDF
IBM Cognos - Kombinera BI med prediktiv analys för att minimera risker och nå...
SPSS Solutions
BA4206 UNIT 4.pptx business analytics ppt
Data mining applications
Predictive Analytics - An Overview
Predictive Analytics Overview
¿Como los modelos predictivos cambian los negocios?
Analytics
Machine Learning on the Microsoft Stack
Watson Analytics for HSE - Copy
Lecture 1 introduction
Tools and techniques for predictive analytics
Digital Marketing expained about how to markeing effectively
Data mining and predictive analytics are related yet distinct fields focused ...
Risk mgmt-analysis-wp-326822
Predictive analytics km chicago
Predictive analytics applying it in your organization - bocx 2014
Predictive Analytics
Data Analytics introduction .pptx
Microsoft Machine Learning Smackdown
IBM Cognos - Kombinera BI med prediktiv analys för att minimera risker och nå...
Ad

Recently uploaded (20)

PPTX
02fdgfhfhfhghghhhhhhhhhhhhhhhhhhhhh.pptx
PPTX
Lecture-3-Computer-programming for BS InfoTech
PPTX
STEEL- intro-1.pptxhejwjenwnwnenemwmwmwm
PPTX
PROGRAMMING-QUARTER-2-PYTHON.pptxnsnsndn
PPT
Hypersensitivity Namisha1111111111-WPS.ppt
PDF
Prescription1 which to be used for periodo
PDF
How NGOs Save Costs with Affordable IT Rentals
PDF
Core Components of IoT, The elements need for IOT
PPTX
了解新西兰毕业证(Wintec毕业证书)怀卡托理工学院毕业证存档可查的
PPTX
code of ethics.pptxdvhwbssssSAssscasascc
PPTX
udi-benefits-ggggggggfor-healthcare.pptx
PPTX
Sem-8 project ppt fortvfvmat uyyjhuj.pptx
PPTX
making presentation that do no stick.pptx
PPTX
Nanokeyer nano keyekr kano ketkker nano keyer
PPTX
Presentacion compuuuuuuuuuuuuuuuuuuuuuuu
PDF
Cableado de Controladores Logicos Programables
PDF
Smarter Security: How Door Access Control Works with Alarms & CCTV
PPTX
title _yeOPC_Poisoning_Presentation.pptx
PPTX
material for studying about lift elevators escalation
PPTX
KVL KCL ppt electrical electronics eee tiet
02fdgfhfhfhghghhhhhhhhhhhhhhhhhhhhh.pptx
Lecture-3-Computer-programming for BS InfoTech
STEEL- intro-1.pptxhejwjenwnwnenemwmwmwm
PROGRAMMING-QUARTER-2-PYTHON.pptxnsnsndn
Hypersensitivity Namisha1111111111-WPS.ppt
Prescription1 which to be used for periodo
How NGOs Save Costs with Affordable IT Rentals
Core Components of IoT, The elements need for IOT
了解新西兰毕业证(Wintec毕业证书)怀卡托理工学院毕业证存档可查的
code of ethics.pptxdvhwbssssSAssscasascc
udi-benefits-ggggggggfor-healthcare.pptx
Sem-8 project ppt fortvfvmat uyyjhuj.pptx
making presentation that do no stick.pptx
Nanokeyer nano keyekr kano ketkker nano keyer
Presentacion compuuuuuuuuuuuuuuuuuuuuuuu
Cableado de Controladores Logicos Programables
Smarter Security: How Door Access Control Works with Alarms & CCTV
title _yeOPC_Poisoning_Presentation.pptx
material for studying about lift elevators escalation
KVL KCL ppt electrical electronics eee tiet
Ad

Predictive Maintenance- From fixing to predicting problems

  • 1. Predictive Analytics & IBM SPSS Modeler
  • 2. What is Predictive Analytics?
  • 3. 3 What is Predictive Analytics? • A set of business intelligence technologies that uncovers relationships and patterns within large volumes of data that can be used to predict behaviour and events. • Predictive Analytics is forward looking, using past events to anticipate the future. • Gives your organisation the knowledge to predict and the power to act.
  • 4. The Predictive Analytics Pillars Predictive Operational Analytics Predictive Customer Analytics Acquire Grow Retain Manage Maintain Maximize Predictive Threat & Risk Analytics Monitor Detect Control • Claims fraud • Credit-card fraud • Minimize risk inventory loss • Assess network outages risk • National security • Predictive maintenance • Assortment planning • Condition monitoring • Reverse logistics • Allocation management • Up-sell/cross-sell • Market basket analysis • Churn prevention • Customer segmentation • Customer loyalty
  • 6. IBM SPSS Modeler • Comprehensive predictive analytics platform • Improve outcomes through predictive intelligence • Desktop to integration within operational system deployment • Providing a range of advanced analytics - text analytics, entity analytics, social network analysis, decision management and optimization.
  • 7. IBM SPSS Modeler Editions • IBM SPSS Modeler Gold – Build and deploy predictive models directly into business processes – Achieved with Decision Management, combining predictive analytics with rules, scoring and optimization with org’s processes & ops systems for recommended actions at POI. – SPSS Modeler Premium (client & server) + Collaboration & Deployment Services + Analytical Decision Management • IBM SPSS Modeler Premium – Range of advanced algorithms & capabilities incl. text analytics, entity analytics and social network analytics to address a multitude of business problems & analytic requirements on almost any type of data – SPSS Modeler Professional + Text Analytics + Entity Analytics + Social Network Analysis • IBM SPSS Modeler Professional – Range of advanced algorithms, data manipulation & automated modeling and preparation techniques to build predictive models and uncover hidden patterns in structured data. • IBM SPSS Modeler Personal – Design and build predictive models from your desktop. SPSS Modeler Personal helps you solve business problems faster by revealing patterns and trends in your structured data, for deeper insights into your customers or constituents.
  • 9. 9 Cross Industry Standard Practice for Data Mining - CRISP-DM
  • 10. Phases and Tasks in CRISP-DM
  • 13. Classification and Prediction • Used to predict a result: – Will a customer buy or leave? – Does transaction fit a known pattern of fraud? – Expected inventory levels – Forecast number of widget purchases • Techniques included – Decision trees, Bayesian Networks, Neural Networks, Decision List, Statistical Models, Time Series, Self Learning Response Models, Support Vector Models, Nearest Neighbor Models
  • 14. Segmentation • Help to group records into clusters or identify unusual cases: – Identify new patterns of fraud – Identify groups of interest in your customer base – Identify data segments that are unusual • Techniques included – Kohonen – K-Means – TwoStep – Anomaly Detection
  • 15. Association • Help determine relationships and rules to determine an outcome given a set of conditions: – Find associations quickly in larger data sets – Customers who bought product X also brought Y and Z (market basket) • Techniques included – Apriori – CARMA – Sequence Model
  • 16. Automated Modeling • Choose from three automated modeling nodes depending on the needs of your analysis: – Build a number of different modeling methods in a single modeling run, then rank them to compare the models’ performance. • Techniques included – Auto Classifier – Auto Numeric – Auto Cluster
  • 19. Entity Analytics • Detect non-obvious relationships, resolve entities and find threats & vulnerabilities that are hiding in your data. • Incremental context accumulator for detecting like and related entities: – Large, sparse and disparate collections of data, – New and old data – Small and big data environments • Perform analytics on events, people, things, transactions and relationships. • Find out answers to the questions like who is who; who knows who; who does what; whose name is who?
  • 20. Text Analytics • IBM® SPSS® Modeler Text Analytics offers powerful text analytic capabilities, which use advanced linguistic technologies and Natural Language Processing (NLP) to rapidly process a large variety of unstructured text data and, from this text, extract and organize the key concepts.
  • 21. Getting a More Accurate Picture: All the Data Matters THE MORE DATA YOU HAVE, THE BETTER YOUR PREDICTIONS CAN BE Behavioral data - Orders - Transactions - Payment history - Usage history Descriptive data - Attributes - Characteristics - Self-declared info - (Geo)demographics Attitudinal data - Opinions - Preferences - Needs & Desires - Survey results - Social Network Data Interaction data - E-Mail / chat transcripts - Call center notes - Web Click-streams - In person dialogues “Traditional” High-value, dynamic - source of competitive differentiation
  • 22. Is this one person or two? Bill Smith 123 Main Street (800) 555-1212 SSN: 444-33-2222 DOB: 8/7/84 Applicant: Today William R Smith 123 S Main Avenue (100) 111-1234 DL: 90909091 DOB: 7/8/84 Arrested: Feb 2013 Entity Analytics automatically detects when multiple entities are the same despite having been described differently.
  • 23. Text Analytics within IBM SPSS Modeler
  • 24. Text Analytics Extracts Concepts and Patterns from Text
  • 25. Text Analytics Identifies the Context/Sentiment of the Text

Editor's Notes

  • #3: Being able to predict is one thing, knowing how to act on those predictions or knowing what to do with those predictions is a whole different story and is the key differentiator that will give your business a competitive edge. And Predictive Analytics can be as simple as: Who are my best prospects? Based on who I have sold to in the past. What offers should I send which customers? Based on which customer took which offers in the past. What job offers should I make to which applicants? Based on which applicant took which offers in the past. How much money will a given customer spend with me next year? Based on the characteristics of this customer to other customers. Based on how much they have spent with me in the past. What starting salary will I be paying a given employee? Based on the characteristics of this employee to other employees. Based on how much I have paid employees with this characteristics in the past. How much of a product will sell next week, month, and year? Based on how much has sold in the past of that item or similar items. Which claims/procurement are likely to be fraudulent? Based prior history of known fraud and identification on unusual activities.
  • #4: Predictive Customer Analytics Acquire – Attract and acquire the ideal customer, those that will be profitable across their customer lifetime, through segmentation and targeted deployment of offers. Acquire – Attract and acquire the ideal employee, those that will add value to my organization across their employment, through segmentation and targeted deployment of job offers. Grow – Grow customer lifetime value by identifying individual customers' propensity to buy and personalizing up-sell and cross-sell offers. Grow – Grow my employee lifetime by identifying individual employees' need for further training and propensity to take up further career development programs and personalizing training offerings. Retain – Retain customers at risk of defecting by identifying attributes that lead to customer churn and proactively intervening with the optimal offer to make them stay. Retain – Retain high-value employees at risk of leaving by identifying attributes that lead to employee attrition and proactively intervening with the optimal retention strategies (bonuses, incentives) to make them stay. Examples Marketing optimisation for targeted promotions - identify the best customers for highly targeted marketing programs. Cross and up-sell - maximize the lifetime value of customers through personalized up-sell and cross-sell efforts. Market Basket Analysis/Shopping Cart/Cross Selling: Finding rules that identify products that, when purchased, predict additional purchases. Determine which products and services are likely to be purchased together. Churn reduction - predict which customers are at risk of leaving and why 
so you can take action to retain them. Customer Segmentation for Direct Marketing: Classifying customers into groups with distinct usage or need patterns Customer loyalty - Measure customer sentiment and spot emerging trends in social media and surveys to increase loyalty. all formats of text and open source Web data (such as blogs, e-mails and open-ended response questions on surveys). Customer Loyalty and Retention (CRM) Predicting who is likely to pay back a loan or be eligible for a loan application (Credit Risk Scoring) Identifying customers who are likely to cancel their policies, subscriptions, or accounts Predicting who is likely to renew a contract for mobile phone service Predict effects of complaints on loyalty   Predictive Operational Analytics Manage your operations – control virtual and physical assets Maintain your infrastructure - helping you set optimal maintenance schedules to remove downtime and alerting you to imminent failures. Maximize capital efficiency - ensuring you are allocating your people and cash in the most efficient manner, in the context of your business processes Examples Predictive maintenance - examine all repair, usage and downtime data, your company can predict with a high degree of accuracy which machines are most likely to fail or need service. Identifying factors that lead to defects in a manufacturing process Identifying factors that lead to pipes leaking in a building Predicting when a spare part is going to fail within a vehicle Identifying factors that lead to a machine failing within a factory Assortment planning – which products to stock - ensures that buyers purchase the right inventory and retailers maximize sales Condition monitoring – proactive machine and equipment monitoring to avoid unplanned downtime. Combine multiple data sources associated with a piece of equipment, apply predictive analytics to highlight possible problems, utilize interpretive expertise to confirm the problem and identify a solution. Take the next best action – offer an employee an incentive to stay, divert an insurance claim to a special investigating unit, what to price energy at a particular time Reverse logistics stands for all operations related to the reuse of products and materials. It is "the process of planning, implementing, and controlling the efficient, cost effective flow of raw materials, in-process inventory, finished goods and related information from the point of consumption to the point of origin for the purpose of recapturing value or proper disposal Allocation management – Allocating your resources (cash, people etc) in the most efficient and effective manner. Predictive Threat and Risk Analytics Monitor your environment - by including a wide variety of data across multiple sources Detect suspicious activity - to identify threats, information breaches, crime fraud Control outcomes - to deliver the best response to reduce exposure or loss and maximize the impact of any action taken Examples Eliminate insider risk – companies and government agencies must answer to stringent regulatory requirements and protect intellectual capital from competitors or subversive political entities. And consumers are typically most concerned with the potential for identity theft and other privacy violations. Manage liquidity risk - predict and respond to future liquidity exposures and risks, run stress tests against those exposures and ultimately recommend counter-balancing capabilities. Detect and prevent fraud - forecast what is likely to happen in the future, as well as detect events as they happen. Developing models to detect fraudulent phone or credit-card activity Minimise inventory loss – help retailers identify products, store conditions, personnel or customers who may be linked to stock shrinkage. Assess network outages - help telecommunications companies or large retailers improve network asset management outcomes by predicting which network will fail next, and the impact this will have on operations and customer experience. Protect national borders - help agencies identify which containers entering a port could contain unwanted/dangerous materials, which passengers on an airline should be investigated more thoroughly, or predict the risk level associated with vehicles at land crossings. Keep communities safer (Predictive Policing) - Police departments and other public safety agencies can use predictive analytics to combine data from disparate sources and help agencies make the best use of the people and information at hand to monitor, measure and predict crime and crime trends. Incident reports, Crime tips, Calls for service - Identify minor crimes that are likely to escalate into violence - deploying police resources based on “hot spots” (predictive policing). Retail Analytics: Predicting good and poor sales prospects Web Analytics: Predicting next page browsed on a website Predictive Analytics in Healthcare Predicting whether a heart attack is likely to recur among those with cardiac disease Predicting whether patients will react to a certain treatment/drug
  • #6: – Group analysis – Identify the groups in the data and who are their leaders and followers. – Diffusion analysis – Use existing churn information to determine who the current churners are and what will influence them to leave. High-performance data mining Enables the user to easily import, manage and analyze their data in a user-friendly graphical interface Allows the user to import from a variety of sources including databases (SQL Server, Oracle), flat files, IBM Cognos, Excel and XML IBM SPSS Modeler allows the user to easily construct graphs & charts, tables, cross-tabulations, build forecasting models, decision-trees, and perform a number of advanced statistical functions and export them to the desired format. Benefits Maximize Analyst Productivity Easy to learn. Don’t have to be a programmer or an exceptionally skilled analyst Sophisticated automation drives rapid ROI Automatically create, evaluate and deploy predictive models Automate data preparation Performance and scalability Best in class database pushback for data transformations and data mining algorithms Multithreading, clustering and use of embedded algorithms
  • #9: In fact, process is so important in the predictive analytics community that in 1996 several industry players created an industry standard methodology called the Cross Industry Standard Process for Data Mining CRISP-DM. Although only 15% of our survey respondents follow CRISP-DM, it embodies a common-sense approach that is mirrored in other methodologies. (See Figure 7.) “Many people, including myself, adhere to CRISP-DM without knowing it,” says
  • #10: The general CRISP-DM process model includes six phases that address the main issues in data mining. The six phases fit together in a cyclical process. These six phases cover the full data mining process, including how to incorporate data mining into one’s larger business practices. These phases are listed in the figure below.
  • #12: Let’s explore these techniques further in our exercises. First, we will create an association model to learn what products are being purchased together. Second, we will create a segmentation (or clustering) model to put people with similar characteristics into groups. Third, we will create a classification model to predict customer churn. Then we are going to extend that model by adding unstructured data to our stream and look at its impact on the results.
  • #14: Segment Employees Doers Semi-skilled Specialists Criticals Segmentation models divide the data into segments, or clusters, of records that have similar patterns of input fields. As they are only interested in the input fields, segmentation models have no concept of output or target fields. Examples of segmentation models are Kohonen networks, K-Means clustering, two-step clustering and anomaly detection. Segmentation models (also known as "clustering models") are useful in cases where the specific result is unknown (for example, when identifying new patterns of fraud, or when identifying groups of interest in your customer base). Clustering models focus on identifying groups of similar records and labeling the records according to the group to which they belong. This is done without the benefit of prior knowledge about the groups and their characteristics, and it distinguishes clustering models from the other modeling techniques in that there is no predefined output or target field for the model to predict. There are no right or wrong answers for these models. Their value is determined by their ability to capture interesting groupings in the data and provide useful descriptions of those groupings. Clustering models are often used to create clusters or segments that are then used as inputs in subsequent analyses (for example, by segmenting potential customers into homogeneous subgroups).
  • #15: Association models find patterns in your data where one or more entities (such as events, purchases, or attributes) are associated with one or more other entities. The models construct rule sets that define these relationships. Here the fields within the data can act as both inputs and targets. You could find these associations manually, but association rule algorithms do so much more quickly, and can explore more complex patterns. Apriori and Carma models are examples of the use of such algorithms. One other type of association model is a sequence detection model, which finds sequential patterns in time-structured data. Association models are most useful when predicting multiple outcomes—for example, customers who bought product X also bought Y and Z. Association models associate a particular conclusion (such as the decision to buy something) with a set of conditions. The advantage of association rule algorithms over the more standard decision tree algorithms (C5.0 and C&RT) is that associations can exist between any of the attributes. A decision tree algorithm will build rules with only a single conclusion, whereas association algorithms attempt to find many rules, each of which may have a different conclusion.
  • #16: The Auto Classifier node creates and compares a number of different models for binary outcomes (yes or no, churn or do not churn, and so on), allowing you to choose the best approach for a given analysis. A number of modeling algorithms are supported, making it possible to select the methods you want to use, the specific options for each, and the criteria for comparing the results. The node generates a set of models based on the specified options and ranks the best candidates according to the criteria you specify. The Auto Numeric node estimates and compares models for continuous numeric range outcomes using a number of different methods. The node works in the same manner as the Auto Classifier node, allowing you to choose the algorithms to use and to experiment with multiple combinations of options in a single modeling pass. Supported algorithms include neural networks, C&R Tree, CHAID, linear regression, generalized linear regression, and support vector machines (SVM). Models can be compared based on correlation, relative error, or number of variables used. The Auto Cluster node estimates and compares clustering models, which identify groups of records that have similar characteristics. The node works in the same manner as other automated modeling nodes, allowing you to experiment with multiple combinations of options in a single modeling pass. Models can be compared using basic measures with which to attempt to filter and rank the usefulness of the cluster models, and provide a measure based on the importance of particular fields. The best models are saved in a single composite model nugget, enabling you to browse and compare them, and to choose which models to use in scoring.
  • #21: The more data you have, the better your predictions can be. Most clients have traditional data stored in datamarts or operational data stores. This data includes basic behavioral data like orders and payments but usually also includes descriptive data like demographics, and other “dimensions” commonly found in BI environments. <Click> The high value data we are seeing clients add into the mix is attitudinal data and interaction data. This non traditional data includes the opinions and preferences of customers, often in form of open ended survey results or even as feedback gathered in Social Media. Coupling this with the results of real time Interaction data like verbatim chat transcripts and web click streams hones the insights provided from the data significantly.