SlideShare a Scribd company logo
TDWI // tdwi.org © 2012 by TDWI
1
Q&A: Predictive Analytics Hot and
Getting Hotter
The increased interest in predictive analytics points to its power and possibilities, says
analyst Fern Halper. Vendors are eagerly jumping on board, and open source is becoming
increasingly important.
By Linda Briggs
Originally published in TDWI’s BI This Week newsletter, May 2, 2012.
“There has been such an uptick in predictive analytics ... even though the technology has been
around for decades,” observes analyst Fern Halper, with Hurwitz & Associates. Although she used
the technology at Bell Labs back in the 80s, Halper says, businesses today are beginning to
understand the value of analyzing data in advanced ways—including the economic returns
possible. “The technology has become top of mind with a lot of companies,” she says.
Halper is a partner at the consulting, research, and analyst firm Hurwitz & Associates. She has
over 20 years of experience in data analysis, business analysis, and strategy development, and has
held key positions at AT&T Bell Laboratories and Lucent Technologies. At Bell Labs, Halper
spent eight years leading the development of approaches and systems to analyze marketing and
operational data.
She is also the author of numerous articles on data mining and information technology, and an
adjunct professor at Bentley College, where she teaches courses in information systems and
business. She blogs about data and analytics at http://fbhalper.wordpress.com/.
In this interview, part one of two, she talks about the growing interest—and changes—she sees in
the predictive analytics market.
BI This Week: What is your definition of predictive analytics? Is it the same as advanced
analytics?
Predictive analytics can
be deployed for
prediction, optimization,
forecasting, simulation,
and many other uses.
Fern Halper: I usually define predictive analytics as an advanced analytics technique. At Hurwitz
and Associates, we define predictive analytics as a statistical or data mining solution consisting of
algorithms and techniques that can be used on both structured and unstructured data, together or
individually, to determine future outcomes. Predictive analytics can be deployed for prediction,
optimization, forecasting, simulation, and many other uses.
TDWI // tdwi.org Q&A: Predictive Analytics Hot and Getting Hotter
I was looking at a TDWI report on advanced analytics; your definition in some ways sounded
similar to my definition of advanced analytics. It’s text analytics. It is predictive analytics—it’s
just very advanced algorithms, and I have a different definition for it.
Explain the Hurwitz Victory Index project you recently completed.
The Victory Index is a new assessment tool that we developed at Hurwitz & Associates. It
analyzes vendors across four different dimensions: vision, viability, validity, and value.
What we’re trying to do
with the index is take a
holistic view of the
value and benefit of
predictive analytics.
What we’re trying to do with the index is take a holistic view of the value and benefit of predictive
analytics. We’re not just looking at the technology—the technical capabilities of the technology—
but also its ability to provide value to customers.
We used a weighted algorithm that has 40 different attributes across four different dimensions.
The first two dimensions, vision and viability, are all about the market perspective. The vision is
the strength of the company strategy, and the viability is the strength and vitality of the company.
The other two dimensions, validity and value, are more of a customer-product perspective.
Validity is the strength of the product that the company delivers to its customers; the value is the
advantage the technology provides. For validity and value, we use primarily data from customer
surveys about how customers feel about different vendors that are part of the Victory Index.
Viability and vision [rely more on] secondary sources.
We also used social media analysis as part of the Victory Index. We looked at what people were
saying about different products on blogs, tweets, and so forth. ... We used all different sources of
data and really tried to make the index as comprehensive as possible.
Do clients come to you for a better understanding of how predictive analytics can add value
to their company and data?
Some companies are asking that, but what’s interesting is that there has been such an uptick in
predictive analytics. People are asking about it even though the technology has been around for
decades. I used it back in the ‘80s when I was at Bell Labs—all different types of predictive
analytics techniques.
There are economic
imperatives around
really understanding
your customers.
Now, though, people are beginning to understand the value, and there are economic imperatives
around really understanding your customers. The technology has become top of mind with a lot of
companies.
The companies I’ve surveyed and talked to are looking at predictive analytics in terms of
advanced analytics. I also asked about things [such as] text analytics and analyzing data streams in
terms of big data analytics.
Trying to find patterns in data was one of the top use cases for advanced analytics, which is [very
much] about the predictive model. People are using predictive analytics to try to find patterns in
data.
The top two drivers are to remain competitive and to better understand customer behavior.
© 2012 by TDWI (The Data Warehousing InstituteTM), a division of 1105 Media, Inc. Visit tdwi.org. 2
TDWI // tdwi.org Q&A: Predictive Analytics Hot and Getting Hotter
That echoes what we’re seeing at TDWI, that predictive analytics is gaining tremendous
traction.
In the same study, I was very interested in understanding, as a side issue, companies’
understanding of predictive analytics. Who is actually using this technology? As a data monitor
from way back at Bell Labs, I wanted to understand who companies thought were actually going
to be using these products. Was it a statistician or mathematician, [someone who could] really
understand what this was all about? These models can be very complex, and if you don’t know
what you’re doing, you can really not know what you’re doing and come out with results that you
may think mean one thing but actually mean something else.
What did you find in terms of who is using predictive analytics?
Two things of interest there: One is that there’s definitely a shift by companies to have business
users work with these advanced technologies, predictive analytics included. The majority of users
of predictive analytics [before] would be mathematicians, statisticians, and quantitative types of
people. For companies planning to use it, many of the end users that they think are using the tools
are actual business analysts. They’re not necessarily trained statisticians.
However, of the companies I talked to for the Victory Index, 98 percent said you have to have
training on these tools. I completely agree with that.
There’s a thrust from
vendors to make tools
easier to use, so you
don’t have to be a
statistician to build
basic models.
Now, of course, there’s a thrust from vendors to make tools easier to use and to automate certain
functions, so you don’t have to be a statistician to build basic models—a business user could do
some of it.
Are these tools really becoming easy enough for business users to use effectively?
It’s interesting. The big vendors have tried to make their tools simple enough so that a business
user could actually have a set of data, and the tool automates some of the data preprocessing and
then suggests models based on the data—models that someone could actually use.
On the one hand, these tools are—in some sense—becoming easy enough that a business user
could use them. It depends on the user. I [worked in] predictive analytics for a long time, and I
was not a trained statistician. I was trained in a quantitative field, so I thought quantitatively and I
also understood the business.
I do think that they’re making the tools easy enough in some respects for business users, especially
marketers. However, I would recommend that users be trained. Even if it seems easy to use, you
could get yourself in trouble. One of the things that I was taught when I was doing data analysis
was, “Look at the data, look at the data, look at the data.” ... Don’t just start throwing a bunch of
algorithms at the data. What’s the data telling you? Explore the data in the first place, then
generate your hypotheses, and then run your analysis.
A lot of attention is being paid to unstructured versus structured data these days. How is
predictive analytics being used on unstructured data? What kinds of issues come up?
One of the most popular use cases I’ve seen is insurance fraud. You have structured data for and
about an insurance claim, for example—that has the name of the person and the date of the
incident, and so forth. It’s really in the unstructured data, the actual verbiage of the claim, that you
© 2012 by TDWI (The Data Warehousing InstituteTM), a division of 1105 Media, Inc. Visit tdwi.org. 3
TDWI // tdwi.org Q&A: Predictive Analytics Hot and Getting Hotter
could actually find some very useful types of information regarding potential fraud. Companies
are marrying the structured and the unstructured data and using text analytics to go through all the
claims data. They then pull out important themes, entities, and concepts [that might indicate
fraud], then link that with the structured data to get a better lift on the model.
Analytics is also being used in telecom, in customer care centers, in warranty analysis, for
example, to understand what problems customers are having.
There are cases where
you can marry the
structured and
unstructured data
together because you
have a common key.
There are cases where you can marry the structured and unstructured data together because you
have a common key, in some ways—a customer makes it easier to do that.
Then, of course, you have the whole area of unstructured data analysis in social media, which is
another place where companies are using analytics. They may not be able to marry unstructured
data together with structured data, but they’re using it to get deeper insights.
SAS, for example, has a way to take unstructured data and pull out entities, concepts, and different
aspects of insight that they can get from the unstructured data. They can then predict, for example,
what the buzz is going to be around a certain product. However, as they are the first to say, that
this is not for the faint of heart.
You’re blending structured and unstructured data?
Yes, marrying the structured and the unstructured. You’re essentially making the unstructured
structured when you’re running text analytics over it. Say I have a bunch of text somewhere and
I’m running text analytics over it. I’m pulling out different pieces of information from it, then I’m
putting that together with my structured data.
In the insurance case, for example: Someone files a workman’s claim, and it turns out that they
were called by their supervisor four times and written up for not doing the job to the best of their
ability. There are things that you couldn’t get from the structured data. Put them together, run your
model, and get much better lift.
It’s still in the early stages, but I’ve heard people talking about analytics in financial services,
telco, various manufacturing industries, and even marketing.
You mentioned data analysis and social media. What is happening there?
That’s a good question. Basically, it’s a good example of what I’ve seen happening in predictive
analytics.
There are probably 250 social media analysis products out there. First, you have to separate the
listening posts that don’t really analyze anything. Then, there’s the social media analytics from
vendors that are much more sophisticated at using natural language processing, doing sentiment
analysis, and all of that.
Beyond that, there are a handful of companies that I would call pure-play social media analysis
companies. They do it really well. Then there are business analytics companies that also are doing
social media analysis, such as SAS and IBM.
© 2012 by TDWI (The Data Warehousing InstituteTM), a division of 1105 Media, Inc. Visit tdwi.org. 4
TDWI // tdwi.org Q&A: Predictive Analytics Hot and Getting Hotter
SAS is really the only example I could point to at this point that is actually showing something
that’s predictive in terms of predicting buzz about a certain brand based on what’s happened in the
past. That’s one use case.
Going back to your Victory Index research, did you uncover anything surprising?
I’ve been living and breathing analytics for so long that there was nothing incredibly surprising.
But… the length that companies are going to make predictive analytics easier to use was
interesting to me, because [all the vendors] were all talking about it.
Also, we can’t finish the discussion without talking about how open source models are becoming
more prevalent—that was another interesting finding.
What did you find out about open source and predictive analytics?
Open source is
becoming increasingly
important because it
enables a wide
community to engage in
innovation, and it’s
offered at academic
institutions.
Open source is becoming increasingly important because it enables a wide community to engage
in innovation, and it’s offered at academic institutions. [The open source predictive analytics
language] R, for example, is becoming really popular. What’s happening is that there is an
ecosystem of vendors sprouting around these open source solutions to make them easier to use.
R, for example, basically [requires] command-level sorts of input. It’s not intuitive, it’s not easy to
use, and it’s also not that scalable. Vendors are popping up—Revolution Analytics is one—to try
to make R easier to use. Even more established vendors are incorporating open source like R.
Vendors are wrapping around those platforms.
Linda L. Briggs writes about technology in corporate, education, and government markets. She is
based in San Diego.
About TDWI
TDWI, a division of 1105 Media, Inc., is the premier provider of in-depth, high-quality education and research
in the business intelligence and data warehousing industry. TDWI is dedicated to educating business and
information technology professionals about the best practices, strategies, techniques, and tools required to
successfully design, build, maintain, and enhance business intelligence and data warehousing solutions.
TDWI also fosters the advancement of business intelligence and data warehousing research and contributes
to knowledge transfer and the professional development of its members. TDWI offers a worldwide
membership program, five major educational conferences, topical educational seminars, role-based training,
onsite courses, certification, solution provider partnerships, an awards program for best practices, live
Webinars, resourceful publications, an in-depth research program, and a comprehensive Web site, tdwi.org.
© 2012 by TDWI (The Data Warehousing InstituteTM), a division of 1105 Media, Inc. Visit tdwi.org. 5

More Related Content

PPTX
Watson analytics
PDF
After Cutting its Big Data Teeth on Wall Street, Vichara Technologies Grows t...
PDF
Forrester on Big Data
PDF
2015 Forrester Report
PDF
Mission Critical Use Cases Show How Analytics Architectures Usher in an Artif...
PDF
Big data in action - Watson in banking Wealth management
PDF
2016 Data Science Salary Survey
Watson analytics
After Cutting its Big Data Teeth on Wall Street, Vichara Technologies Grows t...
Forrester on Big Data
2015 Forrester Report
Mission Critical Use Cases Show How Analytics Architectures Usher in an Artif...
Big data in action - Watson in banking Wealth management
2016 Data Science Salary Survey

What's hot (20)

PDF
IRJET- Predicting Review Ratings for Product Marketing
PDF
Mighty Guides Data Disruption
PDF
The Data Science Process
PPTX
Cognitive Era and Introduction to IBM Watson
PDF
Consumer Behavior: Factors Affecting Member Attrition and Retention
PPTX
Valuing the data asset
PPT
PPTX
PDF
From Rocket Science to Data Science
PDF
Predictive Data Analytics and Artificial Intelligence by 40°
PDF
Framework to Analyze Customer’s Feedback in Smartphone Industry Using Opinion...
PPT
Reports vs analysis
PDF
Business Data Analytics Powerpoint Presentation Slides
PPTX
A Topic Model of Analytics Job Adverts (Operational Research Society Annual C...
PDF
Emcien overview v6 01282013
PPTX
Trends, Tools and Tips for Technology Careers
PPTX
How to get on the AI journey?
PPTX
System Dynamics, Analytics & Big Data (16th Conference of the UK Chapter of t...
PPTX
The Value of Pervasive Analytics
PDF
How to make your data scientists happy
IRJET- Predicting Review Ratings for Product Marketing
Mighty Guides Data Disruption
The Data Science Process
Cognitive Era and Introduction to IBM Watson
Consumer Behavior: Factors Affecting Member Attrition and Retention
Valuing the data asset
From Rocket Science to Data Science
Predictive Data Analytics and Artificial Intelligence by 40°
Framework to Analyze Customer’s Feedback in Smartphone Industry Using Opinion...
Reports vs analysis
Business Data Analytics Powerpoint Presentation Slides
A Topic Model of Analytics Job Adverts (Operational Research Society Annual C...
Emcien overview v6 01282013
Trends, Tools and Tips for Technology Careers
How to get on the AI journey?
System Dynamics, Analytics & Big Data (16th Conference of the UK Chapter of t...
The Value of Pervasive Analytics
How to make your data scientists happy
Ad

Viewers also liked (7)

PPTX
What Every Software Engineer Should Know About Machine Learning - Peter Norvig
PPTX
Using the Machine to predict Testability
PPT
Predictive Performance Testing: Integrating Statistical Tests into Agile Deve...
PPTX
Machine Learning in Software Engineering
PDF
Defect Prevention & Predictive Analytics - XBOSoft Webinar
PDF
Machine learning in software testing
PDF
Automated testing of software applications using machine learning edited
What Every Software Engineer Should Know About Machine Learning - Peter Norvig
Using the Machine to predict Testability
Predictive Performance Testing: Integrating Statistical Tests into Agile Deve...
Machine Learning in Software Engineering
Defect Prevention & Predictive Analytics - XBOSoft Webinar
Machine learning in software testing
Automated testing of software applications using machine learning edited
Ad

Similar to Predictive analytics: hot and getting hotter (20)

PDF
CS309A Final Paper_KM_DD
PDF
Drive your business with predictive analytics
PDF
Expanding BIs role by including Predictive Analytics
PDF
The Business Value of Predictive Analytics
PDF
The Business Value of Predictive Analytics
PDF
Data Science - Part I - Sustaining Predictive Analytics Capabilities
PPTX
Challenges in adapting predictive analytics
PDF
ForresterPredictiveWave
PPTX
Predictive analytics roadshow
PPTX
Simplify your analytics strategy
PDF
Using data analytics to drive BI A case study
PDF
Predictive Analytics with IBM Cognos 10
PDF
bda-unit-5-bda-notes material big da.pdf
PDF
PDF
predictive analysis and usage in procurement ppt 2017
PDF
An Overview Of Predictive Analysis Techniques And Applications
PPTX
ANALYTICS SOLUTIONS
PPTX
Summer Shorts: Using Predictive Analytics For Data-Driven Decisions
 
PDF
Augmented Analytics The Future Of Data & Analytics.pdf
PDF
Risk mgmt-analysis-wp-326822
CS309A Final Paper_KM_DD
Drive your business with predictive analytics
Expanding BIs role by including Predictive Analytics
The Business Value of Predictive Analytics
The Business Value of Predictive Analytics
Data Science - Part I - Sustaining Predictive Analytics Capabilities
Challenges in adapting predictive analytics
ForresterPredictiveWave
Predictive analytics roadshow
Simplify your analytics strategy
Using data analytics to drive BI A case study
Predictive Analytics with IBM Cognos 10
bda-unit-5-bda-notes material big da.pdf
predictive analysis and usage in procurement ppt 2017
An Overview Of Predictive Analysis Techniques And Applications
ANALYTICS SOLUTIONS
Summer Shorts: Using Predictive Analytics For Data-Driven Decisions
 
Augmented Analytics The Future Of Data & Analytics.pdf
Risk mgmt-analysis-wp-326822

More from The Marketing Distillery (20)

PDF
PDF
The M2M platform for a connected world
PDF
Internet of Things: manage the complexity, seize the opportunity
PDF
7 steps to business success on the Internet of Things
PDF
Capitalizing on the Internet of Things: a primer
PDF
Making sense of consumer data
PDF
Capitalizing on the Internet of Things
PDF
Managing the Internet of Things
PDF
From the Internet of Computers to the Internet of Things
PDF
Getting started in Big Data
PDF
Smart networked objects and the Internet of Things
PDF
Enhancing intelligence with the Internet of Things
PDF
Internet of Things application platforms
PDF
Internet of Things building blocks
PDF
Smart cities and the Internet of Things
PDF
The ABCs of Big Data
PDF
M2M innovations invigorate warehouse management
PPTX
Big Data: 8 facts and 8 fictions
PDF
How Big Data can help optimize business marketing efforts
PDF
Big Data analytics
The M2M platform for a connected world
Internet of Things: manage the complexity, seize the opportunity
7 steps to business success on the Internet of Things
Capitalizing on the Internet of Things: a primer
Making sense of consumer data
Capitalizing on the Internet of Things
Managing the Internet of Things
From the Internet of Computers to the Internet of Things
Getting started in Big Data
Smart networked objects and the Internet of Things
Enhancing intelligence with the Internet of Things
Internet of Things application platforms
Internet of Things building blocks
Smart cities and the Internet of Things
The ABCs of Big Data
M2M innovations invigorate warehouse management
Big Data: 8 facts and 8 fictions
How Big Data can help optimize business marketing efforts
Big Data analytics

Recently uploaded (20)

PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Spectroscopy.pptx food analysis technology
PDF
Empathic Computing: Creating Shared Understanding
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Encapsulation theory and applications.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
cuic standard and advanced reporting.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Spectroscopy.pptx food analysis technology
Empathic Computing: Creating Shared Understanding
Chapter 3 Spatial Domain Image Processing.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Encapsulation theory and applications.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
cuic standard and advanced reporting.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Programs and apps: productivity, graphics, security and other tools
Advanced methodologies resolving dimensionality complications for autism neur...
NewMind AI Weekly Chronicles - August'25 Week I
Big Data Technologies - Introduction.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
MIND Revenue Release Quarter 2 2025 Press Release
The Rise and Fall of 3GPP – Time for a Sabbatical?
The AUB Centre for AI in Media Proposal.docx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Agricultural_Statistics_at_a_Glance_2022_0.pdf

Predictive analytics: hot and getting hotter

  • 1. TDWI // tdwi.org © 2012 by TDWI 1 Q&A: Predictive Analytics Hot and Getting Hotter The increased interest in predictive analytics points to its power and possibilities, says analyst Fern Halper. Vendors are eagerly jumping on board, and open source is becoming increasingly important. By Linda Briggs Originally published in TDWI’s BI This Week newsletter, May 2, 2012. “There has been such an uptick in predictive analytics ... even though the technology has been around for decades,” observes analyst Fern Halper, with Hurwitz & Associates. Although she used the technology at Bell Labs back in the 80s, Halper says, businesses today are beginning to understand the value of analyzing data in advanced ways—including the economic returns possible. “The technology has become top of mind with a lot of companies,” she says. Halper is a partner at the consulting, research, and analyst firm Hurwitz & Associates. She has over 20 years of experience in data analysis, business analysis, and strategy development, and has held key positions at AT&T Bell Laboratories and Lucent Technologies. At Bell Labs, Halper spent eight years leading the development of approaches and systems to analyze marketing and operational data. She is also the author of numerous articles on data mining and information technology, and an adjunct professor at Bentley College, where she teaches courses in information systems and business. She blogs about data and analytics at http://fbhalper.wordpress.com/. In this interview, part one of two, she talks about the growing interest—and changes—she sees in the predictive analytics market. BI This Week: What is your definition of predictive analytics? Is it the same as advanced analytics? Predictive analytics can be deployed for prediction, optimization, forecasting, simulation, and many other uses. Fern Halper: I usually define predictive analytics as an advanced analytics technique. At Hurwitz and Associates, we define predictive analytics as a statistical or data mining solution consisting of algorithms and techniques that can be used on both structured and unstructured data, together or individually, to determine future outcomes. Predictive analytics can be deployed for prediction, optimization, forecasting, simulation, and many other uses.
  • 2. TDWI // tdwi.org Q&A: Predictive Analytics Hot and Getting Hotter I was looking at a TDWI report on advanced analytics; your definition in some ways sounded similar to my definition of advanced analytics. It’s text analytics. It is predictive analytics—it’s just very advanced algorithms, and I have a different definition for it. Explain the Hurwitz Victory Index project you recently completed. The Victory Index is a new assessment tool that we developed at Hurwitz & Associates. It analyzes vendors across four different dimensions: vision, viability, validity, and value. What we’re trying to do with the index is take a holistic view of the value and benefit of predictive analytics. What we’re trying to do with the index is take a holistic view of the value and benefit of predictive analytics. We’re not just looking at the technology—the technical capabilities of the technology— but also its ability to provide value to customers. We used a weighted algorithm that has 40 different attributes across four different dimensions. The first two dimensions, vision and viability, are all about the market perspective. The vision is the strength of the company strategy, and the viability is the strength and vitality of the company. The other two dimensions, validity and value, are more of a customer-product perspective. Validity is the strength of the product that the company delivers to its customers; the value is the advantage the technology provides. For validity and value, we use primarily data from customer surveys about how customers feel about different vendors that are part of the Victory Index. Viability and vision [rely more on] secondary sources. We also used social media analysis as part of the Victory Index. We looked at what people were saying about different products on blogs, tweets, and so forth. ... We used all different sources of data and really tried to make the index as comprehensive as possible. Do clients come to you for a better understanding of how predictive analytics can add value to their company and data? Some companies are asking that, but what’s interesting is that there has been such an uptick in predictive analytics. People are asking about it even though the technology has been around for decades. I used it back in the ‘80s when I was at Bell Labs—all different types of predictive analytics techniques. There are economic imperatives around really understanding your customers. Now, though, people are beginning to understand the value, and there are economic imperatives around really understanding your customers. The technology has become top of mind with a lot of companies. The companies I’ve surveyed and talked to are looking at predictive analytics in terms of advanced analytics. I also asked about things [such as] text analytics and analyzing data streams in terms of big data analytics. Trying to find patterns in data was one of the top use cases for advanced analytics, which is [very much] about the predictive model. People are using predictive analytics to try to find patterns in data. The top two drivers are to remain competitive and to better understand customer behavior. © 2012 by TDWI (The Data Warehousing InstituteTM), a division of 1105 Media, Inc. Visit tdwi.org. 2
  • 3. TDWI // tdwi.org Q&A: Predictive Analytics Hot and Getting Hotter That echoes what we’re seeing at TDWI, that predictive analytics is gaining tremendous traction. In the same study, I was very interested in understanding, as a side issue, companies’ understanding of predictive analytics. Who is actually using this technology? As a data monitor from way back at Bell Labs, I wanted to understand who companies thought were actually going to be using these products. Was it a statistician or mathematician, [someone who could] really understand what this was all about? These models can be very complex, and if you don’t know what you’re doing, you can really not know what you’re doing and come out with results that you may think mean one thing but actually mean something else. What did you find in terms of who is using predictive analytics? Two things of interest there: One is that there’s definitely a shift by companies to have business users work with these advanced technologies, predictive analytics included. The majority of users of predictive analytics [before] would be mathematicians, statisticians, and quantitative types of people. For companies planning to use it, many of the end users that they think are using the tools are actual business analysts. They’re not necessarily trained statisticians. However, of the companies I talked to for the Victory Index, 98 percent said you have to have training on these tools. I completely agree with that. There’s a thrust from vendors to make tools easier to use, so you don’t have to be a statistician to build basic models. Now, of course, there’s a thrust from vendors to make tools easier to use and to automate certain functions, so you don’t have to be a statistician to build basic models—a business user could do some of it. Are these tools really becoming easy enough for business users to use effectively? It’s interesting. The big vendors have tried to make their tools simple enough so that a business user could actually have a set of data, and the tool automates some of the data preprocessing and then suggests models based on the data—models that someone could actually use. On the one hand, these tools are—in some sense—becoming easy enough that a business user could use them. It depends on the user. I [worked in] predictive analytics for a long time, and I was not a trained statistician. I was trained in a quantitative field, so I thought quantitatively and I also understood the business. I do think that they’re making the tools easy enough in some respects for business users, especially marketers. However, I would recommend that users be trained. Even if it seems easy to use, you could get yourself in trouble. One of the things that I was taught when I was doing data analysis was, “Look at the data, look at the data, look at the data.” ... Don’t just start throwing a bunch of algorithms at the data. What’s the data telling you? Explore the data in the first place, then generate your hypotheses, and then run your analysis. A lot of attention is being paid to unstructured versus structured data these days. How is predictive analytics being used on unstructured data? What kinds of issues come up? One of the most popular use cases I’ve seen is insurance fraud. You have structured data for and about an insurance claim, for example—that has the name of the person and the date of the incident, and so forth. It’s really in the unstructured data, the actual verbiage of the claim, that you © 2012 by TDWI (The Data Warehousing InstituteTM), a division of 1105 Media, Inc. Visit tdwi.org. 3
  • 4. TDWI // tdwi.org Q&A: Predictive Analytics Hot and Getting Hotter could actually find some very useful types of information regarding potential fraud. Companies are marrying the structured and the unstructured data and using text analytics to go through all the claims data. They then pull out important themes, entities, and concepts [that might indicate fraud], then link that with the structured data to get a better lift on the model. Analytics is also being used in telecom, in customer care centers, in warranty analysis, for example, to understand what problems customers are having. There are cases where you can marry the structured and unstructured data together because you have a common key. There are cases where you can marry the structured and unstructured data together because you have a common key, in some ways—a customer makes it easier to do that. Then, of course, you have the whole area of unstructured data analysis in social media, which is another place where companies are using analytics. They may not be able to marry unstructured data together with structured data, but they’re using it to get deeper insights. SAS, for example, has a way to take unstructured data and pull out entities, concepts, and different aspects of insight that they can get from the unstructured data. They can then predict, for example, what the buzz is going to be around a certain product. However, as they are the first to say, that this is not for the faint of heart. You’re blending structured and unstructured data? Yes, marrying the structured and the unstructured. You’re essentially making the unstructured structured when you’re running text analytics over it. Say I have a bunch of text somewhere and I’m running text analytics over it. I’m pulling out different pieces of information from it, then I’m putting that together with my structured data. In the insurance case, for example: Someone files a workman’s claim, and it turns out that they were called by their supervisor four times and written up for not doing the job to the best of their ability. There are things that you couldn’t get from the structured data. Put them together, run your model, and get much better lift. It’s still in the early stages, but I’ve heard people talking about analytics in financial services, telco, various manufacturing industries, and even marketing. You mentioned data analysis and social media. What is happening there? That’s a good question. Basically, it’s a good example of what I’ve seen happening in predictive analytics. There are probably 250 social media analysis products out there. First, you have to separate the listening posts that don’t really analyze anything. Then, there’s the social media analytics from vendors that are much more sophisticated at using natural language processing, doing sentiment analysis, and all of that. Beyond that, there are a handful of companies that I would call pure-play social media analysis companies. They do it really well. Then there are business analytics companies that also are doing social media analysis, such as SAS and IBM. © 2012 by TDWI (The Data Warehousing InstituteTM), a division of 1105 Media, Inc. Visit tdwi.org. 4
  • 5. TDWI // tdwi.org Q&A: Predictive Analytics Hot and Getting Hotter SAS is really the only example I could point to at this point that is actually showing something that’s predictive in terms of predicting buzz about a certain brand based on what’s happened in the past. That’s one use case. Going back to your Victory Index research, did you uncover anything surprising? I’ve been living and breathing analytics for so long that there was nothing incredibly surprising. But… the length that companies are going to make predictive analytics easier to use was interesting to me, because [all the vendors] were all talking about it. Also, we can’t finish the discussion without talking about how open source models are becoming more prevalent—that was another interesting finding. What did you find out about open source and predictive analytics? Open source is becoming increasingly important because it enables a wide community to engage in innovation, and it’s offered at academic institutions. Open source is becoming increasingly important because it enables a wide community to engage in innovation, and it’s offered at academic institutions. [The open source predictive analytics language] R, for example, is becoming really popular. What’s happening is that there is an ecosystem of vendors sprouting around these open source solutions to make them easier to use. R, for example, basically [requires] command-level sorts of input. It’s not intuitive, it’s not easy to use, and it’s also not that scalable. Vendors are popping up—Revolution Analytics is one—to try to make R easier to use. Even more established vendors are incorporating open source like R. Vendors are wrapping around those platforms. Linda L. Briggs writes about technology in corporate, education, and government markets. She is based in San Diego. About TDWI TDWI, a division of 1105 Media, Inc., is the premier provider of in-depth, high-quality education and research in the business intelligence and data warehousing industry. TDWI is dedicated to educating business and information technology professionals about the best practices, strategies, techniques, and tools required to successfully design, build, maintain, and enhance business intelligence and data warehousing solutions. TDWI also fosters the advancement of business intelligence and data warehousing research and contributes to knowledge transfer and the professional development of its members. TDWI offers a worldwide membership program, five major educational conferences, topical educational seminars, role-based training, onsite courses, certification, solution provider partnerships, an awards program for best practices, live Webinars, resourceful publications, an in-depth research program, and a comprehensive Web site, tdwi.org. © 2012 by TDWI (The Data Warehousing InstituteTM), a division of 1105 Media, Inc. Visit tdwi.org. 5