F I N A N C I A L
Advisors
THE BUSINESS ANALYST
AND BEYOND
DATA MINING
F I N A N C I A L
Advisors
BUSINESS ANALYST
Business Analyst and Data Mining
Understanding the advantages of using different data mining
tools and techniques, knowing what data mining does, can help
the Business Analyst provide recommendations that improve
business processes and discover efficiency leakages within the
existing business.
Most Business Analyst’s are aware of data mining and what it
can do for an organization.
• reduce the cost of acquiring data and
• improve the success rate of discovering oil and gas.
However, whether you are a beginner or a seasoned Business
Analyst , there is a need to understand what data mining does
and what are the different data mining tools and techniques
available to improve the audit activities a Business Analyst must
do. How can it assist improving the business operations across
the board.
DATA MINING
IMPROVE
• BUSINESS
PROCESSES
• AUDIT CAPABILITY
F I N A N C I A L
Advisors
IMPORTANCE OF DATA MINING
KNOWLEDGE DISCOVERY AND DATA MINING
Data Mining is often considered to be:
• a blend of statistics,
• AI (artificial intelligence), and
• data base research
Which was not commonly recognized as a field of interest for statisticians, and was
even considered by some "a dirty word in Statistics. Due to its applied importance,
however, the field emerges as a rapidly growing and major area (also in statistics)
where important theoretical advances are being made.
F I N A N C I A L
Advisors
GOAL OF DATA MINING
ABILITY TO PREDICT
Data Mining is an analytic process designed to explore usually large amounts of data,
typically business or market related, known as "big data” in search of consistent
patterns and/or systematic relationships between variables, and then to validate the
findings by applying the detected patterns to new subsets of data.
The ultimate goal of data mining is prediction - and predictive data mining is the most
common type of data mining and one that has the most direct business applications.
F I N A N C I A L
Advisors
DATA MINING
A 3 STAGE PROCESS
The process of data mining
consists of three stages:
(1)  the initial exploration,
(2) model building or pattern
identification with validation/
verification, and
(3) deployment (i.e. the application
of the model to new data in
order to generate predictions).
F I N A N C I A L
Advisors
DATA MINING
WHAT IS IT?
Stage 1- Exploration.
This stage usually starts with data preparation which may involve cleaning data, data
transformations, selecting subsets of records and - in case of data sets with large
numbers of variables ("fields") - performing some preliminary feature selection
operations to bring the number of variables to a manageable range (depending on the
statistical methods which are being considered). Then, depending on the nature of the
analytic problem, this first stage of the process of data mining may involve anywhere
between a simple choice of straightforward predictors for a regression model, to
elaborate exploratory analyses using a wide variety of graphical and statistical
methods (Exploratory Data Analysis (EDA)) in order to identify the most relevant
variables and determine the complexity and/or the general nature of models that can
be taken into account in the next stage.
F I N A N C I A L
Advisors
DATA MINING
WHAT IS IT?
Stage 2: Model building and validation.
This stage involves considering various models and choosing the best one based on
their predictive performance (i.e., explaining the variability in question and producing
stable results across samples). This may sound like a simple operation, but in fact, it
sometimes involves a very elaborate process. There are a variety of techniques
developed to achieve that goal - many of which are based on so-called "competitive
evaluation of models," that is, applying different models to the same data set and then
comparing their performance to choose the best.
These techniques - which are often considered the core of predictive data mining -
include:
• bagging (Voting, Averaging),
• boosting,
• stacking (Stacked Generalizations), and
• meta-Learning.
F I N A N C I A L
Advisors
DATA MINING
WHAT IS IT?
Stage 3: Deployment.
That final stage involves using the model selected as best in the previous stage and
applying it to new data in order to generate predictions or estimates of the expected
outcome. The concept of Data Mining is becoming increasingly popular as a business
information management tool where it is expected to reveal knowledge structures that
can guide decisions in conditions of limited certainty. Recently, there has been
increased interest in developing new analytic techniques specifically designed to
address the issues relevant to business Data Mining (e.g. Classification Trees), but
Data Mining is still based on the conceptual principles of statistics including the
traditional Exploratory Data Analysis (EDA) and modeling and it shares with them
both some components of its general approaches and specific techniques.
F I N A N C I A L
Advisors
DATA MINING
WHAT IS IT?
Stage 3: Deployment.
However, an important general difference in the focus and purpose between Data
Mining and the traditional Exploratory Data Analysis (EDA) is that Data Mining is
more oriented towards applications than the basic nature of the underlying
phenomena. In other words, Data Mining is relatively less concerned with identifying
the specific relations between the involved variables. For example, uncovering the
nature of the underlying functions or the specific types of interactive, multivariate
dependencies between variables are not the main goal of Data Mining. Instead, the
focus is on producing a solution that can generate useful predictions. Therefore, Data
Mining accepts among others a "black box" approach to data exploration or
knowledge discovery and uses not only the traditional Exploratory Data Analysis
(EDA) techniques, but also such techniques as Neural Networks, which can generate
valid predictions but are not capable of identifying the specific nature of the
interrelations between the variables on which the predictions are based.
F I N A N C I A L
Advisors
DATA MINING
WHAT IS IT?
Data mining automates the detection of relevant patterns in a database, using defined
approaches and algorithms to look into current and historical data that can then be
analyzed to predict future trends.
Because data mining tools predict future trends and behaviors by reading through
databases for hidden patterns, they allow organizations to make proactive,
knowledge-driven decisions and answer questions that were previously too time-
consuming to resolve in a manual manner.
Changes in data mining techniques, have enabled organizations to collect, analyze,
and access data in new ways. The first change occurred in the area of basic data
collection. As companies started collecting and saving basic data in computers, they
were able to start answering detailed questions quicker and with more ease.
F I N A N C I A L
Advisors
AMOUNT OF DATA
BEING ANALYZED
Vast amounts of data that companies collect, require use of data mining
programs to investigate data trends and process large volumes of data
quickly.
Workers can determine the outcome of the data analysis by the parameters
chosen, thus providing additional value to business strategies and initiatives.
It is important to note that without these parameters, the data mining program
will generate all permutations or combinations irrespective of their relevance.
F I N A N C I A L
Advisors
IDENTIFY KEY PARAMETERS
TRANSFORM DATA INTO VALUABLE INFORMATION
The business Analyst need to pay attention to selection of key parameters.
Data mining programs lack the ability to recognize the difference between a
relevant and an irrelevant data correlation.
The business analyst need to review the results of mining exercises to ensure
results provide needed information from the data.
For example, knowing that people who default on loans usually give a false
address might be relevant, whereas knowing they have blonde hair might be
irrelevant.
Monitor whether sensible and rational decisions are made on the basis of
data mining exercises, especially where the results of such exercises are
used as input for other processes or systems at workplace.
F I N A N C I A L
Advisors
SECURITY ASPECTS
YOUR BUSINESS ADVANTAGE OF INTEREST TO OTHERS
The business Analyst need to consider the different security aspects of data
mining programs and processes.
A data mining exercise might reveal important information that could be
exploited by an outsider who hacks into the organization's computer system
and uses captured information from a data mining tool.
F I N A N C I A L
Advisors
DATA MINING
Importance of Data Mining
Using data mining to understand and extrapolate data and
information can reduce the chances of success in E&P programs,
improve audit reactions to potential business changes, and ensure
that risks are managed in a more timely and proactive fashion.
Business Analysts can use data mining tools to model "what-if"
situations and demonstrate real and probable effects to
management, such as combining real-world and business
information to show the effects of a security breach and the impact
of drilling dry holes or miss a production opportunity.
If data mining can be used by one part of the organization to
influence business direction for profit, why can't Business Analysts
use the same tools and techniques to reduce risks and increase audit
benefits?
REDUCING RISKS AND INCREASE AUDIT BENEFITS
IMPROVE
• BUSINESS
PROCESSES
• AUDIT CAPABILITY
F I N A N C I A L
Advisors
SPATIAL DATA MINING
Spatial understanding
Spatial data mining is the application of data mining techniques to
spatial data. Spatial data mining follows along the same functions
as seen in data mining, with the end objective to reveal and create
knowledge about trends/ patterns geographically.
Spatial data mining is the process of discovering interesting and
previously unknown, but potentially useful patterns from large
spatial datasets. Extracting interesting and useful patterns from
spatial datasets is more difficult than extracting the corresponding
patterns from traditional numeric and categorical data due to the
complexity of spatial data types, spatial relationships, and spatial
autocorrelation.
Spatial data mining could be done with location prediction, spatial
outlier detection, and co-location data mining.
ADD GEOGRAPHIC DISTRIBUTION
F I N A N C I A L
Advisors
GEOGRAPHIC DATA MINING
Goal is to discover oil and gas @ minimum risk and cost
Techniques for spatial data warehousing (SDW), spatial data mining
(Sdm), and Spatial visualization (SVis) have been developed.
In addition, there has been a rise in the use of knowledge discovery
techniques due to the increasing collection and storage of data on
spatiotemporal processes and mobile objects.
Geographic data mining and knowledge discovery is a young discipline
with many challenging research problems. This area represents an
important direction in the development of a new generation of spatial
analysis tools for data-rich environments. It is important to motivate
researchers to develop new methods and applications in this emerging
field.
KNOWLEDGE DISCOVERY
F I N A N C I A L
Advisors
KNOWLEDGE DISCOVERY FROM DATABASE
Create value added to business
KDD is a response to the enormous volumes of data being
collected and stored in operational and scientific databases. Continuing
improvements in information technology (IT) and its widespread
adoption for process monitoring and control in many domains is creating
a wealth of new data. There is often much more information in these
databases than the “shallow” information being extracted by traditional
analytical and query techniques. KDD leverages investments in IT by
searching for deeply hidden information that can be turned into
knowledge for strategic decision-making and answering fundamental
business goal questions. KDD is better known through the more popular
term “data mining.” However, data mining is only one component (albeit
a central component) of the larger KDD process. Data mining involves
distilling data into information or facts about the mini-world described
by the database. KDD is the higher-level process of obtaining
information through data mining and distilling this information into
Knowledge (ideas and beliefs about the mini-world) through
interpretation of information and integration with existing knowledge.
KDD
F I N A N C I A L
Advisors
DATA MINING
TECHNIQUES
F I N A N C I A L
Advisors
SPATIAL KNOWLEDGE DISCOVERY
Create value added to business
A very important special case of KDD, namely, spatial knowledge
discovery (SKD)
Spatial data mining techniques and the relationships between SKD and
spatial visualization (SVis), an increasingly active research domain
integrating scientific visualization and cartography.
Seeking patterns and trends in hydrocarbon systems (such as porosities
and other geological parameters over space and time) benefits from
projecting the data into an information space whose spatial dimensions
are non-metric
SKD
F I N A N C I A L
Advisors
SKD PORTALS
CONNECTIVITY
• Customer Portals provide richer information, from real-time service status to
estimated restoration times
• Analytics identify failure trends so you can more effectively target your
investment and corrective measures
• Field Operations connect crews with comprehensive information and tools
for faster fault location, restoration and updating of your enterprise records
• Task-oriented Solutions create new workflow-specific capabilities that
leverage spatial information and processes
F I N A N C I A L
Advisors
SPATIAL BUSINESS INTELLIGENCE
PROVIDE THE NECESSARY INSIGHT FROM THE DATA
Be able to synthesize data into information and present it in an
understandable manner is important for business. Business intelligence (BI)
environments must be able to visualize based on spatial data and depend on
the following two elements:
Visualization. Interrelationships in spatial data are often readily understood
when visually presented. Irrelevant of how abstract or complex, effective
Visualization exposes the information insight.
4W. Standard BI systems handle the Who, What and When, but the Where is
vastly underexploited in order to reveal trends or outliers.
Spatial BI transforms data into human understanding and actionable insight.
F I N A N C I A L
Advisors
WE THINK VISUALLY.
WE NEED TO SEE OUR LOGIC.
SPATIAL BUSINESS INTELLIGENCE
MI + BI = SBI(MAP INTELLIGENCE + BUSINESS INTELLIGENCE = SPATIAL BUSINESS INTELLIGENCE)
F I N A N C I A L
Advisors
DATA MINING
LITERATURE
References
• Berry, M., J., A., & Linoff, G., S., (2000). Mastering data mining. New York: Wiley.
• Edelstein, H., A. (1999). Introduction to data mining and knowledge discovery (3rd ed). Potomac, MD: Two Crows Corp.
• Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., & Uthurusamy, R. (1996). Advances in knowledge discovery & data mining. Cambridge, MA: MIT Press.
• Han, J., Kamber, M. (2000). Data mining: Concepts and Techniques. New York: Morgan-Kaufman.
• Hastie, T., Tibshirani, R., & Friedman, J. H. (2001). The elements of statistical learning : Data mining, inference, and prediction. New York: Springer.
• Pregibon, D. (1997). Data Mining. Statistical Computing and Graphics, 7, 8.
• Weiss, S. M., & Indurkhya, N. (1997). Predictive data mining: A practical guide. New York: Morgan-Kaufman.
• Westphal, C., Blaxton, T. (1998). Data mining solutions. New York: Wiley.
• Witten, I. H., & Frank, E. (2000). Data mining. New York: Morgan-Kaufmann.
• Card, Stuart K.et al., Readings in Information Visualization: Using Vision to Think, Morgan Kaufmann Publishers, 1999.
• Dresner, Howard, "Predicts 2004: Business Intelligence Technology Directions", Gartner Inc., December 5, 2003.
• Gonzales, Michael L., "Alleviating Spatial Constraints", Intelligent Enterprise, August 18, 2000.
• Ibid., "Breaking Out of the Warehouse", Intelligent Enterprise, September 17, 2003.
• Ibid., IBM Data Warehousing,, Wiley Publishing Inc., 2003.
• Ibid., "More than Pie Charts", Intelligent Enterprise, November 13, 2004.
• Ibid., "Picture This! A Spatially Aware Data Warehouse", Journal of Data Warehousing, Volume 6, Issue 3, Summer 2001.
• Ibid., "Seeking Spatial Intelligence", Intelligent Enterprise, January 20, 2000.
• Ibid., "The New GIS Landscape", Intelligent Enterprise, February 1, 2003.
• Spence, Robert, Information Visualization, Addison-Wesley Publishing Co., 2000
• Sullivan, Dan, "Vision of Intelligence", Intelligent Enterprise, May 28, 2002
• Tiedrich, Alan, "Business Intelligence Tools: Perspective", Gartner Inc., June 19, 2003.
• Ibid., "Cool Vendors in BI, BAM and Data Warehousing", Gartner Inc., March 25, 2004.
• Woodbury, Henry, "Why Your Ideas Need Visual Explanation", Dynamic Diagrams Inc., October 2003.

More Related Content

PPTX
Regression and correlation
PDF
Data Analytics and Big Data on IoT
PPTX
Introduction to Business Data Analytics
PDF
SAS/MIT/Sloan Data Analytics
PPTX
Data analysis
PPTX
Business analytics and data mining
PDF
Introduction to data analytics
PDF
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Regression and correlation
Data Analytics and Big Data on IoT
Introduction to Business Data Analytics
SAS/MIT/Sloan Data Analytics
Data analysis
Business analytics and data mining
Introduction to data analytics
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...

What's hot (15)

PPT
Lobsters, Wine and Market Research
PPTX
Data analytics
PPTX
Data mining (prefinals)
PDF
ForresterPredictiveWave
PDF
Foundational Methodology for Data Science
PDF
Predictive Modelling
PPTX
What is Data analytics and it's importance ?
PDF
Big data overview
PDF
PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014
PDF
Data Mining for Big Data-Murat Yazıcı
PPTX
Classes of Model
PDF
Business Analytics and Optimization Introduction (part 2)
DOCX
Predictive analytics - The cure for business myopia
PPT
Data analysis for effective decision making
PDF
xv-whitepaper-workforce
Lobsters, Wine and Market Research
Data analytics
Data mining (prefinals)
ForresterPredictiveWave
Foundational Methodology for Data Science
Predictive Modelling
What is Data analytics and it's importance ?
Big data overview
PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014
Data Mining for Big Data-Murat Yazıcı
Classes of Model
Business Analytics and Optimization Introduction (part 2)
Predictive analytics - The cure for business myopia
Data analysis for effective decision making
xv-whitepaper-workforce
Ad

Similar to data analysis-mining (20)

PPTX
Moh.Abd-Ellatif_DataAnalysis1.pptx
PDF
This is where data analytics enters as a critical field.pdf
PDF
Top 30 Data Analyst Interview Questions.pdf
PPTX
Big Data Analytics information And Tools
PDF
Data Analysis Methods 101 - Turning Raw Data Into Actionable Insights
PDF
Data Analyst Interview Questions & Answers
PPTX
Unit 1 pptx.pptx
PDF
What is Data Mining? Key Concepts Explained
PPTX
Chapter 3: Data Analysis or Interpretation of Data
PPTX
Data Mining for Business Analytics in PGCM
PDF
Study of Data Mining Methods and its Applications
PDF
How To Transform Your Analytics Maturity Model Levels, Technologies, and Appl...
PDF
Machine Learning for Business - Eight Best Practices for Getting Started
PDF
what is ..how to process types and methods involved in data analysis
PPTX
Data mining
PPTX
Data mining
PPTX
Data Science and Analytics Lesson 1.pptx
PPTX
Introduction to data analytics - Intro to Data Analytics
PPTX
DATA ANALYSIS THE PRESENTATION BRIEFLY DONE
PPTX
Data Analytics Introduction.pptx
Moh.Abd-Ellatif_DataAnalysis1.pptx
This is where data analytics enters as a critical field.pdf
Top 30 Data Analyst Interview Questions.pdf
Big Data Analytics information And Tools
Data Analysis Methods 101 - Turning Raw Data Into Actionable Insights
Data Analyst Interview Questions & Answers
Unit 1 pptx.pptx
What is Data Mining? Key Concepts Explained
Chapter 3: Data Analysis or Interpretation of Data
Data Mining for Business Analytics in PGCM
Study of Data Mining Methods and its Applications
How To Transform Your Analytics Maturity Model Levels, Technologies, and Appl...
Machine Learning for Business - Eight Best Practices for Getting Started
what is ..how to process types and methods involved in data analysis
Data mining
Data mining
Data Science and Analytics Lesson 1.pptx
Introduction to data analytics - Intro to Data Analytics
DATA ANALYSIS THE PRESENTATION BRIEFLY DONE
Data Analytics Introduction.pptx
Ad

More from Stig-Arne Kristoffersen (20)

PDF
Storre behov for att effektivt rekrytera
PDF
Distans for-imot
PDF
SKL behover dig
PDF
Jobbsokning under pandemin
PDF
Hockey vanskapsprogram
PDF
Kultur och fritid
PDF
Varumärke inom Svensk hockey
PDF
Lågutbildade, arbetslösa mindre delaktiga i informationssamhället
PDF
Digital mogenhet - nödvändigt för alla företag
PPTX
Mining and artificial intelligence - a new paradigm growing!
PDF
Matchning av dem långt från arbetsmarknaden
PDF
s AI s - seismic Artificial Intelligence system
PDF
Transform unstructured e&p information
PDF
Den passiva arbetssökaren
PDF
Hitta varandra med Algoritmer som fungerar för alla!
PDF
Hitta varandra i arbetsmarknaden!
PDF
Arbetsförmedling och Rekrytering - en samverkan
PDF
Vatten från olika källor i Västra Götaland
PDF
Bättre match mellan jobbsökare och arbetsgivare
PDF
Vilken riktning tar rekryteringen i närmaste framtid?
Storre behov for att effektivt rekrytera
Distans for-imot
SKL behover dig
Jobbsokning under pandemin
Hockey vanskapsprogram
Kultur och fritid
Varumärke inom Svensk hockey
Lågutbildade, arbetslösa mindre delaktiga i informationssamhället
Digital mogenhet - nödvändigt för alla företag
Mining and artificial intelligence - a new paradigm growing!
Matchning av dem långt från arbetsmarknaden
s AI s - seismic Artificial Intelligence system
Transform unstructured e&p information
Den passiva arbetssökaren
Hitta varandra med Algoritmer som fungerar för alla!
Hitta varandra i arbetsmarknaden!
Arbetsförmedling och Rekrytering - en samverkan
Vatten från olika källor i Västra Götaland
Bättre match mellan jobbsökare och arbetsgivare
Vilken riktning tar rekryteringen i närmaste framtid?

data analysis-mining

  • 1. F I N A N C I A L Advisors THE BUSINESS ANALYST AND BEYOND DATA MINING
  • 2. F I N A N C I A L Advisors BUSINESS ANALYST Business Analyst and Data Mining Understanding the advantages of using different data mining tools and techniques, knowing what data mining does, can help the Business Analyst provide recommendations that improve business processes and discover efficiency leakages within the existing business. Most Business Analyst’s are aware of data mining and what it can do for an organization. • reduce the cost of acquiring data and • improve the success rate of discovering oil and gas. However, whether you are a beginner or a seasoned Business Analyst , there is a need to understand what data mining does and what are the different data mining tools and techniques available to improve the audit activities a Business Analyst must do. How can it assist improving the business operations across the board. DATA MINING IMPROVE • BUSINESS PROCESSES • AUDIT CAPABILITY
  • 3. F I N A N C I A L Advisors IMPORTANCE OF DATA MINING KNOWLEDGE DISCOVERY AND DATA MINING Data Mining is often considered to be: • a blend of statistics, • AI (artificial intelligence), and • data base research Which was not commonly recognized as a field of interest for statisticians, and was even considered by some "a dirty word in Statistics. Due to its applied importance, however, the field emerges as a rapidly growing and major area (also in statistics) where important theoretical advances are being made.
  • 4. F I N A N C I A L Advisors GOAL OF DATA MINING ABILITY TO PREDICT Data Mining is an analytic process designed to explore usually large amounts of data, typically business or market related, known as "big data” in search of consistent patterns and/or systematic relationships between variables, and then to validate the findings by applying the detected patterns to new subsets of data. The ultimate goal of data mining is prediction - and predictive data mining is the most common type of data mining and one that has the most direct business applications.
  • 5. F I N A N C I A L Advisors DATA MINING A 3 STAGE PROCESS The process of data mining consists of three stages: (1)  the initial exploration, (2) model building or pattern identification with validation/ verification, and (3) deployment (i.e. the application of the model to new data in order to generate predictions).
  • 6. F I N A N C I A L Advisors DATA MINING WHAT IS IT? Stage 1- Exploration. This stage usually starts with data preparation which may involve cleaning data, data transformations, selecting subsets of records and - in case of data sets with large numbers of variables ("fields") - performing some preliminary feature selection operations to bring the number of variables to a manageable range (depending on the statistical methods which are being considered). Then, depending on the nature of the analytic problem, this first stage of the process of data mining may involve anywhere between a simple choice of straightforward predictors for a regression model, to elaborate exploratory analyses using a wide variety of graphical and statistical methods (Exploratory Data Analysis (EDA)) in order to identify the most relevant variables and determine the complexity and/or the general nature of models that can be taken into account in the next stage.
  • 7. F I N A N C I A L Advisors DATA MINING WHAT IS IT? Stage 2: Model building and validation. This stage involves considering various models and choosing the best one based on their predictive performance (i.e., explaining the variability in question and producing stable results across samples). This may sound like a simple operation, but in fact, it sometimes involves a very elaborate process. There are a variety of techniques developed to achieve that goal - many of which are based on so-called "competitive evaluation of models," that is, applying different models to the same data set and then comparing their performance to choose the best. These techniques - which are often considered the core of predictive data mining - include: • bagging (Voting, Averaging), • boosting, • stacking (Stacked Generalizations), and • meta-Learning.
  • 8. F I N A N C I A L Advisors DATA MINING WHAT IS IT? Stage 3: Deployment. That final stage involves using the model selected as best in the previous stage and applying it to new data in order to generate predictions or estimates of the expected outcome. The concept of Data Mining is becoming increasingly popular as a business information management tool where it is expected to reveal knowledge structures that can guide decisions in conditions of limited certainty. Recently, there has been increased interest in developing new analytic techniques specifically designed to address the issues relevant to business Data Mining (e.g. Classification Trees), but Data Mining is still based on the conceptual principles of statistics including the traditional Exploratory Data Analysis (EDA) and modeling and it shares with them both some components of its general approaches and specific techniques.
  • 9. F I N A N C I A L Advisors DATA MINING WHAT IS IT? Stage 3: Deployment. However, an important general difference in the focus and purpose between Data Mining and the traditional Exploratory Data Analysis (EDA) is that Data Mining is more oriented towards applications than the basic nature of the underlying phenomena. In other words, Data Mining is relatively less concerned with identifying the specific relations between the involved variables. For example, uncovering the nature of the underlying functions or the specific types of interactive, multivariate dependencies between variables are not the main goal of Data Mining. Instead, the focus is on producing a solution that can generate useful predictions. Therefore, Data Mining accepts among others a "black box" approach to data exploration or knowledge discovery and uses not only the traditional Exploratory Data Analysis (EDA) techniques, but also such techniques as Neural Networks, which can generate valid predictions but are not capable of identifying the specific nature of the interrelations between the variables on which the predictions are based.
  • 10. F I N A N C I A L Advisors DATA MINING WHAT IS IT? Data mining automates the detection of relevant patterns in a database, using defined approaches and algorithms to look into current and historical data that can then be analyzed to predict future trends. Because data mining tools predict future trends and behaviors by reading through databases for hidden patterns, they allow organizations to make proactive, knowledge-driven decisions and answer questions that were previously too time- consuming to resolve in a manual manner. Changes in data mining techniques, have enabled organizations to collect, analyze, and access data in new ways. The first change occurred in the area of basic data collection. As companies started collecting and saving basic data in computers, they were able to start answering detailed questions quicker and with more ease.
  • 11. F I N A N C I A L Advisors AMOUNT OF DATA BEING ANALYZED Vast amounts of data that companies collect, require use of data mining programs to investigate data trends and process large volumes of data quickly. Workers can determine the outcome of the data analysis by the parameters chosen, thus providing additional value to business strategies and initiatives. It is important to note that without these parameters, the data mining program will generate all permutations or combinations irrespective of their relevance.
  • 12. F I N A N C I A L Advisors IDENTIFY KEY PARAMETERS TRANSFORM DATA INTO VALUABLE INFORMATION The business Analyst need to pay attention to selection of key parameters. Data mining programs lack the ability to recognize the difference between a relevant and an irrelevant data correlation. The business analyst need to review the results of mining exercises to ensure results provide needed information from the data. For example, knowing that people who default on loans usually give a false address might be relevant, whereas knowing they have blonde hair might be irrelevant. Monitor whether sensible and rational decisions are made on the basis of data mining exercises, especially where the results of such exercises are used as input for other processes or systems at workplace.
  • 13. F I N A N C I A L Advisors SECURITY ASPECTS YOUR BUSINESS ADVANTAGE OF INTEREST TO OTHERS The business Analyst need to consider the different security aspects of data mining programs and processes. A data mining exercise might reveal important information that could be exploited by an outsider who hacks into the organization's computer system and uses captured information from a data mining tool.
  • 14. F I N A N C I A L Advisors DATA MINING Importance of Data Mining Using data mining to understand and extrapolate data and information can reduce the chances of success in E&P programs, improve audit reactions to potential business changes, and ensure that risks are managed in a more timely and proactive fashion. Business Analysts can use data mining tools to model "what-if" situations and demonstrate real and probable effects to management, such as combining real-world and business information to show the effects of a security breach and the impact of drilling dry holes or miss a production opportunity. If data mining can be used by one part of the organization to influence business direction for profit, why can't Business Analysts use the same tools and techniques to reduce risks and increase audit benefits? REDUCING RISKS AND INCREASE AUDIT BENEFITS IMPROVE • BUSINESS PROCESSES • AUDIT CAPABILITY
  • 15. F I N A N C I A L Advisors SPATIAL DATA MINING Spatial understanding Spatial data mining is the application of data mining techniques to spatial data. Spatial data mining follows along the same functions as seen in data mining, with the end objective to reveal and create knowledge about trends/ patterns geographically. Spatial data mining is the process of discovering interesting and previously unknown, but potentially useful patterns from large spatial datasets. Extracting interesting and useful patterns from spatial datasets is more difficult than extracting the corresponding patterns from traditional numeric and categorical data due to the complexity of spatial data types, spatial relationships, and spatial autocorrelation. Spatial data mining could be done with location prediction, spatial outlier detection, and co-location data mining. ADD GEOGRAPHIC DISTRIBUTION
  • 16. F I N A N C I A L Advisors GEOGRAPHIC DATA MINING Goal is to discover oil and gas @ minimum risk and cost Techniques for spatial data warehousing (SDW), spatial data mining (Sdm), and Spatial visualization (SVis) have been developed. In addition, there has been a rise in the use of knowledge discovery techniques due to the increasing collection and storage of data on spatiotemporal processes and mobile objects. Geographic data mining and knowledge discovery is a young discipline with many challenging research problems. This area represents an important direction in the development of a new generation of spatial analysis tools for data-rich environments. It is important to motivate researchers to develop new methods and applications in this emerging field. KNOWLEDGE DISCOVERY
  • 17. F I N A N C I A L Advisors KNOWLEDGE DISCOVERY FROM DATABASE Create value added to business KDD is a response to the enormous volumes of data being collected and stored in operational and scientific databases. Continuing improvements in information technology (IT) and its widespread adoption for process monitoring and control in many domains is creating a wealth of new data. There is often much more information in these databases than the “shallow” information being extracted by traditional analytical and query techniques. KDD leverages investments in IT by searching for deeply hidden information that can be turned into knowledge for strategic decision-making and answering fundamental business goal questions. KDD is better known through the more popular term “data mining.” However, data mining is only one component (albeit a central component) of the larger KDD process. Data mining involves distilling data into information or facts about the mini-world described by the database. KDD is the higher-level process of obtaining information through data mining and distilling this information into Knowledge (ideas and beliefs about the mini-world) through interpretation of information and integration with existing knowledge. KDD
  • 18. F I N A N C I A L Advisors DATA MINING TECHNIQUES
  • 19. F I N A N C I A L Advisors SPATIAL KNOWLEDGE DISCOVERY Create value added to business A very important special case of KDD, namely, spatial knowledge discovery (SKD) Spatial data mining techniques and the relationships between SKD and spatial visualization (SVis), an increasingly active research domain integrating scientific visualization and cartography. Seeking patterns and trends in hydrocarbon systems (such as porosities and other geological parameters over space and time) benefits from projecting the data into an information space whose spatial dimensions are non-metric SKD
  • 20. F I N A N C I A L Advisors SKD PORTALS CONNECTIVITY • Customer Portals provide richer information, from real-time service status to estimated restoration times • Analytics identify failure trends so you can more effectively target your investment and corrective measures • Field Operations connect crews with comprehensive information and tools for faster fault location, restoration and updating of your enterprise records • Task-oriented Solutions create new workflow-specific capabilities that leverage spatial information and processes
  • 21. F I N A N C I A L Advisors SPATIAL BUSINESS INTELLIGENCE PROVIDE THE NECESSARY INSIGHT FROM THE DATA Be able to synthesize data into information and present it in an understandable manner is important for business. Business intelligence (BI) environments must be able to visualize based on spatial data and depend on the following two elements: Visualization. Interrelationships in spatial data are often readily understood when visually presented. Irrelevant of how abstract or complex, effective Visualization exposes the information insight. 4W. Standard BI systems handle the Who, What and When, but the Where is vastly underexploited in order to reveal trends or outliers. Spatial BI transforms data into human understanding and actionable insight.
  • 22. F I N A N C I A L Advisors WE THINK VISUALLY. WE NEED TO SEE OUR LOGIC. SPATIAL BUSINESS INTELLIGENCE MI + BI = SBI(MAP INTELLIGENCE + BUSINESS INTELLIGENCE = SPATIAL BUSINESS INTELLIGENCE)
  • 23. F I N A N C I A L Advisors DATA MINING LITERATURE References • Berry, M., J., A., & Linoff, G., S., (2000). Mastering data mining. New York: Wiley. • Edelstein, H., A. (1999). Introduction to data mining and knowledge discovery (3rd ed). Potomac, MD: Two Crows Corp. • Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., & Uthurusamy, R. (1996). Advances in knowledge discovery & data mining. Cambridge, MA: MIT Press. • Han, J., Kamber, M. (2000). Data mining: Concepts and Techniques. New York: Morgan-Kaufman. • Hastie, T., Tibshirani, R., & Friedman, J. H. (2001). The elements of statistical learning : Data mining, inference, and prediction. New York: Springer. • Pregibon, D. (1997). Data Mining. Statistical Computing and Graphics, 7, 8. • Weiss, S. M., & Indurkhya, N. (1997). Predictive data mining: A practical guide. New York: Morgan-Kaufman. • Westphal, C., Blaxton, T. (1998). Data mining solutions. New York: Wiley. • Witten, I. H., & Frank, E. (2000). Data mining. New York: Morgan-Kaufmann. • Card, Stuart K.et al., Readings in Information Visualization: Using Vision to Think, Morgan Kaufmann Publishers, 1999. • Dresner, Howard, "Predicts 2004: Business Intelligence Technology Directions", Gartner Inc., December 5, 2003. • Gonzales, Michael L., "Alleviating Spatial Constraints", Intelligent Enterprise, August 18, 2000. • Ibid., "Breaking Out of the Warehouse", Intelligent Enterprise, September 17, 2003. • Ibid., IBM Data Warehousing,, Wiley Publishing Inc., 2003. • Ibid., "More than Pie Charts", Intelligent Enterprise, November 13, 2004. • Ibid., "Picture This! A Spatially Aware Data Warehouse", Journal of Data Warehousing, Volume 6, Issue 3, Summer 2001. • Ibid., "Seeking Spatial Intelligence", Intelligent Enterprise, January 20, 2000. • Ibid., "The New GIS Landscape", Intelligent Enterprise, February 1, 2003. • Spence, Robert, Information Visualization, Addison-Wesley Publishing Co., 2000 • Sullivan, Dan, "Vision of Intelligence", Intelligent Enterprise, May 28, 2002 • Tiedrich, Alan, "Business Intelligence Tools: Perspective", Gartner Inc., June 19, 2003. • Ibid., "Cool Vendors in BI, BAM and Data Warehousing", Gartner Inc., March 25, 2004. • Woodbury, Henry, "Why Your Ideas Need Visual Explanation", Dynamic Diagrams Inc., October 2003.