SlideShare a Scribd company logo
1. data mining
Data mining is the process of discovering patterns in large data sets involving methods at
the intersection of machine learning, statistics, and database systems.
steps involved in data mining process
 Identifying the source information.
 Picking the data points that need to be analyzed.
 Extracting the relevant information from the data.
 Identifying the key values from the extracted data set.
 Interpreting and reporting the results.
2. What is regression?
a measure of the relation between the mean value of one variable (e.g. output) and
corresponding values of other variables (e.g. time and cost).
Regression analysis
In statistical modeling, regression analysis is a set of statistical processes for estimating
the relationships among variables
3. Any one analytics technique with example.
Here are five analytics techniques that MBA students will learn, that they're sure to
apply in their future work:
1. Descriptive analytics.
2. Predictive analytics/data mining and forecasting.
3. Optimization for resource allocation.
4. Simulation/risk management.
5. Analytics and Big Data.
4. what is logistic regression?
Logistic regression is a statistical method for analyzing a dataset in which there are one
or more independent variables that determine an outcome. The outcome is measured with
a dichotomous variable (in which there are only two possible outcomes).
Binomial or binary logistic regression deals with situations in which the observed
outcome for a dependent variable can have only two possible types, "0" and "1" (which
may represent, for example, "dead" vs. "alive" or "win" vs. "loss"). ... Ordinal logistic
regression deals with dependent variables that are ordered.
5. simple regression analysis & Multiple linear regression
In simple linear regression, we predict scores on one variable from the scores on a
second variable. The variable we are predicting is called the criterion variable and is
referred to as Y. When there is only one predictor variable, the prediction method is
called simple regression.
Multiple regression is an extension of simple linear regression. It is used when we
want to predict the value of a variable based on the value of two or more other
variables. The variable we want to predict is called the dependent variable (or
sometimes, the outcome, target or criterion variable).
6. Descriptive Analytics? Different data and scale of measurement
Descriptive statistics are brief descriptive coefficients that summarize a given data
set, which can be either a representation of the entire population or a sample of
it. Descriptive statistics are broken down into measures of central tendency and
measures of variability, or spread.
Nominal: Nominal data have no order and thus only gives names or labels to various
categories.
Ordinal: Ordinal data have order, but the interval between measurements is not
meaningful.
Interval: Interval data have meaningful intervals between measurements, but there is
no true starting point (zero).
Ratio:Ratio data have the highest level of measurement. Ratios between
measurements as well as intervals are meaningful because there is a starting point
(zero).
7. Cluster Analysis
Cluster analysis or clustering is the task of grouping a set of objects in such a way that
objects in the same group are more similar to each other than to those in other groups.
8. Data Analytics & Used of Data mining
Data analysis is a process of inspecting, cleansing, transforming, and modeling data
with the goal of discovering useful information, suggesting conclusions, and
supporting decision-making.
9. Steps of Cluster Analysis
Two-step clustering can handle scale and ordinal data in the same model, and it
automatically selects the number of clusters. The hierarchical cluster analysis follows
three basic steps: 1) calculate the distances, 2) link the clusters, and 3) choose a
solution by selecting the right number of clusters.
10. Association rules with an example
Association rule mining is a procedure which is meant to find frequent patterns,
correlations, associations, orcausal structures from data sets found in various kinds of
databases such as relational databases, transactional databases, and other forms
of data repositories.
11. Factor Analysis
Factor analysis is a statistical method used to describe variability among observed,
correlated variables in terms of a potentially lower number of unobserved variables
called factors.
12. Explain Modeling process
Business process modeling (BPM) in business process management and systems
engineering is the activity of representing processes of an enterprise, so that the
current process may be analysed, improved, and automated. ... Alternatively,
the process model can be derived directly from events' logs using process mining
tools.
13. Types of Variables
14. Market Basket Analysis
Market Basket Analysis is a modelling technique based upon the theory that if you
buy a certain group of items, you are more (or less) likely to buy another group of
items. For example, if you are in an English pub and you buy a pint of beer and don't
buy a bar meal, you are more likely to buy crisps
For investors, the market basket is the principal idea behind index funds, which are
essentially a broad sample of stocks, bonds or other securities in the market; this
provides investors with a benchmark against which to compare their investment
returns.
15. Generating Candidate Rules?
Association Rules find all sets of items (item sets) that have support greater than the
minimum support and then using the large item sets to generate the desired rules that
have confidence greater than the minimum confidence. The lift of a rule is the ratio of
the observed support to that expected if X and Y were independent. A typical and
widely used example of association rules application is market basket analysis.
How to Generate Candidates? How to Generate Candidates?
Step 1: self-joining
„ Step 2: pruning (before counting its support)
16.Selecting Strong Rule & Lift Ratio
Lift (data mining) ... Lift is simply the ratio of these values: target response divided by
average response. For example, suppose a population has an average response rate of
5%, but a certain model (or rule) has identified a segment with a response rate of 20%.
17. Explanatory vs. Predictive Modeling
When building multivariate statistical models, researchers need to be clear as to
whether their goals are explanatory or predictive. Explanatory research aims to
identify risk (or protective) factors that are causally related to an outcome. ...
Unfortunately, researchers often conflate the two, which leads to errors

More Related Content

PPTX
Data analysis
PDF
Data analysis
PPTX
Data Analysis and Statistics
PPTX
Basic Statistics & Data Analysis
PDF
Exploratory data analysis data visualization
PPTX
Statistical analysis and interpretation
PPTX
Dma unit 2
PPT
Data Analysis
Data analysis
Data analysis
Data Analysis and Statistics
Basic Statistics & Data Analysis
Exploratory data analysis data visualization
Statistical analysis and interpretation
Dma unit 2
Data Analysis

What's hot (20)

PDF
Exploratory data analysis
PPT
T19 factor analysis
PPTX
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
PDF
Research Method EMBA chapter 11
PPTX
Missing Data and Causes
DOC
Statistics Assignments 090427
DOCX
Scope and objective of the assignment
PPTX
Introduction to regression
PPTX
Statistical analysis using spss
PPTX
Statistical Approaches to Missing Data
PPT
Pentaho Meeting 2008 - Statistics & BI
PPTX
Data analysis
PPTX
Imputation Techniques For Market Research Datasets With Missing Values
PPTX
Unit 4 editing and coding (2)
DOCX
Introduction to Business Statistics
PPTX
What is Data analytics and it's importance ?
PPTX
Classification
PPT
Lecture 1
PPT
Statistical Analysis Overview
PDF
Descriptive Analytics: Data Reduction
Exploratory data analysis
T19 factor analysis
Statistical Data Analysis | Data Analysis | Statistics Services | Data Collec...
Research Method EMBA chapter 11
Missing Data and Causes
Statistics Assignments 090427
Scope and objective of the assignment
Introduction to regression
Statistical analysis using spss
Statistical Approaches to Missing Data
Pentaho Meeting 2008 - Statistics & BI
Data analysis
Imputation Techniques For Market Research Datasets With Missing Values
Unit 4 editing and coding (2)
Introduction to Business Statistics
What is Data analytics and it's importance ?
Classification
Lecture 1
Statistical Analysis Overview
Descriptive Analytics: Data Reduction
Ad

Similar to Exam Short Preparation on Data Analytics (20)

PPTX
Business Analytics models, measuring scales etc.pptx
PPTX
Business analytics and data mining
PPTX
Business analytics and data mining
PPTX
Business analytics and data mining
PPTX
Business analytics and data mining
PPTX
Business analytics and data mining
PPTX
Business analytics and data mining
PPTX
Business analytics and data mining
DOC
Performance management analytics
PDF
PDF
what is ..how to process types and methods involved in data analysis
PDF
Data Mining for Big Data-Murat Yazıcı
PDF
Chapter 1 Introduction to Business Analytics.pdf
PDF
Intro_to_business_analytics_1707852756.pdf
PDF
Evans_Analytics2e_ppt_01.pdf
PPTX
BUSINESS_INTELLIGENT_AND_ANALYTICS.pptx
PPTX
Data mining
PDF
Chapter 1.pdf
PPTX
Business intelligence
PPTX
Exploratory data analysis for business MODULE 1.pptx
Business Analytics models, measuring scales etc.pptx
Business analytics and data mining
Business analytics and data mining
Business analytics and data mining
Business analytics and data mining
Business analytics and data mining
Business analytics and data mining
Business analytics and data mining
Performance management analytics
what is ..how to process types and methods involved in data analysis
Data Mining for Big Data-Murat Yazıcı
Chapter 1 Introduction to Business Analytics.pdf
Intro_to_business_analytics_1707852756.pdf
Evans_Analytics2e_ppt_01.pdf
BUSINESS_INTELLIGENT_AND_ANALYTICS.pptx
Data mining
Chapter 1.pdf
Business intelligence
Exploratory data analysis for business MODULE 1.pptx
Ad

Recently uploaded (20)

PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Introduction to machine learning and Linear Models
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
Mega Projects Data Mega Projects Data
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
annual-report-2024-2025 original latest.
PDF
Business Analytics and business intelligence.pdf
PPTX
climate analysis of Dhaka ,Banglades.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Miokarditis (Inflamasi pada Otot Jantung)
STUDY DESIGN details- Lt Col Maksud (21).pptx
Introduction-to-Cloud-ComputingFinal.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Clinical guidelines as a resource for EBP(1).pdf
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Introduction to machine learning and Linear Models
Reliability_Chapter_ presentation 1221.5784
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Mega Projects Data Mega Projects Data
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
annual-report-2024-2025 original latest.
Business Analytics and business intelligence.pdf
climate analysis of Dhaka ,Banglades.pptx

Exam Short Preparation on Data Analytics

  • 1. 1. data mining Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. steps involved in data mining process  Identifying the source information.  Picking the data points that need to be analyzed.  Extracting the relevant information from the data.  Identifying the key values from the extracted data set.  Interpreting and reporting the results. 2. What is regression? a measure of the relation between the mean value of one variable (e.g. output) and corresponding values of other variables (e.g. time and cost). Regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships among variables 3. Any one analytics technique with example. Here are five analytics techniques that MBA students will learn, that they're sure to apply in their future work: 1. Descriptive analytics. 2. Predictive analytics/data mining and forecasting. 3. Optimization for resource allocation. 4. Simulation/risk management. 5. Analytics and Big Data.
  • 2. 4. what is logistic regression? Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. The outcome is measured with a dichotomous variable (in which there are only two possible outcomes). Binomial or binary logistic regression deals with situations in which the observed outcome for a dependent variable can have only two possible types, "0" and "1" (which may represent, for example, "dead" vs. "alive" or "win" vs. "loss"). ... Ordinal logistic regression deals with dependent variables that are ordered. 5. simple regression analysis & Multiple linear regression In simple linear regression, we predict scores on one variable from the scores on a second variable. The variable we are predicting is called the criterion variable and is referred to as Y. When there is only one predictor variable, the prediction method is called simple regression. Multiple regression is an extension of simple linear regression. It is used when we want to predict the value of a variable based on the value of two or more other variables. The variable we want to predict is called the dependent variable (or sometimes, the outcome, target or criterion variable). 6. Descriptive Analytics? Different data and scale of measurement Descriptive statistics are brief descriptive coefficients that summarize a given data set, which can be either a representation of the entire population or a sample of it. Descriptive statistics are broken down into measures of central tendency and measures of variability, or spread. Nominal: Nominal data have no order and thus only gives names or labels to various categories. Ordinal: Ordinal data have order, but the interval between measurements is not meaningful. Interval: Interval data have meaningful intervals between measurements, but there is no true starting point (zero). Ratio:Ratio data have the highest level of measurement. Ratios between measurements as well as intervals are meaningful because there is a starting point (zero).
  • 3. 7. Cluster Analysis Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. 8. Data Analytics & Used of Data mining Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making. 9. Steps of Cluster Analysis Two-step clustering can handle scale and ordinal data in the same model, and it automatically selects the number of clusters. The hierarchical cluster analysis follows three basic steps: 1) calculate the distances, 2) link the clusters, and 3) choose a solution by selecting the right number of clusters. 10. Association rules with an example Association rule mining is a procedure which is meant to find frequent patterns, correlations, associations, orcausal structures from data sets found in various kinds of databases such as relational databases, transactional databases, and other forms of data repositories.
  • 4. 11. Factor Analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors.
  • 5. 12. Explain Modeling process Business process modeling (BPM) in business process management and systems engineering is the activity of representing processes of an enterprise, so that the current process may be analysed, improved, and automated. ... Alternatively, the process model can be derived directly from events' logs using process mining tools. 13. Types of Variables
  • 6. 14. Market Basket Analysis Market Basket Analysis is a modelling technique based upon the theory that if you buy a certain group of items, you are more (or less) likely to buy another group of items. For example, if you are in an English pub and you buy a pint of beer and don't buy a bar meal, you are more likely to buy crisps For investors, the market basket is the principal idea behind index funds, which are essentially a broad sample of stocks, bonds or other securities in the market; this provides investors with a benchmark against which to compare their investment returns. 15. Generating Candidate Rules? Association Rules find all sets of items (item sets) that have support greater than the minimum support and then using the large item sets to generate the desired rules that have confidence greater than the minimum confidence. The lift of a rule is the ratio of the observed support to that expected if X and Y were independent. A typical and widely used example of association rules application is market basket analysis. How to Generate Candidates? How to Generate Candidates? Step 1: self-joining „ Step 2: pruning (before counting its support)
  • 7. 16.Selecting Strong Rule & Lift Ratio Lift (data mining) ... Lift is simply the ratio of these values: target response divided by average response. For example, suppose a population has an average response rate of 5%, but a certain model (or rule) has identified a segment with a response rate of 20%. 17. Explanatory vs. Predictive Modeling When building multivariate statistical models, researchers need to be clear as to whether their goals are explanatory or predictive. Explanatory research aims to identify risk (or protective) factors that are causally related to an outcome. ... Unfortunately, researchers often conflate the two, which leads to errors