SlideShare a Scribd company logo
BUSINESS ANALYTICS FOUNDATION WITH R
TOOLS
Lesson 4 - Predictive Modeling Techniques
Part 2
Copyright 2016,Beamsync, All rights reserved.
• A measure of goodness of fit - How well your model does fit the data?
COEFFICIENT OF DETERMINATION R2 :
R2 = 0 , no linear relationship
R2 = -1 , negative linear relationship
R2 = +1 , positive linear relationship
Copyright 2016,Beamsync, All rights reserved.
• Based on R2 value , we can explain how well the model explains the data and the percentage of
differences that are explained by this model.
• The differences between observations that are not explained by the model is the error term or
residual .
• Suppose we have a case in which R2 value is 0.74. This means that 74% of variance in the values of
the dependent variable is explained by the model and the remaining 26 % which is not explained is
its residual or error term.
HOW GOOD IS THE MODEL ?
Copyright 2016,Beamsync, All rights reserved.
HOW TO FIND LINEAR REGRESSION EQUATION
SUBJECT AGE (X) GLUCOSE LEVEL (Y) XY X2 Y2
1 43 99 4257 1849 9801
2 21 65 1365 441 4225
3 25 79 1975 625 6241
4 42 75 3150 1764 5625
5 57 87 4959 3249 7569
6 59 81 4779 3481 6561
Σ 247 486 20485 11409 40022
Y = a + bX => 65.14 + 0.38x
Copyright 2016,Beamsync, All rights reserved.
• It’s a statistical method that is used in analyzing datasets where one or more independent variables
would determine the outcome.
• In this type of regression the dependent variables are binary, data been coded as 1 for TRUE and 0
for FALSE (dichotomouscharacteristics).
• The goal of logistic regression is to find the best fitting model to describe the relationship between
the dichotomous characteristic and a set of independent variables.
• Logistic regression generates the coefficients of a formula to predict a logit transformation of the
probability of presence of the characteristic of interest:
logit (p) = β0 + β1 x1 + β2 x2 +β3 x3 + βn xn
where, p is the probability of presence of the characteristic of interest.
• The logit transformation is defined as the logged
odds: odds = (p / 1-p) and logit(p) = ln(p / 1-p)
LOGISTIC REGRESSION
Copyright 2016,Beamsync, All rights reserved.
METHOD TO DEVELOP A LOGISTIC MODEL
Observation-performance
windows
Data preparation, data treatment,
data hygiene
Derived variables identification
Fine and coarse classing
Logistic modeling and diagnostics
Data
Logistic
Regression
Model
Copyright 2016,Beamsync, All rights reserved.
• Linear regression is mainly used to establish a relationship between dependent and independent
variable. It helps in estimating the impact of independent variable over a dependent variable.
• Example – using a linear regression, the relationship between temperature (T) and ice cream sales
(I) is found to be
I = 2T + 4000
• This equation says that for every 1 degree raise in temperature , there is a demand of 4002 ice
creams.
• Logistic regression helps in finding out the probability of an event and this event is captured in
binary format i.e. 0 or1.
• Example – In order to know whether customers will buy a product or not, run a Logistic Regression
on the data. The dependent variable would be a binary variable .
• In terms of graphical representation, Linear Regression gives a linear line as an output, once the
values are plotted on the graph. Whereas, the logistic regression gives an S-shaped line
LINEAR REGRESSION VS LOGISTIC REGRESSION
Copyright 2016,Beamsync, All rights reserved.
CLUSTER ANALYSIS
Intra-cluster
distance is
minimized
• It groups the data objects based on the information that is found in the data that describes the
objects in other groups.
• The goal of this procedure is that the objects in a group are similar to one another and are different
from the objects in other groups.
• The greater the similarity within a group and greater the difference between the groups, more
distinct is the clustering.
• Cluster Analysis provides a way for users to discover potential relationships and construct
systematic structures in large numbers of variables and observations Inter-cluster
distance is
maximized
Copyright 2016,Beamsync, All rights reserved.
Thank You
Beamsync is providing business analytics training in Bangalore along with
certification. If you are looking your career into analytics schedule you’re
training here: http://beamsync.com/business-analytics-training-bangalore/
Copyright 2016,Beamsync, All rights reserved.

More Related Content

PPTX
Business Analytics Foundation with R Tools - Part 3
PPTX
Business Analytics Foundation with R Tools Part 1
PPTX
Data mining Part 1
PDF
HRUG - Linear regression with R
PDF
Missing data handling
PPTX
Data Mining: Mining ,associations, and correlations
PPTX
Matlab:Regression
Business Analytics Foundation with R Tools - Part 3
Business Analytics Foundation with R Tools Part 1
Data mining Part 1
HRUG - Linear regression with R
Missing data handling
Data Mining: Mining ,associations, and correlations
Matlab:Regression

What's hot (20)

PPTX
Matlab Data And Statistics
PPTX
Firebird: cost-based optimization and statistics, by Dmitry Yemanov (in English)
ODP
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
PPT
Fernandos Statistics
PPT
Emilie Rousselin Stastistics
PPT
Sales Force Alignment
PPT
5 6 Scatter Plots & Best Fit Lines
PPTX
Pca(principal components analysis)
PPS
Scatter Plot
PPTX
COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES(CBNST)
PDF
Data Trend Analysis by Assigning Polynomial Function For Given Data Set
PPTX
Missing Data and Causes
PDF
A Comparative Study for Anomaly Detection in Data Mining
TXT
Logistic regression
PPTX
Exploring Data
PDF
The RuLIS approach to outliers (Marcello D'Orazio,FAO)
 
PDF
PCA (Principal component analysis)
PPTX
Data Science Meetup: DGLARS and Homotopy LASSO for Regression Models
PDF
Applied Mathematical Modeling with Apache Solr - Joel Bernstein, Lucidworks
PDF
Machine learning meetup
Matlab Data And Statistics
Firebird: cost-based optimization and statistics, by Dmitry Yemanov (in English)
Understandung Firebird optimizer, by Dmitry Yemanov (in English)
Fernandos Statistics
Emilie Rousselin Stastistics
Sales Force Alignment
5 6 Scatter Plots & Best Fit Lines
Pca(principal components analysis)
Scatter Plot
COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES(CBNST)
Data Trend Analysis by Assigning Polynomial Function For Given Data Set
Missing Data and Causes
A Comparative Study for Anomaly Detection in Data Mining
Logistic regression
Exploring Data
The RuLIS approach to outliers (Marcello D'Orazio,FAO)
 
PCA (Principal component analysis)
Data Science Meetup: DGLARS and Homotopy LASSO for Regression Models
Applied Mathematical Modeling with Apache Solr - Joel Bernstein, Lucidworks
Machine learning meetup
Ad

Viewers also liked (14)

PDF
COMPARING PROGRAMMER PRODUCTIVITY IN OPENACC AND CUDA: AN EMPIRICAL INVESTIGA...
PPTX
Power de ldya
PPT
Handroanthus heptaphyllus - Ipê Rosa
PDF
LUNG CANCER TREATMENT: THE SURGEONS ROLE AND PERSPECTIVE
PPT
Metodo cientifco
PDF
Наиболее интересные технологические нововведения IBM i
PDF
Plant a child
PPTX
Original
PDF
JW_Gov Innovation Process Master Class_inV2
PPTX
Antimicrobial Stewardship in Oncology Care
PPT
Excel parte 2
DOCX
hieu ro hon ve bitcoin. tai sao bitcoin khong phai la “tien ao”
PDF
PDF
MA-Overview-Brochure
COMPARING PROGRAMMER PRODUCTIVITY IN OPENACC AND CUDA: AN EMPIRICAL INVESTIGA...
Power de ldya
Handroanthus heptaphyllus - Ipê Rosa
LUNG CANCER TREATMENT: THE SURGEONS ROLE AND PERSPECTIVE
Metodo cientifco
Наиболее интересные технологические нововведения IBM i
Plant a child
Original
JW_Gov Innovation Process Master Class_inV2
Antimicrobial Stewardship in Oncology Care
Excel parte 2
hieu ro hon ve bitcoin. tai sao bitcoin khong phai la “tien ao”
MA-Overview-Brochure
Ad

Similar to Business Analytics Foundation with R tools - Part 2 (20)

PPTX
Logistic Regression.pptx
PPTX
Regression analysis by akanksha Bali
PDF
Regression
PPTX
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
PPTX
Logistical Regression.pptx
PPTX
Regression of research methodlogyyy.pptx
PDF
Logistic regression
PDF
3ml.pdf
PPTX
Logistics Regression Using Python.pptx
PPTX
Regression
PPTX
Logistic Regression in machine learning ppt
PPTX
business Lesson-Linear-Regression-1.pptx
PPTX
ML4 Regression.pptx
PPTX
Logistic regression is a data analysis technique that uses mathematics to fin...
PDF
Logistic regression sage
PDF
Regression analysis algorithm
PDF
Machine Learning.pdf
PPTX
Group 20_Logistic Regression devara.pptx
PDF
Regression analysis made easy
PPTX
Regression analysis
Logistic Regression.pptx
Regression analysis by akanksha Bali
Regression
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
Logistical Regression.pptx
Regression of research methodlogyyy.pptx
Logistic regression
3ml.pdf
Logistics Regression Using Python.pptx
Regression
Logistic Regression in machine learning ppt
business Lesson-Linear-Regression-1.pptx
ML4 Regression.pptx
Logistic regression is a data analysis technique that uses mathematics to fin...
Logistic regression sage
Regression analysis algorithm
Machine Learning.pdf
Group 20_Logistic Regression devara.pptx
Regression analysis made easy
Regression analysis

More from Beamsync (6)

PPTX
Business Analytics Foundation with R tool - Part 5
PPTX
Basic Analytic Techniques - Using R Tool - Part 1
PPTX
Introduction to Business Analytics Course Part 10
PPTX
Introduction to Business Analytics Course Part 9
PPTX
Introduction to Business Analytics Course Part 7
PPTX
Introduction to Business Analytics Part 1
Business Analytics Foundation with R tool - Part 5
Basic Analytic Techniques - Using R Tool - Part 1
Introduction to Business Analytics Course Part 10
Introduction to Business Analytics Course Part 9
Introduction to Business Analytics Course Part 7
Introduction to Business Analytics Part 1

Recently uploaded (20)

PDF
Complications of Minimal Access Surgery at WLH
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Computing-Curriculum for Schools in Ghana
PDF
Insiders guide to clinical Medicine.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Cell Types and Its function , kingdom of life
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Classroom Observation Tools for Teachers
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
Institutional Correction lecture only . . .
PDF
Microbial disease of the cardiovascular and lymphatic systems
Complications of Minimal Access Surgery at WLH
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Computing-Curriculum for Schools in Ghana
Insiders guide to clinical Medicine.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
Microbial diseases, their pathogenesis and prophylaxis
Pharmacology of Heart Failure /Pharmacotherapy of CHF
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPH.pptx obstetrics and gynecology in nursing
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Cell Types and Its function , kingdom of life
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Classroom Observation Tools for Teachers
102 student loan defaulters named and shamed – Is someone you know on the list?
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Institutional Correction lecture only . . .
Microbial disease of the cardiovascular and lymphatic systems

Business Analytics Foundation with R tools - Part 2

  • 1. BUSINESS ANALYTICS FOUNDATION WITH R TOOLS Lesson 4 - Predictive Modeling Techniques Part 2 Copyright 2016,Beamsync, All rights reserved.
  • 2. • A measure of goodness of fit - How well your model does fit the data? COEFFICIENT OF DETERMINATION R2 : R2 = 0 , no linear relationship R2 = -1 , negative linear relationship R2 = +1 , positive linear relationship Copyright 2016,Beamsync, All rights reserved.
  • 3. • Based on R2 value , we can explain how well the model explains the data and the percentage of differences that are explained by this model. • The differences between observations that are not explained by the model is the error term or residual . • Suppose we have a case in which R2 value is 0.74. This means that 74% of variance in the values of the dependent variable is explained by the model and the remaining 26 % which is not explained is its residual or error term. HOW GOOD IS THE MODEL ? Copyright 2016,Beamsync, All rights reserved.
  • 4. HOW TO FIND LINEAR REGRESSION EQUATION SUBJECT AGE (X) GLUCOSE LEVEL (Y) XY X2 Y2 1 43 99 4257 1849 9801 2 21 65 1365 441 4225 3 25 79 1975 625 6241 4 42 75 3150 1764 5625 5 57 87 4959 3249 7569 6 59 81 4779 3481 6561 Σ 247 486 20485 11409 40022 Y = a + bX => 65.14 + 0.38x Copyright 2016,Beamsync, All rights reserved.
  • 5. • It’s a statistical method that is used in analyzing datasets where one or more independent variables would determine the outcome. • In this type of regression the dependent variables are binary, data been coded as 1 for TRUE and 0 for FALSE (dichotomouscharacteristics). • The goal of logistic regression is to find the best fitting model to describe the relationship between the dichotomous characteristic and a set of independent variables. • Logistic regression generates the coefficients of a formula to predict a logit transformation of the probability of presence of the characteristic of interest: logit (p) = β0 + β1 x1 + β2 x2 +β3 x3 + βn xn where, p is the probability of presence of the characteristic of interest. • The logit transformation is defined as the logged odds: odds = (p / 1-p) and logit(p) = ln(p / 1-p) LOGISTIC REGRESSION Copyright 2016,Beamsync, All rights reserved.
  • 6. METHOD TO DEVELOP A LOGISTIC MODEL Observation-performance windows Data preparation, data treatment, data hygiene Derived variables identification Fine and coarse classing Logistic modeling and diagnostics Data Logistic Regression Model Copyright 2016,Beamsync, All rights reserved.
  • 7. • Linear regression is mainly used to establish a relationship between dependent and independent variable. It helps in estimating the impact of independent variable over a dependent variable. • Example – using a linear regression, the relationship between temperature (T) and ice cream sales (I) is found to be I = 2T + 4000 • This equation says that for every 1 degree raise in temperature , there is a demand of 4002 ice creams. • Logistic regression helps in finding out the probability of an event and this event is captured in binary format i.e. 0 or1. • Example – In order to know whether customers will buy a product or not, run a Logistic Regression on the data. The dependent variable would be a binary variable . • In terms of graphical representation, Linear Regression gives a linear line as an output, once the values are plotted on the graph. Whereas, the logistic regression gives an S-shaped line LINEAR REGRESSION VS LOGISTIC REGRESSION Copyright 2016,Beamsync, All rights reserved.
  • 8. CLUSTER ANALYSIS Intra-cluster distance is minimized • It groups the data objects based on the information that is found in the data that describes the objects in other groups. • The goal of this procedure is that the objects in a group are similar to one another and are different from the objects in other groups. • The greater the similarity within a group and greater the difference between the groups, more distinct is the clustering. • Cluster Analysis provides a way for users to discover potential relationships and construct systematic structures in large numbers of variables and observations Inter-cluster distance is maximized Copyright 2016,Beamsync, All rights reserved.
  • 9. Thank You Beamsync is providing business analytics training in Bangalore along with certification. If you are looking your career into analytics schedule you’re training here: http://beamsync.com/business-analytics-training-bangalore/ Copyright 2016,Beamsync, All rights reserved.