SlideShare a Scribd company logo
Quantitative Analysis for BusinessLecture 3July 12th, 2010Saksarun (Jay) Mativachranon
Error in regression model
Assumptions of the Regression ModelIf we make certain assumptions about the errors in a regression model, we can perform statistical tests to determine if the model is useful Errors are independentErrors are normally distributedErrors have a mean of zeroErrors have a constant varianceA plot of the residuals (errors) will often highlight any glaring violations of the assumption
Residual Plots ErrorXA random plot of residualsFigure 4.4A
Residual Plots ErrorXNonconstant error varianceFigure 4.4B
Residual Plots ErrorXNonlinear relationshipFigure 4.4C
Analysis of variance
Analysis of variance (Anova)Analysis of Variance (ANOVA)A statistical procedure for analyzing the total variability of a data set
Analysis of variance (Anova)Sum of squares total (SST)Measures the total variation in the dependent variableSum of squares of regression (SSR)Measures the variation in the dependent variable explained by the independent variableSum of squares of errors (SSE)Measures the unexplained variation
Estimating the VarianceErrors are assumed to have a constant variance ( 2), but we usually don’t know thisIt can be estimated using the mean squared error (MSE), s2wheren = number of observations in the samplek = number of independent variables
The Standard error of estimateThe standard error of estimate (SEE) or The standard error of the regressionMeasures uncertainty between independent and dependent variables
The f statisticF-test asses how well a set of independent variables, as a group, explains the variation in the dependent variableWhereMSR = mean regression sum of squaresMSE = mean squared errork = the number of slope parameters (k = 1 for linear regression)n = number of observations
F-statistic linear regressionFor linear regression, the hypotheses for the validity of the model are;H0: b1 = 0Ha: b1 ≠ 0To determine if b1 is statistically significant, the calculated F-statistic is compared with the critical F-value, Fc, at the appropriate level of significance.
F-statistic linear regressionThe degree of freedom (df) for the numerator and denominator with one independent variable are;dfnumerator = k = 1dfdenominator = n – k – 1 = n – 2Decision for F-testReject H0 if F > Fc
Company A Data
Company A example__^^^__^^_
Estimating the VarianceFor Company AWe can estimate the standard deviation, s
This is also called the standard error of the estimate or the standard deviation of the regressionTesting the Model for SignificanceWhen the sample size is too small, you can get good values for MSE and r2 even if there is no relationship between the variablesTesting the model for significance helps determine if the values are meaningfulWe do this by performing a statistical hypothesis test
Testing the Model for SignificanceWe start with the general linear modelIf 1 = 0, the null hypothesis is that there is no relationship between X and Y
The alternate hypothesis is that there is a linear relationship (1≠ 0)
If the null hypothesis can be rejected, we have proven there is a relationship
We use the F statistic for this testTesting the Model for SignificanceThe F statistic is based on the MSE and MSRwherek =	number of independent variables in the modelThe F statistic is
This describes an F distribution with		degrees of freedom for the numerator = df1 = k		degrees of freedom for the denominator = df2 = n – k – 1
Testing the Model for SignificanceIf there is very little error, the MSE would be small and the F-statistic would be large indicating the model is usefulIf the F-statistic is large, the significance level (p-value) will be low, indicating it is unlikely this would have occurred by chanceSo when the F-value is large, we can reject the null hypothesis and accept that there is a linear relationship between X and Y and the values of the MSE and r2 are meaningful
Steps in a Hypothesis TestSpecify null and alternative hypothesesSelect the level of significance (). Common values are 0.01 and 0.05Calculate the value of the test statistic using the formula
Steps in a Hypothesis TestReject the null hypothesis if the test statistic is greater than the F-value from the table Otherwise, do not reject the null hypothesis:Reject the null hypothesis if the observed significance level, or p-value, is less than the level of significance (). Otherwise, do not reject the null hypothesis:Make a decision using one of the following methods
Step 3.	Calculate the value of the test statisticCompany AStep 1.H0: 1 = 0	(no linear relationship between X and Y)H1: 1≠ 0	(linear relationship exists between X and Y)Step 2.		Select  = 0.05
Step 4.	Reject the null hypothesis if the test statistic is greater than the F-valuedf1 = k = 1df2 = n – k – 1 = 6 – 1 – 1 = 4	The value of F associated with a 5% level of significance and with degrees of freedom 1 and 4 isF0.05,1,4 = 7.71Fcalculated = 9.09Reject H0 because 9.09 > 7.71Company A
0.05F = 7.719.09Company AWe can conclude there is a statistically significant relationship between X and Y
The r2 value of 0.69 means about 69% of the variability in sales (Y) is explained by Man Hour (X)Limitation of regression analysisLinear relationships can change over timeThis is referred to as parameter instabilityEven if the model is accurate, its usefulness will be limited if other market participants are also aware of and act on this modelIf the assumptions do not hold, the interpretation and tests of hypotheses may not be valid
Using software for regression
Using Software for Regression

More Related Content

PPTX
Business Quantitative - Lecture 2
PPTX
Regression & It's Types
PPTX
Simple linear regression
PPTX
Econometrics chapter 8
PDF
Simple & Multiple Regression Analysis
PPTX
Regression presentation
PPT
Simple Regression
PPTX
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
Business Quantitative - Lecture 2
Regression & It's Types
Simple linear regression
Econometrics chapter 8
Simple & Multiple Regression Analysis
Regression presentation
Simple Regression
What is Simple Linear Regression and How Can an Enterprise Use this Technique...

What's hot (19)

DOCX
7 classical assumptions of ordinary least squares
PDF
Use of Linear Regression in Machine Learning for Ranking
PDF
Data Science - Part IV - Regression Analysis & ANOVA
PPTX
Linear regression analysis
PPTX
Multiple Linear Regression
PPTX
Machine Learning-Linear regression
PPTX
Regression Analysis
PPTX
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
PPTX
Ols by hiron
PPTX
Polynomial regression
PDF
Multicollinearity1
PPTX
Regression analysis
PPTX
Regression Analysis
PPTX
Regression analysis
PDF
Autocorrelation
PDF
Linear regression
PPTX
Applications of regression analysis - Measurement of validity of relationship
PPT
Chapter 9 Regression
PPTX
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
7 classical assumptions of ordinary least squares
Use of Linear Regression in Machine Learning for Ranking
Data Science - Part IV - Regression Analysis & ANOVA
Linear regression analysis
Multiple Linear Regression
Machine Learning-Linear regression
Regression Analysis
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
Ols by hiron
Polynomial regression
Multicollinearity1
Regression analysis
Regression Analysis
Regression analysis
Autocorrelation
Linear regression
Applications of regression analysis - Measurement of validity of relationship
Chapter 9 Regression
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Ad

Similar to Business Quantitative Lecture 3 (20)

PPTX
IBM401 Lecture 5
PPT
Multiple Regression.ppt
PDF
Pearson's Chi-square Test for Research Analysis
PPTX
Evaluation and optimization of variables using response surface methodology
DOCX
MSL 5080, Methods of Analysis for Business Operations 1 .docx
PPTX
Recep maz msb 701 quantitative analysis for managers
PPTX
Recep maz msb 701 quantitative analysis for managers
DOCX
Week 5 Lecture 14 The Chi Square Test Quite often, pat.docx
PPTX
Elementary statistics for Food Indusrty
PPTX
manecohuhuhuhubasicEstimation-1.pptx
PDF
elementary statistic
PPT
Multiple Regression.ppt
PPT
Multiple Regression with examples112.ppt
PPT
Multiple Regression.ppt
DOCX
Distribution of EstimatesLinear Regression ModelAssume (yt,.docx
PPT
SPSS statistics - get help using SPSS
DOCX
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
PPTX
Simple Regression.pptx
PPTX
PROCEDURE FOR TESTING HYPOTHESIS
PPTX
Machine learning session4(linear regression)
IBM401 Lecture 5
Multiple Regression.ppt
Pearson's Chi-square Test for Research Analysis
Evaluation and optimization of variables using response surface methodology
MSL 5080, Methods of Analysis for Business Operations 1 .docx
Recep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managers
Week 5 Lecture 14 The Chi Square Test Quite often, pat.docx
Elementary statistics for Food Indusrty
manecohuhuhuhubasicEstimation-1.pptx
elementary statistic
Multiple Regression.ppt
Multiple Regression with examples112.ppt
Multiple Regression.ppt
Distribution of EstimatesLinear Regression ModelAssume (yt,.docx
SPSS statistics - get help using SPSS
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
Simple Regression.pptx
PROCEDURE FOR TESTING HYPOTHESIS
Machine learning session4(linear regression)
Ad

More from saark (20)

PDF
Ibe303 grade
PDF
Ibm401 grade
PPTX
IBM401 Lecture 12
PPTX
IBE303 Lecture 12
PPTX
IBM401 Lecture 11
PPTX
IBE303 Lecture 11
PPTX
IBM401 Lecture 10
PPTX
IBE303 Lecture 10
PPTX
IBE303 Lecture 9
PPTX
IBM401 Lecture 9
PPTX
IBM401 Lecture 8
PPTX
IBE303 Lecture 8
PPTX
IBM401 Lecture 7
PPTX
IBE303 Lecture 7
PPTX
IBE303 - Lecture 6
PPTX
IBM401 - Lecture 6
PPTX
IBM401 Midterm key
PPTX
IBE303 Midterm key
PPTX
IBE303 Lecture 5
PPTX
IBE303 International Economic Lecture 4
Ibe303 grade
Ibm401 grade
IBM401 Lecture 12
IBE303 Lecture 12
IBM401 Lecture 11
IBE303 Lecture 11
IBM401 Lecture 10
IBE303 Lecture 10
IBE303 Lecture 9
IBM401 Lecture 9
IBM401 Lecture 8
IBE303 Lecture 8
IBM401 Lecture 7
IBE303 Lecture 7
IBE303 - Lecture 6
IBM401 - Lecture 6
IBM401 Midterm key
IBE303 Midterm key
IBE303 Lecture 5
IBE303 International Economic Lecture 4

Recently uploaded (20)

PPTX
Pharma ospi slides which help in ospi learning
PDF
Classroom Observation Tools for Teachers
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Insiders guide to clinical Medicine.pdf
PPTX
Cell Types and Its function , kingdom of life
PPTX
Lesson notes of climatology university.
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
master seminar digital applications in india
PDF
Sports Quiz easy sports quiz sports quiz
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Microbial disease of the cardiovascular and lymphatic systems
Pharma ospi slides which help in ospi learning
Classroom Observation Tools for Teachers
Renaissance Architecture: A Journey from Faith to Humanism
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Insiders guide to clinical Medicine.pdf
Cell Types and Its function , kingdom of life
Lesson notes of climatology university.
VCE English Exam - Section C Student Revision Booklet
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Pharmacology of Heart Failure /Pharmacotherapy of CHF
102 student loan defaulters named and shamed – Is someone you know on the list?
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
master seminar digital applications in india
Sports Quiz easy sports quiz sports quiz
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
TR - Agricultural Crops Production NC III.pdf
Supply Chain Operations Speaking Notes -ICLT Program
Microbial diseases, their pathogenesis and prophylaxis
Microbial disease of the cardiovascular and lymphatic systems

Business Quantitative Lecture 3

  • 1. Quantitative Analysis for BusinessLecture 3July 12th, 2010Saksarun (Jay) Mativachranon
  • 3. Assumptions of the Regression ModelIf we make certain assumptions about the errors in a regression model, we can perform statistical tests to determine if the model is useful Errors are independentErrors are normally distributedErrors have a mean of zeroErrors have a constant varianceA plot of the residuals (errors) will often highlight any glaring violations of the assumption
  • 4. Residual Plots ErrorXA random plot of residualsFigure 4.4A
  • 5. Residual Plots ErrorXNonconstant error varianceFigure 4.4B
  • 6. Residual Plots ErrorXNonlinear relationshipFigure 4.4C
  • 8. Analysis of variance (Anova)Analysis of Variance (ANOVA)A statistical procedure for analyzing the total variability of a data set
  • 9. Analysis of variance (Anova)Sum of squares total (SST)Measures the total variation in the dependent variableSum of squares of regression (SSR)Measures the variation in the dependent variable explained by the independent variableSum of squares of errors (SSE)Measures the unexplained variation
  • 10. Estimating the VarianceErrors are assumed to have a constant variance ( 2), but we usually don’t know thisIt can be estimated using the mean squared error (MSE), s2wheren = number of observations in the samplek = number of independent variables
  • 11. The Standard error of estimateThe standard error of estimate (SEE) or The standard error of the regressionMeasures uncertainty between independent and dependent variables
  • 12. The f statisticF-test asses how well a set of independent variables, as a group, explains the variation in the dependent variableWhereMSR = mean regression sum of squaresMSE = mean squared errork = the number of slope parameters (k = 1 for linear regression)n = number of observations
  • 13. F-statistic linear regressionFor linear regression, the hypotheses for the validity of the model are;H0: b1 = 0Ha: b1 ≠ 0To determine if b1 is statistically significant, the calculated F-statistic is compared with the critical F-value, Fc, at the appropriate level of significance.
  • 14. F-statistic linear regressionThe degree of freedom (df) for the numerator and denominator with one independent variable are;dfnumerator = k = 1dfdenominator = n – k – 1 = n – 2Decision for F-testReject H0 if F > Fc
  • 17. Estimating the VarianceFor Company AWe can estimate the standard deviation, s
  • 18. This is also called the standard error of the estimate or the standard deviation of the regressionTesting the Model for SignificanceWhen the sample size is too small, you can get good values for MSE and r2 even if there is no relationship between the variablesTesting the model for significance helps determine if the values are meaningfulWe do this by performing a statistical hypothesis test
  • 19. Testing the Model for SignificanceWe start with the general linear modelIf 1 = 0, the null hypothesis is that there is no relationship between X and Y
  • 20. The alternate hypothesis is that there is a linear relationship (1≠ 0)
  • 21. If the null hypothesis can be rejected, we have proven there is a relationship
  • 22. We use the F statistic for this testTesting the Model for SignificanceThe F statistic is based on the MSE and MSRwherek = number of independent variables in the modelThe F statistic is
  • 23. This describes an F distribution with degrees of freedom for the numerator = df1 = k degrees of freedom for the denominator = df2 = n – k – 1
  • 24. Testing the Model for SignificanceIf there is very little error, the MSE would be small and the F-statistic would be large indicating the model is usefulIf the F-statistic is large, the significance level (p-value) will be low, indicating it is unlikely this would have occurred by chanceSo when the F-value is large, we can reject the null hypothesis and accept that there is a linear relationship between X and Y and the values of the MSE and r2 are meaningful
  • 25. Steps in a Hypothesis TestSpecify null and alternative hypothesesSelect the level of significance (). Common values are 0.01 and 0.05Calculate the value of the test statistic using the formula
  • 26. Steps in a Hypothesis TestReject the null hypothesis if the test statistic is greater than the F-value from the table Otherwise, do not reject the null hypothesis:Reject the null hypothesis if the observed significance level, or p-value, is less than the level of significance (). Otherwise, do not reject the null hypothesis:Make a decision using one of the following methods
  • 27. Step 3. Calculate the value of the test statisticCompany AStep 1.H0: 1 = 0 (no linear relationship between X and Y)H1: 1≠ 0 (linear relationship exists between X and Y)Step 2. Select  = 0.05
  • 28. Step 4. Reject the null hypothesis if the test statistic is greater than the F-valuedf1 = k = 1df2 = n – k – 1 = 6 – 1 – 1 = 4 The value of F associated with a 5% level of significance and with degrees of freedom 1 and 4 isF0.05,1,4 = 7.71Fcalculated = 9.09Reject H0 because 9.09 > 7.71Company A
  • 29. 0.05F = 7.719.09Company AWe can conclude there is a statistically significant relationship between X and Y
  • 30. The r2 value of 0.69 means about 69% of the variability in sales (Y) is explained by Man Hour (X)Limitation of regression analysisLinear relationships can change over timeThis is referred to as parameter instabilityEven if the model is accurate, its usefulness will be limited if other market participants are also aware of and act on this modelIf the assumptions do not hold, the interpretation and tests of hypotheses may not be valid
  • 31. Using software for regression
  • 32. Using Software for Regression
  • 33. Using Software for RegressionCorrelation coefficient is called Multiple R in Excel
  • 34. Analysis of Variance (ANOVA) TableWhen software is used to develop a regression model, an ANOVA table is typically created that shows the observed significance level (p-value) for the calculated F value
  • 35. This can be compared to the level of significance () to make a decisionTable 4.4
  • 36. ANOVA for Company AP(F > 9.0909) = 0.0394Because this probability is less than 0.05, we reject the null hypothesis of no linear relationship and conclude there is a linear relationship between X and Y