SlideShare a Scribd company logo
SIMPLE AND MULTIPLE REGRESSION Chris Stiff [email_address]
LEARNING OBJECTIVES In this lecture you will learn: What simple and multiple regression mean. The rationale behind these forms of analyses How to conduct a simple bivariate  and multiple regression analyses using SPSS How to interpret the results of a regression analysis
REGRESSION What is regression? Regression is similar to correlation in the sense that both assess the relationship between two variables Regression is used to predict values of an outcome variable (y) from one or more predictor variables (x) Predictors must either be continuous or categorical with ONLY two categories
SIMPLE REGRESSION Simple regression involves a single predictor variable and an outcome variable Examines changes in an outcome variable from a predictor variable  Other names:  Outcome = dependent, endogenous or criterion variable. Predictor = independent, exogenous or explanatory variable.
SIMPLE REGRESSION The relationship between two variables can be expressed mathematically by the slope of line of best fit.  Usually expressed as Y    =   a  +  b  X Outcome  Intercept  + (Coefficient  x  Predictor)
SIMPLE REGRESSION Where:  Y = Outcome (e.g., amount of stupid behaviour) a  = Intercept/constant (average amount of stupid behaviour is nothing is drunk b  = Unit increment in the outcome that is explained by a unit increase in the predictor – line  gradient X = Predictor (e.g., amount of alcohol drunk)
LINE OF BEST FIT Amount of alcohol Stupid behaviour
LINE OF BEST FIT – POOR EXAMPLE Stupid behaviour Number of pairs of socks ?
SIMPLE REGRESSION USING SPSS Analyze  Regression Linear
My regression lecture   mk3 (uploaded to web ct)
SPSS OUTPUT
SPSS OUTPUT R  = correlation between amount drunk and stupid behaviour R square  = proportion of variance in outcome (behaviour) accounted for by the predictor (amount drunk) Adjusted R square  = takes into account the sample size and the number of predictor variables
THE  R 2 The  R 2 , increases with inclusion of more predictor variables into a regression model Commonly reported The adjusted  R 2  however only increases when the new predictor(s) improves the model more than would be expected by chance The adj.  R 2  will  always  be equal to, or less than  R 2 Particularly useful during variable selection stage of model building
SPSS OUTPUT
SPSS OUTPUT Beta  = standardised regression coefficient and shows the degree to which a unit increase in the predictor variable produces a standard deviation change in the outcome variable with all other things constant
REPORTING THE RESULTS OF SIMPLE REGRESSION ß  = 74,  t (18) = 4.74,  p  < .001,  R 2  = .56   Beta value t  value and associate  df  and  p   R  square
GENERATING  DF  AND  T df = n  – p - 1 Where  n  is number of observations and  p is number of parameters estimated (i.e., predictor(s) + constant). NB This is for regression, df can be calculated differently for other tests!
ASSUMPTIONS OF SIMPLE REGRESSION Outcome variable should be measured at interval level When plotted the data should have a linear trend
SUMMARY OF SIMPLE REGRESSION Used to predict the outcome variable from a predictor variable Used when one predictor variable and one outcome variable The relationship must be linear
MULTIPLE REGRESSION Multiple regression is used when there is more than one predictor variable Two major uses of multiple regression: Prediction Causal analysis
USES OF MULTIPLE REGRESSION Multiple regression can be used to examine the following: How well a set of variables predict an outcome Which variable in a set of variables is the best predictor of the outcome Whether a predictor variable still predicts the outcome when another variable is controlled for.
MULTIPLE REGRESSION  - EXAMPLE Attendance at lectures Books read Motivation Exam  Performance  (Grade) What might predict exam performance?
MULTIPLE REGRESSION USING SPSS Analyze  Regression Linear
My regression lecture   mk3 (uploaded to web ct)
MULTIPLE REGRESSION: SPSS OUTPUT
MULTIPLE REGRESSION: SPSS OUTPUT
MULTIPLE REGRESSION: SPSS OUTPUT For overall model:  F(2, 42) = 12.153, p<.001
MULTIPLE REGRESSION: SPSS OUTPUT Number of books read is significant predictor b=.33, t(42) = 2.24, p<.05 Lectures attended is a significant predictor b=.36, t(42) = 2.41, p<.05
MAJOR TYPES OF MULTIPLE REGRESSION There are different types of multiple regression: Standard multiple regression Enter Hierarchical multiple regression Block entry Sequential multiple regression Forward Backward  Stepwise } } Statistical model building Theory-based model building
STANDARD MULTIPLE REGRESSION Most common method.  All the predictor variables are entered into the analysis simultaneously (i.e., enter) Used to examine how much: An outcome variable is explained by a set of predictor variables as a group Variance in the outcome variable is explained by a single predictor (unique contribution).
EXAMPLE The different methods of regression and their associated outputs will be illustrated using: Outcome variable  Essay mark Predictor variables  Number lectures attended (out of 20) Motivation of student (on scale from 0 – 100) Number of course books read (from 0 -10) Attendance at lectures Books read Motivation Exam  Performance  (Grade)
ENTER OUTPUT
ENTER OUTPUT R square = proportion of variance in outcome accounted for by the predictor variables  Adjusted R square = takes into account the sample size and the number of predictor variables
ENTER OUTPUT
ENTER OUTPUT Beta = standardised regression coefficient and shows the degree to which the predictor variable predicts the outcome variable with all other things constant
HIERARCHICAL MULTIPLE REGRESSION aka sequential regression Predictor variables entered in a prearranged order of steps (i.e., block entry) Can examine how much variance is accounted for by a predictor when others already in the model
My regression lecture   mk3 (uploaded to web ct)
Don’t forget  to choose the  r-square change  option from the  Statistics  menu
BLOCK ENTRY OUTPUT
BLOCK ENTRY OUTPUT NB – this will be in one long line in the output!
BLOCK ENTRY OUTPUT
BLOCK ENTRY OUTPUT
STATISTICAL MULTIPLE REGRESSION aka sequential techniques
STATISTICAL MULTIPLE REGRESSION aka sequential techniques Relies on SPSS selecting which predictor variables to include in a model Three types: Forward selection Backward selection Stepwise selection
Forward    Starts with no variables in model, tries them all, includes best predictor, repeats Backward    Starts with ALL variable, removes lowest contributor, repeats Stepwise    Combination.  Starts as Forward, checks that all variables are making contribution after each iteration (like Backward)
SUMMARY OF MODEL SELECTION TECHNIQUES Theory based Enter -  all predictors entered together (standard) Block entry – predictors entered in groups (hierarchical) Statistical based Forward – variables entered in to the model based on their statistical significance Backward – variables are removed from the model based on their statistical significance Stepwise – variables are moved in and out of the model based on their statistical significance
ASSUMPTIONS OF REGRESSION Linearity Relationship between the dependent and predictors must be linear check : violations assessed using a scatter-plot  Independence  Values on outcome variables must be independent  i.e., each value comes from a different participant Homoscedasity At each level of the predictor variable the variance of the residual terms should be equal (i.e. all data points should be about as close to the line of best fit) Can indicate if all data is drawn from same sample Normality Residuals/errors should be normally distributed check  : violations using histograms (e.g., outliers)  Multicollinearity Predictor variables should not be highly correlated
OTHER IMPORTANT ISSUES Regression in this case is for continuous/interval or categorical predictors with ONLY two categories More than two are possible (dummy coding) Outcome must be continuous/interval Sample Size Multiple regression needs a relatively large sample size some authors suggest using between 10 and 20 participants per predictor variable others argue should be 50 cases more than the number of predictors to be sure that one is not capitalising on chance effects
OUTCOMES So – what is regression? This lecture has: introduced the different types regression detailed how to conduct and interpret regression using SPSS described the underlying assumptions of regression outlined the data types and sample sizes needed for regression outlined the major limitation of a regression analysis
REFERENCES Allison, P. D. (1999).  Multiple regression: a primer.  Thousand oaks: pine press. Clark-carter, D. (2004).  Quantitative psychological research: A student’s handbook.  Hove: psychology press. Coolican, H. (2004).  Research methods and statistics in psychology (4 th  ed).  Oxon: Hodder Arnold. George, D., & Mallery, P. (2005).  SPSS for windows step by step (5 th  ed).  Pearson: Boston . Field, A. (2002).  Discovering statistics using SPSS for windows.  London: sage publications. Pallant, J. (2002).  SPSS survival manual.  Buckingham: open university press. http://www.statsoft.com/textbook/stmulreg.html#aassumption

More Related Content

PPTX
Multiple Linear Regression
ODP
Multiple linear regression
PDF
APA Reference List with Example - 7th Edition APA Style
PPTX
Types of entreprenuer
PPTX
PPTX
Chi square tests using SPSS
PPTX
Understanding the Interview Process
PPTX
What is an ANCOVA?
Multiple Linear Regression
Multiple linear regression
APA Reference List with Example - 7th Edition APA Style
Types of entreprenuer
Chi square tests using SPSS
Understanding the Interview Process
What is an ANCOVA?

What's hot (20)

PPTX
Regression analysis
PPT
Linear Regression Using SPSS
PDF
Time Series - 1
PPTX
Analysis of variance
PPTX
Lecture 6 Point and Interval Estimation.pptx
PPTX
Hypothesis testing ppt final
PPTX
Multinomial Logistic Regression Analysis
DOCX
Binary Logistic Regression
PPTX
Statistical tests
PPT
The sampling distribution
PPTX
Regression analysis
PPTX
Correlation & Regression Analysis using SPSS
PPTX
6.5 central limit
PPTX
Sampling Distributions and Estimators
PPTX
Regression analysis
ODP
Multiple Linear Regression II and ANOVA I
PPT
Regression
PDF
Linear regression theory
PPT
Time series slideshare
Regression analysis
Linear Regression Using SPSS
Time Series - 1
Analysis of variance
Lecture 6 Point and Interval Estimation.pptx
Hypothesis testing ppt final
Multinomial Logistic Regression Analysis
Binary Logistic Regression
Statistical tests
The sampling distribution
Regression analysis
Correlation & Regression Analysis using SPSS
6.5 central limit
Sampling Distributions and Estimators
Regression analysis
Multiple Linear Regression II and ANOVA I
Regression
Linear regression theory
Time series slideshare
Ad

Viewers also liked (9)

PPTX
Regression
PPT
Converting theses and dissertations into journal articles
PDF
Models for hierarchical data
ODP
Multiple linear regression II
PDF
Hierarchical data models in Relational Databases
PPT
Multiple regression presentation
PPT
Statistics Case Study - Stepwise Multiple Regression
PPTX
Reporting a multiple linear regression in apa
Regression
Converting theses and dissertations into journal articles
Models for hierarchical data
Multiple linear regression II
Hierarchical data models in Relational Databases
Multiple regression presentation
Statistics Case Study - Stepwise Multiple Regression
Reporting a multiple linear regression in apa
Ad

Similar to My regression lecture mk3 (uploaded to web ct) (20)

PPT
Spss software
PPT
Ders 2 ols .ppt
PPT
Regression analysis ppt
PPTX
Mba2216 week 11 data analysis part 02
PPTX
Recep maz msb 701 quantitative analysis for managers
PPTX
Recep maz msb 701 quantitative analysis for managers
PDF
Analysis of Variance
PPT
Regression analysis
PDF
Applied statistics lecture_6
PPTX
Correlation research
PPTX
s.analysis
PPT
Correlation Research Design
PDF
Linear regression
PPTX
Regression analysis
PPTX
Multiple regression by anagha singh
PPT
604_multiplee.ppt
PPTX
Data-Analysis.pptx
PPTX
Measure of Dispersion in statistics
PPT
Week-7-Slides-Mean-Tests-Parametrics-Test-selection-module.ppt
ODP
Review of "Survey Research Methods & Design in Psychology"
Spss software
Ders 2 ols .ppt
Regression analysis ppt
Mba2216 week 11 data analysis part 02
Recep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managers
Analysis of Variance
Regression analysis
Applied statistics lecture_6
Correlation research
s.analysis
Correlation Research Design
Linear regression
Regression analysis
Multiple regression by anagha singh
604_multiplee.ppt
Data-Analysis.pptx
Measure of Dispersion in statistics
Week-7-Slides-Mean-Tests-Parametrics-Test-selection-module.ppt
Review of "Survey Research Methods & Design in Psychology"

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Spectroscopy.pptx food analysis technology
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
MYSQL Presentation for SQL database connectivity
PDF
cuic standard and advanced reporting.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
KodekX | Application Modernization Development
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Electronic commerce courselecture one. Pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Review of recent advances in non-invasive hemoglobin estimation
Spectroscopy.pptx food analysis technology
The Rise and Fall of 3GPP – Time for a Sabbatical?
MYSQL Presentation for SQL database connectivity
cuic standard and advanced reporting.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
KodekX | Application Modernization Development
Dropbox Q2 2025 Financial Results & Investor Presentation
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Electronic commerce courselecture one. Pdf
Unlocking AI with Model Context Protocol (MCP)
Encapsulation_ Review paper, used for researhc scholars
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Spectral efficient network and resource selection model in 5G networks
The AUB Centre for AI in Media Proposal.docx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf

My regression lecture mk3 (uploaded to web ct)

  • 1. SIMPLE AND MULTIPLE REGRESSION Chris Stiff [email_address]
  • 2. LEARNING OBJECTIVES In this lecture you will learn: What simple and multiple regression mean. The rationale behind these forms of analyses How to conduct a simple bivariate and multiple regression analyses using SPSS How to interpret the results of a regression analysis
  • 3. REGRESSION What is regression? Regression is similar to correlation in the sense that both assess the relationship between two variables Regression is used to predict values of an outcome variable (y) from one or more predictor variables (x) Predictors must either be continuous or categorical with ONLY two categories
  • 4. SIMPLE REGRESSION Simple regression involves a single predictor variable and an outcome variable Examines changes in an outcome variable from a predictor variable Other names: Outcome = dependent, endogenous or criterion variable. Predictor = independent, exogenous or explanatory variable.
  • 5. SIMPLE REGRESSION The relationship between two variables can be expressed mathematically by the slope of line of best fit. Usually expressed as Y = a + b X Outcome Intercept + (Coefficient x Predictor)
  • 6. SIMPLE REGRESSION Where: Y = Outcome (e.g., amount of stupid behaviour) a = Intercept/constant (average amount of stupid behaviour is nothing is drunk b = Unit increment in the outcome that is explained by a unit increase in the predictor – line gradient X = Predictor (e.g., amount of alcohol drunk)
  • 7. LINE OF BEST FIT Amount of alcohol Stupid behaviour
  • 8. LINE OF BEST FIT – POOR EXAMPLE Stupid behaviour Number of pairs of socks ?
  • 9. SIMPLE REGRESSION USING SPSS Analyze Regression Linear
  • 12. SPSS OUTPUT R = correlation between amount drunk and stupid behaviour R square = proportion of variance in outcome (behaviour) accounted for by the predictor (amount drunk) Adjusted R square = takes into account the sample size and the number of predictor variables
  • 13. THE R 2 The R 2 , increases with inclusion of more predictor variables into a regression model Commonly reported The adjusted R 2 however only increases when the new predictor(s) improves the model more than would be expected by chance The adj. R 2 will always be equal to, or less than R 2 Particularly useful during variable selection stage of model building
  • 15. SPSS OUTPUT Beta = standardised regression coefficient and shows the degree to which a unit increase in the predictor variable produces a standard deviation change in the outcome variable with all other things constant
  • 16. REPORTING THE RESULTS OF SIMPLE REGRESSION ß = 74, t (18) = 4.74, p < .001, R 2 = .56 Beta value t value and associate df and p R square
  • 17. GENERATING DF AND T df = n – p - 1 Where n is number of observations and p is number of parameters estimated (i.e., predictor(s) + constant). NB This is for regression, df can be calculated differently for other tests!
  • 18. ASSUMPTIONS OF SIMPLE REGRESSION Outcome variable should be measured at interval level When plotted the data should have a linear trend
  • 19. SUMMARY OF SIMPLE REGRESSION Used to predict the outcome variable from a predictor variable Used when one predictor variable and one outcome variable The relationship must be linear
  • 20. MULTIPLE REGRESSION Multiple regression is used when there is more than one predictor variable Two major uses of multiple regression: Prediction Causal analysis
  • 21. USES OF MULTIPLE REGRESSION Multiple regression can be used to examine the following: How well a set of variables predict an outcome Which variable in a set of variables is the best predictor of the outcome Whether a predictor variable still predicts the outcome when another variable is controlled for.
  • 22. MULTIPLE REGRESSION - EXAMPLE Attendance at lectures Books read Motivation Exam Performance (Grade) What might predict exam performance?
  • 23. MULTIPLE REGRESSION USING SPSS Analyze Regression Linear
  • 27. MULTIPLE REGRESSION: SPSS OUTPUT For overall model: F(2, 42) = 12.153, p<.001
  • 28. MULTIPLE REGRESSION: SPSS OUTPUT Number of books read is significant predictor b=.33, t(42) = 2.24, p<.05 Lectures attended is a significant predictor b=.36, t(42) = 2.41, p<.05
  • 29. MAJOR TYPES OF MULTIPLE REGRESSION There are different types of multiple regression: Standard multiple regression Enter Hierarchical multiple regression Block entry Sequential multiple regression Forward Backward Stepwise } } Statistical model building Theory-based model building
  • 30. STANDARD MULTIPLE REGRESSION Most common method. All the predictor variables are entered into the analysis simultaneously (i.e., enter) Used to examine how much: An outcome variable is explained by a set of predictor variables as a group Variance in the outcome variable is explained by a single predictor (unique contribution).
  • 31. EXAMPLE The different methods of regression and their associated outputs will be illustrated using: Outcome variable Essay mark Predictor variables Number lectures attended (out of 20) Motivation of student (on scale from 0 – 100) Number of course books read (from 0 -10) Attendance at lectures Books read Motivation Exam Performance (Grade)
  • 33. ENTER OUTPUT R square = proportion of variance in outcome accounted for by the predictor variables Adjusted R square = takes into account the sample size and the number of predictor variables
  • 35. ENTER OUTPUT Beta = standardised regression coefficient and shows the degree to which the predictor variable predicts the outcome variable with all other things constant
  • 36. HIERARCHICAL MULTIPLE REGRESSION aka sequential regression Predictor variables entered in a prearranged order of steps (i.e., block entry) Can examine how much variance is accounted for by a predictor when others already in the model
  • 38. Don’t forget to choose the r-square change option from the Statistics menu
  • 40. BLOCK ENTRY OUTPUT NB – this will be in one long line in the output!
  • 43. STATISTICAL MULTIPLE REGRESSION aka sequential techniques
  • 44. STATISTICAL MULTIPLE REGRESSION aka sequential techniques Relies on SPSS selecting which predictor variables to include in a model Three types: Forward selection Backward selection Stepwise selection
  • 45. Forward  Starts with no variables in model, tries them all, includes best predictor, repeats Backward  Starts with ALL variable, removes lowest contributor, repeats Stepwise  Combination. Starts as Forward, checks that all variables are making contribution after each iteration (like Backward)
  • 46. SUMMARY OF MODEL SELECTION TECHNIQUES Theory based Enter - all predictors entered together (standard) Block entry – predictors entered in groups (hierarchical) Statistical based Forward – variables entered in to the model based on their statistical significance Backward – variables are removed from the model based on their statistical significance Stepwise – variables are moved in and out of the model based on their statistical significance
  • 47. ASSUMPTIONS OF REGRESSION Linearity Relationship between the dependent and predictors must be linear check : violations assessed using a scatter-plot Independence Values on outcome variables must be independent i.e., each value comes from a different participant Homoscedasity At each level of the predictor variable the variance of the residual terms should be equal (i.e. all data points should be about as close to the line of best fit) Can indicate if all data is drawn from same sample Normality Residuals/errors should be normally distributed check : violations using histograms (e.g., outliers) Multicollinearity Predictor variables should not be highly correlated
  • 48. OTHER IMPORTANT ISSUES Regression in this case is for continuous/interval or categorical predictors with ONLY two categories More than two are possible (dummy coding) Outcome must be continuous/interval Sample Size Multiple regression needs a relatively large sample size some authors suggest using between 10 and 20 participants per predictor variable others argue should be 50 cases more than the number of predictors to be sure that one is not capitalising on chance effects
  • 49. OUTCOMES So – what is regression? This lecture has: introduced the different types regression detailed how to conduct and interpret regression using SPSS described the underlying assumptions of regression outlined the data types and sample sizes needed for regression outlined the major limitation of a regression analysis
  • 50. REFERENCES Allison, P. D. (1999). Multiple regression: a primer. Thousand oaks: pine press. Clark-carter, D. (2004). Quantitative psychological research: A student’s handbook. Hove: psychology press. Coolican, H. (2004). Research methods and statistics in psychology (4 th ed). Oxon: Hodder Arnold. George, D., & Mallery, P. (2005). SPSS for windows step by step (5 th ed). Pearson: Boston . Field, A. (2002). Discovering statistics using SPSS for windows. London: sage publications. Pallant, J. (2002). SPSS survival manual. Buckingham: open university press. http://www.statsoft.com/textbook/stmulreg.html#aassumption