
MULTIPLE REGRESSION
An illustrated tutorial and introduction to multiple linear regression analysis using SPSS, SAS, or Stata. Suitable for introductory graduatelevel study.
The 2014 edition is a major update to the 2012 edition. Among the new features are these:
Below is the unformatted table of contents.
MULTIPLE REGRESSION Overview 13 Data examples in this volume 16 Key Terms and Concepts 17 OLS estimation 17 The regression equation 18 Dependent variable 20 Independent variables 21 Dummy variables 21 Interaction effects 22 Interactions 22 Centering 23 Significance of interaction effects 23 Interaction terms with categorical dummies 24 Plotting interactions through simple slope analysis 24 Separate regressions 27 Predicted values 28 SPSS 28 SAS 28 Stata 29 Adjusted predicted values 30 Residuals 31 Centering 31 OLS regression in SPSS 32 Example 32 SPSS input 32 SPSS Output 33 The regression coefficient, b 33 Interpreting b for dummy variables 34 Confidence limits on b 35 Beta weights 35 Zeroorder, partial, and part correlations 36 R2 and the "Model Summary" table 39 The Anova table 40 Tolerance and VIF collinearity statistics 40 SPSS plots 41 SPSS "Plots" dialog 41 Plot of standardized residuals against standardized predicted values 43 Histogram of standardized residuals 44 Normal probability (PP) plot 45 OLS regression in SAS 46 Example 46 SAS input 47 SAS output 48 The regression coefficient, b 48 Interpreting b for dummy variables 49 Confidence limits on b 49 Beta weights 50 Zero order, partial, and part correlation 52 RSquared and the Anova table 53 Tolerance and VIF collinearity statistics 54 SAS Plots 55 SAS plotting options 55 Plot of residuals against predicted values 57 Histogram and kernel density plot of standardized residuals 58 Normal probability (PP) plot 59 Normal quantilequantile (QQ) plot 60 Other SAS plots 61 OLS regression in Stata 64 Example 64 Stata input 65 Stata output 66 The regression coefficient, b 66 Interpreting b coefficients 67 Confidence limits on b 68 Beta weights 68 RSquared and the Anova table 68 Zero order, partial, and part correlation 69 Tolerance and VIF collinearity statistics 69 Other Stata postestimation output 70 Stata Plots 71 Stata plotting options 71 Plot of standardized residuals against standardized predicted values 71 Histogram of standardized residuals 73 Normal probability (PP) plot 74 Margin plots 75 Robust regression 75 Overview 75 When to use robust regression 76 Robust regression in SPSS 76 Overview 76 SPSS input 77 SPSS output 77 Robust regression in SAS 78 SAS input 78 SAS output 80 Robust regression in Stata 81 Stata input 81 Stata output 81 Hierarchical multiple regression 82 Overview 82 Examples 83 Difference in differences regression 83 Overview 83 The parallel trend assumption 84 Example data 85 Data setup 86 The model 86 Difference modeling in SPSS 89 SPSS input 89 Should the dependent variable be linear or logarithmic? 90 SPSS output 92 Difference modeling in SAS 94 SAS input 94 Should the dependent variable be linear or logarithmic? 95 SAS output 96 Difference modeling in Stata 98 Stata input 98 Should the dependent variable be linear or logarithmic? 98 Stata output 99 Panel data regression 101 Overview 101 Types of panel data regression 101 Software for panel data regression 103 Stepwise Multiple Regression 103 Overview 103 Forward, backward, and stepwise regression 103 Warning 104 Other problems of stepwise regression 105 Dummy variables in stepwise regression 105 Example 106 Stepwise regression in SPSS 106 Overview 106 SPSS input 106 SPSS Output 107 Stepwise regression in SAS 108 Overview 108 Example 109 SAS input 109 SAS output 110 Stepwise regression in Stata 111 Overview 111 Example 113 Stata input 113 Stata output 113 Model selection regression in SPSS 114 Overview 114 Example 116 The "Fields" tab 116 The "Build Options" tab 118 The "Model Options tab 123 Model Viewer 124 Default output 124 Model Viewer interface 125 Model Viewer: Automatic Data Preparation Table 127 Model Viewer: Model Building Summary window 128 Model Viewer: Coefficients window 129 Coefficient importance 132 Model Viewer: Effects window 133 Model Viewer: Predicted by Observed window 136 Model Viewer: Estimated Means window 137 Model Viewer: Residuals window 139 Model Viewer: Outliers window 141 Example 2: A boosted ensemble model 142 Model Viewer: Model Summary window 142 Model Viewer: Predictor Frequency window 142 Model Viewer: Saving and printing output 143 Model selection regression in SAS 146 Overview 146 Selection criteria 146 Example 147 PROC REG/SELECTION method 147 SAS input 147 SAS output 148 Summary 152 PROC GLMSELECT method 152 Overview 152 SAS input 152 SAS output 153 Quantile Regression 158 Overview 158 Introduction 158 PseudoR2 159 Standard errors and coefficient significance 160 Interpreting quantile regression coefficients 161 Conditional vs. unconditional quantile regression 162 Generalized quantile regression (GQR and IVGQR) 164 Quantile Regression for Panel Data (QRPD) 166 Example 167 Quantile regression in SPSS 168 Overview 168 Heteroscedasticity test 170 SPSS input 171 SPSS output 174 Quantile regression in SAS 178 Overview 178 Heteroscedasticity test 178 SAS input 179 SAS output 180 Model selection quantile regression in SAS 186 Overview 186 Example 186 SAS input 187 SAS output 188 Quantile regression in Stata 193 Overview 193 Heteroscedasticity test 193 Stata input 194 Stata output 195 More about significance tests in regression models 199 Overview 199 Complex samples 200 F test 201 Overall test of the model and R2 201 Small samples 202 The partial F test 202 ttests 203 One vs. twotailed t tests 204 ttests for dummy variables 205 Confidence limits and standard errors 205 Confidence intervals and prediction intervals 205 The confidence interval of the regression coefficient 206 The confidence interval of y, the dependent variable 206 The prediction interval of y, the dependent variable 209 Standard error of estimate (SEE) / root mean square error (MSE) 210 Standard error and mean standard error of predicted values [SE(Pred) and MSEP] 211 More about effect size measures in multiple regression 215 Beta weights 215 Overview 215 Using beta weights in model comparisons 216 Significance of beta 217 Unique v. joint effects 217 Standardization and comparability of variables 217 Beta weights over 1.0 218 Labeling: b and beta 218 Correlation 218 Zeroorder correlation, r 218 Semipartial (part) correlation 220 Partial correlation squared 220 Rsquared 221 Overview 221 Warning regarding Rsquare differences between samples 222 Adding variables 222 Adjusted Rsquare 222 Rsquared difference tests 224 Levelimportance 232 The intercept 233 Residual analysis and diagnostics 234 Outliers, influence, leverage, and distance 234 Overview 234 Outliers and residuals 235 Leverage and distance 235 Influence 235 Example 236 Types of residuals and plots 236 Unstandardized residuals 236 Standardized residuals 236 Standardized deleted residuals 237 Studentized residuals 237 Studentized deleted residuals 238 Which type of residual to use? 238 Types of residual plots 238 Residual plots 238 Partial regression plots 239 Partial residual plots 239 Error histograms 240 Normal probabilityprobability (PP) plots 240 Normal quantilequantile (QQ) plots 240 Coefficients flagging unusual observations 244 Different outlier status definitions 244 Leverage values 244 Cook's distance 245 Mahalanobis distance 246 DFFITS 247 Standardized DFFITS 247 DFBETA 248 Standardized DFBETA 249 Covariance ratio (COVRATIO) 250 What to do about outliers 250 Residual analysis in SPSS 252 Obtaining residualsrelated statistics 252 Saving residuals to file 253 Listing cases with the largest residuals 256 Checking for serial independence 257 Checking for homoscedastic error 257 Checking for nonlinearity 262 Checking for normally distributed error 265 Checking for outliers 271 Residual analysis in SAS 281 Obtaining residualsrelated statistics 281 Saving residuals to file 284 Scatterplots of bivariate relationships 286 Listing cases with the largest residuals 287 Checking for serial independence 288 Checking for homoscedastic error 289 Checking for nonlinearity 289 Checking for normally distributed error 291 Checking for outliers 295 Other ODS plot options 304 Residual analysis in Stata 307 Obtaining residualsrelated statistics 307 Saving residuals to file 307 Listing cases with the largest residuals 308 Checking for serial independence 308 Checking for homoscedastic error 309 Checking for nonlinearity 312 Checking for normally distributed error 313 Checking for outliers 318 Additional plotting options in Stata 327 Multicollinearity 327 Types of multicollinearity 328 The correlation matrix 328 Tolerance 328 VIF (varianceinflation factor) 329 Condition index values and variance proportions 330 Checking multicollinearity in SPSS 331 VIF and tolerance 331 Condition index and variance proportions 332 Checking multicollinearity in SAS 332 SAS syntax 332 VIF and tolerance 332 Condition index and variance proportions 333 Checking multicollinearity in Stata 333 VIF and tolerance 333 Condition index and variance proportions 334 Assumptions 335 Proper specification of the model 335 Spuriousness 336 Suppression 336 Ramsey's RESET test for misspecification 337 Proper specification of the research question 340 Appropriate modeling of control variables 340 Population error is assumed to be uncorrelated with each of the independent variables 341 Nonrecursivity 341 No overfitting 342 Absence of perfect multicollinearity 342 Absence of high partial multicollinearity 343 Linearity 345 Nonlinear transformations 345 Nonlinear link functions 345 Data level 346 Multivariate normality 347 Normally distributed residuals (error) 348 Homoscedasticity 348 Robust standard errors 349 Robust regression 350 Outliers 350 Reliability 351 Additivity 351 Independent observations (absence of autocorrelation) 352 Overview 352 Graphical test of serial independence 353 The DurbinWatson coefficient 357 Mean population error of zero 362 Random sampling 362 Validity 363 Other data requirements 363 Frequently Asked Questions 364 How do I report regression results? 364 What is the logic behind the calculation of regression coefficients in multiple regression? 367 How large a sample size do I need to do multiple regression? 367 Can Rsquared be interpreted as the percent of the cases explained? 368 When may ordinal data be used in regression? 368 When testing for interactions, is there a strategy alternative to adding multiplicative interaction terms to the equation and testing for R2 increments? 370 How do margin plots reveal interaction effects? 371 The regress command 371 The margins command 372 The marginsplot command 373 How do I code dummy variables in regression? 375 What is "attenuation" in the context of regression? 379 Is multicollinearity only relevant if there are significant findings? 379 What can be done to handle multicollinearity? 380 What can be done to handle autocorrelation? 381 How does stepwise multiple regression relate to multicollinearity? 382 What are forward inclusion and backward elimination in stepwise regression? 382 Should I keep dropping nonsignificant independent variables one at a time until only significant ones remain in my model? 382 What are different types of sums of squares used in F tests? 383 Can regression be used in place of Anova for analysis of categorical independents affecting an interval dependent? 385 Does regression analysis require uncorrelated independent variables? 385 How can you test the significance of the difference between two Rsquareds? 385 How do I compare b coefficients after I compute a model with the same variables for two subgroups of my sample? 386 How do I compare regression results obtained for one group of subjects to results obtained in another group, assuming the same variables were used in each regression model? 386 What do I do if I have censored, truncated, or sampleselected data? 387 What do I do if I am measuring the same independent variable at both the individual and group level? 387 What is a "relative effects" regression model? 388 How do I test to see what effect a quadratic or other nonlinear term makes in my regression model? 389 What is "smoothing" in regression and how does it relate to dealing with nonlinearities in OLS regression? 389 What is nonparametric regression for nonlinear relationships? 391 What is Poisson regression? 394 SPSS questions 394 What is the command syntax for linear regression in SPSS? 394 How do I standardize variables in SPSS? 396 How do I create interaction variables in SPSS? 396 All I want is a simple scatterplot with a regression line. Why won't SPSS give it to me? 397 What is categorical regression in SPSS? 398 Acknowledgments 399 Bibliography 400 Pagecount: 410