
DISCRIMINANT FUNCTION ANALYSIS
Overview
Discriminant function analysis, also known as discriminant analysis or simply DA, is used to classify cases into the values of a categorical dependent, usually a dichotomy. If discriminant function analysis is effective for a set of data, the classification table of correct and incorrect estimates will yield a high percentage correct. Discriminant function analysis is found in SPSS under Analyze>Classify>Discriminant. If the specified grouping variable has two categories, the procedure is considered "discriminant analysis" (DA). If there are more than two categories the procedure is considered "multiple discriminant analysis" (MDA).
Multiple discriminant analysis (MDA) is a cousin of multiple analysis of variance (MANOVA), sharing many of the same assumptions and tests. MDA is sometimes also called discriminant factor analysis or canonical discriminant analysis.
While binary and multinomial logistic regression, treated in a separate Statistical Associates "Blue Book" volume, is often used in place of DA or MDA respectively, if the assumptions of discriminant analysis are met, it has greater power than logistic regression: there is less chance of Type II errors  accepting a false null hypothesis. If the data violate assumptions of discriminant analysis, outlined below, then logistic regression may be preferred because it usually involves fewer violations of assumptions (independent variables needn't be normally distributed, linearly related, or have equal withingroup variances), is robust, handles categorical as well as continuous variables, and has coefficients which many find easier to interpret. Logistic regression is preferred when data are not normal in distribution or group sizes are very unequal.
There are several purposes for DA and/or MDA:
To classify cases into groups using a discriminant prediction equation. To test theory by observing whether cases are classified as predicted. To investigate differences between or among groups. To determine the most parsimonious way to distinguish among groups. To determine the percent of variance in the dependent variable explained by the independents. To determine the percent of variance in the dependent variable explained by the independents over and above the variance accounted for by control variables, using sequential discriminant analysis. To assess the relative importance of the independent variables in classifying the dependent variable. To discard variables which are little related to group distinctions. To infer the meaning of MDA dimensions which distinguish groups, based on discriminant loadings. >/pre>Discriminant analysis has basic two steps: (1) an F test (Wilks' lambda) is used to test if the discriminant model as a whole is significant, and (2) if the F test shows significance, then the individual independent variables are assessed to see which differ significantly in mean by group and these are used to classify the dependent variable.
Discriminant analysis shares all the usual assumptions of correlation, requiring linear and homoscedastic relationships and untruncated interval or near interval data. Like multiple regression and most statistical procedures, DA also assumes proper model specification (inclusion of all important independents and exclusion of causally extraneous but correlated variables). DA also assumes the dependent variable is a true dichotomy since data which are forced into dichotomous coding are truncated, attenuating correlation.
The full content is now available from Statistical Associates Publishers. Click here.
Below is the unformatted table of contents.
DISCRIMINANT FUNCTION ANALYSIS Table of Contents Overview 6 Key Terms and Concepts 7 Variables 7 Discriminant functions 7 Pairwise group comparisons 8 Output statistics 8 Examples 9 SPSS user interface 9 The "Statistics" button 10 The "Classify" button 10 The "Save" button 13 The "Bootstrap" button 13 The "Method" button 14 SPSS Statistical output for twogroup DA 16 The "Analysis Case Processing Summary" table 16 The "Group Statistics" table 16 The "Tests of Equality of Group Means" table 16 The "Pooled WithinGroup Matrices" and "Covariance Matrices" tables. 18 The "Box's Test of Equality of Covariance Matrices" tables 18 The "Eigenvalues" table 19 The "Wilks' Lambda" table 21 The "Standardized Canonical Discriminant Function Coefficients" table 21 The "Structure Matrix" table 23 The "Canonical Discriminant Functions Coefficients" table 23 The "Functions at Group Centroids" table 24 The "Classification Processing Summary" table 24 The "Prior Probabilities for Groups" table 25 The "Classification Function Coefficients" table 25 The "Casewise Statistics" table 26 Separategroups graphs of canonical discriminant functions 27 The "Classification Results" table 27 SPSS Statistical output for threegroup MDA 28 Overview and example 28 MDA and DA similarities 28 The "Eigenvalues" table 29 The "Wilks' Lambda" table 29 The "Structure Matrix" table 30 The "Territorial Map" 31 Combinedgroups plot 34 Separategroups plots 34 SPSS Statistical output for stepwise discriminant analysis 35 Overview 35 Example 35 Stepwise discriminant analysis in SPSS 36 Assumptions 41 Proper specification 41 True categorical dependent variables 41 Independence 41 No lopsided splits 41 Adequate sample size 41 Interval data 42 Variance 42 Random error 42 Homogeneity of variances (homoscedasticity) 42 Homogeneity of covariances/correlations 42 Absence of perfect multicollinearity 43 Low multicollinearity of the independents 43 Linearity 43 Additivity 43 Multivariate normality 43 Frequently Asked Questions 44 Isn't discriminant analysis the same as cluster analysis? 44 When does the discriminant function have no constant term? 44 How important is it that the assumptions of homogeneity of variances and of multivariate normal distribution be met? 44 In DA, how can you assess the relative importance of the discriminating variables? 44 Dummy variables 45 In DA, how can you assess the importance of a set of discriminating variables over and above a set of control variables? (What is sequential discriminant analysis?) 45 What is the maximum likelihood estimation method in discriminant analysis (logistic discriminate function analysis)? 45 What are Fisher's linear discriminant functions? 46 I have heard DA is related to MANCOVA. How so? 46 How does MDA work? 46 How can I tell if MDA worked? 46 For any given MDA example, how many discriminant functions will there be, and how can I tell if each is significant? 47 What are Mahalonobis distances? 47 How are the multiple discriminant scores on a single case interpreted in MDA? 47 Likewise in MDA, there are multiple standardized discriminant coefficients  one set for each discriminant function. In dichotomous DA, the ratio of the standardized discriminant coefficients is the ratio of the importance of the independent variables. But how are the multiple set of standardized coefficients interpreted in MDA? 48 Are the multiple discriminant functions the same as factors in principalcomponents factor analysis? 48 What is the syntax for discriminant analysis in SPSS? 48 Bibliography 50 Pagecount: 52