Home > E-book list >Log-Linear Analysis

Garson, G. D. (2012). Log-Linear Analysis. Asheboro, NC: Statistical Associates Publishers.

Instant availablity without passwords in Kindle format on Amazon: Not yet available. .
Tutorial on the free Kindle for PC Reader app: click here.
Obtain the free Kindle Reader app for any device: click here.
Delayed availability with passwords in free pdf format: right-click here and save file.
Register to obtain a password: click here.
Statistical Associates Publishers home page.
About the author
Table of Contents
ASIN number (e-book counterpart to ISBN): B00B0P10I0 .
@c 2012 by G. David Garson and Statistical Associates Publishers. worldwide rights reserved in all languages and on all media. Permission is not granted to copy, distribute, or post e-books or passwords.



Also called multiway frequency analysis (MFA), log-linear analysis is a special case of the general linear model (GLM, which includes regression and ANOVA models) created to better treat the case of dichotomous and categorical variables. It is a method of analyzing the distribution of cases in a table when all the variables of interest are categorical. Usually there is no "dependent variable" as in regression, though the special case of logit log-linear analysis, discussed below, can handle dependent variables. Ordinarily, however, what is predicted is not a variable but instead is the distribution of values in the table formed by categorical variables. The table is not limited to the usual two-way table but may be of any order (any number of categorical variables).

Thus log-linear analysis deals with association of categorical or grouped variables, looking at all levels of possible main and interaction effects, comparing this saturated model with reduced models. The primary purpose is to find the most parsimonious model which can account for cell frequencies in the table being analyzed. While log-linear analysis is a non-dependent procedure for accounting for the distribution of cases in a crosstabulation of categorical variables, it is closely related to such dependent procedures as logit and logistic, probit, and tobit regression.

Log-linear analysis is different from logistic regression in three ways:

1.	The expected distribution of the categorical variables is Poisson, not binomial or multinomial.
2.	The link function is the natural log of the dependent variable, not the logit of the dependent as in logistic regression. (A logit is the natural log of the odds, which is the probability the dependent equals a given value [usually 1, indicating an event has occurred or a trait is present] divided by the probability it does not).
3.	Predictions are estimates of the cell counts in a contingency table, not the logit of y. That is, the cell count is the dependent variable in log-linear analysis. 

Log-linear methods also differ from multiple regression by substituting maximum likelihood estimation of a link function of the dependent for regression's use of least squares estimation of the raw dependent variable itself. The link function transforms the dependent variable and it is this transform, not the raw variable, which is linearly related to the predictor side of the model.

There are several possible purposes for undertaking log-linear modeling, the primary being to determine the most parsimonious model which is not significantly different from the saturated model, which is a model that fully but trivially accounts for the cell frequencies of a table. Log-linear analysis also is used to determine if variables are related, to predict the expected frequencies (table cell values) of a dependent variable, the understand the relative importance of different independent variables in predicting a dependent, and to confirm models using a goodness of fit test (the likelihood ratio). Residual analysis can also determine where the model is working best and worst. Often researchers will use hierarchical log-linear analysis (in SPSS, the Model Selection option under Log-linear) for exploratory modeling, then use general log-linear analysis for confirmatory modeling.

SPSS supports these related procedures, among others:

  1. Generalized linear modeling. Generalized linear modeling (GZLM), discussed in a separate Statistical Associates "Blue Book" volume, represents a more recent approach for analyzing categorical dependents and independents, thus constituting a different method for implementing log-linear analysis, as well as models for logit, probit, Poisson regression on cell count data, and others.
  2. Hierarchical log-linear analysis (HILOG). Select Analyze, Log-linear, Model Selection. HILOG is often used for automatic selection of the best hierarchical model.
  3. General log-linear analysis (GENLOG). Select Analyze, Log-linear, General. GENLOG is often used to refine the best hierarchical model to be more parsimonious by dropping terms.
  4. Logit loglinear analysis and logit regression. Used when there are one or more dependent variables.
  • In summary, traditional approaches to categorical data relied on chi-square and other measures of significance to establish if a relationship existed in a table, then employed any of a wide variety of measures of association to come up with a number, usually between 0 and 1, indicating how strong the relationship was. Log-linear methods are similar in function but have the advantage of making it far easier to analyze multi-way tables (more than two categorical variables) and to understand just which values of which variables and which interaction effects are contributing the most to the relationship. For simple two-variable tables, traditional approaches may still be preferred but for multivariate analysis of three or more categorical variables, log-linear analysis is preferred.

    The full content is now available from Statistical Associates Publishers. Click here.

    Below is the unformatted table of contents.

    Log-linear Analysis
    Table of Contents
    Overview	8
    Key Concepts and Terms	10
    Types of log-linear analysis	10
    General log-linear analysis	10
    Hierarchical log-linear analysis	11
    Types of variables	11
    Factors	12
    Covariates	12
    Cell structure variables/cell weight variables	12
    Contrast variables	12
    Types of models	12
    Saturated models and effects	12
    Parsimonious models	14
    The complete independence model	15
    The one factor independence model	15
    The conditional independence model	16
    The homogenous association model	18
    The symmetry model	19
    The conditional symmetry model	19
    General log-linear modeling: SPSS user interface	20
    The "Model" button	21
    The "Options" button	23
    The "Save" button	24
    General log-linear analysis compared to crosstabulation (SPSS)	24
    Log-linear effects as categorical control variables in crosstabulation	24
    General log-linear analysis of the crosstab example	26
    Goodness of fit in log-linear analysis	28
    Types of goodness of fit measures	28
    Likelihood ratio	28
    Pearson chi-square	29
    Factor list warning	29
    A simple goodness of fit example	29
    General log-linear analysis using SPSS	30
    Overview	30
    Example	31
    The saturated model	32
    The independence model	34
    Model dropping the highest level of interaction	36
    The conditional independence model	37
    General log-linear analysis using SAS	39
    Example	39
    SAS syntax	39
    SAS output for the saturated model	41
    SAS output for the independence model	41
    SAS output for the homogenous association model	42
    SAS output for the conditional independence model	43
    Residual analysis	45
    Overview	45
    Residuals depend on the model	45
    Residuals of the most parsimonious model	46
    Adjusted residuals plots	47
    Normal probability (Q-Q) plots	48
    Deviance residual plots	50
    Normal probability (Q-Q) plots for deviance	51
    Parameter estimates and odds ratios	51
    Overview	51
    Parameter estimates	52
    Standardized parameter estimates (Z scores)	54
    Model equations in log-linear analysis	54
    Predicted frequencies	55
    Odds ratios	57
    Example	57
    Hierarchical log-linear analysis	61
    Overview	61
    The SPSS user interface for hierarchical linear modeling	61
    The initial "Model Selection Loglinear Analysis" dialog	61
    The "Model" button dialog	62
    The "Options" button dialog	63
    Statistical output for hierarchical log-linear analysis in SPSS	64
    The "Cell Counts and Residuals" table	64
    The "Step Summary" table	65
    The "Goodness of Fit Tests" table	67
    The "Parameter Estimates" table	68
    "Tests of K-Way and Higher-Order Effects" table	70
    The "Partial Associations" table	71
    Ordinal log-linear models	73
    Overview	73
    Linear-by-linear association models	73
    Linear-by-linear modeling in SPSS	73
    Example	73
    Data setup	74
    Statistical output for the linear-by-linear ordinal model	75
    Row-effects models	76
    Overview	76
    Data setup	76
    Statistical output for the row-effects ordinal model	77
    Column-effects models	78
    Logit log-linear models and logit regression	79
    Overview	79
    Example	79
    The SPSS user interface for logit log-linear analysis	79
    The main logit log-linear user interface	79
    The "Model" button dialog	81
    The "Options" button dialog	83
    The "Save" button dialog	84
    Logit log-linear statistical output in SPSS	84
    Model	84
    The "Goodness-of-fit Tests" table	84
    The "Analysis of Dispersion" and "Measure of Association" tables	85
    The "Parameter Estimates" table	86
    The "Cell Counts and Residuals" table	88
    Conditional logit regression models	89
    Matched pairs or panel data	89
    Conditional logit regression in SPSS	90
    Choice models	90
    Statistical output for conditional logit regression in SPSS	91
    Assumptions of log-linear models	91
    Not assumed	91
    Well-populated tables	91
    Small models with few variables	92
    Adequate sample size	92
    No zero cells	92
    No important outliers	93
    Normally distributed residuals	93
    No binned interval-level data	93
    Evenly distributed categories	93
    Independence	93
    Data distribution assumptions	94
    Appropriate dispersion	95
    Absence of endogenous regressors	95
    Frequently Asked Questions	95
    Why not just use regression with dichotomous dependents?	95
    Why not just use crosstabulation and ordinal measures of association rather than ordinal log-linear analysis?	96
    What computer packages implement log-linear analysis?	96
    What are second-order and partial odds ratios?	96
    What are structural zeros and sampling zeros in the SPSS "Data Information" table?	97
    Since logit and probit generally lead to the same statistical conclusions, when is one better than the other?	97
    Do I really need to do multinomial logit (multinomial logistic regression) or multinomial probit? Could I just apply M different logit or probit models for a variable with M levels?	98
    What if my variables are multiple-response type?	98
    Explain "partial odds".	98
    Explain coding in saturated vs. nonsaturated models.	98
    What is log-linear analysis with latent variables?	99
    Bibliography	99
    Pagecount: 103