Discriminant analysis is useful for
situations where you want to build a predictive model of group membership based
on observed characteristics of each case. The procedure generates a discriminant function (or, for more than two groups, a set
of discriminant functions) based on linear
combinations of the predictor variables that provide the best discrimination
between the groups. The functions are generated from a sample of cases for
which group membership is known; the functions can then be applied to new cases
with measurements for the predictor variables but unknown group membership.
Note: The grouping variable can have
more than two values. The codes for the grouping variable must be integers,
however, and you need to specify their minimum and maximum values. Cases with
values outside of these bounds are excluded from the analysis.
Example. On
average, people in temperate zone countries consume more calories per day than
those in the tropics, and a greater proportion of the people in the temperate
zones are city dwellers. A researcher wants to combine this information in a
function to determine how well an individual can discriminate between the two
groups of countries. The researcher thinks that population size and economic
information may also be important. Discriminant
analysis allows you to estimate coefficients of the linear discriminant
function, which looks like the right side of a multiple linear regression
equation. That is, using coefficients a, b, c, and d,
the function is:
D = a * climate + b * urban + c * population + d * gross domestic product per capita
If these variables are useful for discriminating between the two
climate zones, the values of D will differ for the
temperate and tropic countries. If you use a stepwise variable selection
method, you may find that you do not need to include all four variables in the
function.
Statistics.
For each variable: means, standard deviations, univariate
ANOVA. For each analysis: Box's M, within-groups
correlation matrix, within-groups covariance matrix, separate-groups covariance
matrix, total covariance matrix. For each canonical discriminant function: eigenvalue,
percentage of variance, canonical correlation, Wilks'
lambda, chi-square. For each step: prior probabilities, Fisher's
function coefficients, unstandardized function coefficients, Wilks' lambda for
each canonical function.