Discriminant Analysis

Like regression but for when the dependent variables are categorical variables

http://stats.idre.ucla.edu/spss/output/discriminant-analysis/

Statistics associated with discriminant analysis
 * Canonical Correlation
 * The extent of association between discriminant scores and groups
 * Centroid
 * Mean value of the discriminant scores for a particular group
 * Classification Matrix
 * Contains number of correctly predicted cases
 * Discriminant Function Coefficients
 * Multipliers of variables when the variables are in the original units of measurement
 * Discriminant Scores
 * Unstandardised coefficients multiplied by variable values. Products are summed and added to constant term.
 * Eigenvalues
 * The ratio of the within-group sum of squares and the between-group sum of squares
 * How much discriminating ability a function has
 * The larger the better
 * F Values and their Significance
 * Calculated from one-way ANOVA with the grouping variable serving as the categorical independent variable
 * Group means and their standard deviations
 * Computed for each predictor for each group
 * Pooled within group correlation matrix
 * Average of separate covariance matrix
 * Standardised discriminant function coefficients
 * When the coefficients have been standardised to have a mean of 0 and variance of 1
 * Structure coefficients
 * Simple correlations between predictors and discriminant function
 * Total correlation matrix
 * If the cases are treated as if they were a single sample and correlations were computed
 * Wilk's Lambda
 * the ratio of the within-group sum of squares means and the total sum of squares
 * Large value means that there is little difference between the means
 * Small value means there is a difference between the means

Discriminant Analysis
 * Formulate Problem
 * estimation/analysis sample used to estimate discriminant function
 * holdout/validation sample left out for later validation
 * Estimate the different function coefficients
 * Determine significance of discriminant function
 * Interpret results
 * Assess validity of discriminant analysis
 * Use holdout/validation sample
 * Find hit ratio, the percentage of cases correctly classified
 * Classification accuracy should be at least 25%

Stepwise Discriminant Analysis -
 * Forward - F ratio calculated for every variable one by one and the ones that meet the cutoff are added in one by one
 * Backward - All variables added in and all the F ratios calculated. Then the variables not meeting the cutoff are taken out one by one.