*DISCLAIM,A
                                 IMPORTANT:
     Always consider WATSTAT's recommendations as a STARTING POINT and NOT
 THE FINAL WORD: they are merely intended to serve as guides to further study
 and consultation.  WATSTAT can only recommend what is USUALLY appropriate,
 given the specifications you provide.  Other unspecified factors my over-
 ride those that WATSTAT considers.  Moreover, it would be unwise to ignore
 such "non-statistical" factors as: what procedures make the most theoretical
 sense; what procedures are established and expected in your field; and what
 procedures you and your readers will be able to interpret.
*RAND,A
     NOTE: Since you specified Random Sampling or Random Assignment, it is
 legitimate to use INFERENTIAL STATISTICS (Significance Tests & Confidence
 Limits) as well as DESCRIPTIVE STATISTICS.  But when you use Inferential
 statistics, you must still report important Descriptive statistics, such as
 means & standard deviations, percentages, or correlation coefficients.
*NONRAND,A
     NOTE: Since you have a non-random sample, NO INFERENTIAL STATISTICS
 (such as Significance Tests or Confidence limits) are appropriate.  Hence,
 WATSTAT will recommend only DESCRIPTIVE STATISTICS.
*WHAT_DES,A
     Report all Descriptive statistics needed to characterize your sample
 (e.g., demographics) and, depending upon your analytical focus, report those
 that most clearly show: 1) the magnitude of sub-sample differences; 2) the
 strength & direction of associations; or 3) the characteristics of a single
 variable's distribution, e.g., its "average," "dispersion," and "shape."
     In deciding what Descriptive statistics to report, ask yourself: "What
 information will a reader need to REPLICATE my analysis or to COMPARE my
 results to those of others?"
*D-UNI-NOM,A
    Summarize the distribution with a percentage table and point out the
 Modal and sparse categories.  Optionally, present percentages graphically
 in a bar or pie chart.
*D-NOM-SMALL,A
    CAUTION: Due to your small sample size, each case counts for more than 1%
 and a seemingly large between-category % difference could be due to very few
 cases.  Take this into account in deciding whether percentage differences
 reflect important substantive differences in the cases you're describing.
*D-UNI-RANK,A
    If your data are inherently in the form of ranks, sample size determines
 all the key descriptive statistics and there is no need to report them.  You
 should report the number of ties and the ranks on which most ties occur.
    If you have an Ordinal variable (not originally in ranks) the Median is
 the appropriate "average" and the Quartile Deviation the appropriate index
 of "dispersion."   Usually, it is also appropriate to report some additional
 Percentiles to give a more complete picture of the variable's distribution,
 for example, the 25th & 75th Percentiles, or the upper and lower Deciles.
*D-UNI-PART,A
    If your Ordinal categories allow, compute the Median and Quartile Devia-
 tion to index the "average" and "degree of dispersion," respectively.  If
 data are inherently grouped and if it is inappropriate to compute the Median
 exactly, report the category it falls in and its approximate location in the
 category.  Summarize the distribution with a percentage table and point out
 the Modal and sparse categories.  Optionally, present percentages graphically
 in a bar or pie chart.
*D-UNI-INT,A
    If your data are dichotomized, report the cut-point that divides the
 categories and the percentage (or proportion) of cases in each category.
    If your data are continuous or grouped into 3 or more categories, use the
 Mean and Standard Deviation to index the "average" and "dispersion" of the
 distribution.  If the distribution is highly skewed or if there are some
 extreme values that could make the Mean a "misleading average," report the
 Median instead of, or in addition to, the Mean.  Whether or not the data are
 skewed, it is usually wise to report some key Percentiles to provide a more
 complete picture of the distribution, for example, the 25th & 75th Percent-
 iles, or the upper and lower Deciles.
    If the data are grouped, a Percentage Table or equivalent graphic (e.g.,
 a bar chart) is usually appropriate.  If you don't use a percentage table
 with grouped data, consider reporting where the Mode falls and which, if
 any, categories are exceptionally sparse.
    If the data are continuous and if it is important to describe the shape
 of the distribution, consider grouping the data and using procedures noted
 in the preceding paragraph.  Alternatively, you could present the data in a
 Frequency Polygon (line chart) or in an Ogive (a line chart that shows the
 cumulative frequency distribution).
*D-COMP1-NOM,A
    Percentage tables are usually the best for comparing Nominal distribu-
 tions across sub-samples.  Use Percentage Differences to index the magnitude
 of sub-sample differences, and point out the Modal and sparse categories for
 each sub-sample.  Optionally, present percentages graphically in bar charts.
*D-COMP2-NOM,A
    Percentage tables are usually the best for comparing Nominal distribu-
 tions across sub-samples.  Use Percentage Differences to index the magnitude
 of sub-sample differences, and point out the Modal and sparse categories for
 each sub-sample.  Multivariate percentage tables are appropriate for showing
 differences across two or more Independent (Comparison) variables, especial-
 ly when there are important Interaction (Specification) effects.  However,
 such tables are more difficult to read, so it is usually advisable to break
 them into a set of bivariate Partial Tables.  Standardized Percentage Tables
 can be used to adjust for one or more Comparison variables without showing
 them directly in the tables, but standardization can only be used for Com-
 parison variables that do not Interact with others.  As an alternative to
 tables, consider presenting percentages graphically in bar charts.
*D-COMP-RANK,A
    If your Dependent variable is inherently in the form of ranks, your best
 option is probably to compare Mean Ranks across sub-samples.  However, keep
 in mind that Mean Ranks are not the same as means computed on Interval data,
 so the absolute size of sub-sample differences is not meaningful: focus only
 on "greater-than" and "less-than" relationships between Mean Ranks of your
 sub-samples.  Unless ties are rare, report the number of ties and the ranks
 on which most ties occur.
    If your Ordinal Dependent variable is not ranked, the Median is the
 appropriate "average" and the Quartile Deviation the appropriate index of
 "dispersion."   Compare Medians across sub-samples, and search for possible
 "interaction effects" between Comparison variables.  Focus on the RELATIVE
 SIZE of sub-sample Medians (i.e., "greater-than" & "less-than" relations),
 because the absolute magnitude of Ordinal-scale Medians is not meaningful.
 Usually, it is also appropriate to report some additional Percentiles (e.g.,
 the 25th & 75th Percentiles or the highest & lowest Deciles) to give a more
 complete picture of each sub-sample distribution.
*D-COMP-PART,A
     The best way to assess differences on a "Partially Ordered" variable
 depends on whether you're able to compute sub-sample Medians.
     If your data allow you to determine Medians exactly, report the Medians
 for all sub-samples and focus on the RELATIVE SIZE of sub-sample Medians
 (i.e., "greater-than" & "less-than" relations), since the absolute magnitude
 of Ordinal-scale Medians is not meaningful.  If you have two or more Compar-
 ison Variables, search for possible "interactions" between these variables.
     If the grouping of data doesn't allow you to compute Medians, you won't
 be able to compare sub-sample "averages" in a way that takes full advantage
 of the Dependent variable's Ordinal properties.  The best approach in this
 case is to present the data in Percentage Tables, which assume only Nominal
 measurement.  (Optionally, present percentages graphically in bar charts.)
 Use % Differences to index the magnitude of sub-sample differences and point
 out the Modal and sparse categories for each sub-sample.  Since you should
 be able to specify the CATEGORIES THAT CONTAIN THE MEDIAN for the various
 sub-samples, you can also base comparisons on the APPROXIMATE location of
 Medians; since categories are ordered, you should also be able to interpret
 an approximate difference in Medians as evidence that one sub-sample has a
 higher "average" than another.
*D-COMP1-INT,A
     With Interval Dependent Variables it is usually appropriate to base
 sub-sample comparisons on Means.  Report all sub-sample Means and Standard
 Deviations.
*D-COMP2-INT,A
     If you have two or more Comparison Variables, search for possible inter-
 actions.  If you have one or more Interval-Level Independent variables that
 you wish to control ("hold constant"), Analysis of Covariance procedures can
 be used to adjust sub-sample Means for such variables.
*D-COMP-DICH,A
     Percentage tables are usually best for comparing Dichotomous Dependent
 variables across sub-samples, but it may be appropriate to use Rates or
 Proportions rather than %'s, especially if the Dependent variable represents
 a relatively rare occurrence, such as a disease or mortality outcome.  [Note
 that Rates & Proportions may be analyzed and tabulated in much the same way
 as Percentages, although they are expressed on different scales.]
    Use % Differences [or Rate or Proportion Differences] to index the magni-
 tude of sub-sample differences, and point out the Modal and sparse catego-
 ries for the various sub-samples.  Multivariate tables are appropriate for
 showing differences across two or more Independent (Comparison) variables,
 especially when important Interaction (Specification) effects are present.
 However, such tables are more difficult to read, so it may be advisable to
 break them into a set of bivariate Partial Tables.  "Standardized Partial
 Percentage Tables" can be used to adjust for one or more Independent vari-
 ables without showing them directly in the tables, but standardization can
 only be used for Independent variables that do not Interact with others.
 Instead of tables, consider presenting Percentages [or Rates or Proportions]
 in graphic charts.
*D-COMP-OTHER2,A
     Except for Interval Dependent Variables, there is no procedure designed
 to handle simultaneous sub-sample comparisons for 2 or more Dependent vari-
 ables.  Your only option is to run a separate analysis for each Dependent
 variable.  To get recommendations appropriate for these separate analyses,
 return to WATSTAT's Choice Boxes and select an Option other than "2 or More
 Dependent Variables" in Box 4.
*D-BIVAR-NOM/NOM,A
    If the two Nominal variables are dichotomized, use the Phi Coefficient
 as a measure of association.  If either or both of your Nominal variables
 has 3 or more categories, use Cramer's V, which is the same as Phi except
 that it adjusts for the number categories.
*D-BIVAR-NOM/RANK,A
    There is no statistic specifically designed to measure the association
 between a Nominal Dependent variable and an Ordinal Independent variable.
 Your only choice is to break the Ordinal variable into categories and treat
 it as Nominal.  If you dichotomize it, select a cut-point as close to the
 Median as possible; if you break it into 3 or more categories, select cut-
 points that yield approximately equal frequencies across categories.  Once
 the Ordinal variable is categorized, the appropriate statistics are those
 for two Nominal variables.
    If the two Nominal variables are dichotomized, use the Phi Coefficient
 as a measure of association.  If either or both of your Nominal variables
 has 3 or more categories, use Cramer's V, which is the same as Phi except
 that it adjusts for the number categories.
*D-BIVAR-NOM/PART,A
    There is no statistic specifically designed to measure the association
 between a Nominal Dependent variable and an Independent variable that is
 cast in the form of Ordinal categories.  Your only choice is to treat the
 Ordinal variable as if it were a set of Nominal categories, and the only
 appropriate statistics are those for two Nominal variables.
    If the two Nominal variables are dichotomized, use the Phi Coefficient
 as a measure of association.  If either or both of your Nominal variables
 has 3 or more categories, use Cramer's V, which is the same as Phi except
 that it adjusts for the number categories.
*D-BIVAR-NOM/INT,A
    There is no statistic specifically designed to measure the association
 between a Nominal Dependent variable and an Interval Independent variable,
 so you have two OPTIONS: 1) break the Interval variable into categories and
 treat it as Nominal, or 2) dichotomize the Dependent variable and treat it
 as Interval.
   If you choose OPTION 1, break the Independent variable into categories
 that contain approximately equal numbers of cases.  Once this is done, the
 appropriate statistics are those for two Nominal variables.
    If the two Nominal variables are dichotomized, use the Phi Coefficient as
 a measure of association.  If either or both of your Nominal variables has
 3 or more categories, use Cramer's V, which is the same as Phi except that
 it adjusts for the number categories.
   If you choose OPTION 2, dichotomize the Dependent variable as close as
 possible to the Median unless there is theoretical justification for using
 another "high vs. low" cut-point.  The dichotomized Dependent variable may
 now be assigned arbitrary scores of 0 for "low" and 1 for "high" and may,
 within limits, be treated as an Interval scale.  Once this is done, you can
 use the Linear Correlation Coefficient (Pearson's r and r-squared) to index
 the strength and direction of the relationship.  But if your problem calls
 for regression statistics, Linear Regression may not be appropriate: with a
 dichotomous Dependent variable some predicted (Y') scores may have impossi-
 ble values (less than 0 or greater than 1).  If these impossible values are
 numerous or if they will cause problems in interpreting your results, use
 Logistic Regression instead.
*D-BIVAR-RANK/NOM,A
    There is no statistic specifically designed to measure the association
 between an Ordinal Dependent variable and a Nominal Independent variable.
 Your only choice is to break the Ordinal variable into categories and treat
 it as Nominal.  If you dichotomize it, select a cut-point as close to the
 Median as possible; if you break it into 3 or more categories, select cut-
 points that yield approximately equal frequencies across categories.  Once
 the Ordinal variable is categorized, the appropriate statistics are those
 for two Nominal variables.
    If the two Nominal variables are dichotomized, use the Phi Coefficient as
 a measure of association.  If either or both of your Nominal variables has
 3 or more categories, use Cramer's V, which is the same as Phi except that
 it adjusts for the number categories.
*D-BIVAR-RANK/RANK,A
   If both variables are in the form of ranks, you can proceed to compute one
 of the measures of association noted below.  Otherwise, you must transform
 them to ranks before proceeding.
   Spearman's Rho is the best known measure of association for two Ordinal
 variables and, because it is simply the Linear Correlation Coefficient
 (Pearson's r) applied to ranks, it is often interpreted as an approximate
 index of linear correlation.  The "correction for ties" should be applied
 to Rho, but it has little effect if fewer than 30% of the cases are tied.
   In some fields the preferred statistic is Kendall's Tau, which, unlike
 Spearman's Rho, does not involve any arithmetical operations that assume
 an underlying Interval Scale.  This statistic is sometimes referred to as
 "Tau-A" to distinguish it from modified forms ("Tau-B" and "Tau-C) that are
 applied to "ordered contingency tables."  The computing formulas for Tau-A
 found in most texts incorporate a correction for tied ranks.
*D-BIVAR-RANK/PART,A
    There is no statistic specifically designed to measure the association
 between a "true" Ordinal Dependent variable and a "partially ordered" ind-
 ependent variable.  Your best choice is to break the Dependent variable into
 ordered categories and treat both variables as "partially ordered."  Prior
 to computations, copy the data into a contingency table in which rows are
 categories of the Dependent variable and columns are categories of the
 Independent variable.  Use one of the following measures of association:
    The best statistic for most ordered contingency tables is a modified form
 of Kendall's Tau: use Tau-B if the number of rows in the table equals the
 number of columns; use Tau-C if the table is not "square."
*D-BIVAR-RANK/INT,A
    There is no statistic specifically designed to measure the association
 between an Ordinal Dependent variable and an Interval Independent variable.
 If you can't assume that the Dependent variable is Interval, you'll have to
 "downgrade" the Independent variable and treat it as an Ordinal scale.  If
 you can transform it to ranks, do so, and apply one of the measures of
 association recommended below.  [If it is so grouped that it can only be
 transformed into a set of ordered categories, go back thru WATSTAT's Choice
 Boxes and pick Option 3, "Ordered Categories," as the Level of Measurement
 for the Independent variable.]
   Spearman's Rho is the best known measure of association for two Ordinal
 variables and, because it is simply the Linear Correlation Coefficient
 (Pearson's r) applied to ranks, it is often interpreted as an approximate
 index of linear correlation.  The "correction for ties" should be applied to
 Rho, but it has little effect if fewer than 30% of the cases are tied.
   In some fields the preferred statistic is Kendall's Tau, which, unlike
 Spearman's Rho, does not involve any arithmetical operations that assume
 an underlying Interval Scale.  This statistic is sometimes referred to as
 "Tau-A" to distinguish it from modified forms ("Tau-B" and "Tau-C) that are
 applied to "ordered contingency tables."  The computing formulas for Tau-A
 found in most texts incorporate a correction for tied ranks.
*D-BIVAR-PART/NOM,A
    There is no statistic specifically designed to measure the association
 between a set of ordered categories and a Nominal Independent variable, and
 your only option is to "downgrade" the Dependent variable to the Nominal
 level.  For two Nominal variables the following recommendations apply.
    If the two Nominal variables are dichotomized, use the Phi Coefficient as
 a measure of association.  If either or both of your Nominal variables has
 3 or more categories, use Cramer's V, which is the same as Phi except that
 it adjusts for the number categories.
*D-BIVAR-PART/RANK,A
    There is no statistic specifically designed to measure the association
 between a "partially ordered" Dependent variable and a "true" Ordinal ind-
 ependent variable.  Your best choice is to break the Independent variable
 into ordered categories and treat both variables as "partially ordered."
 Prior to computations, copy the data into a contingency table in which rows
 are categories of the Dependent variable and columns are categories of the
 Independent variable.  Use one of the following measures of association:
    The best statistic for most ordered contingency tables is a modified form
 of Kendall's Tau: use Tau-B if the number of rows in the table equals the
 number of columns; use Tau-C if the table is not "square."
*D-BIVAR-PART/PART,A
    Prior to computations, copy the data into a contingency table in which
 rows are categories of the Dependent variable and columns are categories of
 the Independent variable.  Use one of the following measures of association:
    The best statistic for most ordered contingency tables is a modified form
 of Kendall's Tau: use Tau-B if the number of rows in the table equals the
 number of columns; use Tau-C if the table is not "square."
*D-BIVAR-PART/INT,A
    There is no statistic specifically designed to measure the association
 between a "partially ordered" Dependent variable and an Interval Independent
 variable.  The best alternative is to break the Independent variable into
 ordered categories and treat both variables as "partially ordered."  Prior
 to your computations, copy the data into a contingency table in which rows
 are categories of the Dependent variable and columns are categories of the
 Independent variable.  Then use one of the following indices of association:
    The best statistic for most ordered contingency tables is a modified form
 of Kendall's Tau: use Tau-B if the number of rows in the table equals the
 number of columns; use Tau-C if the table is not "square."
*D-BIVAR-INT/NOM,A
    The preferred measure of association for an Interval Dependent variable
 and a Nominal Independent variable is the Correlation Ratio (Eta).  The Eta
 statistic indexes the strength of a relationship of any form, including
 non-monotonic (e.g., U-shaped).  Eta-Squared is commonly reported instead of
 Eta, since it has a more meaningful interpretation: it measures the propor-
 tion of variance in the Dependent variable explained by the categories of
 the Independent variable.
*D-BIVAR-INT/RANK,A
    There is no statistic specifically designed to measure the association
 between an Interval Dependent variable and an Ordinal Independent variable.
 If you can't assume that Independent variable is Interval, you'll have to
 "downgrade" the Dependent variable and treat it as an Ordinal scale.  If
 you can transform it to ranks, do so, and apply one of the measures of
 association recommended below.  [If it is so grouped that it can only be
 transformed into a set of ordered categories, go back thru WATSTAT's Choice
 Boxes and pick Option 3, "Ordered Categories," as the Level of Measurement
 for the Dependent variable.]
   Spearman's Rho is the best known measure of association for two Ordinal
 variables and, because it is simply the Linear Correlation Coefficient
 (Pearson's r) applied to ranks, it is often interpreted as an approximate
 index of linear correlation.  The "correction for ties" should be applied
 to Rho, but it has little effect if fewer than 30% of the cases are tied.
   In some fields the preferred statistic is Kendall's Tau, which, unlike
 Spearman's Rho, does not involve any arithmetical operations that assume
 an underlying Interval Scale.  This statistic is sometimes referred to as
 "Tau-A" to distinguish it from modified forms ("Tau-B" and "Tau-C) that are
 applied to "ordered contingency tables."  The computing formulas for Tau-A
 found in most texts incorporate a correction for tied ranks.
*D-BIVAR-INT/PART,A
    There is no statistic specifically designed to measure the association
 between an Interval Dependent variable and a "partially ordered" Independent
 variable, so you have 2 OPTIONS: 1) "downgrade" the Dependent variable by
 breaking it into ordered categories, or 2) "downgrade" the Independent vari-
 able to a Nominal scale.  OPTION 2 is the best choice if you're interested
 mainly in the strength of the relationship, but since the Independent vari-
 able is assumed to be merely Nominal, you won't be unable to determine the
 direction (+/-) of the relationship.
    If you choose OPTION 1, you should break the Dependent variable into cat-
 egories that contain approximately equal numbers of cases.  Copy the data
 into a contingency table in which rows are categories of the Dependent vari-
 able and columns are categories of the Independent variable.  Then compute
 one of the following indices recommended for ordered contingency tables.
    The best statistic for most ordered contingency tables is a modified form
 of Kendall's Tau: use Tau-B if the number of rows in the table equals the
 number of columns; use Tau-C if the table is not "square."
    If you choose OPTION 2, every category of the Independent variable MUST
 contain at least 2 cases (preferably more), so you might have to collapse
 some sparse categories.  However, categories should not be collapsed without
 restraint: it is also desirable to have as many categories as possible.
    The preferred measure of association for an Interval Dependent variable
 and a Nominal Independent variable is the Correlation Ratio (Eta).  The Eta
 statistic indexes the strength of a relationship of any form, including
 non-monotonic (e.g., U-shaped).  The square of the Eta (Eta-Squared) is
 commonly reported instead of Eta, since it has a more meaningful interpret-
 ation: it measures the proportion of variance in the Dependent variable
 explained by the categories of the Independent variable.
*D-BIVAR-INT/INT,A
    In most situations the preferred index of association for two Interval
 variables is the Linear Correlation Coefficient, also called Pearson's r.
 The square of the r statistic, known as the Coefficient of Determination, is
 often reported along with r, because it measures the proportion of variance
 in one variable explained by the other.
    If you're interested in predicting or estimating scores on the Dependent
 variable from those on the Independent variable, you should compute the
 Linear Regression statistics: the Regression Coefficient, the Y-Intercept,
 and the Standard Error of Estimate.
    If you suspect that the relationship departs markedly from linearity, so
 that Pearson's r underestimates its "true" strength, you can use the Correl-
 ation Ratio (Eta) instead.  This will require breaking the Independent vari-
 able into a set of categories, preferably in such a way that 5 or more cases
 fall in each category.  Eta indexes the strength of a relationship of any
 form, including those which are non-monotonic (e.g., U-shaped).  Eta-squared
 is commonly reported instead of Eta, because it has a more meaningful inter-
 pretation: it measures the proportion of variance in the Dependent variable
 explained by the categories of the Independent variable.
*D-BIVAR-DICH/NOM,A
    Even if your dichotomous Dependent variable is Ordinal or Interval, it is
 probably best to treat it as Nominal, like your Independent variable, and
 use a measure of association for two Nominal variables.
    If the two Nominal variables are dichotomized, use the Phi Coefficient as
 a measure of association.  If either or both of your Nominal variables has
 3 or more categories, use Cramer's V, which is the same as Phi except that
 it adjusts for the number categories.
*D-BIVAR-DICH/RANK,A
    There is no statistic specifically designed to measure the association
 between a dichotomous Dependent variable and an Ordinal Independent vari-
 able.  You'll first have to break the Independent variable into categories
 and then you'll have 2 OPTIONS: 1) assume the Dependent variable is Ordinal
 and use a measure of association for two "partially ordered" variables, or
 2) assume that both variables are merely Nominal and use a measure for two
 Nominal variables.  Option 1 is usually preferable, but choose Option 2 if
 it makes no sense to treat the dichotomous Dependent variable as Ordinal.
    If you choose Option 1, copy the data into an ordered contingency table
 and compute one of the following:
    The best statistic for most ordered contingency tables is a modified form
 of Kendall's Tau: use Tau-B if the number of rows in the table equals the
 number of columns; use Tau-C if the table is not "square."
    If you choose Option 2, copy the data into a contingency table, making no
 assumption about the order of rows & columns.  Then use one of the following
 measures appropriate for two Nominal scales:
    If the two Nominal variables are dichotomized, use the Phi Coefficient as
 a measure of association.  If either or both of your Nominal variables has
 3 or more categories, use Cramer's V, which is the same as Phi except that
 it adjusts for the number categories.
*D-BIVAR-DICH/PART,A
    With a dichotomous Dependent variable and a "partially ordered" independ-
 ent variable, you have 2 OPTIONS: 1) assume the Dependent variable is also
 Ordinal and use a measure of association for two "partially ordered" vari-
 ables, or 2) assume the Independent variable is only Nominal and use a meas-
 ure of association for two Nominal variables.  Option 1 is usually better.
    If you choose Option 1, copy the data into an ordered contingency table
 and compute one of the following:
    The best statistic for most ordered contingency tables is a modified form
 of Kendall's Tau: use Tau-B if the number of rows in the table equals the
 number of columns; use Tau-C if the table is not "square."
    If you choose Option 2, copy the data into a contingency table, making no
 assumption about the order of rows & columns.  Then use one of the following
 measures appropriate for two Nominal scales:
    If the two Nominal variables are dichotomized, use the Phi Coefficient as
 a  measure of association.  If either or both of your Nominal variables has
 3 or more categories, use Cramer's V, which is the same as Phi except that
 it adjusts for the number categories.
*D-BIVAR-DICH/INT,A
   With a dichotomous Dependent variable and an Interval Independent vari-
 able, you have 2 OPTIONS: 1) assume that the dichotomy is an Interval vari-
 able, or 2) "downgrade" the Independent variable to the Nominal level.  For
 Option 1, which is usually preferable, you'd use a measure of association
 for two Interval variables.  For Option 2, you'd first break the Independent
 variable into categories and use a measure of association for two Nominal
 variables.
   If you choose OPTION 1, assign arbitrary scores of 0 (low) and 1 (high)
 to categories of the Dependent variable.  Then use the Linear Correlation
 Coefficient (Pearson's r and r-squared) to measure the strength and direc-
 tion (+/-) of the relationship.  If you're mainly interested in predicting
 Dependent variable scores from those on the Independent variable, compute
 regression statistics (Regression Coefficient, Y-Intercept, & Standard Error
 of Estimate).  But note that Linear Regression may not be appropriate: with
 a dichotomous Dependent variable, some scores predicted from the regression
 equation (Y'= A+bx) may have impossible values (i.e., less than 0 or greater
 than 1).  If there are many impossible values or if they will cause problems
 in interpreting your results, use Logistic Regression instead.
    If you take OPTION 2, divide the Independent variable into categories
 that contain about the same number of cases and use one of the following:
    If the two Nominal variables are dichotomized, use the Phi Coefficient as
 a measure of association.  If either or both of your Nominal variables has
 3 or more categories, use Cramer's V, which is the same as Phi except that
 it adjusts for the number categories.
*D-MUL-SMALL-INT,A
 WARNING: The SAMPLE SIZE you specified may be TOO SMALL to support the type
 of multivariate procedure(s) WATSTAT recommended.  As a practical rule of
 thumb you should have a minimum of about 10 cases for each variable in such
 procedures.  To meet this criterion you may have to drop some variables from
 the analysis.  If you can't drop enough to approach the 10-case-per-variable
 criterion, you shouldn't use the above procedure(s).
*D-MUL-SMALL-NOM,A
 WARNING: The SAMPLE SIZE you specified may be TOO SMALL to use Multivariate
 Procedures for Nominal Variables, of the sort recommended.  Computations for
 such methods are based on cross-tabulations, and as the number of variables
 (& categories) increases, cell frequencies can become too sparse to support
 the analysis.  You may need to drop some variables from the analysis and/or
 collapse variables into fewer categories.
*D-MUL-1DEP-NOM/NOM,A
    The recommended procedure (and the only one available) for measuring the
 association between a Nominal-level Dependent and a set of Nominal independ-
 ent variables is Log-Linear Analysis.  In most cases, this procedure will
 require the use of a computer and many popular statistical software packages
 can run it.  A good deal of statistical sophistication is required to apply
 it and to interpret its results.  Log-Linear Analysis may not be widely used
 in your field and, if not, the task of reporting your results will be some-
 what more difficult.  The use of Log-linear Analysis is also limited by the
 substantial sample size it usually requires.
    However, no alternative procedure is applicable unless you're willing to
 dichotomize the Dependent variable (so it can be scored 0/1 and treated as
 Interval) and to transform all the Independent variables and also treat them
 as Interval.  The latter step would involve either: 1) dichotomizing each
 Independent variable and assigning "0" & "1" scores to its categories; or
 2) creating a set of "dummy variables" (each scored 0/1) to represent its
 categories.  After these transformations, you can apply either Logistic
 Regression or Discriminant Analysis.  For more info about these procedures,
 return to WATSTAT's Choice Boxes and specify "Dichotomous" for the depen-
 dent (Box 5) variable & "Interval" for the Independent (Box 6) variables.
*D-MUL-1DEP-NOM/INT,A
    The only procedure designed to assess the association between a Nominal
 Dependent & a set of Interval Independent variables is Discriminant Analysis.
 This procedure does not produce a single index (analogous to a correlation
 coefficient), but instead yields a set of prediction equations, called
 "Discriminant Functions," the interpretation of which requires a good deal
 of statistical expertise.  Computations must be done by computer and most
 statistical software packages include Discriminant Analysis routines.
    Interpretation of results is considerably simpler if the Dependent vari-
 able is dichotomized, but if this is done, Logistic Regression and Multiple
 Correlation/Regression would also be applicable and perhaps preferable.
*D-MUL-1DEP-NOM/MIXIO,A
    There is no procedure available to measure association between a Nominal
 Dependent variable and Independent variables with "mixed" levels of measure-
 ment, so you'll need to transform one or more Independent variables to make
 them all either Nominal or Interval.  In the former case, you'd simply break
 your Interval or Ordinal variables into categories and proceed as if they
 were Nominal.  In the latter, you'd transform each Ordinal or Nominal inde-
 pendent variable to Interval by either: 1) dichotomizing it and assigning
 scores of "0" and "1" to its categories; or 2) breaking it into categories
 and creating a set of "dummy variables" (each scored 0/1) to represent its
 categories.
    If all Independent variables are Nominal, Log-Linear Analysis may be
 used.  For more info about Log-Linear Analysis, return to WATSTAT's Choice
 Boxes and specify "Nominal" measurement for both the Dependent (Box 5) and
 the Independent (Box 6) variables.
    If all Independent variables are Interval (including dichotomies and
 dummy variables), you can use Discriminant Analysis.  For more info about
 Discriminant Analysis, return to WATSTAT's Choice Boxes and specify
 "Nominal" for the Dependent (Box 5) and "Interval" for the Independent
 (Box 6) variables.
*D-MUL-1DEP-NOM/ORD,A
    There is no procedure available to measure association between a Nominal
 Dependent variable and Ordinal Independent variables.  Your best alternative
 is to categorize the Ordinal variables and treat them as Nominal; then you
 can use Log-Linear Analysis.  For more information on Log-Linear Analysis,
 return to WATSTAT's Choice Boxes and specify "Nominal" measurement for both
 the Dependent (Box 5) and the Independent (Box 6) variables.
*D-MUL-1DEP-ORD/ALL,A
    There is no multivariate procedure designed to measure the association
 between an Ordinal Dependent variable and a set of 2 or more Independent
 variables.  However, if you transform the Dependent variable (and perhaps
 the Independent variables) a number of alternatives may be applicable.
 You have 2 basic OPTIONS: 1) dichotomize the Dependent variable and treat
 it as Interval, or 2) break the Dependent variable into 2 or more categories
 and treat it as Nominal.  OPTION 1 is preferable as long as it makes sense
 to dichotomize the Dependent variable.
    If you take OPTION 1, you can use either Multiple Regression/Correlation
 or Logistic Regression, BUT to do so all your Independent variables must
 also be Interval or Dichotomies (i.e., Nominal and Ordinal Independent vari-
 ables must be dichotomized or represented as sets of "dummy variables").
 For more info about Multiple Regression/Correlation, return to WATSTAT's
 Choice Boxes and choose "Interval" measurement for both the Dependent vari-
 able (Box 5) and the Independent (Box 6) variable.  For more information on
 Logistic Regression, specify "Dichotomy" (Box 5) and "Interval" (Box 6).
    With OPTION 2, you can use either Discriminant Analysis or Log-Linear
 Analysis.  To use Discriminant Analysis, all Independent variables must be
 Interval (i.e., Nominal & Ordinal Independent variables must be dichotomized
 or represented as sets of "dummy variables").  With Log-Linear Analysis, all
 Independent variables must be Nominal (i.e., Ordinal & Interval variables
 must be represented as sets of 2 or more Nominal categories).  For more info
 about Discriminant Analysis, return to WATSTAT's Choice Boxes and specify
 "Nominal" for the Dependent (Box 5) and "Interval" for the Independent
 variables.  For more info about Log-Linear Analysis, specify "Nominal" for
 both Dependent (Box 5) and Independent (Box 6) variables.
*D-MUL-1DEP-INT/INT,A
    If your Dependent variable is Interval and all your Independent variables
 are also Interval (or dichotomies) your best choice is Multiple Regression/
 Correlation.  Use the Multiple Correlation statistics (R and R-Squared) to
 index the strength of the relation between the Dependent variable and all
 the Independent variables jointly.  Use the Regression Coefficients (b)
 to index the effect of each Independent variable and use the Standard Error
 of Estimate to index the precision with which the set of Independent vari-
 ables predict (estimate) scores on the Dependent variable.
*D-MUL-1DEP-INT/OTHER,A
    There is no multivariate procedure designed to relate an Interval depend-
 ent variable with Nominal or Ordinal Independent variables.  However, after
 some simple transformations, you can treat Nominal and Ordinal variables as
 if they were Interval and use Multiple Correlation/Regression procedures.
    Dichotomous Independent variables (scored 1/0) can be treated as Interval
 in these procedures and you can dichotomize whenever it makes sense to treat
 a Nominal variable as "present" vs. "absent" (1 vs. 0) or an Ordinal vari-
 able as "high" vs. "low" (1 vs. 0).  However, it is often desirable to pre-
 serve a more detailed representation of Nominal & Ordinal variables: this
 can be done by dividing them into categories and using a SET of dichotomous
 variables, called "dummy variables," to represent the categories.
    Use the Multiple Correlation statistics (R and R-Squared) to index the
 strength of the relation between the Dependent variable and all the indepen-
 dent variables operating jointly.  Use the Regression Coefficients (b-values)
 to index the effect of each Independent variable and use the Standard Error
 of Estimate to index the precision with which the set of Independent vari-
 ables predicts (estimates) scores on the Dependent variable.
*D-MUL-1DEP-DICH/NOM,A
    Log-Linear Analysis is specifically designed to assess association
 between a Nominal Dependent variable and a set of Nominal Independent vari-
 ables.  The fact that your Dependent variable is dichotomous presents no
 problems, as long as it makes sense to treat it as a Nominal variable.
*D-MUL-1DEP-DICH/ORD,A
    There is no procedure designed to measure association between a dichoto-
 mous Dependent variable and Ordinal Independent variables.  Your best alter-
 native is to categorize the Ordinal variables and treat them as Nominal;
 then you can use Log-Linear Analysis.  For more information about Log-Linear
 Analysis, return to WATSTAT's Choice Boxes and specify "Nominal" measurement
 for both Dependent (Box 5) and Independent (Box 6) variables.
*D-MUL-1DEP-DICH/INT,A
    Several multivariate procedures are potentially applicable if the depen-
 dent variable is a dichotomy and all the Independent variables are Interval.
 In order of preference, the available options include: Logistic Regression,
 Discriminant Analysis, & Multiple Correlation/Regression.  Logistic Regress-
 ion is almost certain to be applicable.  Discriminant Analysis is a good
 alternative when category frequencies on the Dependent variable approach a
 50%/50% split, but should not be used when the split is more extreme than
 80%/20%.  Multiple Correlation/Regression is less generally applicable when
 the Dependent variable is a dichotomy: although the Dependent variable is
 scored 0 and 1 (for "low" & "high") some predicted (Y') scores may attain
 impossible values (less than 0 or greater than 1).  If there are many impos-
 sible values, or if such values will cause problems in interpreting your
 results, Multiple (Linear) Correlation/Regression should NOT be used.
*D-MUL-1DEP-DICH/MIXON,A
    There is no procedure designed to measure association between a dichoto-
 mous Dependent variable and "mixed" Ordinal/Nominal Independent variables.
 Your best alternative is to categorize the Ordinal variables and treat them
 as Nominal; then you can use Log-Linear Analysis, which assumes that all the
 Independent variables are Nominal.  For more info about Log-Linear Analysis,
 return to WATSTAT's Choice Boxes and specify "Nominal" measurement for both
 Dependent (Box 5) and Independent (Box 6) variables.
*D-MUL-1DEP-DICH/MIXIO,A
    There is no procedure designed to measure association between a dichoto-
 mous Dependent variable and Independent variables with "mixed" measurement
 levels, so you'll need to transform one or more Independent variables to
 make them ALL either Nominal or Interval.  In the former case, you'd simply
 break any Interval or Ordinal variables into categories and proceed as if
 they were Nominal.  In the latter, you'd transform each Ordinal or Nominal
 Independent variable to Interval by either: 1) dichotomizing it and assign-
 ing scores of "0" and "1" to its categories; or 2) breaking it into catego-
 ries and creating a set of "dummy variables" (each scored 0/1) to represent
 the categories.
    If all Independent variables can be treated as Nominal, you can use
 Log-Linear Analysis.  For more info about Log-Linear Analysis, return to
 WATSTAT's Choice Boxes and specify "Nominal" measurement for both Dependent
 (Box 5) and Independent (Box 6) variables.
    If all Independent variables are Interval (including dichotomies and
 dummy variables), you can use Logistic Regression or Discriminant Analysis.
 For more info about these procedures, return to WATSTAT's Choice Boxes and
 specify "Dichotomy" for the Dependent (Box 5) variable and "Interval" for
 the Independent (Box 6) variables.
*D-MUL-2DEP-INT/INT,A
    Several multivariate procedures are potentially applicable when all your
 variables are Interval and you're dealing with 2 or more Dependent variables
 simultaneously. They include: Canonical Correlation; measures of association
 derived from MANOVA; and various Structural Equation Modelling procedures,
 e.g., LISREL and EQS.  All these assume advanced statistical training and
 must be performed by computer.  Moreover, so much additional information is
 needed to choose from these alternatives that WATSTAT cannot recommend a
 "best" procedure here.
*D-MUL-2DEP-INT/NOTINT,A
    Several multivariate procedures are potentially applicable when you're
 dealing with 2 or more Dependent variables simultaneously. They include:
 Canonical Correlation, measures of association derived from MANOVA, and
 various procedures for Structural Equation Modelling (e.g., LISREL and EQS).
 However, all require advanced statistical training and must be performed by
 computer.  Further, all assume Interval measurement for ALL variables, so
 you won't be able to use them unless you drop "lower-level" variables or
 transform them to sets of dummy variables.  Finally, so much additional
 information is needed to choose from these alternatives that WATSTAT can't
 recommend a "best" procedure here.
*D-MUL-2DEP-NOTINT,A
    Several multivariate procedures are potentially applicable when you're
 dealing with 2 or more Dependent variables simultaneously. They include:
 Canonical Correlation, measures of association derived from MANOVA, and
 various procedures for Structural Equation Modelling (e.g., LISREL and EQS).
 However, all require advanced statistical training and must be performed by
 computer.  Further, all assume Interval measurement for ALL variables in the
 analysis, so you probably won't be able to use them.  Finally, so much addi-
 tional information is needed to choose from these alternatives that WATSTAT
 can't recommend a "best" procedure here.
*D-MUL-NODEP-INT,A
    Factor Analysis is recommended for assessing relationships among several
 Interval-level variables when there is no Dependent variable identified.
 [Dichotomous variables, scored 0/1, may also be Factor Analyzed.]
    There are many types of Factor Analysis and selecting the appropriate
 type is too complicated for WATSTAT to handle: you'll need to consult a
 specialized text on Factor Analysis.  Computations require a computer, and
 most popular statistical packages offer a variety of Factor Analysis proce-
 dures.  [The manuals for some of these packages are good sources of advice
 on which type of Factor Analysis to apply.]
*D-MUL-NODEP-RANK,A
     Kendall's Coefficient of Concordance (Kendall's W) is designed to assess
 relationships among 3 or more Ordinal variables when there is no Dependent
 variable identified.  All variables must be transformed to RANKS if they are
 not inherently in rank form.  The interpretation of Kendall's W is facili-
 tated by its linear relationship to "Average Rho," i.e., the mean rank-order
 correlation (Spearman' Rho) between all possible pairs of variables.
*D-MUL-NODEP-NOTINT,A
    Factor Analysis is the only widely-used procedure designed to assess
 relationships among several variables when there is no Dependent variable
 identified.  Unfortunately, this procedure assumes that all variables are
 Interval, so you can't use it for your "lower level" variables.  However,
 dichotomies (scored 0/1) may be treated as Interval here, so if you can
 dichotomize your "lower level" variables, you can apply Factor Analysis.
*S-UNI-NOM,A
     Assuming only Nominal Measurement, the Chi-Square Goodness-of-Fit Test
 may be used to test whether it's likely that your RANDOM SAMPLE came from a
 POPULATION with an hypothesized proportion of cases in its various catego-
 ries.  You specify the Population proportions (P) in the Null Hypothesis and
 multiply each P by Sample Size to obtain EXPECTED FREQUENCIES for the test.
 Within limits, you may specify any set of P's derived from theory or prior
 knowledge of a relevant population.
     If your variable is Dichotomous, the Binomial Test is preferable to the
 Chi-Square Goodness-of-Fit, especially when sample size is small.  Use Exact
 Binomial Tables for small sample sizes and the Normal Approximation (z-Test)
 for larger (>25) samples.
*S-UNI-RANK,A
     In the special situation where "scores" or Ranks represent a SEQUENCE of
 cases, the so-called "Test for Runs Up and Down" can be used to test for a
 TREND, i.e., a tendency for scores to increase or decrease over a sequence.
     If data are NOT SEQUENCED and NOT RANKED, your best alternative is to
 categorize the data and to apply a test designed for "Partially Ordered"
 data (One-Sample Kolmogorov-Smirnov Test) or Nominal data (Chi-Square
 Goodness-of-Fit Test).  There is no Univariate test for UNSEQUENCED RANKS.
*S-UNI-PART,A
     The Kolmogorov-Smirnov One-Sample Test is recommended for a Categorized
 Ordinal ("Partially Ordered") variable.  It tests the Null Hypothesis that
 the random sample was drawn from a Population with some specified Proportion
 of cases in the various categories: you specify these Proportions based on
 theory or prior information about the Population.
*S-UNI-INT,A
     Use the One-Sample t-Test to determine whether it is likely that your
 sample was DRAWN FROM A POPULATION WITH A KNOWN (or guessed) MEAN, which
 you specify in the Null Hypothesis.  Besides requiring INTERVAL MEASUREMENT,
 valid application of this test assumes the sample was drawn from a NORMALLY
 DISTRIBUTED POPULATION.  Check to see that your data adequately meet these
 assumptions: most intro. texts explain conditions under which they may be
 relaxed.
     If you're interested in estimating the MEAN of the POPULATION from which
 your RANDOM SAMPLE was drawn, compute CONFIDENCE LIMITS FOR THE MEAN.
     If you're interested in the SHAPE of your variable's distribution, use
 the Chi-Square Goodness-of-Fit Test to see if it's likely that your SAMPLE
 was drawn from a POPULATION with an hypothesized proportion of cases in its
 various categories.  You specify the Population Proportions (P) in the NULL
 Hypothesis and multiply each P by Sample N to get EXPECTED FREQUENCIES for
 the test.  Within limits, you may hypothesize any set of P's derived from
 theory or prior knowledge of a population.  If you get the P's from a table
 of the Normal Distribution, you can use the Chi-Square Goodness-of-Fit Test
 to see whether it's likely that your sample came from a NORMALLY DISTRIBUTED
 POPULATION.
*S-2SAMPLE-INT,A
     Use Student's t-Test to compare TWO SUB-SAMPLE MEANS on an INTERVAL
 DEPENDENT VARIABLE, where RANDOM SAMPLING or RANDOM ASSIGNMENT of cases has
 yielded INDEPENDENT SUB-SAMPLES.  Valid application of this test assumes:
 1) that sub-samples were drawn from two NORMALLY DISTRIBUTED POPULATIONS, &
 2) that the two parent POPULATIONS have EQUAL VARIANCES.  Check to see that
 your data approximate these assumptions: most intro. texts list conditions
 under which these assumptions may be relaxed.  A special form of the t-test
 is available in cases where population variances are unequal.
*S-2MATCH-INT,A
     Use the Matched-Pairs t-Test to compare TWO SUB-SAMPLE MEANS on an
 INTERVAL DEPENDENT VARIABLE, where RANDOM SAMPLING or RANDOM ASSIGNMENT has
 yielded MATCHED (dependent) SUB-SAMPLES.  Valid application of this test
 assumes that sub-samples were drawn from 2 NORMALLY DISTRIBUTED POPULATIONS.
 Check to see that your data approximate this assumption: most intro. texts
 list conditions under which it may be relaxed.
*ARCSINE,A
     A number of tests are available for comparing 2 dichotomous sub-samples,
 in cases where RANDOM SAMPLING OR RANDOM ASSIGNMENT has yielded INDEPENDENT
 SUB-SAMPLES.  (They are listed in order of preference.) The Arcsine Test is
 the preferred alternative, especially if sample size is small.  A Chi-Square
 Contingency Test, with data cast in a 2-by-2 table, gives similar results
 when sample size is large.  For smaller samples, Fisher's Exact may be used.
 Special forms of the z-test and t-test, which test for DIFFERENCES IN PRO-
 PORTIONS, are also applicable.  Consult a statistics text for the assump-
 tions underlying each of these tests.
*FISHER-EXACT,A
     Fisher's Exact Test is usually the best alternative for detecting a
 difference between INDEPENDENT SUB-SAMPLES when sample size is very small
 and data can be cast in a 2-by-2 contingency table.  Fisher's Exact Test is
 also used as an alternative to the Chi-Square Contingency Test when sample
 size is too small to apply the latter: in such cases it is used to test for
 the significance of an ASSOCIATION BETWEEN 2 DICHOTOMOUS NOMINAL VARIABLES.
     Although not widely-known, Fisher's Exact Test can be extended to tables
 larger than a 2-by-2: the only problem is finding a computer program that
 calculates p-values for larger tables.
*MCNEMAR,A
     The McNemar Test is designed to compare a DICHOTOMOUS DEPENDENT VARIABLE
 across 2 MATCHED SUB-SAMPLES.  The Dependent variable may be inherently
 dichotomous or transformed to a dichotomy especially for the test. There is
 NO TEST designed to compare a Dependent variable with 3 or more categories
 across Matched Sub-Samples.
     The McNemar Test assumes only Nominal Measurement, but if an Ordinal
 Dependent variable is dichotomized at the Overall Median, it can be used as
 a test for differences between Medians for MATCHED SAMPLES.
*MEDIAN-TEST,A
     The Median Test is designed to compare 2 INDEPENDENT SUB-SAMPLES when
 the DEPENDENT VARIABLE is ORDINAL and when it is feasible to determine the
 OVERALL MEDIAN OF THE TOTAL SAMPLE.  Although tests based on ranks are
 preferable, the Median Test is a good alternative when data are "Partially
 Ordered" or when sample size so large that it is infeasible to rank the data.
 The Median Test is really a "transformation" rather than a distinct test:
 data are cast in a 2-by-2 contingency table by breaking the Dependent vari-
 able at the overall Median; then either the Chi-Square Contingency Test or
 Fisher's Exact Test is applied, depending on sample size.
     The Median Test can also be applied when there are 3 or More INDEPENDENT
 SUB-SAMPLES.  In this case, the Dependent variable is again Dichotomized at
 the OVERALL MEDIAN, but data are cast in a 2-by-k contingency table, where
 k is the number of sub-samples.  Then the Chi-Square Contingency Test is
 applied.
*WILCOX-MATCH,A
     The appropriate test for a difference between TWO MATCHED SUB-SAMPLES,
 when the ORDINAL DEPENDENT VARIABLE is scored a RANKS, is the Wilcoxon
 Matched-Pairs Test [sometimes called the Matched-Pairs Signed-Ranks Test].
*WILCOX-RSUM,A
     Two tests, the Wilcoxon Rank-Sum Test and the Mann-Whitney U-Test, can
 be applied to test for a difference between TWO INDEPENDENT SUB-SAMPLES,
 when the ORDINAL DEPENDENT VARIABLE is scored as RANKS.  These are really
 two forms of the same test and yield exactly the same p-values.  Although
 the Mann-Whitney is more widely used, the Wilcoxon Rank-Sum Test is much
 easier to compute and interpret and, therefore, preferable.  [Don't confuse
 this Rank-Sum Test with Wilcoxon's Matched-Pairs Test, which is used for
 DEPENDENT SUB-SAMPLES.]
*ONEWAY,A
     The appropriate significance test for differences between Means of three
 or more INDEPENDENT SUB-SAMPLES is the so-called "ONE-WAY ANOVA F-TEST."
 This is an "overall" test: it detects differences between pairs or combina-
 tions of sub-samples, but it can't specify which sub-samples differ.  Thus,
 it must be followed by more specific tests, called CONTRASTS, to pinpoint
 which sub-samples differ.  Besides assuming INDEPENDENT SUB-SAMPLES and
 INTERVAL MEASUREMENT, this F-Test assumes that sub-samples were drawn from
 NORMALLY DISTRIBUTED POPULATIONS that have EQUAL VARIANCES.  Check to see
 that your data approximate all these assumptions: most intro. texts specify
 conditions under which they may be relaxed.  Consult a specialized text on
 Analysis of Variance (ANOVA) for help in selecting a test for CONTRASTS
 following the overall F-Test.  [Usually, the Duncan Multiple-Range Test is
 best for Contrasts between PAIRS of sub-samples and the Scheffe Test best
 for Contrasts between GROUPS of sub-samples, but there are many other alter-
 natives that may be preferable in your case.]
*TWOWAY,A
     The best significance test for differences between Means of 3 or more
 MATCHED SUB-SAMPLES is ANALYSIS OF VARIANCE F-TEST FOR RANDOMIZED BLOCKS,
 which is sometimes loosely called "TWO-WAY" ANOVA.  In this design, "Blocks"
 may be individual cases or sets of matched cases, which are represented in
 all the sub-samples.  Blocks are used to "control" extraneous between-case
 variation.  When individual cases appear in all the sub-samples, the design
 is referred to as a RANDOMIZED BLOCKS DESIGN WITH REPEATED MEASURES.
     The F-Test is an "overall" test: it detects differences between pairs or
 combinations of sub-samples, but it can't specify which sub-samples differ.
 Thus, it must be followed by more specific tests, called CONTRASTS, to pin-
 point which sub-samples differ.  Besides assuming INTERVAL MEASUREMENT, this
 F-Test assumes that sub-samples were drawn from NORMALLY DISTRIBUTED POPULA-
 TIONS that have EQUAL VARIANCES.  Check to see that your data approximate
 all these assumptions.  Specialized texts on Analysis of Variance (ANOVA)
 usually contain extensive explanations of underlying assumptions and also
 offer help in selecting a test for CONTRASTS following the overall F-Test.
*CR-FACTORIAL,A
     ANALYSIS OF VARIANCE with a COMPLETELY RANDOMIZED FACTORIAL (CRF) design
 is the best alternative when you have: an 1) INTERVAL DEPENDENT VARIABLE,
 2) TWO OR MORE COMPARISON VARIABLES, and 3) NO MATCHING of cases across
 sub-samples of any Comparison Variable. [The last condition implies that
 each case appears in the analysis one and only one time.]
     The CRF design yields an F-Test for each Comparison Variable and also
 for INTERACTION EFFECTS due to sets of these variables.  The F-Tests are
 "overall" tests: they detect differences between pairs or combinations of
 sub-samples, but don't specify which sub-samples differ.  Thus, they must
 be followed by more specific tests, called CONTRASTS, to pinpoint which
 sub-samples differ.  Besides INTERVAL MEASUREMENT, the F-Tests assume that
 the sub-samples were drawn from NORMALLY DISTRIBUTED POPULATIONS that have
 EQUAL VARIANCES.  Check to see that your data approximate all these assump-
 tions.  Specialized texts on Analysis of Variance usually contain extensive
 explanations of underlying assumptions and the conditions under which they
 may be relaxed.  Only a few offer help in selecting the most appropriate
 test for CONTRASTS in CRF Designs.
*RB-FACTORIAL,A
     ANALYSIS OF VARIANCE with a RANDOMIZED BLOCKS FACTORIAL (RBF) design is
 the best alternative if you have: an 1) INTERVAL DEPENDENT VARIABLE, 2) TWO
 OR MORE COMPARISON VARIABLES, and 3) MATCHED CASES or OBSERVATIONS across
 sub-samples of one or more Comparison Variables.  In this design, "Blocks"
 may be individual cases or sets of matched cases, which are represented in
 all the sub-samples of a Comparison Variable.  Blocks are used to "control"
 extraneous between-case variation.  When individual cases appear in all the
 sub-samples of any Comparison Variable, the design is referred to as a
 RANDOMIZED BLOCKS FACTORIAL DESIGN WITH REPEATED MEASURES.  When the Blocks
 are split into "Sub-Blocks" on one or more "Blocking Variables" the design
 is referred to as a SPLIT-PLOT DESIGN.
     The RBF design yields an F-Test for each Comparison Variable and also
 for INTERACTION EFFECTS due to sets of these variables.  The F-Tests are
 "overall" tests: they detect differences between pairs or combinations of
 sub-samples, but don't specify which sub-samples differ.  Thus, they must
 be followed by more specific tests, called CONTRASTS, to pinpoint which of
 the sub-samples differ.  Besides INTERVAL MEASUREMENT, the F-Tests assume
 that sub-samples were drawn from NORMALLY DISTRIBUTED POPULATIONS that have
 EQUAL VARIANCES.  Check to see that your data approximate all these assump-
 tions.  Specialized texts on Analysis of Variance usually contain extensive
 explanations of underlying assumptions and the conditions under which they
 may be relaxed.  Only a few offer help in selecting the most appropriate
 test for CONTRASTS in RBF or Split-Plot Designs.
*ANOVA/REGN,A
 [Traditional ANOVA computations for the above design require EQUAL FREQUEN-
 CIES in all the cells created when the sample is split by 2 or more Compar-
 ison Variables.  If cell frequencies are unequal, F-Ratios can be obtained
 through Multiple Regression procedures, of which ANOVA is a special case.
 Most computer programs use Multiple Regression for all ANOVA problems, but
 hide this fact by reporting results in a conventional ANOVA Summary Table.]
*ANCOVA,A
     If you have one or more Independent variables that you wish to "control"
 or "adjust for" without building them in as Comparison Variables, you can
 apply ANALYSIS OF COVARIANCE (ANCOVA) procedures.  ANCOVA is an extension of
 ANOVA in which the effects of one or more INTERVAL-LEVEL INDEPENDENT VARI-
 ABLES are "partialled out," through Multiple Regression procedures, before
 F-Ratios are computed for the major Comparison Variables.  Normally, vari-
 ables are selected for such adjustment because they create "extraneous"
 variation in the Dependent Variable and can't be eliminated physically.
 ANCOVA usually requires a computer and most popular statistical packages
 can perform it.  To use ANCOVA, you must meet all the assumptions of ANOVA
 and Multiple Regression, plus some additional ones unique to this procedure.
 Specialized texts on Analysis of Variance usually explain all these assump-
 tions and the conditions under which they may be relaxed.
*MANOVA,A
     MULTIVARIATE ANALYSIS OF VARIANCE (MANOVA) is an extension of ANOVA
 designed to handle two or more INTERVAL-LEVEL DEPENDENT VARIABLES simulta-
 neously.  The application of MANOVA and the interpretation of its results
 requires advanced statistical training.  If you lack such expertise, and if
 your theory demands MANOVA, it would be wise to seek help from a statistical
 consultant before attempting to apply it.  It may be wiser yet to choose a
 procedure that can be applied in separate analyses for each Dependent vari-
 able.  If the latter alternative is feasible, WATSTAT may be able to offer
 more help: return to the Choice Boxes and select "Multivariate with ONE
 Dependent Variable" in Box 4.
*CHI-LOGIST,A
     Significance tests associated with Logistic Regression PARALLEL those
 used with Linear Multiple Regression: there are tests for overall fit of
 the equation as well as for individual Regression Coefficients.  However,
 as Logistic Regression is based on a different equation-fitting criterion,
 neither the tests nor their interpretations are IDENTICAL to their Linear
 counterparts.  Logistic Regression also has its own set of assumptions and
 limitations, which you'll need to consider.
*CHI-COMP-NOM,A
     Use the Chi-Square Contingency Test to determine whether it is likely
 that your RANDOM SAMPLE was drawn from a set of Sub-Populations (correspond-
 ing to your Sub-Samples) that have the same proportion of cases in the
 various categories of the Dependent Variable.  [Chi-Square must be computed
 on RAW FREQUENCIES: don't make the common beginner's error of computing it
 from a table of Percentages or Proportions.]
*CHI-PHI,A
     The appropriate significance test for the Phi Coefficient or Cramer's V
 is the Chi-square Contingency Test.  Fisher's Exact Test may be used as a
 test for Phi if sample size is too small for the Chi-Square Test.
*TTEST-BIV-R,A
     A special t-Test or F-Test is used to test for the significance of the
 Correlation Coefficient (r) or the Regression Coefficient (b).  In the bi-
 variate case, t and F Tests yield exactly the same p-values and tests for
 r and b are equivalent.  Besides requiring INTERVAL MEASUREMENT, these tests
 assume BIVARIATE NORMALITY.  Check to see that your data approximate this
 assumption: most intro. texts list conditions under which it may be relaxed.
*TTEST-RHO,A
     A special t-Test is used to test for the significance of Spearman's Rho.
 The computing formula for this test is the same as that used for the Linear
 Correlation Coefficient (r) except that Rho replaces r in the computations.
*ZTEST-TAU,A
     The significance test for Kendall's Tau uses a z-statistic, which is
 referred to a table of the Standard Normal Distribution to obtain p-values.
 For sample sizes less than 10, exact tables are available and should be used
 instead of the Normal approximation.
*FTEST-ETA,A
     The significance test used for the Correlation Ratio (Eta) is the F-Test
 obtained from a ONE-WAY ANALYSIS OF VARIANCE.
*FTEST-MULTR,A
     An F-Test is used to test for the significance of the Multiple Correla-
 tion Coefficient.  A special t-Test or F-Test (yielding identical p-values)
 is used to test the significance of each Regression Coefficient in the equa-
 tion.  F-Tests for "R-Square Change" can be used to test whether a set of
 two or more Independent Variables contributes significantly to the fit of
 equation.  Valid application of these tests rests on many stringent assump-
 tions: consult a Multiple Regression/Correlation text for information about
 these assumptions and check to see that your data meet them.
*S-LOG-LIN,A
     Several significance tests are usually applied in a Log-Linear Analysis,
 all of which are referred to the Chi-Square Distribution to obtain p-values.
 In addition to a test for overall fit of a Log-Linear Model (analogous to a
 test for R-Squared in Regression), tests are usually made for MAIN EFFECTS
 and INTERACTION EFFECTS (analogous to F-Tests in Analysis of Variance).
*S-DISCRIM,A
     Several F-Tests are usually applied in a Discriminant Analysis, includ-
 ing: a test for fit of each discriminant function, tests for the contribu-
 tion of each Discriminant Function Coefficient, and tests for differences
 between groups.  Computer programs also use significance tests as criteria
 for including variables and for terminating the analysis.  [The validity of
 these criteria, like ALL significance tests, rests on the assumption of
 Random Sampling.]
*S-FACTOR-ANAL,A
     Numerous tests can be applied in Factor Analysis, including tests for
 Factor Loadings, Correlations between Factors, and the Number of Factors.
 When the focus is on description, as it is in so-called "Exploratory Factor
 Analysis," there is usually no need for any tests.  However, significance
 tests become central when the Factor Analysis is used to address theoretical
 hypotheses, as in "Confirmatory Factor Analysis."
*S-KENDALL-W,A
     The significance test for Kendall's W uses exact tables when sample
 size and the number of variables are small.  Otherwise, a Chi-Square stat-
 istic is used.  The Null Hypothesis tested is that the sample was drawn
 from a population in which the variables are mutually Independent.
*S-COCHRANQ,A
     Cochran's Q Test is designed to compare a DICHOTOMOUS DEPENDENT VARIABLE
 across 3 or more MATCHED SUB-SAMPLES.  The Dependent variable may be inher-
 ently dichotomous or transformed to a dichotomy especially for the Q-test.
 There is NO TEST designed to compare a Dependent variable with 3 or more
 categories across Matched Sub-Samples.
     Cochran's Q Test assumes only Nominal Measurement, but if an Ordinal
 Dependent variable is dichotomized at the OVERALL MEDIAN, it can be used to
 test the Null Hypothesis that Matched Sub-Samples were RANDOMLY drawn from
 Populations with the same Median.
*KRUSKAL,A
     The Kruskal-Wallis Test is designed to compare an ORDINAL DEPENDENT
 VARIABLE across 3 or more INDEPENDENT SUB-SAMPLES.  If the Dependent vari-
 able is not inherently Ranked it must be transformed to Ranks for the test.
 The Kruskal-Wallis is an analogue of One-Way ANOVA and uses a Chi-Square
 test statistic in place of the ANOVA F-Test.
*FRIEDMAN,A
     The Friedman Test is designed to compare an ORDINAL DEPENDENT VARIABLE
 across 3 or more MATCHED SUB-SAMPLES.  If the Dependent variable is not
 inherently Ranked it must be transformed to Ranks for the test.  This test
 is an analogue of "Two-Way ANOVA" (Randomized Blocks ANOVA) and uses a
 Chi-Square test statistic in place of the ANOVA F-Test.
*S-COMP2-RANK,A
     There is no well-known significance test for Ordinal data that can
 handle 2 or more Independent (Comparison) Variables in a single analysis.
 That is, there are no Ordinal-Level analogues to Factorial ANOVA, Analysis
 of Covariance, etc., which are used with Interval Dependent Variables.
*S-COMP2-DICH,A
     There is no test designed to compare a DICHOTOMOUS DEPENDENT VARIABLE
 across SUB-SAMPLES created by 2 or more Independent (Comparison) variables.
 However, if it's appropriate to shift the Analytical Focus from "Sub-Sample
 Comparison" to "Association," a number of alternatives are open.  Among
 these are Logistic Regression and Discriminant Analysis.  If your Analytical
 Focus can be changed in this way -- if it MAKES SENSE to cast your research
 questions in terms of Association -- return to WATSTAT's Choice Boxes and
 select "No Sub-Sample Comparisons" in Box 2 and "Describe Association" in
 Box 3.  WATSTAT's Report will then give you more information about Logistic
 Regression and Discriminant Analysis.
*S-COMP2-NOM-IND,A
     There is no test designed to compare a NOMINAL DEPENDENT VARIABLE across
 SUB-SAMPLES created by 2 or more Independent (Comparison) variables.
     If it's appropriate to change your Analytical Focus from "Sub-Sample
 Comparison" to "Association," a number of alternatives are open, namely,
 Log-Linear Analysis, Logistic Regression, and Discriminant Analysis.  If it
 MAKES SENSE to re-cast your research questions in terms of Association,
 return to WATSTAT's Choice Boxes and select "No Sub-Sample Comparisons" in
 Box 2 and "Describe Association" in Box 3.  WATSTAT's Report will then give
 you more information about the above alternatives.  [All these alternatives
 require advanced statistical training: a wise novice will seek expert help.]
*S-COMP2-NOM-MATCH,A
     There is NO MULTIVARIATE TEST designed to compare a NOMINAL DEPENDENT
 VARIABLE across MATCHED SUB-SAMPLES created by 2 or more Comparison vari-
 ables.  If you haven't yet collected the data, consider ways to achieve an
 Interval-Level measure of the Dependent variable.  If the data are already
 collected, and if it's appropriate and feasible to dichotomize the Dependent
 variable, you may be able to use ANOVA F-Tests. [This will also require a
 so-called ARCSINE TRANSFORMATION before ANOVA can be applied to a Dichotomous
 Dependent variable.]  If either of these options is viable in your case,
 return to WATSTAT's Choice Boxes and select "Interval" in Box 5.
*COPYRIGHT,A
 COPYRIGHT 1991 BY HAWKEYE SOFTWORKS, 300 GOLFVIEW AVE., IOWA CITY, IA, 52246
