
THE CHI-SQUARE
DISTRIBUTION AND THE ANALYSIS
OF FREQUENCIES
CHAPTER OVERVIEW
This chapter explores techniques that are commonly used in the analysis of
count or frequency data. Uses of the chi-square distribution, which was
mentioned briefly in Chapter 6, are discussed and illustrated in greater detail.
Additionally, statistical techniques often used in epidemiological studies are
introduced and demonstrated by means of examples.
TOPICS
INTRODUCTION
THE MATHEMATICAL PROPERTIES OF THE CHI-SQUARE DISTRIBUTION
TESTS OF GOODNESS-OF-FIT
TESTS OF INDEPENDENCE
TESTS OF HOMOGENEITY
THE FISHER EXACT TEST
RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC
SUMMARY
LEARNING OUTCOMES
After studying this chapter, the student will
1. understand the mathematical properties of the chi-square distribution.
2. be able to use the chi-square distribution for goodness-of-fit tests.
3. be able to construct and use contingency tables to test independence
and homogeneity.
4. be able to apply Fisher’s exact test for 2
2 tables.
5. understand how to calculate and interpret the epidemiological concepts of relative
risk, odds ratios, and the Mantel-Haenszel statistic.
600

12.2
THE MATHEMATICAL PROPERTIES OF THE CHI-SQUARE DISTRIBUTION
601
12.1
INTRODUCTION
In the chapters on estimation and hypothesis testing, brief mention is made of the chi-
square distribution in the construction of confidence intervals for, and the testing of,
hypotheses concerning a population variance. This distribution, which is one of the most
widely used distributions in statistical applications, has many other uses. Some of the more
common ones are presented in this chapter along with a more complete description of the
distribution itself, which follows in the next section.
The chi-square distribution is the most frequently employed statistical technique for
the analysis of count or frequency data. For example, we may know for a sample of
hospitalized patients how many are male and how many are female. For the same sample
we may also know how many have private insurance coverage, how many have Medicare
insurance, and how many are on Medicaid assistance. We may wish to know, for the
population from which the sample was drawn, if the type of insurance coverage differs
according to gender. For another sample of patients, we may have frequencies for each
diagnostic category represented and for each geographic area represented. We might want
to know if, in the population from which the same was drawn, there is a relationship
between area of residence and diagnosis. We will learn how to use chi-square analysis to
answer these types of questions.
There are other statistical techniques that may be used to analyze frequency data in
an effort to answer other types of questions. In this chapter we will also learn about these
techniques.
12.2
THE MATHEMATICAL PROPERTIES
OF THE CHI-SQUARE DISTRIBUTION
The chi-square distribution may be derived from normal distributions. Suppose that from a
normally distributed random variable Y with mean m and variance s2 we randomly and
independently select samples of size¼ 1. Each value selected may be transformed to the
standard normal variable z by the familiar formula
yi
m
zi ¼
(12.2.1)
s
Each value of z may be squared to obtain z2. When we investigate the sampling distri-
bution of z2, we find that it follows a chi-square distribution with 1 degree of freedom.
That is,
y m
x2
2 ¼z2
ð1Þ ¼
s
Now suppose that we randomly and independently select samples of size¼ 2 from the
normally distributed population of Y values. Within each sample we may transform each

602
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
value of y to the standard normal variable z and square as before. If the resulting values of z2
for each sample are added, we may designate this sum by
y1
m
y2
m
x2
ð2Þ ¼
2 þ
2 ¼z2
1 þ
z22
s
s
since it follows the chi-square distribution with 2 degrees of freedom, the number of
independent squared terms that are added together.
The procedure may be repeated for any sample size n. The sum of the resulting z2
values in each case will be distributed as chi-square with n degrees of freedom. In general,
then,
x2
z2
z2
þz2
(12.2.2)
ðnÞ ¼
1 þ
2 þ
n
follows the chi-square distribution with n degrees of freedom. The mathematical form of
the chi-square distribution is as follows:
1
1
fðuÞ ¼
1eðu=2Þ; u > 0
k
(12.2.3)
1
!2k=2 uðk=2Þ
2
where e is the irrational number 2.71828 . . . and k is the number of degrees of freedom.
The variate u is usually designated by the Greek letter chi (x) and, hence, the distribution is
called the chi-square distribution. As we pointed out in Chapter
6, the chi-square
distribution has been tabulated in Appendix Table F. Further use of the table is demon-
strated as the need arises in succeeding sections.
The mean and variance of the chi-square distribution are k and 2k, respectively. The
modal value of the distribution is k
2 for values of k greater than or equal to 2 and is zero
for k ¼ 1.
The shapes of the chi-square distributions for several values of k are shown in Figure
6.9.1. We observe in this figure that the shapes for k ¼ 1 and k ¼ 2 are quite different from
the general shape of the distribution for k > 2. We also see from this figure that chi-square
assumes values between 0 and infinity. It cannot take oegative values, since it is the sum
of values that have been squared. A final characteristic of the chi-square distribution worth
noting is that the sum of two or more independent chi-square variables also follows a
chi-square distribution.
Types of Chi-Square Tests As already noted, we make use of the chi-square
distribution in this chapter in testing hypotheses where the data available for analysis are
in the form of frequencies. These hypothesis testing procedures are discussed under the
topics of tests of goodness-of-fit, tests of independence, and tests of homogeneity. We will
discover that, in a sense, all of the chi-square tests that we employ may be thought of as
goodness-of-fit tests, in that they test the goodness-of-fit of observed frequencies to
frequencies that one would expect if the data were generated under some particular theory
or hypothesis. We, however, reserve the phrase “goodness-of-fit” for use in a more

12.2
THE MATHEMATICAL PROPERTIES OF THE CHI-SQUARE DISTRIBUTION
603
restricted sense. We use it to refer to a comparison of a sample distribution to some theoretical
distribution that it is assumed describes the population from which the sample came. The
justification of our use of the distribution in these situations is due to Karl Pearson (1), who
showed that the chi-square distribution may be used as a test of the agreement between
observation and hypothesis whenever the data are in the form of frequencies. An extensive
treatment of the chi-square distribution is to be found in the book by Lancaster (2). Nikulin
and Greenwood (3) offer practical advice for conducting chi-square tests.
Observed Versus Expected Frequencies The chi-square statistic is most
appropriate for use with categorical variables, such as marital status, whose values are
the categories married, single, widowed, and divorced. The quantitative data used in
the computation of the test statistic are the frequencies associated with each category of the
one or more variables under study. There are two sets of frequencies with which we are
concerned, observed frequencies and expected frequencies. The observed frequencies
are the number of subjects or objects in our sample that fall into the various categories of
the variable of interest. For example, if we have a sample of 100 hospital patients, we may
observe that 50 are married, 30 are single, 15 are widowed, and 5 are divorced. Expected
frequencies are the number of subjects or objects in our sample that we would expect to
observe if some null hypothesis about the variable is true. For example, our null hypothesis
might be that the four categories of marital status are equally represented in the population
from which we drew our sample. In that case we would expect our sample to contain 25
married, 25 single, 25 widowed, and 25 divorced patients.
The Chi-Square Test Statistic The test statistic for the chi-square tests we
discuss in this chapter is
“
#
X
2
ðOi Ei
Þ
X2 ¼
(12.2.4)
Ei
When the null hypothesis is true, X2 is distributed approximately as x2 with k r
degrees of freedom. In determining the degrees of freedom, k is equal to the number of
groups for which observed and expected frequencies are available, and r is the number of
restrictions or constraints imposed on the given comparison. A restriction is imposed when
we force the sum of the expected frequencies to equal the sum of the observed frequencies,
and an additional restriction is imposed for each parameter that is estimated from the
sample.
In Equation 12.2.4, Oi is the observed frequency for the ith category of the variable of
interest, and Ei is the expected frequency (given that H0 is true) for the ith category.
The quantity X2 is a measure of the extent to which, in a given situation, pairs of
observed and expected frequencies agree. As we will see, the nature of X2 is such that when
there is close agreement between observed and expected frequencies it is small, and when
the agreement is poor it is large. Consequently, only a sufficiently large value of X2 will
cause rejection of the null hypothesis.
If there is perfect agreement between the observed frequencies and the frequencies
that one would expect, given that H0 is true, the term Oi Ei in Equation 12.2.4 will be

604
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
equal to zero for each pair of observed and expected frequencies. Such a result would yield
a value of X2 equal to zero, and we would be unable to reject H0.
When there is disagreement between observed frequencies and the frequencies one
would expect given that H0 is true, at least one of the Oi Ei terms in Equation 12.2.4 will
be a nonzero number. In general, the poorer the agreement between the Oi and the Ei, the
greater or the more frequent will be these nonzero values. As noted previously, if the
agreement between the Oi and the Ei is sufficiently poor (resulting in a sufficiently large X2
value,) we will be able to reject H0.
When there is disagreement between a pair of observed and expected frequencies, the
difference may be either positive or negative, depending on which of the two frequencies is
the larger. Since the measure of agreement, X2, is a sum of component quantities whose
magnitudes depend on the difference Oi Ei, positive and negative differences must be
given equal weight. This is achieved by squaring each Oi Ei difference. Dividing the
squared differences by the appropriate expected frequency converts the quantity to a term
that is measured in original units. Adding these individual ðOi EiÞ2=Ei terms yields X2, a
summary statistic that reflects the extent of the overall agreement between observed and
expected frequencies.
The Decision Rule The quantityP½ðOi EiÞ2=Ei will be small if the observed
and expected frequencies are close together and will be large if the differences are large.
The computed value of X2 is compared with the tabulated value of x2 with k r
degrees of freedom. The decision rule, then, is: Reject H0 if X2 is greater than or equal to the
tabulated x2 for the chosen value of a.
Small Expected Frequencies Frequently in applications of the chi-square test
the expected frequency for one or more categories will be small, perhaps much less than 1.
In the literature the point is frequently made that the approximation of X2 to x2 is not
strictly valid when some of the expected frequencies are small. There is disagreement
among writers, however, over what size expected frequencies are allowable before making
some adjustment or abandoning x2 in favor of some alternative test. Some writers,
especially the earlier ones, suggest lower limits of 10, whereas others suggest that all
expected frequencies should be no less than 5. Cochran (4,5), suggests that for goodness-
of-fit tests of unimodal distributions
(such as the normal), the minimum expected
frequency can be as low as 1. If, in practice, one encounters one or more expected
frequencies less than 1, adjacent categories may be combined to achieve the suggested
minimum. Combining reduces the number of categories and, therefore, the number of
degrees of freedom. Cochran’s suggestions appear to have been followed extensively by
practitioners in recent years.
12.3
TESTS OF GOODNESS-OF-FIT
As we have pointed out, a goodness-of-fit test is appropriate when one wishes to decide if
an observed distribution of frequencies is incompatible with some preconceived or
hypothesized distribution.

12.3
TESTS OF GOODNESS-OF-FIT
605
We may, for example, wish to determine whether or not a sample of observed values
of some random variable is compatible with the hypothesis that it was drawn from a
population of values that is normally distributed. The procedure for reaching a decision
consists of placing the values into mutually exclusive categories or class intervals and
noting the frequency of occurrence of values in each category. We then make use of our
knowledge of normal distributions to determine the frequencies for each category that one
could expect if the sample had come from a normal distribution. If the discrepancy is of
such magnitude that it could have come about due to chance, we conclude that the sample
may have come from a normal distribution. In a similar manner, tests of goodness-of-fit
may be carried out in cases where the hypothesized distribution is the binomial, the
Poisson, or any other distribution. Let us illustrate in more detail with some examples of
tests of hypotheses of goodness-of-fit.
EXAMPLE 12.3.1 The Normal Distribution
Cranor and Christensen (A-1) conducted a study to assess short-term clinical, economic,
and humanistic outcomes of pharmaceutical care services for patients with diabetes in
community pharmacies. For 47 of the subjects in the study, cholesterol levels are
summarized in Table 12.3.1.
We wish to know whether these data provide sufficient evidence to indicate that the
sample did not come from a normally distributed population. Let a ¼ .05
Solution:
1. Data. See Table 12.3.1.
2. Assumptions. We assume that the sample available for analysis is a
simple random sample.
TABLE 12.3.1
Cholesterol Levels as
Described in Example 12.3.1
Cholesterol
Level (mg/dl)
Number of Subjects
100.0-124.9
1
125.0-149.9
3
150.0-174.9
8
175.0-199.9
18
200.0-224.9
6
225.0-249.9
4
250.0-274.9
4
275.0-299.9
3
Source: Data provided courtesy of Carole W. Cranor, and
Dale B. Christensen, “The Asheville Project: Short-Term
Outcomes of a Community Pharmacy Diabetes Care
Program,” Journal of the American Pharmaceutical
Association, 43 (2003), 149-159.

606
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
3.
Hypotheses.
H0: In the population from which the sample was drawn, cholesterol
levels are normally distributed.
HA: The sampled population is not normally distributed.
4.
Test statistic. The test statistic is
“
#
k
ðOi Ei
Þ2
X2 ¼
Ei
i¼1
5.
Distribution of test statistic. If H0 is true, the test statistic is distributed
approximately as chi-square with k r degrees of freedom. The values
of k and r will be determined later.
6.
Decision rule. We will reject H0 if the computed value of X2 is equal to
or greater than the critical value of chi-square.
7.
Calculation of test statistic. Since the mean and variance of the
hypothesized distribution are not specified, the sample data must be
used to estimate them. These parameters, or their estimates, will be
needed to compute the frequency that would be expected in each class
interval when the null hypothesis is true. The mean and standard
deviation computed from the grouped data of Table 12.3.1 are
x ¼ 198:67
s ¼ 41:31
As the next step in the analysis, we must obtain for each class
interval the frequency of occurrence of values that we would expect when
the null hypothesis is true, that is, if the sample were, in fact, drawn from
a normally distributed population of values. To do this, we first determine
the expected relative frequency of occurrence of values for each class
interval and then multiply these expected relative frequencies by the total
number of values to obtain the expected number of values for each
interval.
The Expected Relative Frequencies
It will be recalled from our study of the normal distribution that the relative frequency of
occurrence of values equal to or less than some specified value, say, x0, of the normally
distributed random variable X is equivalent to the area under the curve and to the left of x0
as represented by the shaded area in Figure 12.3.1. We obtain the numerical value of this
area by converting x0 to a standard normal deviation by the formula z0 ¼ ðx0
mÞ=s and
finding the appropriate value in Appendix Table D. We use this procedure to obtain the
expected relative frequencies corresponding to each of the class intervals in Table 12.3.1.
We estimate m and s with x and s as computed from the grouped sample data. The first step
consists of obtaining z values corresponding to the lower limit of each class interval. The
area between two successive z values will give the expected relative frequency of
occurrence of values for the corresponding class interval.

12.3
TESTS OF GOODNESS-OF-FIT
607
x0
X
FIGURE 12.3.1
A normal distribution showing the relative frequency of occurrence of values
less than or equal to x0. The shaded area represents the relative frequency of occurrence of values
equal to or less than x0.
For example, to obtain the expected relative frequency of occurrence of values in the
interval 100.0 to 124.9 we proceed as follows:
100:0
198:67
The z value corresponding to X ¼ 100:0 is z ¼
¼
2:39
41:31
125:0
198:67
The z value corresponding to X ¼ 125:0 is z ¼
¼
1:78
41:31
In Appendix Table D we find that the area to the left of
2:39 is .0084, and the area to
the left of
1:78
is
.0375. The area between
1:78
and
2:39
is equal to
.0375
.0084 ¼ .0291, which is equal to the expected relative frequency of occurrence
of cholesterol levels within the interval 100.0 to 124.9. This tells us that if the null
hypothesis is true, that is, if the cholesterol levels are normally distributed, we should
expect 2.91 percent of the values in our sample to be between 100.0 and 124.9. When we
multiply our total sample size, 47, by .0291 we find the expected frequency for the interval
to be 1.4. Similar calculations will give the expected frequencies for the other intervals as
shown in Table 12.3.2.
TABLE 12.3.2
Class Intervals and Expected Frequencies for
Example 12.3.1
zðxi xÞ=s
At Lower Limit
Expected Relative
Expected
Class Interval
of Interval
Frequency
Frequency
< 100
.0084
.4
1.8
100.0-124.9
2.39
.0291
1.4
125.0-149.9
1.78
.0815
3.8
150.0-174.9
1.18
.1653
7.8
175.0-199.9
.57
.2277
10.7
200.0-224.9
.03
.2269
10.7
225.0-249.9
.64
.1536
7.2
250.0-274.9
1.24
.0753
3.5
275.0-299.9
1.85
.0251
1.2
1.5
300.0 and greater
2.45
.0071
.3

608
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
Comparing Observed and Expected Frequencies
We are now interested in examining the magnitudes of the discrepancies between the
observed frequencies and the expected frequencies, since we note that the two sets of
frequencies do not agree. We know that even if our sample were drawn from a normal
distribution of values, sampling variability alone would make it highly unlikely that the
observed and expected frequencies would agree perfectly. We wonder, then, if the
discrepancies between the observed and expected frequencies are small enough that we
feel it reasonable that they could have occurred by chance alone, when the null hypothesis
is true. If they are of this magnitude, we will be unwilling to reject the null hypothesis that
the sample came from a normally distributed population.
If the discrepancies are so large that it does not seem reasonable that they could have
occurred by chance alone when the null hypothesis is true, we will want to reject the null
hypothesis. The criterion against which we judge whether the discrepancies are “large” or
“small” is provided by the chi-square distribution.
The observed and expected frequencies along with each value of ðOi EiÞ2=Ei are
shown in Table 12.3.3. The first entry in the last column, for example, is computed from
ð1
1:8Þ2=1:8 ¼ .356. The other values of ðOi EiÞ2=Ei are computed in a similar
manner.
From Table 12.3.3 we see that X2 ¼P½ðOi EiÞ2=Ei
¼ 10:566. The appropriate
degrees of freedom are 8 (the number of groups or class intervals)
3
(for the three
restrictions: making
PEi ¼ POi, and estimating m and s from the sample data) ¼ 5.
8. Statistical decision. When we compare X2 ¼ 10:566 with values of x2 in
11:070, so that, at the
Appendix Table F, we see that it is less than x2.95 ¼
.05 level of significance, we cannot reject the null hypothesis that the
sample came from a normally distributed population.
TABLE 12.3.3
Observed and Expected Frequencies and
ðOi Ei
Þ2=Ei for Example 12.3.1
Observed
Expected
Frequency
Frequency
Class Interval
(Oi)
(Ei)
ðOi Ei
Þ2=Ei
< 100
0
.4
1.8
.356
100.0-124.9
1
1.4
125.0-149.9
3
3.8
.168
150.0-174.9
8
7.8
.005
175.0-199.9
18
10.7
4.980
200.0-224.9
6
10.7
2.064
225.0-249.9
4
7.2
1.422
250.0-274.9
4
3.5
.071
275.0-299.9
3
1.2
1.5
1.500
300.0 and
0
.3
greater
Total
47
47
10.566

12.3
TESTS OF GOODNESS-OF-FIT
609
9. Conclusion. We conclude that in the sampled population, cholesterol
levels may follow a normal distribution.
10. p value. Since 11:070 > 10:566 > 9:236, .05 < p < .10. In other words,
the probability of obtaining a value of X2 as large as 10.566, when the null
hypothesis is true, is between .05 and .10. Thus we conclude that such an
event is not sufficiently rare to reject the null hypothesis that the data come
from a normal distribution.
&
Sometimes the parameters are specified in the null hypothesis. It should be noted
that had the mean and variance of the population been specified as part of the null
hypothesis in Example 12.3.1, we would not have had to estimate them from the sample
and our degrees of freedom would have been 8
1 ¼ 7.
Alternatives Although one frequently encounters in the literature the use of chi-
square to test for normality, it is not the most appropriate test to use when the hypothesized
distribution is continuous. The Kolmogorov-Smirnov test, described in Chapter 13, was
especially designed for goodness-of-fit tests involving continuous distributions.
EXAMPLE 12.3.2 The Binomial Distribution
In a study designed to determine patient acceptance of a new pain reliever, 100 physicians
each selected a sample of 25 patients to participate in the study. Each patient, after trying
the new pain reliever for a specified period of time, was asked whether it was preferable to
the pain reliever used regularly in the past.
The results of the study are shown in Table 12.3.4.
TABLE 12.3.4
Results of Study Described in Example 12.3.2
Number of
Number of Patients
Doctors
Total Number of Patients
Out of 25 Preferring
Reporting this
Preferring New Pain
New Pain Reliever
Number
Reliever by Doctor
0
5
0
1
6
6
2
8
16
3
10
30
4
10
40
5
15
75
6
17
102
7
10
70
8
10
80
9
9
81
10 or more
0
0
Total
100
500

610
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
We are interested in determining whether or not these data are compatible with the
hypothesis that they were drawn from a population that follows a binomial distribution.
Again, we employ a chi-square goodness-of-fit test.
Solution: Since the binomial parameter, p, is not specified, it must be estimated from
the sample data. A total of 500 patients out of the 2500 patients participating
in the study said they preferred the new pain reliever, so that our point
estimate of p is p ¼ 500=2500 ¼ .20. The expected relative frequencies can
be obtained by evaluating the binomial function
x
fðxÞ ¼25Cxð.2Þxð.8Þ25
for x ¼ 0; 1; . . . ; 25. For example, to find the probability that out of a sample
of 25 patients none would prefer the new pain reliever, when in the total
population the true proportion preferring the new pain reliever is .2, we would
evaluate
o
fð0Þ ¼25Coð:2Þoð:8Þ25
This can be done most easily by consulting Appendix Table B, where we see
that PðX ¼ 0Þ ¼ .0038. The relative frequency of occurrence of samples of
size 25 in which no patients prefer the new pain reliever is .0038. To obtain
the corresponding expected frequency, we multiply .0038 by 100 to get .38.
Similar calculations yield the remaining expected frequencies, which, along
with the observed frequencies, are shown in Table 12.3.5. We see in this table
TABLE 12.3.5
Calculations for Example 12.3.2
Number of
Number of
Doctors Reporting
Patients Out of 25
This Number
Expected
Preferring New Pain
(Observed
Relative
Expected
Reliever
Frequency, Oi)
Frequency
Frequency Ei
0
5
.0038
.38
11
2.74
1
6
.0236
2.36
2
8
.0708
7.08
3
10
.1358
13.58
4
10
.1867
18.67
5
15
.1960
19.60
6
17
.1633
16.33
7
10
.1109
11.09
8
10
.0623
6.23
9
9
.0295
2.95
10 or more
0
.0173
1.73
Total
100
1.0000
100.00

12.3
TESTS OF GOODNESS-OF-FIT
611
that the first expected frequency is less than 1, so that we follow Cochran’s
suggestion and combine this group with the second group. When we do this,
all the expected frequencies are greater than 1.
From the data, we compute
2
ð11
2:74Þ
ð8
7:08Þ2
ð0
1:73Þ2
X2 ¼
þ
þ þ
¼ 47:624
2:74
7:08
1:73
The appropriate degrees of freedom are 10 (the number of groups left
after combining the first two) less 2, or 8. One degree of freedom is lost
because we force the total of the expected frequencies to equal the total
observed frequencies, and one degree of freedom is sacrificed because we
estimated p from the sample data.
We compare our computed X2 with the tabulated x2 with 8 degrees of
freedom and find that it is significant at the .005 level of significance; that is,
p < .005. We reject the null hypothesis that the data came from a binomial
distribution.
&
EXAMPLE 12.3.3 The Poisson Distribution
A hospital administrator wishes to test the null hypothesis that emergency admissions
follow a Poisson distribution with l ¼ 3. Suppose that over a period of 90 days the numbers
of emergency admissions were as shown in Table 12.3.6.
TABLE 12.3.6
Number of Emergency Admissions to a Hospital During a
90-Day Period
Emergency
Emergency
Emergency
Emergency
Day Admissions
Day Admissions
Day Admissions
Day
Admissions
1
2
24
5
47
4
70
3
2
3
25
3
48
2
71
5
3
4
26
2
49
2
72
4
4
5
27
4
50
3
73
1
5
3
28
4
51
4
74
1
6
2
29
3
52
2
75
6
7
3
30
5
53
3
76
3
8
0
31
1
54
1
77
3
9
1
32
3
55
2
78
5
10
0
33
2
56
3
79
2
11
1
34
4
57
2
80
1
12
0
35
2
58
5
81
7
13
6
36
5
59
2
82
7
14
4
37
0
60
7
83
1
15
4
38
6
61
8
84
5
16
4
39
4
62
3
85
1
(Continued )

612
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
Emergency
Emergency
Emergency
Emergency
Day
Admissions
Day
Admissions
Day
Admissions
Day
Admissions
17
3
40
4
63
1
86
4
18
4
41
5
64
3
87
4
19
3
42
1
65
1
88
9
20
3
43
3
66
0
89
2
21
3
44
1
67
3
90
3
22
4
45
2
68
2
23
3
46
3
69
1
The data of Table 12.3.6 are summarized in Table 12.3.7.
Solution: To obtain the expected frequencies we first obtain the expected relative
frequencies by evaluating the Poisson function given by Equation 4.4.1 for
each entry in the left-hand column of Table 12.3.7. For example, the first
expected relative frequency is obtained by evaluating
0
e33
f ð0Þ ¼
0!
We may use Appendix Table C to find this and all the other expected rel-
ative frequencies that we need. Each of the expected relative frequencies
TABLE 12.3.7
Summary of Data Presented
in Table 12.3.6
Number of
Number of
Days This Number
Emergency Admissions
of Emergency
in a Day
Admissions Occurred
0
5
1
14
2
15
3
23
4
16
5
9
6
3
7
3
8
1
9
1
10 or more
0
Total
90

12.3
TESTS OF GOODNESS-OF-FIT
613
TABLE 12.3.8
Observed and Expected Frequencies and Components
of X2 for Example 12.3.3
Number of
Number of
Days this
Expected
Emergency
Number
Relative
Expected
ðOi Ei
Þ2
Admissions
Occurred, Oi
Frequency
Frequency
Ei
0
5
.050
4.50
.056
1
14
.149
13.41
.026
2
15
.224
20.16
1.321
3
23
.224
20.16
.400
4
16
.168
15.12
.051
5
9
.101
9.09
.001
6
3
.050
4.50
.500
7
3
.022
1.98
.525
9
9
8
1=
.008
.72=
9
1
2
.003
.27
1.08
.784
;
>
10 or more
0
.001
.09
Total
90
1.000
90.00
3.664
is multiplied by
90 to obtain the corresponding expected frequencies.
These values along with the observed and expected frequencies and the
components of X2, ðOi EiÞ2=Ei, are displayed in Table 12.3.8, in which we
see that
“
#
X
2
ðOi Ei
Þ
4:50Þ2
ð2
1:08Þ2
X2 ¼
¼ð5
þ þ
¼ 3:664
Ei
4:50
1:08
We also note that the last three expected frequencies are less than 1, so that
they must be combined to avoid having any expected frequencies less than 1.
This means that we have only nine effective categories for computing degrees
of freedom. Since the parameter, l, was specified in the null hypothesis, we
do not lose a degree of freedom for reasons of estimation, so that the
appropriate degrees of freedom are 9
1 ¼ 8. By consulting Appendix
Table F, we find that the critical value of x2 for 8 degrees of freedom and
a ¼ .05 is15.507, so that we cannot reject thenull hypothesis at the.05 level,
or for that matter any reasonable level, of significance
(p > .10). We
conclude, therefore, that emergency admissions at this hospital may follow
a Poisson distribution with l ¼ 3. At least the observed data do not cast any
doubt on that hypothesis.
If the parameter l has to be estimated from sample data, the estimate is
obtained by multiplying each value x by its frequency, summing these
products, and dividing the total by the sum of the frequencies.
&

614
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
EXAMPLE 12.3.4 The Uniform Distribution
The flu season in southern Nevada for 2005-2006 ran from December to April, the
coldest months of the year. The Southern Nevada Health District reported the numbers
of vaccine-preventable influenza cases shown in Table 12.3.9. We are interested in
knowing whether the numbers of flu cases in the district are equally distributed among
the five flu
season months. That is, we wish to know if flu cases follow a uniform
distribution.
Solution:
1.
Data. See Table 12.3.9.
2.
Assumptions. We assume that the reported cases of flu constitute a
simple random sample of cases of flu that occurred in the district.
3.
Hypotheses.
H0: Flu cases in southern Nevada are uniformly distributed over the five
flu season months.
HA: Flu cases in southern Nevada are not uniformly distributed over the
five flu season months.
Let a ¼ .01.
4.
Test statistic. The test statistic is
2
XðOi EiÞ
X2 ¼
Ei
5.
Distribution of test statistic. If H0 is true, X2 is distributed approxi-
mately as x2 with ð5
1Þ ¼ 4 degrees of freedom.
6.
Decision rule. Reject H0 if the computed value of X2 is equal to or
greater than 13.277.
TABLE 12.3.9
Reported Vaccine-Preventable
Influenza Cases from Southern Nevada,
December 2005-April 2006
Number of
Reported Cases
Month
of Influenza
December 2005
62
January 2006
84
February 2006
17
March 2006
16
April 2006
21
Total
200
epidemiology/disease_statistics.htm.

12.3
TESTS OF GOODNESS-OF-FIT
615
Chart of Observed and Expected Values
90
Expected
80
Observed
70
60
50
40
30
20
10
0
Category
1
2
3
4
5
Chi-Square Goodness-of-Fit Test for Observed Counts in Variable: C1
Test
Contribution
Category
Observed
Proportion
Expected
to Chi-Sq
1
62
0.2
40
12.100
2
84
0.2
40
48.400
3
17
0.2
40
13.225
4
16
0.2
40
14.400
5
21
0.2
40
9.025
N
DF
Chi-Sq
P-Value
200
4
97.15
0.000
FIGURE 12.3.2
MINITAB output for Example 12.3.4.
7. Calculation of test statistic. If the null hypothesis is true, we would
expect to observe 200=5 ¼ 40 cases per month. Figure 12.3.2 shows the
computer printout obtained from MINITAB. The bar graph shows the
observed and expected frequencies per month. The chi-square table
provides the observed frequencies, the expected frequencies based on a
uniform distribution, and the individual chi-square contribution for each
test value.
8. Statistical decision. Since 97.15, the computed value of X2, is greater
than 13.277, we reject, based on these data, the null hypothesis of a

616
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
uniform distribution of flu cases during the flu season in southern
Nevada.
9. Conclusion. We conclude that the occurrence of flu cases does not
follow a uniform distribution.
10. p value. From the MINITAB output we see that p ¼ .000 (i.e., < .001).
&
EXAMPLE 12.3.5
A certain human trait is thought to be inherited according to the ratio 1:2:1 for homozygous
dominant, heterozygous, and homozygous recessive. An examination of a simple random
sample of 200 individuals yielded the following distribution of the trait: dominant, 43;
heterozygous, 125; and recessive, 32. We wish to know if these data provide sufficient
evidence to cast doubt on the belief about the distribution of the trait.
Solution:
1.
Data. See statement of the example.
2.
Assumptions. We assume that the data meet the requirements for the
application of the chi-square goodness-of-fit test.
3.
Hypotheses.
H0: The trait is distributed according to the ratio 1:2:1 for homozygous
dominant, heterozygous, and homozygous recessive.
HA: The trait is not distributed according to the ratio 1:2:1.
4.
Test statistic. The test statistic is
“
#
X
2
ðO EÞ
X2 ¼
E
5.
Distribution of test statistic. If H0 is true, X2 is distributed as chi-square
with 2 degrees of freedom.
6.
Decision rule. Suppose we let the probability of committing a type I
error be .05. Reject H0 if the computed value of X2 is equal to or greater
than 5.991.
7.
Calculation of test statistic. If H0 is true, the expected frequencies for
the three manifestations of the trait are 50, 100, and 50 for dominant,
heterozygous, and recessive, respectively. Consequently,
X2 ¼ ð43
50Þ2=50 þ ð125
100Þ2=100 þ ð32
50Þ2=50 ¼ 13:71
8.
Statistical decision. Since 13:71 > 5:991, we reject H0.
9.
Conclusion. We conclude that the trait is not distributed according to the
ratio 1:2:1.
10.
p value. Since 13:71 > 10:597, the p value for the test is p < .005.&

EXERCISES
617
EXERCISES
12.3.1
The following table shows the distribution of uric acid determinations taken on 250 patients. Test the
goodness-of-fit of these data to a normal distribution with m ¼ 5:74 and s ¼ 2:01. Let a ¼ .01.
Uric Acid
Observed
Uric Acid
Observed
Determination
Frequency
Determination
Frequency
<1
1
6 to 6.99
45
1 to 1.99
5
7 to 7.99
30
2 to 2.99
15
8 to 8.99
22
3 to 3.99
24
9 to 9.99
10
4 to 4.99
43
10 or higher
5
5 to 5.99
50
Total
250
12.3.2
The following data were collected on 300 eight-year-old girls. Test, at the .05 level of significance,
the null hypothesis that the data are drawn from a normally distributed population. The sample
mean and standard deviation computed from grouped data are 127.02 and 5.08.
Height in
Observed
Height in
Observed
Centimeters
Frequency
Centimeters
Frequency
114 to 115.9
5
128 to 129.9
43
116 to 117.9
10
130 to 131.9
42
118 to 119.9
14
132 to 133.9
30
120 to 121.9
21
134 to 135.9
11
122 to 123.9
30
136 to 137.9
5
124 to 125.9
40
138 to 139.9
4
126 to 127.9
45
Total
300
12.3.3
The face sheet of patients’ records maintained in a local health department contains 10 entries.
A sample of 100 records revealed the following distribution of erroneous entries:
Number of Erroneous
Entries Out of 10
Number of Records
0
8
1
25
2
32
3
24
4
10
5 or more
1
Total
100

618
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
Test the goodness-of-fit of these data to the binomial distribution with p ¼ .20. Find the p value for
this test.
12.3.4
In a study conducted by Byers et al. (A-2), researchers tested a Poisson model for the distribution
of activities of daily living (ADL) scores after a 7-month prehabilitation program designed to
prevent functional decline among physically frail, community-living older persons. ADL meas-
ured the ability of individuals to perform essential tasks, including walking inside the house,
bathing, upper and lower body dressing, transferring from a chair, toileting, feeding, and
grooming. The scoring method used in this study assigned a value of 0 for no (personal) help
and no difficulty, 1 for difficulty but no help, and 2 for help regardless of difficulty. Scores were
summed to produce an overall score ranging from 0 to 16 (for eight tasks). There were 181 subjects
who completed the study. Suppose we use the authors’ scoring method to assess the status of
another group of 181 subjects relative to their activities of daily living. Let us assume that the
following results were obtained.
Observed
Expected
Observed
Expected
X Frequency X Frequency
X
Frequency X Frequency
0
74
11.01
7
4
2.95
1
27
30.82
8
3
1.03
2
14
43.15
9
2
0.32
3
14
40.27
10
3
0.09
4
11
28.19
11
4
0.02
5
7
15.79
12 or more
13
0.01
6
5
7.37
Source: Hypothetical data based on procedure reported by Amy L. Byers, Heather Allore,
Thomas M. Gill, and Peter N. Peduzzi, “Application of Negative Binomial Modeling for
Discrete Outcomes: A Case Study in Aging Research,” Journal of Clinical Epidemiology, 56
(2003), 559-564.
Test the null hypothesis that these data were drawn from a Poisson distribution with l ¼ 2:8. Let
a ¼ .01.
12.3.5
The following are the numbers of a particular organism found in 100 samples of water from
a pond:
Number of Organisms
Number of Organisms
per Sample
Frequency
per Sample
Frequency
0
15
4
5
1
30
5
4
2
25
6
1
3
20
7
0
Total
100
Test the null hypothesis that these data were drawn from a Poisson distribution. Determine the p value
for this test.

12.4
TESTS OF INDEPENDENCE
619
12.3.6
A research team conducted a survey in which the subjects were adult smokers. Each subject in a
sample of 200 was asked to indicate the extent to which he or she agreed with the statement: “I would
like to quit smoking.” The results were as follows:
Response:
Strongly agree
Agree
Disagree
Strongly Disagree
Number
Responding:
102
30
60
8
Can one conclude on the basis of these data that, in the sampled population, opinions are not equally
distributed over the four levels of agreement? Let the probability of committing a type I error be .05
and find the p value.
12.4
TESTS OF INDEPENDENCE
Another, and perhaps the most frequent, use of the chi-square distribution is to test the null
hypothesis that two criteria of classification, when applied to the same set of entities, are
independent. We say that two criteria of classification are independent if the distribution of
one criterion is the same no matter what the distribution of the other criterion. For example,
if socioeconomic status and area of residence of the inhabitants of a certain city are
independent, we would expect to find the same proportion of families in the low, medium,
and high socioeconomic groups in all areas of the city.
The Contingency Table The classification, according to two criteria, of a set of
entities, say, people, can be shown by a table in which the r rows represent the various
levels of one criterion of classification and the c columns represent the various levels of the
second criterion. Such a table is generally called a contingency table, with dimension r c.
The classification according to two criteria of a finite population of entities is shown in
Table 12.4.1.
We will be interested in testing the null hypothesis that in the population the two
criteria of classification are independent. If the hypothesis is rejected, we will conclude that
TABLE 12.4.1
Two-Way Classification of a Finite
Population of Entities
Second
Criterion of
First Criterion of Classification Level
Classification
Level
1
2
3
c
Total
1
N11
N12
N13
N1c
N1.
2
N21
N22
N23
N2c
N2.
3
N31
N32
N33
N3c
N3.
r
Nr1
Nr2
Nr3
Nrc
Nr:
Total
N.1
N.2
N.3
N.c
N

620
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
TABLE 12.4.2
Two-Way Classification of a Sample
of Entities
Second
Criterion of
First Criterion of Classification Level
Classification
Level
1
2
3
c
Total
1
n11
n12
n13
n1c
n1.
2
n21
n22
n23
n2c
n2.
3
n31
n32
n33
n3c
n3.
r
nr1
nr2
nr3
nrc
nr.
Total
n.1
n.2
n.3
n.c
n
the two criteria of classification are not independent. A sample of size n will be drawn from
the population of entities, and the frequency of occurrence of entities in the sample
corresponding to the cells formed by the intersections of the rows and columns of Table
12.4.1 along with the marginal totals will be displayed in a table such as Table 12.4.2.
Calculating the Expected Frequencies The expected frequency, under
the null hypothesis that the two criteria of classification are independent, is calculated for
each cell.
We learned in Chapter 3 (see Equation 3.4.4) that if two events are independent, the
probability of their joint occurrence is equal to the product of their individual probabilities.
Under the assumption of independence, for example, we compute the probability that one
of the n subjects represented in Table 12.4.2 will be counted in Row 1 and Column 1 of the
table (that is, in Cell 11) by multiplying the probability that the subject will be counted in
Row 1 by the probability that the subject will be counted in Column 1. In the notation of the
table, the desired calculation is
n1:
n.1
n
n
To obtain the expected frequency for Cell 11, we multiply this probability by the total
number of subjects, n. That is, the expected frequency for Cell 11 is given by
n1:
n.1
n
n ðnÞ
Since the n in one of the denominators cancels into numerator n, this expression reduces to
ðn1:Þðn.1Þ
n
In general, then, we see that to obtain the expected frequency for a given cell, we multiply
the total of the row in which the cell is located by the total of the column in which the cell is
located and divide the product by the grand total.

12.4
TESTS OF INDEPENDENCE
621
Observed Versus Expected Frequencies The expected frequencies and
observed frequencies are compared. If the discrepancy is sufficiently small, the null
hypothesis is tenable. If the discrepancy is sufficiently large, the null hypothesis is rejected,
and we conclude that the two criteria of classification are not independent. The decision as
to whether the discrepancy between observed and expected frequencies is sufficiently large
to cause rejection of H0 will be made on the basis of the size of the quantity computed when
we use Equation 12.2.4, where Oi and Ei refer, respectively, to the observed and expected
frequencies in the cells of Table 12.4.2. It would be more logical to designate the observed
and expected frequencies in these cells by Oij and Eij, but to keep the notation simple and to
avoid the introduction of another formula, we have elected to use the simpler notation. It
will be helpful to think of the cells as being numbered from 1 to k, where 1 refers to Cell 11
and k refers to Cell rc. It can be shown that X2 as defined in this manner is distributed
approximately as x2 with ðr
1Þðc
1Þ degrees of freedom when the null hypothesis is
true. If the computed value of X2 is equal to or larger than the tabulated value of x2 for some
a, the null hypothesis is rejected at the a level of significance. The hypothesis testing
procedure is illustrated with the following example.
EXAMPLE 12.4.1
In 1992, the U.S. Public Health Service and the Centers for Disease Control and Prevention
recommended that all women of childbearing age consume 400 mg of folic acid daily to
reduce the risk of having a pregnancy that is affected by a neural tube defect such as spina
bifida or anencephaly. In a study by Stepanuk et al. (A-3), 693 pregnant women called a
teratology information service about their use of folic acid supplementation. The research-
ers wished to determine if preconceptional use of folic acid and race are independent. The
data appear in Table 12.4.3.
Solution:
1. Data. See Table 12.4.3.
2. Assumptions. We assume that the sample available for analysis is equiv-
alent to a simple random sample drawn from the population of interest.
TABLE 12.4.3
Race of Pregnant Caller and Use of
Folic Acid
Preconceptional Use of Folic Acid
Yes
No
Total
White
260
299
559
Black
15
41
56
Other
7
14
21
Total
282
354
636
Source: Kathleen M. Stepanuk, Jorge E. Tolosa, Dawneete Lewis, Victoria
Meyers, Cynthia Royds, Juan Carlos Saogal, and Ron Librizzi, “Folic Acid
Supplementation Use Among Women Who Contact a Teratology Information
Service,” American Journal of Obstetrics and Gynecology, 187 (2002), 964-967.

622
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
3.
Hypotheses.
H0: Race and preconceptional use of folic acid are independent.
HA: The two variables are not independent.
Let a ¼ .05.
4.
Test statistic. The test statistic is
“
#
k
ðOi EiÞ2
X2 ¼
Ei
i¼1
5.
Distribution of test statistic. When H0
is true, X2
is distributed
approximately as x2 with ðr
1Þðc
1Þ ¼ ð3
1Þð2
1Þ ¼ ð2Þð1Þ ¼
2 degrees of freedom.
6.
Decision rule. Reject H0 if the computed value of X2 is equal to or
greater than 5.991.
7.
Calculation of test statistic. The expected frequency for the first cell is
ð559
282Þ=636 ¼ 247:86. The other expected frequencies are calcu-
lated in a similar manner. Observed and expected frequencies are
displayed in Table 12.4.4. From the observed and expected frequencies
we may compute
“
#
2
X2 ¼PðOi EiÞ
Ei
2
247:86Þ
ð299
311:14Þ2
ð14
11:69Þ2
¼ ð260
þ
þ…þ
247:86
311:14
11:69
¼ .59461 þ .47368 þ . . . þ .45647 ¼ 9:08960
8.
Statistical decision. We reject H0 since 9:08960 > 5:991.
9.
Conclusion. We conclude that H0 is false, and that there is a relationship
between race and preconceptional use of folic acid.
10.
p value. Since 7:378 < 9:08960 < 9:210, .01 < p < .025.
TABLE 12.4.4
Observed and Expected Frequencies
for Example 12.4.1
Preconceptional Use of Folic Acid
Yes
No
Total
White
260
(247.86)
299
(311.14)
559
Black
15
(24.83)
41
(31.17)
56
Other
7
(9.31)
14
(11.69)
21
Total
282
354
636
&

12.4
TESTS OF INDEPENDENCE
623
Computer Analysis The computer may be used to advantage in calculating X2 for
tests of independence and tests of homogeneity. Figure 12.4.1 shows the procedure and
printout for Example
12.4.1
when the MINITAB program for computing X2 from
contingency tables is used. The data were entered into MINITAB Columns 1 and 2,
corresponding to the columns of Table 12.4.3.
We may use SAS® to obtain an analysis and printout of contingency table data by
using the PROC FREQ statement. Figure 12.4.2 shows a partial SAS® printout reflecting
the analysis of the data of Example 12.4.1.
Data:
C1: 260 15
7
C2: 299 41 14
Dialog Box:
Session command :
Stat Tables Chi-square Test
MTB
> CHISQUARE C1-C3
Type C1-C2 in Columns containing the table.
Click OK.
Output:
Chi-Square Test: C1, C2
Expected counts are printed below observed
counts
C1
C2
Total
1
260
299
559
247.86
311.14
2
15
41
56
24.83
31.17
3
7
14
21
9.31
11.69
Total
282
354
636
Chi-Sq
=
0.595
+
0.474
+
3.892
+
3.100
+
0.574
+
0.457
=
9.091
DF
=
2, P-Value
=
0.011
FIGURE 12.4.1
MINITAB procedure and output for chi-square analysis of data in Table 12.4.3.

624
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
The SAS System
The FREQ Procedure
Table of race by folic
race
folic
Frequency
Percent
Row Pct
Col Pct
No
Yes
Total
—————————–
Black
41
15
56
6.45
2.36
8.81
73.21
26.79
11.58
5.32
—————————–
Other
14
7
21
2.20
1.10
3.30
66.67
33.33
3.95
2.48
—————————–
White
299
260
559
47.01
40.88
87.89
53.49
46.51
84.46
92.20
—————————–
Total
354
282
636
55.66
44.34
100.00
Statistics for Table of race by folic
Statistic
DF
Value
Prob
———————————————————-
Chi-Square
2
9.0913
0.0106
Likelihood Ratio Chi-Square
2
9.4808
0.0087
Mantel—Haenszel Chi-Square
1
8.9923
0.0027
Phi Coefficient
0.1196
Contingency Coefficient
0.1187
Cramer’s V
0.1196
Sample Size
=
636
FIGURE 12.4.2
Partial SAS® printout for the chi-square analysis of the data from
Example 12.4.1.

12.4
TESTS OF INDEPENDENCE
625
Note that the SAS® printout shows, in each cell, the percentage that cell frequency is
of its row total, its column total, and the grand total. Also shown, for each row and column
total, is the percentage that the total is of the grand total. In addition to the X2 statistic,
SAS® gives the value of several other statistics that may be computed from contingency
table data. One of these, the Mantel-Haenszel chi-square statistic, will be discussed in a
later section of this chapter.
Small Expected Frequencies The problem of small expected frequencies
discussed in the previous section may be encountered when analyzing the data of
contingency tables. Although there is a lack of consensus on how to handle this problem,
many authors currently follow the rule given by Cochran (5). He suggests that for
contingency tables with more than 1 degree of freedom a minimum expectation of 1 is
allowable if no more than 20 percent of the cells have expected frequencies of less than 5.
To meet this rule, adjacent rows and/or adjacent columns may be combined when to
do so is logical in light of other considerations. If X2 is based on less than 30 degrees of
freedom, expected frequencies as small as 2 can be tolerated. We did not experience the
problem of small expected frequencies in Example 12.4.1, since they were all greater
than 5.
The 2
2 Contingency Table Sometimes each of two criteria of classifica-
tion may be broken down into only two categories, or levels. When data are cross-
classified in this manner, the result is a contingency table consisting of two rows and two
columns. Such a table is commonly referred to as a 2
2 table. The value of X2 may be
computed by first calculating the expected cell frequencies in the manner discussed
above. In the case of a 2
2 contingency table, however, X2 may be calculated by the
following shortcut formula:
2
nðad bcÞ
X2 ¼
(12.4.1)
ða þ cÞðb þ dÞða þ bÞðc þ dÞ
where a, b, c, and d are the observed cell frequencies as shown in Table 12.4.5. When we
apply the ðr
1Þðc
1Þ rule for finding degrees of freedom to a 2
2 table, the result is
1 degree of freedom. Let us illustrate this with an example.
TABLE 12.4.5
A 2
2 Contingency Table
First Criterion of Classification
Second Criterion
of Classification
1
2
Total
1
a
b
aþb
2
c
d
cþd
Total
aþc
bþd
n

626
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
EXAMPLE 12.4.2
According to Silver and Aiello (A-4), falls are of major concern among polio survivors.
Researchers wanted to determine the impact of a fall on lifestyle changes. Table 12.4.6
shows the results of a study of 233 polio survivors on whether fear of falling resulted in
lifestyle changes.
Solution:
1.
Data. From the information given we may construct the 2
2 contin-
gency table displayed as Table 12.5.6.
2.
Assumptions. We assume that the sample is equivalent to a simple
random sample.
3.
Hypotheses.
H0: Fall status and lifestyle change because of fear of falling are
independent.
H1: The two variables are not independent.
Let a ¼ .05.
4.
Test statistic. The test statistic is
“
#
k
ðOi Ei
Þ2
X2 ¼
Ei
i¼1
5.
Distribution of test statistic. When H0
is true, X2
is distributed
approximately as x2 with ðr
1Þðc
1Þ ¼ ð2
1Þð2
1Þ ¼ ð1Þð1Þ ¼
1 degree of freedom.
6.
Decision rule. Reject H0 if the computed value of X2 is equal to or
greater than 3.841.
7.
Calculation of test statistic. By Equation 12.4.1 we compute
2
233½ð131Þð36Þ ð52Þð14Þ
X2 ¼
¼ 31:7391
ð145Þð88Þð183Þð50Þ
8.
Statistical decision. We reject H0 since 31:7391 > 3:841.
TABLE
12.4.6
Contingency Table for the Data of Example 12.4.2
Made Lifestyle Changes Because of Fear of Falling
Yes
No
Total
Fallers
131
52
183
Nonfallers
14
36
50
Total
145
88
233
Source: J. K. Silver and D. D. Aiello, “Polio Survivors: Falls and Subsequent Injuries,”
American Journal of Physical Medicine and Rehabilitation, 81 (2002), 567-570.

12.4
TESTS OF INDEPENDENCE
627
9. Conclusion. We conclude that H0 is false, and that there is a relationship
between experiencing a fall and changing one’s lifestyle because of fear
of falling.
10. p value. Since 31:7391 > 7:879, p < .005.
&
Small Expected Frequencies The problems of how to handle small expected
frequencies and small total sample sizes may arise in the analysis of 2
2 contingency
tables. Cochran (5) suggests that the x2 test should not be used if< 20 or if 20 << 40
and any expected frequency is less than 5. When¼ 40, an expected cell frequency as
small as 1 can be tolerated.
Yates’s Correction The observed frequencies in a contingency table are discrete
and thereby give rise to a discrete statistic, X2, which is approximated by the x2
distribution, which is continuous. Yates (6) in 1934 proposed a procedure for correcting
for this in the case of 2
2 tables. The correction, as shown in Equation 12.4.2, consists of
subtracting half the total number of observations from the absolute value of the quantity
ad bc before squaring. That is,
2
nðjad bcj
.5nÞ
X2
(12.4.2)
corrected ¼
ða þ cÞðb þ dÞða þ bÞðc þ dÞ
It is generally agreed that no correction is necessary for larger contingency tables.
Although Yates’s correction for 2
2 tables has been used extensively in the past,
more recent investigators have questioned its use. As a result, some practitioners recom-
mend against its use.
We may, as a matter of interest, apply the correction to our current example. Using
Equation 12.4.2 and the data from Table 12.4.6, we may compute
2
233½jð131Þð36Þ ð52Þð14Þj
.5ð233Þ
X2 ¼
¼ 29:9118
ð145Þð88Þð183Þð50Þ
As might be expected, with a sample this large, the difference in the two results is not
dramatic.
Tests of Independence: Characteristics The characteristics of a chi-
square test of independence that distinguish it from other chi-square tests are as follows:
1. A single sample is selected from a population of interest, and the subjects or objects
are cross-classified on the basis of the two variables of interest.
2. The rationale for calculating expected cell frequencies is based on the probability
law, which states that if two events (here the two criteria of classification) are
independent, the probability of their joint occurrence is equal to the product of their
individual probabilities.
3. The hypotheses and conclusions are stated in terms of the independence (or lack of
independence) of two variables.

628
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
EXERCISES
In the exercises that follow perform the test at the indicated level of significance and determine the p
value.
12.4.1
In the study by Silver and Aiello (A-4) cited in Example 12.4.2, a secondary objective was to
determine if the frequency of falls was independent of wheelchair use. The following table gives the
data for falls and wheelchair use among the subjects of the study.
Wheelchair Use
Yes
No
Fallers
62
121
Nonfallers
18
32
Source: J. K. Silver and D. D. Aiello, “Polio Survivors: Falls and
Subsequent Injuries,” American Journal of Physical Medicine and
Rehabilitation, 81 (2002), 567-570.
Do these data provide sufficient evidence to warrant the conclusion that wheelchair use and falling are
related? Let a ¼ .05.
12.4.2
Sternal surgical site infection (SSI) after coronary artery bypass graft surgery is a complication that
increases patient morbidity and costs for patients, payers, and the health care system. Segal and
Anderson (A-5) performed a study that examined two types of preoperative skin preparation before
performing open heart surgery. These two preparations used aqueous iodine and insoluble iodine with
the following results.
Comparison of Aqueous
and Insoluble Preps
Prep Group
Infected
Not Infected
Aqueous iodine
14
94
Insoluble iodine
4
97
Source: Cynthia G. Segal and Jacqueline J. Anderson, “Preoperative Skin
Preparation of Cardiac Patients,” AORN Journal, 76 (2002), 821-827.
Do these data provide sufficient evidence at the a ¼ .05 level to justify the conclusion that the type of
skin preparation and infection are related?
12.4.3
The side effects of nonsteroidal antiinflammatory drugs (NSAIDs) include problems involving peptic
ulceration, renal function, and liver disease. In 1996, the American College of Rheumatology issued
and disseminated guidelines recommending baseline tests (CBC, hepatic panel, and renal tests) when
prescribing NSAIDs. A study was conducted by Rothenberg and Holcomb (A-6) to determine if
physicians taking part in a national database of computerized medical records performed the
recommended baseline tests when prescribing NSAIDs. The researchers classified physicians in
the study into four categories—those practicing in internal medicine, family practice, academic
family practice, and multispeciality groups. The data appear in the following table.

EXERCISES
629
Performed Baseline Tests
Practice Type
Yes
No
Internal medicine
294
921
Family practice
98
2862
Academic family practice
50
3064
Multispecialty groups
203
2652
Source: Ralph Tothenberg and John P. Holcomb, “Guidelines for Monitoring of NSAIDs: Who
Listened?,” Journal of Clinical Rheumatology, 6 (2000), 258-265.
Do the data above provide sufficient evidence for us to conclude that type of practice and
performance of baseline tests are related? Use a ¼ .01.
12.4.4
Boles and Johnson (A-7) examined the beliefs held by adolescents regarding smoking and weight.
Respondents characterized their weight into three categories: underweight, overweight, or appropri-
ate. Smoking status was categorized according to the answer to the question, “Do you currently
smoke, meaning one or more cigarettes per day?” The following table shows the results of a telephone
study of adolescents in the age group 12-17.
Smoking
Yes
No
Underweight
17
97
Overweight
25
142
Appropriate
96
816
Source: Sharon M. Boles and Patrick B. Johnson, “Gender, Weight Concerns, and Adolescent
Smoking,” Journal of Addictive Diseases, 20 (2001), 5-14.
Do the data provide sufficient evidence to suggest that weight perception and smoking status are
related in adolescents? a ¼ .05.
12.4.5
A sample of 500 college students participated in a study designed to evaluate the level of college
students’ knowledge of a certain group of common diseases. The following table shows the students
classified by major field of study and level of knowledge of the group of diseases:
Knowledge of Diseases
Major
Good
Poor
Total
Premedical
31
91
122
Other
19
359
378
Total
50
450
500
Do these data suggest that there is a relationship between knowledge of the group of diseases
and major field of study of the college students from which the present sample was drawn?
Let a ¼ .05.
12.4.6
The following table shows the results of a survey in which the subjects were a sample of 300 adults
residing in a certain metropolitan area. Each subject was asked to indicate which of three policies they
favored with respect to smoking in public places.

630
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
Policy Favored
No
Smoking Allowed
No
Highest Education
Restrictions
in Designated
Smoking
No
Level
on Smoking
Areas Only
at All
Opinion Total
College graduate
5
44
23
3
75
High-school graduate
15
100
30
5
150
Grade-school graduate
15
40
10
10
75
Total
35
184
63
18
300
Can one conclude from these data that, in the sampled population, there is a relationship between
level of education and attitude toward smoking in public places? Let a ¼ .05.
12.5
TESTS OF HOMOGENEITY
A characteristic of the examples and exercises presented in the last section is that, in each
case, the total sample was assumed to have been drawn before the entities were classified
according to the two criteria of classification. That is, the observed number of entities falling
into each cell was determined after the sample was drawn. As a result, the row and column
totals are chance quantities not under the control of the investigator. We think of the sample
drawn under these conditions as a single sample drawn from a single population. On
occasion, however, either row or column totals may be under the control of the investigator;
that is, the investigator may specify that independent samples be drawn from each of several
populations. In this case, one set of marginal totals is said to be fixed, while the other set,
corresponding to the criterion of classification applied to the samples, is random. The former
procedure, as we have seen, leads to a chi-square test of independence. The latter situation
leads to a chi-square test of homogeneity. The two situations not only involve different
sampling procedures; they lead to different questions and null hypotheses. The test of
independence is concerned with the question: Are the two criteria of classification indepen-
dent? The homogeneity test is concerned with the question: Are the samples drawn from
populations that are homogeneous with respect to some criterion of classification? In the
latter case the null hypothesis states that the samples are drawn from the same population.
Despite these differences in concept and sampling procedure, the two tests are mathemati-
cally identical, as we see when we consider the following example.
Calculating Expected Frequencies Either the row categories or the col-
umn categories may represent the different populations from which the samples are drawn.
If, for example, three populations are sampled, they may be designated as populations 1, 2,
and 3, in which case these labels may serve as either row or column headings. If the variable
of interest has three categories, say, A, B, and C, these labels may serve as headings for rows
or columns, whichever is not used for the populations. If we use notation similar to that
adopted for Table 12.4.2, the contingency table for this situation, with columns used to
represent the populations, is shown as Table 12.5.1. Before computing our test statistic we
need expected frequencies for each of the cells in Table 12.5.1. If the populations are indeed

12.5
TESTS OF HOMOGENEITY
631
TABLE 12.5.1
A Contingency Table for Data for a
Chi-Square Test of Homogeneity
Population
Variable Category
1
2
3
Total
A
nA1
nA2
nA3
nA:
B
nB1
nB2
nB3
nB:
C
nC1
nC2
nC3
nC:
Total
n.1
n.2
n.3
n
homogeneous, or, equivalently, if the samples are all drawn from the same population, with
respect to the categories A, B, and C, our best estimate of the proportion in the combined
population who belong to category A is nA:=n. By the same token, if the three populations
are homogeneous, we interpret this probability as applying to each of the populations
individually. For example, under the null hypothesis, nA. is our best estimate of the
probability that a subject picked at random from the combined population will belong to
category A. We would expect, then, to find n.1ðnA:=nÞ of those in the sample from population
1 to belong to category A, n.2ðnA:=nÞ of those in the sample from population 2 to belong to
category A, and n.3ðnA:=nÞ of those in the sample from population 3 to belong to category A.
These calculations yield the expected frequencies for the first row of Table 12.5.1. Similar
reasoning and calculations yield the expected frequencies for the other two rows.
We see again that the shortcut procedure of multiplying appropriate marginal totals
and dividing by the grand total yields the expected frequencies for the cells.
From the data in Table 12.5.1 we compute the following test statistic:
“
#
k
ðOi Ei
Þ2
X2 ¼
Ei
i¼1
EXAMPLE 12.5.1
Narcolepsy is a disease involving disturbances of the sleep-wake cycle. Members of the
German Migraine and Headache Society (A-8) studied the relationship between migraine
headaches in 96 subjects diagnosed with narcolepsy and 96 healthy controls. The results
are shown in Table 12.5.2. We wish to know if we may conclude, on the basis of these data,
TABLE 12.5.2
Frequency of Migraine Headaches by Narcolepsy Status
Reported Migraine Headaches
Yes
No
Total
Narcoleptic subjects
21
75
96
Healthy controls
19
77
96
Total
40
152
192
Source: The DMG Study Group, “Migraine and Idiopathic Narcolepsy—A Case-Control Study,”
Cephalagia, 23 (2003), 786-789.

632
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
that the narcolepsy population and healthy populations represented by the samples are not
homogeneous with respect to migraine frequency.
Solution:
1.
Data. See Table 12.5.2.
2.
Assumptions. We assume that we have a simple random sample from
each of the two populations of interest.
3.
Hypotheses.
H0: The two populations are homogeneous with respect to migraine
frequency.
HA: The two populations are not homogeneous with respect to migraine
frequency.
Let a ¼ .05.
4.
Test statistic. The test statistic is
Xh
i
X2 ¼
ðOi Ei
Þ2=Ei
5.
Distribution of test statistic. If H0 is true, X2 is distributed approxi-
mately as x2 with ð2
1Þð2
1Þ ¼ ð1Þð1Þ ¼ 1 degree of freedom.
6.
Decision rule. Reject H0 if the computed value of X2 is equal to or
greater than 3.841.
7.
Calculation of test statistic. The MINITAB output is shown in Figure
12.5.1.
Chi-Square Test
Expected counts are printed below observed counts
Rows: Narcolepsy
Columns: Migraine
No
Yes
All
No
77
19
96
76.00
20.00
96.00
Yes
75
21
96
76.00
20.00
96.00
All
152
40
192
152.00
40.00
192.00
Chi-Square
=
0.126, DF
=
1, P-Value
=
0.722
FIGURE 12.5.1
MINITAB output for Example 12.5.1.

12.5
TESTS OF HOMOGENEITY
633
8. Statistical decision. Since .126 is less than the critical value of 3.841,
we are unable to reject the null hypothesis.
9. Conclusion. We conclude that the two populations may be homoge-
neous with respect to migraine frequency.
10. p value. From the MINITAB output we see that p ¼ .722.
&
Small Expected Frequencies The rules for small expected frequencies given
in the previous section are applicable when carrying out a test of homogeneity.
In summary, the chi-square test of homogeneity has the following characteristics:
1. Two or more populations are identified in advance, and an independent sample is
drawn from each.
2. Sample subjects or objects are placed in appropriate categories of the variable of
interest.
3. The calculation of expected cell frequencies is based on the rationale that if the
populations are homogeneous as stated in the null hypothesis, the best estimate of the
probability that a subject or object will fall into a particular category of the variable of
interest can be obtained by pooling the sample data.
4. The hypotheses and conclusions are stated in terms of homogeneity (with respect to
the variable of interest) of populations.
Test of Homogeneity and H0:p1 ¼ p2
The chi-square test of homogeneity
for the two-sample case provides an alternative method for testing the null hypothesis that
two population proportions are equal. In Section 7.6, it will be recalled, we learned to test
H0 :p1 ¼ p2 against HA :p1 ¼ p2 by means of the statistic
ðp1
p2
Þ
ðp1
p2
Þ0
z¼
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffifffi
pð1
pÞ
pÞ
þpð1
n1
n2
where p is obtained by pooling the data of the two independent samples available for
analysis.
Suppose, for example, that in a test of H0 : p1 ¼ p2 against HA : p1 ¼ p2, the sample
data were as follows: n1 ¼ 100; p1 ¼ .60; n2 ¼ 120; p2 ¼ .40. When we pool the sample
data we have
.60ð100Þ þ .40ð120Þ
108
p¼
¼
100 þ 120
220¼.4909
and
.60
.40
z¼
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
¼ 2:95469
ð.4909Þð.5091Þ
ð.4909Þð.5091Þ
þ
100
120
which is significant at the .05 level since it is greater than the critical value of 1.96.

634
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
If we wish to test the same hypothesis using the chi-square approach, our contin-
gency table will be
Characteristic Present
Sample
Yes
No
Total
1
60
40
100
2
48
72
120
Total
108
112
220
By Equation 12.4.1 we compute
2
220½ð60Þð72Þ ð40Þð48Þ
X2 ¼
¼ 8:7302
ð108Þð112Þð100Þð120Þ
which is significant at the .05 level because it is greater than the critical value of 3.841. We
see, therefore, that we reach the same conclusion by both methods. This is not surprising
because, as explained in Section 12.2, x21
z2. We note that 8:7302 ¼ ð2:95469Þ2 and
ð Þ ¼
that 3:841 ¼ ð1:96Þ2.
EXERCISES
In the exercises that follow perform the test at the indicated level of significance and determine the p
value.
12.5.1
Refer to the study by Carter et al. [A-9], who investigated the effect of age at onset of bipolar disorder
on the course of the illness. One of the variables studied was subjects’ family history. Table 3.4.1
shows the frequency of a family history of mood disorders in the two groups of interest: early age at
onset (18 years or younger) and later age at onset (later than 18 years).
Family History of Mood
Disorders
Early
18ðEÞ
Later > 18ðLÞ
Total
Negative (A)
28
35
63
Bipolar disorder (B)
19
38
57
Unipolar (C)
41
44
85
Unipolar and bipolar (D)
53
60
113
Total
141
177
318
Source: Tasha D. Carter, Emanuela Mundo, Sagar V. Parkh, and James L. Kennedy,
“Early Age at Onset as a Risk Factor for Poor Outcome of Bipolar Disorder,” Journal of
Psychiatric Research, 37 (2003), 297-303.
Can we conclude on the basis of these data that subjects 18 or younger differ from subjects older than
18 with respect to family histories of mood disorders? Let a ¼ .05.

EXERCISES
635
12.5.2
Coughlin et al. (A-10) examined breast and cervical screening practices of Hispanic and non-
Hispanic women in counties that approximate the U.S. southern border region. The study used data
from the Behavioral Risk Factor Surveillance System surveys of adults ages 18 years or older
conducted in 1999 and 2000. The following table shows the number of observations of Hispanic
and non-Hispanic women who had received a mammogram in the past 2 years cross-classified by
marital status.
Marital Status
Hispanic
Non-Hispanic
Total
Currently married
319
738
1057
Divorced or separated
130
329
459
Widowed
88
402
490
Never married or living as
41
95
136
an unmarried couple
Total
578
1564
2142
Source: Steven S. Coughlin, Robert J. Uhler, Thomas Richards, and Katherine
M. Wilson, “Breast and Cervical Cancer Screening Practices Among Hispanic
and Non-Hispanic Women Residing Near the United States-Mexico Border,
1999-2000,” Family and Community Health, 26, (2003), 130-139.
We wish to know if we may conclude on the basis of these data that marital status and ethnicity
(Hispanic and non-Hispanic) in border counties of the southern United States are not homogeneous.
Let a ¼ .05.
12.5.3
Swor et al. (A-11) examined the effectiveness of cardiopulmonary resuscitation (CPR) training in
people over 55 years of age. They compared the skill retention rates of subjects in this age group who
completed a course in traditional CPR instruction with those who received chest-compression-only
cardiopulmonary resuscitation (CC-CPR). Independent groups were tested 3 months after training.
Among the 27 subjects receiving traditional CPR, 12 were rated as competent. In the CC-CPR group,
15 out of 29 were rated competent. Do these data provide sufficient evidence for us to conclude that
the two populations are not homogeneous with respect to competency rating 3 months after training?
Let a ¼ .05.
12.5.4
In an air pollution study, a random sample of 200 households was selected from each of two
communities. A respondent in each household was asked whether or not anyone in the household was
bothered by air pollution. The responses were as follows:
Any Member of Household
Bothered by Air Pollution?
Community
Yes
No
Total
I
43
157
200
II
81
119
200
Total
124
276
400
Can the researchers conclude that the two communities differ with respect to the variable of interest?
Let a ¼ .05.

636
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
12.5.5
In a simple random sample of 250 industrial workers with cancer, researchers found that 102 had
worked at jobs classified as “high exposure” with respect to suspected cancer-causing agents. Of the
remainder, 84 had worked at “moderate exposure” jobs, and 64 had experienced no known exposure
because of their jobs. In an independent simple random sample of 250 industrial workers from
the same area who had no history of cancer, 31 worked in “high exposure” jobs, 60 worked in
“moderate exposure” jobs, and 159 worked in jobs involving no known exposure to suspected cancer-
causing agents. Does it appear from these data that persons working in jobs that expose them to
suspected cancer-causing agents have an increased risk of contracting cancer? Let a ¼ .05.
12.6
THE FISHER EXACT TEST
Sometimes we have data that can be summarized in a 2
2 contingency table, but these
data are derived from very small samples. The chi-square test is not an appropriate method
of analysis if minimum expected frequency requirements are not met. If, for example, n is
less than 20 or if n is between 20 and 40 and one of the expected frequencies is less than 5,
the chi-square test should be avoided.
A test that may be used when the size requirements of the chi-square test are not met
was proposed in the mid-1930s almost simultaneously by Fisher (7,8), Irwin (9), and Yates
(10). The test has come to be known as the Fisher exact test. It is called exact because, if
desired, it permits us to calculate the exact probability of obtaining the observed results or
results that are more extreme.
Data Arrangement When we use the Fisher exact test, we arrange the data in the
form of a 2
2 contingency table like Table 12.6.1. We arrange the frequencies in such a
way that A > B and choose the characteristic of interest so that a=A > b=B.
Some theorists believe that Fisher’s exact test is appropriate only when both marginal
totals of Table 12.6.1 are fixed by the experiment. This specific model does not appear to
arise very frequently in practice. Many experimenters, therefore, use the test when both
marginal totals are not fixed.
Assumptions The following are the assumptions for the Fisher exact test.
1. The data consist of A sample observations from population
1 and B sample
observations from population 2.
2. The samples are random and independent.
3. Each observation can be categorized as one of two mutually exclusive types.
TABLE 12.6.1
A 2
2 Contingency Table for the Fisher Exact Test
With
Without
Sample
Characteristic
Characteristic
Total
1
a
A a
A
2
b
B b
B
Total
aþb
AþB a b
AþB

12.6
THE FISHER EXACT TEST
637
Hypotheses The following are the null hypotheses that may be tested and their
alternatives.
1.
(Two-sided)
H0: The proportion with the characteristic of interest is the same in both populations;
that is, p1 ¼ p2.
HA: The proportion with the characteristic of interest is not the same in both
populations; p1 ¼ p2.
2.
(One-sided)
H0: The proportion with the characteristic of interest in population 1 is less than or
the same as the proportion in population 2; p1
p2.
HA: The proportion with the characteristic of interest is greater in population 1 than
in population 2; p1 > p2.
Test Statistic The test statistic is b, the number in sample 2 with the characteristic
of interest.
Decision Rule Finney (11) has prepared critical values of b for A
15. Latscha
(12) has extended Finney’s tables to accommodate values of A up to 20. Appendix Table J
gives these critical values of b for A between 3 and 20, inclusive. Significance levels of .05,
.025, .01, and .005 are included. The specific decision rules are as follows:
1. Two-sided test. Enter Table J with A, B, and a. If the observed value of b is equal to
or less than the integer in a given column, reject H0 at a level of significance equal to
twice the significance level shown at the top of that column. For example, suppose
A ¼ 8, B ¼ 7, a ¼ 7, and the observed value of b is 1. We can reject the null
hypothesis at the 2ð.05Þ ¼ .10, the 2ð.025Þ ¼ .05, and the 2ð.01Þ ¼ .02 levels of
significance, but not at the 2ð.005Þ ¼ .01 level.
2. One-sided test. Enter Table J with A, B, and a. If the observed value of b is less than
or equal to the integer in a given column, reject H0 at the level of significance shown
at the top of that column. For example, suppose that A ¼ 16, B ¼ 8, a ¼ 4, and the
observed value of b is 3. We can reject the null hypothesis at the .05 and .025 levels of
significance, but not at the .01 or .005 levels.
Large-Sample Approximation For sufficiently large samples we can test the
null hypothesis of the equality of two population proportions by using the normal
approximation. Compute
ða=AÞ ðb=BÞ
z¼
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
(12.6.1)
pð1
pÞð1=A þ 1=BÞ
where
p ¼ ða þ bÞ=ðA þ BÞ
(12.6.2)
and compare it for significance with appropriate critical values of the standard normal
distribution. The use of the normal approximation is generally considered satisfactory if a,

638
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
b, A a, and B b are all greater than or equal to 5. Alternatively, when sample sizes are
sufficiently large, we may test the null hypothesis by means of the chi-square test.
Further Reading The Fisher exact test has been the subject of some controversy
among statisticians. Some feel that the assumption of fixed marginal totals is unrealistic in
most practical applications. The controversy then centers around whether the test is
appropriate when both marginal totals are not fixed. For further discussion of this and other
points, see the articles by Barnard (13-15), Fisher (16), and Pearson (17).
Sweetland (18) compared the results of using the chi-square test with those obtained
using the Fisher exact test for samples of size A þ B ¼ 3 to A þ B ¼ 69. He found close
agreement when A and B were close in size and the test was one-sided.
Carr (19) presents an extension of the Fisher exact test to more than two samples of
equal size and gives an example to demonstrate the calculations. Neave (20) presents the
Fisher exact test in a new format; the test is treated as one of independence rather than of
homogeneity. He has prepared extensive tables for use with his approach.
The sensitivity of Fisher’s exact test to minor perturbations in 2
2 contingency
tables is discussed by Dupont (21).
EXAMPLE 12.6.1
The purpose of a study by Justesen et al. (A-12) was to evaluate the long-term efficacy of
taking indinavir/ritonavir twice a day in combination with two nucleoside reverse
transcriptase inhibitors among HIV-positive subjects who were divided into two groups.
Group 1 consisted of patients who had no history of taking protease inhibitors (PI Na
ıve).
Group 2 consisted of patients who had a previous history taking a protease inhibitor (PI
Experienced). Table 12.6.2 shows whether these subjects remained on the regimen for the
120 weeks of follow-up. We wish to know if we may conclude that patients classified as
group 1 have a lower probability than subjects in group 2 of remaining on the regimen for
120 weeks.
TABLE 12.6.2
Regimen Status at 120 Weeks for
PI Na€ıve and PI Experienced Subjects Taking
Indinavir/Ritonavir as Described in Example 12.6.1
Remained in
the Regimen
for 120 Weeks
Total
Yes
No
1
(PI Na
ıve)
9
2
7
2
(PA Experienced)
12
8
4
Total
21
10
11
Source: U.S. Justesen, A. M. Lervfing, A. Thomsen, J. A. Lindberg,
C. Pedersen, and P. Tauris, “Low-Dose Indinavir in Combination with
Low-Dose Ritonavir: Steady-State Pharmacokinetics and Long-Term
Clinical Outcome Follow-Up,” HIV Medicine, 4 (2003), 250-254.

12.6
THE FISHER EXACT TEST
639
TABLE 12.6.3
Data of Table 12.6.2 Rearranged to Conform to the
Layout of Table 12.6.1
Remained in Regimen for 120 Weeks
Yes
No
Total
2
(PI Experienced)
8¼a
4¼A a
12 ¼ A
1
(PI Na
ıve)
2¼b
7¼B b
9¼B
Total
10 ¼ a þ b
11 ¼ A þ B a b
21 ¼ A þ B
Solution:
1.
Data. The data as reported are shown in Table 12.6.2. Table 12.6.3
shows the data rearranged to conform to the layout of Table 12.6.1.
Remaining on the regimen is the characteristic of interest.
2.
Assumptions. We presume that the assumptions for application of the
Fisher exact test are met.
3.
Hypotheses.
H0: The proportion of subjects remaining 120 weeks on the regimen in a
population of patients classified as group 2 is the same as or less
than the proportion of subjects remaining on the regimen 120 weeks
in a population classified as group 1.
HA: Group 2 patients have a higher rate than group 1 patients of
remaining on the regimen for 120 weeks.
4.
Test statistic. The test statistic is the observed value of b as shown in
Table 12.6.3.
5.
Distribution of test statistic. We determine the significance of b by
consulting Appendix Table J.
6.
Decision rule. Suppose we let a ¼ .05. The decision rule, then, is to
reject H0 if the observed value of b is equal to or less than 1, the value of
b in Table J for A ¼ 12, B ¼ 9, a ¼ 8, and a ¼ .05.
7.
Calculation of test statistic. The observed value of b, as shown in
Table 12.6.3, is 2.
8.
Statistical decision. Since 2 > 1, we fail to reject H0.
9.
Conclusion. Since we fail to reject H0, we conclude that the null
hypothesis may be true. That is, it may be true that the rate of remaining
on the regimen for 120 weeks is the same or less for the PI experienced
group compared to the PI na€ıve group.
10.
p value. We see in Table J that when A ¼ 12, B ¼ 9, a ¼ 8, the value of
b ¼ 2 has an exact probability of occurring by chance alone, when H0 is
true, greater than .05.
&

640
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
Pl * Remained Cross-Tabulation
Count
Remained
Yes
No
Total
Pl
Experienced
8
4
12
Naive
2
7
9
T
otal
10
1
1
21
Chi-Square Tests
Asymp. Sig.
Exact Sig.
Exact Sig.
Value
df
(2-sided)
(2-sided)
(1-sided)
Pearson Chi-Square
4.073b
1
.044
Continuity Correctiona
2.486
1
.115
Likelihood Ratio
4.253
1
.039
Fisher’s ExactTest
.080
.05
6
Linear-by-Linear
3.879
1
.049
Association
N of Valid Cases
21
a. Computed only for a 2
2 table
b. 2 cells (50.0%) have expected count less than 5. The minimum expected count is 4.29.
FIGURE 12.6.1
SPSS output for Example 12.6.1.
Various statistical software programs perform the calculations for the Fisher exact
test. Figure 12.6.1 shows the results of Example 12.6.1 as computed by SPSS. The exact p
value is provided for both a one-sided and a two-sided test. Based on these results, we fail to
reject H0 (p value >.05), just as we did using the statistical tables in the Appendix. Note
that in addition to the Fisher exact test several alternative tests are provided. The reader
should be aware that these alternative tests are not appropriate if the assumptions under-
lying them have been violated.
EXERCISES
12.6.1
The goal of a study by Tahmassebi and Curzon (A-13) was to determine if drooling in children
with cerebral palsy is due to hypersalivation. One of the procedures toward that end was to examine
the salivary buffering capacity of cerebral palsied children and controls. The following table gives
the results.

12.7
RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC
641
Buffering Capacity
Group
Medium
High
Cerebral palsy
2
8
Control
3
7
Source: J. F. Tahmassebi and M. E. J. Curzon, “The Cause of Drooling in
Children with Cerebral Palsy—Hypersalivation or Swallowing Defect?”
International Journal of Paediatric Dentistry, 13 (2003), 106-111.
Test for a significant difference between cerebral palsied children and controls with respect to high or
low buffering capacity. Let a ¼ .05 and find the p value.
12.6.2
In a study by Xiao and Shi (A-14), researchers studied the effect of cranberry juice in the treatment
and prevention of Helicobacter pylori infection in mice. The eradication of Helicobacter pylori
results in the healing of peptic ulcers. Researchers compared treatment with cranberry juice to “triple
therapy (amoxicillin, bismuth subcitrate, and metronidazole) in mice infected with Helicobacter
pylori. After 4 weeks, they examined the mice to determine the frequency of eradication of the
bacterium in the two treatment groups. The following table shows the results.
No. of Mice with Helicobacter pylori Eradicated
Yes
No
Triple therapy
8
2
Cranberry juice
2
8
Source: Shu Dong Xiao and Tong Shi, “Is Cranberry Juice Effective in the Treatment and
Prevention of Helicobacter Pylori Infection of Mice,” Chinese Journal of Digestive Diseases,
4
(2003), 136-139.
May we conclude, on the basis of these data, that triple therapy is more effective than cranberry juice
at eradication of the bacterium? Let a ¼ .05 and find the p value.
12.6.3
In a study by Shaked et al. (A-15), researchers studied 26 children with blunt pancreatic injuries.
These injuries occurred from a direct blow to the abdomen, bicycle handlebars, fall from height, or
car accident. Nineteen of the patients were classified as having minor injuries, and seven were
classified as having major injuries. Pseudocyst formation was suspected when signs of clinical
deterioration developed, such as increased abdominal pain, epigastric fullness, fever, and increased
pancreatic enzyme levels. In the major injury group, six of the seven children developed pseudocysts
while in the minor injury group, three of the 19 children developed pseudocysts. Is this sufficient
evidence to allow us to conclude that the proportion of children developing pseudocysts is higher in
the major injury group than in the minor injury group? Let a ¼ .01.
12.7
RELATIVE RISK, ODDS RATIO, AND
THE MANTEL-HAENSZEL STATISTIC
In Chapter 8 we learned to use analysis of variance techniques to analyze data that arise
from designed experiments, investigations in which at least one variable is manipulated
in some way. Designed experiments, of course, are not the only sources of data that are

642
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
of interest to clinicians and other health sciences professionals. Another important class of
scientific investigation that is widely used is the observational study.
DEFINITION
An observational study is a scientific investigation in which neither the
subjects under study nor any of the variables of interest are manipulated
in any way.
An observational study, in other words, may be defined simply as an investigation
that is not an experiment. The simplest form of observational study is one in which there are
only two variables of interest. One of the variables is called the risk factor, or independent
variable, and the other variable is referred to as the outcome, or dependent variable.
DEFINITION
The term risk factor is used to designate a variable that is thought to be
related to some outcome variable. The risk factor may be a suspected
cause of some specific state of the outcome variable.
In a particular investigation, for example, the outcome variable might be subjects’
status relative to cancer and the risk factor might be their status with respect to cigarette
smoking. The model is further simplified if the variables are categorical with only two
categories per variable. For the outcome variable the categories might be cancer present
and cancer absent. With respect to the risk factor subjects might be categorized as smokers
and nonsmokers.
When the variables in observational studies are categorical, the data pertaining to
them may be displayed in a contingency table, and hence the inclusion of the topic in the
present chapter. We shall limit our discussion to the situation in which the outcome variable
and the risk factor are both dichotomous variables.
Types of Observational Studies There are two basic types of observational
studies, prospective studies and retrospective studies.
DEFINITION
A prospective study is an observational study in which two random
samples of subjects are selected. One sample consists of subjects who
possess the risk factor, and the other sample consists of subjects who do
not possess the risk factor. The subjects are followed into the future (that
is, they are followed prospectively), and a record is kept on the number of
subjects in each sample who, at some point in time, are classifiable into
each of the categories of the outcome variable.
The data resulting from a prospective study involving two dichotomous variables can
be displayed in a 2
2 contingency table that usually provides information regarding the
number of subjects with and without the risk factor and the number who did and did not

12.7
RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC
643
TABLE 12.7.1
Classification of a Sample of Subjects with Respect
to Disease Status and Risk Factor
Disease Status
Risk Factor
Present
Absent
Total at Risk
Present
a
b
aþb
Absent
c
d
cþd
Total
aþc
bþd
n
succumb to the disease of interest as well as the frequencies for each combination of
categories of the two variables.
DEFINITION
A retrospective study is the reverse of a prospective study. The samples are
selected from those falling into the categories of the outcome variable.
The investigator then looks back (that is, takes a retrospective look) at the
subjects and determines which ones have (or had) and which ones do not
have (or did not have) the risk factor.
From the data of a retrospective study we may construct a contingency table with
frequencies similar to those that are possible for the data of a prospective study.
In general, the prospective study is more expensive to conduct than the retrospective
study. The prospective study, however, more closely resembles an experiment.
Relative Risk The data resulting from a prospective study in which the dependent
variable and the risk factor are both dichotomous may be displayed in a 2
2 contingency
table such as Table 12.7.1. The risk of the development of the disease among the subjects
with the risk factor is a=ða þ bÞ. The risk of the development of the disease among the
subjects without the risk factor is c=ðc þ dÞ. We define relative risk as follows.
DEFINITION
Relative risk is the ratio of the risk of developing a disease among subjects
with the risk factor to the risk of developing the disease among subjects
without the risk factor.
We represent the relative risk from a prospective study symbolically as
a=ða þ bÞ
RR ¼
(12.7.1)
c=ðc þ dÞ
where a, b, c, and d are as defined in Table 12.7.1, and RR indicates that the relative risk is
computed from a sample to be used as an estimate of the relative risk, RR, for the
population from which the sample was drawn.

644
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
We may construct a confidence interval for RR
p
ffiffiffffi
ð
za=
X2
Þ
100ð1
aÞ%CI ¼ RR1
(12.7.2)
where za is the two-sided z value corresponding to the chosen confidence coefficient and X2
is computed by Equation 12.4.1.
Interpretation of RR The value of RR may range anywhere between zero and
infinity. A value of 1 indicates that there is no association between the status of the risk
factor and the status of the dependent variable. In most cases the two possible states of
the dependent variable are disease present and disease absent. We interpret an RR of 1 to
mean that the risk of acquiring the disease is the same for those subjects with the risk
factor and those without the risk factor. A value of RR greater than 1 indicates that the
risk of acquiring the disease is greater among subjects with the risk factor than among
subjects without the risk factor. An RR value that is less than 1 indicates less risk of
acquiring the disease among subjects with the risk factor than among subjects without
the risk factor. For example, a risk factor of 2 is taken to mean that those subjects with the
risk factor are twice as likely to acquire the disease as compared to subjects without the
risk factor.
We illustrate the calculation of relative risk by means of the following example.
EXAMPLE 12.7.1
In a prospective study of pregnant women, Magann et al. (A-16) collected extensive
information on exercise level of low-risk pregnant working women. A group of 217 women
did no voluntary or mandatory exercise during the pregnancy, while a group of 238 women
exercised extensively. One outcome variable of interest was experiencing preterm labor.
The results are summarized in Table 12.7.2.
We wish to estimate the relative risk of preterm labor when pregnant women exercise
extensively.
Solution: By Equation 12.7.1 we compute
22=238
RR ¼
18=217¼.0829¼1:1
TABLE 12.7.2
Subjects with and without the Risk Factor Who Became Cases
of Preterm Labor
Risk Factor
Cases of Preterm Labor
Noncases of Preterm Labor
Total
Extreme exercising
22
216
238
Not exercising
18
199
217
Total
40
415
455
Source: Everett F. Magann, Sharon F. Evans, Beth Weitz, and John Newnham, “Antepartum, Intrapartum,
and Neonatal Significance of Exercise on Healthy Low-Risk Pregnant Working Women,” Obstetrics and
Gynecology, 99 (2002), 466-472.

12.7
RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC
645
Odds Ratio and Relative Risk Section
Common
Original
Iterated
Log Odds
Relative
Parameter
Odds Ratio
Odds Ratio
Odds Ratio
Ratio
Risk
Upper 95% C.L.
2.1350
2.2683
0.7585
2.1
19
2
Estimate
1.1260
1.1207
1.1207
0.1140
1.1144
Lower
95% C.L.
0.5883
0.5606
0.5305
0.5896
FIGURE 12.7.1
NCSS output for the data in Example 12.7.1.
These data indicate that the risk of experiencing preterm labor when a woman
exercises heavily is 1.1 times as great as it is among women who do not
exercise at all.
We compute the 95 percent confidence interval for RR as follows. By
Equation 12.4.1, we compute from the data in Table 12.7.2:
2
455½ð22Þð199Þ ð216Þð18Þ
X2 ¼
¼ .1274
ð40Þð415Þð238Þð217Þ
By Equation 12.7.2, the lower and upper confidence limits are, respectively,
p
ffiffiffiffiffiffiffffi
p
ffiffiffiffiffiffiffffi
1:96=
:1274
:1274
1:11
¼ :65 and 1:11þ1:96=
¼ 1:86. Since the interval includes
1, we conclude, at the .05 level of significance, that the population risk may
be 1. In other words, we conclude that, in the population, there may not be
an increased risk of experiencing preterm labor when a pregnant woman
exercises extensively.
The data were processed by NCSS. The results are shown in Figure
12.7.1. The relative risk calculation is shown in the column at the far right of
the output, along with the 95% confidence limits. Because of rounding errors,
these values differ slightly from those given in the example.
&
Odds Ratio When the data to be analyzed come from a retrospective study, relative
risk is
not a meaningful measure for comparing two groups. As we have seen, a
retrospective study is based on a sample of subjects with the disease (cases) and a separate
sample of subjects without the disease (controls or noncases). We then retrospectively
determine the distribution of the risk factor among the cases and controls. Given the results
of a retrospective study involving two samples of subjects, cases, and controls, we may
display the data in a 2
2 table such as Table 12.7.3, in which subjects are dichotomized
with respect to the presence and absence of the risk factor. Note that the column headings in
Table 12.7.3 differ from those in Table 12.7.1 to emphasize the fact that the data are from a
retrospective study and that the subjects were selected because they were either cases or
controls. When the data from a retrospective study are displayed as in Table 12.7.3,
the ratio a=ða þ bÞ, for example, is not an estimate of the risk of disease for subjects with
the risk factor. The appropriate measure for comparing cases and controls in a retrospective
study is the odds ratio. As noted in Chapter 11, in order to understand the concept of

646
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
TABLE 12.7.3
Subjects of a Retrospective Study
Classified According to Status Relativeto a Risk Factor
and Whether They Are Cases or Controls
Sample
Risk Factor
Cases
Controls
Total
Present
a
b
aþb
Absent
c
d
cþd
Total
aþc
bþd
n
the odds ratio, we must understand the term odds, which is frequently used by those who
place bets on the outcomes of sporting events or participate in other types of gambling
activities.
DEFINITION
The odds for success are the ratio of the probability of success to the
probability of failure.
We use this definition of odds to define two odds that we can calculate from data
displayed as in Table 12.7.3:
1. The odds of being a case (having the disease) to being a control (not having the
disease) among subjects with the risk factor is ½a=ða þ bÞ =½b=ða þ bÞ
¼ a=b.
2. The odds of being a case (having the disease) to being a control (not having the
disease) among subjects without the risk factor is ½c=ðc þ dÞ =½d=ðc þ dÞ
¼ c=d.
We now define the odds ratio that we may compute from the data of a retrospective
study. We use the symbo
OR to indicate that the measure is computed from sample data
and used as an estimate of the population odds ratio, OR.
DEFINITION
The estimate of the population odds ratio is
a=b
OR ¼
(12.7.3)
c=d¼bc
where a, b, c, and d are as defined in Table 12.7.3.
We may construct a confidence interval for OR by the following method:
p
ffiffiffffi
ð
za=
X2
Þ
100ð1
aÞ%CI
OR1
(12.7.4)
where za is the two-sided z value corresponding to the chosen confidence coefficient and
X2 is computed by Equation 12.4.1.

12.7
RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC
647
Interpretation of the Odds Ratio In the case of a rare disease, the popula-
tion odds ratio provides a good approximation to the population relative risk. Conse-
quently, the sample odds ratio, being an estimate of the population odds ratio, provides an
indirect estimate of the population relative risk in the case of a rare disease.
The odds ratio can assume values between zero and 1. A value of 1 indicates no
association between the risk factor and disease status. A value less than 1 indicates reduced
odds of the disease among subjects with the risk factor. A value greater than 1 indicates
increased odds of having the disease among subjects in whom the risk factor is present.
EXAMPLE 12.7.2
Toschke et al. (A-17) collected data on obesity status of children ages 5-6 years and the
smoking status of the mother during the pregnancy. Table 12.7.4 shows 3970 subjects
classified as cases or noncases of obesity and also classified according to smoking status of
the mother during pregnancy (the risk factor). We wish to compare the odds of obesity at
ages 5-6 among those whose mother smoked throughout the pregnancy with the odds of
obesity at age 5-6 among those whose mother did not smoke during pregnancy.
Solution: The odds ratio is the appropriate measure for answering the question posed.
By Equation 12.7.3 we compute
ð64Þð3496Þ
OR ¼
¼ 9:62
ð342Þð68Þ
We see that obese children (cases) are 9.62 times as likely as nonobese
children
(noncases) to have had a mother who smoked throughout the
pregnancy.
We compute the 95 percent confidence interval for OR as follows. By
Equation 12.4.1 we compute from the data in Table 12.7.4
2
3970½ð64Þð3496Þ ð342Þð68Þ
X2 ¼
¼ 217:6831
ð132Þð3838Þð406Þð3564Þ
TABLE 12.7.4
Subjects Classified According to Obesity
Status and Mother’s Smoking Status during Pregnancy
Obesity Status
Smoking Status
Cases
Noncases
Total
During Pregnancy
Smoked throughout
64
342
406
Never smoked
68
3496
3564
Total
132
3838
3970
Source: A. M. Toschke, S. M. Montgomery, U. Pfeiffer, and R. von Kries, “Early
Intrauterine Exposure to Tobacco-Inhaled Products and Obesity,” American Jour-
nal of Epidemiology, 158 (2003), 1068-1074.

648
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
Smoking_status * Obsesity_status Cross-Tabulation
Count
Obesity status
Cases
Noncases
Total
Smoking_status
Smoked throughout
64
342
406
Never smoked
68
3496
3564
T
otal
132
3838
3970
Risk Estimate
95% Confidence
Interval
Value
Lower
Upper
Odds Ratio for
Smoking_status
(Smoked throughout
9.621
6.719
13.775
/Never smoked)
For cohort Obesity_
8.262
5.966
11.441
status Cases
For cohort Obesity_
.859
.823
.896
status Noncases
N of Valid Cases
3970
FIGURE 12.7.2
SPSS output for Example 12.7.2.
The lower and upper confidence limits for the population OR, respectively, are
p
ffiffiffiffiffiffiffiffiffiffiffif
ffi
p
ffiffiffiffiffiffiffiffiffiffiffif
ffi
1:96=
217:6831
217:6831
9:621
¼ 7:12 and 9:621þ1:96=
¼ 13:00. We conclude
with 95 percent confidence that the population OR is somewhere between
7.12 and 13.00. Because the interval does not include 1, we conclude that, in the
population, obese children (cases) are more likely thaonobese children
(noncases) to have had a mother who smoked throughout the pregnancy.
The data from Example 12.7.2 were processed using SPSS. The
results are shown in Figure 12.7.2. The odds ratio calculation, along with
the 95% confidence limits, are shown in the top line of the Risk Estimate
box. These values differ slightly from those in the example because of
rounding error.
&
The Mantel-Haenszel Statistic Frequently when we are studying the rela-
tionship between the status of some disease and the status of some risk factor, we are

12.7
RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC
649
aware of another variable that may be associated with the disease, with the risk factor,
or with both in such a way that the true relationship between the disease status and the
risk factor is masked. Such a variable is called a confounding variable. For example,
experience might indicate the possibility that the relationship between some disease
and a suspected risk factor differs among different ethnic groups. We would then treat
ethnic membership as a confounding variable. When they can be identified, it is
desirable to control for confounding variables so that an unambiguous measure of the
relationship between disease status and risk factor may be calculated. A technique for
accomplishing this objective is the Mantel-Haenszel (22) procedure, so called in
recognition of the two men who developed it. The procedure allows us to test the null
hypothesis that there is no association between status with respect to disease and risk
factor status. Initially used only with data from retrospective studies, the Mantel-
Haenszel procedure is also appropriate for use with data from prospective studies, as
discussed by Mantel (23).
In the application of the Mantel-Haenszel procedure, case and control subjects are
assigned to strata corresponding to different values of the confounding variable. The data
are then analyzed within individual strata as well as across all strata. The discussion that
follows assumes that the data under analysis are from a retrospective or a prospective study
with case and noncase subjects classified according to whether they have or do not have the
suspected risk factor. The confounding variable is categorical, with the different categories
defining the strata. If the confounding variable is continuous it must be categorized. For
example, if the suspected confounding variable is age, we might group subjects into
mutually exclusive age categories. The data before stratification may be displayed as
shown in Table 12.7.3.
Application of the Mantel-Haenszel procedure consists of the following steps.
1. Form k strata corresponding to the k categories of the confounding variable. Table
12.7.5 shows the data display for the ith stratum.
2. For each stratum compute the expected frequency ei of the upper left-hand cell of
Table 12.7.5 as follows:
ðai þ bi
Þðai þciÞ
ei ¼
(12.7.5)
ni
TABLE 12.7.5
Subjects in the ith Stratum of a Confounding
Variable Classified According to Status Relative to a Risk
Factor and Whether They Are Cases or Controls
Sample
Risk Factor
Cases
Controls
Total
Present
ai
bi
ai þ bi
Absent
ci
di
ci þ di
Total
ai þ ci
bi þ di
ni

650
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
3. For each stratum compute
ðai þ bi
Þðci þdiÞðai þciÞðbi þdiÞ
vi ¼
(12.7.6)
n2i ðni
1Þ
4. Compute the Mantel-Haenszel test statistic, x2
MH asfollows:
2
k
k ai
ei
i¼1
x2
¼ i¼1
(12.7.7)
MH
k
vi
i¼1
5. Reject the null hypothesis of no association between disease status and suspected risk
factor status in the population if the computed value of x2
MH isequaltoorgreaterthan
the critical value of the test statistic, which is the tabulated chi-square value for 1
degree of freedom and the chosen level of significance.
Mantel-Haenszel Estimator of the Common Odds Ratio When we
have k strata of data, each of which may be displayed in a table like Table 12.7.5, we may
compute the Mantel-Haenszel estimator of the common odds ratio,
ORMH as follows:
k ðaidi=niÞ
ORMH ¼i¼1
(12.7.8)
k
ðbici=ni
Þ
i¼1
When we use the Mantel-Haenszel estimator given by Equation 12.7.4, we assume that, in
the population, the odds ratio is the same for each stratum.
We illustrate the use of the Mantel-Haenszel statistics with the following
examples.
EXAMPLE 12.7.3
In a study by LaMont et al. (A-18), researchers collected data on obstructive coronary
artery disease (OCAD), hypertension, and age among subjects identified by a treadmill
stress test as being at risk. In Table 12.7.6, counts on subjects in two age strata are presented
with hypertension as the risk factor and the presence of OCAD as the case/noncase
variable.
Solution:
1. Data. See Table 12.7.6.
2. Assumptions. We assume that the assumptions discussed earlier for the
valid use of the Mantel-Haenszel statistic are met.

12.7
RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC
651
TABLE 12.7.6
Patients Stratified by Age and Classified by Status
Relative to Hypertension (the Risk Factor) and OCAD (Case/Noncase
Variable)
Stratum 1 (55 and under)
Risk Factor
(Hypertension)
Cases (OCAD)
Noncases
Total
Present
21
11
32
Absent
16
6
22
Total
37
17
54
Stratum 2 (over 55)
Risk Factor
(Hypertension)
Cases (OCAD)
Noncases
Total
Present
50
14
64
Absent
18
6
24
Total
68
20
88
Source: Data provided courtesy of Matthew J. Budoff, MD.
3.
Hypotheses.
H0: There is no association between the presence of hypertension
and occurrence of OCAD in subjects 55 and under and subjects
over 55.
HA: There is a relationship between the two variables.
4.
Test statistic.
2
k
k ai
ei
i¼1
x2
¼ i¼1
MH
k vi
i¼1
as given in Equation 12.7.7.
5.
Distribution of test statistic. Chi-square with 1 degree of freedom.
6.
Decision rule. Suppose we let a ¼ .05. Reject H0 if the computed value
of the test statistic is greater than or equal to 3.841.
7.
Calculation of test statistic. By Equation 12.7.5 we compute the
following expected frequencies:
e1 ¼ ð21 þ 11Þð21 þ 16Þ=54 ¼ ð32Þð37Þ=54 ¼ 21:93
e2 ¼ ð50 þ 14Þð50 þ 18Þ=88 ¼ ð64Þð68Þ=88 ¼ 49:45

652
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
By Equation 12.7.6 we compute
v1 ¼ ð32Þð22Þð37Þð17Þ=ð2916Þð54
1Þ ¼ 2:87
v2 ¼ ð64Þð24Þð68Þð20Þ=ð7744Þð88
1Þ ¼ 3:10
Finally, by Equation 12.7.7 we compute
2
½ð21 þ 50Þ ð21:93 þ 49:45Þ
x2
¼ .0242
MH ¼
2:87 þ 3:10
8. Statistical decision. Since .0242 < 3:841, we fail to reject H0.
9. Conclusion. We conclude that there may not be an association between
hypertension and the occurrence of OCAD.
10. p value. Since .0242 < 2:706, the p value for this test is p > .10.
We now illustrate the calculation of the Mantel-Haenszel estimator of the
common odds ratio.
&
EXAMPLE 12.7.4
Let us refer to the data in Table 12.7.6 and compute the common odds ratio.
Solution: From the stratified data in Table 12.7.6 we compute the numerator of the ratio
as follows:
ða1d1=n1
Þ þða2d2=n2Þ ¼ ½ð21Þð6Þ=54 þ ½ð50Þð6Þ=88
¼ 5:7424
The denominator of the ratio is
ðb1c1=n1
Þ þðb2c2=n2Þ ¼ ½ð11Þð16Þ=54 þ ½ð14Þð18Þ=88
¼ 6:1229
Now, by Equation 12.7.7, we compute the common odds ratio:
5:7424
ORMH ¼
6:1229¼.94
From these results we estimate that, regardless of age, patients who
have hypertension are less likely to have OCAD than patients who do not
have hypertension.
&
Hand calculation of the Mantel-Haenszel test statistics can prove to be a cumber-
some task. Fortunately, the researcher can find relief in one of several statistical software
packages that are available. To illustrate, results from the use of SPSS to process the data of
Example 12.7.3 are shown in Figure 12.7.3. These results differ from those given in the
example because of rounding error.

EXERCISES
653
Smoking_status * Obsesity_status * Stratum Cross-Tabulation
Count
Obesity status
Stratum
Cases
Noncases
T
otal
55 and under Smoking_status Smoked throughout
21
11
32
Never smoked
16
6
22
T
otal
37
17
54
Over 55
Smoking_status Smoked throughout
50
14
64
Never smoked
18
6
24
T
otal
68
20
88
Tests of Conditional Independence
Asymp. Sig.
Chi-Squared
df
(2-sided)
Cochran’s
.025
1
.875
Mantel-Haenszel
.002
1
.961
Mantel-Haenszel Common Odds Ratio Estimate
Estimate
.93
8
In(Estimate)
.064
Std. Error of In(Estimate)
.41
2
Asymp. Sig. (2-sided)
.876
Asymp. 95% confidence
Common Odds Lower Bound
.418
Interval
Ratio
Upper Bound
2.102
In(Common)
Lower Bound
.871
Odds Ratio)
Upper Bound
.743
FIGURE 12.7.3
SPSS output for Example 12.7.3.
EXERCISES
12.7.1
Davy et al. (A-19) reported the results of a study involving survival from cervical cancer. The
researchers found that among subjects younger than age 50, 16 of 371 subjects had not survived for
1 year after diagnosis. In subjects age 50 or older, 219 of 376 had not survived for 1 year after
diagnosis. Compute the relative risk of death among subjects age 50 or older. Does it appear from
these data that older subjects diagnosed as having cervical cancer are prone to higher mortality
rates?

654
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
12.7.2
The objective of a prospective study by Stenestrand et al. (A-20) was to compare the mortality rate
following an acute myocardial infarction (AMI) among subjects receiving early revascularization to
the mortality rate among subjects receiving conservative treatments. Among 2554 patients receiving
revascularization within 14 days of AMI, 84 died in the year following the AMI. In the conservative
treatment group (risk factor present), 1751 of 19,358 patients died within a year of AMI. Compute the
relative risk of mortality in the conservative treatment group as compared to the revascularization
group in patients experiencing AMI.
12.7.3
Refer to Example 12.7.2. Toschke et al. (A-17), who collected data on obesity status of children ages
5-6 years and the smoking status of the mother during the pregnancy, also reported on another
outcome variable: whether the child was born premature (37 weeks or fewer of gestation). The
following table summarizes the results of this aspect of the study. The same risk factor (smoking
during pregnancy) is considered, but a case is now defined as a mother who gave birth prematurely.
Premature Birth Status
Smoking Status
During Pregnancy
Cases
Noncases
Total
Smoked throughout
36
370
406
Never smoked
168
3396
3564
Total
204
3766
3970
Source: A. M. Toschke, S. M. Montgomery, U. Pfeiffer, and R. von Kries, “Early Intrauterine
Exposure to Tobacco-Inhaled Products and Obesity,” American Journal of Epidemiology, 158
(2003), 1068-1074.
Compute the odds ratio to determine if smoking throughout pregnancy is related to premature birth.
Use the chi-square test of independence to determine if one may conclude that there is an association
between smoking throughout pregnancy and premature birth. Let a ¼ .05.
12.7.4
Sugiyama et al. (A-21) examined risk factors for allergic diseases among 13- and 14-year-old
schoolchildren in Japan. One risk factor of interest was a family history of eating an unbalanced diet.
The following table shows the cases and noncases of children exhibiting symptoms of rhinitis in the
presence and absence of the risk factor.
Rhinitis
Family History
Cases
Noncases
Total
Unbalanced diet
656
1451
2107
Balanced diet
677
1662
2339
Total
1333
3113
4446
Source: Takako Sugiyama, Kumiya Sugiyama, Masao Toda, Tastuo Yukawa, Sohei Makino,
and Takeshi Fukuda, “Risk Factors for Asthma and Allergic Diseases Among 13-14-Year-Old
Schoolchildren in Japan,” Allergology International, 51 (2002), 139-150.
What is the estimated odds ratio of having rhinitis among subjects with a family history of an
unbalanced diet compared to those eating a balanced diet? Compute the 95 percent confidence
interval for the odds ratio.
12.7.5
According to Holben et al. (A-22), “Food insecurity implies a limited access to or availability of food
or a limited/uncertain ability to acquire food in socially acceptable ways.” These researchers

12.8
SUMMARY
655
collected data on 297 families with a child in the Head Start nursery program in a rural area of Ohio
near Appalachia. The main outcome variable of the study was household status relative to food
security. Households that were not food secure are considered to be cases. The risk factor of interest
was the absence of a garden from which a household was able to supplement its food supply. In the
following table, the data are stratified by the head of household’s employment status outside the
home.
Stratum 1 (Employed Outside the Home)
Risk Factor
Cases
Noncases
Total
No garden
40
37
77
Garden
13
38
51
Total
53
75
128
Stratum 2 (Not Employed Outside the Home)
Risk Factor
Cases
Noncases
Total
No garden
75
38
113
Garden
15
33
48
Total
90
71
161
Source: Data provided courtesy of David H. Holben, Ph.D. and John P. Holcomb, Jr., Ph.D.
Compute the Mantel-Haenszel common odds ratio with stratification by employment status. Use the
Mantel-Haenszel chi-square test statistic to determine if we can conclude that there is an association
between the risk factor and food insecurity. Let a ¼ .05.
12.8
SUMMARY
In this chapter some uses of the versatile chi-square distribution are discussed. Chi-square
goodness-of-fit tests applied to the normal, binomial, and Poisson distributions are
presented. We see that the procedure consists of computing a statistic
“
#
X
2
ðOi Ei
Þ
X2 ¼
Ei
that measures the discrepancy between the observed (Oi) and expected (Ei) frequencies of
occurrence of values in certain discrete categories. When the appropriate null hypothesis is
true, this quantity is distributed approximately as x2. When X2 is greater than or equal to the
tabulated value of x2 for some a, the null hypothesis is rejected at the a level of
significance.
Tests of independence and tests of homogeneity are also discussed in this chapter.
The tests are mathematically equivalent but conceptually different. Again, these tests
essentially test the goodness-of-fit of observed data to expectation under hypotheses,
respectively, of independence of two criteria of classifying the data and the homogeneity of
proportions among two or more groups.

656
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
In addition, we discussed and illustrated in this chapter four other techniques for
analyzing frequency data that can be presented in the form of a 2
2 contingency table: the
Fisher exact test, the odds ratio, relative risk, and the Mantel-Haenszel procedure. Finally,
we discussed the basic concepts of survival analysis and illustrated the computational
procedures by means of two examples.
SUMMARY OF FORMULAS FOR CHAPTER 12
Formula
Number
Name
Formula
yi
m
12.2.1
Standard normal random
zi ¼
variable
s
12.2.2
Chi-square distribution with
x2n
z2
z2
þz2
ð Þ ¼
1 þ
2 þ
n
n degrees of freedom
12.2.3
Chi-square probability
1
1
f ðuÞ ¼
1eðu=2Þ
density function
k
1
!2k=2 uðk=2Þ
2
“
#
12.2.4
Chi-square test statistic
ð
Þ2
x2 ¼
Ei
12.4.1
Chi-square calculation
nðad bcÞ2
x2 ¼
formula for a 2
2
ða þ cÞðb þ dÞða þ bÞðc þ dÞ
contingency table
12.4.2
Yates’s corrected chi-square
nðjad bcj
.5nÞ2
calculation for a 2
2
x2
corrected ¼
ða þ cÞðb þ dÞða þ bÞðc þ dÞ
contingency table
12.6.1-12.6.2
Large-sample approximation
ða=AÞ ðb=BÞ
z¼
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
to the chi-square
pð1
pÞð1=A þ 1=BÞ
where
p ¼ ða þ bÞ=ðA þ BÞ
12.7.1
Relative risk estimate
a=ða þ bÞ
RR ¼
c=ðc þ dÞ
p
ffiffiffi
12.7.2
Confidence interval for the
ð
za =
x2
Þ
100ð1
aÞ%CI ¼ RR1
relative risk estimate
12.7.3
Odds ratio estimate
a=b
OR ¼
c=d¼bc
p
ffiffiffi
12.7.4
Confidence interval for the
ð
za=
x2
Þ
100ð1
aÞ%CI
OR1
odds ratio estimate
(Continued )

REVIEW QUESTIONS AND EXERCISES
657
12.7.5
Expected frequency in the
ðai þ bi
Þðai þciÞ
ei ¼
Mantel-Haenszel statistic
ni
12.7.6
Stratum expected frequency
ðai þ bi
Þðci þdiÞðai þciÞðbi þdiÞ
vi ¼
in the Mantel-Haenszel
n2
1Þ
i ðni
statistic
12.7.7
Mantel-Haenszel test statistic
k ai
k ei
i¼1
x2
MH
¼ i¼1
k vi
i¼1
12.7.8
Mantel-Haenszel estimator
k ðaidi=niÞ
of the common odds ratio
ORMH ¼i¼1
k ðbici=niÞ
i¼1
Symbol Key
a; b; c; d ¼ cell frequencies in a 2
2 contingency table
A; B ¼ row totals in the 2
2 contingency table
b ¼ regression coefficient
x2 or X2
¼ chi-square
ei ¼ expected frequency in the Mantel-Haenszel statistic
Ei ¼ expected frequency
EðyjxÞ ¼ expected value of yat x
k ¼ degrees of freedom in the chi-square distribution
m ¼ mean
Oi ¼ observed frequency
OR ¼ odds ratio estimate
s ¼ standard deviation
RR ¼ relative risk estimate
vi ¼ stratum expected frequency in the Mantel-Haenszel statistic
yi ¼ data value at pointi
z ¼ normal variate
REVIEW QUESTIONS AND EXERCISES
1. Explain how the chi-square distribution may be derived.
2. What are the mean and variance of the chi-square distribution?
3. Explain how the degrees of freedom are computed for the chi-square goodness-of-fit tests.
4. State Cochran’s rule for small expected frequencies in goodness-of-fit tests.
5. How does one adjust for small expected frequencies?
6. What is a contingency table?
7. How are the degrees of freedom computed when an X2 value is computed from a contingency
table?

658
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
8.
Explain the rationale behind the method of computing the expected frequencies in a test of
independence.
9.
Explain the difference between a test of independence and a test of homogeneity.
10.
Explain the rationale behind the method of computing the expected frequencies in a test of
homogeneity.
11.
When do researchers use the Fisher exact test rather than the chi-square test?
12.
Define the following:
(a) Observational study
(b) Risk factor
(c) Outcome
(d) Retrospective study
(e) Prospective study
(f) Relative risk
(g) Odds
(h) Odds ratio
(i) Confounding variable
13.
Under what conditions is the Mantel-Haenszel test appropriate?
14.
Explain how researchers interpret the following measures:
(a) Relative risk
(b) Odds ratio
(c) Mantel-Haenszel common odds ratio
15.
In a study of violent victimization of women and men, Porcerelli et al. (A-23) collected infor-
mation from 679 women and 345 men ages 18 to 64 years at several family practice centers
in the metropolitan Detroit area. Patients filled out a health history questionnaire that included
a question about victimization. The following table shows the sample subjects cross-classified
by gender and the type of violent victimization reported. The victimization categories are
defined as no victimization, partner victimization (and not by others), victimization by a person
other than a partner
(friend, family member, or stranger), and those who reported multiple
victimization.
Gender No Victimization Partner Nonpartner Multiple Total
Women
611
34
16
18
679
Men
308
10
17
10
345
Total
919
44
33
28
1024
Source: John H. Porcerelli, Rosemary Cogan, Patricia P. West, Edward A. Rose, Dawn
Lambrecht, Karen E. Wilson, Richard K. Severson, and Dunia Karana, “Violent Victimization
of Women and Men: Physical and Psychiatric Symptoms,” Journal of the American Board of
Family Practice, 16 (2003), 32-39.
Can we conclude on the basis of these data that victimization status and gender are not independent?
Let a ¼ .05.
16. Refer to Exercise 15. The following table shows data reported by Porcerelli et al. for 644 African-
American and Caucasian women. May we conclude on the basis of these data that for women, race
and victimization status are not independent? Let a ¼ .05.

REVIEW QUESTIONS AND EXERCISES
659
No Victimization
Partner Nonpartner Multiple Total
Caucasian
356
20
3
9
388
African-American
226
11
10
9
256
Total
582
31
13
18
644
Source: John H. Porcerelli, Rosemary Cogan, Patricia P. West, Edward A. Rose, Dawn Lambrecht,
Karen E. Wilson, Richard K. Severson, and Dunia Karana, “Violent Victimization of Women and
Men: Physical and Psychiatric Symptoms,” Journal of the American Board of Family Practice, 16
(2003), 32-39.
17.
A sample of 150 chronic carriers of a certain antigen and a sample of 500 noncarriers revealed the
following blood group distributions:
Blood Group
Carriers
Noncarriers
Total
0
72
230
302
A
54
192
246
B
16
63
79
AB
8
15
23
Total
150
500
650
Can one conclude from these data that the two populations from which the samples were drawn differ
with respect to blood group distribution? Let a ¼ .05. What is the p value for the test?
18.
The following table shows 200 males classified according to social class and headache status:
Social Class
Headache Group
A
B
C
Total
No headache (in previous year)
6
30
22
58
Simple headache
11
35
17
63
Unilateral headache (nonmigraine)
4
19
14
37
Migraine
5
25
12
42
Total
26
109
65
200
Do these data provide sufficient evidence to indicate that headache status and social class are related?
Let a ¼ .05. What is the p value for this test?
19.
The following is the frequency distribution of scores made on an aptitude test by 175 applicants to a
physical therapy training facility ðx ¼ 39:71; s ¼ 12:92Þ.
Score
Number of Applicants
Score
Number of Applicants
10-14
3
40-44
28
15-19
8
45-49
20
20-24
13
50-54
18
25-29
17
55-59
12
(Continued )

660
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
Score
Number of Applicants
Score
Number of Applicants
30-34
19
60-64
8
35-39
25
65-69
4
Total
175
Do these data provide sufficient evidence to indicate that the population of scores is not normally
distributed? Let a ¼ .05. What is the p value for this test?
20.
A local health department sponsored a venereal disease (VD) information program that was open to
high-school juniors and seniors who ranged in age from 16 to 19 years. The program director believed
that each age level was equally interested in knowing more about VD. Since each age level was about
equally represented in the area served, she felt that equal interest in VD would be reflected by equal
age-level attendance at the program. The age breakdown of those attending was as follows:
Age
Number Attending
16
26
17
50
18
44
19
40
Are these data incompatible with the program director’s belief that students in the four age levels are
equally interested in VD? Let a ¼ .05. What is the p value for this test?
21.
A survey of children under 15 years of age residing in the inner-city area of a large city were classified
according to ethnic group and hemoglobin level. The results were as follows:
Hemoglobin Level (g/100 ml)
Ethnic Group
10.0 or Greater
9.0-9.9
< 9:0
Total
A
80
100
20
200
B
99
190
96
385
C
70
30
10
110
Total
249
320
126
695
Do these data provide sufficient evidence to indicate, at the .05 level of significance, that the two
variables are related? What is the p value for this test?
22.
A sample of reported cases of mumps in preschool children showed the following distribution by age:
Age (Years)
Number of Cases
Under 1
6
1
20
2
35
3
41
4
48
Total
150

REVIEW QUESTIONS AND EXERCISES
661
Test the hypothesis that cases occur with equal frequency in the five age categories. Let a ¼ .05.
What is the p value for this test?
23.
Each of a sample of 250 men drawn from a population of suspected joint disease victims was asked
which of three symptoms bother him most. The same question was asked of a sample of 300
suspected women joint disease victims. The results were as follows:
Most Bothersome Symptom
Men
Women
Morning stiffness
111
102
Nocturnal pain
59
73
Joint swelling
80
125
Total
250
300
Do these data provide sufficient evidence to indicate that the two populations are not homogeneous
with respect to major symptoms? Let a ¼ .05. What is the p value for this test?
For each of the Exercises 24 through 34, indicate whether a null hypothesis of homogeneity or a null
hypothesis of independence is appropriate.
24.
A researcher wishes to compare the status of three communities with respect to immunity against polio
in preschool children. A sample of preschool children was drawn from each of the three communities.
25.
In a study of the relationship between smoking and respiratory illness, a random sample of adults
were classified according to consumption of tobacco and extent of respiratory symptoms.
26.
A physician who wished to know more about the relationship between smoking and birth defects
studies the health records of a sample of mothers and their children, including stillbirths and
spontaneously aborted fetuses where possible.
27.
A health research team believes that the incidence of depression is higher among people with
hypoglycemia than among people who do not suffer from this condition.
28.
In a simple random sample of 200 patients undergoing therapy at a drug abuse treatment center,
60 percent belonged to ethnic group I. The remainder belonged to ethnic group II. In ethnic group I,
60 were being treated for alcohol abuse (A), 25 for marijuana abuse (B), and 20 for abuse of heroin,
illegal methadone, or some other opioid (C). The remainder had abused barbiturates, cocaine,
amphetamines, hallucinogens, or some other nonopioid besides marijuana (D). In ethnic group II the
abused drug category and the numbers involved were as follows:
Að28Þ
Bð32Þ Cð13Þ
D ðthe remainderÞ
Can one conclude from these data that there is a relationship between ethnic group and choice of drug
to abuse? Let a ¼ .05 and find the p value.
29.
Solar keratoses are skin lesions commonly found on the scalp, face, backs of hands, forearms, ears,
scalp, and neck. They are caused by long-term sun exposure, but they are not skin cancers. Chen et al.
(A-24) studied 39 subjects randomly assigned (with a 3 to 1 ratio) to imiquimod cream and a control
cream. The criterion for effectiveness was having 75 percent or more of the lesion area cleared after
14 weeks of treatment. There were 21 successes among 29 imiquimod-treated subjects and three
successes among 10 subjects using the control cream. The researchers used Fisher’s exact test and
obtained a p value of .027. What are the variables involved? Are the variables quantitative or
qualitative? What null and alternative hypotheses are appropriate? What are your conclusions?

662
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
30.
Janardhan et al. (A-25) examined 125 patients who underwent surgical or endovascular treatment for
intracranial aneurysms. At 30 days postprocedure, 17 subjects experienced transient/persistent
neurological deficits. The researchers performed logistic regression and found that the 95 percent
confidence interval for the odds ratio for aneurysm size was .09-.96. Aneurysm size was dichoto-
mized as less than 13 mm and greater than or equal to 13 mm. The larger tumors indicated higher odds
of deficits. Describe the variables as to whether they are continuous, discrete, quantitative, or
qualitative. What conclusions may be drawn from the given information?
31.
In a study of smoking cessation by Gold et al. (A-26), 189 subjects self-selected into three treatments:
nicotine patch only
(NTP), Bupropion SR only (B), and nicotine patch with Bupropion SR
ðNTP þ BÞ. Subjects were grouped by age into younger than 50 years old, between 50 and 64,
and 65 and older. There were 15 subjects younger than 50 years old who chose NTP, 26 who chose B,
and 16 who chose NTP þ B. In the 50-64 years category, six chose NTP, 54 chose B, and 40 chose
NTP þ B. In the oldest age category, six chose NTP, 21 chose B, and five chose NTP þ B. What
statistical technique studied in this chapter would be appropriate for analyzing these data? Describe
the variables involved as to whether they are continuous, discrete, quantitative, or qualitative. What
null and alternative hypotheses are appropriate? If you think you have sufficient information, conduct
a complete hypothesis test. What are your conclusions?
32.
Kozinszky and Bartai (A-27) examined contraceptive use by teenage girls requesting abortion in
Szeged, Hungary. Subjects were classified as younger than 20 years old or 20 years old or older. Of
the younger than 20-year-old women, 146 requested an abortion. Of the older group, 1054 requested
an abortion. A control group consisted of visitors to the family planning center who did not request an
abortion or persons accompanying women who requested an abortion. In the control group, there
were 147 women under 20 years of age and 1053 who were 20 years or older. One of the outcome
variables of interest was knowledge of emergency contraception. The researchers report that,
“Emergency contraception was significantly [(Mantel-Haenszel) p < .001] less well known among
the would-be aborter teenagers as compared to the older women requesting artificial abortion
ðOR ¼ .07Þ than the relevant knowledge of the teenage controls ðOR ¼ .10Þ.” Explain the meaning
of the reported statistics. What are your conclusions based on the given information?
33.
The goal of a study by Crosignani et al. (A-28) was to assess the effect of road traffic exhaust on the
risk of childhood leukemia. They studied 120 children in Northern Italy identified through a
population-based cancer registry (cases). Four controls per case, matched by age and gender, were
sampled from population files. The researchers used a diffusion model of benzene to estimate
exposure to traffic exhaust. Compared to children whose homes were not exposed to road traffic
emissions, the rate of childhood leukemia was significantly higher for heavily exposed children.
Characterize this study as to whether it is observational, prospective, or retrospective. Describe the
variables as to whether they are continuous, discrete, quantitative, qualitative, a risk factor, or a
confounding variable. Explain the meaning of the reported results. What are your conclusions based
on the given information?
34.
Gallagher et al. (A-29) conducted a descriptive study to identify factors that influence women’s
attendance at cardiac rehabilitation programs following a cardiac event. One outcome variable of
interest was actual attendance at such a program. The researchers enrolled women discharged from
four metropolitan hospitals in Sydney, Australia. Of 183 women, only 57 women actually attended
programs. The authors reported odds ratios and confidence intervals on the following variables that
significantly affected outcome: age-squared (1.72; 1.10-2.70). Women over the age of 70 had the
lowest odds, while women ages 55-70 years had the highest odds.), perceived control (.92; .85-1.00),
employment (.20; .07-.58), diagnosis (6.82, 1.84-25.21, odds ratio was higher for women who
experienced coronary artery bypass grafting vs. myocardial infarction), and stressful event (.21, .06-.73).
Characterize this study as to whether it is observational, prospective, or retrospective. Describe the

REVIEW QUESTIONS AND EXERCISES
663
variables as to whether they are continuous, discrete, quantitative, qualitative, a risk factor, or a
confounding variable. Explain the meaning of the reported odds ratios.
For each of the Exercises 35 through 51, do as many of the following as you think appropriate:
(a) Apply one or more of the techniques discussed in this chapter.
(b) Apply one or more of the techniques discussed in previous chapters.
(c) Construct graphs.
(d) Construct confidence intervals for population parameters.
(e) Formulate relevant hypotheses, perform the appropriate tests, and find p values.
(f) State the statistical decisions and clinical conclusions that the results of your hypothesis tests justify.
(g) Describe the population(s) to which you think your inferences are applicable.
(h) State the assumptions necessary for the validity of your analyses.
35.
In a prospective, randomized, double-blind study, Stanley et al. (A-30) examined the relative efficacy
and side effects of morphine and pethidine, drugs commonly used for patient-controlled analgesia
(PCA). Subjects were 40 women, between the ages of 20 and 65 years, undergoing total abdominal
hysterectomy. Patients were allocated randomly to receive morphine or pethidine by PCA. At the end
of the study, subjects described their appreciation of nausea and vomiting, pain, and satisfaction by
means of a three-point verbal scale. The results were as follows:
Satisfaction
Unhappy/
Moderately
Happy/
Drug
Miserable
Happy
Delighted
Total
Pethidine
5
9
6
20
Morphine
9
9
2
20
Total
14
18
8
40
Pain
Unbearable/
Slight/
Drug
Severe
Moderate
None
Total
Pethidine
2
10
8
20
Morphine
2
8
10
20
Total
4
18
18
40
Nausea
Unbearable/
Slight/
Drug
Severe
Moderate
None
Total
Pethidine
5
9
6
20
Morphine
7
8
5
20
Total
12
17
11
40
Source: Data provided courtesy of Dr. Balraj L. Appadu.

664
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
36.
Screening data from a statewide lead poisoning prevention program between April 1990 and March
1991 were examined by Sargent et al. (A-31) in an effort to learn more about community risk factors
for iron deficiency in young children. Study subjects ranged in age between 6 and 59 months.
Among 1860 children with Hispanic surnames, 338 had iron deficiency. Four-hundred-fifty-seven
of 1139 with Southeast Asian surnames and 1034 of 8814 children with other surnames had iron
deficiency.
37.
To increase understanding of HIV-infection risk among patients with severe mental illness, Horwath
et al. (A-32) conducted a study to identify predictors of injection drug use among patients who did not
have a primary substance use disorder. Of 192 patients recruited from inpatient and outpatient public
psychiatric facilities, 123 were males. Twenty-nine of the males and nine of the females were found
to have a history of illicit-drug injection.
38.
Skinner et al. (A-33) conducted a clinical trial to determine whether treatment with melphalan,
prednisone, and colchicine (MPC) is superior to colchicine (C) alone. Subjects consisted of 100
patients with primary amyloidosis. Fifty were treated with C and 50 with MPC. Eighteen months
after the last person was admitted and 6 years after the trial began, 44 of those receiving C and 36 of
those receiving MPC had died.
39.
The purpose of a study by Miyajima et al. (A-34) was to evaluate the changes of tumor cell
contamination in bone marrow (BM) and peripheral blood (PB) during the clinical course of patients
with advanced neuroblastoma. Their procedure involved detecting tyrosine hydroxylase (TH) mRNA
to clarify the appropriate source and time for harvesting hematopoietic stem cells for transplantation.
The authors used Fisher’s exact test in the analysis of their data. If available, read their article and
decide if you agree that Fisher’s exact text was the appropriate technique to use. If you agree,
duplicate their procedure and see if you get the same results. If you disagree, explain why.
40.
Cohen et al. (A-35) investigated the relationship between HIV seropositivity and bacterial vaginosis
in a population at high risk for sexual acquisition of HIV. Subjects were 144 female commercial sex
workers in Thailand of whom 62 were HIV-positive and 109 had a history of sexually transmitted
diseases (STD). In the HIV-negative group, 51 had a history of STD.
41.
The purpose of a study by Lipschitz et al. (A-36) was to examine, using a questionnaire, the rates and
characteristics of childhood abuse and adult assaults in a large general outpatient population.
Subjects consisted of 120 psychiatric outpatients (86 females, 34 males) in treatment at a large
hospital-based clinic in an inner-city area. Forty-seven females and six males reported incidents of
childhood sexual abuse.
42.
Subjects of a study by O’Brien et al. (A-37) consisted of 100 low-risk patients having well-dated
pregnancies. The investigators wished to evaluate the efficacy of a more gradual method for
promoting cervical change and delivery. Half of the patients were randomly assigned to receive
a placebo, and the remainder received 2 mg of intravaginal prostaglandin E2 (PGE2) for 5 consecutive
days. One of the infants born to mothers in the experimental group and four born to those in the
control group had macrosomia.
43.
The purposes of a study by Adra et al. (A-38) were to assess the influence of route of delivery on
neonatal outcome in fetuses with gastroschisis and to correlate ultrasonographic appearance of the
fetal bowel with immediate postnatal outcome. Among 27 cases of prenatally diagnosed gastro-
schisis the ultrasonograph appearance of the fetal bowel was normal in 15. Postoperative complica-
tions were observed in two of the 15 and in seven of the cases in which the ultrasonographic
appearance was not normal.
44.
Liu et al. (A-39) conducted household surveys in areas of Alabama under tornado warnings. In one of
the surveys (survey 2) the mean age of the 193 interviewees was 54 years. Of these 56.0 percent were

REVIEW QUESTIONS AND EXERCISES
665
women, 88.6 percent were white, and 83.4 percent had a high-school education or higher. Among
the information collected were data on shelter-seeking activity and understanding of the term
“tornado warning.” One-hundred-twenty-eight respondents indicated that they usually seek
shelter when made aware of a tornado warning. Of these, 118 understood the meaning of tornado
warning. Forty-six of those who said they didn’t usually seek shelter understood the meaning
of the term.
45.
The purposes of a study by Patel et al. (A-40) were to investigate the incidence of acute angle-closure
glaucoma secondary to pupillary dilation and to identify screening methods for detecting angles at
risk of occlusion. Of 5308 subjects studied, 1287 were 70 years of age or older. Seventeen of the older
subjects and 21 of the younger subjects (40 through 69 years of age) were identified as having
potentially occludable angles.
46.
Voskuyl et al. (A-41) investigated those characteristics (including male gender) of patients with
rheumatoid arthritis (RA) that are associated with the development of rheumatoid vasculitis (RV).
Subjects consisted of 69 patients who had been diagnosed as having RV and 138 patients with RA
who were not suspected to have vasculitis. There were 32 males in the RV group and 38 among the
RA patients.
47.
Harris et al.
(A-42) conducted a study to compare the efficacy of anterior colporrhaphy and
retropubic urethropexy performed for genuine stress urinary incontinence. The subjects were 76
women who had undergone one or the other surgery. Subjects in each group were comparable in age,
social status, race, parity, and weight. In 22 of the 41 cases reported as cured the surgery had been
performed by attending staff. In 10 of the failures, surgery had been performed by attending staff. All
other surgeries had been performed by resident surgeons.
48.
Kohashi et al. (A-43) conducted a study in which the subjects were patients with scoliosis. As part of
the study, 21 patients treated with braces were divided into two groups, group AðnA ¼ 12Þ and group
BðnB ¼ 9Þ, on the basis of certain scoliosis progression factors. Two patients in group A and eight in
group B exhibited evidence of progressive deformity, while the others did not.
49.
In a study of patients with cervical intraepithelial neoplasia, Burger et al. (A-44) compared those who
were human papillomavirus (HPV)-positive and those who were HPV-negative with respect to risk
factors for HPV infection. Among their findings were 60 out of 91 nonsmokers with HPV infection
and 44 HPV-positive patients out of 50 who smoked 21 or more cigarettes per day.
50.
Thomas et al. (A-45) conducted a study to determine the correlates of compliance with follow-up
appointments and prescription filling after an emergency department visit. Among 235 respondents,
158 kept their appointments. Of these, 98 were females. Of those who missed their appointments, 31
were males.
51.
The subjects of a study conducted by O’Keefe and Lavan (A-46) were 60 patients with cognitive
impairment who required parenteral fluids for at least 48 hours. The patients were randomly assigned
to receive either intravenous (IV) or subcutaneous (SC) fluids. The mean age of the 30 patients in the
SC group was 81 years with a standard deviation of 6. Fifty-seven percent were females. The mean
age of the IV group was 84 years with a standard deviation of 7. Agitation related to the cannula or
drip was observed in 11 of the SC patients and 24 of the IV patients.
Exercises for Use with the Large Data Sets Available on the Following Website:
www.wile y.com/ college
/ daniel
1.
Refer to the data on smoking, alcohol consumption, blood pressure, and respiratory disease among
1200 adults (SMOKING). The variables are as follows:

666
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
Sex ðAÞ :
1 ¼ male; 0 ¼ female
Smoking status ðBÞ :
0 ¼ nonsmoker; 1 ¼ smoker
Drinking level ðCÞ :
0 ¼ nondrinker
1 ¼ light to moderate drinker
2 ¼ heavy drinker
Symptoms of respiratory disease ðDÞ :
1 ¼ present; 0 ¼ absent
High blood pressure status ðEÞ :
1 ¼ present; 0 ¼ absent
Select a simple random sample of size 100 from this population and carry out an analysis to see if you
can conclude that there is a relationship between smoking status and symptoms of respiratory disease.
Let a ¼ .05 and determine the p value for your test. Compare your results with those of your
classmates.
2.
Refer to Exercise 1. Select a simple random sample of size 100 from the population and carry out a
test to see if you can conclude that there is a relationship between drinking status and high blood
pressure status in the population. Let a ¼ .05 and determine the p value. Compare your results with
those of your classmates.
3.
Refer to Exercise 1. Select a simple random sample of size 100 from the population and carry out a
test to see if you can conclude that there is a relationship between gender and smoking status in the
population. Let a ¼ .05 and determine the p value. Compare your results with those of your
classmates.
4.
Refer to Exercise 1. Select a simple random sample of size 100 from the population and carry out a
test to see if you can conclude that there is a relationship between gender and drinking level in the
population. Let a ¼ .05 and find the p value. Compare your results with those of your classmates.
REFERENCES
Methodology References
1. KARL PEARSON, “On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated
System of Variables Is Such that It Can Be Reasonably Supposed to Have Arisen from Random Sampling,” The
London, Edinburgh and Dublin Philosophical Magazine and Journal of Science, Fifth Series, 50 (1900), 157-
175. Reprinted in Karl Pearson’s Early Statistical Papers, Cambridge University Press, 1948.
2. H. O. LANCASTER, The Chi-Squared Distribution, Wiley, New York, 1969.
3. MIKHAIL S. NIKULIN and PRISCILLA E. GREENWOOD, A Guide to Chi-Squared Testing, Wiley, New York, 1996.
4. WILLIAM G. COCHRAN, “The x2 Test of Goodness of Fit,” Annals of Mathematical Statistics, 23 (1952), 315-345.
5. WILLIAM G. COCHRAN, “Some Methods for Strengthening the Common x2 Tests,” Biometrics, 10 (1954),
417-451.
6. F. YATES, “Contingency Tables Involving Small Numbers and the x2 Tests,” Journal of the Royal Statistical
Society, Supplement, 1, 1934 (Series B), 217-235.
7. R. A. FISHER, Statistical Methods for Research Workers, Fifth Edition, Oliver and Boyd, Edinburgh, 1934.
8. R. A. FISHER, “The Logic of Inductive Inference,” Journal of the Royal Statistical Society Series A, 98 (1935),
39-54.
9. J. O. IRWIN, “Tests of Significance for Differences Between Percentages Based on Small Numbers,” Metron, 12
(1935), 83-94.
10. F. YATES, “Contingency Tables Involving Small Numbers and the x2 Test,” Journal of the Royal Statistical
Society, Supplement, 1, (1934), 217-235.
11. D. J. FINNEY, “The Fisher-Yates Test of Significance in 2
2 Contingency Tables,” Biometrika, 35 (1948),
145-156.

REFERENCES
667
12.
R. LATSCHA, “Tests of Significance in a 2
2 Contingency Table: Extension of Finney’s Table,” Biometrika,
40
(1955), 74-86.
13.
G. A. BARNARD, “A New Test for 2
2 Tables,” Nature, 156 (1945), 117.
14.
G. A. BARNARD, “A New Test for 2
2 Tables,” Nature, 156 (1945), 783-784.
15.
G. A. BARNARD, “Significance Tests for 2
2 Tables,” Biometrika, 34 (1947), 123-138.
16.
R. A. FISHER, “A New Test for 2
2 Tables,” Nature, 156 (1945), 388.
17.
E. S. PEARSON, “The Choice of Statistical Tests Illustrated on the Interpretation of Data Classed in a 2
2 Table,”
Biometrika, 34 (1947), 139-167.
18.
A. SWEETLAND, “A Comparison of the Chi-Square Test for 1 df and the Fisher Exact Test,” Rand Corporation,
Santa Monica, CA, 1972.
19.
WENDELL E. CARR, “Fisher’s Exact Text Extended to More than Two Samples of Equal Size,” Technometrics, 22
(1980), 269-270.
20.
HENRY R. NEAVE, “A New Look at an Old Test,” Bulletin of Applied Statistics, 9 (1982), 165-178.
21.
WILLIAM D. DUPONT, “Sensitivity of Fisher’s Exact Text to Minor Perturbations in 2
2 Contingency Tables,”
Statistics in Medicine, 5 (1986), 629-635.
22.
N. MANTEL and W. HAENSZEL, “Statistical Aspects of the Analysis of Data from Retrospective Studies of Disease,”
Journal of the National Cancer Institute, 22 (1959), 719-748.
23.
N. MANTEL, “Chi-Square Tests with One Degree of Freedom: Extensions of the Mantel-Haenszel Procedure,”
Journal of the American Statistical Association, 58 (1963), 690-700.
Applications References
A-1.
CAROLE W. CRANOR and DALE B. CHRISTENSEN, “The Asheville Project: Short-Term Outcomes of a Community
Pharmacy Diabetes Care Program,” Journal of the American Pharmaceutical Association, 43 (2003), 149-159.
A-2.
AMY L. BYERS, HEATHER ALLORE, THOMAS M. GILL, and PETER N. PEDUZZI, “Application of Negative Binomial
Modeling for Discrete Outcomes: A Case Study in Aging Research,” Journal of Clinical Epidemiology, 56
(2003), 559-564.
A-3.
KATHLEEN M. STEPANUK, JORGE E. TOLOSA, DAWNEETE LEWIS, VICTORIA MEYERS, CYNTHIA ROYDS, JUAN CARLOS
SAOGAL, and RON LIBRIZZI, “Folic Acid Supplementation Use Among Women Who Contact a Teratology
Information Service,” American Journal of Obstetrics and Gynecology, 187 (2002), 964-967.
A-4.
J. K. SILVER and D. D. AIELLO, “Polio Survivors: Falls and Subsequent Injuries,” American Journal of Physical
Medicine and Rehabilitation, 81 (2002), 567-570.
A-5.
CYNTHIA G. SEGAL and JACQUELINE J. ANDERSON, “Preoperative Skin Preparation of Cardiac Patients,” AORN
Journal, 76 (2002), 821-827.
A-6.
RALPH ROTHENBERG and JOHN P. HOLCOMB, “Guidelines for Monitoring of NSAIDs: Who Listened?,” Journal of
Clinical Rheumatology, 6 (2000), 258-265.
A-7.
SHARON M. BOLES and PATRICK B. JOHNSON, “Gender, Weight Concerns, and Adolescent Smoking,” Journal of
Addictive Diseases, 20 (2001), 5-14.
A-8.
The DMG Study Group, “Migraine and Idiopathic Narcolepsy—A Case-Control Study,” Cephalagia, 23 (2003),
786-789.
A-9.
TASHA D. CARTER, EMANUELA MUNDO, SAGAR V. PARKH, and JAMES L. KENNEDY, “Early Age at Onset as a Risk Factor
for Poor Outcome of Bipolar Disorder,” Journal of Psychiatric Research, 37 (2003), 297-303.
A-10.
STEVEN S. COUGHLIN, ROBERT J. UHLER, THOMAS RICHARDS, and KATHERINE M. WILSON, “Breast and Cervical Cancer
Screening Practices Among Hispanic and Non-Hispanic Women Residing Near the United States-Mexico
Border, 1999-2000,” Family and Community Health, 26 (2003), 130-139.
A-11.
ROBERT SWOR, SCOTT COMPTON, FERN VINING, LYNN OSOSKY FARR, SUE KOKKO, REBECCA PASCUAL, and RAYMOND E.
JACKSON, “A Randomized Controlled Trial of Chest Compression Only CPR for Older Adults: A Pilot Study,”
Resuscitation, 58 (2003), 177-185.
A-12.
U. S. JUSTESEN, A. M. LERVFING, A. THOMSEN, J. A. LINDBERG, C. PEDERSEN, and P. TAURIS, “Low-Dose Indinavir in
Combination with Low-Dose Ritonavir: Steady-State Pharmacokinetics and Long-Term Clinical Outcome
Follow-Up,” HIV Medicine, 4 (2003), 250-254.
A-13.
J. F. TAHMASSEBI and M. E. J. CURZON, “The Cause of Drooling in Children with Cerebral Palsy—Hypersalivation
or Swallowing Defect?” International Journal of Paediatric Dentistry, 13 (2003), 106-111.
A-14.
SHU DONG XIAO and TONG SHI, “Is Cranberry Juice Effective in the Treatment and Prevention of Helicobacter
Pylori Infection of Mice?,” Chinese Journal of Digestive Diseases, 4 (2003), 136-139.

668
CHAPTER 12
THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES
A-15.
GAD SHAKED, OLEG KLEINER, ROBERT FINALLY, JACOB MORDECHAI, NITZA NEWMAN, and ZAHAVI COHEN,
“Management of Blunt Pancreatic Injuries in Children,” European Journal of Trauma, 29 (2003), 151-155.
A-16.
EVERETT F. MAGANN, SHARON F. EVANS, BETH WEITZ, and JOHN NEWNHAM, “Antepartum, Intrapartum, and Neonatal
Significance of Exercise on Healthy Low-Risk Pregnant Working Women,” Obstetrics and Gynecology, 99
(2002), 466-472.
A-17.
A. M. TOSCHKE, S. M. MONTGOMERY, U. PFEIFFER, and R.von KRIES, “Early Intrauterine Exposure to Tobacco-
Inhaled Products and Obesity,” American Journal of Epidemiology, 158 (2003), 1068-1074.
A-18.
DANIEL H. LAMONT, MATTHEW J. BUDOFF, DAVID M. SHAVELLE, ROBERT SHAVELLE, BRUCE H. BRUNDAGE, and JAMES
M. HAGAR, “Coronary Calcium Scanning Adds Incremental Value to Patients with Positive Stress Tests,”
American Heart Journal, 143 (2002), 861-867.
A-19.
MARGARET L. J. DAVY, TOM J. DODD, COLIN G. LUKE, and DAVID M. RODER, “Cervical Cancer: Effect of Glandular
Cell Type on Prognosis, Treatment, and Survival,” Obstetrics and Gynecology, 101 (2003), 38-45.
A-20.
U. STENESTRAND and L. WALLENTIN, “Early Revascularization and 1-Year Survival in 14-Day Survivors of Acute
Myocardial Infarction,” Lancet, 359 (2002), 1805-1811.
A-21.
TAKAKO SUGIYAMA, KUMIYA SUGIYAMA, MASAO TODA, TASTUO YUKAWA, SOHEI MAKINO, and TAKESHI FUKUDA, “Risk
Factors for Asthma and Allergic Diseases Among 13-14-Year-Old Schoolchildren in Japan,” Allergology
International, 51 (2002), 139-150.
A-22.
D. HOLBEN, M. C. MCCLINCY, J. P. HOLCOMB, and K. L. DEAN, “Food Security Status of Households in Appalachian
Ohio with Children in Head Start,” Journal of American Dietetic Association, 104 (2004), 238-241.
A-23.
JOHN H. PORCERELLI, ROSEMARY COGAN, PATRICIA P. WEST, EDWARD A. ROSE, DAWN LAMBRECHT, KAREN E. WILSON,
RICHARD K. SEVERSON, and DUNIA KARANA, “Violent Victimization of Women and Men: Physical and Psychiatric
Symptoms,” Journal of the American Board of Family Practice, 16 (2003), 32-39.
A-24.
KENG CHEN, LEE MEI YAP, ROBIN MARKS, and STEPHEN SHUMACK, “Short-Course Therapy with Imiquimod 5%
Cream for Solar Keratoses: A Randomized Controlled Trial,” Australasian Journal of Dermatology, 44 (2003),
250-255.
A-25.
VALLABH JANARDHAN, ROBERT FRIEDLANDER, HOWARD RIINA, and PHILIP EDWIN STIEG, “Identifying Patients at Risk
for Postprocedural Morbidity After Treatment of Incidental Intracranial Aneurysms: The Role of Aneurysm Size
and Location,” Neurosurgical Focus, 13 (2002), 1-8.
A-26.
PAUL B. GOLD, ROBERT N. RUBEY, and RICHARD T. HARVEY, “Naturalistic, Self-Assignment Comparative Trial of
Bupropion SR, a Nicotine Patch, or Both for Smoking Cessation Treatment in Primary Care,” American Journal
on Addictions, 11 (2002), 315-331.
A-27.
ZOLTAN KOZINSZKY and GYORGY BARTAI, “Contraceptive Behavior of Teenagers Requesting Abortion,” European
Journal of Obstetrics and Gynecology and Reproductive Biology, 112 (2004), 80-83.
A-28.
PAOLO CROSIGNANI, ANDREA TITTARELLI, ALESSANDRO BORGINI, TIZIANA CODAZZI, ADRIANO ROVELLI, EMMA PORRO,
PAOLO CONTIERO, NADIA BIANCHI, GIOVANNA TAGLIABUE, ROSARIA FISSI, FRANCESCO ROSSITTO, and FRANCO BERRINO,
“Childhood Leukemia and Road Traffic: A Population-Based Case-Control Study,” International Journal of
Cancer, 108 (2004), 596-599.
A-29.
ROBYN GALLAGHER, SHARON MCKINLEY, and KATHLEEN DRACUP, “Predictors of Women’s Attendance at Cardiac
Rehabilitation Programs,” Progress in Cardiovascular Nursing, 18 (2003), 121-126.
A-30.
G. STANLEY, B. APPADU, M. MEAD, and D. J. ROWBOTHAM, “Dose Requirements, Efficacy and Side Effects of
Morphine and Pethidine Delivered by Patient-Controlled Analgesia After Gynaecological Surgery,” British
Journal of Anaesthesia, 76 (1996), 484-486.
A-31.
JAMES D. SARGENT, THERESE A. STUKEL, MADELINE A. DALTON, JEAN L. FREEMAN, and MARY JEAN BROWN, “Iron
Deficiency in Massachusetts Communities: Socioeconomic and Demographic Risk Factors Among Children,”
American Journal of Public Health, 86 (1996), 544-550.
A-32.
EWALD HORWATH, FRANCINE COURNOS, KAREN MCKINNON, JEANNINE R. GUIDO, and RICHARD HERMAN, “Illicit-Drug
Injection Among Psychiatric Patients Without a Primary Substance Use Disorder,” Psychiatric Services, 47
(1996), 181-185.
A-33.
MARTHA SKINNER, JENNIFER J. ANDERSON, ROBERT SIMMS, RODNEY FALK, MING WANG, CARYN A. LIBBEY, LEE ANNA JONES,
and ALAN S. COHEN, “Treatment of 100 Patients with Primary Amyloidosis: A Randomized Trial of Melphalan,
Prednisone, and Colchicine Versus Colchicine Only,” American Journal of Medicine, 100 (1996), 290-298.
A-34.
YUJI MIYAJIMA, KEIZO HORIBE, MINORU FUKUDA, KIMIKAZU MATSUMOTO, SHIN–ICHIRO NUMATA, HIROSHI MORI, and
KOJI KATO, “Sequential Detection of Tumor Cells in the Peripheral Blood and Bone Marrow of Patients with Stage
IV Neuroblastoma by the Reverse Transcription-Polymerase Chain Reaction for Tyrosine Hydroxylase mRNA,”
Cancer, 77 (1996), 1214-1219.

REFERENCES
669
A-35.
CRAIG R. COHEN, ANN DUERR, NIWAT PRUITHITHADA, SUNGWAL RUGPAO, SHARON HILLIER, PATRICIA GARCIA, and
KENRAD NELSON, “Bacterial Vaginosis and HIV Seroprevalence Among Female Commercial Sex Workers in
Chiang Mai, Thailand,” AIDS, 9 (1995), 1093-1097.
A-36.
DEBORAH S. LIPSCHITZ, MARGARET L. KAPLAN, JODIE B. SORKENN, GIANNI L. FAEDDA, PETER CHORNEY, and GREGORY
M. ASNIS, “Prevalence and Characteristics of Physical and Sexual Abuse Among Psychiatric Outpatients,”
Psychiatric Services, 47 (1996), 189-191.
A-37.
JOHN M. O’BRIEN, BRIAN M. MERCER, NANCY T. CLEARY, and BAHA M. SIBAI, “Efficacy of Outpatient Induction with
Low-Dose Intravaginal Prostaglandin E2: A Randomized, Double-Blind, Placebo-Controlled Trial,” American
Journal of Obstetrics and Gynecology, 173 (1995), 1855-1859.
A-38.
ABDALLAH M. ADRA, HELAIN J. LANDY, JAIME NAHMIAS, and ORLANDO GOMEZ-MARıN, “The Fetus with Gastro-
schisis: Impact of Route of Delivery and Prenatal Ultrasonography,” American Journal of Obstetrics and
Gynecology, 174 (1996), 540-546.
A-39.
SIMIN LIU, LYNN E. QUENEMOEN, JOSEPHINE MALILAY, ERIC NOJI, THOMAS SINKS, and JAMES MENDLEIN, “Assessment
of a Severe-Weather Warning System and Disaster Preparedness, Calhoun Country, Alabama, 1994,” American
Journal of Public Health, 86 (1996), 87-89.
A-40.
KETAN H. PATEL, JONATHAN C. JAVITT, JAMES M. TIELSCH, DEBRA A. STREET, JOANNE KATZ, HARRY A. QUIGLEY, and
ALFRED SOMMER, “Incidence of Acute Angle-Closure Glaucoma After Pharmacologic Mydriasis,” American
Journal of Ophthalmology, 120 (1995), 709-717.
A-41.
ALEXANDRE E. VOSKUYL, AEILKO H. ZWINDERMAN, MARIE LOUISE WESTEDT, JAN P. VANDENBROUCKE, FERDINAND C.
BREEDVELD, and JOHANNA M. W. HAZES, “Factors Associated with the Development of Vasculitis in Rheumatoid
Arthritis: Results of a Case-Control Study,” Annals of the Rheumatic Diseases, 55 (1996), 190-192.
A-42.
ROBERT L. HARRIS, CHRISTOPHER A. YANCEY, WINFRED L. WISER, JOHN C. MORRISON, and G. RODNEY MEEKS,
“Comparison of Anterior Colporrhaphy and Retropubic Urethropexy for Patients with Genuine Stress Urinary
Incontinence,” American Journal of Obstetrics and Gynecology, 173 (1995), 1671-1675.
A-43.
YOSHIHIRO KOHASHI, MASAYOSHI OGA, and YOICHI SUGIOKA, “A New Method Using Top Views of the Spine to
Predict the Progression of Curves in Idiopathic Scoliosis During Growth,” Spine, 21 (1996), 212-217.
A-44.
M. P. M. BURGER, H. HOLLEMA, W. J. L. M. PIETERS, F. P. SCHRoDER, and W. G. V. QUINT, “Epidemiological
Evidence of Cervical Intraepithelial Neoplasia Without the Presence of Human Papillomavirus,” British Journal
of Cancer, 73 (1996), 831-836.
A-45.
ERIC J. THOMAS, HELEN R. BURSTIN, ANNE C. O’NEIL, E. JOHN ORAV, and TROYEN A. BRENNAN, “Patient
Noncompliance with Medical Advice After the Emergency Department Visit,” Annals of Emergency Medicine,
27
(1996), 49-55.
A-46.
S. T. O’KEEFE and J. N. LAVAN, “Subcutaneous Fluids in Elderly Hospital Patients with Cognitive Impairment,”
Gerontology, 42 (1996), 36-39.