– MedMuv

Home

June 18, 2024

THE CHI-SQUARE

DISTRIBUTION AND THE ANALYSIS

OF FREQUENCIES

CHAPTER OVERVIEW

This chapter explores techniques that are commonly used in the analysis of

count or frequency data. Uses of the chi-square distribution, which was

mentioned briefly in Chapter 6, are discussed and illustrated in greater detail.

Additionally, statistical techniques often used in epidemiological studies are

introduced and demonstrated by means of examples.

TOPICS

INTRODUCTION

THE MATHEMATICAL PROPERTIES OF THE CHI-SQUARE DISTRIBUTION

TESTS OF GOODNESS-OF-FIT

TESTS OF INDEPENDENCE

TESTS OF HOMOGENEITY

THE FISHER EXACT TEST

RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC

SUMMARY

LEARNING OUTCOMES

After studying this chapter, the student will

1. understand the mathematical properties of the chi-square distribution.

2. be able to use the chi-square distribution for goodness-of-fit tests.

3. be able to construct and use contingency tables to test independence

and homogeneity.

4. be able to apply Fisher’s exact test for 2

2 tables.

5. understand how to calculate and interpret the epidemiological concepts of relative

risk, odds ratios, and the Mantel-Haenszel statistic.

600

12.2

THE MATHEMATICAL PROPERTIES OF THE CHI-SQUARE DISTRIBUTION

601

12.1

INTRODUCTION

In the chapters on estimation and hypothesis testing, brief mention is made of the chi-

square distribution in the construction of confidence intervals for, and the testing of,

hypotheses concerning a population variance. This distribution, which is one of the most

widely used distributions in statistical applications, has many other uses. Some of the more

common ones are presented in this chapter along with a more complete description of the

distribution itself, which follows in the next section.

The chi-square distribution is the most frequently employed statistical technique for

the analysis of count or frequency data. For example, we may know for a sample of

hospitalized patients how many are male and how many are female. For the same sample

we may also know how many have private insurance coverage, how many have Medicare

insurance, and how many are on Medicaid assistance. We may wish to know, for the

population from which the sample was drawn, if the type of insurance coverage differs

according to gender. For another sample of patients, we may have frequencies for each

diagnostic category represented and for each geographic area represented. We might want

to know if, in the population from which the same was drawn, there is a relationship

between area of residence and diagnosis. We will learn how to use chi-square analysis to

answer these types of questions.

There are other statistical techniques that may be used to analyze frequency data in

an effort to answer other types of questions. In this chapter we will also learn about these

techniques.

12.2

THE MATHEMATICAL PROPERTIES

OF THE CHI-SQUARE DISTRIBUTION

The chi-square distribution may be derived from normal distributions. Suppose that from a

normally distributed random variable Y with mean m and variance s² we randomly and

independently select samples of size¼ 1. Each value selected may be transformed to the

standard normal variable z by the familiar formula

y_i

z_i ¼

(12.2.1)

Each value of z may be squared to obtain z². When we investigate the sampling distri-

bution of z², we find that it follows a chi-square distribution with 1 degree of freedom.

That is,

y m

x²

² ¼z²

ð1Þ ¼

Now suppose that we randomly and independently select samples of size¼ 2 from the

normally distributed population of Y values. Within each sample we may transform each

602

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

value of y to the standard normal variable z and square as before. If the resulting values of z²

for each sample are added, we may designate this sum by

y₁

y₂

x²

ð2Þ ¼

² þ

² ¼z²

1 þ

z22

since it follows the chi-square distribution with 2 degrees of freedom, the number of

independent squared terms that are added together.

The procedure may be repeated for any sample size n. The sum of the resulting z²

values in each case will be distributed as chi-square with n degrees of freedom. In general,

then,

x²

z²

þz²

(12.2.2)

ðnÞ ¼

1 þ

2 þ

follows the chi-square distribution with n degrees of freedom. The mathematical form of

the chi-square distribution is as follows:

fðuÞ ¼

¹eðu=2Þ; u > 0

(12.2.3)

!2k=2 uðk=2Þ

where e is the irrational number 2.71828 . . . and k is the number of degrees of freedom.

The variate u is usually designated by the Greek letter chi (x) and, hence, the distribution is

called the chi-square distribution. As we pointed out in Chapter

6, the chi-square

distribution has been tabulated in Appendix Table F. Further use of the table is demon-

strated as the need arises in succeeding sections.

The mean and variance of the chi-square distribution are k and 2k, respectively. The

modal value of the distribution is k

2 for values of k greater than or equal to 2 and is zero

for k ¼ 1.

The shapes of the chi-square distributions for several values of k are shown in Figure

6.9.1. We observe in this figure that the shapes for k ¼ 1 and k ¼ 2 are quite different from

the general shape of the distribution for k > 2. We also see from this figure that chi-square

assumes values between 0 and infinity. It cannot take oegative values, since it is the sum

of values that have been squared. A final characteristic of the chi-square distribution worth

noting is that the sum of two or more independent chi-square variables also follows a

chi-square distribution.

Types of Chi-Square Tests As already noted, we make use of the chi-square

distribution in this chapter in testing hypotheses where the data available for analysis are

in the form of frequencies. These hypothesis testing procedures are discussed under the

topics of tests of goodness-of-fit, tests of independence, and tests of homogeneity. We will

discover that, in a sense, all of the chi-square tests that we employ may be thought of as

goodness-of-fit tests, in that they test the goodness-of-fit of observed frequencies to

frequencies that one would expect if the data were generated under some particular theory

or hypothesis. We, however, reserve the phrase “goodness-of-fit” for use in a more

12.2

THE MATHEMATICAL PROPERTIES OF THE CHI-SQUARE DISTRIBUTION

603

restricted sense. We use it to refer to a comparison of a sample distribution to some theoretical

distribution that it is assumed describes the population from which the sample came. The

justification of our use of the distribution in these situations is due to Karl Pearson (1), who

showed that the chi-square distribution may be used as a test of the agreement between

observation and hypothesis whenever the data are in the form of frequencies. An extensive

treatment of the chi-square distribution is to be found in the book by Lancaster (2). Nikulin

and Greenwood (3) offer practical advice for conducting chi-square tests.

Observed Versus Expected Frequencies The chi-square statistic is most

appropriate for use with categorical variables, such as marital status, whose values are

the categories married, single, widowed, and divorced. The quantitative data used in

the computation of the test statistic are the frequencies associated with each category of the

one or more variables under study. There are two sets of frequencies with which we are

concerned, observed frequencies and expected frequencies. The observed frequencies

are the number of subjects or objects in our sample that fall into the various categories of

the variable of interest. For example, if we have a sample of 100 hospital patients, we may

observe that 50 are married, 30 are single, 15 are widowed, and 5 are divorced. Expected

frequencies are the number of subjects or objects in our sample that we would expect to

observe if some null hypothesis about the variable is true. For example, our null hypothesis

might be that the four categories of marital status are equally represented in the population

from which we drew our sample. In that case we would expect our sample to contain 25

married, 25 single, 25 widowed, and 25 divorced patients.

The Chi-Square Test Statistic The test statistic for the chi-square tests we

discuss in this chapter is

“

ðOi Ei

X² ¼

(12.2.4)

E_i

When the null hypothesis is true, X² is distributed approximately as x² with k r

degrees of freedom. In determining the degrees of freedom, k is equal to the number of

groups for which observed and expected frequencies are available, and r is the number of

restrictions or constraints imposed on the given comparison. A restriction is imposed when

we force the sum of the expected frequencies to equal the sum of the observed frequencies,

and an additional restriction is imposed for each parameter that is estimated from the

sample.

In Equation 12.2.4, O_i is the observed frequency for the ith category of the variable of

interest, and E_i is the expected frequency (given that H₀ is true) for the ith category.

The quantity X² is a measure of the extent to which, in a given situation, pairs of

observed and expected frequencies agree. As we will see, the nature of X² is such that when

there is close agreement between observed and expected frequencies it is small, and when

the agreement is poor it is large. Consequently, only a sufficiently large value of X² will

cause rejection of the null hypothesis.

If there is perfect agreement between the observed frequencies and the frequencies

that one would expect, given that H₀ is true, the term O_i E_i in Equation 12.2.4 will be

604

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

equal to zero for each pair of observed and expected frequencies. Such a result would yield

a value of X² equal to zero, and we would be unable to reject H₀.

When there is disagreement between observed frequencies and the frequencies one

would expect given that H₀ is true, at least one of the O_i E_i terms in Equation 12.2.4 will

be a nonzero number. In general, the poorer the agreement between the O_i and the E_i, the

greater or the more frequent will be these nonzero values. As noted previously, if the

agreement between the O_i and the E_i is sufficiently poor (resulting in a sufficiently large X²

value,) we will be able to reject H₀.

When there is disagreement between a pair of observed and expected frequencies, the

difference may be either positive or negative, depending on which of the two frequencies is

the larger. Since the measure of agreement, X², is a sum of component quantities whose

magnitudes depend on the difference O_i E_i, positive and negative differences must be

given equal weight. This is achieved by squaring each O_i E_i difference. Dividing the

squared differences by the appropriate expected frequency converts the quantity to a term

that is measured in original units. Adding these individual ðO_i E_iÞ²=E_i terms yields X², a

summary statistic that reflects the extent of the overall agreement between observed and

expected frequencies.

The Decision Rule The quantityP½ðO_i E_iÞ²=E_i will be small if the observed

and expected frequencies are close together and will be large if the differences are large.

The computed value of X² is compared with the tabulated value of x² with k r

degrees of freedom. The decision rule, then, is: Reject H₀ if X² is greater than or equal to the

tabulated x² for the chosen value of a.

Small Expected Frequencies Frequently in applications of the chi-square test

the expected frequency for one or more categories will be small, perhaps much less than 1.

In the literature the point is frequently made that the approximation of X² to x² is not

strictly valid when some of the expected frequencies are small. There is disagreement

among writers, however, over what size expected frequencies are allowable before making

some adjustment or abandoning x² in favor of some alternative test. Some writers,

especially the earlier ones, suggest lower limits of 10, whereas others suggest that all

expected frequencies should be no less than 5. Cochran (4,5), suggests that for goodness-

of-fit tests of unimodal distributions

(such as the normal), the minimum expected

frequency can be as low as 1. If, in practice, one encounters one or more expected

frequencies less than 1, adjacent categories may be combined to achieve the suggested

minimum. Combining reduces the number of categories and, therefore, the number of

degrees of freedom. Cochran’s suggestions appear to have been followed extensively by

practitioners in recent years.

12.3

TESTS OF GOODNESS-OF-FIT

As we have pointed out, a goodness-of-fit test is appropriate when one wishes to decide if

an observed distribution of frequencies is incompatible with some preconceived or

hypothesized distribution.

12.3

TESTS OF GOODNESS-OF-FIT

605

We may, for example, wish to determine whether or not a sample of observed values

of some random variable is compatible with the hypothesis that it was drawn from a

population of values that is normally distributed. The procedure for reaching a decision

consists of placing the values into mutually exclusive categories or class intervals and

noting the frequency of occurrence of values in each category. We then make use of our

knowledge of normal distributions to determine the frequencies for each category that one

could expect if the sample had come from a normal distribution. If the discrepancy is of

such magnitude that it could have come about due to chance, we conclude that the sample

may have come from a normal distribution. In a similar manner, tests of goodness-of-fit

may be carried out in cases where the hypothesized distribution is the binomial, the

Poisson, or any other distribution. Let us illustrate in more detail with some examples of

tests of hypotheses of goodness-of-fit.

EXAMPLE 12.3.1 The Normal Distribution

Cranor and Christensen (A-1) conducted a study to assess short-term clinical, economic,

and humanistic outcomes of pharmaceutical care services for patients with diabetes in

community pharmacies. For 47 of the subjects in the study, cholesterol levels are

summarized in Table 12.3.1.

We wish to know whether these data provide sufficient evidence to indicate that the

sample did not come from a normally distributed population. Let a ¼ .05

Solution:

1. Data. See Table 12.3.1.

2. Assumptions. We assume that the sample available for analysis is a

simple random sample.

TABLE 12.3.1

Cholesterol Levels as

Described in Example 12.3.1

Cholesterol

Level (mg/dl)

Number of Subjects

100.0-124.9

125.0-149.9

150.0-174.9

175.0-199.9

200.0-224.9

225.0-249.9

250.0-274.9

275.0-299.9

Source: Data provided courtesy of Carole W. Cranor, and

Dale B. Christensen, “The Asheville Project: Short-Term

Outcomes of a Community Pharmacy Diabetes Care

Program,” Journal of the American Pharmaceutical

Association, 43 (2003), 149-159.

606

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

Hypotheses.

H₀: In the population from which the sample was drawn, cholesterol

levels are normally distributed.

H_A: The sampled population is not normally distributed.

Test statistic. The test statistic is

“

ðOi Ei

Þ²

X² ¼

E_i

i¼1

Distribution of test statistic. If H₀ is true, the test statistic is distributed

approximately as chi-square with k r degrees of freedom. The values

of k and r will be determined later.

Decision rule. We will reject H₀ if the computed value of X² is equal to

or greater than the critical value of chi-square.

Calculation of test statistic. Since the mean and variance of the

hypothesized distribution are not specified, the sample data must be

used to estimate them. These parameters, or their estimates, will be

needed to compute the frequency that would be expected in each class

interval when the null hypothesis is true. The mean and standard

deviation computed from the grouped data of Table 12.3.1 are

x ¼ 198:67

s ¼ 41:31

As the next step in the analysis, we must obtain for each class

interval the frequency of occurrence of values that we would expect when

the null hypothesis is true, that is, if the sample were, in fact, drawn from

a normally distributed population of values. To do this, we first determine

the expected relative frequency of occurrence of values for each class

interval and then multiply these expected relative frequencies by the total

number of values to obtain the expected number of values for each

interval.

The Expected Relative Frequencies

It will be recalled from our study of the normal distribution that the relative frequency of

occurrence of values equal to or less than some specified value, say, x₀, of the normally

distributed random variable X is equivalent to the area under the curve and to the left of x₀

as represented by the shaded area in Figure 12.3.1. We obtain the numerical value of this

area by converting x₀ to a standard normal deviation by the formula z₀ ¼ ðx₀

mÞ=s and

finding the appropriate value in Appendix Table D. We use this procedure to obtain the

expected relative frequencies corresponding to each of the class intervals in Table 12.3.1.

We estimate m and s with x and s as computed from the grouped sample data. The first step

consists of obtaining z values corresponding to the lower limit of each class interval. The

area between two successive z values will give the expected relative frequency of

occurrence of values for the corresponding class interval.

12.3

TESTS OF GOODNESS-OF-FIT

607

x₀

FIGURE 12.3.1

A normal distribution showing the relative frequency of occurrence of values

less than or equal to x₀. The shaded area represents the relative frequency of occurrence of values

equal to or less than x₀.

For example, to obtain the expected relative frequency of occurrence of values in the

interval 100.0 to 124.9 we proceed as follows:

100:0

198:67

The z value corresponding to X ¼ 100:0 is z ¼

2:39

41:31

125:0

198:67

The z value corresponding to X ¼ 125:0 is z ¼

1:78

41:31

In Appendix Table D we find that the area to the left of

2:39 is .0084, and the area to

the left of

1:78

.0375. The area between

1:78

and

2:39

is equal to

.0375

.0084 ¼ .0291, which is equal to the expected relative frequency of occurrence

of cholesterol levels within the interval 100.0 to 124.9. This tells us that if the null

hypothesis is true, that is, if the cholesterol levels are normally distributed, we should

expect 2.91 percent of the values in our sample to be between 100.0 and 124.9. When we

multiply our total sample size, 47, by .0291 we find the expected frequency for the interval

to be 1.4. Similar calculations will give the expected frequencies for the other intervals as

shown in Table 12.3.2.

TABLE 12.3.2

Class Intervals and Expected Frequencies for

Example 12.3.1

zðx_i xÞ=s

At Lower Limit

Expected Relative

Expected

Class Interval

of Interval

Frequency

< 100

.0084

1.8

100.0-124.9

2.39

.0291

1.4

125.0-149.9

1.78

.0815

3.8

150.0-174.9

1.18

.1653

7.8

175.0-199.9

.57

.2277

10.7

200.0-224.9

.03

.2269

10.7

225.0-249.9

.64

.1536

7.2

250.0-274.9

1.24

.0753

3.5

275.0-299.9

1.85

.0251

1.2

1.5

300.0 and greater

2.45

.0071

608

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

Comparing Observed and Expected Frequencies

We are now interested in examining the magnitudes of the discrepancies between the

observed frequencies and the expected frequencies, since we note that the two sets of

frequencies do not agree. We know that even if our sample were drawn from a normal

distribution of values, sampling variability alone would make it highly unlikely that the

observed and expected frequencies would agree perfectly. We wonder, then, if the

discrepancies between the observed and expected frequencies are small enough that we

feel it reasonable that they could have occurred by chance alone, when the null hypothesis

is true. If they are of this magnitude, we will be unwilling to reject the null hypothesis that

the sample came from a normally distributed population.

If the discrepancies are so large that it does not seem reasonable that they could have

occurred by chance alone when the null hypothesis is true, we will want to reject the null

hypothesis. The criterion against which we judge whether the discrepancies are “large” or

“small” is provided by the chi-square distribution.

The observed and expected frequencies along with each value of ðO_i E_iÞ²=E_i are

shown in Table 12.3.3. The first entry in the last column, for example, is computed from

ð1

1:8Þ²=1:8 ¼ .356. The other values of ðO_i E_iÞ²=E_i are computed in a similar

manner.

From Table 12.3.3 we see that X² ¼P½ðO_i E_iÞ²=E_i

¼ 10:566. The appropriate

degrees of freedom are 8 (the number of groups or class intervals)

(for the three

restrictions: making

PE_i ¼ PO_i, and estimating m and s from the sample data) ¼ 5.

8. Statistical decision. When we compare X² ¼ 10:566 with values of x² in

11:070, so that, at the

Appendix Table F, we see that it is less than x2.95 ¼

.05 level of significance, we cannot reject the null hypothesis that the

sample came from a normally distributed population.

TABLE 12.3.3

Observed and Expected Frequencies and

ðOi Ei

Þ²=E_i for Example 12.3.1

Observed

Expected

Frequency

Class Interval

(O_i)

(E_i)

ðOi Ei

Þ²=E_i

< 100

1.8

.356

100.0-124.9

1.4

125.0-149.9

3.8

.168

150.0-174.9

7.8

.005

175.0-199.9

10.7

4.980

200.0-224.9

10.7

2.064

225.0-249.9

7.2

1.422

250.0-274.9

3.5

.071

275.0-299.9

1.2

1.5

1.500

300.0 and

greater

Total

10.566

12.3

TESTS OF GOODNESS-OF-FIT

609

9. Conclusion. We conclude that in the sampled population, cholesterol

levels may follow a normal distribution.

10. p value. Since 11:070 > 10:566 > 9:236, .05 < p < .10. In other words,

the probability of obtaining a value of X² as large as 10.566, when the null

hypothesis is true, is between .05 and .10. Thus we conclude that such an

event is not sufficiently rare to reject the null hypothesis that the data come

from a normal distribution.

Sometimes the parameters are specified in the null hypothesis. It should be noted

that had the mean and variance of the population been specified as part of the null

hypothesis in Example 12.3.1, we would not have had to estimate them from the sample

and our degrees of freedom would have been 8

1 ¼ 7.

Alternatives Although one frequently encounters in the literature the use of chi-

square to test for normality, it is not the most appropriate test to use when the hypothesized

distribution is continuous. The Kolmogorov-Smirnov test, described in Chapter 13, was

especially designed for goodness-of-fit tests involving continuous distributions.

EXAMPLE 12.3.2 The Binomial Distribution

In a study designed to determine patient acceptance of a new pain reliever, 100 physicians

each selected a sample of 25 patients to participate in the study. Each patient, after trying

the new pain reliever for a specified period of time, was asked whether it was preferable to

the pain reliever used regularly in the past.

The results of the study are shown in Table 12.3.4.

TABLE 12.3.4

Results of Study Described in Example 12.3.2

Number of

Number of Patients

Doctors

Total Number of Patients

Out of 25 Preferring

Reporting this

Preferring New Pain

New Pain Reliever

Number

Reliever by Doctor

102

10 or more

Total

100

500

610

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

We are interested in determining whether or not these data are compatible with the

hypothesis that they were drawn from a population that follows a binomial distribution.

Again, we employ a chi-square goodness-of-fit test.

Solution: Since the binomial parameter, p, is not specified, it must be estimated from

the sample data. A total of 500 patients out of the 2500 patients participating

in the study said they preferred the new pain reliever, so that our point

estimate of p is p ¼ 500=2500 ¼ .20. The expected relative frequencies can

be obtained by evaluating the binomial function

fðxÞ ¼₂₅C_xð.2Þ^xð.8Þ²⁵

for x ¼ 0; 1; . . . ; 25. For example, to find the probability that out of a sample

of 25 patients none would prefer the new pain reliever, when in the total

population the true proportion preferring the new pain reliever is .2, we would

evaluate

fð0Þ ¼₂₅C_oð:2Þ^oð:8Þ²⁵

This can be done most easily by consulting Appendix Table B, where we see

that PðX ¼ 0Þ ¼ .0038. The relative frequency of occurrence of samples of

size 25 in which no patients prefer the new pain reliever is .0038. To obtain

the corresponding expected frequency, we multiply .0038 by 100 to get .38.

Similar calculations yield the remaining expected frequencies, which, along

with the observed frequencies, are shown in Table 12.3.5. We see in this table

TABLE 12.3.5

Calculations for Example 12.3.2

Number of

Doctors Reporting

Patients Out of 25

This Number

Expected

Preferring New Pain

(Observed

Relative

Expected

Reliever

Frequency, O_i)

Frequency

Frequency E_i

.0038

.38

2.74

.0236

2.36

.0708

7.08

.1358

13.58

.1867

18.67

.1960

19.60

.1633

16.33

.1109

11.09

.0623

6.23

.0295

2.95

10 or more

.0173

1.73

Total

100

1.0000

100.00

12.3

TESTS OF GOODNESS-OF-FIT

611

that the first expected frequency is less than 1, so that we follow Cochran’s

suggestion and combine this group with the second group. When we do this,

all the expected frequencies are greater than 1.

From the data, we compute

ð11

2:74Þ

ð8

7:08Þ²

ð0

1:73Þ²

X² ¼

þ þ

¼ 47:624

2:74

7:08

1:73

The appropriate degrees of freedom are 10 (the number of groups left

after combining the first two) less 2, or 8. One degree of freedom is lost

because we force the total of the expected frequencies to equal the total

observed frequencies, and one degree of freedom is sacrificed because we

estimated p from the sample data.

We compare our computed X² with the tabulated x² with 8 degrees of

freedom and find that it is significant at the .005 level of significance; that is,

p < .005. We reject the null hypothesis that the data came from a binomial

distribution.

EXAMPLE 12.3.3 The Poisson Distribution

A hospital administrator wishes to test the null hypothesis that emergency admissions

follow a Poisson distribution with l ¼ 3. Suppose that over a period of 90 days the numbers

of emergency admissions were as shown in Table 12.3.6.

TABLE 12.3.6

Number of Emergency Admissions to a Hospital During a

90-Day Period

Emergency

Day Admissions

Day

Admissions

(Continued )

612

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

Emergency

Day

Admissions

Day

Admissions

Day

Admissions

Day

Admissions

The data of Table 12.3.6 are summarized in Table 12.3.7.

Solution: To obtain the expected frequencies we first obtain the expected relative

frequencies by evaluating the Poisson function given by Equation 4.4.1 for

each entry in the left-hand column of Table 12.3.7. For example, the first

expected relative frequency is obtained by evaluating

e³3

f ð0Þ ¼

We may use Appendix Table C to find this and all the other expected rel-

ative frequencies that we need. Each of the expected relative frequencies

TABLE 12.3.7

Summary of Data Presented

in Table 12.3.6

Number of

Days This Number

Emergency Admissions

of Emergency

in a Day

Admissions Occurred

10 or more

Total

12.3

TESTS OF GOODNESS-OF-FIT

613

TABLE 12.3.8

Observed and Expected Frequencies and Components

of X² for Example 12.3.3

Number of

Days this

Expected

Emergency

Number

Relative

Expected

ðOi Ei

Þ²

Admissions

Occurred, O_i

Frequency

E_i

.050

4.50

.056

.149

13.41

.026

.224

20.16

1.321

.224

20.16

.400

.168

15.12

.051

.101

9.09

.001

.050

4.50

.500

.022

1.98

.525

.008

.72=

.003

.27

1.08

.784

;

10 or more

.001

.09

Total

1.000

90.00

3.664

is multiplied by

90 to obtain the corresponding expected frequencies.

These values along with the observed and expected frequencies and the

components of X², ðO_i E_iÞ²=E_i, are displayed in Table 12.3.8, in which we

see that

“

ðOi Ei

4:50Þ²

ð2

1:08Þ²

X² ¼

¼ð⁵

þ þ

¼ 3:664

E_i

4:50

1:08

We also note that the last three expected frequencies are less than 1, so that

they must be combined to avoid having any expected frequencies less than 1.

This means that we have only nine effective categories for computing degrees

of freedom. Since the parameter, l, was specified in the null hypothesis, we

do not lose a degree of freedom for reasons of estimation, so that the

appropriate degrees of freedom are 9

1 ¼ 8. By consulting Appendix

Table F, we find that the critical value of x² for 8 degrees of freedom and

a ¼ .05 is15.507, so that we cannot reject thenull hypothesis at the.05 level,

or for that matter any reasonable level, of significance

(p > .10). We

conclude, therefore, that emergency admissions at this hospital may follow

a Poisson distribution with l ¼ 3. At least the observed data do not cast any

doubt on that hypothesis.

If the parameter l has to be estimated from sample data, the estimate is

obtained by multiplying each value x by its frequency, summing these

products, and dividing the total by the sum of the frequencies.

614

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

EXAMPLE 12.3.4 The Uniform Distribution

The flu season in southern Nevada for 2005-2006 ran from December to April, the

coldest months of the year. The Southern Nevada Health District reported the numbers

of vaccine-preventable influenza cases shown in Table 12.3.9. We are interested in

knowing whether the numbers of flu cases in the district are equally distributed among

the five flu

season months. That is, we wish to know if flu cases follow a uniform

distribution.

Solution:

Data. See Table 12.3.9.

Assumptions. We assume that the reported cases of flu constitute a

simple random sample of cases of flu that occurred in the district.

Hypotheses.

H₀: Flu cases in southern Nevada are uniformly distributed over the five

flu season months.

H_A: Flu cases in southern Nevada are not uniformly distributed over the

five flu season months.

Let a ¼ .01.

Test statistic. The test statistic is

XðO_i E_iÞ

X² ¼

E_i

Distribution of test statistic. If H₀ is true, X² is distributed approxi-

mately as x² with ð5

1Þ ¼ 4 degrees of freedom.

Decision rule. Reject H₀ if the computed value of X² is equal to or

greater than 13.277.

TABLE 12.3.9

Reported Vaccine-Preventable

Influenza Cases from Southern Nevada,

December 2005-April 2006

Number of

Reported Cases

Month

of Influenza

December 2005

January 2006

February 2006

March 2006

April 2006

Total

200

Source: http://www.southernnevadahealthdistrict.org/

epidemiology/disease_statistics.htm.

12.3

TESTS OF GOODNESS-OF-FIT

615

Chart of Observed and Expected Values

Expected

Observed

Category

Observed

Proportion

Expected

to Chi-Sq

0.2

12.100

0.2

48.400

0.2

13.225

0.2

14.400

0.2

9.025

Chi-Sq

P-Value

200

97.15

0.000

FIGURE 12.3.2

MINITAB output for Example 12.3.4.

7. Calculation of test statistic. If the null hypothesis is true, we would

expect to observe 200=5 ¼ 40 cases per month. Figure 12.3.2 shows the

computer printout obtained from MINITAB. The bar graph shows the

observed and expected frequencies per month. The chi-square table

provides the observed frequencies, the expected frequencies based on a

uniform distribution, and the individual chi-square contribution for each

test value.

8. Statistical decision. Since 97.15, the computed value of X², is greater

than 13.277, we reject, based on these data, the null hypothesis of a

616

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

uniform distribution of flu cases during the flu season in southern

Nevada.

9. Conclusion. We conclude that the occurrence of flu cases does not

follow a uniform distribution.

10. p value. From the MINITAB output we see that p ¼ .000 (i.e., < .001).

EXAMPLE 12.3.5

A certain human trait is thought to be inherited according to the ratio 1:2:1 for homozygous

dominant, heterozygous, and homozygous recessive. An examination of a simple random

sample of 200 individuals yielded the following distribution of the trait: dominant, 43;

heterozygous, 125; and recessive, 32. We wish to know if these data provide sufficient

evidence to cast doubt on the belief about the distribution of the trait.

Solution:

Data. See statement of the example.

Assumptions. We assume that the data meet the requirements for the

application of the chi-square goodness-of-fit test.

Hypotheses.

H₀: The trait is distributed according to the ratio 1:2:1 for homozygous

dominant, heterozygous, and homozygous recessive.

H_A: The trait is not distributed according to the ratio 1:2:1.

Test statistic. The test statistic is

“

ðO EÞ

X² ¼

Distribution of test statistic. If H₀ is true, X² is distributed as chi-square

with 2 degrees of freedom.

Decision rule. Suppose we let the probability of committing a type I

error be .05. Reject H₀ if the computed value of X² is equal to or greater

than 5.991.

Calculation of test statistic. If H₀ is true, the expected frequencies for

the three manifestations of the trait are 50, 100, and 50 for dominant,

heterozygous, and recessive, respectively. Consequently,

X² ¼ ð43

50Þ²=50 þ ð125

100Þ2=100 þ ð32

50Þ²=50 ¼ 13:71

Statistical decision. Since 13:71 > 5:991, we reject H₀.

Conclusion. We conclude that the trait is not distributed according to the

ratio 1:2:1.

10.

p value. Since 13:71 > 10:597, the p value for the test is p < .005._&

EXERCISES

617

EXERCISES

12.3.1

The following table shows the distribution of uric acid determinations taken on 250 patients. Test the

goodness-of-fit of these data to a normal distribution with m ¼ 5:74 and s ¼ 2:01. Let a ¼ .01.

Uric Acid

Observed

Uric Acid

Observed

Determination

Frequency

Determination

Frequency

6 to 6.99

1 to 1.99

7 to 7.99

2 to 2.99

8 to 8.99

3 to 3.99

9 to 9.99

4 to 4.99

10 or higher

5 to 5.99

Total

250

12.3.2

The following data were collected on 300 eight-year-old girls. Test, at the .05 level of significance,

the null hypothesis that the data are drawn from a normally distributed population. The sample

mean and standard deviation computed from grouped data are 127.02 and 5.08.

Height in

Observed

Height in

Observed

Centimeters

Frequency

Centimeters

Frequency

114 to 115.9

128 to 129.9

116 to 117.9

130 to 131.9

118 to 119.9

132 to 133.9

120 to 121.9

134 to 135.9

122 to 123.9

136 to 137.9

124 to 125.9

138 to 139.9

126 to 127.9

Total

300

12.3.3

The face sheet of patients’ records maintained in a local health department contains 10 entries.

A sample of 100 records revealed the following distribution of erroneous entries:

Number of Erroneous

Entries Out of 10

Number of Records

5 or more

Total

100

618

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

Test the goodness-of-fit of these data to the binomial distribution with p ¼ .20. Find the p value for

this test.

12.3.4

In a study conducted by Byers et al. (A-2), researchers tested a Poisson model for the distribution

of activities of daily living (ADL) scores after a 7-month prehabilitation program designed to

prevent functional decline among physically frail, community-living older persons. ADL meas-

ured the ability of individuals to perform essential tasks, including walking inside the house,

bathing, upper and lower body dressing, transferring from a chair, toileting, feeding, and

grooming. The scoring method used in this study assigned a value of 0 for no (personal) help

and no difficulty, 1 for difficulty but no help, and 2 for help regardless of difficulty. Scores were

summed to produce an overall score ranging from 0 to 16 (for eight tasks). There were 181 subjects

who completed the study. Suppose we use the authors’ scoring method to assess the status of

another group of 181 subjects relative to their activities of daily living. Let us assume that the

following results were obtained.

Observed

Expected

Observed

Expected

X Frequency X Frequency

Frequency X Frequency

11.01

2.95

30.82

1.03

43.15

0.32

40.27

0.09

28.19

0.02

15.79

12 or more

0.01

7.37

Source: Hypothetical data based on procedure reported by Amy L. Byers, Heather Allore,

Thomas M. Gill, and Peter N. Peduzzi, “Application of Negative Binomial Modeling for

Discrete Outcomes: A Case Study in Aging Research,” Journal of Clinical Epidemiology, 56

(2003), 559-564.

Test the null hypothesis that these data were drawn from a Poisson distribution with l ¼ 2:8. Let

a ¼ .01.

12.3.5

The following are the numbers of a particular organism found in 100 samples of water from

a pond:

Number of Organisms

per Sample

Frequency

per Sample

Frequency

Total

100

Test the null hypothesis that these data were drawn from a Poisson distribution. Determine the p value

for this test.

12.4

TESTS OF INDEPENDENCE

619

12.3.6

A research team conducted a survey in which the subjects were adult smokers. Each subject in a

sample of 200 was asked to indicate the extent to which he or she agreed with the statement: “I would

like to quit smoking.” The results were as follows:

Response:

Strongly agree

Agree

Disagree

Strongly Disagree

Number

Responding:

102

Can one conclude on the basis of these data that, in the sampled population, opinions are not equally

distributed over the four levels of agreement? Let the probability of committing a type I error be .05

and find the p value.

12.4

TESTS OF INDEPENDENCE

Another, and perhaps the most frequent, use of the chi-square distribution is to test the null

hypothesis that two criteria of classification, when applied to the same set of entities, are

independent. We say that two criteria of classification are independent if the distribution of

one criterion is the same no matter what the distribution of the other criterion. For example,

if socioeconomic status and area of residence of the inhabitants of a certain city are

independent, we would expect to find the same proportion of families in the low, medium,

and high socioeconomic groups in all areas of the city.

The Contingency Table The classification, according to two criteria, of a set of

entities, say, people, can be shown by a table in which the r rows represent the various

levels of one criterion of classification and the c columns represent the various levels of the

second criterion. Such a table is generally called a contingency table, with dimension r c.

The classification according to two criteria of a finite population of entities is shown in

Table 12.4.1.

We will be interested in testing the null hypothesis that in the population the two

criteria of classification are independent. If the hypothesis is rejected, we will conclude that

TABLE 12.4.1

Two-Way Classification of a Finite

Population of Entities

Second

Criterion of

First Criterion of Classification Level

Classification

Level

Total

N₁₁

N₁₂

N₁₃

N_1c

N₁.

N₂₁

N₂₂

N₂₃

N_2c

N₂.

N₃₁

N₃₂

N₃₃

N_3c

N₃.

N_r1

N_r2

N_r3

N_rc

N_r:

Total

N.1

N.2

N.3

N.c

620

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

TABLE 12.4.2

Two-Way Classification of a Sample

of Entities

Second

Criterion of

First Criterion of Classification Level

Classification

Level

Total

n₁₁

n₁₂

n₁₃

n_1c

n₁.

n₂₁

n₂₂

n₂₃

n_2c

n₂.

n₃₁

n₃₂

n₃₃

n_3c

n₃.

n_r1

n_r2

n_r3

n_rc

n_r.

Total

n.1

n.2

n.3

n.c

the two criteria of classification are not independent. A sample of size n will be drawn from

the population of entities, and the frequency of occurrence of entities in the sample

corresponding to the cells formed by the intersections of the rows and columns of Table

12.4.1 along with the marginal totals will be displayed in a table such as Table 12.4.2.

Calculating the Expected Frequencies The expected frequency, under

the null hypothesis that the two criteria of classification are independent, is calculated for

each cell.

We learned in Chapter 3 (see Equation 3.4.4) that if two events are independent, the

probability of their joint occurrence is equal to the product of their individual probabilities.

Under the assumption of independence, for example, we compute the probability that one

of the n subjects represented in Table 12.4.2 will be counted in Row 1 and Column 1 of the

table (that is, in Cell 11) by multiplying the probability that the subject will be counted in

Row 1 by the probability that the subject will be counted in Column 1. In the notation of the

table, the desired calculation is

n1:

n.1

To obtain the expected frequency for Cell 11, we multiply this probability by the total

number of subjects, n. That is, the expected frequency for Cell 11 is given by

n1:

n.1

n ðnÞ

Since the n in one of the denominators cancels into numerator n, this expression reduces to

ðn1:Þðn.1Þ

In general, then, we see that to obtain the expected frequency for a given cell, we multiply

the total of the row in which the cell is located by the total of the column in which the cell is

located and divide the product by the grand total.

12.4

TESTS OF INDEPENDENCE

621

Observed Versus Expected Frequencies The expected frequencies and

observed frequencies are compared. If the discrepancy is sufficiently small, the null

hypothesis is tenable. If the discrepancy is sufficiently large, the null hypothesis is rejected,

and we conclude that the two criteria of classification are not independent. The decision as

to whether the discrepancy between observed and expected frequencies is sufficiently large

to cause rejection of H₀ will be made on the basis of the size of the quantity computed when

we use Equation 12.2.4, where O_i and E_i refer, respectively, to the observed and expected

frequencies in the cells of Table 12.4.2. It would be more logical to designate the observed

and expected frequencies in these cells by O_ij and E_ij, but to keep the notation simple and to

avoid the introduction of another formula, we have elected to use the simpler notation. It

will be helpful to think of the cells as being numbered from 1 to k, where 1 refers to Cell 11

and k refers to Cell rc. It can be shown that X² as defined in this manner is distributed

approximately as x² with ðr

1Þðc

1Þ degrees of freedom when the null hypothesis is

true. If the computed value of X² is equal to or larger than the tabulated value of x² for some

a, the null hypothesis is rejected at the a level of significance. The hypothesis testing

procedure is illustrated with the following example.

EXAMPLE 12.4.1

In 1992, the U.S. Public Health Service and the Centers for Disease Control and Prevention

recommended that all women of childbearing age consume 400 mg of folic acid daily to

reduce the risk of having a pregnancy that is affected by a neural tube defect such as spina

bifida or anencephaly. In a study by Stepanuk et al. (A-3), 693 pregnant women called a

teratology information service about their use of folic acid supplementation. The research-

ers wished to determine if preconceptional use of folic acid and race are independent. The

data appear in Table 12.4.3.

Solution:

1. Data. See Table 12.4.3.

2. Assumptions. We assume that the sample available for analysis is equiv-

alent to a simple random sample drawn from the population of interest.

TABLE 12.4.3

Race of Pregnant Caller and Use of

Folic Acid

Preconceptional Use of Folic Acid

Yes

Total

White

260

299

559

Black

Other

Total

282

354

636

Source: Kathleen M. Stepanuk, Jorge E. Tolosa, Dawneete Lewis, Victoria

Meyers, Cynthia Royds, Juan Carlos Saogal, and Ron Librizzi, “Folic Acid

Supplementation Use Among Women Who Contact a Teratology Information

Service,” American Journal of Obstetrics and Gynecology, 187 (2002), 964-967.

622

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

Hypotheses.

H₀: Race and preconceptional use of folic acid are independent.

H_A: The two variables are not independent.

Let a ¼ .05.

Test statistic. The test statistic is

“

ðO_i E_iÞ²

X² ¼

E_i

i¼1

Distribution of test statistic. When H₀

is true, X²

is distributed

approximately as x² with ðr

1Þðc

1Þ ¼ ð3

1Þð2

1Þ ¼ ð2Þð1Þ ¼

2 degrees of freedom.

Decision rule. Reject H₀ if the computed value of X² is equal to or

greater than 5.991.

Calculation of test statistic. The expected frequency for the first cell is

ð559

282Þ=636 ¼ 247:86. The other expected frequencies are calcu-

lated in a similar manner. Observed and expected frequencies are

displayed in Table 12.4.4. From the observed and expected frequencies

we may compute

“

X² ¼PðOi EⁱÞ

E_i

247:86Þ

ð299

311:14Þ²

ð14

11:69Þ²

¼ ð²⁶⁰

þ…þ

247:86

311:14

11:69

¼ .59461 þ .47368 þ . . . þ .45647 ¼ 9:08960

Statistical decision. We reject H₀ since 9:08960 > 5:991.

Conclusion. We conclude that H₀ is false, and that there is a relationship

between race and preconceptional use of folic acid.

10.

p value. Since 7:378 < 9:08960 < 9:210, .01 < p < .025.

TABLE 12.4.4

Observed and Expected Frequencies

for Example 12.4.1

Preconceptional Use of Folic Acid

Yes

Total

White

260

(247.86)

299

(311.14)

559

Black

(24.83)

(31.17)

Other

(9.31)

(11.69)

Total

282

354

636

12.4

TESTS OF INDEPENDENCE

623

Computer Analysis The computer may be used to advantage in calculating X² for

tests of independence and tests of homogeneity. Figure 12.4.1 shows the procedure and

printout for Example

12.4.1

when the MINITAB program for computing X² from

contingency tables is used. The data were entered into MINITAB Columns 1 and 2,

corresponding to the columns of Table 12.4.3.

We may use SAS^® to obtain an analysis and printout of contingency table data by

using the PROC FREQ statement. Figure 12.4.2 shows a partial SAS^® printout reflecting

the analysis of the data of Example 12.4.1.

Data:

C1: 260 15

C2: 299 41 14

Dialog Box:

Session command :

Stat Tables Chi-square Test

MTB

> CHISQUARE C1-C3

Type C1-C2 in Columns containing the table.

Click OK.

Output:

Chi-Square Test: C1, C2

Expected counts are printed below observed

counts

Total

260

299

559

247.86

311.14

24.83

31.17

9.31

11.69

Total

282

354

636

Chi-Sq

0.595

0.474

3.892

3.100

0.574

0.457

9.091

2, P-Value

0.011

FIGURE 12.4.1

MINITAB procedure and output for chi-square analysis of data in Table 12.4.3.

624

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

The SAS System

The FREQ Procedure

Table of race by folic

race

folic

Frequency

Percent

Row Pct

Col Pct

Yes

Total

—————————–

Black

6.45

2.36

8.81

73.21

26.79

11.58

5.32

—————————–

Other

2.20

1.10

3.30

66.67

33.33

3.95

2.48

—————————–

White

299

260

559

47.01

40.88

87.89

53.49

46.51

84.46

92.20

—————————–

Total

354

282

636

55.66

44.34

100.00

Statistics for Table of race by folic

Statistic

Value

Prob

———————————————————-

Chi-Square

9.0913

0.0106

Likelihood Ratio Chi-Square

9.4808

0.0087

Mantel—Haenszel Chi-Square

8.9923

0.0027

Phi Coefficient

0.1196

Contingency Coefficient

0.1187

Cramer’s V

0.1196

Sample Size

636

FIGURE 12.4.2

Partial SAS^® printout for the chi-square analysis of the data from

Example 12.4.1.

12.4

TESTS OF INDEPENDENCE

625

Note that the SAS^® printout shows, in each cell, the percentage that cell frequency is

of its row total, its column total, and the grand total. Also shown, for each row and column

total, is the percentage that the total is of the grand total. In addition to the X² statistic,

SAS^® gives the value of several other statistics that may be computed from contingency

table data. One of these, the Mantel-Haenszel chi-square statistic, will be discussed in a

later section of this chapter.

Small Expected Frequencies The problem of small expected frequencies

discussed in the previous section may be encountered when analyzing the data of

contingency tables. Although there is a lack of consensus on how to handle this problem,

many authors currently follow the rule given by Cochran (5). He suggests that for

contingency tables with more than 1 degree of freedom a minimum expectation of 1 is

allowable if no more than 20 percent of the cells have expected frequencies of less than 5.

To meet this rule, adjacent rows and/or adjacent columns may be combined when to

do so is logical in light of other considerations. If X² is based on less than 30 degrees of

freedom, expected frequencies as small as 2 can be tolerated. We did not experience the

problem of small expected frequencies in Example 12.4.1, since they were all greater

than 5.

The 2

2 Contingency Table Sometimes each of two criteria of classifica-

tion may be broken down into only two categories, or levels. When data are cross-

classified in this manner, the result is a contingency table consisting of two rows and two

columns. Such a table is commonly referred to as a 2

2 table. The value of X² may be

computed by first calculating the expected cell frequencies in the manner discussed

above. In the case of a 2

2 contingency table, however, X² may be calculated by the

following shortcut formula:

nðad bcÞ

X² ¼

(12.4.1)

ða þ cÞðb þ dÞða þ bÞðc þ dÞ

where a, b, c, and d are the observed cell frequencies as shown in Table 12.4.5. When we

apply the ðr

1Þðc

1Þ rule for finding degrees of freedom to a 2

2 table, the result is

1 degree of freedom. Let us illustrate this with an example.

TABLE 12.4.5

A 2

2 Contingency Table

First Criterion of Classification

Second Criterion

of Classification

Total

aþb

cþd

Total

aþc

bþd

626

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

EXAMPLE 12.4.2

According to Silver and Aiello (A-4), falls are of major concern among polio survivors.

Researchers wanted to determine the impact of a fall on lifestyle changes. Table 12.4.6

shows the results of a study of 233 polio survivors on whether fear of falling resulted in

lifestyle changes.

Solution:

Data. From the information given we may construct the 2

2 contin-

gency table displayed as Table 12.5.6.

Assumptions. We assume that the sample is equivalent to a simple

random sample.

Hypotheses.

H₀: Fall status and lifestyle change because of fear of falling are

independent.

H₁: The two variables are not independent.

Let a ¼ .05.

Test statistic. The test statistic is

“

ðOi Ei

Þ²

X² ¼

E_i

i¼1

Distribution of test statistic. When H₀

is true, X²

is distributed

approximately as x² with ðr

1Þðc

1Þ ¼ ð2

1Þð2

1Þ ¼ ð1Þð1Þ ¼

1 degree of freedom.

Decision rule. Reject H₀ if the computed value of X² is equal to or

greater than 3.841.

Calculation of test statistic. By Equation 12.4.1 we compute

233½ð131Þð36Þ ð52Þð14Þ

X² ¼

¼ 31:7391

ð145Þð88Þð183Þð50Þ

Statistical decision. We reject H₀ since 31:7391 > 3:841.

TABLE

12.4.6

Contingency Table for the Data of Example 12.4.2

Made Lifestyle Changes Because of Fear of Falling

Yes

Total

Fallers

131

183

Nonfallers

Total

145

233

Source: J. K. Silver and D. D. Aiello, “Polio Survivors: Falls and Subsequent Injuries,”

American Journal of Physical Medicine and Rehabilitation, 81 (2002), 567-570.

12.4

TESTS OF INDEPENDENCE

627

9. Conclusion. We conclude that H₀ is false, and that there is a relationship

between experiencing a fall and changing one’s lifestyle because of fear

of falling.

10. p value. Since 31:7391 > 7:879, p < .005.

Small Expected Frequencies The problems of how to handle small expected

frequencies and small total sample sizes may arise in the analysis of 2

2 contingency

tables. Cochran (5) suggests that the x² test should not be used if< 20 or if 20 << 40

and any expected frequency is less than 5. When¼ 40, an expected cell frequency as

small as 1 can be tolerated.

Yates’s Correction The observed frequencies in a contingency table are discrete

and thereby give rise to a discrete statistic, X², which is approximated by the x²

distribution, which is continuous. Yates (6) in 1934 proposed a procedure for correcting

for this in the case of 2

2 tables. The correction, as shown in Equation 12.4.2, consists of

subtracting half the total number of observations from the absolute value of the quantity

ad bc before squaring. That is,

nðjad bcj

.5nÞ

X²

(12.4.2)

corrected ¼

ða þ cÞðb þ dÞða þ bÞðc þ dÞ

It is generally agreed that no correction is necessary for larger contingency tables.

Although Yates’s correction for 2

2 tables has been used extensively in the past,

more recent investigators have questioned its use. As a result, some practitioners recom-

mend against its use.

We may, as a matter of interest, apply the correction to our current example. Using

Equation 12.4.2 and the data from Table 12.4.6, we may compute

233½jð131Þð36Þ ð52Þð14Þj

.5ð233Þ

X² ¼

¼ 29:9118

ð145Þð88Þð183Þð50Þ

As might be expected, with a sample this large, the difference in the two results is not

dramatic.

Tests of Independence: Characteristics The characteristics of a chi-

square test of independence that distinguish it from other chi-square tests are as follows:

1. A single sample is selected from a population of interest, and the subjects or objects

are cross-classified on the basis of the two variables of interest.

2. The rationale for calculating expected cell frequencies is based on the probability

law, which states that if two events (here the two criteria of classification) are

independent, the probability of their joint occurrence is equal to the product of their

individual probabilities.

3. The hypotheses and conclusions are stated in terms of the independence (or lack of

independence) of two variables.

628

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

EXERCISES

In the exercises that follow perform the test at the indicated level of significance and determine the p

value.

12.4.1

In the study by Silver and Aiello (A-4) cited in Example 12.4.2, a secondary objective was to

determine if the frequency of falls was independent of wheelchair use. The following table gives the

data for falls and wheelchair use among the subjects of the study.

Wheelchair Use

Yes

Fallers

121

Nonfallers

Source: J. K. Silver and D. D. Aiello, “Polio Survivors: Falls and

Subsequent Injuries,” American Journal of Physical Medicine and

Rehabilitation, 81 (2002), 567-570.

Do these data provide sufficient evidence to warrant the conclusion that wheelchair use and falling are

related? Let a ¼ .05.

12.4.2

Sternal surgical site infection (SSI) after coronary artery bypass graft surgery is a complication that

increases patient morbidity and costs for patients, payers, and the health care system. Segal and

Anderson (A-5) performed a study that examined two types of preoperative skin preparation before

performing open heart surgery. These two preparations used aqueous iodine and insoluble iodine with

the following results.

Comparison of Aqueous

and Insoluble Preps

Prep Group

Infected

Not Infected

Aqueous iodine

Insoluble iodine

Source: Cynthia G. Segal and Jacqueline J. Anderson, “Preoperative Skin

Preparation of Cardiac Patients,” AORN Journal, 76 (2002), 821-827.

Do these data provide sufficient evidence at the a ¼ .05 level to justify the conclusion that the type of

skin preparation and infection are related?

12.4.3

The side effects of nonsteroidal antiinflammatory drugs (NSAIDs) include problems involving peptic

ulceration, renal function, and liver disease. In 1996, the American College of Rheumatology issued

and disseminated guidelines recommending baseline tests (CBC, hepatic panel, and renal tests) when

prescribing NSAIDs. A study was conducted by Rothenberg and Holcomb (A-6) to determine if

physicians taking part in a national database of computerized medical records performed the

recommended baseline tests when prescribing NSAIDs. The researchers classified physicians in

the study into four categories—those practicing in internal medicine, family practice, academic

family practice, and multispeciality groups. The data appear in the following table.

EXERCISES

629

Performed Baseline Tests

Practice Type

Yes

Internal medicine

294

921

Family practice

2862

Academic family practice

3064

Multispecialty groups

203

2652

Source: Ralph Tothenberg and John P. Holcomb, “Guidelines for Monitoring of NSAIDs: Who

Listened?,” Journal of Clinical Rheumatology, 6 (2000), 258-265.

Do the data above provide sufficient evidence for us to conclude that type of practice and

performance of baseline tests are related? Use a ¼ .01.

12.4.4

Boles and Johnson (A-7) examined the beliefs held by adolescents regarding smoking and weight.

Respondents characterized their weight into three categories: underweight, overweight, or appropri-

ate. Smoking status was categorized according to the answer to the question, “Do you currently

smoke, meaning one or more cigarettes per day?” The following table shows the results of a telephone

study of adolescents in the age group 12-17.

Smoking

Yes

Underweight

Overweight

142

Appropriate

816

Source: Sharon M. Boles and Patrick B. Johnson, “Gender, Weight Concerns, and Adolescent

Smoking,” Journal of Addictive Diseases, 20 (2001), 5-14.

Do the data provide sufficient evidence to suggest that weight perception and smoking status are

related in adolescents? a ¼ .05.

12.4.5

A sample of 500 college students participated in a study designed to evaluate the level of college

students’ knowledge of a certain group of common diseases. The following table shows the students

classified by major field of study and level of knowledge of the group of diseases:

Knowledge of Diseases

Major

Good

Poor

Total

Premedical

122

Other

359

378

Total

450

500

Do these data suggest that there is a relationship between knowledge of the group of diseases

and major field of study of the college students from which the present sample was drawn?

Let a ¼ .05.

12.4.6

The following table shows the results of a survey in which the subjects were a sample of 300 adults

residing in a certain metropolitan area. Each subject was asked to indicate which of three policies they

favored with respect to smoking in public places.

630

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

Policy Favored

Smoking Allowed

Highest Education

Restrictions

in Designated

Smoking

Level

on Smoking

Areas Only

at All

Opinion Total

College graduate

High-school graduate

100

150

Grade-school graduate

Total

184

300

Can one conclude from these data that, in the sampled population, there is a relationship between

level of education and attitude toward smoking in public places? Let a ¼ .05.

12.5

TESTS OF HOMOGENEITY

A characteristic of the examples and exercises presented in the last section is that, in each

case, the total sample was assumed to have been drawn before the entities were classified

according to the two criteria of classification. That is, the observed number of entities falling

into each cell was determined after the sample was drawn. As a result, the row and column

totals are chance quantities not under the control of the investigator. We think of the sample

drawn under these conditions as a single sample drawn from a single population. On

occasion, however, either row or column totals may be under the control of the investigator;

that is, the investigator may specify that independent samples be drawn from each of several

populations. In this case, one set of marginal totals is said to be fixed, while the other set,

corresponding to the criterion of classification applied to the samples, is random. The former

procedure, as we have seen, leads to a chi-square test of independence. The latter situation

leads to a chi-square test of homogeneity. The two situations not only involve different

sampling procedures; they lead to different questions and null hypotheses. The test of

independence is concerned with the question: Are the two criteria of classification indepen-

dent? The homogeneity test is concerned with the question: Are the samples drawn from

populations that are homogeneous with respect to some criterion of classification? In the

latter case the null hypothesis states that the samples are drawn from the same population.

Despite these differences in concept and sampling procedure, the two tests are mathemati-

cally identical, as we see when we consider the following example.

Calculating Expected Frequencies Either the row categories or the col-

umn categories may represent the different populations from which the samples are drawn.

If, for example, three populations are sampled, they may be designated as populations 1, 2,

and 3, in which case these labels may serve as either row or column headings. If the variable

of interest has three categories, say, A, B, and C, these labels may serve as headings for rows

or columns, whichever is not used for the populations. If we use notation similar to that

adopted for Table 12.4.2, the contingency table for this situation, with columns used to

represent the populations, is shown as Table 12.5.1. Before computing our test statistic we

need expected frequencies for each of the cells in Table 12.5.1. If the populations are indeed

12.5

TESTS OF HOMOGENEITY

631

TABLE 12.5.1

A Contingency Table for Data for a

Chi-Square Test of Homogeneity

Population

Variable Category

Total

n_A1

n_A2

n_A3

n_A:

n_B1

n_B2

n_B3

n_B:

n_C1

n_C2

n_C3

n_C:

Total

n.1

n.2

n.3

homogeneous, or, equivalently, if the samples are all drawn from the same population, with

respect to the categories A, B, and C, our best estimate of the proportion in the combined

population who belong to category A is n_A:=n. By the same token, if the three populations

are homogeneous, we interpret this probability as applying to each of the populations

individually. For example, under the null hypothesis, n_A. is our best estimate of the

probability that a subject picked at random from the combined population will belong to

category A. We would expect, then, to find n.1ðn_A:=nÞ of those in the sample from population

1 to belong to category A, n.2ðn_A:=nÞ of those in the sample from population 2 to belong to

category A, and n.3ðn_A:=nÞ of those in the sample from population 3 to belong to category A.

These calculations yield the expected frequencies for the first row of Table 12.5.1. Similar

reasoning and calculations yield the expected frequencies for the other two rows.

We see again that the shortcut procedure of multiplying appropriate marginal totals

and dividing by the grand total yields the expected frequencies for the cells.

From the data in Table 12.5.1 we compute the following test statistic:

“

ðOi Ei

Þ²

X² ¼

E_i

i¼1

EXAMPLE 12.5.1

Narcolepsy is a disease involving disturbances of the sleep-wake cycle. Members of the

German Migraine and Headache Society (A-8) studied the relationship between migraine

headaches in 96 subjects diagnosed with narcolepsy and 96 healthy controls. The results

are shown in Table 12.5.2. We wish to know if we may conclude, on the basis of these data,

TABLE 12.5.2

Frequency of Migraine Headaches by Narcolepsy Status

Reported Migraine Headaches

Yes

Total

Narcoleptic subjects

Healthy controls

Total

152

192

Source: The DMG Study Group, “Migraine and Idiopathic Narcolepsy—A Case-Control Study,”

Cephalagia, 23 (2003), 786-789.

632

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

that the narcolepsy population and healthy populations represented by the samples are not

homogeneous with respect to migraine frequency.

Solution:

Data. See Table 12.5.2.

Assumptions. We assume that we have a simple random sample from

each of the two populations of interest.

Hypotheses.

H₀: The two populations are homogeneous with respect to migraine

frequency.

H_A: The two populations are not homogeneous with respect to migraine

frequency.

Let a ¼ .05.

Test statistic. The test statistic is

X² ¼

ðOi Ei

Þ²=E_i

Distribution of test statistic. If H₀ is true, X² is distributed approxi-

mately as x² with ð2

1Þð2

1Þ ¼ ð1Þð1Þ ¼ 1 degree of freedom.

Decision rule. Reject H₀ if the computed value of X² is equal to or

greater than 3.841.

Calculation of test statistic. The MINITAB output is shown in Figure

12.5.1.

Chi-Square Test

Expected counts are printed below observed counts

Rows: Narcolepsy

Columns: Migraine

Yes

All

76.00

20.00

96.00

Yes

76.00

20.00

96.00

All

152

192

152.00

40.00

192.00

Chi-Square

0.126, DF

1, P-Value

0.722

FIGURE 12.5.1

MINITAB output for Example 12.5.1.

12.5

TESTS OF HOMOGENEITY

633

8. Statistical decision. Since .126 is less than the critical value of 3.841,

we are unable to reject the null hypothesis.

9. Conclusion. We conclude that the two populations may be homoge-

neous with respect to migraine frequency.

10. p value. From the MINITAB output we see that p ¼ .722.

Small Expected Frequencies The rules for small expected frequencies given

in the previous section are applicable when carrying out a test of homogeneity.

In summary, the chi-square test of homogeneity has the following characteristics:

1. Two or more populations are identified in advance, and an independent sample is

drawn from each.

2. Sample subjects or objects are placed in appropriate categories of the variable of

interest.

3. The calculation of expected cell frequencies is based on the rationale that if the

populations are homogeneous as stated in the null hypothesis, the best estimate of the

probability that a subject or object will fall into a particular category of the variable of

interest can be obtained by pooling the sample data.

4. The hypotheses and conclusions are stated in terms of homogeneity (with respect to

the variable of interest) of populations.

Test of Homogeneity and H₀:p₁ ¼ p₂

The chi-square test of homogeneity

for the two-sample case provides an alternative method for testing the null hypothesis that

two population proportions are equal. In Section 7.6, it will be recalled, we learned to test

H₀ :p₁ ¼ p₂ against H_A :p₁ ¼ p₂ by means of the statistic

ðp1

p₂

ðp1

p₂

Þ₀

z¼

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffifffi

pð1

pÞ

þpð1

n₁

n₂

where p is obtained by pooling the data of the two independent samples available for

analysis.

Suppose, for example, that in a test of H₀ : p₁ ¼ p₂ against H_A : p₁ ¼ p₂, the sample

data were as follows: n₁ ¼ 100; p₁ ¼ .60; n₂ ¼ 120; p₂ ¼ .40. When we pool the sample

data we have

.60ð100Þ þ .40ð120Þ

108

p¼

100 þ 120

220¼.4909

and

.60

.40

z¼

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

¼ 2:95469

ð.4909Þð.5091Þ

100

120

which is significant at the .05 level since it is greater than the critical value of 1.96.

634

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

If we wish to test the same hypothesis using the chi-square approach, our contin-

gency table will be

Characteristic Present

Sample

Yes

Total

100

120

Total

108

112

220

By Equation 12.4.1 we compute

220½ð60Þð72Þ ð40Þð48Þ

X² ¼

¼ 8:7302

ð108Þð112Þð100Þð120Þ

which is significant at the .05 level because it is greater than the critical value of 3.841. We

see, therefore, that we reach the same conclusion by both methods. This is not surprising

because, as explained in Section 12.2, x²¹

z². We note that 8:7302 ¼ ð2:95469Þ² and

ð Þ ¼

that 3:841 ¼ ð1:96Þ².

EXERCISES

In the exercises that follow perform the test at the indicated level of significance and determine the p

value.

12.5.1

Refer to the study by Carter et al. [A-9], who investigated the effect of age at onset of bipolar disorder

on the course of the illness. One of the variables studied was subjects’ family history. Table 3.4.1

shows the frequency of a family history of mood disorders in the two groups of interest: early age at

onset (18 years or younger) and later age at onset (later than 18 years).

Family History of Mood

Disorders

Early

18ðEÞ

Later > 18ðLÞ

Total

Negative (A)

Bipolar disorder (B)

Unipolar (C)

Unipolar and bipolar (D)

113

Total

141

177

318

Source: Tasha D. Carter, Emanuela Mundo, Sagar V. Parkh, and James L. Kennedy,

“Early Age at Onset as a Risk Factor for Poor Outcome of Bipolar Disorder,” Journal of

Psychiatric Research, 37 (2003), 297-303.

Can we conclude on the basis of these data that subjects 18 or younger differ from subjects older than

18 with respect to family histories of mood disorders? Let a ¼ .05.

EXERCISES

635

12.5.2

Coughlin et al. (A-10) examined breast and cervical screening practices of Hispanic and non-

Hispanic women in counties that approximate the U.S. southern border region. The study used data

from the Behavioral Risk Factor Surveillance System surveys of adults ages 18 years or older

conducted in 1999 and 2000. The following table shows the number of observations of Hispanic

and non-Hispanic women who had received a mammogram in the past 2 years cross-classified by

marital status.

Marital Status

Hispanic

Non-Hispanic

Total

Currently married

319

738

1057

Divorced or separated

130

329

459

Widowed

402

490

Never married or living as

136

an unmarried couple

Total

578

1564

2142

Source: Steven S. Coughlin, Robert J. Uhler, Thomas Richards, and Katherine

M. Wilson, “Breast and Cervical Cancer Screening Practices Among Hispanic

and Non-Hispanic Women Residing Near the United States-Mexico Border,

1999-2000,” Family and Community Health, 26, (2003), 130-139.

We wish to know if we may conclude on the basis of these data that marital status and ethnicity

(Hispanic and non-Hispanic) in border counties of the southern United States are not homogeneous.

Let a ¼ .05.

12.5.3

Swor et al. (A-11) examined the effectiveness of cardiopulmonary resuscitation (CPR) training in

people over 55 years of age. They compared the skill retention rates of subjects in this age group who

completed a course in traditional CPR instruction with those who received chest-compression-only

cardiopulmonary resuscitation (CC-CPR). Independent groups were tested 3 months after training.

Among the 27 subjects receiving traditional CPR, 12 were rated as competent. In the CC-CPR group,

15 out of 29 were rated competent. Do these data provide sufficient evidence for us to conclude that

the two populations are not homogeneous with respect to competency rating 3 months after training?

Let a ¼ .05.

12.5.4

In an air pollution study, a random sample of 200 households was selected from each of two

communities. A respondent in each household was asked whether or not anyone in the household was

bothered by air pollution. The responses were as follows:

Any Member of Household

Bothered by Air Pollution?

Community

Yes

Total

157

200

119

200

Total

124

276

400

Can the researchers conclude that the two communities differ with respect to the variable of interest?

Let a ¼ .05.

636

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

12.5.5

In a simple random sample of 250 industrial workers with cancer, researchers found that 102 had

worked at jobs classified as “high exposure” with respect to suspected cancer-causing agents. Of the

remainder, 84 had worked at “moderate exposure” jobs, and 64 had experienced no known exposure

because of their jobs. In an independent simple random sample of 250 industrial workers from

the same area who had no history of cancer, 31 worked in “high exposure” jobs, 60 worked in

“moderate exposure” jobs, and 159 worked in jobs involving no known exposure to suspected cancer-

causing agents. Does it appear from these data that persons working in jobs that expose them to

suspected cancer-causing agents have an increased risk of contracting cancer? Let a ¼ .05.

12.6

THE FISHER EXACT TEST

Sometimes we have data that can be summarized in a 2

2 contingency table, but these

data are derived from very small samples. The chi-square test is not an appropriate method

of analysis if minimum expected frequency requirements are not met. If, for example, n is

less than 20 or if n is between 20 and 40 and one of the expected frequencies is less than 5,

the chi-square test should be avoided.

A test that may be used when the size requirements of the chi-square test are not met

was proposed in the mid-1930s almost simultaneously by Fisher (7,8), Irwin (9), and Yates

(10). The test has come to be known as the Fisher exact test. It is called exact because, if

desired, it permits us to calculate the exact probability of obtaining the observed results or

results that are more extreme.

Data Arrangement When we use the Fisher exact test, we arrange the data in the

form of a 2

2 contingency table like Table 12.6.1. We arrange the frequencies in such a

way that A > B and choose the characteristic of interest so that a=A > b=B.

Some theorists believe that Fisher’s exact test is appropriate only when both marginal

totals of Table 12.6.1 are fixed by the experiment. This specific model does not appear to

arise very frequently in practice. Many experimenters, therefore, use the test when both

marginal totals are not fixed.

Assumptions The following are the assumptions for the Fisher exact test.

1. The data consist of A sample observations from population

1 and B sample

observations from population 2.

2. The samples are random and independent.

3. Each observation can be categorized as one of two mutually exclusive types.

TABLE 12.6.1

A 2

2 Contingency Table for the Fisher Exact Test

With

Without

Sample

Characteristic

Total

A a

B b

Total

aþb

AþB a b

AþB

12.6

THE FISHER EXACT TEST

637

Hypotheses The following are the null hypotheses that may be tested and their

alternatives.

(Two-sided)

H₀: The proportion with the characteristic of interest is the same in both populations;

that is, p₁ ¼ p₂.

H_A: The proportion with the characteristic of interest is not the same in both

populations; p₁ ¼ p₂.

(One-sided)

H₀: The proportion with the characteristic of interest in population 1 is less than or

the same as the proportion in population 2; p₁

p₂.

H_A: The proportion with the characteristic of interest is greater in population 1 than

in population 2; p₁ > p₂.

Test Statistic The test statistic is b, the number in sample 2 with the characteristic

of interest.

Decision Rule Finney (11) has prepared critical values of b for A

15. Latscha

(12) has extended Finney’s tables to accommodate values of A up to 20. Appendix Table J

gives these critical values of b for A between 3 and 20, inclusive. Significance levels of .05,

.025, .01, and .005 are included. The specific decision rules are as follows:

1. Two-sided test. Enter Table J with A, B, and a. If the observed value of b is equal to

or less than the integer in a given column, reject H₀ at a level of significance equal to

twice the significance level shown at the top of that column. For example, suppose

A ¼ 8, B ¼ 7, a ¼ 7, and the observed value of b is 1. We can reject the null

hypothesis at the 2ð.05Þ ¼ .10, the 2ð.025Þ ¼ .05, and the 2ð.01Þ ¼ .02 levels of

significance, but not at the 2ð.005Þ ¼ .01 level.

2. One-sided test. Enter Table J with A, B, and a. If the observed value of b is less than

or equal to the integer in a given column, reject H₀ at the level of significance shown

at the top of that column. For example, suppose that A ¼ 16, B ¼ 8, a ¼ 4, and the

observed value of b is 3. We can reject the null hypothesis at the .05 and .025 levels of

significance, but not at the .01 or .005 levels.

Large-Sample Approximation For sufficiently large samples we can test the

null hypothesis of the equality of two population proportions by using the normal

approximation. Compute

ða=AÞ ðb=BÞ

z¼

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

(12.6.1)

pð1

pÞð1=A þ 1=BÞ

where

p ¼ ða þ bÞ=ðA þ BÞ

(12.6.2)

and compare it for significance with appropriate critical values of the standard normal

distribution. The use of the normal approximation is generally considered satisfactory if a,

638

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

b, A a, and B b are all greater than or equal to 5. Alternatively, when sample sizes are

sufficiently large, we may test the null hypothesis by means of the chi-square test.

Further Reading The Fisher exact test has been the subject of some controversy

among statisticians. Some feel that the assumption of fixed marginal totals is unrealistic in

most practical applications. The controversy then centers around whether the test is

appropriate when both marginal totals are not fixed. For further discussion of this and other

points, see the articles by Barnard (13-15), Fisher (16), and Pearson (17).

Sweetland (18) compared the results of using the chi-square test with those obtained

using the Fisher exact test for samples of size A þ B ¼ 3 to A þ B ¼ 69. He found close

agreement when A and B were close in size and the test was one-sided.

Carr (19) presents an extension of the Fisher exact test to more than two samples of

equal size and gives an example to demonstrate the calculations. Neave (20) presents the

Fisher exact test in a new format; the test is treated as one of independence rather than of

homogeneity. He has prepared extensive tables for use with his approach.

The sensitivity of Fisher’s exact test to minor perturbations in 2

2 contingency

tables is discussed by Dupont (21).

EXAMPLE 12.6.1

The purpose of a study by Justesen et al. (A-12) was to evaluate the long-term efficacy of

taking indinavir/ritonavir twice a day in combination with two nucleoside reverse

transcriptase inhibitors among HIV-positive subjects who were divided into two groups.

Group 1 consisted of patients who had no history of taking protease inhibitors (PI Na

ıve).

Group 2 consisted of patients who had a previous history taking a protease inhibitor (PI

Experienced). Table 12.6.2 shows whether these subjects remained on the regimen for the

120 weeks of follow-up. We wish to know if we may conclude that patients classified as

group 1 have a lower probability than subjects in group 2 of remaining on the regimen for

120 weeks.

TABLE 12.6.2

Regimen Status at 120 Weeks for

PI Na€ıve and PI Experienced Subjects Taking

Indinavir/Ritonavir as Described in Example 12.6.1

Remained in

the Regimen

for 120 Weeks

Total

Yes

(PI Na

ıve)

(PA Experienced)

Total

Source: U.S. Justesen, A. M. Lervfing, A. Thomsen, J. A. Lindberg,

C. Pedersen, and P. Tauris, “Low-Dose Indinavir in Combination with

Low-Dose Ritonavir: Steady-State Pharmacokinetics and Long-Term

Clinical Outcome Follow-Up,” HIV Medicine, 4 (2003), 250-254.

12.6

THE FISHER EXACT TEST

639

TABLE 12.6.3

Data of Table 12.6.2 Rearranged to Conform to the

Layout of Table 12.6.1

Remained in Regimen for 120 Weeks

Yes

Total

(PI Experienced)

8¼a

4¼A a

12 ¼ A

(PI Na

ıve)

2¼b

7¼B b

9¼B

Total

10 ¼ a þ b

11 ¼ A þ B a b

21 ¼ A þ B

Solution:

Data. The data as reported are shown in Table 12.6.2. Table 12.6.3

shows the data rearranged to conform to the layout of Table 12.6.1.

Remaining on the regimen is the characteristic of interest.

Assumptions. We presume that the assumptions for application of the

Fisher exact test are met.

Hypotheses.

H₀: The proportion of subjects remaining 120 weeks on the regimen in a

population of patients classified as group 2 is the same as or less

than the proportion of subjects remaining on the regimen 120 weeks

in a population classified as group 1.

H_A: Group 2 patients have a higher rate than group 1 patients of

remaining on the regimen for 120 weeks.

Test statistic. The test statistic is the observed value of b as shown in

Table 12.6.3.

Distribution of test statistic. We determine the significance of b by

consulting Appendix Table J.

Decision rule. Suppose we let a ¼ .05. The decision rule, then, is to

reject H₀ if the observed value of b is equal to or less than 1, the value of

b in Table J for A ¼ 12, B ¼ 9, a ¼ 8, and a ¼ .05.

Calculation of test statistic. The observed value of b, as shown in

Table 12.6.3, is 2.

Statistical decision. Since 2 > 1, we fail to reject H₀.

Conclusion. Since we fail to reject H₀, we conclude that the null

hypothesis may be true. That is, it may be true that the rate of remaining

on the regimen for 120 weeks is the same or less for the PI experienced

group compared to the PI na€ıve group.

10.

p value. We see in Table J that when A ¼ 12, B ¼ 9, a ¼ 8, the value of

b ¼ 2 has an exact probability of occurring by chance alone, when H₀ is

true, greater than .05.

640

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

Pl * Remained Cross-Tabulation

Count

Remained

Yes

Total

Experienced

Naive

otal

Chi-Square Tests

Asymp. Sig.

Exact Sig.

Value

(2-sided)

(1-sided)

Pearson Chi-Square

4.073^b

.044

Continuity Correction^a

2.486

.115

Likelihood Ratio

4.253

.039

Fisher’s ExactTest

.080

.05

Linear-by-Linear

3.879

.049

Association

N of Valid Cases

a. Computed only for a 2

2 table

b. 2 cells (50.0%) have expected count less than 5. The minimum expected count is 4.29.

FIGURE 12.6.1

SPSS output for Example 12.6.1.

Various statistical software programs perform the calculations for the Fisher exact

test. Figure 12.6.1 shows the results of Example 12.6.1 as computed by SPSS. The exact p

value is provided for both a one-sided and a two-sided test. Based on these results, we fail to

reject H₀ (p value >.05), just as we did using the statistical tables in the Appendix. Note

that in addition to the Fisher exact test several alternative tests are provided. The reader

should be aware that these alternative tests are not appropriate if the assumptions under-

lying them have been violated.

EXERCISES

12.6.1

The goal of a study by Tahmassebi and Curzon (A-13) was to determine if drooling in children

with cerebral palsy is due to hypersalivation. One of the procedures toward that end was to examine

the salivary buffering capacity of cerebral palsied children and controls. The following table gives

the results.

12.7

RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC

641

Buffering Capacity

Group

Medium

High

Cerebral palsy

Control

Source: J. F. Tahmassebi and M. E. J. Curzon, “The Cause of Drooling in

Children with Cerebral Palsy—Hypersalivation or Swallowing Defect?”

International Journal of Paediatric Dentistry, 13 (2003), 106-111.

Test for a significant difference between cerebral palsied children and controls with respect to high or

low buffering capacity. Let a ¼ .05 and find the p value.

12.6.2

In a study by Xiao and Shi (A-14), researchers studied the effect of cranberry juice in the treatment

and prevention of Helicobacter pylori infection in mice. The eradication of Helicobacter pylori

results in the healing of peptic ulcers. Researchers compared treatment with cranberry juice to “triple

therapy (amoxicillin, bismuth subcitrate, and metronidazole) in mice infected with Helicobacter

pylori. After 4 weeks, they examined the mice to determine the frequency of eradication of the

bacterium in the two treatment groups. The following table shows the results.

No. of Mice with Helicobacter pylori Eradicated

Yes

Triple therapy

Cranberry juice

Source: Shu Dong Xiao and Tong Shi, “Is Cranberry Juice Effective in the Treatment and

Prevention of Helicobacter Pylori Infection of Mice,” Chinese Journal of Digestive Diseases,

(2003), 136-139.

May we conclude, on the basis of these data, that triple therapy is more effective than cranberry juice

at eradication of the bacterium? Let a ¼ .05 and find the p value.

12.6.3

In a study by Shaked et al. (A-15), researchers studied 26 children with blunt pancreatic injuries.

These injuries occurred from a direct blow to the abdomen, bicycle handlebars, fall from height, or

car accident. Nineteen of the patients were classified as having minor injuries, and seven were

classified as having major injuries. Pseudocyst formation was suspected when signs of clinical

deterioration developed, such as increased abdominal pain, epigastric fullness, fever, and increased

pancreatic enzyme levels. In the major injury group, six of the seven children developed pseudocysts

while in the minor injury group, three of the 19 children developed pseudocysts. Is this sufficient

evidence to allow us to conclude that the proportion of children developing pseudocysts is higher in

the major injury group than in the minor injury group? Let a ¼ .01.

12.7

RELATIVE RISK, ODDS RATIO, AND

THE MANTEL-HAENSZEL STATISTIC

In Chapter 8 we learned to use analysis of variance techniques to analyze data that arise

from designed experiments, investigations in which at least one variable is manipulated

in some way. Designed experiments, of course, are not the only sources of data that are

642

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

of interest to clinicians and other health sciences professionals. Another important class of

scientific investigation that is widely used is the observational study.

DEFINITION

An observational study is a scientific investigation in which neither the

subjects under study nor any of the variables of interest are manipulated

in any way.

An observational study, in other words, may be defined simply as an investigation

that is not an experiment. The simplest form of observational study is one in which there are

only two variables of interest. One of the variables is called the risk factor, or independent

variable, and the other variable is referred to as the outcome, or dependent variable.

DEFINITION

The term risk factor is used to designate a variable that is thought to be

related to some outcome variable. The risk factor may be a suspected

cause of some specific state of the outcome variable.

In a particular investigation, for example, the outcome variable might be subjects’

status relative to cancer and the risk factor might be their status with respect to cigarette

smoking. The model is further simplified if the variables are categorical with only two

categories per variable. For the outcome variable the categories might be cancer present

and cancer absent. With respect to the risk factor subjects might be categorized as smokers

and nonsmokers.

When the variables in observational studies are categorical, the data pertaining to

them may be displayed in a contingency table, and hence the inclusion of the topic in the

present chapter. We shall limit our discussion to the situation in which the outcome variable

and the risk factor are both dichotomous variables.

Types of Observational Studies There are two basic types of observational

studies, prospective studies and retrospective studies.

DEFINITION

A prospective study is an observational study in which two random

samples of subjects are selected. One sample consists of subjects who

possess the risk factor, and the other sample consists of subjects who do

not possess the risk factor. The subjects are followed into the future (that

is, they are followed prospectively), and a record is kept on the number of

subjects in each sample who, at some point in time, are classifiable into

each of the categories of the outcome variable.

The data resulting from a prospective study involving two dichotomous variables can

be displayed in a 2

2 contingency table that usually provides information regarding the

number of subjects with and without the risk factor and the number who did and did not

12.7

RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC

643

TABLE 12.7.1

Classification of a Sample of Subjects with Respect

to Disease Status and Risk Factor

Disease Status

Risk Factor

Present

Absent

Total at Risk

Present

aþb

Absent

cþd

Total

aþc

bþd

succumb to the disease of interest as well as the frequencies for each combination of

categories of the two variables.

DEFINITION

A retrospective study is the reverse of a prospective study. The samples are

selected from those falling into the categories of the outcome variable.

The investigator then looks back (that is, takes a retrospective look) at the

subjects and determines which ones have (or had) and which ones do not

have (or did not have) the risk factor.

From the data of a retrospective study we may construct a contingency table with

frequencies similar to those that are possible for the data of a prospective study.

In general, the prospective study is more expensive to conduct than the retrospective

study. The prospective study, however, more closely resembles an experiment.

Relative Risk The data resulting from a prospective study in which the dependent

variable and the risk factor are both dichotomous may be displayed in a 2

2 contingency

table such as Table 12.7.1. The risk of the development of the disease among the subjects

with the risk factor is a=ða þ bÞ. The risk of the development of the disease among the

subjects without the risk factor is c=ðc þ dÞ. We define relative risk as follows.

DEFINITION

Relative risk is the ratio of the risk of developing a disease among subjects

with the risk factor to the risk of developing the disease among subjects

without the risk factor.

We represent the relative risk from a prospective study symbolically as

a=ða þ bÞ

RR ¼

(12.7.1)

c=ðc þ dÞ

where a, b, c, and d are as defined in Table 12.7.1, and RR indicates that the relative risk is

computed from a sample to be used as an estimate of the relative risk, RR, for the

population from which the sample was drawn.

644

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

We may construct a confidence interval for RR

ffiffiffffi

z_a=

X²

100ð1

aÞ%CI ¼ RR¹

(12.7.2)

where z_a is the two-sided z value corresponding to the chosen confidence coefficient and X²

is computed by Equation 12.4.1.

Interpretation of RR The value of RR may range anywhere between zero and

infinity. A value of 1 indicates that there is no association between the status of the risk

factor and the status of the dependent variable. In most cases the two possible states of

the dependent variable are disease present and disease absent. We interpret an RR of 1 to

mean that the risk of acquiring the disease is the same for those subjects with the risk

factor and those without the risk factor. A value of RR greater than 1 indicates that the

risk of acquiring the disease is greater among subjects with the risk factor than among

subjects without the risk factor. An RR value that is less than 1 indicates less risk of

acquiring the disease among subjects with the risk factor than among subjects without

the risk factor. For example, a risk factor of 2 is taken to mean that those subjects with the

risk factor are twice as likely to acquire the disease as compared to subjects without the

risk factor.

We illustrate the calculation of relative risk by means of the following example.

EXAMPLE 12.7.1

In a prospective study of pregnant women, Magann et al. (A-16) collected extensive

information on exercise level of low-risk pregnant working women. A group of 217 women

did no voluntary or mandatory exercise during the pregnancy, while a group of 238 women

exercised extensively. One outcome variable of interest was experiencing preterm labor.

The results are summarized in Table 12.7.2.

We wish to estimate the relative risk of preterm labor when pregnant women exercise

extensively.

Solution: By Equation 12.7.1 we compute

22=238

RR ¼

18=217¼.0829¼1:1

TABLE 12.7.2

Subjects with and without the Risk Factor Who Became Cases

of Preterm Labor

Risk Factor

Cases of Preterm Labor

Noncases of Preterm Labor

Total

Extreme exercising

216

238

Not exercising

199

217

Total

415

455

Source: Everett F. Magann, Sharon F. Evans, Beth Weitz, and John Newnham, “Antepartum, Intrapartum,

and Neonatal Significance of Exercise on Healthy Low-Risk Pregnant Working Women,” Obstetrics and

Gynecology, 99 (2002), 466-472.

12.7

RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC

645

Odds Ratio and Relative Risk Section

Common

Original

Iterated

Log Odds

Relative

Parameter

Odds Ratio

Ratio

Risk

Upper 95% C.L.

2.1350

2.2683

0.7585

2.1

Estimate

1.1260

1.1207

0.1140

1.1144

Lower

95% C.L.

0.5883

0.5606

0.5305

0.5896

FIGURE 12.7.1

NCSS output for the data in Example 12.7.1.

These data indicate that the risk of experiencing preterm labor when a woman

exercises heavily is 1.1 times as great as it is among women who do not

exercise at all.

We compute the 95 percent confidence interval for RR as follows. By

Equation 12.4.1, we compute from the data in Table 12.7.2:

455½ð22Þð199Þ ð216Þð18Þ

X² ¼

¼ .1274

ð40Þð415Þð238Þð217Þ

By Equation 12.7.2, the lower and upper confidence limits are, respectively,

ffiffiffiffiffiffiffffi

1:96=

:1274

1:1¹

¼ :65 and 1:11þ1:96=

¼ 1:86. Since the interval includes

1, we conclude, at the .05 level of significance, that the population risk may

be 1. In other words, we conclude that, in the population, there may not be

an increased risk of experiencing preterm labor when a pregnant woman

exercises extensively.

The data were processed by NCSS. The results are shown in Figure

12.7.1. The relative risk calculation is shown in the column at the far right of

the output, along with the 95% confidence limits. Because of rounding errors,

these values differ slightly from those given in the example.

Odds Ratio When the data to be analyzed come from a retrospective study, relative

risk is

not a meaningful measure for comparing two groups. As we have seen, a

retrospective study is based on a sample of subjects with the disease (cases) and a separate

sample of subjects without the disease (controls or noncases). We then retrospectively

determine the distribution of the risk factor among the cases and controls. Given the results

of a retrospective study involving two samples of subjects, cases, and controls, we may

display the data in a 2

2 table such as Table 12.7.3, in which subjects are dichotomized

with respect to the presence and absence of the risk factor. Note that the column headings in

Table 12.7.3 differ from those in Table 12.7.1 to emphasize the fact that the data are from a

retrospective study and that the subjects were selected because they were either cases or

controls. When the data from a retrospective study are displayed as in Table 12.7.3,

the ratio a=ða þ bÞ, for example, is not an estimate of the risk of disease for subjects with

the risk factor. The appropriate measure for comparing cases and controls in a retrospective

study is the odds ratio. As noted in Chapter 11, in order to understand the concept of

646

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

TABLE 12.7.3

Subjects of a Retrospective Study

Classified According to Status Relativeto a Risk Factor

and Whether They Are Cases or Controls

Sample

Risk Factor

Cases

Controls

Total

Present

aþb

Absent

cþd

Total

aþc

bþd

the odds ratio, we must understand the term odds, which is frequently used by those who

place bets on the outcomes of sporting events or participate in other types of gambling

activities.

DEFINITION

The odds for success are the ratio of the probability of success to the

probability of failure.

We use this definition of odds to define two odds that we can calculate from data

displayed as in Table 12.7.3:

1. The odds of being a case (having the disease) to being a control (not having the

disease) among subjects with the risk factor is ½a=ða þ bÞ =½b=ða þ bÞ

¼ a=b.

2. The odds of being a case (having the disease) to being a control (not having the

disease) among subjects without the risk factor is ½c=ðc þ dÞ =½d=ðc þ dÞ

¼ c=d.

We now define the odds ratio that we may compute from the data of a retrospective

study. We use the symbo

OR to indicate that the measure is computed from sample data

and used as an estimate of the population odds ratio, OR.

DEFINITION

The estimate of the population odds ratio is

a=b

OR ¼

(12.7.3)

c=d¼bc

where a, b, c, and d are as defined in Table 12.7.3.

We may construct a confidence interval for OR by the following method:

ffiffiffffi

z_a=

X²

100ð1

aÞ%CI

OR¹

(12.7.4)

where z_a is the two-sided z value corresponding to the chosen confidence coefficient and

X² is computed by Equation 12.4.1.

12.7

RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC

647

Interpretation of the Odds Ratio In the case of a rare disease, the popula-

tion odds ratio provides a good approximation to the population relative risk. Conse-

quently, the sample odds ratio, being an estimate of the population odds ratio, provides an

indirect estimate of the population relative risk in the case of a rare disease.

The odds ratio can assume values between zero and 1. A value of 1 indicates no

association between the risk factor and disease status. A value less than 1 indicates reduced

odds of the disease among subjects with the risk factor. A value greater than 1 indicates

increased odds of having the disease among subjects in whom the risk factor is present.

EXAMPLE 12.7.2

Toschke et al. (A-17) collected data on obesity status of children ages 5-6 years and the

smoking status of the mother during the pregnancy. Table 12.7.4 shows 3970 subjects

classified as cases or noncases of obesity and also classified according to smoking status of

the mother during pregnancy (the risk factor). We wish to compare the odds of obesity at

ages 5-6 among those whose mother smoked throughout the pregnancy with the odds of

obesity at age 5-6 among those whose mother did not smoke during pregnancy.

Solution: The odds ratio is the appropriate measure for answering the question posed.

By Equation 12.7.3 we compute

ð64Þð3496Þ

OR ¼

¼ 9:62

ð342Þð68Þ

We see that obese children (cases) are 9.62 times as likely as nonobese

children

(noncases) to have had a mother who smoked throughout the

pregnancy.

We compute the 95 percent confidence interval for OR as follows. By

Equation 12.4.1 we compute from the data in Table 12.7.4

3970½ð64Þð3496Þ ð342Þð68Þ

X² ¼

¼ 217:6831

ð132Þð3838Þð406Þð3564Þ

TABLE 12.7.4

Subjects Classified According to Obesity

Status and Mother’s Smoking Status during Pregnancy

Obesity Status

Smoking Status

Cases

Noncases

Total

During Pregnancy

Smoked throughout

342

406

Never smoked

3496

3564

Total

132

3838

3970

Source: A. M. Toschke, S. M. Montgomery, U. Pfeiffer, and R. von Kries, “Early

Intrauterine Exposure to Tobacco-Inhaled Products and Obesity,” American Jour-

nal of Epidemiology, 158 (2003), 1068-1074.

648

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

Smoking_status * Obsesity_status Cross-Tabulation

Count

Obesity status

Cases

Noncases

Total

Smoking_status

Smoked throughout

342

406

Never smoked

3496

3564

otal

132

3838

3970

Risk Estimate

95% Confidence

Interval

Value

Lower

Upper

Odds Ratio for

Smoking_status

(Smoked throughout

9.621

6.719

13.775

/Never smoked)

For cohort Obesity_

8.262

5.966

11.441

status Cases

For cohort Obesity_

.859

.823

.896

status Noncases

N of Valid Cases

3970

FIGURE 12.7.2

SPSS output for Example 12.7.2.

The lower and upper confidence limits for the population OR, respectively, are

ffiffiffiffiffiffiffiffiffiffiffif

ffi

ffiffiffiffiffiffiffiffiffiffiffif

ffi

1:96=

217:6831

9:62¹

¼ 7:12 and 9:621þ1:96=

¼ 13:00. We conclude

with 95 percent confidence that the population OR is somewhere between

7.12 and 13.00. Because the interval does not include 1, we conclude that, in the

population, obese children (cases) are more likely thaonobese children

(noncases) to have had a mother who smoked throughout the pregnancy.

The data from Example 12.7.2 were processed using SPSS. The

results are shown in Figure 12.7.2. The odds ratio calculation, along with

the 95% confidence limits, are shown in the top line of the Risk Estimate

box. These values differ slightly from those in the example because of

rounding error.

The Mantel-Haenszel Statistic Frequently when we are studying the rela-

tionship between the status of some disease and the status of some risk factor, we are

12.7

RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC

649

aware of another variable that may be associated with the disease, with the risk factor,

or with both in such a way that the true relationship between the disease status and the

risk factor is masked. Such a variable is called a confounding variable. For example,

experience might indicate the possibility that the relationship between some disease

and a suspected risk factor differs among different ethnic groups. We would then treat

ethnic membership as a confounding variable. When they can be identified, it is

desirable to control for confounding variables so that an unambiguous measure of the

relationship between disease status and risk factor may be calculated. A technique for

accomplishing this objective is the Mantel-Haenszel (22) procedure, so called in

recognition of the two men who developed it. The procedure allows us to test the null

hypothesis that there is no association between status with respect to disease and risk

factor status. Initially used only with data from retrospective studies, the Mantel-

Haenszel procedure is also appropriate for use with data from prospective studies, as

discussed by Mantel (23).

In the application of the Mantel-Haenszel procedure, case and control subjects are

assigned to strata corresponding to different values of the confounding variable. The data

are then analyzed within individual strata as well as across all strata. The discussion that

follows assumes that the data under analysis are from a retrospective or a prospective study

with case and noncase subjects classified according to whether they have or do not have the

suspected risk factor. The confounding variable is categorical, with the different categories

defining the strata. If the confounding variable is continuous it must be categorized. For

example, if the suspected confounding variable is age, we might group subjects into

mutually exclusive age categories. The data before stratification may be displayed as

shown in Table 12.7.3.

Application of the Mantel-Haenszel procedure consists of the following steps.

1. Form k strata corresponding to the k categories of the confounding variable. Table

12.7.5 shows the data display for the ith stratum.

2. For each stratum compute the expected frequency e_i of the upper left-hand cell of

Table 12.7.5 as follows:

ðai þ bi

Þða_i þc_iÞ

e_i ¼

(12.7.5)

n_i

TABLE 12.7.5

Subjects in the ith Stratum of a Confounding

Variable Classified According to Status Relative to a Risk

Factor and Whether They Are Cases or Controls

Sample

Risk Factor

Cases

Controls

Total

Present

a_i

b_i

a_i þ b_i

Absent

c_i

d_i

c_i þ d_i

Total

a_i þ c_i

b_i þ d_i

n_i

650

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

3. For each stratum compute

ðai þ bi

Þðc_i þd_iÞða_i þc_iÞðb_i þd_iÞ

v_i ¼

(12.7.6)

n²i ðni

1Þ

4. Compute the Mantel-Haenszel test statistic, x²

MH asfollows:

^k a_i

e_i

i¼1

x²

¼ i¼1

(12.7.7)

v_i

i¼1

5. Reject the null hypothesis of no association between disease status and suspected risk

factor status in the population if the computed value of x²

MH isequaltoorgreaterthan

the critical value of the test statistic, which is the tabulated chi-square value for 1

degree of freedom and the chosen level of significance.

Mantel-Haenszel Estimator of the Common Odds Ratio When we

have k strata of data, each of which may be displayed in a table like Table 12.7.5, we may

compute the Mantel-Haenszel estimator of the common odds ratio,

OR_MH as follows:

^k ða_id_i=n_iÞ

OR_MH ¼i¼1

(12.7.8)

ðbici=ni

i¼1

When we use the Mantel-Haenszel estimator given by Equation 12.7.4, we assume that, in

the population, the odds ratio is the same for each stratum.

We illustrate the use of the Mantel-Haenszel statistics with the following

examples.

EXAMPLE 12.7.3

In a study by LaMont et al. (A-18), researchers collected data on obstructive coronary

artery disease (OCAD), hypertension, and age among subjects identified by a treadmill

stress test as being at risk. In Table 12.7.6, counts on subjects in two age strata are presented

with hypertension as the risk factor and the presence of OCAD as the case/noncase

variable.

Solution:

1. Data. See Table 12.7.6.

2. Assumptions. We assume that the assumptions discussed earlier for the

valid use of the Mantel-Haenszel statistic are met.

12.7

RELATIVE RISK, ODDS RATIO, AND THE MANTEL-HAENSZEL STATISTIC

651

TABLE 12.7.6

Patients Stratified by Age and Classified by Status

Relative to Hypertension (the Risk Factor) and OCAD (Case/Noncase

Variable)

Stratum 1 (55 and under)

Risk Factor

(Hypertension)

Cases (OCAD)

Noncases

Total

Present

Absent

Total

Stratum 2 (over 55)

Risk Factor

(Hypertension)

Cases (OCAD)

Noncases

Total

Present

Absent

Total

Source: Data provided courtesy of Matthew J. Budoff, MD.

Hypotheses.

H₀: There is no association between the presence of hypertension

and occurrence of OCAD in subjects 55 and under and subjects

over 55.

H_A: There is a relationship between the two variables.

Test statistic.

^k a_i

e_i

i¼1

x²

¼ i¼1

^k v_i

i¼1

as given in Equation 12.7.7.

Distribution of test statistic. Chi-square with 1 degree of freedom.

Decision rule. Suppose we let a ¼ .05. Reject H₀ if the computed value

of the test statistic is greater than or equal to 3.841.

Calculation of test statistic. By Equation 12.7.5 we compute the

following expected frequencies:

e₁ ¼ ð21 þ 11Þð21 þ 16Þ=54 ¼ ð32Þð37Þ=54 ¼ 21:93

e₂ ¼ ð50 þ 14Þð50 þ 18Þ=88 ¼ ð64Þð68Þ=88 ¼ 49:45

652

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

By Equation 12.7.6 we compute

v₁ ¼ ð32Þð22Þð37Þð17Þ=ð2916Þð54

1Þ ¼ 2:87

v₂ ¼ ð64Þð24Þð68Þð20Þ=ð7744Þð88

1Þ ¼ 3:10

Finally, by Equation 12.7.7 we compute

½ð21 þ 50Þ ð21:93 þ 49:45Þ

x²

¼ .0242

MH ¼

2:87 þ 3:10

8. Statistical decision. Since .0242 < 3:841, we fail to reject H₀.

9. Conclusion. We conclude that there may not be an association between

hypertension and the occurrence of OCAD.

10. p value. Since .0242 < 2:706, the p value for this test is p > .10.

We now illustrate the calculation of the Mantel-Haenszel estimator of the

common odds ratio.

EXAMPLE 12.7.4

Let us refer to the data in Table 12.7.6 and compute the common odds ratio.

Solution: From the stratified data in Table 12.7.6 we compute the numerator of the ratio

as follows:

ða1d1=n1

Þ þða₂d₂=n₂Þ ¼ ½ð21Þð6Þ=54 þ ½ð50Þð6Þ=88

¼ 5:7424

The denominator of the ratio is

ðb1c1=n1

Þ þðb₂c₂=n₂Þ ¼ ½ð11Þð16Þ=54 þ ½ð14Þð18Þ=88

¼ 6:1229

Now, by Equation 12.7.7, we compute the common odds ratio:

5:7424

OR_MH ¼

6:1229¼.94

From these results we estimate that, regardless of age, patients who

have hypertension are less likely to have OCAD than patients who do not

have hypertension.

Hand calculation of the Mantel-Haenszel test statistics can prove to be a cumber-

some task. Fortunately, the researcher can find relief in one of several statistical software

packages that are available. To illustrate, results from the use of SPSS to process the data of

Example 12.7.3 are shown in Figure 12.7.3. These results differ from those given in the

example because of rounding error.

EXERCISES

653

Smoking_status * Obsesity_status * Stratum Cross-Tabulation

Count

Obesity status

Stratum

Cases

Noncases

otal

55 and under Smoking_status Smoked throughout

Never smoked

otal

Over 55

Smoking_status Smoked throughout

Never smoked

otal

Tests of Conditional Independence

Asymp. Sig.

Chi-Squared

(2-sided)

Cochran’s

.025

.875

Mantel-Haenszel

.002

.961

Mantel-Haenszel Common Odds Ratio Estimate

Estimate

.93

In(Estimate)

.064

Std. Error of In(Estimate)

.41

Asymp. Sig. (2-sided)

.876

Asymp. 95% confidence

Common Odds Lower Bound

.418

Interval

Ratio

Upper Bound

2.102

In(Common)

Lower Bound

.871

Odds Ratio)

Upper Bound

.743

FIGURE 12.7.3

SPSS output for Example 12.7.3.

EXERCISES

12.7.1

Davy et al. (A-19) reported the results of a study involving survival from cervical cancer. The

researchers found that among subjects younger than age 50, 16 of 371 subjects had not survived for

1 year after diagnosis. In subjects age 50 or older, 219 of 376 had not survived for 1 year after

diagnosis. Compute the relative risk of death among subjects age 50 or older. Does it appear from

these data that older subjects diagnosed as having cervical cancer are prone to higher mortality

rates?

654

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

12.7.2

The objective of a prospective study by Stenestrand et al. (A-20) was to compare the mortality rate

following an acute myocardial infarction (AMI) among subjects receiving early revascularization to

the mortality rate among subjects receiving conservative treatments. Among 2554 patients receiving

revascularization within 14 days of AMI, 84 died in the year following the AMI. In the conservative

treatment group (risk factor present), 1751 of 19,358 patients died within a year of AMI. Compute the

relative risk of mortality in the conservative treatment group as compared to the revascularization

group in patients experiencing AMI.

12.7.3

Refer to Example 12.7.2. Toschke et al. (A-17), who collected data on obesity status of children ages

5-6 years and the smoking status of the mother during the pregnancy, also reported on another

outcome variable: whether the child was born premature (37 weeks or fewer of gestation). The

following table summarizes the results of this aspect of the study. The same risk factor (smoking

during pregnancy) is considered, but a case is now defined as a mother who gave birth prematurely.

Premature Birth Status

Smoking Status

During Pregnancy

Cases

Noncases

Total

Smoked throughout

370

406

Never smoked

168

3396

3564

Total

204

3766

3970

Source: A. M. Toschke, S. M. Montgomery, U. Pfeiffer, and R. von Kries, “Early Intrauterine

Exposure to Tobacco-Inhaled Products and Obesity,” American Journal of Epidemiology, 158

(2003), 1068-1074.

Compute the odds ratio to determine if smoking throughout pregnancy is related to premature birth.

Use the chi-square test of independence to determine if one may conclude that there is an association

between smoking throughout pregnancy and premature birth. Let a ¼ .05.

12.7.4

Sugiyama et al. (A-21) examined risk factors for allergic diseases among 13- and 14-year-old

schoolchildren in Japan. One risk factor of interest was a family history of eating an unbalanced diet.

The following table shows the cases and noncases of children exhibiting symptoms of rhinitis in the

presence and absence of the risk factor.

Rhinitis

Family History

Cases

Noncases

Total

Unbalanced diet

656

1451

2107

Balanced diet

677

1662

2339

Total

1333

3113

4446

Source: Takako Sugiyama, Kumiya Sugiyama, Masao Toda, Tastuo Yukawa, Sohei Makino,

and Takeshi Fukuda, “Risk Factors for Asthma and Allergic Diseases Among 13-14-Year-Old

Schoolchildren in Japan,” Allergology International, 51 (2002), 139-150.

What is the estimated odds ratio of having rhinitis among subjects with a family history of an

unbalanced diet compared to those eating a balanced diet? Compute the 95 percent confidence

interval for the odds ratio.

12.7.5

According to Holben et al. (A-22), “Food insecurity implies a limited access to or availability of food

or a limited/uncertain ability to acquire food in socially acceptable ways.” These researchers

12.8

SUMMARY

655

collected data on 297 families with a child in the Head Start nursery program in a rural area of Ohio

near Appalachia. The main outcome variable of the study was household status relative to food

security. Households that were not food secure are considered to be cases. The risk factor of interest

was the absence of a garden from which a household was able to supplement its food supply. In the

following table, the data are stratified by the head of household’s employment status outside the

home.

Stratum 1 (Employed Outside the Home)

Risk Factor

Cases

Noncases

Total

No garden

Garden

Total

128

Stratum 2 (Not Employed Outside the Home)

Risk Factor

Cases

Noncases

Total

No garden

113

Garden

Total

161

Source: Data provided courtesy of David H. Holben, Ph.D. and John P. Holcomb, Jr., Ph.D.

Compute the Mantel-Haenszel common odds ratio with stratification by employment status. Use the

Mantel-Haenszel chi-square test statistic to determine if we can conclude that there is an association

between the risk factor and food insecurity. Let a ¼ .05.

12.8

SUMMARY

In this chapter some uses of the versatile chi-square distribution are discussed. Chi-square

goodness-of-fit tests applied to the normal, binomial, and Poisson distributions are

presented. We see that the procedure consists of computing a statistic

“

ðOi Ei

X² ¼

E_i

that measures the discrepancy between the observed (O_i) and expected (E_i) frequencies of

occurrence of values in certain discrete categories. When the appropriate null hypothesis is

true, this quantity is distributed approximately as x². When X² is greater than or equal to the

tabulated value of x² for some a, the null hypothesis is rejected at the a level of

significance.

Tests of independence and tests of homogeneity are also discussed in this chapter.

The tests are mathematically equivalent but conceptually different. Again, these tests

essentially test the goodness-of-fit of observed data to expectation under hypotheses,

respectively, of independence of two criteria of classifying the data and the homogeneity of

proportions among two or more groups.

656

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

In addition, we discussed and illustrated in this chapter four other techniques for

analyzing frequency data that can be presented in the form of a 2

2 contingency table: the

Fisher exact test, the odds ratio, relative risk, and the Mantel-Haenszel procedure. Finally,

we discussed the basic concepts of survival analysis and illustrated the computational

procedures by means of two examples.

SUMMARY OF FORMULAS FOR CHAPTER 12

Formula

Number

Name

Formula

y_i

12.2.1

Standard normal random

z_i ¼

variable

12.2.2

Chi-square distribution with

x2n

z²

þz²

ð Þ ¼

1 þ

2 þ

n degrees of freedom

12.2.3

Chi-square probability

f ðuÞ ¼

¹eðu=2Þ

density function

!2k=2 uðk=2Þ

“

12.2.4

Chi-square test statistic

Þ²

x² ¼

E_i

12.4.1

Chi-square calculation

nðad bcÞ²

x² ¼

formula for a 2

ða þ cÞðb þ dÞða þ bÞðc þ dÞ

contingency table

12.4.2

Yates’s corrected chi-square

nðjad bcj

.5nÞ²

calculation for a 2

x²

corrected ¼

ða þ cÞðb þ dÞða þ bÞðc þ dÞ

contingency table

12.6.1-12.6.2

Large-sample approximation

ða=AÞ ðb=BÞ

z¼

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

to the chi-square

pð1

pÞð1=A þ 1=BÞ

where

p ¼ ða þ bÞ=ðA þ BÞ

12.7.1

Relative risk estimate

a=ða þ bÞ

RR ¼

c=ðc þ dÞ

ffiffiffi

12.7.2

Confidence interval for the

za =

x²

100ð1

aÞ%CI ¼ RR¹

relative risk estimate

12.7.3

Odds ratio estimate

a=b

OR ¼

c=d¼bc

ffiffiffi

12.7.4

Confidence interval for the

za=

x²

100ð1

aÞ%CI

OR¹

odds ratio estimate

(Continued )

REVIEW QUESTIONS AND EXERCISES

657

12.7.5

Expected frequency in the

ðai þ bi

Þða_i þc_iÞ

e_i ¼

Mantel-Haenszel statistic

n_i

12.7.6

Stratum expected frequency

ðai þ bi

Þðc_i þd_iÞða_i þc_iÞðb_i þd_iÞ

v_i ¼

in the Mantel-Haenszel

n²

1Þ

i ðni

statistic

12.7.7

Mantel-Haenszel test statistic

^k a_i

^k e_i

i¼1

x²

¼ i¼1

^k v_i

i¼1

12.7.8

Mantel-Haenszel estimator

^k ða_id_i=n_iÞ

of the common odds ratio

OR_MH ¼i¼1

^k ðb_ic_i=n_iÞ

i¼1

Symbol Key

a; b; c; d ¼ cell frequencies in a 2

2 contingency table

A; B ¼ row totals in the 2

2 contingency table

b ¼ regression coefficient

x² or X²

¼ chi-square

e_i ¼ expected frequency in the Mantel-Haenszel statistic

E_i ¼ expected frequency

E_ðyjxÞ ¼ expected value of yat x

k ¼ degrees of freedom in the chi-square distribution

m ¼ mean

O_i ¼ observed frequency

OR ¼ odds ratio estimate

s ¼ standard deviation

RR ¼ relative risk estimate

v_i ¼ stratum expected frequency in the Mantel-Haenszel statistic

y_i ¼ data value at pointi

z ¼ normal variate

REVIEW QUESTIONS AND EXERCISES

1. Explain how the chi-square distribution may be derived.

2. What are the mean and variance of the chi-square distribution?

3. Explain how the degrees of freedom are computed for the chi-square goodness-of-fit tests.

4. State Cochran’s rule for small expected frequencies in goodness-of-fit tests.

5. How does one adjust for small expected frequencies?

6. What is a contingency table?

7. How are the degrees of freedom computed when an X² value is computed from a contingency

table?

658

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

Explain the rationale behind the method of computing the expected frequencies in a test of

independence.

Explain the difference between a test of independence and a test of homogeneity.

10.

Explain the rationale behind the method of computing the expected frequencies in a test of

homogeneity.

11.

When do researchers use the Fisher exact test rather than the chi-square test?

12.

Define the following:

(a) Observational study

(b) Risk factor

(d) Retrospective study

(e) Prospective study

(f) Relative risk

(g) Odds

(h) Odds ratio

(i) Confounding variable

13.

Under what conditions is the Mantel-Haenszel test appropriate?

14.

Explain how researchers interpret the following measures:

(a) Relative risk

(b) Odds ratio

15.

In a study of violent victimization of women and men, Porcerelli et al. (A-23) collected infor-

mation from 679 women and 345 men ages 18 to 64 years at several family practice centers

in the metropolitan Detroit area. Patients filled out a health history questionnaire that included

a question about victimization. The following table shows the sample subjects cross-classified

by gender and the type of violent victimization reported. The victimization categories are

defined as no victimization, partner victimization (and not by others), victimization by a person

other than a partner

(friend, family member, or stranger), and those who reported multiple

victimization.

Gender No Victimization Partner Nonpartner Multiple Total

Women

611

679

Men

308

345

Total

919

1024

Source: John H. Porcerelli, Rosemary Cogan, Patricia P. West, Edward A. Rose, Dawn

Lambrecht, Karen E. Wilson, Richard K. Severson, and Dunia Karana, “Violent Victimization

of Women and Men: Physical and Psychiatric Symptoms,” Journal of the American Board of

Family Practice, 16 (2003), 32-39.

Can we conclude on the basis of these data that victimization status and gender are not independent?

Let a ¼ .05.

16. Refer to Exercise 15. The following table shows data reported by Porcerelli et al. for 644 African-

American and Caucasian women. May we conclude on the basis of these data that for women, race

and victimization status are not independent? Let a ¼ .05.

REVIEW QUESTIONS AND EXERCISES

659

No Victimization

Partner Nonpartner Multiple Total

Caucasian

356

388

African-American

226

256

Total

582

644

Source: John H. Porcerelli, Rosemary Cogan, Patricia P. West, Edward A. Rose, Dawn Lambrecht,

Karen E. Wilson, Richard K. Severson, and Dunia Karana, “Violent Victimization of Women and

Men: Physical and Psychiatric Symptoms,” Journal of the American Board of Family Practice, 16

(2003), 32-39.

17.

A sample of 150 chronic carriers of a certain antigen and a sample of 500 noncarriers revealed the

following blood group distributions:

Blood Group

Carriers

Noncarriers

Total

230

302

192

246

Total

150

500

650

Can one conclude from these data that the two populations from which the samples were drawn differ

with respect to blood group distribution? Let a ¼ .05. What is the p value for the test?

18.

The following table shows 200 males classified according to social class and headache status:

Social Class

Headache Group

Total

No headache (in previous year)

Simple headache

Unilateral headache (nonmigraine)

Migraine

Total

109

200

Do these data provide sufficient evidence to indicate that headache status and social class are related?

Let a ¼ .05. What is the p value for this test?

19.

The following is the frequency distribution of scores made on an aptitude test by 175 applicants to a

physical therapy training facility ðx ¼ 39:71; s ¼ 12:92Þ.

Score

Number of Applicants

Score

Number of Applicants

10-14

40-44

15-19

45-49

20-24

50-54

25-29

55-59

(Continued )

660

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

Score

Number of Applicants

Score

Number of Applicants

30-34

60-64

35-39

65-69

Total

175

Do these data provide sufficient evidence to indicate that the population of scores is not normally

distributed? Let a ¼ .05. What is the p value for this test?

20.

A local health department sponsored a venereal disease (VD) information program that was open to

high-school juniors and seniors who ranged in age from 16 to 19 years. The program director believed

that each age level was equally interested in knowing more about VD. Since each age level was about

equally represented in the area served, she felt that equal interest in VD would be reflected by equal

age-level attendance at the program. The age breakdown of those attending was as follows:

Age

Number Attending

Are these data incompatible with the program director’s belief that students in the four age levels are

equally interested in VD? Let a ¼ .05. What is the p value for this test?

21.

A survey of children under 15 years of age residing in the inner-city area of a large city were classified

according to ethnic group and hemoglobin level. The results were as follows:

Hemoglobin Level (g/100 ml)

Ethnic Group

10.0 or Greater

9.0-9.9

< 9:0

Total

100

200

190

385

110

Total

249

320

126

695

Do these data provide sufficient evidence to indicate, at the .05 level of significance, that the two

variables are related? What is the p value for this test?

22.

A sample of reported cases of mumps in preschool children showed the following distribution by age:

Age (Years)

Number of Cases

Under 1

Total

150

REVIEW QUESTIONS AND EXERCISES

661

Test the hypothesis that cases occur with equal frequency in the five age categories. Let a ¼ .05.

What is the p value for this test?

23.

Each of a sample of 250 men drawn from a population of suspected joint disease victims was asked

which of three symptoms bother him most. The same question was asked of a sample of 300

suspected women joint disease victims. The results were as follows:

Most Bothersome Symptom

Men

Women

Morning stiffness

111

102

Nocturnal pain

Joint swelling

125

Total

250

300

Do these data provide sufficient evidence to indicate that the two populations are not homogeneous

with respect to major symptoms? Let a ¼ .05. What is the p value for this test?

For each of the Exercises 24 through 34, indicate whether a null hypothesis of homogeneity or a null

hypothesis of independence is appropriate.

24.

A researcher wishes to compare the status of three communities with respect to immunity against polio

in preschool children. A sample of preschool children was drawn from each of the three communities.

25.

In a study of the relationship between smoking and respiratory illness, a random sample of adults

were classified according to consumption of tobacco and extent of respiratory symptoms.

26.

A physician who wished to know more about the relationship between smoking and birth defects

studies the health records of a sample of mothers and their children, including stillbirths and

spontaneously aborted fetuses where possible.

27.

A health research team believes that the incidence of depression is higher among people with

hypoglycemia than among people who do not suffer from this condition.

28.

In a simple random sample of 200 patients undergoing therapy at a drug abuse treatment center,

60 percent belonged to ethnic group I. The remainder belonged to ethnic group II. In ethnic group I,

60 were being treated for alcohol abuse (A), 25 for marijuana abuse (B), and 20 for abuse of heroin,

illegal methadone, or some other opioid (C). The remainder had abused barbiturates, cocaine,

amphetamines, hallucinogens, or some other nonopioid besides marijuana (D). In ethnic group II the

abused drug category and the numbers involved were as follows:

Að28Þ

Bð32Þ Cð13Þ

D ðthe remainderÞ

Can one conclude from these data that there is a relationship between ethnic group and choice of drug

to abuse? Let a ¼ .05 and find the p value.

29.

Solar keratoses are skin lesions commonly found on the scalp, face, backs of hands, forearms, ears,

scalp, and neck. They are caused by long-term sun exposure, but they are not skin cancers. Chen et al.

(A-24) studied 39 subjects randomly assigned (with a 3 to 1 ratio) to imiquimod cream and a control

cream. The criterion for effectiveness was having 75 percent or more of the lesion area cleared after

14 weeks of treatment. There were 21 successes among 29 imiquimod-treated subjects and three

successes among 10 subjects using the control cream. The researchers used Fisher’s exact test and

obtained a p value of .027. What are the variables involved? Are the variables quantitative or

qualitative? What null and alternative hypotheses are appropriate? What are your conclusions?

662

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

30.

Janardhan et al. (A-25) examined 125 patients who underwent surgical or endovascular treatment for

intracranial aneurysms. At 30 days postprocedure, 17 subjects experienced transient/persistent

neurological deficits. The researchers performed logistic regression and found that the 95 percent

confidence interval for the odds ratio for aneurysm size was .09-.96. Aneurysm size was dichoto-

mized as less than 13 mm and greater than or equal to 13 mm. The larger tumors indicated higher odds

of deficits. Describe the variables as to whether they are continuous, discrete, quantitative, or

qualitative. What conclusions may be drawn from the given information?

31.

In a study of smoking cessation by Gold et al. (A-26), 189 subjects self-selected into three treatments:

nicotine patch only

(NTP), Bupropion SR only (B), and nicotine patch with Bupropion SR

ðNTP þ BÞ. Subjects were grouped by age into younger than 50 years old, between 50 and 64,

and 65 and older. There were 15 subjects younger than 50 years old who chose NTP, 26 who chose B,

and 16 who chose NTP þ B. In the 50-64 years category, six chose NTP, 54 chose B, and 40 chose

NTP þ B. In the oldest age category, six chose NTP, 21 chose B, and five chose NTP þ B. What

statistical technique studied in this chapter would be appropriate for analyzing these data? Describe

the variables involved as to whether they are continuous, discrete, quantitative, or qualitative. What

null and alternative hypotheses are appropriate? If you think you have sufficient information, conduct

a complete hypothesis test. What are your conclusions?

32.

Kozinszky and Bartai (A-27) examined contraceptive use by teenage girls requesting abortion in

Szeged, Hungary. Subjects were classified as younger than 20 years old or 20 years old or older. Of

the younger than 20-year-old women, 146 requested an abortion. Of the older group, 1054 requested

an abortion. A control group consisted of visitors to the family planning center who did not request an

abortion or persons accompanying women who requested an abortion. In the control group, there

were 147 women under 20 years of age and 1053 who were 20 years or older. One of the outcome

variables of interest was knowledge of emergency contraception. The researchers report that,

“Emergency contraception was significantly [(Mantel-Haenszel) p < .001] less well known among

the would-be aborter teenagers as compared to the older women requesting artificial abortion

ðOR ¼ .07Þ than the relevant knowledge of the teenage controls ðOR ¼ .10Þ.” Explain the meaning

of the reported statistics. What are your conclusions based on the given information?

33.

The goal of a study by Crosignani et al. (A-28) was to assess the effect of road traffic exhaust on the

risk of childhood leukemia. They studied 120 children in Northern Italy identified through a

population-based cancer registry (cases). Four controls per case, matched by age and gender, were

sampled from population files. The researchers used a diffusion model of benzene to estimate

exposure to traffic exhaust. Compared to children whose homes were not exposed to road traffic

emissions, the rate of childhood leukemia was significantly higher for heavily exposed children.

Characterize this study as to whether it is observational, prospective, or retrospective. Describe the

variables as to whether they are continuous, discrete, quantitative, qualitative, a risk factor, or a

confounding variable. Explain the meaning of the reported results. What are your conclusions based

on the given information?

34.

Gallagher et al. (A-29) conducted a descriptive study to identify factors that influence women’s

attendance at cardiac rehabilitation programs following a cardiac event. One outcome variable of

interest was actual attendance at such a program. The researchers enrolled women discharged from

four metropolitan hospitals in Sydney, Australia. Of 183 women, only 57 women actually attended

programs. The authors reported odds ratios and confidence intervals on the following variables that

significantly affected outcome: age-squared (1.72; 1.10-2.70). Women over the age of 70 had the

lowest odds, while women ages 55-70 years had the highest odds.), perceived control (.92; .85-1.00),

employment (.20; .07-.58), diagnosis (6.82, 1.84-25.21, odds ratio was higher for women who

experienced coronary artery bypass grafting vs. myocardial infarction), and stressful event (.21, .06-.73).

Characterize this study as to whether it is observational, prospective, or retrospective. Describe the

REVIEW QUESTIONS AND EXERCISES

663

variables as to whether they are continuous, discrete, quantitative, qualitative, a risk factor, or a

confounding variable. Explain the meaning of the reported odds ratios.

For each of the Exercises 35 through 51, do as many of the following as you think appropriate:

(a) Apply one or more of the techniques discussed in this chapter.

(b) Apply one or more of the techniques discussed in previous chapters.

(d) Construct confidence intervals for population parameters.

(e) Formulate relevant hypotheses, perform the appropriate tests, and find p values.

(f) State the statistical decisions and clinical conclusions that the results of your hypothesis tests justify.

(g) Describe the population(s) to which you think your inferences are applicable.

(h) State the assumptions necessary for the validity of your analyses.

35.

In a prospective, randomized, double-blind study, Stanley et al. (A-30) examined the relative efficacy

and side effects of morphine and pethidine, drugs commonly used for patient-controlled analgesia

(PCA). Subjects were 40 women, between the ages of 20 and 65 years, undergoing total abdominal

hysterectomy. Patients were allocated randomly to receive morphine or pethidine by PCA. At the end

of the study, subjects described their appreciation of nausea and vomiting, pain, and satisfaction by

means of a three-point verbal scale. The results were as follows:

Satisfaction

Unhappy/

Moderately

Happy/

Drug

Miserable

Happy

Delighted

Total

Pethidine

Morphine

Total

Pain

Unbearable/

Slight/

Drug

Severe

Moderate

None

Total

Pethidine

Morphine

Total

Nausea

Unbearable/

Slight/

Drug

Severe

Moderate

None

Total

Pethidine

Morphine

Total

Source: Data provided courtesy of Dr. Balraj L. Appadu.

664

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

36.

Screening data from a statewide lead poisoning prevention program between April 1990 and March

1991 were examined by Sargent et al. (A-31) in an effort to learn more about community risk factors

for iron deficiency in young children. Study subjects ranged in age between 6 and 59 months.

Among 1860 children with Hispanic surnames, 338 had iron deficiency. Four-hundred-fifty-seven

of 1139 with Southeast Asian surnames and 1034 of 8814 children with other surnames had iron

deficiency.

37.

To increase understanding of HIV-infection risk among patients with severe mental illness, Horwath

et al. (A-32) conducted a study to identify predictors of injection drug use among patients who did not

have a primary substance use disorder. Of 192 patients recruited from inpatient and outpatient public

psychiatric facilities, 123 were males. Twenty-nine of the males and nine of the females were found

to have a history of illicit-drug injection.

38.

Skinner et al. (A-33) conducted a clinical trial to determine whether treatment with melphalan,

prednisone, and colchicine (MPC) is superior to colchicine (C) alone. Subjects consisted of 100

patients with primary amyloidosis. Fifty were treated with C and 50 with MPC. Eighteen months

after the last person was admitted and 6 years after the trial began, 44 of those receiving C and 36 of

those receiving MPC had died.

39.

The purpose of a study by Miyajima et al. (A-34) was to evaluate the changes of tumor cell

contamination in bone marrow (BM) and peripheral blood (PB) during the clinical course of patients

with advanced neuroblastoma. Their procedure involved detecting tyrosine hydroxylase (TH) mRNA

to clarify the appropriate source and time for harvesting hematopoietic stem cells for transplantation.

The authors used Fisher’s exact test in the analysis of their data. If available, read their article and

decide if you agree that Fisher’s exact text was the appropriate technique to use. If you agree,

duplicate their procedure and see if you get the same results. If you disagree, explain why.

40.

Cohen et al. (A-35) investigated the relationship between HIV seropositivity and bacterial vaginosis

in a population at high risk for sexual acquisition of HIV. Subjects were 144 female commercial sex

workers in Thailand of whom 62 were HIV-positive and 109 had a history of sexually transmitted

diseases (STD). In the HIV-negative group, 51 had a history of STD.

41.

The purpose of a study by Lipschitz et al. (A-36) was to examine, using a questionnaire, the rates and

characteristics of childhood abuse and adult assaults in a large general outpatient population.

Subjects consisted of 120 psychiatric outpatients (86 females, 34 males) in treatment at a large

hospital-based clinic in an inner-city area. Forty-seven females and six males reported incidents of

childhood sexual abuse.

42.

Subjects of a study by O’Brien et al. (A-37) consisted of 100 low-risk patients having well-dated

pregnancies. The investigators wished to evaluate the efficacy of a more gradual method for

promoting cervical change and delivery. Half of the patients were randomly assigned to receive

a placebo, and the remainder received 2 mg of intravaginal prostaglandin E₂ (PGE₂) for 5 consecutive

days. One of the infants born to mothers in the experimental group and four born to those in the

control group had macrosomia.

43.

The purposes of a study by Adra et al. (A-38) were to assess the influence of route of delivery on

neonatal outcome in fetuses with gastroschisis and to correlate ultrasonographic appearance of the

fetal bowel with immediate postnatal outcome. Among 27 cases of prenatally diagnosed gastro-

schisis the ultrasonograph appearance of the fetal bowel was normal in 15. Postoperative complica-

tions were observed in two of the 15 and in seven of the cases in which the ultrasonographic

appearance was not normal.

44.

Liu et al. (A-39) conducted household surveys in areas of Alabama under tornado warnings. In one of

the surveys (survey 2) the mean age of the 193 interviewees was 54 years. Of these 56.0 percent were

REVIEW QUESTIONS AND EXERCISES

665

women, 88.6 percent were white, and 83.4 percent had a high-school education or higher. Among

the information collected were data on shelter-seeking activity and understanding of the term

“tornado warning.” One-hundred-twenty-eight respondents indicated that they usually seek

shelter when made aware of a tornado warning. Of these, 118 understood the meaning of tornado

warning. Forty-six of those who said they didn’t usually seek shelter understood the meaning

of the term.

45.

The purposes of a study by Patel et al. (A-40) were to investigate the incidence of acute angle-closure

glaucoma secondary to pupillary dilation and to identify screening methods for detecting angles at

risk of occlusion. Of 5308 subjects studied, 1287 were 70 years of age or older. Seventeen of the older

subjects and 21 of the younger subjects (40 through 69 years of age) were identified as having

potentially occludable angles.

46.

Voskuyl et al. (A-41) investigated those characteristics (including male gender) of patients with

rheumatoid arthritis (RA) that are associated with the development of rheumatoid vasculitis (RV).

Subjects consisted of 69 patients who had been diagnosed as having RV and 138 patients with RA

who were not suspected to have vasculitis. There were 32 males in the RV group and 38 among the

RA patients.

47.

Harris et al.

(A-42) conducted a study to compare the efficacy of anterior colporrhaphy and

retropubic urethropexy performed for genuine stress urinary incontinence. The subjects were 76

women who had undergone one or the other surgery. Subjects in each group were comparable in age,

social status, race, parity, and weight. In 22 of the 41 cases reported as cured the surgery had been

performed by attending staff. In 10 of the failures, surgery had been performed by attending staff. All

other surgeries had been performed by resident surgeons.

48.

Kohashi et al. (A-43) conducted a study in which the subjects were patients with scoliosis. As part of

the study, 21 patients treated with braces were divided into two groups, group Aðn_A ¼ 12Þ and group

Bðn_B ¼ 9Þ, on the basis of certain scoliosis progression factors. Two patients in group A and eight in

group B exhibited evidence of progressive deformity, while the others did not.

49.

In a study of patients with cervical intraepithelial neoplasia, Burger et al. (A-44) compared those who

were human papillomavirus (HPV)-positive and those who were HPV-negative with respect to risk

factors for HPV infection. Among their findings were 60 out of 91 nonsmokers with HPV infection

and 44 HPV-positive patients out of 50 who smoked 21 or more cigarettes per day.

50.

Thomas et al. (A-45) conducted a study to determine the correlates of compliance with follow-up

appointments and prescription filling after an emergency department visit. Among 235 respondents,

158 kept their appointments. Of these, 98 were females. Of those who missed their appointments, 31

were males.

51.

The subjects of a study conducted by O’Keefe and Lavan (A-46) were 60 patients with cognitive

impairment who required parenteral fluids for at least 48 hours. The patients were randomly assigned

to receive either intravenous (IV) or subcutaneous (SC) fluids. The mean age of the 30 patients in the

SC group was 81 years with a standard deviation of 6. Fifty-seven percent were females. The mean

age of the IV group was 84 years with a standard deviation of 7. Agitation related to the cannula or

drip was observed in 11 of the SC patients and 24 of the IV patients.

Exercises for Use with the Large Data Sets Available on the Following Website:

www.wile y.com/ college

/ daniel

Refer to the data on smoking, alcohol consumption, blood pressure, and respiratory disease among

1200 adults (SMOKING). The variables are as follows:

666

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

Sex ðAÞ :

1 ¼ male; 0 ¼ female

Smoking status ðBÞ :

0 ¼ nonsmoker; 1 ¼ smoker

Drinking level ðCÞ :

0 ¼ nondrinker

1 ¼ light to moderate drinker

2 ¼ heavy drinker

Symptoms of respiratory disease ðDÞ :

1 ¼ present; 0 ¼ absent

High blood pressure status ðEÞ :

1 ¼ present; 0 ¼ absent

Select a simple random sample of size 100 from this population and carry out an analysis to see if you

can conclude that there is a relationship between smoking status and symptoms of respiratory disease.

Let a ¼ .05 and determine the p value for your test. Compare your results with those of your

classmates.

Refer to Exercise 1. Select a simple random sample of size 100 from the population and carry out a

test to see if you can conclude that there is a relationship between drinking status and high blood

pressure status in the population. Let a ¼ .05 and determine the p value. Compare your results with

those of your classmates.

Refer to Exercise 1. Select a simple random sample of size 100 from the population and carry out a

test to see if you can conclude that there is a relationship between gender and smoking status in the

population. Let a ¼ .05 and determine the p value. Compare your results with those of your

classmates.

Refer to Exercise 1. Select a simple random sample of size 100 from the population and carry out a

test to see if you can conclude that there is a relationship between gender and drinking level in the

population. Let a ¼ .05 and find the p value. Compare your results with those of your classmates.

REFERENCES

Methodology References

1. KARL PEARSON, “On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated

System of Variables Is Such that It Can Be Reasonably Supposed to Have Arisen from Random Sampling,” The

London, Edinburgh and Dublin Philosophical Magazine and Journal of Science, Fifth Series, 50 (1900), 157-

175. Reprinted in Karl Pearson’s Early Statistical Papers, Cambridge University Press, 1948.

2. H. O. LANCASTER, The Chi-Squared Distribution, Wiley, New York, 1969.

3. MIKHAIL S. NIKULIN and PRISCILLA E. GREENWOOD, A Guide to Chi-Squared Testing, Wiley, New York, 1996.

4. WILLIAM G. COCHRAN, “The x² Test of Goodness of Fit,” Annals of Mathematical Statistics, 23 (1952), 315-345.

5. WILLIAM G. COCHRAN, “Some Methods for Strengthening the Common x² Tests,” Biometrics, 10 (1954),

417-451.

6. F. YATES, “Contingency Tables Involving Small Numbers and the x² Tests,” Journal of the Royal Statistical

Society, Supplement, 1, 1934 (Series B), 217-235.

7. R. A. FISHER, Statistical Methods for Research Workers, Fifth Edition, Oliver and Boyd, Edinburgh, 1934.

8. R. A. FISHER, “The Logic of Inductive Inference,” Journal of the Royal Statistical Society Series A, 98 (1935),

39-54.

9. J. O. IRWIN, “Tests of Significance for Differences Between Percentages Based on Small Numbers,” Metron, 12

(1935), 83-94.

10. F. YATES, “Contingency Tables Involving Small Numbers and the x² Test,” Journal of the Royal Statistical

Society, Supplement, 1, (1934), 217-235.

11. D. J. FINNEY, “The Fisher-Yates Test of Significance in 2

2 Contingency Tables,” Biometrika, 35 (1948),

145-156.

REFERENCES

667

12.

R. LATSCHA, “Tests of Significance in a 2

2 Contingency Table: Extension of Finney’s Table,” Biometrika,

(1955), 74-86.

13.

G. A. BARNARD, “A New Test for 2

2 Tables,” Nature, 156 (1945), 117.

14.

G. A. BARNARD, “A New Test for 2

2 Tables,” Nature, 156 (1945), 783-784.

15.

G. A. BARNARD, “Significance Tests for 2

2 Tables,” Biometrika, 34 (1947), 123-138.

16.

R. A. FISHER, “A New Test for 2

2 Tables,” Nature, 156 (1945), 388.

17.

E. S. PEARSON, “The Choice of Statistical Tests Illustrated on the Interpretation of Data Classed in a 2

2 Table,”

Biometrika, 34 (1947), 139-167.

18.

A. SWEETLAND, “A Comparison of the Chi-Square Test for 1 df and the Fisher Exact Test,” Rand Corporation,

Santa Monica, CA, 1972.

19.

WENDELL E. CARR, “Fisher’s Exact Text Extended to More than Two Samples of Equal Size,” Technometrics, 22

(1980), 269-270.

20.

HENRY R. NEAVE, “A New Look at an Old Test,” Bulletin of Applied Statistics, 9 (1982), 165-178.

21.

WILLIAM D. DUPONT, “Sensitivity of Fisher’s Exact Text to Minor Perturbations in 2

2 Contingency Tables,”

Statistics in Medicine, 5 (1986), 629-635.

22.

N. MANTEL and W. HAENSZEL, “Statistical Aspects of the Analysis of Data from Retrospective Studies of Disease,”

Journal of the National Cancer Institute, 22 (1959), 719-748.

23.

N. MANTEL, “Chi-Square Tests with One Degree of Freedom: Extensions of the Mantel-Haenszel Procedure,”

Journal of the American Statistical Association, 58 (1963), 690-700.

Applications References

A-1.

CAROLE W. CRANOR and DALE B. CHRISTENSEN, “The Asheville Project: Short-Term Outcomes of a Community

Pharmacy Diabetes Care Program,” Journal of the American Pharmaceutical Association, 43 (2003), 149-159.

A-2.

AMY L. BYERS, HEATHER ALLORE, THOMAS M. GILL, and PETER N. PEDUZZI, “Application of Negative Binomial

Modeling for Discrete Outcomes: A Case Study in Aging Research,” Journal of Clinical Epidemiology, 56

(2003), 559-564.

A-3.

KATHLEEN M. STEPANUK, JORGE E. TOLOSA, DAWNEETE LEWIS, VICTORIA MEYERS, CYNTHIA ROYDS, JUAN CARLOS

SAOGAL, and RON LIBRIZZI, “Folic Acid Supplementation Use Among Women Who Contact a Teratology

Information Service,” American Journal of Obstetrics and Gynecology, 187 (2002), 964-967.

A-4.

J. K. SILVER and D. D. AIELLO, “Polio Survivors: Falls and Subsequent Injuries,” American Journal of Physical

Medicine and Rehabilitation, 81 (2002), 567-570.

A-5.

CYNTHIA G. SEGAL and JACQUELINE J. ANDERSON, “Preoperative Skin Preparation of Cardiac Patients,” AORN

Journal, 76 (2002), 821-827.

A-6.

RALPH ROTHENBERG and JOHN P. HOLCOMB, “Guidelines for Monitoring of NSAIDs: Who Listened?,” Journal of

Clinical Rheumatology, 6 (2000), 258-265.

A-7.

SHARON M. BOLES and PATRICK B. JOHNSON, “Gender, Weight Concerns, and Adolescent Smoking,” Journal of

Addictive Diseases, 20 (2001), 5-14.

A-8.

The DMG Study Group, “Migraine and Idiopathic Narcolepsy—A Case-Control Study,” Cephalagia, 23 (2003),

786-789.

A-9.

TASHA D. CARTER, EMANUELA MUNDO, SAGAR V. PARKH, and JAMES L. KENNEDY, “Early Age at Onset as a Risk Factor

for Poor Outcome of Bipolar Disorder,” Journal of Psychiatric Research, 37 (2003), 297-303.

A-10.

STEVEN S. COUGHLIN, ROBERT J. UHLER, THOMAS RICHARDS, and KATHERINE M. WILSON, “Breast and Cervical Cancer

Screening Practices Among Hispanic and Non-Hispanic Women Residing Near the United States-Mexico

Border, 1999-2000,” Family and Community Health, 26 (2003), 130-139.

A-11.

ROBERT SWOR, SCOTT COMPTON, FERN VINING, LYNN OSOSKY FARR, SUE KOKKO, REBECCA PASCUAL, and RAYMOND E.

JACKSON, “A Randomized Controlled Trial of Chest Compression Only CPR for Older Adults: A Pilot Study,”

Resuscitation, 58 (2003), 177-185.

A-12.

U. S. JUSTESEN, A. M. LERVFING, A. THOMSEN, J. A. LINDBERG, C. PEDERSEN, and P. TAURIS, “Low-Dose Indinavir in

Combination with Low-Dose Ritonavir: Steady-State Pharmacokinetics and Long-Term Clinical Outcome

Follow-Up,” HIV Medicine, 4 (2003), 250-254.

A-13.

J. F. TAHMASSEBI and M. E. J. CURZON, “The Cause of Drooling in Children with Cerebral Palsy—Hypersalivation

or Swallowing Defect?” International Journal of Paediatric Dentistry, 13 (2003), 106-111.

A-14.

SHU DONG XIAO and TONG SHI, “Is Cranberry Juice Effective in the Treatment and Prevention of Helicobacter

Pylori Infection of Mice?,” Chinese Journal of Digestive Diseases, 4 (2003), 136-139.

668

CHAPTER 12

THE CHI-SQUARE DISTRIBUTION AND THE ANALYSIS OF FREQUENCIES

A-15.

GAD SHAKED, OLEG KLEINER, ROBERT FINALLY, JACOB MORDECHAI, NITZA NEWMAN, and ZAHAVI COHEN,

“Management of Blunt Pancreatic Injuries in Children,” European Journal of Trauma, 29 (2003), 151-155.

A-16.

EVERETT F. MAGANN, SHARON F. EVANS, BETH WEITZ, and JOHN NEWNHAM, “Antepartum, Intrapartum, and Neonatal

Significance of Exercise on Healthy Low-Risk Pregnant Working Women,” Obstetrics and Gynecology, 99

(2002), 466-472.

A-17.

A. M. TOSCHKE, S. M. MONTGOMERY, U. PFEIFFER, and R.von KRIES, “Early Intrauterine Exposure to Tobacco-

Inhaled Products and Obesity,” American Journal of Epidemiology, 158 (2003), 1068-1074.

A-18.

DANIEL H. LAMONT, MATTHEW J. BUDOFF, DAVID M. SHAVELLE, ROBERT SHAVELLE, BRUCE H. BRUNDAGE, and JAMES

M. HAGAR, “Coronary Calcium Scanning Adds Incremental Value to Patients with Positive Stress Tests,”

American Heart Journal, 143 (2002), 861-867.

A-19.

MARGARET L. J. DAVY, TOM J. DODD, COLIN G. LUKE, and DAVID M. RODER, “Cervical Cancer: Effect of Glandular

Cell Type on Prognosis, Treatment, and Survival,” Obstetrics and Gynecology, 101 (2003), 38-45.

A-20.

U. STENESTRAND and L. WALLENTIN, “Early Revascularization and 1-Year Survival in 14-Day Survivors of Acute

Myocardial Infarction,” Lancet, 359 (2002), 1805-1811.

A-21.

TAKAKO SUGIYAMA, KUMIYA SUGIYAMA, MASAO TODA, TASTUO YUKAWA, SOHEI MAKINO, and TAKESHI FUKUDA, “Risk

Factors for Asthma and Allergic Diseases Among 13-14-Year-Old Schoolchildren in Japan,” Allergology

International, 51 (2002), 139-150.

A-22.

D. HOLBEN, M. C. MCCLINCY, J. P. HOLCOMB, and K. L. DEAN, “Food Security Status of Households in Appalachian

Ohio with Children in Head Start,” Journal of American Dietetic Association, 104 (2004), 238-241.

A-23.

JOHN H. PORCERELLI, ROSEMARY COGAN, PATRICIA P. WEST, EDWARD A. ROSE, DAWN LAMBRECHT, KAREN E. WILSON,

RICHARD K. SEVERSON, and DUNIA KARANA, “Violent Victimization of Women and Men: Physical and Psychiatric

Symptoms,” Journal of the American Board of Family Practice, 16 (2003), 32-39.

A-24.

KENG CHEN, LEE MEI YAP, ROBIN MARKS, and STEPHEN SHUMACK, “Short-Course Therapy with Imiquimod 5%

Cream for Solar Keratoses: A Randomized Controlled Trial,” Australasian Journal of Dermatology, 44 (2003),

250-255.

A-25.

VALLABH JANARDHAN, ROBERT FRIEDLANDER, HOWARD RIINA, and PHILIP EDWIN STIEG, “Identifying Patients at Risk

for Postprocedural Morbidity After Treatment of Incidental Intracranial Aneurysms: The Role of Aneurysm Size

and Location,” Neurosurgical Focus, 13 (2002), 1-8.

A-26.

PAUL B. GOLD, ROBERT N. RUBEY, and RICHARD T. HARVEY, “Naturalistic, Self-Assignment Comparative Trial of

Bupropion SR, a Nicotine Patch, or Both for Smoking Cessation Treatment in Primary Care,” American Journal

on Addictions, 11 (2002), 315-331.

A-27.

ZOLTAN KOZINSZKY and GYORGY BARTAI, “Contraceptive Behavior of Teenagers Requesting Abortion,” European

Journal of Obstetrics and Gynecology and Reproductive Biology, 112 (2004), 80-83.

A-28.

PAOLO CROSIGNANI, ANDREA TITTARELLI, ALESSANDRO BORGINI, TIZIANA CODAZZI, ADRIANO ROVELLI, EMMA PORRO,

PAOLO CONTIERO, NADIA BIANCHI, GIOVANNA TAGLIABUE, ROSARIA FISSI, FRANCESCO ROSSITTO, and FRANCO BERRINO,

“Childhood Leukemia and Road Traffic: A Population-Based Case-Control Study,” International Journal of

Cancer, 108 (2004), 596-599.

A-29.

ROBYN GALLAGHER, SHARON MCKINLEY, and KATHLEEN DRACUP, “Predictors of Women’s Attendance at Cardiac

Rehabilitation Programs,” Progress in Cardiovascular Nursing, 18 (2003), 121-126.

A-30.

G. STANLEY, B. APPADU, M. MEAD, and D. J. ROWBOTHAM, “Dose Requirements, Efficacy and Side Effects of

Morphine and Pethidine Delivered by Patient-Controlled Analgesia After Gynaecological Surgery,” British

Journal of Anaesthesia, 76 (1996), 484-486.

A-31.

JAMES D. SARGENT, THERESE A. STUKEL, MADELINE A. DALTON, JEAN L. FREEMAN, and MARY JEAN BROWN, “Iron

Deficiency in Massachusetts Communities: Socioeconomic and Demographic Risk Factors Among Children,”

American Journal of Public Health, 86 (1996), 544-550.

A-32.

EWALD HORWATH, FRANCINE COURNOS, KAREN MCKINNON, JEANNINE R. GUIDO, and RICHARD HERMAN, “Illicit-Drug

Injection Among Psychiatric Patients Without a Primary Substance Use Disorder,” Psychiatric Services, 47

(1996), 181-185.

A-33.

MARTHA SKINNER, JENNIFER J. ANDERSON, ROBERT SIMMS, RODNEY FALK, MING WANG, CARYN A. LIBBEY, LEE ANNA JONES,

and ALAN S. COHEN, “Treatment of 100 Patients with Primary Amyloidosis: A Randomized Trial of Melphalan,

Prednisone, and Colchicine Versus Colchicine Only,” American Journal of Medicine, 100 (1996), 290-298.

A-34.

YUJI MIYAJIMA, KEIZO HORIBE, MINORU FUKUDA, KIMIKAZU MATSUMOTO, SHIN–ICHIRO NUMATA, HIROSHI MORI, and

KOJI KATO, “Sequential Detection of Tumor Cells in the Peripheral Blood and Bone Marrow of Patients with Stage

IV Neuroblastoma by the Reverse Transcription-Polymerase Chain Reaction for Tyrosine Hydroxylase mRNA,”

Cancer, 77 (1996), 1214-1219.

REFERENCES

669

A-35.

CRAIG R. COHEN, ANN DUERR, NIWAT PRUITHITHADA, SUNGWAL RUGPAO, SHARON HILLIER, PATRICIA GARCIA, and

KENRAD NELSON, “Bacterial Vaginosis and HIV Seroprevalence Among Female Commercial Sex Workers in

Chiang Mai, Thailand,” AIDS, 9 (1995), 1093-1097.

A-36.

DEBORAH S. LIPSCHITZ, MARGARET L. KAPLAN, JODIE B. SORKENN, GIANNI L. FAEDDA, PETER CHORNEY, and GREGORY

M. ASNIS, “Prevalence and Characteristics of Physical and Sexual Abuse Among Psychiatric Outpatients,”

Psychiatric Services, 47 (1996), 189-191.

A-37.

JOHN M. O’BRIEN, BRIAN M. MERCER, NANCY T. CLEARY, and BAHA M. SIBAI, “Efficacy of Outpatient Induction with

Low-Dose Intravaginal Prostaglandin E₂: A Randomized, Double-Blind, Placebo-Controlled Trial,” American

Journal of Obstetrics and Gynecology, 173 (1995), 1855-1859.

A-38.

ABDALLAH M. ADRA, HELAIN J. LANDY, JAIME NAHMIAS, and ORLANDO GOMEZ-MARıN, “The Fetus with Gastro-

schisis: Impact of Route of Delivery and Prenatal Ultrasonography,” American Journal of Obstetrics and

Gynecology, 174 (1996), 540-546.

A-39.

SIMIN LIU, LYNN E. QUENEMOEN, JOSEPHINE MALILAY, ERIC NOJI, THOMAS SINKS, and JAMES MENDLEIN, “Assessment

of a Severe-Weather Warning System and Disaster Preparedness, Calhoun Country, Alabama, 1994,” American

Journal of Public Health, 86 (1996), 87-89.

A-40.

KETAN H. PATEL, JONATHAN C. JAVITT, JAMES M. TIELSCH, DEBRA A. STREET, JOANNE KATZ, HARRY A. QUIGLEY, and

ALFRED SOMMER, “Incidence of Acute Angle-Closure Glaucoma After Pharmacologic Mydriasis,” American

Journal of Ophthalmology, 120 (1995), 709-717.

A-41.

ALEXANDRE E. VOSKUYL, AEILKO H. ZWINDERMAN, MARIE LOUISE WESTEDT, JAN P. VANDENBROUCKE, FERDINAND C.

BREEDVELD, and JOHANNA M. W. HAZES, “Factors Associated with the Development of Vasculitis in Rheumatoid

Arthritis: Results of a Case-Control Study,” Annals of the Rheumatic Diseases, 55 (1996), 190-192.

A-42.

ROBERT L. HARRIS, CHRISTOPHER A. YANCEY, WINFRED L. WISER, JOHN C. MORRISON, and G. RODNEY MEEKS,

“Comparison of Anterior Colporrhaphy and Retropubic Urethropexy for Patients with Genuine Stress Urinary

Incontinence,” American Journal of Obstetrics and Gynecology, 173 (1995), 1671-1675.

A-43.

YOSHIHIRO KOHASHI, MASAYOSHI OGA, and YOICHI SUGIOKA, “A New Method Using Top Views of the Spine to

Predict the Progression of Curves in Idiopathic Scoliosis During Growth,” Spine, 21 (1996), 212-217.

A-44.

M. P. M. BURGER, H. HOLLEMA, W. J. L. M. PIETERS, F. P. SCHRoDER, and W. G. V. QUINT, “Epidemiological

Evidence of Cervical Intraepithelial Neoplasia Without the Presence of Human Papillomavirus,” British Journal

of Cancer, 73 (1996), 831-836.

A-45.

ERIC J. THOMAS, HELEN R. BURSTIN, ANNE C. O’NEIL, E. JOHN ORAV, and TROYEN A. BRENNAN, “Patient

Noncompliance with Medical Advice After the Emergency Department Visit,” Annals of Emergency Medicine,

(1996), 49-55.

A-46.

S. T. O’KEEFE and J. N. LAVAN, “Subcutaneous Fluids in Elderly Hospital Patients with Cognitive Impairment,”

Gerontology, 42 (1996), 36-39.

Functional gastrointestinal disturbances in the early age children

Diseases of the lips and tongue in children

Diaphragmatic hernia. Disease of mediastinum

IMMUNE PREPARATIONS

Lecture 1

Articulation and occlusion. Biomechanics of movement of the mandible (Vertical, sagittal, transversal movements of the mandible).

Приєднуйся до нас!

Підписатись на новини:

Наші соц мережі

Leave a Reply Cancel reply

Functional gastrointestinal disturbances in the early age children

Diseases of the lips and tongue in children

Diaphragmatic hernia. Disease of mediastinum

IMMUNE PREPARATIONS

Lecture 1

Articulation and occlusion. Biomechanics of movement of the mandible (Vertical, sagittal, transversal movements of the mandible).