Methods of pharmaceconomic evaluation: Cost of Illness, Cost-Minimization Analysis, Cost-Effectiveness Analysis

1 Methods of pharmaceconomic evaluation

3.1 Cost of Illness

3.1.1 Approaches

3.1.2 Methods

3.2 Atopic Dermatitis

3.2.1 Therapy-Specific Cost

3.2.1.1 Topical Corticosteroids

3.3.2.2 Topical Immunomodulators

3.4 HPV

6.1 Cost-Minimization Analysis

6.1.1 What is Meant by Therapeutic Equivalence?

6.1.2 Optimizing Evidence from Clinical Trials

6.2 Sources of Clinical Trial Evidence

6.2.1 Superiority Trials

6.2.2 Equivalence Trials

6.2.2.1 Characteristics of Equivalence Trials

6.2.2.2 Equivalence Range or Margin

6.2.3 Non-Inferiority Trials

6.2.3.1 Characteristics of a Non-Inferiority Trial

6.2.3.2 Non-Inferiority Range or Margin

6.3 Other Issues To Be Addressed In Evaluating Equivalence

6.3.1 Statistical versus Clinical Significance

6.3.2 Equivalence in Single or Multiple Outcomes?

6.3.3 Whose Views of Clinical Equivalence Should be Preeminent?

6.3.4 Over What Period Should We Evaluate Clinical Equivalence?

6.4 Effectively Targeting the Use of CMA

7.1 The Rationale for Cost-Effectiveness Analysis

7.2 The Cost-Effectiveness Plane

7.3 Basic Components of a Cost-Effectiveness Analysis

7.3.1 Enumeration of the Options

7.3.2 Perspective of the Analysis

7.3.3 Time Horizon

7.3.4 Scope of the Analysis

7.3.5 Measuring and Valuing Costs

7.3.6 Measuring and Valuing Outcomes

7.3.7 Time Preference

7.3.8 Choice of Analytic Modeling Method

7.3.9 Accounting for Uncertainty

7.4 Calculation of Incremental Cost-Effectiveness Ratios

7.4.1 Dominance and Extended Dominance

7.4.2 Sensitivity Analysis

7.4.3 Interpretation of CEA Results

1 Methods of pharmaceconomic evaluation

Pharmacoeconomic methods: Economic (Cost consequence, Cost benefit, Cost effectiveness, Cost minimization, Cost utility), Humanistic (Quality of life, Patient preferences, Patient satisfaction).

There are several types of pharmaceconomic evaluation, each of which is useful in different circumstances:

• Cost-of-illness analyses consider the costs of a given disease without considering the outcome.

• Cost-minimization analyses compare the costs of interventions that provide the same outcome, with the ultimate aim of identifying the cheapest option.

• Cost-effectiveness analyses involve the comparison of cost per standardized unit of effectiveness for two or more interventions that provide varying outcomes.

• Cost-utility analyses aim to compare the cost per quality adjusted life-year for two or more interventions that provide varying outcomes.

• Finally, cost-benefit analyses compare the costs and benefits of two or more interventions that provide varying outcomes, where outcome is measured in monetary terms.

Cost minimisation analysis (CMA)

This involves measuring only costs, usually only to the health service, and is applicable only where the outcomes are identical and need not be considered separately. An example would be prescribing a generic preparation instead of the brand leader (lower cost but same health outcomes).

Cost effectiveness analysis (CEA)

The term cost effectiveness is often used loosely to refer to the whole of economic evaluation, but should properly refer to a particular type of evaluation, in which the health benefit can be defined and measured in natural units (eg years of life saved, ulcers healed) and the costs are measured in money. It therefore compares therapies with qualitatively similar outcomes in a particular therapeutic area. For instance, in severe reflux oesophagitis, we could consider the costs per patient relieved of symptoms using a proton pump inhibitor compared to those using H2 blockers. CEA is the most commonly applied form of economic analysis in the literature, and especially in drug therapy. It does not allow comparisons to be made between two totally different areas of medicine with different outcomes. The broad form of these evaluations are shown in box 1, and the key measure is the incremental cost effectiveness ratio (ICER).

Cost utility analysis (CUA):

This is similar to cost effectiveness in that the costs are measured in money and there is a defined outcome (box 2). But here the outcome is a unit of utility (e.g. a QALY). Since this endpoint is not directly dependent on the disease state, CUA can in theory look at more than one area of medicine, e.g. cost per QALY of coronary artery bypass grafting versus cost per QALY for erythropoietin in renal disease. In practice this is not so easy since the QALY is not a well defined fixed unit transferable from study to study. We should be particularly wary of attempts to draw up league tables of QALYs to allow comparisons between a range of therapies. The values in such tables have usually been derived at different times and in different ways and are not comparable.

Cost benefit analysis (CBA)

Here, the benefit is measured as the associated economic benefit of an intervention (eg monetary value of returning a worker to employment earlier), and hence both costs and benefits are expressed in money. CBA may ignore many intangible but very important benefits not measurable in money terms, e.g. relief of anxiety. CBA may also seem to discriminate against those in whom a return to productive employment is unlikely, eg the elderly, or the unemployed.

However the virtue of this analysis is that it may allow comparisons to be made between very different areas, and not just medical, e.g. cost benefits of expanding university education (benefits of improved education and hence productivity) compared to establishing a back pain service (enhancing productivity by returning patients to work). This approach is not widely used in health economics, although many economists like it on theoretical grounds and because it removes some of the “sacred cow” protection which surrounds health care. They argue that health should be another commodity, and not necessarily valued more than other possible uses of the resources.

Cost consequences and other types of evaluation

Other forms of quasi-health economic evaluation may be seen in the literature but are not true economic evaluations because they do not weigh costs and benefits in an incremental manner. In some cases, often where studies consider multiple outcomes, costs and benefits are presented in a disaggregated form (e.g. health profiles). These evaluations are frequently referred to as cost consequences analyses. Burden of disease (also known as cost of illness) studies attempt to measure the health and resource implications arising to society from a particular disease.

3.1 Cost of Illness

Cost-of-illness (COI) analysis measures the economic burden of disease and illness on society. It is often called burden-of-illness (BOI). The components of a pharmacoeconomic or cost-effectiveness analysis include costs and consequences. Costs can be divided into direct and indirect costs. Direct medical costs are those related to providing medical services, such as a hospital stay, physician fees for outpatient visits, and drug costs (including the cost of the medication itself and any downstream adverse events that may arise as a result of drug administration). Direct nonmedical costs are those related to expenses, such as transportation costs, that are a direct result of the illness. Direct costs are most frequently included in a COI study, whereas indirect costs, those associated with changes of individual productivity, are often not included in a COI study, because they are difficult to obtain. Examples of indirect costs are lost time from work (absenteeism) and unpaid assistance from a family member. In addition, intangible costs, such as pain and suffering, may be included in the analysis. Analyses can be done from one or several perspectives, which will help in determining the distribution of disease costs across multiple stakeholders. The societal perspective typically includes indirect, as well as direct, medical costs because these are costs to society, that is, as previously mentioned, lost time from work. The payer perspective typically includes only direct costs.

COI analyses are used to aid in policy making; resource allocation—that is, prioritizing resource use for disease treatment and prevention—and as baseline research from which to determine the potential benefit of new therapies.

3.1.1 Approaches

There are two approaches to conducting COI analyses, the prevalence-based approach and the incidence-based approach. The prevalence-based approach considers the cost of disease within a specified time period. The prevalence-based approach is most appropriate for diseases or illnesses that are measured within the time period of analysis and that do not change much over time (e.g., migraine) or acute diseases (e.g., asthma, eczema).

This is in contrast to the incidence-based approach, which calculates the lifetime costs of disease. This approach is most appropriate for chronic diseases, such as hypertension, or diseases that take a long time to progress, such as diabetes. This approach considers disease progression and survival probability. The disease is first defined using existing disease definitions or classification systems, such as International Classification of Diseases--Ninth Revision (ICD-9-CM) codes. To accurately capture the disease COI over the appropriate timeframe, depending on the aforementioned approaches, one must take into consideration the epidemiology of the disease under study and the demographic profiles of the typical patient population.

3.1.2 Methods

A micro-costing method has been used in many studies to examine COI. The direct costs included in this method typically comprise out-of-pocket expenses for noninsured items (over-the-counter medications, visits to out-of-plan health practitioners, laundry/clothing, and specialty items) and co-payments for prescription medications and clinic visits determined from insurance claims databases as well as the usual direct cost items previously outlined.

Two examples of COI studies, atopic dermatitis (AD) and human papillomavirus (HPV), will now be examined.

3.2 Atopic Dermatitis

AD is a chronic disease that affects the skin of children and adults. It results in itchy, flaky skin and demonstrates a considerable impact on patient QoL, as well as a substantial monetary burden. Direct and indirect costs for AD have been measured in various countries and are substantial from both a patient and a societal perspective. The direct costs have been reported to range from $71 to $2,559 per patient per year. This variation in cost is due to differences in study methodology as well as differences in health care systems of the various countries. Most of the costs of AD consist of indirect costs associated with time lost from work, lifestyle changes, and non-traditional or over-the-counter treatments for AD. The financial burden on the health care system and on society is expected to grow because the prevalence of the disease is increasing.

Indeed, studies in the past 7 years, using a prevalence-based approach to calculate COI, have demonstrated direct costs ranging from US$150 (using the approximate US$ equivalent in 2005) to US$5806 per patient per year, with differences varying due to different cost-accounting methods. Table 3.1 lists numerous references in which US$ (or equivalent) per patient COI were calculated.

Typically, outpatient visits and medications composed the majority of direct costs, ranging from approximately 62% to >90%.8 The distribution of AD-associated direct costs from Fivenson and colleagues is shown in Figure 3.1.8 In those studies that examined indirect costs (e.g., the patient out-of-pocket costs for co-pays, medications, household items, loss of productivity) they made up substantial percentages of the total, e.g., 36% or 73%. Several studies showed increasing costs with worsening disease severity in adults. Using a micro cost-accounting approach, whereby costs of hospitalizations, consults, drug therapy, treatment procedures, diagnostic tests, laboratory tests, clinic visits, and urgent care visits were summed, Fivenson, Arnold, and colleagues (Table 3.2) reported an average annual per patient direct cost ranging from $435 in mild patients to $3229 in severe patients.

Figure 3.1 Distribution of atopic dermatitis-associated direct costs in a U.S. health plan.

Indirect costs also increased by worsening disease severity—by more than twofold to threefold11 to as much as almost tenfold.8 Similarly, Ehlken and co-authors showed a greater than twofold increase in total (both direct and indirect) costs for patients with mild vs. severe disease.

3.2.1 Therapy-Specific Cost

Several studies have compared the cost of different uses of topical corticosteroids (TCS) vs. topical immunomodulators (i.e., pimecrolimus and tacrolimus) and of the topical immunomodulators against each other. Some of these are detailed below.

3.2.1.1 Topical Corticosteroids

Green and colleagues undertook a systematic review of 10 randomized controlled trials (RCTs) in patients with AD.9,10 Their literature search at the time revealed no published studies of this nature. The authors noted a wide variation in price and product availability, with the lowest price being generic hydrocortisone (£0.60 [approximately US$1.09]) to the highest at that time being mometasone furoate (Elocon) of £4.88 (approximate US$8.80 equivalent).

Six of the RTC studies favored the once-daily option as the lowest-cost treatment and four favored a twice-daily option, with successful outcome being defined by overall response to treatment, relapse or flareup rate, adverse effects, compliance, tolerability, patient preference measures, and impact on quality of life. One of the twice-daily-favored studies achieved a greater benefit (number of successful treatment responders) at a greater cost. However, it was felt that this greater cost would still likely be very cost-effective, given the relatively low prices of TCS. The limitations noted in the review were that of potentially low generalizability due to 80% of the RCTs’ referring to potent TCS in patients with moderate-to-severe disease, whereas the majority of patients with AD have mild disease and lack of information on quantity of product usage.

3.3.2.2 Topical Immunomodulators

Clinical data show that topical immunomodulators are effective in AD, yet do not cause the significant adverse effects associated with TCS. Delea and colleagues retrospectively compared 157 pimecrolimus patients with 157 tacrolimus patients previously receiving TCS in a large claims database of managed care patients in terms of resource utilization (concomitant medications) and AD-related follow-up costs. They used propensity matching to control for differences between the groups in baseline demographic and clinical characteristics and utilization of AD-related services prior to assessment of disease severity. Patients in the pimecrolimus group had fewer pharmacy claims for TCS (mean 1.37 vs. 2.04, P = 0.021); this occurred primarily in the high-potency topical corticosteroid category. Fewer patients in the pimecrolimus group also received antistaphylococcal antibiotics during the followup period (16% vs. 27%, P = 0.014) and total AD-related costs during this time were lower in this group than in the tacrolimus group (mean $263 vs. $361, P = 0.012).

3.4 HPV

Persistent infection with cancer-associated HPV (termed oncogenic or high-risk HPV) causes the majority of squamous cell cervical cancer, the most common type of cervical cancer, and its histologic precursor lesions, the low-grade cervical dysplasia Cervical Intraepithelial Neoplasia-1 (CIN1) and the moderate-to-high-grade dysplasia CIN 2/3. Multiple HPV strains cause varying degrees of invasive cervical cancer (ICC) and its CIN precursors. HPV strains 16 and 18 cause approximately 70% of all cervical cancers15,16 and CIN3, specifically, and 50% of CIN2 cases. In addition, HPV 16 and 18 cause approximately 35 to 50% of all CIN1. Low-oncogenic HPV risk types 6 and 11 account for 90% of genital wart cases. Unfortunately, cytological and histological examinations cannot reliably distinguish between those patients who will progress from cervical dysplasia to ICC from those whose dysplasias will regress spontaneously, the latter being the vast majority of cases. This inability to definitely ascertain the natural history of HPV infection is one of the primary reasons for the dilemma with HPV vaccination.

Although cervical cancer screening programs, such as the use of routine screening via the Papanicolaou (Pap) cervical smear, have substantially reduced the incidence and mortality of ICC in developed countries over the past 50 years, there has been a slowing of these declines in recent years due to poor sensitivity of cervical cytology, anxiety and morbidity of screening investigations, poor access to and attendance of screening programs, falling screening coverage, and poor predictive value for adenocarcinoma, an increasingly common cause of ICC. HPV is the most common sexually transmitted disease in the United States and virtually 100% of cervical cancer is due to HPV. HPV is also linked to head and neck cancer in men. There are more than 100 HPV strains (thereby potentially reducing vaccine efficacy for oncogenic strains not covered by the vaccine); HPV infection is often self-limited. A mitigating factor for the argument against using the vaccine is the fact that the cost-effectiveness of screening with Pap smears is reduced (improves) from USD1 million/QALY if patients continue to be screened annually, as is the common current recommendation, to USD150,000/QALY if patients are screened every 3 years, the latter a likely scenario if the vaccine is used.

Worldwide, the incidence of cervical cancer is 470,000 new cases and 233,000 deaths per year; it is the second-leading cause of cancer deaths, with 80% of these cases observed in developing countries. Women in developing countries are especially vulnerable as they lack access to both cervical cancer screening and treatment. The demographics of cervical cancer in the United States show that 9710 new cases of ICC were expected to be diagnosed in 2006 and about 3700 deaths in women were expected from ICC. The National Cancer Institute estimates an annual incidence of new genital HPV infections of 6 million. Quadrivalent Human Papillomavirus (HPV) Vaccine recombinant (Gardasil®), the vaccine recently approved for use in the United States and Europe, covers the two major oncogenic HPV strains (16 and 18) for cervical cancer. In addition, it covers HPV strains 6 and 11, the primary causes of genital warts. Therefore, the vaccine does not offer full protection against cervical cancer, because it does not protect against HPV strains 31 and 45, which are also implicated in ICC and cervical dysplasia. To significantly reduce the rate of cervical cancer in the population as a whole, 70% of girls need to be vaccinated to achieve what is called “herd immunity”—when the vaccine’s impact goes beyond just people who are inoculated. So far, it is unknown if HPV strains will mutate as the vaccine is introduced, although this is not very likely, seeing that HPV is a DNAbased virus.

Insinga and colleagues used administrative and laboratory data from a large U.S. health plan to examine costs, resource utilization, and annual health plan expenditures for cervical HPV-related disease. An episode of care was defined as beginning with a routine cervical smear, that is, one that required no evidence of follow-up for a previous Pap smear abnormality or ICD-9 diagnosis of a cervical abnormality during the previous 9 months. If CIN or cancer was not detected during an episode of care, biopsy results were termed false-positive. Because the data source was a prepaid health plan without direct billing for procedures or services, service-specific costs were assigned from the Medstat Marketscan database as a proxy for the health plan costs. Because of the small number of cervical cancer cases in the data set, costs were assigned on an age- and stage-specific basis using the Surveillance Epidemiology and End Results Program (SEER; National Cancer Institute; U.S. Department of Health and Human Services, Bethesda, MD) and an Agency for Healthcare Researchand Quality evidence report. All cost estimates were converted to 2002 dollars using the Medical Care component of the Consumer Price Index.

The authors found that episodes of care after an abnormal routine cervical smear were $732 on average, compared with $57 for visits with negative results, with a statistically significant trend toward higher costs with increasing grade of initial cytologic abnormality. False-positive cervical smears cost $376 annually, while incomplete follow-up was $79. Regardless of age group, cervical HPV-related disease annual health care costs were $26,415 per 1000 enrollees, with the greatest costs of $51,863 being observed in the 20- to 29-year-old age group. The largest cost contribution was that of routine screening at 63.4% of total costs (range by age group of 54.1% to 70.8%), followed by cost of CIN 2/3, then cancer, false-positive smear,CIN 1 and incomplete follow-up (see Figure 3.2).

Figure 3.2 Distribution of cervical HPV-related disease direct costs in a commercial U.S. health plan.

Insinga and co-authors extrapolated their results to the general U.S. population to derive a total health care cost for HPV-related disease in 1998 of $3.4 billion, with expenditures for routine screening accounting for $2.1 billion, false-positive Pap test $300 million, CIN 1 $150 million, CIN 2/3 $450 million, and IC $350 million in 2002 dollars. A follow-up study by the same authors estimated the annual direct costs of abnormal cervical findings and treating cancer at $3.5 billion in 2005 US$. Annual direct cost estimates in 2005 US dollars have been as high as $4.6 billion and adding in costs of anogenital warts and other cancers associated with oncogenic HPV strains raises the total estimated economic burden to as high as US$5 billion in 2006 US$.

Insinga and colleagues also estimated indirect costs, assuming that there were 130,377 women who would have been alive during 2000 had they not died from cervical cancer during that or a previous year, >75% of these women died before age 60, with >25% dying prior to age 40, and that 37,594 (29%) of these women would have had labor force earnings during 2000. Using these data, the total productivity loss in 2000 owing to cervical cancer mortality was estimated at $1.3 billion, several times higher than recent estimates of the annual U.S. direct medical costs of US$300 to $400 million associated with cervical cancer. As in the AD studies, therefore, indirect costs are thought to account for a much greater burden than direct costs of HPV.

In summary, COI or BOI lays the foundation on which to frame the different types of analyses (see Chapters 4 through 9) that are used to make decisions in allocation of healthcare resources. As indirect costs, that is, productivity, often account for a substantial portion of the burden, these should be assessed as part of the COI computation whenever possible.

6.1 Cost-Minimization Analysis

The principal issues that are addressed in this chapter are:

1. The circumstances in which cost-minimization analysis (CMA) is an appropriate methodology to undertake health economic evaluations.

2. Steps that can be taken to improve the quality of CMAs and, hence, their reliability as a basis for healthcare decision making.

The appropriateness of any economic methodology depends on the nature and quality of the underlying clinical evidence, with evaluations based on inappropriate or poor quality clinical data’s failing to provide a reliable basis for health care decision-making. The primacy of clinical data is particularly evident in the case of CMA in which, conditional on health benefits between two equivalent competing options, the least expensive option is preferred. Perhaps as a consequence of this apparent simplicity, scant attention has been previously paid to the theoretical and practical methods used to inform the analysis or to establish the appropriateness of this choice of methodology.

Many sources of clinical evidence can be used to support economic analyses; however, the “gold standard” is normally considered to be the randomized controlled trial (RCT), which holds everything constant with the exception of the drug being evaluated. Given that, by definition, the results of clinical trials cannot be known in advance, it is impossible to plan to undertake a CMA alongside an RCT because it is not certain that the health outcomes being compared will be equivalent (Donaldson, Hundley et al. 1996). Therefore, no prospective economic evaluation starts out as a CMA; only when the health outcomes generated are empirically demonstrated to be “identical or similar” will the CMA be adopted as an appropriate methodology by the health economist.

CMA is frequently portrayed as being the “poor relation” among health economic methodologies, with its apparent simplicity making it unworthy of being considered alongside more theoretically rigorous health economic methodologies. However, it is important that health economists recognize and acknowledge that the theoretical underpinnings of CMA are just as rigorous as those underpinning other methods of economic evaluation. Perhaps as a consequence of the comparative disdain in which CMA has been held, its use to date appears to have been poorly conceived and frequently inappropriate. In this regard, CMA has been frequently employed as an evaluative tool to support and justify the introduction of cheaper, but potentially less effective, treatments. The usual procedure is for the analyst to simply assume that the benefits of a new health technology are equivalent to the existing “gold standard” therapy without having sufficient evidence to justify such a claim. For example, by assuming a class effect for similar types of drugs (each drug in a class having equivalent outcomes) it then becomes possible to base subsequent analysis solely on a comparison of costs—an attractive strategy if you are introducing a cheaper but less effective drug.

The methods currently used to justify equivalence in outcomes in a CMA therefore appear to be inherently flawed and indicate an urgent need to improve the theoretical rigor underlying this aspect of CMAs if they are to be taken seriously as a method of economic evaluation. The current haphazard approach leads to a situation in which CMA is typically described in health economics textbooks as a form of economic evaluation where “… the decision simply revolves around the costs” (Gold, Siegel et al. 1996 p. 165).

This interpretation ignores the extreme rigor that should be required to ensure equivalence in health benefits prior to deciding on the appropriateness of employing CMA as an economic methodology. The crucial decision relates to the fact that CMA has been defined as being an appropriate methodology. Underpinning this decision is a detailed analysis of clinical data that convinces the analyst that the interventions being compared lead to equivalent health outcomes. Only in these strictly controlled circumstances is it legitimate for CMA to concentrate on costs alone. As such, a crucial and indispensable element underpinning the decision to use CMA as an economic methodology is the need to unambiguously determine the therapeutic equivalence of competing interventions (Newby and Hill 2003). In practice, therefore, the extent to which CMA represents an appropriate methodological structure is entirely determined by the interpretation that can be placed on the available clinical evidence.

6.1.1 What is Meant by Therapeutic Equivalence?

The extent to which alternative health care technologies are sufficiently similar to justify the use of CMA is an area of theoretical uncertainty and, thus, still open to subjective interpretation, with the majority of published CMAs appearing to be based on assumptions rather than evidence of clinical equivalence. This primacy of hope over experience may cause misleading recommendations to be made for health care resource allocation.

Given this fact, it is perhaps surprising that the exact nature of the evidence base required to support therapeutic equivalence and, hence, the appropriateness of CMA as an economic methodology has not been subject to more intense scrutiny. CMAs are frequently based on the results of clinical trials that have attempted but failed to identify the superiority of a new drug over the existing “gold standard” therapy. This occurs despite the obvious fact that the inability of a health intervention to prove superiority in a superiority trial (ST) in no way indicates that this necessarily implies clinical equivalence. Recent advances in clinical trial design have made it easier to directly compare clinical equivalence in a more meaningful manner with the development of non-inferiority trials (NIs) allowing this issue to be directly addressed. Alternatively, where a trial is initially designed as a ST but such superiority remains unproven, the analysis can be switched from superiority to non-inferiority in appropriate cases. The use of such improvements in trial design should enable CMAs to be more effectively targeted in a manner that ensures that they are undertaken only in appropriate circumstances using rigorous sources of evidence. In this manner, only CMAs that meet minimum standards with regard to clinical equivalence will be accepted; CMAs that fail to meet such criteria will be dismissed. Such an approach would enable health economics to gain enhanced credibility from the use of this potentially valuable economic methodology.

6.1.2 Optimizing Evidence from Clinical Trials

If CMAs are to form a reliable basis for health care decision-making, due consideration must be given to the claims of clinical equivalence that are crucial to the adoption of the CMA methodology. The implications of adopting an inappropriate clinical trial design or misinterpreting the results of a clinical trial are often considerable:

“… wrongly discounting treatments as ineffective will deprive patients of better care.

Wrongly accepting treatments as effective exposes patients to needless risks and wastes.” (Tarnow-Mordi and Healy 1999 p. 210).

RCTs typically compare the gold standard existing treatment with a new intervention (Tramer, Reynolds et al. 1998). RCTs can be structured to evaluate ST, therapeutic equivalence (ET) or therapeutic NI. The trial designs differ in terms of their objectives and these differences have significant implications for the use of the CMA methodology. The greatest support for the use of CMA occurs when an ET proves that two health care technologies are clinically equivalent; however, there exists a myriad of “gray” areas that may be indicative of therapeutic equivalence and, hence, require more careful analytical consideration and judgement. Such gray areas are analyzed in detail in the remainder of this chapter.

6.2 Sources of Clinical Tria l Evidence

6.2.1 Superiority Trials

The extent to which clinical evidence can be used to inform CMAs is dependent on the design of the RCT. STs are specifically designed to show a difference in health benefits between two health care technologies. Typically, the primary objective of the research is to determine whether an experimental intervention is more efficacious than the established gold standard treatment. To identify whether there is a difference in health benefits between two health care technologies it is necessary to begin with a null hypothesis that treatment X yields the same health benefits as treatment Y.

The ST estimates the probability that the effect exists when the null hypothesis is true using the test statistic (p-value). The smaller the size of the p-value the more likely it is that the null hypothesis is false and that a difference does exist between the health benefits generated by the treatments. P-values, therefore, can identify statistically whether an effect is likely by conveying information about the probability of an incorrect inference given the observed effect but can say nothing about the size of the effect or its clinical relevance.

Newby and Hill (2003) emphasize the inadequacy of using p-values obtained in STs to interpret the results of clinical trials and recommend the use of confidence intervals and personal judgment when determining clinical equivalence before accepting or rejecting an equivalence claim:

“… leaving it up to the reader to decide whether the confidence interval includes or excludes potentially clinically important differences between two treatments. If it does not exclude differences … assume that the two drugs are not the same” (Newby and Hill 2003).

When the original objective of an ST is not achieved, there is an obvious incentive to refocus the analysis to support more restricted claims of clinical equivalence. However, STs are specifically designed to demonstrate that there is, indeed, a difference and, thus, to reject the null hypothesis in favor of the alternative hypothesis (i.e., that there is a difference). In STs, it is impossible to prove that the null hypothesis is true, as the aim is to reject it by proving that the observed difference is unlikely to be commensurate with equivalent health outcomes of the competing health care interventions.

In CMAs the clinical evidence from failed STs is often misinterpreted as proving that the health care interventions being compared are clinically equivalent. Such methodological flaws resulting from the misinterpretation of clinical trial results can also “…lead to false claims, inconsistencies and harm to patients” (Greene 2000 p. 715).

However, if appropriately planned for, it is possible to switch the focus of the analysis from superiority to non-inferiority in a single trial. Thus, failed STs that are well designed and have adequate sample size could therefore potentially be used to provide evidence of health equivalence for use in CMAs.

6.2.2 Equivalence Trials

6.2.2.1 Characteristics of Equivalence Trials

ETs are intended to demonstrate that the effect of a new treatment is not worse than the effect of the current treatment by more than a specified equivalence margin. The aim of an ET is, therefore, to specifically rule out significant clinical differences between the treatments by directly evaluating the extent to which two health care interventions have equivalent therapeutic effects. Briggs and O’Brien (2001) argue that CMA should be used only when clinical evidence has been obtained from an ET. They argue that it is inappropriate to use the results of a failed ST to demonstrate clinical equivalence “… unless a study has been specifically designed to show the equivalence of treatments it would be inappropriate to conduct cost-minimization analysis” (Briggs and O’Brien 2001).

However, even where an equivalence trial indicates clinical equivalence in primary outcomes, scrutiny of secondary outcomes may reveal significant differences in safety, cost, or convenience. “… one therapy may offer clinical benefits such as a more convenient administration schedule, less potential for drug interaction or lower cost” (Hatala et al. 1999 p.9).

Reliance on a single clinical measure of effectiveness may potentially be misleading as it may fail to capture an important difference in health outcome between two alternatives. Thus, ideally, clinical equivalence should be established for a range of health outcomes before the use of CMA can be supported. In addition, in evaluating claims of clinical equivalence it is important to acknowledge that:

“It is never correct to claim that … there is no difference in effects of treatments.…

There will always be some uncertainty surrounding estimates of treatment effects, and a small difference can never be excluded” (Alderson and Chalmers 2003 p.476).

Even if one compared a drug with itself, there would be a difference; therefore, it cannot be unequivocally claimed that two health care technologies are clinically equivalent. Thus, even where the results of ETs indicate no difference, this may simply indicate that the true difference exists outside of the specified probabilities of error.

If clinical equivalence is demonstrated in a good quality ET, there remain two other issues that must be addressed prior to unambiguously supporting the use of the cost-minimization approach. First, the primary health outcome must encompass the main benefit(s) of the treatments being compared. Second, any differences in other health outcomes, e.g., secondary health outcomes, must be sufficiently small so as not to attain clinical significance. If these assumptions cannot be substantiated, then it would not be appropriate to adopt the CMA approach despite the availability of equivalence obtained in an ET.

6.2.2.2 Equivalence Range or Margin

A crucial step in the design of an ET is the definition of clinical equivalence. The equivalence margin attempts to incorporate all values that represent unimportant clinical differences in treatment and must be stipulated in advance of the clinical trial. The equivalence range, therefore, includes the largest difference between treatments that is clinically acceptable before treatments become defined as providing significantly different benefits. The first step in any ET is, therefore, to define the smallest unacceptable degree of inferiority or superiority to ensure that the ET can be appropriately powered. For example:

“… if the difference between the two groups in respect of change in pulmonary function was within +/– 1.5 units, then the treatments would be considered clinically equivalent” (Huson 2004 p.2).

This means that if treatment A is better or worse than treatment B by more than a 1.5 unit change in pulmonary function, the two treatments cannot be considered to be clinically equivalent. Clinical equivalence can be claimed if the 95% CI around the difference in treatments is found to lie entirely within the predetermined clinical equivalence margin. The setting of the equivalence margin communicates a judgment about what is and what is not clinically and statistically acceptable (Pater 2004).

Clearly, different clinical situations require different equivalence margins and analysts must justify their chosen range with regard to clinician opinion and previous trials comparing active controls with placebo. An equivalence margin that is too wide could mean that significantly different treatments are considered to be clinically equivalent; conversely an equivalence margin that is too narrow could mean that clinically equivalent treatments are mislabeled as being significantly different. It is important that good clinical judgment be combined with sound clinical and statistical reasoning to ensure that the chosen margin is clinically relevant and statistically feasible.

A negative study result from an ET can take two forms. The CI around the treatment difference may lie partially within the equivalence margin or it can lie entirely outside, leading to the conclusion that the probability of a difference between the two treatments has not been rejected (see Figure 6.1).

Figure 6.1 Interpretation of equivalence trials.

6.2.3 Non-Inferiority Trials

6.2.3.1 Characteristics of a Non-Inferiority Trial

The rationale behind a NI trial is to demonstrate that the new health care technology is not worse than the current health care technology by a pre-stated clinical margin. This type of trial is useful when the clinical issue relates to the extent to which the new health care technology is as good as current therapy. In NIs, analysis is focused entirely in one direction—typically the new treatment is not worse than the established therapy by more than the non-inferiority margin specified. An improvement of any size fits within the definition of non-inferiority. Span and colleagues published the first paper that acknowledged the link between CMA and NIs:

“… the most efficient analysis of the clinical effect in a cost minimization study is the non-inferiority analysis” (p.262). They conclude that: “… to obtain valid results from a cost-minimization study, care has to be taken to adapt the correct methodology for non-inferiority testing in clinical outcomes” (Span, TenVergert et al. 2006 p.261).

To ensure a robust interpretation of trial results, some analysts call for both per protocol (PA) and intention to treat (ITT) analyses to be conducted and only if both types of analysis support the hypothesis should non-inferiority be claimed (Snappinn 2000). Therefore, the extent and nature of the evidence of non-inferiority that is required to provide an acceptable platform on which to base a CMA is still open to debate.

6.2.3.2 Non-Inferiority Range or Margin

The non-inferiority range should be set in relation to the clinical notion of a minimally important effect. An acceptable non-inferiority margin depends on defining a difference that has previously been identified as not being clinically significant. To do this, two additional conditions must be met. First, the smallest expected effect of the active control over placebo must exceed this margin to ensure that no positively harmful treatments can be introduced and, second, the margin must be no greater than the difference between active treatments judged clinically important.

In a NI, non-inferiority is demonstrated when the CI around the treatment difference lies entirely to the right of the lower bound of the non-inferiority margin. Non-inferiority is not demonstrated if the lower bound of the CI lies to the left of the non-inferiority margin (see Figure 6.2).

Figure 6.2 Interpreting NIs using CIs.

6.3 Other Issues To Be Addressed In Evaluating Equivalence

6.3.1 Statistical versus Clinical Significance

One of the failings of statistical analyses undertaken in the context of an ST is that statistical significance may differ from clinical significance. Variables that are identified as exhibiting statistically significant differences may be entirely unimportant from a clinical perspective, whereas clinically crucial differences remain crucial even if they fail to achieve statistical significance. In contrast, in ETs and NIs, statistical and clinical significance are inextricably linked via the setting of equivalence and non-inferiority margins.

6.3.2 Equivalence in Single or Multiple Outcomes?

In any clinical trial it is necessary to identify a primary health outcome that is common to the competing alternative interventions. Choice and measurement of such an outcome measure is a crucial step in determining the appropriateness of the trial as an evidence source on which to undertake CMAs. To be of value, the primary health outcomes must be the dominant outcome from the perspective of both patients and clinicians and capture the most clinically relevant benefits of the competing treatments. If not, claims of clinical equivalence, even when based on ETs, are not sufficient to support the use of CMA.

In clinical practice it is highly unlikely that two health care interventions will yield exactly the same health benefits in all dimensions of clinical and patient outcomes. Typically, the design of ETs and NIs identifies a single endpoint for comparison despite the perception that one of the treatments is likely to offer significant advantages in another area. For example, where two treatments have equal efficacy, yet one is more convenient to patients, the extent to which CMA can be appropriately utilized depends largely on the perspective adopted by the analysis. Where equivalence is not demonstrated for all important outcomes, the analyst must provide explicit justification for using the cost-minimization approach in light of the study question and perspective. In large part, the interpretation of clinical equivalence will depend on the specific circumstances of the clinical trial, the range of outcomes being measured and the judgement of the analyst. In such cases it is difficult to provide specific guidance that would be appropriate in all cases.

6.3.3 Whose Views of Clinical Equivalence Should be Preeminent?

Definitions of clinical equivalence will depend on whose views we consider to be the most important (patients, clinicians, or society). Generally, lead investigators in clinical trials specify the primary and secondary health outcomes to be measured, with the identification of the primary outcome measure’s being based on relevant clinical experience, published clinical evidence, and knowledge of patient needs. The crucial factor is to ensure that the choice of health outcome measures used to determine clinical equivalence is clinically meaningful to the patient.

6.3.4 Over What Period Should We Evaluate Clinical Equivalence?

The benefits of health care technologies will vary in relation to the time point at which they are measured. In a clinical trial the primary health outcome measure might exhibit statistically significant differences at 3 months but not at 6 or 12 months. In such circumstances, do we interpret the therapeutic interventions as being equivalent and, hence, appropriate for analysis in the context of a CMA? It is important to acknowledge that subsequent reanalysis is required to demonstrate continued clinical equivalence.

6.4 Effectively Targeting the Use of CMA

The current use of RCT evidence to support statements of clinical equivalence is inadequate, and clear and appropriate decision rules are required in the future to ensure that unambiguous evidence of clinical equivalence is a feature of future CMAs. In the absence of such evidence it would be potentially misleading to use flawed analyses as the basis for health care decision-making. While it is comparatively simple to identify circumstances in which the use of CMA as an economic methodology is clearly inappropriate, it is more difficult to specify unambiguous decision rules that identify circumstances in which clinical evidence clearly supports the use of CMA. The appropriateness of using CMA must be judged in the light of the totality of the clinical evidence supporting or refuting the hypothesis of therapeutic equivalence between two competing interventions, combined with the specialist knowledge and expertise required to place such evidence in context. However, certain limited guidance can be provided with regard to effectively targeting CMAs.

First, clinical evidence from a well-designed ET represents the gold standard in supporting claims of clinical equivalence in support of the use of CMA. However, even where data is available from an ET it still remains important to consider the extent to which the primary health outcome fully captures the benefits being derived from the health care treatments being compared. If other benefits are clinically meaningful to patients and clinicians, additional comparisons of clinical equivalence may be required.

Second, failure to prove clinical superiority should not be interpreted as providing evidence of clinical equivalence. In certain circumstances, and if planned into trial design, trial data may be re-analyzed to assess clinical equivalence, but such reinterpretation of the dataset requires further analysis if the use of CMA is to be justified. In particular, a non-inferiority statement should be stipulated in the clinical trial protocol to ensure that valuable information can still be derived even if superiority is not proven.

Third, the extent to which data from NIs can be used to justify CMAs is currently subject to a great amount of uncertainty. In particular, to what extent proof of noninferiority represents an acceptable approximation of “therapeutic equivalence” and, hence, justifies the use of CMAs, is still open to debate.

Finally, where CMAs are based on valid claims of clinical equivalence derived from appropriate sources of RCT evidence, it represents an appropriate and powerful method of economic evaluation. However, it is crucial that in interpreting the results of CMAs, the informed decision-maker uses his or her clinical judgment to assess the quality and quantity of the evidence in support of therapeutic equivalence and, hence, identifies the theoretical justification for the use of CMA. In cases where the decision-maker does not accept claims of clinical equivalence, the results of the CMA should clearly not be used as the basis for decision-making.

The cost-minimization method of economic evaluation has always been employed in a more haphazard manner than other methods of economic evaluation. It is crucial to rectify this situation to ensure that only techniques that prove to be robust and reliable in improving health care decision-making are incorporated into the toolkit employed by the health economist. However, exactly how similar do outcomes have to be to support the application of this powerful economic methodology? The most appropriate design for a clinical trial to generate evidence that two health care technologies are identical or similar is the ET. Such trials are specifically designed for this purpose and, therefore, any differences that are identified between the health interventions being compared are neither clinically nor statistically significant.

It is essential that health economists and decision-makers are clear on what is meant by the concept of clinical equivalence and to acknowledge that, given the heterogeneous nature of patient populations and treatment outcomes, it is likely to prove impossible to achieve exact equivalence between competing health care interventions. Ultimately, it is up to the health economist to justify the use of CMA just as it is up to the decision-maker to judge the extent to which the results obtained should be influential in determining decision making.

7.1 The Rationale for Cost-Effectiveness Analysis

As noted in prior chapters, the economic evaluation of pharmacotherapies and other health care interventions is growing in importance as the resources directed toward health care account for progressively larger portions of the budgets of governments, employers, and individuals. Making rational decisions under conditions of resource constraints requires a method for comparing alternatives across a range of outcomes, allowing a direct ranking of the costs and benefits of specific strategies for preventing or treating a particular illness.

Cost-effectiveness analysis (CEA) provides a framework to compare two or more decision options by examining the ratio of the differences in costs and the differences in health effectiveness between options. The overall goal of CEA is to provide a single measure, the incremental cost-effectiveness ratio (ICER), which relates the amount of benefit derived by making an alternative treatment choice to the differential cost of that option. When two options are being compared, the ICER is calculated by the formula:

In medical or pharmacoeconomic cost-effectiveness analysis, health resource costs (the numerator) are in monetary terms, representing the difference in costs between choosing option 1 or option 2. In cost-effectiveness analysis, the differential benefits of the various options (the denominator) are non-monetary and represent the change in health effectiveness values implied by choosing option 1 over option 2. Typically, these health outcomes are measured as lives saved, life years gained, illness events avoided, or a variety of other clinical or health outcomes. Unlike CEA, cost-benefit analysis values both the costs and benefits of interventions in monetary terms. Cost-utility analysis, a subset of CEA where intervention effectiveness is adjusted based on the desirability (or utility) of the resulting health states, is discussed in Chapter 9 as it relates to the cost-effectiveness of human papillomavirus (HPV) vaccine.

7.2 The Cost-Effectiveness Plane

A pharmacoeconomic analysis is often interested in how much more of a health outcome can be obtained for a given financial expenditure. Limited resources may, many times, constrain choices between medical options. The cost-effectiveness plane serves to clarify when these choices may be easy or difficult.1 The cost-effectiveness plane is typically drawn with the differences in cost (or the incremental cost) on the y-axis and the differences in effectiveness (or incremental effectiveness) between the two options on the x-axis (Figure 7.1). In this example we will compare an existing program with a new program. The existing program, acting as the comparator, will be at the origin of both the cost and effectiveness axes, depicting the current level of expenditure and benefit with which a new therapy is compared. The new therapy can be more expensive, less expensive, or equivalent in costs to the current option. Similarly, the new option can be more effective, less effective, or equivalent in clinical effectiveness as compared with the existing strategy or therapy.

Figure 7.1 The cost-effectiveness plane.

This produces four possible options for the results of the analysis of a new strategy compared with an existing one. If the new program is less expensive and more effective than the existing program, then the point representing the new program falls into the southeast (SE) quadrant of the cost-effectiveness plane. Points in this quadrant are called dominant, and strategies that have such a characteristic should be chosen over the existing strategy due to their superior outcome at diminished costs. These strategies are “cheaper and better” than current therapy and should be adopted. Examples of strategies in this quadrant are laparoscopic cholecystectomy compared with other therapies for symptomatic gallstones or interventions to decrease cigarette smoking.

If, on the other hand, the new program is more expensive and less effective than the existing one, then this program falls into the northwest (NW) quadrant of the plane. Strategies in this quadrant are considered to be dominated by the current strategy and should not be chosen due to poorer outcomes at greater cost. Although existing strategies in this quadrant are perhaps relatively rare, there are examples of strategies that do not appear to derive a benefit, yet incur substantially more health care costs than other options. Examples include amoxicillin prophylaxis compared with no antibiotic for dental procedures in patients at moderate risk for infective endocarditis and magnetic resonance imaging vs. endocrinologic follow-up of patients with asymptomatic pituitary microadenomas.

If the new program is either dominant or dominated (i.e., in the SE or NW quadrants), a formal CEA is not needed to assist the decision—the decision is (or should be) obvious. However, if the new program is both more effective and more costly, falling in the northeast (NE) quadrant, then a CEA would be useful to define the tradeoff between increases in costs and effectiveness and to calculate the cost per unit of effectiveness gained. Similarly, a CEA would also be useful if the new strategy fell into the SW quadrant as being both less costly and less effective than the existing program, once again to define the tradeoffs between programs and to ascertain the cost-effectiveness ratio. This graphical display emphasizes one of the most fundamental and important concepts of cost-effectiveness analysis; it is useful only when there is a tradeoff between the cost of a strategy and the benefit derived from that strategy.

7.3 Basic Components of a Cost-Effectiveness Analysis

Several factors should be considered in the construction of a CEA (Table 7.1). A high-quality analysis will include and describe the relevant options, clearly state the perspective of the analysis, choose a relevant time horizon over which to track costs and effects, consider the appropriate population, accurately measure the costs and effectiveness of the competing options, account for the differential value of costs and outcomes that occur at different times in the future, and account for uncertainties of assumptions and values in the context of an appropriately constructed analytic model. Following is a description of these concepts in more detail.

7.3.1 Enumeration of the Options

A CEA requires a comparison between two or more options. A single option cannot be cost-effective in isolation—an option can be considered cost-effective or not cost-effective only in comparison with other options. Additionally, the cost-effectiveness of a strategy is highly dependent on the specific choice of comparators included in the analysis, and care must be taken to include all of the clinically reasonable options. At a minimum, the comparators include the current standard of care and a range of typically utilized options. A cost-effectiveness analysis of a new therapy compared with a strategy that is not typically used, or is used only in atypical circumstances, is not useful for clinicians or policy makers. It is often reasonable to include a “do nothing” option, especially if doing nothing is a legitimate clinical strategy, but also as a baseline comparator to assess the clinical realism of the model and analysis. In all cases, the strategies should be described in sufficient detail such that readers could replicate or implement the strategy in their own settings.

7.3.2 Perspective of the Analysis

Choosing the perspective or set of perspectives to be considered in a CEA is essential, as this choice determines the cost values to be contained in the analysis. For example, an analysis from the societal perspective considers all costs, while an analysis from the patient perspective would consider only costs borne by the patient. Other possible perspectives include the third-party payer (insurance) or health system perspective where costs for which these entities are responsible are considered in the analysis; the hospital or health agency perspective includes the costs of providing various health services. Whenever possible, the societal perspective should be included in the set of perspectives to be considered in analysis, because it is the broadest and is recommended for the reference case analysis by the Panel on Cost-Effectiveness in Health and Medicine.

7.3.3 Time Horizon

The analyst must decide a priori how long the costs and effects of the various interventions in the analysis will be tracked. This is usually determined by the clinical features of the illness or its treatment. For example, a CEA of a new antibiotic for acute dysuria treatment in otherwise healthy women might appropriately have a very short time horizon of only a month, as there are virtually no long-term effects of either the disease or its treatments. On the other hand, cost-effectiveness analyses designed to value the effects of cardiovascular risk reduction need to assess the outcomes for much longer time periods; typically such an analysis would follow treatments and effects until death. In any case, all of the strategies must be followed or modeled for the same time horizon. Methods for modeling costs and effects, even in situations where this modeling extends beyond the existence of specific data.

7.3.4 Scope of the Analysis

An analysis might be relevant for an entire population or for only a relatively small population subgroup; the analyst will need to appropriately choose the cohort to be considered in the analysis. For example, if an intervention is to be directed toward elderly patients with diabetes in order to prevent diabetes complications, limiting the scope of the analysis to an elderly, diabetic population is a logical choice, while if the question is regarding diabetes prevention in adults, a broader population scope is required. The scope of outcomes to be considered is another important consideration. In the example above, a broad or narrow range of diabetes outcomes could be considered in an analysis of elderly diabetics. If a small number of complications are modeled, the data requirements of the model would be less but the conclusions might be limited compared with a model with a broader range of complications considered. However, a more comprehensive model would have greater data needs and require more complex model construction. Choosing the scope of an analysis often means finding a balance between simplicity and complexity, frequently determined by the clinical situation modeled and the question to be examined.

7.3.5 Measuring and Valuing Costs

Data sources for costs must be found and incorporated into the analysis. Cost data can be obtained from clinical trials, but more often other sources will need to be utilized. In addition, the analyst will need to choose between micro-costing or macrocosting methodologies or some mix of the two, often based on the perspective taken in the analysis. Micro-costing enumerates and identifies each item that is incorporated into a particular service, requiring detailed data on supplies used, personnel, room, and instrument costs, and often needing time-and-motion studies to accurately capture medical service costs. Macro-costing (or gross costing) uses data, often from large government databases, to estimate average costs for a care episode, for example the average cost of coronary artery bypass grafting or of a hospital stay for pneumonia. In the US, Medicare reimbursement data or the Healthcare Cost and Utilization Project (HCUP) database are often used for this purpose.

7.3.6 Measuring and Valuing Outcomes

The effectiveness outcome for the analysis must be chosen and outcomes data found, often based on data availability. Randomized trials are excellent data sources on the effects of therapies, but study entrance criteria frequently limit applicability to a more general patient population. Cohort studies are useful for risk factor determination and for determining the natural history of an illness. Administrative databases are excellent sources for broad population-based estimates of disease and for the effectiveness of therapies, unlike randomized trials which, in general, estimate efficacy. However, administrative databases often pose difficulties in accounting for possible confounding variables in the data set. Meta-analyses provide summary measures for parameters, but studies considered are generally limited to randomized trials, thus limiting generalizability. The perspective of the analysis may also influence the effectiveness outcome chosen. Life years or quality-adjusted life years (QALYs) gained are certainly relevant for analyses using the relatively broad-based societal or health system perspectives, but may not be as important when a narrower perspective is chosen, such as that of an individual hospital, when effectiveness measures such as bed day saved or drug administration error avoided might be more relevant.

7.3.7 Time Preference

The differential timing of costs and outcomes should be considered in the analysis. This is typically accomplished through the use of discount rates, where costs and outcomes that occur in the present have higher values than those in the future.

7.3.8 Choice of Analytic Modeling Method

The analytic model must also be selected. Cost data from clinical trials can allow relatively straightforward calculation of incremental cost-effectiveness ratios between management options, often the intervention arms of the clinical trial. More often, data for the analysis must come from a variety of sources and may require a decision analysis model as a framework for data synthesis.

7.3.9 Accounting for Uncertainty

Finally, a sensitivity analysis to elucidate the effects of uncertainty on model results should be performed. There are many goals of sensitivity analysis, and methods for conducting such analyses. During model construction and validation, sensitivity analysis is useful as a “debugging tool” to assure that the model behaves as it was designed to behave. After the model is finished, sensitivity analysis is useful to determine which variables have a large impact on the outcomes. Sensitivity analyses can be used to determine the cost-effectiveness ratio in specified subgroups of an analysis, as well as to determine how much a change in one variable will alter the cost-effectiveness ratio. Finally, probabilistic sensitivity analyses can be used to produce a version of a confidence limit or probability range around the cost-effectiveness ratio.

7.4 Calculation of Incremental Cost-Effectiveness Ratios

The ICER requires a detailed enumeration of the costs and benefits of the strategies being compared. Methods for measuring and estimating the costs and benefits of strategies and interventions are often quite complicated. In this section, we use the results of two existing pharmacoeconomic studies to illustrate the calculation and use of the ICER. Details of the enumeration of costs and outcomes in these studies are detailed in the studies themselves.

The following example considers low molecular weight heparin (LMWH) compared with warfarin for the secondary prevention of venous thromboembolism in patients with cancer. Aujesky used a decision analysis model and data from a variety of sources to estimate the incremental cost-effectiveness of two anticoagulant regimens. Analysis results, with effectiveness in life years, are outlined in Table 7.2.

Typically, the first step in calculation of ICERs among mutually exclusive options is to order the options by cost. LMWH is both more costly and more effective than warfarin, thus, neither strategy is dominant or dominated and a CEA would be useful. Subtracting the cost of the warfarin strategy from that of the LMWH strategy produces the incremental cost; the difference in life expectancy between strategies is the incremental effectiveness. Dividing the incremental cost by the incremental effectiveness produces the ICERs, $115,847 per life year gained, the unit cost of an additional life year occurring as a result of LMWH use rather than warfarin.

7.4.1 Dominance and Extended Dominance

Calculation of the ICER can be more complicated when more than two strategies are being considered. One of the complicating characteristics of the analysis of many options is that some strategies may be dominated by others and should be removed from further analysis. As noted in the description of the cost-effectiveness plane, any strategy that is more expensive and less effective than an existing option for the same illness (e.g., is in the left upper quadrant compared with the existing strategy) is said to be strictly dominated; one would never choose such a strategy when an alternative would produce a better outcome at a cheaper price. Strict dominance is also termed strong dominance by some authors. A second type of dominance occurs when a particular strategy is more expensive and less effective than a linear combination of two other strategies. This is called extended dominance, and represents a situation where one could achieve a better outcome at less cost by treating a proportion of the population with a combination of two alternative strategies. Extended dominance can also be referred to as weak dominance. We illustrate both types of dominance in the following example.

Using a decision analysis model, we performed a CEA of testing and antiviral treatment strategies for adult influenza, using days of influenza illness avoided as an effectiveness term in the analysis. Cost and effectiveness values estimated by this analysis are shown in Table 7.3. (Please note that in a separate analysis the other neuraminidase inhibitor, oseltamivir, was substituted for zanamivir, with similar cost-effectiveness results.)

Once again, the first step in calculation of incremental cost-effectiveness ratios among mutually exclusive options is to order the options by cost. Doing so with these data results in Table 7.4.

Next, options of lesser effectiveness and of equal or greater cost than another option are removed due to strict, or strong, dominance. These strictly dominated options, which are inferior both in terms of cost and effectiveness, do not need to be considered further in the analysis. In this example, “Testing, then amantadine” costs more and is less effective than “Amantadine (without testing).” Thus, “Testing, then amantadine” is strictly dominated and can be removed from consideration. Similarly, “Testing, then rimantadine” also costs more and is less effective than the “Amantadine” strategy and the “Rimantadine (without testing)” strategy and, thus, can be eliminated due to strict dominance. Removal of these two strategies results in Table 7.5.

Then, starting with the second row, the differences in cost and effectiveness between that row and the preceding row are calculated. These results are the incremental cost and incremental effectiveness between the two adjacent strategies. The incremental cost divided by the incremental effectiveness produces the ICER, the cost per illness day prevented. This same procedure is then followed for the remaining rows in Table 7.6.

Next, the calculated ICERs are examined for extended, or weak, dominance of strategies. This occurs when the ICER of a strategy is greater than the strategy below it, signifying that the subsequent strategy would be preferred. In this case both “Rimantadine” and “Test/Zanamivir” have higher ICERs than Zanamivir; thus, these strategies would not be preferred over Zanamivir due to extended dominance and can be removed from consideration. Removing these strategies from the table and recalculating the ICER of Zanamivir compared with Amantadine results in Table 7.7.

This same procedure can be performed graphically using the cost-effectiveness plane. Figure 7.2 depicts all the testing and treatment strategies on the cost-effectiveness plane. Starting with “No testing or treatment,” the least costly option, a line is drawn to the strategy that produces the shallowest slope (i.e., the smallest ICER), which is “Amantadine.” From Amantadine, the shallowest positive slope is to Zanamivir. The resulting line is the cost-effectiveness efficient frontier; any point not on this frontier is dominated, either by strict dominance or extended dominance, as illustrated by the “Testing” strategies and by the “Rimantadine” strategy.

Figure 7.2 Cost and effectiveness values for influenza management strategies plotted on the cost-effectiveness plane. The line represents the cost-effectiveness efficient frontier, gray points denote strategies that are strictly dominated, and open points show strategies that are eliminated from consideration by extended dominance.

All reasonable strategies should be included in cost-effectiveness analyses so that true ICERs can be calculated. For example, if the Amantadine strategy were omitted from the analysis above, the ICER of Zanamivir would be $60 per illness day avoided when compared with “No testing or treatment” rather than $198 when compared with Amantadine. Omitting Amantadine would not give a true picture of the incremental value of Zanamivir, i.e., it would not tell us how much more would be paid for the gains in effectiveness seen with Zanamivir compared with all other reasonable strategies.

Similar considerations apply to the average cost-effectiveness ratio, here the cost divided by the illness days avoided; for example, the average cost-effectiveness ratio for Zanamivir is $137.1/0.74 or $185.27 per illness day avoided. When comparing mutually exclusive strategies, as we are in this example, the absence of incremental comparisons between strategies in the average cost-effectiveness calculation does not allow for elimination of dominated strategies or for calculation of incremental gains and costs between strategies. The average cost-effectiveness ratio is useful in the evaluation of mutually compatible programs that are subject to a budget constraint, where programs are ranked, lowest to highest, by average cost-effectiveness ratio, then funded in that order until the budget is exhausted. Use of the average cost-effectiveness ratio in this fashion would maximize the health benefit for a given monetary expenditure; however, its use for this purpose has been largely theoretical to this point.

7.4.2 Sensitivity Analysis

The next step in a CEA is the performance of sensitivity analyses. Typically, univariate, or one-way, sensitivity analyses are performed on parameter values, and further multiple parameter sensitivity analyses may also be performed.

7.4.3 Interpretation of CEA Results

To reiterate a prior point, CEA hinges on comparisons between strategies. A single option alone cannot be cost-effective; options can only be cost-effective compared with other options. The relative cost-effectiveness of one option compared with another is subject to interpretation and, perhaps as a result, the term “cost-effective” has been misused (although perhaps less so now than in the past, due to increasing familiarity with the true meaning of the term). Cost-effective does not necessarily mean cost-saving. New health programs that are less costly and more effective than existing programs are clearly good buys, but a new program that costs more and is more effective than the existing program can be cost-effective without costs being saved, depending on how much is willing to be paid for a given health benefit. Cost-effective has also been incorrectly used to mean cost-saving when no determination of effectiveness differences between options has been performed; buying health insurance from one carrier that costs less than insurance from another carrier is not making a cost-effective decision when there is no comparison of health benefits between insurance plans; this would be a cost-minimization evaluation. Similarly, “cost-effective” has been misused to mean “effective” when there is no cost comparison. The correct meaning of “cost-effective” is that a program or strategy is worth the added cost because of the benefit it adds compared with other interventions. The application of the method requires a determination of the value of health care benefits as well as costs.

Returning to our influenza example, how can one interpret the incremental cost-effectiveness ratios of the amantadine and zanamivir strategies? One of the first steps in interpreting cost-effectiveness analyses is to understand what cost-effectiveness cannot do. It cannot make the “correct” choice; instead, it provides an analysis of the consequences of each choice. Cost-effectiveness analysis is not designed to address the social, political, or legal issues that might arise from a medical decision. Thus, if differing strategies involve questions of equity, social justice, legal responsibilities, or public opinion that need to be weighed in making a medical decision, consideration of more than strategy cost-effectiveness is necessary. Cost-effectiveness is one of many aspects of a decision to be considered and interpreted by decision-makers, be they physicians in the care of an individual patient or health policy makers in a broader population-based medical care context.

Let us assume for now that sociopolitical issues are similar between our example strategies, allowing us to concentrate on the cost-effectiveness results as a major basis for the decision. In this case the question is: which strategy should we choose based on the ICERs calculated for each strategy? Or more bluntly, which strategy is the most “cost-effective”? The answer depends on the willingness-to-pay per unit health outcome (here, per illness day avoided). If the willingness-to-pay is less than $9 per illness day avoided, then “No testing or treatment” would be chosen, since the ICERs of the other strategies are ≥$9 per illness day. If willingness-to-pay thresholds are higher, other strategies would be chosen: Amantadine is chosen if the willingness-to-pay is $9 – $197, and Zanamivir is chosen if the willingness-to-pay is ≥$198 per illness day avoided.

How, then, is a reasonable cost-effectiveness willingness-to-pay threshold determined? This is a difficult question with no clear answer at this point, complicated by the many possible effectiveness values (life years gained, lives saved, illness days avoided, etc.) that could be considered. Cost-effectiveness comparisons between interventions using a common effectiveness measure can be useful in gaining a sense of an intervention’s relative value. For example, if Treatment x for Disease X costs $100 per illness day prevented and is considered economically reasonable while Treatment y for Disease Y costs $500 per illness day avoided and is considered too expensive, then Treatment z for Disease Z costing $550 per illness day prevented might also be considered too expensive. However, the usefulness of this comparison depends on the similarity of illness days between Diseases X, Y, and Z. If Disease Z is worse than X or Y, then there might be a higher willingness-to-pay to avoid a more severe illness day from Disease Z than to avoid a more moderate illness day due to X or Y.

Sensitivity analysis may also be useful in the interpretation of results. If variation of analysis parameter values does not change the conclusion drawn from the base case analysis results, the analysis is said to be “robust,” and increases the confidence in analysis results. Analyses that are not robust, where conclusions may change with variation of one or more parameter values, are termed “sensitive to variation,” and their results are viewed with less confidence. Depending on the data used in the analysis, this confidence or uncertainty can be quantified through development of confidence intervals for cost-effectiveness ratios in empiric data sets or the use of probabilistic sensitivity analysis and acceptability curves when empiric data sets are not available.

A number of other factors can make interpretation of CEAs challenging. Differences in analysis results can be due to methodologic differences between analyses. Cost-effectiveness analysis results are often dependent on the perspective, time horizon, and assumptions used in the analysis and, unless these factors are well-aligned between analyses, discordant results can arise based solely on these technical differences. Analyses using effectiveness values that are very specific to the medical scenario being examined, such as deep venous thrombosis prevented or lumbar discectomies avoided, may have few similar analyses available for comparison, making interpretation of their results challenging. Even if analyses with similar effectiveness values are available, their results could be difficult to compare with those of interventions for other disease processes using other effectiveness measures, thus limiting their comparability and interpretability. In these cases, a common effectiveness measure would facilitate cost-effectiveness comparisons over a broad spectrum of medical interventions. The use of quality-of-life utilities and QALYs in cost-utility analysis, along with methodologic recommendations to standardize analysis practices, such as those of the U.S. Panel on Cost-Effectiveness in Health and Medicine, is largely motivated by the need to facilitate such comparisons, and has resulted in resources such as the online CEA Registry from Tufts University to make direct comparisons possible.

Cost-effectiveness analyses compare medical intervention strategies through the calculation of the incremental cost-effectiveness ratio, a measure of the cost of changes in health outcomes. These analyses can be performed on clinical trial data when information on both costs and effectiveness is available or, more commonly, through the use of decision analysis models to synthesize data from many sources. Interpretation of CEA results can be challenging due to the variety of health outcomes that can be used as the effectiveness term in these analyses and to the absence of a definitive criterion for “cost-effective.” A subset of CEA, cost-utility analysis, attempts to make interpretation of results less difficult through the use of a common effectiveness term, the QALY.

Using Cost-Effectiveness Analysis for Setting Health Priorities

Governments around the world face budget constraints that compel them to make tough decisions about how best to invest funds for public health. They need a way to evaluate which investments will address the most pressing health problems and bring the greatest health gains. Cost-effectiveness analysis is an essential evaluation tool that allows policymakers and health planners to compare the health gains that various interventions can achieve with a given level of inputs. Getting the most value for money has been a central thrust in the analysis presented in Disease Control Priorities in Developing Countries, 2nd edition (DCP2). The basic concepts underlying the analysis, as well as needed improvements, are described here.

What Is Cost-Effectiveness Analysis?

Cost-effectiveness analysis is the primary tool for comparing the cost of a health intervention with the expected health gains. An intervention can be understood to be any activity, using human, financial, and other inputs, that aims to improve health. The health gain might be reducing the risk of a health problem, reducing the severity or duration of an illness or disability, or preventing death. If the health outcome is the same, say preventing death from measles either by immunizing a child or by treating the disease, then analysts need only compare the costs of different interventions that can achieve that outcome. The result is a costeffectiveness ratio, expressed as cost per outcome, which can be compared across various types of services or various service locations that perform the same function. The ratio is always discussed in relative terms, as there is no “best” or absolute level of cost-effectiveness.

The cost-effectiveness of an intervention can vary greatly depending on a program’s size and scope. Typically, as program coverage expands and more people are served, the cost per outcome drops. For example, if more children can be immunized with the same fixed costs like nurses and clinics, then each additional immunization will be cheaper until the service approaches full capacity.

On the other hand, costs can rise as coverage expands if it becomes harder to reach additional patients. Therefore, depending on the comparison undertaken, an analyst might look at the average cost-effectiveness ratio or the incremental cost-effectiveness ratio. The average cost-effectiveness ratio looks at total costs and total results, starting from zero, while the incremental ratio compares additional costs and additional results, starting from the current level of coverage or services.

Using child immunizations as an example, the incremental cost of adding mobile vaccination teams might be lower than expanding fixed clinic services, particularly if the unvaccinated children are dispersed and hard to reach.

How Do Analysts Measure and Compare Different Health Outcomes?

Cost-effectiveness analysis requires health outcomes to be expressed in common units so that comparisons among interventions can be made. All analyses start with some unit, such as cases of a disease or injury, deaths, or numbers of people who quit smoking or adopt some other behavior. All interventions that prevent death are alike in terms of the common outcome. However, when lives are saved at different ages—averting a death from malaria at age 2 versus a heart attack at age 50—the outcome is no longer identical, and some adjustment must be made for the difference in years of life saved.

For interventions that prevent death, the analysis starts by estimating the deaths prevented and the age at death to yield the number of life-years saved. The number of life-years saved is the difference between age at death and life expectancy remaining at that age. Standard economic analysis then discounts the future years to take account of uncertainty and the advantage gained by investing early.

Discounting means reducing the value of the outcome in each future year by an amount that increases over time.

Thus, if values are discounted at 3 percent annually, that means dividing the values for year 1 by 1.03, those for year 2 by 1.03 squared, and so on. At that rate, preventing an infant death saves not all of the 60 to 80 years of life expectancy at birth (depending on the country), but at most 30 discounted years. Even when discounted, however, saving infants’ lives yields substantial health gains.

The Disease Control Priorities Project discounts years of life saved at a constant 3 percent per year. The same logic applies to interventions that avert a chronic condition or disability, except that different disabilities must be compared in severity. For short (acute) episodes of illness, age is not relevant—all life years are regarded as equally valuable—and therefore discounting has no effect.

New Metric Used in Cost-Effectiveness Analysis: The DALY

The disability-adjusted life year (DALY) was introduced by the World Health Organization and the World Bank in 1993 and has been used since, with some variations, for two related purposes. One is to measure the “burden of disease,” the extent to which premature deaths and disabilities cause a loss of health status compared to everyone’s living to old age in good health. The other purpose is to compare the value of health interventions that have multiple or different health outcomes occurring at different ages. In particular, it allows for measuring and comparing health outcomes other than saving lives. Used in DCP2, the DALY—or common unit of health loss or gain—takes into account the duration and severity of a health problem and discounts future years.

In cost-effectiveness analysis, the DALY represents the number of years of disability-free life that would be gained from a particular health intervention—yielding a cost per DALY where cost data are available or can be inferred.

Gaining a DALY through a health intervention reduces the burden of disease; it is the same as averting the loss of a DALY. The calculation includes assumptions about severity (if the health condition is not fatal), the age at which an illness or intervention occurs, the duration of ill health with and without the intervention, and the remaining life expectancy at the age when the gain occurs.

For interventions that aim to reduce risk factors for health (such as stopping smoking) rather than directly affect illnesses or injuries, the analysis includes estimates of reductions in ill health that result from changes in the level of risk. Stopping smoking, for example, reduces deaths from both cardiovascular disease and cancer.

DALYs allow analysts to compare the cost-effectiveness of different interventions and different health outcomes, by expressing diverse health outcomes in a common unit. As a result, it can help guide where best to invest scarce health resources. For example, a coronary artery bypass costs, on average across world regions, US$37,000 per DALY gained—far beyond the per capita income of most countries—compared with an average of only US$409 for the polypill (several medications for preventing heart disease in a single pill). The latter is a “best buy” for developing countries.

However, both interventions are much less cost-effective than saving life years of a middle-aged person by treating active tuberculosis (and thereby preventing transmission) at a cost as low as US$15 per DALY.

In high-income countries, some analyses include quality-adjusted life years (QALYs), an alternative measure of how much a year of life is diminished if a person suffers health limitations. This measure can account for people suffering from more than one illness or disability and in varying degrees. But it is used little in the Disease Control Priorities Project that focuses on burden of disease in low- and middle-income countries.

Some Shortcomings of Cost-Effectiveness Analysis

Cost data can be extremely hard to find in developing countries. Ideally, cost-effectiveness analysis should include direct costs (such as doctors’ or nurses’ time and supplies used) as well as indirect costs (such as a portion of administrative costs). The cost of equipment also needs to be spread across its many uses. These costs are usually not readily available, however, and thus the costs of interventions reported in developed countries are often used and adjusted for developing-country settings. Alternatively, a study conducted in one low-income setting is sometimes used to estimate costs in all or several low-income countries.

Cost-effectiveness is only one criterion for judging whether an intervention has merit. Policymakers also must take into account total cost and whether an intervention is affordable at all, the capacity of the system to deliver it, and whether people will demand and use the service provided. Also, costeffectiveness analysis might show that an intervention is worth doing, but it doesn’t necessarily mean that the public sector should undertake all of it. Private-sector services might be available and affordable for some portion of the population.

Equity is also a concern, because it can be more cost-effective to serve many people in large urban centers, where the cost per outcome is relatively low. Providing the same service in a poor, rural area—where fewer patients are seen or where staff and other inputs are harder to make available—might be less cost-effective but more worthy of public investment because it is more equitable.

What About Benefits Other Than Health?

Cost-effectiveness, whether expressed as a single health outcome or a DALY, only takes account of health benefits, which means it may underestimate the total benefit of some health interventions that also improve people’s productivity and other aspects of quality of life. Piped water and sanitation, for example, bring environmental as well as health benefits to communities, and save people’s time. Some interventions may not appear costeffective on health grounds alone but may be justified by large non-health gains.

Estimating the value of all benefits of an intervention, including both health and non-health outcomes, requires expressing gains in monetary terms because “apples and oranges” such as lifeyears, income, and better school performance can’t be added together. Such comparisons are the domain of cost-benefit analysis, where the aim is to compare the total gains from different investments.

The preference for cost-effectiveness analysis in health stems in part from concern about the ethical implications of placing a monetary value on people’s lives. In cost-benefit analysis, the value of life years is most often expressed in terms of income lost or gained, which can be impractical to compute and also problematic to justify when addressing the needs of vulnerable (old, young, or disadvantaged) populations.

Improvements Needed

More and better data are needed in low- and middle-income countries so that analysts do not need to use cost data and assumptions from high-income countries or rely on expert judgments. The need for information starts, in some cases, with better estimates of the incidence and prevalence of particular diseases, and with data on the coverage and outcomes of health interventions. In most countries, estimating what it would cost to expand the coverage of existing interventions or to add new interventions relies heavily on assumptions.

Whenever possible, cost-effectiveness analyses should be conducted at the national or subnational level. This would allow planners to take full account of all the reasons cost-effectiveness varies from one place to another, and to develop priorities on the basis of analysis appropriate to local circumstances.

Conclusion

While many considerations, such as affordability, equity, and non-health benefits, may factor into decisions about health spending, cost-effectiveness analysis is an essential tool for decision makers. It can guide decisions about where best to spend limited resources and what to include in a package of health services that responds to a population’s greatest health needs.