[Federal Register Volume 64, Number 185 (Friday, September 24, 1999)]
[Notices]
[Pages 51767-51780]
From the Federal Register Online via the Government Publishing Office [www.gpo.gov]
[FR Doc No: 99-24855]


-----------------------------------------------------------------------

DEPARTMENT OF HEALTH AND HUMAN SERVICES

Food and Drug Administration
[Docket No. 99D-3082]


International Conference on Harmonisation; Choice of Control 
Group in Clinical Trials

AGENCY: Food and Drug Administration, HHS.

ACTION: Notice.

-----------------------------------------------------------------------

SUMMARY: The Food and Drug Administration (FDA) is publishing a draft 
guidance entitled ``E10 Choice of Control Group in Clinical Trials.'' 
The draft guidance was prepared under the auspices of the International 
Conference on Harmonisation of Technical Requirements for Registration 
of Pharmaceuticals for Human Use (ICH). The draft guidance sets forth 
general principles that are relevant to all controlled trials and are 
especially pertinent to the major clinical trials intended to 
demonstrate drug (including biological drug) efficacy. The draft 
guidance describes the principal types of control groups and discusses 
their appropriateness in particular situations. The draft guidance is 
intended to assist sponsors and investigators in the choice of control 
groups for clinical trials.

DATES: Written comments by December 23, 1999.

ADDRESSES: Submit written comments on the draft guidance to the Dockets 
Management Branch (HFA-305), Food and Drug Administration, 5630 Fishers 
Lane, rm. 1061, Rockville, MD 20852. Copies of the draft guidance are 
available from the Drug Information Branch (HFD-210), Center for Drug 
Evaluation and Research, Food and Drug Administration, 5600 Fishers 
Lane, Rockville, MD 20857, 301-827-4573. Single copies of the guidance 
may be obtained by mail from the Office of Communication, Training and 
Manufacturers Assistance (HFM-40), Center for Biologics Evaluation and 
Research (CBER), or by calling the CBER Voice Information System at 1-
800-835-4709 or 301-827-1800. Copies may be obtained from CBER's FAX 
Information System at 1-888-CBER-FAX or 301-827-3844.

FOR FURTHER INFORMATION CONTACT:
     Regarding the guidance: Robert Temple, Center for Drug Evaluation 
and Research (HFD-4), Food and Drug Administration, 5600 Fishers Lane, 
Rockville, MD 20857, 301-594-6758.
     Regarding the ICH: Janet J. Showalter, Office of Health Affairs 
(HFY-20), Food and Drug Administration, 5600 Fishers Lane, Rockville, 
MD 20857, 301-827-0864.
SUPPLEMENTARY INFORMATION: In recent years, many important initiatives 
have been undertaken by regulatory authorities and industry 
associations to promote international harmonization of regulatory 
requirements. FDA has participated in many meetings designed to enhance 
harmonization and is committed to seeking scientifically based 
harmonized technical procedures for pharmaceutical development. One of 
the goals of harmonization is to identify and then reduce differences 
in technical requirements for drug development among regulatory 
agencies.
     ICH was organized to provide an opportunity for tripartite 
harmonization initiatives to be developed with input from both 
regulatory and industry representatives. FDA also seeks input from 
consumer representatives and others. ICH is concerned with 
harmonization of technical requirements for the registration of 
pharmaceutical products among three regions: The European Union, Japan, 
and the United States. The six ICH sponsors are the European 
Commission, the European Federation of Pharmaceutical Industries 
Associations, the Japanese Ministry of Health and Welfare, the Japanese 
Pharmaceutical Manufacturers Association, the Centers for Drug 
Evaluation and Research and Biologics Evaluation and Research, FDA, and 
the Pharmaceutical Research and Manufacturers of America. The ICH 
Secretariat, which coordinates the preparation of documentation, is 
provided by the International Federation of Pharmaceutical 
Manufacturers Associations (IFPMA).
     The ICH Steering Committee includes representatives from each of 
the ICH sponsors and the IFPMA, as well as observers from the World 
Health Organization, the Canadian Health Protection Branch, and the 
European Free Trade Area.
     In May 1998, the ICH Steering Committee agreed that a draft 
guidance entitled ``E10 Choice of Control Group in Clinical Trials'' 
should be made available for public comment. The draft guidance is the 
product of the Efficacy Expert Working Group of the ICH. Comments about 
this draft will be considered by FDA and the Efficacy Expert Working 
Group.
     In accordance with FDA's good guidance practices (62 FR 8961, 
February 27, 1997), this document is now being called a guidance, 
rather than a guideline.
     The draft guidance sets forth general principles that are relevant 
to all controlled trials and are especially pertinent to the major 
clinical trials intended to demonstrate drug (including biological 
drug) efficacy. The draft guidance includes a description of the five 
principal types of controls, a discussion of two important purposes of 
clinical trials, and an exploration of the critical issue of assay 
sensitivity, i.e., whether a trial could have detected a difference 
between treatments when there was a difference, a particularly 
important issue in noninferiority/equivalence trials. In addition, the 
draft guidance presents a detailed description of each type of control 
and considers, for each: (1) Its ability to minimize bias, (2) ethical 
and practical issues associated with its use, (3) its usefulness and 
the quality of inference in particular situations, (4) modifications of 
study design or combinations with other controls that can resolve 
ethical, practical, or inferential concerns, and (5) its overall 
advantages and disadvantages.
     This draft guidance represents the agency's current thinking on 
the choice of control group in clinical trials. It does not create or 
confer any rights for or on any person and does not operate to bind FDA 
or the public. An alternative approach may be used if such approach 
satisfies the requirements of the applicable statute, regulations, or 
both.
     Interested persons may, on or before December 23, 1999, submit to 
the Dockets Management Branch (address above) written comments on the 
draft guidance. Two copies of any comments are to be submitted, except 
that individuals may submit one copy. Comments are to be identified 
with the docket number found in brackets in the heading of this 
document. The draft guidance and received comments may be seen in the 
office above between 9 a.m. and 4 p.m., Monday through Friday. An 
electronic version of this guidance is available on the Internet at 
``http://www.fda.gov/cder/guidance/index.htm'' or at CBER's World Wide 
Web site at ``http://www.fda.gov/cber/publications.htm''.
     The text of the draft guidance follows:

[[Page 51768]]

 E10 Choice of Control Group in Clinical Trials\1\
---------------------------------------------------------------------------

    \1\ This draft guidance represents the agency's current thinking 
on the choice of control group in clincal trials. It does not create 
or confer any rights for or on any person and does not operate to 
bind FDA or the public. An altenative approach may be used if such 
approach satisfies the requirements of the applicable statute, 
regulations, or both.
---------------------------------------------------------------------------

 1.0 Introduction

     The choice of control group is always a critical decision in 
designing a clinical trial. That choice affects the inferences that 
can be drawn from the trial, the degree to which bias in conducting 
and analyzing the study can be minimized, the types of subjects that 
can be recruited and the pace of recruitment, the kind of endpoints 
that can be studied, the public credibility of the results, the 
acceptability of the results by regulating authorities, and many 
other features of the study, its conduct, and its interpretation.

 1.1 General Scheme and Purpose of Guidance

     The general principles considered in this guidance are relevant 
to all controlled trials. They are of especially critical importance 
to the major clinical trials carried out during drug development to 
demonstrate efficacy. This guidance does not address the regulatory 
requirements in any region, but describes what studies using each 
design can demonstrate. Although any of the control groups described 
and discussed below may be useful and acceptable in studies serving 
as the basis for registration in at least some circumstances, they 
are not equally appropriate or useful in particular cases. After a 
brief description of the five principal kinds of controls (see 
section 1.3), a discussion of two important purposes of clinical 
trials (see section 1.4), and an exploration of the critical issue 
of whether a trial could have detected a difference between 
treatments when there was a difference in noninferiority/equivalence 
trials (see section 1.5), the guidance will describe each kind of 
control group in more detail (see section 2.0-2.5.7) and consider, 
for each:
      Its ability to minimize bias
      Ethical and practical issues associated with its use
      Its usefulness and the quality of inference in 
particular situations
      Modifications of study design or combinations with 
other controls that can resolve ethical, practical, or inferential 
concerns
      Its overall advantages and disadvantages
     Several other ICH guidances are particularly relevant to the 
choice of control group:
      E3: Structure and Content of Clinical Study Reports
      E4: Dose-Response Information to Support Drug 
Registration
      E6: Good Clinical Practice: Consolidated Guideline
      E8: General Considerations for Clinical Trials
      E9: Statistical Principles for Clinical Trials
     In this guidance, the drug terms ``test drug,'' ``study drug,'' 
and ``investigational drug'' are considered synonymous and are used 
interchangeably; similarly, ``active control'' and ``positive 
control,'' ``clinical trial'' and ``clinical study,'' ``control'' 
and ``control group;'' and ``treatment'' and ``drug'' are 
essentially equivalent terms.

 1.2 Purpose of Control Group

     Control groups have one major purpose: to allow discrimination 
of patient outcomes (changes in symptoms, signs, or other morbidity) 
caused by the test drug from outcomes caused by other factors, such 
as the natural progression of the disease, observer or patient 
expectations, or other treatment. The control group experience tells 
us what would have happened to patients if they had not received the 
test treatment (or what would have happened with a different 
treatment known to be effective).
     If the course of a disease were uniform in a given patient 
population, or predictable from patient characteristics such that 
outcome could be predicted reliably for any given subject or group 
of subjects, results of treatment could simply be compared with the 
known outcome without treatment. For example, one could assume that 
pain would have persisted for a defined time, blood pressure would 
not have changed, depression would have lasted for a defined time, 
tumors would have progressed, the mortality after an acute 
infarction would have been the same as previously seen. In unusual 
cases, the course of illness is in fact predictable in a defined 
population and it may be possible to use a similar group of patients 
previously studied as a ``historical control'' (see section 1.3.5). 
In most situations, however, a concurrent control group is needed 
because it is not possible to predict outcome with adequate 
accuracy.
     A concurrent control group is one chosen from the same 
population as the test group and treated in a defined way as part of 
the same trial that studies the test drug. The test and control 
groups should be similar with regard to all baseline and on-
treatment variables that could influence outcome other than the 
study treatment. Failure to achieve this similarity can introduce a 
bias into the study. Bias here (and as used in ICH E9) means the 
systematic tendency of any aspects of the design, conduct, analysis, 
and interpretation of the results of clinical trials to make the 
estimate of a treatment effect deviate from its true value. 
Randomization and blinding are the two techniques usually used to 
prevent such bias and to ensure that the test treatment and control 
groups are similar at the start of the study and are treated 
similarly in the course of the study (see ICH E9). Whether a trial 
design includes these features is a critical determinant of its 
quality and persuasiveness.

 1.2.1 Randomization

     Assurance that subject populations are similar in test and 
control groups is best attained by randomly dividing a single sample 
population into groups that receive the test or control treatments. 
Randomization avoids systematic differences between groups with 
respect to variables that could affect outcome. The inability to 
eliminate systematic differences is the principal problem of studies 
without a concurrent randomized control (see external control 
trials, section 1.3.5). Randomization also provides a sound basis 
for statistical inference.

 1.2.2 Blinding

     The groups should not only be similar at baseline, but should 
be treated and observed similarly during the trial, except for 
receiving the test and control drug. Clinical trials are often 
``double-blind'' (or ``double-masked''), meaning that both subjects 
and investigators (including analysts of data, sponsors, other 
clinical trial personnel) are unaware of each subject's assigned 
treatment, to minimize the potential biases resulting from 
differences in management, treatment, or assessment of patients, or 
interpretation of results that could arise as a result of subject or 
investigator knowledge of the assigned treatment. For example:
      Subjects on active drug might report more favorable 
outcomes because they expect a benefit or might be more likely to 
stay in a study if they knew they were on active drug.
      Observers might be less likely to identify and report 
treatment responses in a no-treatment group or might be more 
sensitive to a favorable outcome or adverse event in patients 
receiving active drug.
      Knowledge of treatment assignment could affect vigor 
of attempts to obtain on-study or followup data.
      Knowledge of treatment assignment could affect 
decisions about whether a subject should remain on treatment or 
receive concomitant medications or other ancillary therapy.
      Knowledge of treatment assignment could affect 
decisions as to whether a given subject's results should be included 
in an analysis.
      Knowledge of treatment assignment could affect choice 
of statistical analysis.
 Double-blinding is intended to ensure that subjective assessments 
and decisions are not affected by knowledge of treatment assignment.

 1.3 Types of Controls

     Control groups in clinical trials can be classified on the 
basis of two critical attributes: (1) The type of treatment received 
and (2) the method of determining who will be in the control group. 
The type of treatment may be any of the following four: (1) Placebo, 
(2) no treatment, (3) different dose or regimen of the study 
treatment, or (4) different active treatment. The principal methods 
of determining who will be in the control group are by randomization 
or by selection of a control population separate from the population 
treated in the trial (external or historical control). This document 
categorizes control groups into five types. The first four are 
concurrently controlled (the control group and test groups are 
chosen from the same population and treated concurrently), usually 
with random assignment to treatment, and are distinguished by which 
of the types of control treatments listed above are received. 
External (historical) control groups, regardless of the comparator 
treatment, are considered together as the fifth type because

[[Page 51769]]

of serious concerns about the ability to ensure comparability of 
test and control groups in such trials and the ability to minimize 
important biases, making this design usable only in exceptional 
circumstances.
     It is increasingly common to carry out studies that have more 
than one kind of control group. Each kind of control is appropriate 
in some circumstances, but none is usable or adequate in every 
situation. The five kinds of control are:

 1.3.1 Placebo Concurrent Control

     In a placebo-controlled study, subjects are randomly assigned 
to a test treatment or to an identical-appearing inactive treatment. 
The treatments may be titrated to effect or tolerance, or may be 
given at one or more fixed doses. Such trials are almost always 
double-blind, with both subjects and investigator unaware of 
treatment assignment. The name of the control suggests that its 
purpose is to control for ``placebo'' effect (improvement in a 
subject resulting from knowing that he or she is taking a drug), but 
that is not its only or major benefit. Rather, the placebo 
concurrent control design, by allowing blinding and randomization 
and including a group that receives no treatment, controls for all 
potential influences on the actual or apparent course of the disease 
other than those arising from the pharmacologic action of the test 
drug. These influences include spontaneous change (natural history 
of the disease), subject or investigator expectations, use of other 
therapy, and subjective elements of diagnosis or assessment. 
Placebo-controlled trials seek to show a difference between 
treatments when they are studying effectiveness, but may also seek 
to show lack of difference (of specified size) in evaluating a 
safety measurement.

 1.3.2 No-Treatment Concurrent Control

     In a no-treatment controlled study, subjects are randomly 
assigned to test treatment or to no (i.e., absence of) test or 
control therapy. The principal difference between this design and a 
placebo-controlled trial is that subjects and investigators are not 
blind to treatment assignment. Because of the advantages of double-
blind designs, this design is likely to be needed and suitable only 
when it is difficult or impossible to double-blind (e.g., medical 
versus surgical treatment, treatments with easily recognized 
toxicity) and only when there is reasonable confidence that study 
endpoints are objective and that the results of the study are 
unlikely to be influenced by the factors listed in section 1.2.2. 
Note that it is often possible to blind endpoint assessment, even if 
the overall trial is not double-blind. This is a valuable approach 
and should always be considered in studies that cannot be blinded, 
but it does not solve the other problems associated with knowing the 
treatment assignment (see section 1.2.2).

 1.3.3 Dose-Response Concurrent Control

     In a randomized, fixed-dose, dose-response study, subjects are 
randomized to one of several fixed-dose groups. Subjects may either 
be placed on their fixed dose initially or be raised to that dose 
gradually, but the intended comparison is between the groups on 
their final dose. Dose-response studies are usually double-blind. 
They may include a placebo (zero dose) and/or active control. In a 
concentration-controlled trial, treatment groups are titrated to 
several fixed-concentration windows; this type of trial is 
conceptually similar to a fixed-dose, dose-response trial.

 1.3.4 Active (Positive) Concurrent Control

     In an active-control (or positive control) study, subjects are 
randomly assigned to the test treatment or to an active-control 
drug. Such trials are usually double-blind, but this is not always 
possible; many oncology studies, for example, are considered 
impossible to blind because of different regimens, different routes 
of administration (see section 1.3.2) and different toxicities. 
Active-control trials can have two distinct objectives with respect 
to showing efficacy: (1) To show efficacy of the test drug by 
showing it is as good as (equivalent, not inferior to) a known 
effective agent or (2) to show efficacy by showing superiority of 
the test drug to the active control. They may also be used with the 
primary objective of comparing the efficacy/safety of the two drugs 
(see section 1.4). When this design is used to show equivalence/
noninferiority or to compare the drugs, it raises the critical 
question of whether the trial was capable of distinguishing active 
from inactive treatments (see section 1.5).

 1.3.5 External Control (Including Historical Control)

     An externally controlled study compares a group of subjects 
receiving the test treatment with a group of patients external to 
the study, rather than to an internal control group consisting of 
patients from the same population assigned to a different treatment. 
External controls can be a group of patients treated at an earlier 
time (historical control) or during the same time period but in 
another setting. The external control may be defined (a specific 
group of patients) or nondefined (a comparator group based on 
general medical knowledge of outcome). Use of this latter comparator 
is particularly treacherous (such trials are sometimes called 
uncontrolled) because general impressions are so often inaccurate. 
Baseline-controlled studies, in which subjects' status on therapy is 
compared with status before therapy (e.g., blood pressure, tumor 
size), are a variation of this type of control. In this case, the 
changes from baseline are often compared to a general impression of 
what would have happened without intervention, rather than to a 
specific historical experience, although a more defined experience 
can also be used.

 1.3.6 Multiple-Control Groups

     As will be described further below (see section 1.5.1), it is 
often possible and advantageous to use more than one kind of control 
in a single study, e.g., use of both active drug and placebo. 
Similarly, trials can use several doses of test drug and several 
doses of active control, with or without placebo. This design may be 
useful for active drug comparisons where the relative potency of the 
two drugs is not well established, or where the purpose of the trial 
is to establish relative potency.

 1.4 Purposes of Clinical Trials

     Two purposes of clinical trials should be distinguished: (1) 
Assessment of the efficacy and/or safety of a treatment and (2) 
assessment of the relative (comparative) efficacy, safety, benefit/
risk relationship or utility of two treatments.

 1.4.1 Evidence of Efficacy

     In some cases, the purpose of a trial is to demonstrate that a 
test drug has any clinical effect (or an effect of some specified 
size). A study using any of the control types may demonstrate 
efficacy of the test drug by showing that it is superior to the 
control (placebo, low dose, active drug). An active-control trial 
may, in addition, demonstrate efficacy in some cases by showing the 
new drug to be similar in efficacy to a known effective therapy. The 
known efficacy of the control is then attributed to the new drug. 
Clinical studies designed to demonstrate efficacy of a new drug by 
showing that it is similar in efficacy to a standard agent have been 
called ``equivalence'' trials. Because in this case the finding of 
interest is one-sided, these are actually noninferiority trials, 
attempting to show that the new drug is not less effective than the 
control by more than a defined amount. As the fundamental assumption 
of such studies is that showing noninferiority is evidence of 
efficacy, the decision to utilize this trial design necessitates 
attention to the question of whether the active control can be 
relied upon to have an effect in the setting of the trial and 
whether, as a result, the trial can be relied on not to find a truly 
inferior drug to be noninferior (see section 1.5).

 1.4.2 Comparative Efficacy and Safety

     In some cases, the focus of the trial is the comparison with 
another agent, not the efficacy of the test drug per se. Depending 
on the therapeutic area, these trials may be seen as providing 
information needed for relative benefit-risk assessment. The active 
comparator(s) should be acceptable to the region for which the data 
are meant. Depending on the situation, it may not be necessary to 
show equivalence or noninferiority; for example, a less effective 
drug could have safety advantages and thus be considered useful.
     Even though the primary focus of such a trial is the comparison 
of treatments rather than demonstration of efficacy, the cautions 
described for conducting and interpreting noninferiority trials need 
to be taken into account (see section 1.5). The ability of the 
comparative trial to detect a difference between treatments when one 
exists needs to be established because a trial incapable of 
distinguishing between treatments that are in fact different cannot 
provide useful comparative information.
     In addition, for the comparative trial to be informative 
concerning relative benefit and risk, the trial needs to be fair, 
i.e., each drug should have an opportunity to perform well. In 
practice, an active-control equivalence/noninferiority trial offered 
as evidence of efficacy also almost always should provide a fair 
comparison with the control, because any

[[Page 51770]]

doubt as to whether the control in the study had its usual effect 
would undermine assurance that the trial had assay sensitivity (see 
section 1.5). Note that fairness is not an issue when the purpose of 
the trial is to show efficacy by demonstrating superiority to the 
control (i.e., the trial will show such efficacy even if the 
comparator is poorly used; such a trial will not, however, show an 
advantage over the control).
     Among aspects of study design that could unfairly favor one 
treatment group are choice of dose or patient population and 
selection and timing of endpoints.
     1.4.2.1  Dose. In comparing the test drug with an active 
control for the purpose of assessing relative benefit/risk, it is 
important to choose an appropriate dose and dose regimen of the 
control. In examining the results of a comparison of two drugs, it 
is important to consider whether an apparently less effective 
control drug has been used at too low a dose or whether the 
apparently less well tolerated control drug has been used at too 
high a dose. In some cases, to show superior efficacy or safety 
convincingly it will be necessary to study several doses of the 
control and perhaps of the test agent, unless the dose of test agent 
chosen is superior to any dose (or the only recommended dose) of the 
control and at least as well tolerated.
     1.4.2.2  Patient population. Selection of subjects for an 
active-control trial can affect outcome; the population studied 
should be carefully considered in evaluating what the trial has 
shown. For example, if subjects are drawn from a population of 
nonresponders to the standard agents, there would be a bias in favor 
of the new agent. The results of such a study could not be 
generalized to the entire population of previously untreated 
patients. The result is, however, still good evidence of the 
efficacy of the new drug. Moreover, a formal study of a new drug in 
nonresponders to other therapy, in which treatment failures are 
randomized to either the new or failed therapy (so long as this does 
not place the patients at risk), can provide an excellent 
demonstration of the value of the new agent in such nonresponders, a 
clinically valuable observation (see appendix).
     Similarly, it is sometimes possible to identify patient subsets 
more or less likely to have a favorable response or to have an 
adverse response to a particular drug. For example, blacks respond 
poorly to the blood pressure effects of beta blockers and 
angiotensin-converting enzyme inhibitors, so that a comparison of a 
new antihypertensive with these drugs in these patients would tend 
to show superiority of the new drug. It would not be appropriate to 
conclude that the new drug is generally superior. Again, however, a 
planned study in a subgroup, with recognition of its limitations and 
of what conclusion can properly be drawn, could be informative. See 
the appendix for a general discussion of ``enrichment'' study 
designs, studies that choose a subset of the overall population to 
increase sensitivity of the study or to answer a specific, but 
narrow, question.
     1.4.2.3  Selection and timing of endpoints. When two treatments 
are used for the same disease or condition, they may differentially 
affect various outcomes of interest in that disease, particularly if 
they represent different classes or modalities of therapy. 
Therefore, when comparing them in a clinical trial, the choice and 
timing of endpoints may favor one therapy or the other. For example, 
thrombolytics in patients with acute myocardial infarction can 
reduce mortality but increase stroke risk. If a new, more active 
thrombolytic were compared with an older thrombolytic, the more 
active drug might look better if the endpoint were mortality, but 
worse if the endpoint were a composite of mortality and disabling 
stroke. Similarly, in comparing two analgesics in the management of 
dental pain, assigning a particularly heavy weight to pain at early 
time points would favor the agent with more rapid onset over an 
agent that provides greater or longer lasting relief.

 1.5 Sensitivity-to-Drug-Effects and Assay Sensitivity of Studies 
Intended to Show Noninferiority/Equivalence

     As noted in section 1.4.1, use of an active-control 
noninferiority/equivalence design to demonstrate efficacy poses a 
particular problem, one not found in trials intended to show a 
difference between treatments. A demonstration of efficacy by 
showing noninferiority/equivalence of the new therapy to the 
established effective treatment or, more accurately, by showing that 
the difference between them is no larger than a specified size 
(margin), rests on a critical assumption: that if there is a true 
difference between the treatments, i.e., if the new drug has a much 
smaller effect or no effect, the study would not have concluded 
there was no such difference. This assumption, in turn, rests on the 
assumption that the active-control drug will have had an effect of a 
defined size in the study. If these assumptions are incorrect, an 
erroneous conclusion that a drug is effective may be reached because 
a trial seeming to support noninferiority will not in fact have done 
so.
     The ability of a specific trial to detect differences between 
treatments if they exist has been called, and is here termed, 
``assay sensitivity.'' In the noninferiority trial setting, assay 
sensitivity requires that there be an effect of the control drug in 
the trial of at least a specified size and that, because of the 
presence of that effect, the trial has an ability not to declare 
noninferiority of a new drug when the new drug is in fact inferior. 
As noted, because the actual effect size of the control in the trial 
is not measured, the presence of assay sensitivity must be deduced. 
In this document, the term assay sensitivity, a property of a 
particular trial, is distinguished from sensitivity-to-drug-effects. 
Sensitivity-to-drug-effects is defined as the ability of 
appropriately designed and conducted trials in a specific 
therapeutic area, using a specific active drug (or other drugs with 
similar effects), to reliably show a drug effect of at least a 
minimum size under the conditions of the trial. Sensitivity-to-drug-
effects is determined from historical experience; it will usually be 
established by a determination that such trials, when adequately 
powered, regularly distinguish active drugs from placebo. 
Sensitivity-to-drug-effects, established in this way, will imply 
that, in a similarly well-designed and conducted noninferiority 
trial, there will be an ability not to find an ineffective agent to 
be noninferior. Assay sensitivity, in contrast, applies to a 
specific trial and requires the actual presence of a control drug 
effect and thus the actual ability of the trial not to declare an 
inferior drug noninferior. This ability depends on the details of 
the design and conduct of a specific trial, as well as the presence 
of sensitivity-to-drug-effects.

 1.5.1 Need to Ensure Assay Sensitivity in Noninferiority 
(Equivalence) Trials; Difference-Showing Versus Noninferiority 
Studies

     When designing a noninferiority study, study designers need to 
consider the fundamental distinction between two kinds of clinical 
trials: (1) Those that seek to demonstrate efficacy by showing 
superiority of a treatment to a control (superiority trials) and (2) 
those that seek to show efficacy by demonstrating that a new 
treatment is as good as (not inferior by some specified amount to) a 
treatment known to be effective. In the difference-showing trial, 
the finding of a difference itself documents the assay sensitivity 
of the trial and documents the efficacy of the superior treatment, 
so long as the inferior treatment, if an active drug, is known to be 
no worse than a placebo. In the noninferiority situation, in 
contrast, a finding of noninferiority leaves unanswered the 
question: Would the study have led to a conclusion of noninferiority 
even if the study drug were inferior? In a noninferiority trial 
without a placebo group, there is no internal standard (that is, a 
showing of an active drug-placebo difference) to measure/ensure 
assay sensitivity. The existence of assay sensitivity of the trial 
therefore needs to be deduced or assumed based on past experience 
(``historically'') with the control drug, generally from placebo-
controlled trials, establishing the sensitivity-to-drug-effects of 
well-designed and conducted trials, together with evidence that the 
trial was in fact well conducted.
     The question of assay sensitivity, although particularly 
critical in noninferiority studies, actually arises in any trial 
that fails to detect a difference between treatments, including a 
placebo-controlled trial. If a drug fails to show superiority to 
placebo, for example, it means either that the drug was ineffective 
or that the study was not capable of detecting the effect of the 
drug. A straightforward solution to the problem of assay sensitivity 
is the three-arm study, including both placebo and a known active 
treatment, a study design with several advantages. Such a study 
measures effect size (test drug versus placebo) and allows 
comparison of test drug and active control in a setting where assay 
sensitivity is established by the active control-placebo comparison. 
The design is also particularly informative when the test drug and 
placebo give similar results in the study. In that case, if the 
active control is superior to placebo, the study did have assay 
sensitivity and the study provides some evidence that the test drug 
has little or no efficacy. On the other hand, if neither drug, 
including the known effective active control, can be distinguished 
from placebo with

[[Page 51771]]

respect to efficacy, the clinical study lacks assay sensitivity and 
does not provide evidence that the drug is ineffective.

 1.5.2 Choosing the Noninferiority Margin

     As noted earlier, most active-control ``equivalence'' trials 
are really noninferiority trials intended to establish the efficacy 
of a new drug. Analysis of the results of noninferiority trials is 
discussed in the ICH guidances E9 and E3. Briefly, in such a trial, 
new and established therapies are compared. Prior to the trial, an 
equivalence or noninferiority margin, sometimes called a ``delta,'' 
is selected. This margin is the degree of inferiority of the test 
drug compared to the control that the trial will attempt to exclude 
statistically. If the confidence interval for the difference between 
the test and control treatments excludes a degree of inferiority of 
the test drug as large as, or larger than, the margin, the test drug 
can be declared noninferior and thus effective; if the confidence 
interval includes a difference as large as the margin, the test drug 
cannot be declared noninferior and cannot be considered effective.
     The margin chosen for a noninferiority trial cannot be greater 
than the smallest effect size that the active drug would be reliably 
expected to have compared with placebo in the setting of the planned 
trial, but may be smaller based on clinical judgment. If a 
difference between active control and new drug favors the control by 
as much as or more than that amount, the new drug might have no 
effect at all. The margin generally is identified based on past 
experience in placebo-controlled trials of adequate design under 
conditions similar to those planned for the new trial. Note that 
exactly how to calculate the margin is not described in this 
document, and there is little published experience on how to do 
this. The determination of the margin is based on both statistical 
reasoning and clinical judgment, should reflect uncertainties in the 
evidence on which the choice is based, and should be suitably 
conservative. If this is done properly, a finding that the 
confidence interval for the difference between new drug and the 
active control excludes a suitably chosen margin could provide 
assurance that the drug has an effect greater than zero. In 
practice, the margin chosen usually will be smaller than that 
suggested by the smallest expected effect size of the active control 
because of interest in ensuring that some particular clinically 
acceptable effect size (or fraction of the control drug effect) was 
maintained. This would also be true in a trial whose primary focus 
is the therapeutic equivalence of a test drug and active control 
(see section 1.4.2), where it would be usual to seek assurance that 
the test and control drug were quite similar, not simply that the 
new drug had any effect at all.
     The fact that the choice of the margin to be excluded can only 
be based on past experience gives the noninferiority trial an 
element in common with a historically controlled (externally 
controlled) study. This study design is appropriate and reliable 
only when the historical estimate of an expected drug effect can be 
well supported by reference to the results of previous studies of 
the control drug. These studies should lead to the conclusion that 
the active control can consistently be distinguished from placebo in 
trials of design similar to the proposed trial (patient population, 
study size, study endpoints, dose, concomitant therapy, etc.) and 
should identify an effect size that represents the smallest effect 
that the control can reliably be expected to have. If placebo-
controlled trials of a design similar to the one proposed more than 
occasionally show no difference between the proposed active control 
and placebo, and this cannot be explained by some characteristic of 
the study, only superiority of the test drug would be interpretable. 
Note that it is the estimated difference from placebo, not the total 
change from baseline, that needs to be used to calculate the 
expected effect of the control.

 1.5.3 Sensitivity-to-Drug-Effects Is Difficult to Support in Many 
Situations

     Whether the historically based assurance of sensitivity-to-
drug-effects of a trial is supported in any given case is to some 
degree a matter of judgment. There are many conditions, however, in 
which drugs considered effective cannot regularly be shown superior 
to placebo in well-controlled studies, and one therefore cannot 
reliably determine a minimum effect the drug will have in the 
setting of a specific trial. Such conditions tend to include those 
in which there is substantial improvement and variability in placebo 
groups, and/or in which the effects of therapy are small, or 
variable, such as depression, anxiety, dementia, angina, symptomatic 
congestive heart failure, seasonal allergies, and symptomatic 
gastroesophageal reflux disease.
     In all these cases, there is no doubt that the standard 
treatments are effective because there are many well-controlled 
studies of each of these drugs that have shown an effect. Based on 
available experience, however, it would be difficult to describe 
study conditions in which the drug would reliably have at least a 
minimum effect (i.e., conditions in which there is sensitivity-to-
drug-effects) and that, therefore, could be used to identify an 
appropriate margin. In some cases, the experience on which the 
expectation of sensitivity-to-drug-effects is based may be of 
questionable relevance, e.g., if standards of treatment and 
diagnosis have changed substantially over time. If someone proposing 
to use an active-control noninferiority design cannot provide 
acceptable support for the sensitivity-to-drug-effects of the study 
with the chosen inferiority margin, a finding of noninferiority 
cannot be considered informative with respect to efficacy or to a 
showing of clinical comparability/equivalence.

 1.5.4 Assay Sensitivity and Study Quality in Noninferiority 
Designs

     Even where historical experience indicates that studies in a 
particular therapeutic area are likely to have sensitivity-to-drug-
effects, this likelihood can be undermined by the particular 
circumstances under which the study was conducted. Great attention 
therefore needs to be paid to how the trial was designed and 
conducted to determine whether it actually did have assay 
sensitivity. There are many factors that can reduce a trial's assay 
sensitivity, such as:
     1. Poor compliance with therapy
     2. Poor responsiveness of the study population to drug effects
     3. Use of concomitant medication or other treatment that 
interferes with the test drug or that reduces the extent of the 
potential response
     4. A population that tends to improve spontaneously, leaving no 
room for further drug-induced improvement
     5. Poor diagnostic criteria (patients lacking the disease to be 
studied)
     6. Inappropriate (insensitive) measures of drug effect
     7. Excessive variability of measurements
     8. Biased assessment of endpoint because of knowledge that all 
patients are receiving a potentially active drug, e.g., a tendency 
to read blood pressure responses as greater than they actually are, 
reducing the difference between test drug and control
     Clinical researchers and trial sponsors intend to perform high 
quality studies, and the publication of the Good Clinical Practices 
guidance will enhance study quality. Nonetheless, it should be 
appreciated that in trials intended to show a difference between 
treatments there is a strong imperative to utilize a good study 
design and minimize study errors, because trial imperfections 
increase the likelihood of failing to show a difference between 
treatments when one exists. In placebo-controlled trials, for 
example, there is often a withdrawal period to be sure study 
subjects actually have the disease for which treatment is intended, 
and great care is taken in defining entry criteria to be sure 
patients have an appropriate stage of the disease. It is common to 
have a single-blind placebo run-in period to discover and eliminate 
subjects who recover spontaneously, whose measurements are too 
variable, or who are likely to comply poorly with the protocol. 
There is close attention to trial conduct, including administration 
of the correct treatments to patients, encouraging compliance with 
medication use, controlling (or at least recording) concomitant drug 
use and other concomitant illness, and use of standard procedures 
for measurement (technique, timing, training periods). All of these 
efforts will help ensure that an effective drug will be 
distinguished from placebo. Nonetheless, in many clinical settings, 
despite the strong stimulus and extensive efforts to ensure study 
excellence and assay sensitivity, clinical studies are often unable 
to reliably distinguish effective drugs from placebo.
     In contrast, in trials intended to show that there is not a 
difference of a particular size (noninferiority) between two 
treatments, there is a much weaker stimulus to engage in many of 
these efforts, which help ensure that differences will be detected, 
i.e., ensure sensitivity, because failure to show a difference 
greater than the margin is the desired outcome of the study. 
Although some kinds of study error diminish observed differences 
between treatments, it is noted that some kinds of study errors can 
increase variance, which would decrease the likelihood of showing 
noninferiority by widening the confidence interval so that a

[[Page 51772]]

test drug control difference greater than the margin cannot be 
excluded. There would therefore be a strong stimulus in these trials 
to reduce variance, which might be caused, for example, by poor 
measurement technique. Many errors of the kind described, however, 
reduce the observed difference between treatments (and thus assay) 
without necessarily increasing variance. They therefore increase the 
likelihood that an inferior drug will be found noninferior.
     When a noninferiority study is offered as evidence of 
effectiveness of a new drug, both the sponsor and regulatory 
authority need to pay particularly close attention to study quality. 
Whether a given study has assay sensitivity often cannot be 
determined, but the known reasons for failure to have such 
sensitivity should be monitored. The design and conduct of the study 
need to be shown to be similar to studies of the active control that 
were successful in the past. To ensure that sensitivity-to-drug-
effects seen in past studies is likely to be present in the new 
study, there should be close attention to critical design 
characteristics such as the entry criteria and characteristics of 
the study population (severity of medical condition, method of 
diagnosis), the specific endpoint measured and timing of 
assessments, and the use of washout periods to exclude patients 
without disease or to exclude patients with spontaneous improvement. 
Similarly, aspects of study conduct that could decrease assay 
sensitivity should also be examined, including such characteristics 
as compliance with therapy, monitoring of concomitant therapy, 
enforcement of entry criteria, and prevention of study dropouts.
     One other possibility should be considered. Even where a study 
seems likely to have sensitivity-to-drug-effects based on prior 
studies, the population studied or other aspects of study design or 
conduct in a noninferiority study may be so different that results 
with the active-control treatment are visibly atypical (e.g., cure 
rate in an antibiotic trial that is unusually high or low). In that 
case, the results of a noninferiority trial may not be persuasive.

 2.0 Detailed Consideration of Types of Control

 2.1 Placebo Control

 2.1.1 Description (See Section 1.3.1)

     In a placebo-controlled study, subjects are assigned, almost 
always by randomization, to either a test drug or to a placebo. A 
placebo is a ``dummy'' medication that appears as identical as 
possible to the investigational or test drug with respect to 
physical characteristics such as color, weight, taste and smell, but 
that does not contain the test drug. Some trials may study more than 
one dose of the test drug or include both an active control and 
placebo. In these cases, it may be easier for the investigator to 
use more than one placebo (``double-dummy'') than to try to make all 
treatments look the same. The use of placebo facilitates, and is 
almost always accompanied by, double-blinding (or double-masking). 
The difference in measured outcome between the active drug and 
placebo groups is the measure of drug effect under the conditions of 
the study. Within this general description there is a wide variety 
of designs that can be used successfully: Parallel or cross-over 
designs (see ICH E9), single fixed dose or titration in the active 
drug group, several fixed doses. Several designs meriting special 
attention will be described below. Note that not every study that 
includes a placebo is a placebo-controlled study. For example, an 
active-control study could use a placebo for each drug (double-
dummy) to facilitate blinding; this is still an active-control 
trial, not a placebo-controlled trial. A placebo-controlled trial is 
one in which treatment with a placebo is compared with treatment 
with an active drug.

 2.1.2 Ability to Minimize Bias

     The placebo-controlled trial, using randomization and blinding, 
generally reduces subject and investigator bias maximally, but such 
trials are not impervious to blind-breaking through recognition of 
pharmacologic effects of one treatment (perhaps a greater concern in 
cross-over designs); blinded outcome assessment can enhance bias 
reduction in such cases.

2.1.3 Ethical Issues

     When a new agent is tested for a condition for which no 
effective treatment is known, there is usually no ethical problem 
with a study comparing the new agent to placebo. Use of a placebo 
control may raise problems of ethics, acceptability, and 
feasibility, however, when an effective treatment is available for 
the condition under study in a proposed trial. In cases where an 
available treatment is known to prevent serious harm, such as death 
or irreversible morbidity in the study population, it is generally 
inappropriate to use a placebo control. There are occasional 
exceptions, however, such as cases in which standard therapy has 
toxicity so severe that many patients will refuse therapy.
     In other situations, when there is no major health risk 
associated with withholding or delay of effective therapy, it is 
considered ethical to ask patients to participate in a placebo-
controlled trial, even if they may experience discomfort as a 
result, provided the setting is noncoercive and they are fully 
informed about available therapies and the consequences of delaying 
treatment. Such trials, however, may pose important practical 
problems. For example, deferred treatment of pain or other symptoms 
may be unacceptable to patients or physicians and they may not want 
to participate in such a study. Whether a particular placebo-
controlled trial of a new agent will be acceptable to subjects and 
investigators when there is known effective therapy is a matter of 
investigator, patient, and institutional review board (IRB)/
independent ethics committee (IEC) judgment, and acceptability may 
differ among ICH regions. Acceptability could depend on the specific 
design of the study and the patient population chosen, as will be 
discussed below (see section 2.1.5).
     Whether a particular placebo-controlled trial is ethical may, 
in some cases, depend on what is believed to have been clinically 
demonstrated and on the particular circumstances of the trial. For 
example, a short term placebo-controlled study of a new 
antihypertensive agent in patients with mild essential hypertension 
and no end-organ disease might be considered generally acceptable, 
while a longer study, or one that included sicker patients, probably 
would not be.
     It should be noted that use of a placebo or no-treatment 
control does not imply that the patient does not get any treatment 
at all. For instance, in an oncology trial, when no active drug is 
approved, patients in both the placebo/no-treatment group and the 
test drug group will receive needed palliative treatment, such as 
analgesics.

 2.1.4 Usefulness of Placebo-Controlled Trials and Quality/Validity 
of Inference in Particular Situations

     When used to show effectiveness of a treatment, the placebo-
controlled trial is as free of assumptions and need for external 
(extra-study) information as it is possible to be. Most trial design 
problems and careless errors result in failure to demonstrate a 
treatment difference (and thereby establish efficacy), so that the 
trial contains built-in incentives for study excellence. Even when 
the primary purpose of a trial is comparison of two active agents or 
assessment of dose-response, the addition of a placebo provides an 
internal standard that enhances the inferences that can be drawn 
from the other comparisons.
     Placebo-controlled trials also provide the maximum ability to 
distinguish adverse effects due to drug from those due to underlying 
disease or intercurrent illness. Note that where they are used to 
show similarity, for example, to show the absence of an adverse 
effect, placebo-controlled trials have the same assay sensitivity 
problem as any equivalence or noninferiority trial (see section 
1.5.1). To interpret the result, one must know that if the study 
drug caused an adverse event, it would have been observed.

 2.1.5 Modifications of Design and Combinations With Other Controls 
That Can Resolve Ethical, Practical, or Inferential Issues

     It is often possible to address the ethical or practical 
limitations of placebo-controlled trials by using modified study 
designs that still retain the inferential advantages of these 
trials. In addition, placebo-controlled trials can be made more 
informative by inclusion of additional treatment groups, such as 
multiple doses of the test agent or a known active-control 
treatment.
     2.1.5.1  Additional control groups.
     2.1.5.1.1  Three-arm study; placebo and active control. As 
noted in section 1.5.1, three-arm studies including an active-
control as well as a placebo-control group can readily assess 
whether a failure to distinguish test drug from placebo implies 
ineffectiveness of the test drug or simply a study that lacked the 
ability to identify an active drug. The placebo-standard drug 
comparison in such a trial provides internal evidence of assay 
sensitivity. It is possible to make the active groups larger than 
the placebo group in order to improve the precision of the active 
drug comparison, if this is considered important. This may also make 
the study more

[[Page 51773]]

appealing to patients, as there is less chance of being randomized 
to placebo.
     2.1.5.1.2  Additional doses. Randomization among several fixed 
doses of the test drug in addition to placebo allows assessment of 
dose-response and may be particularly useful in a comparative trial 
to ensure a fair comparison of treatments (see ICH E4: Dose-Response 
Information to Support Drug Registration).
     2.1.5.1.3  Factorial/combination studies. Factorial/ 
combination (response-surface) designs may be used to explore 
several doses of the investigational drug as monotherapy and in 
combination with several doses of another agent proposed for use in 
combination with it. A single study of this type can define the 
properties of a wide array of combinations. Such studies are common 
in the evaluation of new antihypertensive therapies, but can be 
considered in a variety of settings where more than one treatment is 
used simultaneously. For example, the independent additive effects 
of aspirin and streptokinase in preventing mortality after a heart 
attack were shown in such a trial.
     2.1.5.2  Changes in study design.
     2.1.5.2.1 Add-on study, placebo-controlled; replacement study. 
An ``add-on'' study is a placebo-controlled trial of a new agent 
conducted in people also receiving standard therapy. Such studies 
are useful when standard therapy is known to decrease mortality or 
irreversible morbidity, so that the therapy cannot be withheld from 
a patient population known to benefit from it, and when a 
noninferiority trial with standard treatment as the active control 
cannot be carried out or would be difficult to interpret (see 
section 1.5). It is common to study anticancer, antiepileptic, and 
anti-heart-failure drugs this way. This design is useful only when 
standard therapy is not fully effective (which, however, is almost 
always the case), and it has the advantage of providing evidence of 
improved clinical outcomes (rather than ``mere'' noninferiority). 
Efficacy is, of course, established by such studies only for 
combination therapy, and the dose in a monotherapy situation might 
be different from the dose found to be effective in combination. In 
general, this approach is likely to succeed only when the new and 
standard therapies utilize different pharmacologic mechanisms, 
although there are exceptions. For example, AIDS combination 
therapies may show a beneficial effect of pharmacologically-related 
drugs because of delays in development of resistance.
     A variation of this design that can sometimes give information 
on monotherapy and that is particularly applicable in the setting of 
chronic disease, is the replacement study, in which the new drug or 
placebo is added by random assignment to conventional treatment 
given at an effective dose and the conventional treatment is then 
withdrawn, usually by tapering. The ability to maintain the 
subjects' baseline status is then observed in the drug and placebo 
groups using predefined success criteria. This approach has been 
used to study steroid-sparing substitutions in steroid-dependent 
patients without need for initial steroid withdrawal and 
recrudescence of symptoms in a wash-out period, and has also been 
used to study antiepileptic drug monotherapy.
     2.1.5.2.2  ``Early escape''; rescue medication. It is possible 
to design a study to plan for ``early escape'' from ineffective 
therapy. Early escape refers to prompt removal of subjects whose 
clinical status worsens or fails to improve to a defined level 
(blood pressure not controlled by a prespecified time, seizure rate 
greater than some prescribed value, blood pressure rising to a 
certain level, angina frequency above a defined level, liver enzymes 
failing to normalize by a preset time in patients with hepatitis), 
who have a single event that treatment was intended to prevent 
(first recurrence of unstable angina, grand mal seizure, paroxysmal 
supraventricular arrhythmia), or who otherwise require added 
therapy. In such cases, the need to change therapy becomes a study 
endpoint. The criteria for deciding whether these endpoints have 
occurred should be well specified, and the timing of measurements 
should ensure that patients will not remain untreated with an active 
drug while their disease is poorly controlled. The primary 
difficulty with this trial design is that it may give information 
only on short-term effectiveness. The randomized withdrawal trial 
(see section 2.1.5.2.4), however, which can also incorporate early-
escape features, can give information on long-term effectiveness. It 
should be noted that formal use of rescue medication in response to 
clinical deterioration could be utilized similarly.
     2.1.5.2.3  Limited placebo period. In a longer term active-
control trial, the addition of a placebo group treated for a short 
period may establish assay sensitivity (at least for short-term 
effects). The trial would then continue without the placebo group.
     2.1.5.2.4  Randomized withdrawal. In a randomized withdrawal 
study, subjects receiving an investigational therapy for a specified 
time are randomly assigned to continued treatment with the 
investigational therapy or to placebo (i.e., withdrawal of active 
therapy). Subjects for such a trial could be derived from an 
organized open single-arm study, from an existing clinical cohort 
(but usually with a formal ``wash-in'' phase to establish the 
initial on-therapy baseline), from the active arm of a controlled 
trial, or from one or both arms of an active-control trial. Any 
difference that emerges between groups receiving continued treatment 
and placebo would demonstrate the effect of the active treatment. 
The prerandomization observation period on drug can be of any 
length; this approach can therefore be used to study long-term 
persistence of effectiveness when long-term placebo treatment would 
not be acceptable. The postwithdrawal observation period could be of 
fixed duration or could use early escape or time to event (e.g., 
relapse of depression) approaches. As with the early-escape design, 
procedures for monitoring patients and assessing study endpoints 
need careful attention to ensure that patients failing on an 
assigned treatment are identified rapidly.
     The randomized withdrawal approach is suitable in several 
situations. First, it may be suitable for drugs that appear to 
resolve an episode of recurring illness (e.g., antidepressants), in 
which case the withdrawal study is in effect a relapse-prevention 
study. Second, it may be used for drugs that suppress a symptom or 
sign (chronic pain, hypertension, angina), but where a long-term 
placebo-controlled trial would be difficult; in this case, the study 
can establish long-term efficacy. Third, the design can be used to 
determine how long a therapy should be continued (e.g., 
postinfarction treatments with a beta-blocker).
     The general advantage of randomized withdrawal designs, when 
used with an early-escape endpoint, such as return of symptoms, is 
that the period of placebo exposure with poor response that a 
patient would have to undergo is short.
     Dosing issues can be addressed by this type of design. After 
all patients had received an initial fixed dose, they could be 
randomly assigned in the ``withdrawal'' phase to several different 
doses (as well as placebo), a particularly useful approach when 
there is reason to think the initial and maintenance doses might be 
different, either on pharmacodynamic grounds or because there is 
substantial accumulation of active drug resulting from a long half 
life of parent drug or active metabolite. Note that the randomized 
withdrawal design could be used to assess dose-response after an 
initial placebo-controlled titration study. The titration study is 
an efficient design for establishing effectiveness, but does not 
give good dose-response information. The randomized withdrawal 
phase, with responders randomly assigned to several fixed doses and 
placebo, will study dose-response rigorously while allowing the 
efficiency of the titration design.
     In utilizing randomized withdrawal designs, it is important to 
appreciate the possibility of withdrawal phenomena, suggesting the 
wisdom of relatively slow tapering. A patient may develop tolerance 
to a drug such that no benefit is being accrued, but the drug's 
withdrawal may lead to disease exacerbation, resulting in an 
erroneous conclusion of persisting efficacy. It is also important to 
realize that treatment effects observed in these studies may be 
larger than those seen in the general population because randomized 
withdrawal studies are ``enriched'' with responders (see appendix). 
This phenomenon results when the study explicitly includes only 
subjects who appear to have responded to the drug or includes only 
people who have completed a previous phase of study (which is often 
an indicator of a good response).
     2.1.5.2.5  Other design considerations. In any placebo-
controlled study, unbalanced randomization (e.g., 2:1, study drug to 
placebo) may enhance the safety data base and may also make the 
study more attractive to patients and/or investigators.

 2.1.6 Advantages of Placebo-Controlled Trials

     2.1.6.1  Ability to demonstrate efficacy credibly. Like other 
difference-showing trials, the interpretation of the placebo-
controlled study relies on no externally based

[[Page 51774]]

assumptions of sensitivity-to-drug-effects nor an assessment of 
assay sensitivity. These may be the only credible study designs in 
situations where it is not possible to conclude that noninferiority 
studies would have assay sensitivity (see section 1.5).
     2.1.6.2  Measures ``absolute'' effectiveness and safety. The 
placebo-controlled trial measures the absolute effect of treatment 
and allows a distinction between adverse events due to the drug and 
those due to the underlying disease or ``background noise.'' The 
absolute effect size information is valuable in a three-group trial 
(test, placebo, active), even if the primary purpose of the trial is 
the test versus active control comparison.
     2.1.6.3  Efficiency. Placebo-controlled trials are efficient in 
that they can detect treatment effects with a smaller sample size 
than any other type of concurrently controlled study. Active-control 
trials intended to show superiority of the new treatment are 
generally seeking smaller differences than the active-placebo 
difference sought in a placebo-controlled trial, resulting in need 
for a larger sample size. Noninferiority active-control trials also 
need larger sample sizes because they must use conservative 
assumptions about the effect size of the control drug to ensure that 
noninferiority of the test drug would in fact demonstrate efficacy. 
Designers of dose-response studies need to guess at the shape and 
position of the dose-response curve and may wastefully assign some 
subjects to several doses that have no effect or are on a response 
plateau.
     2.1.6.4  Minimizing the effect of subject and investigator 
expectations. Use of a blinded placebo control may decrease the 
amount of improvement resulting from subject or investigator 
expectations because both are aware that some subjects will receive 
no active drug. This may increase the ability of the study to detect 
true drug effects.

 2.1.7 Disadvantages of Placebo-Controlled Trials

     2.1.7.1  Ethical concerns (see sections 2.1.3 and 2.1.4). When 
effective therapy that is known to prevent harm exists for a 
particular population, that population cannot usually be ethically 
studied in placebo-controlled trials; the particular conditions and 
populations for which this is true may be controversial. Ethical 
concerns may also direct studies toward less ill subjects or cause 
studies to examine short-term endpoints when long-term outcomes are 
of greater interest. Where a placebo-controlled trial is unethical 
and an active-control trial would not be credible, it may be very 
difficult to study new drugs at all. For example, it would not be 
considered ethical to carry out a placebo-controlled trial of a beta 
blocker in postinfarction patients; yet it would be difficult to 
conclude that a noninferiority trial would have sensitivity-to-drug-
effects. The designs described in section 2.1.5 may be useful in 
some of these cases.
     2.1.7.2  Patient and physician practical concerns. Physicians 
and/or patients may be reluctant to accept the possibility that the 
patient will be assigned to the placebo treatment, even if there is 
general agreement that withholding or delaying treatment will not 
result in harm. Subjects who sense they are not improving may drop 
out of trials because they attribute lack of effect to having been 
treated with placebo, complicating the analysis of the study. With 
care, however, drop-out for lack of effectiveness can sometimes be 
used as a study endpoint. Although this may provide some information 
on drug effectiveness, such information is less precise than actual 
information on clinical status in subjects receiving their assigned 
treatment.
     2.1.7.3  Generalizability. It is sometimes argued that any 
controlled trial, but especially a placebo-controlled trial, 
represents an artificial environment that gives results different 
from true ``real world'' effectiveness. If study populations are 
unrepresentative in placebo-controlled trials because of ethical or 
practical concerns, questions about the generalizability of study 
results can arise. For example, patients with more serious disease 
may be excluded by protocol, investigator, or patient choice from 
placebo-controlled trials. In some cases, only a limited member of 
patients or centers may be willing to participate in studies. 
Whether these concerns actually (as opposed to theoretically) limit 
generalizability has not been established.
     2.1.7.4  No comparative information. Placebo-controlled trials 
lacking an active control give little useful information about 
comparative effectiveness, information that is of interest and 
importance in many circumstances. Such information cannot reliably 
be obtained from cross-study comparisons, as the conditions of the 
studies may have been quite different.

 2.2 No-Treatment Concurrent Control (See Section 1.3.2)

     The randomized no-treatment control is similar in its general 
properties and its advantages and disadvantages to the placebo-
controlled trial. Unlike the placebo-controlled trial, however, it 
cannot be fully blinded, and this can affect all aspects of the 
trial, including subject retention, patient management, and all 
aspects of observation (see section 1.2.2). This design is 
appropriate in circumstances where a placebo-controlled trial would 
be performed, except that blinding is not feasible because the 
treatments themselves are so different, e.g. radiation therapy 
versus surgery, or because the treatment side effects are so 
different. When this design is used, it is desirable to have 
critical decisions, such as eligibility and endpoint determination 
or changes in management, made by an observer blinded to treatment 
assignment. Decisions related to data analysis, such as inclusion of 
patients in analysis sets, should also be made by individuals 
without access to treatment assignment (See ICH E9 for further 
discussion).

 2.3 Dose-Response Concurrent Control (See Section 1.3.3)

 2.3.1 Description

     A dose-response study is one in which subjects are randomly 
assigned to one of several dosing groups, with or without a placebo 
group. Dose-response studies are carried out to establish the 
relation between dose and efficacy/adverse effects and/or to 
demonstrate efficacy. The first use is considered in ICH E4; the 
latter is the subject of this guidance. Evidence of efficacy could 
be based on significant differences in pair-wise comparisons between 
dosing groups or between dosing groups and placebo, or on evidence 
of a significant positive trend with increasing dose, even if no two 
groups are significantly different. In the latter case, however, 
further study may be needed to assess the effectiveness of the low 
doses. As noted in ICH E9, the particular approach for the primary 
efficacy analysis should be prespecified.
     There are several advantages to inclusion of a placebo (zero-
dose) group in a dose-response study. First, it avoids studies that 
are uninterpretable because all doses produce similar effects so 
that one cannot assess whether all doses are equally effective or 
equally ineffective. Second, the placebo group permits an estimate 
of absolute size of effect, although the estimate may not be very 
precise if the dosing groups are relatively small. Third, as the 
drug-placebo difference is generally larger than inter-dose 
differences, use of placebo may permit smaller sample sizes. The 
size of various dose groups need not be identical; e.g., larger 
samples could be used to give more precise information about the 
effect of smaller doses or be used to increase the power of the 
study to show a clear effect of what is expected to be the optimal 
dose. Dose-response studies can include one or more doses of an 
active-control agent. Randomized withdrawal designs can also assign 
subjects to multiple dosage levels.

 2.3.2 Ability to Minimize Bias

     If the dose-response study is blinded, it shares with other 
blinded designs an ability to minimize subject and investigator 
bias. When a drug has pharmacologic effects that could break the 
blind for some patients or investigators, it may be easier to 
preserve blinding in a dose-response study than in a placebo-
controlled trial. Masking treatments may necessitate multiple 
dummies or preparation of several different doses that look alike.

 2.3.3 Ethical Issues

     The ethical and practical concerns related to a dose-response 
study are similar to those affecting placebo-controlled trials. 
Where there is therapy known to be effective in preventing death or 
irreversible morbidity, it is no more ethically acceptable to 
randomize deliberately to subeffective therapy than it is to 
randomize to placebo. Where therapy is directed at less serious 
conditions or where the toxicity of the therapy is substantial 
relative to its benefits, dose-response studies that use low, 
potentially subeffective doses or placebo may be acceptable to 
patients and investigators.

 2.3.4 Usefulness of Dose-Response Studies and Quality/Validity of 
Inference in Particular Situations

     In general, a blinded dose-response study is useful for the 
determination of efficacy and safety in situations where a placebo-
controlled trial would be useful and has similar credibility (see 
section 2.1.4).

[[Page 51775]]

 2.3.5 Modifications of Design and Combinations With Other Controls 
That Can Resolve Ethical, Practical, or Inferential Problems

     In general, the sorts of modification made to placebo-
controlled studies to mitigate ethical, practical, or inferential 
problems are also applicable to dose-response studies (see section 
2.1.5).

 2.3.6 Advantages of Dose-response Trials, Other Than Those Related 
to Any Difference-Showing Study

     2.3.6.1  Efficiency. Although a comparison of a large, fully 
effective dose to placebo is maximally efficient for showing 
efficacy, this design may produce unacceptable toxicity and gives no 
dose-response information. When the dose-response is monotonic, the 
dose-response trial is reasonably efficient in showing efficacy and 
also yields dose-response information. If the optimally effective 
dose is not known, it may be more prudent to study a range of doses 
than to choose a single dose that may prove to be suboptimal or 
toxic.
     2.3.6.2  Possible ethical advantage. In some cases, notably 
those in which there is likely to be dose-related efficacy and dose-
related important toxicity, the dose-response study may represent a 
difference-showing trial that can be ethically or practically 
conducted even where a placebo-controlled trial could not be, 
because there is reason for patients and investigators to accept 
lesser effectiveness in return for greater safety.

 2.3.7 Disadvantages of Dose-Response Study

     A potential problem that needs to be recognized is that a 
positive dose-response trend (i.e., a significant correlation 
between the dose and the efficacy outcome), without significant 
pair-wise differences, can establish efficacy, but may leave 
uncertainty as to which doses (other than the largest) are actually 
effective. But, of course, a single-dose study poses a similar 
problem with respect to doses below the one studied, giving no 
information at all about such doses.
     It should also be appreciated that it is not uncommon to show 
no difference between doses in a dose-response study; if there is no 
placebo group to provide a clear demonstration of an effect, this is 
a very costly ``no test'' outcome.
     If the therapeutic range is not known at all, the design may be 
inefficient, as many patients may be assigned to sub-therapeutic or 
supratherapeutic doses.
     Dose-response designs may be less efficient than placebo-
controlled titration designs for showing the presence of a drug 
effect; they do, however, in most cases provide better dose-response 
information (see ICH E4).

 2.4 Active Control

 2.4.1 Description (See Section 1.3.4)

     An active-control (positive-control) trial is one in which an 
investigational drug is compared with a known active drug. Such 
trials are usually randomized and usually double-blind. The most 
crucial design question is whether the trial is intended to show a 
difference between the two drugs or to show noninferiority/
equivalence. A sponsor intending to demonstrate effectiveness by 
means of a trial showing noninferiority of the test drug to a 
standard agent needs to address the issue of the sensitivity-to-
drug-effects and assay sensitivity of the trial, as discussed in 
section 1.5. In a noninferiority/equivalence trial, the active-
control agent needs to be of established efficacy at the dose used 
and under the conditions of the study (see ICH E9: Statistical 
Principles for Clinical Trials). In general, this means it should be 
an agent acceptable in the region to which the studies will be 
submitted for the same indication at the dose being studied. A 
superiority study favoring the test drug, on the other hand, is 
readily interpretable as evidence of efficacy, even if the dose of 
active control is too low or the active control is of uncertain 
benefit (but not if it could be harmful). Such a result, however--
superiority in the trial of the test agent to the control--is 
interpretable as actual superiority of the test drug to the control 
treatment only when the active control is used in appropriate 
patients at an optimal dose and schedule (see section 1.4.2). Lack 
of appropriate use of the control drug would also make the study 
unusable as a noninferiority study if superiority of the test drug 
is not shown, because assay sensitivity of the study would not be 
ensured (see section 1.5.4).

 2.4.2 Ability to Minimize Bias

     A randomized and blinded active-control trial generally 
minimizes subject and investigator bias, but a note of caution is 
warranted. In a noninferiority trial, investigators and subjects 
know that all subjects are getting active drug, although they do not 
know which one. This could lead to a biased interpretation of 
results in the form of a tendency toward categorizing borderline 
cases as successes in partially subjective evaluations, e.g., in an 
antidepressant study. Such biases may decrease variance and/or 
treatment differences and thus can increase the likelihood of an 
incorrect finding of equivalence.

 2.4.3 Ethical Issues

     Active-control trials are generally considered to pose fewer 
ethical and practical problems than placebo-controlled trials 
because all subjects receive active treatment. It should be 
appreciated, however, that subjects getting a new agent are not 
getting standard therapy (just as a placebo group is not) and may be 
receiving an ineffective or harmful drug. This is an important 
matter if the active-control therapy is known to improve survival or 
decrease the occurrence of irreversible morbidity. There should 
therefore be a sound rationale for the investigational agent. If 
there is not strong reason to expect the new drug to be at least as 
good as the standard, an add-on study (see section 2.1.5.2.1) may be 
more appropriate, if the conditions allow such a design.
     Using a very low dose, either of the active control or of the 
test drug, may provide a de facto placebo that can be shown inferior 
to the full dose of the test drug. This, however, is only considered 
ethical where a placebo would also be ethical, unless there is a 
legitimate reason to study such low doses.

 2.4.4 Usefulness of Active-Control Trials and Quality/Validity of 
Inference in Particular Situations

     When a new drug shows an advantage over an active control, the 
study has inferential properties regarding the presence of efficacy 
equivalent to any other difference-showing trial, assuming that the 
active control is not actually harmful. When an active-control trial 
is used to show noninferiority/equivalence, there is the special 
consideration of sensitivity-to-drug-effects and assay sensitivity, 
which are considered above in section 1.5. If assay sensitivity is 
established, either historically (by reference to past experience 
with the control drug) or by including a placebo control as well as 
active control, the active-control trial can assess comparative 
efficacy.

 2.4.5 Modifications of Design and Combinations With Other Controls 
That Can Resolve Ethical, Practical, or Inferential Issues

     As discussed earlier (section 2.1.5), active-control studies 
can include a placebo group, multiple-dose groups of the test drug, 
and/or other dose groups of the active control. Comparative dose-
response studies, in which there are several doses of both test and 
active control, are typical in analgesic trials. The doses in 
active-control trials can be fixed or titrated, and both cross-over 
and parallel designs can be used. The assay sensitivity of a 
noninferiority trial can sometimes be supported by a randomized 
placebo-controlled withdrawal phase at the end (see section 
2.1.5.2.4). Active-control superiority studies in selected 
populations (nonresponders to other therapy) can be very useful and 
are generally easy to interpret (see appendix), although the results 
may not be generalizable.

 2.4.6 Advantages of Active-Control Trials

     2.4.6.1  Ethical/practical advantages. The active-control 
design, whether intended to show noninferiority/equivalence or 
superiority, reduces ethical concerns that arise from failure to use 
drugs with documented important health benefits. It also addresses 
patient and physician concerns about failure to use documented 
effective therapy. Recruitment and IRB/IEC approval may be 
facilitated, and it may be possible to study larger samples. There 
may be fewer dropouts due to lack of effectiveness.
     2.4.6.2  Information content. Where superiority to an active 
treatment is shown, active-control studies are readily interpretable 
regarding evidence of efficacy. The larger sample sizes needed are 
sometimes more achievable and acceptable in active-control trials 
and can provide more safety information. Active-control trials also 
can, if properly designed, provide information about relative 
efficacy.

 2.4.7 Disadvantages of Active-Control Trials

     2.4.7.1  Information content. See section 1.5 for discussion of 
the problem of assay sensitivity and the ability of the trial to 
support an efficacy conclusion in noninferiority/equivalence trials. 
Even when assay sensitivity is supported and the study is suitable 
for detecting efficacy, there is no

[[Page 51776]]

direct assessment of absolute effect size and greater difficulty in 
quantitating safety outcomes as well.
     2.4.7.2  Large sample size. Generally, in noninferiority 
trials, the margin of difference that needs to be excluded is chosen 
conservatively, first, because the smallest effect of the active 
control expected in trials will ordinarily be used as the estimate 
of its effect and, second, because there will usually be an intent 
to rule out loss of more than some reasonable fraction (see section 
1.5.2) of the control drug effect, leading to a still smaller 
margin. Because of the need for conservative assumptions about 
control drug effect size, sample sizes may be very large. In a 
difference-showing active-control trial, the difference between two 
drugs is always smaller, often much smaller, than the expected 
difference between drug and placebo, again leading to large sample 
sizes.

 2.5 External Control (Historical Control)

 2.5.1 Description

     An externally controlled trial is one in which the control 
group consists of patients who are not part of the same randomized 
study as the group receiving the investigational agent, i.e., there 
is no concurrently randomized comparative group. The control group 
is thus not derived from exactly the same population as the treated 
population. Usually, the control group is a well-documented 
population of patients observed at an earlier time (historical 
control) at another institution, or even at the same institution but 
outside the study. An external-control study could be a superiority 
study or an equivalence study. Sometimes certain patients from a 
larger experience are selected as a control group on the basis of 
particular characteristics that make them similar to the treatment 
group; there may even be an attempt to ``match'' particular control 
and treated patients.
     So-called ``baseline-controlled studies'' are a variety of 
externally controlled trials; these are sometimes thought to use 
``the patient as his own control,'' but that is logically incorrect. 
In fact, the comparator group is an estimate of what would have 
happened in the absence of therapy to the patients. Both baseline-
controlled trials and studies that use a more complicated on-off-on 
(cross-over) design, but that do not include a concurrently 
randomized control group, are of this type. As noted, in these 
studies the observed changes from baseline or between study periods 
are always compared, at least implicitly, to some estimate of what 
would have happened without the intervention. Such estimates are 
generally made on the basis of ``general knowledge,'' without 
reference to a specific control population. Although in some cases 
this is plainly reasonable, e.g., when the effect is dramatic, 
occurs rapidly following treatment, and is unlikely to have occurred 
spontaneously (e.g., general anesthesia, cardioversion, measurable 
tumor shrinkage), in most cases it is not so obvious and a specific 
historical experience should be sought. Designers and analysts of 
such trials need to be aware of the risks of this type of control 
and should be prepared to support its use.

 2.5.2 Ability to Minimize Bias

     Inability to control bias is the major and well-recognized 
limitation of externally controlled trials and is sufficient in many 
cases to make the design unsuitable. It is always difficult, in many 
cases impossible, to establish comparability of the treatment and 
control groups and thus to fulfill the major purpose of a control 
group (see section 1.2). The groups can be dissimilar with respect 
to a wide range of factors, other than the study drug, that could 
affect outcome, including demographic characteristics, diagnostic 
criteria, stage or duration of disease, concomitant treatments, and 
observational conditions (such as methods of assessing outcome, 
investigator expectations). Blinding and randomization are not 
available to minimize bias when external controls are used. It is 
well documented that untreated historical-control groups tend to 
have worse outcomes than an apparently similar control group in a 
randomized study, primarily because of selection bias. Control 
groups in a randomized study should meet certain criteria to be 
entered into the study, criteria that are generally more stringent 
and identify a less sick population than is typical of external-
control groups. The group is often identified retrospectively, 
leading to potential bias in its selection. A consequence of the 
recognized inability to control bias is that the persuasiveness of 
findings from externally controlled trials depends on obtaining much 
more extreme levels of statistical significance and much larger 
estimated differences between treatments than would be considered 
persuasive in concurrently controlled trials.
     The inability to control bias restricts use of the external-
control design to situations in which the effect of treatment is 
dramatic and the usual course of the disease highly predictable. In 
addition, use of external controls should be limited to cases in 
which the endpoints are objective and the impact of baseline and 
treatment variables on the endpoint is well characterized.
     As noted, the lack of randomization and blinding, and the 
resultant problems with lack of assurance of comparability of test 
group and control group, make the likelihood of substantial bias 
inherent in this design and impossible to quantitate. Nonetheless, 
some approaches to design and conduct of externally controlled 
trials could lead them to be more persuasive and potentially less 
biased. A control group should be chosen for which there is detailed 
information, including, where needed, individual patient data 
regarding demographics, baseline status, concomitant therapy, and 
course on study. The control patients should be as similar as 
possible to the population expected to receive the test drug in the 
study and should have been treated in a similar setting and in a 
similar manner, except with respect to the study therapy. Study 
observations should utilize timing and methodology similar to those 
used in the control patients. To reduce selection bias, selection of 
the control group should be made before performing comparative 
analyses; this may not always be feasible, as outcomes from these 
control groups may have been published. Any matching on selection 
criteria or adjustments made to account for population differences 
should be specified prior to selection of the control and 
performance of the study. Where no obvious single ``optimal'' 
external control exists, it may be advisable to study multiple 
external controls, providing that the analytic plan specifies 
conservatively how each will be utilized in drawing inferences 
(e.g., study group should be substantially superior to the most 
favorable control to conclude efficacy). In some cases, it may be 
useful to have an independent set of reviewers reassess endpoints in 
the control group and in the test group in a blinded manner 
according to common criteria.

 2.5.3 Ethical Issues

     When a drug is intended to treat a serious illness for which 
there is no satisfactory treatment, especially if the new drug is 
seen as promising on the basis of theoretical considerations, animal 
data, or early human experience, there may be understandable 
reluctance to perform a comparative study with a concurrent control 
group of patients who would not receive the new treatment. At the 
same time, it is not responsible or ethical to carry out studies 
that have no realistic chance of credibly showing the efficacy of 
the treatment. It should be appreciated that many promising 
therapies have had less dramatic effects than expected or have shown 
no efficacy at all when tested in controlled trials. Investigators 
may, in these situations, be faced with very difficult judgments. It 
may be tempting in exceptional cases to initiate an externally 
controlled trial, hoping for a convincingly dramatic effect, with a 
prompt switch to randomized trials if this does not materialize.
     Alternatively, and generally preferably, in dealing with 
serious illnesses for which there is no satisfactory treatment, but 
where the course of the disease cannot be reliably predicted, even 
the earliest studies should be randomized. This is usually possible 
when studies are carried out before there is an impression that the 
therapy is effective. Studies can be monitored by independent data 
monitoring committees so that dramatic benefit can be detected 
early. Despite the use of a single-treatment group in an externally 
controlled trial, a placebo-controlled trial is usually a more 
efficient design (needing fewer subjects) in such cases, as the 
estimate of control group outcome generally needs to be made 
conservatively, causing need for a larger sample size. Great caution 
(e.g., applying a more stringent significance level) is called for 
because there are likely to be both identified and unidentified or 
unmeasurable differences between the treatment and control groups, 
often favoring treatment. The concurrently controlled trial can 
detect extreme effects very rapidly and, in addition, can detect 
modest, but still valuable, effects that would not be credibly 
demonstrated by an externally controlled trial.

 2.5.4 Usefulness of Externally Controlled Trials and Quality/
Validity of Inference in Particular Situations

     An externally controlled trial should generally be considered 
only when prior belief in the superiority of the test therapy to

[[Page 51777]]

all available alternatives is so strong that alternative designs 
appear unacceptable and the disease or condition to be treated has a 
well-documented, highly predictable course. It is often possible, 
even in these cases, to utilize alternative, randomized, 
concurrently controlled designs (see section 2.1.5 and appendix).
     Externally controlled trials are most likely to be persuasive 
when the study endpoint is objective, when the outcome on treatment 
is markedly different from that of the external control and a high 
level of statistical significance for the treatment-control 
comparison is attained, when the covariates influencing outcome of 
the disease are well characterized, and when the control closely 
resembles the study group in all known relevant baseline, treatment 
(other than study drug), and observational variables. Even in such 
cases, however, there are documented examples of erroneous 
conclusions arising from such trials.
     When an external-control trial is considered, appropriate 
attention to design and conduct may help reduce bias (see section 
2.5.2).

 2.5.5 Modifications of Design and Combinations With Other Controls 
That Can Resolve Ethical, Practical or Inferential Problems

     The external-control design can incorporate elements of 
randomization and blinding through use of a randomized placebo-
controlled withdrawal phase, often with early-escape provisions, as 
described earlier (see section 2.1.5.2.4). The results of the 
initial period of treatment, in which subjects who appear to respond 
are identified and maintained on therapy, are thus ``validated'' by 
a rigorous, largely assumption- and bias-free study.

 2.5.6 Advantages of Externally Controlled Trials

     The main advantage of an externally controlled trial is that 
all patients can receive a promising drug, making the study more 
attractive to patients and physicians.
     The design has some potential efficiencies (smaller sample 
size) because all patients are exposed to test drug, of particular 
importance in rare diseases.

 2.5.7 Disadvantages of Externally Controlled Trials

     The externally controlled study cannot be blinded and is 
subject to patient, observer, and analyst bias, major disadvantages. 
It is possible to mitigate these problems to a degree, but even the 
steps suggested in section 2.5.2 cannot resolve such problems fully, 
as treatment assignment is not randomized and comparability of 
control and treatment groups at the start of treatment, and 
comparability of treatment of patients during the trial, cannot be 
ensured or well assessed. It is well documented that externally 
controlled trials tend to overestimate efficacy of test therapies.

 3.0 Choosing the Control Group

     Figure 1 and Table 1 provide a decision tree for choosing among 
different types of control groups. Although the table and figure 
focus on the choice of control to demonstrate efficacy, some designs 
also allow comparisons of test and control agents. The choice of 
control can be affected by the availability of therapies and by 
medical practices in specific regions.The potential usefulness of 
the principal types of control (placebo, active, and dose-response) 
in specific situations and for specific purposes is shown in Table 
1. The table should be used with the text describing the details of 
specific circumstances in which potential usefulness can be 
realized. In all cases, it is presumed that studies are 
appropriately designed. External controls are so distinct a case 
that they are not included in the table. In the table, a P notation 
refers to the need to make a convincing case that the study has 
assay sensitivity.
     In general, evidence of efficacy is most convincingly 
demonstrated by showing superiority to a concurrent control 
treatment. If a superiority trial is not feasible or is 
inappropriate for ethical or practical reasons, and if a defined 
treatment effect of the active control is regularly seen (e.g., as 
it is for antibiotics in most situations), a noninferiority/
equivalence study can be utilized and can be persuasive. Use of this 
design calls for close attention to the issue of sensitivity to drug 
effects in active-control noninferiority trials of the condition 
being studied and to the assay sensitivity of the particular study 
carried out (see section 1.5).

BILLING CODE 4160-01-F

[[Page 51778]]

[GRAPHIC] [TIFF OMITTED] TN24SE99.000



[[Page 51779]]

[GRAPHIC] [TIFF OMITTED] TN24SE99.001



BILLING CODE 4160-01-C

[[Page 51780]]

 APPENDIX

 Studies of Efficacy in Subsets of the Whole Population; Enrichment

 1.0 Introduction

     Ideally, the effect of a drug should be known in general and in 
relevant demographic and other subsets of the population, such as 
those defined by disease severity or other disease characteristics. 
To the extent study patients are not a random sample of the patients 
who will be treated with the drug once it is marketed, the 
generalizability of the results can be questioned. Even if the 
overall result is obtained in a representative sample, however, that 
does not suggest the result is the same in all people. If subject 
selection criteria can identify people more likely to respond to 
therapy (e.g., high renin hypertensives to beta blockers), we 
consider therapy more rational and the drug more useful.
     Subjects entering clinical studies are in fact almost never a 
random sample of the potential treatment population, and they are 
not treated exactly as a nonstudy patient would be treated. They 
must give informed consent, be able to follow instructions, and be 
able to get to the clinic. They are sometimes assessed for 
likelihood of complying with treatment. They are usually not very 
debilitated and generally are without complicated or life-
threatening illness, unless those conditions are being studied. They 
are usually selected using particularly stringent diagnostic 
criteria that make it very certain they actually have the disease to 
be treated (more likely than in clinical practice). Lead-in periods 
are often used to exclude subjects who improve spontaneously or 
whose relevant functional measures (blood pressure, exercise 
tolerance) are too variable. Of course, the entire setting of trials 
is artificial in varying degrees, generally directed toward reducing 
unwanted variability and increasing study efficiency.
     All of these departures from a truly unselected population of 
people likely to receive the drug are directed at identifying and 
including subjects likely to make a ``good assay population.'' They 
can be considered methods of ``enrichment'' of the population, 
modifications of a truly random sample of potential users to produce 
a population of subjects more likely to discriminate between an 
active and an inactive therapy. The kinds of enrichment described 
above are widely accepted and ``benign,'' i.e., it seems likely that 
results in such a population will be of general applicability, at 
least to patients with good compliance. There is a view, however, 
that in-use ``effectiveness'' may often be different from the 
artificial ``efficacy'' established in these enriched ``efficacy'' 
trials.
     There are other kinds of enrichment that could also be useful 
but that would more clearly alter the inference that could be drawn 
from the results. This should not discourage their use but should 
encourage attention to what such studies do, and do not, show. Some 
enrichments of potential value include:

 1.1 Studies of Patients Nonresponsive to, or Intolerant of, Other 
Therapy

     In this kind of study, patients failing therapy on a drug, or 
failing to tolerate it acceptably, are randomized to the failed or 
poorly tolerated therapy or to the investigational treatment. 
Greater efficacy (or better tolerance) of the new therapy shows that 
the drug is useful in failures on the other therapy. This is a 
valuable showing if, e.g., the drug is relatively toxic and intended 
for a ``second-line'' use, but it does not show that the new therapy 
is superior in general, and such studies need to be carefully 
interpreted. By selecting study patients who will only infrequently 
respond to the control agent or who are very likely to have a 
particular adverse effect of the control drug, the design 
facilitates showing the second drug's advantage in that 
circumstance. A direct comparison of the two drugs in an unselected 
population that could contain responders to both drugs would need to 
be much larger to show a difference between the treatments, even if 
there was an overall advantage of the new drug. Moreover, it could 
be that each drug has a similar rate of nonresponders (but the other 
drug works in some of these), so that no difference could be seen in 
a direct comparison in unselected subjects.
     In this design, it is usually critical to randomize the 
nonresponders or intolerants to both the new agent and the failed 
agent, rather than simply place the failures on the new drug. 
Patients who failed previously may ``respond'' to the failed drug 
when it is readministered in a clinical trial, or may tolerate the 
previously poorly tolerated drug in the new circumstance. This can 
present a problem. In the ``intolerance'' case, although subjects 
can be randomized to a drug that has caused certain kinds of 
intolerance, they cannot be randomized to a drug that would endanger 
them if administered (e.g., if the intolerance was anaphylaxis, 
liver necrosis). Similarly, in the nonresponder case, patients 
cannot be restudied on the failed drug if failure would lead to 
harm. In some cases, the prior experience may be an adequate control 
(e.g., failure of a tumor to respond), a baseline-controlled study 
design.

 1.2 Studies in Likely or Known Responders

     If patients cannot respond to the main pharmacologic effect of 
the drug, they cannot be expected to show a clinical response. Thus, 
subjects with no blood pressure response to sublingual nitroglycerin 
have been excluded from trials of organic nitrates, as they show no 
ability to respond to the mechanism of action of these drugs and 
including them would only dilute the drug effect. A similar approach 
was used in Cardiac Arrhythmia Suppression Trial (CAST). Only 
subjects responding to encainide or flecainide with a 70 percent 
reduction in ventricular premature beats (VPB's) were randomized to 
the mortality phase of the study because there was no reason to 
include people who could not possibly benefit (i.e., people with no 
VPB reduction). It is important in such cases to record the number 
of subjects screened in order to construct the study population so 
that users of the drug will have a reasonable expectation of what 
they will encounter. It will often be appropriate to incorporate 
similar selection criteria in labeling the drug for use.
     The nitroglycerin and CAST enrichment approaches were generally 
accepted. A potentially more controversial enrichment procedure 
would be to identify responders in an initial open phase, withdraw 
treatment, then carry out a randomized study in the responders. This 
could be a useful approach when efficacy has proved difficult to 
demonstrate. For example, it has been difficult to obtain evidence 
that gut motility-modifying agents are effective in gastroesophageal 
reflux disease, perhaps because there are unrecognized 
pathophysiologic subsets of patients, some of which can respond and 
some of which cannot. It seems possible that identifying apparent 
responders clinically, then randomizing the apparent responders to 
drug and placebo treatments, would best utilize both clinical 
observation and rigorous design.
     In seeking dose-response information, little is to be learned 
from studying the drug in a population of nonresponders (although 
one would want to know the proportion of the population that is 
nonreponsive). Such studies might better be carried out in known 
responders to the drug. Similarly, in evaluating a drug of a 
particular class, studies including only known responders to the 
class might be more likely to detect an effect of the drug or to 
show differences between members of the class.
     Finally, it should be appreciated that randomized withdrawal 
studies (see section 2.1.5.2.4), and studies of maintenance 
treatment in general, are often studies in known responders and can 
therefore be expected to show greater effect than studies in an 
unselected population.

    Dated: September 16, 1999.
Margaret M. Dotzel,
Acting Associate Commissioner for Policy
[FR Doc. 99-24855 Filed 9-23-99; 8:45 am]
BILLING CODE 4160-01-F