分享
C0316_PDF_C16.pdf
下载文档

ID:3615951

大小:77.79KB

页数:9页

格式:PDF

时间:2024-06-26

收藏 分享赚钱
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,汇文网负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
网站客服:3074922707
C0316_PDF_C16
CHAPTER 16Multi-Armed Trials16.1 IntroductionIt is becoming increasingly common to conduct trials with three or more treatmentarms.The substantial overhead expense in setting up a large clinical trial meansthat there are economies of scale and time if several competing treatments canbe compared in the same trial.Follmann,Proschan&Geller(1994,p.331)describe an example of a hypertension prevention trial in which subjects wererandomized either to a control group or to one of three interventions,weight loss,sodium restriction,or stress management,making a total of four treatment arms.The primary endpoint was change in diastolic blood pressure from a baselinemeasurement taken on entry to the study.We illustrate the methods for multi-armed trials in the simplest case ofa balanced one-way layout comparing the means of J univariate normaldistributions,where J 3.Extensions to other situations will be referred tolater.Independent observations are available from each arm j=1,.,J,and these are assumed to be normally distributed with unknown mean jandcommon variance 2.Observations are taken in groups of g per treatment arm.For j=1,.,J and k=1,.,K,letXjkdenote the sample mean of thenk=g k responses from treatment arm j available at stage k.Also,let s2kdenotethe pooled within-arms estimate of 2available at stage k.16.2 Global Tests16.2.1 Group Sequential Chi-Squared TestsIn some circumstances it may be appropriate to monitor a single global teststatistic.For testing the hypothesis of homogeneity of J normal means,we canconsider group sequential tests based on monitoring successive chi-squared andF-statistics.For known 2,the test of H0:1=.=Jis based on values ofthe standardized between-arms sum of squares statisticSk=nk2J?j=1(XjkXk)2,k=1,.,K.(16.1)Here nk=gk,the cumulative sample size on each arm,andXk=1JJ?j=1Xjkis the overall mean at analysis k.c?2000 by Chapman&Hall/CRCWe consider a group sequential chi-squared test which stops to reject H0thefirst time Sk ckand accepts H0at the final stage,K,if SK cK.Criticalvalues c1,.,cKare to be constructed so that the test has a specified Type I errorprobability.The marginal distribution of Skis chi-squared with p=J 1degrees of freedom and noncentrality parameternkJ?j=1(j )2/2,where =(1+.+J)/J.The joint distribution of S1,.,SKdoes not havethe canonical distribution(3.1).However,by using the Helmert transformation(e.g.,Stuart&Ord,1987,p.350),Skcan be written as a quadratic form in J 1independent normal variates.This allows us to apply the results of Jennison&Turnbull(1991b)who show that the sequence S1,S2,.is Markov and derive itsjoint distribution,in which Skis a multiple of a non-central chi-squared variateconditional on Sk1.This joint distribution is quite tractable and can be used tocalculate critical values giving tests with specified Type I error probabilities.For boundaries analogous to the Pocock test of Section 2.4 with constantnominal significance levels,we set ck=CP(p,K,),say,for k=1,.,K,where p=J 1.For boundaries analogous to those of the OBrien and Flemingtest of Section 2.5,we set ck=(K/k)CB(p,K,),k=1,.,K.Values of theconstants CP(p,K,)and CB(p,K,)are provided in Table 2 of Jennison&Turnbull(1991b)for p=1 to 5,K=1 to 10,20 and 50,and =0.01,0.05and 0.10.An abbreviated version of this table with =0.05 only is shown inTable 16.1.Ifthegroupsizesorthenumbersofsubjectsallocatedtoeacharmwithingroupsare unequal,tests can be based on the statisticsSk=12J?j=1njk(XjkXk)2,k=1,.,K,(16.2)where njkis the cumulative sample size on arm j at stage k andXk=J?j=1njkXjk?J?j=1njkis the overall mean.The marginal distribution of Skis still 2J1and,in keepingwith the significance level approach of Section 3.3,critical values c1,.,cKobtained assuming equally sized groups can be applied to the data actuallyobservedtogiveanapproximatetest.Aslongastheimbalancesarenottoosevere,the Type I error probability should remain close to its nominal level.Proschan,Follmann&Geller(1994)have shown how(16.1)can be generalizedto accommodate unequal variances.Again,the exact tables are no longerapplicable and the same significance level approach could be used to obtain anapproximate test.Alternatively,Proschan et al.(1994)suggest using simulation toobtain critical values.c?2000 by Chapman&Hall/CRCTable16.1ConstantsCP(p,K,)andCB(p,K,)for,respectively,Pocock-typeand OBrien&Fleming-type repeated 2-tests of homogeneity of J normal means.Tests have K analyses,the 2statistic at each analysis has p=J 1 degrees offreedom and the Type I error probability is =0.05.Number ofDegrees ofNumber ofCP(p,K,)CB(p,K,)arms,Jfreedom,panalyses,K2113.843.8424.743.9135.244.0245.584.1055.824.1666.024.21106.534.353215.995.9927.086.0237.676.1248.066.2058.356.2768.586.33109.176.484317.817.8129.047.8339.697.92410.137.99510.448.06610.698.111011.348.265419.499.49210.829.50311.539.57412.009.64512.359.71612.629.771013.329.9365111.0711.07212.4911.08313.2511.14413.7511.21514.1211.27614.4111.331015.1611.50Critical value ckfor the 2-statistic at analysis k is CP(p,K,),k=1,.,KCritical value ckfor the 2-statistic at analysis k is(K/k)CB(p,K,),k=1,.,Kc?2000 by Chapman&Hall/CRC16.2.2 Group Sequential F-TestsIn practice,the variance 2will usually be unknown.In that case we replace 2by its current estimate s2kin(16.1)and monitor the F-statisticsFk=nk(J 1)s2kJ?j=1(XjkXk)2,k=1,.,K.(16.3)A group sequential F-test would then stop to reject H0the first time Fk ck,or accept H0on reaching the final stage with FK k1,the tworemaining comparisons would use critical values ck=2.303(5/k)instead ofck=2.448(5/k).Ifanothertreatmentarmisdroppedatthenextanalysis,basedon this new critical value,the one remaining comparison would subsequently useck=2.040(5/k).Follmann et al.(1994)show the same step-down procedurepreserves the experiment-wise error of in case(i),all pairwise comparisons,butonly when there are J=3 arms.Their argument breaks down for J 4 and thevalidity of the result is an open question.Two well known multiple comparison procedures for all pairwise differences inthe fixed sample case are Fishers Least Significant Difference(LSD)method andthe Newman-Keuls procedure;see Hochberg&Tamhane(1987).These methodshave been generalized to the group sequential setting by Proschan et al.(1994).In the analogue of the LSD procedure,initially the global statistic(16.1)or(16.3)is used to monitor the study at the specified error rate.When,and onlywhen,this statistic exceeds its boundary,the pairwise are differences examinedusing boundary values for unadjusted group sequential tests with two-sided Type Ierror probability,as described in Chapters 2 and 3.Any treatment arm foundinferior in a pairwise comparison at this stage is dropped from the study.Atsubsequent stages,only pairwise differences among the remaining treatment armscontinue to be monitored using unadjusted two-sample group sequential tests,and any treatment found to be inferior is dropped.As an example,with J=4treatment arms,K=5 analyses,=0.05 and OBrien&Fleming-typeboundaries,we would start by using the global statistic(16.1)with boundaryvalues 8.06(5/k)for k=1,.,5.(The constant CB(3,5,0.05)=8.06 isobtained from Table 16.1.)If the boundary is exceeded,then at that and anysubsequent stage,all pairwise differences are examined using a two-samplestatistic such as(2.4)and critical values 2.04(5/k)for k=1,.,5.Here theconstant CB(5,0.05)=2.04 can be read from Table 2.3,or from the column inTable 15.2 for p=1.The generalization of the Newman-Keuls procedure is similar and is basedon the range statistic instead of the chi-squared statistic.All pairwise absolutedifferences in means fall below a threshold c if and only if the range of themeans falls below this threshold,so a range test can also be defined in termsof the difference between the largest and smallest sample means.At any stage,suppose there are J?arms that have not been dropped from the study.Then,the standardized two-sample statistic for comparing the largest and smallestsample means is compared with the level critical value for p=J?(J?1)/2c?2000 by Chapman&Hall/CRCcomparisons.Exact values of this critical value can be obtained from Tables 1and 2 of Follmann et al.(1994)or Table I of Proschan et al.(1994).Alternatively,one could use the slightly conservative Bonferroni-based values from Tables 15.1and 15.2.If the difference between largest and smallest sample means is found tobe significant,the exercise is repeated comparing the largest mean to the secondsmallest and the second largest to the smallest using critical values for the caseof J?1 arms.This process is repeated until no further ranges are significant;for details,see Hochberg&Tamhane(1987,p.66),who treat the fixed samplecase.All arms found inferior by this process are dropped from the study and theprocedure continues to the next stage,with only the ranges of contending armscontinuing to be monitored.As in the fixed sample case,the group sequential versions of the LSD andNewman-Keuls procedures only weakly protect the experiment-wise Type I errorrate.That is,the probability of any true hypothesis being rejected is no morethan the specified only under the global null hypothesis that all treatment meansare equal.For other configurations of means j,j=1,.,J,this probabilitymay exceed.For example,this can happen under either procedure if there arefour treatment arms with means 1=2 3=4;then for either procedure,the global test will reject early and there is a probability of approximately 2 thatat least one of the two true hypotheses 1=2or 3=4will be rejected.In thefixed sample case,Hochberg&Tamhane(1987,p.69)show how the experiment-wise error can be strongly controlled at level by reducing the nominal levels atwhich each component comparison is made;in principle,the same modificationcould be made to the group sequential versions.16.4 Bibliography and NotesProschan et al.(1994)adapt the ideas presented here in the context of monitoringmean responses for multi-armed trials to survival time responses.The moretechnical details are given in Follmann et al.(1994,Appendix A.2).Siegmund(1993)and Betensky(1996)propose a fully sequential design forcomparing three treatments using a hybrid method,similar in spirit to the LSDprocedure described above.Their procedure consists of two phases.In the first,a global statistic is monitored sequentially to test for homogeneity of the threemeans.If this hypothesis is accepted,the experiment terminates;otherwise,theleast promising treatment arm is dropped and,in the second phase,a sequentialtest is performed with the remaining two treatments.Betensky(1997c)adapts thesame procedure to accommodate survival time endpoints.Hughes(1993)proposes a group sequential procedure based on two-samplestatistics for pairwise comparisons with the aim of dividing the treatment armsinto two subsets,one labeled inferior,the other superior.At interim analyses,treatments determined to be in the inferior group are dropped.If,at termination,only one arm is contained in the superior set,it is identified as the best one;otherwise all treatments in that set are declared superior.The procedure is in thesamespiritassubsetselectionprocedures,oftenusedasscreeningmethodstopickthe most promising treatments for future study from a larger group of candidatec?2000 by Chapman&Hall/CRCtreatments;see,for example,Bechhofer,Santner&Goldsman(1995,Chapter 3).In fact,in the ranking and selection literature,there is a considerable body ofwork on sequential and multi-stage procedures for identifying the best of severalcompetingtreatments.Anindifferencezoneformulationisoftenadoptedwhereby,typically,designs must satisfy the requirement that the best treatment is selectedwith a pre-specified probability Pwhenever its mean exceeds that of the nextbest by a specified amount.Bechhofer et al.(1995,Chapter 2)survey the variousprocedures that have been proposed.When there are multiple treatment arms,it becomes interesting to ask if gainscan be made by unequal allocation of subjects.Of course,group sequentialprocedures in which arms can be dropped at interim analyses are,essentially,attempting to reduce sample sizes on the inferior arms.This is often highlydesirable from an ethical viewpoint,and possibly also from an economic one.The more general class of designs in which the fractions of subjects randomlyallocated to each treatment are allowed to depend on the accumulating resultsare called“adaptive allocation”designs or in some cases“multi-armed bandit”procedures.We discuss such designs in the next chapter.c?2000 by Chapman&Hall/CRC

此文档下载收益归作者所有

下载文档
你可能关注的文档
收起
展开