Encyclopaedic Companion to Medical Statistics

Encyclopaedic Companion to Medical Statistics Second Edition Edited by BrlanS.Ev.nt Professor Emedtus, KIIig's CoIIege,...

Author: Brian S. Everitt | Christopher Palmer (editors)

42 downloads 6932 Views 35MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Encyclopaedic Companion to Medical Statistics Second Edition

Edited by BrlanS.Ev.nt Professor Emedtus, KIIig's CoIIege,. London, UK

and ChrIstopher R. Palmer

DItecIot Of the CenIte for App1Ied MetJIt:!II StatIstics, UnIvetsIty of CsmbIIdgs, UK

With a Foreword by Richard Horton

~WILEY AJohn Wiley and Sons, Ltd, Publicalian

This editiaa 6rst palllisbcd 2011 l' 2011 . . W'1Iey a. Saas. Ltd

I14lJ1tm1l1jJ1R

.

JaIaa Wiley a: Saas IJd., The Alrium,.Soulbma ~ CIUc:hestcr. West Suaex, P.OI9 ISQ. ",lted KiQIIIom Far demOs 01_ PaW editOrial a8Iccs, ·far CIIIIaaIa' ~I aacI far iDformIIdaD . . bow copyriPt IIIIIaW iD dIis book please see oar wbsik: at www.wiley.ODID.

to....,. far

permissioa to mase tha

I,..

The riPI of abe IIIlbar: to be ideati8ed as the IIIIhar or this ?laik has beat utCdcII ill ~ with 1hc CClpyricM, Dcsipu aad PIIInIs Act

All riPIs raerwd. No part of Ibis publiadi_ may .., n:pmduI:cd. stmaI ia a Idrieval SJStaII, GI' tnInsIaittaI, in lIlY farm GI' by Ibc UK .CapyrqIII~ Dcsipi

10)' . . . . . .~ 1DCICI.aica1, pbolocapyiD,l. m:CIIdiag Dr adawise., czccpt as pa1IIiUai by IUd PU:ats·Ad 1988, wiIhaut abc prior ~ of abe pubUsher.·· . .

Wiley also publishes its boab fa a ~ of elOCInIaic fOlllllll. Same C'OIIIeDt dial appears in priDt may ~ .., ·a\llilable iD ckcIraaa"e books. DcIipaIians used by compIIIics 10 cIistinpi'lb 1hcir pnIducIs In oftca clabncclll tradcmub.. AlIIaad DIIIIICS IDCI pmdud _ Used in dIis book an: InIIIc: Dillie&, scrvD -ks, Indc:IiiIub ar.staaI bldelDlrb of Ibeir 1apCCtiw: __II.. The pJbIidM:r Is DOl IlSOCialed wida aa)' pmduca or veaclar IDeIItiaDad in dill book. 'dais public:adaa iI.Biped to pnwide accurIIe aad authmi~ . iDfonnaIiDn ill IqIId to subject maIIa ~ Il is sold _ die .......ini thai the- publisher il DDt cappel iD tadi.iq ...,reaiaaallClvic:cs. If JIR)fasiaaaI acIvice or ada aped usisIaace is .aiui1ed..1hc ~ of a ~ profcaional ~ be BDqbt.. .

*'

Librory"fl/CIIIIBWU OIlIllo'''''''·l'IIbIicaliDn Dtlltl The CDC)'CIopaadic campaaiaD to ..... staIiiIiI:s I cdiIaI by Brim S. Emtt aad CJn1apIaer R. PahDc:r; with • fcRwanl by RicbanlIIadaIL - 2IId ell.

p.;cm.

lacludes IIiIIIiDpapIIic n:f'cmxcl. ~ ..... Elicyclopaalic Campaaian .10 MtcUc:.I Stlllisties, ....i · rcadibIe ~ or almost 400 sbdistical topicl ceatial to cam:III medical n:sean:b. Each catry _1Icca wriIIat by aa iDdiYidilal chasca far bath 1hcir expc:rIiIcia 1hc field aad tbdr abilitY to.CDIIUIIIIIIicale sbdilitical aIIICCpIS succ:cssrully 10 madical mtai"'lI.. Real a .....cs tiUm 1hc bIomcdicaI ~ ......... ililstndioas feam mID1ft)' entries, aad ~ cnu n:fmnciDI sipposts Ihc iader to ....,. CIIlrics"'-PRJvidai by puIiIisbu. . . ISBN 978-0470-61410-1 (!lib)

I. Medical ~Iopedjal I. Ewritt,. __. D. Palmer. aIrisIopbi:r .Rat,h. . (DNlM: l. S1aIiItics as ~"'s-EagIisIL 2. ~ 'J'bcaIaic~~ WA 13 E562 2010) RA409.ES2120IO 6Io.120~·

2010018141 A c....... 1tICCIId for this .... 1s IlftiIIbIc tiaai Ibc British LiInIy. PrinlISBN: 978-0470-61419-1 cPDF ISBN: 9'fI.O..470-ddD74-7

To Mary-Elizabeth Brian S. Everitt To Cailry-Joan, Lalll'a, Carolyn and David Christopher R. Palmer

Contents Foreword ........ Preface 10 the Second Edition I

.......... I

......... I

......... I

.......... I

.......... I

.......... I

.......... I

.......... I

.......... I

••••••••••••••••••••••••••••••••••••••••••••••••••••••

I

•

•

••

ix·

xi

Preface • . . • • • . • . . • . • . . . . • • . . . . . • • . . . . . • . . . . . . • . . . . . . • . . . . . . • . . . . . . • . . . . . . • . . . . . • xiii Biographical Infonnalion on the Edilors ••••.••••••.•.••••.•.•••••••••••.••••••.•••••••••• xv

LiSl or Conbibtltors . • .. .. . . • . • • .. . .. • . • • .. . .. • • • .. .. . .. • . • .. .. . .. • . • .. .. . • • • • .. .. . .. • • • .. .. . .. • • • .. .. . .. • • I vii Abbreviations arad Acronyms ......... Xli Eocyclopaedle ColDpaaioD to Medical Statlsllcs A-Z . . • • • • • . . • • • • . . • • • • • • . • • • • • • . • • • . . . 1-491 I

•

•

•

.. ..

..

•

•

•

.. .. ..

..

•

•

•

.. .. ..

..

•

•

•

.. .. .. •

•

•

•

.. .. ..

..

•

•

•

.. .. ..

..

•

•

•

.. .. ..

.. • •

vii

Foreword This cac:yclopaediac:ontains nocnlly ror 'Pe«~iew' .In my small comer of the medical statistical uni\lCnlC, Ibis seems Iikcagrasssinoromission.lnstead.lhcpn:x:cssorcw)ualing MSCIII'Cb papers is discussed under 'Critical appraisal'. Arc these two proc:edun:s synonymous? And. im:speclive or whether they 1ft or 8M not. should anybody cam? I believe Ibat peer review and critical appraisal do dilTer. that these dille.n:na:s mattcr a deal when considering the ways in which n:aclcrs should intClpn:l the medical Iitcral~ and lhat an undcnIandin, or these difrc~nces hclps to place medicaJ Slatistics in its proper (lCJftlext when surveying Ihc wide horizon of c:IinicaJ and public health n:scardI. Thc editors of this quite wonderfully rewarding uatise on slatisticaltenns havedcftncdcritical appraisal as 'the process of evaluatinJ; n:search n:poItS and assessing their contribution to scientific: knowledJ;c·. 'Ibis statement follows naturally from Ihc mean in, of the words 'criticism' (the an of judgiq)and 'appraisal' (lheestimationofquality). Tbatisto say. critical appraisal is an cstimation of wonh followed by some kind of judJ;ment- ajudt;ment that leans IIKR towards an art lhan a science. As a non-statistician, I rather wann to the precise imprecision of Ibis definition. Now consider the more commonly embeddc:d term 'peer n:vicw' and look how infcrior it is! Who is this anonymous idealised peer? Ocncrally. one would consider a peer 10 be an equal. somebody whocamc:s frum a groupcamparable to Ihat from which Ihc penon under scrutiny has cmcrgc:d. 1bis intellectual cgalitarian is subsequently setlhclask ofviewing apin (to take '~vicw' at its mosl literal meaning) the work underconsiclcration. But 10 view with what purpose? None is specified. Despite these practical shortcomings. editors of biomcclical joumals ~maiD wedded 10 "peer review'. Wc feci uncomfonablc with the notion of critical appraisaJ. The cmbodiment of peer ~view as a distinct scientific discipline is the series of inlcnlalional cOnJ;~sses devoted to peer n:vicw in biomedical pubUealion. or,aniscdjointly by JAMA and thc BMJ. These conpcsscs have spawned hundrals of abstracts. dozens orrescarch papers. and fourthemc issues or JAMA. They are enti~lycommendable in every WDy. For the cdilon of JAMA and the 8MJ, peer revicw encompasses a broad nmsc of activities: mechanisms of editorial decision making. toscthcr with their quality. validilY, and pncticality. online peer review and publication. pm-publication posting

,real

of infonnation. quality assuranc:c of ~vicwcrs and editors. aulhorshipand conlributanhip. conflicts or interest. scientific misconduct. peer ~vicw of grant proposals. economic aspects of peer ~icw. and the fUlU~ of scicntific publication. In other words. peer ~view is a ~mendously elastic concept. allowing editon to stretch it to mean whatever inle~ts them at a given (whimsicaJ) moment in lime and place. Indeed. its claslicity is seen by many of us as its ,~t ~th. ThcCODClCpI pows in richncss and undenlanding as our OWD appreciation of its complexity and nuance soars. 11le impenClrable nature of pccrnmcw. and the obscu~ and hardto-learn cxpertise iI demands. rceds our brilllc c,os. The notion of critical appraisal. by contrast. is far dUnner in meanin,. wilh much less room for cdilorial manipulation and BlP'Bndiscment. EVcn if peer review and crilical appraisal do diller. should anyone actually can:? Yes. they should. and for a very simple rason: the idea of peer ~icw is now bankrupt. Its ~tention as an operation within the biomedical sciences ~Rects Ihe inte~sts or those who wish to preserve their own pow« and position. Peer ~icw is fundamentally anti-democratic. II elcwtes the mediocre. It asphyxiates originality and it kills careers. How so? Peer ~view is DOl about intelli,cnt cnJ;agcment with a piece of rac~h. It is about defining the margins of what is acceptable and unaa::eptable to the ~vicwcr. 'I1Ie mythical 'peer' is bein, asked to view again. after Ihe cditor. the work in queslion and to oller a eomment about thc gcogaphical location of Ihat work on the map or existing knowledp:. If ~ is space on Ihis map, and providc:d the work does not disrupt (too much) the terrain cslablishcd by others. its location can be secured and marked by sanctioniq publication.lfthe disruption is toog~at. the wed's wish to seck a place of ~st must be "CIocd. Peer review is about the DleDeY of power to praervc csbIblished onhodoxy. It has nolhin, 10 do with science.•1 has cverylhinJ; to do with ideology - and the maintenance of a quiel Iifc of privilcp: and mysliquc. Instead. critical appraisal is aboul incrementally wedin, one's way towards lrulh'. It can new:r be about lruth itselr. The essencc of biomedical ~sean:h is cstimation. Our world ~sists CCltainly. Crilical appraisal is aboutlJ'aDsparcnt. measurablc analysis Ihat cuts a path towards grater precision. I. Honan.. R. 2002: Poi1pubItcaliCIII criticism _ dintcal ~. JAMA 217.2143-7.

the shapi", of

Ix

RDREWORD _______________________________________________________________________

Critical appraisal refuSC5 to \'eil itself in the gaudy adornments that editors pin to peer re'liew in order to embellish their own imporlance in the carlography of scientific inquiry. A far more robust instrument critical appraisal is for that refusal. What do these differelK"CS tell us about the propcr place of medical statistics in biomedicine: today'! In my \'iew, as a lapsed doctor and a now wrinkled editor. medical statistics is the most important aspect of our critical appraisal of any piece of new resc.an:h. The e\'aluations by .so-called peers in the clinical specialties that concern a particular l'CSCW'Ch paper prm'ide valuable: insight into how that work will be reeeh'cd by a community of practitioners or scholars. Howevc:r. as an editor I am less intere:slc:d in n:ccpCion than I am in meaning:. I want a tough intcno~ation of new work before its publication. according to commonly a~reed standards of questioning - standards that I can sec and e',-aluate for myself. To return to my personal definition of critical appraisal. I want an estimation of quality combined with ajudgment.1 do not want a view from the dub culture of one particular academic discipline. The rc:jection of peer re\'iew by the

2. Horton. R. 2000: Common sense and fig~: the mc:toric of \'ulldity in medicine. Solisl Med 19, 3149-64.

editors of this elK')'dopaedia is therefore a triumph of liberty against the forces of confonnity. Yet still today. too much of medicine takc:s medical statistics for granted. Time and a~ain. we sec research that has dearly not bc:cn within a hundred miles of a statistical brain. Physicians usually make poor scientists. and physicians and scientists to~ether too onen play the: pari of amatc:ur statistician - with appalling consequences. The future of a successful biomedical research enterprise: depends on the f1ourishin~ oflhe discipline we call me:dical statistics. It is not at all clear to me that those who so depend on medical statistics appreciate either that dependence: or the: fragility of its foundation. If this magnificent elK'yclopac:dia can be deployed in the ongoing ar~ument about the: future of twenty-first century academic medicine. then not only the research enterprise: but also the public's health and well-being will be far stronger tomorrow than it is today. Richard Horton Editor. Lanc'el

IS.

II

II

I·

,(.'

II; 1

II

·1.

J lt rII IIiiI'Ii i'. 'JI,nli

a

t

•

D

f.

•.

;r

1;.-1

B

~t

II

~1~li~=11 r

iJJi 'li lilt ~1!!sf~!I~~ 1 ;11:~f!§1 •

II

I'

!.

"II ",tlll· 11~llfr! l t I i . '~t' f e.·1 I;'J ~~ ~~frlltl l~ll:l' :~1 r I!Ii! S: I IIs,r ii':1.: II r

'S,.

0 0

f,!~ Bi a i '= ' IJ~ I i ' .. eI'" l:U· ~lllil ff~f~Jalli(- ~r

l

II

f'J t I 'II- ,r I. r,ll" ·1:1 if • r ""alrilifil'" 'I;,.lr i'I: I t I tL·a! a '&.1 1 - f I '

II

r

i' r i It ..,, i· ; I ~ I f. I-S' :IJttI5f~rltll.

"C

t 111 '.Jlil i lIlt, I:i rl'l~ It 'Ii~i l"~tJ f ·wI rIs=tJllrl'IJKJl~,1 tlr~ .,rJllifil 11ft till Iltlll!l .J 111;n:

•

-

•

, It

Preface Sialistical science plays .. ilDporlanl role in medical JaelRh. Indeed a major pall of lhc key 10 lhc pmgras in medicine tian lhc 17th ceatlll)' 10 the pn:scnl day has been the collcetion and valid inlCqRtalion of evidence.. panic... larly quantitative cvidence. pmvidccl by Ihe appIicalion of stalislical methock to medical inYeSlipiioas. Cum:nl medical journals ~ full of statislical lDIIICriai. both rdalively simple (for eumple.. t-lcsls. p-wlues.lincar lqIaSion) and. increasingly, mo~ complCJC. (for cxample. genenalised c:stimaliDl equalians. cluSla" analy. Ba)eSiaD IDClhods). The laller material reRects the vilnnl 5IaIc or stalistical research with many new mdhods having praclical implicati_s f~ medicine heiDI cIc~oped iD thc IasllWO cIcc&des ar so. Bul why is slatistics impallaDl in medicine? Some possible answers are: Medical praclice and medical rescan:h gcneralC IaJp amounts of clara. Such data an: generally full of uncertainly and \·arialion. and e&tIacling the "sipal· from lhc "Daisc' is usually nollrivial. (2) Medicine involves asking quclli_s lhaI have slraDg slalislical overtones. How common is lhc discasc? Who is especially likely 10 conInIcl a particular condition? Whal an: lhc chances thai a patienl diasnoscd with breast cancer will survive man: tha fi\IC yean? (3) The evaluation of campcliq In:allDCnts or pn:vc:alative IDCIISIRS relics heavily on slatillical cOllCc:pl' in both lhc clcsip and analysis phase.

(I)

Recognilion orlhc: importance ofstalislics in medicine has increased considerably in n:ecnl years. The lasl decade. in padicular. has seen the emergcnce of cvidence-based mccIicinc.., ..d with it the need for clinicians to keep one step ahead or lheir paticnts. lDay or whom nowadays ha\IC access 10 virtually unlimiled inronnation (lDuch ofil being virtual, yct SOnIC of it being limiled in its reliabilily). Comparcel with ~vious gencrations of medical students,. today's prc-cliniclil undergraduales arc heiDI taughl more about statistical principlcs than lhcir predecessors. Furthermore. today's clinical researchers 1ft faced (happily. in our vicw) with growing numbcn or biomedical journals utiBsing slatislical rcfCl'CCS as part of their peer Mview proccsses (sec amCAL APPRAISAL and STATImCAL REFEllEEDKJ). This enhances the qualily oflhe papersjoumal editors select. although from the clinical n:scardJcr's

pcnpc:clive it has lDade publicalion in leading journals more challcnginglhan ever befOM. So slalistics is (and an:))RYalcnl in the medical world now and is sello ranain so ror the future. Clearly. clinicians and medical n:scan:hcn need to know somcthiq about Ihe subject. cven ifonJy 10 maIcc their discussion with a friendly Slalislician more rruitful. The article on consulting a stalilr tician quolcs one of the fGRfatbcn or mocIenI slalillics, R.A. FISher who- back in 1938. obsc:ncd wryly: "To conn,lt lhe $Itll&lIcilBr tlJln till e."cpnimerrl isjin&lred u often merely 10 ale him ID conduct II po.,-mor'em e.'ttlmintlliDR. He Cllll per/uJps my ...·btlI tire e.'fpe,imerrl dim of.' Thus. one or our hopes for the usefulness and helpfulness of the EncycloPllffdic COmptllliDIf 10 MediCtlI Sltlli,sl;cs is lhat it may SCrYC 10

cncaurqe both productive and timely intaaclioas bcawcca medical racan:hc:rs and statiSlicians. Anolhcrsincc~ hope is that il ftlls a pp between, on the one hand. lextbooks thai delve into possibly 100 much theory and. on the olhcr haad.. shader cliclianarics thai may not necessarily focus on Ihc needs of medical racan:hc:n.. or else have c:alrics Ihat an: IanlalisiDlly succinct. 1b meet these ends. lhc presenl ~f cn:nce wark conlains concise. informative.. n:lali~ly nontechnical. and hence, we IrUsl. readable acc:ounlS of over ]So topics ccatrallo modem medical statistics. Topics are coVCRd either brieRy or mo~ extensively. iD general. in accordance with the subject matter's pcn:eivcd ilDportance. although we acknowlcdlc then: will be disagn:cmcnt. incvilabJy. about our ChaiClC of DIIicle lengths. Many entries benefit from containing real-life, clinical exalDples. Each has been wriltcn by an individual chosen not only for subjecl-maller expertise in the ficld bUI.just as ilDportantly. also by ability 10 communicale statistical concepls to olhcn. 11ac extensive cl'OSlrrefcn:nc:ing supplied usiq SMAU. CAPII'ALS 10 indiclllC tcnns thai appear as separate entries should help the radcr to find his or her way amuncI and also serves to point out associated topics thai might be of intcral elsewhere within lhc EncycloptJedic Comptllf;tIII. All but the shadcst caines contain rerCI'CIICCS to fUJthcr resources when: the interested n:aclcr can learn in paler cIcpIh about Ihc palticular topic. Thus. while hoping this work is found to be mostly comprehcasiblc we do nol claim illo be rullyCGIDprchcnsi~. As c:o-editan we lake joint n:spansibilily far ..y enurs ('sins or commission') and would positively weIc:omc sugcslions

xiii

PREFACE _____________________________________________________________________ ror possible new topics to consider for futw-c: inclusion to R:ctiry perceivc:d missing entries ("sins or omission'). Our thanks ~duc to numerous people - first. to all onhe many conlributors for providing such excellenl material. mostly on time (mosaly!) with panicular gratitude: extended to those who contributed multiple &nicles or who handled requests for additional &nides so gracefully. NeXI. we upprcciatc:d Ihc lrcmendous and indispensable effom of staff at Arnold. cspccially Liz Ooostc:r and Liz Wilson, and not least ror their remaining calm during an editor's moments or anxiety and neurosis about the enlire projecL In addition we would like to thank Harriet Meteyard for her eonslanl support

and encouragemenl throughout the preparation of this book. F"mally. our family memben deserve especial thanks ror ha\ing been exba toleranl of our lime spenl on developing and executing this extensive projecl from beginning 10 encLll is our hope thai Ihe Encyclopaedic- Companion pnn'cs all these efforts and sucriHces to be well worthwhile. becoming a userul. rqularly-thumbc:d reference added to the bookshelf or many or those involvc:d in contemplating. conducting or contributing 10 mc:dical rcscan:h.

Brlaa S. Everitt aDd Clartstopber R. P......r /tlnlllllY 2005

Biographical Information on the Editors BrI_ S. Everitt - Professor Bmerllal, KIDR·s eo... Londo.. After 3S years al the: InSlitulC of Psychiatry. University ofLondoD. Brian Ewritt n:tired in May 2004. Author of approximately 100 journal articles and O\'Cl'SO books on statistics. and also co-c:ditor of SllIlisllml Melhods in Met/· im/ Reselll'cir. Writing continues apace in n:lin:ment but now puncluatc:cl by lenn~ walks ia the counlry.luitar playing and visits to the Dna. rather lban by committees, committees and mon: committees.

Cbrlstapher R. Palmer, roundins Din:ctor of Cambridge University's Centn: for Applied Medical Statistics. rqularly teaches and collabanlc:s with cum:al and fuaun: cIoctan. His first clcln:e was from Oxford. while graduate and postdoctoral studies \VCR in the USA (at UNC-Cbapel Hill and Harvani). He has shined frvm mathcmaticaltowards applied Slatistics.. with particular inten:st in lheethics ofclinical trials and Ihe use or ftexible designs whenever appropriale. Fundamentally. he likes to promote soUDd statistical thinking in all areas of medical raean:h and hopes this volume might help towards dlat end. Chris SClrved as Deputy or Acting Editor far Slatistics In Medicine, 1996-2000. and is a longstandinlslatistical n:viewer for 71re Lancel. He and his wife have three childn:n they ClClllsider to be IIICR than statistically sipificant.

xv

List 01 Contributors K. R. Ab...... (DA). Cenln: for Biaslalislics aad Ocnelic. BpiclemioJOI)', J)cpaIbDent.ofHeallh Scic:nccs. UniwDit)' of Leicesla', LBI 7RH. UK

....,icestc:r.

Calla ....... (CB), Cli.aI nial Servic:e JJnil and

I.ac)r M. CarpeD.... (LMel, DeparlmeDt of Public Heallh.

Univenil)' of oxronl,. and Nullield CoIIcp,. oxranl ox I INR UK .

iuaa CIdaD (SC), Respiratory Epidc:miok., aad Public Health. Imperial College. Emmaaucl

Ka~. Building.

EpiclcmioJogicai Studies Unit (CTSU). Richanl Doll Buildilll. Olel Raad Campas. Raosevelt Drive. oxranl OX] 7LP,. UK

Manraa Road. LaacIon SW3 6LR,. UK

Alua ....... (AB), QuantilatiWl Scienc:cs. OIuoSmith-

LandaD WC1N lEH,. UK

Kline. Medicines Rcscarch Ce~ Gumaels Woad Raacl. Stevcaace. Hertrordshill' SOl lNY. UK

'11m CelIe (TJC), MRC C'enIrc of EpidemioiOD rar aaJcI Health. UCL Instiblte or Child Health. 30 Ouilrord SIn:et.,. Chris CGftOI8D (CCo), J)cpaIbDent of Madlematies and Los-. UT 84322-3900.

Statistics. Ulall Slate Univenity,

TlJI De Ble (TOB), ISIS Rc:sean:. 0nIup. Buildiq 1. UniWlnity of Sauduunplon. SouIb....pIaI S017 IBJ. UK KII.... Bjork CD). Primelric:s. Inc:. AmIcIa, Colanclo, USA (btheOprimebics.net)

J. MuIID ....... (JMB), Pmfessor or Health SlDIistics. DepadJnent of Health Sciences. UniWlmity of yadt,

HesliDgtaa. yadt YOlO SDD,. UK Ma.......... (MtB), DivisiOll of Bioltatics. AnIoId School of Public Heal.... Univenity of South Carolina. 800 Sumter Sbeet. Columbia. SC 29208. USA. and also Unit of BiosIaIisIics. InstitulC of BnvinHlll1ClllaI

Medicine. Kmulinslca Institutet Nobels ric 13. Stackholm. Sweden MkIIIUe Bndley (MMB), Health Ial'onnatiaa and Quality AUlharit)', Oc:orp's Court. Ocoqe's lane. Dublin 7.1n:1anc1 ........... (SB), Departmenl

or Sacial Medicine.

UniWlnity or Bristol. C_ynge HaD. Whiteladies Road. Clifton, Brislol Bsa 2PR. UK Marc aa,.. (Ma), Intenational Dna, DeWllapmenl "stitute (IDDI). 30 avenUc provineialc..

1340 Louvaia-la-Neuve. BelPum

M..J. CampIIeII (MJc)' School of Heahh aiad Related ReseaKh. University of Sheflielcl. Resent Court. 30 Relent Sbeet. SlIellleld SI 4DA,. UK

USA

or

NeIIo CrIIIIaaIaI (NC), UC Davis Departmc:Dt Slalisties.. 360 ICar Hall. One Shields A~ Davis.. CA 9S616. USA

Sanb CnaIer (SRC), MRC urec:ounc EpidelllioiOlY Unit. University or Southampton,. Southampton

General Hospital. Southampton $016 6YD, UK CaraIe O.mml" (CLC). 'I'be University of BinniaPam. Departmeat ofPublic Heal.... BpidemiolOl)' ancIBiotali*s, go VlIIL'ICnt Drive. Ecl&ballOll. Binni~ B 15 lTH Geo.... DllYII,-8mUb (ODS), Schaol or Social and Communit)' Medicine.. Uniwnit)' of' Bristol,. Oakfield Hause. Oakfield Onnre. Bristol BSS 2BN, UK Sbaaa Da, (SO), Raclae Products LilnilCd.. Welwyn 0anIen

Cit)'. HertfonIshiM. AL7 ITW. UK.

DaaIeIa·De A"'IIII (DBA), Statillics.. Modellin& and Economics Departmc:llt. Healdl Pmta:tion Apncy, Cenln: far Infections. Landon and MRC BiCl5lalistics Unit. Jnstilute of Public: Health. Uniwnity Pantie SilC. Robinson Way. Cambridp: ca2 OSR, UK

J......... Dee.. (JD), Public Health. EpiclcmioJogy and BiastatiSlics. Univenit)' or Binningham. EcIgbasIon. Birmingham BI5 21T. UK 0 ....... Daaa (OD), Health Scienc:cs Resc:an:b Gnlup.

, ..... R. C......ater (JRC). Mcdic:aI SlaIistic:s Unit.

London School or Hygiene and 'l"rapkal Medicine:. Keppel Sbeet,. Lonclan welE 71fT. UK

Schaal or Canmaunily Based Medidae, UniWlnily of Manchester,. lean McFarlane Buildin& Oxford Road,. MaacbesIer MI3 9PL. UK

xvii

USfOFCONTRlmnoRS _________________________________________________________

Daua E8Itoa (DE), Department of Publi~ Health and

THY Johnso. (TJ), MRC Biostatistics Unia.lnslitute

Primary OR, University of Cambridge. SlI1Ingeways Resean:h I..abanlory. Worts Causeway. Cambridp

CBIIRN. UK

of Public Health. University Fonie Site. Robinson Way, Cambridge CB2 OsR. UK a MRC Clinical Trials Unit. 222 Euston Road. London. NW 1 2DA

Jo....... Bmbenoa (JE), Cinical Trial Servitle Unit and Epidemiological Studies Unit (CJSU). Ricbard Doll Building, Old Road Campus. Roosevelt Drive. Oxfard OXl

Karea KaIMIar (KKa), Department of Mathemalics. University of Colorado at Denver. PO Box 173364. Campus Box 170. Denver. CO 80217-3364, USA

7~UK

IUcharcI EauIey (HE), Health Sciences Resean:h Oraup.

School of Community Based Medicine.. University of MancheslCr. JeaD McFarlane Building. Oxford Raad. Manchester M 13 9Pt.. UK Bri. . S. Everlll (851£), Biastalistics Dcpadment. Institute of Psycbially. Denmadt Hill. London SES lAP. UK David FaraaII (OF), Depadment of Statistics. Univenity of Haif... Haifa 31905. Jsrael

lC.,.........•• lC.bn (KlC.), Department of Biostatistics and Medical Informatics, University of WISConsin Medical School. 60D Highhiad Ave., Madison. WI 53792-4675, USA bib Kina (RlC.)t School of Mlllhematics and Statistics. Mathematit:al Institute. UnivcrsityofSt Andrews. Fife KYI6 9SS. UK " WoJlek IC.JozaDDwsti (WK), School of Engineering, Mathematics and Physical Science. University of Exeter. Harrison Building. North Park Road. Exeter EX4 4QF. UK

Health and Health Policy. Division of CommuniI)' Based Sciences. University of GIIISIOW. GlasJow G128RZ, UK

RaDjlt LaD (RL), Warwick Emergency Care and RebabililaliCID. Division of Health in the Community. Warwick Medical School. Univenity of Warwick.. The PannhoWie. Gibbet Hill Campus, Coventry CV4 7AL. UK

81. Gaet.......ur (EO), Department of Applied Mathematics and Statistics. OIIent Universaly. Krijgslaan 281-89,9000 Ohcna. Belgium

Sabine Landau (SL), BioslDlislics Department. Institute of Psychiatry.. King's College. Denmark Hill. London SBS lAP. UK

AlKlnw Orl..e (AO), Division of Health a Social Can: Research. Depanment of Primary ~ and Public Health Scieaces. School of Mc:dicine.. King's College. Floor 7. Capital House. 42 Weston Loadon SEI 3QD. UK

Andrew 8. La. . . (AL), Division of Bioslalistics and Epidemiology, College of Mectacine. Medical University of South Carolina, Charleston. SC 29415. USA

W. Harper GUDIOur (WHG), Section far Public

sa.

Julilua P. T. ......_ (JPTH), MRC Biostatistics Unit. Institute of PUblic Health, University Forvie Sileo Robinson Way. Cambridge CB2 OSR. UK 11Ieodore .. HoIfanl (TRH), Division ofBiosiatistics. Yale School of Public Health~ Yale.. New Haven. CT 06520. USA HaUls (SH), AstraZc:aeca, Parklands. Aldedey Part. Macclesftclcl. ~eshire SKIO 4TF. UK

SaB)'

Tonte. Hotbana (TH). Institut fUr Slalistik. LudwigMuimilians-Univenitit MUnchen. LudwiplraS&e 33. DE-80S39 Miinchen. G:nnany Hazel 1_ldp (HI), MRC Lifecounc Epidemiology Unit. Uni~ty of Southampton, SouthamplOD Cieneral Hospital. ~~nSOI66YD.UK

Marwn Leese (ML), Health sCrvitle and Population Resean:h Department. Institute of Psychiaby, King's College. Denmark Hill. Loncloa SBS lAP. UK

AmI1 Lyada(AGL), DeparlmentofOncology. University of Cambridge. Li Ka Shing Cenam., Robinson Way. Cambridge.. CB20RE. UK Cyrns M..... (CM), President. Cytcl Software Corporation. 675 Massachusetts Avenue. Cambridge. MA 02139, USA RI..... Morris (RM), Department of Primary Can: and Population Health. UCL Medical SdJooI, Royal Free Campus, London NW3 2PF. UK Paul M......... (PM), Department" of Statistics, The University of Auckland. Private B8I92019. Auckland. NewZcaland

_______________________________________________________

CIIrIstopIIer R. ......r (CRP), J)epadlDent of Public Health and Primuy Care, Institute: of Public Health. Uni~nity Fantie Site. Rabiason Way. C8mbridp CB2OSR.

USTOF~~

Aaden SknIadIII (AS), DiYisiem of EpidcmioloJy. Norwepua InsIituteorPublic Health. PO Box 4404 Nydalen. N~. Oslo. Nanvay

UK

NIaeI Smeetaa (NCB). Kina's CoIlqe London. Max ......... (MP), MaC CUnicai Trials Unit. 222 Busloft Road. London NW 1 mA. UK

Departmc:Dt of Prillllll)' Care and Public Health Scieaces.. DiYisiem of Heal... and Social Cam Rc&eaIdI. 7th Floor Capital Haase. 42 Weston SlRIe~ Landon

Nlla ..... tNP), Cytel Softwan: Carpanlioa. 615 Massachusetts A\lellue. C'ambridp. MA 02139-3309, USA

3QD.UK

sal

NIaeIStdanl (NS), Wanvick Medical Schaol, UniYersilyof

J_a Pow-UP), Dcpanmc:at of Public Heallb and Primary

Warwick. COW:nII')' CV4 7AL. UK

Iaslitalc of PubUc Health.. Uni~ly Fcnie Site. Robinson Way. C8IIIbriqc Ca2 OSR. UK

Joaat. . . . . . . (JS), School of Social and

ellie,

P. PracaIt (pP), Faculty of Malhematical Studies, UniWl5ity or SauahamplOD. SouIhaaaplOft SOl7 IBI. UK SopIda ............. (SRH). Graduate School of Educatioa and Gndu_ 0nJup in Bi~. UniWl5ity or California. Berb1ey. 3659 Tolman H•• Califomia 94720. USA and InstilUte of Educalion. UniWl5ily or London

8m ...... (BR), Department

or SIalistics.. Uni~ or

Medicine. Uni~nily or BristoL Canynge HaIL 39 Whatley Road. Bristol BSS lPS, UK

Comaumil)'

.......... SWaIIDa (88), Swedish Business Schaoll Slalislics. oR::bro Univenily. 0n:In. Swedc:n

M ...... SJdes(MS), MRCClinicai Trials Unit. 222 EUSloII

Road. I..aDcIoa NWI 2DA. UK J......, Taylor (NOT), Depanmc:at or BiaslalilliCs. Univenity of Michipn. 1420 WIIIhiDIton Heilhts. Ann Arbor. MI4BI09-2D29, USA

Haif.. Haifa 31905. Israel

ShaaII . . . . . (sas), MRC BiasIalislic:s Unit. Institute of Pablic Health, UnivCnily Fonic Site. Robinaaa Way.

Kale TIIIIaa (KT), School or Social and Comm_ily Medicine. Uni~ty of Bristol, OmYlile Han, 39 Whalley ~ Bristol BSS 2PS. UK

Cambridge CB2" OSR, UK M ........ (MRS), Division of aiaslalistics.. Uniwnity of Califamia, 185 8c:n)' SIreel. Suite 5100. San FraDcisco. CA 94107, USA

PraI..,. .........art (PIe), CyteI Software Carpanlion. 675 MassachuseUs Avcnue.. Cambriclce. MA 02139-3309. USA ............. (SS),DepartmentofStlllistics. The Uni~ity of Olascow. Olasgow 012 8QQ. UK ~

SIuIID (PS), Deputmcllt of Psychialr)'. 11Ie Uni~ily of Hoa& Kan" Queen Mill)' Haspital, 102 Pakfulam ReI.

HOIIIICaIl&

c .... Sbarp (CS), Computiq Deputment. Institute of Psychially. Daunut Hil~ LoncIaa se 8AP. UK Anid SjalaDder (AlS,. Depadment of Medical Epiclemiolo" and Biasllllillics.. Karvlinsb InstilUlet. Nabels Vic 12A. 171 17 Stockholm. Sweden

81t1ua TGID (BT), MRC Biaslalistics Unil.lnstilUle of Public Health. University Fanie Site, Robinson Way. CambridJe

CB2OSR, UK

.

ReIM!ca 1'araer (RT), MRC Biostatistics Unit. Institute

or Public

Health, Uniwnily Fame Site. Robinson Way. Cambrid&e CB2 OSR, UK Aady V. (AV), Bio.dptista 0nJup. Uniwmdly or Maachcster, Oxfanl Road. Mancheller M13 9PL. UK

SII,Ia V...........t(SV),Gheal Uni~lSity, DcpLof AppIic:cI Mathematics and Computer Science. Krijplaan 281, S9, 8-9000 Ghent, BelPum Sanb L Vowier (SL\I), BioinfCll1lllltics CoR, Cancer Racan:h UK. Cambridge Rescan:h lnIIitute, Robinson Way. Cambridp CB2 ORE., UK

Stepbla J. W...... (SJw)' Medical SIaIistics Graup. School or Health and Related Raarch, University of Sheffteld. Regent Court. 30 Repnl Stn:et. Shemeld

SI4DA.UK

xix

USTOFCONTRIBUTORS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

J.G. Wheeler (JG\V), Quanticate Ud. Bevan House. 9-11 lUi. UK Bancroft Court. Hitchin. Herts.

J .... \VII. . (JW), 1710 Rhode Island Ave. NW. Suite 200.

Bnmdon WbItcbI!r (aW). Clinicallmaginl Centre.

Mark Woad....... (MW" The George iMlibiac for

GluoSmilhKliae, Hammersmith Hospital. Du Cane Road. Lonclaa WI2 OHS. UK

Inlemalional Health. PO Box M20I. Misscaclen Sydney NSW 2OSO, Australia

sas

laD WhIte (IW), MRC Biostatistics Unil.lnslitulc or Public Univenit)' Forvie SiRe. RObinson Way. Cambridge

Health.

CB2OSR. UK

Washington DC 20036. USA

Road.

Ra-F_ Yeb (BY), University or California San Fl'IIIICisco. Campus Box Number 0560. Soo PanlB!isus" 420 MU-W. San Francisco, CA 94143-0560. USA

Abbreviations and Acro.nyms ACES ACET

AD AI AlC ANCOVA ANOVA AR

ARMA AUC

BlC BUGS CACe.

CARr .CAT CBA

CEA CI CONSORT COREe CPMP

CPO Cd CRM CSM .CUE CV

CWT DAG

DALY DAR DeAR DDD DE OaF

DIC DM DMC DIMC I)WT

Actiw conbvI cquivalcnc:c &Iud, Ac:Iiw COIIIIVi cqui~ I&:It Adapliw des.ip Altildal inlellipnc:e Alaike's information Crilaion AnaI,sil or COWIiaDt!e Analysil or variance AuloJqn:sSive Aulolqlaslve lIIDVinI'avcrap Ala " ' r c:urYe Ba)'Clian infonnaliaft crilCrian BaycSiaD inraa.ce Usia& Gibbs SIImpIinc (softw8re) CompUer awnp causal ell'ect Classiftcal~n and RpasiDn ftc

DZ 88M BOA EM

Diz)'lGlic EvicIencc-bued aalicinc BxplanlOl)' data analysis

£MEA

Ell...... Medicines EwllIIIIioa Apacy Food and DIu& Adminillnlliaft OcneraIised additive madel GencnlilCd csliaIaliDl eqlllltioDs CJeaeraI t'CItility nile

CoaapuICr..-laplive tcsIiJII Cosl-bc:neftl ",,.1 Cosl-ctrec:li--. analysis Caiaftdencc inIavaI

ORR GWAS

CGasolidalioa of s&ancIanIs or ftpIItinJ IriaIs ~:OI1ice

roi.RaeaIdI

FDA GAM

Gi!B GFR GIS GUM

GLMM

HALE

HMM

HPDl HREC HRQo~

IBD.

CaaunillCC fal' PmpricIary Medicinal Praducll Coiaditiaaal )RdicIiYC- onIi...e

ICC

"p

Dalaminill& Dada.monitorial cammilfeo

Data and saf'c:Iy monilariD& cammiuce DiI!Clde wavclcllnlnlf'orm

~ infOl1ll&llion sysICID

Gcnaalisallinear. illlenK:tiw .modcJIilll (SDftWIR)

GUMM GLM

Ethics ComniiIIecs

C1dble inlcmII Coaliaua11aS1C1S111e11t mcIhod c.uaillac GD Safety or Medicines Cosl-ulility analysis C'ocOicient or wrilllian Conli.ous wavelel .......... Din:c:lCd acyclic paph Disability alij8led life-)'Car Dlapaal at nnIom DIapauI mmplcrely aI mndaIn DaIa-clepcadcnt Dclipefred Dc:pa:s of rn:cclam DeviaDcc informalion crilcrion

Expeclalion-aaaimilalian

CleIIcraIisc:d Unear mW:cI ..... OcneraIised linear model Gcnaalisallinear mIUcI ...... Gmss ..".aduclion mte

C1enorno-wicla associlllion ·slUdics HcalIh-a4iUSlcd lire cxpecllUlcy Hidclen Markov model Hilliest (IOIIaior·dcnsity ~ H..... IaClRh elbics conunitlec Hcallbcllllcd qualily or lire fclenlily-by-clesc:cal IDlradus (or inIIaclustcr) com:laIioIa cadIlcicat

ICER ICH

~1I1CIIIaI CDII-clra:liveaess ratio InlcmalioDal ConrCRnCC GD

Hannaaizali_

IQ

rrr IV

KDD

laslilulianal .mew board InlcnlioD-~

lllslnanallal variabIc ICnowlqe cIisccway ill clatabuel

KM

ICaplan-Meil;r

IcNN

k-ncan:1t aeilhbaur u.. discrinainanI ftIIIcliDn

LDF LR

LREC LS· LST

MA MANOVA MAR MeA MCAR

~11IIio

Local n:aean:b c:lhici c:ommillee Lcasl·.._ l.aIp simple IriaI MoviI1l awrap MukiWriate ...., . or variance Missialld mndom Mc:cIiciael ConIIvI At,crrl:y

Misslnl. mmpIctcly aI. raacIam

xxi

~~AND~

______________________________________________

M.m. cliaiD MoRa.: Carlo

M~·_·~.~ PnJducis·

.RepIalDay Apacy . M~m_ JikemioacI CIIIinuiIe _ (- C8IimalicIa)

MREC

Mulliccalre ·lUCarch ethiCs caaunillee

MsB

Mc.a.1IqUIR e~

MTQ

MliXim .... Iolerali:d .. . . cloSe Manozyplic: ~ (arnoninfanaalhe)

MZ NI ~.

·:Ncl mcmeIaIy ~ftt.

NNH NNT

.NulDJM:r .aated·IO.hann Niam....... lOlMat Ncpii~ pmlicii~ value

NPV

NUs

Mdt .OLS QR

PeA

PDP PEsr fGM

ftl

.0Idin.y: inst sipuua ·Odds,.aio -PriDciplil companc:al.

anal,. ......,.litY "'Y runciIa. ...... ad B.arian of ~~(IOft~D)

,..,. ~ nicuuni . .

Quanlitali~ .lloci

RCT· REB·

bndomIsed conIIOIIaIlriai Rc~.I.~ .

RBML,

llcscaIda.eIhics cammilleC b ..... max_UIII. UbiihOOd

·ROC·

~eaciwropifttiq ~

ROt

~_ofintaat

RPW

~isacl play~. . .

•• SSM

·S8.t

.SPRT

.SS

,S8

·TDT

PnIpOrtiDaaI . . . .

POP pp

.,. .,...,col·

p'-p

~.JlCl'CCntile

• • • 1~ . . . ~H~t7. . . . . . . . . .

pn,bability

.

.1M

Tail

n·

VAS

wLSi!

~n-·;miIl ~hiJit , .r-" l"" ........ Y c.

Positive paalic:live ·alae

~

Quiaai~1IIlIiIe

.~

SVM

PIiarwnac:oJci_:-.....lo_ _- - - ' - _ : -

PPV

Quality or lire

SMR

PKIPD ppp.

QoL

.SD !S.

' . '

.

~ty. ad)i~ ure:-~.

RR

Naiiaaal Racan:h ElJiic:s . . . .SerVice . ·Net . . . .1CtiDn . .

-~miit

~Y·

RcIaIiWJ

.Ii*

SIancIanI deviation SI8nCIanl~

_ur81 slanclaidi.d

slIuIdIud ciJcr .Of'the mCan;: ~

DDIcJ

l'IIOIIIiIily. raIio . S.aultical parameIric nap psababilily tcit . .ratio .

Seq..,."

SUmofsquans

.

Sum of.squaR:S ~ 10 error s~ -reCtor madu_

TransmiisioD dlsiciltiDn lest· 1bbiJ. fertility rata . ·~rncihod

TiiquIar rat aDaJOpe...Je . W~ .... squan:a estimate

Y"""

(or lIati~)

A accelerated factor

See SURVIVAL ANALYSIS

accelerated failure time models

See Slm1\'AL

ANALYSIS. 'JRANSRlStAnON

active control equivalence studies The classic ranclomiscd aJNICAI.TRIAL seeks 10 prove superiorily of a new lmItmenl to an existing one and a successful conclusion is one in which such proof is clcmansll'alcd. The famous MRC trial of slIqJIOmycin is a case in poinl (Mc:dical Research Council Streptomycin in 'lUben:ulosis liials Committee. 1948). The trial concluded with a signiftcanl difference in outcome in favour oflhc group glven streptomycin compared to the group that was noI. In n:ceal yc:an., however. there has bcc:nan iDCMaSing inletat in IriaIs whose objeclive is to show thai some new therapy is DO worse as repnls some outcome than an existing treatment. Such trials ha\'C particular featwes and difftc:ulties that wen: described in an important paper by Makuch and Johnson (1989) in which lhcy used the term 'active control equivalence studies' (ACES). Actually. lhc term is nul ideally chosen since. unlike bioequivalence studies. when: Ihe object is to show that the bioavailability of a new formulalion is not only at least 204Jt less than that of an existing formulation. but also at most2S 4Jt more, ad he:nce w~ f!qUilYllence 10 some dcgn:c: is genUinely the aim, in ACES it is almast always Ihe case that only noninferiority is the goal. II may be questioned as to why the: rather modest goal of noninferiorily should be of any inlclal in drug regulation. There arc several reasons. 11Ic flnt is lhat the new drug may have advDDtagcs in lenns of tolerabililY. Second. the new drug. while showing no net advantage to lhc existing one. may increase patient choice and this can be useful. For example. many people have an aspirin allergy. Henee. it is desinble to ha\'C altcmalive analgesics, even if no better on average than aspirin. Third, it may became necessary to withdraw tn:abnents from the market and one can never pR:dicl whe:n lbis may happen. 'I1Iere IR now several stalins on the market. 1be facl that lbis is so means that withdrawal of "riYaSlatin does not make it impossible for physicians 10 continue to treat their patients with this class of drug. Pourth. introduction of further equivalent therapies befon: patent expily of an innovator in the class may pennit price compelition to the advantage of reimbunors (although such competition is probably not particulariyeffccti\IC; Sean ad Rosati. 2(03). However. the nOh nmon is probably the most important Dnq; regulation is designed to satisfy some minimum requirements for phar&qdOfNllldjt CtNHpIIIfioIr It) Mftd"1IYI1 Slalislia; S«rMd EdiliufJ C 2011 JohD Wiley & ~ ....

maceuticals: that they are of sufficienl quality. arc safe and emcacious. Efficacy isdcmonstrated iflhc treatment is better than placebo. even if it is not as good as some other treatments. The comparison of a new drug 10 an active In:alment may be dictated by ethics but the object of the trial may simply be an indin:ct pRlOf that the treatment is beuu than placebo through ClOIDparison to an agent whose emcacy is accepted. Rctlently the issue of the indirect comparison to platlCbo has been taken more seriously. Consider Ihe cue when: we ha\IC a single effectiye treatment on the market, say A. whose emcacy has been demonstrated in a series ofbials comparing ilto placebo. We now run some new trials comparing a furtha treatment. B. to A. 1bking all these trials together, they then have lhc structure of an incomplete blocks design. The effect of B compared 10 placebo can then be estimated USing Ihe double contrast of 8 comparai 10 A and A compared to platlCbo. This approach has been examined in detail by Hasselblad and Kong (2001). A CXJIIscquence of taking this particular view of malleI'S is that the precision with which the effect of A was established compared to platlCbo cannot be excec:dc:d by the indirect comparison of B to placebo. since the variance of this indirect conlnSt is the sum of Ihe variances of the two din:c:l CXJIIlraslS. This is. however, nol the only difficulty with such studies. The folloWing are some of those that apply.

Est.liming Q cliniCtlIl,. irrelevant difference. Ifthe route of a fannal anaIysisClOlDpamd 10 placebo via an indirect contrut is taken. this particular difficulty may be finessed. Tbe new RalJnent is shown to be 'sipiflcantly' better than placebo. albeit using an indiru1 argument. and the extent of its inferiority 10 the campanlOr is only of relevance 10 the extent that it impinges on the proof of el1icacy compared to placebo. If this proof is provided.. then the comparison to the actiYe compamtor is 'walei' under the bridge' . If this particular approach is nol taken. howcvu. lhen any proof of eflicacy of the new tn:atmc:nt rests on a dcmonslrlllion that it is not 'substanliaUy inferior' to the CompandOr. which camp;uarar is accepted as being cfftcacious. 1bis nises the issue as to what it means for a ckug to be not substantially inferior to anolhcr ODe. This appears to rcqui~ that some naargin .d • .:I> O. be adopted such that if" is theelllent by which the new trcaIment is inferior to the slandard (where r < 0 indicates inferiority) then it is judged subslanliDlly inferior if 1" :5 -.:I and not subslantially inferior or "equiYalent' if T > -.d.

Edited by Briaa S. Everitt and ChrisIGph« R. P'dmeI'

1

ACTIVEOONI'ROL EOUIVALENCE STUDIES _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

Tecl",;clI/ slolislim/ IDpeds. In a Neym~Pc:anoD hmewarlt (sec Salsbury. 1998) the lesl of noninfcriorily raauira one to use a shifted NULL HYPOrHESIS. One miPt. thcn:fon:. adopt Ho.~~-A. The silaalion is DOl lIS canlnn'asial as thai for InIe biacquiyalence. where the fiact thai two hypotheses have 10 be rejected. that of inferiority and that of superiority. means that an inlaiU", approach of seeing thai the confidence limits for the ditreraace lie within the limils of c:quivalc:ace is not 'optimal' (Beller and Hsu. 1996~ althoUlh the 'optimal' lest may in pnldia: be: worse (Perlman and Wu. 1999: Senn. 2(01). In praeticc. in the case of ACES if the lower CXlDYenlionai I - a ,wo-sided CONFlDENtE INIBlVAL for ~ excc:cds -A. Ihc hypothesis or substantial inferiority may be: n:jcclcd .. the level a ad noninferiority asserted. n might be: thoughl that a o~sided conficleace inlCrVal would be suftlcient for this purpose. Howc:Yc:l', the general rqulalory CXlDYenlion is Ibal all tests clc:signed to show superiority an: lwo-siclc:cl (despite appan:nt purpose) and. since such tests an: a special case of a noninfcriorily lell with.d =O. usc or onHdcd tests for noninferiority would lead to inconsistcncies (Sean, 1997; Committee for Prapric:bUy Mcclicinal Pnxiucas. 2000). In a Bayesian framework (sec BAYESIAN .IEIHODS) one might RqUin: that the posterior probabilily of noninferiority wen: less Ihan same specific:cl amounl. Abanalivcly. use of a loss functian would permit a clcc:ision analytic meIhod. such as has been proposed for bioequiva1ence (Undley. 1991). to be used.

Po...~r of ,rio/£. Nole that the rcasan one daes not employ a value of A = 0 in practice is that unless it is expected that the new ImIImeat Rally is better than the slandanl. the power of the resulting test could nevc:l' exceed SO fJ,.

However, the clinically im:levant difl'cn:nce is likely to be less than the clinically n:levanl difl'en:nce used in CXlDYentionalliiais. Hena:. if the new tn:aImcDt is ac""'ly no bc:_1han the stanclanl IRalment. then. for a given sample size. the IIOIICICntndity panamcler. cl == AISE(J) is likely to be: smaller f... ACES than for lliaJs designed to show superiority. COIIICqucntly. ACES either have lower power ... higher sample sizes than conyentional trials.

A.uay :lensili"ily. A pmbIem with ACES is thai if the trial appears to show noninferiority or the new IIQtmcnt. then then: are thn:e plausible explanations. 11ac lint. thai of cluance. is one: that statistical analysis is designed to addn:ss. 111e second. that the new IIQtmcnl is indeed noninfcrior. is what was dcsimlto prow. However. a thinl possibility. that the experiment was not sensitive 10 find a dift'cmacc. is difficulllO exclude. "111 is issue bas been n:rcm:d to as anc of ·competenc:e' (Senn. 1993) and atrects whalc\'CI'

infcn:ntial framework one decides 10 usc. An analogy may be uscrul hc:rc. In a pme of hUnI the thimble. a found thimble renders the quality or the Slrately used for finding it irrelevant. II is no marc 'found' ir a goad stnIcgy wen: used than if a bad ODe were. However. a failun: 10 find a thimble docs not automatically justify the conclusion that the room doc:s nat contain one and the quality of the SCBIdI employccl is a CI1ICiaI consideration in any judgemenl that it doc:s noL

TIre ejfed of DROI'OU1S, NONCOJIPUANCE tmd lire role of tnltllysu. It is plausible that in many cin:UJI1Itancc:s in conventional superiority trials if IICIIICOmpiiana: 01' clmpauts an: a problem an intcnlion-l~ IRat analysis will give a more modest estimate of the IRllllllcnt effecl than will a PER PIIOMCOI. analysis. In ACES. it is at least plausible thai this may nat be the case. INTEN1ION-ro-TIlEAT

COIfjlid ojrequiremenls ofodt/ilil'/Iy IIIfdclinical re/eFtlllt:e. It may be that the clinically irrelevanl diffen:nce is most meaningfully established on a scale that is nat additive. For example. in a trial of an anti-inrective. it could be most appropriate tocstablish thai theditrerenc:e in cun: rate an the PIObability scale was DOl gJaIIuthan lOme specified amounL Contrariwise. the log-odds scale mighl lend itself marc n:adily to staUstical modelling. This can lead 10 consich:rablc climcullies (Holmgn:n. 1C)g9). in panicular because a trial doc:s not Rendt a nmdom sample fmm the lalgct papulation. It may be that funhcr modelling usinI additianal data may be necessary (Sean, 2000). A CGIIUIIOft ein:umstana: likely to make n:gulalOly authorities ask questions is that a trial thai was designed with optimism to show superiorily to an active co. . . .t... fails to do so. but then is used 10 aIIc:mpIto dcmonstndc noninfaiorily. 11Iis parlic:ular set of cin:umstances has bctvmc Ihc subject ofoncofthe European Medic~EvaluatiaD Ageacy's 'poinls to consider' (Scan. 1997; CommitlCc f... Proprietary Meclcinal Pmducts. 2000). This stresses the desirability of elilablishing the trial's purpose prc-paf'armanoc and also warns apinsl establishing the clinically irrelevanl diffen:ncc. A, after the trial is ClDIDplcte. II ~ putlinl a trial thai was designed to show superiority to the puI'JJCl5C ofncminfcriority as an unac:a:ptableusc butKCqJtstheconvcnc.111e guideline recognises thai then: an: no issues of multiple Ic:sting involw:d with such switches (Bauer and Kic:sc:r. 1996) but that establishing values of A mraspc:clively may be biasing. 111us. it is preferable ror invcstigalOlS to specify in adwncc (e.g. by mcansof fannal chanp to the QlNlCAL TRL\LS JIIOIOCOL) Ihcir inlenclcd switch or purpcI5C and to nx the yalue of A prior 10 data unblinding. This. however. raises the ilSUC as 10 whether

______________________________________________________________ ADAPnVEDESIGNS the value of.d is not somclhing the regulator shoulddeclarc for given indications rathcI' than relying on the sponsor to do so. Otherwise,. a regulator could be raced with the following position. Drug B is rqislcnXl on the basis of comparison to a standarclln:allnent A bc:c:ause the lower confidence interval ror the lrealment effect. TIJ_A' excc:eds same pre-spccifted ,·a1ue..d. HoweVCl', a further drug. C. which has also been compared to A, is notgranled a lia:ace because a superiority trial was planned. Although superiority to A was not proven. the lower confidence inlerval for the In:aImeni etrc:ct 1'C_A excludes a smaller possible ditT~1It'e between C and A than is excluded for the difference between B and A by the lriallhat has led to rc:;istralion of B. SS B...... P. aad Kieser, M. 1996: A urufying appRl8l:h f«confidence inlm'als and testing of equivalence ud difremICC. Biometrika 83. 4. ~7. . . .r.1t. L aad Bsa. J. C. 1996: Biocquivalcnce mals. intcrscClion-union ICSts and equivalence confidence sets. Slatiftical Sdenre II. 4. 283-302. CeaualUee for ProprIeWJ l\IedldaI Prodad.s 2000: Points to oonsider on Switching between superiority and non-inrcriority. HUllibIad, V. aad 1CoDa. D. F. 200 I: Stati.stkal methods ror compcuiscn to placebo in acliveoCOiD)1 SlUdies. Drug In/ormation Jorunttl35. 435-19. RoIIngreD, E. B. 1999: Establishing equivalence by shwing that a speciftcd perce8IagC of thc cffect or the active control over placebo is maintained. Joumal of BiDplrarmat't'llticalStalisti(,..J9.4. 65 1-9.IJndI..,.. D. V. 1998: Decision analysis and biocquj\'aJcncc lrials. Slatisim Sri~lfCe 13.2. 136-4J. Makudl, It. aad Jabasaa, M. 1919: Issues in planning and interpming adiYe control equivalence studies. Joumal of Clinical Epitlenriology42. 6. 503-1 I. MedkaI Researdl CoaDdI sa.... m)'dn III Tabercu1a511 Tdais CGlllllllttee 1948:

~ptomycin

batment for pulmonary tubcmalosis. Brilish Medical JoufIJQl ii. 769-82. Pedmaa,M. D. aad Wu, L. 1999: 11ac empeRll"s nc\\' tests. Slalistical Sciencr 14.4.355-69. Sal....,., Do 1998: Hypothesis testing. In Armitage. P. and Colton. T. (eels). EnC)'tiopedia of biDslatisti('s. Chichester: John Wile)' & SOlIS. Lad. Sea&. S. J. 1993: InbaaIt difficultics with active control equivalence studies. StalUtia in MeJitUre 12.24,2367-75. SeDII,S.J. 1997: Slatiftical ismr.f in drug tlrvelopnrent. CUchestcr: John Wiley & Sons. Lad. s.m. S. J. 2000: Consensus and con~ny in phaimaceutical statistics (\\ith discussion). 1M Stalistician 49. 135-76. Sean. S. J. 2001: Slalislic:al issues in biocquivalencc. StDtiftia in Medidne 20. 17-18. 2785-99. Seaa. S. J. ad ItaIad. N. 2003: Editorial: Pharmac:eutic:als. paIcnIs and competition - some stalislical issues. Joumoi 0/ lire Royoi Stalislical Sot'iel}' sma A - Statisti('s in Sotiety 166. 271-7.

adaptive designs

a.JNJC\L 1RLWii that arc adapli\'C arc

modified in some way by the data dud hayc aln:ady been collected within that trial. The most common way the designs adapt is in the allocation of treatment. as a function or the n:sponse. For example. we may be interested in a dose that givcs a 20., chDlKlC of toxicity. whel'CCX"SSCS to this level of toxicity would be harmful. Thcrcrol'C, we may want to design the trial in such a way that. as more infonnation is gath~d. doses are a1loealed to optimise lhc estimate of Ihat dose.lfwc

were to usc a traditional fully nmdomiscd approach to runnirq; the trial. which is not adapth-c. we would probably not look at the data until the end or the trial. thereby risking exposing subjects to toxic doses and also possibly failing to produce an optimal estimate of the RXluired dose. Another such example of an adaptive design isgival in Rosenberger and Lachin (1993). whereby there arc two trcalments in the study, A and B. and as inrormation emerges from the triaI.he treabncnt assignment probabilities arc adapted in an aucmpt to assign IDOI'e patients to the treatment pc:IftXllling beucr thus far. 11Jereforc. when a patient enters the study. ir treallnent A appears to be better than treatment B, a patient has a greater than 50 Cit chance of being allocated In:alment A - and vi" vena. Because adapti"e designs modify the allocation of treatment on an ongoing basis, and thus protect patients from inefTeclive or toxic doses, they can be said to be more ethical than traditional designs. Rosenberger and Palmer (1999) consider the ethical dilemma between collective and individual ethics (see mucs .~ND CUNlCAL TRIALS) and argue thai in a clinical trial setling indh'idual ethics should be uppennosl; i.e. consiclendion should be towards doing what is best ror patients in the current trial as opposed to doing what is best for future patients who stand to beneftt from the l'Csults of cUlTCnt trial. The Declaration or Helsinki of October 2000 outlines the tension between these two types of ethics by stating: ·Considerations related to the well-being of the human subject should take prcc.x:dence over the interests of science and society: It is adaptive designs thal address the indiYidual ethics, as opposed to fully randomised designs. which address those collective ethics. We will be dealing primarily with n:sponse adaptive designs here. such as those just outlined. and will not be describing those designs that atlempt dynamically to balance the randomisation forcovarialc information. such as oUllined by Pbc:ock and Simon (1975) (see D.~TA-DEPENDEHT DESIDNS. MJmMISA11ON).

The randomised play winner (RPW) design attempts to a1loeale trcaIments to patients sequentially based on a simple probability model. Rosenbeq;er (1999) emphasises that the RPW design speciftcally applies to the situation whel'C the outcome from a trial is binary, i.e. either 'success' or 'failure' and where there arc only two lrealments. e.g. chug A and drug B. At the start of the trial there is an assumed urn of a baDs of type A (which rehde to drug A) and fJ balls or type B (which relale to drug B). When a subject is recruited. a ball is drawn from the urn and then l'Cplac:ed. If the ball is type A then the subject is allocated to drug A. if type B then the subject is a1loeBled to drug B. When the subject's outcome is available (and we assume that the outcome is available befol'C the next subject is randomized), the WD is updaacd. If the response is a success on drug A.then a ball ortype A is put into the urn, and

3

ADAPnVEDESIGNS ______________________________________________________________ similarty fora success on drug B.lfthe ouk:omc is a failure on drug A. then a ball of t)'pC B is put into the urn. and spin similarly for a failure on dl1ll B.ln this way. the balls build up such thai a new subject has a better chance or being allocated to a better lreatment. Rosen~er (1999) concludes with a table of eonditions UDder which the RPW rule is reasonable and provides a rQlistic allenlalivc to the standard cliniul trial design. These an: given in the table.

adaptive designs CondItions under which the RPW Is reasonable (Rosenberger, 1999) • The therapies ha\"C been evaluated previously for toxicity • The raponse is binary • Delay in n:sponse is mociemte, allowing adapting to take plaee • Sample siza an: moderate (at least SO subjects) • Duration of the lrial is limited and recruitment can take plaee during the entire trial • The trial is carefully planned with extensive computations done under dilTerent models aDd initial urn compositions • The experimental therapy is expec:led to have signiflcanl beneftts to public health if it proves effective

Traditional dose-n:sponse studies. where patients are allocated to a limited number of doses along an assumed dose-rcsponse cun'C. are limiled and. some would say. wrong. For example. if the assumc:d dose-n:sponse model is inCOlTCCI then palients may be allocated to ineffc:cli\"C or unsafe doses. One answer could be to increase the number of doses. However. this would resull in many patients allocated to wasted doses. It would be much belter 10 increase the number of doses and allocate doses to a subjC:CI based on cum:nl knowledge of the dose-response curve. which best optimises some IR-spc:cified criteria. This is precisc:ly what Bayesian response adaptive designs attempllo do. by employing Bayesian DECISION THEORY to a utility function. Thus. the dose thai most optimally addresses the utility is allocated to the next available subject or cohort of subjects. One of the first BA~ r.tE1lIDDS described was the continual mlSsessmc:nt method (CRM), inll'Oduced by O'Quigley. Pep.: and Fisher (1990), and originally devised for dose-escalalion studies in oncology. Whitehead el QL (200la) suggest that the method ClOUld also be used for applications in other serious diseases. The CRM c:nvisages a study whereby human voIunlc:ers are lrealed sequentially. in order 10 detc:ct a dose with a probability of loxicity of 20 CJt. i.e. TD20. The ~sponse is a binary response, 'Ioxicity' or 'no toxicity'. Before the study staIts, investiptors are asked to proVide what their best guess is of a probability of toxicily at

each or the series of doses. The first patient is then ~aIcd with the dose that is aJllsidcred to be the closest 10 the TOlD. Once the OIIk:ome is obsc:rved the FROB.o\BlUTYof Ioxic:ity at each of the doses is recalculaled using the Bayesian method of statistics. The proccd&R continues in this way until it wtles on a single dose. Whitehead el QI. (2OOIa) point out that the CRM does home in 011 the ro20 quickly and efficiently, but then: has been concern that early on in the lrialsubjc:cts could be allocaled 10 too high a dose. leading 10 palentia) toxicity problems. This has led to a number of modifications. such as starting at the lowest dose and never skipping a dose during the escalation. Whitehead el aL (200lb) suggelil practical exlensions to the CRM for pharmacokinclic data. employing the use of Bayesian decision theory 10 allocate ~alments optimally 10 subjects. They argue that conventional dose-escalatiOll studies carried out in healthy volunteers do not normally employ statistical methodology or fonnal guidelines for dose escalation. As such the studies can take a long time 10 complc:lc with little opportunity to skip doses. The methods proposed allocate doses in anlcr to maximise the information about the ~response curve. gi\'Cn a pre-speCified safely constrainL They use two simple utility or gain functions. one that allocates the highest allowable dose under the safely constrainl and the other that allocates doses in order to optimise the shape of the dosc-n:sponse curve. Krams el al. (2003) also use a Bayesian decision theory approach with sc:quential cIase anocation 10 a Phase II study in acute SIroIcc therapy by inhibition of neutrophils (ASTIN). which employs up 10 15 dose levels. They usc a responseadapti\"C procedun: in order to find a dose that gives an improvement over that of placebo in the primary ENDPOINT. allocllling the next subject eilhu to the optimal dose or FLo\CEBO. Slopping nales were employed by which if the pD5lerior probability of an effectivc drug or ineffective drug were greater than 0.9 then the dc:cision would be made eithu 10 go on 10 a confinnatory lrial (effeclive dnag) or to stop development (inelTc:ctive drug). In this way. they were able to stop dc\'Clopment or a compound more quickly than would have been possible under the traditional panuligm. In 2006. the Pharmaceutical Research and Manufacturers of Amc:rica (PhRMA) Adaptivc Design Working Group pUblished a series of papers in an issue of the Drug Information Association (DIA) joumal detailing various aspects of these lrials. Topics included terminology and classification: implementation; conftdentiality and trial integrily: adapti\"C dose response: seamless Phase IIIIlI: and sample size n:estimation (see Drug In./OTnJlllion JOIImal40. 425-84. 20(6). In addition. and re8ecling the growing inleresl in adaptive designs. there have been numerous special editions of other jounaals devoted to these trials. includi~ JoumQI 0/ Slalislical Pltllllling ad In/erence. issue 136(2). 2006: JOUTnQI of BioplwrIPlQt:eUliml Slalislics. issues 16(5).

_ _ _ _ _ _ _ _ _ _ _ _ ADJUSTMENT FOR NONCOMPlIANCE IN RANDOMISED CONrROLLED TRIALS

2006 and 17(6). 2007; and Stalislics in Medicine. issue AS

27(10). 2008.

KnIll., Mot Lees, JC., IIac:b, W., Grine. A. p.. Oraoaau,J..l\oL aDd Font. O. A. 2003: Acule IIIakc therapy by iabibilian of nculnlphils (ASTIN). An adaptive cIose-Iapoasc study or UK279.276 in Kute iscbanic stnIke. SlroJcr 34, 2543-8. Paeoek, S. &ad SImaa, R. 1975: ScqueDliaillatmcnt assipmeDl with ~ iltl or prolDDSlic factCIIS in CGIdIOllcd clinicallrials. Biome"k~ l I. IOJ-IS. O'QaIIIe". J., PIpe, Me aDd FIIbIr, L 1990: Coalinual reassessment mcIhod: • pl'ldieal design far fhasc 1 clinical trials

in cancer. BionretTia ~ 33-11. _ ...... W. F. 1999: R..cIomiJJCd piay-lhc-winDc:r clinicallrials: rmcw and m:onuneadatians. COIIIroil. Clilriml Trab 20, 321-12................ W. F. &ad tar..... J. M. 1993: n.e usc or rapanse-adaptive designs in clinicallrials. Co""oIledClinimlTrillh 14. 471~............., W. F.... hIBler, C. R. 1999: Ethics 8IId practice: alternative designs far Phase 10 ranclamised clinicallrials. COIflrollfti Clinical Triola 20, 172'-. WUeMad, J., Y......... z., P......... So, We""", D.... Fraadr, S. 2mla: Easy-lO-implemcat Bayesian methods far dasc-esc:alalian studies in healthy volunteers. Siolla· lirl;cr2.47-61. WMtebead,J.. ZIIDa, y ..staIIard,N..Tadd,S. ... WbIIeIIad A. 2OO1b: Lamial fram JRYious responses in Phase 1 cbc-esc:aI.oa 1lUdics. BTitUir JDIInIIlI of ClininrJ Pbsrnlll(,o/ogy S2. 1-7.

aon

adaptive rancloml..

Sec ADa\P11YE DESlCINS.

JlANI)CaUSATlON

adJU8Imeni for noncompliance In randomlaad controlled btala In clinical medicine. ·noncompliance' occurs when a patient does not rully rollow a prascribed course: of lIaImenL The alternative terms "adhcmlcc' and "concanlance' attempt toaw.icltheautharilarian cwertonc:s or "compliance'. In randomisecl QINIC'AL TRL\U. we an: cancemc:d with any ~ rrom a randomisc:d lRalmcnt. whether due to noncompliance CJI' a In:alment change q~ed with mc:cIical staff. In a trial to eamlNR two types or medication (drug A and dnag B. say) for the: tn:aImeat or hc:art disease. far eumple~ patients may ~ftlse or rorgc:t to take any of their medicalion or £orgc:a to Iakc: it some or the time (partial compliance). PatienlS aliocalcd 10 ra:cive dl1lg A might switch to drug B~ mad vice versa. Some of the patients mayevc:a take anaIhcr mc:dication altogether (drug C. say) or. particularly ir the Ihcrapy appears to be failing. RXlCive a much more radical intc:r'VCntion such as surgery. A rwther complication for the estimation of In:aImenl effects arises when patients who fail to comply with thc:ir prescribed tratment an: also Ihose who an: IIIIR likely to be last to follow-up.

Ratiollllie. Conventionally. trials with c1cpanun:s ftvm mndomisc:d tn:atment lire analysed by IN1ENI'JON-~TREAT. This clim:lIy compares the e;8'rclirmess of the diO'e..,nt

In:abnent policies as actually implemented in the llial e.g. "drug A plus changes' venus ·dlUg B plus changes'. Unlilcc efl'ectivc:aess. f!//k1lC)' ..,11lleS 10 the eITects or the In:alments themselves. and is not estimatc:d by an intc:Dtionto-trat analysis.. Rc:scan:ben may also be inlclated in the: eO'ectiveness oran intervention in othc:rcilaunslancc:s. e.g. if public suspicion of the intervention had bec:a ~duced by the positive aawls of a clinicallrial. In these circumstances. the: ratc:s orcompliance may be improved and actiustment for this change may be aIlc:Inpted. It is imponant to define the aim or adjustment for noncompliance. For example. in a trial of immediate venus c1crc:nat zicIovudine in asymptomatic HIV infc:ction. the: initial plan was to derer zidowdine until the onset of symptomatic disease. However. rollowing a pnJtocol amendment. some individuals startc:cI zidowcline beron: the onset of symptomatic disease (White et Ill.• 1997). Then: was interat in estimating the eITect Ihat would have been absc:m:d undc:r the original protocol. Zidovudinc: beron: the onset of symptomatic disease was thc:n:rore regarded as ·noncompliance'. Other individuals stopped zidovudinc Imdmc:IIt because of advc:rse events. Additional adjustment ror stopping tn:almc:nt would not answc:r a clinically relevant question. so the analysis did DOl aim 10 estimate efficacy. Adjustment rar noncompliance: is useful in a variety of situalions. Patients may be most intc:n:sIcd in In:alment efticacy. DiO'e~nc:cs in compliance may help to explain variation or a lreatment effm with time. between subgroups in a Irial or betwc:c:n trials in a .mAo-ANALYSIS. Reconciling llial daIa with obscr'Vational data may laIuire adjustment rar noncompliancc: in the: trial. Policy analysis may lellui~ projc:ctions for situadons with improYed eampUancc:. Most llllempls 10 allow ror noncompliance use on-IRalment analysis or PER PIIDI"OCOL analysis.. This only proviclc:s a valid comparison of the In:alments themselYc:s (efflcacy) if complic:rs and noncomplien do not cliffeI' systematically in their disease stale or prognosis. In practice this is unlikely to be the case. so selection bias occurs. Heart clisc:ase palic:nts who comply with their pn:scribc:cl medication. rar example. 1ft also those who an: likely to impro~ thc:ir diet or lab man: exc:n:isc and thesecllanges. in lOm.1Ire likely to lead toa better outcome. SELECTION BIAS may often be n:duced by adjustment ror baseline m\'ariates~ but thc:n: is still no guanntcc of an unbiased analysis. For elUUllple. in the: Coronal)' Drug Pmjc:cl. 5-year manaUty of poor eampUc:n was 28.2 CJt compared with 15.1 CJt in good mmplic:~ and adjustment far 40 baseline ractors only raIu4"ed the diITerence to 25.8 CJt vc:nus 16.4 CJt (The Coronary Drug Project

Rc:scan:h Group. 1980). Newer ·randomisation-basecl' methods am estimate emcacy while aw.iclinl selection bias by din:dly comparing the groups as randomiscd as in an ination-IO-tn:at analysis (While., 20(5). This is made possible: by considc:ring the

5

AGE-PERIODCOHORTANALYSIS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

subpuup or 'compliers' who would ha\'e n:ceivcd Ihcir mndomised batmen.. whichever group they \\'ere nndomiscd to. For example. a trial in Indonesian childn:n campan:cI vitamin A supplemc:alalion with no inlerVention. the outcome beil1l 12-month manalily. Vitamin A supple. or the intervention ann and by DOlle or the caatnJl ann. Sammer and 2qcr (1991) cansideml the subgroup who did not n:ceive vitamin A in die intervention ann and a corn:spondilllsubJrouporlhe conbol ann who ~·ould rrol/raJ,-e ,eaiJ,-eti l'ilanrin A if Ihey bad hem allocaled 10 receire il. These 'noncomplier' subgroups were assumed to be unalJ"ecled by allocalion to vitamin A. It is then straiplfolWani to estimate abe number or nancomplic:n in the conlrol arm and their mean outcome. and hence the risk difTen:ace. risk maio or odds ratio in complien. This is oRen called the 'complier .~ causal effect' (CACE) estimate (Utile and Rubin. 2000). This approach is • special case of PRINCIPAL STRAnFlCAtICN. A man: gc:aeral approach requires a model n:laling polential oUIcomes for each individual under different caunterractual In:almc:ats. A simple model mipt sa)' that each individual would have blood pressure b nunHg lower if they lOok abe cIru& with perfect compliance than if they did not take the cInas. with pmpodional blood pn:ssun: mluclions for pallial compliance. Such • maclcl may be lilted by abservil1lthat untn:ated blood pn:ssure must have the same dislribulion in each rancIomisc:cl group (Fischer-Lapp and 0adJhebc:ur. 1999). An important ad\'8Rlage of these methods is that no assumption is rcquin:d about the relalionship bc:Iwc:c:a compliance and polc:IIliai outcomes. 'I1Iey are closely related to the use: or JNSTIt1}),IE1Il VAlUABlES mdhods (Dunn and 8entall. 2007). The approaches just described are genc:rally only able to estimate one tn:almenl effect in a Iwo-ann trial. They ad to be: hopelessly impra:ise: in situations such as EQUIVALEICE STUDIES where patic:ats may Slopallbatment during the trial. so that the analysis rcquin:s estimation of the elrect or both lIaImenlS.ln this case it is possible to adjust abe nndomised comparison usil1l observational estimation of one or more treatment elTects - i.e. assuming the~ are no unmc:asun:d caafounden for tn:aIment. Methods such as mtlrginlJ/.rlnlcIliral modellirrg can work e'VCII when at'tuaillalmeni is both a consequence of symptomalic deterioration and a cause of slowa' disease progression (se:e Little and Rubin. 2000. for ~ferences 10 this literalUre). A trial wilh noncompliance has less POWER than one with perfect compliance. as a n:sult of Ihc mlUClCd effect size as estimalc:d in an intcDtion-lO-Rai analysis. and it is natural to want to n:cover the last power. Howe~r. many or the DeW pnx:ed1RS preserve the intcntion-lo-Cn:aI SKOOF1CANQ LEVEL and thererore do not affect power. In some cases. il is impossible to relain power without makil1l some assumption mentation was actually received by only 8Oe.t

abaat comparability or noncompliers and compliers. In athc:r situations. some gain in power is theoretically possible. but this is unlikely to be appn:ciable in plKticc (Becque and White. 20(8). Signiftcance testing should therefon: rely on intention-to-lmll analysis even when other melhacls are used to estimate emCat'y. IWIGD

w.....

AnpIst,J. Do, IID....O. RaItIa,D. B. 1996: ldentificlllioa of causal effects usiq illSlnllDeDtal variables (with discussiOll). JoumaloJlbe AmtriamStaiinicalAs.ftJdalitHI91.~72. B....e. T."'" WIllIe. L R. 2008.ltepininc pG'A'Cr lost by non-campIiance via full proIIability modcllinc. Slatistiu in Medicine 27. 5640-63. DulIn. G............. R. 2007: ModdlinglJatmcnt-cffc:ct hcaem~ity in rancIamizcd CGIdIOllcd trials of complc:x intcnentiaas (psydlolCJCical tn:aImcnlS). StaiulicJ in Meditillt 26. 47(4)....15. .....-....." K. .... ~.... B. 1999: PmcticaJ pnJpCrties of same structural mean analyses of the effect of compliance in

randomized trials. COIrtroiletl CliRitll' Triu& 20. S31-46.1..1tt1e. R-

.... RuIIIn. 0. B. 2000: Causal effects in cliDical and epidemiological studies via potential oukClmC5: CCJIICCIIlS aad analytical a~ praachcs. A""""I Bern· oj PIIbIir Health 21. 121~S. 'Ibe C..... ..,. Draa PmJtd R..ardI Graup 1980: InIIueaoe of adhaaace to lRallllent and rapoIIle 10 cholcsaeml on mortality in Ihc CORJRII)' Drug PIojed. Nn' Eng/ad./tJumQ1 tJj Akdkine 303, 103s-J1.

WIllie, L R 2CJ05: Uses aad limitations or randamizaIioa·1Jascd eflicaty cstilDldol5. Sialislital MellrDt/s in Medial' Relml"ch 14,

327-47. \\'Idte. L It, Walbr,5..BaIIIker, A. G...... .,.,.,......... J. R. 1997: Impad of tn:abncnt chanFs OD the inlcrpmaaion orthe Conconle trial. AIDS 11. 999-1006.

.ge-perlocI cohort ...lysls

To understand die effect of lime on a particular oulcome ror an individual it is c:ssenlialto n:alisc the n:levanl temporal praspc:ctive. Ale affects mIlD)' aspects of life. including Ihc risk of dise:ase. sa this is an essential componc:at or any analysis of time trends. Period denotes the elate of Ihe outcome and if Ihc outcome varies with period it is likely to be due to some undedyilll fat'tor that affects the outcome and varies in Ihe same way for Ihc entire population under study. Cohort. aJIItrariwise. refers to lencrational effects caused b)' factors that only affect particular &Ie groups when their level changes wi'" time. An eumple ora period effm would be a potential elTecl of an air contaminant thai affected all qe poups in the same: way. If Ihc leyel of exposure to thai fat'tor incrcascdlde.creased with time. ellClting a change in the outcome in all age groups. then ~ would expc:ct • relalcd paltel'll aclDSS all age poups in Ihe stud)'. In studies that take place over IOIIJ periods or lime. the technology for measuring Ihc outcame may change. giving rise: to an arlifactual effect thai was nol due to change in exposure to a causative ageaL For example. intensive scn:ening for disease can identirydisc:asc: cases that would not previOusly haw: been identified. thus artificially increasilll the disease rate in a population has had no change in exposure over time.

"'aI

___________________________________________________ Cohort (also called birth cGbart) eJrcc:ts may be due to fllClorS n:1ated to exposun:s assacialed willi Ihc dale oI'birlh.

suc:b.dIe inlRMluctianofaparticulardrul_pnldiceduriDl dial was bmuchl in at a parlicular point in tilDe. FarexampIc. aprepancy pmcticc assaciated with iDcn:asc:cI risk and adapk:d by the popuIaIian or mothers cluriDg a panicular tilDe pcriodcould aired the risk during Ihc lire. . . of Ihc c:nIiR: ICncnllioa bam cluriq thai period. While it is conunaD ton:l'crto thc:sc: efl'eclSas bc:inlassodatc:d with Jar ofbinh..lhcy could also be: &be ",sak or chanps in c:.x~ thai occ:unalaRcrbirth.ln .....y inclividuals..lifell),le fadexs thai may affect disease risk ewer a lifetime an: Iixc:cI .Ihcy approach aduldlaacl.. Aqullllliftcalion ofthesc cft'c:ctson such a pnc:ndian would give rise 10 a COlDpuisan oflhc:se cohort or ICncndioDaI efra:ls. An inhc:n:nt n:dunclancy'amGIIglhcse thrc:c Icmponl fae. Ian arises rram Ihc facllhat knDwin& any lwo fadon implies the of &be third. Farc:.xaaaple. if we: know an individuals • Cd) at a given date or period, (P). Ihc:n dac: cabart is the difrCIaICC (c =p - II). This linearclepcndc:ncc liYC:Srise to an idc:nliftalH1ity pmblc:m in a fonn" ...-c:ssion model that atlc:lllpis to obtain quanlilali~ c:stimates of rqrasion .,.... mc:Icn lIIIOC:iated with each IalDporai c:1emc:at: pn:p8DC)'

AG&P~ODCOHORT~~S

aenlS or ownIllinear tn:Dd and curvallR ar cIc:paItIft fram I.ar_ In:nd. For c:uaaplc. • can be gi\len by ai i P.. + ci ;. when: i i - 0.5(1 + I). tI" is die overall slape and ;;; the cunalule. The ownIl model can be: expn:ssccI as:

=

=

2010

_lie

E[Y) = flo +II/l. + PIl, + r/J" Using Ihc linear n:latiansbip between Ihc lives rise ID:

~ f~exs

ElY] = Jlo + 4. + p/l, + (P-")Ilt = /10 + d(IJ,,-!Jt) +p(JJ, + /le) 'which has only two identifiable piaramcIcn bc:sidc:s the inlcR:c:pI instead of die c:xpectccl tIRe.. Another way of Yisualising allis phc:aolllCllDll is Ihat all eambinatians or SIC. period and cohort II1II)' be displayc:cl in &be LExIs DIACIWI (_&be ftpn:)~ which is obviously a n:pn:sentaliDn ofa twodimc:nsionaI plaac inslCad of die: tine dimensions cxpc:c1Ccl for tine separate factors. In pnc:nl. dlCse analyses an: IIDllimited to linear cfl'c:cIs appIiccllo a eaalinuous mc:asun: oflimc:~ but iDslead dle:y 1ft applied to b:alporal intervals.. such • disc:aso IIIICS observed for 5- or l~ycar intcmds or.., and period. When Ihc widths of these iatcrvals an: equal. the model may be c:.xpn:liICd as:

£(Yp) =/1 + a; +.7rj + Yk

when: /l is the inll:RcpL a, the efl'eet or . . f_ die ilia (/= I •.... I) inlen'aI. lrJ Ibe effc:cl of period far &be jlh U= I ..... J) inlCmll and die eJTect Ihc kth callOd (k=i-j+l= t ..... K=I+J-I). 'lheusual c:onsIIaints in this madel imply dIa. Ea,= E~J= EYA =0. '1111: _ntiaability plUblc:. manircSlS illelfdnugh a single: unidentifiable palBlllCtc:r (Faenbc:rg and Mason. (979). which can be ~ easily sc:cn if we partitioa each tcmparaI efl'eet into compo-

r.

or

30

40

Age(yearB)

__period cahort . . . . , . lsIs diagtam showing the RJItJtionship beIween age, period and oohott. The dagonaIline tnIc8s III/8'11IHiOd lfelime toran intIvkIuIII bam In 1947

.. -

... .... e["gt] =/l + (;!J" +a i) + (jll" +~i) + (lcll" + y.:) =JI +

-iUJ.. +/l,,) + -i(JJ" +/l.,) +a;-

-

.....

+.7ri+1". because Ic = } - I. Th..... caeh of the curvalun:s can be: lDIiquely delalnined. but dae 0\'e11l11 sq,cs 1ft hopelcssly c:ntaqlc:d so that only CCIIain CICIIIIbinatians can be uniquely cllimated (HolfonlI913). '1111: implication or the: identifiability pnJblcm is thallhc: ewerall dim:tion of the c:O'c:cI far any or the ..... 1CmparaI companellts cannaI be cIc:b:nniaecI fium a ~lR'ssion anal),sis. Thus.. we: cannoI evc:a detcnnine whether the tRmds an: inm:asiq ar declasiq with cobad. far instanc:lc. '!'he sc:cond ftgun: on PIIIc I displays snc:nI combilUllians 01' age. period and cohort parameters,c:Kb SCI ofwhich JllDvicIcs' an iclc:nlical Ic:I of filled rates. Nalicc that as Ihc: period

7

AG&RELATEDREFERSNCERANGES ________________________________________________ panuaeIeIS an: IObdc:cI clockwise:. the age and cobalt panImdc:n an: eomparably IQlaIcd in abe counterclockwise dim:tion. Each of these parameters can be IObdc:cI a fuU 181)0. but it is importDDl also to n:alise that they cannol be

IOlIIII:d one at a time.. only all together. n..s. even thOUlh the speaRe tIaIds cannaI be uniquely cstimalcd. certain combinatio.. of the overallln:nd can be uniquely delennincclsuch as!J" +{J". which is callecl Ibc nel drijJ (Clayton and Schil1lcn.. 1987a. 1987b). Alk:maIive drift a.timata covering shaner timcspmas can also be dclcnniDcd and these have pnldical significance in ta.abey describe the experience orfoUowing a palticular age paup in lime, because bath period and cohort will advantIC lOgeIhcr. Curvatura. by way of aJIIlnIsl. are campldcly cldcnnincd. including paI)'DDIDiai panunden for the sq~ and higher powers. changes in slopes and second difrc:n:nc:es.. The signiftcancc test ror anyone of the Ic:nlporaI eft'ccts iD the pracnce of the 0Ibcr two wilileacrally be aacst of abe conaponding curvatun: and naI the slope. Holfonl piovidc:s rudhcrdcrail an how software can beset up for fitting these models (Holford. 20(4). ' TRH aa,toa,D.............. E. 1987a: Models for temporal wriation in caliccr I1IIC5 I: Ap-periocland ~ahort madcls.. SlatUli('s in MftlidM6, 449-67. a..., D...... SchIfIInI, E. 1987b: Madcls for temporal ~ in cancer riles D: ~ ClDhon nladelL St,,'isliaill aVecIi('ille6,469-8J.I1eJIbIq,5.E..........., w. M. 1979: lcIeatilicalion and cstimaliaa or ~period-cobod models in the aaaI)'sis of discmc. .bMIl data. Sociolori('tll'MelhoiolDgy 1971.1-61. Halford, T. R. 1983: Tbc cSlimaliall of age. pcrioII and cohorI drects rar vital rates. BiDlrwtri('s 39. lll-24.1IoIfonI. T. R.

2004: TCIIIpCII'II flldals in public heal... survcillance: SCIItiDg aut age. period and CCIhart dfeds.1D Bsookmc)'CI', R. and Sbaup. D. F. (cds). aVDllilDrinl tbe bealtb 11/ populstitNu. Odani: Oxford Uai\'CISiI)'

PIal. lIP- 99-126.

age-ralatedreference ranges 1besc an: ranges of values or a mcasu~mc:at that identify the upper and Iowc:r limit of nannality in lhe population. w~ the range varies according to Ihc subjecl'S age. Rcfcn:nce l'BIIIes are an important put of medical diapasis. w~ a conliDUDUS mcasU~lI1CIIt (e.1- blood pn:ssun:) needs converting 10 a binary wriable fardccision-making purposes. If the patient's value lies outside Ihc measurement's reference range it is tn:atccI as abnormal and the palicnt is invcstiptc:d further. The canSlJUclion or refcrcacc I1IIIgcs involves estimatinl lhe range or values that CO\lCJ'i a spc:ciftc:d pm:enlqc or Ihc ~ren:ace papulation. often 9S .... Usually this is the ccntnl part of lhe distribution with equal rail an:a probabilities. allhaugh in somccascs the ~ren:nce range is baundccI at zero or infinity. For nonnally distributed cIaIa the range can be fnmI the population ~ and STANDARD DEVL\11O.'f (SD). die 9541, nmgc. for example. being the meaD plus or minus 2 SOs. For nannormal data the simplest approach is 10 use quanliles. i.Co rank and caunt the data. then the 2.5CJ. and '¥IoSCJ, points are Ihc lower and upper limits or Ihc 95 CJ, referaICC I1IIIgCo However. this is indlicicnt and n:quircs a large sample. If the data an: sIcew they can be iransronncd.. e.g. to lopritlumi. and then the refcrcacc range can be calculllled f..... the mean and SD on the transformed scale

cIeri_

2.-----------------------------------------------------------1

0.5

-1

-1.5

~

_

........- ........--........- .........--........--........- .......... .........

--........- ....-..-

.......- .......- ........

..M._......._.M...._.M...._

_ ......._.

........--.....~.~.-.........- .........- .......-.

........_ ........_ .........M_.......

. . M._........._ ........_ ......._ ......._ ......M_........_ ........

M .......

......

PerIod slope

-0.00 -0.05 -0.10 -0.15 -0.20

-I--

~.5~------------------------------------------------------~ age period cohort 1InIIIys. AiJB. pIHiodand cohof1 efIeds torpm-menopausal bnHlsI cancer incidence for SEER 19~1~

,

_________________________________________________________

AG&S~F~RATES

1m~------------------------~

180 -140

f -

12OJ------

I! 100 L-------~ 1m.·············· .. I. ..................................................

1

80 40 .......................................... .

........................

20

o~----~------,-----~------~ 1.. 19 24 Age(J8ars)

.....,..... ....iance . . . . AgtHe/ated 95" reIetence ranges for blood Pf8SSUte In boys: systolic (solid lines} and diIIstoIIc (dolled Hnss} andlnnsfGllllCd back 10 Ihc: original scale. A JIIOIe ftexible variant is 10 use a 8oJt-Cox power tnmsfomudion (of' which the laprithm is a special case). which adjusts for slcewaess more prmscly (sa: 'IIlANSRJIW.VIO). A&e«laIcd refen:nc:e nnps an n:feIaIcc l1IIIIes that dc:pend on DIe. They arise mast commoal)' in paediatrics. notabl), for apHelalccllllCaStRS of'bod)' size like hei&ht and weight. which can be displayed as CIIOW11I awns. The priac:iples or reference IBIIIC eslilDalion ale CSICIIliaU)' the SIIIIIe wilen they an: age re'aIaI. exC:Cpllhat Ihe JDD&CS Cor adjacent BlC JIUUPS aeeclto be consisteat. To awid clilCOlllinuiIic:s at the 8IC lmap boundaries .requires the sUIIUIUIIY slatistics toclefine the refereDCIC nmce (e.I.the mean ancISO) andtocbanpsmaothl)' with lIIe butimpasiq this constraint compIicarcs die fitling piQCess. For normaIl)' distributed homosccdastic data, when: the SO is coastanl ac:nJSS BIC. the lIIe-n:lated mean can be eslimllb:d by LINEAR IWIRESSION and Ihc: Jd'en:nc:c map cxmsllUcted 8IOUIId the regrasiOD cane using the n:sidual SO. 11M: rqrcssiDn curve is estimated asinl a smoaIbiq repalion fUnction. e.1- a pol)'llDlllial. Iiac:IionaI polynomial or lencrarllCd addilhc (cubic spline) cunc.lfthe SDchanps witb . . u is often the case. a c~ of Ihc: ap-related SO also needs to be estimated by the n:p:ssiaa methods or Aitkin (1981) or Allman (1993) and the Ble-reIaIccI me.. obtained usiq wciJhtcd linear relR:Ssian willa weights com:spollding to the imene squan: or the Ble-n:lated SD. 1'bc age-relatccl Jd'en:nc:c mqc is apia COIISII'IICIaI araund the repssioD CIII\'C using abe SD curve. When the cia.. are skew it may be possible toacljUSl for the skewness ..... a single. col. logarithmic. 1ransf0000000on at all asCI. However. often the cIepec or skewness is iiself' age ~ althouP this needs a large sample to show iL .. this case an 1IIe-n:Jated sWlI1IIIII)' slillistic for the has to be estilDlllecl, alGIII with the ....relalal mean and SO. The

_wness

UIS MEIIIDD isa papuI.-way todo~is, or altemmvelythcEN method of Royllon and Wnpa (1998). For IIIIR extn:me

IlDllnormai elata, a nOnpanuneIric appmach based on QUA.N1D.E PElWfSSIlN i5 ncecIecl. a fonn of least absaIUb: enors n:p:ssioD. when: smoaIh curves are cOlllbuc:led for the agerelated upper and IGWa' limits of' the n:feraace ranp. 'I1Ie figure livcs ase-n:JaIcd refelalce raqes for systolic and diastolic bloacl preIIUfC in boys.apd 4-24, estimated by the LMS methacI. ~ an: two advanIaps of'refeninte mnps based on an undedyiq fmauenc)'distributioa. uapposcdto~cIeri_ . . . quantile rcp:ssion. TIle lint is eflic:iency - the standard c:mmI ofb refen:noe I1IIIJC limits are smaller. 1b: SCICDIId is

anaI)'licai conVCIIiencc-daIa for indivicluals can beconvenccl loz-SCORES.. indicating how many SDs they an: above or below the median of the c:IisIribulion. wllic:h is a convenienl way of adjullilll for. prior 10 fUrther analysis. TIC (Sec also 0I0W1II awnsJ

AlIda, M. 1987: Madcllilll Y8riIIxe hc~ in IIGIIDIIIqIasicIa asilll GUM. App/iIIJSlflIlJIlcl36. 332-9. Mala, Do O. 1993: CalSlructiaaafap«laIaIicLaxeccntilcs......... aidulJs.

StlllirlnillM«Iit.W 12. 917-24. Cole, T.J.... Ona, •• J.I992: 5'nIDaIhiac n:fCJCIII!C ccnlilc CIII\'CS: ~ LMS IiIdhoIIIIIII peaaIizcd IibIihDocl. Sltllistin iIr IIft1lrine II, 1~19. KoII*tr, R. W.... D'O"'t V.I987: ~"",,""'.ApJIIWSttIt&lirl 36, 383-93• . . . . . , P..... WrlPt, JL M. 1998: A IiICIhod far cstimd.lIMPCCific Id'CftIICe illlCnlls ('nonDaI raps') based an

fraI:tiaaaI paIynamiIk and apanealiallIIIIsfarmIIIia JourtIIIIof" un. 79-101.

IlDJYII SIIIIiMiaII SDci6y sma A,

.......,aeHIc rates

These an: rates calculated within a Dumber of Kialively Dl1IIOW' ap bands. A ClUcIe rate is the: Dumber or ewmls occurriq in a papulation duriDg a speciftecllime period divided by an eSlilllllac of' Ihc: size of the:

I

AGRESMENT ________________________________________________________________

populatiOD. However. when comparinl rates belween populations willa diffen:nt !lie dislributions. it is necessary to consider IDles at specific ages separately. In the table:.. clealh rales an: pn:sc:naed for COIla Rica and Ibe UDib:d Kinplana for 1999. derived fna dala flUID the United Nations (2002). The final colunm lives the SICspecific rales far bmad !lie bands and the crude (lotal) rate. The QC-specific rate is calculated as the number ofdeaths in Ibe palticular DIe lroup.ln Costa Rica. thc death rate aI ages 0-5 is calculated as 129611 070000. 'I1Ie rate isexpmlSCd per 1000 penons so the nile is multiplied by 1000 to live the rate of 1.2 pel' 1000 in the final column of the table. .....pecHic rates Population, number of deaths and deaIh I8Ies from all causes lor Costa RIca and tire United Kingdom for the YfHU 1999 CD~III

RiclI

A,e

PtlplllllliDIr

'96 in.II,e

11'0"11

(IOOOOD.s)

gl'tlllp

0-15 15-49

10.7 17.4 3.9 1.3 33.4

32'1\

50-69 70+ Tala)

12 '1\ 4.,

DmIJu

DeIIIIr Nle//ODD

1296 2766 3447 7523 15032

1.2 1.6

8.8 56.6 4.'

Uniled Kirrgtlom PtlpIIlllliorr

'96 in lI,e

(IOOOOD.s)

gRlllp

0-15 15-49 50-69 70+

113.9 288.0 126.1 67.0

TaIaI

59'~0

DmIJu

DeIIIIr Nle//ODD

19 'I.

S8SO

48~

31228 120759 474225 632062

0.' 1.1 9.6 70.1 10.6

21 '1\ 11'1\

RATIOI 0.... far N...... staIIItIcI 2002: MorlDli" "D'islier mII.ff'. R~,~,,· oflM bgalr. G~I'tIIIIII_11u ilyeDuse, .x_IIg~. in

ErtglDRJ _ WDks. 200/. I.oadoa: Ofticc for NIIIioDaI Statistics• PmtdD, M., WIIeIaII, So, FIrIa)", J., TIppO, L -1'IIaIaaI, D. B. 2003: Olmw iItritImt:r iIr ftl'e ftHIliIIt"ls. ~l. VOl. Lyan: IARC Scientific Publicldioas. Valid Na_ 2002: 2000 tltmogrtlphic )WlrbooIc. New VOlt: UnilCd NIIIions. WaI... A., ........ J., CaaBard, .... GaddanI, Eo ... M. 2001: liI'ing in BrilDiR: re.ll~from lilt 200D c;,lItrtll HoaseIroIti SlInyY. Landan: 11Ie Stationery Offtce..

n-.

52~

Age ,,.D"II

00la: of NaIi..aI Statistics (ONS) in Ensland and Wales (ONS. 2002). Ase-spc:cific disease incideace rates are also published in wrious cauntries, mosI DOIably cllllCer incidence. for which inlclnalional data an: compiled by Ihe Inlemalionai Asency for Research OD Canter (Parkin el QI•• 2(03). Age-spccilic prevalence ndes for exposures such as smokinl can also be derived. but ~ mon: usually obtained from specific surveys such as Ihe Oenn Hause:hold Survey (Walker etlll•• 2(01). HI (See also CAUSE-SPEC'mC DEA1H RA1E.. STANDARDISED . . . .AIJJ'Y

The: crude: (talal) rate forCosta Rica is less than halflhat for Ibe UK. However. at DO age is the nile in the UK double that for Costa Rica and far some qe groups the rate is higher in Costa Rica than in the UK. Note that the pe~nlQc:s of the populalion in each age group (third column) diller markedly. The UK papulati.. is much older (II .., of thc population ~ oyer 70 compan:d with 4'1\ in Costa Rica). The different . . st~ explaias the misleadins mmparison between the crude rateL Agc-spec:iftc rates are cwnbenomc: to mml'lR across a number of populalions. Standardisation mc:lhacIs are often used to provide an qe-adjusted sunurtal)' nile far each population.

Many countries publish qe-specific ntc:s for all cause and specific causes of death. c.l. tile annual publications of the

8gl'88lllani Apeement in repeated assessments is a funclameatal crilcrion for quality of usessmenlS on mtins scales. The use or ratinl seales and ather kinds of anIercd cllllSi6calions of complex qualitative variables is inlel'disciplinuy and unlimilc:cl. RatiRl scale assessments produce tlrr/ina/ the anIemI caleCOrics JqHaentinl only a rank onIc:r of thc intensily of a particular variable and IIDI a numerical value in a IDIIthemalicai sense. althqh the use of numerical labelliag could give a false impression of quantilalive data. (sec RANK INVARWICE). The main qualilY concepts of scale assessments are reliabilily and "alidiIY. Reliability (sec ME.o\SUItEMENT PIECISION AND REl.lABlury) refers 10 the extenlto which repelllecl measun:ments of dae same: abject yield thc same raula. which means IIIR'Cment in n:peated usessments of various designs. In intell1lk:r n:liabilily (see MEASUJlfJ.I!NI' PRECISION AND REUAllurY) studies an: madeofthc level oragn:emenl belween obscm:rs duat classify the SIIIIIC object or individual. and inlnlraler reliabilily (see INrRACUSS CORREl.ATION COEFFICIENT) slUdies refer to ap:cment in lesl-relesl scale usc:ssmc:nts by dae

.'11.

same: nIIcr. the f""lucDC')' disbibution of pain of ordinal data is described in a square COJrnHOEI'D' TABLE (sec the figure wilh pans I, II and III on page II). and in the cue of continuous usessmenls on a visual analogue scale. VAS. by a scllllcr plot. 'I1Ie percentage apecment (PA) is a basic agn:emc:at meas~. When theqreemena is unsatisfactar)' small RUDDS f.disqn:ement ca bce'Vllluatc:cl by a &IaIisticai meIhod lhal labs Kalant of the rank-invariant propc:dic:s or antinal clata and thai makes it possiblc 10 identify and measure systelDlllic disasn:ement. when 1B1CDt. separately from disagn:ement

________________________________________________________________ AGREEMENT

Systematic diSllJl'CCmcDl is evident by the lIIBIIinai helUogencity. and by pairilll otT abc two sets or lIIBIIinai rrequencies. the so-called nnIt-lnInsrannabie pattem of &p:ClDent (RTPA) is cODstnlclcd. The RTPA clc:scribcs the expected paltera in abc case of systematic disqn:emc:nt only. All pain or observations or the RTPA will have tile same rank onicrilll in lhe two assessments provided dial the ranks am lied to the cells. which is the clcfinilion or the augmcnled ranking pnJCCdun: (aug-nmb)

caused by individual variability in assessments. SySlemalic disa&Rcment is population based and maIs a sySlemalic duuage in conditions or memory bias bcIwccn 1CSl-n:1CSt assc:ssmenlS. or bclWccn ndcI5 who interpn:t the seale caIelDries ditTeratly.l..arJe individual wriabilily. on the aIhcr haad. is a sip orpoorqualily ora ralilll scale as it allows for uaccrlainty in ~1Ilcd assessmc:atl. TIle pracnc:e or systematic disagn:c:mcnt in Ihe use of the scale calCpric:s between the two usc:ssmcnts is ~ed by ditTcrcat fRe quency dislribulions. which means IlUll'linal distributions (paris I and II). A systematic disagr1:clDenl n:pnIiac the categorical levels and in the way or concc:allatin& the asseamenlS on Ihe categories ~ mellllUml by the n:lali~ position (RP) ..... the n:lali~ conccnlralion (RC) n:spcctively. The RP expn:sscs the extent to which the IIIBIJinal distribution of assessments Y is shined lOWanIs hillier categories than the lIIIIIIinai distribution or X. nather ..... the opposite. A lheon:ticaJ clcscripti_ is the: diR'c:n:nc:e bclwecn the probabiUtics P(X < JI) - pcY < Xl. Possible values or RP ranie from (-1) 10 1. and RP is positive whcll higher seale calcJories ~ more frequently used in the assessments Y than in X when CXIIIIp8I'Cd with the opposite. Com:spondingly. Ihe RC expn:sscs Ihe extent to whieh the IIIBIJinai distribution or Y asscssmen15 is II'ICR COIICCnlndcd to central seale catcgoric:s than is the rnaqinal distribution or X. thc:oIclicaily clcscribecl by Ihe diR'c:n:acc in probabilities P(X,< r,,<x,)-P(Y,<X,, < lj). Possible values mnge from (-1) to 1. and a posilive RC indicates thai Ihe assessmcnlS r an: man: CXIIICCntratcd than x. Zao or wry small values or both RP and RC mean lhat the systematic part of an observed clisqrccmenl .-ireel assessmeats is ncglipblc.

I A

RaIcr

r

RatcrX B C

D

I

I

D

c

II tat 2

A

(sec IAXKINO). Part U in the figun: is the RTPA orthe pallem in part I. The observed distribution or pairs in part J deviates from this

RTPA. which means that some or the pain or aug-ranks given to Ihe observations diO'er. The relalive rank yariance (RY) is a rank-based IDCDSUK or this observed individual variability. i.e. unexplained by the measures or SyslCmalie diS81n:cmcnl:

when: n is the number or paiRd assessments and (.di,)1 is Ihe squsn: or the mean 8U&-lDllk diR'en:ac:e of the Qlh cell. ad the summation is made over all cells ij or the m x n, square table, 0 ~ RV ~ I (Swnsson Itl DI., 1996; Svensson, 1998&). TIle Cohen's coelltcienl kappa (,,) is a commonly used mcasun: or 8IrecmeDt adjuslccl ror lIIe chance expected qn:c:mc:nl (sec KAIIPA AND WEIOII1'ED KAPM). the calculations of Cronbach's alra and other so-called n:liability CXlCl1icicn15 an: baed on the lIlISumplion or quantitative, normally clisbibutcd daIa. whieh is not achievable in

RalcrX B C D

tat

'Z

Z

IJ

2

'·Z.

14 18 C

t 16"

B

1

1

II

3

16 B

A

2

8

3

1

J4 A

tal

3

II

17

19

PA.12'1 RP. -0.49

RV,o.OB

17'

3 11

!50 tal 3

II

17

19

RP,-G.49 RV.O

A D

RatcrX B C

D

I

4

14

tat .9

II C

1

2

10 4

17

16 B

1

6

3

II

14 A

1

2

50

3

II

tal

1

3 17

19

!50

PA,62~

PA.12" RC.O.l6

OJ

R~0.16

RP=RC.O

RV,o.OS

agraement EJtIJmpIes of psifed ordinal data from Intemtter assessments on 8 four-poinl SCIIle with the ORIered calegodesl8belledA < B< C < D. 7beranlc-transfotmablepaltem oIlJ111'f18f11f11(RTPA) Is sIuIded. Themeasul8SoI percsnlllge agreement (PA), lhelflllltive position (RP), Iherel8t1ve concenItaIion (RC) 8IId the 18Ia1lve mnIc VBIfance (RV) 818 given

11

AKAIKFS INFORMATION CRITERION _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

data rrom nding scales. There is also a widespread misuse or the correlation coefficient as a reliability measure. The eorrclalion coemcienl (see CORRELATION) mcasun:s the degree of associalion between two variables and does nOi measure the level of agn:ement. In part I of the figure the PA is 12~. and the observed disagreemeat is mainly explained by a systematic disagreement in position. The negative RP value (-0.49) and the RTPA (parlll) shows that the assessmenls r systematically used lower catqories than X. A slight additional individual variability. RV =0.08 is observed. SPEARMAN'S RANK CORRELAll0N COEfFICIENT. r~. is 0.66 in part I or the figure and 0.97 in part II. ignoring the ract that the assessments an: systematically biased and unreliable. The same holds for the eoemcienl kappa (-0.14). In pari III the marginal homogeneity and the zero RP and RC values confirm that the disagreement (39~) is entirely explained by slight individual dispersion (RV =0.05) from the RTPA. which is the main diagonal in this case. The ra is 0.61 and the It is O.4S. Besides reliability studies. the level of disagreement is or main inteRst in paired asscssmenls ·berore and after' lrealmeat for analysi~ change in outcome or treatment effect. In this application or the disagn:ement measures, nonzero RP and RC values indicate the level or eommon group change in outeomes. and the hcterogcaeity in changes among the individuals is measun:d by the RV (Svensson. 1998b). ES Swauoa, E.. 1998&: AppIicatiCIII of a llIIIk-invariant medaod to C\'IIIuale n:liability of ontcmI catepD:al asscs:smcnlS.. Jorulftll of Epidemiology I11III Bioslalillirs 3. 403-9. SYIIISIOII, It. I998b: OrdiDDl invariant measures for indiridual and group cbanp:s in ordmd categorical data. SlaliNlits in Meditbre 17, 2923-36S........, ...... sr.m.tc, J ..E., BboIm, S.. YOB . . . . . c. aad Me....... A. 1996: Analysis of'inter-Clbsen'Udisagrmnc:n1 in die asscssment of subarachnoid blood . . acute hydrocephalus on cr scans. Nftlrological Remuth 18. 487-94.

AIcaIke'slnformaUon criterion

Akaike's infonnalion criterion (AIC) is an index used to discriminale between compcti~ models. It is widely used when then: is the issue or model choice where we wish to find the most parsimaaious model (see Akaikc. 1974). Often there may be a number or possible models that can be Hlted 10 the cIaIa. from which parameters can be estimated using. rorexample. the MAXIMUM UKaJIIDOD ESllMATION. Generally. complex models an: mo~ ftexible. but contain a n:lalively large number of paramclers. whereas simpler models with rewer parameters may compromise the fit or the model to the data. Eueatially. the AIC statistic compan:s competing models by CXJllsideri~ the trade-off between the complexity of Ihe model and the carrcspanding fit of the model to the clata. The AlC stalistic is widely used. particularly as it can be used to compare a'en

models when likelihood ratio tc:sts cannoI be applied. Let z denote the data and' the com:spondi~ maximum likelihood estimates (MLEs) or the pararnclcl'5. Then. the AIC for a given model is denoted by:

IKJIU1CSled

AlC = -2 log L(i; x) + 2p where p denotes the number of parameters in the given model being filled to the data and log L(i; z) the corresponding log-likelihood evaluated at the MLEs of the parameters. The AIC statistic is calculated ror each possible model being considcrc:d. The model deemed optimal is the one with the smallest AIC value. i.e. a model with a relatively small number or panunelers that adequately fits tbe data. The AIC is generally easy to calculate given the maximum or the likelihood function and is vcr)' versatile. allOWing us to compa~. for example. nonnesleci models. We note that c:orrec:tions have been suggestc:d to the AIC statislic to allow for data with ovenlispersion (denoted by QAIC) and small sample sizes (AIC,.). See. for example. Burnham and Anderson (2002), Sections 2.4-5. 1be AIC statistic has also been used to eompare Ihe performance of difl"cmlt models. relative to each othu (Buckland. Burnham and Augustin, 1997: Burnham and Anderson. 2002. Section 2.6). It is not the absolute values of the Ale statistics thal are important but their relative values, in parliculartheir difference. Foreach model the tcnn .dAIC = AIC - min AlC is calculated. where min AlC is the value of the AlC slalistic ror the model deemed optimal. Clearly. AAIC = 0 ror the model deemed optimal: the largCl" the value or .dAiC the poorer the model. The relati\'e penalised likelihood weights n', can also be calculated ror each model i = 1•. , '. m. where: W; =

exp( -.dAlC;/2)

-=",r~----';':"""";'-

Eexp(-AAlCj/2} j=1

and AICI denotes the carrcsponding AlC value associated with model i. The weights provide a scale to interpn:t Ihe difference in values ror the models. Finally. these model weights can be used to obtain a (weighted) model-averaged estimate or parameters of interesL RK [See also DEVIANCE. UKflJIIOOD RATIo) Ablb, H. 1974: A DCW look atlhc Slatistical model identification. IEEE TrtmJGt:lioRs on Alliomalit COIItrol AC 19. 716-72, Backlaad. S. T., BIIlIIIuua. K. P. ad Aapstta, N. H. 1997: Modcl selection: an integral part of inrc~nce. Biomelrirs 53. 603-11. Bu........... K. P. &ad Andenoa, D. R. 2002: Model sc/~clion aM mullimotlel i"J~rDlte. 2nd editiCIII. Heidelbell: Springer Verlag.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ALL SUBSETS REGRESSION

allelic association

1'bis is an association between two alleles (at two dilTermtlaci). ar betwcea an allele and a phenotypic bait. in Ihc population. Since humans IR diploid a morelc:dmical definition of the ranner is nca:ssary: two alleles are associated iflhcir frequency orCO-OCCUJ'RDC."C in the same: haplolype (i.e. Ihe genetic maIeriaI transmitted from one pan:nt) is gmder Ihan the product of the l1III'Iinal frequencies of the lwo alleles. Association belween two alleles is also known as lilrJcoge dlseqrtili"'iuIII. 1bc reason is thai. in a lillie papulation under mncIom mating.1hc extent of association between two alleles (as me&lllRd by the difference between Ihc rraaucncy of the haplotype containinr: Ihc two alleles and the product or the frequencies of'the two alleles) clc:c:reases by a radar equal to one minus the n:combinalion fraction (see OENETIC UNIt. AGE) between the two loci. per leneration. "Ibus allelic association n:pn:sents a slate of disequilibrium that k:ads to dissipate at a nile determined by the sln:nlth or linkage the stale of equilibrimn belwcen the lwo alleles. whea the: frequency oflhe haplotype is equal to the produc:cor the hqucncies of the two constituenl alleles. AssociDlions between lWO alleles can arise in a population for a number of R:8SDRs. The mutation that pve rise to the more rc:ccnt allele may ha'Ve occuned on a chromosome tlud happened to contain the other allele. Random genetic drift duriq a population botdeneck may have led to the oVcm:pR:SCntalion or some haplotypes. The mixiq of two populations with differenl allele frequencies may havc resulted in associations between alleles in the ovmall population. When. for any of' these ~asons. such allelic associations arose many lenemtions BlO. OIIly those occurring between lightly linked loci are likely to havc penisted 10 the Clll'l'CDt ICDeraliaa. We would thcrefce expect an impelf'ed inverse rdationship belween Ihc extent of associalion between two alleles and the distance belwecn 1hc1D. An assocwion between an allele and a disease may be lhe result Oradi~cl causal Rlationship.ln other words. the allele is a causal varianllhal is fuactional and increases Ihc risk of the disease. However. il could also be indin:cl. with the allele beiDg in linkage disequilibrium with a causal variant. The pI1:lICIICe of' link. disequilibrium belween tightly linked loci means that it is possible to seRen a chromosomal n:gion for a causal varianl without eumininJ all the alleles. only a sufticienl number to ensure Ihat any causal variant in the ~gion is likely 10 be in linkage disequililximn with one or mon: of tile alleles examined. The poIymorphisms chasen 10 JqJn:5Cnl itself and associated polymorphisms in ils vicinily in an association study are called TAG polymorphi&ms. The International HapMap Projecl (www.hapmap.DII) has characterised the pauem of allelic associations among over 3 million sinlle nucleolide poJymorphisms (SNPs) in the human lenome in line major populDlions (Europeans. Arricans and Asians).

lOW'"

.Classical epidemioloJical designs (CASE-coNI'IDL SnJDl!S. COHORI' STUDIES. CROSS-5EC11ClNA1. SnJDlES) are mldily applicable to lhe study or diseasc>oallele associations. as IR the Slatistieal methods developed ror Ihcse designs (e.l. LOOIS11C REORESSION. SURVIVAL ANALym). These designs are polCDliaily susceptible to Ihe problem ofhidclen population stratiftcation. which can lead 10 spurious associations or mask true a~ cialions. Family-based association designs an: robuSl to population stnliftcaliaa and usually consist of the use of eilller parenlal ar sibling controls. Melhads forlhe analysis of matched samples. such as the McNBIAR·S 11!ST (also called the transmission disequilibrium lesl in the contexl of'pamatal conlmls) and CONDITIONAL LOOJS11C RBJUSSIDN an: applicable 10 these dc:sips. The study of diseasD-8llele associations is a complemcnIaIy slndegy 10 link. analysis. in the localisation and identification or genes thDl incn:ase the risk or disease. In general. allelic association is unlikely to be detected when Ihc marker locus is quite far (>1 mclabase) from the disease locus. but can be much mon: powelf'ul than linkage whea Ihe marker locus is close enouP to the disease locus to be in substantial IiDkqe disequililxium with it. particularly when the effecl size of the disease locus is small. For this reason. allelic assaciation is particularly appealiq for sc:archillJ n:gions that demonstrate linkage to Ihc disease or to Ihe inveSlilation of specific candidate aenes. However. technoIopcal developments have enabled lhe efficient lenolypiq of up 10 I million SNPs in a single 8II'8y. and this has led to association sludies on the whole-genome scale (called gellOl11C-widc assoc:iation studies. or GWAS) that ha'Ve coverage of over 90 fJt of common variants (allele frequency > S CJ.) in the lenome. PS

all ..bsets regresSion

A form of n:grasion in which ull possible models an: compared usillJ some appropriale criterion for indicatinllhe "best' models. If there an: p explanatory variables in the data. there are a total of I

r-

possible reIRssioD models because each explanatory variable can be in oroul ofthe madel and the model containiq no explanatory variables is excluded. One possible criterion rar comparinl models is the MAu..aNs C,. STA11STIC' and 10 iIIur lI1Ile ils use we will apply it to data that arise from a study of 2S patients with cystic ftbrosis Rported in O'Neill el til. (1983). and also gi\'en in Altman (1991). Data for Ihc first tine patienls an: Jiven in the lint table. The dependenl variable in this case is a mcasu~ or malnutrition (PE_). Some of Ihe models consideml in the all subsets n:gression of Ihcse data are shown in the second lable. Iogcther with their assaciated C,. wlues. where p n:fers to the number of paramelcls in a particular model. Le. a model that includes a subsca of p - I of the explanalaly variables plus an intercepl. If Cp is plouecl against p. Ihc subsets of explanaIDry

13

ALTERNA~HYPOTH8S~

_______________________________________________________

all .uba. regression Cystic flJrosis data; fbI thl8fJ subjecIs Sub I 2 3

7 7 8

Se.T.

Height

Weight

BMP

FIN

RV

o

109

13.1

1

112 124

68 6S

32

12.9 14.1

19

6S

22

o

FRC

TLC

2S1

113

24S

137 134

95

449

441

268

147

100

85

Sub: subjcc:t number Sex: O=maIe. 1=female BMP: bady mass (ftiJhIillc~) as • pcrcen1age or Ihc agwpcc:ill: median in IICInII8l indiriduals FEY: fon:ed cxpiratary volume in CIIIC secand RV: raiduaJ wIume FRC: filDclional residual capacity TLC: lotaIlunc capacity Pf..u: maximal statistic clpinllOl)' pn:SS1R (cmH:O)

alternative hypothesis all ....... regie_on Some of ths models fitted in ~ths~ _ _~~m~~~.~

data (size is one mote than ths numberofvllliables in. model, 1o Include the Intercept) Model 7 14 21 ~

3S

42

2 3 4 4 5

63

6 6 7 I

70

9

77

9

49 56-

Tnnu

Size

Sex Sex, weight Age. FEV. RV

Age. BMP. FEV Sex, weight, BMP. FEV Ale. wei&ht. BMP. FEV, RV Age, sex. height. FEV. TLC Age. sex. height. FEV. RV, n..c Sex, weight. BM.,. FEV~ RV. FRC. TLC Age, height. weight. BMP. FEV. RV.FRC, TLC Ale, sex. height. BMP. FEV. RV.FRC,TLC

Cp

17.24 4.63 2.62 4.5 2.95 2.8 6.99 7.06 6.49 8.06

10.29

• Models close to abc line e,. =p. variables mast wonh cODsidcrini in trying to ftnd a parsimonious model am Ihase Iyilll close 10 the line C,,=p. All subsets IqIasion has been foUDd to be particularly userul in applicaliOM or 'COX"s REGIlBSION MODEL (sec Kuk. 1914). SSE (See also MUL11FLE LINEAR RBJRESSION] AltmaD , D. O. 1991: PrortimJ stillwiea for rrwdiml remut:b. London: CRCJChapman & Hall. Kale, A. Y. c. 1984: All subsets ~ion in a pruportioDaI hazuds model. BioIrwtrim. 71~.587-92. O'NtIU. s., ....." r., PasCerIcaInp, H. aad Tal, A. 1983: The dfects of chronic bypmunclion. nutriliaaaJ staha and postuK on rapiratar)' muscle sbmIIh in cystic fibrosis. AlfWriftlll Rft'ieM' of Respiratory DiJortlers 128. 1051-4.

AMOS

See IIYIVl1IESIS nsrs

See STRurnJRAL EQUATION MDOEWND sm:rwARE

a..lysls of covariance (ANCOYA, ANOCOVA) This is an extensiao of the analysis or variance (ANOVA) that incorporales a eantinuous cxplanalOry YDriable. When: ANOVA aims to ddc:ct if then: is a change in the mean value or a wriable across two ar IIICR puups, ANCOVA (or rarely ANOCOVA) docs the same but adjusts for a mntinuous covariate. Most commonly this cowrialc will be a baseline ~~ menl, a1lowinl the analysis 10 adjust ror initial variation between participants and isolate the etrects due to the lRatment factor. However. sometimes a dift'en:nt covariate is used. Far example. Xarhune el QI. (1994) consider the association' bc:twccn alcohol intake (divided into rour cllle-garies) and numbers of Purtinje cells. In doing so they introduce DIe as a continuous covariate in order to 'control' or 'adjust' for the effects of age on cell numbers. Under other cirannstanc:es the aulhan 'could have been inten:stcd in the etrects of age and wantinl to adjust far alcohol intake. Despite being the same analysis computationally. this is not typically what is lhought or as analysis of covariance and might mo~ commonly be IR5Cnled as a '~grasion' . Indeed the various analysis of wriance methods can all be Yiewed from within a repession fnIInewoItt. which demonstrates that ANCOVA can be extended to cope with much m~ than ODe continuous mvariate. Malhematically. ANCOVA follows a similar path to thal far ANOVA and the output is usually summarised in a similar table. aJlhoUlh the details may vary. The promised beneftls of the analysis of eovariance an: clear. If one has an unbalanced obserwtional study. then ANCOVA can adjust for dift"en:nces in baseline 'Values and ~movc a potential bias from the ~sults. By the same token, if one has a randomiscd biaI thai is naturally balanced. then

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ANALYSISOFVARIANCE (ANOVA)

ANCOVA mluces the amount of unexplained variation in the data and Ibus increases the power of the test. However. ANCOVA can only be employed if the appropriate assumptions are met. 'lbcse include those of ANOVA (i.e. normality of n:siduals. holllOSC."Cciaslicity) as well as the appropriateness of the ANCOVA model. Is the n:lationship with the covariate IrUly linear? Docs the etTc:ct of the COWl'iate vary between groups? Failing to meet these assumptions can lead to the introduction of important but subtle biases. It is a frequent c:onccm that medical n:scarch papers report a covariate as having been "conlrOlled' or 'adjusacd' for, with no evidence that the conlrOl 01' adjuslment was appropriate. For further- details sec Allman (1991), Owen and Froman ( 1998). Miller and Chapman (2001) and Vickers and Allman (2001). AGL (See also GIlNER.wSED LINEAR Mooa.l

AItawI. D.O. 1991: Pmt:tiCill statistic's lor mttliml researm. London: ChIpman &: Hall. KariIuDe, P. J., ErldDJutti, T. &ad LalJIPaIat P. 1994: Moderate alcohol ccaumplion aod loa of' cmbcllar Prikinjc cells. British M«Iim/ Joumal 301. 1663-7. MOler, O. A. and CIIa......,J. P. 2001: Misunderstanding anal)'lls of covariaDc:e. Journal ofAbnormtJI PS)~lIoIogy 110.~. OweD, S. V. aDd Ji'I'oawa, R. D. 1998: Uses and abuses oflhc analysis or CO\'lIriancc. Rt.seorm in NlII'sing and Hm/11r 21, 557-62. Vkken, A. J. aDd Altman. D. O. 2001: AnaI)'s~ coatrolled trials with basdiae and follcM'-up measun:mcnts. Britislr Medical JDllT1ItIl323. 123-4.

analysis of variance (ANOVA) Often referring to the one-way analysis of "'ariancc. it is a test for a common !.lEAN in multiple groups that we describe in detail here. Analysis of variance frequently arises in the comparison of more c:omplicaled models, but the same logical argwnenlS apply. In all eases. the undcrIyiRl concept is to partition the observed wriance into quanlities attributable to specific explanatory soun:cs. and then consider important those SOUK'es thai explain 'more than their rair share' of the variance. Despite the confusion sometimes caused by the name. the one-way analysis of wriance is a method for testing to sec whether multiple samples come from populations that share the same mean.. In this mspcct it can be viewed as an exlcnsion to the '-test, which assesses whether samples rrom two populations share a common mean. An analysis of variance performed on two samples is equivalent to performing a l-lest. ANOVA assumes that all the samples come from populations with a NORMAL DI5TRIBU11O.~ that share the same VARIANCE. It can be viewed in a number of ways. but essentially allDpare5 the estimate of the variance obtained within samples (that makes no assumption that the populations have a common mean) with an estimate of the variance rrom the sample means (which will requin: the assumption that

the populations have the same mean). If the two estimalCs of the variance an: different. then Ibis is evidence Ibat our assumption of equality failed and. therefore.lbat the populations do not all have the same mean. Note that the variance of a single sample is eslimated as Ihe sum of squared ditTen:nces from the mean divided by the sample size minus one. 1be sum of squared differences tenD is interpretable as a measure of the total variation in the sample. In the analysis of variance. by combining all groups together. one can calculate this measure for all the data. This is termed the 'total sum of squares' or 'total SS·. Variation in the data is either- "between' or 'within' the samples. The "wilhin poups sum of squares' or 'within sse can be calculatc:d as the sum of squared difren:nces rrom the individual sample means (mther Ihan the ditTen:aces from the overall mean Ihat produced the tcMal SS). 'Between groups sum of squares' or "between SS' can be calculatc:d directly. but is most easily calculated by subtraction of the within SS (rom the total SS. 11ae two estimates of the variance (or "mean square' as it is often termed in this context) can then be calculated. 1bc: between groups mean square is equal to the between SS di\'ided by the number of groups minus one. 11ae within groups mean square is equal to the wilhin SS divided by the number of observations minus the number of groups. An F-slatistic is then calculated as the betwe:en groups variance divided by the within groups variance. Under the assumptions of normality and homoscedasticity (common vananee) Ibis statistic will be an obsen'ation from an F-DISTRIB~ ir Ibe groups come from populations with a common mean. The DEOREES OF FREEDOM of the F -distribution are the number of groups minus one and the number of observations minus the number of groups. From the F-disbibulion. we: can calculate: the probability of observiRl such an extreme value of the F-statistic if the populations have a common mean. This is a one-tailed test. If the value is unusually small. this suggests the between groups variance is unusually small and so is not evidence of variation between the groups. Therefore. the test is to find the probability. ir the populations do have a common mean, or observing a value greater than that observed. A natural way ofpn:scnting ANOVA is the ANOVA table. Given Nobsc:rvalions that fall into k groups. it is necessary to calculate the total SS and the within SS as described earlier and then the analysis can be completed as presented in the first table. Murphy el aI. (1994) conducted an analysis of variance to see if milk consumption before the age or 25 affects bone density of the hip in later life. A total of 248 women palticipalc:d in this part of their study (N =248) and wen: divided intogroupslbat reprcscntlow. mediwn and high milk consumptions (k =3). The samples had similar variances aDd so atlcast one of the assumptions for ANOVA was

15

ANALYSIS OF VARIANCE (ANOVA) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

analysis of variance (ANOVA) 7he anaJys;s of valiance table Degree:lof

SIIIfI:I of

freedo",

:lqutlnS

8etwa:a graups

It-I

Within groups

N-k N-I

Between SS= Total SS - Within SS Within SS Total SS

Soune of l'Qrilllrce

TaIaI

Mf!tJR aqlltlTe:l

F

P

Between MS = Between SS/(It - I)

BetweenMS WilhinMS

p

Within MS = Within SSI(N - It)

analysis of variance (ANOVA) Approximate reconstruction of the analysis of variance table from MulPhyet a/.

(1994). Source of var;lIIIce

8etwa:a graups Within groups

Talal

Degree:l offreedom

Sunu of :lquares

2 245 247

0.15

4A

0.01 0.02

F

p

3.8

0.23

4.6

(Entries in bold weft: infemd from the paper. lite rat simply follow from the cU:ulalioas)

salisftai. As is common for reasons of space. the ANOVA table was not preseated in the published paper. just the Pvalue. but enough dala wm: pn:scnted for an approximate R:Consb'UctiOll. We can infel' btllle within SS is approximately 4.4 and the between SS is approximately O.IS. leads 10 an Fstatistic or approximately 4. From the rcpaned P-value (0.023). it can be calculated from the F-distribution (with 2 and 245 respcclively for numerator and denominator degrees of fR:Cdom) thai the F-statistic was 3.1. The aJOclusion then is thatthc:n: is evidence Ihal these samples cIonot come from papulations thai share a common mean. n.c rcconsb'Uctcd table is presealed in the second table (entries in bold iD this table were infem:d from the papeI'. the rest simply follow from the calculations). It is preferable to conduct an analysis of varillDlX: rather than 10 conducl/-tesls betwc:c:n all pairs of groups. ANOVA awills problems of multiple testing and Ihus keeps CX1IIlIUI oflhe SIDNlflCANCE LEVIiL Having amduclc:d an ANOVA and rejected Ihc hypothesis of common means. it may then be clcsin:d to lest to see which graups ~ raponsible (although a plot of the data might be as infonnalivc). In this case, care must be takea to comd for the problems of making 1ftILDPLE

nus

~.

It is imporlanttotakc: note or the assumptions being made. rather than simply ignoring them. ANOVA can be quite robust to variatio. from nonnality. but hcterosccdasticity can be a serious problem. Residual plots can be used to help assess the normality and IOXPLOIS can be used to help assess the hctcmsc:edaslicity. Passible formallcsts for the assumptions ~ the KOLMOOOItO\L-SMIRNOY 1BI' and I..EVENES 1BI' rcspeclivcly.

If the assumptions do not hold. thea "I1W5RItMA'I1DN of abe daIa mighl comx:t dU. Otherwise a number of nan)Jlll'1llDClric altemativc:s to ANOVA exist. the most commonly used beins the KlUsKAL-W~ 1BI' and the FREDMAN 1I5SJ'. 1bc one-way analysis of variance is appropriate when our data an: simply divided into a number of groups. 'Ibm: an: many other forms of analysiS of ,·arianc:c. 'nIe TW~WAY ANALYSIS CE VARIANCE should be used when the IJOIIPS are definc:d by two factors. S~ for example. we had six groups: the three groups of women in Murphy el QI. (1994) and tlee gruups of men at die same levels of milk atnsumplion. Rathel' than a on~way analysis of variance. a two-way analysis oharianee with gender and milk consumption as Ihe two factors would be appropriate in this instance. Ir the data are multiple observations from the same subjects, perhaps measurements· of cholesterol levels O. 7, 14.21 and 21 days after slarting a new diet 011 several individuals. then a REFEATED ).IEASURES ANALYSIS CE VARIANCE would be appropriate. This is a special case of the two-way ANOVA and can be viewed as an extension of the paired sample I-tell. If lhere ~ observations of ~ than one characteristic from the individuals in several poops. i.e. measures of bulb the diastolic and systolil: blood prasun:. then a multivariate analysis of variance (MANOVA) can be used. If. however. it is desired to correct for a measw-ed baseline atwriate. such as body mass index. in the analysis. then aD ANALYSIS OF COYARIANa. (ANt'OVA) may be used. All these techniques could be implemented through a regRssion framework. in most cases MUIlIPLE lINEAR REORESSION. TheadvanlqeS ofdoing so would be the transition from the usc of a IIYIIOI'HESIS TEST to an actual estimate of etrcct

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ARTIFICIAL INTEWGENCE (AI)

sizes. This approach would also allow IIICR ftcxibility; for inslanc:e in Ihc case ofMurpily el 01. ( 1994) we could account for lhe naluml ordering oflhe 1c~ls of milk consumption that ANOVA ignores. As a gencnl priDc:iple. eSlimDlian and modelling IR usually pn:fenm to testing of h)'potheses. For fUJthcr details sec Allman (1991) and Altman and Bland (1996). AGL

.AJtmu. D. G. 1991: ProdimJ Jla,istk, foT medical l'esearcl,. London: Chapman a Hall. AIIDwI. 0. G. aad BIaDd, J. M. 1996: Slalistic:s nota: oompariag several poups usia: analysis of variaJtI::e. British MedkolJoumol312. 1472-3. M1II'pIJy. s., Ka.w, K.-T.. Ma1. H. aDd Compsloa. J.IL 1994: Milk CIODS1UIIpIion and bone mineral densil)' in middle ~ and elderty WOIDeD. BrilM Met/iral JOIITtIa/ 308, 939-41.

diseased subjects by a fair coin toss. Consicicr the example discussed in the enlly ror the ROC curve. The points on the curve IR given in the table.

area uncler the cwve SUmmary dIIIa ussd in an ROC CUIVB

Speciftcit)' I-Sensitivit)'

lolli-I

AUC -

2~(/i+I-li)tl'; + )";..... )

in many amIS or medical ~search. including bioc:quivalClltlC and pharmacokinetics. It pla),s an especially important role in the analysis of RECEIYER 0PERA1IN0 CHARAC'JERISIlC (ROC) CURVES. The area under lhe ROC curve of a diagnostic: marker (1Csl) measures the ability of Ibc I1UIdcr 10 discriminate between health)' and diseased subjects. II is the most commonly used measure of perfonnance or a lII8.Iler. We use the convention thDIlarger marlccr values ~ IIICR indicative ordisease. 11Ien if we randomly pick one subject rlOm the health)' population and one flOm the diseased population we would 'cxpect' that lhe value of the muter far the healthy subject would be smaller than thec:om:sponding yalue ror the diseased subject. AUC is the probabilit)' that this. in fact, occurs. The larger the AUC. Ihc bencr Ihc overall discriminator)' accurac)' of the marker. An Bra or I ~presents a perfect test while an IRa or 112 rqRSCnts a worthless test having a discriminatory ability. which is the equiyalent ofdiffen:ntialing between healthy and

'11tc AUe is used as a suml118l)'

measu~

0 0

0.56 0.04

0.14 0.12

0.94 0.32

0.98 0.60

1.00 1.00

11Ic data pn::scnted ~ult in an AUe as follows: AUe - 0.5(0.04-0)

x (0+0.56) +

(0.12-0.04)

x(0.S6 + 0.84) + (0.32-0.12) X (0.84 +0.94) + (0.60-0.32) x (0.94 + 0.98)

area uncler the curve (AUC)

This is a simple and useful mediad of obiainiDg a summary measu~ from plotted cIaIa. Medical research is frequently concerned with serial data. as in repeated measurcments (sec REPEATm MEASURB ANALYSIS OF VARIANCE) an a subject oyer time. e.g. blood aspirin c:onL'lCntralion mcasun:d at various times over a 2hour interval (Matthews el m., IWO). Sa)' we have n measun:ments y, laken at times II (i-I, ..., n). Such data arc tRqucntl)' cxhibited by plotting )'1 yersus I, and joining the resulting points by straight-line segments n:sulting in a ·curve'. The n:sulting an:a under Ihc curve (AUe) is often used as a single-numb« summary measu~ for Ihc indiYiduai subject. Further analysis of Ihc subjects or comparison or groups of subjects is carried out based on the summary measures. The AUe for the SCI of points (y,. I,) i - I ..... n is t),picall), calculalcd by the lnpCZium rule:

()'I) (I,)

+( 1.00-0.60) x (0.98 + 1.00») - 0.91 AD area or 0.9 I indicates the high discriminatory ability or the marker. For the ROC curve. estimating the area by the trapezium rule is equivalent to computing the Wilcoxon or MannWhitney stalislic divided b)' Ihc products of the sample sizes on the healthy and diseased populations. For smoothed ROC curves.. allemaliYe eslimates of the AUe arc available (faraggi and Reiser. 20(2). The effectiveness of aitemative diagnostic marten is usually studied by comparing their AUes (Wieand el a/., 1989). Adjustments of these ~u ror covariate infonnalion, selection bias and pooling effects an: discussed in Ihe ~fe~nClCs gi~n in Ihc entry for the ROC curve. Sc:histerman elol. (200 I) consider com:clions or the AUe for measurement error. For rurthel'details scc Hanle)' and McNeil (1982). DFIBR ........, D. and RII. ., B. 2002: Eslimalioa of the lR:a under the ROC CUI\'e., Slalulit'$ in Medicine 21.3093-106. HuIe)"J. A. and McNeIl, B. J. 1912: The meaning and use or the IRa under the n:ceiver opending clwKteristk (ROC) cun~ RlMJiolol)' 143, 29-36. ~........ J. N. s., A...... D. G .. Call1pIIeII. 1\1. J. and P. 1990: AIIalysis of suiaI melSU~meDts in medical rescuda.. BrilU/r Medicol JtHll"lJQl 300. 230-5. SdIIIt--. E.. ........, o.,Reiler, B.andTrmsan.M.200I: Statistical infamce for the ma under Ibc ROC CUJVe in die pmlCDCC of random

ItoJstaa.

measumaenl CI'RIf'. Am~rit.Ylft Journal of Epitkmiolol)' 154. 174-9. WIeaIId, Gal, l\L H., J ..... B... and J..... K. L 1989: A family of non-parametric stali.stics for comparinc diagnostic

s.,

marken with paimI or uapaimt daIa. Biometriko 76. S8S-92.

artificial Intelligence (AI) This branch of computei' scielK'e is devoted 10 the simulDiion of intelligent behaviour in machines. Traditional focus an:as of AI an: machine vision. MAOIINE lEARNING. natural-Ianguqe processing and speech n:cognilion. Historically an interdisciplinary field. and helK'e characlerised by the pn:scnce of several

17

ASSOCIATION __________________________________________________________________

competing paradigms and approaches. recently AI has staJtcd developing a more unified conccptual rramcwodt. based largely on the convergence of statistical and algorithmic ideas. A constant theme of AI Ihroughout its history has been 'pauem m:opition', Ihc cruciallask of delCcting "pa1lems' (regularities. relations. laws) within daIa. This task has elllCJ'Fli as a roadblock in aUIhc lladilional areas mentioned earlier and hence has aunctcd significant aucntion. Since most cum::nt approaches to pallern R:COgnition involve signiftcant usc of statistics. this has bc:come an important ~I in AI in general. Recently. AI has bc:cn applied to a new series or important problems and this. in tum. has hc:avily aJTc:ck:d general AI ~. Important applications of modem AI include: intelligcnt data analysis (sec also DATA MININO IN MEDICINE): information retrieval and filtc:ri~ from the web; bioinfarmatics; and computational biology. Tmditional application areas. by way of contrast. included lhc design of EXPER1' SYSTDIS for medical or indusbiDI diagnosis. me:thods for scheduling in logistics and creation ofoIhcr decision-making assistant software. The imprecise definition of what AI actually is has made it harder in time: to gauge the impact of this n:sc:an:h ftcld on everyday applications. A number of widely used computer programs would have met early definitions of artificial intelligence.. e.g. popular wcb-bascd n:commcndation systems or air travel planning advisors. Popular techniques for pallc:m n:cognilion such as NEI1RAL NETWORKS, clc:cision ~ and cluSlu analysis (see nUSI"ER ANALYSIS IN MmICINE) have made lhcir way into the standard toolbox of data analysis and are commonly found in lhc toolbox of any biology lab. Machinc vision methods are routinely used in analysing medical images. as wcll as parts of systems such as microanay machines far collecti~ gene expression data. Web retrieval and email filtering software also incorponde several ideas from natural-language proccssi~ and paIteIn m:ognition and Ihc modem sequence analysis of genomic data heavily relics on techniques originally developed for spccc:h nxvgnition. Intelligent web agents exist to find, assess and rclricve relevant infonnalion for the user and spc:cch-n:cognition systems are routinely used in automatic pbanc information systems. 111e field of artificial intelligc:nee has clearly produced a number of pnc:tical applications, but-lhc critics say - these have been achieved without solVing lhc general problem of building intelligent madUncs. Maybe for this n:ason, gc:nc:rally the main suc:ccss story of AI is n:ported to be the defeat of the chess world champion Gary Kasparov by an IBM algorithm in 1997. The origin of Ihc field or AI is often idcntified with a paper by A. M.1Uring. whichappean:d in 1950 in thejournal Mintl, and with a workshop held at Danmouth College in the

summer of 1956, although many key ideas had already bc:cn debated befon:, during lhc early years of cybcmc:tics. Modem techniques of artificial intelligence include Baycsian belief networks. part of lhc more general ftcld of probabilistic graphical models: pattern-recognition algorithms such as SUPPORT \'EtTOR MACHINES. which represenl the con\'ergence of ideas from classical stalistics and from neural networks analysis; statistical analysis of natural languagc tcxt and machine vision algorithms: reinforceme:ntlearning algCll'ithms which represent a connection with control theory: and many other methods. !VenDS BIIbop. c. 1996: Nellral nelK--orks jor pQllern recognilion. Oxford: Oxford University Press. Mllcbell, T. 1995: Mamine learning. Maidenhead: McGraw-Hili. R...., S. and NOI'YIat P. 2002: Artijit'ia/ inlelligen~e: a ",odem approach. 2nd edition. Harlow: Pn:ntice Hall. SbaweTaylor. J. aDd CrlsUanlnl, N. 2004: Kemel methods for pallerrr allQlysu. Cambridge: Cambnqe Univcnity Press.

association

This is the statistical depcndc:noe between two variables. Measura of association, unlilce descriptive statistics of a single variable. summarise thc extent to which one variable: inm:ascs ar dccrc:ascs in relation to a change in a sc:cond variablc. The basic graphical analysis of two variables is the SCA1TERPLGr. wbk:h provides evidence of association in the shape and direction of the seanc:r or points. In the example given here. there appears to be an association between bady mass index and systolic blood pn:ssure values in a samplc of a few thousand middle-aged men and women: higher values of body mass index lc:ad to be associated with higher \'alues of systolic blood pn:ssure.. suggesting a "positivc' association. A ·neplive' association. in cantrast, would dc:smbe a situation where an increase in one wriable: tc:nds to be n:latcd to a dc:cn:asc in lhc second variable. Various statistical measwa can be lRd to inlCrpn:t Ihc degn:e of association.

Correlatiorr c.oeffi~ienl. This specifically measures lhc degree of lirrear association between lWO quantitative variables on a scalc from negative one to positive one. A value ofzero indicates a total absence orlincar association. whilc a value of positive or nqativc one indicates a perfecl linear relationship. The correlalion coemcient between body mass index and systolic blood pressure in our example was 0.25, indicating a positive association that is less than perfectly linear. Howevc:l'. adherencc to a linear relationship is only one form of association and it is easy to imagine other plausiblc patterns of association. such as a parabolic scatter. in which the change in one variable: may be perfectly reftected in thc change in the second variable, but the correlation coemcient might be close to zero.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ATTENUATION DUE TO MEASUREMENT ERROR

Regreuion t:oejJicienl. In the cascofsimple lincarreg~ssion. then: is a complc:tc c;orrcspondcnce between the (."C)IKlation c:oeflicient and the regression c:oeflicient for the slope (/J). 1'11c: regression coefficient. thCRforc. also measmes association. but its value is interpreted as the magnitude of change in the dependent variable thDl arises. on average. from a unit chang~ in the independent variable. In our example. an estimate of fJ = 1.37 indicated Ihat a I kglm2 increase in the body mass index was associated with an average in~ of 1.37 mmHg sylllolic blood pn:ssurc. However. in more complex regression models, the regression coefficient can measure other forms of association beyond linear dependence. For example. either the dependent or independent variable may be mathematically transfonnc:d. such as raising to a higher power. taking logarithms, etc., and the association measun:d by the regression coefficient would express a nonlinear change in one variable in response to a change in the second \·ariable. Relllli.'e riJlc. In the special case of two binary variables. various nlio measun:s arc often used to quantify the degree of association. forexample. one variable might be a measure ofdisease OCCUI'ICncc:.the other a biolOgical or environmental quantity. Most commonly the ratio would compare probabilily of disease expressed as an odds. a risk or some olhcr relevant approximalion to the risk. A relative risk value of 1. indicDling equal risks in both groups. suggc:5Is that no association exists between the biological or environmental quantity and disease. If a statistical measure suggests posilive or negalive association. this should not immediately be taken to imply thal the association is valid and gc:ncraJisable. Several ~ siderations mighl lead us to question the importance of an observc:d statistical association. Firsl. considc:ntion of the STANIlo\RD ERRDR of the measure of association. generally re8ccting the size of the sample. places Ihc magnitude of association in perspective with the magnitude of random c:rror. Apparently shollg associations may in fact be poorty estimalc:d and fall short or SlAtislical signiflcance. Second. an apparent association may be entirely spurious (i.e. 'confoundc:d') due to the inftuenc:e of Glher measuraL or unmeasun:d. variables that hayc not been accounted for in the analysis. For example. in II IRliminary statistical enquiry. risk of coronary hc:an disease may appear to be associated with watching television. although consideration of the underlying relalionship with obesity and physical exercise would probably suggest that Ihc preliminary Onding was spurious. An association may alter after adjustment for the interdependence oroeher variables and Ihc gcnc:nl validity of a measa= of association would often depend on the extent to which such poICntial interdependencies have been taken into K-aJUnt. Studies measuring sevc:nl variables often utilise

multiple regression models to estimate adjusted regression coefficients and partial com:lation coeflicients by including all relevant variables in the model. However. even aIlc:r allowing for such intc:rdependencies. Ihc much strongc:r claim of CAUSALllY between two variables would generally require cxaminalion of more stringent CriteriL Third. an observc:d association may be speciftc 10 Ihc chosen range of the variables or to Ihc particular group of subjects studied and any inrerence beyond the range of the data to hand would require careful consideration of lite method of sample selc:ction. Various forms of selection bias may limit the ,eneralisabilily of the association. JGW

as treated

See INTENTJON-TO-TREAT

attenuation due to measurement error This is a bias reducing Ihc size of a correlation or a regression coefftcic:at due to imprecision of data measurc:mcnt. Consider an analytical epidemiological study in which the aim is to estimate lite CClItJI.EU.TKJN bc:twc:en true average consumption of alcohol (mg per day) and true a'lmlle systolic blood IRS~ (mmHg). Blood pressure measurc:mcnts arc wc:lIknown to be variable within individuals and a single measurement is likely to be rather imprecise (see J.lEASUREMfJO' FREaSJON AND RELLO\BWTY). Such a statement is even more true of a single day's intake or alcohol as a measure of Ihc true a\'CnIIC daily intake of alcohol (even if that day's intake wen: found to be measured without cnur). Now. in the c:pidemiological study we chose. ror each participant. to measure systolic blood pressure once and then ask them to recalllhcir alcohol intake the previous day. If we now calculate lite Pearson product-moment correlation bet\WCn Ihc two mc:aSUfCS we are likely 10 get a positive value that may be statistically signiftcanl (assuming we hayc a large enough sample) but will not be particularly high (i.e. not far above zero). Suppose, ror the sake of argument that we have found a value of lhis correlation 10 be 0.20. It should be fairly obvious that as Ihc measures or systolic blood prcs5ure and alcohol get less pteCisc (equivalenl for a nx.c:d population to lowering their reliabilities) the correlation will tend to zero. This is alIcnuation due to measuremenl erTOr. l.etlhc observed measurement of blood pressure for the 4h participant be Y, and the corrcsponding true average blood prcssa=be t'.. Similarly.lct the measured alcohol intake bcXt wilh a true 8\OCrage of",. We have estimated the correlation between Y and X. PYX. when we are really interested in Ihc corrclDlion between the lnIe w1ues. PII9' If the c:rrors of measurement for blood prcssa= are uncorn:lated with those ror alcohol consumption then it can be shown thDl Ihc rollowing relationship holds: (1)

19

ATTRIBUT~~SK

____________________________________________________________

HcM. "r and "Jt ~ the reliabilitics of the blood pleSAR and alcohol a1Dsumption mc:asuremenls respcc:tivcly. It follows that:

as good an cstimale as possible. especially when one employs anobscrvational study. Using BAYESTHE(IlEM and rearranging the cqualiDD. we can obtain an expreSsion exprasc:cl in tenDs of the relative risk (RR):

Provided wc know the reliabililies fCJl'the two measuremenls.

1 = Pr{EHRR-l)

this equation can be used 10 adjust Ihe observed atJRlalion between Yand X to obtain the required com:lation between their true average values. If we bow that "Jt I : 0~3 and ItJt=0.7. for example. the n:quin:d comdalion is 0.21 ,J(0.3 x 0.7) = 0.44. If. instead ofa correlation. the lincarregn:ssion coefficient for the effecl of blood presSIR on alcohol CODSumption wcre of key intelallhcn: (3)

and. again.1he n:quiml adjustmenl isstraightforwani. Equation (3) also holds approximately if we were 10 use a logistic rqrasion to pn:dict the presencclabllClltlC of hypertension. These calculalions are One as 10Rl as we ha\'C valid cstimates of the reliabilities. Howcver. Ihcy ~ only valid in these veJY simple situations as described. Epiclcmiologisls almost always wish to adjust their estimates to allow for confounding and some of these confounders an: inevitably goiRl to be prone 10 MEASUREMENT ERROR. Undel' these cimlmstances life is considembly more complicated! We cannot even be ccrtain that the estimate of the required parameter will be allenualc:d, never mind heine altenuated in a way described by equation (3). Readers ~ refClRCi clsewhere to these much mom challeRling but more realistic situations (Carroll. Ruppert and Stefanski. 1995; Cheng and Van Ness. 1999; Ouslafson~ 20(3). GO Carrol, R. J., RuppeJ1, D. and S......... L A. I99S: MI!tUUI'r,.nl ~rTtJf' in IIIHflinetlr motkls. London: Oapman " Hall. a..a. Co-L ..... Va Ness. J. W. 1999: StQI&tkal re,rrssiolr M'ilh IMtISII1'tRfMt e"OI'. London: Amold. CiuIIIIfIaa. P. 2003: MetlSUTr!-

,.nl ~"or and miNmsijitaliDII ;" stalistia tmtl epidemiology. London: Cbapmm" HalIlCRC.

where Pr(£} is the pn:valence of exposun: in the population at large. This is a COIlYCnicnl way of expressing the measure of association, because RR is often cslimalc:d usine alternative study designs. including C.~SE-CON1ROL. COHOIn' AND CIlOSS-SECIlOIW. SlUDIES.

Attributable risk is most easily inlClpretc:d when the factor of interest increases risk. i.e. RR > I. and in these cases the possible range of the measure is from 0 10 I. An altributable risk of zero can occur when no individuals in the: popuIalion ~ exposed 10 the factor of inlc:rcst. or if the: faclor is not relaled to risk of disease. RR = I. The measure is nol easily interpreted when the expos~ is proteclive. RR < I. so it isgenemllynot used in thisca5e. By n:cieftning the: reference graup. one can always cxpress the results of a study in a fonn in which RR is greater than I. so this is not a serious limitation. In addition. the lDCasure is oftcn expressed as a percent. As RR became large.).. goes 10 I. but A. goes 10 zero either as the proportion exposed. Pr{EJ. becomes small or as the relative risk. RR. appraached the: null value of 1. If an enlim population is cxposed to a particular faclor. Pr{ E} = 1. then the: sc:cond equalion (above) reduces to A. =(RR -1)lRR. The lable: shows a typical 2 x 2 tablc that can he usc:cllo display Ihe results from an cpidemiological study. In a casecontrol study. the column totals arc generally regarded as being fixed by design and the odds ratioorcross-produci mtio is usc:cl as a good approximation 10 the estimate of RR when the disease is rare. In addition. the exPOSlR distribution in Ihc conbols. Pr{E) = Pr{EtO}. is conside=llo be representative of the exposure distribution in Ihc overall population. Substituting in the samplc estimatcs ofthcse quantities gives rise to whal is the maximum likelihood cstimate of A:

attributable risk As a

mcaslR of the public health signiftcance of exposure to a risk factor for disease. the attributable risk provides an estimate of the proportion of diseased subjcc:tslhat may be allributc:d to the exposure. It is defined by:

A = Pr{D}-Pr{DIE} where PrID) is the probability that an individual develops disease and E and E reprellCllt whclhcr an individual is cxposed CJI' not exposed to Ihc factor of intcn:st (Levin. 1953). Ideally. one would like 10 know bath Prt DJ and Pr{ DIE} fCJl' Ihc population under study. bul for some study designs this is not possible. so if one wishes to use the measure, can: is needed to design a study that will proviclc

•

ad-be

1=---d(a+c)

attributable risk Results from an epidemioIogIcalstudy with two lwels 01 exposure and disease status Disetue slahU

Expo:mi

£

E Total

D

fJ

Tolal

a c a+c

b tl b+d

a+b c+tl N

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ AUTOMATIC SElECTION PROCEDURES

When selting a conllclcncc interval aboulthecslimale. Waller ( 1975) sugeSls using lhe: nonnaI appmximation an die log

transfonnalion of the complemenl of lhe eslimale: var[log(l-l)]

==~c(~tI:~C) + d(b:d)

Aliemalively. Leung aad Kupper (1981) have suggested using a IlIIil transformaaion ia which:

In a cohan study. the row IolaIs in lhe lable an: mgudcd as fixed: IherefCR such a study does nat provide a goad internal eslbnale of the exposure: distribution and neiihcr docs it provicle a good estimate of the IDIconditionai estillUlle or the probabilily or disease. In this case. the proportion exposed is usually cieri. rlOm aDOIhc:r study, perhaps an earlier caseCODlIol SIUdy or a survcy of thc entire papulalion. A CI'UlSsc:c:lional study provides both aD eslimale of the ~tive risk and the overall popuIalion distribution. so in that sense il is ideal lOr estimating altribulablc risk. However. a crosssectional SIUdy sutTen in other ways (see ~SErl1ONAL STUDIf.S). Wallei' (1976) discusses the: properties of eslimales of altribulable risk using these altcmalive study clcsilns. Melhocls for estimating attribulable risk far a particular expasun: while adjustiag ror potential canfouncling facton depends on whether lhe ctTect is conslanl oyer Ihe levels of the: covariates under consideration. When thc effect is CXIRstaaL it can be represented as having a cammon mlativc risk over the strata when using a stralified appraach. such as the Mantel-Hac:nszel method. or it can be IqRSCllled by a main etTKI only in a madel. such as the: linear IlIIistic model. la thesesilUations.onecandi~yusetheadjU5tedestimatoror

the: rOdive risk. along with an estilDllle of thc exlJOS'R distribution in the disc:uc:d &;rOup in lhe second equation (above) to obIain an eslimale oflhe adjustccl allributable risk (Waller. 1976: Gm:nlancl. 1917). Howcver. lhe assumplion thallhe associalion can be described withoulthe inclusion of an iateraclion lena is a sbung one and il is critical in thai a seriously biased estimale can result ir it is not 1nIG. An eslimale of altributable risk thai can be used eilher in a slnlifled aaalysis in which Ihc ctTect is nol homogeneous KlOSS slraIa or in a generalised linear model thai includes inlcraction lenDs can be exprased as: , - 1-~,,"'.. Pij

I'. -

RRlli

whcRj repRSCIIIS the levels of the faclOl(s) being actiusted. i n:prescmls the levels of exposure. Pu is the: propartion of diseased individuals in (IJ) and RR~ Ihe relative risk for exposure level i for individuals with levclj of lhe cowriales being adjusted (Walter, 1976~ Benichau. (993). TRH

or adjustment for cstimaIing the aaibutable risk in ellSC-CGllllOl sbDes: a new. Stalistics in Medkilll 10, 1753-73. Gnad.... s. 1987: Variance cstillllllOrl (or altributabk fractioncslilDlta. coasb1ent in bolla Iqe IUIIa and spanc cIaIa. SIIII&Iies in Medkiltt 6. 701~....... H. It. ... Kapper, L L 1981: CompariSCIII of confidence inlCmis ror atlributable risk. Biomelric. 37, 293-302. M. L. 1953: 'I1Ic OCCIIIRDCC or IURI cancer in IDIIL Alia Uni" Inlmltlli".lis CUll'", Ctllltrunr9. 531-41. W.....,s. D. 1975: 'I1Ic dislributiDn of Levin's IDC8RIJe of lllributable risk. BitJmel.62. 371-4. Waller, S. D. 1976: 1be cstimaliGll aad inlCqRIaIion orlllribulable risk in heal... 1aC1R:h. BitNnel,k$ 32. 829-49. ........ J. 1993: Methods

I.e.,

AUC

See AREA UNDER TIlE CURVE

autoconelatlon

See CORRELATION

automaDc selection proceclurea These an: p. . cedun:s far iclenlirying a parsimonious model in~ion in genc:nl and MULTDU UNEAR REDRESSION in palticular. Such methods arc needed because: in regression analysis an underfiacd model can lead 10 sevcmy biased estimation and pmliction. la conlraSl. aD overfilled model can seriously depadc Ihc efflcicncy or the: resulting paramc:IeI' estimates and pn:dicticms. Consequently a varielyoflec:hniques all wilh the aim or scleetiqlhc: mast important explanatory 'Variables far prc:dic:ling the response variable and then:by oblainiq a parsimonious and ctTectively predictive model have been developed. Perhaps the thn:e mast cammanly used melhods an: ft,,·...tmI .1«Iion, btldcBWd elimintlliDIr and a combination of bulb of lhese. known as sleplt'ise reSTeuion. The forward selection approach begins with an initial madc:1 thai conlains only an inten:epl and successively adds explanalcxy variables to lhe mode) from the pool of candidate wriablc:s until a SIaF is n:ached where none of the candidalc 'Variabl~ if added to the cum:at model. would conlribute inrcxmalion thai is Slalistically imporlaat concerniag the expc:cled valuc of lhe response. Tbc backward elimination mcthad lqins with an initial madelthal conlains alllbe explanaloly wriables being used in the study aad Iben ftnt idcnlifies thc single variable thai conlributes the least inranaalion aboutlhe expected value or the respanse: iflhis is deemed ROllO be "significant" then Ihe wriableiseliminaled rlOmlhccum:a1 model SlIL"CCSSivesleps of the melhad resull in • 'ftnaI' model from which no further wriables can be eliminaled wilhaul adversel), afrccling. in a sIaIislicai sensc.the pmliclcd value or the ellpected n:sponse. The stepwise n:gn:ssion method eambines elements or bath rarwanl sclc:ction and backward eliminalion. The initial madc:1 consi~d is one that contains only an inlen:cpl. Explanatory variables arc lhc:n consiclcn:d for inclusion in the CUl'lalI model, as described previously for rorward selection., bul now in each step of lhe procedure 'Variables

21

AV~LABlECASEANALYSIS

_________________________________________________________

included previously are also considcml for possible: elimination as in Ihe backward mclhod., and lhcy might be removed if lhc presenec of new variables in the model make Iheir contribution 10 pmlicting the expc:ctal response no longer significanL In multiple linear rcpasion lhc criterion used for assessing whcthu or not a Yariable should be added to an existing model in forward selection or remoyed from an existing model in backward elimination is. eS5Cnlially. the change in Ihe residual sum-of-squarcs produced by lhc inclusion or exclusion of the variable. Specifically in forward scJc:ctian an 'F-slatistic'. known as the F-to~ller~ is calculated as: F=

RSS",-RSS"'+I

RSS",+I/(n-m-2) when: RSSmanci RSS"'~I an: lhc residual sums of squares when models with m and m ... I explanatory variables ha~ bc:en ftued. The F-to-enter is Ihcn compaml with a pn:sct term; calculated Fs gJaICI' than the preset value lead to lhc wriable lIIICIu c:oMidcndian being added to the model. In backward selection a calculated F less that a corresponding F-to-remo\le lcads 10 a wriable being removed from lhc auRnl madel. In the stepwise proa:dure wriables are enaen:d as willi forward selection. bul after each addition of a new variable Ihase variables currenlly in lhc model are CXIIISidemi for remoyal by the backward elimination process. (For more details sec Petrie and Sabin. 2005.) In olher types of regmssion. for example. LOGIS11C REORESSJON. other criteria are used for judgiq whether or not a wriable should be enlcn:cl inlo or removed from Ihc: auRnt model. When applying rqressian techniques to HKIII-DiMENSION."'- DATA more sophisticated variable selection Iechniques are needed (sec. for example. Francois. 2008). None of chc automatic pl'OCcdura for selecting SUbsclS of variables is foolproofand it is possible forlhcm lobe seriously misleading in some circumslanc:cs (sec Agresti. 1996). That said. atlcast one can be more conlidcnl in a chosen model if aD thn:c procedures COIIYeIIt: on 10 the same sci of 'Variables. as OCCID'S quite frequently. bUI nul always. in pmctice. When dilTcmat subsets of variables arc indicated. judgcmcnt is ncccssary 10 decide on a prcfcncd model. such judgement being based on the desi..: to create a parsimonious model that is lilcely to be genenlisablc. not overly complex as if madelling mere quirks of the particulardatasct on which it is based. and yet inc:ludiq important OJ'standard paramctasdccmcd to be: of clinical relevance. SSE (Sec also ALL SlJBSETS RI3ORESSIO.~) Ap'estI, A.

1996: inlrotiJKliDli 10 tG/egoriml MIG QIlG/),Jil.

New York: John Wiley a: SolIs. Inc•• FraaeoIs. D. 2008: High· dilMlUiolltll dG/G antJlym:from oplinllli melrits 10/eGturr ~/t£ lion. VDM Verlag. Peart.. A. ... Sabia. S. 200S: Mediall JIG/ulits GI a gla,,~. 2nd edition. Wiley-Black'A~II. Chichester.

available case analysis This is an approach

to

multivariate data containing missing values on a number of variables. in which )dI!.AJIiS. VARIANCES and covarianccs (sec COVARIANCE MATRIX) are calculated from all available subjc:cts with nonmissing values an chc wriable (means and variances) or pair of variables (co'Variances) involYCd. Although this approach makes use of as much of the observed data as possible. il docs have disacb'anlqcs. For example:. Ihc summary slalistics for each variable may be based an different numbers of observations and the calculated yariancecovariance matrix may now not be suitable for methods of multiwriate Daalysis such as PRINCIPAL C'OMJ1ONENTS ANALYSIS and fAC'l1ll .o\N.o\LYSIS for reasons described in Schafel' (1997). (See also ttllSSlNO DATA. MULTIPLE D.lPUTA~J

SSE

ScIaaf'er.J. L 1997: Analysis o/illtOlllpkte muJl;J'tlI'iale daiG. Boca Raton. Florida: Oaapman.t. HaiLOtC.

average age at death

This lawed slatistic is someUmes used for summarising life expectancy and other aspc:cts of mortality. For example. Andcnc:a (1990) commcnlS on a study that campum average age at death for male symphony orchestra condUClors and for the entire US male papulation and showed ahat. on a'VCl1Jgc. the conduclors lived about 4 years longer. 111c difference is. however. IlIIIcly iUusory because as age at enlJy was birth. those in the US male population who died in infancy aDd childhood wen: included in lhc calculation of lhc average lifespan. whereas only men who survived long enough to become conduClors could enter the conductor cohort. The apparenl dilTcreace in longevity disappeared after' aa:ounliRg for infanl and pcrinalal monaIily. In Ihe other direction. a study in the USA that used average age at death of lUCk stan (which. on lhc basis of 32l such deaths, they round 10 be 36.9 yean) to warn of lhc perils of rock music also got it WI"Oq. It took no account of the rock stars still alive. Proper analysis of mortality inyolves the clcccnninalion of AOE-5FEC1RC RATES for mortality. which requires denominator data on chc agc distribution of lhc population (see SSE Colton. 1974). A......... B. 1990: Melhodolog;i.YlI erTtNs in metliml reswuth. 0xfCIrd: Blackwell Scientific:. CoIIOD, T. 1974: Stalistics in oWi"inr. Basion: Unle. BID'A'D aDd Co.

average treatment effect on the treated (ATT) See JlRalENSIlY scalES

average treatment eIfect on the (ATU) Sec PROPENSITY SCORES

untreated

B back-calculatlon Also known as back-projec6on,this is a nu:aas of esdmating. for example. put HIV infectioa raIes and pmlicqllle number of new AIDS cases in the rU1ll1e IUId was flm pmpased in the mid-.9. (BRIOkmcy~ and G~I, 1986). The esSCDCe.of the methad is contained in tM equation:

E s~ imated

HlIV incideTlce

/

I

J

d(I) == ;'(.r)p(I--.r)cI.r

o

where dll) and lIes) denole Ihe disease diaposis nIC III time , and the infection rate at time 8, and p(.) indicales the pmbabilily distribulion (density) of the illCu~on lime (or JNc:UBA1ION PERIOD). This eapression stales that Ihe rate of disease diapasis at lime I depends an the rate of new infeclions at time s and on die distribulion of the illCllbalion lime 1 -s.11x:..,f'om, ifanyawo oflhe~ IhreecampanenlS are kllOWD- Ihe third can be ;"forTed. Typically. the dilCUC diqDDsis rate for I up 10 the cum:ntlime T andlhe ~ bution iof the incubalion lime aM assumc:d known and the infection nile is estimated. The ftgaR explains the: idea in a discn:te time liamewalk usillllhc HIV epidemic as an example. ~.the intcn:st is in estimating HlY incidence and in PRCfictiDg rU1ll1e AIDS cases. Suppose data on new AIDS cases ow:r time up 10 the culmlllime Tare available IOgc:Iher with the information on lM distribution of the incubation time. It is then possible to ~t the nUlIlbel' of past infeclions that have resulted in the oba:rYecl AIDS cases. The estimalc:d incidence of HIV can be used in conjunction with the distribution of the incubation lime 10 produce short-term pmjeclions of neW AIDS diagnases. NOIc that in Ibis particular c:ase.1he MEDIAN Iqth of the incubation time is ofllle onIc;rof 10)'CalS, with very few individuals developinl AIDS within a short time period flUID infection. 1he observed AIDS cases thererCR provide information on infections that occumxl in the distant past~ raIher Ihan in IeClent years. &timalcs of incidence of infectian far IIIe yean JUII pn:ceding T will necessarily be quite inaccurate. as they are based on UllIe inrormation. Can: should Ibcn be: IDkcn in the inlc:lprdation or IeClent trends in the number of infections. HOMWer. this problem wiD ani affect pmjcctions of AIDS cues as long as they are short term.

C8leooar time

T

back-c:adculatloa Btlck-atlt'U/alion lain, HIV incidence lIIId F~dicl;on D/julure AIDS IXlSrS

A number offOlRlulations ofthe back-calcuJadon equation

have been JXCIPOISI'Il. To give a flavour or Ihe·cllimation problem. it is CXJRYenient 10 use a cIiscn:ac 'Version of our fint equation. Let 10 be die beginniq of the epidemic and)"1t the nmnber of individuals tIIal develop 1he disc:asc endpoint or inlcn:st (e~ AIDS in an HIV context) in the kth time inlerwl 1"-1. 'II;) for k =I •...• K. Suppose thai fu. Ihe probabilit)' or developing the disease endpoinl in Ihe jib time: inlervalgiven infection in the ith interval, is also known. Then the expected number of new disease cases in [tlt- .. Iv can be expn:ssed as:

Ir E(,.Ir) == LE(hiVII ;=1

wbereh,islhe unob&erved number ofnew infeclians in the.fth time: intuvaJ. Assuming thatlhe " an: independently di~ lributcd according to a PaSSON DJS'I1UBtmON wi'" panunc:1cI' E(hJ. then Ihe ,'ott an: alsO Poisson dillributed wi'" panunclI:I' B(yll;). F'mm this Ihe likelihood rar the obsc:nm data can be c:onstruclCd and maximised 10 obtained cstimales of tbc: numba-ofnew infections over timc (see MA.XBIlJU LlKELDlOOO DJmUBUDON).ln pnldice.eslimadon ofh=( h •• ... •/rll) is not so Slraighlforwani. The high dimensionality of b can lead 10 unstable c:&timates. In anIer 10 avoid lack or identifiability.

&rqdfllllllf6e CfMIJIIIIIflM I. MeIka'SI.;'1ia: S«'IIIIII EtIiIiM YIaI by Briu So Everitt .... ChrisIlClph« R. P'aIIaeI' C 2011 .folD Wiley ilk ~ Ltd

23·

BACK·~EcnON

......................................................................................................................................................__

some structun: ncc:ds to be imposed on the shape of II. This has typically been achieved by choosilll fully paramelric models for h-lI(q). The problem is Ihcn mluced 10 an eSlimalion of q, convenienlly ehascn to be of a lower dimension than h. AJlCmlllivcly. to n:1aiD some ftcxibilily, weakly paramdric models (i.e. step functions constanl O\'CI' a 10111 period of lime) ha\"e been specified or smooIhncss CIODSlnintson II havc been intnxluccd. 'nIis has CJaIcd a rich literature.. especially in the HlV ftcld (sec 8nxIkmc)'el' and Gail~ (994). AtlraCtivc in principle. givcn the simplicity of the idca.1hc mdhod docs rcquin: IRCdc knowlcclgc of III lcast two oflhc thn:e COmpaacnlS intmduccd aln:acIy. However, perfect information is ramy available. For cumple. as in Hay. Ihc incidence of the disease endpoint. typically acquired 110m surveillance schemes.. might be affected by reporting delay or uncIcnqJaniDg. FUJtIIer.lhc:dislributionofthe incubation lime may also be imprecisely known.. Results can be highly sensitivc 10 misspcc:ificalian of the iaputs.llislhcn:ftR impaltant thai data an: appropriately adjusted far delay in reporting bcrtR Ihcy ~ used in the back-calculation. Equally. it is esscnlialthat sensitivity anaIyscslO Ihc model chosen for Ihc distributiem of the incubation lime am carried out. One more limitation or the method is the inability to pmyiclc pn:dsc eslimalcsoftheineidcnccofinfc:ctioninn:ccnttimcs. Thisisa pallicuiarly serious problem for diseases with lang incubation limes. as seen in the HIVexample. These limitalions nalWitlulanc&ng. the back-calculalion mdhod has been wielely used and ~lopcd in various ways. cspcciaJly in the HIV an:&. Notably. the oriliaal methodology assumed a fixed distribulion for the incubation lime. independent of calendar lime or age al infection. Howevcr~ thcnpcutic changes Cft"eI' lime and the discovery of a clear dependence ofHIV progression on qe at infc:ctian havc made the tiJnc....qc inclcpc:ndcnce assumption untenable. This has led to thedevclopmcat ofagcHimc specific vcrsio. of backcalculalion. Equally. the need 10 CSlimaIc the number of individuals 81 difl'en:nl stap:s of the development of HlV has n:suItai in the devclopment of 'slageel' back-calculaaion~ whac the incubation lime is diviclc:d illlo stages according 10 the value of markers of HlV disease. A Rnal example is given by the need 10 rdlnccstimationofHIV ineidcncc. especially in rucnt years. and AIDS projections. This has n:suIted in a fill1hcr development or Ihc mcIhod. now able 10 iJlallPDr8le external informaIion on the disease spn:acI as well as oIhcr surveillance data. in addilion to AIDS diaposes (sec Dc AlIIClis. GiUts aad Day. 1998; Becker. Lewis and U 2003). The method and its developments have found important application in OIbcr contexts besides HlV.lhamplcs include the assessment of the bovine sponpfann cncephaIopaIhy epidemic in caltle and the CGDSCqUcal Crcutzfeldt-Jakob disease epidemic in humans in On:at Britain. the estimation of lhc Hepatitis C virus epidemic in Prance and the

estimation of Ihc in Australia.

DUmber

of new injc:clilll drug users

DDA

a.cctHttI, P. 1998: BlKk~aJcuJaticIa.1n AlmiIlP'. P. ancIColton, T. (eds).En"rlo'-ofbionatmicJ. \b1.1. Cbicheslcr:Jaha Wiley &. Saas.lJd.. pp.235-42. lder. N. G...... J. J. C...... U, Z. F. 2003: ~ific back-pmjeclion alHIV diaposiSditL S'alulirs in Met/kiM 22. 2177-90. R. .... Od, M. He 1986: Minimum size oflhe acquired immuaodelcieac:y syocIrome (AIDS) epidemic ia abe United StaICL Loner' 2(8519). 1320-2. lraak. ....,.....R.adO'O'M.H.I994:AIDSrpi#kmiolDrY:aqutmlitaiRte apprt1lldJ. N",,' YOlk: Oxfold Univenity PIaL De AaatUs~ 0 .. GIkI. W. R. ad DIJ, N. E. 1998: Bayesian pmjedioD of the acquimI immune deficiency S)'IICInJme epidemic. JDII1fIIII 0/ 1M Royal Slatislictli SlICi~I)' C - App 47. 449-81.

I......,.r.

backWards regl888lon

See LOCKSTIC RBIlESSION.

Mtl.11IU UJlEAR UDlESSION

balance

Sec RANDOMISATION

bar chart

A graphical display of data classiftccl into a nwnbel' of (usually unordcn:cl) cllleprics. Equal width n:ctangular bars ~ used 10 n:pacnl CKh calcgory. with Ihc hci&JIts of the ban beilll proportional 10 the observed fn:.qucncy in the com:spondilll category. AD example is shown in the lipan:..

200 0"1

€

E .sO ,~

:8 ;;;

....

tii 00

!;i,

~.....

;::.-,

0

50

:::!!;,

0

I II

I

I

I

.... cIwt MO"III;I,. rllle:l PI!1' 1000 live bi"hs,/oT rlrild,en

under Jive in./ire tlgJ'e1WII co"""ie:l An extension of the simple bar chart is the component bar chart (also known as the stacked bar chart) in which panicular lcagthsoreach bar am difTc:rcntialCd ton:pn:scnt a numbcrof

___________________________________________________________________ BARCHART

'Jbe basic bar chart is oRen of linle rnc:xe belp in undersannding catc:gorical data than the numerical data thcmselves. However. sophisticated adaptations of the graphic can become an extremely efTective tool for displaying a complex set of categorical data. Thai this is so can be illustrated by an example taken from Sarbr (2008) that uses data summarir ing the fDles of thc 220 I pnsSCDgClS on the Titani,', The data are catc:gorised by economic status (class of ticket. first. second or third. or crew). sex (male or female). agc (adult or child) and whether they s1Ir\'ivcd or not (the data an: available on Sarkar's websile. htlp:lllmdv.r-forge,r-projccL orgI), The first diagram produced by Sarkar is shown in the third figure, This plot looks impressive but is dominaacd by the third 'ponel' (adult males) as heighL" of bars represent counls and all panels have the same limits. Sadly. all the plot tells us is thal there were many more males than females aboord (particularly among the crew, whieh is the largest group) and abat there were even fewer children. The plot becomes man: illuminating about what really happencci to the passengers if the proportion of survivors is ploued and by allowing independent horixontnl scales for the dilTerenl

frequencies a5sociaacd with each category forming abe chart. Shading or colour can be used to enhance the display. An example is givcn in the second figure; here lbe numbers of patients in the four categorics of a response variablc for two treatments (BP and CP) ~ displayed. D IProgriE POPc ....,. Ihe trial is stopped. because even an opIimist should be penuadc:d that the new lreatmenl is no beUer than the staadanl. Similarly. if PPP > PPPcarr. the biaI is slopped bc:c&usc e~ a peuimisl should be: pmsuacled thallhe new tn:allnent is better. For further delails see HeiljaD (1997). SRS (See also Ba\YESJA.~ ME11IOO5J

HtItjIuI, D. P. 1997: Bayesian interim analysis or Phase I( cancer cliaical trials. Sltlli"icJ in Medicine 16. 1791-802.

benchmarking This is a proecdun: for adjusting a less n:liable series ofobservations to make it c:onsisIcnl with man:: n:liable mcasun:ments known as benclrnJtlTle,. For example. data on hospilal bed occupation coUc:cted monlhly will DOl necessarily ~e wilb ftgura collc:ctc:d aDnuaily and Ihe monthly IIgura (which an: likely 10 be less ~Iiable because the annual 11l1Ircs will pmbably originate from a census, exhausli~ administrative n:conls 01' a iarJer sample) may be: adjusted al some poinllO ap:e wilb the I11CR n:liable annual fil1lRlS. Benchmarking isotlen useclloacljust time-series dala 10 annual bc:ac:hmarb while pn:scrving as far as possible Ihe monlh-to-maath movemenl of abe original series (see. rar exampl~ Cholelle and Dqum. 1994). SSE CIIDieUe, P. A. ad ........ £. 8. 1994: Bcnclunartiagtimc series with aUlocondalcd 1illlYC)' cnors./,,'emtlliolloJSialislic,Rnk..·fil. 365-77.

See OIlAPHICAL MOIlELS

Bayesian persuasion probabilities These

Berkson's fallacy

1ft

posterior pmbabilitics Ibat a new lreatmenl beil1l lelilc:d in a Phase II clinical lrial is beller dum or no belle.. than a sIaDdanIlmilmenl.1n a Phase II lrial IN1ERIM AN.UYSES 1ft carried oullO dcaennine whether ar nolto Slop Ihc bial early becaus~ on Ihc basis of the dala aln:acIy accrued. the: new IMalment appears eilher unlikely 10 be beller than the slaDdanlllallDenl or alikely nollo be: better Ihan it.

Sometimes a spurious n:lationship can beconcluclc:cl because the dala from which the conclusion was clc:riyed came from a special soun:e, which is DOl repn:sentative of the gc:aeral population. Such bias is known as Berkson·s fallacy and it can anly be avoided by ClRful Slud), design (Waller 1980; FeinSlein. Walter aad Horwiaz.. 1986~

Woodward. 20(5).

A classic example of this bias is the study or aulopsies by Pearl (1929). Fewer aUlopsies dian expecrm found baIh

__________________________________________________________________ tuben:ulosis aad cancer lo occur together: the frequency or cancel' was thus lower among lUbcn:ulosis viclims Ihan oIhc:n. This Ie:d Peart to the erroneous eonclusion that tuben:ulosis might be offering people some kind of protection againsl cancer. even leading 10 the suggestion thai caneer patients might be lJ'eated with the protein of the aubeR:ulosis bacterium. 11Ie problem wilh this line of Ihinking is that DOl ever)' dcaIh is aUlopsied; in this case it turned aut that people who died with both diseases we~ less likely to be autopsied. leading to an anificiallack of numbers with both diseases in Pearl~s autopsy series. Bedcson's fallacy is a particular problem with caseconlrol studies. For example. suppose that both the case and eontrol series arc derived fiom hospitals. If it happened that anyoac with both the 'case' disease and some other disease we~ more likely 10 be hospitalised than someone with only one or the pair, we may well se:e a n:lationship between the p~valenc:e of the: two diseases in the caseconlrol study. even when Ibe~ is mally no such n:lationship in the general population. Exactly Ibe same situation may also give rise 10 spurious ~Iationships between any risk factor for the "second' disease and the disease that deftnes cases. For instance. consider a hospital-based CBSe-aJlltrol study of eoffee drinking and angina among the elderly. Suppose that coffee drinking is a risk factor for ParkinSOll's disease. If someone has Parkinson's disease she or he is unlikely to be hospitalised unless she or he develops a polentially life-thn:atening condition. such as angina. Most individuals with angina will be tn:ated in the community. the exception. perhaps. being when there is a disabling CDmorbidity. The ~sull of these hypothetical eonditions might be a disproportionate number with Parkinson's disease (who tend to drink eoffee) among the angina cases in hospital than among the controls (people with other iUnesscs). 'nle cusc>control stud)' would thus find coffee drinking to be a risk factor for aDgina. even if this wen: not ac~b~ (See also BIAS IN OBSERVA11CNAL SI1JI)IESJ

~

Felatteln.A. R.. Waiter.S.D.aadHorwUz,R. L 1986: Anaaalysis of Berkson's bias in casc-contrul studies. JOIITfIQI 0/ CIrrotr;~ Dis· elJSeJ 39,495-504........ R. 1929: Cancer and aubcn:ulosis. NneT;~Q11 JournoJ 0/HygieM 9, 97-159. Walter. S. D. 1980: Berkson's bias and its coatrol in epidemiological studies. JOUl'INII of Cluonk Disetues 33. 721-5. Woodward, M.2OO5: Epiden,iolol)': study, design llIfIl datil tIIItIlysis. 2nd edition. Boca RalOn: Chapman .t HalIfCRC PMs..

beta dlatrlbuUon This is a flexible PROBABJUTY DIS11lJ. BU11ON. eommonly used 10 describe a proportion. Whcn:as

many of the distributions we encounter arc nonzero OVCl' an infinite range of values. the beta distribution is nonzero only in the range 0 to 1. By n:scaling, it can be useful any time that a dislribution is required OVCl' a finite nmge. The dislribution

B~

is defined by two parameters. ,. and s, and has the density function:

j(.\")

= .t'-I(I-x)'-I/fJ(,.,s)

whe~ the fJ(,..:s) term CaD be viewed as a coMtant to ensure that the total probability is equal to 111Ie ~IEAN of the beta dislribution is r/(r+.r) and the

is ,..rI([(r+oJt(r+s+ I)]. The panunclCn r and s define the shape of the distribution. This shape can be wide ranging. with u-shaped CUI'\'c:s. n-shaped curves. sbic:lIy increasingldec~ing curves and lriangul., distributions all possible. Some or the possible distributions ~ illuslndcd in the figure (see pace 38). If,. and s ~ e:qual then Ibe distribution will be symmetric. Nole the similarities to the BlImMJAL DJSTRIB1J'I'ICN. W~ the binomial models the distribution or the number of successes, when given the probabiUty of a success. the beta can model the probability or a success given the number or successes. Indeed, in a Bayesian analysis (see BAYESIAN MEIlIODS), the bela distribution is the conjugate prior for Ibe binomial distribution. 11Ie bela distribution is n:lated lo a number of other distributions. It contains the uniform distribution OVCl' (0,1 Jas a special case (when r I and oJ I), it is i~ingly well approximated by a NORMAL DlSTRlBU110N as ,. and s increase and it can ~sult from constructions of the fonn AI(A + B) w~ A and B ~ both random wriables with OAMralA DlSlRIBurJONS. For furlbel' details on how the bela distribution ~laIes to othcrdislributions. see Leemis (1986). 11Ie bela distribution is most commonly used to model proponions. Suppose thai we wish to eslimale the specificity of a test that in trials Corn:dly identifies SO of the 52 participants that do nol have a eondilion. 11Ie usual normal approximation will not suffice since it leads to an interval rrom 0.91 to 1.01. and a value grealCI' than I makes no sease. 'nle~ ~ a numbel' of ways to use the beta distribution in estimating the interval (see Brown. Cai and DasGuplD. 20(1). AGL VARIANCE

=

=

DuG...,.., A. 2001: Interval esti-

Bro.... L D.~ C.., T. T. aad maIion for 8 binomial proportiCIJ.

Sltltisl;~ol Sdmce

16. 101-33.

Letads. L M. 1986: Relationships among common univariate distrillulions. The AnwritGII SIaiislidGII 40, 2. 143-6.

bias

Any experiment, sludy or measuring process is said to be biased if it produces an outcome that differs from the ·truth' in a systematic way. Bias can occur aI any staae of the ~sean:h process fmm the litcralU~ nwiew through to the publication of the results (Annilage and Colton. 2(05). It is important to dislinguish between bias or systematic error, on the one hand. and mndom error, on the other hand. For example, suppose that we had a population of subjects with a MEAN weight of 80 kg and a STANDARD DEVlA110N of

37

~----------------------------------------------------------------------

1

(.)

(b)

1.

1.5

.-t 1.0

10 •

,s. 8·

f

.i

8· 4.

.-8

~

.. 0.5

IM-t::::=t===::;:::==:;:::==r==::::;==::::t:===I 0.0

0.2

0.4

0.6

.0.8

1.0

I :-~~;;;;;;;;~;;;;;~J='===I 0:0

Proportion of success BS

12.0

r~

ti

-8 1.0

.-

~

0.0

0.2

p.4

0.6

0.8

(8)

.1

r.I.8.0.&

c: 10

8

t

6

i

0.0

0.2

0.4

0.6 0.8 Proportion 01 successes

1.0

r......

(f)

8 --------

a 6

I

• I· I:~~~~~~~ 'V

1:0

-t:::=t==::;:::::::;~:::;:=::;:::::::~::1

M

1.0

Proportion of SUCCBSSBS

t.

·0:8 0:" . Proportion·of SUlXl!sses

t

1.0

•JM-t::::~==::;:::==:;:::==r==::::;==_===1 I a·

0.4

(d)

(c)

~.

0.2

4

0.0

0.2

0.6

0.8

1.0

Proportion of SUCCI! SSBS

~

4 2 o~=*==~==~==~~~==~~ 0.0

0.2

0.4

0.6 0.8 Proportion 01 successes

1.0

..... dlltrlbaII_lliw"tl'ing Ih. rtII'illy.offomu I/rtl"lre wltl dUlr/buIID" can I.e: (II) lhe IIl1i[o1'm dillirilnll;on oJW (41). (h) tl hiniodtJJ CDMIIIV! .,,;bIIl;OII fin IIrU leJftey·s prior), (c) II t:UlW wil" tl sing/. made, (d,1I ~ jrlllClion oll~e proptll'liDll, fe) 1I11D111imN1,bul slill slricll)' inCf't!tl.fillg dUlribulion. UJ 1m eXlIIIIPl. Ihtll ill well tlpproximllled ~ ,he IfDrmtll dis"",,,lion

CAre'"

10 q. It we select • simple raDdom sample of·2S ·sUbjects tiom this popullllion and measun=lhcir weapls uliBl a wellc81ibnled sc:I of scales, l~ ·it is passib1e dual the IDeaD. weight far Ibis sample will be substantially cliftCmd. ftom 80 Icc. In fact. ~ is about a 1 in 20 chance: that the sample mean will be: IIIDRI than 4 k, below or 411 above the InIe mean or· 80 kg.

However, simple nndom sampliBl prod~ an unbiased e:&limale of lhe lrUe me&n weipt because:, if the pmcc:ss or sc:JectiBl a ~implc: IDJIdom sample: of 25 ~bj_ and compulinglhe sample: mean weight werelqle8led a Iaqc: numlJc:r or limes. the distribution or die: sample: means be: cc:nlml arvuncIdIe IIUe mc:an of 80 kc. 11K: larpr die sample ~ dac: closer Jhe samp~ mc:aas will be: clllltc:led around

wau"

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ BIAS IN OBSERVATIONAL srUOIES

the true populalion mean. In alher words. the cxpected value of the sample mean equals the populalion mean. In this sccnario, there is no bias and any deviation of Ihc observed sample meaD from the lrUe value can be accounted for by pure chance. known as nmdom variation or random C:II'or. If. howevcr, the weighls of a random sample of subjects were mc:asunxl using a poorly calibrated scl of scales that weighed each subjeci as being 2 kg heavier Iban Ihe actual weighl. this would lead to a biased estimate of Ihc lIUc population mean weighL The six oflhis bias. or syslemalic error. would nol be by incn:asing the sample size and the distribution of the sample mean will be ClCnln:d around 82 q nlher than 80 kg. 11ae systematic error in this example is a mc:asun:menl bias due to a faulty measuring instrument. More genc:rally. measu~mc:al bias could be due to such diverse causes as poor queslionnaire design. faully equipmenl. obsener enor or n:spondenl enor (Silman and Macfarlane. 2(02). Examples of observer error include mi~ading the scale on an inslrUmc:at. bias in reporting results by an unblinded ewJuator in a clinicallrial or bias in eliciting information aboul the exposun: histol)' of c.asc:s and controls in a CA5E-cDNl'ROL SI'UDY. Eumples ofn:spondenl error include biased reporting of symploms by unblinded patienlS in a clinical llial. bias in rc:call of exposwe histol)' by cases and controls in a case-conlrol sludy (sec BIAS IN OBSERVATIONAL S1\JDIES). All types of study ~ susceplible to design bias. This can arise from many sou~es. such as saEcrlON BIAS (when the subjects selected for study ~ not relRscntalive of' lhe target population). NOmlBI'ONSE BIAS (when there is a systematic difference between the chanc:leriSlics of those who choose: to participate and those who do not). IlODComparability bias (when groups of subjects chosen for comparison in. for example. a c.ase-conllOl sludy are not in fact comparable). Randamised trials (see aJNICAL 1RJALs) arc genc:rally ~gardcd as being leaSl susceptible to design biases. 'I1Ie scope for BIAS IN aBSERVAllONAL SnJDIES. especially case-control sludies. is much greater. Annitille and Collon (2005). Ellenberg (1994). Porta and Lasl (2001) and Sackell (1979) all proVide a comprehensive descriplion of so,""s of design bias. Analysis bias arises from emn in the analysisof'dalL This covers such issues as confoundilll bias (in which conrounding fadars have nul been approprialely adjusted for in the analysis), analysis method bias (includilll inapplOpriale assumptions about the distribution of wriables. faulty strategies far handling MlSSIND DATA ar otmJERS. unplanned SlJI. OIOUP .UWoYSIS and dtlltl dredging) (Annitille and Colton. 2005; Da\'ey Smith and Ebrahim. 2(02). Ensuring that the interpn:talion ofdaaa is unbiased isjust as importanl as ensuring Ihat the pnJCCISCS of design. rnc:asuremc:nl and analysis arc unbiased. Bias in Ihe interpretation or data can be conscious or unconsciaus and is padicularly

rmuced

dimcullto adcIIus because it involves subjective judgements on the part of the reseaId1en.. Kaptchuk (2003) proVides an oveniew of the issues invol,,-ed. There is some evidence 10 suggest that the soun:e of runding far cIrq studies is related to Ihe outcome. A syslCmatic ~view by l.cxc:hin el til. (2003) demonstrated a syslemalic bias in favour of the products made by Ihc company funding the rese~h. 11Ie main S1IJUIceS oflhis bias were thought to be inappropriate selc:clion or In:atmenls to compare apinst Ihe product being investiplc:d and publication bias. Porential SGUR.'lCS of investigator bias an: reviewed in detail by Oteenland (2009). Finally, publication bias (sec SYSmL\1IC REVIEWS AND META-ANALYSIS) can arise from two main soun:c:s. First. rese~hC15 arc IOCR likely to submit papc:n for publication if the n:search produces D statistically and clinically significant mAlll nIher than an iaconclusive result. Second. journal editors DR IDO~ likely to publish papers reporting SlaliSlically and clinically signiftaual JaUlas (Dubben. 20(9). lVHG

ISee also NONRESlIONSE BIAS. SELECIION BIASI A.......... P. ad OIItoa, T. (eds, 200S: EMydoptlt!tlis oj biostoluties. 2nd edition. HCM' YaIk: JaIua Wiley a: SalIS. Inc. Daft)'

_II. 0."'" Dnllllll.5. 2002: Data ~nl. bias or canfounding. BrilUIJ Medital JDllfRQ/l2S,1437-8............ B. 2009: New mdhods 10 deal with publicatian biu. BrilfM Mftiiml lOUI'M/l39. b3272. BIlla....., J. H. 19M: Selcclion bias in obscnaliaul and experimental studies. Stll'iJli~s ill Media. 13.557-67.0.......... S. 2009: Accounting far uncc:nainl)' about inYcstipiar bias: disclo-

sure is inforlnatWe. JDllmtllojEp_,Wo,y and CDnIIIIIIIrity Healt" 63,593-8. KapfdIak, T.J. 2003: E«cct ofilllcqmive biasanlaealda

cvidcncc.BriliJIIMedimII0lllflQ1l16.14S3-5.PaIta,M. .... Last,J. M. 3D: A tikI.."., of ~pitlmlio"". 51b cdilian. Oxbd: Oxford Uniwnity Pras. .......... J., ..... L A., DJaIbeID*, B. ad ad, 0. 2003: ~aI induslry spaasanhip and man:b outcome and quality: S)*mDIic .mew. BrltiJIJ &lftikalJOIII'_ 326. 1167-76. SIIcbtt, D. L 1919: Bias in analytic raeaIdL Jormrtrl ", Chronic DismsG 32. 51-63. SIlman, A. J .... MadIuIIIIIe. O. J. 2002: Epidtrniolo,iml R_S: II pr«t_ guilt. 2nd c:dilion. CamIlridF: CamIIridce Univasity Pras.

bla. In ob_rvaUonal studies In an ideal study. an invc:stiptar seeks to estimate Ihe effect of an exposure 10 a factaron an outcome ofinten:st. We mightlikc to be able to look at what happens toa population when the rac:tar is at one level and then tum back time and ~run things atlhc: second level: llul thai is impossible. of COUI'5C. Vel)' oRen it is not even possible or pnc:licalto conduct an experimenl in which Ihe levels of exposure arc controllc:cl. so that one is left with analysilll observational data Ihat occur naturally. Bias is any systematic departure from this idealised construct. which is distinct fram purely random ctTOr. which is ZCI'O on DVClBlc. The lalter can be dealt with by RlCIuc:ing variability in the measure of association. which can be 31

BIAS IN OBSERVATIONAL STUDIES _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

accomplished in a variety of ways. including the increase of the overall sample size. However. bias cannot be n:duc:ed by incrcasin, the sample size and it can only be conlrOllcd throup can:fully conducted research by an investiptor. 'lberc have been allCmpls to catalOJ;ue the types of bias that can occur and these broadly fall into thm: sources: the selection ofstudy subjeclS. c:rron in the information collected and coafouodin, or entangling the effeclS with other causes of the ouame (Hill and Kleinboum. 1998). In order to discuss the sources of bias in an observational SlUdy in more conc:me tenns. consider a hypothetical epidemiological study in which lhc n:su11S are summarised in a 2 x 21ab1e (showa in the table). We are inten:stcd in studying the association between expasun: and disease in a manner that avoids bias. Among the choic:c:s of study desi", from which data for this 2 )( 2 table may have arisen are a CROSSSECTIONAL STUDY. a COHORT SIlIDY or a CASKON11K1 ST1JDY. In a cross-sectional study. N subjcc:lS an: sampled and the four cell fn:qucnc:ies ddcnnincd. but in a cohort study. a groups of exposed and unexposed subjects arc chosen. essentially fixin, the lOW totals. and then the column fn:quencies arc dctcnnincd by what transpin:s durin, the course of followup. For a case-control sbidy.lhc column totals are n:prdcd as fixed and subjects dislributcd to each row within a column depending on their exposure history. which would usually be J;leaned by intervicw. Fundamental to each of these study dcsips is the ~lil8tionof a mndom sample. eilherovcrall or within lhc rows or columns. .... Ia oIIHnaUoai studies Tabu/aled resu/ls from an epitiemi%gim/ stlldy wilh Iwo lel'eiJ 0/ exposure Ilml diSftUe SilltllS

DixllSed Exposed Yes

Yes II

No

c

Total

Il+C

No

Tolal

b d b+d

Q+b

c+d N

Selection bias occurs when the propoltion rc:cnailCd from the IaJlet populDlion that is counted in a cell ofthe 2 )( 2 table depends on boIh the row and the column. One way in which this can occur in a cohort study is if there arc dilTerential diagnoses depending on the expasun: slatus. For example. SUppD5C thai an exposure of intcn:st occurs in a manufacturin, plant that proYidcs health insurance for its employees. but amon, the unexposed are substantial numbers who are unilUiuml.lfthe insured n:ceive n:,ularcheckups from lhcir physicians. this may increase lhc likelihood of a com:ct diagnosis among Ihose expased. while similar cases may have been missed for the unexposed that an: unilUiurcd.

Clearly, this would bias DD eSlimatc of the odds mUo that would be calculaled fmm such a study. Another potential soun:e of such bias in a cohort study may arise from loss to follow-up. e.g. if instead of exposure the invc:stiptor is intereslc:d in whether a person is using a particular type of In:atmenL However. suppose thai the ~tment is nul only ineffective but it also CBUSc:S unpleasant symptoms in patients who an: relalCd to the oc:cum:ac:e of the disease outc:ome. If the individuals so affeclCd drop out of the study. this would artificially lower the count in this cell of the 2 )( 2 table and bias the eSlimatc. Notice thai the m8J;nitude of the elTect of this selection bias may be substantial. even if the nmnber lost rcpn:sents a small proportion of the total. This is especially true when the proportion that develops the disease is small. so thai the portion lost in a cell of the lable is relatively high. even though the proportion lost n:pn:sents a small proportion of the o\'CI'all sample. In a case~ntrol study. a common source of bias when scIcclin, cases CDD occur when subjects with a prevalent diseuc an: enrolled into the study, some of whom may ha\'e had the disease for some time. 'Ihose who have been ill for a lon, period of time will be man: likely to be enrolled ifsuch a study design is used. a phenomenon known as I..ENGI1I-BL\S'ED SAMPIJNO. If the primary aims of the study are to study the association between exposure and the OCCUI"l"COCC of the disease, this will clearly lead to a biased estimalC of ass0ciation, but this could have been avoided by only enrollin, newly diaJ;noscd cases instead. n.c choice of appropriate controls in a casc>c:ontrol study can be an especially eammon soun:e of bias. If the cases an: scIc:ctcd from alllOllJ: those who an: di8J;noscd at a collaboratin, set of hOSpitals. then the controls should ideally be a n:praenlative sample of those \\'ho are healthy in the calc:hmcnt areas of those hospitals. If all hospitals in an an:a arc COOpendiRJ; with a study. then this could be accomplished by i1'Cl1Iitin, a mndom sample orthe overall populaJion in the geographic area. Random digit diallin, is one approach that bas been useful in populations weD covered by tclcphanes.. bu. it is bccomin, more difftcultto employ lhc method with the incn:asing use ofcum:nttcchnolOJ;ies such as cell phones. caller ID and noal lists. In some studies. controls an: scIc:ctcd ulin, subjects who have been admitted into the same hospital for a disease that is unn:laled to the exposure of intercsL This would result in a poup of subjects from the same catchment an:a as the cases. thus avoiding one SCJUI'CIC of potential selection bias. The estimate of association in such a study would be lhc dilTerence bclWccla the elTect of exposun: on the disease of interesI and ilSelTcct on the 'control disease' (Bn:s1ow. 1978. 1982). If one has chosen a control disease that is not n:latcd to exposure. i.e. the elTc:ct is zero. then the estimate of association will be an unbiased estimalC orthe effect on disease risk. However. it is often difl"u:ull to be certain thai this is the case because the assumption may just

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ BIAS IN OBSERVATIONAL STUDIES be the result of a lack of kaowlc:dge about the aetiology of disease affecling the conbols. A cross-sectional study can be a useful way or oblaining a snapshot of the association bc:Iwcen two or more wriablcs at a single point in time. especially if the population chosen for study is of broad interest and a carefully planned method for drawing a random sample has been put in placc.. Some national health surveys are good examples of such studies. such as those conducted by the National Cenler for Health Statistics. HowcYCl". if the aim is to study disease aetiology or other outcomes that evolve o\lCl' lime. then the single snapshot in time can be a serious limitation. For example. in an epidemiological study. subjects with a dise:ase who have been identified by a sun'ey conducted at a single poinl in time would IICC'Cssarily be a prevalent case. which is a polcntial source of bias here as it is in a case-control study. Infonnation bias in an observational study arises from cnor in the wriables that have been collected as part of the data for each subject in a study. Such cnors can either be difTcn:ntial or nondilTerenlial, i.e. random. DifTc:nmtiai error in reporting values summarised in a 2 x 2 table would arise if the cnor rate for reporting the variable in the column depended on the lOW or vice vc:n&. This would obviously be a potentially important source of bias when estimating an association. However. bias can also arise when the cnor is nondilTcn:nliai or purely random. Casc-control studies can be prone to infonnation bias because someone with a serious illness may n:member their history of exposure to the factor of interest quite differently from a healthy control. This Jltx:AI.L BIAS can be especially significant when otherstudics of the ex~ ofinlcn:st havc enlCRd the public's consciousness or been repoIted in the news. One technique for minimising its elTect is to use a wellstructured interview in which the questions havc been clearly and unambiguously phrased and posed in an identical manner to all subjects in the slUdy. This requires considerable effort on the part of an investigator. in that the questionnaire would need 10 be pre-leslCd and the inten'iewer5 weillrained. Infonnation bias can potentially also atTect a study by subconsciously inftuenc:ing evaluations by interviewers. proressional diagnosticians or eyen laboratory technicians. This could happen if the: individual has a preconceived idea of what the results of a study will be or of the way the results arc going. Thus. it is generally prefem:d that the study hypotheses not be known to those responsible for collecting the dala or that the status of a subject be masked. a procedure in which the person rec:onling the data is said to be blind with respect to the outcome. These measures should reduce the: possibility for differential errors. but nol nondilTercntial errors. While it is intuiti\'ely casy to appreciate that dilTerential cnor of measurement can bias the results of an observational study. nondiffcrential cnor can also have an effect as well. If only a single variable is affected by nondilTcn:ntiai error, Ihc:n

the effecl is generally 10 attenuate the effect. i.e. to bias the estimated association towards the null value of no association. This would tend to make the results or a study with nondilTercntial error in one of the variables c:onsen'alive in the sense: that it would make it morcdifficultlo establish that an estimated association was not due 10 chance alone. Contrariwise. il would also result in an und~imate of an effect. whK:h can be important when trying 10 delennine the public health significance of exposure 10 a particular factor. It is most desirable to minimise information bias during the design and data collection phase ofa study by minimising measurement error, but it is generally not possible to be enUrely successful in these efforts. One approach to correcting for bias at the data analysis phase is to introdUClC a correction factor that takes into account the measurement error. In the case of a 2 x 2 table. formulae ha,'e been provided for this (Barron. 1977: Copeland el al.• 1977) and similar approaches are also available for use in lOGISTIC REORESSJON (Rosner, Spiegelman and Willett. 1990). There is now a rich variety orSlatistical techniques fordeaJing with errors in variables. many of which are described in the text by Canoll. Ruppert and Stefanski (1995). Confounding arises when the estimated effect for an association of interest is entangled with another factor. pemaps one that is well known to be associated with the outcome. It is conceptually relaled to aliasing in design of experiments. in which two effects arc completely entangled. and collinearit, in other contexts. 'The potential for confounding in an observational slUdy of two variables exists when each is associated with a tbinl variable. the confounder. in the presence of the factor orintcrcsL Pn:cisc definitions of confounding go to the heart oflhe objcctives ofobserwlionai studies and various models have been proposed as a theorelical basis ror its eITect (Rubin. 1974: WlCkramarablc and Holford. 1981). Altemativdy. c:oI/apmbi/il)' is sometimes used as a simple and practical alternalive 10 more fonnal definitions of confounding (Bishop. Fienburg and Holland. 1973). An association is collapsible with respect to a putative confounder ir the estimated association is unchanged whc:a adjusUng for the confounder in the analysis. Approaches for dealing with a potential confounder are in essence to estimate the association holding the value of the confounderconstanl.ln a designed experiment. this would be accomplished by selecting strata or blocks of subjects with identical values ofthe confounder and only vary the exposure of interest within the strata. One way of accomplishing a similar elTect in an observational study is to stratify the data by abe potential confounder and then combine information across the strata. if the elTect is constant. using the MANTEL-HAENml.MElHooorsomethingsimilar(Mantcland HaentszeJ. 1959). Altcmath'ely. one can adjust ror one or IIIOI'e putative confounders by including them in a model. such as the linear logistic model (Hosmer and Lcmcshow.

41

81MODALDISTRIBUTION _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ 1989) or an altemaliYe CBlfltAUSfD LIIIEAR UOIB. (McCullqb and Nc:IcIu. 1989) rar D binary Iapoa5e. It is enlird), possible thDlaaobsenationalstudy will not be able to separate out the elrc:cl or an exposun: of interest froID the elrect or ziaother expos~ thai it lhouPl 10 be a aJOfounder. This is not unlike lhc more geaeral problem or eoIlinearily ahal arises in lhc context of IqI'asion anal),sis. In these silUalions. illDB)' onl), be possible to conduct a new study in which the design has been carefully conllnlcteci so that one can tcase apaJt lhc separatc contributions of a faclOr of interest from iRs confounder. TRH ........ B. A. 1917: nc eft'ectsof milclassikation on the eslimlle ofldati~risk.BiDml'triull.414_18. ~ Y.M.M., ........ S. It. aad RGIIaDd, P. W. 1973: Distnte mu/tiWll'me QllQ/yJis: tht!rNyamlpraclice. C_bridp, MA: MIT ~ss. BnIIow, N. 1978: The prupodionaI hazards madel: appIicatians in epidemioioU. Comtnlllfiarliolu in Sltltislirs - 'Th«Iry _ MlIlMRflllifI A7. 4. 315-32. BnsIcnr, N. 1912: Dcsip aad aall)'sis of auc-am1rOl 5IUdicL Annual R~rieM' of Publk Health 3, 29-54. CarnII, R. S., 1bIpptrt. Do ad SWWMli. L. A. 1995: Mmsumnml nror ill nonJiMtIT motIels. LoacIon: a.pmaa a: Hall. CopeIwd. K. T., ClltckGWaJ'.H.,M~A.J.ad RoI~R.H.I917: Bias to misclusificatioD in the estilDltiaa of n:1aIM: risk. AmeritIDI Jollnllll ofEpitkmiDIolY lOS. 488-05. HII, H. A. .... o.O.I998:Biasiaobscmlicnlstudie.s.InAmlitap.P.andCoitan, T. (cds). EIIcyt'lop«lill of biosllllmics. ~r: John W'aIe)' &: Saus.LId. B_r,D. W.... ............,S.1989:AppWloginr: regrasirHl. New VOlt: John Waley a: Sons. Inc. M...., N. ad .......*W. 1959: SlaIisticaJ aspects of the _ysis of daIa from mrospedivc studies of disease. Journal of 1M NtltiolltlJ Cllllc:er lrulilule 22, 71~. MtOa....... P. aM NeIder, J. A. 1989: GeneTaliMtiliMtD'motkIs. Loadoa: Chapnan a: Hall ROIDIr, B., SpIeaI>·n. Do . . W. . . . \V. C. 19!JO: Camctian or Ior;istic rep:ssiaa n:lalive risk estimates and canfidcnoc intcnaIs far mel-

-=

inactiWlion orllle cInag ioniazid in US adults. An accounlor the use or bimodal distributions in a mcdic.aI selling is pven in Hqberg ellli. (2001). SSE

0.20 0.15 ~

0.10 0.05

0.0 ""1....-~----,r------r--_--_ 2 4 6 8 10 Jt

bIJDadJd dlstrlbutloa Finite mixture tfulTibutitlll

.a.......,

surcmeaI enor.lbe case of multiple covariala IIICISIWd with CII'ClI'. AmerkonJDIInItIlofEpitlemiolo"U2. 734-45. JtaIIIa,D.8. 1974: Estimalinl caUSII dl'ccts of IIalmenlS in raadomixd and nonnn-

cIomized studies. JourMl tJf EtJamtioMl Psydrolo" 66, 688-701. ~ P. J. ad RaU'onI. T. R. 1987: Confouading in epidemiolopc studies: the Iidcquaf:y of the cantml paup as a IIICISIR of CCIIlf'~ Bklmelriu 43. 7SI~S.

bimodal distribution This is a PllOltABlIJIY DJS'JIUIU. with two modes. Often the two mcxa in the distribution com:spond to the data arising

TION or a FItEQlJENCY DISI'IUIU11QN

fiom two distinct populations. The nISI ligun: shows a bimodal densit)' function arising from a weipted sum or two NORMAL DlS1RI8l1rJONS (a FINITE MIXTURE DlSTRlBUIlON). An example or a histognun with two distincl modes is shown in the second ftgurc. n.c data hen: CDIIespand 10 the sw:s or

myelinated lumbosacral ventral roat libn:s laken from a kiHen of a puticular qc. The ftnI mode is associalcd with uons or gamma neurons and the second with alpha neurons. Other examples of medical bimodal diSlributions arc the age of incidence or Hodgkin's lymphoma and the speed or

fl)re size (mm x 1(t41) bimodal dlstrlbulloa Hislogrtll1l with

_be,. ddcclion Eo, mOt or G.

I~"O

distinct modes

Co, ......, F...... SuI.. J. N. (2001)

Imprvvcd evcat-relalcd fuac:tioaal MRI sipals using probability fuadiaas NeuroilnQge 14. 119l-205.

binomial distribution 11ais is the PIOIL\BD.lI'Y DISI'RI.

of the number of ·succcsses'. X. in a series or n independent trials, where the prvbability of a success is p far each llial. Spcciftcally lhc distribution is giveD by: BUOON

Pr(X = .\') = .( n~ ).,r(I-P)"-~ •.\' = 0, 1,2, .•.• n ."C. rr .\'. where n! (radorial rr) is the prodUCl or alllhc inteaen up 10 and including n and or is defined to be I. 11H: meaD or the disllibutian is rrp and its variance np( 1 - p). Some binomial dislributians with n= 10 DDd dift'cnmt values or p an: shown in the ftgun: (see PIlle 43). 11H: distribution often OCCUIS in medicine as the basis for tcsling the hypalhesis that the

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ BIRTH.COHORT STUDIES·

n.10.p.o.1

0.5·

0.5 ·

0.4

0.4 ·

,~ 0.2

f~0.2 ·

O.t

O.t

I

0.0

2

0

. .\

6

8

0.0

1b

I

I .!

.!

o

2

o•

2•

Number of SUCCBSUS

I

4'6 8 Number of SUCCBSUS --'-

-'10

8•

10'

n.10.p.U

0.5 ·

0.5 ·

0.4 ·

0.4 ·

0.3 ·

'0.3 ·

1

I

J0.2 ·

Do

O.t ·

0.2 ·

O.t · I • 2

0.0

o

. 4

I

• 6

8

'

0.0

10 .

Number of succeSIBS

. 4

I

I 6

.

Number of succeSIBS

......... dlstrlballo. Binomilll _IributiolU lor mrillla mlue6 II/n lind p probability of some event of iDlCl'al IaIces a particular value. For example~ a n:sean:her ....)' poshIlaae Ihat 1041, of a populalian is iDfcctcd with a vinas and, ~n sampliDg 20 people at random flVlli.the po~~ fiDds 1h1ll6 people ha~ dae virus. Is then: any evidence dud the iDfection rate is .hi&hcr than the hypolhcsised value or 10 CJt? ~ answer Ibis .question a P-value can be computed from die biDomial distribulion as the probability Ihat 6 ·or more people in the 20 sampled have the virus wilen the probability lhat a penon is inl'ec:lecl is 0.1. i.e. Ihe sum:

~

2O! ~ 2D-. ~ x!(20-x)! (0.1) (0.9) 11M: rauiting Value is.O.Ol, givin& Sbung evid&:ace thai the infaclian rate is Iarpr Iba 10 ... As wc~ as _ling for a specific proportion. the biaomial distribution ca be usc:cl in calculatin& CONfIDENCE INT!RVALS fora pmpartion. Vallanueva ~l til (2003) usc the binomial.disbibutioa 10 c:stimale CUDftdencc: intervals ror the· pmpanion or adVerts in medical jaumals willi iDaccunde CIaiIDS. MCR clewis of the binomial distribulion can be found in AlImaD (1991). BSIiIAGL

A"-, Do o. 1991: PrtlCti~lII. "malia for nuNliclIl ~."dr. Laadoa: Chapman a HalL ~ p......... S............ J. .ad .....6, L 2003: AccuIKy of pbUmaccuticai advcdisemenls in medical jCIUIIIIIs. 14M~I 361. 27-32.

blolnfarmatlcs 'I11is is a

term given

far· the c:omiq

lOgcthc:rofmolc:Cuiar biology, C:ampulCl'se~ ~ icsandstalislics to deal with thce~-apanding genomic and prvleomic cIaIabascs. which are themselves Ihe result of rapid lcchnoiogiealMlYllllC;cs in DNA sequencilll, gene e.xpJaSiaa measuremc:at and macromolc:c:ular strudun: dc:acnnilllltion. In many cases sue" techniques pve rise 10 HIGH DJr.tI!NUDNAL DATA. A coJl1lfthensivc accounl ofbioinforrnatics is giw:n in Zvclebil and 8aum (2007). BSE ZYIIIIIIL M.. .ad ......, J. 2007: Ulltkrsllllllli", biDitlftJrmtllil:~

GIrIaad Scieace.

birth cohort studies

'I1x:sc IR stuclieseSlDblishecilo

e.xamine grawth. development and health or cbilcln:n flUID binh. ~~ giftn sufticienl follow-up they also provide 43·

BIRTH COHORT STUDIES _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

insights inlo inftuences on aduk disease that operate throughout the life coune. In principle. a birth cohort study is one in which all study participants are recruiled al birth and then followed over lime. The cohort is defined by the localion and the lime period in which the participants were born, which may be those born in one week or over a period of a year of more. The members of the cohort are then followed up at various lime points to ascertain risk factors and health outcomes. As the cohort ",es the focus of the research lends to shift. In the early years. the emphasis tends 10 be on childhood poWlh and developmenl and risk of childhood illness. but as the QObort matures adult risk factors such as smoki", and obesity and health measures such as blood pressure stan to be of greatel' interest. Oulcome variables in childhood such as height can laler be conside~ as risk facton when assessing chronic: disease later in life. Inlergenc:ndional and genetic faclOrs are of interest and infonnation on family members is included in many birth cohorts (Lawlor. Andersen and Baity. 20(9). The first cohort to be established using recruitment at birth was the National Survey of Health and Development. which slUdied babies born in Britain in the fint week of March 1946 (Wadsworth el at•• 20(6). The eohort has been followed up on more than 20 occasions since birth. the latest being at age ~ years. Contacts with the participants ha\'C been by postal questionnaire. home and cliDic: visits. and through schools and links with health and educational professionals. Brilain has lwo oaher birth cohort studies conducted on similar lines. They comprise those born between ahe 3 and 9 March 1958 (Power and Elliott. 1992) aad between 5 and II ApriI1970(EliiollaadShepbenl.2006)respectively.Both studies have included a number of follow-ups that have gi\'en insights inlO the powth and development of these cohorts through childhood. adolesc:ence and into adulthood. Crosscohort comparisons have also been possible and have allowed examination of secular tn:nds. e.g. inlO Crohn's disease. ulceralive colilis and irritable bowel syndrome (Eblin el at., 20(3). The Millennium Cohort was recruited in a dilTen:nt way (Smith and Joshi. 20(2). and a further cohort study is planned for births in 2012. Birth cohort slUdies are not of course confined to Britain. though until recently comprehensive national cow:rage has ruely been allemptcd elsc:where. The Scandinavian counbies have well.developed lin~e systems and in Norway and Denmark studies or ovel' 100 000 births have been launched (Olsen el al.• 2001: Magnus el al.• 2(06). and the United Stales has embaJted on a study of a similar scale (Branum el at•• 2003. and sc:e hllp:l/www.nalionalchildrenssaudy.gOY). Frequently. birth cohorts an: located in one town or city. For example, the PeIOlas Birth Cohort Study in Brazil n:cruiled all births born in the cily of Pelotas during 1982. It represents a good example of a birth cohort study with

long-term follow-up in a developing country (Vic:tora and Banos. 2(06). Many birth cohorts have been defined retrospectivcly. Thus births in a defined geographical area during a specified time period ~ identified from established records. The dala can then be linked to other standard records such as death indices. or the sludy population can be traced and those slm alive can be assessed by post or by interview. An example of this is the birth records of the 1920s and 1930s from the English county of Hertfordshire that were extracled in the 1980s. The population was lnIced through the National Health Service Cennl Register and details of deaths and current general praclitioner addresses obtained. This allowed nol only an analysiS of mortality in relation to birth and infant weighl but also enabled follow-up of the survivors to examine them for risk factors for chronic conditions s~h as cardiovascular disease (Syddall el at•• 2(05). Some retrospectively defined birth cohorts have focused on particular events that gave rise 10 extreme liying conditions. Forexample. those born in Amsterdam in 1~1945 around the lime of the famiRC imposed by the Oennan occupation have been followed up 10 assess the impact of famine at key stages of pregnancy and early life (Roseboom el at.• 20(1). Similarly. a cohort of men born in 191~1935 WCI"C identified from one district in Leningrad. a third of whom had experienced starvalion during the siqe of Leningrad in 1941-1944 when the)' were around lhe age of puberty (Span:n el al., 2003). The whole cohon was followed up and invited to take part in health examinations to assess the long-Ierm effects of the famine. There is also inten:st in defining birth cohorts at an earliu time point than birth. A child's growth and de\'Clopment begins before birth and so characterisalion of aspects of pregnancy is considen:d importanl in determining the long-term in8uences on the olTspring's health. The Avon Longitudinal Study of Parents and Children (ALSPAC) recruited 14 000 pn:gnant women resident in the English county of Avon whose expected dates of delivery wen: between 1 April 1991 and 31 December 1992. The women and their offspring have been followed up by means of postal questionnaires on many occasions and a subsample known as the Children in focus was seen at clinics 10 times before abe age of seven )'C8IS. From that age onwanls clinics bepn for the entire cohort (Golding el at•• 2001). lhking this one step further. with an increasing focus on the very early origins of life. two cohort saudies have recruited women hefon: pn:gnanc:y. The first of these recruited some 2500 women in six villages near Pane in India. Of these. OYU 1000 became pregnant and full data were obIained on nearly 800 births. This c:ohoIt has now been followed up inlo adoleseence (Rao el at.• 200 I). In the UK. the Southampton Women's Survey rc:auited over 12 SOD women aged 20 to

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ BIVARIATE 8OXPLOT

34 years when they wen: not plqlUlDt aad over 3000 orlhcm we~ studied throlllhout subsequcat p~pancies and Ihc chilcm,n are being rollowed up (lnskip el QI.~ 2006). The ~cal1y launched US Nalio.... Children's Study is mainly ~ruiting women ill Ihc first trimester or pn:gnllllC)' but is also including some women beron: conception (hUp:l1www

.nationalchildn:nsslUdy.gov). Birth cohort slUdies ha'VC many Sln:nglhs. Usually they

aaptuR: a mJIiSoSICCtion orlhe papuIaIian and lhey havc all die adWDlllpS orlongitudinal studies. However.1he weakness is tlud over the lire counc a lugc pen:entqe or prospectiyely defined bilthcabans lends todRJp out. MIlD)' DIIOFOUTSan: due todc:alh as thccohort ages or to mipation out orlhc "'lion or c:ounlryof study. Persistent questioning and n:qucsls to attend clinics 01' be Yisitc:d at home adds to the albition. as IOII1C participants fccl ....t they baYC conlributcd CROUgh and lheir motiVlllion wanes. Tbe ~maining cohort may no 10000r ~nt Ihe general population. Rc:IraspecIi'VCly defined cohorts CaD suITer less flUID this problem but then they oRm lack suflicicnl dala on die early ycars. HI (See also COIKIn' snJDIfS)

BnIIImI, A. M.. Co'awn, G. W., eon.. A.dal. 2003: NIlianaI Childraa's Study or envinwnental effects on child health and cbdopmcat. En,inHrnren'a' Hm/," hrJ/WcI;re 111.642-6. DdIII, A. G. c., l\~, S. M .. EIdIom, A., .......r, .. Eo &lid Wak...... A.J.2OOJ: PIe\'llleaceof lasaoinlaliaaldiscases in two British Dati'" binb cabons. Gu, 52. 1117-21. Ill", J. ad Sllepllenl, P. 2006: Cohort pmftic: 1970 British Birth CaIat (BCS'70).IRlmIII,iDlltll JoumtI' ofEpitlemiolol>' 35. 836-43. Goldm.,J.,........,.. M.,J--. R. ........ ALSPAC...., T_ 2001: AUPAe - The Avon LonPtudinai SIUdy or Pumts and Childraa I. Sludy methadoiCICY. PtIe_'r;~ tnttl pm-'ol Epidem;' oIogy 15. 74-87."",. H. Mot GadI'n)', K. M., ItDIIIDIDa, S. M.,

11Ic Millenniwn Cahart Study. Pop,,1or Trmtb 107, 3Q..4. Spuft.

".1IDIbJ.,

POI Vipri, 0., SbesIoY, 0. 8., s.. Puf'IIImI, N.. ......... V., PaIaroI, D.... 0aIaIdI, 1\oL R. 2004: LaIg tena mortality after sa~ SW'VIIion dun.. the liege of Leninpad: pnlSpCCtivc cahan ady. B,'tUII lIedital JDllrIftlI January. 321. II. 17........ R. Eo, .\ale Sa)w. A., ........., & M.. Maa1ID, H. J........... D.J. p.. C'Gaper. c. ........ ReItfDnllbln CoIIarI StadJ GnqI200S: Cohort pmfiIe: 11aellcdfonlshiR Cohort Sludy. In'mNlliontII Journal ojEpitienriology 34. 1234-42. Vkton, C. G. "'1IarnI, F. C.2006: Cohort profile: n.e 1982 PcIotas (Bruit) Birth Cohad Study. In'~nrtI'ionaJ JOIII'IIIII 01 Epitkn,iDID,J' 35. 23742. Wadswartll, 1\0'" XIIIIt Do, au.:..rds, M. R. 2006: Caban pralle: The 1946 NIIIionaI Birth Cohort (MRC N.a.aI Survey of Heal... and De\'Clapmeal).'If'e1I1t1,itNrtllJOUI'MI 01 Epidemiology 35. 49-54.

"'1IanI)'.

blserla' correlation

bivariate boxplot 'l1Iis is a two-dimensional analogue of the BOXPLOI' for univariate data. which is based on calcuIDling "rabust~ measu~s or locatioa. scale and com:laliOD. It

or

consists essentially a pair of conccalric ellipses. one or which (the "hiIllC') includes 50'1, the data and the other

com:laiions and large rorsmaU absolute values. Dclailsorlhe construction or the biyariak: boxplot an: p'VCn in GoldberJ and Iglewicz (1992). This Iypc orboxplot may be useful in

nee 0

~r" -:::r.::.. \ /. }.,

..... K., N,ad, W.,SkjMn... R. ................ C. 2006:

Study). Intmlll'iontIl JDllrlltliol EII_mlo." 35, 34-41. .... Sa, y~ c. s., K....... A....... C. H. D.. Maraetll, 80M., J. . . .A.A..SIder. R.,JGIIII,s., Rep.S.. Lubne, H. ... DIal, B. 2001: ..... ormicranutricnt-rich foods in India IIICIIbcn is associated willi die size or lbeir babies • birth: Puc Matanai Nutrition Study. JDUmG/ol Nutr/,1oIr 131. 1217-24. ItaIIbooIII, T. J.. tall . . . Metdta, J. H. p.. 0IIaaad. c., Barbr, 0. J. P., . ., ...... A. C. J. ad Blebr, o. P. 2001: Adult survival after expGiSUlIC 10 the Dutch famine 1~5. Pardia'ric ontl P«ind,a1 EpitlmJiology IS. 220-5........ Ie. and J.... H. 2002:

on

prcu'"

or

(called Ihe "fenan which dclineales potential wubJcsomc outliers. In addition. raiSlaat ~I~ssion lines of both y an x aad .'l on y 1ft shown, with Ihcir intersection showilll Ihe biYariatc locations estimator. The aaatc angle bc:awecn Ihe ICgn:ssion lines will be small for a IIIIIC absolute value of

.....,C.M.,Barbr,D.J.... CaapIr.C.2006:Cabortpmft1e:11ae Southampton Women's Sunoey. hrlmllll;olltll JDUmGI t1/ Epidemiology 90. 42~. Lawlor, D. A., AIIdeneII. A.l\L I11III Batty, G. D. 2009: Birth c:dtart studies: put. Pft:SCIII and fUhR-ln'l!I1ItIliolllll Joumol of Epitlmriolo".18, 897-002. Map..., P., ..... L M., Cahart pmft1e: The Narwqian MadIer and Child Cohort Study (MoBa).IIf,nnoliDlltllJOIITIItIlojEpiden,ioIolJ' 35, 1146-50. 0Isa, J., l\felbye. M., 0 ..... S. F. ~ III. 2001: The Danish Natianal Binh Cahart - its backpound. strucI1IK IDII aim. SmntlintniDn JOW'IIa' 01 Pubik Hm/'Ir 29. ~7. Penr, c. ... EllIott, J. 2006: CGbart profile: 1958 British Binh Cohort (N1IiDnaI Child newlopnnt

Sec CORRELAnON

,

/

,

I· • I

\ __i • DeInIII .. I ,

I·I

.0,J

\

I

I

\1~/ o

500 1000 1500 2CIOO 2500 SOOO Number of manufaclurlng enterprises _oying 20 or more workers

blYlll'late IIoxplot SCQ',erplol 0/ sulJur dioJci_ concml,Q-

lion QlllinJI nunrber D/nJIIIIlI/tICluring mterprises/or citws ;n lire USA, mowing lire bivtll'itlle boxp/o' of ,Ire

.'Q

4&

BIVARIATE DISTRIBUTION _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

indicaling the distribulional propc:rIies of the cIaIa and in idcntifyinJ possible outliers. An exlUllplc faraSCA11ERFI.DTor the number or manuracturing enterprises cmployina mo~ than 20 people: qainll the: polluti_ level as measured by sulfUr dioxide c:oncc:nlnllioa for a number or us cities is shown in the ftgun: (sec JNIIC 4S)~ fiYC cities ~ indicated as outlic:n. rour thai haw: man: than a thousand manuracturing cnIcIprisc:s. bul one. Pnn'idencc. thai has only a n:laliYCly small number or manufacturing cnlClpriscs. SSE (Sec also SCA111!IlPI.Dr MATRICES) GaIdbera, It. M. ad JaIewkz. B. 1992: BiYariate extcnsialls oldie

boxpkM. TuluIDnretrit$l4. 307-20.

bIv. .ate distribution

For each and evcl)' pair or reasible values. lhc probability thai a pair or variables will lake IhaIc values. 111is lhcn is a naIUral exfcnsion of the idea of a lDIiYariate prababilily distribution. applicable when we ~ measurinllWo paiml Yariables.e-l. a penon's height and weight. If Ihe two variables ~ independent then tile bivariate clistribulion will not be of puticaIar inten:&L When they ~ CXIIIelated. as willa height and wcilht. however. it becomes importanl toeansiclcr ahc bivariate cliSlribution.lrwc wc~ to look aa a sample we might scelhal20" orlhc sample an: over 6 reet tall and 20., of ahc sample ~ under SOq in weight. but only by IoakiaI aa Ihc bivariate dislribution would we know ahat Ihcn: wen: no (or rew) peoplc who wen: bath over 6 fi:ct IaII and under SO kg in wei,ht. When:as univariate distributions can usually be depicted in a BAR CHART or 1IIS'I'OOItA1I, bivariate distributions. clue to their additional dimeDSion. cannat. Ir Ihe two wriablcs ~ categorical then a simple crass-tabulation will pmbably be most informative. while irthe two variables an: continuous a SC'ATI'fllPlDl' will probably be apprapriatc. Par eumple. whaals in pnMOUS univarialc work Mcl..arc:a ellli. (2000) hayC sepanllely lookc:cl 81 the distributions or n:d blood cell volume and hDClDOllobin levels when idcntif'yina anaemia. in a mon: sophisticaled approach, Mcl..arc:a et III. (2001) cmploy a biyariatc distribution or n:d cell volume and haemoglobin. The masl commonly cncou~d bivariate distribution is the BIVARIATE NOR1W. DISTRIBU11OM. a special case or ahc MUIl'IVARIATE HOB1AL DImUBUIlON.

AGL

Me...... C. It., "am......., &. L, Mc~ O. J., """'111, II. C.. U, G. M. ad .1d.aND, G. D. 2000: PIIient-spcc:ific ....Iysis of sequential hIcmatoIopcai data by natliple Unear ~ and JIIimR disarillulion moddUnc. Sttlti.rtiu in Mf!tiicillr 19. 1,83-98••1d.anD. c. E., c.Ia. LV.. s.,tII, p.

x.. ...........

_ Md.""•• O.J. 2001 :CIassific:lIIioaofcIisordcrsofancmiaon the basis of millhDe . . .1 paIBIDdCIS. ndnical Report 01-56. Iniae: Infonaalion IIIId Computer Science Department. UaiYCnity or CalifomiL

bivariate normal dlstrtbutlon Tbisis a special case or ahc multiwriatc nonnaI distribution with two variables and the IIIII5l mmmon example or a bivarialc distribution. The bivariate normal dislribution is worthy of mention because or anlhe multivariate nonna) distributions it is Ihc IIIII5l conunonIy usc:cI.1he casielllO i1luslnfc and lhe easic:sl to write out in mathematical natation. Oiycn two variables, X and Y, the probability density function of the biwrialc normal distribution is defined by the mc:ansof X and r(hen: dcnolcdllJt andllyrapecliwly). lhc STANDARD DEYIA'I'IONS or X and Y (hc~ dcnolcd tlJt and tlr n:spceti\ICJy) and lhc CORRELATION or X and r (dcnaled pl. Given lhc:se values, the pmbability thai X and Y bike values x and y n:spcclively.J(x~). is: 1

1

1 [ (x-IiX) I(x.,) = exp Sf 2."l.tIf,;r=p'l 2( I-p-) til. {

This ronnula may not appear particularly plc:asana. bul is easier'1O hancIIe than for hillier variate narmaI distributions bcaause of the siftJlc c:orn:lation involWICI.. For runhercldails of the dislributi_ sec OIalficld and Collins (1980) and Orinuneu and Slirzaker (1992). When paphed as a SC'A11EIPI.Of, data rlVm this distributian will appear as a cluster or points in an approximately elliptical shape willa the density of the points being gn:atcst. allhecen~ of the ellipse. 11Ie location of the ellipse will be dependent on the two means. while the standard deviations and correlation dclcnnine the angle and spn:ad of the ellipse. The ellipse gels 'narrower' as ahc mqnilade of tile c~lation increases and approaches a slnIighllinc al p =1 or -I. Suryapnmata et til. (2001) use the bivariate normal dislribulioa in onIcr to compan: simultaneously the clinical effcctivcness and cost-cfl'ecliyeftCSs ortwo lrealmcnls ror patienls willa acute myocardial infllKtion. By dnwing a graph of the difference in el1'ecl q.insl the diffen:nce in cosl they W~ able 10 illustrate a CONFIDENCE INTERVAL ror the clilJen:nces between the two tn:allllCnlS as an ellipse. A conYenienl property of the biVariate normal clislribution is ahc ract thaa the IDIII'ginai cliSlributions oflhc two wriables an: uni\rarialc narmaI: i.c. ir Yis ignon:d.lhcn Xby itself has a NOBW. DISTRIBU1'IC»f (and vice ycrsa). Also. the condiliaaal disbibulicms or X and Y are nonnal. To put dais another way, if Y is observed to take a particular yalue. dacn dac

_______________________________________________________________

unknown value of X saill has a nonnaI distribution given this knowledge. AGL

CMaIIId. C. aDd C....... A. J. 1980: Introt/lleUon 10 nadlia'tlritJle DIftlIyJiS. London: Chapman & Hall. GrImmett, G. R. .... stInaker, D. R. 1992: PrDlxlbility and rmrJom prtJeGJD, 2nd edition.. Oxford: CIIIKndOll ~ss. 5111')'.......... H., OUenanaer. J. P., NIhIIe...... Eo, , ...'t Har. A. \V. J., HooraUe, J. C. A., de Beer, M.J.,AI,M.J.aadZIJI...... F.200I:Long«rmoutcomeudoo.steffectiYeDCSS of stenti", ~rsus balloon angioplasty for acuIc my~

canlial infarction. NeDrl IS, 667-71.

Bland-ARman plot blinding

See UMIlS Of A(Jlf.BtENf

Sec CUNlCAL 11UALS. CAI11CAL APRAISAI.

blocked randomisation

Sec

RANJX))dJS,o\TION

Bonferronl correction This conection is used when pcrfcxmi~

multiple signiftcance tests in order to avoid an excess of false positives (ScbaJTcr. 1995). Suppose. for example. S'nJDENrs is to be applied to sample data on six variables to assess mean diffcn:aces in two populations of interest. If lhc NUlL HYPOJIIESIS of no diffe~nce in means holds forcach oflhe six wriables. and each of the six lcSIs is perfonncd at lhc 5 CJt SICJNlFlCANa L.EVEL. lhc probability of falsely ~jc:ctinr; the equality of at least one pair of means is 0.26 (this assumes the variables are independent). a ftvefold increase o\'er the nominal significance level. 11Ic Bonfemmi c:om:ction approach 10 this problem involves using a significance level of ~n rather than ac for each of lhc n lesls to be pcrfonncd. For a small number of multiple tests (up 10 about 5) this method proVides a simple and acceptable answer lo the problem of inftaling the 'JYpc ) error. The correction is. however. hiply conservative and not recommended if large. numbc:n of lests are to be applied. panicularly sinee its usc can lead to the rather UDSaIisfactory situation where many tests are significant 81 the « level but none al level rUn (Pemeger. 1998). In addition. lhc 8onferroni concction ign~ the dcgft'C to which the wriables may be CDlKlated. which again leads lo coRSCl'\'8tism when such ~lalions ~ substantial. SSE ISee also WLTIPLE COMPARISON PROCEOORESJ

'-TEST

........r. T. V. 1991: What's wrong with BOIIfenani aGj1IStments? BritimMtdicalJ0III'IftIl316, 1236-8. SdIaII'er.J. P.I99S: Multiple hypodlcsis tali",. Annual Rra'ieM' of I'S)v:hology 46, 561-84.

boosting This is a class of optimization algorithms that can be applied to lit a number of classical and modem statistical models. Its origins come from machine learning and aJIIIPUICr science (Meir and Rilsc:h. 2003: Schapire. 2003) but have been adopted in statistics as well. From a statistical point of view. boosling works by iteratively fttting

~STRAP

residuals obtained from rather Simple reg~ssion models (BOhlmann and Holhom. 20(7) (sec MULTIPLE LINEAR REDRESSION). These models ~ called base-learners and determine the structure of the linal model which. in esseac:e. is the sum of all base-Ieamers. The method is altraclive because it can be applied to multiple linear regression. LOOJS1lC REORESSlO.... classiftc81ion. SURVIVAL ANALYSIS. robust regression (see ROBUSTNESS). QUAJII11LE REDRESSION. etc. Furthennore. the rq:ression relationship can be ~stricted 10 linear or addilive funclions. which facilitales interpretation of the ftnal model. Unlike RANDOM RHlESTS, boosting is sensitive to the most important hyperparameter. the number of iterations of the algorithms. Too large values will cause overfilting. Thus. crossvalidation techniques have to be applied to detennine an appropriate number of ileralions. The algorithm is especially useful for model filling for HIGH-DIMENSIONAL D.O\TA. i.e. when the number of observations is smaller than the number of elplOl1ltory variables. Models filled by boosting algorithms have been successfully applied to weight estimalion for foelases by three-dimensional ultrasound imaging or for predicting cancer subtypes based on gene expression and single nucleotide poIymorphisms (SNPs) dala. TH

an......... P. ad HatIMn, T. 2007: Boosti", algoritJu.: R:pluizalion, pmlictioo and model filling. SIDtulicai Sdtrlce. 22(4). 477-505. MeIr,R.and Ri.... O. 2003: An iatnxluction to boosting amd Icveraciag. In Am'tllftm lecillTe3 Off nltldJiIre l«U'ning (LNM2600J. Sc••ptn, R. E. 2003: The boostinl applUllCb 10 machine lew",: an overview. In Denison. D. D., Hansen. M. H..

Holmes. C•• Mallick. B. and Yu. B. (cds). Nonlilreor ellimlltion and cltmijitDtion. Ne\\' York: Springer.

bootstrap The bootstrap is a computationally intensive technique for slatistical inference. which can be used when the assumptions that underpin much of classical statistical infen:acc an: questionable. Tbis may be because the data are not nonnaUy distributed or the dataset is small so that thccRtical !aulas based on larJe sample theory are inapplicable. For elample. the bootslnJp can be used to estimate the BIAS and STANDARD ERROR ofparameterestimatcs togethcrwith CONfIDENCE ImERYALS.

)n effect. as we illustrate in the figure. the boocslrap is a data resampling technique. It was ronnally introduaxl by Efron (see the discussion in Efron and nbshirani, 1993) aDd. although it has a sound thean:tical basis. the idea the~ is something magical aboul it is rcftectcd in its name. The tcnn boolSlnlp derives from the phrase to oncselfup by one's boolSlnlp. widely thought lo be based on one of the 18th c:entwy &dvenllRS of Bamn Munchauscn. 111e Baron found himself at the bottom of a deep lake and saved himself by hauling himself up by his bootstraps.

pun

47

BOOfSfRAP _________________________________________________________________

'. . . ..... '. . .. ... .. "' ............ ·.1 . '. . .. . S: ".': .: ... Ie ... I••: • • .. •.• ·.a •..•• ':: •.••••.••..•••• : •.•.• ,..: •..•••••. 2: .• ·.·: .. ' . .... .. ..... ......... ......... ..... ' . ..'" .... ..... " .. ,

• 7.: ... •••• • 12 • : t : .. ' : .... : :...... • ..'

'. •

•

.. " .... .. • . ' .. '... ..": • ... 10 ~. : .: ..

. . . . . .. .. • ....... •• ...-. t t • ~'...... .~." ~"

It'

9 ......1

....

.. .... : . I I ·

~

Population

parameter value •

...

'I 10 '11

Sample

2 12 4 7 I

parameter estimate 6

'5 986

S 2 15 I

., 6

8 2 'I 'It 7 2

to

2 4

241'1 '12 7 I f 7 '11

8 6 2 t '1284 8 4 'I 8

.,

7 '12 4

2 S .,

'It " 7 • 'It'lO 16.

I '10 15 .8 I 10 2 1

S 9

28 84 'It 912 1 o t

2 1 2 5

,7 4 '12 2 • • 1'112

7 Bootstrap samples

parameter eslimales 141 ,",0s,04,OI_"~ 6

•

'.

•

•

'6

•

boatstnIp S~he",a'k Illrut,lllion of boDt,t,apping We will describe the idea using the figure. Suppose we ha~ a population in which the true value of a quantity of intaat. say adult height. is c1cnOlc:d by O. We wish to estimate (J and take a sample or 12 individuals &am dUs popuIalion. In die fllIR- Ihe populatiaa is dcaoted by the IBlle m:1aDJIe in the first row (note that the numbc:n idenlify ~alaljon mem~ a I R not their adult heights). In this population.. the 12 individuals to be iDcIucIc:d in the sample an: numbered. They comprise the actual sample, which is shown in the sccond row. Our estimlllC or adult heiP~ calculalc:d &om this sample. is dcaaccd by 8. In order to quanlify how close 9, the estimate of adult heipa in aur sample, is likely to be t06, the actual adult height in the population. we need althe very least ID c:stimaIc Ihe variance or8. Imqine doin& Ibis in the followin& way. Take a lqe number, say B. of samples of size 12 from the population.ln each of these samples. calculate an estimate ofadult heiJht. Call these e:stimates 8" ... ,0,. Then estimate the variance of 9 by the sample variance of (9...... 8,). Of CIOUI'SC, dais appmach is impossible in practice; if we could doni to draw B extra samples of size 12. we waald ha~ dnwn a much larger sample initially! However. an appro.x.imation to it can be achievc:cl as follows. Suppose we sample with n:piaoemcnls from die 12 absenatiolW in the data (second lOW in the figlR) to fonn a ~subsample'. also of size 12. Seven possible such "subsampies' an: shown in the thinl row onhe IigIU'C.. Forcxample. the ftr.st subsample. shown in Ihe lint IeClanIle in the third IUW. eonsists of the follOWing observations (note some ~a lions will occur II1CJm than once, and some not at aD): C1.2. 2. 2. 3. 3. S, 6. 7. 7. 8. III. These ·subsamples' 1ft known as boot,t,ap somple,. Using each of these bootstrap samples we calculate an eslimlllC or adult height. By conll:nlion. these ~ denoted with a •••• to indicllle lhey have bc:ea calculated from a bootslrap sample. From the seven bootstrapAIBmP~ in the thinl row or the figure. we therefore gel 6" .•• ,87• Now

we simply estimate the variability or the estimate or adult height calcuJaaed from the aclllal +ta. py the sample variance or the bootstrap estimates 9., ... ,0,. or course. in praclice we waald need many IlIOn: than seven bootstrap eslimales. Another way of IookiDg at this is as follows. We wish to 1e:8111 about the mationship between the lruc population panuneICr value. 9, and c:sIimates of(J obtained &am samples from the population, denoted To do dais, we pn:tcncIthe observed cia.. an: the populalion and repeatedly sample from the dala to lcam about the relationship between iJ and c:stimatc:s abmined tiom the n:sampled d_ dcaotcd In other words, we say:

6,

9.

o... .

Di,lri""IiDII of eslimtlles 9gimr6 is lIPP,oximtlteri by

(I)

DistributiDn of e,timate, if giren 9 This is known as the boout,tlp principle. It is impartanl to separate this principle from simulation.. which is used to eslimatc the distribution ofestimates 6 given 8. In fact, then: IR two potential IOUrces or error in bootstrap procecIun:s. The first arises because the boocstrap principle cIac:s not hold lnIe, i.e. the two distributions in equation (I) an: not equal. The second arises because we only usc a finite number of baotslrap samples. B,ID eslimllle the distribution of the , s. However. this error can be made as small as we like by simply increasing B. whe:reas the: boaIstrap cnor is lied. One: of the ans or bootstrapping is to mnsiclc:r simple functions of 11. such as (9-9)/11 (when: II is the sample stanclarcl mor of 9), for whioh the bootstrap principle is more nearly bUe. 10 make things mon: concrete. we ilIUSlrate how to use the baotslrap toeslimlllC VAJUANCE. Considuthe data in the table.. We an: intcn:sted in estimating the a\'Cr. chanGe in the carbon IIIDDOXicIe transfe:,. fador. The obVious estimate is the mean: (33+2+24+27+4+ 1-6)11= 12.14. Suppose ...

A

_____________________________________________________________ 'we wae ~Ie 10 draw a lar&e a~.1J, samplesoflhc: ~ size u ..., in Ihc: table fmm the 'popIdaaioa' of ~

with ellielcenpox and c:atillllde Ihc: a\Wllp chinle aD .c:acb. Denob: Ihe n:suIlilll cstimala by 91 ••• ~ , ii, and recalllhat the InIevalueiD thcpapulalian iscalled 9..Thca an esIimalc or the varimJec: would be:

IIIOIIDXftIe,,.,,,,.

.......... 011111"" II. ctrion ./tld",..for dtklcapo.Y. ",..,mI. tJdnrbSitm," 1larpi11ll1lRlltI/kr tI sillY of11M ..wi (Din." I11III Hilrlcky.

.... ""6 wiI. 1997. p.6l1

En,,.,·

Wark

C""",e III (Wft'k - Enlry,

1 2 3 4

40 ·50

73 ·52

33

5

60 62

PtI'''''

6 7

56

80

58

15

66

64

63 60

hquenc:y of Ihese ot.r¥aIiaas in abe data in die ftatlable;

"y ·aII accui' GIKle. Rows ~Il show Ihe fn:quency or IIae observations in boatstrap umplc:s l...g.

Thus.

ia the 8nl

baolIInIp cia'........ obsc:naIian 1~s DOl appcaI',obscn.aIion 2 appealS ~ ~ absc:nali_ 3'~ twice.. Dbaer..; vati_ 4 once. obIerYaliDa 5 cIocs DDt appear. observalian 6 appealS ancc and oIa:rwitian 7 does DOl appem-: lite IIIC8D is dlen I!.71. ~ table shOws B.9 baolIInIp S8IIIpIcs. We Ihus have 9.1, •••• each..of ~ch . . . . ill approxilnab:ly Ihe &1liiie ~llIlicmIIIip to 9 u 0 does 10 the IIUe parameIer 9. We can use"'" 10 learn about Ihc ~laticiaslUp ___ ; aad O. $Peciftcally. the baolIInIp nlillUde of variance is: . . .

0,.

1 -B

2 24 'Z1 4 1

~NP

te"...~2 r-I

0.·-6 r

(3)

CcImpIIriIll with (2). we see the baolIInIp vasiaa (3) is _hal by (a).pUltiIll "'. na~ 1D~iD& with a ·hat". and (b) puUilll a ..... Oft wbaI is left. 1his rule oflbumb is ftI)' UBCfuI in practice. SubslibltiD& Ihe baolIInIp cstiIIIab:s Iian die secand IIIWe

-6

gives:

• (9;-9)1 -I~.

(2)

B,:

~ [(11.7.-12.14)2 + (7.00-12.14)2 + (18.57-12.14)2

UsiD& Ihe boubti. principle: (I), we CSIimu this by (a) . . - . 9 by its p.ae &am the _ 0, aacI (b) . . - . dac: iii by. ii. wlacn: cac:h is the IIICIIIl cadIon monoxide . . . . . in the ilh bootstrap sample.. The secancI .... shows the boaIslrap in action. 1'hI: ftnI lOW shows lite absenrccI diff'cn:aces, conapondilac to the rourdI co"ma of . . Intlable. ~ IKCIIId nM shows.the

i;

+ (1,.43-12.14)2 + (15.71-12.14r + (26.43-12.14)2

+ (13.29-12.14)2 +(24:29-12.14)2 + (16.43-12.14)1]

= 7.112

.........pFnqwnriawil. ""'klr.m~fnJm '.tWiginlll.'lIm 'kjin' '''''tlpJlNrlnern:'' t1f"_~'rk bDoI.,,.. ..",.. SItI'Ulk

0bIcI val differences

Flaauenc:y in

33 1

observed clara lit bootllnlp sample

2 1

24 1

27 1

3

2

1

1

2ad bacIISInp I8111pIc

1

I

.3n1 baotsInp IIIIIIPIc

3

I

4dI boaIstnIp umple

I

I

SIll boaIstnIp IIIIIIple

7.

4 1

1 1 1 1

1

2

1

2

1

2

2

2

6Ih boaIstnIp IIIIIIple

4

1

1

1

boaIstnIp IIIIIIple

I

2

1

1

8th boaIstnIp umple

2·

I

2

2

9dI boaIstnIp IIIIIIple

I

1

1

2

-6 I

I

I

1 2

2

(me_)

..

B= 12.14

'I...... =

11.11

... = 18.51 ... ' .. = 15.43 ... "... = 15.11 " ='26..43 ... IJ., = 13.29 .... '.... =24.29 " = 16.43 '2 =7.00

0]

I

41

BOXPlOT _____________________________________________________________________ However. B = 9 is nOi nearly enough. 'l)pically we may need BIOWld B 800 boatsb'ap samples lO eslimate the variance accuntely (Booth and Sarkar, 1998). Taking B= 1000. we ftnd lhatthe bootstrap variance of the mean is 5.392• which compan:s with the maximum likelihood estimate of 5.382 • 1be bootstrap eslimlllC of the standard enol" of the mean. 12.14. is thus 5.39. Of course., this example is only illusbative; we know the answer anyway. Howeyer. in many circumllanc:es we may not. e.g. if the data are nOl normally distributed and we want the slandanl enor of the median or some Olhc:r nonslandani measure of the data's 'cenlle·. The bootslrapprinc:iple can clearly be applied much more Widely. It is prvbably most often used to calculate confidence inlc:rvals (Carpenter and Bithell. 2000). where it avoids the need to n:ly on large sample lhc:ory or assumptiOM coneeming the dislribulion of the data. For example. the distribution of individual patients' hospital costs is usually very skew and the bootstrap has been applied to calculate conftde:nc:e intervals for the average cost of hospitalisation. Other applications include hypothesis tesls. power calculalions and estimating the prcdicti\'e performance a slatislical model will have when applied to a new dataset that was not used in fonnulating or estimaling the model. In order for the bootsbap principle (I) to hold, it is necessary fOl' the bootstrap sampling 10 mimic the actual data sampling. l'ben:fon:. if we an: bootstrapping a c:linical trial with twolleabnents. we should sample with n:placement within each tn:atmenl paup. to preserve the nmdomisalion. Other situalioM requin: diffen:nt approaches. The booIsbap resampling illustrated hen: does not depend on any statistical model and is an example of the no-ptlrtlmelric boolstrDp. An altemali\lC, the paramc:lric bootstrap. is less widely used. This samples data from a parametric slalistical model, such as a n:;n:ssion model. nuhc:r than wilh n:platlemenl from the observed data. Lastly. nOle that the bootstrap. although it uses simulation. is cbarac:terised by the boolsarap principle ( 1). It is thus quite distinct from two other common uses of simulation. rand0misation tests and MARKOV CHAIN MONTE CARLO (for ftlting BAYESIAN MOOELS). IRe

=

IAckDowiedaemeab James R. Oupenter was supported by ESRC Resc:an:h Methods Programme grant H3332S0047. tided "Missing data in muitile\'c:1 models'.]

BaaaI, J. G. and Sarbr, S. 1998: Mo.-c:-Carlo IPIJIOXimation of booescrap vari.aaccs. The Anrentllll SlalaliC'i1Bl 52. lS4-7. CarpeD..... J. ad BHbtII, J. 2000: BOOlstnIp coaftdencc iJdm'als: when, which. ,,'hal? A pradicaI pidc for mediad 5Ialistici1m. Slalislica ill MttliciM 19. 1141-64. OatilDD, A. C. and ~. De V. 1997: Bootstrap mrlhotis and lheir appliflllion. Cambridge: Cambridge University PIas. EIna, B. aad TIbIIIIIruI, R. 1993: An inlm.clioIr 10 '''~ bootslrap. New York: OIapman & Hall.

boxplot This is a graphical display useful forhighligiatilll important distributional features of a variable. 1bc diagram is based on the five-number summary of a dataset. the numbers being the minimum. the lower quartile. the median. the upper quartile and the maximum. 1bc boxplol iSCOMtructed by ftrsl drawing a "box' with enck aI the Iowa- and upper quartiles of the data. next a horimntal line (or some other feature) is used to indicale the position of the median within the box and then lines are drawn from each end of the box lO the moll n:mole ObaemdiOM. One COII\'Cntion modiftes this last step by tnanc~ the lines to within (WlmaJtcd) poiatsgiven by the upper quartile plus 1.5 times the interquartile range (the difference between the upper and lower quartiles) and the lower quartile plus 1.5 times the intenauartile range. In this case. any observations outside: these limits are n:p~sc:nted indiVidually by some means in the finished graphic. Different compUIc:r packages may employ slightly ditrerent conventions for displaying exln:mc 01' outlyilll yalues. The ~sulling diapam schemalically n:pn:sents the body of the data minus the exln:mc obsenations. Particularly userul for comparing the: dislributional features of a variable in difrcrc:nt groups as illustrated in the figure. which shows the birthweighas of infants with severe idiopathic n:spinlOl)' disorder, classified by whether or not the infant SUl'Yived. For other examples see Altman (1991). BSE

3.5

-! .1iP

3.0

2.5

12.0 1.5 1.0

L_....!::==--__-===--_--1 Baby died

Baby survived

boxplat Birth..-eights (kg) of infants ,,,ilh severe idiopathic respirolory diseose syndrome (See also HlS"J'OGIWt. S'IDt-AND-LEAF PLor)

AICInan, De G. 1991: Pradital alalislit's for medko/ rrlWll'tlr. Londoa: Oaapman at: Hall.

Box-Cox transformation

See nAN5RlWA11ONS

Bradford HIli afterla Guidelines for drawing conclusions about causal relationships were proposed by Sir Austin Bradford Hill, Professor Emeritus of Medical

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ BUGS AND WINBUGS Statistics at die London School of Hygiene and Tropical Mc:dicine. in his adcImis to the Section of Occupational Mc:dicineofthe Royal SodelyofMedicinc in 1965 (Bmdronl Hill 1965). Bnaclrord Hill's guidelines drew on his many contributions 10 chronic: disease n:sc:an:h in Ihe post-war era. inciudililthe graundb~ng work with Richard Doll on the link belwc:en smoking and lung cancer. The following nine aspects ~ proposed for deciding whetbel' a statistical association might be causal: slmlgIb - rna;nilude of the association. as observc:d by measures such as the ratio of incidence rates; consistency - repcatc:d observation of the association in different populations and ci~umstanc:es and by diff~nt n:searchers: specificity - whether a cause leads toa single effect in a gi\'ell population: temporality - whether the cause pn:c:edcs the effect in time; biological gradient existence of a tn:ad or dosc-n:sponse curve betwcen the cause and effect: plausibility - whether the association is consistenl with currenl biological knowledge: cohe~nc:e ensuring that the inlelpn:tation or cause and effect does DOl conllict willi what is known of the natural history of the disease; experimcnt- existence of experimental rather lIIan obsenalionai evidence. such as through conductilll a nIDdomised trial ell' by introduction of a pR:\lCDtive measu~: analogy - comparison with previous ~search thai identified similar effect mechanisms. Bradford Hill did DOl intend these: guidelines to be philosophically rigorous "crileria' for causal infen:DDC. rather a basis for decision making that could lead to timely action for the good ofpublic health. With furthuaJDSicieration. some of the guidelines (e.g. "specificity') are less than universal in their utility and some commenlalon have proposed alternative criteria more finnly rooted in deductive logic (Weed. 1986). However, in prvposing these guidelines, Bmdfonl Hill advocated an approach of mWlll the besl use of the totality of available evidence: "All scientific work is incomplc:te -thai docs not confer upon us the fRcdom to igno~ the knowlc:dge we aln:ady have. or to poslpone the action it demands.' JGW

BradI'm'd HII, A. 1965: The enviroDmmt and disease: as:socialion or causalion? Procmli~ of,he Ro)'sl SOC~I)' ofMeJiciM sa. 295. Weed, D. L 1916: On Ibc logic of causal infmace. Amer;~tIII JoumaJ of Epidmlio/ogy 123.965-79.

bubbleplot nais is a graphical display ror line variables in which lWO variables ~ used to fonn a SCATIERJIIDI' and lhen the values of the third ~'ariable are leplgenk:d by circles with radii proparlionalto these values and ccntn:d on the apprvpriale poinl in the scauerplot. An example is shown in theftgum; hc:rethedalaan: for41 citicsin the USA and the two variables fonning the scallerplol are average annual tempenbln: and avenge annual wind speed. with the "bubbles' n:praenting the pollution level as measun:cl by the concenlnlion of sulfur dioxide in the air. The plot

suggests that higher pollution levcls an: associated with a combination oflowerannuallemperatun: and higher average wind speed. Mon: details or bubbleplots can be found in Everitt (2003). BSE

i

•

• (I)

12

-11

to

• •

0

G>

'g

1••

9

Iu

8

I

7

r

•• G) @®

•

••

•

®

0

8 45

50

55

• 60

85

70

75

Average annual temperature (Fahrenheit) buhltleplot Bllbbleplol

0/ Qnnlllll temperalure tIIId wind

speedtlgQwt pollulion lel'elos mea:mredby sulphllr t/io.'Cide concenlration in the air fIN 41 cities in 'be USA (See also BIVARIA'IE IOXPLOI' and SCA11DPLOI' MATRICES)

E,..ut, B. S. 200): Modern mrdiml slllli"ks. London: Arnold.

BUGS and WlnBUGS 1be use of BAYESIAN METHODS in practical problems in medical statistics and other substantive an:as of application has bc:c:a hindered until ~lalively recently by computational aspects. In particular, the evaluation of integrals in order 10 obtain posterior marginal. conditional and pn:dic:tive distributions in many multiparamcler problems are nol usually analytically tractable and asym..olic. numerical integndiOD techniques or simulalio.based methods ~ n:quin:d (Bernardo and Smith. 1994). In many pnctical problems in medical statistics the lilnlc:tum and naIUn: oflhe models used havc made parameter estimation particularly amenable 10 the use of MARKOV OIAiN MONJE CARLO (MCMC) simulation methods and it is these that the software packages Bayesian inference Usilll Gibbs Sampling (BUGS) and Win BUGS (Windows version of BUGS) implement. (l.aIest versions at the: time of writing are BUGS 0.5 and Win BUGS 1.4 ~ fn:ely available fram www.m~-bsu.cam.ac.uk.)

BUGS and WinBUOS use the BUGS syntax. which is similar to that of S-PLUS a: R. to specify the likelihood and prior distributions for the stalistical modeJ in question. together with initial slarting values for the sampler (oillts.,

61

BUGSANDWINBUGS _______________________________________________________________ Thomas and Spicgclhalter. 1994). Wilhin Win BUGS the specification of models may also be in terms or directed acyclic graphs (DAGs) using the Doodle feature (see ORAPH. leAL )'IOOfJ.S). with the appropriate code being produced automatically. Additions to the most recent \'crsion of Win BUGS ( 1.4) are the ability to usc scripts so that WinBUGS may be used in 'batch mode' and impro\'cd graphics capabilities. together with calculation orahe deviance information criterion (DIC) to assess model complcxity and fit (see BAYESIA.'l METHODS). In addition. the suite of S-PLUS functions CODA (Best. Cowles and Vines. 1995) can be used to explore con\'Crgencc issues with output from BUGS and WinBUGS. Specific dc\'elopments of Win BUGS arc PKBUGS, which allows MCMC methods to be used for complex population pllQmracokinelidphumlarodYIlQmic (PKJPD) models and GeoBUGS. which is an add~n to WinBUGS that fits spatial models and produces a range of maps as output.

Since BUGS and Win BUGS require the user to specify statistical models in terms of the UKELIHOOD and PRIOR DISTRIBlT'J1m1S (see BAYESIA.'l METHODS). using MCMC methods in order to evaluate the model is only recommended for users skilled at undenaking Bayesian analyses and must therefore be used with considerable care -the manual even comes with a 'health warning'! KRA

Bernardo. J. M. and Smlt", A. F. M. 1994: Bayesian theor),. Chichester: John Wile)' & Sons. Ltd. Best. N. G., Cowles. M. K. and "Iae~ S. K. 1995: CODA COnl'ergence diagnosis and

outPII' analysis sojltt'Qre for Gibbs Sampler Olliput: Version 0.1. Cambridge: MRC Biostatistics Unit. Gllu. W. R.. Tbomas. A. and Splegelbalter. D. J. 1994: A language and program for complex Bayesian modelling. TIre Stalistician 43. 109-78. Spleplbalter, D. J.. Thomas. A. aDd Best. N. G. 2001: lVilrBUGS 'J'ersion 1.4 ruer manual. Cambridge: MRC Biostatistics Unit.

c calibration

Consider a situation in which we wish to mcaswc serum c:onc:enlnltiorw of hanaoncs. enzymes and other pmleins. for example. using such methods as radioinununoassays (RIA) and enzyme-linked immunosOl'bcnt assays (EUSA). 11an:c key questions in the development.of such assays IR (a) how does Ihc: expected value: (average) of the: assay response change as a funclion of the true amount of the target malerial in the senlm samples. (b) how does the VARIANCE (or STANDARD DEVIATION) of the USBy results change with the averBIe USBy n:sull and. subsequently. (c) how might we usc: a panicular assay n:sull to delerminc: the amount of the target maIerial in a new sample of' senun? We leave question (c) for the time being and concentrate on questions (a) and (b). Let the assay respDDse be Yand let the true: level orthe target material be X. We wish to clc:tc:nninc the form of the functionsF and G in the following two equations: E(YIX)

= F(X)

(I)

and

Var(YIX) = G(E(YIX»

(2)

Here we assume that the values of X IR known without !oIEASlIREMENT BUlOR. We an: concemcd with what is often ~fem:d to as absolule calibration. If we do not have access to the truth. but only have measun:ments using alternative assays. Y. and f20 say. then we ~ a:n:emc:d with the problem of comparative: calibration (for the lauer sec ME11IOD COMMIlISDN S'RJDIES). "JYpically. such a univariale calibndion study involves performing the assay proced~ (ideally with full. indcpendcat~ replications) on each of N training samples or specimens with known yalues of X. and then using various data analytic and modelling proceckRs to eYaluate the fonn of F and G. 11Ic: statistical mc:Ihods might be fully panmc:tric (Htting linear or nonlinc:ar models, for example, with an assumed parametric model for the wriancc:) or nonpanmc:lric (esSCDlially filting an ubilrarily shaped smooth dosc-n:sponse curve). Suppose: an analytical chemist wishes to use some fonn or absorption spectroscopy to study the composition of. say, certain body Ooids. He or she is likely to use meas~mc:nll or many peak heights &om such spectra 10 measure several substances simultaneously. This activity is the multivariate analogue of the univariate case; i.e. multivariate calibndion. Technically, multivariate calibration is much IIICR dimcult than the simpler univariate problem. but the ullimale aims and logic an: similar. We stall with the latter and then brieRy discuss the former. £rrqdDpllldie C~ to' AlNItaI SllIIis,;a: S«OIIII Edilim cJ) 2011 JohD Wiley 1\\ Sou.. ....

Instead ofdealing with the technical complexities offilting nonlinear models with heterogcac:ous error distributions. we will consider an example that. by comparison. appears to be: quite simple. Suppose we hayc a simple colorimmc assay fOl' urinary lIucose. We obtain a series of specimens with known lIucose concenlnliorw (X) and then measure the: absorbance using the mevant assay procc:dwc. We assume thai the calibration function F is a slnight line and that the variance of the Y measulmlCnts is independent of X (i.e. the 'enor' variance is constant). Filtin, a simple linear regRssion model for Y using ordinary least squares gives us estimates of the: inlen:c:pt (u) and slope (/J) of the slnlight line relating X to Y. Having answeml questions (a) and (b) usin, the: simple: rqressionanalysis. we now move on to question (c). Suppose we an: presented with a new urine specimen and ~ asked to ddcnninc its glucose eanlenl. 11ae classical method of estimating the unknown X fram our measurement. Y~ involves using information from the abovercgessionof' Y.X.1bc: requircdc:slimate isgi'Ven by:

(n

x = (Y-a)/IJ

(3)

An alternatiye is the so-called invc:ne estimator suggeslCd by Krutcbkoff(1967). This involves using the original X. Ydata to rcp:ss X on Y to obtain estimales of the: intcra:pt (y) and

slope (1). and then simply using these panuncler estimates 10 pmlict X giyen a new fl' i.e.:

.

X=y+AY

(4)

For details or the properties of these: two estimalon. see the review by Osbome (1991). To illustrate the ideas of multivariate calibration. consider a n:latively simple ellample. SuppaIC we wish to measwc the concentration or a particular metabolite in the blood (X) but we arc now able to usc, say, thn:e different colorimetric assay pmcc:dun:s to olMain yalues Y•• Y2 and f s• Assuming that the thn:e carn:spanding calibration curves (F.. F~ and F~). as before, are all straight lines (but with ditTen:nt intcn:epts. slopes and "c:nar. 'Variances) we CaD use MULTIVARIATE UNE..o\R REORESSJON (orlhree separate n:grasions) in order to estimate the panunclers or the thn:e calibration curvcs. 1hc classical approacl1 to the uscor. new set oflhn:e measurc:mc:nts (Y•• Y2 and Y,) on a DeW specimen 10 pnxlic::t an unknown X is the: mulliYariate generalisation of the univariate problem. DeWls of multivariate calibration IR well beyond the scope of the present article. however. and readers ~ refClRCi 10 Thomas (1994) and Naes el al. (2002) fOl' further information. Consiclc:ring our prescnl example:. one simpleappralU:h

Edited by Briaa S. EYeritt and ChrisIGJlh« R. P'IIImeI'

63

CAUPERMATCHING _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ (particularly if we are prqJBrCd to assume ClGftditional inclepeadcnce of the Y values) might involve estimating the unknown X using each or the YI' Y: and Y;t values separately (in each case usiag equation (3) above) and then producing a weiPtcd average of these thR:e estimates, with weights proportional to their estimated precision. An example of the inYCn1e approach would be to produce a multiple regn:ssion to prc:dict the unknown X from Ihe three Y measurements. This has ob\'ious lC:chnical drawbacks, howe\'Cr. because or MUIllCOWNEARm' (high correlations betwcca the lhrc:e Y values). One possible solution involves the: usc of principal components regression. A PRINCIPAL COMPONENI'S AN.u'YSIS is carried out on the Y values and then one or man: of the resulling components are used to predict the unknown X. Fwdaer details or principal components regression and alternative analytical stnlCgics can be found in Tbomas (1994) and Naes el D/. (2002). Whalcver method of prc:didion is usccl however, it is important in bath univariate and multivariate calibration problems thal the performance or the predictions arc adequately evaluated. this might involve validation using a test sci of new X, Y values or internal cross-validation (use or the LEA\'E-ONE-CX1T CROSS-VALIDATION approach. for example) using the original lraining set. GD

Kntdlbft',R. G. 1967: Classical and invcne lCp'CSSioa methodsof'

rewa, Dan.. T. 2002: A ruer-/riendl, gllide 10 nut/lilYll'iale

calibratiCJll.ledllORfetrit~ 9. 425-39. NaIl, T.. 1.1aRa,T..

T. aM

ralibralion tmd citmijiC'QlioIr. OUchcster. UK: NIR Publicatians.. 0IIt0rae, C. 1991: Statistical calibration: a nwicw. InlmllllioMI StaliJliCQI Rent.., 59, 309-.36. nama., Eo V. 1994: A primer on mullivariale calibration. AnIl/yliml Chemislry 66. 79SA-804A.

caliper matching

See MATCIDNO

canonical correlation analysis This technique establishes whether relationships exist between a priori groups or variables in a study. For example. in a study or heart disease, we might ask ir then: is a connection between penonal physical charac:lcristics such as age. weight and height. on the one hand. and the systolic and diastolic blood pressures or the individuals. on the odler. Altc:mativc:ly. in chronic depn:ssion. a study might be aimed at uncovering rc:lationships between personal social and financial variables such as gender. sse. educalional level. income and a range or health variables induding various indicaton of depression. In anothu example. a public health survey might be C1Jftdueted to explore cOMections between housing quality variables and indicaton of different illnesses. A first attempt al analysing the strength of association between two ,TOUPS of variables (e.g. between hausing quality and illness) might involve examination or all COII'CIalions betwcca pails of variables.. one from each group. Howevu. ir cadi group contains man: than just a rew variables. such an approach is bound to lead to conrusion.

Ideally. one would like 10 n:place each set of original variables by a new set. in such a way that the new variables were mutually uncOl1'elaled within sets and just a few of them exhibitc:d correlation betwcca sets. Canonical correlation analysis takes just such an approach. and finds optimal sets or linf!DT Irllllsj'oTmQI;olU or the original variables. one ror each original group of variables. Suppose that "I. II:••••• Uk an: the InIDSfonned variables ror one sci (say. the housing quality variables). while ",. "':•...• I'k an: the transfonned variables ror the other sci (say, the illness variables). "Optimality' is defined by requiring the correlation betwcen "I and ", to be as Iarp: as possible among all linear combinations of the original variables. that bc:twecn II: and 1'2 to be the nextlarp:st. that between u) and l':s the thinllar;est and so on. subject to Ihe rollowing constraiD15: II:• •••• an: mutually UDCGITClated: 1'1. "2••••• If.. are mutually uncorn:lated: and any",. pair is uncom:latc:d when ; #= j. It is clearly nat possible to have more (uncorrelalcd) transronncd variables Ihan then: were original variables in a set. so the number sorpain that can bedcrivcd is equal to the smtzl/eTofthe numbel'sororiginal variables in the two groups. The effect of canonical cOl1'elation analysis is thus to channel all the association between the two groups of variables through the n:sulting pairs of Unear combinations (u""')' (U2.I'2)' .... These derived variables are known as t:Dnonit:DI WlT;Dles. The only nonzero correlations n:maining in Ihe c:orrclalion matrix of the new variables an: those between conaponding pain or canonical variates.. i.e. between u, and ,., for; = I •.. -•.s: they an: known as the alIIOlfit:tJ1 t:orreIDlwIf,s of the system. Mos. computer software packages that conlain multivariate slaliSlical prot'edurcs will conduct such an analysis. They will also quote a significance level against cacl1 canonical conclalion. appropriate ror testing the NUlJ. H\'FOI1IESIS that all succccding population canonical cOI1'elations are zero. Such significance levels should be bellied with some caution, as lhey rely on Ihe assumption that the data follow a MULTlVARIAlE NORMAL DISTRIBUIlON. Nonetheless. the number or signiracanl canonical conclations is usually laken to indicate Ihe number of (independent)conncctions that ex is. betwcen the two groups or variables. Inspection of Ihe aJCfftcients or each original variable in each canonical variate may also proVide an interpretalion or the canonical variate in the same manner as interpretation or principal components.. which may help to identify the natun: of the conneclion between the groups (sec PRINCIPAL COMJlONENT ANALYSIS). However. again a cautionary note is in order. because such intellRtation is not quite as straightrorward as ror principal components. The reason ror the complication is thai there may be very diverse VARIANCES and covariances (sec COVARIANCE MATRICES) among the original \'ariables in the two groups, which affects the sizes of the coeflicients in the canonical van-

"I.

"I

"k

_____________________________________________ ates. and Ihere is no convenient normalisation to place all coefficients OR an equal fooling. This drawback can be allevialcd 10 some extent by rabictinl interpretation to the :lIQntlartli.d coeOicienlS.. i.e. the coemcients lhatln appropriate when the original variables have been stanclanlised. but nevenheless the problem still n:mains. To iIluslrale the technique. consider a canonical conelation analysis bc:Iwcen the 'health' variables ancIthe 'pcnonal' variables in the Los AftlCles clepn:ssion study or 294 n:spoadcal5 pn:senlCd by Aftft and Clark (1984. Chapter IS). The four 'penonal! variables \VCR: gender, DIC. incCHDe. c:cIucation level (numerically coded fram the lowest. 'leIS than hilh school'. 10 the highcsl. 'finished doc:lOrate'). while the IWa 'health' variables we~: CESD (the sum of 20 sepande Dumerical scales measurinl clift'en:nt aspcc:1s of clcprasion) and health (a numerical sc~ measuring 'general health'). The c~ma matrix bc:twc:ea these variables for the sample is showa iD Ihc: table. canonical correlation ....., ... ConeIafion mattix for Iwohealth andlourpersona/VIIIIIIbIBs in the LA ~ sionstudy CESD HeQII" Gentler Age &iMmliDn Income 1.0

0.212 1.0

O.llA 0.091 1.0

-0.161

alGI

o.ow 1.0

-0.101 -0.210 -0.106 -0.208 1..0

-0.151 -0.183 -0.110 -0.192 0.491 1.0

C~R&4ECA~REM~

worth. TIle conapondiag canonical variates In: fI2 = O.I99C£SD + O.288Hetlllh 1'2 = 0.396Gentler-O.443Age-0.448Edrlt:tllion

-O.ssSIRt:on,e Since the higher value of the gender variable is for females. the interpn:1alion hen: is lhal ~latively yauag. poor and uneducatc:cl females ~ aaacialc:cl wilh higlac:r depn:ssion SCOleS and. lo a lesser ellteal. wilh poor ~eivc:cl health. Thus Ihe~ are two inlClplelable 'dimensions' of cannCCli_ betwa:n the 1W0 sclsofvariablcs. A SCAmllJll.Ol'ofthe scores of ~spoadents &pinst each pair of canonical varialeS would help to identify any anomalous individuals in the sample. A further inlercstinJ application of canonical ccx-mali_ analysis occurs when there isjusl a single sct of variables. but they ~ measun=d on individuals in a number of a priori distinct poups or populations. Forcxample. a set of signs or sym..OIRS XI. X2 • ••• , .t'~ is observed ma a sample of patients suffering fiomjDunclice and each padent isclassifiecl inlOone of I illnesses thai have lheeXlc:mal manifeslali_ orjaunclice. We can thus define a sct of indicator variables 11. 12' •••• )'. that specify a palient's illness. by selling the values 'i= I. 1./=0 (j:F i) for a palicnt suffering fram illness i. A canonical com:lalion analysis with the x values as one SCI or variables and Ihe 1 values as lheolher sct ofvariablcs willihea produce the linear combinations of the .y values thai ~ most highly com:lalc:cI with linear combinations of the poup indicator variables. Since Ihe lauc:r deftne the best way 10 view group dift'e~nces. the fonner an: just the canonical variables tbal best discriminate belWcc:n the I grvups of individuals (see DISCRIMINANT RJNCI10N ANALYSIS).

He~

the maximum Dumber of canonical variate pain is :1= min (2. 4}=2.1be fint cllDllllicai cam:lalion turns GUlto be 0.405 and this Ih'CS a sipificance level P < 0.00001. II might be arguc:d thai lender ancI c:cIucalion ~ unlikely to have normal distributions. so Ihis SignificllllCle level should not be laken tao litemlly. Neyenheless.lhc:~ does seem 10 be strong evidence thallhe ftJst canonical cam:lalion is significant. The: cam:sponcling canonical variates, in lenns of slandanlisecl original variables.~: "I

= -O.490CESD + O.9I2HeQII"

= O.02SGe",lo + 0.87IAge-o.383EtluCllliM+ O.082lnl:o",e High coefticients com:spond 10 CESD (lICIatively) and hc:alth (posilively) for the ~c:aVc:cl health variables and to age (posilively) and c:cIucation (Relatively) for the penonal variables. Thus n:latively older and unc:ducatc:cl people Ic:nd to SCOle low in terms of clepJasion. but penx:ive their health as matively poor. while ~lalively younger but c:cIucaled people have the apposite hc:aIth pen:c:plion. The second canonical conelation is 0.266. which has a significance level P < 0.00 I so also carries interpretalive 1'1

Finally, the variables in a slUdy may fall inlO IlleR than two tI priori sets anclsome genend between-a mea5tR of associalima is requiral. Various possible definitions of such associalima may be made and. consequently. the ideas or cllDllllicai COlRlation analysis may be gcmeralised iD various ways. However, such generalisali_ is quite complicalc:cl and inlerpretation orthe n:sults becomes much more problematic. Gnanadesikan (1997) pnmdcs a brief overview and further ~f'en:accs. WK All, A.A. ... CIIIItE, V. 1984: Compuln-tlitletl _Ii,trille _lysis. Califomia: Wadswadb. O.........Ire., R. 1997: IIeltrods /« Jllllwimllllllll,lis tJf mulli.triJle obm-tVlliou. 2nd cditian. New York: Jalut Wiley a: Sans. Inc.

canonical variates

See

CANONICAl.

CORIlELAnONS

AJW.YSIS

capture recapture methods

"Ibis is an altemalive

appruadJ to a census for estimating population size

that operates by sampling the population several times. 65

~~EC~REMB~

_____________________________________________

idenlifyilll individuals who appear mCR dian once. Captun:-n:ca~ mdhods ha~ a long hislDry dalilll back to 1786. whca Laplace used such a technique lD CSlimalc the size of lhc 10IaI papulalion of Francc. Traditionally. captum-R'apleR melhaclolagy was primarily focused on wildlife populations. bUI has increasingly been applied to human populations, particularly within epidemiological situations. Within the ecological Reid. Caplum-nnplUle experiments involve observers going inlD the field and m:ording all animals ht an: absc:ned (either visual sighlings 01' trappings) at a scquc:ncc of capture e~1s. On lhc initial ca~ event, all aaimals that an: obsened an:: m:orded. uniquely marked ad released back inlD the population. Al each subsequent captureevcat. all unnuukc:d animals an: recorded and uniquely nuutcd. aU markccl animals an: lUanied and all animals released. The data from such an experiment 1ft simply the ~ of captlR histories for each individual animal observc:d within the study. Each iadividual caplUrC history is typically rcpmICntccl by a series of ZCIOS and ORCS. when: the 0 and the I denote lhc absc:acc or presc:ncc. n:spcclively. or lhc individual at each capture event. Then: an:: generally two fonns or models for captum~Caplure data: closed and open. for which there have been • series of models proposed. Closc:d models assume hllhe population is 4Xlftslanl thmughout the study period. with no births. deaths or migrations. whcn:as apen models allow for these traasitions in the populaliOD. Generally speaking. the panunctcrs of inlcn:st diller between the IWo models. Forexample. within closed populalions. the total population size is geacrally of particular inten:st: convenely. far open populations. parIII11CICI1s of interest may include birth rates. dcaIh rates. migration rates and/or pnxIuctivity rates. We initially. bricJ1y. consider the c:aplum-~apleR methods often consideml fOl' wildlife dais before 4Xlftsidcrilll in fUrther dc:tail epidemiolopcal models. Forcloscd populations. Otis el DI. (1978) described a series of different Caplun:-n:ca~ models. relating lD possible hclCrogcneity in lhc capture rates as a rault of time. Irap n:sponse or individual elTec:ts. 10111 and Brooks (2008) ha~ incorporated these madels inlD a Bayesian framcWOll. Mix.tun: models have become inc~asingly popular lD model individual heterogeneity (Plcciger.2000: Morgan and Ridout. 2008). Additionally. then: ha~ been a series of models proposed for open populations. dependent on the paramcIcn of interest. with perhaps the most widely used beilll the Cormac:k-Jolly-Sebcr madel. whIR lhc sumval rab:s an:: of primary interest. ad the Amason-Schwarz model. which incorporates multiSlnlIa cIaIa. Recent advances include the generalisation lD multiC\'ent models (Pradcl. 2005). where the SlaIc of an individual may only be partially observed. Within the epidcmiologicallitcratun:., closed populations an: usually modelled. with the lDtai population size of particular intcn:sl - and this is whal we shall focus on here.

capture-recaplure methods Example of an inc0mplete contingency tlJbIe, with three SOUICeS: A, B and C. The entries n.- denote tire number of individuals obsstved in the given cell, where 011 tepf8SBlJls libsencelPl8SeIICe on the given list. The cell ntIDD is unobsewed and hence unknown

C=I A=l A=I A=O A=O

8=1 B=O 8=J B=O

"I ..

c=o

IIIDI

110 "lliao

110..

11010

11001

lllIDO

For example, many IR8S of scientific n:scan:h focus OD the estimation of papulation size: fiom the number or susceptibles lD a given disease. lD the number of drug addicts in a panicular area or the number of injuries sustained in the workplace. However. it is usually impossible lD enumerate each member of a populalion. possibly clue lD their number (e.g. the number of web pqes on the intcmc:t) or whca the papulation is ·hidden' (such as the number of injector drug usen). Thus. data 1ft often collected in the fonn of a series ofincomplctc populalion counts using a variety of sources 01' lists. Each saur= ~spDnds to a capture event and an individual being IaXJI'dcd by a gi~ SDU~ com:sponcis 10 being observed at ht captun: cvcnt. It is assumed that each individual is uniquely identiRable by each soume. 11Ic:n. the data 1ft simply the capture hislDrics of all individuals o. served. The clata an:: usually 5ummarisc:d in the fann of a contingency table., when: k is the number of sources. and the cell cables corn:spond to the number of individuals thai an: observed by each combination of SlJUn:es (i.e. the number of individuals observed with the same capture history). Cleady. lhc contingency table is incompleac since the number of individuals belonging to the papulalion but not absc:ned within the stucly is unknown (sec the lable. for example). Unlike the ecological application die soun:cs do not usually have a temporal scqucncilll as for the captun: events and so dilTcmat models ha\'C been developed within Ihe epidemiological application. Within lhc epidemiological field the captum-recaplure approach is sometimes callccl 'multiple ~ systems' and the corresponding estimate of the total population refem:cl to as 'Bernoulli census estimates· or 'uccrtainmcnt c:orr=ted rates'. 1be most common appI'OIIC.h to analysing epidemiological data of this san is via the use of LOa-LINEAR MOOELS. introduced by Faenbcrg (1972). In these models. the logarithm of the expectccl cell count is expressed as a linear function of panunclel'S. These parameters relRSCnt main cft'c:cIlcnns for individual soun:es and associations bc:Iwccn two or ~ soun:es. Thus. these models allow for interactions between diffcmal soun:cs.. The model assuming indcpcncIcnt'C betwccn each soun:c is simply a special case:

t,

______________________________________________________ QUDOGMM .willa DO inb:ncIions JRICDL '11Ic:n: are 85_0y a DUmber of possible 1..-IiDeal' models lhat CaD be lilted to lhc

dada. each specil'yilll clifl'en:al sets or interactions bctWC:CD

the soun:cs. Tradilionally, classical aaalyses coasist of initially lindilll the model which pm\'idcs the best fillo Ihe .... _11& ror eumple, UDIJHOOO RAllO 1f.S1'S aadtor iDformaai_ criteria, such as AlCAllCE's INRlIMATIOX auTERION (AlC) or BtlyesiDlr in/Dmllltiolf mleritnr (aIC). Once lhc CiYCD madc:J has beell selected, the Ialal popuIaiiOD is csIiIaaIc:d .sinc MAXIMUM LIKflJIIDODES'IIMATIOX fardac miSlinlceD. cambinCd with lhc observed number oI'individuais (sec. forexampic, HOlIk and Rc:pI, 1995). Reccndy, Bayesian appmaches have also beca deveJaped for ftllinc 1og-liDe. madels to Ihc cIaIa, and ill )NIIIiaI1.1he issueof .....1choice (KiDl amlamob, 2001; KinI el QI. 200S). 'I1Iis approach also allows the calc.11IIioD or a modcI-avcrqcd estimate of the IaIal papulalion. I1:I1KWinc die ....I-clepcaclcace pmblem lhat mayarisewheD ....y

a siagte model is chosea on which 10 basi: infcn:lllx:. AIlerllllliw: appnNIChc:s to .sinc 1oI-1me. models iDclucle: 8SC 01' the Rasch model (see Caniquiry and FacDbeq. (998) in anIer to madel possible heterop:DCily iD the papulation and also IBleat class models when the individuals CaD be categorised inlo ditren:al subpopulations. RK

CaniqaIrJ. A. L

............. 5. It. 1998: Rasc:b IDDIIcIs. In p. ... Col. T. (cds).. Erreyt:/1Ipftiia D/ bitutlllillirs. CIIic:Iaa!ilCl': John Wiley" So-. Ltd........... 5. It. 19'12: 'I1Ie Anni~

mulliplc RCIIIIIIR OCIISIIS for clDlcd .,..11Iians ud iDcampIclc 'i' eoaIingcacy IIbIes. BitHnelriktl 59. 591-603• . . . , ........ RIpI, .... 1995: ~ mcdaads in cpdanialol)':

_sis

IDCIbods and limilalians. E,ilknliDllI,itlll Rnin.. 17, 243-64. KIIIIt ..... IInab, s. P. 2001: Oa abe Bayesian or papuIaIionsi&.Biotnelrikll88. 317-36. KIDa. ..... ~s. P. lOO8: On Ihc BayesilD esIimaIian or a closed papailliCII size in Ihc pn:seDCC of IIclCmpadly ... madel 1IIICCItIialy. IJitllMlrks 64, 81~24. KIDI. a., BInI, s. at., BnIab, s. P., Ba.J. o..... y • ....., S. 200S: Prior iDfOllDllliDn iD bebaviDlaal c~ ftlCapbR IDdhads: c1cmopllllly iaftDlD:CS injecIDII' pnIpeDIity ID

be listed GIl data!ICIIRCI1IIId lhcirdrupclllCchna...UIy.A....... JOIIIfIIIloJEpiMmitJlIIg)' 162.094-'103. OIII,D.a.., . . . . . .,K.P.. WIlle, o. c. ... A8dInMI, Do .. 1971: Stalislieal iDfeRnce fftIIa aphRdIta _ closed IIIIiIDII papallliaas. Wi/tli/, 62. 1-135. U ........ J. T..... ....., M. S. 2008: A new mixlUre model far aphR hdcqcaeil)'. JauraaI or the Royal SlIIIislicai Socicl)': Series C 57.433-46.......... s.lOOO: UniW maximuIn likclihoad csti_ta ror closed ~ IIIOdeIs IIIiDI mixllRS. BitJlntlTks 56. 434-42. ......., .. 2005: tttultic:Ycat: an exImBian ofmuhislalectplllla-ftlClI*R models IOUllCCllaiaSlates. B-.tI'icr61,442-7.

II""",,.,.

carry-over CART

See CR05SOYER 1RL\U

This is an 8CI'OJIym 'or clusificlllion aad 11:p:s-

siOD 1n:C. See TIEE-mtUC'IUlED loIEIIIOD5

cartogram This is a diapam ill which clescripli~ slDlislical iDformatiCIII is displayed on a p:opaphical map by means or shadiag, by usilll a variely 01' diffi:n:Dt symbols or by SOllIe IIICR illvolved pmc:ecI&R. Two examples in . . - s an: liven. A dcscripti_ of bow cllltognaas may be COll1InIcIc:d is given in OuseiD-Zade mad Talcanov (1993). SSE ISec aim DISEASE a.u51I!RIKOJ

CIIItognun CstIogram oflfe expedBncy in Ihe USA by slate.

ro - mote ,han 10 yeatS 57

~sruDl~

________________________________________________

cart..... 1996 US population CIIItogram (III states aRl. resized telatitle 10 their population) Qaela.Zadt, S. ad l1~.V. 1993: A

coaslrlldiac

CDDtlauaus

DCW

cutopllDS- Geo,rtlpII}' _

' 'formtl';. S)'SImu 20. 167-73. caa-cohort studies

meIbod lor Gtogmpbic

Sc:e CASE (J). then a positivc respanse. Y=1. is observed. For example. in a toxicological slUdy invcstigating the cffect ofdifTc:n:nt dasesofan experimental drug on mice. a mouse may dic irthe CXpasIII'C dosage exa:cds the underlying tolerance the mouse bas ror the drug. In studies aJDemIing mobilily disability in the clderly, the uncIerJying latent Iesponsevariablc rorthe sclr-n:parlinginability to walk aqU8lter' ora mile may be the subject's true mabilily level. Henc:eeacb individual's response to the question: 'Arc you able to walk a quarter of a mile?" ~iU depend on his or her cut-oll' point. wbich is the threshold Icvel on this latent scalc at which he or she will move rrom Y=O 10 Y= 1. Thus caefftcicnts in the rcgressionmaclelabovemaybeintcrpn:talaslhccll'c:ctsofthe cavariaa on the latent variable. )"". The complcmentary log-log madel can also be derived from notiDg the relationship between the probability of a positive response in a lime interval of Ic:ogth. Tsay (or an analogous measure. say volume). and the response rale.1' say. fDl' this time interval u . the Poisson assumption (see PoiSSON DIStRIBUTION). For example. this relationship is utilised in the development or models for dilution and serological studies. where tbe probabilities or growth occurIinI on a plate al a particular dilution and or a person living in a particular disease endemic area being inrected with this diseue in one year, n:spectively. arc of interest. This madel also follows naturally frum the applicalion of the proportional hazards assumption to poupc:d survival dalB. Thc rqrcssion CXJCflicients in this case arc interprdcd as log hazard ratios (see SURYIVAL .ANALYSIS) or log relative risks. However. if P is small (P < 0.2), the regn:ssion coefficients can also be inlclprcled as log odds. Far further BT details sec Colleu (2002). CaHett. D. 2002: "'otklling hintllj dola. 2nd cdilioa. Loncba: ChaplDllllt HalIICRC• .,.. . . A. C. 1998: &bemc w1ucs. Ia Annitace. P. aad Colton, T. (eels), E/ft),dope4it1 of hioslalislks.

Chichester. John Wiley cI: Sans. Ltd.

Complete case analysis This is an analysis that LIleS only individuals who have a complete SCI of Ihe intended measurements included in a sbldy. An individual with a missing value on onc or more variables will not be included in the analysis. Whcn there arc many individuals with missing values this approach can considembly n:duce the effectivc sample size. In most cin:umstances complcle case

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ COMPLEX INTERVENTIONS analysis is not to be rc:commended since other approaches such as multiple imputation can be used in onIer to retain as full a datascl as possible. then:by improving efficiency and rc:duci", bias. SSE (Sec also DROPOUTS)

complex Interventions 11aese 1ft interventiolUi that contain 5evenl interacting components and/or multiple outcomes. Most CUNlCAL TRIIaI.S focus on a single intervention with a primlU)' outcome measure (see ElmPOINTS); the slaDciani trial compares a group of patients who receive a new medication with a group who do noL and Ihe allocation to groups is done al random. lnaasing)y. howc\'er, we wish to test more: complex inlcl"Ventions. Multifacc:tc:d interventions or those thal ha\'C multiple outcomes an: hard to develop and their testi", presents a nmge of challe",es. Disciplines such a public health and preventive mc:dicine need to assess complex packages of measures that are tailored to individuals or groups.. ralhc:r than a single medication given in standard doses 10 recipients. InterventiODS aargc:tc:d al behaviour change can involve various components and be particularly complex. Sometimes the intervention conlinues to develop ancrthe initial assessment, and ongoing evaluation is needed. though this can be hanlto accomplish. There is no specific deftnition of whaa makes an intervention complex: even standard "simple' interventions may have elements of complexity. a.US1D RANDOMIZED TRIIaI.S have been developed to deal with the 4X1mple:xity Ibat arises when interventiolUi are applied to groups such as hospital wards. surgeons conducting operations and 4X1mmunilybased groups. However. the: complexity can go much funher. Mullifacc:tc:d interventions to alter diet or physical activity. for example. can be extremely complex. The British Medical Research Council has developed gUidance for the development and evaluation of complex interventions (Craig el QI., 2(08). Some: of the key points that have emc:r;ed over the years relating to these intervenlions include the need to develop the intervention carefully. drawing on all the evidence 10 date using systematic reviews. understanding the theory underlying the intervention. modelling the processes and outcomes. and assessing feasibility and piloting the methods. before embarking on a full evaluation of Ihe intervention. A clear description of the intervention is vital both for those involved to understand it fully. but also in reporting the findings so that lhe intervention can be introduced elsewhen:. or tested further. Various apprvaches to developing and evaluali", interventions have been cleveloped. such as a multiple optimisation strategy (MOST) (Collins el al.• 2005) and the Reach. Effec:tivencss. Adoption. Implementation and Maintenance (~AIM) framework (Glasgow. Vogt and Boles. 1999). The Nalional

Institute for Health and Clinical Excellence (2007) has also produced guidance for planning. delivering and evalualing public health activities aimed at behaviour change. 1he choice of study design depends on the intervention being IISSCSsc:d.. Randomisation should be employed wherever possible and the gold standard is the nmdomised controlled !rial. perhaps with variations OD that methodology. such as pn:ference mals. and randomised consent. stepped wedge and N-OF-l 11lIALS. However. sometimes randomised assessment is nol possible. An example or this islhe introduction of the Sure Stan Local Programmes in England. which was a government initiative 10 provide care and support to families and children in the most deprived areas in thecauntry.1'hc: intervention was undoubtedly complex in that sen'ices olTemi in each SIR Start an:a varied according to locally idcntiftc:d priorities. and no control groups wen: mODitoR:d contemporaneously. The: evaluation learn used a quasi-experimental approach 10 compare the data from Sure Start with similar data rrom children in the Millennium Cohort Study who lived in similarly deprived parts of the country but who did nol yet ha~ access to Sure Start programmes (Melhuish el al.• 2008). This falls short of the ideal of a randomisc:d assessment but was a serious attempt to evaJulIIe a large. widespn:ad and costly intervention. Public health assessments often need 10 maximise the infonnation available: from "natural experiments' (Petticrew el QI•• 20(5). 1bus the introduction of a new supennarket. lraftic calming I1'IC8SUR:S. the building of a major road may have health effects thal may or may nul be the intended con&equcIM..'Ic of the intervention. Randomiscd assessmenls arc: rarely possible and although less than ideal a befoRHmd-aftcr comparison. preferably contrasted with one: or more comparable areas. may be the best that can be achieved. Particularly dillicult is the assc:s.smeol of interventions affecti", entire: populations inlroduced by nalionaJ or local gOYc:mment Examples are the introduction or Wilier fluoridation (sec: hup:llwww.lOuthamptonhealth.nhs.uklpublichealthl ftuoridation) or smoking bans in wrious countries (Haw el aL. 2006: Pelt el QI.• 20(8). A PRRQuisite for such assessments ismllc:ction ofdaIS before the intervention commences and this CaD be politically difllcult. as it may nca:ssitale delaying the implementalion of the new me8SIRS. Complex interveationsrequin: imaginative approaches but they also need syslemalic evaluation and an understanding of the true n~ of the intervention under study. All 4X1mponents of the intervention and its consequences. intended and Wlintendc:d. ha\'e 10 be assessed and resean:hers need to undersland the thcon:lical and practical background to Ihe intervention that they are investigating. Campbell eI 01. (2000) offer the sensible advice that a mixWn: of Refs and other research designs an: needed fully to assess complex interventions. HI

89

COMPUERAVERAGECAUSAL EFFECT (CACE) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ Campbell, r.L,t IIf. 2000: Framcwcd for the design and cvaIuaIion forcomplex iJdm'aalioDs 10 improve health. Brittsh MftiiraJJDUlflQI 321. 694-6, CaIIIas, L. M., Marpby. S. A.. Nair. V. N. aDd stndIer, V. J. 2005: A Slralcl)' for qUnWng and evaIuaIing bebavioral intamdions. AlurtIls of ikIrtn'iourai Medicine 30(1). 65-73. C ..... p.. Dleppe, Po, r.1adDtyre,5.,l\UC..... NuanOa, L IIDd Peftkre1r.1\1. 2008: Developing and evalualin, complex inlcn'ClllIons: the new Medical Racarcb Council guidance. British Mttliml Journa/l37. a1655. Full guidance available al: W\\'W.IDR:. ac.uklcompiaiDlcrvc::ntionsguidance. G~. R. Eo, Voat. T. r.L _ BoIII,S. M. 1999: Evalualing the public health impact ofheallh promoliOll inten'aalioDs: the RE-AiM framcwcd.Amrrit'tlll JDlllJlQI If Public Hettlllr 19. 1322-7. Haw, S. Jo, Onler. L., Amas, A., Carrh,C.,Fl,dlbadllr.C.. FGllltG. T.""I. 2006: LegislllliODon smoling in enclosed public places in Scotland: how will we C\'alUIIIe the impact? Journal of Public Hmlllr 28, 24-30. MeIhuIsII, Eo,

s..

BelIky.J., lAylaad,A. R.,BanHI,J.aDdtbeN......... Ew........

GlSan Start 2008: Research Team EffedSoffully-es1ablishcci Sum SlIIJl Loc:aI Programmes on J-year-old cbildmt and their families lMng in England: a quasi-experimenlal absen'aIionaJ study. Lonert 372. 1641-7. NatloaallDlUtute for Redia aDd CUaIcaI ExaIIe. . (NICE) 2007: Bcha';our cban~ at papu)atioa. communily

and indi\'idualle\-ds. In NICE Pub/ic Health Cildtltmce. London: NICE. Pea. J. Po, H..... CoIIbe, s., NnIJ)'. D. Eo. Pel, A. c. R., FIIddtadIer. C.II"'. 2008: Smok~f~ legisllllion and hospitalizalions for acute COIODIIIY syndrome Nt'tt' EnglandJDUlflQI ojM~didne 359.412-91. Plttlcnw, r.L. Cmnad..,S., FemII, Co, FlDdIay, A., HI....., C., RO)'. C." •. 2005: Nalllralexperimcnts: an uncIcrused tool for public health? Public Health 119. 751-7.

s..

complier average causal affect (CACE)

See

ADJUSTMEXI' FaR NONCOMPLL\NCE IN RANDOMISED CG.'fI'RII' m

TRIALS

component bar chart

Sec B..\R CH.O\Rr

components of variance

'11Icse are variance paraI11CtcB lIIat quantify the variation attributable 10 nndom effecl lenns included in a n:pcssion model. For example. a simple IWIJ)O),I EFFECTS MODEl. for diastolic blood preSAR mcasLII'CIDCnts on patients recruilc:d from a number of clinics includes random effects 10 Jq)rescnt the variability belween clinics and random n:sidual effects to rcpn:scnt Ihc \'arj. ability betwc:cn patients. If no f1ll'lhu nndom efTcct tenns arc added to this model. Ihe model is said to include two CXJII1POnCnts of variance. The VAlUANCE of lhc random clinic effects in the model is the bctween-c:linic varianec component and the variance of the random n:sidual effects is the between-palient variance c:omponent in this example. Under this model. the lOiaI variance of the individual patienl measurements is assumed 10 be equal 10 the sum of the variance components. Suppose in lhis example of blood pressure meBSUn:lncats on patients wilhin dinics that the overall mean value is estimated as 10 mmHg. with the bctween-c:linic variance

component estimated as 7 and lhc be~n-paticnt varillDC'e component estimated as 135. 1be estimated belween-dinic variance componenl allows conslruction of a 9SCJL range for lhc mean blood pressure values at the different clinics. using lhc approach for calculaling a .den:nce interval. Hen:. values lhat arc within approximately two (between-clinic) standanl deviations of the overall mean arc 10-1.96./7 = 74.lmmHgandlO + 1.96J7 = 8S.2mmHg.llisthcrcfore estimalc:d that the majority of mean blood pressun: values for different cUnics lie between 74.1 mmHg and 85.2 mmHg. Estimation ofvarianc:e components is relevDDl in a number of application areas. In HEAlllI SERVICES RESEARCH. variance componcats can be used lo describe the variability between administrative or geographical units such as clinics. hospitals or towns and. scpamtcly. Ihc Variability bclwc:en patients within units. In LQNOJ1UDlNAL DATA. variance aJlllponents can be used to desc.ribe the variability betwc:cn patients and. scparalCly. the variability belween measurcments within patients. When Ihc data or inlerest arc from a balanecd clcsign.tbm: is a standanl approach forestimalion ofvariant'e componenls that is based on AN.o\LYSIS a: VARIANCE. As an example. consider some data n:pn:senting six repealed measurcmcnts or Ihc peak expinlory Row raIc (PEFR) for 10 palients with asthma. A simple random effecls model ror the PEFR measurements includc:s a betwc:cn-patient variance component ~ and a within-paticat variance canponcnt 0;. Because the: same number of observations is available for every patient. the datasel is balanced and the variance components can be estimated using an analysis orvariancelable for the dala. 'J'bc table pracnts the observed sums of squares and mean squares. as in a conventional analysis of variance. Under the random effccts model assumed hc~ the expc:cted values ror the mean squan:s can be expn:ssed in terms of the variance components o~. and oi. By eqUaling the observed mean squares with their expected values. eslimales for 0;. and oi arc obtained as 191.41 for 0;, and (11903.13 - 191.41)1 6= 1952.07 ror ~

components of variance 0bssfVed sums of squares and mean squares

Source of variation

Degree:e of Sum:eof Mea" E."Cpecied freedom JqlIIlTes :squares mean JqIIIlTes

pMieaI5 W"adlin patinas

IlerAftD

T....

9 50 59

l07 134.5 1 9570.70

11903.83 .91A1

a:..+6Di 0;-

11671S.21

Many study designs pnxluce unbalanced data; e.g. health services rcsc:an:h studies thlll include a numberofhospilals or

________________________________________________ clinics commonly n:cruit varying numbers of palienlS from these ancIlongitudinai studies need not collect equal numbers of measun:ments from all subjcctL Sc\'CI'aI methods are available far estimation of variance components in 1mbalanced datasets. Extensions of the analysis of variance approach to the unballlllCed case have been proposed. but these an: not now commonly usc:d.. Estimation of variance components using the method of MAXD.lUM LlKElJHOOD EmMAl'ION can be achieved within many statistical software packages. Howcver, maximum likelihood estimates of variance components an: biased downwards in general The prefemcl method for eslimation of variance components in unbalanced data is RES11UcrED t.L\XWUM UKELIIIOOD ESmIA. TION (REML). which is also available within many software packages. In balanced datasets. REML estimation gives the same results as the analysis or variance approach as just described. whereas maximum likelihood estimation docs not. 8y definition. a component of variance is nonnegative. since it com:sponds to the variance of a set orrandam effects. However. the methods forestimalion of variance components can pnxIuc:e negative values. Usually. this OCCU15 when the true value of the variance camponent is small and nonnegative. One approach to proceeding is to set the negalive estimate to zero. Estimation and Jq)DIting should be handled with can: far data in which a negative variance estimate has been obtained (Brown Dad Pn:scou. 1999). For further accounts or wriance components. see OoIcistein (1995). Searle (1971) and Snijdenand Bosker(J999). RT Browa, H. ad ............ R. 1999: Applim mLw mDtkis iIr mftIieille. Chichester: Jdm Wiley " Soa~ Ltd. GoIdIteIa, H. 1995: M"ltilerri sfaiinil.Yll motIe&. Loadoa: AmoId. Sarli, S. R. 1971: LiMtIr motIels. New Vorl: 101m Wiley &: SoDS, Inc. SDI,Iden. T. ad BCIIbr. R. 1999: MuJ/iiel'el tllftll),sis. I..andon: Sase.

compoafte endpoint

See ENDPOINTS

compound symmetry

This tenn is used to describe

the slruclun: of a covariance matrix thai has all its diqonal elements equal to the same value (sa)' fill) and all its oft"diagonal elements equal to another value (say fll~' i.e. a covariance malrix. with the form:

., 1:=

An KCOW1I of testing for compound symmelly is given in VoIaw (1948). SSE

(See also lJNEAR MIXED EHEtTS MODELS)

Votaw. D. F. 1948: Testing compound symmeli)' in a normal multiYlliatc diSlributioa. ARno& oj MothtmQliml Slota/its 19. 447-73.

conditional Independence graphs See

flI2 0 12

°Il °12

012

flI2

fI{

.,

condlUonailoglstlc rag~ This is a form of logistic rqression dlat can be applied to matched dala&ets. particularly cIaIu from matched CASE-CONTIIDL S'RJDJES (see MATCHED MIRS ANALYSIS). For such dala the usual logistic regression model cannot be used since the number of parDmeters iDCmISCS at the same rate as the sample size with Ihc consequence that MAXIMUM UXBJHOOD ESTIMATION is no longer viable. The problem is o~ome by regarding particular panuDClCr5 as a ·nuisance' that do not need to be eslimak:d (see NUISANCE PAIWIf:I&S). A condilionallikelihood function can then be CRaled thai will )'ield maximum likelihood eslimaton of the parameters of most intc:n:sI. i.e. the regression coefficients or the EXPLANATORY VARIABLES involved. 111e mathemalics of the proc:edum are described. for example. in Collett (2003). The conditional logistic regn:ssion models can be applied USing SIand.arcIlogislic regression software .85 follows: first. set the sample size to the number or matched pairs: next. use as explanatory variables the diffc:n:nces between the values for each case and control; Ihcn. set the: value or the: n:sponsc variable to one ror all observations; and. finally. exclude Ihc constant tenn from the model. SSE Callett. D. 2003: Modelling bilrtrry .to, 2nd edition. Boca RaIan: Chapman &: HaU/CRC ~S5.

condlUonai probability

A conditional probabilily is the probability of an event given thai another event has occurred. For ex.ample. ClOIISidcr two events tl and b. 111e probability orboiha and boc:cuning,dcaotc:d pc,a A b), using the mulliplim/ion rule (see PROBABILITY) can be expressed as:

= P(alb) x P(b)

(1)

Reanarq;ing eqWllion (J) yields the: conditional probabililY of a given bas: P(a,., b) P(alb) =

Such a SlIUctUR is umed by some a s to the analysis of longitudinal date. e.g. the random inleKqlt model. although it is genc:nlly lmn:alistic since. in pradice. varianc:es often i~ with lime and covariances frequently increase: with Ihc time interval bc:Iwcen two mcasun:ments.

ORAPlD-

CAL MODELS

P(a 1\ b)

fli °12

OONDm~p~mLrrv

pCb)

(2)

If tl and b a~ independent. then P(a " b) =p(a) PCb) and hence fromequalion (2) P(alb) = P(tl). Frequendy. wcwish 10 revene the conditioning: i.e. ndhcr than p(alb) we want P (bla) and this can be achieved using B.o\YES· nlECIlEM.

91

OONA~I~MB

_________________________________________________

D+

o.a

~

0.2

E+ 0.8

D+ 0.3 E- 0.4

~

0.7

Coadiliollal pmbabililics IR fRqUeDtly used in epidemiology (Ciaytoa and Hills.. 1993). 'l1Ie &111m shows a typical sihIalion is which incli\'iduals CaD clevelap. a diseue or nat ...... D+ and D- aapc:cIively. haviIIg been expallCd or not. cIeaatcd E+ and E- n:spc:c:liwly. 'I1Ic eoncIilional pmbability Ihaa individuals cle\'elop abe disease 'giYCD tballlley \1Wft eapasal. i.e. p(D+IB+), is 0.1. KRA CIQtoII, Do o........ M. IWl: Sltllillktlillltltk& iR qitkmi,.". 0dInI: Oxranl UniYcnil)' PIa&.

confidence Intervals 'I1Iis is a nagc: of wlucs calcuJated frvm a sample 10 lIIat a given prapDItion ofinterYals Ibus calcuJatal flOlll such samples waald CDIIIain Ihe IIUe

papulation _ac.ln JaeaKh. we collect cIaIa on ourn:sean:h subjects 50 we can tlnw CGIIClusions aboul some IIIr&CI' papulation. Far CXlllllpIe. in a randomiscd CXlDbaJled trial comparilll IWo obstelric n:Jimes. Ihe n:llllive risk of' Caesan:... lCCIion for actiw mlllUlplllCllt orlabaurCDlllplRClto . .line JIUIM&CIIICIIt was 0.97. with a caaftdence interYal 0.6010 1.56(SadIcr, Davison IIIKlMcCow.... 2(00). This ~ was carried aut in one obstetric unit in New 7aIand. we ale not spcciftc:ally inlCn:lted in lhis unit til' in Ihcsc patients. We an: inlclatcd in what they can lell us abauI wIIal would happen ifwcllaled rutun: palic:a1S with active 11UIftIIFII1CDl orlabaur nthcr than RJUline nuiaapmc:at. We wanlto bow DIll lhe n:latiovc risk for thc:sC: ptlrlialkr YIOIIIen but dae n:llllive risk fOl' QII women. The: IriaI subjects farm a sampJc that we usc to clnw _ _ conclusions aboul Ihe population or such palients in aIhc:r dinic:aI cenlRs in New Zc:aIancIand othcrCGU~ DOW and in the ftIbft. The: observed relative risk ofCac:sBlalllllClion• 0.97,. pmYiclcs an estimalc or the Jelatiovc risk we woulcl expecl to see in this wider pnpulali-. II is called a ptlinl ulimtlle because it is a sincle number. If we \1Wft to n:peal abe trial. we would not set .:xac:dy the: same: point eslimalC. Other similar IriaIs cilccl by Sadler. Davison IIIKl McCowan (2OOO).ovc ~ different n:llllive risIcs: 0.75,1.01 and 0.64. ~h of Ihesc lrials ft1RSCIIlS a dill'CRIIl sample: or patic:nls anclcUniciaasand tIIc:m isbouad to be SOIIIC variation between samples. Hence we c...not conclude that lhe n:Jative risk in IIIe population will be the: same: as ..at found in our p8raicular lrial ....Ie. The n:IaIiw risk that we get in .y

_I

4+-.............................................~..................................................----3+-.............................................~..................................................-----

risk I• aRelatIve not h:IucIng population reIaIIve risk

95% conldence inIerviIl

canftclenoe .....,... ConIIdenoe JnteMJ/s for 100 slmullJted t8IaIIve risks

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ CONFIDENCEINTERVALS particular sample would be compatible with a J1IIIge or possible dilTerences in the population. We estimate this range of possibilities in the population with the confidence intcrval. A 95'1,. «Jnfidcnce interval is defined in such a way that. if we ~ to JqJCal the lrial many times and calculate a confidence interval for each. 9S~ of these intervals would include the relati\'e risk Cor the population. Thus if we estimate that the population wlue is within the 95CJt conftdeac:e interval. we will be com:d for 954it of samples. This is a pretty difficult «JnCept to get to grips with. The 8glft (see page 92) shows a computer simulation of relative risks and confidence intervals for 100 studies where the relativc risk in the population is 0.90 and the sample size and Caesarean raIe similar to those in the New Zealand study (Sadler. Davison and McCowan. 2(00). or these 100 confidence intervals, S include the papulation value (chosen to be 0.90). Many raean:hers misundenland conficlence intervals and think that 95CJt ofsamples will produce point estimates within this confidence interval. This is simply not true. In the simulation. the rust sample coafidcnce inlel"WJ is 0.46 to 1.15, and only 13~ of sample relalivc risks are within these limits. Such intervals are not unique and indeed many intenals with this propeny could be chosen. We usually choose the intenaJ so that. of those intervals Ihal do not include the population value, half will be wholly greater than that value and half wholly less. This often leads to intenaJs that arc symmetrical about the point estimate. allhough in the case of RELATIVE RISKS AND ODDS RA'IIOS this symmetry usually occurs on the Ioprilhmic rather than the natural seale. In principle. a confidence interval can be found for any quantity estimated from a sample. There arc several different mc:lhods for doing Ibis, SOnIC simple and some not. First. we shall show how conridence intervals can be found for two or the simplest statistics, MEA.,,. and proportion for continuous and categorical data respcc:tivcly. and then see what they show about «Jnfidcnce intervals in general. In the St Oc:orgc's Binhweight Study (Brooke et ai. 1989) data on binh weight and gestationalqe on 1749 pregnancies were obtained. For the 1603 bint. at 37 weeks' gestation or more the mean birth weight was 3384 g and the STANDARD DEVIATION was 449 g. This is a large sample and the sample mean will be an obserwtion from a NORMAL DISlRIBUTlON whose mean is the unknown mean birth weight in the population and whose standanl deviation is well estimated by the standard emJI' 449/ ,JIM» = 11.2. For a normal distribution. 9S4it of obSCl'Wlions arc less than 1.96 standard deviations from the mean. so 95'1,. of sample means will be less than 1.96 standard enurs from the population RICan. The 9S4it confidence interval has as a lower limit the sample mean

minus 1.96 standanI CITOrs and as an upper limit the sample mean plus 1.96 standanl errors., 3384 - 1.96)( 11.2 to 3384+ 1.96 x 11.2=3362 to 3406g. Similar methods can be usc:d for many large sample estimates. We need the estimate to be from an approximately nonnal distribution and the standard error to be well estimated. We can estimate a confidence intenal for a proportion p USing the standard error formula for a BINCMIAL DlSTRIBI.I'1X»l 'P(I-p)/rr. For example. in the St G:orge's Binhweight dy 146 of 1749 births occum:d at less than 37 completc:d weeks' gestation. The proportion islhus 14611749=0.08348 or 8.34it. The: standard error is estimated by .08348(1-0.08348)/1749 = 0.006614. 'I11e 95'1,. connnce inacrval is thus 0.08348 - 1.96 x 0.006614=0.07052 to 0.08348+ 1.96 x 0.006614=0.09644. Rounding Ihis, we get 0.071 to 0.096. which is from 7.1% to 9.6%. For small samples things get much I11CR complicated. We cannot assume that theeslimate follows a normal cliSlribution or that the standanl error is a good estimate of the standard deviation of whatever distribution it does fo))ow. For means, we can use a method based on the standard enor if we assume that the data themselves follow a normal distribution. If we make this assumption then for a sample of n obscnations Ihe difl'erence between the sample mean and the unknown population RICan dividc:d by the standard error follows a ,DlmtlBurlON with rr - 1 DECJREfS OF FIl£B)Q)J. Rather than 9S~ or samples having means within 1.96 standard errors of the population mean, they ha\'C means wilhin laM saandard errors of the poP'dalion mean. whCR ' 0 •06 is the two-sidc:d S4it point of the ,-distribution with degn:es or freedom. In Ihe binh weight study there were II babies born at 34 weeks' gestation. Their RICan birth weight was 2477 g with a standard deviation of 531 g. giving the standard emJI' 5311 JIT = 160.1 g. Then: were II - I = 10 degrees offn:e.. dom and the 5'1,. point or the t~bution is 2.228. The 954it confideRCIC interwl for the mean birth weight of babies born at 34 weeks was therefore 2477 - 2.228 x 160.1 to 2477+2.228 x 160.1. namely from 2120 to 2834g. For a propoltion estimated from a small sample or small number or events, things do not work in the same way. 111e standard erroreslimate can go disastrously wrong. In a study of isolated inlnlcaniiac echOJenic foci in foetuses. we found one bisomy-21 abnormality among 177 subjects (Pn:fumo el al., 20(1). The proportion was Ibus 1/177 =0.00S6S. or 5.65 per thousand. The usual 95% COnfidCDCIC interval using the normal approximation to the binomial disbibution gives -5.4 to 16.6 per thousand. clearly impossible. The large sample assumption has broken down. Researchers will actually quote such impassible intervals andjournals ha\'C been known to publish them! Somc:limes, realising that the negative limit is impossible researchers will replace it by zero,

93

CONFIDENCE INTERVALS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ buttlUs. 100. though beuer, is still wrong. The lower limit of the confidence interval annat actually be zero in this example. Since we have found a case in the sample. it is not possible that there are no cases in the population. There are a number of diffcrent methods to improve this interval (Newcombe, 1998). One of these uses a proc:edu~ based on the exact individual probabilities of the binomial distribution. The binomial distribution has two parameters. the number of independent obsen'ations n we make (e.g. number of patients) and the probability P that any given observation will be a ·yes'. This probability is what we are trying to estimate. We find the lower conftdence limit as the value of P 50 that the probability of obtaining the observed number of ·yes's or morc will be 0.025 and the upper limit as the value of P so Ihat the probability of the observed number of 'yes's or fewer will be 0.025. These probabilities are obtained by summing the exacl binomial probabilities for all the possible numbers of 'yes·s equal to and beyond that obsc:rved.. n.c calculations for such methods are extn:mely tedious. but not to a computer. For the echogenic foci data the 95... aJlllidence interval by this method is 0.00014 to 0.03107. or 0.014 to 31 per thousand. This is an example of an exact method calculation. because it uscs the exact probabilities of the distribution (se:e EXACT MEI1IOOS RIl CA"lmaUCAL DATA). There are seven! other computer-intensive methods that can be used, such as the BOOISlRAP and those: based on rank tests. The confidence interval allows for what is called sampling variation. This means that it reftec:ts the differcnc:e between estimate and population value likely in nndom samples from that papulation. However. it does not take into account other sources of variation. tcnncd naasampling variation. The sample that we have is from geographical space. in that it aJIItains one hospital. as in the .:tive ll'UlllBgemcat bial (SadJc:r. Davison and McCowan. 20(0). Evca the IlII'J:est clinical trial wiD contain at most only a few hospitals and their patients. The hospilals arc not chosen nndomly, so the sample will differ from the population in an unknown and inestimable way. It is also a sample in time. in that we want the sample of patients seen in the past to tell us about patients whom we will see in the future. The sample may not be as good at estimating quantitics in this wider population as the confidence interval suggests. The interval quoted in 1he .:live mllllBlcment trial was II 95 CJ., confidence interval and 95... of such intervals would aJIItain the mative risk fOl'the population. We could also calculate intervals for other percenlages. e.g. a 99'1,. interval. calculated so that 99... ofpassible intervals would contain the population estimate. For the Caesarean section relative risk the com:sponding 99CJ, confidence interval would be 0.52 to 1.81. wider than the 95CJ., inlClVal of 0.60 to 1.56 n:ponc:d. In compensation. more of these intervals would aJIItain the population value.

We could calculate a much narrower interval. A SOCJ, confidence interval is calculated as estimate minus or plus 0.67 standanienms. com~d to estimate minus or plus 1.96 standard enors for a 95... confidence interval. The: SO... interval basc:d on a large sample normal approximation is only 34% of the width of the 95CJt. interval. This is not very useful as an estimate. as only S09L of such intervals contain the population value they are estimating. However, it shows that if we calculate 95'1,. confidence intervals. we can say that fOl' about S09L of samples the middle third of the 95CJ, confidence interval will contain the population peI1Imeter. Thus. 95% is chosen as a standard eonfidence level as a reasonable compromise between width (or precision) and coverage probability (accuracy). SignificllDCle tests and confidence intervals are closely related. Many null hypotheses are about the value of something we an also estimate. such as the diffe:rence in mean between two groups. It will usually be the case that if the NUU. H\'FOIHESIS value (diffCRIKle or regression coeflicient = O. odds ratio 01' ~Iative risk = 1.0) is contained within a 95CJ, confidence interval then the P-value will be g~ter than 0.05. For example. in the Birthweight Study. we might want to test the null hypothesis that mean birth weight in the population is 3400 g. To lest this. we sublmct 3400 from the obscrved mean and divide by the standard crror, 11.2. This ratio. (3384-3400)111.2=-1.43. would be an observation from the standard nonnal distribution if the null hypothcsis were true, givingP=O.lS. HCR the 95% conftdclKle interval (3362 10 3406 g) includes the null hypothesis value for the mean. 3400 g. and P > O.OS. Contrariwise. we might want to test the null hypothesis that the population mean birth weight was 3500 g. Now the test statistic is (3384 - 3500)1 11.2 = -10.36, giving P < 0.0001. The null hypothesis value is not included in the confadc:nce interval and the difference is significant. Thus the 95% confidelKle interval can be used to do a significance test at the 5CJt. level. For mcans and their differences then: is an exad relationship between the usual conftdence interval and the usual signiflcance test. because the standard CIIOr is not related 10 the quantities being compan:d (means) and thus is not affected by the null hypothesis. It may not woJlt for proportions. relative risks. odck ratios. etc. For example. let us test the null hypothesis thai in the population the proportion of births at less than 37 weeks' gestation is 8 .... Under the null hypothesis. the prtJpOJtion is 0.08 and the standard error is 0.08(1-0.08)/1749 = 0.006487. not the same as the 06614 used for the confidence interval. The test statistic is (0.08348 - 0.08)10.006487 =0.54. P= 0.59. The nuD hypothesis value of the propartion is within the confidence interval 0.07110 0.096 and the difl'ercnce is not significant. Now let us consider a null hypothesis value just outside the confidence interval 0.97. The standard error. if the null hypothesis wac

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ CONFIRMATORY FACTOR ANALYSIS true. would be .fH1 x (I-OJ)97) 1749 = 0.007077. The test statistic is (0.01348 - 0.(97)10.007077 =-1.91. P =0.056, but not significant. Thiscffcd ofthe null hypoehcsis on the standard cnor is why we sometimes see odds nlias. relative risks and SlaDdaniised mortality mtias where thc 9SCJt conlidc:nc'C intc:rwl includes 1.0, but the ratio is repoJ1cd as significanL Rcscan:hers arc now cncouraged lO present results as confidence intervals iMtcad of. or in addition to. P-values (Garclnu and Albnan. 1986). This approach is more informativc than the pnclicc of giving a P-valuc or staling "significant" or 'not significant'. as it provides an cstimate of the size of the possible dUTcn:nce or mtio betwccn the groups in the population. This is particularly useful when diffen:aces arc not statistically significant. as it cnables the reader to judge whether a poICDtiaily impoJtant diffcn:nc:e could havc been missed. P-values and confidence intervals both have their role and if possible both should be given. Most major medical journals now include in their recommendations 10 authors that thc main results of studies be presented using confidencc intervals (or their cqui\'alenl) and that authors should avoid relyi", solely on hypothesis testing. Finally. same comments on the Bayesian penpc:c:tivc. then: being two differing statistical philosophies. the Bayesian and the frcquenliSi. At present few Bayesian analy5Cs appear in thc mediailliterature. althouP we may expcctto sec more of them in rul~ (sec BAYESIAN METHODS). People often talk about a 9SCj(, confidence interval as including the unknown population value with probability 0.9S, saying. for instance. then: is a 9Sfit chancc thDi the true value lies within the computed 9SCl. confidencc interval. Now, it is true that if wc sct out to collect a new sample. the probability that its confidence interval will include the papulDlion valuc is 0.9S. Howcvu. onoc the sample has been collected and the interval calculated. it either includes the populDiion valuc or it docs not. we just do not know which. In strict frc:qucnlisttcrms. we cannot talk about the probability of the papulation plU1UllClcr having any given value or range of valucs.ll has a constant. albeit unknown. value with no probability distribution. A Bayaian is willing 10 think of thc population value as a variablc with a distribution. which n:prcsenls the unc:CJtainly in our estimatc of iL Bayesians quote something called a CRmIBLE INTERVAL. which is a range of possiblc values that has a given probability of including the unknown population valuc. This probability is often sci at 95 Cl.. Thus a 9Sfit crediblc interval is a set of values that is estimated to includc the papulation value with probability 95 CJt, whcn:as a 9SCI. confidence interval is a sct of values chosen so lhat9Sfit of such scts would include the population value. For the proportion of biJths before 37 weeks. a Bayesian credible interval. assumi", no prior knowledge. is 7.ICJt to 9.7", virtually the same as thcconfidence interval

(7.1 'l.lO9.6'l.). The difference is academic. which is perhaps why academics have spent so much time qui", about it. JMB

Bruob,o. G., AndellODe H. R., BIaad.J. M.. PeImd,J. L ..... Stewart, C. l\L 1989: Eft'cc:.ts on birth weight of smotiag. alcohol caffeine. socioccoaomic factors. aad JlSYchosoc:iaJ ~ss. BriliJIJ Medkol JDUmQI298. 795-801. G........, M. J. aad AI....... De O. 1986: Confidence-intemJs rather thaa p-WIIDes - estimation rather than hypothe.s~~ Srilisb Meditol JOIITnttl 292. 7~SO. NIWCGIIIbe, R. G. 1998: lWo-sided confidence intervals for the single prupodion: comparison of seven methods. Siolislirs in MedkiJre 17.857-72. PnI'mDo, F.. PnStl,P., l\lanldls, E.. Saaud.

A.F.. BI.....,J.~L,C............ s. ..... Camlll... J.S.~OOI:~laled

echoFDic foci in the fetal bean: do lhey increase the nsk oflJisomy 21 in II population pnMousty smcned by nuchal lransluccncy? UllrtJSOumJ hi obstetrics aNI Medicine II. 126-30. Sadler. L C.. T ...... McCow., L M. 2000: A randomiscd controUcd bial and mcta-anaIy.sis of Klivc ~mcnt or labour. BrilU/r Journol oj'Obstelrics turtI G)JltWco/ogy 107.909-IS.

Danso.a.

confidence level

Sec C(]lIIfIJ)fNQ INJalVALS

confirmatory factor analysis

This is a procedure for testing a hypothesised factor sb'Ucturc for a sci of obSCl'Ved variables. Thc hypothesised structure will specify both the number of factors and which observed variables arc rclatc:d to which factors (Dunn. Everitt and Pickles. 1993). This contrasts with FACTOR A.~ALYSIS when used in its exploratory mode when the number ofractors has 10 be detenninc:d in some ways from thc data and no a priori constraints an: placed on the factor structure. Conrannatory factor analysis is a thc:ory-ccsting madel as opposed to a theory-generating method like explOJalory factor analysis. The first step in a confirmatory fador analysis involves the calculation ofcithu a correlation or a COVARJANCE ),f,O\1RJX for a sel of observed variables. Then possibly a number of competing fador models arc proposed. derived either from theory or previously performed exploratory factor DDalyseS on othe.. datasets. The models will differ in their specifications of 'frec' and "fixed' parameters. MAxD.IUM UKElJHOOO ES11WJJON is generally used to cstimDle the fr= p8l'8l1'ldcls in a model. ConfirmDlory factor analysis models can be fiued using one of a number of available softwan: packages (USREL. EQS, MPLUS) and a wrietyofmethods can bcused to test the filof a model and to compare: the fit of two competing modcls. As an cxample or where this approach mighl be applied. considu a psychiDlrist who meaSW'eS a number of variables on a samplc of mentally ill patients. The PSYChiatrisl belicvcs that some of the observed variablcs arc relatc:d to a patient" s dcpn:ssion and oIhcrs to anxiety. and hc or she is particularly interested in estimating the correlation between these two, essentially. Lo\TEHT VARIABLES. To make things specific suppose then: arc six observed variables with thc first three indicating depression and the remaining threc. anxiety. The

95

OONFOUNDING _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ Delailcd exampies or the appUadiaD orconfirmakH)' f~~ aaaI)'Iis an: giyen in Hub&. Wingard and Bcatlc:r (1981) and Dunn. Eyeritt and Pickles (1993). BSE o.m,G.. EwtrIt,LS.....dda, AoI99j: MotkIJiRlcoa'Ol'iDncr, 1YII'iIIbk, DIrg EQS. Boca Raton: CRe PlaslCbapmlll a: Hall• ......, o. J.. WIIIpnI. J. Ao .......... P••L 198): A CCIIDpIIrisaD of' IWo laIent YariabIe causal models far adolescent drug use. JtJlUlIIIIlI/ PuJtHlllIi" _ S~,iDI PS]tIlol"l)' 40. 116-93.

_ltd.,

confounding

See BIAS IN OBSERVATIONAL S11JDIES. ~

CONIROI. STUDIES

Consolidated standards for Reporting TrIals

(CONSORT) statement

"Ibis rcscan:h tool was desilnc:dtoimprovelhequaiityofn:paltSofclinicailrials(Bc:u el al. 1996; Moliere' ilL, 20(1). Thccon: conlribution orthc: CONSORT sIaIcmc:nt consists of a now cIiagnun (see the figwe) and a chc:ckIisL The ftow diqram enables ~iewcrs and JCaden to pup quickly how maDy eligible panicipaIU wen: randomly assignc:cl to each ann of the trial and whether any imbalanc:es an: appIRnl n::pnljng Dumbers or palicats withdrawilll flOm 01' failiq to comply with their assigac:d Ralmenl (sec: DIIOPOUrS). Large discn:pandes ar imbalances SUgesllhc: need far conducting not only INI1iNI1ON-.11lEAT (m) analyses but also FER FROTOCOL analyses to seek corroboiation. SUch infonnalion is fn:quc:ntly difficult or impossible: toasoCltain from trial n::ports uthey wc:rc n:parted in the past. Thcchccldist iclcnliftc:s21 items lhatshould be incorpallllCdin

confirmatory factor.lUdyais Palhd/aglBmfordepression BtJd IIIJJdety example

corn:laIaL two-radcrmacieJ to be fitted isdeacribcd gmphiadly by Ihe .... diapam shawn iD the figaR:. Apart from 1hc enur wri~ the paI8II1daS to be estirnaled me the IaadiDp oflbe first thn:e wriabJes GIl ractor one (cIcpIasion) - variables rour~ 1M and six arc consIrainI:d to have zc:m Ioadin&s onlhis variable - and the la.linp of the lui thn:c yariables an facflll' IWo (anxidy) - now the firsllhn:c variables an: CODSInIinccllD D:IU Iaadinp.. 1he estimated corn:latian between the Iab:nt variables., cIepaasion and anxic:ly will be a c&satIcnualcd ~ IaIion.. i.e.. one in which Ihcdfeclsor~menl cm:n in the CJbsentcd variables ..~ been eft'CCliycly R:IIlCM:d.

"VC

Regislered or eligible patients (n = •••, I

Not rancIomised (n = ...) Reasons (n == ••• ,

}l ReceiwKI standanI Intervention as allocated (n = ...)

Received inteMlllion as aIocated (n = •••)

Oil not receive standanI Intervention as allocated (n = •••)

DId not receive interwnllon as aIocated (n = •••)

I

Followed up (n = ..•) Tuning of prinary and secondary outcomes

I

foIowed up (n = ..., Timina of primary and secondary outcomes I

WiItDawn (n = ...) InteMlllion iniledive (n =...) Lost to 1oI1ow-up (n =...) Other(n= ••.)

Withdrawn (n = •••) Intervention inelfective (n Lost to foIl~ (n = ••.) OIher(n= ••.)

=...,

I

~eted trial (n = •••)

Completed trial (n == •••)

!

ConaoIkIIIted SIancIarda for AeporIIng Trial. lllatement Flow clagram of CONSORT slatement

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ CONSULTING A STAnsTICIAN the title. abSlract, introduction. methods. results or conclusion ofevery mndomizcdclinicallrial. More details am be found at www.COIL501t-slatement.OJg. BSE (See also atI11C'AL AJIIIRAISAL, STATlmCAL REFEREElNO) _lilt C.,CIao.M..Irastwood,5., Mort..... a., MaIaer,D.,OIaIeID, L ~/"'.

1996: Improviagthcqualityofreportingof'randomizcdclinkal trials; the CONSORl' statement. Joumoi of the Amer;C'tIII Medical ADodation 276. 637-9. D.. Sbalu, K. F. aDd AlImaII, Do G. 2001: The CONSORT statement mised m:ommcadalioas for impIovin, abc quality of JqIOIts of parallel-group randomized trials. Annals of 'lftmIDl Met/kiM 134. 7-622.

,..aIaer.

consulting a statlsUclan

'To conmJt Ihe ,'atisli· ciDnajleran experiment isjinishediloften merel,. 'otuk him 10 conducl Q post-morlem exanJinalion. He rQlI pnhaps say ",/ralthe experiment died of.' So said R. A. Fisher. later Sir Ronald. widely considered lhc founding father of modem slatislics. and of R.\JIIDOW~ in particular. as Ion; ago as 1938. His ton;uc>in-c:heek message remains sage advice just as true today as a n:mincler of Ihe singlc most important aspect of sc:c:king slalistical advice - to seck il early. Many no\ice n:searc:hers make Ihe mistake of believing lhc statistician to be thc numbers person only to be approached. and then with bqJidalion. once data have been collectc:d.. In actuality. a consullation with 0 statistician should be a positive expericncc and opportunity to assist planning all aspects of study design. meaning neither just Ihe subsequent analysis nor the nanow mallu of SAMPLE SIZE DE1EIWINATION. Natumlly, lhen: 1ft important diffen:nces in how statistical consulting tUes place according 10 whether the setting is within a university. a hospital. a pharmaceutical company. a governmcnt agency and so on. duc to the obviousmffcn:nces between public and privatc sedor employers. not to mention gcocraphical diffcn=nccs from one wnlinent to another. Statistical consulting can also takc plaec in a variely or ways: tclephone. cmail or facc-lo-race, or a mixlun: Ihc:reof. This entry will rocus on the mast productive manner. nlllDCly face-to-face., sinec this maximises clrccliyc two-way conunurucation. II also CXlDCenlrales on those aspects of n:scan:h projccls that an: R:asonably consislcftt n:gardlcss or the particular environment. although the aulhor's perspective is bued on experience within academic settings. 11Ic remainder of this enlly examines Ihe sort ofprojcct-specific ad\icc a statistician am give. noIably incluclin; general guidance on ptqI8ring for a first meeting with a statistician and somc observations on abc inlCraction ~n the Slatislician and clinical rcsean:hcr. What cannot be included. necessarily. is local adYice on when: 10 find a nearby consulting statistician in the 8rst place. In abc event none is available. one should consider usililtexlboolcs or WEB RESOlJRCES IN MEDK'I\L SfA11S. TICS. or cven tra\'Cllilil 10 attend a short course offering an inlrOduction to the subjccL For further details concerning technical content. in addition to Ihe prcx:ess. of statistical

consultations. the rader is R:fcned 10 the rcstoflhis volumeor else to one of several boob. such as Hand and Everitt ( 1998). Derr (2000) or OIbera and McDougall (2001). 8nJadly. n:sc:arch can be subdivided into a number ofdistinct slagesasdcpidc:d in Ihe flgun:. Tbe wonltimc tob apprtJGICh a Slalb1ician is at the past-n:fCRCiq ua;c of a submilIaIjoumaI article. Consultant slaIisticians may be abJc to ofTer some mnedial help at this IaIe Sla£'e. but only on mailers of analysis. iderpR:Iation or prcsedalion. 1be most cammon n:asons why 5lalistical rd'c:nx:s rcc:ommcnd R:jeeIion or submitted manuscripts 10 biomedical journals pertain to design issues, which is hardly surprising when one Ialisc:s Ihal fUlldarnaual nows in SlUdy ap simply cannal be R:bicvcd by sophisticated anaIyscs (see STA.fISfICAL REHIttEINO). 'Thus. if the paper has not been mjedcd ouIrighl on staIisIicaI puunds., then: may be hope for Ihc manllSCripl aft« suitable nMsion. A staIisIician approached at such 0 late SlaIe is Iilccly 10 drop meR than a subtle hintlhat it would be allOp:Iher IIIIJR: SDtisractal)' csaentially to hec:d Fasher's advice mel request that the racaR:hc:r come along sooner in 0 projecl's lifc cycle Ihe next time!

Respond to referees' comments

consunlng a slatlsUeI.. Schematic diagram of the research process, from initial thoughts through to dissemination of the study results. Statistical input should ideally be sought at the study fotmulation stage

97

CONSULTING ASTATlsnCIAN _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ Then:: is a ICmptation to think seeing a statistician is unnecessary if one has confidence in one's OWD statistical knowlc:dgcanciabiJity (and access to relevant SOFIWAItE).1'his can be a danguous policy for Ihc novice n:sean:her. cspecially if the confidence tums out 10 be misplaced or handling the data man: complcx than envisa;ed. Evcn veterans of mc:dical n:scan:h with substantial slalislical skills of their own can find consulting a statistician invaluable, despilC the time andeffOlt n:quin:d in the mickt of busy rescarchagcndas and clinical conunilments. 'Ibis is by no means just to dcle,ate data-n:laIc:d tasks bul to additional. independent input about the intended study from an aho,cihcr ditren:nt penpec:tivc. Statisticians. after all do not sec the world of medicine and rcscaId1 in lhc same way as those on the frontline dealing din:ctly with paticnts. nor for that matter those working with test tubes in the laboratory. What sort of help can a statistician offcr? Clearly this depends on the nature of the projc:c:t itselfand the extent of the sllllislician's involvement. For example, ir in an academic environment a student seeks advice on a research projc:c:t forming a part of a ~e. then involvcment will be nc:cessarily less than in a full collaboration. In the fonner case, the swdent nccck to own Ihc anaIyscs and be able to derend Ihcm single-handedly. so thatlhc role of the statistical consultant is to poinl in the ri,ht din:ction by n:commendin, an appropriaIC choice or swciy design and method fordala analysis. To help a\'Oid bec:omin, a surroplc supervisor by derault il can be helpful to SU"cst thatlhc stuclent's projc:c:t supervisor should also allCnd the cOMultation. It is important to clarify early on in the consultation process irit iscxpcctc:d to bec:omc a full collaboration. far then issues or payment. if indicatc:cl. and co-authorship (or acknowledgement for lesser statistical involvement) need to be discussc:d and agn:c:cl. Payment for statistical advice remains a dclicate matter and Ioc:al nales would dictate. It is scnsible. so as not to discouragc those who mD5t need statistical help. to have a policy whcn:by the first meeting (or say about an hour) is provided fmc orany ~t charge to the consuhec. Parter and Berman (1998) providc some helpful criteria for sqgcsting when authorship may • may DOl be appropriate. As a nale. ir the finishc:d piecc of work could nOi have attained its statistical quality without the assistance of the consultant statistician. and more than just elementary descriptive or inrerential statistics an: involvcd. then Ihc default ~ht to be ~authorship for the statistician. Then: is at least anecdotal evidence that statistical ~authorship enhances chances of publication in fint-cboice journals. Equally, then: is a ~ that sbdisticians' names can be used ~ainstlhcir wishes to lend perhaps man: c:n:denc:c than is due to some submitted papers or grant applications! What should be broqhtto a fint meeting? In orduto make the most use oflhc lime available il is best for Ihcconsultec to

,aiD

have made some specific pn:pandions. A checklist can assist. pemaps in the fonnat or a QUEmO~AJRE to be completed in ad\'llDCC of the initial mc:ctin,. Useful questions to addn:ss both 'housekeeping· matlers as well as more substantivc issues concerning lhc projc:d. include Ihc following: I. What is the single nmin aim oftire project '! (A bricf answer to this fundamental queslion at least ensureslhalthe mccting can be foc:usc:cl.) 2. What stage is the project al rigM II0W'! (Options can be forming ideasldesipi~ protocollcollc:cti~ datal analysis or dabllwriling upfreren:e's comments.) J. What arra(s) do yOli Ihilllc )"011 need "elp wit"'! (Some an:as an: formulating ideas/sample size calc:ulalionldesiping protocollmaldng granl applicationlrandomisalion practicalitiesf carrying out the study/collecting dalalmanaging data/analysis you an: doing/c:bccki~ your anaIysislchcckin, wriucn n:portlrespanding torcfcree.) 4. \Vhal role Jl'OIIld)'Ou lilce the stalislicilm 10 play,! (Thesc could be advisorlco-applicantl interpretel' of resultslco-author. althoqh note thalthe statistician would reserve the right to decline the latter if authorship was felt to be inappropriate.) S. Does t"is ...'On form pari ofa disserlation or tlresis'! (See comments abo\'c concerning student work.) 6. \Vhal is Ihesour«ofpolientsor sllbjects and lire crileriafor selecting Ihem? (This allows an opportunity to discuss ar review appropriaIC study design.) 7. Ho ...•many subjects are reqllired or available'! (If this is to be a topic for Bdvice.lhcre is a need to know clinically n:lewnt differences in proportions invol\'ed and/or sIandanI devialiOM far continuous outcomes.) 8. What is the maiIl outcome measure? (Again to foc:us attention on primary as opposed to seconchuy ENDPOINTS or, in the worst case. to ensure lhc project docs pre-specify at Icast one endpoint.) 9. W/ral is the main ('Otrrparison or relationship 0/ interest? (To enc:ouragc bei~ as specific as possible and to check for a suitablc control ,roup.) 10. What other quanlilies are being measured and "wen? (For example. B.UELL.'IE MEASUREMENTS. covariates. scc:ondary outcomes.) II. What problems lrtuY! been or are tllfticipated in dota colledion? (To discuss. f . cxample. aecuracy. MJSSJNO DATA. repeated measures. matching but essentially any potential BIAS.) 12. Wlrat aree.'Cpected orhopetijoT rrsullsal lhe stlldy'smd? (Apin to foc:us on lhc real n:ason for performing the rescan:h.) /J. Are there any specific approaches 10 .'a am/ysn intended? (For instance. Ihc same method as in a previously publishc:d study. preferably with a hard copy 10 be handed over.) 14. Is there any furlher infornroUon )'OU would like 10 gi.'Y! rrgarding the sllIdy? (A suitable clOSing question to allow, one hopes. any pertinent facts to emerge.) It is best if aMYlCrs to the above catalogue or questions can be sent in advance of Ihc meeting. along with a brier description of lhc projecl and copies or related dac:uments to assisl the statistician's understanding (c.g. protocol, gnmt application. Clhics commillCC submission).

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ CONSULTING A STATISTICIAN In lenns or practicalities. the slalislieian may have some rurther expeclations or the CGDSullce 10 bring or transmit in advance or the Rill meeting at which data DR to be analysed (n:call, idc:ally, this isnol at the first c:DIXIUnter!) Statisticians do nul usually lake on the more mundane data enlly tasks. so would not be pl'CJHImIto type in Ihc: numben. They may express IRferences rorhow data IR pracnted eleclJonicaily in terms or file type (e.g. Excel being a cammon choice) and media (Ioppy disks are fine. thoqh somewhat oldrashioned; email atlachments genc:raJly wadt better ror small-to-modcratc: sized datasds, or USB pen drives more genc:rally). In any evc:Dt. it is always important to check ror viNsc:s 10 awid spn:ading conlaminated files. The layout or the data should anlinarily be as a sprc:adsbc:c:t. with welllabelled variable names. one column pc:r variable and one row per subject. It is best to ask ir th~ is a data enlly prererence whc:a handling repealed measura data. but ir in doubt the spmldshcc:t works well. In any case, data provided must be reasonably clc:an and rn:e rmm dala enlry errors, although the odd OUTLIER is excusable. Due to conRclc:nlialily issues (e.g. the UK's Data Protection Act 1998) there should nol be any uacadc:d individual patient iclc:nlific:n: i.e. names and addresses and other inrarmation that could be used 10 trace individuals must ha~ been n:mGved. ObViously, the: anonymisation praa:ss must generate uniquc palic:at IDs in order to be fully reversible $Olllat quc:ric:s with dalacan bechc:ckcd flOmariginal records that are ston:d clsew~. While it is nul a serious problcm. it is better to code data numerically nllhc:r than alphanumerically. For example •• ' and '2' ror 'malc' and ·female' rc:spc:cti\"Cly is bc:ucr lhan use or 'M', ·m', "male'. ·Male'. ·MALE', c:tc., especially as atlCidental leading ar tniling blanks can add to potential c"onf'usion, possibly creating a nc:edless missing clala point on subsequenl conversion to numeric formaL In geneml. calcgarical variables should ha~ a ditrcrc:nt number rcpnm:Ating each group, with an acc0mpanying clc:scription. or internal labelling, or how the categorics are c:odcd.. Equally. missing data DR better bandied by insc:rting an obviously impossible value (e.g. '-99~ whc:n all other values DR positive) rather than just leaving a spradsheet cell blank. The: statistician would rather be told about any such embeclded axle, however, to avoid unnc:ccssary runs or softWIR IVUlines after noticing. for example, stl'DDge rc:siduals in regression analyses. An altogether less langible item 10 bring along. but arguably the most impadant ror a suc:ccssful mec:liDg. can be summarised as the: righ, ,,"Ilude. Statistical ClClftSullation involves a high dqRIC ofCXJIIImunicalion and mutual respect. Since areas orexpeltisc are ditren:nt.j8llCID is to be awiclc:dboth by the statistici... and the consultcc:. (Medics are not alone: in haVing big words. ar ablRviations and acronyms, to describe lhings that are obvious only to themselves!) Pune-

tuality is important but it is undentood that mc:cIicai emergencies can and do occur thai nc:cessilale being late. in which cue 8II1II1Jing far a telephone mc:ssqc is a simple courtesy or cutting shart an ongoing meeting at a blc:eper~s notice. However, there is lID such lhing as a slalistical Cmc:l'lcncy, so the~ is littlc excuse far the cansullee who demands an immediate appointment with a statistician or expeds raults 10 be tllJ1led around within. 58Y, 24 hours to meet his or her cIcadIine rar • grant. Clhics or canrerence submission. parlieularlyas such clc:adlines are typically known months in ad\'8DtlC. Also bear in mind some c"oosulling statisticians are new to their jobs. Just as some lnIining or junior doclon occurs ·on thejab~. so Iooclo juniar sIalislicians ha~ 10 learn. iclc:ally under supervision rrom someone more experienced. by interacting with rail clients in real consultations. 11Ic: bansition fram a univc:nity degree coune to a practising statistical consuilaDl is never automatic. An attitude of patience is helprul in these cin:umslances. much as laluiam by drivers stuck behind a leamer*,,"ling willi hillstarls(all were Ic:amer cIri\ICIS once!). To close. and in Icceping with Ihe spirit or F"asbc:r's advice quoted earlier, it can be instJuctive toc"onsicler waysorhaving an _helpful meeting between a mc:cIical resc:an:her and statistician. So long as both panics can avoid malting Ihese mislakes. then: is scope: for real progras and genuine collaboration. Flr:sl. whtlt Q1'e some of 1M "'wy-, a slotistietlll upse' medlcul colleQ,ues? I. Being 100 nit-picky. pn:cise, clc:tailoriented and railing to see the big picture. 2. Being slow to n:spand to requests ror appoinbnc:nlS ar to analyse data. 3. Bcing overly crilical or gcnuine-but-ftawcd attempts to analyse data themselves. 4. Using unnecessary jillion. S. Using unnecessarily complicated mc:Ihods when simpler ones suffice. 6. Spending tao much. or 100 little lime. during the consultation. 7. Embarking on a mathematical lecture within • consultalion. 8. Only expecting to meet on your home turf (clcspitcowning a laptop). 9. Believing ~ is such a thing as an average patient 10. Thinking EVIDENCE-a.\SED MmICINE. (EBM) means clinical expcric:acc counts ror nothing compan:clto having a few well-honed CIUIICAL AJIFRAISAL skills and a recenlly published META-ANALYSIS to hand. Fin"I1.". IrOIl" 10 up., .,'OUr st,,'&Iicitm? 1. Sayiag "This will only take S minutes or your time'. far it will nul. 2. Arriving unnannaunc:cd, late or not at all (notwihSlancling genuine clI'ICI'ICncies). 3. Wailing until the grant or ethics application deadline is lomonow and leaving no time rar review orstatistical input beron: sc:adingthe documc:at otT. 4. Driprc:c:cling data ar hypotheses or telling halrlhc: story ('Ob. actually irs Ihe same: patient seen five times') or shifting between study aims. S. 1bking far granted - not considering acknowlcdlement or co-authanhip or balhc:ring to inform ir that application ar joumal submission was eversuccc:uful or

•

CONTINGENCY COEFFICIENT _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ noL 6. Saying, in earshoL "I just need the slalto toaunch the numben' and generally reprding the statistician as a technical service provider. 7. Expecting knowledge or specialist mc:dicalll:nninology. 8. Expecting poorly entered data to be cleaned or fmgdting to run a virus check on your data. 9. Demanding "What"s the P-yaluc?' or 'CanOt you ftnd one that is significant?' 10. Coming too late in R:SClIR:h process and aJlDplaining about a statistical postmortem! CRP Callen, J. 8Dd Me....... A.. 200): Slatistitlli consulting. New Yolk: Springer. Derr, J. 2000: Slatink'oJ eOMllllillg: a gllitk 10 efferlive ~itlltion. Pacific Grave. CA: Duxbury Press. Bud, 0. J. 8IId Eymtt, B. S. (eels) 1987: TJw stal&titaJ consuIlaIIl in aclion. Cambrid&e: Cambridce Univenil)' Pms. Parktr, R. A. ... Bem.a, N. o. 1998: Criteria for authorship far sIalisticians in medical papa'S. Slalistics in Mftlirin~ )7,2219-99.

contingency coefficient

This is a measure of the slrength ofan association between two categorical Yariables. While the CHI-SQUARE lEST can detect an association between two variables.. it is not a good DlCaSlR of the stmIgth of that association. This isbccausc it is alsodependcnl on the sample size and Ihc numbcrorcategories into which the yariables 1ft classed. Typically, COIIliagency coefficients are adjustments ofthechi-squlR sbdistic, intended to remove the dependence an those ractcxs. Because they are based on Ihc chi-square statistic. any attempt to test the contingency CXJCl1ic:ient for significance will mcn:ly resolve into repealing thcchi-square lest of independence. The two most common COIItiogcacy coefficients 1ft CramCr's contingency cocfBcient (also known as Cramer's c. Cramer's V and occasionally CratnCr's v) and Pcanon's amlingency coefficient (often just refcrml to as Ihc c0ntingency coefficient 01' as Pcanon's cocfllcient or mean squlR cOnlingency). For a table with T rows and c columns, with k being sct as equal to abc smaller or T and c. that produces a chi-square stalisIic of }f from n observalions. the fonaulac for Cramer's and Pcanon's coefficients arc: Cramer's coemcient

=

)(2

n(k-I)

so it is possible to n:scale Ibis cocflicientlO lie in die range o to 1. While the use or these measures is popular in some fields.. more so if we consider thai the phi coefllcienl far a 2 x 21ab1e (sec COItRELATlOH) is a special case of Cramu's coefficient. inlClprctation is nol straightforward. Clc:ady, in some sense. the larger the cocfftc:ient is. the g~alcr Ihc associatian. However, the absolute w1uc does not haye any clear meaning and comparing concJatian coefficients from two tables (especially tables of dilTermt dimensions) is not sllaighlforward. Cantingency coefficients are widely used as a result of their convenience aad in spite of their limitations. For 2 x 2 lables. odds ratios IR possibly a bcUCl' mcaswe as it is easy 10 produce confidence intcrwls and they ha\'C a familiar intcrpn:lalion. Far larger labIes with at least one ordered categorical variable a measure based on die Spearman rank ~Ialion might be mare appropriate. For ftuther details ICC Goodman and KnukaI (1954), f1c:iss (1981). Siq;el and Castellan (1988) and Conowr (1999). AGL

c-,.., w. J.

1999: PrtNtim/lflHrptrTtltMlrir Slalislirs. New yort: John Wiley a Soas.Inc. ' .....J. L Itsl: Slatislit:tlimellrot/s for raleJ tmtl "opoTliOlU. 2ad editiCIIL New York: Joha Wiley III Sans., Inc. Goodma, L A..IIDII KnasbI, W. H. 1954: Mcas&Rsof assaciatioA for croswlassificatialS. Jouru of 1M Amerinm Stalutiml As.Jotialion 49. 732-64...... s. 8IId CaltllIaa Jr, N. J. 1988: Nonparanrelric stalistirs for lire behtniDurai sdmca, 2nd edition. New Vorl: McC:iraw-HiU.

contingency tabl..

These 1ft clOss-tabulatiOM that arise when a sample from same populatian is classified with respect to two or IDO~ qualilaliYe Yariables. The ftnI table shows a simple example involving two such Yariables each willi thn:c caIcCories. A IIICR complex CCIIIlingencylable that inyolves a classification with n:spccl to thn:e yariables is shown in the second table.

contingency tabI_ Incidence of cereblal tumours Type

and Pearson's coc....cient = Site While Cnunc5r's cocOlcienl can Iak values from zero to CIIIC. PearsonOs CXJCflicient cm. Dever ~ach one (the denominator is clearly always larger than Ihc numcnllor). In fado

.........·s_ .... lcaownmaxinwmor lk-I)!k

1bbd

I D

m

A

8

c

Total

23 21

9 4 24

6 3

38 28

17 26

75 141

34 18

31

L rn.tat . .; D, tcmporall.s~ Ul atbc:rcmInJ mas. A. baaip tumours~ B. maiipaDt tulllOllll; C, other oen:inl1llJnclua..

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ CONVENIENCESAMPLE

contingency tables Coronaty head disease

Serum cholesterol Blood

2

3

4

PRISUIC

CHD (ycs)

CHD (ao)

I 2 1 4 I

2 3 8 1 111

2

85

1 4

119. 61

334 2 I 1 II 6 6 12 II II 121 41 22 98 43 20 2C» 61 43 99 46 3l

Blood ~: I, 1671D11l11;.Saum cholesterol: 4In. An infannal approach is 10 sort the distanteS in Older and examine the fcw cues with Ihe highest distances. A large jump between these and the rest can suggest points worth iJM:Slipting. CUC5IhUS identified mighl be considcRd for removal or at least further inwsligalion (subja:t to the caution that removal of OLnUEllS always requin:s). Analogous quantilies fCll' other maclels an: available, e.g. for LOCIISlIC IlECIlESSIDN (see P1egibon. 1981). If the interest is in one or man: particular parameters in the repasion, ralhcr than the c:amplete set taken as whole. 'dfbetas' can be computed: these estimate Ihe changes in the indiYidual paramelCl's after dcJetilll each cue. ML

Coale. R. D. 1977: Detedioa of infIucnlial obsemliaas ia linear ~pasion. TemlJtHJJt/rit.t 19. 15-11. Caalc.R.D.udWil!llllleq.S.

1999: Appltetl regrrsI_ illcludm, rompuling ond ,ropIIiu. New York: JoIm Wiley &: SaIs. IDe......... L C. 1992: RcpasiDD with paphics. Belmont: Duxbury. PnaIbGD. D. 1911: Logistic ~pasioR diaposticL Annab of Slolirliu 9, 705-24.

h;

(I-hit

r

when: p is the number of i I ·ables, is the variance of the ellimalc and Ir is the I vc:mge of Ihe fth observalion. given by Ihe ith onaI ement of the socalled "hat' nudrix H=X (X"X)-IX": w~ X is the data matrix ofiDdc:pendent values (see Cook and Weisberg. 1999). For a point to be inftucntial it must be both an oUllier. i.e. haft a high residual, and it must also have high levcnagc. i.e. be far from the cen~ of gravity or the points (see the

IC

8

aB

6

coplota

See 'I1tEWS OR.o\PHS

COREC

See E1HICAL REVIEW CotNl1TEB

cOlf8latlon Correlalion is used to meas~ Ibe strength or the linear n:lalionship between two random variables. If we plot two variablC5 on a SCAnERPLOI', their correlation is a measure of how closely lhc points lie to a straight line. We measure correlation by a mm:lationc:oemcienl. 11Ie simplest or these is PEARsoN'S CORItEI..A'I1ON WEFFlCIf.NI'. also known as the protlucl-lIIOIIIenl t'O"emlion coefficient or simply as dae com:lalioo coeflicienl. This is the ratioorthe sum ofpmclucts or difTerences from the ),lEAN divided by the square RJOIs of the two sums of squares about the mean and is usually denoted by r.

4 r = --;==+====~=======~ ;-y X;-."C

2 ~

_)2t -)2

0

-2 -4 ~

-8

-2

-1

0

1

2

3

4

X Cook'. distance Three points in s sample al50, only one of which (C) bass high Coole's dislance. PoIntA has B hiQhresidusJ butlowIfIV8f8(IB IUfdB has Blow IfISiduaI but high leverage

The confusing sym r' (rather 'c') is for historical reasons: it appears to ave indicated "regn:ssion' originally. It is now well established and if a medical paper uses "r= ...' wilhoul explanation, il usually means the correlation coefficienl When we wanl to distinpish between the correlation coefficient in a sample, r, and the correlation coef6cient in the population from which Ihe sample was drawn, we use 'p', Ihc Greek letaer "rho', to denote the latter. 11Ie figure (see page 103) shows lOme sample correlation coefficients. The coefficient is positift when large values ofy are associated with large values of x~ the variables being said to be positively correlated, as in (a). (b) and (c) in the figure.

__________________________________________________________

be misaed 01' ~ by il. In the fipn:. (.:) &baws • SIIimI mllllionship yet the ccm:JaIiaI coeIIk:ienl is ZCIU and (I) shows .. CUd l1181hema1ical mlaliDnslaip. wiIIIauI. lID)' JBDdom wrialian. yel .thc CGlRIIIIian coeIIicic:nt is Iess.1IIIIn GIll: because Ihe mlalionsllip is nal

The lllldaril)' of abservaiions will have either baIb obserYaIian5 llater thaa the mc:aa or boIb less Ihaa the .... la eilhercase, obscMdian miaus .... wiD have the __ sip. eilherposiliw: or 1lCl1IIi~ for baIII variables ancIlIK: pnxIuct of Ihese clitrc:nmca wiD be positive. HeiIcc ~ . . . of products will be pasili~ aaclthe COIMJaliDn cocftlc:ieIIl wdl be posiliw:. The canclaliOR cacIIIcienl is nc:ptive wIIea small values of,. an: associated with Iarp values or .Y, the variables beilll aeptively comdaled as in (I). (h) IIDil (i) ill lhe fipR. The majarily of observatioas will ha'We CHIC obscnati~ p:atcr Ihaa Ihe IIIC8D .... Ihe GIber lea .... the mean. 0bscnaIi0a aiiaus mc:aa will haw:. ditrcn:al sips for the lwo variables aad the: pnJdUCI oflhese clifl'cn:nces wdl be aecaliv.e. HelICe the sam ofpnxiucts wiD be acpIiw: and lhe cam:1aaioa coeIJIcienl wiD. be aeplive. The conelalioa caeOIcical has a maximin vahle of+1 wllea lhe pOinlS aD lie exac:dy OR a straiJhl line and the variables an: posiliw:ly c:anelalcd and a lllinim. . -I whea.1he points .. lie aacdy OR • Slmipt Une and Ihe variAbles an: aepliw:ly com:lalcd. When there is no linear rNtiClDlllip 81 • the caefficical is :a:ro aacldIe variables an: said lo be UIllXtlleIalc:d. as in (d) ia die filum. COIRJaIiDn -'y IIICIIIUIa the IImI&Ih of the Iiacar (LCo SIrIIiPlline) ""lIiansbip. Noniiacar Rialionships ...y

• IInIiChlliac. . . We: can lest Ibe MULL IIYPOIIIESIS Ihat Ihc papulation corRIIIIion is zc:nt. Le. dial Ihc~ is no linear n:Ialionsbip belween die two variablc:s.. Ulinl • sinaple l-test. Alleut oaeoflhe twoqriables millt follow. normal distribulion.and die observalions mult be indcpe..... lfwe can asIImIC Ibis. we: n:qui~ only Ihe value of r ancIlhe sample size If. 11u:a. if die nuJl .ypalhc:sis YAK InIe: . 1=1'

0'

(a)

r=1.0

/0

7 > 8 5

I§

4 3

0

>-

I

8

I

~

8 8

4 2

(It

r.O.O

.. 4 ~ 2

2

>10

I

8

I!l

r.-G.3 00

5 0

co""""

0

4 2

I alii" >-

8 6 4 2 o0 0

0

-

2

0 0

YarilbleX

>

I

8 8 4 2

0 .8 0

0

0

•

VarllbleX (I)

.....

o~

(h)

'. 00

0

r.O.O 0

>- 8

YarlableX

(a)

~

(e)

r=O.5

(c)

YarlableX

ooo~lho

8

UI' IIOL'

r=o.e

YarilbleX (d)

would follow al-dislribulion ~ an: IabIc:s or lhis lesl many boob'and aImast aD PIOIJ1IIII5Iha1 calculaae ,. also liw: die P-VALUL As a n:.tt.. correlalioa coefftcients in medical papl:n ~ allIICIISl iDvariably rollowed by p-~uc:s. Col. ·,=0.57. P20 events ror each category combination. PareKh variable being consiclen:d in a Cox model we lest the NUU.. HYPOJ1IESIS that the variable is not imponant to the model. i.e. that the panuneler value associalcd with the variable.p, is zero: this is equivalent to the hazard ratio (HR) ror that variable. HR =expfJl) =eo = I. This can be tested with a =-slDlistic when: == b/SE(b). where b is the estimate or the parameter and SS(b) is its standard CIIUI'. Unclu the null hypothesis this should follow a normal distribution and thus P-VALUES can be calculalc:d in the usual way. We may assess models and the addition or removal of variables to models using a vanetyofdiffen:nt tells including the Wald test. LlKEIJIIOOD RAno test and &COn: lest. The score test is the mosl complex and less commonly used tesL The Wald test looks at the change in the overall value between two models where the DEGREES OF FREEDOM is the number of dirrerent variables between the madels. The likelihood ratio test compares the 'likelihoods' of the two models and lakes a more general approach than the Wald tell: illooksat how the included variables explain the variation in the model. This is. therer~, the prerem:d method for reasons of consistency and stability. The time of each oulc:ome evenl (railure lime) is not actually relevant in a Cox model. but the ordering or these railures is. 1bererore. considcralion needs to be given 10 the onIcroffailurcs in the event of failures with tied event times. These can be dealt with in a series or methods including marginal calculation. panial calculation, E&on approximation and Bn::slow approximation (see Kalb8eish and Prentice. 20(2). The last of these is the simplest and is an adequate approximation if then: are relatively Cew lied railures. ~ should be taken when using the Cox model if there are 100 many lied event times.

r

cREDIm£I~v~

As Car nonnallinear n:grasion. it is possible to assess the fit or the Cox model by calculating residuals. However, then:

are no unique residuals far the Cox model Commonly used residuals an: Schoenfield and Maninga!e n:siduals., althou;h il can be difflcult to interpret whichever are used.. II is also possible 10 assess whether individual explanatory variables violate the propmtional hazards assumption (sec PIIOFORTJON. ALHAZ.O\RDS) and thcn::ron: assess whether a variable should be included in the model. MSIMP

a ..... M. A., GoaIcJ. w. lV. . . . GaUe....... R. G. 2003: An introduction 10 $IUl'iPO/ tIIftIlysis wbrg Slaltll. mrisc:d edilian.. 'lelas: Slala Pras. Cos, Do R. 1972: Rqression models and tire tables (with discussion). JOrmlill of the ROYS/ Stalistical So~"'rty 834. 117-220. KalIIIIIIdI, J. D. and PnaUet, R. L. 2002: The statisliml tlIIII/yJU of/DiMe time .'a. 2nd editian.. New York: Joha Wiley &: SaM. IDe. MMIdD, D. IUId .........., Me K. L 1995: Surriml tIIItllyJis: a pI'tI~t;CQI approodJ. London: John Wile)' &:

Sen. Ltd.

Cramer·. conUngency coefficient

See CONtlN.

OENCY ('()I!ffiCiENT

credible Interval When the aim of a Bayesian analysis (seeIlAYESlANMErHODS) is 10plOvide a scienliftc infen::nce about

an unknown parameter aD the requin:d informalion about the uncertainties involved an: contained in the POSTERIOR DISI'RIaunON. 11x:~ is a sense in which the only bUe salisfactory infen:nccswnmary is the complete "piclure' n:pn:sented by the posterior dislribulion. Alternatively. a range or posterior dislributions concsponding 10 a range of prior spccificatians allows a display of the sensitivity of ·conclusions· to "assumptions'_ Sometimes. however, proViding a complete pidure or the uncertainty in estimation by a posterior distribution is lessc:onvenicatthan providing alow-la'Cl summaryorthc message contained within it. Credible intcrwlsan: to Bayesian statistics asarmDtNCE INJERYAlS an: to fnxplentist SIaIislics; they provide a simple summary or the uncertainty associated with the e5limalc or an unknown p8l1lmelcr. If we suppose that the posIcrior dislributian ror an unknown parameter el is denoted by the poSlcrior diSlribution p (6) then an inlcrYal (el1.' clu) is said to form a 100(1 - a)CJ.. posterior credible for ~ if

a..

p(cl)d6= I-a

There are infinite many ways to determine a credible interval. some oC ich are iIIuSlnled in the first n,ure. Cn:dible inaerval ( I) isdelCrmined so thai il excludes regions or equal posterior probability. eKh tail com:sponding to a probability of I - al2. Credible interval (2) excludes a region of ex.Kliy a. on the lower side extending to infinily on the right. In contrast. credible interval (3) excludes

109

CRmCALAPPRAISAL. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ a ~gion of exactly ~ conlent an the right beginning at zero. 1be final cmlible interval (4) is termed the highest poSlCrior dc:nsity (HPD) intc:rval and is constructed 50 that every parameter w1uc: within the inlcmll has a higher density than every value: outside: the inlerval and hence they arc mo~ likely values. It can be demonstnted that the HPD interval in any particular case de:tennines the shadest cn:dible inlcrYai. If the posIcrior distribulion is _synunelric then the equaItailed and HPD intervals coincide:. TIle concepI of a CRdible inlcrYal JenenaJises to mo~ than a single parameter. •

(4)

•

I

(3)

~~----~~~--------~: (2) II II • I~

(1)

::

I;

..

credible Interval Credible IntetvBls for an unknown paramstfN

The second figure illuSlrales a bivariate (lwo-panuneler) HPD region COnsb'Uctcd in euctly the same way as the uaiwriatecasc but with every point within the region having a highu density than every point outside it. 1be specific example is IBken from an extensian of the Bmdley-1Cny pain:ck:omparison madeI (see. far example. Imn:y. 1998), accounting far lies. Far a given pmbabilily conical such a c:mtible ~gion has the smallest arcL AG

4 I!

1

3

II

12 1!

11 ~

0 0.4

-0.7 0.6 0.8 0.9 0.5 Probability of trealment preference

ciacllblelnterval Bival'iate HPD region

1.0

1IDn!"P. B. 1998: BndJey-Tal)' madelInArmil8F. P.aadCoIt. . T. (cds) En"dopet/kl of biolltltislirs. ctuchestcr: John W'aIcy &: SGas. Ltd.

critical appral.11

This is a process thai evaluates reselRh MpOrts and assesses their contribution to scientific knowledge and is typically applied to n:seardJ papen in medical journals. A careful evaluation of the medical literature is important because the quality of n:sean:h is variable, and oRen very poor. It is imprudent to assume that a paper is error f~ just because il has been published: even papen in well-n:spectedjoumals contain faults that cast doubt on the conclusions. Allman has n:seaadIed the exlent and implicalions of emn in the medical lilcnllUle. estimating that reviews have found statistical emxs in about half of published papers (Altman, 1991a). n.e pmbIem of poor-quality research is set in the context of the incn:asing use of statislics in the medical Iiteratl1J'e. Altman (199la) describes two surveys of n:seardJ papers published in the New Eng/tlIft/ JOUl'llllI 0/ Medicine, in 1978-1979 and 1990. III this lime the proportion of papen conlaining nothing more llum descriptive Slatislics fell fram 27., to 11 CJt, while the proportion using mon: complex statistical mc:&hods. such as SUlMVAL ANALYSIS. increased dramalic:ally. A good UDCIcrstancIing of slatistical analysis. alongside: BD awareness of statistical iaues surrounding ICselRh de:sign and e:Jlecution, ~f~ is essential to effective appraisal of the medicallilcnllUle. AllmaD (1994) BIIuc:cl. in an editorial enlilled 'The scandal of poor medical research", thai re5elRh in the medical amHI isoftca done with the-aim offunhering a curriculum vitae. rather than promoting scientific knowledge. He sugests that: 'Much poor I'CSelRh arises because I'eSClRhers feel compeDed for CIRCI' MaSOns tocany out n:sean:h that they arc ill equipped to perform. aad nobody stops them. " The situation is eampaundcd because the individual is 'expected to carry out some n:sean:h with the aim of publishing several papers'. the number of publications being "a dubious indicator of abililY to do good n:search: ilS n:Jevance 10 the ability to be a good cIactor is even mo~ obscure' . This culture. AllmaD argues. leads 10 poor-qua!ity ~ search. An addilioaal difficulty arises bec::ausc jlDlior doc:lOn typically move jobs frequeady, but are nevertheless often expected to conduct racardI during their short tenures. This may lead to small sample sizes as weD as inaclequalc time far baining. planning. analysis and formulalion of conclusions. Purthu problems may occur when in\'estigalon an: expected to complete n:search initiated by their pralc:cessors. Altman (199lb) suggc:slS that easy access tocomputen BDd slalistical packqes ulIIICICOmpanied by cOlTeSpODding technical understanding. as well as inadequate statislical education, also contribute to the c:mn.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ CRmCALAPPRAISAI.. Elscwbcn:.. Albnan (1980) describes the grave ethical implicalioos of poor-quality research. He argues lIIat it is unethical to carry out bad scimlific experiments since palienls may be subject to unnecessary risk. discomfort and inc:onYenience.. while other resoun:cs. including the n:seard1er's time. ~ diverted from more valuable functions. 11lc publication of erroneous n:sults is also unethical since it may lead din:ctly 10 patients receiving an inferior treatmcnL Mon: subtle consequences arc the encouragcment to Olber rescardtcrs to n:plicatc Hawcd methods or 10 do further n:search based on erroneous pn:miscs. as wcll as the difficulty in gclting ethics commiUces to permit further n:search when il is lhoughtthat Ihc: '~cr aRSwer is known. Many medical journals cmploy statisttc.al n:Yiewers in an atlcmplto improve the quality of racardJ design. statistical analysis and presentalion of l'CSults (see STATLmCAI. REfUEE. ING). Goodman. Albnan and George (1998) rcpoIt lIIat in a 1993-1995 study 37.. ofjournals surveycd had a policy that guarantecd a statistical n:vicw bcf'ore an acceptance decision. Direct eYidence of the etTect of stalistical n:viewing is limitcd. However, Schor and Karten (1966) studied the implementation of a programme of statistical n:view at a leading medical journal. Of the 514 original contributions consideral 26Clt were judged statistically acceptable: this increased 10 74'i, once these manuscripts had bem publishcd aller statislical ~vicw. Ganlncr and Bond (1990) pcrfonned a similar study on 45 papers submitted 10 the S,.ilis/, Medica/ JOUT#IIl/. 11lcy found thai only 11 'i, WCI'C inilially considcn:d suitable for publicalion. but after stalistical n:vicw 84CJt. wen: n:glUded to be of an acccptable stalistical standard. Howcvcr.lhcn: is much research that has not undergone statistical ~vicw and reviewing itself is a subjeclive process. It is thus essenlial to n:od papers in the medical literature cautiously: the n:putalion of a journal is not a guarantee of the quality of research n:portcd. Errors in racardJ vary in their magnitude and impact and a major elcment of critical appraisal is then:fore to evaluate their poICDtial cffects on conclusioos. The followingdcsc:ribes SOMCCOmmoncnolS in Ihc:dcsign. analysiS. presentation and interpn:lalioo of medical n:searc:h. 11tc c:ommcnls ~ not comprehensivc. but n:preSCllt some of themon:widespn:odandimportantcrrorsmadcinlhcn:search process (sec also pitfalls in medical n:scan=h). Andersen's (1990) book. Mel/,otI%ricD/ errors i" medico/ ,.esearch.

contains a more completc description of crrors illustrated by examples from the medical litcratun:. although the author nevertheless describes il as an ineomplete catalogue. Medical I'aieIII'Ch can be broadly divided into CUHIC.o\L TRIALS. COHORT SllJDIES. C.o\SE~ON1ROL SllJDIES and CJlOSSSECTIONAL STUDIES. Clinical trials an: cxperimental studies when: the invcsliplOr assigns participants to different interYcntions. preferably randomly. The well-conduclCd nndo-

mised conlrollcd double-blind dinicallrial comes the closest establishing cause and cR'ccl between intervention and outcome in a single study. Cohost studics. casc-control studies and crass-sectional !!Iudics arc all observational !!Iudies. Hen: Ihc: investigator observes participants without making any inten'mtion. Conclusions an: noa cORSideR:d as robu!!l as lhasc from cxpc:rimental studies because factors controlling cXpDSlR may also be relaled to the ouame. However, obscrvatiooal !!Iudics an: common because it is oftcn impraclical or uncthical to assign participants to intcrvmtions: for example, it would noI be possible 10 assign subjects to be smokers or nonsmokers. Each study design has advantagcs and disadvantages for specific research questions and an important initial consideration in critical appraisal is whether an appropriate experimental design has bem employed. Several of the following criteria ~lale to clinicallrials when: rigorous design and conduct of studies is impcmli\·c if n:sults an: to be conclusive. 11Ic vast majority of research studies cannot consider the whole popuialion of intcn:st and Ihc:n:fon: a sample is selected. Results from this samplc an: then applied to the population ofinlercsl.lflhis infen:nce is to be valid it is vital that the sample is n:presenlative of the population. A key concept is thai of random selectioo; if the pmlicipants an: randomly sclected from Ihc: population then then: is the best chant'IC or the sample being truly representative. 11lc n:search setting is often a penincnt CX)nsideration hen:: a srudy of childbinh in a maternity hospital n:cciving a hish proportion of referrals for complicatioos may nol be rqm:scntalh'C of childbirths throughout Ihc: c:ounlly. DRoPOUT or refusal rates an: also important issues because then: is a strong possibility lIIat those who do not take part in a study arc systemalically diffcrent from those who do. Allhough dropouts and n:rusals should be minimisc:d. they an: usually inevitable and a good rescan:h paper will describe the repn:scntative nalUn: of a sample by n:porting clearly the number originally selected. as well as the number completing the study. Reasons for dropout should be given. if possible. and any available characteristics of Ihose who do not compleac the study compan:d with those thai do. All n:scarch papers must describe fcal~ of the study samplc so thai comparisons with Ihc: mcwnt populalioo can be made. A very common problem in experimental design is the lack of a pre-study sample size calculation. This indicaICS the number or participants n:quin:d to be reasonably lilcely to deleet a clinically significant etTect.lt is considcn:d uncthtc.al to undertake a study with insufticicnt numbers 10 detect such an etTect. Thcn:fon: it is important that a prc>study sample size calculation is performed and described in Ihc: n:scan:h n:porl. providing su01cient detail. about the assumptions made, so that the calculalioo can be verified. Sample sizc to

111

CRmCALAPPRAISAl. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ aJIISiderations should be based on the primary outcome variable in a study. and should also allow far dropout or refusal rates. In clinical lrials the concepts of blinding and nndom allocation to intervention are essential aspcclS or experimental design. Blinding is necessary becausc BIAS may coler a study through a participant or observer knowing the intervcotion aDocation. In a double.-blind clinicallrial neither the participant nor the observer is awan: of the allocalion. Blinding is clearly nul always feasible. ror example. in a trial comparing an intervention or physiotherapy to no physiotherapy among a sample of elderly patienlS who ha~ had a fall. In some trials. it may only be possible to blind the participant and not the observer (known as single-blind trials). but it is impodantthat the maximum level ofblindncss possible is used.. Random allocation to illlenention is a further desirable feature of clinical trials. It is nc:cess8l)' that groups of participants receiving different interventions are as similar as possible so that any effects at the end of Ihc: lrial are attributable only to differences in the intervention. RANDo. MlSA110N optimises the chance that the groups will be as similar as possible. Unfortunalely. many 'trials' are not planned. but instead are basc:d on existing routinely collected data. Allocalion to intcnention in Ihc:se instances is never truly random and is often particularly derective. Forexample. two surgeons in a hospital performed many operations to ~uce snoring. each surgeon using a different technique. Analysis of sevemJ years of routine outcome dala attempted to com~ the two techniques. Here the surgeon was II ClOIIfoundcr and it is not possible to deduce whether differences in outcome wen: due to Ihc effects of the surgical technique or of the surgeons who operated. The double-blind randomisc:d controUed trial (see CUNK'.a\L 11lLW) is considered the gold standard of medical rescan:h. If a rescard1 n:poIt stales thai blinding and random allocation have been usc:d then it is impDJtant that the procedures employed in implementing each are desaibed~ it is insufficient to assume that authors understand the meaning of these tc:nns. EITors in n:scan:h design would be reduced if statisticians YI'CM consultc:d more often in the early stages of the research proc::as so dial statiSlical issues could be conshleml throughout (see CONSUUINO A srATIS11C1AN). Unfonunately. enors in the design of expcrimenlS an: nearly always impossible to axrcct aad ~fore the resean::h may be falally "awed. Errors in statistical analysis are also Widespread. Many statistical techniques make assumptions about the data to which abey are applied. but a mislake often observed in n:sean:h papers is thai these asswnptions ha~ not been met. A common assumption is that ofdalaconrorming to a NOIWAL DlSIRIBtmON. II is. unfortunately. not always possible 10 lell whelhcr a variable is aonnally distributed when the raw data

cannot be inspected. However. summary slalislics may be provided and for measuremenlS that cannot be negative. which is often the case in medical n:sc:arch. it can be infem:d that the: data have a skewed distribulion ir the standard deviation is more than half the MEAN. although the converse is nul necessarily tnJc. When daIa do not ClOIIfonn 10 the assumption of nannality they should eilhc:r be transformed (sec 11lANSRlStAnON) or NONPARAME11UC METHODS used instead. It may be clear from graphs or ranges that oulliers arc pn:scnl in data. These can haYC a ClOIIsiderable effect on Slatislical analyses. Generally. however. values should not be allCI'ed or delded if there is no evidence or a mistake. Instead if OUTLIERS are pn:scnl a n:scan;h paper should indicate that Sleps were taken to inveSligate their elTects. Again. transformations. ar nonpanunebic methods may be appropriale. A common assumption of Slatistical tcsls is thai all the observations are independenL However, multiple obsenalions on one subject arc not indcpcndentand should therefore noI be analysed as such. For example. the results of hearing tests in the right and left ears of a group of study participants should not all be entered into an analysis whcm observations are assumed to be independenL InsIead. the a\'UBge of the right and left measurcmc:nts could be laken. or the results from just abe left or right ear might be chosen. It is also erroneous to anaIysc paired data ignoring the pairing. Pain:d data can arise when a one-to-one matched design has been uscd or when two measurements are made on the same subject. e.g. beftR and aflc:1' In:a1ment. METHOD COMPARISON sruolES are common in medical n:search and CORRELATION is often misused 10 assess agreement between the two. Com:lalion measures linear association. rather than a,greement. so if one method always gives a value ofexactly lwice the other method a perfect correJalion would be found although agn:cment is clearly lacking. Instead. ag,"ment betwc:en two conlinuous variables should be assessed using the technique described by Bland and Allman (1986). A major problem c:ncountcn:d in the analysis of medical rescan:h is that or multiple testing (sec ).ftJl.11PLE ca.IP.wso.~ PROCEDURES). Choosing the conventional significance level of 0.05 means thai if 20 statistical tests were performed we would expect one to be signiftcant )JtRly by chance. Then:.fore theconclusionsofa papcrreponingoncsignifiamt result among 20 teslS performed should be tempered by this fact. It may be that an adjustment for multiple 4XJIDparisons. such as a BONFERRONlCCRJlfC11ON., is appropriate. A distinction should always be made between prior hypotheses and those resulting from explondion of the data. so that the same daIa arc not used for testing. a hypothesis as for generating it (see POST HOC A.~ALYSJs).

Other emxs in analysis abound. including the use of conclalion to relate change to initial value. failUM 10

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ CRmcALAPPRAISAI.. take account of onlcn:d categories or the evaluation of a dialnoslic lest only by means of SENSmvlTY and 5PfCIf1C1TY whea the POSmYE and NfXJATIVE 1IIlEDICm'E VALUES would be more informative. Whatevc,. method of analysis is used. an importanl n:quirement of the ~ report is thal all techniques employcd an: clearly specified for each analysis. Unusual or obscure methods should be referenced and methods that exist in more than one form. such as Pearson's or Speannan's correlation coefficient (see CORRELATION). must be idenliOc:cl unambiguously. EI1OI'S can also be made in presenlalion. although these may have more trivial implications for the conclusions of a paper than the errors describc:cl earlier. Nevcnbeless, good presentation is important to ensure that Ihe readu is not misled or confusc:cl. is WOIth Doting thal a paor~uaIity paper may not necessarily describe poor-quality research. but ifinsuflic:icnt detail is provided in a report it is unsalisfac::tory to assume that the research has been performc:cl acceptably. P-valucs an: oftcn used in the medical literature 10 indicate statistical significancc. Howcver. it is preferable that CG."lR. DENCE INlERYAU an: used in the presentation of results to gi~ an immediate idea of the clinical signilicance or an effect. If a 9SC)'t confidence interval does not include zcro (or. more gCDmllly.the value specified in the NlJLL 1n"POTHESIS) then the P-value will be less than O.OS. 'lbus there is a close relation between confidence inlerVals and P-valucs (see lESI'S AS ~CE INJ1!RVALs). but confidence intervals additionally dcmonstrate the magnitude or the effect of interest. MEASURES OF SPRE.W should be quoted alongside MEASURES OF LOCAllON to indicale variabiUty around the a\'mlle measurement. However, the ± notation is discouragc:cl because its use to denote the stANDARD DEVIJ\1KXII. the standard ClTOr and the half-width of a coaHdcnce interval has Ic:cIto some ambiguity. ThUs. rather than describing mothers in a study of pregnancy by saying. ·the mean age of mothers was 28 4.6 yeaJS'. the data am better summarised as 'the mean age of motheJs was 28 yean (SO 4.6)' • Since the vast majority of statistical analysis is now performed using computers, research papers should present exact P-values. which an: far IDCJR informative. such as P =0.014. rather than ranges. such as P < O.OS. 'lbe notation "NS' fornonsigniftcant is even less re\'eBling. However. there is no need to be specific below 0.0001. Authors must justify the appropriateness of a one-sided P-value quoac:cl in a research papcl'. A one-sidc:cl P-value should only be used in the vel')' rare situation where an observc:cl difference could only have oc:cum:d in one direction. The decision to use a one-sided P-value should be made prior to the data analysis and hence nol be dependent on the raults. Spurious precision is m.oahu common enor in the medical IilerallR that impairs the readability and credibility of a papel'. When presenting raulls the precision of the original dala must be bomc in mind. Altman (l99lb) suggests that

.1

=

means should not be quotc:cl to more than one decimal place than the original data and standard deviations or standard errors to no more than two. Likewise. pen:entages need not be given 10 more than one decimal place and P-valucs need not ha\'e more than two significant figures. Errors also arise in graphical pn:senlalion (see ClRAPIIICAL Df.XDTION). Gmplw that do DDt include a true zero on the vertical axis or thal change scale in Ihe middle of an axis can be misleading. as can the unnecessary use or thrce-dimensional effects. Other crron include the plotting of means withoul any indicadon of variability and the failure to show coincident points on a SCA11ERPlDT. Misinterpretation is common when P-values an: presented. It must be remcmberc:d that the con\'enlional cutoff of O.OS is purely arbitrary. A fla(ucnt mistake is to interpret a value of. say. O.04S as significant. but a value of O.OSS as nol signiOcanl. when in reality there is very little difference between the two. Then: is also a preVailing belief thai sigDiOcant P-valucs are indicative of more successful rescan:h than nonsigniOcant P-values. This auitude is retlmed in studies being describc:cl as ·positi~· or 'ncgati~'. depending on the signiOcance of the findings. Results should not be evaluaaed solely on the statistical significance or the Ondings, but also on their clinical significance (see ruNICAL VERSUS STATImCAl. SKl.'UFJCANC'£). The usc: or conl1dence intervals is a helprul antidote 10 this problem. A rurthe,. serious error or inlcrpretalion is to interpret ASSOCL\TION as causation. The only type of slUdy when: CAUs.\UIY can be inrem:d is a well-conductc:cl randomised conuulled trial. Otherwise. gn:aI can: should be taken in the interpn:laIion of resulas: in particular. the likely effect of confounders must be consicL:rm. A ftnal area where oonclusions an: ollen nol an:alcd with suf1k:ient caution is that of inference from a sample 10 a population. Although a sample should theoretically be random, in pmc:t.ice this may not be realistic:. 'lberefore a researdl paper should attempt to repon any likely biases in the selection proecss and implicadons this may have for the findings repartcd. When critically appraising a research paper it is helpful to have a checklist of issues to c:oasiclcr. A checklist is particularly useful because it is easier to spot errors than omissions and. as already nOled. it is inappJOpriale to inrer that a aJlMCt procedure was employed when the relewnt inrormalion is not includc:d. The Bri/WI MedicQ/ JOllmQ/ provides two checklists ror use by its statistical reviewers that can be usc:d when cridcally appraising a paper. these an: published in Gardner. Machin and Campbell (1986) or can be round on the British Medical JoumQI website. One checklist is intended specifically for clinical trials and so includes questions relevant only to this study design; the othe,. is for usc with all othe,. study types.

113

~

~

R

]

~

~

R

C

_____________________________________________________

In canclusion. critical appmisal is an essential skill for of the mcdicallilCratlR; it is impolUllltlhat readers haw: the conftdence to question c:onc:lusions sIaIed by the aulhan and the statistical knowledge to assess the mclhods used. 'l1Ic consequences of Ihc mage of errors described in this section can wry between n:ducilll the readability of a paper to n:vcning the din:clioa ofthc results. An important part of the critical appmisal process. thcn:fCR. is to make a juqcmcnt about the implications of' any issues raised. A study should not be discarded becausc a single naw is found. buL instcacL a subjectiw: assessment of Ihc impact on the findings must be made. SRC IISCI'S

A.I.w•• o. 0.1980: Slatistics aad ethics in medical mem:h.Britirb M«Iiml JDIImIIl28I. 1112-4. AI...... O. o. IWIs: Statimc:s in medical journals: dcvelapmads iD the 19IOs. SllItislirs in M~tlidM 10. 1197-911. A......... O. G.I99lb: I'rllclimlJtatlJticslornredical remlrch. Laadoa: Chapman at HaU. Altman, De G. 1994: 1be sclllldal of poor medical 1'CSCEh. British Mmiml JDUmIIllOI, 213-4. Aadenea. B. 1990: MellrotlolDgical ",ors in nWiml remlrM. axfanl: Black~lI...... J. M..... AlIma. De G. 1916: Stalislical methods for lIS5CSSiag qrccmcnt 1lel\\'CCn two methods 01 clinical mcasumncnL lmrtel I. 307-10. O'.....r, ftl. J. ad . . . J. 1990: AD explar81Dl)' study or statisIicaI asscssmcat or papers publisbcd iD the British Met/itlll Jourlftll. Joutrllli of lire Amft'kQIJ Mediml A,sor.';on 263, 1355-7. Oud-

. .,M.J., ........ D..... c..apbell,M.J.19I6: Uscorchec:k lists iD assessiag the statistical c:aalent of'medicaI SIUdics. Br;Iirb Medical Joutlllll292. 810-12. GoodJnan, s. No. AIIDID. De 0 ..... 0......-, S. L 1998: SlatiSlicaI rmewiDI policies or medical jaumals: camll lector? JIllU'IIIII of Gellmll Inlmllli MmidM 13. 753-6. Sdaor, A. .... Karla, L 1966: Statistical cwluatian of medical jaumall'DIDusaiplS. Journal of lire Ameritllll MedicallWDdlltioR 195. 1123-8.

crossover trtal.

11Icsc are trials in which patients are allocated 10 sequences of tratmcall with the object of slUdying diffen:nccs bclWccn individuailratmcnlS or subseqIICIKlCS of treatments (Scnn, 2002). 'I11at is to say, each palienl is treated man: Ihan anc:e and the responses under difl"cn:at In:allDcnts for the same patient can lhen be compared. This is besl explained by considering some examples. Suppose we an: inICRsIcd in gcacral in comparing tn:atments A. B. C, etc.• and Ihat patients will be allocated to sequences of In:aImcat of the form ABC, CBA. etc.• whcn:. for example. ABC means thai the patient will m:eiw: A in a first period. B in a second and C in a thinl. When only two In:atmcnts an: being compared. a w:ry popular type of ClVIISDw:r design is one in which patients an: allocated at random and usually in equal numbers to one of two scquences AB or BA. Such a trial was run by Gralf-Lonncvig and Browaldh (1990) comparinglhe cf('eclS of single doses of inhaled fonnolcrol (1211g) and salbutamol (200 III) in 14 mocIenatcly or sc~ly asthmatic In:atmcnlS. If we giw: the

label A to fonnotcrol aad B to salbulamol. Ihcn childn:n wen: allocated at nuuIom to one of the two sequcna:s AB or BA. Whc= thn:c In:almcnlS are being compared. patients may be allocated in equal numbers to one ofthrec sequences formilila Latin Iquan:. either ABC. BCA and CAB or ACB. BAC and CBA. or it may be that both Latin squarc:s would be employed. 10 that patients would be allocated in equal numben to each of Ihc six passible sequences involviq each of Ihc Ihrec treabDcnlS. For example. Dahlof and Bjorkman (1993) compared two closes of the poIaSSium salt of diclof'enac (SO mg or 100 mg) to placebo in the IJulJncat of migraine in 72 patients. If A is placebo. B is the lower dose of diclofenac and C Ihc higher ODe. dlen Ihcir design involved allocaling patients to one of the six sequences ABC. ACB. cIe. MeR complex dcsiglUii than Ihis an: possible. Forcxamplc. il may sometimes be the case that the number of treatments that one wishes to study is In:ala' than the numbcrofpcriods in which it is consiclcJm n:alislic to In:at paticnlS. So-called inro",pkle blDCk de~igns. in which patients receive suitable: chasen subsets of lhe trealments to be inw:sligatcd. an: papular. At the adICI' extmnc. it may be lhal it is possible to tn:at patients in IlIOn: periods than Ihcn: are Imdmcnts. lcadilll to so-called ,eplicllie deJ;gns. As we shall discuss. these an: extremely useful for the purpose of studyilll an individual response. Because crollSD~ trials permit comparisons 011 a withinpatient basis. they are ef1icient compan:clto parallel group trials aad consiclcnble savings in patient numbers an: po... sible. Howc\'er. CI'05SO\'C1'triaJs are clearly UftSuitable for any condition in which death or cure is the outcamc and Ihcir appIUpIiatc usc is rcstric:Ied to chronic diseases and an:alments whose effects are rewniblc. Suitable conditions include asthma.. rheumatism and migraiDc. However. it is naI juSlthe condition bulthe lrealmcnt and the ENDIIOlNI'that determine the suitabilily of crossoVCl' bials. For example. they can be used to study blood pn:s5um itself in short-tenn trials in hypcnension bul naI lhe long-lCnn sequelae of hypcl1ension. such as. for example. slrOkc or kidney or eye ciamaIc. In aslhma dley are more suitable for saudyingthe effcclS of bcla-qonists. which an: n:lativcly shalt tcnn and n:versible. than those of steroids. which havc IongerlCnn effects. In such condilions when: crossovcr trials may be employed. it is nearly always the case that the sample size laIuinxi to prove eflic:acy. even if a puaIlel group lrial is used. is considerably less than Ihat required to demonsb'ate safety of the drug. Hence. in Phase ID. whe~ safely considerations an: extremely impadant. then: is no point in reducing. the sample six by employing Cl'OIISDVCr bials anyway (sec PIWE III 11lW.S. PIWWACOVIOILANCE). Conse.quently. some discussions of Ihc comparative mcrill of cnJSSOver trials and parallel group trials that appear in die

_____________________________________________________ scientific literature

IR

ndher misleading. In practice.

crolSD~ trials aR DeVUaD alk:mali~ ror Ihc: major parallel

graup trials carried out iD Phase III. They can, however. be exlremcly useful in Phases I and II for pharmacokinclic and ph8llDllCClclynamic modellins. ror dasc finding for aolcmbility in healthy volunteers and ror ef1icaey using pharmacodynamic outcomes in patients (see PHAsE I 'I1UAI.S. PHAsE II 'I1UAU). They CaD also be: useful elscwhc:n: for answering certain specialist qucstio.. such as. ror elWllpie. clcmonstratins equivalence (see EQUIVALENCE STUDIES: DESKIN) of generic aad brand name products using so-eaUc:d biocquiwlcnc:e studies (Scnn and Ezzd.. 1999). Unlike the parallel group trial. the basic unit of RpIicatiOD in a CrossoVCF dc:lisn is nol ahe patient but an episode or Iralment. Since a genc:nl necessary assumption in standard analyses of eapc:riments is Ihal there is no interfCRIICC between units, this is clearly potentially more problemalic for CIVSSO\'CI" trials than rar parallcl group trials. It is inhc:rendy IlleR plausible thai the In:atmenlgiven to a palient in an earlier period may arr«t the respaasc rar the same patient when being given a further lreallnent in a subsequent period. a phenomenon known as t:llrry-fnw, than thai the In:alment given to one palient may affect another. ("I"hcR aR some cases wbc:re even parallel group trials may suffer rrom inlClference between unilS, iD particular irinfectiousdisc:ascs ~ invol~ or irpvup therapy takes place. This may lead to clustcrrandomisalion being nccesSlll)'. but this is plausibly an infraauc:nt problem.) In fact. carry-ovcr is Iqanled as being the central (potential) difticully of CI'05SO\lCl' trials and much of the considerable Iitcratun: devoted to the design and analysis of these trials is concc:mc:d with malten to do with COIIlIOllins for cany-over. The phenomenon of cany-ovcr means thai it is prudent. incleed necessary. to employ a so-called wtUhDllt period. This is a period bc:twccn the measurement or the effect of one Iralment and the neat in which the elT«t or the previous treatment is allowed to cliSSipale. Washout tn:aImc:nts can be ptlS6ire. if washout is alloweclto occur withoul any In:alment being given (Senn. 2(02). This may seem the natural approach from the experimental point of view. It has. however. the disadvantage that the patient is eapected to tolerate a period in which no therapy is offeml. AD a1tcmali~ strategy is that of employing an tlt:/ire washout period (Senn. 2(02). 'I1Iis might involve a near-immediate switch or the patient's therapy but a delay or measurement until a suitable period has taken place during which the effects or the previous In:alment ha~disappemaL (For rurthcl-delails. see Senn. 2002.) It seems plausible Ihat ClUSSO\lCl' llials will be more vulnerable to ckopouts than parallel pvup trials because or the greater demands OR palient lime thal the ronnc:r make. because davpauts in one period will also lead to loss of data from subsequent periods and because incomplete data will

CRO~V8R]R~~

unbalance designs and lead 10 dispnlpOltionate losses in information. It should be noIc:d that in nearly all CrossOVCF trials. subjects are nol recruited simultaneously. The exception is some designs in whicb healthy voluntcc:n IR used. Far designs involvilll patients they must. of course. be b9Ic:d when they pracnt. Consequendy. 'period' has a rellltiW! meaning in the conlext of CI'OISDYCr lri• . Far eumple. in an ABIBA CIOSSO\ICI' some patients will usually have complc:tc:d both periods oftrealmellt before oIhen have even swtcd in the trial. A popular linear model ror responses for a crossover llial with I palients in periods J with Ttreallnents gi\'ing rise to L rarms orcarry-over may be expressed as follows (Jones and Kenward. 1989). We let the response in period}.j= I ••• .• J on patient i, i = I. . ... I be r u. the: treatment given to Ihal patient in that period be t(i, j) = I ..... T and the form of cany-over be /(i,}). Then we write:

Hc:n:1' isapandmc:an. "', is an effect due topatienli•.lI'Jisan eff«t due to periodj, 1'"1Jt is the effect of treatment I(i,/). ll(t.l) is the: carryover effect of type /(i,/1 and the .(1 an: within-patient error terms usually assumed identically and independently distributed with variance u2 (say). TIle following points may be noted in connection with this model. 1he model is seven:ly OVCIpInIIDCtcriscd. HowevCF, interell centres on conlnlsls between the various ~ tenns and. given various reslrictive assumptions about the cBll')'O\'Cl' terms.. Ihc:sc will usually be estimable. Since then: can be no cany-O\lCl' in period I. ror each patient ~/.IJ =0 rar all i. In practice. to make pmgnss in ellimation. runhc:r restrictive assumptions IR introduced about the CarryovCF terms. 1'ben: are two VCIY popular choices. TIle first is to assume that any washout sll1degy has been successrul maclthat all C8I1)'O\'CI" tenns aR ZCI'O. 11M: second is to assume thai 'simple' carry-ovcr applies and that carry-ovcr clepcnds only on Ihc: treatment given in the 11Kvious period., 50 that we may write I(i,})=t(i.j- 1),j~2. This last assumption may seem more reasonable than the first but in practice then: are very few imaginable cin=umSlances under which the second assumption would apply if the fint did nat, as it seems plausible that if cany-o\lCl' occumxllhc: effect of Ihc: enscndering lrealment would be: modified by the pc:nurbcd lmIlmCnt (Senn. 2(02). In practice. although designs can easily be round in which patient. period and In:alnlent cff'ecls aR onhogonalto each other. ir carryover effi:cts an: included. the desip matria will usually be nononhogonal and thc:re will be a loss in efficiency. For CCltain designs. for most purposes il makes no dilTerence whether the patient elTCCIS flI IR labn as F1XJ!D ElRCI"S arRANDOM EfRCI'S. However. for incomplete block designs in

115

CROSS-8ECTIONALSTUDIES _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ which T>J. interpatient information will usually be R:covcnble by taking the patient effccls as nndorn, and this will also usually be the case: if carryovcr etrects arc included in the madel (Scnn. 20(2). The fonowing arc a number of conwvenies and issues that arc relevant to crosSO'VU desiJ;11S. The most notorious CXIIItroversy c:onceming cany-over has been in connection with the ABIBA desip. For many years a popular approach to dealing with cany-ovcr was the so-called two-stage pro~IR originally proposed by Grizzle (1965). He noIcd that in thc ~scnoc of carry-over the Raiment effcct was not estimable and hence proposed thai a preliminary test ofcanyover be made. If cany-over wcn: delccted.1he second period data should be ignored and a between-patient test usilll flrstperiod data only should be employed. However. a subsequent paper by Fn:eman (1989) showed that this strategy was eXRmcly biased as a whole and did not maintain nominal 1)rpe I enor rates. It is possible to adjust the two-~e proecdure so that it maintains the overall Type I error J8te (Scnn. 1997; Wang and Hung. 1997) but it has less power than the strategy of simply ignoring carry-over and is not recommended (Sean. 1991). Various eXbcmcly complica!ed designs and analysis strategies have been proposc:d for dealing with cany-over. They aU make n:strictive and unn:alistic assumptions about the nature of cany-over. howcver. and they nearly all involve a penalty in terms ofincreascd \lDlianees of estimators of the treatment efl'ect (Senn. 200:2). This would seem to lea\'C washout as the only reasonable slntegy for dealing with cany-over. However. this approach is bound to leave some investigDlOrs unhappy because of the n:liance it makes on judgcment based on biology and pharmacology rather than slnlc:gies of design or analysis based on PlRly slDlistical principles. Man: general envr structures than those considen:d here an: possible. In particular. one could allow for a tl1lC nndom effecL that is to say the possibility that diffen:nt patients react ditren:ntly to treatmenL From one point of view we would then have as a random intera:pt panlmc:ter for each patient but then add a slape parameter for a given tn:alment for a given patient. For a givcn tn:alment these would then be assumed to be randomly distributed with unknown variance to be estimated. It is also possible. although this appears to be rarely atlemplCd. to allow for an autocorrelation or the within-patient cnors.. Baseline values at the beginning of each treatment period an: sometimes collected in crossover trials but extreme care is R:quired in their usc. For many desips it is quite plaUsible thai the outcome values may be unaffected by cany-ovcr but that the baseline values would be. In that case. incorporatilllthe baseline wlues in the analysis might inlroduce a BIAS due to cany-over that would not otherwise be pn:sc:aL

.1

1ben: have been various attempts to produce Bayesian analyscs of crossovcr trials (Grieve. 1985). In theory this is atlrac:tive in that it pennits compromise positions to be adopted between that of assuming that cany-over is absc:at to allOWing that it may have any yalue Dl all. In practice. it is difficult to caplun: in the model the dependence that must ineVitably exist between belief in the magnitude of the tn:atment effect and thDl of the canyovcr effecL Ironically. despite the CXlllsiderable potential of crosSO'VU trials to measure individual response to tn:atmc:nt. especially if n:plicate dcsips an: employed when:by the number of periods ex~ the Dumber or tn:alments 50 thai J > T. and hence make a genuine contribution to the currently fashionable field ofpharmacogenomics, this possibility has n:ceivcd most Dltention where it is least impoltant. namely in investigating individual n:sponsc to ditrerent formulDlions in the context or bioc:quiwlcnoc. Despite some limilDlions of application and some difficulties in their usc it would be wrong to conclude. however. that crossover trials have no place in drug devcJopmenL They can be extn:mcly efllcient compared to parallel group trials and an: far superior for the purpose of investigating true random effects. They arc extrc:mcly valuable on occasion. in padicular in phannacokinetic studies and in dose flndilll in Phase II. 5S

C._

o.bIaI. BjoItuaD. R. 1993: Diclorenac-K (SO and 100 me) and placebo in tbe acute ~atment ofmigrainc. CepIJalalgia 13,2. 117-23. I'reeaIaD. P. 1919: 11ae performance of the two-stace analysis of two-babnent. two-periad cross.over trials. StatirtiC's in MedicineS. 1421-32 Gralf-Laaaeri&. V.aadBrowaldta.L 1990: l\\~1vc: haulS broactIacWatinc effect of iDbaled fonnCMaoI in chi.. dn:n \\ith asthma: adoublc-btindaoss-overstudy \'mus salbutamol. Clinical and E.~tal Allergy 20. 429-32. GrIne. A. P. 1985: A Bayesian analysis of the two·paiod cmssovcr design for clinical mals. BiomelriC's"l. 4. CJ79..9O. GrtzzIe~J. It. 1965: 1bc IWO-pcriad chaDge over design and its usc in clinical trials. BiomelriC's 21. 467-80. J..... B. and Keaward. M. G. 1989: De';''' antiQRQ}ysis o/C'rMSO(}'f't!T lrio/s. London: Chapman & Hall. SeaR, s.J. 1997: The case forcross-o\'cr mals in Phase ID (letter; comment). Stalirlirs in Medicine 16, 17. 2021-2. Seaa, Sa J. 2002: Cross-om' lriDls in

C'linitol resrtlTclr. 2nd edition. C1Uchester: John Wiley & Sons. Ltd. Sean, Sa J. ad Ezztt, F. 1999: Clinical c:ross-ovcr trials in Phase L Sialislimi Methoth in Medical Resetlrch I. 3. 263-71. W.., S. J. _ R..... H. PtL 1997: Usc of IWCHIa,ge test statislic in the twoperiod crossover trials. Biomelriu 53. 3. 1081-91.

cross-sectional studies The objective of a crosssectional study is to determine the disbibution ofa variable or the joint distribution of man: than one variable in a population. This may be accomplished by oblaining a n:pn:sentative sample of the population of inten:st lhrough the use of a So.lPLE RANDOM SAMPLE, a SITDlijied rtIIIdom sample or a complex survey design. Such a study is characterised by the faet that subjects arc only observed at a single point in

________________________________________________ time even Ihau,h the phc:llGIIICna lIISOCiatcd with the "ariables or inraall may haw: evolved duaup a d)'Dlllllic process dud develaps oycr lime (Kleinbau.., Kupper and Margenstcm. 1912; Roduaan and Gm:aIaad. 1991). H~. beaiuse the study subjects ~ only abscrved at . a siDJle point in time. csscalial reatun:s in the temponaI paIIcrns wiD be which n:ndcn il impossible to coaduct a lhDrau&h Ioncitudiaal ualysisof the phenOllleIlOla or inlcn:sI. This appaach is also sometimes usc:cI when coaduc:liD, an epidc:miolop: 1Iud)' in which sabjccls 1ft n:cruiled Without M1an1lo lhc:jrexpollRorcliseasc slatus. so lhal die inr~on on each conapands 10 the sIaIUs of subjccls aI the time of the iDlerYiew _Iy. In Older lo lIIJPRICilllc the inhc:reat limilalion "at wsls .wheaobservatiaasIR anlyobsaved aI a sin&ie poiat in ~ consider the h)'pothc:Iical esampJc iDUSInIcCI by the SCA11ER. PLOr in the fi&In. pari (a,. Subjecls 1ft obserYccI aI different aps ad the scaaterpJat sugc:sIs Iballhe ou.k:ome tends to dc:crease . . . illCla5eS. Iii COIIInsI. a /Oll&illlliilllll wauId obsc:ne subjc:dsat multiple tilDe paints.. dausenablin& lhe ilWCSliplarlo lAck the clm:1opment orthe outcome oYer time. It is appamIl dud the tracking ofiadiriclual subjects in oar hypalheaical example may have arisen either rlOlll an incn:ae·in die n:spaase.the sabjeclages (&pre (b)) or a decmIsc (_un: (e). B)' -ay abscmq subjccls at a IlDeIe point in lime. it is impossible 10 dislinpish between . . trends lISSDCiaIed willi enmlment into the SIIIdy and abase that e"olve lIS each indi"iduaI subject . . . eDillie. Uang and %cp, 19M). This limitation is nOt raolval by eaadu&:ling n:pe1llcd~aI studic:scanic:claut atdift"crenlpoinls in lime, unless the SlUclies ~ dc:sipacl 10 as 10 ablain n:pealcd lllllessmealS or the same indivicluaJs. In I:pidaaiolo&Y. a cross-sectional lIady canlnslS wilh a COIIOIn' SI'UDY and a CASI!-cama. mJDY. In a coIat study, subjecls are selected on the basis oftheircapasuM slabIs and thea rollowed IDltilthe disease develops 10 dull aD invcstipIOr can direatly assess the usaciatian wilh disease deYClopmenL In a typical casc-contml study. incideat CIIICS 1ft n:cruiled near the time at which the disease is diqnoscd and cxpIIIUM is assessed by 1eC8O. In either case. an inYellillIIOr wauld be studyilll iacidcnt cases while a ~ sludy would be stuclyilll JRWIc:at cases, i.e. eases thai may have acc:unal at 10111e time in the past. 1'IIis would confound eft"ecl5 on disease incicleace with etrecIs on pqnasis or survival. A aaa-secIionaI 1bICI)' is especially pnme 10 I.fKJIII-BWIJ) Sa\MIII.INO because the ~enl eases with a lon& period or illness bcrCR death would be IIIIR likely 10 eater the sludy ilia would a subject who diccl shorlIy after diaposis. 'I'hcRr~ the desiCn is primarily used in the study or diseases willi n:latiYely short-lmn effects. One example or such a SIUdy would be aD attempl to discxm:r causes or a road poisoniiaa aullHak in a school. by icIenlify., all

miss."

cAOSHE~srumES

(a,

• •

y

•

• • ••

•

•

•

• • x

(b)

6"'"

y

x (c)

y

.... -4

x ant. . . . clonal ___ Result (10m a hypothtfJIlcaI cross-secIionaI study (a) and 0DI'I'8fJIJ0IJd IongIIutfnaI studies wlllaincrflllsing (b)anddet:trJasing(cJ lime fiends

lI"nls and IISICSSilllthe specific road they had CDIISUIIICCI and whcdJer they had become iD. ADDIher concem in a cmss-seclional study or disease is whether there is a s)'stematic BIAS or inaccwacy in dac:

117

CR~VAUDATICN

_____________________________________________________

n:parIingofexpIIISIR bydiscase IilatUs.ln IDIIIC cases lhiscaD be avoided by usilll eXpa51R mcasUR:S thai an: not affected by disease:. e.g.. Ihe dc:tenninalion or a particular gcnolypc duough the usc of ,enomic analyses. However. irlhis cannot be avoided. Ihcn Ihis potential sauR.'lC or bias may limit the strenlth or Ihc associalion. as well as the stmIgih of the cvidence thallhe exposlR ofintc:n:Sla«ccls the aetiololY or disease. TRH ..... P. J.. L..... K.·Y..... Zepr, S. L 1994: AlrG'I)~is t1/ Ion,iludinal .'a.

Oxford: CIaJadan

~ss.

KIIIn......, D. 0.,

KUPPll'.L L ..............,.,R.I982:Epidtmiologicramrrlr: pTinrip/e, anti qrMllllilalil'e IMIIIDtlJ. Bc:11DDllt LifClimc: I.caming PUblicllliaas. ....._ , K. J..... 0"........, S. 1998: Moiern qilkmilllDD'. Philadclplaia: Lippiacaa-Ravc:n.

cra. .vallclaaon

Sec DlSClUMDIANT FUNCIIDN AlW.YSIS

crude blrlhldeath rate cubic spline

See DEMCXHWIIY

Sec SCA11ERRDI' SMOOIIIEIS

cura models A cure model can be used in survival analysis when lhere an: 'immunes' or ·Jolil-lerm sW"'liVOlS' pn:scnt iD the data (Mallcraad ZIaou. 1996). In such a selling. immuDC or CURId subjects DR a:1ISCRd siace cure can DC\'CI" be observed. while susccpliblc: subjects would e'VClllUally develop abe C'VCIIt ir rollowed for long enough. A lypical example where a cure anacIel might be approprialc would exhibit a Kaplan-Meier eSlimalc or the naaapnal lime-Eocvent cliSlribution thallc:vellcd 011' aliong follow-up limes to a IIDft7.aO value (sec: KAPLAK-MEIEa f.STIMAT(Il). An example is in studies or CIIIICCI' ror which a significant propadiaa or palicnls may be cun:d by the tlalmcnL A mixtun: model fcxmulation is one appIOlCb to analysing such data (sec: fINItE ~ DlSlRIBUI'IOKS). Assume that a rmelion p or the papulalion ~ susc:eplibles and the n:maining fraction arc not; thc:a Ihc survival runction S(I) for the population is given by: S(/)

=pS.(/) + (I-p)

whereS.(I) is the survival funclioa for the suscepliblegraup and where covarialcs can affecl bath p and S.(I). Let (I" dJ.o Z,) be the observations. when: Z, is a vector of covariates. " the observed or censoral lime and d, the censoring indicalor. lei D, indicalc cun: status for each subject denoted by V,= 1 for a susa:plible subject and D,= 2 ror cwal. Thus each censored subject has either D,= 1 and the event has nat yet occurral or has D,= 2. The incidence model is Iypically given by:

Amonl susceplible individuals. the lime to event has a distribution. such IS a Weibull (Fan:well. 1912):

S.(/ilD;)

=exp[-cxp(y'Zr)Ir]

An attractive reature or this model is Ihe two separate components. The parameters b IIICIISUIC the effect of covariateson whether the event will occur and the paramelelS Y measure the dl'ecl of Ihe covariates on when the event will occur liven that the subjecl is suscc:plible. These two components IR somc:limes called incidence and laleney and can have nice interprctalions in liven applications. DilTercnt formulations can be used. Li and Taylar (2002) and Y8IDIIpChi (1992) consiclcml parametric and semipanunclric accelcndcd failure time anacIels for the latency model. Kuk and ellen (1992). Peq and De. (2000) and 5y and Taylar(2000) consiclc:n:d a semi-parametric pmpodiaaal hazards model far the latency model. One pmblc.. associated willi the cure model is nearnanidcnlifiability (Fan:wcll, 1986: U. Taylor and 5y. 2001). This arises due to Ihe Jack or informalian aI the end of the follow-up periocl. n:sulling in difliculties in distinguishing models wilh a high incidence of susceptibles and lonllails or S.(I) f'nIm low incidence of susceplibles and short lails of S .(1). The iDcorpomlionorlongitudinai clata into the cure madel isoDC way to n:duce Ihe problem (Law. Taylar and SancDu, 20(2). While Ihc panunelcrs inp and S.(I) have nice inlerpretatiaas. in some applications the lIIU'IinDI survival distribulion S(I) and ils dc:pendcnce on Z may be or most inlel'CSL This clislribulion is easily obtained rrom Ihe estimates of p and S.(I). Predicling the CUR: stalUs or a censored subject ..ay also be of interest. The rannula to estimate the probability that a censon:d subjeci is in the susceptible group is given by: pSI (Ii) P(D; = liT; > I;) = pS ( ) 1 • I;

+

-p

The mixtun: cun: madel S(/) dac:s not in gcnc:nI have a prapaItionai hullds slnlclure. In orcIer to Icccp this, howC:VU, nonnaixturc CUR: models have been proposed (Tsodikcw. 1998; Chen. Ibnhi.. and Sinha. 1999). In lhesc: .....Is. a baundedcumulalive hazanl isassumc:cl: lim, _-1\(1) =IJ. One way 10 enforce this propeny is to write A(I)=6F(I). where F (I) is Ihc distribution function or a l10IIIICIative random wriable. Then the survival distribution S(I) ror the population can be written as S(/)=e-Ifl", which has the cure nile e-O. Covariates ean be incorpanlled into Ihe IICJIUI1ixhn cun: madel by assuming 1J(Z,)=c:xpfJJ'z,). Cure models arc worthy or considendion ror analysilll data ror which Ihc:n: is a stnKIg scienlilic rationale far Ihe exislellce or a and group and empirical eviclc:aa: or a naazero limiting survival fnclion,lOpIhc:r wilh a substlllllial number of cellSCRd observations with long rollow-up limes.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ CURTAlLMBfTSAMPLING ~n example wile", the cwe model was applied arose from a study of 672 IDnsil cancer patients Imdcd willi mdiaaion therapy (Sy and Taylor, 2000). The radiation caaeliminatcall lhe cancer cells in the lonsil of some palieals and thus the cancer will nul reappeal' in Ihc tonsil and Ihc patient is regaudecl as beiq curc:d. Iflhe radiation is DDt successful at eliminalilll all the caaoercells in !he tonsil. those thai n:main .wiD rc-gruw and become delectable., as a Iaca1 n:curn:1ICIe. within about 3 )'an. This is a good situation wIleR a CIR model txHIId be appropri_ because the", is a scientiftc ndianale far a cun:d paup and because a Kaplan-Meier estimale of lime of local recum:ace will ellhibit a long plateau rqion if Ihe", is suflieient follow-up in the data. Far lhese daIa. tIleR were 206 ewnls or local n:cunence and most patients had men than 3 )'e815 fciUow-up. The main interest was in understanding the effect of the toeal dose ofmdialion and theoveralllmllmeDtlime between lhe start and the end of radiation on local recurrence.. 0Iher covariates. such as stage orlhe I. .our and age of the patient we", incluclecl in the analysis. A million: cu'" model was used. with a 1000islic model for Ihe iDc:idence and a semipalBlllClric .,....,nional hazards madel for the Jalency_ The n:sults sugelllCd ahat Slap. dose ancIllalllnc:llt time we", slnmgly associaled with wllcthcr Ihe twnOUl' recumd. as indicab:cl by the paramelels in Ihe logistic model. Age. however. was nul associated with the incidence. The estimates of the relative huanIs parameICI5 in the laIency part of the model suuestcd stage. cbe mad ovenII lreabncnt lime we", not as:saciated with when the recum:1ICIe would occur. 11K: patient's age_ however. was :lITOngly associated with when recum:nce would happen. given that the patient was nul c:ured. The din:clion or the association was that younger paliealS would n:cur earlier. One possible inleqntalion of this is thai yaung patients tend 10 have the same

t-'

susceptibility 10 lreabnent as older patients but they tend 10 have fasterpowingcllllCelSlhalwill n:curearlierifnalcun:d. The initial iizeor stage of the tumour and how it islMatecl an: importllDl radan in clelermining whether a patient is curc:d. but ~ not important in determining how fasl the lumour grows back after lrealment if nat cun:d_ JMGT

a... Me 8., IllrUlm,J. O•

.ad....,

D. 1999: A DeW Bayesian model for SUlYi'VaJ daIa with a survival fraction. JofITRIII 0/ the Amtriam StQtiJlit:tll AIsoritIIio" 94, 9(8-19. Fanwwa. V. T. 1982: The use or IIIixIIR IIIOdeIs for Ihc anaI)'sis Of survival data with loag-taJn surviwn.. BiontIlrits 83.1041-6.........., V. T. 1986: MiXbR models. in surviYil aaalysis: are they ,,'onb 1bc risk? 77Ie 0mtItJiDIr JOIII'IItII oj Sialislicr 14, 257-62. Kale, A. V. c-. C. H. 1992: A mix1uR: model ~g logistic ~~aa with prapartianai hazIrcIs ~ssiCIIL BiollltlriJca79. 531-41. Law, N.J.. T8,JIar.J. MoO."" S'adIer,H.2002: 11acjoinl mocIcllingor a ICJIIIitudiDBI disease propasion mukcr aad Ibe failun: IiInc: pracess in the presence of cum. Biostalirs 3, 547~. U, C. 5."" T. . . . J. M. O. 2002: A semi-parametric: KCelmlcd faihR IiInc: aRmodeI.SIQlislic.inMdcbJt'2J 9 323S-47.LI,C.s.. Ta)'lor,J. Me O.... 5)" J. P. 2001: IdmIifiability or C1R models. Stalistics _ Probability lI11ers 54 389-95......... R. A. . . 7Jaoa, "1996: Surriral tlllll/ysis ",Uh 1Mg-lenn sani'f'fWs. MetA' York: Joha Wiley a: SOIIS.IDC..... Dar, K. B. O. 2000: A IIOIIpII'8IDdric mixture model far CUR: raIe cslilDltiolL Biometric. S6, 237-43. 51, J. P. aDd T.,.IDI', J. M. O. 2000: Emmlllioa in a COll prapaniaaaJ IllZardscun: model. BiDlrwlrks.56. 227-36. nodIkoY. A. 1998: A pnIIIOItioaaJ huaRI model tatiag accouat of loag-tenD survivon. Biomemu 54. 1508-15. V........... K. 1992: Acc:leIeraJcd failuse time ~pasioa models wi1h a repasioll time model of surviving _tion: an ...,Iication 10 the aa&lysis or "permaIXIII

C.""

9

Y.""

empIo),lIICDt'. JDIInIfIl·tJf lire Amtriam SltllislitJll Am1dtIIi6n 17.

284-92.

curtailment sampling

Sec 00EmI ANALYSIS

119

D data and safety monltortng boards These arc commitlCcs of experts set up to monitor the safety of participants and validity and inlegrity of data in QJNJC'..\L 'J1UAlS. Some: rorm or data and safely monitoring is called for in any trial to c:n~ minimal acceptable risks to trial panicipants and continually to IaSSc:SS the risks versus beneftts of trial interventions during the conduct to make sun: that thc:rc is an eqrlipoiN in cantinuing Ihe 1riDI. The International Conference on Hannonizalion defines good clinical pmclice (GCP) as 'an international ethical and scientific quality standard for designing. conducting, recording and reporting ofmals thai imolve participation of human subjects'. Monitoring of trials for safety of participants. integrity of data leading to valid conclusions.. adequate trial conduct and considerations ror early lermination to avoid unnecessary experimentation on human subjects is thus necClUl)' to mccI the stated GCP n:quirements. The trial sponsor. the investigaton and the instilUtional review board (IRB). also known as Ihe ethics commitk:e (Ee). are at the frontline of safety monitoring for trial participants and they assume and share the responsibilities. Howcvu. the sponsor may elect to eslablish a data and safety monitoring board (DSMB). also known as an independent daIa monitoring committee (IDMC or DMC). and delegate part orits responsibilities to the DSMB. The establishment of a DSMB is n:commended basc:d on lhe recognition that monitoring of safety at regular intervals is essential 10 c:asure safety of trial participants and that individuals dim:lly involved in lhe anluc:t and management of a trial may not be suited for objective review of emerging interim dalD.. All clinica.llrials n:quire monitoring of safely and efficacy data. but nol all require monitoring by a DSMB independent of the sponsor and investigaton. The degree and extent of such monitoring should depend on the poIenlial risks associated with interventions. the severity of disease: and END. POINTS of Ihe trial and the method or monitoring on the size. scope and complexity. A DSMB is generally n:quired for laqe. randomiscd conttolled trials comparing monality or major irrevenible morbidity as a primary endpoint or pivotal trials for regulatory approval of nwlteting. A DSMB is a body of experts who review accumulating daIa. both safety and eRic:acy. from an ongoing trial at regular intervals and advise the sponsor about lhe risks venus benefits and the scientific merit of continuing lhe trial. A typical DSMB is made up of people with pcrtinc:at expcltise. including clinicians and scientists knowledgeable about

the disease and intervcations under investigation and a statistician knowledgeable about clinical trials rnelhodology includiDg methods for INIBUM ANALYSIS. to interpret the emerging data appropriately. A DSMB IDay also include a patient advOCale or an ethicist. A DSMB is a SCpande c:atily from an instilUtionaireview board and its members should DOl be involved with the trial they monitor and ha\'C no connict of interc:sL either scientiftc or financial. A DSMB is primarily raponsible ror the appropriate o\'enight and monitoring of the conduct of trials for safely of panicipants and validity and iDlegrity of the dalD.. Marc specifically. lhe primary raponsibilities of a DSMB include review of the study protocol and lhe plans fordala and safety monitoring; evaluationoflhe propasofthe saudy. induding recruitmc:at ofllial participants. timeliness aad completeness of follow-up. compliance with protocol procc:durcs. performance of participating sites and other rac:lon that may affect study outcome: assessments of risks venus benefits: malting m:ommcndations to the sponsor and the investigaton concerning continuation. modifications to the protocol or lenninationofthe trial: and communicatinl the findings from data aad safety monitoring to the local IRBs. A DSMB will allow diRiculL mid-study decisions about the trials. DSMB members arc pro\'idcd unblindccl data on lhe important outcome measuremc:ats at rqular inlervals or at intervals specified in the proIOCol. These unblinded data should be kept confidential from the sponsor and Ihe investigators. A DSMB is responsible for making recommendations to the sponsor as to whether the trial should continue as originally planned or with modifications to the design. be temporarily suspc:aded of enrolment or trial intervc:alions until some uncertainly is adequaaely addressed or be lenninaled eilher because there is no longer equipoise among trial interventions or because it is highly unlikely that lhe trial can be successfully complelccl or meet its scientific goals. 11Ic indcpc:adence of the DSMB is inlendc:d to control the sharing of important comparative information and 10 protect lhe integrity of the bial from adverse impact raulting from pmnature knowledge about the emelJ:ing data. While small differences may be wellatClCptcd as nondcfinitive. awareness of such dilTen:nces may make investigaton reluctant toenter patients on the trial. to limit entry to a certain subset of patients or to encourage patients to withdraw if lhey arc assigned what they pm:eive as inferior intervention. Such tendencies will introduce biases and diminish the reliability

EtrqUopIINit C'OMpIIItiolf It) M.aKaI S/fllislic$: S«ond Edition Edited by Briaa S. EYeritt and ChrisIGph« R. PoaImec' oJ> 2011 JohD Wiley & Sons. ....

121

DATA~

_________________________________________________________________

of tile llial's eventual n:sallS areven pra:lucIc tile completioa ofdlc mal. Umilinglhe access to unblincled interim daIa to a DSMB relievcs Ihc: sponsarof'lhc: burden of clc:ciding whether it is ethical 10 continue 10 randomise patients and helps pralcctthe trial from biascs in patient entry or evalualion. A DSMB should have standard operating procedun:s and mainlain n:cords of all its meetings and deliberations. including interim n:sallS. and lhese should be available for n:view when the bial is completed. The DSMB standard operati... procccIun:s should specify meeting quannn. schedule mad pracedun:s. its decision-making rules and meeting follow-up. A DSMB should be consulted about the conlCnls of interim n:pons that serYC as a basis for dIc DSMB clcIibc:ntions. A practical pcnpcctive on DSMBs and the n:commcndations far the operation and management of DSMBs can be found in EllcnbelJ. Fleming and DcMcIS (2002), n:l1cctinga n:cent guidance forclinicallrial sponsors em the establishmenl and operation of clinical mal data monitoring commillce by the Faod and DnII Adminislnlion of the US Department or Health mad Human Serviccs (www.fi:la.gov/cbcdgcUns/clinclatmon.htm). Inb:rim analyscs of compandive trials are necessary to ensun: Ibatlarge differenccs between intcncntians do not go unnoticed. as well lIS 10 detect excess toxicity or unanticipated llaws in study dc:sips. Routine n:porling of Ioxit'itics ar information about inlerYCDtion adminiSlralion helps ensun: ahat interventions an: being given safely and properly and improve IriaI quality. Routine n:porting of outcome mulls. however. can hann study quality. In genn a DSMB would examine not only Ihc llial data but also relevant external evidc:ace rrom 0Ihc:,. soun:c:s. lis n:commcndation 10 Ihc sponsor should be based on Ihe intapmation of tile resullS ofdlc oRloin, trial in the contexl of cxistilll outside scientific cia.. releVllllt to such inlerpretation. A final decision. as to whclher or not 10 cOnlinue the trial. should not n:ly solely on • formal lell of Slatistical significance. The DSMB meetings proVide a setting in which lheclinical signiftcance or carty diffen:nces or lack thcn:or can be discussed openly with inlerim data and Ihc complex Stalislieal issues involved in sequential monitoring ofallial can be discussed at leRlth. Focused discussions of the JII'OIn:ss tow" the scicntilk goals of a study are r.:ilitak:d by a DSMB with access to unblinded data. Since intervention effcclS will be cxaminc:d by a small group.1hc dBlller will be n:duccd that a pnHIIising IriaI will be infonnaUy stopped carly with n:duccd accrual because of o\lCrinlerpmaiion of interim results by the sponsarorlhe in\'CSliptan.ln addition. sequential manitoriRl rules an:al best guidelincs forcomplcx decisions involving many aspccIS of a trial. Deliberations and canc:lusions of the moniloring should be awnmunicatcd 10 Ihc spoILIOI' and the IRBs without compromising the inlqrity of Ihe trial. Rc:conuncndations

resulting from monitoring activities should be reviewed by tile study team and adequately adcln:sscd. LocaIIRBs should be proVided fcalback on a regular basis. inc:ludinl findings from adverse evcnlsand ~ndationsclcrivcd from data and safety monitorinl. KK

....... S............ T. R. and 0eM.u, D. L 2002: I)Qla lIIOIIilorillg l.YImIIIill«~ Boca RaIan: Chapmaa 4 HalI1CRC.

data entry This process puts observations inlo elecllOnic ronnat ror compulcr analysis. No successrul statistical invesliJalion talccs paaa, without Ihc reliable and KCUl'ate collection of data and its convenion into a suitable elcctlOnic rana for compub:riscd analysis. While ostensibly a simple clerical process. it often sulTers ncll= in planning and execution that can jeopardise the smooth running of a research projeci. The rcliability of data collcclion is not specifically an issue rardataentry. but we will see latcrhow technoloJicalchangcs in data entry can encl'DKh on thc process. Most impoltant far the majorilY ofinvatigatOlS is the accurate entry ordata. In • rannal n:scarc:h projecl such as a clinicallrial ~ will be established and inviolable clerical pruccdun:s for data collection and entry dial will help 10 ensun: accumc:y. But in many academic studies Ihc researchers Ihcmsclvcs will lake responsibility far the complete coIlcclian and enlly pnx:css. With modem statistical paclcqcs and modendc information tcchnology (IT) litcnc:y on behalf or the uscr. this is • pcrfecdy feasiblc and economic process rar studies up to a few hundn:d variables and a few hundred cases. The spreadsheet data entry facilities of'SPSS ar Excel provide an easy way of enterinl data and.. given that the n:searcher is enteriRl the data. he or she can make checks durillllhc procedure. TWo problems occur with Ibis appraach. One is simple clerical enorarabsenl-mindcdncss in typiqdata.lhcolhcris a lack of an audit trail far chBllles in the spn:adsbccL Dual entry is typically used 10 com:clthe lint. Pro,rams such as SPSS Data Enlly ar a program written in MS-Acccss ar similar pennit CIIIC uscr to enler data and then anaIhcr to recater it frum scratch. Any inc:onsistency is 8agcd and Ihc appropriate variable checked. Such programs can also incorpanlc range chcckilil. While takilll exlm effort. it is well worth the initial investment in design. Some argue qainstlhc administrative burden of double data entry. however. on Ihc grounds that ranJC checks. ele.. will detect mast clerical el'lVl'S. yet it n:mains a sensible pn:caution. especially if temporary ar CXlernaJ slafl' are to be used for data entry. n.c audit or change is important. particularly ir scveral researchcrsarc mviewing the datL An individual corn:cUllla variable may unwittingly invalidate lIIIDlhcr's pn:vious analysis.lt is good pnlClice in these cin:wnstanccs 10 set up aeon: datasel and Ihcn usc a JII'OIram to change individual data

_________________________________________________________ elemenls if they need revision. ThUs. fOl' ex.ample. a file with SPSS syntax IlIIll:uage dala InInsfonnation commands can be used to compute changes to an SPSS dala file. It is then available for review by all in the team. There is much interest in using personal digil8l assislants (PDAs). or internet browsers. for data enlly. sometimes dim:lIy by the subject themsclves.11te superficial allnlcti\'Cness of these procedures can be misJc.ling. Transfening and merging dam from a PDA is not necessarily simple and will usually n:quire signiftcant manual intervention. This can be a source of error and care needs to be taken in design to prevent this. The use of a web page for data enlly potentially gives access to many thousands of respondenls. Setting up a n:liable data enlly page is not so simple. Browsing sessions ollen terminDle for communications reasons mid-session and therefOR: program logic needs to identify successful completion. Care needs to be taken to identiry unique data enlly sessions by. for example. originating aD IP addn:ss of the client browser. Data security is needed so a user cannot accidentally see other entries. A reasonable amount of pr0gramming effort is ~uired to do this and will certainly require database. programming and HTML design skills to Khieve it successfully. This docs not mean it is no( possible to desip a simple web page to acquire data: it is just more complicated to acquin: data boIh n:liably and acaJrately. Planning is the key to successful data enlly. Before any dais are collected the process now for entering data into the compulcl' packaa;c and its checking should be described and adhered to rigidly by the n:search team. Such discipline will payoff in smoothing the path to analysis. CS (Sec also DATA MANAOF.JdENI'I

data-dependent designs These are methods used for allocating tn:almenls to patients in clinical trials that make constructive use oflhc emerging responses. Compared with traditional trial designs using pn:determined sample sizes. dala-dependcnt designs aim to impan some advantage to bial participants. n:aching a conclusion sooner and/or exposing fewer 10 inferior therapy during the course of the trial. Such designs ha\'C been around in thecHy for at least as long as modem CUMCAL lRIAIS. although Ihcir practical applicalions have hitherto been very Umited. Other terms in use for similar methodological approaches for such trials include jle:cible desigtrs. dynamiC' designs. ADAP11VE DESIONS and letmI-QS-yolI-go designs, although then: is no apparent CODSensus on the nomenclatUM. There are four broad categories ofdala-dependcnt designs. each or which shan:s the same spirit of leaming from the accumulating data within the trial. as opposed to ignoring intermediate results until completion of the trial. These categories are: sequential. Bayesian. decision-theCRtic and adaptive. Their descriptions given later in this entry dclib-

DATA~PENDENTDESIGNS

erately avoid too much mathematical detail. Also. a distinction is drawn here rrom two other n:lated types ofclinical trial design not discussed further. FirsL Ihcre an: designs that use MINIMISATION to incorporate knowledge or cowriDles of patients already entered into a trial (and hence self-modify according to treatment allocation though not to treatment response) and. second. there are bial designs intended to have an internal pilol study (see FILm SIlJDIES). Initially. however. it is wonh considering brieRy some historical background to help understand why modem c1inicallrials ha\'C emerged in the way they have. Typically they fWlR fixed smnplc sizes dictalc:d by error probability considerations (see TYPE I ERROR and TYPE II EJlROR rates). tn:alments being allocated equally (usually. to maximise FOWERof a lest) and results kept hidden (to all but a DATA AND SAfETY ~rroRJNO &QUD (DSMB). if appoinlCd) until the final analysis. which is conducted well after the final patient has been enrolled. Interestingly. and tellingly. the rootsoflOday'slrials lie not in medicine but in agriculture. In the UK in the J920s. R. A. Fisher began conducting crop field bials to try to dctenninc: which type of fertiliser produced higher yields of wheat. Realising then: wen: more facton than could be listed (soil composition. aspect. slope. WDIer and so on) that might inftucnce total yield. YlSher ( 1926) pioneered RANDOMISATION to cope with the problem of balancing all the known and unknown variables as far as possible. It was a statistical masterstroke. for such use of an extemal chance mc:chanism alone could ensun: that the comparison betwccn fertilisers was fair and unbiased. Specifically. any difference observed in crop yield at harvest lime could be altributed to the one factor that was known to be different between the puups. namely the fertiliser. all other fadon being expected to be equal. Hence. infen:nc:e from any observed differences between groups would Unk cause and effect (bere, fertiliser and yield) as strongly as possible (sec CAUSAUTY). 1hc first medical application of randomisation came in Ihc laic 19405. when A. B. Hill used Fisher's tc:dmique in a clinical trial testing streptomycin for the In:almcnt of pulmonary tubclallosis. This was not without some controVCl'Sy at the time. but Hill convinced sceptics by arpIing that randomisation was also a fair way of allocating Ihc scarce reSOUR:e involved. given that the treatment was in slriclly limited supply. This Medical Raean:h Council sponson:d trial became Ihc first randomised controlled clinical trial to be published (MRC. 1941). However. trials of today an: fundamentally quite similar to those of SO or ~ years 8p), in Ihal they typically involve equal allocation of treatments to patients. generally after performing a power calculation to determine a lalgcl number to be m;ruited. Thus. in a two-lmltment compuative bial. half the palients customarily n:ceive Ihc standard and half the experimcnl8l treatment. As alrcady mentioned. with the

123

DATA-DEPENDENTDE8IGNS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

possible exception of DSMB committee members and a slalislician eonducting an IN1DIM ANr\LYSls. lID one looks at the raults until all the patients haw been nmdomisal and followed up. At Ihe end of the trial it is possible thai the experimental lMabDent is cleclan:d a statistically signilicanl impl'OVCmc:Dt and hcmIcIed as a clinical SUlXlCSs..1t is an ethical problem. however~ if slalistical ·failure' means die patient died and one can look back with some n:morse wondering 'if anly we had come 10 this conclusion saon&:I' pcIhaps we could ha~saved somelivcs' (seeE11lJCS A.~aJNICAL'IIUALS). Even if the outcome is noI as serious as death.. the !Illume. pelSills: Could fcwcr palic:ats in the study haw suITemI on the way to R:aching a valid conclusion? This last question has motiwtc:cl much rescardI by ethically minded statisticiaM. Ironically, this work dates back at least as far as the lint modem clinical trial. for the whole area of SEQUENJ1AL ANr\LYSIS tlKes its hi5lory 10 the 1940s. World War U and US goyemment-contracted statistician Abraham WaId (see Wald. 1943). His wort. like FlSher·s. was not in the medical arca of application. but in ammunition Ic:sling. an allolelbcrdilTenmt CJUIII1p1e ofseeking toc:ope with )Rcious and limited resoun:es. Medical application of sequential melhods docs seem entin:ly appropriate. after all, as palienlS ani~ to be tn:aIcd sequentially (they arc DOl all wailing in line outside the: doclor's ofIic:e. or ha5pilal clink. at the start of a bial) and. similarly. laults rrom same: arc awilable sooner than from oIhen. The rationale for sequential trials invol vcs looking carefUlly at data QS lbey accrue with a view to slopping just in time. Hence. the number ofexperimental units mauin:d is noI fiKc:cI in advance but is a random variable. 11Ieory shows that the expectccl numbers involved in a sequentially analysed nuacIomised controlled trial is less than the com:sponcIing fixed sample size trial. far any given power and level of significance. It is possible. when treatment groups fare broadly equally well. for a sequential trial to nc:cd sliPtly rnon: patients overall compared with a biaI using IJaditional dcsip. but this would be quite unusual. Far better or wane. the clinical llial as canduc:tccI and analysed today is nol in Wald·s style of testillli ammunition but ralhc:r in Fisher's appliclllion of fc:niliser to fields of whc:al. 'I1Iese two metaphon iIIuslnde the fundamental difrCRnce between the slalistics behind c1inicalllials that SIri~ to leam-as-they-go and those thal wait. literally, until harvest time befo~ beginning 10 make scienlilic infenw:es. The R:ader may decide whether it is rilhtthal nonnative pnsclice sees clinical trial volunteers alTanIed the same respect as the rc:n.iliser rather than the ammunition. Following Wald's pioneering reseaueh. sequential designs have evolved as sophisticatc:cllools 10 assi51 those on DSMBs and hence can be considered mainstRam. in conll'asl to the mnaiDing design types discussed below. It should be said. however. that these methods arc noIlUUtinely implemented

as primary anaIyticaitaols ror drivillli llials. Instead at best

they ~ used as "back seat driven' toexert iDdin:ct inftuc:ac:e on trial conduct. How do they do this? Essentially. as data accumulab:. a test staliSlic can be plaited on a graph of tn:almCnt dilTen:ACC VCIIUS lime. and trial m:ruitment can be m:ommended to terminale just as soon as a pmleterminc:cl boundary is crossed. This boundary may take on various shapes. the simplest. being lrianJUlarwith two possibleopliaas; eilhertreatment A or B isdc:claR:d bc:tterclcpendinl011 which side ofthetrilUllle is crossed filSl. 10 allow ror a thinl. nonconclusive. option with a pleddennincd maximum bial sample size. the boundary outline is modiflc:d to include a vc:nicalline at a given paint on Ihe lime (strictly "information') axis. 'I1Ie idea is 10 Slop the lriaI in favouroftreatmenl A. say. if the upper line of the bouadary is avsscd fint; B if the lower line; or else. conclude no clinically relevant dilTerence between A and B if the VClticalline is n:achcd lint. 1hc~ ~ variations on this theme with rules such as those derived by Pococ:k(1983)andbyO'BrienandFleming(1979) beinl popular examples. Thus it is not necessary to update the graph aftcrevay single observation. One can apply rules. called group .sequential melha. that update after small batc:hc:s or raIIlas become: awilable. For man: details refer to Jennison and 'lUmbull (2001). StatiSlicai software for implementing these rulcs is readily available in sevent COl1U11Cl1:iaI paclcqes (e.g. EaSL PEST. S + Sc:qTrial). One disadwntage with sequentially designed experiments is that their usefulness. namely their poICDtiallo learn while in progress. is self-limiting to trials having relatively rapid ENDFOOO'S. 11Ius a sequential trial otTers little benefit oyer a traditional. fixed sample size biaI if the outcome R:mains unknown until years after randomisation. 11Iis may be 10 in bR:8St cancer. for example. but is no limitation. for instance. in emergency medicine or in rapidly fatal diseases. Tumin, 10 Bayesian designs. investigators sbut by elicitinga PRIOR DlStRlBurJON, either fram a panel ofclinical expens or fram a R:aSaftable selection of available thcomical distributions thought 10 mimic reality in lenns or tn:aIment success distributions (see BAYESIAN METHODS. For example. a BETA DtnRlBU110N with suilably chosen panunelels can represent initial beliefs about a trealmenl'sel1icacy rallliing fram neplively skewed to unifonnly distributed to positively skewed. In practice., ~ is virtue in choosing a prior that makes Ihe experimental treatmenl appear initially a weak conlendcr. so that positive results in fa'VOlD" or the tn:aIment are not too dependent on initial choice of the prior. As the patients' results accumulate. the conditional diSlributiaa given the data thus far is evaluatc:cl-the so-called POSI'EIUOR DJmuBUDDN. amalpmating the prior and the: I.IKEI.IHOOO. Infe~nce is based on the posterior. including the evaluation of CREDIBLE INTERVALS. analOlous to CONRDENCE INTERVALS in the frequcntiSi CODtext.

_______________________________________________________ An adwablge is the ease ofinle...,matiaaofthesc intcn_ for they havc IlIOn: inlUilive meaning 10 clinicians and paticnts. A disadvanlDgc is thc gene'" lack of awan:ncss of Bayesian methods sinec these an: less oftcn cacountcrai dum those from the fmaucntiSl school. nais is reftcclcd in lhe comparative lack of slalistical lc:Xtbooks. counc:s and softwan: aligned 10 Ihc Bayesian paradigm. Spicgclhala. Fn:ccIman and Parmar (1994) proVide an cxcellent o\'CI'Vicw of Bayesian methodology applied 10 dinicaltrials. Some see Ihc subjective or arbilnlly nalUn: of the prior distribution involved as a weakness: othcn repnl it as a positivc oppaI'lunity to illCOlpOJate provisional information about the potential new In:alment. The thUd braad calcgOl)' of dallHlcpcnclcnl designs iDvolvcs the usc: of DEClSKlN THEORY. Some expcrimcnlal slUdics can be conducted with lhe resulliag inferencc. in lenns of how Ibc iafonnation will be used 10 reach a practical clccision CDIICCming which lrealmentlo n:commcnd. as the driving fon:c. ForexampJc. one can specify a criterion such as minimisingclqJCetcdsuccc:acslost.armaximisingsucccssc:s gained. over the course of a pm:Ictcnnined number of future paticnts. called Ihc horizon. wilhin and outside a comparative trial. Another crilCrion CXJUld be maximising the probability of conut scIcc1iaa of superior treatment. Either way. the foeus is on Ihc PJDIIDIIlic nc:cd to make a decision and use one of the treatments or not once lhe trial is OWl' in a din:ct attcmplto balancc the nc:cds of CUrmlt and future patients. It is possiblc to discount future patients by putting more weight on present results. althouP this whole an:a can become malhcmatically quite inlricalc., especially when modelling with uncanslnlincd "multiannc:d bandits' in the context 01' deciding &mOng scvcmltreatmcnts. Ncvenhclcss,. praclical simpliftcatiaasca. be iDtVrpanlCd. such as limiting equal allocation among n:mainiag treatments. In lhe case 01' just two treabncnts this amounts to a1localing pairs or treatments until such lime as it isopUmai. by whalcvercrilCrion.to cease the comparativc stage. After that onc can switch all mnaining patients within the horizon to the preferred treatment. or maybe enter thcm inlo a brand new randamiscd lrial comparing this 'winner' wilh 8DOIhc:r novclllaltmcntthal is ready for a compandi\'C bial. ThUs. OIIC is not CGIIslniDed in actuality to put all remaining paticnts on to the indicatcd treatment. but one can acl safcly in the Imowlc:dp thai the selection of the winner is warlting 10 the best available information. whc:rc "best' is guaranlccd until thc original horizon is n:ached. (Note. in practice, the choice of horizon in absolute tenns is naI critical. for only an approximate size would ac:cd 10 be specified.) Objections 10 the subjective natun: of prior distributions involved in Ibis type of dccision-thcomic framework can be a1lcviatc:d. for example. by appealing to minimax crileria. 11Iis means implcmcnlilll a design that has good thean:tical propc:dics across a bJVBd range or priors. Development 01'

DATA~PENoeNTDES~NS

computer software to allow such designs 10 be implemented has been slower than for sequential mcdaods. contributing 10 the current lack of use of dc:cision-thean:tic methods in practice. The fourth catcgory considcn:cl hc:rc. {n:sponsc-).~E DESIGNS. is Ihc mast cxtremc 1)'pC of clata-depcndcnt design. It incorporates the aCCJUing informaliaa from Ihc data 10 maclify Ihc treatment allocation probabilities away fram 50:50 in thc case 01' two lIatmcnts. Thus. f... example. whcn:as the trial wauld stan with equal allocation. as Ibc data begin to favour one In:alment cven slightly. Ihc:n it a«ccts the odds or thc next allocation being accanlingly f'racIionaily higher. In practice il works like lhis. Imagine a bag containing an equal number of nxl and blue balls. A n:cI baD drawn iDdicatcs Ihc nexl aDacation is to Irealment A: a blue ball. In:atmcnt B. If a success occurs a ball of Ihc: approprialc colour is added to the bq bcro~ the next drawing. and hence treatment alloealion. takes place. Adaptivc designs wcre SCI back by a ndbcr poor pnltotypical cxample 01' a mid-l980s trial (Bartlett. Roloff and Comell. 19I5) involving CXlracarpomli membrane oxygcnation (ECMO) dacrapy. and which has reccived much attention in the statistical and medical literatun:.. Ethicists. clinicians and statisticians ha\'C all contributed 10 the debatc about lhis particular trial. It involved critically iU newborn babics and Ihc relcWllt outcome in quc:slion really was a matter of lifc and dcalh. In relnlSpcCl. it was clearly a mistake 10 begin this bial wilh pn:eisely one ball of cach colour in thc hili instead of. say. len of each. What cnsuc:d was a highly unbalanced cliSlributiaa of In:atmcnt alloealion (for ECMO babies generally lived. unlike many of those not on ECMO therapy) n:ncIcring sensiblc inferencc difficult. if not impossible. On the other hand. it can be said that sincc the ECMO tria~ computing power and mobilc lcchnology. two pn:RqUisites far successfully conducting an adapti\'C design. have taken hugc leaps forwaud. making this design far IIIOIC feasiblc to implement successfully than cver before. Thesc adaptive designs an: the most CXJllIIO\'Cl'Siai in Ibc family or data-dcpcnclcnt designs. 11ais is bc:aause they appear to read too quickly to carly clata. which may be subject to syslcmatic bias. ar time tn:nds. and if DIll careful can begin 10 aclapt 100 swiRly 10 ch8DtlC results. 11acre is also the criticism that if one treatment happens 10 be a PLAtDO. why should anylhing change aftcr a success ar a failure on such an inert substancc? Nevertheless. wilh suitable cautions and awareness of the issues involved. adaptivc designs can be a highly cffectivc and cthically appealing design. despitc oncc apin the relativc dearth of pasilivc examples of their usc so far. A puwing number of statisticians believc Ihc 21sa century will be chanlctcriscd by man: use of computer-intcnsive.. clallHlcpcndcnt methods. 50 long as those responsible far

12&

DATADREDGING ________________________________________________________________ conducling clinical bialsan: opca tora=eiving suggestions on how to advance trial methodology. For rurther details. including when data-dcpcndc:nt methods are consideml most suitable and a proposed stndegy for lheir introduclion. see Palmer (2002). In closing. it is worth ~membering why one should consider using data-dc:pendent designs. The primary n:ason is for thcirethical ad\'8Dtage in tenns or how patients within trials an: ~gardcd. without compromising the scientific rigour or usefulness or studies for the sake of rutlB patients. Thc~ are secondary reasons besides. with benefits derived from the side effect of expccliag fewer patients to be involved in leam-aryou-go trials compared with lJ'adilionai trials. These beneHts pertain to trial sponson (notably the pbarmaceutical industry). doctors. patients and ultimalcly the science of medicine itself. CRP Bartlett, R. H .. RaIoI'f, D. W .. CCll'lld, R. 0 ... til. 1985: ExtracorporcaI ciR:ulation in neonlllal respirator)' fail~: a prospective nmdomised study. Pediatric$ 76. 479-87. FIsIIer, R. A. 19"'..6: 1bc amngcment or field apcriments. JoumtIl of 1M Atinislry of Agricullure. Greal Brilaill S03-13. J........... C. aad TarabaII, B. W. 200 1: Group sequeftlilll methods "'ilh applicalions ID clinicallrials. London: Chapman ..t HalIICRC. MRC Slnptam)'dD .. TubeR'll.... TrIals CGmmIUee. 1948: Streptomycin lleabncnt for pula. nary tuberculosis.. Bri/ish MedimJ JoumtJ/769-82. O'Bflea, P. c. -1'1ImIIIae T. R. 1979: A multiple taling poccchR for clinical 1riaIs. Biomelrics 35. 549-56. PalmII', C. R. 2002: EdUc:s. datadepcadeDt dcsips and the strlllqy of clinical trials: time to SlUt le~-as-wc-,o? SlalislimJ Mt/WS iIr Medical RrsmJ'th II. 381~ IWock, S. J. 1983: elilrira! lrials: Q praclical approum. OUchcster: Joba Wiley &: Sons. Ltd. S............., D. J., Fnedm.... L 5.aad Parmar. Me K. B. 1994: Bayesian approaches to randomised trials. JoumtJlojthe Royal Sialutical Society, SeriesA 157, 357~16. WaId,A.I943:Sc:quc:ntialaaaJysisofSlatisticaJ daIa: tqXIrl submiued to Applied Malhematics Panel National Defense Rcscan:h Commiltce (declassified in Walel. A. 1947: Sequmlial tJIItIIysis. New York. Jolm Wiley &: Sons. Inc.).

data dredging

See lUI' HOC A.'1ALYSfS

data management

This is the syslemalic management or a large structuml collection of infonnation. 'Data management' is always a component of data analysis. but is usually a more signiftcant issue in large or multicent~ studies wheR the data management fealU~s of software packages such as SPSS or Excel an: inadequale. This will also be the case when the 'data model' of the study - the entities ror which data an: collected zmd the ~lationships betwc:c:n them - docs not fit the standanl rectangular data model or spreadsheet or the classical statistical package. Thus. for example. a study comparing trc:alment in hospitals may ha~ thn:c entities - hospital, ward mad patient - that need data n:cording at each level. Longitudinal or n:pcalCd measurement studies also generate data that docs not so easily fit the n:ctangular model.

Nonetheless. \'ery many complicated studies an: maaaged and analysed ent~l)' in packages such as SPSS or SAS (see STA11SI1C:Al. MCKAOES). SAS in particular has very strong data management fealu~s. Se\'eral data Illes foreachenlity can be created. and the data melle reatun:s of thc:sc propams in order to prepare spccillc analyses. Because Ibis melling of files is manual, man: skill and experience on behalf or Ihe researcher is nc:c:dcd and thorough documentation and understanding of merge procedun:s is esscutial to p~\'ent error. Nonetheless. bcc:ause only one programming language is used. the procedures me consistent and easier to learn. Allhough such an approach seems 'Iow-tech'. there is much to recommend it for many studies. The main weakness of this approach is the manual management of bansaclion updates and production of an audil toil. If, in the example above, mo~ palicuts an: recruited in a particular ward. then dcrivali,,-e files that include hospital or ward variables need to be recreated manually. In a very dynamic data cuvironmcnt this is tedious mad also error prone. Equally the com:clion of values in one file similarly requires the n:cmdion or aU the dcriVDlive flies. An allCmati\'e is to consider using a rormal data management tool. and this usually implies a dalabase. Wilb a fasl modem PC. desktop database packages such as MS-Ac:cess ~ capable of managing datascts with some tens oflhousands or cases and several hundred variables. Only the very largest studies will ~ui~ a full SQL-compliant database, although the~ may be sound ~s for using the latter ror security and access control. Almost ineVitably deployment of a database will require the production or data entry and update scrc:c:ns. a process that requ~s some programming ability. lbis is particularly the case if transaction control and an audit tnlil of changes are needed. Sc:cond. the database query statemcats needed to provide Ihe appropriate rectangular malrix datasets for analysis can be complicalcd,. and can requ~ subtle undentanding of SQL. Such in-depth ex.pertise is not normally easily accessible in a resc:arch team and may be expcasive to proVide. Bcfo~ deciding to use a database for data storage Ihe ~search team should plan and budget for such skills to be available throughout the life of Ihe project. Employing a programmer who thcu gels another job just bcro~ the end of the study can lea\'e a rcsean:b team wilhout the support they nc:c:d when wanting IlnaJly to analyse the data. For Ihis reason alone. it is often scasible to consider the acquisition of specialised dinical data rnanagemcul packages. These often include alilbe alia checks and ronns necessary for formal CUNIC'.o\LTRIALS. Entering the appropriate lCIm5 in any web search engine will briDl up scveraI hundred companies offering products Ibat are sullable - the difficulty will be in selecting 0ftC. Although there may be a seemingly significant initial cost (perhaps seveml Ihousand euras or

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ DECISION THEORY

dollars) Ihc saving on development time. as well as the prcdiclability and security of softwan: operation. give a rapid payback. At project conception it is usually possible to outline the exlent of data management n:quiranents dependent on the complexity or the problem. II is sensible ror prior specification and buclieUng or the software needc:d to take place. ralhcr than awaiting project start and then developing ad hoc: solutions. This will give the resean:h team the security or control of the data over the project IireUme. CS (Sec also DATA enRY)

data mining In IMdlelne This is a branch of both computer science and statistics devoted to extracting useful knowledge rrom databases (also known as KOO. knowledge discoveJy in databases). In general. such knowledge is obtained by cleteding various types or regularity and relation among the datL most often association rules. classification rules (sec DISCRIMIN.~ ~CIIQN ANALYSIS).lincar and nonlinear dependencies and clusters (sec CLUSJ1!R .o\NALYS1S). Depending on the context in which it is performed. data mining research may emphasise computational scalability or the algorithms or slalislical significance or the n:suIlS. The field bcneftts from a major injection or ideas and tools from genmd ).IA(IIJNE lEARNING and pattern n:cogniUon and. as sucb, it is often tDlsidenxl also as part or ARTIfICL\L INTELUOENCE..

Data mining (OM) is often described as an intel'DCli~ process that involves both the compUlcl' and the human c0mponent. This is also why data visualisation is considcml an essential pan or the process. OM is lI10Ie gcncnal than traditional statistical analysis. in the type of regularities that can be found (e.g. decision In:es). in the size of the dalasds (often in the range of millions ofdata items) aad in the 5IrCJng emphasis on visualisation or the dala and automation or the analysis. The application or OM to medicine has a long tradition. Automatic data collecUon in modem medicine is i~ ingly pushing towards the development and deployment of 0015 able to handle and analyse data in a compulersupported fashion. Being able to detect sets of symptoms thai are oRen simultaneously present (association analysis) can help predict which other symptoms may be observed (association rules). Observing many paUent descriptions as well as their diagnoses may help find a rule to predict Ihc diagnosis given a new patient (classification analysis). SpotliDg groups or similar patients can help customise the therapy (cluster analysis). Finally. being able 10 predict Ihc expc:ctc:d cost or a patient based on his or her hislOl')' can help insurance companies optimise their services (regression analysis - sec WU1PLE REORESSJON). An early application

or data

mining in medicine is the decision tn:e leamer ASSISTANT (Cesblik. Kononenko and

Bndko. 1987; Witten and Frank. 1999). It was developed speCifically to deal with the particular cbancterislics of medical dataselS. A whole new chaplCr in Ihc application or DM lechniqucs to biomedical data is being wrillen with the intnxluclion of genome-wide dablSCls. Genomic sequences ror seyeral organisms are now available online and the availability of high-throughput gene expression and protcomic data highlight the urgent need ror efficient and flexible algorithms to exlrad the wealth or medical information contained in them. Oatascts n:cording human genetic variability (SNPs) are soon expccled and. with them. the possibility of correlating genotypic with phenotypic information (sec OENETIC EPlDBOOI.OOY)•

.Classic examples or modem data mining melhock an: systems such as BLAST (Altschul et QI•• 1990). which allows researchers to ftnd related genetic sequences elrlCiendy. together with a Slalistical assessment orlhe deem: or similarity. Si;niftcant biological discoveries are now routinely being made by combinin; OM methods with traditional laboratory tc:chniques. For example. the discovery of novel regulatory regions for heat shock genes in C.oenorhabt/itis elegans (Thakurta el al.. 2002) was made by mining vast amounts or gene expression and sequence data ror sipmcant patterns. NeRDS Albdaal, S. F .. GID, W.. ~DDer, W.. Myen, Eo W. and LIpman. D. J. 1990: Basic local alignmem searda tool. JournoJ 0/MoImdaT B;ology2IS.403-10. Bratb,Land Keaoaellko,L 1987: Leaming diagnostic rules flOlll incomplete and noisy dalLln Phelps. B. (cd.).

AI mellrotls in slatislicJ. L.ancIon: Gower Technical Pms. CtIbdIc. B., KOIIGIIeBb, I. aDd Bntko, L 1987: ASSISTANT 86: a knowledge elidtalioa tool for sophisticated uscrs.ln Brab.l. aad Lavrac. N. (eels). Progress in nltldJiIre lemning. Wilmslow: Sigma h:ss. IIaad. D. J., Mannl-, II. and SmA P. 2001: Prilr~iples ofdata minm, faJapli1'e ~ompulat;on tmtI mtJmiM /etzmingJ. Cunbridgc, MA: MIT ~ Lavnc. N. 1999: Selected techniques for data mining in mcdk:ine. Al'lijidaJ Inlelligence in Medicine )6. 1.3-23. nuart., De G., P........ L.. SIonno, G. D., TedI&c:o, p.. JoImoJI, T. E.. W.tker, D. W., 1Jthaow. G.. KIm, S. ..... LInk, C. De 2002: idenlificaliCIII of a DO\'C1 ds..~plalOl)' clement iJI\'olved in abc heat shack zapon&e in Caenorhobtlilis tlqmu using miaoanay gene cxpr9UCIII and compulalional methods. Genome ReseQrm 12, S. 701-12. WIttaa, L R. ..... Frank, IE. 1999: Data minillg: prDtliml maclrinr learn;"g lools ad t«hniques "'ilh Jom impItmtnlatiDlu. London: Morgan Kauffmann.

data monitoring committee (DMC)

Sec DATA AND

SAf£TY MONITORINO BOARDS

decision theory This is an approach to the analysis of data that leads to choice between a number or allemaUve acUons by consideration of Ihc likely consequences. This is in conlnlSt to the commonest fona of analysis or data from a ClOOCAL TRIAL that is based on hypothesis tesUng.

127

DECISION THEORY _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

The decision-lhcomic approach is mast suitable for

decisions in which the possible actions 1ft entimy wilbin the control of the _ision maker. In a clinicallrials scUing. mosI suge.stions for the use of decision thcary havc been in early-phase trials. Ibe rmal outcome or which is usually a decision as 10 whethc:r or IIDl to continue with the clinical devclapment pmpammc. In a dccision-thcoraic fl'1UlleWOl'k. the actions thal will be taken as a ~It of the analysis an: explicilly identified and a utility. or pin. assipcd to each exlRSSilll the desirability of the action as a func:lion of some unknown plll1llllder. For example. in an early-phase clrua trial. possible actions might be either conducting fUJtherclinicallrials with the drug ar abandoning the clinical devclapment pmgramme. The desirability or each of these actions depends on the InIc unknown efficacy. Information on the unknown parameter is summarised by a Bayesian posICrior disbibulion (sc:e &\YESL\N ~). indicating those values that 1ft plausible givcn the observed data. and this can be usc:d to obtain an expected value for the utility f«ada adiaa. The actian with Ihc larpsa pastcrior expc:cb:d ulility may Iben be identified. This is the action thai will be chosen by a ndiaaaI dccisiaD maker whose ~nccs an: accurately n:pn:a:ntal by the utility functian. A simple example based on Ihal CCHlSiclenxl by Syl'VCSter (1988) (see also the cOlRClion by Hilden. 1990) illusllales decision making in a PHASE U lRLo\L. AI the end of the trial a decision will be made as 10 whether or nollo continue with Phase 10 development. The desirability of each or these two options is IUIIUIUIrised by a utility function. which. if the observed data I n biruuy (suc:ccsslfailure). depends on the true success rate. which wiD bedenob:d by p.Suppose Ihalthe suc:cc:ss probability for the exisling standard trutmcnt is known: e.g. it may be known to be 0.5. Suppose also that if the Phase II lrial is successfuL some known number. denoted RI. ofpalienls will be tmltcd with the new (Raiment in PHAsE 1II1R1ALS and that int is found 10 be eifcctivc in the Phase III trial. a IOIaI of I additional patients will be In:ated wilh the new lR:almenL PalicD15 Imlted with the standard batmenl in the Phase III trial will n:ceivc the same balmeal. rqan:Uc:ss orlbe outeome of the Phase U trial. so need not be consiclen:d. Suppose lb. the utility ofeach action can be measun:d by the number of future successes expected if that action is taken. If the Phase III trial is nOi conducted. the RI +1 ~ patients will all ","ive the slaDdarclln:alment. The success ndc for these patients will then be 0.5. so thal the expected number or future successes is 005 (m + I). nis does not depend on the sua:c:ss rate for the new drug since: this will not be used for aay ftIrther patients. If the Phase III trial is anIuctcd. m patients will ","ive the new tn:alment in the trial. so that theexpectcd numbcrofsuccesscs far the patients in this trial is pm. If the Phase llllrial is unsuccessful. the I

further patients wiD n:cei\'e the standard (Ralment and the expected number or successes for these patients will be 0.51. If the Phase III trial is successful.thc:se patients wiD ","ive the new lR:atment so that the expected number of successes will bepl. We will assume that the Phase llllriaJ will givcthe COm:ctDSWCI'. so thal it will be successful wheneva- p > 0.5. in which case the number of extra successes from Imllilllthe 1 fUlure patients with the new rathCl' than the standard (Rabnent is the ditrcn:ace between pI and 0051. which is (p-O.5}I. The loIai expected number of suc;a,sses if the Phase III trial is conducted will be: mp +0051 + (p - 0.5)1/ (p >0.5). when: the indicator funclion /(p >0.5) is equal to 1 if P > 0.5 and 0 ir p 0.5. If p is I~e. this utilit)' is large. n:8ecling the factt"t coalinualion to Phase HI is desirable. and ifp is small. the ulilily is smaller as continuation 10 Phase III is undesirable. If the value of p wen: known. we would lake the action com:spondilllto the larger of the two ulilities: i.e. we would continue to Phase III if the expcclcd number of successes from doing so. mp + 0.51 +(p - 0.5)11(P > 0.5). was gRater than the expected number from abandoning de"elopmc:nt of the experimentallmltment. O.5(m + I). In pnactic:e. or course. pis DDI known, but instead Ibe infonnation on p given by Ibe observed daIa is swmnarisecl by its posterior dislribution and the expected number of futum successes frona each action must be avcrapd over this dislribution. The oplimal action COI1CSpondilll to the I~CI'expectcd number of succcsscsam then be selected.. In addition 10 making decisions. the end ofa clinicallrial. as illustrated hm:. decision theory can be used to make decisions befon: the study SlaItS Iqanling the study desip farcliaicallrials,ascliscusscd by Sylveslcr(1988),and itisin this context that the approach is most often propasal. Design decisions consiclcml might be those taken befce the study sluts ~prding the sample size for a fixed sample size study or those taken during a sequential trial (see SEQUEH1lAL AJW.YSIS) about the future conduct of the trial. In the Jaacr case. a mc:thod known as 4dynamic prognunming' may be used to obtain a sequcnc:e of optimal clccisions by working backwanis through the 1riaJ. eonsiderilll each clc:cision point in turn. Examples I n given by Bcny and Stangl (1996). who consider the problems of when to stop a sequential trial involving a single expcrimenlallmdmcnt and of deciding which Imllmentto use far each patient in a sc:quentialtrial comparilll an experimentalllalmcat with a control. Although the sugellion to use decision theary in clinical trials has a lang history (see. far example. Colton. 1963). then: hu been Iiule practical application (see DATA-DEPENDENT Df.'SION1). 'I1Ie use or the approach has probably been limited by the dif6culties8SlOCiated with specificalionofapprapriate ulility functions. 11Ie detailed specification of the gain fuDction also meaas thal designs must be obtained with a particular type of trial in mind. One possible solution is

:s

_________________________________________________________________ DSMOGRAPHY

to use wbat has been called a stylised Bayesian approach. as illustrated. for example. by Stallard. Thall and Whilchead (1999), in which parameleu in the utility runction are selected so as to lead to a design with attrldive fmauenlist propc:Itic:s. NS BeIl'J. D. A.ad 5....... D. It. 1996: Bayesian lDdhads in heallhrelated I'eSCmh.1n Beny. D. A. and Stangl. D. K.. (eels), Bayemm bioslalisli~s. New York: MaR:el Dekker. CoItoa, T. 1963: A model for selecting one of two medical treatmeats. JOIII1Ial oflire American SIaUsI;cal AssodaliDft 58.3Il-400. Hlldea, J.I990: Conmcd loss calculation for Phase U trials. Biomelrics 46. 535-8. 5......,.. N., 'I1IaI, P. F.ad Wlaltebead,J. 1999: DccisiOllIbcomic dcsips (or

Phase II clinicalmals wilb naulliple outcomes. Bkmrtlriu 5S. 971-7_ Sylvester, R.J.1988: A Bayesian...,..,acb tothcdesicnofPhase U clinical bials. B;o",tlrits 44. 823-36.

degrees of freedom This is an elusive concept that occurs throughout statistics. Essentially. the tenn degrc:c:s of

fm:dom (DoF) means the number of independent units or information in a sample ~ICVanl to the estimation of a panuncter or the calculation of a slalistic. For example. in a 2 x 2 CONTINOENC'Y TABLE with a given set of marginal totals. only one of the rour cell fR:qucncies is f~ and the table is thc:Rro~ said to bave one delR'C of freedom. In many cases the term corresponds to the number of parameters in a model and in others to the number of parameters in a Slatistical distribution such as the l-Dl5TRIBl1J1ON. the H)lSTRIBuno.~ and the CHI-SQUO\RE DISTRIBU11ON. SSE

demography

The study of population processes (~on. Hcu\'elinc and Guillot. 20(1). This entry provides a brief survey of the following topics: measura or fertility. mcasun:s of mortality. age standardisation. sources of data. historical demography and the dc:mognphic b'alWition, papulaaion projection. papulation ageing and summary measun:s of population health. Note that many demographic measures that ~ deftned as 'raleS' an: noIlnIc rates in the sense thalthey ~ not measun:s of events pel' unit of penon-time. These measures ~ identified by plac:ing the "rate" of their title in quotes. We hegm by discussing measun:s of rertility. FatUity mcrs to actual childbearing petfornlQlf«. not childbearing potential. which is calledjecunaity. The CTUck birlh role is the number of births per conventional unit of person-time. 111c person-time denominator is typically bued on estimates of population sixe at mid-period multiplied by the length of the period. W~ the period is a single calendar year (the usual cin:ums1ance) then this equals I. For example. in England and Wales in 200 1 the~ wen: 594 634 live births rqistcml DDd the mid-year population was eSlimalcd al S2 084 Soo. giving a ClUde birth rate of 11.4 births per 1000 population per year.

The analogously calculated crude dealh rate may be subtracted from it to give the Tote of natural increase. M~ spccirac measures of fatUity ~ desirable because population &Ie structure affects the childbearing potential of the population and because. at an individual level. births (unlike dcaiM) can be repeDlcd and the likelihood of this happening depends on n:produclive experience to that poinL Thus. cumulative measures or individual rertility an: also desirable. The generol feTtility 'Tole" (GFR) is calculated as the number or live births per conventional unit or female person-years in the age range IS 10 49. while the 10101 fertility "Tole' (TFR) estimates the a\'Crage number of babies that would be born per full ~productive lifetime - given current age-spcciftc fertility rates. It equals the sum or the probabilities of giving birth in each or the ycars of life in ",hich such a birth could occur, atnventionally from IS to 49. The gross reproduction 'role' (GRR) is the 'rate' at which mothers an: n:prociucing themselves. It is an estimate of the a\lCJ'Dgc number of daughlCrsthat would be born to a ",oman during her lifctime if she passed tluough the childbearing ages experiencing the age-spcciftc fatility rates or the population of inlc~sL It can be estimated on the assumption that the proportion or female births is (approximDICly) l00f (100+ IOS)=0.488.111c GRR is then 0.488)( TFR. The nel repTod"ction 'rOle' (NRR) takes into account the preVailing mortality among women aad thus estimates the cxtentlO which each generation of mothers actually reproduce themselves. allowing for the proportion who die beron: reprodUCing. It can be calculated from a hypothetical birth cohort (e.g. or 1000 femalcs) who ~ aged through the reproductive lirespan and exposed to the given death rates using lifctable methods. This yiel& an expected number of penon-years lived by the cohort ofpotc:ntial mothers in each of the age inlc:mlJs in which births could occur. ThUs. formulaically (",ith 1: denoting 'sum or) the NRR =10.488 x 1: (probabilities of giving birth in each ofthe ycarsorlife in which such a binb could occur )( the person-years lived by the cohort at each age»)lnumber in the cohort. By definition. a NRR of 1.0 equals "mplacement level fertility". 'Zero population growth' ",ill not typically be approadlcd until sevcnl decades after the allainment of ~placement le\'el fertility because (previously) incn:asing populations typically have a higher proponion in the ~ reproductive ages than would obtain in the atll'Csponding stational')' papulation. 'This exccss reproductive potential creates subSlantial momentum that is not slowed until the age structun: appJOaChes that ofastationary papulation. (Sec the discussion of stable population to follo\\'.) Turning to measu~s of monality. the most basic is the mHIedeath role, analogous to thecrudc binh rate. giving the number of deaths per conventional unit of population time. Thus in 1999 in the US. 2391 399 dcaiM WC~ reported and

129

DSMOGRAPHY _________________________________________________________________

the estimated mid-year popuJatian was 272 691 000. giving a made death rate of 8.77 per 1000 population per year. Death raIcs may be specific for sex. age group and cause: e. g. 8337 men qed 5S-64 had their cause of death entered as hean attack (KIlle myocardial infarction) on their death certificates in England and Wales in 1990 - out of an estimated mc:aD population at risk of 25.26 x lOS. giving a nile specific for age. sex and cause of 330 per lOS per year. Among other specific death 'raIcs' and ratios. an important one is the infant morlality 'Tale'. oonceptually. the probability of dying between binh and exact age I (.qo in lifetable notation). It isoperatianally dermed. in relation. foreumple. to events occurring in a gi~n calendar year. as: Number of dc:aths of liveborn infanls who ha~ not reached their first binhday -----~~-:---~___:~o:_--- x 1000 Number of live births Nalc that this operational definition rcquira accurate counts of births and infant deaths and is thcn:fore dil1icult to impIcmcnt in the absence of a national \ital statistics system with complete (e.g. greater than 9S'.it) co~e. Only 75 of 191 member states of abe World Health Organisation (WHO) met this criterion in 2000. Thus the infant monaIity rate is most dif1icultto measun: in those populaaions when: infant mortalily tends to be highest and of greatest public health importance. In such populations it is usually estimated using modcllifetables starling from estimates of the ~h;ld morlalily 'rQle". which tend to be: more robustly estimated. Another important 'rate' is the ~h;ld(or under S} morlalily 'rQle". conceptually. the probability of dying betwccn birth and cxact age 5 ~qCl in lifc:lab1c notation). con~ti0R8l1y mUltiplied by 1000.11 is the most robustlyestimatc:d measure of mortality in carly life in low and middle income countries without comprehensi~ vital statistical systems. In such «lWlbies. it can be: measured operationally (in demo;raphic and health suneys and in censuses) by asking women of reproductive age about all the babies they have had and which of these ha~ sina: died. There are standard demo;raphic methods fQl' using answers to these questions to estimate 51/0. If details of dates of birth and death are available then mortality rates can be estimated dirc:ctly. Ifanly the numbers ever bom and numbers alive (or dead) an: known then the indinxt method (also known as the 'Brass' method) is used. The WHO estimates that each year about 22CJt of global dc:aahs occur in populations for which estimates of this type proVide the only available evidenc::c on mortality levels (at any age). The Qdllil morlalit,- ·rale' is the probability of dying bc:twccn exact age 15 and exact age 60 (..!IIu).lt is typically used by international agencies as a SUIIlI1'UU)' measure of adult mortality levels in low and middle income countries. High income countries with compn:hensi\'e vital statistical systems tend not to use this measure for their own pmposcs.

MatnrrQ/ morlali,,· is a topic of glQt policy interal globally but its measumncnt is fraught with dimculty. The WHO defines maternal death as the 'dcaIh of a woman while pregnant or within 42 days of terminatian of pregnancy. irrespective of the dulDlion and the site of the pregnancy. from any cause related to or aggravated by the pregnancy QI' its management but not from accidental or incidental causes'. ThenlDlerllQlmorlQ/ityralioistheratioofmatemaldeathsto li~ births x 100000. Even in countries with the best vital statistical systems it is estimated that around one-third of maternal deaths an: not identified as such by the ICD oodc assigned for the underlying cause of death. Blscwhen: the ratio is subject to even greater uncertainty - making it unsuitable for oomparisons between countries QI' over time. The global maternal mortality ratio for 1995 was estimated at 397/100000 births with an uncertainty interval extending from 234 to 635 - emphasising the magnitude of the uncertainty as.saciated with this measun:. In order to IIIBk fair comparisons.. especially internationally. in demogmphy it is necessary to standardise vital (birth and death) rates. Crude (unstandardised) death rates are poor guides for comparing the force of mortality in ditTen:nt populations: a retirement town. with many of its population in the oldcr(dying) age ranges. will. purely as a function orits age structure. tend to have a very high crude death rate. The processes used to age-standardise can be con~nientJy described by distinguishing bc:twccn the ·population of interest'. i.e. the population whose vital niles are being charaacrised. and the 'referen~e' or 'stantlurd' poPlilution. i.e. either an artificial QI' a real popuJatian. used fQl' standardisation. DireC'1 standtzrdi:rtltion is one method n:quiring relatively pn:cisc estimates of the age-spccific dcalh rates in the population of interest. The standard population provides a standard age stnIcturc. '11Ic age-spcciftc death rates of the population ofintcrest are applied to the component age strata (of standardised size) orahe standard population. In this way. each age stratum of the standard population is made to expericna: the same force of mortality as the oom:sponding age stratum in the population ofintcn:sl. The resulting sum of deaths (in the standard population) is no longer inftucnccd by the age structUR: of the population of interest and. when this sum is divided by the appropriate denominatQl' (100000 in this case). it yields a din:ctly age-standardised death rate. A dirc:ctly ag~standardised death rate is. in effect. a weightc:d mean of the age-specific rates using a standard set of weights - with the weight foreacl1 age straIum being the standardised proportion it comprises of the total standard population. The second mc:thod is called intIireC'1 Qge stant/QrtliSQtion. If the population of interest is small or the deaths of interal are from an uncommon cause. the number ofdeaths occurring in some age strata may be too small to produce the stable estimates of the age-specific death rates that the direcl

_______________________________________________________________ DSMOGRAPHV

mdhod rcquin:s. In the indin:d method. Ihc: agc-spc:ciftc dcaIh ndcs or a SIandard population 1ft projcdcd on to the &Ie sIndD onlle population of inlen:sl to givc the number of deaths dlat would bcexpeclcd in each age stratum on the basis oflhc standard IDles. 11K: lDIio or the total observed deaths to the toIaJ 'expc:cted cIeaIhs' is usually called the standardised mortality lDIio (or SMR). Because SMRs 1ft still inftucnced by the age slnlcllR of the populations of inlclat. cach should. Slrictly. only be compan:d willi the WIIuc fOl' the slandanl populaUon. i.e. with 1.00 (or 100 depending on which base is chosen). Sourccs of data used for cIcmopaphic measures CaD be ilIuslJated for mortality. ArouncI onc-quancr ofthe 57 million dealhs estimatcdlO occur cadi year occur in counbies w~ the vilal SlalisticalsyslCm has been judged to be at least 95... complete. Around 13... occur in populations whose vital statistical systems an: less than 9SCJ, complete. For India. China and sevc'" smallcrcounbies. vital rates an: estimated usingdala fram sample rqislralionand suncillanc:e systems. In these systems some 1'1. or so oflhc: national population is coveml by intcnsi\'e surveillance for vital events. For populations in which around 22CJ, of global cIeaIhs occur. child mortality can be estimated from suncy and census n:wms on the numbas of chilchn born and numbers still alive" even thou&h then: is little or no din:ct cvidcacc on adult mortality levcls. 'I1Icsc 1ft typicallyestimalcd using mocIcllifetablesto match plaUsible adult mortality levcls to estimated child mortality. This leaves around SCJt or dcalhs occurring in populations with no n:cent data on child or adult mortality. Estimates of mortality in this last c:ategory or populations an: enlin:ly 'model bascd'; i.e. they an: pn:dicted fRJID other known or estimaacd characteristics of the pOpUlation. The calculation of death rates also n:qui.a estimates of populations at risk of dying. A minority of counbies ha~ n:gular censuscs with coverage deemed complete. These countries estimate populations in inten:ensal years using adjace:nt censuscs. Atlhc: other end of the daIa availability spcclJUm an: populations with no n:cent censuliCS. For Ibis group. bodies such as the Population Division of the UN have a long experience in pn:paring 'modcl-bascd' estimates of the size and age andscx disbibution orthe papulation. albeit with subslantiallevels of unc:edainlY. Thus. while mortalilY estimates an: now pn:pan:cI by intemational bodies such as the WHO for all components of die human population. many of these an: subject lO substanlial uncertainty. 1"be evolving philosophy of Ihc: WHO has been to make Ihc best use of all available evidence and dlen to scck 10 quantify the level of uncertainly attaching to the n:sulling estimalcs. Life expeclancy eslimatcs published by WHO an: now presented with UDl:Crlaint)' intervals. 'I1Icsc inlCrVals aim to quanlify all sou..:es of uncertainty. DOl just thai associated with sampling enor - hence Ihc:ir dcscription as llIfcola;"I)' rather Iban CO.'fl=JDEN(E

INlDVALS.

Historical demography is the bnmch or demography dlat Sludics how and why the force of mortality has changed thnJugh historical lime. infanniDg our understanding. of Ihc: main determinants of human health. and is thcn:fon: of consicbablc inlcn:sL Historical demographers typically work their way backwards in time from man: n:cent periods. willa data that an: n:aclily available and of g.ood n:liability.1O earlier periods when: then: an: problems with eithCl' the availability or thequalily ofthc available e\idcnce. Mortality eslimatcs based on a formalised syslcm of data collection by parishes an: available fOl'Sweden from the mid-ISIb century. For England and Wales aD ol1icial system of vital n:gistralion began in 1837. Sefom such syslcms ~ in place:. parishes in England, for example. kepi n:cords of baptisms. burials and maniqes. Historical demographers have used these reconls to'reconstitute' families and fram these gcnealogies have obtained bolh aumerators (vital events) sad denominators (estimates of person-time lived) fOl' Ihc: estimation of vital rates. Family ra:onslitutioa has yielclcd estimates offcnility and mortality levels for England dlat now extend back to the 16111 century. TheseconstilUte the longcSl such series for a North AlJanlic society. Thcn:havc been two main findings from IhcseciatL First. it has been shown that the main means by which the English population adjusted 10 cyclic variation in economic fortunes. in the early modem period. was via the rqulationoflDlll'riqe (Wrigley el al. 1997). When economic conditions became difficult. age al marriage increased and the proparlion never marrying also incn:ascd. Thc:sc dcpartun:s from the pallern of universal early lDIII'riqe as sc:c:n eJscwhcn: have been characterised as the Emvpcan marriage pattern. As nUpiiaiity varied. 50 did fedility and with it the rate of population incn:ase. Second. a high level oradull moltality in England in the early mocIcrn period has been obsc:ned. While somewhat man: than SOCJ, of those bam survived 10 adulthood. among Eqlishmales.fOl'examplc.onlyaround309toflS-year-01ds could expc:d to survive to M.1t is of inlcn:slto notc hen: dlat hip levels of aduk mortality wen: also typical of the poOl' agnuiaD society of India on the eve of its dcmopaphic transition. Around 1900. only I in 6 of 15-year-old Indian males aJUld expect to survive to 65. 1hc overall lI'IIRSition fram a 'pn>-modem' 10 a ·I~ madcrn' pattern of vital ndcs isdcscribcd as the demographic 'mnsit/on. It begins wilb high mortality and fertility rates. followed by a period in which manality dccliacs in adwnce of the decline in fertility - a phase of the transition in which population growth accelcralCs. Fertility then declines - in an idealised fonn to n:ach n:placement level (NRR 1.0). with a new equilibrium being finally established with high survivorship. As has aln:ady been implied. Ihc: starling point fOl' this demographic transition was II1CR fawwuble in northwest Europe (in. say. the 17th century. when birth and death rates wen: 'submuimal') than in poor agrarian

=

131

DSMOGRAPHY _________________________________________________________________

societies such as India (around. say. 1900. when birth and death rates were exceptionally high). Turning from the past to the future. anOlher imponant aspect of dc:mognpby is making popu!atl"" projedlons and fcm:caslS. Population projc:c:tions, as the name implies. project eXisling populations forward in lime under slated assumptions and in aoconI with established relalionships bctwccn demographic panunc:ters. Some projections may be known to be unn:alislic but be carried out to explore 'what ir scenarios. Forecasts are those projections that an: believed most likely to pn:dict the future. The standard method for projecting populations is known as the rollOrl fORlpolient melhod. Typically. each 5-year age group in the populalion of interest is projected forward 5 calendar yean at a time. It is depleted by expected losses to death and emigration and augmented by expected levels of immigration. At the beginning of life. births (10 existing residents and to immigrants) an: predicted. For these purposes. attention focuses on females to whom assumed fertility schedules are applied. A parallel exercise for males makes up the numbers. This exercise is n:pealed. starting apin with the expc:cted populalion in 5 yean' time. The migration c:vmponent usually introduces the largest levels of uncertainty into the calculations. Realistic assumptions entail nonlineartn:nds in fertility and perhaps also monatity so that the assumed nleS need to be adjUSled for each 5-year calendar period. Both the United Nalions Population Division and the US Bureau of the Census prepare projections on 'high'. ·medium' aDd "low' assumptions for key inputs. ThUs. estimates for the size of the US population in 2050 wry by 102 million between low and high fertility assumptions. by 48 million between low and high mortality assumptions and 87 million between low and high migration assumptions. There is a general n:cognilion that this scenario-based appI'OIICh needs to be replaced by a more systematic: approach to the quantification ofuncCltainty and its representation in probability distributions. UndelSlanding ofthedetenninalion ofage SU1Ic:turc n:sts on the theory of stable populalions. Stable popuIaiions emerge when the puwth rate in the number of births is constant (or the schedule of a;~specific fertility rates is constant). the schedule of age-specific: death rates (i.e. the lifeaable) is constant and theR is no migration. In such populations. to which many historical populations approximate. various mathematical mationships hold between key parameters. The age distribution. the birth rate.. the death rate and the growth rule an: entirely detcnnined by the fertility and mortality schedules. Populalions that are not themselves culMntly approximating the stable model can nonetheless be said to have a 'stable eqUivalent'. i.e. the population thai would elDCfge if the birth and death schedules were allowed to act eonti...ously. From this c:qujvulence an ·intrinsic growth rate' may be delennined.

One of the most striking and counterintuitive findings from stable population theory is that population age struc:lUre is very much more sensiti,'e to changes in the fertility schedule than to changes in the monality schedule (Coale. 1955). nus with a gross reproductive rate of 2. incn:ases in life expc:c:taDc:y from 40 to 60 years are associated with redur:liollS in the mean age of the population. nis is because increases in survival an: proportionally gn:atcst at each end of the lif~ span. The increases in survival in the early yeaJS of life lead to larger cohorts or parents who in tum produce more childn:n. keeping the base of the population pyramid extended. Hoy,-ever. as fertility falls and life expec:lancy extends. proportions aged over 65 do inclQSC. Finally. as populations approach stationarity (sustained equality of birth and death rates). increases in survival are reftected in increased proportions of aged persons. On the way to such equilibrium. substantial perturbations may arise due to the passage of cohorts that an: ·Iarge' relative 10 those that immediately follow. 1bese may have arisen from shon periods ofinc:reascd fertility, e.g. 'post-war baby boomers' in Western c:ounlries or from the last 'large' birth cohorts before subsequent substantial and rapid falls in fertilitY9 e.g. in such countries as Japan. China and Italy. In the next halfcentuIY these presumptively transitional phenomena will result in periods of marked ·population ageing' when the relevant ·Iarge' cohorts pass age 65. According to the UN Population Division's 'medium' variant projections. proporlions aged over 65 will incn:ase over the period 20 10 to 202S from 8.34jt to 13.4" in China. from 20.4" to 24.4" in Italy and from 22.M(, to an extraordinwy 29.74Jt. in Japan. By contrast. incn:ases in the USA an: expected to be more modest: from 16.1 .. to 18.1 ex, (United Nations Population Division. 2(09). As populations approach stationarity. assumptions about limits to life expc:dancy become incn:asingly relevant. Oeppen and Vaupel (2002) have shown how demographers have repeatedly underestimated such limits. Mortality decline at high ages has continued in low monaIity countries and has 50 far shown lillie evidence of slowing down at the highest ages. Demography has played an important role in the development of methods for measuring the burden of di.sease and injury (Murray ("I at.• 20(2). Forexample. the health-adjustc:d life expectancy (HALE) measure seeks to estimate the expc:c:tation of life in 'full health'. nme expected to be spent in less than full health is subtracted from tOlallife expectancy. after Weighting by the severity of the depaJture from full health. 'Health pp' measures. such as the disability-acljusted life-year (DALY) lost. estimate the hypothetical Rows of "lost healthy lifetime' arising from deatm and from onsets of disease and injury during the period of interest. For the "years of life lost' eomponent (and for long-term nonfatal health losses). gaps an: estimated relative to a standard lifetable with a remale life expc:c:taDcy at birth of 82.5 years

___________________________________________________________ DENSITYESnMAnON and a male life cxpcclaDey at birth of 80.0 years. Unlike hc:alth expectancy-type DleIl5UleS (such as HALE), health gap measures can becleaJmpased by allocllling DALYs 105110 the diseases and injuries R:SpOMible and also into the determiDants of the diseases and injuries. JP c.Ie, A. J. 1955: How lite ace distributiaa of a human papuiatiaa is determined. In 'Prot:eerliRgJ o/Olki Sprilf, HlUboitr S,)1JIIHlI1ia 011 {llltllllila#ire biology. pp. 83-9. Marnr, c. J. L, SaIamDII, J. A., ~ C. D. ad Lapn. A. D. 2002: SurrImtzI1 RletUlUeS of popula/iDII IIM1111: com.~pu. ethiC'S, IftItlSllmftMltmtlopplimtiDlfJ. Gcans: wartd Health Orpnizatiaa (WW\\"."iao.inl/pultlsmphlcnf indcx.hImi). OIppIII,J...... Va"", J. W.2002: B'-D limilS to life clpectaDcy. Scicnc:e 296. 1029-31. Prestaa,S.H.,HMlvtIIDe,P. ad Gal.... M. 2001: Demopaphy: RfNSIITing tmd modeliRg popida/iDlf prouJSr,. OxfClld: BERdl. United Natlaas Papal&tIDa DI.... 2009: WorltlpopidatiDllprosp«ls: lbe 2008 nMDIL New Yen: UN Dcputmcnt of Economic and Sod" Main (htlp"JI www.ua.argesalpopulalionl). WrIaIeJ, Eo A.. BaYlIs, R. s., Oeppili, J. E. ... SdIDIIeId, .. S. 1997: popMIa/iDlf hilt"" /IYIIII /tImil)' r«oIISlilll'lon, 151O-1BJ7. Cambridp: CamIIridgc University PIaL

O.

Jt

density estimation Kernel estimate showing individual kemals (SiIvennan, 1986)

• 30-

.

_lim

dendrogram Sec CLUS1ER ANALYSIS IN r.tEDlCINE density esUmatlon This is lhc c:stimalc of a probability diSlribution from a sample of abserYations. In many siblalions in medical n:scan:b we may wish to usc a sample of observation to estimate: the f"",ucney dislribution or probability density of a variable of iDtCRSl. Commonly this estimation probleat is approached by simply cOllSUUtting a JIISIOORAM of the daIa. Howevu. the histogram may nat be: the most cfl"ecIi~ way of displaying the: distribution ofa wriablc. because of its cIcpeadc:ncc on the: aumber of classes chasen. 'J'bc: pRlbIc:m becomes even man: acute if IWcHIimcnsionai histagnuns an: usc:cI to estimate: BJVARIAlE DImlIlllmONS. The density estimates provided by one- and awo-dimeaslonal histograms can be improved in a number ofways. If. of COUIR. wc wcre willing to assume a partit'ular form for the distribution. e.g. normal.1hc:a density estimation would be mluc:cd to estimating the paramelcl"s of the chosc:n density function. Mon: eommonly. however. we would like the data to 'spcak for Ihc:msc:lves' as it wc:rc:. in which cue we might choose one: of a variety of ahe nonparamdric _sity estimation procedures available. Perhaps ahe most c:oaunoa 1ft the kernel density estimators. which an: essentially smoolhc:cl estimatc:softhe proportion or observations railing in intc:rwJs of some size. Thccssential components of such estimalOrs 1ft the kc:mcl function and baadwidth or smoothiq parameter. 111c: kc:mcl emmat. is a sum of 'bumps' placed III the obscmdions. The kcmcl function detc:nnines the shape of the bumps while Ihc: window wicbh determines dle:ir width. Details of the mathematics iavolvc:d an: given in Silverman (1986) aad Wand and Jones (1995). but die: cssc:ace of the pmccdurecan be gleaned from the: ftrst figure. Here ahe kernel

~-

.20II

i 15Q

•

.• ...• ••• • ..•

•

."

10-

5-

•• ••J.I. •• ~ •

• ••

2b

sb

• • , •

..

••

• • •• • • •••• • •• • • • Birth rate

4b

density estimation Scatterplot ofbirth and death tales for 69 countdes fuaction is GaUSSian, and the: diagram shows the: individual bumps at each absemdion as well as the density estimate obIaineci flVlll adding Ihc:m up. The kc:mc:1 density estimator considcn:cl as a series of bumps central at the obsc:rvatiOM has a simple cxteasion 10 two dimensions as described in. for example. Silverman (1986). Here we conle.nl ounelves with an example. 'J'bc: second ftgll'" shows a plot of binh and death rates f. 69 countries and the third figure (in (a) and (b» shows pcmpc:ctive plots ofdensity estimates givc:a by using dilTen:at 1cemc:1 functions. Bivariale density estimlllcs can also be: useful when applied 10 ahe separate panels in a SCAT1'ERPlDI' MATRIX of data with mon: than two variables. As an eumpJe lhc fOUJ1h ftgure shows the SCallClplol matrix of data consisting of thRC body

193

DENSnvesnMATION _________________________________________________________

(8)

........

(b)

...........

.......

........

....

d_sfty ........10.. PelSPflCfive pIoIs of two density estimates forlhe bItth and death tate date: (a) bivatlate fJOIfIIIII kemaJ; (b) Epanechnikov kemel me8SlRmenls on 2.0 indiYiduais with a mntour plal or the appropriate cslimllleCi bivariate densit)' rUDClion on each panel. ~ is clc:arevidencc or lWO modes in the estimated densities, which is explained by the pieseac:e or aDd women in the sample. SSE.

men

......... B. W. 1986: DrlUily '31im11la

.'a

loT

slaliJlks aruJ

tIIIIIIym. LandoR: CRCIChapnum a: Hall. Waad, M. P. .... - - . M. C. 1995: Knife' JmIJIJ,h;"g.l.andan: CROOIapman

&HaU.

22 24 28 28 30 92

Chest

Ii

I

"a

I~~----~~~~~~~~ ~

__~__~~

~

I

.1

~

•••~

A ~1"""T'"~_ _"'I""T"......JiI ~-__---'r---"""'" 4--I--I--I--I==P- I)J S24 _

as 400

Got

20

25

30

a5

32 34 38

as

.to •

cleMIty estimation ScaIte1plot maIti1t 01 body measutemenls data showing Ihe estimated bivadate densitif1s on each panel

______________________________________________________________ DENTALSTATISTICS

dental statistics

Dentistry is concemcd with the pr0vision of aR for the teeth. supporti", tissues and the gums. and lhe tn:atment of diseases aJTecti", these an:as of the mouth. In the United Kingdom. the Social SW'\ICy Division of the Office for National Statistics ewes out the Adult Dental Health Surve)' every 10 years (see. forexamplc, the Iq)OIt on the 1998 surve), by I 2011 JohD Wiley & Sons. .....

20S

HAWTHORNEEFFECT _____________________________________________________________

wilh one dc:gn:e comparing observed and pn:dicled counts is Slandani bul for loci with ~ than IWO alleles a pennulation-based lesl is preferable. PS (See also AI.I.ELIC ASSOCIAnON. OEJlE'l1C EPlDBO(1()(JY. HERlTABDJTY I

Hawthorne effect This is a possible effcCI that might be pmduced in an experiment or study simply rrom subjccls~ awan:acss or participalion in some fonn of scienlific investigation. Thai individual behaviours mighl be altered becausc thcy know they arc being studic:d was first said to have been demonstraled in a research projecl carric:d out at the Hawthorne Planl oflhe Western Elcelric Company in Cicero. Ulinois. in the laic 1920s. 1bc major finding orthe study was that., almost regardless of the experimental manipulation employcd.the production or the workers sccmc:d to impro\'C. The implication or thc elTed is that people who arc singled out ror a study or any kind may impro~ thcirperformance or behaviour. not because of any specific condilion being tesled but simply because of the attention lbey ,"cive. A mc:dieal example suggeslc:d by Gail (1998) involves a study of methods 10 promote smoking cessation. in which il is necessary 10 4XlRtact Slady participants each year 10 ddc:rmine smoking &latus. A furthermore recent medieal example orlhe appearance of Ibe Hawthorne elTcet is given in Fox. Brennan and Qasen (2008). 1be Hawthorne effecl c:ouIddistoJt study results if this mpealc:d annual conlact affec:lc:d smoking behaviour or thc l'CpOIting of smoking behaviour. SSE Fox. N. 50, B~ J. s.. aDd CIIaIIa, S. T. 2008: Clinical estimation of fctal wcight and the Hawthome c:ft'cct. EuroptOll JOumDl of OIJSl~trlt3. G}'lftltC'Olog), tmd bproduclhoe Biology l4l. 111-14. Gd, Me H. 1998: Hawtbome etTect. In Anni~ P. and Colton. T. (eds). EItt)'clopetliQ of biOJtalistics. Chichester: John Wiley a: Sons. Ltd.

hazard funcUon

Sec FRCFORTIOIW. HAZARDS. SURVIVAL

ANALYSIS - AN OYEIMEW

health-adJU8ted life expectancy (HALE)

Sec

DEMODRAPJIY

health services research Health scrvices Rsearc:h. according 10 Bowling (2002). 'is concemcd with the relationship between the provision. elTediveness and emeient usc or health services and the health needs or the populalion. b is narrower than health n:scan:b'.Itthus entails measuring and evalualing the inputs. processes and outcomes or healthcare provision. Input and process infonnalion that is primarily aimc:d at assisting hcalthcarc managers and providers. especially when coIlectc:d on a rouline basis. is probably ItICR corrcclly 4XlRsiclcn:d as audit or quaUly DlSUnmtlC.

Genenlly speaking. such routine data can l'III'CIy be used ror reseudl pwposc:s. due 10 diftk:ullies in mainlaining stanclarcls in data coUcc:lion. An exccption would be a lang-tenn case regisk:rconlaining daIa on all patients in a gb'en arca pthcn:d in a Slricdy conlmllc:d and objectivc fashion. While most sIandanI slali5lieal methods arc potentially appUcablc in health services msearch. some arc I1'lCR useful than others. This is bc:c:ause health services research is often relalively complex. inyolving as il docs the analysis or dilTemnt inlerventions. outcomes and levels of data simullancousJy. Becausc of lhis complexity. and also sometimes becausc of ethical issues. relatively unusual experimenlal or pseudoexperimcnlal dcsips such as stepped wedge designs. pmrerenc:e trials and randomised consent designs arc available in addition 10 the more standard experimental designs. such as individually or cluster randomisc:d CUNJCAL 11UAL5. observational studies. such as CI05S-5ECT101W. 51\JD1ES. COHORT STtJDIES and CASE-coNIRDL STUDIES may be man: approprialc or indeed the only rcasible option for studying health serviees in natunlistic scltinp. For further inrormalion on \'DI'ious approaches. see the MRC gUidance on complex interventions. which has bcc:n revised and updaled from 2000 (CDig el al.• 20(8). A conn... between the typical he~ bial and Ihe typical phlU'ltUlCOlogicaillial is that the latler usually rocuses on the outcome for Ibe individual palicnt and assesses some particular therapeutic intervention such as a drug. a sUllieal proc:edwc or a psychological intervenlion. The remit of a typical hcallhcare trial. contnuiwise. tends 10 be broader and more complex since it often involves the evalualion of one or more interventions. the environment in which they take place and the personnel administering lbem. The outcomes may be measured at the patient level but they may also be measun:d at other levels. such as the ward or the hospilal. or indcc:d at several of these levels simultaneously. Three nested levels hem might be patient. ward and hospital. and these would all nec:d to be taken inlo KCOunI in a MULTILEVEl. MODEL. In a discussion aimed specifically al psychiabislS. bul which is nevertheless generally applicable. Dunn (2001) draws altention 10 some or the problems inherent in health service trials. One: ofthesc islbe HA\VJ1IORNE EffECI'. in which there is a nonspecific or PLACEBO elTcet that is not dim:tJy associated with Ibe specific content or the intervenlion bul is rather due to the mere fact or participalion in the study. Dunn points out thai. in health service research. providcJs as well as patients may be subjc:ctlo such an elTed. AVOiding it is oRen more difficult in heallhcare llials compared 10 clinical trials because blinding or participants lIUly be impractical or unethical. nae definition or outcome is often quite problematic in health services research since interventions may potc:ntially procIuc:c mulliple and confticling cbangc:s in scveral

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ HERITABILlTV

dimensions. ImparlDDt statistical issues in this ilia 1ft thus cIcaIing with mulliple significance lestiq and combining outcomes into summai)' statistics. Economic anaI)'S~ aimed at balancing Ihc effecliw:ncss or outcomes apinstlhc casl or providiq inlcrVcnlions.. is c:ammonIy pcrfonncd in health scnic:cs research (sec COST-EIftCI1YENESS ANALYSIS). One issue that has to be addn:ssccI in this conIcU is whether scnice usc information. such as number of hospital admissions, should be relarded as an outcome in its own right or whclhcr il should be c:onsidm:d purely on the ClGIl side of the equation. 1bc views orclinicians and health c:canamists may differ _Ibis poinL Oftc:a oukXlmcs are concc:mcd with such conccpIs as patient_isfaction or QUALITY m: LIFE (Fayen and Machin. 2(07). which may be difficult 10 define and capture. E\'Cft once they ha'VC bc:c:a defined conceptually. outcomes are DOl always Slnlightforward 10 measure and often in~lve the usc or QUESIIONNAIRES. 1bc latter may be prone to Icstmat imJRCision. due to a subjec:t~s inconsistency or. indcccl. gc:auinc chaages fram one lime poinlto the: next. or clisapeemcnl bctwccn raters (in cases where Ihc qucstionnai.a 1ft administeml aad inlClJRlcd by someone other than the subjccl). Methods for assessing MEASUIBIENr a_ 1ft Ibus impoltana in health scnices research. The analysis or the psychometric praperlies of inslrUmcnts. such as Ihc:ir reliability and validity (sec Streincr and Norman. 2008). may be nc:cessary when: inslnancnlS have been developed especially for a study. The treatment or MISSINO DATA. and data quality in genenL is also a relatively common issue arising in health services research. nis is bc:causc there is gellCl1llly less control over cIaIa collection in the community ar a hospital. as oppascd to an cxperimental laboratory or clcclicatal clinic. 1bc sl8adanl CONSOJn" ST~ t.I!NI' may need to be adapted (sec Boutran el til.• 2008) for nonclinicalouames. Sometimes Ihc focus in health services research is on 8IJl'CI* dala fran high-le'VCl units such as hospitals or health authorities. For example. melhads for comparing the pc:Ifannancc or health providers in league tables may be mauin:cl. Goldstein and SpiegclhallCr (1996) discuss some of thc issues arisiq from Ihc: comparison of institutional pc:Ifannancc. Methods for analysing spatial statistics are used when the geopaphical Iocalion of the units is also importanl and such methods may be integndCcl with a Ic:ographical information s)'Stem (GIS): this is a specialised form of database that holds complex popaphical data so as 10 allow ilto be visualised. Such mc:daods may be aimed at idenlirying outlying disease clusters. examining the impacl or area-widc intencnlions or measuriq health inequalities and relating them to other area-wide daIa such as social deprivation. ML

ISec also ECOLOCHCAL snJ)IfSl

BoIdroa. I.. Moller, D.. A..... D., ScaUz. K. and Ra,..... P. 2008: &tcadinl the CONSORT S'*lDCnl to nnclamimllrials of noa-pharmacoJagic tlalmcnt: explanation aad elabcnlion. AIUfIII, of I",ernal Medicine 148. 295-309•. 1JowIIDt. A. 2OD2: ReJearrh fllelhotbilr IrmI,Ir. 2ndeditiaa. BuckinPam: Open Univenil)' ~ss. C...... P.9 DIeppe. P., ~ s., MIdde. s., NuantIa, L and PtCtknw, l\oL 2001: DeveIClllinc. and evaluatiac complex intcm:ntions: the new Medical Rescan:h Councilpillluace. Brilm. Medit:tll JtNII'IItll. 337, al6S5. DaaD. G. 2001: SlatiSlical methods for measuriag 0IIlcDmfs. ID 11aamieroft. G. and Tlnlclla. M. (cds),. MmlaJ bral'" oulcome fllelUlltes. 2nd cclitian. New Vcnl: SprinJer ~d8&o pp. 5-18. F.,.., P. Me and Maclda, 0. 2007: QMali" Djli,fr: the tJlUs.JRtInl. QIItII)'Jia anti inle'prettllion Dj ",tient-l'eptlT,Gi 011't'Dmt'a. 2ad edition. New yeldt: Wiley a: Saas..Inc. GGIdsteID. H. ad SpI,,,WIer, Do J. 1996: Leque tables and thcir limitations: statistical issues in c:ampari.sans of insIitutional pafarmancc. JtNII'IItll of tire Ro)YlI SltI'ulkal Socie'y. sma A 159. 38S-443. StnIaer, D. L ... Nonaaa, G. R. 2008: Health metl.SlUemml scale.: tI pratliml ,1Ii,w10 IMiI' tltloelopIM"t tIIId we. 4th edition. Oxfonl: Oxford Uaivenity ~ss.

J_

herttability

In the broad sense. heritability is the p. . portion of the variance of a givCII InIiI that is explained by genetic ditTerences in a population. In the: narrow SCIISC. genc:lic dilJerences ~ reslrictcd 10 thasc due 10 the addili'VC effc:cls of alleles. Heritability is a key concept in population ICnclies intnxluced by Sir R. A. fisher. in close: cOlUlCCtion with his work on the ANALYSIS m: VAItJANCE. Nonaddiliw genc:lic inftucnc:cs. which include inlCnlc:tions between alleles aI the same locus (dominance) or at different loci (epistasis). are included in bnIad but not nanow hcrilability. In humans. herilabilily is usually estimated by twin ar adoption studies (sec TWIN ANALYSIS). The classical twin design relies on the fact thai monozygotic (MZ) twins an: developed from the same fertilised qg and 1ft therefore genetically identical. whereas dizygotic (DZ) twins an: like onIinary brothers and Sisters in being develapecl from two separate fertilised ova and therefon: sIIIIm _ a\'Cl'8le SO CJt or their genes. Given this fact.. and under some additional assumptions (including the equality or the CIIvironmental similarity bct\WCl1 MZ aad DZ twins and the absence of dominance and epistasis). a simple cslimalc or heritability is given by twice Ihc difference between the il1ll1lC:lass MZ and DZ correlations far the trail. This simple method of estimali_ for the heritabililY is known as Falconer's formula. Adoption studies wodt under the assumption that any com:lalion bcIwccn an adoplCC and his or her biological family is enmly gc:ac:tic in origin. In the absence or epistasis, twice the com:lalion between adoptee and biological pan:at provides an estilDlllc of narrow heritability. Similarly. the intraclass correlation far MZ twins reared aparllJlQvides an estimate far the broad heritability.

HIERARCHICAL MODELS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

histogram This is a graphical replaenlalion of a

A high herilDbilily 'is IDIDeIimes misinteqnlcd as meaning thal the trait is untilrcl)' to rapond to enviranmenIaJ chanps. HerilDbility rcllec:lI on the genetic and environmc:alal diffc:n:nccs that exist in a particular

fleQuency diSlribuliOD iD which cadi class inrerwl is ~ SCllred by a vertical bar whose base is the class interval and whose heiPt is the numbc:.- or ob~ODS in Ihe class intenal. When;lIIe class intervals arc unequally spaced Ihe histogram is cbawn in such a way that the .a'or each bar is pIOpOItional to the fn:quency f_ thai class inlerVai. Scott (1979) cxmsidcn how 10 choose the optimal nmnber and width of classes in a histogram. for then: arc mailers of choice. Two examples of histograms arc shown in the ftI'RThehistopamisgenendlyusedfortwopurpases.caunting and displaying the distribution of a mable, although it is ndativcly incffcclive forbolh. with stem-and-leafplals bein. better forcaunling and boxploll better far auessing dilbibUlional pmperties. SSE

populalion; it cannot be used to predict the consequences or enYiJonmental changes outside the normal range ror the populaliOD. A ramiliar example is that the mental retardalion that is iDvariably assac:ialcd wilh Ihe genelic condilion phenylketonuria in a nalural populati~n can be IRYCnled by the inlroduclioD of a 10w-phcD)'laluinc dica iD early inraDC),. PS [See also CJENE'I1C' EFlDDtKI.OOY, CIENEI1C I..OOCAOE,. QUANnTAME 1RAJT LOCI)

hierarchical models Sec LOO-LINEAR MOOEU

Scoa, D. W. 1979: On optimal and data-t.sed bitaapams. Bionw-

hlglHllmanslonai data This is a lenn usc:d for cia-

~riktl66, ~Io.

IaseIS thai arc chancterisc:d by a Yery large Dumber of

\lllliables and a muc:h IIIGI'e modest number or olM!erYations. In the 21st c:enlUr)' such dalasels arc coDc:c1Cd in man)' arca5, e.g. IeXtlwcb data miDiIll (see DI\T.~ MININCJ IN MEOICIJIE) and BlOIJIIfORMA1ICS. The IIISk or cxtractiDg meaningrul statistical and biological iDronnation flOlll such datasc:ls pn:scnls many challenlCs for which a number of IeCCIII methodoloJical dcvclopmenlS may be helpful: for details ICC. far elUllllpie. Francois (2008). SSE

historical controls This Mfers to the use of' past data f_ the purpose of makingComparilDDS with pn:sent daIa in a rc5ClRh contexL U.fortunalcJy. despile the appeal of desirinI to make efficient use of previously collec:led ~ soun:es, with informalioD sIoml perhaps OD a computer data~. the use of historical controls is frau.ht with BIAS (Pac:ocl, 1983). Onc annal make reliable inferences in controlled aJNIC'AL 1RL\lS by comparing new daIa with old. Tile main reason why bias would be introduced is the lack of comparability aI baseline between the two groups. Only

.'tI

IF. . . . . D. 2ODB: Higll-dimmsiolrtrl t111t1l)'m: jrDRt oplimlll IIIIttria III Imllft ~kd;lJII. VDM "=da&-

(a,

(b)

-

100-

15

-

r---

~

80-

10 .

r-- '

I

1

-

5

I--

20-

,.

0-

-

r-

do 160 Helghl(cm)

do

i '

180

o'-

II

o sdo 1000 1sbo 2Obo 25b0aobo SUrvival lime (days)

histogram (II) Heights (em) of ekIedy ."..",; (b) sulVivlll times (days) of patients with IeukIJemIa

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ HISTORVOF MEDICAL STATISTICS

CCJIICum:nt IANDDMISATlON of eligible pllllicipants can bc:stow such betweea-graup comparabilit)'. since ranclomisil1l alone can seck 10 CDStR tmatmenlpoups ~ balanced with n:spccllo alllbe known and (innumerable) unknown risk factors. CRP Pocack,s. Jo 1983: Cli"it:tli trim: II /NQctittll ap,rQQm. Cbic:bcsaa: John W-dey &: Sans. Ltd.

hlatollcal dMiography

See DDtOOIAPHY

history of medical statistics

The lint atlempts at °mc:clical statistics' might perhaps be CGDsiden:d the early elTOIts 10 keep track of bil1hs and deaths Ihraulh chun:h n:cords or wc:ddi11l1. christenings and burials. However. mon: ambitious statistical procedun:s than simple counting would have been IlIIIel)' unwclcame to physicians until well inlo the 17th centur)' simpl), lxcausc the)' might ha~ raised Ibe unlhinkable speeR of questioning Ibe invulnerability mast or lhem still claimed. Medical practices at the lime wen: largel), based on uncritical n:liancc on past experience.. po.sl 11«. ergo propler hot: reasonil1l. and veneratiaa or the "1I'1I1h' as pnlClaimc:d by authoritative ftgun:s such as Galen (130-200), a On:cl ph)'sician whose inftuCDCC dominatal medicine: far many centuries. Such atliludcs largel)' SliW an)' intcn:&l in expcrimcntatiaa or proper scicatific investigation or explanation of medical phenomena. Even the rew cliDicians who did Slrive to increase their knowlc:dge by close abscnDlion or simple experimcat oRca illlelplCtcd their IIDdil1lS in the Iighl of the cum:nlly acceptccI dopua. Sevc:nlauthors have painted out whaI must qualify as the waders earliest n=c:cxdccI camparalive trial. Described in the biblical book or Daniel. Ju:ac,e cin=a 600 B~ Daniel and three colleques ellpn:sscd their prererence not to be given road thai had been ~ conlnr)' to their beliefs. Their slUdy involwd a prior hypothesis and primary ENDPOINT. albeit rather subjcclivc (facial appearance). and the trial chnlion was limited 10 just 10 cIa)'s. 11ac control IJ'DIIP. which n:c:eived thc SlaIIdard r~ was an unkDown size. but. clearl)" the IR:atmCDt poup, which n:c:eived \ICIetables and water onl)', was small. at just rour. 11Ie study turned out posilively rar Daniel. Despite modcm-cIay criticism. notably lack of 1WIlOMISA11C»I. 110 ane could critic:isc Daniel far his inftucatial choice or publicatiaa (see Daniel I: 1-16.. Holy Bible). 8y the laic 17th and earl)' 18th centuries., medicine: beg_ its slow prvgRlSS from a son of mySlical cc:dainty 10 a SCientifically man: acceptable unc:cdainly about many or its proccdun:s. Tbe laking of systematic observations and canyinl oUI or experiments became mon: wi~. John Oraant (1620-1674), SOD of a Londoa draper. for example. published his Nlllllrtli tmtI poliliall ob.n'lllilllU

rmuIe upo" lhe bill.. O/IIfIIrllllil}' in 1662 and dcri~ the first cver life table. Graunt wu what mighllOday be lcrmc:d a vital llaliSlician: he elWnined the risk inben:at in lhe plUCCss of birth. marriage and death and used bills or lDOItaIit)' (weekly n:porlS on the numbers and causes or death in aD area) 10 compare one disease with maoIher and one year with 8IIOIhcr by calculatinl mortality statistics. Graunl's wed and ideas had considerable in8uenc:c and bills of mortality ~ also intraduc:al in Paris and odICI' cities in Euntpc:. Early ellperimental work in medicine is iIIuSlrDlcd b)' Ihe Cllamplc: that is oOen quoted of James Uad~s (l71~17M) Slud)' unclertabn on boarcIthe ship Ihe SlJIi.rbury in 1747. Und assessed sewn! clitTeRnt possible trc:almcats farlCUl'\'y by giving each to a difl'en:nt pair of sailors with the disease.. He observed thai the lwo mea given oranges and lelllODs made the most dramatic n:coftl)'. altholllh it wu to be aaalher40)'CIID beron: the Admirally wu convinc:al ellDUlh by Und's finding to issue lemon juicc 10 IIICmbers of the British Navy. The 1700salso saw the first appcarancc of a pracedun: that loats n:marbbly similar to a modcm-cIay SlCJHlFK'AHCE'I155T. specificall)' a SIGN TEST. This arvsc rnm John Arbuthnot's (1667-1735) endcavaun to argue the case for Divine: PIvvidc:nc:cin the Slabilit)' orthe ralioofnumberofmen towomcn. Arbuthnot maintained lhatlhe guiding hand ora diviDe beinl was 10 be discerned in the ae"y constant ratio of male 10 remale christenil1lS n:canlccl annually in London over the yean 1629-1710. The data pn:sentccl by Arbuthnat (1710) showed thai in caeh orlhe 82 years in this period. the annual number or male christenings had been CGDsislCndy higher than the number or remale christeninls. but never \ICl)" much higller. He then essential I), teslcd a null h)'JXllhcsis of "chance' dclcJminlllion or sell aI birth. against an alternative of Divine Providence. by calculating. under the assumpliaa that the null hypalhcsis is true. a PROIWIIllI'Y deftncd by n:fen:ace 10 tbe observed data. Albuthnal·s n:presc:ntlllion of chance in lbis context was the toss or a fair two-sided coin. in which case the distribution of births would be: (1/2+ 1/2)r1. 50 that the obsc:rved exccss of male christenings on cach of 82

occasions had an ex~mcly small probability, thus providing support ror the Divine Providcace h),pothesis. Arbuthnot oreea an eliplaalllion rar the gmdCr supply or males u a wise economy of natun:. u the males ~ IIICR subject 10 accidents and diseases. having 10 seek their rood with danler. Then:fon:.. plUvidenl nature 10 n:pair tbe loIS brings fanh man: males. The nc:arcqualil), orthe sexes isdesilned so that evCl)' malc ma)' have a female in the same cauntry and of suitable age. Olher mathematical dc:velopmc:nts in the 18th century that wen: or special n:Jevance ror medical statiSlics included

201

HISTORY OF MEDICAL STATISTICS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

Daniel 8cmoulli's (17(D-1782) development of the normal approximation 10 Ihe BlNOMIAL DlS'I'JlIBUIION. which was also used in studies of Ihe liability or the sex ratio at birth. The ~ or mediad slalislics in punWlll reform is illustnll:d by Ihe work of Flon:ace Niptinple (1120- 1907). ID her efl"olls 10 improve the squalid hospital canditions in Turkey doriag the Crimean War, and in her subsequent campaigns to impnJve Ihc health and livilll candiliaas or the British Army. the sanilaly conditions and admini5ll1ltion of hospilals and the nunilll .,..,ressiOD. Flan:ncc Nighlingale was not unlike many aIhcr VlClOriaa rcfol111Cl5. Howcvu. in one importanl n:spccl &he was \'elY difl"ercnt. since she marshalled massive amounts of data. ~rully BmIIIgcd.bibulatcd and paphccl. and presented this rnaterialao ministcn. vieclOysand othcn,toc:onvincedacInof the jusliee of her case. No other major national cause had previously been championed Ihmugh the pn:selllaliOD of sound slatistical data and those who apposed F10rence Niptilllale's reforms M:nt down to defeat because her da.. were un&nSM:rable: their publiclllian led 10 an outcry. Another telling example of how camuJ arrangement of daIa was used in the 19th ccntury to save lives is pmvidcd by the work or the epidcmiologill laIIn Snow (1813-1858). Ancr an outbreak or cholera in central Lonclan in Seplembcr 1854. Snow used data coJlccled by the General Relister OfIice and pIoued the Iocatiaa or deaths on a map or the IUU and also showed the location or die amI·S II water pumps. The resultilll map is shown in the figure. Examining the scaUCl' O\'CI' Ihe surface or the map. Snow observccllhat nearly all thecbolem deaths wc= IllDDllllhascwholiyccl near the Braud SlRet pump. Howcver. befCRclaiming that he had disc:oven:d a passible causal connection. Snow made a morc ddailed invc:stipliaa of the deaIhs that had CJCCUIRd DellI' some odIcr pumps. He visilcd abc ramilies or '0 or the dc:ccasc:d and found Ihat faur or Ihcse, because they JRfenai its Iaslc, regularly senl ror water fnm the Braud SlRet pump. 11ua: others wen: chilcRn who attended a sehooJ ncar the 8mad Street pump. One oIbc:r finding thai inilially confused Snow was that then: wen: no dcalhs amang wolkers in a IRway close 10 abc Brvad SlRet pump. aconfusion that was quickly raolYccl when it became appan:atlhat the warken drank only beer, never water. Saow's findings WCR sumciently compelling 10 pcnuacIc the authorities ao JemD\'C the handle or the Braud Stm:I pump and. in days. the ncipbourhood cpidemic Ihaa had claimed more tbaa 500 lives had c:adcd. Later in Ihc 19th Ccnlury and in the carly 20Ib century. lhe walk of people such as Sir Francis Gallon (1822-1911). Wilhelm Lcxis (1837-1914) aacI. in particular. Karl Pcarsoa (1857-1936) bepn to chaillc Ihc emphasis in 51a1is1ics 80m the descriptive to the mathematical. The concept or ~ U'IICJN and iIs mcasun:mcnt by a cOJXlalion cocOlcient was introduced. Slatistical infen:ncc bcpn 10 clc:Yclop and cnter

history of medical .....8tIcs Snaw's IIIIIP of cIroItHIf deaths in lhe Bmad Street . , .

areas or scientific investigation, iDeludilll mccIical research. In 1909 Raaald Aylmer Fisher (later Sir Ronald) (1890-1962) enlc:mi Cambridse to study mathemaaics. die finl step to bccominllhe most influential slatistician or the 201h century. Fisher dc:vclapccl MAXI).ftJM LlKELDlOOD ES'I'DL\. liON. walked on evolutionary thcary. made musivc conlributions to scnelics and inycnted Ihc ANALYSIS OF VAl,. lANCE.. However, FlSher's most imporlanl canbibution 10 medical statistics was his intraduction of randomisatiaa as a principle in the dcsisn ofeenain experiments. In F"ashc:r"s case the experiments wen: in qriculwJe and wereconcc:rnc:d with which rcrliliscn Ic:cIao the In:atesl crop yields. FISher cliyiclccl ap1cultural areas into plots and randomly assipc:cl the plots to differenl experimental rcrliliscn. The principle was soon adoptc:cI in medidne in studies to compare comPCtillllhcrapies for a parlicularcondition.leading. of course, to the rancIomised CIJNIC\L 1RL\L (RCT). described by cmineniBritish slatistician Sir David Cox as "die masl irnparlDnl IDDSl

____________________________________________________________ HYPCmHESISTESTS

contribution of 2CJth.cenlUry statistics·. 11Ie lint properly pc:Ifanncd nuadomiscd clinical trial is now geacrally acknowledged 10 be thai published in I94B by anothCl' gianl

of 20th century medical statisaics.. Sir Austin Bradfanl Hill (1897-1991). who invcstigalalthe usc ofsRpIomycin in the tralmcnt of puhnolllll')' tuben:ulasis. Nowadays. il is eslimalc:cl &hat oyer 8000 RCTs ~ undc!taken worldwide every year. At aboul the lime lhal Bradfanl Hill was busy willa the fint randomised clinical lrial. anolher development was taking place, which. by RMtIutionising man's ability to calculate. was 10 have a dramatic em:ct GIl lhc science of statislies and the work of SIaIislic:iaDs. "nIe computer age was about to bqin. alllKHllh it would be some yean befeft statisticiaas wen: cnI~ly relieved or the bunIen or undel'taking large amounts or laborious arillamctic _ some pre-campUICI' caleulalor. HOWCYeI", in the 1960s. the lint statistical softWIR packages bc:pn to appear, which made the application or maDy complex slalistic:aI proccdura easy and mutine. The inftuence ofinereasing. inexpensivecompuling poweron stalislics continues 10 this day and OVCl'the last 20 yeBlS ils almost univc:rsal availability has meant that rescan:h workcn in stalislics in general. and medical stalislics in particular, no longer have 10 keep one eye on the computational difficulties when developing new methods of analysis. The result has been the intraduetion of many exciting and powerfal new statistical methods many of which ~ of greal imporlance in medical statistics. Notable examples to name bul a few an: BOD1S'I1lAP. COX·S REOUSSIGN, OENEJlALISED EStWATINO fQUATIONS. LOaISTIC RBIRESSION and WLTlPLE IMJIUI'ADON. In addition. BAyESIAN METHODS, at one time lilde man: Ihan

an intellectual curiosily wilhout practical implications because of their associated computational requirements, can DOW be applied relatively raulincly. MaDy inlc:n:sling examples are described in Congdon (2001). There seems liUle doubt thalthe remarkable success of medical slatistic:s will CODIinuc into lhe 21sa century. SSE (See also DEMOCJIWII~ EPlDDIIOUXIY] ArInI......J. 1710: Anarpnnl fa' Diviac Pnnideace.taken from the CCJIIStaM rqularity obscrv'd ill Ibc births or both sexes. I'IIi. soplrirtllTl'aluaclitItUo/. Roy. Sod~" 27, 186-90. CGaad-. P. 2001: BIIyuillll JloI&li",1 rntNIellin&. Cbic:bcsIa: John W"aIcy at Soas.LId..

hotspot clustering See DISEASE CLUSIUINCJ

Huber-Whn. estimate

See CWSlER IWmOMISED

TRIALS

human ......rch ethics board (HREB) Sec EnIIC\L REVIEW COMMrI1EES

hypothesis teals 11Ie testing of hypotheses is rundumental to statistics and BllUII1Cnas about appmprialc ways to lest hypothcsesdalc back toclispulcS between the founders of slatiSticai inference. durilll the lint half of the 20th cc:almy. R. A. Fisher proposccI SIGNIFICANCE 'IESTS as a meaDS of examining lhc discn:pancy between lhc cIaIa and a mill Irypolhau (e.g. lhc null hypothesis Ibat them is no associatioD belween two variables). The P-VAWE. (sirnijictlllce I~Pe" is the FROBAIILII'Y that an assoc:ialion as large or larger than thai observed in the cIaIa would occur if the null hypolhcsis wen: 1nIe. In Fisher·s appmach the null hypothesis is neycr prow:d or eslablishcd. but is possibly disprDlw. FISher advocated 1'=0.05 (5 fJt silniftc:ancc) as a sIancIard le~l ror CXJDCluding that then: is evidence against the hypothesis tcsIed. allhaugh not as an absolute nile: If Pis bct,,'CCn .1 and .91hc~ is ccnainl)' no

.,..y

RaSDII

S1ISpCCt abc h)'lXldlesis 1C:SIcd. Jrit is below .02 il is

to

indicalCd thallhe hypothesis fails toaccaunt for Ibc wIIaIe of the fadS.. ~ . . . DOt often be asuay if we draw a CCIIIYCDlionaIlinc II .05 (Fisher, 19SO). fa fld no Kiadiftc: warm has a fixed level of sipific:ance II which tium year to Year', and in all cimllnstaaccs, he .ejects bypotheses; be rather gives his mind 10 each panicular case in die lilbt of his e\'idence and his idea (fisher. 1973).

Par Fisher, inlCrpn:lation of the P-wlue was ullimately for the experimenter: e.g. a P-value of amund 0.05 milht lead neither to belief nor disbelief in the null hypothesis. hulto a dcc:ision 10 perform anathc:r experimenl. To some exlcnt. usc orlhn:sholds far signiftcance n:sultcd fmm lhe rmuClion in lhc size of stalistieal tables ,,'hen only the quanliles of distribulions (such as 0.1.0.05 and 0.01) wen: tabulBled. Dislilcc or the subjective inlcl'pn:lation inhen:al in Fishel" s appI'DIICb led Neyman and Pearson ( 1933) to prapasc what lhcy callccllrypolhesu tests. which wen: designed to proVide an objective. dcc:ision-thcorelic approach 10 the raulas of expc:rimcnlS. Instead of focusing _ evidcace qainsl a null hypothesis. Neyman and Pearson consicIcred how 10 decide between lwocompctinghypothcses.the null hypothesis and a specified Qltemtlliw hypolhesu. For ClUIIIIplc. the null hypothesis might state thatlhe clil1'cn:nce between the means of IWO normally dislributcd variables is zero, while the alternative hypothesis might slate thai this clifl'en:acc is 10. Based on this paradigm, Neyman and Peanon argued thai lhcn: ~ twotypcsofc:narthatcauld be made in intcrprelilll lhc n:sults of an experiment (see ERmRS IN HYPOnIE5IS TBIS). We make a TYPE I ERROR ifwc n:jccl the Dull hypalhesis when it is. in fild.1nIe. while we make a TYPE 11 EIIIOR ifwc aca:pI the Dull hypothesis when il is. in ract. false. Neyman and Pearson lhcn showed how to find apli..J rules thai would. iD Ihc long run. minimise lhc probabilities (the Type I and Type II emir

211

~ESSTESTS

______________________________________________________________

rales) or making these CItOJS over a series of many experiments. The ~ I error raIC. usually denolcd as Q_ is closely related to the P-value since ir, for eumple. the 'JYpe I enor rate is fixed at 5 fI, then we will reject the null hypolhe:sis whca P < 0.05. The Type II error rate is usually denotc:d as {J and thepoweroflhe tesl (the probabililY that we do not make a ~ II c:mJI' if the allemalive hypolhesis is lrUe) is I - /J. Basc:don lhese: ideas. Ne:ymanand Pearson were able lOclcrive teslS that were "bc:sa' in the se:ase that they minimised the'l}'pe II c:mJI' nate. given a particular 'l}'pe I enor rale. II is important to realise that in Ibis paradigm we do not attempl to inrer whether the Dull h)'pOlhesis is 1nIC:

No test based upon a Ihcory of' probability caD by ilSelf provide any valuable: cvidaaoc of the buth or falsehood of' a hypolbcsis. But we IDly look at the purpose of' tests from anoIhcr viewpoint Without hoping to bow \\iaelher cKb separate hypodJesis is IIUe ar false. we may ~b far rules to go\~m our beha\'iour with reprd to them. in following ncb we iDs1R that. in the long IUD of expcrienoe. we shall not often be 'ATCIIIg (Nc)'IDID ad ~n. 1933). To iIluslnlte the diffe:rences between the two approaches. consider the hypothetical connlled trial ora new cholesterollowering drug. with n:suIts (mean posI-llCatmeDl cholesterol) swnmarised in the table.

hypothesi. _Is Results of a hypothetical controlled trial of a new cholesterol-lowering dlUfl Group Nc:wdrug

Placebo

Numbt!rof

Meon

partit:#ptlItIs

tlro/esterol (m&ldlJ

IS IS

220 lOS

SIGlldarti

tier., io" 2S 2S

Mean cholestc:rol has been n:duccd by 15 mg/d); a reduction or this mapitudc mighl lead to a substantial .muction in the risk of heart disease. An unpaired I-test gives P=O.Il. Based on Fisher's approach. the null hypothesis has nol been disproved. Howc'VCl'. a thoughtful investigator might. rather' than discarding the proceed to conduct a larger trial. Application of the Neymm-Peanon approach n:qUiRlS the specification or both Type J and Type II error raIcs in ad\'8DCC. so we must specify a pm:ise alternative hypothesis. e.g. that the mean reduction is 10 mgldl. An investiplOr .ttemplin, to foUow the Neyman-Pcanon approach would nc:cd to IqJC)rt not only that the lest was not si,nificant at the 5 CJ. Ie:vel ('JYpe I enol' rate 5 fit) bul also the ~spc:cific:d Type II c:mJI' rate. Howe\lCr. the power or a study with 15 patients per group to dc:tcct a difference of 10 mglcII is only 19.5 «.it. for a study that

dru,.

is too small. such as this one, there is no choice or'JYpe I and

1YPe II error rates that is satisfactory. Had we done a FOWER calculation on the basis that we wishe:d to detect a difference of 10 mgldl with SO fit power at 5 «.it significance. we wauId have round that we require a much larger study. with 99 patients in cadi poup. The usc of power calculations to CDsu~ that studies are large enough to detect associations or inte~"1 is an endwing Ic:pcy of Neyman and Pearson's wort. Now that most slalislical computer packages n:porI pn:cise P-wlues. there seems little justification in n:palting the ~sults of our drug trial as P > 0.05_ P > 0.1 or ·NS (nonsignificant) unless one is following a pn:-specifie:d choice of bolh 1YPe 1 QIId 1YPe II error rates. This is ran:ly the case: even in randomiscd trials we will usuaRy invesligate a number of hypolhescs beyond the primary one for which the trial was desipc:d. Therefore. in modem medical slatistics. it is usual to JqJOrlthe precise P-value., together with the estimak:d dilTen:nce and the CONRDENCE IN11lRVAL ror the difference. For elUlmple~ ror our hypathc:ticaltrial we could ~port that the MEAN raiuction in cholesterol was IS mglcII (9SCJ. CI -3.7mg1d) to 33.7mgldl. P=O.l1)~ When we examine the conndeaee interval we see Ibat the results arc consistc:Rt either with a substantial aad clinically impaltant n:duction in mean cholestelOl or with a modest increase. Examining the confidence intc:nal should help us aYOid the common error ofequaling 'nOMigniftcance' with acceptance orthe null hypothesis that the dru,has noetrecl.n:gardlc:ssof the power of the study to detect diffen:nces of inteRlSl. A number orboolcs and articles discuss in more detail the 1CStin, of hypotheses. the arguments between the Fisher and Neyman-Pearson scbooJs of inrerence and the case for Bayesian n:asoning as an alternative (e.g. Cox. 1982: Oakes. 1986: Lehmann. 1993: Goodman. 1999a. 1999b: Sterne and Davey Smith. 200 I). IS f

Cox, Do R. 1982: Slatistical sipificanc:e tcsts. BriliJh Journal of Clinical PlrtumGcoIogy 14. 325-31. n.Iier, R. A. 19SO: SllIlisliml

melltods1M Ttsetlrth workm. London: Oliver and Boyd. JIIsIaer. R.A. 1973: SllItislimi tMlhotIs _ ~~nliJk infermt'e. Londan: Collias Macmillan. Goadmaa. S. N. 19998: TowudC\'idca~

medical s1alistics. I: the P-\'a1ue fallacy. MlftIls of I",erlftlliono/ Metlkilfe 130. 99>1004. Qaad••", S. No 1999b: TO'A'ardcvideDl:cbased medical statistics. 2: the Bayes factor. Arrrrills o/I",erlftlliono/ Metlkirre 130. 100>13. J.eIuuaa, Eo L 1993: The Fisher. Neyman-Pcarsaa tbcories of'testin, hypadac.scs: one theory or 1\\'01 JoumtIi oj ,Ire Amerimrl SIlI,isticll1 ABDtillliorr 88. 1242-9. NIJIIID, J............... Eo 1933: On the problem of the most eftkie.. tals ofSIali51ical hypotbcscs. Philosopbim/TrDlUtltUons of ,he Royal Sotiet)'. ~rie$ A 231.289-337. Oabs, Ptl. 1986: SIlIlis,iml in/nmre. Chichcslcr: John Wiley & SoDs. LJd. Sterae, J. A. ..... Daft7 Sadtb,G.200I:Siftiagtheevidence-.t's\\TODgwilb significance tesas? BritiD. IVeJksl Journtll 322. 226-31.

I ICC Abbrevialioa for DI'J1lAQ.US'I'E COIlRB..AlION COERICIENI' ICER Abbreviation ror INCREMENTAL COST-EfRCTIVENESS RATJO. See COST-EffECTIVENESS ANALYSIS

ImmuIW proportion

11Iis proportion indicates individuals who may not bcsubjecllo dealh. failu~ relapse.ele.• in a sample of censon:d surviwllimes. The presence of such individuals may be indiclllCd by a relalively high number of individuals with large censon:d survival limes. Finite mixhR distribulions can be used 10 investigate such data. Specifically. the population is assumed 10 consist of two componenlS. The lirs~ which is presenl in prapanion. p say. contains those indi\iduals who are susceptible to SDmeevent ofinteresl (death. relapse. etc.) and have. say, an eJtponcnliai disbibution for the lime ID the occlllmlCe of the event. These individuals an: subject to righl censoring. The remaining proportion. 1- P. of the population is assumed to be immune to. or cun:cl or, the disease and rorthese individuals the event never happens. Consequenlly, observations on their survival times· an: always censored at the limil of rollow-up. An importanl aspect of such analysis is to consider whc:ther or not an immune proportion does in fact eJtist in the population (sec. for eJtampie. Maller and 2hou. (995). SSE [See also aJRE MODELS) Miller. ReA. adZllaa, S. 1995: ThsIing f« dIe.,.am:c ofinunulle

«curcdindiWUlsiDcc:morm!llJ'Yi,'aldata.Biametrk.rSI.181-20J.

Imputation

See MUL11PLE IMPUTATJON

11Ie incidence of a disease is the: number or new cases of the diseuc occurring wilhin a specified period of lime in a denned population. A lime period or I year is most commonly used. but any appropriale length of lime can be substituted. II is generally presented as a J1IIe. Thus:

Incidence

Incide nee

rate _ Number of new cases of the disease in one ),ear NumLei' in ihC popuiaiiOR II ri5E

This assumes that lhe size or the study populalion remains constant over the lime: period for which the nle is calculated. Small iDCmISCS or dCC19SCS in population size over a year. for eJtampie. can be dealt willi by using the mid-year papulation as the denominalor for lhe incidence ndc. This results in a number between 0 and 1. but for case of presentation it is often eJtpn:sscd as a nle per 1000. per 100 000 or per I 000 000 depending on the disease nrity. As

an example. the incidence rate of coIorectai cancer in males aged 60-64 in Scodancl was 159 per 100000 in the year 2006 compared 10 206 per 100000 in the year 2000 (NHS National Services ScolJand: Information Services Division. www .isclscotland.org). Thus incidence lales can be used lo measure risk and comPIR risks across lime or between different populations. This definition is nIher simplistic because it igncns Ihe ractlhat when new cases of the disease OCCUl", the subject is no I~er at risk and should icleaUy be removed from the denominator. It is also unsatisfaclory for cleaIing with data from LONOIlUDINAI.STUDIES in which subjeclS may be followc:d up forw,ryinglengthsoflime. For these studies the incidence rate can be definc:d as: IDcidace rile

=..... af new

"'5

GIleS crf Ihe dRae _die clef. . paput.... TCIIIiI .... d IiIIII! lac ..1IidI hll\"e hie. rallned lip

The dcnominaIDr gives lhe number of person-yam of observation. Incidence rates defined in this way an: often eJtpresscd as rates per 100 or per 1000 pcnon-years of observalion. (A more detailed discussion of incidence

and incidence rates is given in Rolhman. Greenland and

Lasb.. 2008.) .OR should be taken todislinguish between incidence and FREYAJ..ENCE. Althaugh the deftnitions appear similar at lirsl

sighL they are usc:d forclifTerent purposes and it isessentiallo distinguish belween them ctXIeCtiy. Furlher details can be round in Woodward (2004). 'VHG Rathman, K. J.. GnIIIIaad, S. sad ...... T. L 2008: MDIkm rpilkmioloD, 31d edition. Philadelphia: Lippincaa.. Wilkins and WiUi.... Woadwllnl. ~L 2004: £pi_ioto,,: $tuJI). tleJign and dala tIIItllyliJ. 2nd cdilion. Boca Rataa: Chapman a: Hall

Inclusion and exclusion crlterta

Thc:se crileria opc:I1IIionaJise the choice of sludy group. a choice that lies at the heart of the design of, and inference from. CLINICAL TRIALS. 'Inclusion' criteria define the population or inlen:st: 'cxc:lusion' crit~a remove people far whom the study treatment is contraiDclicaleci or Wllikcly ID be eJTc:ctive. Collectively. inclusion crilCria and cxc:lusion criteria comprise the e",r,. cri'erill or eligibililY crimio. Biological plausibility. the: inlernal validity of the study. the epidemiological basis far generalisability Dad statistical powcr all play parts in selc:cting entry criteria ancIln making leCommendalions from the raultsofthe trial. The selec:lion orthose to be enrolled in a trial often reftc:cts a delibendc attempllo

213

INCWSIONAND EXCLUSION CRITERIA _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

select a slUdy calion homogeneous eDOlIIh 10 allow a IIUc lmltmcnt elTcct to bc::come lIUIIIifcsl. )'et heterogeneous enoulh to pennil ",liable generalisation to a blOlldcr population. Clinical lrials necessaril)' study people with mo", homolCllCOUS characteristics than the patients to whom clinicians will apply the n:sults. Strict relRsentatiw:ncss is n:icvantto the gmemlisabilil,. of clinical trials but is not esscntialto ilr/erence rmm them. In mndomised studies. the logital basis forcbawilll COfIclusioas lies in the act ofRAl'ftXHSA'JIOH.'J'hcpracessorconclucling that the etrect seen in a clinicailriaJ wiD apply to another populatian is informal ancIsubjc:clive (Cowan and Witles., 19M). Homogeneity of the stud)' papulation ditren from homogc:DCity of the treatment etrect. The fonner ",ren to a study group's sharing similar chamc:lCristic:s: the laller MfeD to an effect of IrCalment whasc expected magnitude and dim:lion wauld lead to the same n:commcndation for use or nonuse in identifiable sublftJUPS. If a thc:rapy affects a wide group or people quite similarl)'. then either a lIomoJencaus 01' hetcrogeaeous study lroup will provide similar answers "'larding the mqnituclc of lmllment etrect. An ideal stud)' gmup would consist or a cohort ror whom the IrCalmenl is efl'ective and com:lIpOIIdinJ to nom is an identifiable populalian thai will be 1rcaIcd. Defining such a stud)' group bef~ the llial is usually difficulL Available data am I1IR:ly sufllciendy n:1iab1c to plVvide serious guidance about wham to include. Early-phase studies typically deftnc IUU1'OW CAli')' criteria to establish prelimiDIIIY safely or to cIemonslnlc: proof or CDIICCpt (sec PHAsE I 'I1UALS, PHAsE II TRL\I.S). Such lrialsoften exclude childn:n. IRgnant and noning warnen. the frail elderly and GIber vulnc:nble populations. Later phase trials with nanow entry criteria specify the t)'pe or patient likely ID beneftt mast and then lest wbc:lhcr the IrCatment worb rar them (sec PHAsE UI1RL\U. PHAsE IV nIALS). A study showinl benefit in this narrow IrauP or participants ma), lead ID fUton: trials with widc:r entry criteria. A IrCaImenl with important hcterogenc:ity or effect ~quin:s a holllOlencous study population. Trials with WidcCAtrycritc:riaaddn:ss whc:therthc: tJadment understudy wodts an avcnagc: when applied to potential useD. Wide entry crilc:ria simplify SCReDinl and recruitmenL enlOlling a wide range or pc:op1e is consistent with assuming homogeneity of eft'ect while afl'onIing Ihc inwstigalor a tentatiye glimpse at Ihc likelihood of the tnlth of thai assumption. Biological plausibility should play a decisive role in selecting the ranp of people to enrol in a trial. Study enlfy criteria should aim ID achieve helc:mlencity wbCA no ClODvincing information at the start or the Sluciy SUlgesls that sizeable differential effi:c1s an: likely. As a hc:tcrogeneous study poup leads to varialion in the incidence of ENDPOINI'S. incn:asing hclerogenc:ity pnendly n:quin:s an inclaSed sample size.

Defining CAli')' criteria n:quin:san opc:nlional definilion of Ihc disease in a IrCalment trial or a specification of who is at

risk in a prevention trial. Allowing people with questionable diagnoses to enter a lrialte:nds 10 aUenu_ the estimak:cl tn:atmcnt etrect and hence decreases statistical power. Yel often the insistence on unequivocal dacumentation or diagnosis excludes many people who in ract wauld n:ceivc the tn:llbllent if the trial shows benefil (Yusur, Held and Teo. 1994). Trials must exclude: people kaown to have c:ontraindicalions to the tn:abnents under study or those who an: padicularly vaI_rabie. Similady, trials or therapies already known ID be etrective or incffecliw: in c:atain gmops should exclude thosc groups of patients. Some raadomisccl trials use an 'uncertaint)' principle' toguideenll)' (sec MEOA-lRIAL). 'A patient can be enteml ire and only ir. Ihe raponsible clinician is substantially uncertain which or the trial lI1:alments wauld be most appropriate ror that particular patient' (PelD and Bailent. 1998). 1)'pical PROI'OCOLS RlR CLIN1C.~L TRIALS exclude people unlikely 10 finish a study or 10 adhcn: to Ihe protocol. Many clinicaltria1s have ver)' rew participants with some specific characteristics. A trial ma)' exclude racial or ethnic groups, not because the entry criteria pn:clude lheir participation bUI because the clinics involved in the study do not have access to them. In summary. trial deSigneD should cason: that each entry criterion repR:scnlS a defensible limitation on Ihc study group; howe\lel'. the: ract of inclusion cIoesnol usually provide much information about the effec:l or tratmcnt in specific groups or people. 'J'hc argument that only by including. say, women and minorities., can one legitimatel), apply the raults orb trial needs to be tempered with the fael that a trial ran:ly giYCS enough infonnalion about specific pvups to learn much about the effecl of trealment for them. When the trial is o\'er, the n:sults should usuall), be applied quite: broadly, both to people whose demographic characteristics am similar and dissimilar ID those in the trial: however, the medical communit)' should maintain an inlellecwal stance open 10 sugc:stive data indicating differences. n.e situation is ~ complicated ror groups of people definc:d by such medical orphysiolOlic variable asdiqnasis, scverity, prognoslic rcatun:s. prior history or concomitanl medications. for often appan:ntly biological cogent ft'1ISDftS justify exclusions. Hen: too a critical questioning of the reasons rar exclusion is wananted: in many cases very few data an: available to support even slIangly held views. DesigneD or cliRicaltriais should construct entry crileria bearinl iD mind the purpose of the cum:ntlrial, the: available knowlcdJe or the study llUlmenls being tested. the likely Sluclies that will follow the trial and how investigators, practising clinicians. palients and n:gulalory qcncies will JW intcrpn:l the: raults in light or the entry criteria.

_______________________________________________________ Cana. Co .... wnta.J.I9M: IDlCI1:cptsludics.cliDicallrials. and clust.:r cxperimcats: to whom can we exIIIpolale? Conlrolletl ct;,,· icsl Trial, 15.24-9. Pete, ........... C. 1998: Trials: the IICXI SO yan. Met/itllt JOIIIfItII 317. 1170-1. y~ S., Held, P. aadTeo,K.K.I994:SelectionafpaliCRISformndonaiscdCXJRlfoIIed 1riaIs: impIic:aIioDs of wide. narrow elipbiUty crileriL StGlistics iR

Bri,.

I4m;dM9.7~

Incomplete block designs

See CROSSOVER tRIAlS

Incremental cost-eflactlveness ratio (ICER) Sec COST-EFFErJ1\IDIESS A.tW..YSIS

Incubation period

Tbis is Ihe time inlcnal between

the acquisilion of inrCClioa and the appcaranc:e or sympkHDatic disease. Examples include the time bet\\'CCD exposun: to mdialiOll GI' to a chemical can:inogca aacI the oc:cummce or cancer and Ihe lime fram iDfc:cliOD with HaVand the: 0Dsel of AIDS.

The leagth orlbc incubation perioddc:pencls on thediseasc. nangalll fram days. rar instance. in Ihe case or malaria to a numba- or yam rar HIV. 11Ie incubatiaD periodlypically yaries from individual to individual and may depend Oft the close or the cliseasc-causinl &genl m:eiYed. Oiven this yariabiUly. it makes SCII5C to talk about incubation period disbibutiOft. The incubation period diSlributicm I'll) represcnls thepmbabilily thallhe leqlhofthe incubaaiaD period is less than or c:qualto I time unils. EslimlllioD and chlll'KlcriSlllioD of F(I) is impodant for a numba- or IaSORL For diseases with short ilK'ubalion periods. such as outbrcab. knowl. or the incubation period is esseDtiaI to the investigatiOll or the circumstances in which Ihe disease bas spRad. ID the case of cliscaues with loal incubation periods, such as HlV or CmllZfelcltJakob disease. infonnaliOll on RI) is a accessary input to Ihe estimation and projection of Ibc evolution of the epidemic (sec BACK-CAl.CtU1IDN). Finally. it is yel)' imporlanl to identify covarillles duat milht affCClIhe lellllla or the incubation pcriocI for an elTective clinical management or the palienl. The ideal setup to estimate the inaibation period distribution is a CCIIORf S11JDY wbc:R individuals 8R uniDfectc:cl at enrolment aad 8R followed up toobscrve bolla theoccllJRllCC of infection and the appearance or symplOmalic disease. Tbe Rsullilll obsc:n'8lioas will be right cellSDRd as every individual will have either clevelapc:d the: disea5C or bc:c:n censoml by the: end or Ihc: foHow-up period (see CBlSORED 0IISEJtVA110NS). Classical survi\..l analysis can be usc:d to eslimatc: FCl) both nonparamc:lrically, via KAIIlAN-MEID JILOTS. and parametrically. by fillinl paramc:lric mocIc:ls to the riPt~nsan:cI data. Usually. especially far diseases willa a 10111 iDcUbatiOll time. such cohort studies ~ diflicultto set up. EslimaliaD or the incubation period distribuliOD is Ihc:n

INSrRUMENTALVAR~es

carried aul either usinl information OIl individuals who have aln:ady clevelapc:d symptoms or rollowilll up cohorts of individuals who 8R aln:ady inrcctc:d. but have not yet developed Ihc disease. In eithc:r case. biased raults can be obtainc:d ir CIIimatiaD does not prapc:rly account far the sampling criteria by which individuals 8R included in the study. DDA

B....-...,., .. 1998: Jncubalioa penact of infcetiaus diseases. In Armitage. P. and Colaaa. T. (cds). ERqc:loper/kl D/6iD"tllulk,. ~. I. pp. lOll-l6. Chichester: JahD Wiley a: Saas, Ltcl.IInIaIau)w• .. ad Gall, M. H. 1994: AIDS epitkmiolDgy: tI 1JUIIII1;IGlh~ approGCll. New York: 0xfanI UDiYmity ~

Indlract standardisation Individual ethics

See DDtOOIlAPIIY'

Sec E1HICS AND aJNICAI. TRIAU

Infant mortality rate

See DBIOOIWHY

Informative censoring/dropout

Scc: CENSDIED

OBSERVATIONS. DRCJIOOT. MlSSlNO DATA

Informative dropout

S)'IIDftym rar

NONIONORAaE

DRQIIOU1'

Informed consent

Sec ETHICS AND aJNJCAL 'J1UAlS

Instantaneous death rate

See SUlMVAL ANALYSIS -

ANO\IEIMEW

InsUtutioMI review board (IRB)

See EIIIIC\L

REVIEW COMMl11EES

Instrumental varlabl..

A variable thai is highly com:laIed with an explanatory variable but has DO direct inftuence on Ibc R:spontC variable (i.e. its elTCCI is mediated by Ihe explanatory variable). Consider a silUaliaD in which we: can assume thai. n:sponse variable. Y. is linearly n:1atcd to an explanatory variable. X, as follows:

Y=a+JJX+£

(I)

WhcR S is a random deviation ora particular value or Y fram that expc:clc:cl flVlll its n:latiaDsbip wilb X.1)'pically. we: wish 10 usc: • sample of (x. JI) pairs of mc:asun:mc:Dls in order 10 e:stimate Ihe unknown yalues or the: panunc:tc:rs., a and/J. Tbe familiar ORDINARY ~ SQUARES (CLS) estimator of fJ is c:quivalenlto the ratio or Ihe estimated COVAlllANC£ of X and r to Ihc: estimated variance of X (dais ratio is usually caleuIBIed by dividinlthe sum orthc cross-pmcIuc:ts orlhe X and r values rram the:irrespcclive mc:an by the sum of squan:s orlhe X \'alueS).ll is possible toclcmonstratc that such an estimate is ID1biasc:d rOT JJ pnwiclal cenain assumptions bold - the lcey one bciag that X andrr 8R UDCOI'Rlatcd.

21&

S

~

R

A

V

L

A

r

N

E

M

U

R

T

S

N

I

_______________________________________________________

Now, irwe have an omitted variable. C. col1'Clalcd with X. and such that the tnJc model is. in rad: (2)

where " is the random deviation or Y from that explained by the model.lrwc still proceed with our naive OLS estimator as ror equation (I) then we will obtain a biased estimate of fJ. This is a n:sult or the rad that the c::om:lation between X and B is no longer zero. This is an example or what econometrists call endogencity (sec Wooldridge. 20(3). In epidemiology. the variable Cis kDOWR as a cotifounder (in this case a hidden conrounder). In such cin:UI11SlaDCCS. how might we obtain a valid estimate of/l? The obvious answer is to mca5ure C and fit equation (2). Another approach (much mon: common in economics than in medical applications) is to find a variable that is slnJDgly corrclalcd with X. but uncorrclalcd with the n:sidual. B. Such a variable is called an instrumental variable (IV) or instrument. for short. Now let us CXJDsider a diffe~nt circumstance. Suppose that the values or X an: mcasun:d subject to emil' such that:

X=r+,

(3)

In which the .. values are random measun:mcnt erron with zero mean and assumed to be uncorrelated both with each other and with the true values. 1'. 'n1c n:lationship we are ~lly intcn:slcd in is the rollowing: Y=a+~+B

~

How do we estimate fJ? Apin. usin,OLS in a n:gn:ssion or Y apinsa X would produce a biased n:sult (sec ATJEHUATION DUE 10 ).IEASVREMEHT ERRQR). This is anaIhcr example of the endogencity problem. A similar situation holds when we attempt the c:omparative calibnlion of two mcasun:ment methods. both subject to me~mcnt CI1OI5 (see MEIHOD <mIPA~ S11JDES). If we we~ in the rortuitous position or knowing the VARIANCE of the measurement crrun in X. or the n:liability or X. we would be able to make appropriate canec:tions. AnoIhcr approach is again to ftnd an inSlJ'Umcntal variable - a wriable that is strongly c:orrelalCd with X but aJDditionaily independent of Y ~vcn X. Consider an inslnlmcntal variable. Z. The instnlmental variable (IV) estimator of fJ in equation (2) or (4) is:

fJlV

E(Z-Z)(Y- Y) = E(Z - Z)(X - :I)

(5)

which isc:quiwlcntto the ratio or the estimated c:ovariance or Z and Yto the estimated co'VBriance of Z and X. Typically this estimate is obtained through the usc of a two-stage least squan:s (2SLS or TSLS) algorithm (see Wooldridge. 2003. Cor further details or the method. including the sampling distribution or the IVestimalc). This algorithm is available in mosIlarr;e general-purpose softw~ pac:~. Note that ils

validity is not dependent on any distributional assumptions concerning either Z or X. 80th could be binary (ycsIno) indicatan. for example. For linear models. IV estimates can also be obtained with ease usin, structural equation modelling (sec S1RUCI1IRAL EQtk\1ION MODfl.S and S11WCTURAL EQUATION MDDEU.JNO SOfTWARE). As an example. an early medical application of instru-

mental variable methods was provided by Pc:nnult and Hebel (1989). They describe a bial in which prqnant women wen:

randomly allocated to n:ceive enc:oungemcnt to n:duce or stop their cigarette smoking during pregnancy (the tn:almcnt

group). or not (the conarol group) - indicated by the binary variable. Z. An intermediate outcome variable (X) was the amount of cigamte smoking n:cordcd during pn:,nanc:y. 'n1c ultimate outcome (I') was Ihc birth wei,ht oC the newborn child. Readcn will be familiar with evaluating the effect of RANDOMIZATION on the child's birth weighL However. what about the effect or smoking (X) on birth weight? Smoking is likely to have been n:duced in the pvup subject to enc:ouragemcnt. but also in the conarol group (but. presumably. to a lesser extent). Then: are also likely to be hidden conCounden (e.g. other health promoting behaviours) that are associalCd with both the mother·s smalting during prq:nancy and the child"s birth weight. Smaltin, eX) is an endogenous In:atmcnt variable. The problem is solved by noting that randomization (2) is an obvious candidate Cor the instnlmental variable.lrthe intervention (i.e. enc:ourqemenlto reduce smoking) works then randomizaIion should be cOll'ClaICd with smoking during pregnaocy. It is also a reasonable to assume that the effect or randomization is completely I1'lc:.diated by its effect on smoking (thai there is no direct effect of randomization on outcome (the birth weight of Ihc: child). Randomization (2). in fact. is increasingly being used as an instrumental variable in the estimation or the effect of In:atmcnt n:c:ei~ (X) on outcome (Y) in nndomizc:d controlled trials subjeclto nonadherence or nonc:ompiiaDce with the allocated In:atmcnt (see ADJUmtENr R)R NONCOMIII.WlC'E IN RANDOMIZED CONIROum TIlIALS). 'n1c poICIItial for the usc ofinstnnncntal variables inepidcmiological investigations is illustrated by Onx:nland (2000) (see also MENDELIAN RAND0MISATION). Health economic: applications are ~viewed by Newhouse and McClellan (1998). What about MEASURaIEN1' ERRal problems? Well. first note that ror the example providc:d by Pcnnutt and Hebel (1989) the IV estimate or the effect of mother's smokin, on her child·s birth weight is not attenuated by the inevitable measu~ment error in the number or cigan:ttcs smoked by the mother. The IV estimator effectively copes with the simullancoUs problems oC c:onCounding and measu~mcnt error. What about Ihe problem solely due to measu~menl error? H~ an obvious choice for an instrument is a mcasun:mcnt oCthe characteristic measun:d asX using adiffcn:nt proc:edun:. Smoking (X) could be measun:d by selC-n:port (in

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ INTENTION-TQ-TREAT (ITT)

a diary. ror eumple) and a suitable inSlnlmenl (Z) mipt be a measurement or a biomarker or nicotine CODSumption (cotinine levels in the blood. for example). The key here is to be able to convince oneself of the conditional indepeacIc:ncc orZ(bi~ me:asu~menl) and outcome, Y(hcalth Slalus). gi'Vc:n Ibe fallible indicalor or exposlR, X (selfn:pOItcd cilanlle smoking). Dunn (2004) pnwides detaile:d descriptio.. orlhc use or instrumental wriable: meIhacIololY in the e'VDluation or mcasun:mcnt CITOrs. mainly in the context or linear models, bUI also in lalent class madelling or binary dialllDSlic lest n:sults. Naalinear models an much more difftcult 10 deal with and arc well beyond the scope of this article: (but scc Stefanski and Buzas. 1995). GD DaIuI, O. 2004: StGlistiI.'Gl no_tiM of nlttJJUl'e""nl error••

London: Arnold. OI'ellll8lld, S. 2000: An iatnJductioa to illSbUmental 'Variabies far cpidcmioloplS. intemalitJrlGlJDIH.IIIII of EpitlmritJlo,,· 29. 122-9 (Eaalum. p.II02). New...., J. P.... McCIeIIaa, M. 1998: &CIIICIII1drics ill outcCIIDCS~!nIdl: the usc of iDSlnlIDClllaI. \'IIriabics. AnllIIIII RnirtF. ofPublic Hmlth 19. 11-34. I'InIatt, T..... He.... J. R. 1919: SimuillDCClUHqllllion aliJUlian ia a clinicallrial of the etTect of smoking and bidh weillL Biometric. 45, 619-22. StlfBlllld, L . . Baas, J. S. 1995: laslnllncatal 'Variable _matitlll ill biauy tqR:ssiaa measumncDl cnor madcls. Jour_ oJ the Amnfam Sttlt&titlll AssotitII_ 90. 541-9. WaaldrIdp. J. M. 2003: In/rotiMtlory «tNftIIIIetrk.: tlppnlGc". 2nd edition. MISOD. Ohio: Sau1h-WCSIeIII.

II "",.".

Integrated hazard function

See SURVIVAL ~~ALYSIi

ANOVERVIEW

Intentlaft.to-lNaI Cotonaty . , . , bypass surgery In sIBbIe anginapedotls trill. MottdIy at,., years afIer TIIIJdotnisatio by allocated IIIId acfuBI.i'JIeMnim (Eufope8n Cotonaty SUqJety SIr4'~, 1979) Allocated

MediCllI

Medical

SUrgictJ'

SrlrgiCll'

Medical

SIqical

Surgical

Medical

inlerve"tion Actual iDtaYmlian SuniYOrs

296

41

353

20

DeadB MartaIity

21

2

IS

6

8.4~

4.fi

4.llJt

23.1'1

died berOlC surgery could be done. aad exclusion of such participants from one: arm only introduces BIAS. the In analysis of these data would c~a lIIOItaIity rate on.8 CJt (291373) in those allocated 10 medicaillealmcnl willi a mte of S.3" (211394) in thase alloca1c:d1o sulJCI)".1f the six dealhs that occurred in panicipanls allocated tosurpcal interYenlion who died befon: receiving suqery (identified by "Actual intervention = Medical" in the lable) arc not attributed 10 surpcal intervention using an intcntian-lo-Ral analysis. surgery would appear to have a falsely low mortality ndc. Since pralacoJ deviations aad naacampliaace arc likely to occur in raulinc usc: of an inICn"ntion. rrr analysis can provide an c:sIimalc of the tn:aIment ell'ce," which n:asonably n:ftects what mipt happen in clinical practice. II is thc:rdan: Ihe most suitable approach ror pnlgmatic IriaIs that aim to measun: the o\ocnaU f!jfecli,'me$S of an inla"Vcnliaa policy in

Intentlon-ta-I...I (ITT)

11Iil is a principle used iD the design. analysis and conduct of randomised CLDIICAL 1RL\I.S (Heriliel". Ocbski and l 1.96 al a sia;nifiamce level 0.05 where:

z-

S-200 - -J7'!"'(400-X-O-.S-X-O-.S-)

'I.

=

= Pr(S E RID(I); 9)

Forsomc rOo YI > 1J2.astoclmsticcunailmClltlestrejecls the nuD hypolhesis if:

Pc(8.} ~ I andPc(O)

~OandPc(91)

< I-YI

According 10 Lan, Simon and Halperin (1982).1he Type I and II c:nur probabilities ~ in8aled but remain bounded fram above by:

a' = alYoandlf = {Jly. Generally stochastic curtailmenl is very COMerYalive and if Yo= 1= YI. it bc:comes deterministic curtailmcnL A formal significance test is only one factor in the complex decision pruccss of whether 10 conlinue. modiry or slop D trial. Interim analyses based on a;roup sequential methods. lliana;ular ICsls or stochastic curtailment procedu~s proVide objective gUidelines to lhe DATA AND SAFETY MOJmORINO BOARDS.

The choice for the mediad of interim analyses should depend on the dcsin:d opending characteristics of the study in terms ofthc early stopping prapelty.the maximum sample size n:quiremcnt and the expected sample size. For example. if the lIudy continues through all K analyses. the group sequential design wiD accrue I110IC participants than Ihe fixed sample dcsip. which is likely to occur if Ho is true. However. iflhe study slops at zin earlier interim analysis, Ihe poup sequential desia;n will aocrue participants on averqe than Ihe flxcd sample design. which is likely 10 occur if HI is true. A l1UIdomisc:cl Phase IU trial should neYel' be terminalcd in the early stages ofrecruilmeat merely because: it is failina; 10 reach the anlicipaled ·minimal' benefit CIlvisqed III Ihe desia;n staa;e of the study. This is because early tc:nnination of a slUdy in these c~umstanccs will leave Ihe associaacd CONfIDENCE IN1BlVAL unacceptably wide. thereby indicating the passibilily of a plaUSible. and maybe worthwhile.. advanID&C to one thempy even WheD there is no IJUe difference in intervention effects. ID such cin=umstances the level of uncertainly remains unacceptably high. KK

rcwer

(See also DATA AND SAfETY MONmlUNO BOARDS]

or CQuivalendy if is - 2001 ~ 20. After 350 losses. we will reject Ho for sure with 220 heads. With 210 heads. however. it dc:pends on Ihe fut~ outcomes. Consider a fixed sample test of Ho: 9 = 0 III D significance level a with power I - JJ 10 ddcct tbe difference 9 The conditional probability of rejection of Ho. i.e. conditional power. III 9 is defined as:

Pc(9)

Pc(O}

INTERIMANALY~S

> Yo

or accepts the nuD hypothesis (rejects Ihe alternative hypothesis) ir:

Andenoa. T. W. (1960). A modification of the sequential pRJba. bilily ratio ICsl to mluce Ibe sample size. AIIIIDIJ 11/ MatMmtltlml

Slatistics 31. 16S-97. AnInmbe. F. J. 1954: Fixed-samplc.-size analysis ofscqucntial obsenatioas. Biomt!lriu 10.89-100. MoN., H. . . . L K.1Uldsu.. W.J. 1998: GRlUpsequential dcsips usiDg both Type I and 1)pe 0 error pmbIbility spending functions. Commlllliratiolu ill Sialislicl, PaTt A - 71reory tmtI Methods 27. 1323-39. L.... K. K. G. ad .,... . . Do L. 19I1: Discrete: scquealiaI boundaries for clinical bials. Bionwlrilca 70, .659-63....... K. K. G., SbnDII, R. ..... Halperin, M. 1912: SlOcllaslicaUy CUltaiIed lcslilll in 1oar:-ccnn cJiakal Irials. Squentiol AM(y.rir I, 207-19. O'Brt., P. C. and .,...... T. R. 1979: A multiple btiDr: pracedwe for clinicallrials. Biometrics 35. Sl9-S6.............. So, 1'IIdI, A. A...... KIIII, It. 2001: Speadilll functions for Type I BDCI 1)pc 0 mar pobabiIities of poup sequential IriaIs. ",., In/ornlatitNr Journo/72, 247-60.I'oaIek,5.J. 1977: Group sequealiaI methods in die design and analysis of clinicallrials.. Blomelrilea

a.ua.

221

INFERNAL PILOT STUDY _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

64. 191-9. W..., A. 1947: Sequmlial _/pis. Nft· YOlt: John Wiley .. Sans.. Inc. ~ J. 1997: tlaign and _11m of MqWlllilllclinicallriab, 2nd~iscdcdiliaa..Chichcsta: Jaha \VUey A SoDS. Ud.

n.e

Intema' pilot study

See PIDI' S11JDJES. SAMIU SIZE

DETERMlNAnox IN a.JNIC\L TRIALS

Interquartlle range 1bis ranse is a MEASURE OF SlllEAD defined as the interval between Ihc willeS that an: located aac-quartcr and ~ or the way thlUUlh the saaaple whc:a the abscnatiODS an: onlenxL Thus. it encloses the midcDc SO., or the data points. Far example. suppose the weights in kilopams of II elderly men from a community sample attending a clinic Jnlelquartile range: ...,

51 60

11

63 65 "

70 72

71

15 9S

t Median Then, the interquDltile range is Ihe interval bdwec:n the thinl and ninth values. i.e. 61 kg to 77q, a dilfen:nce of 16q. The iDlcrquarlile range is mast informatiye ifthe upper and lower Yalues are bath qualed. rather than simply the interval bcaween them. The lower \lalue is known as the lowerqlUUtile or 25th pen:enlile and the upper yalue as the upper quartile or 75th pcn:c:nlile. Iflhcnumberofabsemdions + I is diYisibleby4 thca the inlClquartile range is simple tocalculale. Iflhis is nalthe case then the \'aIues far the inlerquanile I1IDgC need to be inlerpolalcd. In general. the positiOD or the lower quartile is calculated by multiplying the sample six plus one by 0.25. and by 0.75 in the case of the upper quarlile. 11acn:fon:. ifanother man altends Ihc clinic with a weight of lOOq then the lowcrquarlile is now at position (12+ 1)/4 = 31/4: 51 60 61 63 65 66

t

Lower quutile

70 72 77 IS 95 100

T UppeI' quartile

1busthe lower quartile lies a quarter oflhe way between 61 and 63. inlcrpoJatal as 61.5 q. Simil.ty.1hc upper quartile lies dan:e-quutenofthc way bctwccn 77 and IS, inlcrpolalc:d as 83kc. The ilRl"quartile range is typically used as a mc:asun: of spladaroundthemeclian.Ulcethemedian..itisulClUlwhcathe daIa an:: not syrnrnc:lricallydislributcd because it is not unduly ....edcd by the presence or SKI!WNfSS orOUR.lEl5. SRC

Intrae.... correlation coefficient 1lR COltREl.ATJON (W'FIlENT

See INJ'IlACWS.

Intracluster COIT8latlon coefficient (ICC)

This is a measure that quantifies the exlenI of similarity among individual observations within cluslCl'5. For example. when a study collects data on palic:nls from a number or dim=at clinics. dac intracluster c:orn:lation coelftc:ienl (ICC) n:pn:.scnlS the clegn=e to which patients allcnding the same clinic an: IIICR similar than the patienls attending diffmml clinics. Also known as the inlnM:lus carRlalion cocfticienl, the ICC labs Yalues bc:Iwcen 0 and I. ~ the valacO cam:sponds to the silUalion whe.e incli\liduals flOlll the sameclusler an: no man: alike than indiYiduais frvm difrClall cluslclS and higher values indicale gn:aICr similarity within cluslcl5. The ICC has been used eXlcnsiyely in ICYcraI applicatiOD IR85. ID bealth services n:SC8Kh. it is used to measun: Ibc ex lent of similarity of patients within .tmiDistndh'C units such as baspitals or geopaphical units such as towns. In family slUdies, it measun:slhc degn:eof n:semblance amoIII mcmbcn or the same family. In psychological n:seardJ. it is used when examining n:liability (sec ~5VREMENT PRECISION AND IB.IABILITY). whIR the same measurements an: taken on subjects by dift"cn:nl 8S5C1S1n. When individuals an: sampled within cluslas such as hospitals or families. the ICC representing withiD-elu51er similarity is defined as the ~ parlion of the variation between indiyiclualslhat is explained by the yarialion bcawcen clusters. Fonnally. lids definition assumes a simple mndom ell"ecI.s model for Ihc EJIUIOIN1' of inten:1It, which includes rancIam clusterelTccts with VAlJAHCEa; and individual rcsidualell"cds willa \lariance~. and the ICC is deftncdas ~I(~ + 02). Tbc IDDSl common approach for eslimatinctbe ICC is to obtain eslimalCs for ~ ad ,r by filling the RANDOM EfFEtTS MODEl. (Danner and WeDs. (986) to Ibc data and to substitute thcsc inlothe farmula. CONFIDEHCEINI"ERVALS forn:porlinlalonpidc the ICC estimate an: also obtained usinc the IIISIIIIIplions of the randaaa eft"ects model. In most lCUings. neptiYe ICC values an: n:sanlcd as implausible. so ncpti\'C wlucs obtained f. ICC estimates an: set to 0 and Iowa limits or confidence inlCmlls an: truncated at o. When consicIerias the ICC fora biDaly OUIComc, e.g. the pn:sence or absence ora disease in family mcmbcn. an ICCellimalecan be oblaincd as described above. bul the mcIhads for conllnlCting confidence intenals an: based OD dift"mml 8Ssumptians. The simple model outlined ben: is nol appropriate for all types or design. When measuring n:liability. the definition of the ICC dilTers BC:lCOIdiIil to wbelbcr the focus is OD 'caasislency' or 'absolute qn=emcnt', as described by McGraw and Wong(I996). Some 5ludydcsipslaluin: man: complex models: e.g. when n:pcatcd measuremenls an: a\lailable OD patienls within clullen or when wishinlto estimate multiple ICC valucs simulbincausiy (Donner. 1916). RT DaaIIer,A.I986: Am-iewofinf'Cft:ace...,ceduresrartheinbal:1ass CGlRlltian caeftkienl in the ~way naIom cft'cds model.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ INVERSE PROBABILITY WEIGHTING (IPW)

Inlemat;otIal SIDI&Iksl Rene"' 54. 67-12. Doaaer. A. ..... Well, G. 1916: A compcuisal 01 confidence interval methods for Ihe inlraclasscondalion codlicieat. Biomelrics42, 401-12 M~w, o. aDd W..... S. P. 1996: Fonning inferencc:s about some inbKlass condalion coe8icieats. PsychologiC'llI Melhods I. 30-46.

K.

Inverse probability welghUng (IPW) This rercrs to a general method of adjuSiing M-cSlimaton (Everitt and Skrondal, 2010) rorconfounding or seleclion bias (e.g. due to CENSORED OBSERVA11ONS. MISSINO DATA or sample seleclion) when Ihe unverifiable assumption or no unmc:asun:d aliifounders. noninronnath'C: censoring or missingness at random is. respectively, mel. The underlying idea is to filler spuriousassocialions away from the data by weighting each subject's data inversely to the magnitude of those associations. In Ihc: process~ one redresses imbalances so that issues or confounding or selection bias may subsequenlly be ignon:d in lhc: analysis. We will illustrate IPW with two examples. Co_icier first a setting where Ihe relation between some exposure A and same outcome Yisofinlen:sl. bUI distorted by measured CXJIIfoundc:rs L. Then weighting each subject's data by Ihe reciprocal ofthc conditional density)tAIL) of expoMR A given confounders L. eliminales the association between A and L. and thereby eliminaleSconfounding by L. nus implies that SIandard measun:s of association be:twc:enA and Y, when applied to the inversely weidaacd data. are no Iongerdislorled by confounding due to L. In the special case where A is dichotomous. taking values 0 for the 'unexposed' and I for the "exposed'. the adjustc:d association between A and Yean thus be calculated as:

Lr. . . ,

wil

Yi/L;~"'-1 II'il- Li:.l.1.8 Jl'IDYr/L;:.l.... wl1) (6)

where Ihe weighls "''' = IIPr(A,= IIL/) and "'10= IIPr(AI= OIL /) can be estimated based on the filted "slues of a UXJLmC REORESSION model for A. given L. More generally, adjustment for measun:d confounding can be accomplished via otT-theshelfsoftwarepackages(sceSTATJmCALPACIUDB)byfiUinga regression model ror the outcome. involving the exposure or interest only (e.g. E( tlA) = a + /lA). while assigning each for those with A, = 1 subject's data the given weight (i.e. and "'10 for those with A, = 0). The impacl of inverse probability weighting is to standardize (see DEMOCJRAPHY) the expected outcome in the exposed and unexposed ~vcIy. with the total group as the refen:nce papulation (Sato and Matsuyama. 2003). It follows that. when L includes all confounden of the relation bclwecn A and Y. Ihe JPW eslimale (6) can be inlClpreted as the change in average outcome that would be observed if the talal group were exposed versus unexposed. As such. IPW

w,.

provides an alacmativc: to direct adjustment methods fOl' mcasun:d confounclcrs. where one involves models fOl' the regression of exposure ralhc:r than outcome on the: measured confounders. Consider Rexl a selting where the inlCRSi lies in a linear regression (sec: MVLTDU UNEAR REDRESSION) of a completely observed outcome Yon to an incompletely observed covarialc X. When missingness in X is inRuenced by the observed outcome and possibly also by extraneous. measured cowribul has no residual association with the missingX. then a regression analysis of Y on X in the complete cases (i.e. those wilh complete data in X) may give biased results. When the association bc:lWc:en missingness and its predictors is filtcml away through inverse probability weidating of the complete cases. the missingness becomes compleacly at random 50 IhDl an analysis of the reweightc:d complete cases bc:eame:s valid. This can be accomplished via olr-Iho-shelf 5OnW~ packages by fiDing a regression model for the outcome to Ihe complete C8ICS. while assigning each subject's data Ihe weight IIPr(R = UY. 2). where R is a missingness indicatorlhat assigns 1 to subjects wi'" complete data and 0 10 subjects with incomplete data in X. '11Ic: missingness probability appearing in the: weights can. fOl' instance. be estimated via logistic regression analysis. This idea that BIAS due to missing data can be c:om:ctcd by weighting each ofthcse subjects· observations by the inverse of the probability of observing complete data dales back to at least Horvitz and Thompson (1952). For many yean. the IPW method gained little: BlXlCplance because of its imprecision relalive 10 more popular missing data methods. such as MULTIPLE IMPUTATION. This has changed drastically over the: past decade. since the seminal wmtt of Robi_. Rotnitzky and Zhao (1994). who dcmonslnllcd how the precision of IPW eSlimaton could be greatly improvc:d to the point when: they bcaJmc: competitive with impulalion estimators. The recent success of IPW is largely due to its abililY to enable adjustment ror timc:-vmying confounding where standard regression adjustment fails (see MAROINAL STRUcrtJR..\L MOl). ELS). This abilily results from its capacity 10 filtc:I' associations away fiom the data. IPW methock have the further advantage that they ~ generically applicable in a wide variety of seltings. that they enable relatively simple sensilivity analyses to investigate the impact of violalion of unverifiable (missing data) assumptions (Scharfslein. ROlnitzky and Robins. 1999) and thai they ~ less prone to exb1lpOJation than mlR standard adjustment methods (e.g. regression adjustment for confounders. multiple imputation) (Tan. 20(8). The main limitation of IPW estimators is Ihat they can be: unstable and imprecise when somc: subjects n:ceive large weights. This may happen when Ihc:n:: is strong confounding, strongly infonnative cc:_orin; or missingncss. 01' when a continuous exposure requires invc:nc weighting by a density. In that case. one muSi consider heuristic weight

atcsz.

223

INVERSEPA08AaLi1Y.WE~

(IPW) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

tnmcatioia~(CoIeaadHcman;, 2008)

ar IaXlUflC.'lo

el1iciaal (daubly ~) IPW ~ _ _ _ (Rabias 'eta'.. 2001;- GacIpIuk, Vaast=IancIt and Gadpe..... ~ '!lie 1.1IC1'_.typicallyallD~....D&i.tcnnsof mbUstncss __inst model -;••~fic:"L- but.8M 'COIDpu-IalionaIly mOie . . . .&-· . .Finally.;l~ IPW II1CIJHKI enjoys Ihe ~nc:Jtpec:lccl property lhid iD esumaied ndlertban kllGWD _ptslcDds to incaase ·Howcftr; when standard SlatiltjcDi softw_ packqc:s (c.c. sAS.. STATA. R: statistical packaps) ~.used·ror IPW. caution ~ ~ in interp-et,inc.lIM:·itandanI Cmn POYicW by lhe iDftwan:,. because ihcie ip~ the iliipn:c.iSion of Ihe Cslimaled WeiibiS. By IDUIe

•

.

•

•

___

c

i

Total

Expert X (b)' A BCD Total

6

B

6

6

6

1

3

30 34

12

1

1

7

3 12

12

1

6

1

1

9

4

3

7

2

1

1

Total 4

9

12 34 59

4

9

12 34 59

A

4

kappa and weighted kIIppa Two examples of paired distributions of assessments made by expetts X and Y when cIassiIying 59 subjects in the ordered-scale categoIies A, B. C and D number o( categories decreases. ThcrefCR. kappa values Ii'om dilTtRnt sbldics are DOl mmparable and using rules for interpretation saying lhat kappa larger than 0.6 represents goad agreement is not meaningful. The depends on the choice of wei;hts (Maclu~ and Willcu. 1987: Allman. 2000). 1\vo disa&reemcnl patterns inspin:d by data (rom a reliability sludy in ncurondiology ilIustrale the limitations of a sunamaJY measure of ~liabililY. Two experts. X and Y. indcpcndcally judge:d 59 objects an a four-point scale. he~ labelled A. B. C and D. Two hypothetical frequency distributions of the: pairs of judgements ~ pvcn in the figures (parts (a) and (b). The disagreement patlcmS differ. but the perccatage agrc:c:mcats are similar: 75CJt (44 of 59) in -

I

70

50

• • • ••• •• • • •• • • • •••• • •• •••••••• •• ••••1• • •• • ••• •••• • •• ... . I• •• • • •• • • • •••• • • •• •• ••• • • • I · ••••• •• • •••• •• •• ••••• • • • ••• • • •• • •• •

•

•

•

........ , I··

'.. ··1-· •· ... -..

180

170

.-

e

•

•

190

180 Body height (In em)

..... IIqU1U88 ....rnaUOn ScIIltetpIot and IelJst squ8f8S I8f1I8sslon Ine of body

wetJhI WJISUS body helght In •

tandom sample d 250 AmeIiaIn men

expc:c:ted outcomes. When the model is linear and the autcOllles are independent, wei"'ls chosen as the invene or the individual variances yield the most eflcienl estimator•. When. furtllcllDlR. the autcomes are nonnally disbibulcd. this WLSE then also coincide~ with the maximum likelihood emmale (MLE). In ather instances.. e.g. with logistic rq:n:ssiOR and many oIhcr ICncnJiscd linear models. thcoptimal weights depend on the larget paramc:1Cr. Because the laller is unknowD. a mclhad or ilcnllively Iewcighlccllcast squares is usually applied. wbcleby wciJbts in each itcraliOa depcacl OR previous ellimalcs for the uaknown plllBDlClCr. Ewn lhaugh the LSE may be less pm:Uc than the MLE uadcI- certain models. Ie... sqU8leS estimation is popular because it does Dot requile a cOmplete description or the sampling distribution of the observed data. Far example. it is DOl necessary 10 assume thai OUlalmcs are IIDIIII8Ily distributed to derive least squares estimates of the unkaown rqn:ssiOD. cocOicienlS in the MUI.TIIU UN!AIl REORESSION model. n.c model of inlerest for the means is "all thai is nccdccI. The LSE is then:fCR immuac to misspccificatioa or the sampling distribution, unlil£c the MLE.

Further modifications or leall squares esti.....oa "ve been devised to adUe\le additional goals. For inSlaJlcc. to enhaace mbuslDCls looutlyingabscmdiaas. LI-ftlCn:sUon is based on minimisiDg the avenge absolute deviation between the observations and the regressioa function. -SVIEG

PalnR.K. W.. ~A.G.adt1llaer.A.G. J9I5:Gcncrllizcd body campasitian pmlidian far men usiag simple mcasurancat techniques. Medidlre IIIIIl Stant in S"",; _ Mm;dlle 17, 189.

leave-ontHKd cro••vallclatlon

Sec DlSCIIMlNANT

RnD10N ANALYSIS

This rorm of samplilll arises when items an: sampled iD proportion to their values on some variable ofiDterest., e.g. a sampling scheme based on the number or palical visits;. A BIAS may be inlnKluccd because some indiYicluals 8M I1IIR likely to be scJcclcd than others simply because they make man: rrequcnl visilS. 1bc pmblem arises in SCREENDiJ SIUDIES. where die sample or cases clelCCled is likely 10 contain an excess of slow-groWing CIIIICCn eampan:d to the sample diapascd positive because orlhcir symptoms. If Icnglh-biascd sampliDl is ipamI. die

length-biased ampllna

_____________________________________________________________

eslillllllC oflhc: true population MI!AM CaD be gn:aaly inftatccl. An example of length-biased sampling is described in 0.vidOY and Zelcm (2001) in the eantext or the assc:ssment or familial risk of diseuc based on Rrenml databases in wllich the lqer the ramily,1hc: paler the prabability of finding the family in the cIaIabse.

sse

(See

aI.

BIAS. BIAS IN OBSERVATIONAL STUDIES, SAMPUNO MEJ1IODS - AX OVERVIEW]

DMidoY,O.aad ZIIa, M. 2001: Rcrmnt sampliq. family bbtaIy aad rcllIiYc risk: die rule mIeqth-biucd sanapJiII& BiGJttlliJIks 2, 173-11.

Leva...•• teat nus is used to lest whdher lwo or I11CR paups have equal VAIIAKCE. The N1JLL JIYIIOI1IE1IS slales that the \'arianc:e of all puups is equal: the altc:malive hypothesis slates Ibat Ihc: variances 1ft unequal for at Ic:ast one pair. Equal varillllCC or two or mon: paups is a rRlqucnl uswnplion for panIII1t:bic als., ANALYSIS OF VARLUICE for eumple. and. Levene's tell can be used to verify this assumption. Levene~s test is relllliftl), simple and robust to departun:s from normaIit)'. To perfonn Lcvenc~s leSt we begin by ftncIilll. ror each paup,.1hc: absolute diffen:nces betwc:c:a the observed values and Ihe MEDIAN. MEAN or IriIlUlled IDI:DD. The poups ru:c:cI DOl beofcqual size. WlleIIaIO use the median. ..... or IriIl1llled mean depcncIs on the underI)'i1ll dislribuli_ or the dalL Ir thedala~ syllllDCtric and naodc:rate tailed the mean pmvidc:s the best power. usilll the IIimmed meaD performs best if the dada ~ heav)' lailed and the median performs best if the data ~ slccwed. However. using the median provides JGOCI IOBUS'INESS forma), l)'pes ofnonnannal data while maining load power (Wilcox. 1998). We complete Lc:wae's test by perfanning an anal,sjs of variance oa these absoIulC differences (sec the table).

Levene·. test Data from sppIyIng Leveners test on tIl,. diffelent treatment lJIOUPS resulting

GrrNIp AbJDIu~ OlDllP All_I, Group Abso/ul, 1 .r',,"6 2 tliffenn"" J diJfnen~.s

24

5.5 4.5 0.5

22 21 18

o.s

25

18 22

23

4.5 12.5 13.5 2.5 4.5

10

1.5

23

19 18 14 6 S 21 MIdIIUI JU

16

17 26 23 21.5

0.5 0.5 3.5 3.5 3.5 0.5 5.5 4.5 4.5 1.5

8 12 14 IS

12 13 7 14 13 15 13

~8DIABRAM

For example. rar In:aImenI 1 a score or 23 Men by 4.5 paints frum the median, while a SCDR or 14 alsa dilras by 4.5 pailllS fmm the median. The icIea is "Ill the larpI' the differences iD same paups compan:lli to aIhen.1he IIICR the spradancLhencc. the IIKIR likely it will be thaa the variliace in the papullllians, fIom which Ihe)' arose. is DOl the same. 1"bcRf4R. a OIH'-K-Yl)' tIIItI/ysUD/WlTillllCfl on these diff~s willlCSl this. This resulls iD an F-statislic: or 4.21 with aD associaled Pvalue or 0.026 so we rejecl the null hypalhesis that die: \I8riaace of the pvups is equal. More details can be rauncl ia Brown and Forsythe (1974). MMB

B....... M. .... Pon,.., A. 1974: Rabut tats far the equality of variances. JDurntlID/ the Ammtllll Stlltistklll AlstJriGlilJll69. 346. 364-7. Wllats,R.I998:1"rimaaiIIg .... WinsariSllian.laAnai. . p. GIld CaIIaa.. T. (cds). Ellcydo/Mdill of bionlltislic6. Chichester: Joha Wiley a: SoDs. lJd.

Lexl. diagram This is a dc:scriptive taoI used ia epidemiOlogy ad cIemopaph)" being a plot of individuals in a 1I1Icl)' an two limeacales simuilaDc:ausly. These timesc:alesan: masl CDIIIIIIOnly calendar lime and age. Each individual is dlen n:presenleci by a diagonal Jine: 81450 10 each axis, which begins at the caleadar lime and. at CDmbnent and ends at the ca1cndar time and age at the ewat of intcn:st (e.g. death) ar cellSClliD&As an example. the table shows the year of birth, BlC at earolment and age 81 delllhlcelLlGl'lng of f'aur individuals earolled ia a saudy thai began ia 1975 and ended with followup in 2004. The: conapolldin& l.aisdiapam is shown in Ihc: filaR (see page 242). AdeaIh isshowa by a fiUcclem:lc and a censoring evcnl by aD empty cin::1e One usc or Ihe LeU! dillJlDlll ~Iales 10 c:stimalilll. _ adjuslilll rar. the effects of age and calendar lime aD die: martalit)' (_ marbidily) nile. In lhis applicalion• • is divided into (e.g. 5-year) bands and ealendartime is divided iato (C.I. S-year) periads..ll is IISSUmed that the martality rille (_ baseline IIIUIlaIit)' rate) is piecewise CDDSlanl, i.e. is' constant _ each combinlllion of lIIe band and eaIendar period. To estimate the mart_ty rate in each of these bancl-pc:riod combinatiaDs. it is IIClCeSSBIY to calculate Ihe DUmber or dc:aIhs and talallime at risk in each band period.

5 I I 2 I 0

Lexls d ...... Stuc(y data lor four individuals

Im/i,itlual

Birtll

Age III

)Y!tU

mrDlmmI

SIll""

_III Dr mIM1I'irr,

6

I 0 2

A,eotll

A

B C D

1940 1951 1954 1960

sa

30

Died Ccascml Died

21

Ccnscnd

30

3S 24

S3 48

241

UFE~ANCv

___________________________________________________________

. . . ..c8IeaclIr period 6',em later. iD 2000......... 2.

_y

yauB ..... in~.11Iebxia ...... l.eaenlalian . . . . it 10. .,c,. wIIea lie chaaps fran one .....paiod Ia

.1DIJdIar: Ibis.bappenawllene~ his di&&analliae CRIISCS • ~_ yatica) Jine..lildividuaiCCOIIIribuIes6.4,6arid 2 yean ~ risk icspecaiWly. to fo'ur INInd-periodsDl • cIcaIh

50

MIll to ~ .... artbc:& .'

____ 1 __ _ 1 1

1 ,I

1 1

- --- ----·-·-r--.

30

·1 I 1

.i

i . 1 1

1

'I

I

I

2O~~~----t~MO~'-----~m~.~D~ Calendar , .

VIri~ ~ dais simple Lcxk diacimn Dilly be Obtained by chaiJciDi the Ii. . . . . . l'Cpn:scIiIccI by ilia by markiDI' OIlier symbols aD Ihe diqanal .liDes 10 IepRIICIIl oIhcr ~ ofinlcn:at GI' iD~. caIaIir Ia di8'cn:nlillie pc:riack·.pc. by _ i~ iD difreniat .~. FeW ·.ftIidIer delllik 'lce Keidinl ' GoIdmin ('1992) ucI

axes.

b,

CIa,.. ...

'111m he clumps lIP .... 4 yean 1aIcr, in 1994. Fmally. he

siis

.

~D._"".1~1:~~"'.~

.0dR: BIIclwcII Scieace Public:IIlcWk' CJtM=n. A; L 1-= B\ICiIII diidI.:·Yhni1izi. ~ ad oIbcr.tiiaelkue_ cIII&. . . . Amtrfam SlIII16,. 46, 13-18. ~. N. ._ ~1Iic8I

48'1-_. ' . '.' *

iIIraaIce ill _ ~ .......~. nr.u.tIiGu til

.Las.",,,,,,.... ....". fOIlour~

FOr example, Ir... baadsandcal. . . .pariadsol' IOyeD' cbRtion' ~.......·I. .¥idaa..C·jaiai Ihe·cahan. iD I~ qed 30~ He cMniaealc.... period after 6 ,an. in 19SJO.:

Hills (1993).

(I""

RtIJIIl Sod«, oJ LMlIM. sma A m;

'I1Iis JiopIiIar ~ IDIsasum is used In DBIOCIR •.,.aY aad DIISIIOIOoY rar . . . . . . 11M: cum:at. health of • papulalioD ar' far hl:ailh compaIisaDJ .~ .,..Jalians. Far ,. 1pDCiIC.'P!'I"iJ.i~ die IWIdianI

,life

~cy.

CUI

....

02

o+-------~~~----~------~------~~--~

o

ire

expec..., sUwivaJaIIN

20 .1

I..

U

IWfII' 11118. Shaded. . CDn8IIPDfIIItJ to Ihe. . . underlhe BUNlvaic:urve Imm IIIJtI Jt

OfMBIdt r i IS t!fIfiiIl to e(K)S(Jt)

_________________________________________________________________ UFETMLES

deftnition of life expectancy at a gj~n ap is die: avaage nwnber of years l1li indivicillal or Ihal pm1icuIar age has Jallaining if the qc-spc:cific . . . .ty raIcs do nol chaqc in Ihc: fubR. Thc:sc .-spc:ciftc lIIOIIaIity nles can be obtainc:cl from the 1ft TABLES for lhat papulation. which may be Slratilied by variables Icnowa to be IIIongly associated with mortality, such as scx. calc:adar period aad smoking slatus if aYailable. 'I1Ie lire Rabie can be estimalc:d eilher aonpanunc:trieally or paramc:tric.y (see. far example., GaitalZis ~I QI•• 2004. and 10m and FIRWCIl, 2(09). In sIaDdud (actuarial) life lable nolation. the life expectaney aI qe :c. e~. is gi~n by ~~ == T~/lx. wIleR T~ is the nwnber of penon-yean lived between die exact age x and extinction. aad I~ is the number of persons saill aIi~ aI the exactage.T:. Thus. farexamplc, the lire expeclaDcy at birth. eo. in a liven binh cohan or life table population is the avaage penon-yean lived fium birth if Ihe CUlRllt ~specific mortality I1IIes ~maiD unchanpd in Ihe futuRo The continuous time mathematical ~nlaliaa of life expectllDC)" (also kDown lIS Ihc: mean laicluailirc:time) of 11ft individual known to ha~ survived to an qe x in terms orthe cum:nl fon:e of mortality (i.e. hazard function over age: see SURVIVAL AXALYSIS - AX CMIlVlEW)• .1(,,). is given by

and is equi\'BIenl to the an:a under die: survival curve. S(u). fromqe.'Conwanisdiriclcdbytheprababilityofsunivingup to age x. S(x) (see Ihe fllU~ on page 242). Note dud because Ihe age-specific manality rates IR expected to c......e in the futUR. the life eJtpc:ctancy is not a melllMR or how long a spcciftc individual ora liven age in the: population ofinten:st is actually expeclcd to live further. Althaup life expectancy is a long-standing and easily uadentood indicalor or pn:senl populalion heallil. it has incRasinlly been sc:ea as too crude for this purpose since il daes noI take mlO account the impact of chronic diseases and disability. Extensions of life expeclanCy to healthy lire expcclancy, disability-fm: life expec:laDcy ad. ~ generally. 10 life expc:etancies in yariaus health slates ha~ bc:en made and can be estimatc:d throup the: ftlling or MULnsr.m: MODELS (see.. for example. Butler ~l QI•• 2008). BT Baller, T. c., . . dell Hoat, A., MIl..., P. E......... J. p .. .....,., C. ... Aanbnd, D. 2008: Dc....Ii. and survi\'aI in Pukiasan disease - a 12·,CS' papulilion study. Nftlrrliogy 7a 1017-22. 0 ...... A.. Jall-, A. L., CWwkkt D. W.. SIMtnoa. S. 0. Bad Saader, J. W. 2004: Life apectaDcy in pcapIe with newly diaposed epilepsy. Bra;" 127.2427-32. Tom, B. D..L Bad ....... V. T. 2009: Statistical methacls far iDdividual-I~1 data ill cahall IDOItaIity studiesofrhcumalic diseases.. CDllIllllllrD liMs in Stalistics -1'IIary""tl Methods 31, 3472-87.

life lab.. Life tables IR models thai conveniently sullllllDlise the level or mortality in a popuIalion of inteaat.. Their best-known fianction. life expectancy. has a ready intuitive mc:anilll. Ufe table functions an: inclc:penclenl of the age strucllft of the papulation whase monalily ~lthey are used 10 SIIIIUIUIrise. Period life tables are used to sununarise the mortalilY expc:rienceduriq a giyc:nperiod. e.g. acaleadarye•• Coharl life labIes sumaaarisc: the experience of a defiDed cahall as it ages thraugh calendar lime. For the nc:cessary mortaIily observations to be available 10 constnIct them. the Rle\lBllt cohort has to be atlc:ast towanIs the end or iIs lifespan. Full life tables have one lOW for each year of life. usually 10 age 110 (see the BgUR on page 244). Abridged life tables typically ha~ one lOW far each 5 yean of life exceptlhat ~ are usually separate lOWS far . s 0 to I and I 10 5. CDlUlrllding life IQbles. TbcmDR two main steps to buildiq

a life table. Fint.the~ is

the ·lRliminar)' computation' • in which the observed age and scx-spccilicdc:aIh rates during the period of intc:resI DR eonverted into com:sponclin& risks of deaIh between two exact ages. Suppose. for example. that Ihe observed a:nlnll death rate in the population of inll:rest far persaasqed 40 to 44 last birthday isO.ool404. (This is 5 M..o in lifetable nolaIion when ~fening to observations made in the populationorinIcRsi. and is convenlionally taken to bean IUIbiasecl estimator or the c:orrapoadjng life Rabie function .....) 1be risk of death between exacl age 40 and eucl8le 45 is given by

_

5 x 0.003404 = 0.01690 1 x (1-0.59)5 x 0.003404

whe~stl.,isthefl'Ktionortheageinlervalliyed.oDaYerale.

by those who die during it. The risk of deaIh across the: age intc:rvaI is close. but not equal to. the cenlnl de.... rate. (In lhiscue,thea:ntraI clcathrate times 5 -10 take accOlUlI oflhc: age interval width - equals 0.017 02. slightly greak:I' than the risk.) Second. there is the computation or the life table proper

(see the liIure). In conslnlcling the life table an initial hypothc:lical cohort of 100000 (/0. known as the radix) is subjected acmss each successive age inlCrvaI 10 the calculalcd risks or death. Th~fan:. SlaIting at binb (.T:=O). 100000 an: eJtposcd to the risk of death Wan: exact age I. i.e. If. wbich in thiseumpleequals 0.02006. This RSuits in 2006deatbsintheintc:rval(ldo). 'I1Iepenoa-timeliyedinlhc: intc:rvaI (.t.) is I year far all who survived it (I. =97994) plus Ihe time lived by thasc who died in the interval - wldch 243

UFE~ES

~

_______________________________________________________________

,,,,. jilllditlru ad

_tII_

x I.J eXMIIII' X. ·U. , . xiII binlrtltq. II drs ID ihe Width of the. iataullIeiq 00nsicIend. In • ruu ure taIIIe wIIR ItII I ·it 1M, be CMIIitllld. eJl is lire apa:1acy aI cact lIP x. . . "". is • cc......... IIIC far peIICIIIS apd bcIwca x . . Z+_ ill die bypadaaliallife table popuIatiaa. h is esIi~ by "MI: (below). "JI. ~ oaIImI .... ate.ia _ papaIIIiaD of ___ I. is the nambc:r of·pcnaas still alh~ at aactqe :t. • f. il Ihc risk (.....liIy) of ..... bctwca qcs x mil x+". JI. il Ihc. • of IIIn'WiDI flam cuet. • x to x+" (eqIIIIs I - .f.)• • 11. il _ acnp frIICIiaa of die iatcmIIliYal by tbase wba die benwa " aad· Z+IL "L. is die ....... 0( pcII1DD-yanliwd bctwcea cuet IPS" ad Z+". TJl IIUIIIbcr of pcnan-JClllIimI bcIRca exa&:lap " II1II the eIIiIIcIiaa of abe IaypadIeticaI caIat.

is*

eu.:.

is_

life ....... &InJt:t of 1InII6 and IIJst 10 lOWS of a fill lie IlIbIe for US rrhIte maI8s In 197(/J Ap iIItmtII. pnWit~

_1tJmr Z...".

iUr_

Will" 11/

l'ropDrIiMlI/

iIIJYIft

pmtIIU ~tII.

tip

X.tJZ+II

".....,11/ tip."""

dyiq.-IRIfrIYIf

Of lotHJtIO ,.,." tliw

"--11/ --.

Ii..., """"dy., MIRIber

III

illtmwl

•

0 I 2 :I 4 5

o.ooo~

0.00012 0.00059 0.00054

..

100 101 101 101 104 105 106 107

I. I.

·tI. is

"t4 G.02D06 0.00116

00

0.35479 G.365.J3 o.37S50 0.38411 0.:39320 0.40101 0.40811 0.41475 0A1075 1.00000

I. ImOOD '19M 97_ 97799 97728 9767.

..

Amvip' " " " . , 11/

,.""". willt IOD. ",.." ..,"·i' .... ,..

1«InGf. rMJaiII., til

IIIBrtber 11/ ",...".s 11/ ~liMI;"

""""11/ ""'1"fI'S 11/ life IiMI ;" ,Ills . .

f.

I¥,."., III. iIIlmYII

"XJIItC'tIIIeJ'

"".iIWntII

tIIl_,..", iIIIn'IfIIJ

.da

.~

T.

tI.

2006 114

98252

6193828 6095576 665P7639 __ 64Ol036 6304336

67.94 68.33 67.41 d6..46 65.51 CMoS5

415

2.20

"". iII,,,",,1

x

,,, Mill..., t~ ,.,.,

81 71 .57

..

52

·97037 97i40 9776] 97700 97645

..

.

.

189 122

.45

100

2dO

~13

77

I"

29

62

leo

2.01

48

18

39

98

2.01

30

12 7 5 2 2 2

24 15

59

1.'-

35

1.94

I

.20 12 7 4

I .•

18 II

. 6

2

(II

5 3 4

1.86 1.12 1.79

0.129 rar the lilt ,c. of. life 1IDII1I(JpI"GIjmIIe1, o.s far .11 subscqueDt JaIL

equaI.s Ida x 1"0 (the ~·orthe . . . . . BYCCI by . . . . who cIed wi...... iI) ar 1006 )( 0.129. whidI equals 98 2S2~ u showa, wilen addc:cIlo I •. (Pol' CICCIIIOIII),. . . . is aaI shown in the tIIbJe: ·it is 0.129 far die 8nt ,ar or lire'" appIOXimalel, 0.5111inaflcw.) TIle "Lx column is c:aIcuI... in this wa)'~ aae row at -lime, ID the end or die IifClplll. A special ~ is daen na:dccl farclcasiD& the last open-cndal illlcnal-

~Ift&elllia&

On . . table shoWn. the penaa lime li_

~i. 1'IIis is ealimaaed by l.orIJ4.... (The julliftcatiaa ror ...ii is IhaIthe n:nuIinia& lWVivai Jime is lalcen 10 be clislributed expaaenti.Jly willi _ IIICIUI of

beyond log -

l/caM....) TJt is daen I1DDIIICd back fnHn the cad or . . &fespan. beliJUliIll, ia this cumpJ&:. with T.., which equals L ••

_______________________________________________________________ UKEUHOOD

Movins up one row. TIG8 then equals T100 + ILIOI and so on back to To. wIIc~ To n:pR&ents alllbe pc:noIHimc lived by Ihc 100000 who set oua. so the averagc penon-limc lived. or life expc:ctanc:yeOo is Tolloand mCRgencrally. for any age

e,. = TII/la• Lire table populations can be interpreted in two ways: (a) as fully hypalhc:lical conslnIcls or models in which 100000 individuals IR imqincd, as it wen:. 10 be born in the same instant and dam inllantaneously subjCClal to lhc: ~lcYlUlt risks of de.... dnupaut Ihcir hypothetical lifespans: or (b) as n:pn:sc:ntinl slationary populations experiencing constant. and equal. birth and death J1IIc:s • In this latter inlclpnUtion. ,.L~ gives the expcclcd number of iDdividuals x to .Y + n and To gi\'es the loIaI population size. In such stationary populations the~ ~ 10 bilths each year, so the birlh nile = IoITo - lhc: iJwasc of the lifc expeclancy. nus: ."C.

81m

Crude birth laic = Crude death nile = l/eo Use$ D/ille Iffe Illble. The la and qa columns have many uses in summarising marlality risks in populations of interest. "Ibus infant mOdality, conceptually defincd as the risk of death bef~ Ihc finl bilthclay~ is Iqo. The uacler 5 morIalily ·ratc· (actually lhe risk of clc:ath befon: age 5) is I - Isllo- The adult mortality ·l1ItC· (45115, or the probability of death between IS and 60) is 1 -l.u/lu. Similarly. the pmbabilityof survival between any two ages i andj is liven by 11 11,. Lifc ..bles ha~ long been used to proVide comparable summaries of monaIity risks in populations. They an: also sening as lhc: basis of DCMr 'SUIIUIIIU')' measun:s or average population health'. which combine infonnalioa on both the risks of pranallIIC clc:aIh and of nonfatal illness and injury. Such summary mc:asura may be cither ·health expectancics' (such as "health adjUSlcd lifc expeclallCy') or ·health gaps' (such as DALYs (disability adjusted lircycan) last).

The: dJr and 131 functions when plotted fora given population at successive time inlervals show how the distribulion of age at death ch8agc:s as life expectancy has risen. One inlel'JRtation of n:cenlln:nds in low mortality countries is that the rise in eo has beendispraporlionatcly due to a raluclion in the ~ ~ adult deaths. As this pmcess coatinuc:s. a IlII'Icr and iarpr proportion of each gcnc:ralion survives until closer to the maximmn lifcspaa. 'l11e clisbibution ofdeaths by 8Ie aI clc:ath becomes concenlralai ata high SIC -maaifcst as a n:cIaceci clispc:nion in lhc: distribution of deaths by age (d.) in the lifc table. 11Ie c:orn:sponding shift in paKem for the survi\'Ol'Ship (I./If) fUnction is for it to n:main high unlil closer to the maximum lifespan aacllhc:n fall sharply - a process clc:scribed as the 'n:c:langularisalion or thc survival curve'. This ·opIimistic' interpn:tlllion of n:c:c:at tn:nds is taken to

imply that then: has bc:cn no cxlcasionofthc avaqeduration of iII-hc:allh in the period immecliately bcfCR death. Par flUther dc:lails sec Elandt-Johnson and Johnson (1980). PmSian. HeuvcliDe and Guillal (2001), Lopez dill. (2003) and Peden 41'1111. (2003). JP

_"$;'a.. .

EIaadt.,Jolaaloa," C. aad J........ N. L 1980: SUT1'i.Y11Ir1t1dtI, _dIIla New Yolk: Joba Wiley Ii Sons. lac. Lopa,A. 0.. AImIad, o. 0 .......... F......, B. 0., ......, J. A.. l\Ia..,.,.C.J.L..... HII,K.H.2OOJ:UreIabIcsCar 191 cOUDlrics Car 2000: data, methods. results. la Murray. C. J. L. and Ewas, D. B. (ells). Hm/lir $1,'enu performtIIfU tII_,..",: tkbtzlel, DIt'W, _ empiridsm. Gcaeva: World Health Orpniueian. pp. 335-53. Pee.....,A..~J.J.,~F.,l\Iacb...... J.P..

AI Maanaa.A. ........... L 2003: Obesity ia adaIdaaod ad ilS CCJIIICIIIICIIIS for life cxpcctaacy: a lifo-table analysis. AMaI, of IlItmNIliolllll M_riM 138. 24-32. ......... S. .... H......., p. .... GaIIIDt, .L 1001: Dtmo,r.",: lIIftIJVI'iIIg _ motklillg popuklliDlr JllWesM$. OsIon:I: Blackwell.

likelihood The likelihood function plays two roles in in ill own right it proVides a means for estimatins unknown parameters by fincIing the value: of the unknown paramelcr(s) thai maximises it (maximum likelihood) as ~11 as for CGlllparing hypotheses (LIKELDIOOD RATIO). Sccaad. it has a role in Bayesian statistics (see

STAlIS'I'ICS. Ani.

BAYESIAN METHODS).

SUppasc: inlCrest lies in leaming about the: response of patients suR'ering from inftuenza symptoms to a new tn:atment. Da.. an: collected from 10 patients. of whom six JapOnd positively. What CD be said about the unkaown probability of a positive response 1r? By deftnilian. the probability of a positivc response is If and. of a JlCgalive response. I - 31. Suppose Ibal we have observed six positive R:sponse and four negative ~ponses in that order. The likclihood of Ibis happening is ~ (I In pncIice, the order of obserYatioa orthe: responses is arbilnry ad we could account for this by mUltiplying the likelihood we ha~ calculated by the number of ways six positive responses and four negative responses could occur.lrlbis is done the: likelihood bc:comc:s:

"t.

6'

J

,

4

4~!""(1-;r) = 15".-(1-;r)

which corresponds to a probability from a BINOMIAL DISTIl. BU11OX. For diffc~nt values of If we can detennine the likclihaocl and on Ibis basis find the most likely value. For example.. if 1r has lhc: valuc 0.1 lhc: likelihood is IS x O.:zo )( 0.1"' -0.0000098 and for the values of 31 0.3. 0.5. 0.7 and 0.9 Ihc com:sponcling likelihood values an:

=

0.002 63. 0.0146, 0.0143 and 0.000 SO n:spc:clively. ~ fon::.ofthese fourvaluc:s. o.s is the mostlikcly.ln fact. ~can pial the likelihoacl values for all palCntial values of 31 and 245

UKEUHOODRAno _____________________________________________________________ 0.025

pmcecds by ealculalilll:

0.02

p(9IData) <X p(DalaI9)p(l)

'10.015

Ii .I 0.01 :::J

0.005 O+----.~--~--_+----~--~--

o

0.2

0.4

0.6

0.8

1

__

1.2

Probability of pasiIi\I8 response (It)

likelihood Llce/lhood function based responses out of 10

on six positive

choascthc wlue thatliws the-maximum. as in Ihc first ftgwe. Prom the fipn:, we can c:onclude that the mast liltely value for Jf is 0.6. as il gives the largesllikelihood value. This value is the maximum likelihood eslimalor. The sameapproac:h caa be used far otherlypeS ofdata. For eumple. Altman (1991) giYCSthc followiDgclata on Ihc daily ellCl'g)' intake (kJ) of II healthy women: S260. 5470. S640, 6180.6390.6515.680S, 7515, 7~15. 8230, 8770. Assuming that these data arise fram a NOIWAL DIS'I1UBU1ION with a CXJIIIIIKID MEAN denoted II and knowa STANDARD DEVIATION 1100 we .can determine Ihe likelihood as a functioa of the unknown,u and plot it as befCR. 1bc secOnd fiI'ft UJuslnllc:s this, in which the maximum likelihood occurs al Ihc value 6754.

1.2

in which ~(J) is the priar distribution expressiJIJ initial beUefs in Ihc pallllDCter of interest. I. p(6IData) is Ihc c~spondilll posterior distribution of beliefs and p(DalaI9) is Ihc likelihood. If there is great priar uncertainly about the panunc:lU of inlcn:st so that the prior distribution is essentially ftat relative 10 the repon in which the likelihood is peaked, thc:a it has linle impact on modifyiq priar beliefs. In such circumstanc:lcs. the postcriardislributiOD is esseatially pmpodional to Ihc likelihood so Ihat posterior beliefs about the pammeter are dictalcd by the location and shape of Ihc liltclihaad. In palticular, the posterior mocIc:. the value • lievcd to be the mastlilcely anercollccting data. isesseatially cquivalenlto the value thal maximises Ihc likelihoocL Le.1hc AG

MAXIMtBI UKEUIIOOD ESTD.IATION.

A......, D. O. 1991: I'mtlkal S#tltislic.s for med~aI mwD'clr.

Landoa: Qapma a Hall.

likelihood ratio 1bc likelihood ratio provides a method for comparing competing hypotheses basc:cI on Ihc UIC& LIHODO calculaled fram experimental clatL It also plays a role in Bayesian hypolhcsis lestilll (sc:c BAlBIAN ME11IODS). Suppose interesl lies in learning about Ihc response of patients suffcriq from inftuenza symptoms 10 a new tn:almenl. Data arc collected from 10 patients.. of wham six respond positively. What can be said about the compc:tiq hypalhc:ses H.= R=O.3 and H 2 : ",-"0.71 The: likelihood of obtaining six positive results and four negative mAIllS is: 1 .-6 6~I;r(l-;r) " = lS;r(I-;r) .-Ii "

4~~.

'8

O. ~ 0.6

~ 0.4 0.2

O+-----~~--~~~--~----

5000

6000 7000 8000 Mean daly eneray intake (kJ)

__

9000

likelihood UIceIihoodfunctfon for the mean dai/Ysnerrw intBke based on a sample of 11 values

and this can be determined for the compelilll values of R under the pair of hypDtheses. Far hypalhcsis H. the value is 0.002"63. while that for H2 is 0.0 143. 1be raIioofthcsc values is 5.44, the likelihood ratio ofH: &laiRsI H. indicating. in this illSlanL"e. that hypolhciis H2. is almost Sl h times as likely as H •• which is slnlng evidence in favour of H: rather than H •• 1bc Bayesian cquivalentlo this farm orhypodacsis testinl is based 011 detenninin& the raliooflhc pallcriorprobabilities or the hypalhcses. Fonnally. we calculate: p(H;IData) IX p(DataIH;) p(H;),

; = 1,2

in which P(H,) is the prior probability of hypolllcsis H,. cx(Rssiq initial beliefs in its vcrac:it)'. P(H)DaJa) is Ihc In Ba~an statistics, Ihc likelihood works to modiry the PUll DlmUBU11QN to" yield thc POSTERIOR DIS'J1UBUI'ION and n:pn:scnts Ihc information contained in the expc:rimcntabout the panllDClU of iDleresL In a fannal way. Bayesian analysis

correspondilll posterior pmbabililics and p(DuJII,) is Ihc likelihood of the hypothesis givilll rise to the data. By taItin& the l1Ilio of Ihc two cxpn:ssions just given, Ihc ratio of the postcriar pmbabilities of the two h)'~ can be

________________________________________________________ UMITSOFAGREEMENT Dev..... R. F. 1991: Scale tJel'elopment: theory tlIIIlappliUllions. Laadoa: Sage. Stniaer, D. L .... NOI'IIIIUI, G. R. 1995: Heallh mftUIII'tmeIII scale.: apr«liml,II;tk 10 Ilreir tkJelopmeRI 0RdUN.

expressed as:

p-;...;(~H~ll=D-ata..;)

= p(DalaIH2) x p(H2)

p(HIIData)

p(DalaIHI)

2ad cditiDD. Oxford: Oxford Univcnil)' Press.

p(HI)

ne left-hand side of litis exJRSSian is the posterior odds ndio. the fint lenD on the right-side is the likelihood ratio and the second lenD is Ihc prior odds ratio. This fonn of Bayesian analysis is familiar in diagnoSlic testing. In thai mntcxtlhc Jikelihood mtio is expmllCd as:

Likeliboocl ratio _ Probability (positiw lest muhldiseue)

- Prati86i1ity (posili\'c test resultlno diSCase)

_ semitivity - I-specificity

AO

AItIuD. D. O. 1991: Procliml stalwk./OI' metJjml rexan:IL London: Cblqllll8ll a: Hall

Uk_ scales

111ese scales ~ usc:d 10 measure the extent to which an individual qR:CS with a litatcmcnL A Likclt scale I)'pically has live levels. ranging from 'slrona;ly d~' to 'slIOna;ly aa;n:c'. One common alternative is to usc an even number of options in order 10 avoid having a "neutral' option. A I)'pical Ukcrl scale queSli~ item with five levels is the fonowin&;: In a proposed study of mild asIhma. il is elltically acceptable to give some participanlS a placebo treatmcnl.

• strongly disagn:e • • • •

disqn:e neither agree nor disaa;n:c agree strongly qn:e

11Ic data from a Liken scale ~ often coded asa number (e.g. 1 lOS) and il is typical for~sponsc:s from multiple itcmslo be summed or averagccllO provide an ovcrallscore related to an underlyin&; issue or LA'RM VARIABLE. When there are multiple ilcms, it is ~cornmcadc:d dud the onIeroflapOnsc:s be ~VCnCCI for some itcms,lo help prevent subjects fallin&; into a simple paUcmofraponscs (e.a;.always select 'strvna;ly 8I~'). The data from one or anon: Ukcrl scale items ~ often analysed as interval data. For this to be a justifiable approach. it is important 10 biat and develop Ibe items properly, pcmaps using a pilot study and lc:st for validation and ~Jiability (see r.lEAStJRBIENl' PREC1SION AND RELIABIUTY). For further details sec DeVellis (1991) and Sb'Cincr and Norman (I99S). PM lSec also F.o\CI'OR ANALYSIS]

limits of agreement

This approach was developc:d by Bland and Allman (1986) 10 assess AOREEMENI' in method comparison Sludies. Based on bath a;raphical techniques and straia;htforwani statistical calculations. iI is simple 10 apply and interpreL It quantifies agreement between two methods through Ibe mean difference (i.e. the eslimate of the systematic BIAS of one method relatiye 10 the other) and Ibe sr.UID."11 includes ranclam effeclS for patients to acknowledge thai Ihe respaasc tcads to dilrer between patients and Ihal repealed measuremenls takc:a on Ihe same patient ~ lhcref'ore alike. 'Ibis modcI is written as: )'ij

= (a + IIj) + /llv + y.~, + eij

when: lhc "J an: random patient effects and the etlare random residual efTc:c:ls. which represent the variability between measurements wilhin patients. The paramdClS a. /I and y ~ lixcdefl"ccIs that repn:seat, n:spc:cti'VCly.1he overall mean n:sponsc in the CICIIIImI group (when: '~II =0). the trend in response overtime and lhc treatment effect, which isconslant over lime. The two sets of nandont effects "I and ell ~ indcpcnclcnt and it is usualro assume these to be DOnnally distributed. "i'''' N(O, o!) and I.'j"" N(O. ~). This basic: mixcd-cffccts madel far the dalacan beextcaclc:d in a number of interesting ways. Far example. we could allow the trend in n:sponse over time 10 vary flum one palient to aaother. we could include additional explanatory variables or we could allow the elrect ortreallDcntto 'VBIY over lime. For a r8l1le of extended mixed-cffc:c:1S models aDd luiclance on their inlCl'prelalion. rc:adc:n are referred to the rull entry on this subjc:cl area. litled MUU1LEVEL MOOEL5 (sec also 8ftritt and Pickles. 2004: Pinheilo and Bales. 2000). 'I1Iis entry provides details of methods and software for estimation of mixc:d-clrccts models and also cown topics such as handlinl of missing data and complex applications. RT LSee also ClEHBWJSED ES11MIJINCJ EQUATIONS. MODELS)

tI_ 11/clinicallriDls. PR:ss.I'InIIIIrotJ. c.

),ftJl.11IJ1:VEL

It. . . . B. S. MIl Plddll,A. 2004: SItllisl;C'IlIIUp«IJ oJlM tkJign tllllli}'Sis 2Rcl edition. Landon: Imperial CaUcce ad Bates, D. M.2000: II;xedrJ/«Is rnotIe& in S _ S·PWS. New yadt: Spri.... ~dlll.

linear reg ...slon

See MUL11FI.E UNEAR IlECIRESSION

linkage disequilibrium

LISREL

~SMETHOO

Sec ALLEUC ASSOCIATION

Sec snwrnJRAL BQlL\TION MOOEWNO SCFIWARE

LMS method

This is a melhod used far cODSlrUcliq ACE-RB.AlED REfERENCI! RANGES. typically applied to OROWlH CHARrS. The underlying agc-spcciftc FREQUENCY DlSTRlBU. TION of the mcasuremenl (typically anduopomctry such as height or weilbt. though it can be applied to any ratio scale measurement) is summarised by thn:e qe-varyiq parameters represcntiq the lilSl three moments of die distribution. The RlSt is the r.tE.DL\M as a MEASlJRf.OFLOCATlON. the IiCICOnd is Ihe COEIRCIENTOF VARIATION orCVas a MEASURE OF SPREAD or scale and the third is the Box-Cox powe:r lnInsformation (see lUNSRHWATIONS) needed to adjust for SDWNESS. as a measure of shape. KURTOSIS is assumed 10 match that of the NORMAL DISTRIBUTION. and is not estimalc:d as such. Adjusting for skewness ensures a symmetric disIribuUon. so that the ),lEAN on the lransformcd scale is also the mc:cIian on the original scale:. TbeCViseslimatc:d rather than the SfANDARD DEVIATION (SD). as die SD oRen increases wilh &Ie in proportion 10 the mean. whereas lhc CV is relatively uncorrclatcd with age. 11ac oriliaal publication describiq the LMS method usc:d the notation A for the Box-Cox POWER. II for die median and (I for the CV - hence the LMS method. 11ac lhrcc age-related curves are referred 10 as the L cunc, M curve and S curve n:spcc:tively. and tOlelher they allow any requin:d QUANTILE of the di5lribuliaa to be constructed as a smooth function of qe. Equally they allow individual measurements to be expressed as a standardised n:sidual or z-SCORE adjusted for skewness. See OROWl'll CHARTS for an example of a cenlile chart conslnlctcd using the LMS mc:lhod. 11ac LMS method is a semi-)HII1IIDc:Iric KlI'CssioD model where Ihe tIRe paramctersofthe distribution ~ estimated as generalized adcIilive cubic smoothing spline curves (sec SCATrE1lPlDl' SMOOI'IIERS) usinl penalizc:d UICELIHOOD. 11ac only analytical decision to make when Rtting the model is to specify the number of equiyalent dcgRlCS of freedom (cOoP) required for each of the three smoothin& spline CUl"\'CS, so that they are neither under- nor o'VClSmoothed. Criteria such 8S AlCAIKE'S INFORMATION CRl'JUION or the Bayesian infonnalian criterion (EvcriU and Skrondal. 2010) are useful hc:~ to InIdc: orr improvc:d fit apinsl increased model complexily. For infanl anihropomclJy die M curve is often steep at birth and prosrcssiftly shallower with incrc:asing age. "n'ansfonning the age scale can help here. e.g. with a square raol transfonnalion 10 stretch younger qes and shrink older ages. as this lends to 249

LOCAL RESEARCH ETHICS COMMITTEE (LREC) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

linemse the M cuneo simplify Ihe cune shape and improve the fit. The LMS melhod is a special case or the family or GENERALJZm ADDITIVE MODELS for location. scale and shape (GAMLSS). These ~ powerful models thai can be applied 10 many different dislribulions. whe~ up 10 four momenls or the dislribution an: estimated as sepante generalised additive regrusion.models. For further dewls see Cole (1988). Cole and Gmen (1992) and Rigby and Stasinopoulos (2005). TJC Cole, T. J. 1988: filii", smoothed ceDliJc curves to men:nce daIa (with discussion). JtHII1ItIl oJ IIIe Royo' Slalillirol Soriely. sma A

lSI. 385-418. CaIe, T. J...... One... P. J. 1992: SlIICICMhing refereac:e ccatile curves: the LMS method aad pcaalized litclihaad. Statistics in Metlit:ine II. 1305-19. Eymtt. B. s. ..... SUoadaI. A. 2010: Cambridge didiDIIQr)' o/$talistics. 4th edition. Cam.,...: CambriclgcUni\'elSity~ss·Rlaby,R.A.8DdStaslaopaalas.D.M.

2005: Generalized addili\tc models for IocatiCIII, scale and shape (with discussion). Applied StQlutks 54. .507-14.

local research ethics committee (LREC) See E11IICAL RE\'JEW cmalllTEES

locally weighted regression See

SCA11UIWI"

SMOOrHEIS

loess

See SCATIDPLOI' st.IDOI1IERS

logistic discrimination See

DlSt'mIlNANT fUNCTION

Mtl..TIU I.L'ft!AR REGRESSION. as the dllla arc not nonnally disbibutc:d. do DOl have the same VARIANCE for groups wilh

diffetent OUICOnIC proportions and PRCficlions ofproponions must not fall outside die range zero to one. which can happen if multiple linear tegmssion of a proportion is used. Binary data can be analysed in telation 10 a si~1e categorical explanatory variable using the C1D-SiQUARE lEST. but very rn:quently it is necessary to analyse a binary out4."lCJlne in n:lation to several explanatory variables. same or all of which may be continuous. For example. in a lIudy that investigated whether n:poned wheeze is telBled to the usc of gas ror coaki~ il would be desirable to take age and gender into account. and also conditions in the home. such as an extractor r~ Ihal might affect ehe concentration of the combustion producalhought to be responsible: for aDy increase: in sym~ toms. Alternatively. we might want to analyse wheeze in n:lalion to the usc or gas for cooking and passive smoking simultaneously. To analyse binarydala and adjust the relation to the factor of primary inten:1I ror confounding variables or to determine to which or sevc:nl potential explanatory variables abe outcome is n:lated. an analogue of analysis of variaac:e and multiple rcgrasion is lequirecl. Logislic n:pcssion meets these requemcnts. An explanation or the method is castell in telation to an example. Logistic n:lRssion has been used to dcsc:ribe ehe disbibulion or age at menarche in girts and the factors associatc:d with early or delayed menarche. Roberts. Romer and Swan (1971) carried oul a clUSS-sc:ctional survey or girls in South Shields. County Durham. in 1967. Data an: shown in the ftnI table.

ANALYSIS

logistic regression

logistic regression Number of girls and number recorded as menstruating, by age group

outcomes often havc only two possibilities. Whether a patient is dead or alive is the moll obvious. but the presence or absence or particular diDInDscs. symptoms or signs an: also examples of binary or dichotomous vanables. Hypcncnsion. obesily and ainvays obstruction an: diDlnoscs that n:sult from observing Ihat a particular measun:menl is above or below a particular value, thus ctellliDg a binary oulc:omc from a continuous measun:mcnt. Methods for the analysis ofbinarydllla differfiom Ihose for eonlinuous variables. Fint. the SUIDl11DlY statistic to dcsc:ribe then:sultsisaproponionorpeR:lentageofindividualswboan: dead (or alive). have the symptom or in general have the designalcd out4."lCJlne. Data from a continuous variable an: summarised by the MEAN and SfANDARD DEVIATION or ltEDlAN and ~ARTD.E R.oU(OE. as how variable the values arc is required as well as a typical value. while for binary data the proportion or pen:cnlap tells usevcrything. Second. when we analyse abinar)'ouame in n:lation toEXJII.ANATORY VARL\BLES we cannot use S11JDENTS I-lEST, ANALYSIS OF V,UIANCE or

AgegrDllp

A fonn or rqmssion analysis to be used when the n:sponse is a binary wriable.. Medical

Il- < 12 12 - < 13 13 - < 14 14- < IS 1.5 - < 16

No. 0/ girls

12 304 366 351 216

No.

4i

mD1111'lItIling

mensllVtltillg

4

76 171 215 2(»

4.9 25.0 41.6 81.2 96.8

The percentage of girts who had reached menan:he. of course. increased with age. being vcry low in the youngc:sl age group and very high in the oldesl. Had younser age groups been included the percentage menstruating would have been less than 4.K and ()Cjf, if sufftciently low. Similarly. the percentage would havc been close to 100.. in older age groups. The telalion ofpropoltion or percent8le mensbuali~ to age can be described by an S-shapc:d (or 'sigmoid') curve. The cIaIa from Ihe ftnt table ~ plotted together with a Rtted smooth Sigmoid curve in Ihe ftgure.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ LOGISTIC REGRESSION

ID

1

i I! I

0.8

'i

0.6

1a 0.4

1

02

i£.

1~

1

Age

IogIsIIc

rear"'" CUmulative IogisIic curve

The curve sbowa is a cumulalive logistic curve. selected from the family of such curva so that it best describes the dais. Its fannula is: 11=

I 1 +e-(~

when: Ir is the proportion menSlJualing by age .tC and a and fJ ~ the parameters that describe the best filling curve. These panmetelS wen: eStimatc:d to be -18.40 and 1.37 n:spcclively to fit the curve shown. Median age at menan:he is caliRlaled aI 11 O.s. i.e. where -(a + fJx) 0 or .tC -fIIfJ. sowas 13.4 years flam thesedala. As the Iopstic distribution is symmc:1ric Ibis is an cstimaIe of mean qe at mcnan:he. The equation defining n CD be IeWrittcn:

=

=

11

I~-

I-.:r

computilll was known asp,obit QlfalYJis(sec PROBrrMODELS). • probit being a IlOIIDIII equivalent deviate with 5 added to avoid negative numbers and was developed far usc in pharmacology (Pinney~ (971). 1be distribution of the dose of a toxic subslanc:e rcquiral to kiD a given stnin"of animal is known as the tole,ance disl,ibutiolr. It cannot usually be observed din:ctly. but if groups of animals are ~ycn ditren:at doses of the drug and the plOpOItionsdying an: recOrded.. then • sigmoid curve of proportion with dose is obsel'Yed that describes the cumulatiye tolc:rmce distribution. Pinacy (1964) ascribed Ihe logistic traasformatiOll to Berkson and showed the close agn:cment of the normal and lo~stic distributions. but favoured the normal dislributiOllIo describe the tolerance distribution of diu, toxicilY. Hence. in pnc:ral. the nonnal or prubit InnSf'onnalion was usc:cl when there was an underlying to1emnce distribution. An exception was &Ie aI mc:nllKhe; it bc:came accepted that the 10000lic llansf'onnalion should be used (Pinney. 1971) as one study apparently found a better fit of the lopstic transf'ormation than of the nonnal dislribution. " Just as linear rqn:ssiaa can be extended to multiple n:gressiOll and also incorporate calqorical explanatory wriabies. so can Iopstic rqn:ssion. A mUltiple logistic n:per lim cqualion can be wriuea: I~ -

I-n

=

/I = a+,.,:c

The left-hand side of this c:qualiOll is known as the logistic I'tIIItf'ormalionofthe propoItionJr. Ilsefl'ecl is to stretch the scale. 10 lIIat Ihe lramfonncd variable can take wluc:s rrom minus infinity (-00). c:om:spondin, to ;r = O. to plus infinity (+oo).c:onespoadingto;r l.andalsotolincarisethen:llllion with ap. Filling the logistic curve can therefon: be achieved

=

byleastsquan:sn:p:ssionofthel~istictransfonnof;rm.

(see I.I!ASTSQl!ARESES11MATION).exCcpl thal the lnIDsfonnation docs DOl achieve homogeneity or varianc..-e and so iIcraIi~l)" wcighled least squan:s rqn:ssion is n:quin:d. However. most modem computer )H1JIrams use MAXDnIM UICEI.D(OOD EmJ.IA. TION. which also mJuin:s iteralion, and individual ratherlh... puuped data an: usually analysed. Pull specificalion of the binary outcome)' for individuals rcquin:slhal y is cliSbibuted as a binomial distribuliOD willi paramc:ter I (heR also known as a Bc:mouIli cIislributiaa) and success probability Ir. The Iolistic curve is nOiIhe only curve that could be fitled todc:scribe the data. A ftttedcumulative NORMAL DISlRIBUTlON "would be almost indislinpishable from the Iogjstic curve. PiUing a normal "dislribUliOll befon: the clays of elcclJonic

n

= a +/l.x. + ~X2 + iJJ''-J+ •..

the :C, can be continuous or dummy variables to indicate categories of11UUpS. For example. Robcns. Danskin and Chinn (1975) analysc:d qc at menardte in ~Ialion to family size. in calqOries of one. two.. thr=. four and at least fivechildn:n.1n the madel shown graphically in Ihe paper••'tl was &Ie. and f'ow' dummy yariables X2 to Xs wen: usc:d to describe the clitren:DCe5 in median .,e aI menan:he betwc:cn Ihe five family size lroups. concspondiq to ftltiq parallel sigmoid curves. The estimates hi of Ihe Il, an: knowa as logistic Rlglasion coeflicicats. To n:tum to our first example: if the praence or absence of wheeze is the outcome and the presence or absence of a ps coulter the indepc:ndent arexpianatoJy yariable. with no Glher facton alllsidered for the moment. then if the dummy (indicalor) Yariable x is 0 far absence and 1 for p~sence of a las coolecr'. dlen we have: whe~

I~ [ I-;r,.. .iJps ] -Iole [ Jl'eops ] =Jl I-Jrq. 251

LOGISTIC REGRESSION _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ Each oflhc tams within bnckclS is an acids and a diffc:rnc:e in log odcIs. when andloged. is an OODS RAllO. Antilog,ing both sides of the equation gi~:

n'ps(I-Jrqas) n'.,.(I-Jrps)

="

The odds ratio on Ihc left-lland side of this equation islhc addsorhavilll wheeze in Ihc pn:senc::e ofa pscoolc.crdividcd by the odds or having whc:c:zc if no gas cookei' is present. It will appmximalc Ihc Rlative risk. ar risk ratio. of wheeze in the pn:sencc or a gas cooker compami to a no gas cooker provided thai the pMvalenc::e of wheeze is low. This relative risk is ellimalcd by,/. 11Ie diffCRnc::es or this example from the 8&e at menan:hecumpleare thatR f'or whc:czc is ualikely to be IRater than. say. 0.2, and no ·tolerance dillributi.· analogous to that of age at nlCnaR:hc is diRclly spcciftccl. although one can envisap Ibis being Ihc dillribution of whalevcr pnxIuCI or cambustion of las is responsible for incn:uc:d wheeze in peaple in homes with gas cookers. Even with such a dillribution spc:ciftcd.. il is unlikely &hat cx~ would ever be high enough to cause lQOtI.. whccz.c. so. in practice., only the lower portion or the sigmoid curve is relCVanl. As with applications in gcncnal. it was the availabililY of software that led to an expansion in Ihc usc of logillic regression.. in particularOLIM (generalised linear interactive modelling) in Ihc carly 1970s. Allhough now Iqcly superscdccI. nolably but not exclusively by StaIa. GUM enabled unbalanced analysis or varianc:e, multiple linear Mpasion and multiple logistic I"CIRssion madels to be fiac:d wilhin the same framework. The application of logistic Mgression in epidemiology and public health journals showed a sleep rise IiDm IIRIUDd 1980 (Hosmer. Taber and Lc:mshow. 1991: Chinn~ 2(01). Odds mlias were usc:d in epidemiology and. in particular. for the Rsults of a C\SE-coJmtOL ~ befoM the wiclcspn:ad availabiliay of campulas and slalistical softWIR enabled easy fttlilll of multiple logistic repasion models; thcref'OM logistic Mgression was Radily adoptc:d by epidemiologills. It was also established as the appropriate method for tbc analysiS or cuc-conb'ol studies with adjustment ror confounding. When cases and controls IR individually malchcd Ihc metbocl of _)'Sis is conditionallogiSlic regression. Mosl statistical softWIR for logillic relression raauiaa the binaryoutcomc to bccadccl oand I, willi I forthe 'positivc' ClUlCome. Like all estimates from a samplc. aD odds ndio has an associated ~ DIlERYAL The NUU. HYJIOI1IESJS of no relation to an explaaatary 'Variable is an odds ndio or one or equiwlendy zero far the com:sponding logislic rep:ssion eaellicicnt. OIdcrpapcn gave logistic repasion coefficients with standard em:xs or 95., confidence intcnais. but moM ~ent papcn live ocIck ratios with 95., confidence inb:nals.

For example., Somerville. Rona and Chinn (1988) pve logistic rcp:ssion coemcients in a study of passive smoking by children in a survey of 5- to II-year-old chiklMn in BnlIancI and Scotland. One result 15 shown in the filii line oflhc second lable. 11Ie Iogillic regrasion eaellicicnt orlhc symptom. reported by a parent in a self-adminislClaI quc~ tionnaiR. 'chest EVER sounds whcczy or whistliJII·. on passive smoking as measun:d by the total number or cigareues pcrday n:podCd to be smoked by the pareats. was 0.0 I I with a standanl error ofO.ClOS. By calculalilll the eaef6cient :I: 1.96 X standard error. a 9S~ confidence ror the logistic repasion coemdcnl can be obtained. AntiloUing (base e) the cocfIicicnt and each limit of its confidence intcrvalli~ the odds ratio and its 95., confidence inlcmll in the second line. However. the odds ratio usocialcd with exposure to just one cilarette a day is not \'CI')' useful; 20 cipn:ucs a day repn:scnts a more common cx~ of childn:n who an: exposed to passive smokilil. To obtain the odds ratio au. cialc:d with exposure 10 20 ciglRllcs a day. multiply both Ibc logistic regrasion coefftc:ient and ill slandanlcrror by 20 and repeat Ihc confidence interval calculation and andloging 10 obtain the thirclline of Ihc table.

logistic regression Altemalflle pt8S8ntations of result 01 /of1isIic regression 1IIJIIIysis, illustrated by 'r:hest EVER sounds whNzy or whisIIIng' in relation to passitlfl smoking for children in the Nalionlll Study 01 HealIhandGlOwtb (Somerville, Rona IUId Chinn, 198BJ Rew/I topstie rcp:ssiaG codIiciClll:: staadard emil' oa numllc:r or cilmacs smoked II home by fadler ancIlIICIIbcr Odds ralio per cillldte smoked (95 ~ confideac:e inIcn'aI) Odds ratio pcr20 cipMlCS smabd (95CJ. CDnftdeacc inIavaI)

lOt"

0.011 :t: 0.00.5

1.011 (1.00110 1.021)

1.246 (1.02410 J.516)

Although tbc evidcacc against the null hypothesis was nol

IIroftI (p,.., 0.021) and the 954JL conftclcnce intc:nal corresponciilllly wide. the results in the thinllinc showlhat the size orlhc likely effect is not ncgiigible. which could DOl be easily appn:cialc:d from either or Ihc Orst Iwo rows. Note that the confidenc::e intcnal for the logistic regn:ssion eaellicient is symmclric araund Ihc estimate. bul that far an odds ratio it is not. II is tempting to interprel the Ihinllinc of Ihe second table as mcanilll that exPOSUM to 20 cigarettes smoked a day in the home results in an incn:asc:d risk of wheeze of betwecft 2.4" ad 51.64JL. This is inlerprctinl an odds

__________________________________________________________ ratio as if it wen: a relalive risk. which is only juslified if the prevalence ofwhceze is low. say less than IK (Zhang and Yu. 1998). In lIIis case it was IO.9CJt. so perhaps not too misleading. but il is easy to ftnd examples of incorrect interpretation of odds ratios in abc medical literatu~ (Chinn. 20Cn). Although the fact that the odds mtio is biased away from Ihc null \'alue of 1 as an estimate of relative risk is well known to statisticians and cpidemiologislS. it is oRen conveniently ignored in the medical lilerature. especially in the repoltiDg of results of CROSSSECTIONAL 511JD1ES.ln faet. il is possible to estimate relative risk directly. by biDomial ~gression, but at abc expeftSe of the ilerative model litting sometimes failing to converge (Chinn, 2001). Lasistic regression is essential for the analysis or unI1UIIchc:d casc:-conlrOl studies and is UkeJy 10 conlinue to be the IDD5t used method fCll' the analysis of binary outcomes in crvss-scclional studies. Statistically il cannot be faullcd; it is in the rcporting. and the fact ahat an odds ratio does DOt estimate relative risk din:clly, that the problem Ucs. Binary outcomes in CQIIORT STUDIES should be analysed by SUJlVIYAL ANALYSIS. unless the follow-up time is CGIlslant. which is rarcly the case. The P-YALUE associated with the odds ratio. to lest a difl'emlce frum 1~ can be obtained by dividing the logistic rqrcssion coel1icicnt by its STAND.\RD ERROR and comparing the lault with the nonnal dislribution. as the null hypolhcsis yalue for the logistic regn:ssion cxx:flicient is ZCIO. Note that the normal distribulion is used rather than the '-DISTRJIUDON. as no residual standard deviation is estimated. nus is because a binomial dislribution is assumc:cl for the observations. which is specified only by Ihc expected propartion and does not involve a standard deviation. Altcmatively. if the model wen: fitted by MAXlt.RJM LIUlJIIDOD. the UICflJ. HOOO RATIO lest CD be used and will usually give approximately the same answer for a single parameter. If model I is the model willi the factCll' of inlcn:st included. with likelihood II' and model 2 ahat with it omitted. with likelihood ' 29 then -210g(/JII) has a Dl-SQUARE DIS1IUIUTJON willi DEDRES OF FREEDOM equal to the difference in the number or "Ited parameters. nais can be used 10 test the equivalence of se'VCIBI paramc:1c:15. e.l. equal mc:cIi.. age at meaan::he fCll' girls Iiom diffCRDt sizes of family (Robats. DanskiD and Chinn. 1975). Related to ICSling for association of outcome with risk factors is Ihal of goodness of fit or the model. This is more diflicult to assess than with a linear n:gn:ssion model. as individual wJues an: each Oor I. so a plot orobserved against litted \'BIues. or ofrcsiduals. is uninfannative. Farassoclated rcasaas the ovcralllikeUhood 1DIi0 statistic cannot be usc:cl. Hasmcr. Taber and Lcmcshow (1991) give a number ofpiots thai can be used and the ncccSSIII')' calculations an: implemented in Slata.

LOGlsncREBRESS~

Logistic n:gn:ssion as dc:scribed here fCll' a binary outcome is a special case or the more general multinomial. or pol),tomous. logistic regrcssion fCll' a calqOrical outcome with line or meR possible wJues (see LOCJIT .tOOElS FCR ORDlNl\L RESPDNSES). II is also closely related to the I.OG-LINEAR MOOEL. which assumes a PoISSIDN DISI1UIUI1ON fCll' the count in each cell of a eaatingency table. Each is an example or a CJENER.

wsm LINEAR MODEL. Medical joumals now frequently report laults frona multiple logistic rcgn:ssion. showing odds ratios,. P-wlues and confidence intervals. These need to be read can:fully. as seemingly similar lables may be used in differenl situations. The lines oflhe table ma), be fordiffcn:nt binDl)' outcomes 01' independent analyses or the same autcome willi ditrenmt explanatory variables. 1be odds ratios will oftt:n be adjusted. fCll' a list or stated variables such as &Ie and sex. although unadjusted odds mliGS may also be shown. An example is shown in second Rabie or Lawlor, Patel and Ebrahim (2003). in which odds ratiosoffallsin women aged 60 to 79 with drug use arc given. Each IVW of the table gives n:suIls forone class of drug. while then: an: columns for ·crude' • i.c. unadjusted. and full), adjusted odds ratios foreachofdRcauacomcs: any falls.lWoor~ falls and ralls where medical allcDtion was given. The variables used to adjust the fully adjusted odds ratios an: listed as a foalnote to the table. Other papcn gi~ odds mlios that an: mutually adjusted fCll' other facton shown iD the same table of results. i.e. alllhc results come flOm a single multiple ICllistie regrcssion and full iDfCllllllllion is given, while Lawlor. Patel and Ebrahim (2003) (described earlier) appear to haYe carried oUI 21 adjusted analyses (thn:c: outcomes by seven drui classes). (For an example or mutual I), adjusted odds ratios see Slap eI til. (2003). lint table in the abriqed printed venion. seeond table in the full elc:dlOnic vcniaa.) Particularly wIKK all results shown arc 'slatisticaUy significant' (SS). the n:aclcr Deeds to a5CC1tain whether all facton in the model an: shown and whether the final maclel was selected fiom a set of possible models. "Ibis is appropriate if eilher the question is 'What factors an: associalcd with the aulcome?~ or a parsimonious madel is n:quircd fCll' pn:diction purpases and selected either by forwards or backwards stepwise elimination. However. as wilh a similar proccdun: with multiple regression. it must be uncIerstoacl ahat pn:cIiclion on a further dataset will not be as good as on Ihe one flOm which the pn:diction was derived. and exclusion or iKlusion of facton with P-wlues close to the choscn critical value may nat be rcpraducible. By the same token, howevcr. when ~ is a SIaI&:d hypothesis. the odds ratio of interest should ideally be adjuslal fCll' all facton dc:tcnnincd a priori to be of potential importance. Some of these ma)' IK1t be associated with outcome in the data al the conventional level of statistical silllificance. but adjustment can saill affecl the odds ratio of

LOGIT MODELS FOR ORDINAL RESPONSES _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

inten:st. It is useful when there may be conll'Oversy o~r the nwnber or potentially conrounding variables to be included to give both unadjusted. rully adjusted and, perhaps also. partially acljustc:d odds ratios. SC CIdnn, S. 2001: 11ac rue and faU of logislic R:gression. Australa5itJn EpiJemiologist 4. 7-10. FInne7, D. J. 1964: Statistkal method in biologkal assa)". 2nd edition. London: Griffin. 'Ianey, D. J. 1971: Probit QllQlysis. 3rd edition. Cambridge: Cambridp: Uaivc:nity PR:ss. Haaner. D. W.. Taber, S. and LemesIIow. S. 1991: 11ac impodance or assessing the fit of lopstic regn:ssiOll models: a case 5lUdy.Am~ri«DIJOIIIPItIIofPub/icH«lllhBI.I~5.Lawlor.D.A.,

Patel, R. aDd EbrabIaI,s. 2003: Associalionbdwcen falls indderly

'A'Dlnea and chroaic disc8SC5 aad drug use: cross-sedioaaI study. British MMitaiJouma1327, 712-15. Roberts, D. ... Romer. L M. _ S..... A. V. 1971: Age at mea.-cllc, physique mel eaviJonmenl in iadustrial northeast England. Ada Pamialrim S(tI1ft/ina1'im 60, 158-64. RoberG. Do F .. BIt....., 1\1. J .... alan, S. 1975: Menarcheal age in Northumberland. Acla PattiialTim SttmI/;1fQV;tQ 64,845-52. Slap, G.... Lot, L.1Iaaq, B.. .,....,... C. A., ZInk, T. M.... 5afc:Gp, P. A. 2003: Selual behaviour or adoIescelMS in Nigeria: cross-sectional survq- or secondary school studmts. Brita MrtlimlJouma/326.15-IB. Samentlle,S., Raaa,R. J.... CIIIna, S. 1988: Pass~ smoking and ~1'IIIOry conditions in primary school dUkEn. Journal of EpidrmioioBY and CommlUlily Hta/th 42. 105-10. ZIIana, J.... Va. K. F. 1998: What's lbc Jdati\'e risk? Journal oJ the ADleritQII Mftdim/.woriation 280. 1690-1.

logft models for ordinal responses A ~gn:ssion model is a slalistical model for describing the rdalionship between one or mo~ elplanatory variables and the response (dependent) wriable. The purpose of slalislical modelling is to lit the best model from a medical and epidemiological point of view that describes this relationship. The statistical modelling of how the relatiOnship between the elplanatory variables and the response variable could be described depends on how the response variable is recorded. The linear regression model as5UmCS continuous quantilali~ response values. When the response variable has only two possible valucs or is measured on a rating scale. a 10g;1 Irml.s/om,alion of the response valucs will meet the asswnption of continuity. A simple liMOr regression model describes how much a contilMlOUS quantitative response variable (y) depends on the explanatory (x) variable by the expression y = a + b:c, where a is a COns1an1.1hc inlercepl. and b is the regression coefficielll, which contains the imporlaDl infonnation about the dependence ofyon:c. According to the madel. y will change b units when x inm:ascs I unil.ln a multiple regression modela linear combination of several elplanalory variables are included. The purpose could be to in~stiplC how the response ,'ariable depends on all explanalDry ,'ariables together. Some of the variables could also be included in the madel as confounding racton. which means that Ihcy would disturb the relationship or inlClCSl if not being adjuslcd for.

In the case or only two responses. sucxesslfailure or discasc:dlnondiseasccL the range of possible response values is betwc:en zero and one: e.g. when the probability of success is p = O.B, then the pmbability or railu~ is (I - p) = 0.2. As the madclling asswnes unlimiled possible continuous valucs.lheelplanatcxy variable will be linked to the raponsc variable by a logit Irans/onntllion. Then the odtU ofsucccss is the ratio between the pmbability of success and the probability of railu~: odds = p/( I - p). The 10Pi of the proportion p is defined as the log odds = 10git(p) = In p/( I - p). when: In denotes the logarithm to the base e. The regression model is called a (linear) LOOISJ1C REORESSIOX model, lopt (P) =a + b."C. and the multiple logistic repessionmodelislogil(p)=a + blx, + ~2+·· ·+bkKII'.whenk explanatory variables are included. The interpretation of the relationship between an explDnatory variable x and the probability p of success is that when x increases 1 unit. the odds will change eb • For example. lopl(p) = 3.2 + I.n means that the logit(p), or the log odds, is predicted to change 1.3 roreach unil ofincrease in."C and hence the odds orsuccess • will change e 1.3 = 3.7. Logistic regression is commonly used to compare the odds or soocess between two groups or subjects with and wilhout some prognostic propeny, such as smoking habits. For ilIUSlnltion, consider a model for having a spc:ciftc disease. logil(p) = 3.2 + 1.3 age + 0.4 smoking. where the prognostic variables are age (yean) and smoking habits coded as smoken = 1 and nonsmokers = O. Assuming the same age in the two groups. the logit ror smokers is logit (PJ = 3.2 + 1.3 age + 0.4 and ror nonsmokers logit(P_J = 3.2+ 1.3 age+O. Tbc difference between these logits is logil(pJ - IOgil(p_) =0.4, wbic:h is a difren:nce between the log cxkk or disease in smokers and nonsmokers. This difference between loprithms is the same as a ratio. in this case the log odds ratio, InOR. Thus. InOR = 0.4, which was the regreSSion coemcient associated with the variable smoking: then OR = eo... = 1.5. According to this example. we can prediCI that the odds or ha"ing the disease are related to smoking habits and are predicted to be I.S limes larger in smokc:n than in nonsmokers, aftu adjustmenl for age. Tbc logit lransformation makes it possible to model how a dichotomous response variable dc:pcnds on the explanatory variables. 1'11c lopt tnasronnation is also suitable ror ordered calegorical (ominal) responses, provided there is dicholomisation or the response: categories. Consider a rour-poinl scale with the catqorics 'none'. "slighI'. "modcraie' and "seve~·. Assume that the numbers orobservations in the caaegariesan: n~. nl and n.. respectively and the total number of observations is n. The cumulali'te, conlinulllionratio and adjacenl-calegories logits are three approaches to creating dichotomous dalascts considering the ordered structure of the: ordinal responses.

R,.

___________________________________________________________ In the nlmu/ol;~ 10g;l. also called Ihe propor';onal otItb model, the probability of being in Ihe lower calqories is compared with Ihe probability or beil1l in die higher. Empirically. the numbel" of observations in categories rqRsentinl loWer levels is compared with Ihe number of obsemdions in Ihe higher levels of the scale. There are (m - 1) possible cut-off points between categories in a scale with nI cldcgories when creating cumulative 10gi15. In the four-poinl scale ~ are line possible cumulalive lagits; when the cul-ofl' point is the IIrst caleJory. 'none', the cumulative lopl is In 1"./(n1 + 113 + ",,)]; by moving the cut-off point onecalegory at a lime the cumulative Iogits will be In 1(111 + n2)/("3 + nJ) IIDd In 1(11. + n1 + "31",,]. In cumulative Iopts, all data are used in each Iogit. The lint cumulative logilcauld be interprck:cl as the log odds of the n:sponse ·IICIDC'. as eomplRd with ·slight'. ·moderate' and 'seven:'. If the wriable is pain Ibis cut-off point seems n:asonable. Absence or pain is compared with presence of pain. but the alhercumulali'VC logilS could also be of iDlen:&l in a logistic model. In the rolllinlllll;on-TQliD tlpprotlCh. the nwnber or abserYIIlions in one catelOl')' is compared with the number of obsen'alions in all catelories represeating lower levelLln the four-point scale the continualiOlHBlio lopts are Inlll~.J. In(nl(n. + nJl and InlllAn, + ~ + "3)]' In Ihe tltIja~nl-clllegories 10g;l, adjacent calelaries are .comparcci: in the four-point aaIe the Iogits are In("z'"'). In(,,-'''J and 1n(".Jn~) and this applOBCh is also applicable to categoricallnaminal data. After dichotomisalion. the IDiits Cor ordinal clala can be used in the lopslic n:gsasion model Cor dichotomaus daaa

LOG~NEARMODELS

and with com:sponding intclpRUtian of odds ratios, when evaluating possible n:lalionships between dichotomisccl ordinal n:sponses aad some prognostic variable. when conIIOlIi. for other propastic or disblrbing backpHmd variables. For fUlther details see Agesti (1984), Altman (2000) and Campbell (2001). ES (See also LINEAR RBHtESSJDN. LCXH.INEAR MOOELS, MlJIl'IIU REGRESSION KODELS)

"pIItI, A. 1914: AIrtIlym ofnlllQI tIIlegorit.YJItIDla. New yurt: John WUey.t: SolIs, Ine..AI...... O' O. 2fX»:Prtldit:alstal&tiajDr medical resmm.. Boca RaIoiJ: C1Iapman at. HallICRC. Campbell. l\L J. 2001: SllIIislics til MpItII'e '"~. Bristol: BM) Baab.

log-llnear models These an: models thai serve

10

describe the n:lationships between frequencies (aJUDts) and one or mon: variables lIIat affect lheir size. In pmclice. Ioglinear models are most often used in connection with CON. TINOENCY TABLES to describe the natlR of associations ~ tweenmulliple DOmina' catelorical variables. n.: analysis of cOnlingency tables formed flUID thn:e or more categorical variables will be ,the primary concern of dais entry, lince two-way contingency lables IR dc:a1l with in the entry mentioned earlier. When 8 sample rrom some population is classilled with respecl to more than lwo qualitative variables. the n:sulling data can be displayed as a multiway contiqency table. As aD example. we consider the three-way contingency table faulting from classifying 1330 patients according to blood pmiSIIn:. senan cholcstmJl and CIOnJIIaIY heart disease (see the ftnl table). In ~way labIes 'Ia,mag' is,

Iog-lln_ models Cross-cIassitIc of patients wlh IfISIJfICf to Ihree cIinicBJ tlllliables discussed in Ku and Kullback (1974) ~rum

CDl'tJllIIT)'

inrI

Blood

al!tlJr

preJnITr

Yes

167 Total No

200-219

Tolal 220-259

>260

2 3

3 2

l I

l

I

II

6

6

7

II

II

20

12 21

,21

24

117

121

IS

98 209 99

47 43

22 20

68 46

43 33

2(M

Jl8

225

142

4

12 9 31 41 93

167

Total 0w:raI1 total

< 200 mgll00«

moteslerol

119 67 311 408

527 555

3(11 246 439 245 1237 1330

25&

LOG-LINEAR MODELS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

used to IICI."OI11IIIOC the levels oflhe third categorical variable (heart discuc). Several independence hypotheses might be of inlCmst in the thn:e-way allllingency table. These COlRSpond to dill'cl'Cftt combinations of jinl-o,der Telaliorrs/ripa between pain of categorical variables:

whereF""are theordical expected frequencies in a threc-way table under a particularhypalhesis. AmlllTtlledmotkl forlhe FtJ/l. i.e. a model that explains all thc varilllion in the data. is givcn by: In(Fjt}

=,,+ 1I1(;} + II11J) + ul(.t) + 1112(,1 + "13(it) + "DUl) + 11123(ft)

(I) mutllal intIeperrt/ence of the tlRC variables. i.e. none of

Ihc: pairs of variables is associated: (2) /HITliol independence, i.e. aD associalion exists belween lwo oflhc: variables. both of whicb an: independent of the thinl; (3) condilional intkpentlmce. i.e. two of the variables are independenl in each level of the third, but each may be associated with Ihc Ihinl variable; (4) nllllllal lU8DCilllion. i.e. each pair of variables is associaIc:d within each level of Ihc Ihinl variable. In addition, the Ihn:c wriables in a Ihrec-way contingency table may display a man: complex fonn of association. namely what is known as a:l«Olld-ortler relllliolUhip. This means that the type and/or degm: of association between two cab:gorical variables is diffemat in same « aU levels of the remaining variable. In thc:ory. in a /c-dimensional table relationships up to (/c - 1)111 order can be invcstipled but the inlc:rpn:talion of hipcrcxderrelatianships bccomesincrcasingly man: dil1k:ulL For some of the hypotheses of interest ill multiwsy tables. the corresponding ellpcclcd values under die: NULL HYPOI'IIESIS can be calculalc:d directly from appmprilllc marginal totals. but ror DIllen same: farm of iterative fttting algorithm is ncccIcd (see Everitt. 1992, ror details). The basic idea of log-linear modelling is to InmslalC the dill'cl'Cftt hypotheses of interest in a multiway table into a &eqUc:DCIe of statistical models so as to proVide a syslcmalic approach to die: analysis of complex multidimensional tables and. in addition. to provide estimates of the magnitudes of effects of intcn:st. The analysis of ~dimcnsional lables poses entirely new eanc:epluaJ problems as compaml with those in two dimensions. Howewr. the extension from tables of thn:c dimensions to four or more. while becoming more complex in analysis and inlclpn:tation, poses no fUrther new problems and here description of Ihc: analysis of higher CJI'der contingCDCy lables will be ill terms of those arising from thn:c categorical variables. The nomenclature used for dealing with the r )( c lable is easily CJtlcndc:d to clc:aI with a thn»climensional r)( c X I contingency lable having r rows. ccolumns and IlayelS. The obsem:d frequency in the ij/cth cell is now repn:scnled by,,_ for i= I, 2. .•.• r.j= I, 2, .•.• c. /c = 1.2•...• 1. 11Ic general model is:

In(Fijt) = linear function of paramcten

where II is an unknown parameter refcm:d to as an ·overall mean cffect~ since alllhc: OIhel' model terms are restricted 10 be dc\'iation terms; with L;"I(r1 = 0 is an unknown deviation tenn thai varies with the level of variable I and is called the 'main effecl of variable I ': "wi wilh Ej'lz{J) = 0 is an unknown deviation tenn Ihal varies with the level of variable 2. the so-called 'main effect of variable 2'; US(/cJ with U'lJ{k) = 0 is an unknown deviation tcnn Ihat varies with the level of variable 3, the so-called 'main effect of variable: 3': II 12(11. with L;"12(!1} = 0 ror all j e ( I, ...• c J and Ej "I2(,1 = 0 for all ; ell •.... r} is a funher unknown deviation tenD for the IIh category of variable I and the fth Calqary of variable 2. Ihe so-called "interaction bdween variables I and 2'; IIIJfdo with Eillll(it. = 0 Cor all k e II, ..., It and UIIIl(Ik) = 0 for all; e t I •..., r} is a further unknown dcviation term for the ith catepry of variable I and the leth calcgary of variable 3, the so-called "interaction between variables 1 and 3': ,,~) with Ej"»{jt) = 0 COl' all k ell .... , It and E.t"D(iI;} = 0 Cor all jell..... c} is • further unknown deviation tenn for the fth catc:gary of variable 2 and the kth calcgory or variable 3. the SCH:alled 'interaction between variables 2 and 3'; "11S(o/r) wilh Eilll23(ijk) = 0 for all j and Ej IlI2J(ft) = 0 fOl' all i and k and E"uJ2J(4,!i) = 0 far all i llidj is yet another unknown deviation tcnn for Ihc ith Clllcgory of vSriable I willl.n Ihc: fth category of variable 2 and Ihc: /cth calcgory of variable 3, the so-called ·thn:c-way interaction'. 1bc main effCClICrms in the sec::ond to f'ouIth oflhcsc tcnns serve to model the single variable marginal distribulions. 11H: two-way interaction terms in the ftfth to seventh terms madel the ftnl-orclc:r ~lationships. Different combinations of absc:ncc:lpn:scnce of the tme lwo-way intcraclions correspond to the mutual. partial. conditional independence or mutual association hypotheses. 1bc thn:e-way intc:raclion tcnn in Ihc eighth term models the two-way rcIaIionship. Far example. fOl' the data in our ftnt table we might compare the following sequence of models:

"le,.

(I) all cell Crequencies are the same:

In(Fjt) = II (2) marginaltatals far variable 2 (say cholesterol) and 3 (say heart disease) are equal:

In(F;t) =

II

+ "1(r1

LOG-UNEAR MODELS

lag·ll...... models Idenllllc8tion of an adequate /og-lineIIr model for the datil in the fItst Illble LR tesl

Model t.YJIIIINUUon

Step I

Step:!

Step 3

Step 4

MDIkI """"'~

Simple, motkl

Add inlenctiaa bctwccft blood IRSI1ft and cbolcstaol Add inlenctiaa bctwccft blood IRSI1ft and heart disease Add inlmetiCIII bctwccR cbolcstaol and laltdisease

Minimal madel (4): mu1uaI iDdcpendencc

MDrt! tfoInpkx IIrIItkI

~I (5): ~ iDdcpendencc of hcut disease Model (6): coaditiClllaJ iDdcpendencc of hcut disease and cholestcml

Model (7): mutual

Model (6): ccnlilioiJal iDllcpeadelXc of bean disease aad cWc:atmaI Model (7): mutual associatiaD ~ blood prcsIIR. choIestcml . . beart disease Satunlecl madel (8): all

alSGCiatian 1Iei'A'CCD blood

finl.... 1DII1CCCIDII-ordcr'

IR~ chablcml and ..... discasc

rdlliaRslUps

ale

equal:

In(F~) -

"+ "1(;) + Ill{,")

(4) the variables blood pmcsun:. cholcSIcroI and heart dis-

ease are mutually independent: In(F,,,) -

II

+ "1(;) + 112(,) + "3(1)

(5) variables I (blood pmssun:) and 2 (cholesterol) ale associated and both are independent of variable 3 (heart disease):

+ III(it + 112(,) + UJ(.t) + "12(,1

(6) variables 2 (cholesterol) and 3 (bean disease) IR CXIDclitionally indepeadent given the levcl of variable I (blood plaSlR):

In(Fjk}

= " + 1I1('l + 112(,) + UJ(k) + "12(4,1) + "13(it)

(7) all pain or Yariables are associall:d:

In(Ffit) -

+ 1I1(;} + "2(j) + IlJ(k) + "12(,1 + 1113(11) + 1ID(jt) II

(8) satundcd model far the dne-wa), table. including the seanl-onlc:r n:lalioaship: In(FjI:)

Model (5): panial iDllcpeadelXc of

"·",Iue

9

24.45

0.0036

3

30.45

10. there is not sufficient evidence to reject the null hypothesis. Therefore. th~ is no cvidencc or a difference in spread or location between Ihc two groups. There is a probability of 0.23 (= 13156) that u new observation from group I will be less than a new observation from ,IVUP 2. For further details see Peu (1997). Han (2001) and Swinscow and Campbell (2002). SLY [See also t.If'.DL\N lESTI Hart, A. 2001: Mann-WhilDey ICSl is nat jud a ICSl or medians.: dift'~nc:cs in spmMI can be iqxIrIaIL Brit&lr Medk:aI JoumtIl PJ, 391-3. Pea, .L A. 1997: NonptUtmWtrk sttJlUlits for hmlthcare ~.ThauSlDdOab:~SwllllCcnr,T.o. V....~....,M. J. 2002: SltJliJtia tJI JqUIlr~ OM, 10th edilioa.1..ondon: BMJ Boob..

Mann-Whitney Utest Synonym ror MANN-WlUnEY R.o\NK

when: ft. = the number of abscrvaliaas in gRlUp I. n2 = the manbc:r or abscrvalions in group 2. R. = the sum oflhe ranks assigned 10 group I and R1 = the sum of the ranks assigned to poup2. Calcuialc U=min(U•• U2 ). Compare U with the critical value or the Mann-Whibley U tables. Tbe null hypothesis is R:jectcd if the wluc: or U is less Iban or equal to the critical value in the tables. The valueoC Ul(nln~can be inlCJprcted us thc probability that u new observation from group I is less than a new obscMllion flVm gruup 2. As. an eumple. data in Ihe table show Mcm2levels in two groups ofpcoplc. seven with and eight without fibrosis or the liver. The groups do not ha\IC similar distributions. Nob: that thcBlSigncd rank ror the 10\\'e51 Meml value abservcd.due to thc tic. is midwuy between I and 2. ellClDplirying the convention mentioned c:arJicr.

32 27 2S 33 14 4 24 3

MeIDl

....

II 9 14 7 II II

Wilhoul jIbroJiJ (,rDllp 2J

SlJ).I

TEST

IlANOVA

See ANALYSIS OF VARIANtE

Mantel-Haenszel method. These IR a collection or statistical methods for Slndifted. categorical data. When analysing datu fiom an epidemiological study. one should be awarc or the dan.-=r of confounding. For examplc, in a CASE-COHlROL SI1JDY of the association between an industrial chemical and a particular cancer. 100 cases and 100 controls an: rc:cruited. When the data an: analysed. Ihe ODOS RATIO associated with the chemical is 0.91. slilgesting no association or a possible pR1lc:ctive effect. However~ il is noticed that when the datu are stratified by 5ex.1he odds ratio in men is 1.29 and in women is 1.38. suggesting a possible hannrul eaect (datu 1ft shown in the table). The reason ror this revenul in the odds ratio is that sex is u confounder in the association between exposure and

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ MARGINAL STRUCTURAL MODELS disease. The disease is more c:ommaa in women. but women arc less likely to be cxposed. and so exposure appears protective if one does not adj_t for sex.

Mantel-Haenszel methods SUmmary data from a case-oontrol study, stratified by sex Mm

Care Exposed Unaposed

7 34

Total

Women

Control CaN Conlrol II 69

4

55

1 19

C~

OHttro/

11

12 88

89

One metbocI for ovemJmiDi conroundiDi is to sbatify the 5c:X. 1k statistic of interest (col. the odds ratio) is calculated feX' each stratum sepanalely. It is dlen oftea desirable to combine lhcse stratum-spc:cific slatistics into a singlc overall m~ to calculate a slandard enor for Ibis and also to tesl a null hypothesis (e",. Ihatlhe cxIds ratio is one). If die number or subjects in each stratum is IBlle, Ibis may be done using ).IAXIMU).{I..JIBJHODI) MElHOOS (e.g. LOOISTIC R£ORESSION), by introducing an additional JNII1II11d« into the model feX' each stratum. However. when data an: spancr, i.e. when the number ofsubjc:cls in a SIndum may be small maximum likelihood may poe biased esdmaIcs. In this sitWllion. it is necessary to use either conditional maximum liIcclihood methods or MantelHacnad methods. The Iauer have the advanlqc of bc:i11l \'CIY sbaightrorward to adculale and. for this 1aISOII. an: popular. Mantd-HacrISUl methods do not mquR that the numben or individuals in each stndum be ~ only thai the total number oflUbjcclsbclargeenaugh. Ho~ ifcvca lhetatal numbcrof subjc:cts is small it is nca:ssary to use 'cud· methods (sec

dais. as in this case where stratification was by

EXACI' t.Et1IDDS fUl CA1BJOItICAl. DVA).

Mantel-Hacnszd methods an: available feX' estimaling odds ratios rrom ~ontrol data. raIC ratios or rate diffuences from cohan data and odds ratios or risk ratios from casc:-cohort datL 1hey may also be useful when analysing n:peatecl-measun:s designs. When the exposure and the outcome (disease) arc both binary. Ihe analysis of a casecontrol study wilb stratification is an ex.ampleoflhe analysis of multiple 2 x 2 tables: one tablc for each Slndum and, in each table. two rows Cor cxposure and two columns Cor outcome. Mantel-Haenszcl methods also exist Cor the m~ general situalion or multiple I x J tables. e.g. a case-allltrol sbldy with more than two possible exposure (or In:alment) Icvels (I> 2) and/or ~ than two possible outcomes (J > 2). Both the exposlR and outcome variables can be treated as either nominal or ordinal categorical variablc:s. Finally. when combining several stralUm-specific estimates to ronn a single overall estimate. it is important to consider whethu this is sensible. Ir the odds ratio (or other measure) appc:an to vary greatly from one straIum to another.

possibly even bei~ much greater than one in some strata and much less than one in olhu sInda and this variation is IDCR than wauId be expected by chant'C. a siDiIe summaI)' measun: may not be YCI)' meaningful. In dais situation it is beller to n:pDIt the odds ratio estimatc Cor each stralUm separately. Thus. before calculatiDi the ovemIl odds ratio (or 0Ibc:F measure). il is worth testing Ibe null hypolbesis oC homogeneily. i.e. that the odds ratio does nol v8l)' from one stratum to anather. The 8rcslow-Da)' test is one such test. PeX' Curthu details see Kuritz. Landis and Koch (1988). Clayton and Hills (1993) and Rothman and Greenland (1998). SRS a.,.,D.IIIHlHlIlr,M.I993:Slatislkalmode&inepiderniolll81. Oxfonl: Black"dl Scieacc PUbIic"s. Karftz.5. J., L"",J. R. _ Kacb. G. O. 1981: A gencraJ OVCIVi~· of MalcI-HaenszeI methods: applications and Ment deYdapmcnts. Anmtal ReriA' of Pub/it Healtlr9. 123-60. Rot....... It. J. _ GI'IIIIIaIId, 5.1998: IVoIk,II epidemiology. 2nd editiaa.. Philadelphia: LippiDcott-Ravcn

Publishers.

marginal sbuctural models These an: regn:ssion models Cor 50-called COUDterfactuai or potential outcomes Y.. which express how the outcome or intelCSl, Y, would have loaIc.cd like if level a of the: laIIct exposure A had been received (Robins, Heman and Brumback. 2000). The models an: labelled 'marginal' because they arc models Cor exposure efl'ects at the population level. rather than within stratadefincd by covariate values. They an: labelled 'structural· because. by conslrUction. their paramclen carty a causal intcqRlation. For instance. in the linear marginal SllUctural model: E(Y.) = a

+ /la,

the inlera:pt a expresses what the expected oUlc:ome in the: population would be if all subjects were unexposed (i.e.• 0= 0). 1bc rq;rcssion slope:

fJ = E(Yd+I)-E(Y.. ) enc:odc:s the expc:ctcd change in outcome that would result rrom a unil inc:n:asc in Ibe exPOSIR. It compraI'Cs palc:ntial outcomes ror the 'same' subjc:cts under diffcn:nt exposun:: levels and thereby can be inteqRlCd as an averagc CAUSl\L EFfECT or the exlJOli'R on the outcome. 'l1Iis contrasts with SIandanI rq;n:ssion models: E(YIA = 0) = a' +110. whc= the regn:ssion slope:

II = E(YIA = II + I)-E(YIA = 0) compan:s Ihe expected outcome between dill'cn:ntly cxposed subgroups (A =Q + I and A =0) or the populalion. When these subpoups arc not inhc:n:ndy4XllllpU1lble. dw:n{f (unlike /1) cannot be interpreted as a causal exposure elfccl. In standard rqrcssion models. 4XJ111P1U'8bility across lRatment levels can be achieved by adjusti~ ror measutm 283

MARGINAL STRUCTURAL MODELS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ ClGftfounden L of lhe ex~utcome rclationsbip. In IlUU'linal Slnletural maclel... adjuslmelat for confounding happens throqh a weipliag procedure. called INVERSE PROBMlLrrY WElClIIllNa (IPW), which works in two steps: (l) A pseudo sample is canslJucted by weighting each

subject with the probability of the obscnal exposure. given the observed confolllHlen: I Pr(AIL) . For instance, a subject wilhconfounder lew) L = 1who is exposed to level A = I is welghted willi 1/Pr(A = IlL = I). Similarly. a subjc:cl with L =1who is exposed 10 level A = 0 is weighted with I/Pr(A = OIL = I). Here. Pr(AIL) can be estimlllCd as the fined value from a slandarcl regrasion of A 011 L. e.g; a logistic regn:ssion if A is dicbolomous. (2) BeaUle the weighting eliminates confounding, the marginal stl'UCllD"al model can be fiued 10 the pseudo sample as if there wen: no confounding. This can be acc0mplished via oft'-thc-shelf softwan: packages by fitling the corresponding regression madel E( rIA) = a + (JA while assigning each subjc:ct's data the giYen weight. We emphasise thai all methods ofconfounding adjustment - in panicular standanI rqn:ssion adjustment and IPW crucially rely on the assumption ofno unmeasured confounding. This assumption holds if L conlains aU CODf'ounders of lhe expDSUn>-outcame relationship, i.e. all faclOrs that di.ully afTcclthe exposure and an: also associated with the outcome. This is visualised in the causal diagram of the ftnI ftglR through lhe absence or an anuw from the unmeasured variables U 10 A. IrDOl all confounden orlhe expasurc-outcame relationship arc included in the eslimation of the weights IfPr(AIL), lhen the assumption of no unmeasured ClGftfounding is violaled. in which case the estimalc of the causal exposure effect may be biased. Tbc assumption of no unmcasun:d confounding is unlc:stable and has to be defended by subject mailer knowledge.

L/U~ ~r margl. .1 structural models causal diagram tepI'6SBfJIlng the data genet8ting mechanism, with A. exposure; L, measured confounders; Y, outcome; U, ,."".sured variables that affect LandY Because marginal struclUnl models encoclc populationaveraged efleclS. lIIey can conveniently be used for

SlaDdanlisation (see [Bt00RAFHy) with the total graup as the standard population (SaID and MlIlsuyama. 20(3). Because the IPW estimation procedure involves the FIlOPEHSJ1Y saJRE. the resultilll eslimates inherit the prapel'lies or pmpensity score adjusted c:stimalors. Mmlinal IIrUcturaI models arc. howcyer, most commonly adopled forllllSeSSing the eft'cct of a lime-varying exposun: on an outcxtmeiD the pn:sc:nce ortimewrying CXlnfounders. This is because these models awid regression adjustment ror lime-wrying confoundc:n. which is fallible when the Iime-WI)'ing confounders an: bolla affected by pall exposun: levels and affecting ~ exposure levels. For instance. CD4 count ClGftfounds the rclationship between AZT lRabnent and survival in HlV infc:cted subjects because it all'ects the: physicians' assipamellt of particular AZr levels and is associated with survival At the same time, it is aft'ected by carliu AZr exposun: levels. Tbe n:ason why standard Jq;n:SSion adjuslmc:nl fails in IhisconlcXt is bcawse. on the one h..d, it eliminalCS indin:ct exposureeft'ects thai an: medialed IIuuugh Ihcse confoundc:n (e.g. indi~ effects of AZr OIl surYiwl tiuough its efreet on the CD4 aJUIII) and. on the oIherhand. it may induce a ~Icd coIIider-slIalification BIAS by which a spurious A5SOC'IA1IDN between elplJllR and outcome arises. even in the absence or an exposun: effc:cl. Jnfmmce for marpnal sInIclural models sutrers neither or Ihcse two limitalions because it imol\'CS no regression adjustment rOl' confoundilil. The IPW procedure. which is used instead. is now slightly mom involvc:d as it must acknowledge the lime-vlII)'iqlUllurc of the exposure. With A' and L' denoting the exposun: (e.g. the AZT level) and confounders (e.g. the CD4 aJUIIl) IapCCtively, measurallIl study cycle I. the wei&hts now lake Ihe fann: I

(1)

L_ WQlOR;

A-,-1

= (I A .AI , •.. ,A'-1)

and

Lrl

= (LI.1

.,)

,&.0 , ••• ,&.0

refer 10 exposure and confounder history respecli~ly and wlKR T is the end-or-sauciy time. Next. a lDIII'Iinal s1nIctural model for time-varying exposures can be fitlcd. For inst..ce. with Y denoting I ror subjects who survi~ the end-or-stucly lime and 0 otherwise. the marginal structural model:

r-I

l~itPr(r.r= 1)=a+tJa'" +y1:a',

(2)

.r=1

is indexed by paramc:len:

exp(/l) =

odds(Y(~-'.I) =

I) odds(Y(r--'.O) = I)

~~;......--~

encoding the short-lam eft'ect or AZf OIl the odds of death. and exp(y) ca.,auring a long-term effccL These panunc:terscan be cslimated by fitting the corresponding regnssion model T = 1) = a + PAT + y 10gilPr( A" while assigniag eada subject's data the given welghl.

rlA

Er;.·

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ MARKOV CHAIN MONTE CARLO (MCMC) Just as far point cxposura. the IPW procedwc for lime-varying expos~ crucially ~lies on the IDIlcslable assumpiionofnounmeasuredconfounding.FarIimc-varying cxposu~s. this asswnplion holds ir all radon beside 1',-1 -'-1 (L ,A ) that afrectthe cxposure at lime I and an: associated with Ihe OUICome an: contained in L'. roreach lime I. 'Ihissc:enario is 'Visualised in thecausaJ diagnun ofthe second ftgure, ror T = 2. liuougb the absc:acc or DD 8I1'OW from the unmeasured variables U to A'. In this fl&u~ the arrows ftom 1"-1 -'-1 I (L ,A ) loA indicalethallhen:ceivcdAZTlevelattime I may be atTedccI by the subjects' In:alment and CD4 count history up to I. Similarly, the anows from (£'-1 •.04'-1) to L' indicate that the CD4 count at time 1 may be affcctc:d by the subjects' treatment and CD4 count history up to I. U represents all unmeasured variables (e.g. gc:nelics, lifestyle faclol'S) that dect a subject's health status over lime.

liliii1i1... atructul1ll mod_ CIluslll diagram, with A'.

exposure at lime t

L'. measured oonfoundtNs at time t;

Y, outcome; U. unmeasuredvarisbles thataffectL'and Y

ft_

When SIandan:I statistic:al software packages (SAS.sTATA.R«"C used for muPnaI slnlctwal models through IPW. caution is needed in illlclJRting Ihc STANIMRD ERRORS provided by the software (sec srmsnCAL PACKAOES). because Ihcse ignore the ilnplmsion of the estimated weights. By using routines that n:poJt so-called sandwich estimalan. caaserwIi\'C sIandanI ell'OlS ~ obIaincd. A major appeal to maqinal structural models is that the undcdying IPW estimation proccdum stmightforwanlly genemliscs to man: compicx settings with time-yaryilll outcomes (Heman. Brumback and Robins. 2002) or surviyal ENIJIOJNI'S (Heman. Brumback and Robins. 20(0). A drawback is that some subjects may have small probabililies Pr(A'IA,-I.i/) at certain lime poinls, 50 lImt they receive inftuCDtiai weights (I). This can maIc.e the IPW cslimalc unstable and impRCisc.. To some exlenl.lhis problem can be mitigated by using so-caUccl stabilized weights. calculated as: srA11S11C\L B\CICAOBS) ~

n!:1 Pr(A'I.o4'-·)

fir=1 Pr(AIIAI-I~i') . When this is insuflicient. progras can sometimes be made by including baseline cowriatcs in the lDBIIinai SlnlCblraI model. i.e. cowriates that )JI'CCCde A 1 in time (e.g. sex.

ethnicity). For cxample. the model in (2) gencnlised to: T-I

10gitPr(ya'

= IIV = v) = a +PJ + y L

CDD

be

tI + 6v+ ~lYl.

.r=1

(3) when: V is a baseline co\lBriate that is contained in Ihe measured baseline conrounder LI. and 1P is a cowrialcexposum interaction. When the model in (3) is ftued through IPW. the (stabilised) weights an: modified as: T Pr( 'IA,-I n1=1 A.... ,V = l') t n1=1 Pr(A , IA-,-1 .L, V = v) -I

•

These weights are lypically less inlluential. This adaptalion may, howcver, be insufftcient when the exposure has strong prediclOn or when it is measured on a continuous scale, in which elise Pr(A'IA'-1 .l!) refers to the density or A', gi'VCD At-I and L'. In that case. one must rc:coune to more efficient (doubly robusl) estimation strategies or consider the relaac:d. bUI more complcx class or structural nested models (Robins. 1997). 11K: latter models have the addilional advantqe that. unlike marginal structural models. they can allow for modification of the exposure effcct by time-varying covariates. AJS/SV H. . . . M.. A., BI'lllllIIadl, B.and RebI-.J. M. 2000: Margiaal structural models 10 estimalc the causal effect of lidcmadinc on the survival of HlV-positive men. EpidemiDiogy 11,561-70........ M. A.. BnIIIIlJack, B. and Ro....... J. M. 2002: Estimating the causal effecl of lidovudinc on CD4 count with a muginal structural model for ,.ateci 1DeaS1RS. Sialutics in M~irinr. 21.1689-709. Pud,J. 2000: Causalil,: Mode&. Rm»ning. and Ilf/erellCr. Cambrid,e: Cambridge University Press. RobIas, J. M.. 1997: Causal iDfe~Doe flOm complex longitudinal dala. In Lalml Variablr Modrling anti Applications to Cawality. New York: Sprinp Verlag....... J. M.., Hem.., M. A. and Bnunbuic. B. 2000: Marginal stJuc:turaI models and causal inferCllce iD epidemiology. Epidemiology II, S~. Sa", T. and MafIa,.. .... Y. 2003: Marginal SlrUcturaJ models IS a tool for studanliutioD. Epidemiology 14, 680-6.

Markov chain Monte Carlo (MCMC)

BAYES' ~

OIEM (I) provides a means ror combining daIa. ". in the fona of the UKBJIIDOD,p()t9), withelUemal evidence in the fonn of a PRKIl DlSlRIBtmoN ror 8. p(8). 10 produt'c a POSI'BUOR DIS11lIBUTION. p(61y) (sec BAYESIAN ME1HOD5). Howe\'er, in order to make inf'crcnccs about either the posterior distribution itselr or lO obtain the posterior expcclation of a function of the model parameters. 9. using Bayes· tbcomn (2), we have to ewJuate often high-dimension inlegmls. which an: only r~ly analytically lractable. ConsequentI),. much of Bayesian slalislics over the last 30 years has been concerned with either parameterising models such that the inlcp'als

285

MARKOV CHAIN MONTECARI..O (MCMC) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ simplify arwilh Ihe uscofapplOXimalion metbacls (BcmanIo and Smith. 1994). Such approximalc mclhacls fall into line broad calelaric:s: lIIymptotic appntXilDlllions. Col. Laplace appraximalions: numerical intepdion techniques. Col. Gaussiaa quadrahR; or simulalioa methacls. e.l. Monic c.lo simulalion (Bc:nuudo and Smith, 1994):

p(B)p(yI9) p(9b') - Jp(II)P()'IB)c19

(I)

EV(B)Lv) = J/(II)P(9Iy)d9

(2)

Oiven a sample or \IBIucs for II from Ihe joiDl posb:rior distribulion.p(91.v), {trr). m= 1•.•• , M}. thea Ihe posb:rior expedaIion ofj(B) can be approximated by:

" £[/(9)11) ~ -1 L/(B(·») 14-=1

(3)

Ihe usc or (3) appcan appeaIiq. in pnadice. generation or samples from oRen hip-dimc:asional joint posb:rior distributions cm. be dimcult However. far (3) to hold. the samples 1C1ICI1Ik:d need nOl be inclepc:aclcnt. but ndher be from a Markov chain whose Slatianlll')' diSlribution is. in fact. Ihe posterior distribution. A Markov chain is a sequence or nadom \IUiabIes eBfI). (12J. .... sucb that II" only depc:ncIs on (1'-'. and not the n:sa of Ihe random \lllliables. C'onsIrucIiq sucb a chain lhen lives rise to MallOY chain Monte Carlo simulalion (Casella aad Georp, 1992; Bnds. 1998). The CODSlIUction of a Markov chain willi a stalionaly distribulion that is the paIleriar dillribution is relalively slrailhtrorwUd and Will initially propascd by McbvpoIis et III. (1953) aacllater Ic:aeralisecl by Maslinp (1970) and is now n:fern:cltoas the Melropo/b-lltulingslligorilirm. At the ilh of m ilcntians FJlCralc: a candidate value for B. from a pmpostll distribution••(IJtf1,-1 J~ aad then with probability a(9 CI-I J. , . ) ac:cepl i.e. (/1) ar rejcc:t it. LCo (1" 11'-1,. ~ a(B (I-I).,.) is pvc:n by the foIlowiq equation. which, in practice. is achieved by pnc:nIiag a \lBlue " &om a unifarm 10.1 J distribution and. if u:S a(9 (/- '., r). While

r,

=,..

r.

accepIiq

=

damain of p(lIy) within a finite number or iteralions and produce samples from Ihe stalionary distribution. i.e. p(1II."). Thus. it sllaalcl nat be clepenclent on the Slalting values. Clearly. one: way in which 10 verilY il'mlucibilily is to use Ihe alpxidun a number of limes with cIiO'erenl slaltiDl values aad inspect the samples obtained. Evc:a if Ihe allOridun is ineducible it has to be IUD 10111 enDU&h so thai it will 'fGlld' ill slalling values aad, in pmc:ticc. this is achievc:cl by runnilll Ihe alprithm for a "bum-in' periad and cliscanliq the linln samples and basiDg inr~nccs on only the lasl m -n samples. or crucial importance, thc:n:ron:, is the question of how Iaqc: m aDCI " should be. In practice. a combination or fonnal methacIs thai have been advocatc:d. together with knowlcdp: of Ihe statistical model and inspc:ction or Ihe samples obtainecI via sensitivity analyses to choices or m, n and dae slartinl values. is the mast )II1IIIII8Iic approach (Cowles and Carlin. 1996: Gilks, Ricbanlson and Spiegelbaltcr. 1996). Bxaminalioa of Ihe autocGIRlalion belwc:c:a the samples at variaas numbers of iteraliaas apart can RMW allariduns dud 1ft mixl". slowly. i.e. coverinl the whole: of p(lIIy). and thus aeecllo be run for considc:nble nu~ or ilcnllians. An alb:mative. often prefaml. option is 10 consider the ~ panunelerisation of the slalistical model in cmIer to illCRUC Ihe nrc of mixinl. In Iiaear relressian models. cClllriIil or cowriates and. in the case: or hieran:bical models, hlerllrclrimll.Wllrilfg have been shown to have dramatic efl"ects on Ihe rate of mixing (Gelfand. Sahu and Carlin. 1995: Gilks. Richanlson and Spi.lhalter, 1996). A special cue of the siqic componc:at Metrapolis-Hasliap alpthm is the: Gibbs sampler in which Ihe prapasal distribulians·1ft Ihc: sc:I or rull coaditianals and Ihe acceplance probability (4) is always equal 10 I (CJeman and Oeman, 1914: Gelrand and Smith. 1990).1bus, givea a sctorinitial ar slaltinl w1ues for the p JNII1IIIlCtelS in a statistical model. IB.(O)••••• B,.(O)}, the Oibbs sampler aI each ilendion draws a sample fram each of the conditional dislributioas in tum.

11Ius:

(5)

r:

( .IlI(i-l)

.Ill.)

a D ' , D'

=

• min

[I

P(II· ly )r("ci-I)19·)]

, p(8CAJ ~).(9.1~=lfj

1.4) ~

...iIJ (.Ill I.M I) B(I, ...il) flp - P D'PI"I • 2 , ••• , flp-I'Y

Thus. the n:alisalions 19.(1)••..• BI ""J}••..• {B,.m• •••, II"cIII'1 Clc:arly. if .(.) is still a multivariale dislribution. the generation of samples may still be diflicull. In pnIClice. most applications or the: MeInlpOI~ngs BIIoritbm use a sinllc CIOIIIponent pmpasaI distribution (Oilles, Richardson and SpielClbalaer. 1996). If Ihe MetnJpolis-Mastinp al&orithm is imldMt:ibk Ihen n:lardIc:ss of where it SIaI1s it will sample from die entin:

al\er m iterations provide samples fram Ihe lIUII'Iinai posIcriGll' distributions and on which infc:n:nces can be based. SampliRl from the conditional clillributions in (5) CaD be cliftk:ult unless lhey 1ft uniwriale. although for many HIJ!R. ARCIIICAL MODELS they are. or they an: IOI-concave. in which case tldtlptiw rejection StIIIJp/i". may be used (Gilks. Richanlson and Spiep:lhalla', 1996). One particular appeal

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ MATCHED PAIRS ANALYSIS of die Oibbs sampler is that. in essence. it is simple to implement and while it can be prognmmed in a variely of computer languages and softwlft packqes.1he development of user-friendly soilwan: such as BUOS AND W ~ BUGS has pIOI1IOb:d ilS widespread use in numerous applic:cl scllinIs in biomedical research (Oclman and Rubin. 1996). KRA Be....... J••L aad Sm..., A. F. l\L 1994: BttyeJitlll ,'-"_ Chichester: Joha Wiley a Sons.lJd. Broo. s.. P. 1998: Markov chain Moatc Carlo method aad its applications. JDII1IIIIl oj tile Rt1)vzJ Sla,istiml Socie". Series D 47.69-100.

ea.ua. G. aDd

Georae. E. 1992: &plaiaiac the Gibbs sampler. America Slat-

istidtlll 46. 161-74. Cowin, .,. K. .... C_IIII, B. P. 1996:

Markov chain MCIIIIe Carlo COR\'Cllence dialllOSlics: a comparaliYC review. JDIII'IrtII of the AlMr;am St"'i"i~'" ArsDriatifNI 91. 883-904. G........, A. Eo ... SmI~ A. F. Me 1990: Samplil1lbased apPR*hcs to calcailling naarginal dcasities. IDrnal oj tile AnreriCtl/l StGlutiml An«itlli"" 85. 398-409. 0tU'ad, A. E., S...... S. K. aad C'arIID, B. P. 1995: EflicicDl paramctcrisatioDS for nonnaIlincar mixed models. Bitlflle,rilca82. 419-88. GeIIua, A.... Rubia, Do B. 1996: Maatov chain Monre Carlo IDCIhads in biostalistics. Sla,istit:a/ Mel"'" in Methml RemlrdJ 5. 339-55. 0 ......., S. ad GeIua, D. 1984: Stochastil: reluation. Gibbs disUibutions and the Bayesian restontian of ilnaps. IEEE 7ran.ta~litNa"" Pallem A""I)'$u tIIIII A'at:hille I"telligence 6. 121-41. GIIkI, W. Ro, ..........., S. aDd S........t_, D. J. 1996: MGrlco, diG;" Monle Carlo methotb in prtIt:t;a. New Yark: Chapman .t Hall. H......., W. K. 1970: ManIC Carlo sampling

methods using MaJtov chains and their applic:atioas. BionrelrilcQ 57,97-109. MeIropaUs, No. RoIIabIu~ A. W .. Teller, M. N. ad Tellet, A. H. 1953: Equalians of stale calculations by fast cCIIDpudng machine. Jllllmal 0/ Ch~mit:lll Physic, 21. 1087-91.

matched pairs analysis Ditrerc:at types or designs may lead to maIchc:cI pairs .....ysi5. indiVidually IlUlkhc:cl subjects in prospective studies. iaclividually matched caDtrois to cases in rclrospcCtive studies and pairs of data obtained when the same individual is measun:d lwia: 1ft examples orm8lchcci pairs (see MAmIED 5.WFLES). A sample of malched pairs consists or stalillically dependent data and in statistical analysis the pair. not siqle 11lIbjc:clS. should be: the unit. Matdu:d pairs analysis may conccm conceplS like change. dill'~ncc and odds. bUI aIsoAOREEMENI' and ASSOCLo\l1ON. Questions of change could include: Is there a dirren:nce in outcome due 10 diffen:ntln:atmenlS between individually matched subjects? Is then: a change in outcome within subjects before and after a treatment? Do patients prefer one tn:atment betler than another? Statislical methods for matched pairs analysis of qualitative. onlcn:d categorical and dichotomous data respectively will be presented. The cholcslerollevcJ wu mc:8SIRd in 20 students bcfCR and after a period of having a clietthat wu supposc:cl to have a cholcstc:ml-lowerilll effect. As each student was mc:asured

lwice, the diffen:nc:e between the lWO values was the outcome variable. Tbc: changes incholcslerol nmgecI fi'om -1.0 mmoll I (incn:uc) to 0.8 mmolll (clccn::asc). The table shows thn:c: differcDl SIalislical appnlBChes to matched pairs analysis of quantilalive data: 71Ie mm" tlpprOllt:h. Provided that the dalaset or differences is a sample froID a NORMAL DlS11UBUIlON, the paired S1t1DENt"SI-lEST of the null hypothesis or zero mc:_ cluange can be used. Tbc: observed mean change was 0.23 nunoIII and according to the lest (see the table) one can conclude that the diet will significantly cIccrcase the mcaa cholesterol level in a rcpn:scntalive population of about 0.04-0.42 mmolll. which is the 95 .. CONFJDE.NCE INIBlVAL (CI). 71Ie meditl" IIpprotlClt. Tbc: Wa.coXON SIGNED RANK TEST requires no assumptions about clislribution of the diffen:nces in quantitative data. 'I1Ic mc:cIian change was 0.2 mmolll and according to the test the nuD hypothesis of no MEDIAN CHANaE can be rejec:led (P=O.OI). 71Ie dimDIDmimlion IlPPlYHlm. The cholestcrul level ~ in 16. incrcasecl in tIm::e and was unchangccl in one slUdent.lfthe null hypothesis orunchlUllc:d values wen: true one: would expccl about the same numbers ofpasilive as ncplive diffen:nces: this comparison is pclfonncd by a sign test. Unchanged wlues provide 110 information about Ihe direction of change and will be excludecl The BINOMIAL DISJ'IUBUlION is used for exact calculation of the PROBABJUI'Y or gcltilllthe observed or eVCR IIICR exln:mc unballlllDC in ncplive and positive diffen:DCCS whCR the null hypothesis is true. The table shows that the plUbability of the observed 01' man: extn:me unbalance was 0.004, which is strong evidence that the diet will cllanp the cholesterol level. 11Ic large sample approximation of the one-mmple sign lest (Altman. 2000) can be written as: z~

=

I"-,,pl-!2 ",Ip(I-p)

where r is the number of differences of one sign IIIIIODg n nonzero differences and p is the probability IIIICICI' the null hypothesis of having the actual sign (p =l). In the example. ,. = 16." = 19 and =.: = 2.75. The proportion of stuclcnts with a dc:cn:ase in cholcslCrol was 84 CJf, and the 95., CI (see Newcombe and Altman. 2000) deviales from that of dae null hypothesis (SO.,) (sec the lable). Malc:hecI pairs analysis of cxdc:mI categorical data is applied to a dalasct ftom a study in diagnostic radiology (Svensson el til.• 20(2)- 'I1Ic patient·s pcrccivc:cl climcully during each of two radiological examinations. heR clcnob:d cr and CO, was rated on a scale with Ihe categories "not at all". 'slighdy', 'fairly' and 'very. difficult. Each of the 108 patients underwent boIh examinations. which means paired data (see the lip..: on page 261).

MATCHED PAIRS ANALYSIS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

matched .....,. ...,... Tbtee dIIfetBnt spproac#Je8 to matched pailS analysis of change In 8-choIesIf?roIln a sample 0120 sIudents

NcpIive changes, 3 Pasitiw clwtps. 16

Mean daqe 0.231111DD111

staadani deviation 0.411lIIII0III

QuutiIcs (Q.;QJ (0.1; O.5)nunaIII Wilcoson siped raakI matched pairsItcst

StUIIaIl's paimI t: 2.527 sipificaDcc IC\'CI I' = 0.02

'=0.01 The 95... a ofmaliaa dllap: (0.1 to 0.5) mmolll

'I1Ic 9SCfI a of IIICID ~ (G.CM to 0.42) nuaaIIl

HoI

SIIghIIy Fairly Very

3

2

1

7

12

2

Zi

28

3

16

12

48

53

5

_uc:

conclude that the cr is pcrcciwdas sipific:anlly Icssdillcull than the co euminlllion.

CT perceived dIIIcdy

at all

The cud sip tat. 1'=0.004 The appruimalc sip test: :~=2.7S. P=o.OO6 The 95... CJ far the pmpartiaa of studcats wida dcc:1aIed 62Ilo to 9441t

21 1

&5

1

29

2

108

..tched pailS ....,... FIfNlIBJCY dimbuIion of PIIII8d data hom evaluation COIJDIItning petr:S/WJd diffI-

AD altcrutiw: way or dichotomisiq is by lJUupillldle: dala in lwo categories. ·nat at all cIifIlcult' (+) and ·diOicuk' (-). which contains tIRe calepricsor dimcully. The table is simpliftcd in two cancanlant and two disconlant combinations orcatepries. The disc:ont.t pain; or 32 cr(+), CO( -) aad 13 CT(-), 00(+) contain iDranDation aboURthe dift'ercncc in perceived diOlculty bctwcca Ihc euminaIions. 1bc SIGN 1I5T ( ; = 2.68.. P = 0.CJ07) confinns thaa IIICH'e patients wiD find the cr euminatian 115 'DOt at all dil&cull' when compared wida their raling or the CO examination.. As is evidcat rnm the ftl'R, a huger prapodi_ or the patienls (11'1,)judgccl the cr as beiq ·nat at all diOlcult'. as 48 patienls(44f1,) I1IIaI dlcCTand 29 (27fJt)nIIccIlhc: CD·not at all cliftk:uIt·. The 95'1, CI rar the dill'eRDDe iD die: paiIaI prqIOI1iaDS., Ap~ was rmlll 5'1, to 29.. accanIinc 10 the

exJRSSionAp= 1.96)( SE(4p).11Ic STANIWlDfIUIOR (SE)iS:

cuIlyCOlJDlltning IwoIlJdlclloglclJl examinations (er, CO) 0.18 fivm scale assessments have an

onIc:mI

ItrucllR

_Iy. which means that chllllle is DOt dcfiacd by the clift'erence. Thm:fon:.. the same statistical mdhods as for pain:d dichalamousciala will be used. A common expn:asian for ,lie sign lesl is: %c=

Ib-('I-I Jb+ c

when: b and c denote abe nuaaber or pairs with clill'cn:nt categories. A McNEMAIt·S tEST is an equivalent teat (Bland.

1996; Alllnan, lOOO). One appIOBCh 10 dichotomisc Ihc data is 10 compare the DUmbas or pain below and ahem: the diqanaJ orunchangc:cl catepxies. For the data iD Ihc ftp~. the 17 paIiaIts.. wblt I1IIcd the cr a hiper Ie'VCI or difBculty thaD die: CO. are ecIIIIpInd with the 4S pairs abo~ the cIiqonaI. Acc:anling 10 the IiID test (re = 3.43, P = 0.0006), this obsenal.baJance iD chanpspw:seviclcaceenouP to

SE(4p) =

! n

J + ('_ b

(b_(')2

"

where" is the 10la1 number or patients (" = 1(1) and bad t: are the numbcn or disaJrdanl pairs (AlbDIUI., 200Ct NcWCXllllbe and Altman. 2000). In order to use a pair-mardIccI CASE-CmI1RC1 S11JDY mediad we are interesb:d In the expD5UR: 10 the risk fiM:tGr:. Usil1l n:baspcctivc cauc-conlrOl SlUdies. individuals having a spc> cific cliseuc (c.1-11IIII CIlllCa') are compaml with indiyiduals widaaut the disease. Both the 0IIICIaaIc \IIIIiabIe (diseased. 1IIIIIIlisc8sc:cl) and the expasun: to Ihc risk factor (exposed. naI exposed) are cIicbotomaus. WithiD caeh pair~ them an: raur possible combinations or disease status and exposun:.. 1\Yo Ids or pairs are concordanL bUI inronnaliOli about Ihc n:latianmip between exposure and disease is gi'VCII by the pairs with clifl'cn:nI exposure. Denote die number or pails with only the case exposed "._ and the number or

_______________________________________________________ pairs with Ihe case unexposed n_ •. Providing nonzero numbers of discordant pairs, the odds ratio in malChed pailS is calculated by OR = n+_/n_+. An OR larger than unity indicalc:s a relalionship between exposure and disease. i.e. B higher odds of developil1l disease when exposed (McNeil, 1999). ES (See also CORRELATION. KAPP..\ AND WEIOH1ED KAPPA MA1tII~O. NATOIED SAMPLESI

AItnIa. D. G. 2000: Pr«timlslalistksjOr m«lica/ Trsetll'tb. Boca Raton: Qapmaa &: HalUCRC. B..... l\oL 1996: Air ilrtrotlMclio" 10 mtditsl stat&tks, 2Dd editiaa. Oxford: Oxford Medical Pras. MeNeIl. D. 1999: Epidtmioio,iml ramrrlr melhOtb. New York: John Waley &: Salls.1nI:. N.wClllllllle. R. G.IIDII AItawI. D. O. 2000: Prapartions anclllleir difl'c:mxes. In Allman, D. G.• Machia. D.• Bryant, T. N. and Clinical and clinical assessors. Most of the statistical work is done in c:onjunction wilh the clinical assessors. although statistical considerations oRen come into other an:as such as assessment of p~-clinical sarety studies. determination or product shelflire and assessment or marketing/advenising claims. Companies apply for a "marketing aulhorisalion' (ronnerly called simply a ·Iicence'). Agency staff assess data and prepare assessmenl rcpons. which are considered by the Commission on Human Medicines (CHM) and some or its expert subcommittees. The CHM is a panel or independent expens (including practising doctors. phannac:ologists. statisticians and lay members) that meets monthly and advises on the granling. or oIherwise. or a marketing authorisation. The final decision on granting is made by the government minister responsible ror heallh. In cenain cin::umstances. companies may appeal against unravourable decisions. The MHRA works in close collaboration willa other European national agencies and the European Medicines Agency (EMA). which is based in Loudon. T~ are a \'ariely or routes by which companies cu apply ror a markeUng authorisaUon within Europe - including national licences in as many (or as rew) EU membcrslalesastheywisb or a centralised licence covering all member Stales. In the laller case. two member states will be allocated to complete a compreheMivc assessment of the applicalion bul all other member states are given the opportunity to raise concerns. 1be Europcan oounterpart to the CHM is the Committee ror Human Medicinal Products (CHMP), which meets monllaly. The MHRA and EMA also work wilh oIhcr agencies across the world and. in particular. contribute to and follow guidelines jointly pIq)8n:d by the Intemalional ConrCRnce on Harmonization (a collaborative effort between the major geographical regions or Europe. Japan and the USA). The assessment or safely and emcacy from ruNIC,~ 1RIAI.S is similar to Ihal fCll' rerCReing a paper for a medical journal bul explon:s much II10Ie detail (sec REQULo\TORY srAnmc.~ ).L\11ERS). The law requires companies to submit details or all trials Ihal havc been conducted. Each of lhcse trials will have detailed study reports running to hundreds or pages and rurther extensive appendices includi~: a copy of the protocol and any amendments: individual case reports fCll' serious adverse eventsln:actions; line listings or individual patient daaa~ possibly efficacy results presented separately for each participating cent~: copies or invesligator.s' curriculum

vitae: documcatalion of quality and purity of product used in the lrials~ and so on. These appendices may run to hundreds or volumes and, hence. the need for a varicay of disciplines willain the assessmenlleams. Mo~ inronnalion about the Agency. its work and regulalioM penaining to medicinal products and medical devices is available at the MHRA website: www.mhra.gO\·.uIt.SD

Medicines Control Agency (MeA)

Sec MEDIC'JNES

AND HEALTHCARE PROOlJCTS REouUTORY AOENCY

mega-trial This isa 1aJge...-ale nmdomiscd trial (gcacnaIly imolYing sevemI thousand subjects) thai is designed to detect the elfects ofone GrIllCR llalmentson majorENDllOlNJS. such as death or disabiUty. 1bc need ror mcp-trials arises because the W5l majority ofbalments h8\'C only modcIaIc effects on such endpoints. typically producing n:lativc nxluc:tions at mosL a quaJter. Any study aiming Ioddect such a moderate effect needs to be able to guarantee that any 8L\SB and random cnars inherent in its design ~ substanlially smaller than the expected lR:aImcnt effect (CoUins and MacMahon. 20(1). wiD cnaac thai the n:suIts or the saudy either confirm the JRSCIlce or a modc:ratc effect amvincingly or. if the tn:atmenl is ineffective. provide clear evidc:nc:e that this is so. For a study 10 avoid modcnllc biases requires RANDOMISA. nON using a method that ~Iudes knowledge or each successive allocation. Randomisation in tuN1C'AL nLW is intended to maximise the UKEUHOOD that each type or patient will havc been allocated in similar proportions to the dilTcrcnt treatment slndq;ies being investiplcd (Annitage. 1982). Randomisation requires that trial procedures are organised in a way that ensun:s that abc decision to enler a patient is made irreversibly and without knowledge of which trial treatment a palienl will be allocated. Even when studies an: randomised. however. modente biascscan still be intnxluced by inappropriate analysis or interprctalion. Tbc n:qui~mcnts for reliable assessmenl ofmoderarc trc:aImenl effects am as foJlows: negligible bitlRs. i.e. guaranteed avoidance or moderate biases invol\'Cs proper nncbnisation (nonrandomlscd methods cannot guarantee the avoidance or modcnde biases). im'Olve: analysis by aIIocalcd In:almcnts

or.

nus

(i.c.anINlEN11ON-T~TIlEo\Tanalysis);chieremphasisonovcrall

resulls (willa no unduly dala-derived subgroup anaIysis)~ and systematic META-ANALYSIS of all the relevant randomiscd trials (with no unduly dala-dependenl emphasis on the results fl1JlJl particular studies) and small TllIfdo", errors. i.e. gullrameed avoidanc:c of moderate nmdom errors. involve: use or large numbcrs(with minimal daIa coJlecUon since detailed stalisIicai analyses or masses ofdata on pmgnostic reatwa gcncndly add lillie to the ell'ecU\'C size of D trial): and s)'Slematic metaanalysis or all the relcwnt raadomised trials. One well-recognised circumstance is when paUents are excluded after randomisation. particularly when the

281

MENDELIAN RANDOMISATION _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ pnJInosis of the eKluded palic:nls in one Imalmenl group dilTers from Ihat in Ihe alher (such as might occur. for example.ifnoncamplierswen:exc:luclcdaftenandomisalion). While: avoidance of macIendc biases Rquin:s c~ful allention bath ID the randomisalian pn:x:ess and to lhe analysis and inlc:rpl1:lalion of the available lria) evidcace. a study can anly avoid mocL:rale random envrs if it accumulales a sufficienlly large number Dr evenlS. When IIIIIjor enclpoilllS such as dealb affect only a smaJl praparlion of those randamisc:d. \'Cr)' IlIIge nUlllbers or patic:nts ncccI ID be studied befcR eslilDales or bealment elTect can be guaranteed to be statistically (and hence: medically) convincing. ID these cin:um!t1llllCcs. when a Imallnent has the polentialto be used widely and hence confer laqe benefit (or harm). a mcga-trial is the anly type orstudy that is suf6ciently !diable. Forexampl~ for an event that is expected to occur among lOCI. or subjects without active lMaImc:nl. O\1:r 20000 subjects are raplired to demanSllale a 2mt n:lalive risk n:cIuclion (i.e. rlUm lOIJt to 8'1,) reliably (i.e. with 9()fJ, POW1!R at a TYFE I ERROR rale or I'it). Far a mega-llial to randomise Iaqe numbers of trial subjects. the: main banicrs to rapid n:clUilmcnt need to be n:movc:d. To racililale this. the information n:cardcd at enll)' should be brief and should concc:nlrale on those rew clinical cleWis that are or paramount importance (including at most only a few major pmgnostic ractors and only a few variables that an: thaughtlikcJy to inOuence substantially Ihe benefits or hazards of IR:atment). Similarly. the informalion n:cordc:d at rollow-up should be limited ID serious outCDmCS. adverse events andlD approximate measun:s or compliance. (Olhcr outCDmcs. such as sunupte endpoints.lbat an: ofinlcRSi but do not need to be studied on such a large scale. may best be assessed in separate smaller studies or in subsds of thc:se Iar&c studies when this is pnlCticable.) Kc:eping a lrial as simple as parsible increascs the IilccJihood that it will be able to recTUit large numbers or patients. For this n:ason. mela-lrials an: also known as "large simple trials' . For ethical n:asons. randomisation is apprvpriatc only if both the doctor and the patient reel substantially uncertain as to which lrial tIalment is best. The "uncertainty principle" maximiscs the patential for reClUilmc:nt within this ethical constraint. This says that a patient can be entered if. and only ir. the responsible physician is substantially uncc:dain as to which of the trial trealmeats would be most appropriate ror that particular patient. A patient should not be enlered irthe respolWible physician or the patient an:, for any medical or nonmedical reasons. n:a50nably certain that one of the treatments that be allocated would be inappropriate ror this particular individual (either in comparison with no tn:atment or in comparison with some ather treatment that could be oITcn:d to the palic:nt in or outside the trial). If many hospitals are collaborating in a trial then wholehearted use of die uncertainty principle ellClClUntgcs

.,ht

helUogcneity in the n:sullilll lrial population and this. in mela-trials. may add subslantially to1he practical value or1he results. AIRORI the early trials of fibriaalytic thcnpy. ror example. moll of the studies had n:strictive trial entry criteria Ihal pn:cluded the: randomisation or elcledy plllienls. 50 lhose llials mntributcd nothing or din:ca re1cmmce to the important clinical quc:stionofwhcthc:rtn:atmcnt was usefulamang older patients. Other IriaIs that did not impose an upper qe limit. however. did include some cJderIy patients and wen: liIcRfom able to show that ace alone is nat a CDDtnainclication 10 fibrinolytic thempy (Fibrinolytic 11Ierapy Triali&ts' C0llaborative OnIup, 1994). Mep-lrials adapting the uncertainty principle to delcnninc eligibility maximise helclV~ity of the slUdy sample. which in tum elWun:s that their n:sults 8ft: releyant to a vel)' divene range of rUhR palieatl. CB (See also EllIlCS] AI'IIIbp. P. 1982: 11Ic role of randomisation in clinicallrials. Sialislics it MMJrinr I. l45-S2. C...., R. ... l\1aeMDDa, S.

2001: RdiablellS5eS5lDCldoftheetreclSoftRalmcDlon manality and major marbidit),.I: clinicallrials.. TIre Lantrl 357. 373-80. tlbrllIII,sc'l'lllrapy Trildlsts' CoDaboratlYe 0.., 1994: Indications far ftbrinal)'lic therapy in suspected lK:ule ID)'cardiaI infamion: collabanlivc cmniew ofeuly modality and majarmortJidity laUIlS from aU I'IIICIomiscd trials or InIR than 1000 palicDls. 1M Ltmrel 343. 311-22.

Mendelian randomisation Mc:ndelian randomisalion reren to a method or leverqing improved causal infemlCe (see CAUWJTY) frum observational data through utilisation of the: random assipuncnt or an individual's genotype rlUm their plll'ental Icnotypes. It is justifted by interpn:lalions or the: laws of Mendelian ,encticl. Assumilll that the pRJbabilily lhal a postmciotic germ cell that has received any particular allele at sepq;,ation conlribules 10 a viable conc:eplUS is independent of environment (fram Mendel's find law) and that genetic variants sort independently (frvm Mendel's sc:cond law). then at a population level these variants will not be associated with the: confoundinl facton that generally diSion conventional obsenalional studics. Fonnally. random allocation or genotype occun within ramily IrouPS (rrom pamIlS to olTspring). and interpn:tation of such studies is closely analogous to Ibat or randomiscd controlled trials (see aJNICAL TRIALS) (Davey Smith and Ebrahim. 2003). However. it has repealc:dly been demonstrated that at a population Icvelleaelic variants an: genc:rally unrelated to potential confoundiq ractors (Davey Smith el al.• 2008). Confounding by other poetic variants will only oc:curror yariants located close tOldheron the same chl'Oll105Clllle. when they an: said to be in linkage disequilibrium (LD) with each oIhc:r. The lc:rm 'Mendelian randomisation' was ftJst applied in a study using Ihe aVailability of genetically compatible siblings to evaluate the elTectiveness of bone manow

_______________________________________________________ transplant in haemalopoielic cancel' (Gray and Wheatley. 1991). an approach that is conceptually similar but distinct in tenns of design (Da\'ey Smith, 20(6). 'I1Ie concept applied to popuIDlion-based epidemiological studies was mosl clearly articulatc:d by Kalan (1986). who proposed that since polymorphic fonns of the apolipoprotein E (APOE) gene we~ ~Iated to dilTercnl avcntle levcls of serum cholesterol. individuals with the geaotype associated with lower avenge cholesterol should be expected to have a higher cllDClCr risk if low cholesterol levels increased the risk of canCel'. If. however. ~verse cauSDIion or confounding generated the association between low cholesterol and canCel'. then no association would be expected. The conditions fal'. and assumptions underlying. a successful Mendelian randomisation saudy wen: elaborated in 2003 (Davey Smith and Ebrahim. 2003) nCR the general proposilion that Mendelian randomization aNdd be used to make causal inferences aboutlhe relationship between m0difiable risk factors and disease outcomes was advanced. It was argued that if genetic variants are robustly related to dilTerCRt levels ofexpD5~ to a risk fadal'. thea these genetic variants mould be related to disease risk to the extent pn:dicted by their inftucace on the risk factor. Mendelian randomisation implies that genot)'JJlHlisease ASSOCIATIONS should not be aft'eclCd either by aJnfounding or by revene causation. and many biases inherent in conventional obSCl'vationa! saudies may also be avoided (Davey Smith and Ebrahim. 2003). Such associations may t~fon: imply a causal effect of the risk factor on the disease outcome. HoweVCl'. these causal inferences from these studies may be undermined by issues including pleiotropy (the genotype has direct elTects on rnon: than one risk factor for the disease outcome). population slnllification (population subgroups that experience both different disease rates and ha\'C different fn:qucacies of genotypes of in~ exist. ~ulting in aJnfOlDldcd associations between genotype and disease) and canalization (butTering of the effect or genotype on disease by compensatory biological mechanisms) (Davey Smith and Ebrahim. 2(03). Thomas and Conti (2004) pointed out that the Mendelian randomization approach involves application of the method of INSTRUMENTAL VARIABLES. which is commonly used in ecanomelrics. An insIJUmenlDl variable satisraes the following assumptions: (a) it is associated with Ihe exposure of inten:st. (b) it isindcpcndent ofconfounding fat'torsand (c) it is independent of the outcome given the risk fat'tors and c:onfounding fat'tors. Lawlor el til. (2008) reviewed inslJumCRlDl variables methodology in the context of Mendelian nmdomisation studies. Because genetic variants often explain only a small proportion orlhe variance in the target risk factor, vcry large sample sizes may be needed to achieve pn:cise estimates of the causal elTc:ct of the risk factor on disease outoomcs. JSlGDS

METkREGRES~ON

Davey SmItII, G. 2006: CopilalUing DR Mtntklitur nmtJomj:aliDn to James Lind Ubnuy: www.jamcslindlilnry.OJg. DIm.J SmUIa. G. aDd F..bnIIIm, S. 2003: 'Menddiaa randomisation': ('aD genetic: epidemiology caatribute to understanding ctWdunmcntal ddcnninuIs or disease? InltmllliOlltll Joumal of Epit/mrjoi9gy 32. t-22. Daft)' SInD, G., Lniar. D.A., Ibrbonl. Q,J,M.JS the eJfHU of IreQlmenls.

IL. TbplDll, N.. Day, I .... Ebnlalm, S. 2008: Clustam enviroDmeals and rancIomiud FftCS: a rundamcntal distiDdion bdwem l:CJII\ftlioaaJ and ,cadit' epidemiology. PIoS Met/kiM 4. el52; DOI:IO.J371Ijoumal.pmcd.0IN0352. Gn),. R. aDd WIIea8ey, Ie. 1991: How to avoid bias when comparing bone mmow lranspJanlalion with chemodJerap)'. Bone MtlTrtJM' Tralup/tJnlali0n7(Suppi.). 3, 9-12. Kala, r.L B. 1986: ApoIipopmIein E isofonns. SCIUII1 moIesteroi. and cancer. I4ntet I, 8479, 507-8. Lawtor, D. A.. HarIIord, R. M., Steme. J. A. C., TIInpIoa. N.........,. S..... G. 2008: Mendeliaa randomization: using genes as instrurncats for making amaI inf'c:renc:es in epidemiology. SIatisliC's in MediC'ine 21. 8.1133-63. 'I'l1o-, D. C.aadCGntl, D. V. 2004: CommeataIy: the I:CJIIOCpl of "Menddian nmdomizalion'. IIIlemalionaJ Journal of Epidemiology 33. 21-5.

meta-analysls

See SYS'lDlATIC REVIEWS AND META-

ANALYSIS

meta-regresslon

This is an analysis of the relationship between study characteristics and saudy results in the context or a !.IEI'A-A."UaLYSJS. Independent studies of the same problem. e.g; multiple ClOOCAL TRIALS of a particular drug or multiple CASE~NrRQL S1UHI!S of the same exposurc-disease .o\SSOCIATJON. will incvitably differ in many ways. Some or the variation may cause the effects being evaluated to be different in different studies. a situation commonly known as heterogeneity. Metan:gn:ssion analyscs are similar to traditianall.lNEAR RftiRESSDI analyses. a conceptual diJTen:nce being thaa eat~ studies. rathcrlhanindividuals.aretheunitsoranalysis.C'hanIclcristics of studies IR used as explanaloly (indepcndenl) variables and estimates ofetTcct are used as oulaJme (dependent) variables. Regrusion coefficients describe how the eft'ects across the studies incmasc per unit incn:ase in the characteristic. Study characteristics might include numerical summaries oftypes of panicipants. variation in the implementation of an intervention or dill'en:nl mc:lhodological fealun:s. Estimates of effect may be. ror example. OODS RAnm. hazard ratios or diffelQCCS in mean responses. depcnding on Ihe type of study and the natun: of the outcome data. Ralio measures of ctTecl are usually analysed on the (1IDIUmI. or base e) log scaJe. Studies are weighted in the analysis to reftcct impn:cision in their rauJts., the weights typically in\'Olving the inverse variances of the effect estimates. A mela-regression may be a primary reason ror assembling multiple studies. although meta-regrasions are perhaps most commonly used as secondary analyses to investigate heterogeneity when a baditional lDCta-analysis was the

283

MET~REGRE~ON

_____________________________________________________

primary objective of a ~view of studies. The 51udy chamcteristics may be Caleplrical or quanlitali\le and several may be included in die same: aaalysis. Forcalegorical chamtleristies, mela-lqraliion may be viewed as a genendisalion or subgroup analysc:s. ~ the subpuuping is by studies rather than by participants. Meta-n:grasion should ideally be conducted only as part of a thorough sY*m1llic mview to eMWC that the studic:s in\'oJvc:d ~ n:liably identified and appraised. Notable examples or mcta-n:pcssion analyses include an invcstiption of the do~sponse relationship between aspirin and secondary pn:veation of stroke. Among clinical trials adminislCring aspirin aI diffen:nt doses. no relationship was appan::nt belwecn aspirin dose and the n:lati\le risk of ==~nce (Johnson el Ill.• 1999). A second example is proVided by Zeegers., Jellema and 0sIm" (2003). who present a rncIa-analysis of obse"'alionai swdies comparing proslate cance:rrisk between people with and without family history of praslatecancer. Tbeyuscd mela-~g~ssion Ioperfonn several subgroup analyses to assess the rubuslness of their finding that a family history or the disease is associated with roughly a doubling of risk. Studies were broken down by study design. year of publication and ethnic group, among other characteristics. A third example that bas inspired development of mcta-n:pcssion methodology is an analysis describing a mlationship bclwc:ea the geopaphical latitude of studie:s of the Bca vaccine and the mative risk or wbcn:ulosis in those wccinaled \'CI5US those not (Berkeyet al.• 1995). A con\lenient illustration of a mela-n:pasion analysis for a single chamc~ristic isa simple Sc\nERPlDTas in the figure. which shows the n:sult of the BCG vaccine mela-~gression. The cin:les rep~nt studies, with Ihe size or c:ach cin:le proponional 10 the pn:cision or the: relali\le risk eSlimate ftom thai study. The mela-n:.gn:ssion line iIIustrales that the vaccine wasobservc:d 10 be IIIOIC effective fUJther away from the equator.

2 •

. •

1

'c:

JO.5

!

0.2

•

0

•

0.1 ~-___--......--__----r--......--.., o ro ~ 00 ~ m ~ DisBlce from equator (degrees of latitude)

meta·reg......on Meta-regression analysis of ths reiaIionshj) belwBen sffectivBlJBSS of BeG vaociM and latitude of study (data from Berkey st aJ., 1995)

Meta-~gn:ssion analyses invol\le observational compar-

isons across studies and may suffer from BIAS due 10 confoundirq:. since studies similar in one chancleristic may be similar in olhen. Causal mlationships belween characteristics and results can seldom be drawn with confidence. A particular problem is thai in most situations the number or studies in a meta-regression is smull while the number of potentially important characteristics is large. Thus any meta-n:gression analyses perfonned should be driven by a strong scientific rationale and ideally p~-specifled and limited in number. It may be nc:cessary 10 control for the possibility of false-posilive findings since the risk of a TYIIE I ERROR increases substantially when multiple meta-regn:ssion analyses are undertaken. II is possible 10 summarise panicipanl-Ie\lel characteriSlics at the level of a study for use in a meIa-n:grc:ssion. Thus the MEA.~ age of participants. the proportion or remales or Ihe a\'CI'DIC length or rollow-up might be used as study-level characteristics. Such analyses should be inlerprCted carerulIy. as they may nol adequately dec. true associations. For example. suppose the etTect or an intervention wly depends on a patient" s DlCoif several c:Iinicaltrials each inc:lude a wide l1IIIIe ofages. but if the mean age is similar across lrials. then a meta-reP"ssion mating mc:an age 10 size or etTect will fail 10 delc:ct the rcIaIianship that would be evident from withintrial analyses. When inlC~t focuscs on participant-level charaderistics. a meta-analysis or individual participantle\lel data is Ihe most reliable: method of separatirq: within-study rrum DIDOrq:-study ~Iationships. Potential limitations of mc:Ia-regn:ssion. including those uln:ady mentionc:d. an: discussed by Thompson and Higgins (2002). In common with meta-analysiS. meta-JqR:Ssion may be conducted assuming either a FL\'ED EFRCr model or a RANDOM EfFECTS MODEL. A fixed effect meIa-reP"ssion assumes that the study characteristic(s) explain all of Ihe inlentudy varialion in effects. It may be performed usirq: Weighted linear reP"ssion oflhe effect estimates on the study characteristics, weightirq: by the inverse VARL\HtES or the effect estimalCS. However. the srANDARD ERRalS of the ~gression coel1icienlS need to be wm:cted 10 accounl ror the facl thai the variances are known. by dividing them by Ihe mean IqUIR error from the weighted n:pasion. A random elTcc:1S ItIda-n:grasion allows for \'BIiDlion in study dTecIs that is not explained by the study characlcristics. Such "R:Siduai bctcrogc:neily' or eam is couunonly assumed to follow a IIDRMAL DlmUBlmON analogous 10 a random elfects mda-analysis. Random eft'CCIs mcIa-n:pasion rcquirc:s lI1OII1: specialised sonw~ although a con\'C11ien1 implelllClllalion is available for SIala (Sterne, Bnldbum and Fgc:r.2001 and see alsoSTATISTICALRaAOES). Sinceil isunlikcJyahal ~ty can be fully explained by a finite selection at study characteristics. mndam etrc:clS mc:ta-n:pasian has bc:c:n J1X."IDIIIII1CI as the default choice (11x1mpson and Higgins. 20(2).

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ METHOD COMPARISON STUDies Mela-~gR:Ssion

may be performed using alternative melhods specific to the nature of the individual level oulcome data when these are available. For example. when the outcome dala frvm individuals in the studies are binary. then mela-~grcssion may be undertaken using logistic regression. Implementations or meta-regression that n:quire special consideration include Ihc n:lalionship between effect estimaICs and underlying risk (due to collelalion between effect estimates and risks). investigations or publication biBS (again due 10 possible correlation between effect estimates and measun:s of precision) and nonindepcadent outcomes (e.g. entering subgroups rram the same study). JPTH

B....,·.C.S.. H..... O' C.,MGlCder.F.... CClldUl.G.A.1995: A random effects ~D madd far meta-analysis. Stalillit's in MftlkiM 14.39>411. JaIuIIaD, & s.. t.an.,5. F.. Wllllwortta 10, C. &.Sderfield, Me .... Abebe, B. L and . . . .r, L W. 1999: A m~ioD analysis of Ihc dasc-RSponsc ell'cd of aspirin on SU'clkc.Am.ireso/lnlelJfQl Medirire 159, 1248-53. saerne,J. A. Co, B........... M. J. and ...... M. 2001: Mda-lIlalysis in SUda™.ln Eger, M.. Dave)' Smilh. G. and Altnum. D. G. (cds). S,,'ematic TerieM'S in heallh mrr: meia-tllltllysis in mIIte.fI. 2ad editiCllL I..CIndoa: 8M1 Publication Group. no......... s. G............, J. P. T. 2002: How should me"~DD aaaI)'scs be undatakcD and illlCqlRleci? Sialisties in Mtrlkilre 21. 1559-73. Me P. A., A. ~L and Osbw, H. 2003: Empiric risk of IJI05IIdC c:arciDlllllla ror rdali~s of paticlD with pmsbIC Clltinoma: a IDdaanalysis.. Cana!'T VI. 1894-903.

z.aen.

J........,

method comparison studies At its simplest. a melhod comparison study involves Ihc measun:ment of a given chamcteristic on a sample or subjc:c:ts or specimens by two differeat mclhock. To take a Simple and familiar example. we could imagine a study in which the body tempenatuR: or CKh of. say. " patients was assessed once using an old men:UI')' thermometer calibmk:d in degn:es FahRnheil (F) and apin using a modem lhmnomeler calibndcd in degn:es Celsius (C). Jr the lnIe Icmperatun: of the hh individual (in degReS Celsius) is 1'1 then the n:sulting sel of measun:mc:nts might be represented by Ihe following two equations:

= I'i+d; F; = 32 + 1.81'; + E;

C;

(I)

The numben 32 and 1.8 follow rrom the temperalUR:S of fRc:zing and boiling walei' (i.e. OOC532°F: IOOoC5212 oF). The key charaderistic or the design is thai the resulting daIa is a series of paiJ'cd measun:ments (C,. F /). The d and to values in these two equations 4:lIJIRSpond to madam measun:ment errors that are assumed to be unc:om:lDk:d both with each other and with the patienl's IrUe tempc:rat~ They are both assumed to have an average value of zero. They cannot be detennincd indiVidually bul slatistical methods can be

al

used 10 assess their variability (their variances. and ~. respectively). Now we will complicate matten by choosing 10 me~ a chamcteristic. such as a tissue enzyme. using two differeat assay methods with indeterminate scales. Arbitrarily choosing one (X) to be the standard (or. indeed. il may be already recognised as the standanI against which a new assay is being compared) and other (Y) to be the campanllor, a n:alistic statistical model might ha\'e the form:

Xi =~; +di Y; = a + fll'; + E;

(2)

Here the values 32 and 1.8 have been n:placed by unknown constants a and /l respectively. Our task is now 10 take the n pain of measun:ments (the XI" Y,) and use statistical methods 10 estimale a and fl. togelhcr with the variances, and 0;. These are the paramelcls of the MEASUREt.ENI' ERROR madel (possibly with Ihc addition of the variance of the IrUe scores. ~. depending on how the patients or specimens have been selected). Before describing how we might attempt to carry out this estimation, however. it will be userul to discuss brieRy whal we might wish to learn ordc:c:ide from the n:sults or a method comparison saudy. We might wish 10 estimate Ihc paramelcls or a ~Iative calibration or measun:mcnt enor model such as that described by the pair or equations in (2). It would obviously be or interest to know Ihc wlues of a and /l so thai we might know how 10 convert the scale of X to that of Y. or vice velSa. In particular. we might wish 10 establish whether a =0 or Jl = I. or both (i.e. are the scales the same?). If the scales or mcasurementarc the same (i.e. a =0 and fl = I) then Ihe lwo meas~ment methods are the MEAN or avemge eqUivalent. In this case we mighl also wish to know whether the lWo methods are equally precise (or whelhcr one is more pR:Cise than the other) by comparing the estimates orabe VARIANCES of the measun:menl erron (i.e. comparing estimates of and 0;). If two methods an: mean equivalent and Ihcir pn:cisions are Ihc same. then they are fully indi\'idually equivalent. In the theory of psychometric tests (applicable 10 the measurement of depn:ssion or anxiety. for example). tests thai are mean equivalent or individual equivalent an: referred to as being I'-equivalc:nt or parallel rcspc:ctively. Measu~ments using alternative methods that an: individually equivalent or parallel an: fully inlc:rchangeablc: without any loss of infonnation. Suppose. howe\'Cl'. that we wish 10 evaluate and compare Ihc pR:Cisions oflwo methods that are known nol to be mean equivalent? How. for example. do we compen: the performance (precision) of an old thermometer calibrated in degn:es FahRnheit with a new one in degm:s Celsius? We would need first 10 conVClt the FahRnheit measurements 10 degn:es Celsius (or via: 1ICrSII) and only then com~ the variances of Ihc measurement errors. For methods X and Y.

ai

oi

285

MHRA _______________________________________________________________ the relevant ndio (i.e. relati'VC precision) for Ibis comparisoD is 1';01). Hen: a direct comparison of ~ would provide die aDswer to the WIOIII question. A less slliDgent quellion might involve askilll whether the two measllMments on a givCD patieDt an: close enough. We do nol ask whelher two methods an: exaclly equivalent but whether. for all practical purposes. they an: inten:hangeable. In this situation we may abandon lhe mc:asun:ment model in die equal ions in (2) entirely ancl concentrate on the paired difl"en:nces (X, - f ,) as indicators of agn:ement betWC:CD lhe two methods (BI_ and Allman. 1986. 1999). If the agnx:ment is good enough dlen we can for all practical purposes n:place a measuremeal made usiDg one or the melhods by a CGm:spondinglllCllSlRmeDI using the other one. This is the rationale for the construction of wms OF AOREE.IENI (Bland and Altman. 1986, 1999). A very useful graphical summary 10 accompaay dlc:se calculalioDs is what is usually known as the Bland-Altman plot - a plat of the ditTerence betWC:CD the two measurcmeDts. X, - f lapinst their mean, (X,+ f,)12 (Bland and Altman, 1986. 1999). In addition. one might wish 10 pnJduce a simple Y 'Venus X SCA1TEJtIIlCJT. to_her with an ellimale of their pnxIuct-momeDI eom:lation and concordance CORRELAnON (Lin. 1989). Many in'VCsliptors. however. will wish to go beyond lesting for equivalence. They will wish to know. for example. whelher f is beller dian X. Is the new mdhod an improycment on the old one? Or. cODtnuiwisc. is it worse? If this is the aim then we have no option bul to mllect lhe relevanl data (the dcSigD might need to be mon: informative thaD those discussed so far). postulate realistic Slatistical models ror the mc:asurcmeDls and proceccllO test whether the models an: appropriate and. ir so. to ftnd what the estimalcs of the model's Plll1Ullelers lell us about the performance of the methods. Returning to die statistical madel described in (2), how do we estimate the paramelCn and lest hypotheses conceming them. given a set of paired mcasun:ments (X,. 1,)? Well. the simple answer is thal we canDOl. ~ is insufficient information pnwidcd by the: dala to enable us to eslimDIC these panunclers. The technical phrase ror this is the 4problem or model undcridcntificalion'. The only way to proceed is by making various assumplionsconceming some of the panmc:lers to pnJduce a model that is ideDliftc:d and then 10 eslimate lhe remaining paramelel5. Examples or these assumptions include (a) knowiltl the variance of the measurement errors of the Slandard (or its ELIASIUTY). (b) assuming a common scale of measurement (i.e. that fJ= .) or (c) kaowilll the relati'VC sizes of the two measurement error variaac:es (i.e. die ratio ~/al). The Ruble with each of lhese assumptions is that we an: asswnilll50mcthing about the measurement methods that we

a;

oi _

would ideally have wished to study as part or the method comparison study. The other problem is that if the chosen assumption is nat actually valid wean: likely to finish up widl the wrang conclusions. COlWicier. ror example. the assumption lhal we bow the ndio ofthecmJl'variances. This leads 10 the use of a method known as odhogonal or Denrin,·s re,re.r~;OI' (very popular in clinical chemistry). 1)pkally the measun:ment error variances arc estimated for cadi oflhe methods by repeatcdly measurllllthe n:levanl characleristic on the same indiYidual(s) or spcdmen(s). This enables us 10 estimate n:peatabilily wriances. These an: ODly valid estimates oflhe measumnc:nl enor variances (~ _ irlhe n:peatcd measurements do nul have corn:latcd mea5U",meat errors. Corn:IaIc:d mcasuremeDl ermn an: almost universal and one should be very W8l)' orlhe use of Deming's rep:ssiOD whea they an: known to be a possibility (Canull and Rupped. 1996). The only really satisfactory way out is to use a man: informalive design iDvolving one or IDOR: of the followiIIJ reatun:s: replication usilll each of the meIhods, die use of inslnuneatal variables (see ~S'IIlUMENTAL VAlUABLES) and Ihc: use of more than lhn:e ditTen:llt mc:lhocIs of measuremeat wilhin the study. '11Ie otherkcy featun: oflhc:se studies should be an adequate sample size. Most method comparison studies arc 100 small. Statistical analyses ror the clala arisilll fram mon: of the iDformative clcsiglW. wilh mon: realistic measuremeat models. is beyond the scope or this cotry. but the methods arc described in considenble clewl iD Dunn (2004). The methods typically involve software developed ror S'IIlucnJRAL E:QUA11ON MOOEI.l.INO (see SOFJ'WAJtE FOR Sl'RUC. TUIAL EQUA110NS MODELS). Methods for the comparison of binary measurements (diqnostic tesls) CD also be found in GD Dunn (2004).

01)

J. M. and .u... a G. 1986: Stalistical mcdaads for assessing ap:clDCnt ~ two mdhocIs ofclinical mcasurcmcnt. lmtte/i.J07-IG. "_J.I'I.- " ....... 0. O. 1999: McasurinI qKelDCal in mc:abod camparisaIl 1tUdies. Slatisliml Meillotls in MIft/icQI Raearrlr .. 135-60. Carroll. R. J. ad Rappen. Do 1996: 11Ic usc and misuse or Clllhapaal ~ssian in liaear cmn-invuiablcs models. TIre Amlfri«lll SIaliSliritm SO. I~ G. lOO4: SIQlistiml "tlllltltiM of m«ISIImMllt errors. Laadon: AmolcL Un, L L-K. 1989: A canoordace corrdlllioa codficicnl to evaIuaIc mpruducibilily. Bionwlrits 45. 255-68 (see Cam:diaas in Biomlftria 56. 32~S). BIIInd,

na..

MHAA

See MEDICINES Rml.UlORY AOENC'Y

AND

mlcrosrray expertments

HEAl.11IC'.o\RE

PROOUCI'S

These arc slUdies in micrabiololY that an: designed 10 measun: Ihc cxpn:ssion 1C'VCls of lenc:s iD a particular cqanism. lencraJly in response to some stimuli or COnditiOIW believed 10 stimulate lhe orpnism's genes. and hence CD be: used 10 assess

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ MICROARRAY EXPERIMENTS

pmbabilistically the risk ofdeveloping a disease. or assessing environmental sensitivity or adaptability. for which genetic cxprasion has been identified. Presently. thn:e technologies have been developed to measure these cxpn:ssion levels: all arc: based on lhc biological concept of hybridisalion between matching nucleotides. ByaJIRparing the expression levels of the full complement of geDCs from the organism with those from a "normal' or "control' indi\·idual. ODC can identify genes that arc: "differentially exprasc:d' (exprc:sscd at difl"mmt Icyels by lhc two genomes). To the extent that a gene has been identified as eilhcrcausatiye or highly associated with a particular disease (e.g. BReA I gene on chromosome 17 for suppression of an carly-onset breast cancer tumour or the Rb gene on chromosome 13 for suppression of a retinoblastoma tumour).lhc risk of disease can be estimatc:d and mechanisms ofcausation can be elucidaled. In some cases. the analysis may pennil early intervention to preventlhc onsc:l or progression ofthc disease (c.g. identification of an absent gene may pennit early diagnosis and treatment). Similarly. adaptability or sensitivity to cnvironmenlal contaminants (e.g. metals) can be comparc:d amon; organisms that manifest diffcrin; levels of gene exprasion. (An example of such an ASSOCIATION has been identified in DtIp!miD. a frc:shWDlCrcrustacean that has a compact and wcll-characteriscd genome sequence. in lhcir acclimation and adaptalion to cadmium in lakes: cr. Shaw el al•• 2008.) 1bc identification of genes responsible for an organism's heallh condition and response to the enyironment provides important clues on potential causes of diseasc. The: measurements in microanay experiments reflect levels of complc:menlaly deoxyribonucleic acid (cDNA). rever.scd-transcribed (as explained below) from "Ilular mcsscn;er ribonucleic acid (mRNA), nOllhc Icvels of proIcins that the organism manufactures in n:sponsc to mc:asun:d elevatcd levcls of mRNA. (While not definitely proven. one asSUR1CS that an orpnism's mRNA Icvels would inc:n:ase as a pn:cursor to lhc manufacture: of rc:leyant protein products nec:dcd to rell-pond to lhc stimulus.) Presently. three technologics pennit the mcasurc:mcnt or gCDC expression leycls on microanays: tn:ated glass slides with spoUed and immobilized cDNA (cDNA a/ides). chips spoiled with manufactunxi 2S base pair sequences of nudeotidc:s found in genes (Oligo1Ulcleolide arrays) and hi,b-density cbips with synthesised longer sequences of oligonucleotidc:s (Irigll-dmsily chips). All thn:e technologies arc: based on lhc biological concept of hybridisalion between matching nucleotides, and can contain multiple copies of singlc-stmndcd genes or gene fi'Dlmcnts. called probes. linked to a substrate or surface for binding with cxpR:Ssed transcripts from target tissues. The genetic axle for an organism is c:xJIItaincd in ol);anised strin;s of four nucleotides (A =adenine. C =cysteine. G = guanine, T = thymine), arranged in triplets such that

each triplctcodes foroneof20 amino acids. (Multiple triplets may code for the same amino acid.) Strings ofamino acids an: callc:d peptidc:s. Peptides can act independently in a "II or lhcy can combine with other peptidc:s to ronn complex proteins used by the organism for ccD function. Genetic material known as deoxyribonucleic acid (DNA) is arranged in a double-stranded helical slnlcturc:, with complementary base pairs on eilhcr side of lhc helix (ATrrA or CGIGC). In response to a stimulus to procIUtlC a proIcin. thc codin; genes in the DNA arc: transcribed into messeagcr RNA (mRNA) for translation into peptidc:s. To test for genc expression. the inYcstigDlOr harvests cells from tissues of thc types IIIHIu study and lhc mRNA is nmmse-transcribcd into its more llablc fonn (complementary DNA. or cDNA). split into smaller strands and denatured (·unzipped'). yiclding single-stranded cDNA. The various tissue types arc: labelled with different RUOfCSCing chemicalslhat can be: detected by an instrument Present instnlmcnlDtion allows for the detection of two dilrcn:nt chemicals that ftuGrescc at sufficiently different wa~lengths that they can be readily distinguishccl, thus allowin; the ~hybridisalion of treatment and conlIOl samples on thc same: mic:roarray for a direct mmparison and minimizing technical variability. though the: technology is not limited to only two scanning channels. For quantilatiYC measurement. a sin;lc or a mixture: of cDNA strands is placed on to thc slide: or chip c:xJIIwnin; the ,enc probes. and the strands ofnucleotides in the tar:et sample arc: allowed to bind (hybridise) to their matching parlDCl'5. Spots on the slide or chip where hybridisalion has occurred indicate gene products that arc: present in largu quantities and may havc bec:n expressed in response to thc stimulus. With this technology. lhc repoltcd intensity leyel at a particular location on the slide or chip is a summary of ftuon:scence mcasurc:mcnts detected by an LCD (liquid Cl)"stal display) camera as a series of pixels that comprise the spot on the slide:. n.c thrc:c ~lated technologies for mc:asurin; gCDC expression diffu in the process that is used to manufacture the probes on thc slide or chip. In cDNA slides. the probes typically an: obtained rrom a cDNA library. which has thousandsofbactcrial colonies with clonedcDNA fragments. Once isolated in the bacterial hosts. the: DNA fragments undergo a series of complex processes that amplify and then mechanically deposit them on lhc tn:alc:d glass slide substrate. In oligonucleotide arrays. the gCDC fl'8lments mnsist of spc:c:irac DNA strings rprobes') of 25-70 manufactun:d nudc:olidesthat are placed on the chip robotically. Commercial microanay manuracturers use either photolithography or digital minor devices and photorc:aclant chemistries. In all cases. thc DNA probes an: platlCd in an array of rows and columns. hence the tenn 'microarray·. n.c technologics also differ in the experimental protocols that yield quantitative ,CDC expression data. For glass substnde microarrays. thc hybridisation solution

MICROARRAV EXPERIMENTS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ contains a mixlon: or lwo Iypes of cells. control and expcrimc:nlal. whose mRNA is nwenc-lnUIscribccl inlD the more slable: cDNA and thcn labellcd with lwo dill'en:nt ftucnphoaa: colllrol cells labelled with Cyanine 3. or Cy3 (In:en dyc). ud experimcatal cells.e.g. cells subjected to stress. hcat. radialionorchemicals. orknown toorigilUlle from discue liaue. labeUed with Cyaninc 5. or Cy5 (n:d dyc). When the mRNA concenlmtion is hip in lhcse samplc:s.their eDNA will bind to their cam:spollclinl probes on the spaUcd cDNA slide: anopticaldetcdor in a lascr sc:anner will mc:as1R the ftuorescencc at wawlenllhs cam:spanding to the IfI'Cn and n:d dyes (532 nm and 635 run n:spc:cli'Vely). Goad EXFERDotENI"AJ. IJI5SIOX will include lechnical replicates that inlcn:hange the dyes in a separaIe hybriclisation experiment toaccounl for imbaJ8IIa:s in the sipal intensities from the two types of ftUOlq)hon:s and expcctccl BIASES flOm ge~ye inlmIclions (e.g. possible degradation in the cDNA samples between the ftnt sc:an at 532 nm and second scan at 635 nm). 11H: ratio of the relative abunclance of rat and gn:en dyes at thc:sc two wawlcagths on a cenain spot indicates n:latiw mRNA colL'lCnlnllion betwcen the experimental and conlrol samplcs allhosc ICues. Thus. the gene expn:ssion levels in the genes under the experimcnlal condition can be compan:d din:ctly with those under the conaml condilion. Howcw:r. alhcr technologies an: limited 10 hybridising a sin. labeDcd sample or targets at one time. thus raauiring the addition or CODtroI probes and altcmalc pIOlacols far normalising the signal inlensily clala across replicates and aclUlS chips for reliable CompariSDILI of gene expasion belween the experimc:nlal and conbol condiliODS. Oligonucleotide 811'8ys cimlmw:Dlthe: poaible ilUlCCUl'acies that can arise in the prepandion of eDNA probes and the control and experimental samples rorspolled 8II'8y slides. by usiq pralc:fincd and prefabricated sequences or 25-70 nucleolidcs 10 characterise each gene. Rather than mechanically deposiling DNA. oligonucleotide probes can be synthesised din:ctly on 10 the subslrate. For arrays manufactua via the photoIithopaphy-like process. the: probe cells mcasllR 24 )( 24 or 50 x 50 miclOmcln:s squan: and an: divided in 8 x B pixels: anays manufactun:cl usilll pholon:aclanl technology mc:asurc 13 x 13 micromctres and Mace anays can accommodate IlIOn: prabcs. As with eDNA slides. cells lium the IarJ:Cl sample. labelled again with fluoropbon:s. will hybridisc to those squares on the chip that axleSpand 10 the complementary strands or the laIgel sample's single-stranded cDNA. For these experiments. the largel sample CODlaias only one type orcell (e.g.ln:alment. or CODtlOl); the assessment ofexpRmion is in comparison 10 the expn:ssion lewl on an adjacent probe. which is exactly the same as the gene probe exccpl for certain nucleotides (e.g. 13th oul of 25 nuclcotides). '11Iis ~mismatch' (MM) for the 'perfect match~ (PM) sequence is only a I'OUlh gUide. sincc a larget sample with elevated mRNA concentration for a

certain gene may hybriclise SUfficiently to both the PM and MM probes. Howewr. the results are believed to be less variable. since the probes on the chips are manufactun:d in mon: can:fully conbolled concenlnlions. Oene expression lewis are measured again by a laser scanner thai detects the optical energy in the pixels at the 'Various probes (PM and MM) on the chip. 1bc analysis of the data (ftuareSCCDtle lewis at the various locations on Ihe slide or chip) depends upon the technology. For cDNA experiments. the analysis usually in\'Olves the: loprithm or the ratio of the expression levels between the laIgel and conlrol samples. Faroligonuclcotideexpcrimc:nts. the analysis in\lOl~s a weiptcd linear combination of the logarithm or the PM expression level and the loprithm or the MM expression lewl (with some authors choosing zero far the weilhts of the MM values). Micl'08ll'a), analysis involves several considerations. including: the sepandion or 'spoI' pixels rrom 'background' pixels and the determination oftheexprasion levcl from the intensitiesn:conlccl fram the dala .spol~ pixcls: the: adjustmenl of the calculated spol intcnsily for bKkgnJUnd ('backgnJUnd c::om:cIion'): the normalisation or the range or ftuorescence values f'rom one experiment to anoIhc:r. particularly with oligonucleaticle chips: expcrimenlal design of multiple slides or chips (Kerr and ClUKhill, 2001; Yang and Speed. 2002; Casella. 2001): dala transfannatio_ (Yang el al.• 2002: leaf• • and Phlllll. 20(3); statistical methacls orinrcrence and combininl information rrvm multiple cDNA experiments (Amanaluqa. and Cabrera. 2001: Dudoil el til.• 2002) and from multiple oligonuclCOlide arrays (Bfron el a1., 2001; lrizany el til.• 20(2) and adjuSlmcDts for MUU1PLE COMMRlSONS (Reiner. Yekulieli and Benjamini, 2002: Bfron. 2004; Benjamini and Yekulieli. 20(5). The ~Iow-Iewl' analysis consists or the: necessary 'PR>processing" steps. including dala TRANSFORMAnONS (usually the 100ariihm) to address partially the nonnannalilY or the CJtpn:ssion levcls. and normalisation and background ~ion methods to adjust for different signal intensities acrass dill'cn:nt micl'ClBll'8y experiments and SOUKes or wrialion arising flOm Ihe chip manufaclurinl proccss and backgruuad inte_ily levels. The 'hip-level" analysiS usuall), involves clustering (see CLUSTER AlW.YS~ IX MEDICINE) the gene expressionlewls into lroups of ICMS thal are bcliCMXlto respond similarly. but no consensus has been achieved on the best methods for normalisilll. clUstering and aucing the number of lenes to consider as 'signiftcanlly differentially expn:sscd' when scardling for associations between disease and gene locations on chramosomes. Mic:roarray analyses have become a slandard screeninl 1001 in the exploralion and elucidation of mechanisms of disease and for sludying the interface between the environment and the genome. They haw also been used lo understand the efl"cct of certain exposu~s better. sucb as anthrax or anthnax-like organisms. or metals" on cells from

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ MINIMISATION human and animal populations. aad heDC'e to characterise better the risk of such agents to these populations (Human Genome Program. 20(2). Examples of useful gene microanay-based investigations that potentially c.an improve public health pl'DCtice include discoveries of c.andidates for biomarkers of discase: measurements of perturbations in the cell cycle under differing conditions: uncovering genetic underpinnings of numerous human. animal and plant diseases: and explorations of the impacts of environmental change on ecosystems and populations of humans. animals. plants and disease-causing organisms. An example of the potential utility of microarrays is in the traelting of antigenic drifts and shifts of innuenza viruses and the changes in the virus-host relationship. Tluough these types of analyses. the ftelds of genomics. prolcomics and statistics will contribute substantially to how we perceive. measure and addn:ss disease and environmental change. Genome- and proteome-scale microanuys will be used incn:asingly as a cost-effective means of quanlifying risk and identifying pn:cursors for diseases. for developin; vaccines and for other public health and environmental initiatives. Through these types of analyses. the fields ofgenomics. proteomicsand statistics will contribute substantially to how we perceive. measure and addn:ss disease and environmental change. KKlKB ISee also ALLELIC ASSOCIATION. OENETIC EPJDEMIOLOOVI AIDIIntanp. Do aDd Cabren,J. 2001: Analysis of data from "iral DNA microchips. JDIII'U of the Amerit'rm Slati,rtical As.Jot:iotion

96. 456. 1161-70. ~ G. 2001: SlalisliCtJltksign. New York: Springer. Dudalt, s., y .... Y., C'-low••L aDd Speed. T. 2002: Slalistical methods for identifying dift"erellliaU), Q~ ~aes in

JqJIic:alcd cDNA microana)· experiments. Sialislim SiIU(a 12. 111-39. moa, B. 2001: Largc-sc:ale simultaneous hypothesis testing: lhe choice of a null hypolhcsis. JOIInIal of t"~ Ameritan Slalutical Assorialion 99. ~104. Una. 8., 1'IIMIIInIII. R.. Storey, J., TullIer. V. 2001: Empirical Ba~s analysis of a microarray experiment. JOlimal of 1M AmeritYIII Slat&lkal As.Jot:iotion 96. 1151~. Hamu Geaame ,......... 2002: US Depl of Em!rgy HumtIII Gmomr News VI2, NI-2. Februuy 2002: http://www .oml.'OY/sciltcc~uman\_(icnomclpublicatlhg

n/vl2nlJ

HGNI21'_2.pdf. irbarr)', R. A., Hobbs, 8., CaIIID, F., BeuerB.......,., y.C.,A.......I... K.J.,Scbrf. V.aDd Speed, T. P. 2002: Explondion. nonnalizatioo. and summaries or hip density olip nucleOlide &nay probe level data. Bioslatistit.J 19. )85-93. Kafadar. It. aDd ...... T. 2003: ThinsfOnnaiioas, backpauad estimatioo. and prucess eft"ccts in the staWticai analysis or microarnys. CompulalionaJ Slalislirs ami Dala Analysis 44, 313-38. a...... A., y_tle... D., BenJamlnl, Y. 2002: ldeacifying dilTemdiall)' expressed genes usiag raise discO\'CIY rale COIdIolliag pmccdures. Bioin/Ormalics 19, 3. 368-75. S"'" J. R., Pmader. B.D., EadJ, R., Klaper. A., Call......, A.. Cabo., ... J...., B., GObert, D. IIId CGIbDanIe. J. It. 2008: Daplmia as an emerging model for toxicolopcai genomics. In Hoptrancl, C. and Kille, P. (cds). Admnres in experimental biology 011 loxitogmomiu. Elsevier.

pp. 165-219. y .... Y. H. aDd Speed, T. P. 2002: Desip issues for eDNA micruuray experiments. Nature Rerin'l 3. 579-88. Y.... Y. H.. Dadalt,s., Lou, P. aad Speed, T. P.2OO2: Normalizatioo for eDNA microanay daIa: 8 robust composite mc:thod addn:ssinc siagle and multiple slide sysaematic wrialion. NutldC'

Adds ReJWITth 30. E15.

mld-P-value

MIM

See EXACT ME11IODS FOR CATEOOIUCAL DATA

See ORAPHICAL ,..OOW

minimisation

This mcdaod is sometimes used to bal-

ance IW'lDmfl5.o\TION in a CLINICALTRIAL when lhen: are scveral factors on which it is considered necessary to try to force balance across the treatment poups. Simple randomisation will. in theory (or in "the long run'). ensure that treatment groups an: equally represented with n:spcct to all known and unknown pqnostic factors but. for any particular trial. this balance may not be as good as we would hope. When there an: only a few fadon for which balance is neceSSBJ)' (such as gender or stage of disease) then simple stratified randomisation may be suflicienL However. if there an: man: than two or thn:e factors on which to try to balance. lhen the number of sInda becomes excessive and the 1000istics or the trial become overwhelming. Minimisation was a method proposed by Taves (1974) and. more extensively. by Pocock and Simon (1975) as a way of balancing simultaneously for seYel1ll factors (see also Pocock. 1983, pp. 84-6). It is important to n:aJisc thai in most trials patients arrive sequentially. rather than all being available as a 'pool' of patients al the beginning. Hence. when a patient of a certain demographic aneller disease state eruols. we do not know when (or even if) a similar patienl will enrol subsequently. However. if two similar patients wen: to be available rora trial. it would be desirable to allocate one to each of the tn:atment groups (a method easily extendable to I1KR than two treatment groups). If then: wm: only one factor on which to balance the randomisation then for patients Within each stratum we would (optimally) allocate them altemaaely be> tween the In:abnents. If there is more than one factor. e.g. gender (malclfemale) and disease slage (earlylpl'OGressivel advanced), we have to "trade 011' the beneftlS of allocating to one treatment in order to ensun: an equal balance of males/ females KlOSS the lmltment groups - and simultaneously to ensure an equal balance of eartyJprogressi\'Cladwncc:d patients across the treatment groups. Often to balance gender. we might be beuu off allocating the patient to one tn:atment but to balance: for disease stage we might be beller off allocating the patient to the other. Heace we use the tenn 'minimisation': to try to minimise the dcgn:e of imbalance across all the identified factors. The following example is described by Day (1999) and conccms a trial randomising general practitioners to an

289

MISSING AT FWI)()M (MAR) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ interventiOll group or tomntrol(sc:c also Steptoe el al., 1999). Three factors we~ idealiftcd on which to balance the groups: the Jarman scon: (a level of social stalus),lhe ralio of number ofpalienls to hourly nune praclice haun ('low' or "high') and the fundholding stalUs of the pruclice (in thn:e categories). Assume we an: partway through the llial and the first 18 practices have been allocated as in the table. Balance looks agsanably good. Now assume that the next (the 19th) practice is or type: low Jarman scare:., high patient-practice nurse-hoursand is a nonCundholdu. We calculate 'scora' for these types of praclice. which an: 4 + S + 3 = 12 far the intervention gmup and 3 +4 ... 3 = 10 for alae control poup. Imbalance is 'in favour" of inlel"YCniion. so by allocating this practice to mntrol. we minimise the imbalance.

minimisation Allocation of fltst 18 general ptIICIices and profile of 19th ptaelics indicated PrtJvuulic factor

1",~rmJliDft

,rtHlp

COIIlr. ,rtHlp

JtmIftIlf~

-

Low Middle High

-

4

3 2

3 5

-

I

High

FuntlhoItling Jtalus Nalf'uadboldcr lsa WIM: Cllby 2nd wave enuy

5

4 S

4

3 4 2

3 3 3

Da"v, I. 1999: 1Ratmcat allocalion by the mctbad of minimisaliOlL Brili. Mftiittl/ JolITfIIIi 319. 947-8. Pocock, I. J. 1983: Clinical lriDls: a prael;cal apprDQM. Chichester. John Wiley and SoDs, Ltd. Poeoelc, I. J.aad S....... R. 1975: Scqueaial ~1IImCIIl assipment with balancing for prusaostic facton in die conllOllcd clinicallrial Biomelrics 31, 103-15............r. W. F. ad I.adda, J. M. 2002: Rtmtionri:alion in ciilrittl/lriGIs. New York: John Wiley and Saas.lnc. SIIpIoe. A., DDIIedJ. 50, IlIaIc, Eo, KII'I')', S.. KIadrIdc. T. and JIIIIaa. S. 1999: Behayioural counscllinl in cenend practice for the JIIOmction of bealthy behaviour among adullS II increased risk or cORJIIII)' hcaIt disease: randomiscd Irial.. BrilU/r Mftlical JOIlIfIQIll9, 943-7. 'by.., D. R. 1974: Minimization: a newmethad of assipaing patients to babnml and COIItIOl groups. clilUttI/ I'lrtImrtttology ond71reropeul;cs IS, 443-53.

missing

at random (MAR)

See DROPOUTS. MISSING

DATA

missing completely

at

random (MCAR)

See

DROPOUTS. MlSSIND DATA

Paliml-prQt:lke /rourSIWrBwk

Low

satisfies most critics, but some (such as Rosenbelger and Lachin. 2002) still consider that all lhe lhean:lical aspc:cts of how the analysis should be done have DOl been rully worked out. SD

missing data Well-dcsigncd statistical &ludies draw

-

-

When 18ves and then. the following year, Pocock and Si...... published their early papers on this topic. they explained how simple the rnc:lhod is to use aad. in particular. how. for a single institution. it is quite possible to ·minimise' on several factors with a simple cn index system. In a MULTla:aRE 11lIAL this would effectively be 'minimisation. Slratificd by cenb'c'. With modem telephone and computer systems used for central nnciomisalion il becomes eYen easier to usc minimisation across cenlrcs (possibly using ·ceRR· as one of the minimisation faclOJS). It was mentioned carUer how an ·optimal' allocation cauId easily be determined in the case of a single ractor but that would OIIly be optimal in the sense of minimising the imbalancc. Maintaining BLINDINCJ is also impclltant and most minimisation algorithms - panicularly Ihasc JUR on compulers - incorporate an elemenl or randomisation within them so thai, evcn with complete knowlc:clge or all the patienls in the study so far. it is not possible 10 guarantee c:onectIy guessing the next patient assignmenl. Minimisation is DOl without controvcrsy. Including such a random component

a n:praentatiye sample from the study papulation by rollowing a sampling plan and a detailed protocol. Often, some or the planned data an: unavailable or otherwise absent from the dalabase. hence the lenD 'missing dais'. n.e data that would be observed if all inlended measurements WCR obtained will be called ·palenlial data'. The potential data that an: not missing, combined with an indicator of availabilily of each planned n:.SPonsc. rorm the 'obsencd data·. The name ·missing data' may suge&l that these dala can simply be forgollcn by alae dais anaJyll. but nothing could be further from the bUlb. Missing data rorm one or the hardest chaRenges for data analysts. This is because the missed data can be inlrinsically dift"en:nt from obsen'ed data in ways that an: hard to prccIicl. and thus Ica\'C a biased sample. For instance. when studying the evolution of CD4 counts over time. AIDS patients may fail to mum farplanned clinic visilS, nOI only when they an: sick as a n:sult of low CD4 counls but also when they feel good and no longer in need oflRalment.1n view of this. thn:e types of missing data an: lypically distinguished (Uttle and Rubin. 2(02): (I) The simplest situation oc:cun when the risk of missing a certain pad of the data is the same far all subjects.

n:ganllcss or their potential data values. This IJI'OI'Css is known as 'missing completely at random· (MCAR). It happens. ror instance. in a study when: very expensive

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ MISSINGDATA outcxxne mcasun:s are. by design. only gathen:d in a nndom sUbsample. In Ihal case. missing observations can simply be dcJcted from the dataset and ignom! in further analyses. (2) A IIICRn:aIistic sitWdiaaaccun when the risk of missing a cCltain pall of Ihe daIa is CDIUIanl ewer the potential outcomes far tim put among subjects far wham we observe the S8IIIC aulaJmcs an a wc:U-c:hasm subset of variables.1bis candilion is easily .inleqmcd ,,'hea dealing wilh baseline ccmuialcs thallR always obsc:ncd and a single OUIcomc dud can be missins- It becomes complex with IIOIIIIIOIIOlon missinpcss paltems.1n cilhercate.1hc data IR Ihcn caUed ...ussiq aI random' (MAR). This happens. farinslm1ce. in two-stqesunpling designs whe~ a subpuup of palienls is invilalto the second &ludy cycle. depending on lheir fiI5l outcome. A naiVe daIa analysis. wbic:la ip10ns missing data. may Ihc:n be mislcadinl. unless it conditions anlhc CX1II1:Ct subset of obscned clala. (3) Whca neilha'ofthese lWoconslrainlShoJd. missingness is called iafOl1llltiYe or naaiporabJc (NO. In a hcaIIh Slll"\'ey, for inslaDc:e. one may lose the unintc:n:slcd ar vay bul)' n:sponclenlS. who haVl: their own disease pmfile. Under each of the abo\'C scenarios the popular MAXIMUM UlEUHDOD ESTIlatATIDN method can be fruilfuDy employed for unbiasc:cl estimation of parameten in the study populalion. The challenp is then 10 pnlpose a (panimoniaus) model ~Iatinl the dislribution of obscned and potential data. Typically. one chooses eilher a so-called °llClcction model' or a 'pallmI-mixlUnf madel (Utde and Rubin. 2002). The fol'll1U adds 10 the usual ....1 fOl' the distribution of the potential data a model for Ihc conditional dislribulion or being observed. liYen the potential datL The latter madels the conditional distribution orthe poICDtiai data foreach level of the n:sponsc indicator and adds to it a model for the dislribution of the raponsc indicator. In botb cues one averages over all possible values of the missing daID to find the observed clala diSlribution that enlCl5 the maximum likelihood procedure. A vel)' useful propclty of MAR is that maximum likelihood estimalion can avoid the nc:cd to model the probability of being observed and still allow for infc:n:nc:e on the potential clata. This makes maximum likelihood eslimation ver)' popular in this setting. Nonelhclcss. observed data likelihoods uncIcr MAR may have complex. fonns DOl cO\'Cn:d by 5IandanI stalistical packqes. 'Ib help avoid Icnllhy compUlaiions in rauline pmctice. the EM ALOOIII'IIM and imputation lcchniques (see ~ IMPUI"A'IKIII) ha\'C been devisc:d. EM is an iterali~ allorilhm Ihat rqJIaces abe usual IoJ-likclihaocl of the potential daID by ilS conditional cxpcc1alion. livCD the Dbservedclaaa. Maximum likelihood estimates ~ then obtained by maximizing it in thc usual way and the expected laglilcclihood is updaaed.lmputalion melhods °fill in' the missing

data by simulatinl from their dislribution condilional on the available daIa. 1bc rcsullinl °completcd~ dalasel is Ihc:n anaIy&c:d using standard softWWR as if no data wen: missiJl&. The lossofinformalion due to the missinl data must. however. be n:copiscd when STANIWtD BlIORS ~ derh'Cd. Carn:ctcd SIaIIcIard cmn have thc:n:ron: beeD pmposed based on the wrialion in eSlimates over difrc:n:nt random imputatiaas (Lillie and Rubin. 2002). One dnawback of the maximum IikcJihoad appmach is that estimates can be biased when the potential data model is misspc:c:ificd. One may ~fOlC choose to specify less fcalUn:S of the model and follow abe Horvitz-Thompson principle, which helps achicve robustness Blainst model misspc:ciftcatioa (Pn:isser. Lohman and Rathouz, 20(2). Ht:n:. the completely observed data an: upweilhtcd by Ihc inverse condilional probability of being observed.li~n the potenlial data. to compcasatc for similar COUDIcIpaI1s that are missiRl. This line of resean:h has seen extensive developments in mx:nt years and is enterinl stalistical praclic:c as softWIR becames IIIIIR n:adily available. Rcprdlcss of Ihc acIoplcd approach. observed data alone seldom contain infonnalion that distinguishes MAR from NI. One is slalislically spealc.iq blind and must rely on luidancc from olhcr sourccs tonaakc progras. nis is made abundantly clc:arby Ihc pattern mixture approach. Indeed. the paIlcrn with unobservedraponsccomplctcly laclc.s infonnalion on the dala distribution. and unbiBllCd inf'en:ac:c needs unverifiable . . sumplions rqanling the dependence of missinlDCSs on Ihc potential data. Reassurance that ·misscd· dalaan:COIIIparable to observed ones is found when data ~ 'missing by desilD~ • bul is hard to obtain otherwise. It is heDClC very impanant at the desiln SlDlC 10 plan 10 pther such information thai helps detennine thediltribulion oflhc mis&c:d data.lncxperimcalal Sludics one seclc.s to plhcrdaaa over timc that can help pn:cIicL Furthcrmon:. a sensitivity analysis can be conduetcd by examiniDl bow estimates vary as different choices are pastulalCd for thc unknown outcome distribulion in nonn:sponclcrs. This pracliccofclcscribinl how conclusions vary over plausible but unteSlable assumplions is n:commended (Kenward. Ooctlhcbcur and MolenbelJhs. 2001: Scharfstein, Daniels and Robins. 2003). While enonnous proJress has bcca made in stalilta melhoclology fOl' dealing with missing data. many problems mnain in practice and in theory. The term ·missiRl data' is oRen misundcntood or the methods are abused. When answen to certain qucsliOlLl an: inlrinsically meaningless or undefined in CCltain categories of people (c.l. blood IRSS~ of dead patients), it is \'cry hanlao justify missiRl data conslnlclions that giYe nonrespondcn the outcome dislribution of ~nders and base conclusioas on oulComes averqed over boIh groups. Furthc:nnare, Ihc MAR assumption is frequently adapted fOl' mathcmalical c:on~nicnce but may be difftcult to interpret arjustify. The loal and n:leWDCe

291

MLlMN _______________________________________________________________ ofaDY analysis. aIoq wilhjustified assumptions. must come lint. In causal infen:ace. far inllance. one has fruilfully exploited missiag data conslrUcts (Vansteclandl and Ooctghebcur. 2005). At oahu limes one ipan:s yaluable and simple MAR mc:Ihods to fall for simplistic analyses thal can be ycry misleadinl. Due caution is always necessary. additionalthoullat is n:quiRXI in model selection and. with repni to missinr; daIa in general~ the familiar adage holds: pl'CVc:Dlion is better than cure. EGlSV (Sec also DROJIOU1S]

Ksward. M. G., GaIf&IatbtGr, E.... MoJIaberPI, G. 2001: Scnsilivily ....lysis for incampldc categorical . . . S'Dtirtical JltJtkliIr, I, 31-41. ~ R. J. A. &ad RabID, D. B. 2002: S'alisti",1 tllltllyJi. tl'ilh misshr, New Yark: Joba Wiley Ii Sans. Inc. PniIIer, J. s., ......n, K. K., ........ P. J. 2002:

.'a.

Pelrannance or weiPtcd

estilDllliDJ cquaIiDDs far I_ptudinal

billll')'daIa willi drvp-auIs missi. at random. Slatislics ill MetikiM. 21.3035-54. SdIarfIfeIa, 0. 0., DaIIIeIs, M. ...... ......, J.l\L 2003: InoDIporaIins prior beliefs abauI sclectiaa bias into Ihc aaaJ)'Sis of I'IUIdomized trials with missing CIIIlcOIDcs. BiDsIDt&,;U 4.495-512. v..... Gae..........,& 200S:SalScand scasiliYily whea cOll'CClinl ror obscned cxpasu~s in randomized clinic:allrials Sialillics in MftiiciM 24, 191-210.

....,s....

MLWln

Sec MUL11LEYEL falODEl.S

mode The mode is a I11CISIft of location. It is simply the value that occurs moll often. For example, the hair colour of born at a UK malcnIity hospilal is shown in the table. In this example. Ihc mode is 'medium brown'.

573 babies

Frequency disttIbuIIon tor hair colour aloud Pale bnM'DIblancI Mediumbmwn Dalklnwn alKIE Red

41

147 244 121 14 6

1bc mocIc. howcyer. is of limited value: in summarising caalinuous data. In conlnlst to the MEAN. the mode is DDt scnsilivc to OUTLIERS but ac:ccI not be a unique: w1ue:~ u a dislribulion of data may be bimodal or multimodal (having two or more modes n:spc:cIiYcly). SRe

MPius

Sec srRVCRJRAL EQUATION MODEWNCJ

mu.lcentr. r8888l1:h ethics committee (MREC) Sec ETIIICAL IlEYIEW COMMJITEES

mu.lcentr. btala 1bc:sc an: studics Ihat are carried out in sevcraI distinct cenlRs.. siles or units (hospilais. clinical

clcpanmcnts. etc.). The lirsttrials or new thcnpies in man (Phase I trials) n:quire few subjects who must be monitoRXI very tillady; thererore the)' arc almost always carried out in a single cenln:. 1"hcse trials arc typically followccl by medium sizc:d mulliccnllC efticac)' Irials (Phase IIlrials).lflhe lCSults of these earl)' lrials an: promisiq. then larpr muiliccnlR anals arc carried out to caalirm the etlicacy ad safely of the new therapies (Phase III trials). Multiccntre trials arc often performed in se~1 countries or e~ sc:vcnal continents. The conduct of multiccntre trials can be: ovc:nc:cn by a slc:ering cammittc:c:. hcadccI b)' the Slud)' chair and consimili ofpcnans dcsignab:CI to n:prescnl stud)' centres. disciplines or actiYities. Multiccntre trials an: nec:ded primarily to accrue Ihc number of subjc:cts 01' patients ~quin:cl far Ihc SlUdy O\'CI' a ~asonably short time period. For common diseases. multiccnllC biats may haYC several ccnllCs with large numbers of subjects per ccnlR or. in the case of ran: diseases. they ma)' have a larr;e numba' of ccnlles willi Yery few subjects per cenllC. Patients treated at different centres (let alone diffcn:at countries ar continents) may be: ellpc:clcci to differ substantially in terms of their clhnicity. ellposlR to ac:IioJocic or risk facton. livinr; conditions and access to health reSGUn:cs. ele. Such bctcmgeneily may be a drawback for two main reasons: lint. the VARL\NCE or the outcome or intelCSl is incrascd because of the hc:lc:rogcncity and. second. a amatInCnt benefit ill some patient subpopulalions mipt be missed ina trial thalacclUCd Olherpalientsubpopulationsas wcll. On closer inspection. neilhcr or these two reasons ques apia multicenln: trials. The iDCmIIC in sample size thai lCSults from hClelogCnCit)' is usually negligible compared 10 Ihc potential number or patients aYailabic al multiple cenlRs. In man)' situations, lID ccntn: would be able to accrue Ihc requilal numberofpaticnts; hcnc:ea siqle-eentn: anal would be infeasible. In addition. the results of. mullicenllC lrial are mon: ~adily gcnc:ralisablc than those of a siqlc:-ccnllC trial. because they arc obtainc:cl in a patienl sample man: likely to rellcct the population of intcn:st. Ir. ImIImcnt is thoupt II priori to exert its benefit sc:lc:ctiYCly in a patient subpo~ ulation. a multic:cntrc trial is Slill indicalccl willi prior exclusion of patients unlikely to benelit. All addccI adYanlqe of muiliccnllC trials is Ihc:ir ROBUST. NESS to fraud. delinquencies and other qUality prablems that may arise at a few centres. 'I1Iis was iUustratc:cI by a series or luge mullicentre biats in c:ady bRut caacc:r. in which exclusion of a fraudulenl ccnlR did not haw: any sizeable impact on the trialresulll (Pcto el til.• 1997). In eanlrul. another highly publicised case orrraud in a trial ofhilh-closc: chemotherapy for adyanccd bn:ast cancer had clisastnJus conscqucnca. because: its results were completely dominaleel by fraudulent dala. Very few ccntn:s had taken part in this trial and fraud from a siqle invcstigalor was sulftcicnt to cause cIramatic: BIAS in the repartc:d lCSults (Weiss el til.•

20lI0). Published raullS orwell~aductcd mulliccatre bials OftcD ha'VC a din:ct impact on clinical practi4.'C. while linelecciIR trials generally RqUR 10 be rqnduc:ed on a IallCr scale before their raults an: aa:epICcI as wlid. The QJNICAL TRW.. PIlOJOC."OL should CDIXJUI'8ge the partieipalin& ceDtn:s 10 PUithe same proccdura in place as Mlards patienl management.. measuremenl of In:aImeDt elfccls and other aspects or the sbldy dud may ha'VC a bearine OR the therapeulic n:sulls. Same helcrogcneily is unavoidable iD mukiCCnln: trials.. bUI such hclemgeneily is unimpcxtanl so I.g as it does ftDt di~dy aO'ectthe auleome or inlm:sI. In a rancIomised trial. in parlicul•• ditTenmces between 4.'Cnba an: ignorable if they do not impact the lreatmenl eft"ecIs or inlen:sl. 11Ius if one centre lCDded to ~ruit paticats orpaor prDgnasis and 1IIIDIhc:r cenIn: tended to recruil patients or good prognosis. Ihis clift"m:nce would nat comprumise the trial ~sullS ir the lreatmenl eft"ecls w~ iDdependcat or prognasis. Such independence is genenally unknown but postulated befeft the Irial and it can. ia ract. be studied within Ihe trial itself.1D theeumple just giWD. clift"m:1KCS iD prognastic mix bctweea ccnln:s mighl be confounclecl by other 4.'C~&peCific factors (such as concomitant medications. supportive can:. ele.). and hence it is advisable to slaDdanlise trial piOCecIwa &crass all participating ~ba 10 the extent possible. The logistics or a mulliccDtre trial c:'an be faidy complex and provisions musl be made for drug shipment and stcnp. trial material distribution, elhic:aI approval and compliance to lacal I'Cgulations. training or investigaton and lacal slafT to avoid variation iD patient mllDllgc:ment. evaluatiaa crileriL follow-up schemes. CIe. Allliu:se issues can be discussed at investigator meetings and monitariDg visits during the trial. SlaIistic:aI qualilY COIIlIoI checks can also be perfCJllDed to icleatifydiscn:pancies bet'Mlen cenln:slhat may call for more thonlugh inveslipli~ especially in ~ba that arc found to be cle. 0U1UEJIS. Such checks arc best performed while the trial is aagoia"., ....t remedial action can be taken e.ly. and lhiscan belacililaled byeleclronic claaacaphnlhat feeds patienl clata to a 4.'Cntral database in Ral lime. 11Icn: is no conclusive evidence that data quality is Jdated to the number of palicDls carolled ia each 4.'Cnlm (Sylvester et DI•• 1911 ~ Hawkins et III.. 1990). In multicenln: llials. 1lANDOMISA11ON of lhe palic:ats is generally orpnisc:d 4.'Cnlnllly. ndher than perfanned in each cciIR. Cenlmlised randomisalion nxauires that all 4.'Cn1n:5 access a 4.'Cnlnll Rsource. usually by inlernet. telephone or fax. 10 obtain the next treatment allocation. Such ~lnIlised control is useful to follow Ihe accrual ofpalients inlo the trial and 10 cbeck eligibility criteria in a ulliform way prior to In:atmenl allocation. Ccatnlisati. of the nndomisalioD prac:css also guarantees ....t it cannot be biased by r~ knowledge of Ihe nexi lR:alment assigDmeDl, which could happen ia apen-Iabel trials with the use of nncIomisalioD

~

~

R

]

E

~

r

c

U

M

____________________________________________________

lists. It is usually desirable to SInlify Ihe alIOCaliOD by centn: and man: gc:aerally by impollant progna5tic fadonmeuun:ci at baseline. such as se'VCrily of disease. patient status. age, ele. Such stratification can be implemelllcd wilh pennurccilislS of lRlllmcnt allocations or through d)'Dlllllic docalion using MIDIISA'I1QN. Allacaliaa duuugh miaimisaliOD has the advantqcs of being able to lake aa:ounl or many prognastic factan and of being complercly unpralictable at any pven centre iD the absence of ialormalion on patienls already rancIomised iD all oilier centra. Mulliccatre trials an: usually designed and lheir sample size calculated under the assumptiaa that the treallDcnl etTect is Ihe same in all celllres. Whether this assumpliaa is suppolled by the data can be tested fonnally. Assume that in Ihe IlII or l4.'Cnln:s, the true tn:aIment effect is given by T, and the eslimateofr,. noted i;. is asymptotically nonnally disbibJb:cI wilh variance Y,:

r;

N(~;, 1';)

""I

The measun:oftrealmeDtefl'ccl istalccn such that nollealmaat effect canapandslo ~t = O.InICn:st focuses first and fomnosl on whethc:rtherc is &lalistical evidence ofan overallln*meDt eft"ect. which can be ICIted IhrauP the tesI slalistic: X~=

(hWi)' ""Izi ;=1,

Ew;

;=1

when: "'; = ''1 I denaIcs the invcne of the wrillllCC of the Raiment etTect in the ilh cenIre. Under the null hypalhesis of

no balmenl effect (~I =T2 = ... = ~/=O). this test statistic has an asymplalic clisllibulion (sceOll-SQOOtEDLmUIII1TION) with one DBIlEE OF FRF.:fDOtL Under the assumption of a coaunon IIalmenl eft"ect in allbials (TI = ~2 = ... = T, = T). the Raiment etTc:cI is estiIDIIIed by:

r

I

E1';K'; ..

i=1

T= ;"'Ji---

Ell'; i=1

In onIcr wOlds. th. is a weighted avcmp of the balmcDt eft'ects in all ~1Rs. The pn:sc:acc of heterogeneity between ccaba can be Ic:sted tIuaugh the lest slalislic: l'J ......, = ....

I

~ (" ~

.)1 w; ""I Xi-I .,

1';-T

':1

which has lID asymplDlic '1.2 clistribulian with 1- I deena of ~. In pmclice. this tal for helclUp:ncity in In:abDcnl eft"ects betweena:ntn:s is nat WIlY informative. ba:ause illacb POWER 10 delectlruc uncIedying cliff~ especially when them 1ft IIIIDY centn:s (I large) with few patients per c.alIc. 293

MUL~OWN~Rnv

_____________________________________________________

Mon:O\'eI'. when slalistiall hcterogcneity is found between centmi., it may be difJicuit to ascribe it 10 a well-identified factor and the intcrpraalion of die overall trcatmc:nt ctrcct may be conlro\"C1SiaL 1bc same lest for helcrogeneily is mon: useful when cclllms can be meaningfully combined according 10 a common charadcristic (for inslancc all cenln:S ahat hll\'c aa:css

to CCJtain c:quipmcnlsor lhat usc catain supporti~ Imlbncnls). When centn:s an: thus combined for Ihc purposes of statistical

analysis. the grouping should be defancd pmspecti~ly and blindly 10 butJncnt aIIocalion and results in the wriousccnbes. A puuping of centres based solely on Ihcir sample: sizes is unlikely to be infonnalive. EYCII when the fonnal leSt or

heterogeneity fails to mach stalistical signiftcance. heterogeneity can be explon:d through dc:saipli\'C stalistics or graphit'aJ displays of the lmalmcnt effects in indMdual ccnIR:S or groups of cenlrcs. ~ diffc:m1CCS in In:atment cft"ccts between cenln:S \'Io'OUld cause conccm. espcciaBy if much of the overall effect was atbibulable to an unexpcctc:dly hqc ell'ect in a single ccnlm or if tmatment had a l11IIItcaIIy negative cft"cct in some: antms - an overall positive tn:a1ment ell'c:ct notwithstanding. Whenever substantial heterogeneity is found. attempts should be made to nDd an explanation in lerms of identifiable fealUla of bial manqcment or subject characteristics. Such an cxplanalion may suggest fta1her analyses or appropriate inlclpmalion. In Ihc absc:nc:e or an explanaliOD. a1temalive e.dimalCS of the ImIlmcat elTect may be mquin:d in on:Iu to substantiate the robUsblcss of Ihc trial ~ulls (lnlemalional Confm:ncc on Hannonisation. 1998).

Regardless of the praence of statistical heterogeneity between cenln:s, the Slatislical model adopCcd for the estimation and lCSIing of treatment efrc:cts may account for centre through stratifICation or by inclusion or a fixed or random eft'ecl for ocnln: in Ihc model If the number of subjects pel' cenln: is limited. cenlm etTects are poorly eslimated and the inclusion of cenlm effects in the model nc;atively all'ecl the power of the tn:abnent comparisons. In such cases. it is preferable 10 ignon: the centre in the analysis. MB

s..

Prior, M. J .. l1sber, ~L R. ..... Bladcbarst, D. W. 1990: Relationship bet'A"eCn rate of paIient enrolment and qualit)· of' clinical ceater perfonnance in t\\'o multicmter trials in ophthalmology. ControU~d Cliniml Trials II. 374-94. Intunatlaul eoarereBCeon lIanDoaIsatIoa 1998: E-9documcnt pidaoc:e 011 statiSlkai .mnciples ror diDicai bials. FeJnal Regisler 63. 179. 49.583-98. No. It., Callbu, R., Sdett, D., ~ J., Babibr, A., Ba)'IIe, Me, saewart. H......... M.. Galdldnc:b. A., IlaaadCllUla, G .. V........,p.. Ratq~st,L.. EIbaume, D.,Albnaa, 0., o.ItIIo, 0., Parmar. M .. HII, C.. aua. Me, Gn)" R. 8IId DaD. R. 1997: The trials or Dr Bernard Fisher: a European perspective OIl an American episode. Controlled C/initQl TritJu 18. 1-13. S),h"elter, R., PInedo, .... De PaDw, M., Staquet. M., Sa,.., M., "...-d, J. mdllaaadaaaa, o. 1981: Qualit)· of institutional participation in multicenter clinicallriak. Nt!t'. Eng/ad Jormllll of MedidJre 3OS. 852-S. Weiss. R. B., RIfkIn. R. M .. stewart, F. M., 11HrIauIt, Ha1t1dlll, B.

R. L, WIll..... L. A., Henna, A. A. 8IId Ben!Itdp, R. A. 2000: Hip-dose cbemcMherap)' for hish-risk primary IRast cancer: an onsite review of the BeZ\\'OCIa study. Th~ Lonerl 355, 999-1003.

mulUcoilineartty This term

is used particularly in

MU..TIIU lJNEAR REQRESSJOH to indicate siluations where Ihe EXPLANATORY VAlUABLES are linearly n:latcd. thus muing

the c:stimation of regression coefficients in the usual way essenlially impossible. Including Ihe sum or average of Ihe cxplanatory variables as a variable would lead 10 this problem. For example. in a blood pn:ssun: study one cannot include among explanatory variables systolic blood pl'Cssure (SBP). diastolic blood pressure (DBP) and, additionally. a Iincarcombinalionorthe two. such as mean blood pn:ssure. wilhoul causing lhe model to bn:ak down completely. Another example is using 100 many dummy variables to code a categorical explanatory variable. In praclice. of course. approxUnQte multicollinearily. Le. where one of the explanatory variables can be predicted with considerable accuracy rrom lhe other- explanatOl)' variables. will be of more cause for conc:em and can lead to in8a1cd variances ror the estimated regression coefficients. Some evidence for approximate multicollinearity can be found by looking at lhe mulliple oorrelation coefficients (see CORRELATION) of each explanalOry variable with the otherexplanatory variables; if any or these is close to one then multicollinearity should be suspected. There is no optimal way of dealing with multicollinearity but in many cases Ihe simplest solution is 10 remo~ explanatory variables that are highly c:orrelated or combine variables in some way. More details are given in Miles and Shevlin (2001). BSE "Illes, J ...... SbeYUD.l\I. 2001: Applying regression and rorrelo-

lion. I..oncIon: Sage.

mulUdlmensionalscallng This technique is ollen used in psychology but less ollen in medicine. The basis of the method is a proximity matm arising eilher di~tly from experiments in which subjects are asked to assess the similarily or pairs of stimuli or. indi~tly. as a measure of Ihc CORRELATION or COVARIANCE of a pair or stimuli derived from a numberormcasurements made on each. In some cases. high proximity 'Values c:orrespond to slimuli thai are similar (similarities); in others. the revcrse is the case (dissimilarities). As an example. the table shows judgements aboul various brands of cola made by a subject using a visual analogue scale with the anchor points "same' (having a SClOIC of 0) and ·differen" (having a SClOIC of 1(0). In this example. the resulting rating for a pair of colas is a diSSimilarity-low value. indicating Ihal the two colas an: regardc:d as alike and vice vena. A similarity mcasun: would have been obtained had lhe anchor points been n:vencd. although similarilies an: usually scaled 10 lie between zero and one.

___________________________________________________ mullldl......I0... scaling Dissimilatity data for pailS of 10 colas for a subject

a.

Sub~c' I

Col"

1

2

J

4

5

45

7

8

9

10

~,

0 2 1

4 .5

6 7

8

9 10

16 0 81 47 56 32 87 68 dO 35 84 SO 99

16

94

87 25 92

0.........,

0

44

71

21

9B 57

98 79

53 90

measlc:scasc:sldeaths iD bothofa pairofan:as. The p:alcr the value: of such a similarity. the: g1'aIc:I' Ihe similarity or Ihc: time series of the: oct'UI'IaCe of mc:asIcs in &he two arc:as. (In allis SlUdy~ the time scries farCKh an:a consiclc:mclClClllSis1edof monthly IoIaIs or measlcs cases far Ihc: 31-year pc:riod from JanuaJy 1960 to December 1990.) 1bc: 811ft on PSle 296showslhe muitidimc:DSionai scaiinJ solUlionsconspaadSSE ing to a onc:-. t~ and dnc>cIimcasionaJ solution. CIIII. A. p.............JIIIDI', Me R.,StNap. De F. .... WI........, G. D. 1995: 1be appliCllliaft ofmultidimensiDDal scaling methods 10 cpidemioIogicaI data.. SIDlal'",' Metbotls in Mediall Re."m, 4. 102-23. Eft... B. S. MIl .............. 5.1997: 'lMtmIIl}osull/"oximi"dalD. Landan: AmDId.

0

71

MULn~MODE~

73 91 13

0 l4 99

0 99

0

19 52

92

45

17

19

44

99

0 84

0

24

18

9B

multilevel models Multiic:vel (aim known u random 0

Rcscan:bers with dala in &he farm of pIODmily malliecs ~gcnc:nlly intcn:slecl in unccm:ringany IIIIUelIR orpalb:m

they may contain ad multic6mcnsional scaling aims to help by n:praeniiDl Ihe observed pmximilics u a spalial or lcometrical madel in which the disIaaccs betWCCD dae poinlS (usually taba to be Euclidean) COlRspond in some way 10 the obsencd pIOximitics. In poeml. Ibis simply means that the larger the dissimilarity (or the smaller the similarily). the: funher apart should be &he points Ie)Rscnling tIu:m in the final geometrical model. The Rlquin:d spatial model is c1cfiDcd by a set of ddimensional poinl5. each ~lICllling ODe of the SIi....1i or intcn:sl .... a measlR of Ihe cIisIance bc:Iwcen Ihcse poillls. 11Ie abjcctiveofmultldimensianal scaling is todctenninc boah the dimensionality of the madel (i.e. the value or tI) and the values or the CXIOIdinatcs. '11Ic: c:oanIinaIes of the poinlS in the modcIlhai rqnsentthe praximilics can be round in a varic:ty of wa)'S. One simple applOKh is to chaasc: the coordinate values (rar a p\ICII value: of tI) to minimise S. dcftnal as:

S = E (6,_dq)2 ij whe~6l/istheabscrYcddissimilarityforstimulilawJj.anddli

is the: diSlaDcc: belRen the poinlS R:JRSCR1inI stimuli i awJj. Since the distanc:c:s 4 ~ a functiaa of the coanliDatc values. so also is s. Far \IIIIiaus I'CUDIIS. S is noIlcnc:nDy a suilable l'unclion ror campuinc distances and dissimilarities and full clc:tails of man: suitable crilcria CaD be round in. far example. Everi. and Rabe-Hc:skc:th (1997). Tbis also includes a discussianofhow lIIIIIIydimensians~ nccdc:cIto pvc an adc:quatc lit of &he gc:omc:trical madc:lto the observed pnaimilic:s. An iIIUSl1ation how multidimcasional scaling has been used in a mc:dicaJ sClling is pIOvidcd in Cliff ellli. (1995). He~ a malrix of similarities is calculated in which each element islhc: numberofaaonths in which thc:n:~ n:partc:d

eft'c:ct. hieran:hical and mixed) models 1ft . . extensive and ftc:xible classofmodc:Js farcOlRl1IlCd data. wIIicharisc: widely in malical statistics. Far example. adult hc:iJht arweiJht may be canelated with those: or oIheI- ramily members and the chance of posl-surgical infec:tiaD may be carrelatcd with that of alher ..lienls willa Ihe same: surgical team. Further. many studies involve the n:peaIc:d measun:mcnt of subjccls' outcomes tluaulhout fallow-up (l.CXIIOITUDINI\ DATA). Such obscmdions 1ft usually quite strcJD&ly com:lalcd. Mullilc:vel models n:1ax the IISIUIIIplion. Rlquin:d far arcIilllll')' least squares (OLS) "'In:ssion.lhat cach n:sponsc: is indcpc:ndc:nt. They have their raoIs in qric:ultunl experiments: indeed they c:mbrace aU the classical anaI)'Sis of vAlUANCEmodc:Is. They have found ladyappliclllion in social science. medical ad economic n:sc:an:h. A bricf histCll)" is given by K",R and de: Lcc:uw (1998. p. 16). The: dala struelU", is viewal as a ICIic:s or levels (ar hieran:hies). For example:. considc:r a multicc:nn bial when: subjects' quantitative oulcGmc:s 1ft recorded n:pc:atcdly over time:. 11Ie fint fillR (on PII&C 296) shows a possible data slnlc:lUrc. Lc:vc:l 1 has the ",pc:aIcd absc:rvatiaas that ~ nested within subjc:c:1S at lem 2. Subjecls 1ft in tum nested within centn:s allevel 3. A multilevel analysis enablc:s us to allow com:clly for. and model. the: CORUUTlOX induced by this 5InH:tu~ Ir we have I_gitudinal data. we can invesliple how subjects change with time. which could be quite different to the: cross-sectional ~Ialionship (Digle etlll.• 2002, p. 16). In aciditiontotheusualjixedpGrtlmeto.r(whareinterp"'talion is similar to the:ir OLS counlClplUlS). I.aUIDOM Eff£Cl'S an: introduced to modellhc: correlation struclwe. as de:scribc:d below.11ac: mix orfixc:d and nadom effects livcs rise: to die: tenD mi.-ced models. Once alCJted. we sc:c: hic:ran:hic:al structun:s everywhere: subjects within wards within countics paliellls within hospitals within health authorities. and 10 on. Thus it is natural to ask what is lained by a multilevel maciel, and when they an: unDCcessary. 9

21&

~TI~~

____________________________________________________

...

....•

III i

....

•

.........

-f.O

....

....

,.

1."

...

.• I

. '-USA

ac.

as

••

............ ..........

0.0

-nuw.. .......

...

............ •• •

wac.

'-&0

,l~1 ~I

_.

1.D

Ywo . . . .orie

I, •••

....

..

--1

~

eo

1••

.... ..... USA.

.,

....~r

r-

·ndllcII............... MOB plots oIl1rfi'USA IfIIIIons Incme-. fltD-lIIIdllne tIrnttnsIonaI space. Data 1118 ftom . monIhIy time seiies masIBs ,cases 191!Jo-1990. Taken tom CIIIf fit III., 191!i5 '.

of"".ied

0_

Fint. OLS srAKiWm..-s ... Wnaac ~ die ~am

muililewl. Far CUinplc:. subjc:cas widd. a dUster 'am si.idIar to cacla ~~ i~ IICII iDdqJcncIcnL They dll:lefCR ClllWCJ less iIII'amuIIion,about Ihe. value-ofa pIII8IIIdeI''''

LeV8i~: ~2:

level 1.:

•

•

,_ illdc:pcDdait (unelustcial) I8IDplC arilie same _.,(~Id

'liein.,2Oo.J. ~ 23).

SccaDd. Q1.S cIocs DDt pc:nait explcnliDn or the variaRcx:' sInII:turc.. Par:exampIe, .~ may wish • aii.... ·dac .~ 'of ~ tatal'~ '-Ween (Ihc INIRACI.UJiia ~1D coi:riiaacr (ICC), cquaIi_ (I) be~) or we' ilia)' ward to irnaliplc, how ·1IIe. variaaca 'Mulii~'Ve1 ....,.b wiD ,c...... . ,as a ~GrCOWlliala. . . . IIiId lillie. 10 ...' analysis When ~ ~'ell'cCtiveJy i~IO'lhaUbe ICC is cbe·to· _ ·HoWever,'. is wise 10 be. c - - . • e\'CII a· small ,ICC CIIII .lave • aaidri\lial clfcic:t.

mb.Jcc:ts

_____________________________________________________ The plan of thisllllic:ie is as foliowL Fi~ die key ideas or muililcvcJ ....Is 8M outlined., foDowccI by a discussion of commonly used aIgorilhmsl.ftUiq mullilc1e1 models. '11Ic:D extc:Dsions to cliscn:1I: data &Ie described aad the .elalionsbip 100ENElWJSEDESIDL\11JIIU fQUQIONS (OEEs) ispYeIL Mediad appIiadions and f'urdIer exleasions 1ft discussed and Ihl:D missiq .... delilll and software. Some sUlpslioas for fudhcr readiiag 1ft p\'eII III Ihc cad of Ihe CIlIIy. Consider the lDulticcnn trial orahe finl filum. FacusiRI OD 1c1els I and 2, we beliD by clcscribiq the simplest model, which allows for correlation between tile oblCl"Yalions, before oudininl how anon: ftexiblc models can be bailt up. 11H: idea iato pncralisc OLS rcgmssiOD. AD OLS model would haye a single ~lrclSion line rclatiq the averqe response to lime. A multilevel model, howeYer, can be thougha of as extending lhis to include a Iq.ession line for each subjc:ct. ThUs. wIIeras iD O~ .egrasion Ihe obsarvlllions 1ft disbibukld about a sialle rcpessiOD line, in lDu1~1 models we can view each subjecl's saponses as distribUtccl aboul Iheir subject-speciftc n:pasion line. The subjectspecific rcpasi... lines an: dlendislributcd abouttheovendl ave,. ~siOD line. This is illuslnted in the IeIXIIId ftprc. Here. the ovendl average .elatioalhip betweeD Ihe n:sponsc: and time is liveD by the bold line., Y=a.+fJl. Five subjec:t-spcciftc rc:pasion linc:s 1ft shown. which an: parallel to this. Each subject's obscmdioDs arc distributed about daeir n:pasion line. Fiw; ~amples oIlhis arc gi\'ell in the lop half of die fIpn:. In this simplec:ue. each subjc:ct"s.epasion Iiae is .....lel to die overall awrap line. The distance between dae fth subject specific: line. Y=(a.+u/)+fJl. andlbe avc:ra&e line Y = a +PI is ~ (in Ihe second ftpre.j= 1,2,3,4 or S).1'hcac

MULn~MODaS

"I arc Jmowa as the subjecl-specoific rarrdom effecu, also

known as the lem 2 l'esitlluib. They ale assIIIIIed to be lIDIIIIIIIIy distribulal about zelV. TIle vertical distances between each subject's n:sponses and lheir subject-specifte rclrcssi... line arc known as die lel'ltlll'e.rit/Ullu. These: arc anaIolOUl to Ihe residuals in OLS madels aad an: likewise IIIISIIIIIcd 10 be normally dislrihded about ZCIO. 'I'he IIDIIIUII densities 01 die level :1 and some level I rcsicIualS arc showa in the second 811ft. In the lop half, we: see lve obsc:mdi_s.1IUIIbd .+'. The Ihe vertical dist8llC'Cs to their subjecl-speciftc .egn:ssion arc ftve level I Rsiduals. The NOIWAI. DImUBU1ION 0I1hese .esiduals is iIlustralal by the ftYC normal densities about lhesubject-speciftc rcgn:ssi_ IineL 1ben. _ the ieft-liand side.1he ~12.esiduals. "I •..., 115 ~ shown. '11Ic:ir normal distribution is sketched on dae left-hand side 01 the ft&1ft. TIle parameIeD a and fJ in the fipre an: known as ft.panunetI:rL They haw; a similar inlerpn:lation to their COIInleqIaIts in die OLS models. so a is 11M: avcrap .eSponlC at lime zenJ and fJ is Ihc: ayerap chanp in IaJIOIlSC per unit cbanp in lime. However. we "ve twa new panuneIcrs. known as IrIIIIlDnJ ptIrtIIrtItter$. wllich an: the wrillllCC 0I1he 1c1e1 2 n:siduals. called ~, aad the variance of the level I residuals, called~. thus, in thescconcl filure, Ihe densily of Ihe IIJ is sketched on die left aad has yariance The fiYC densilies 01 the leyel 1 rcsicIuaIs have common variance 0;. Often. 0; is called the between-subjecl variance and Ihe widain-subjecl Yllriaace. 'I'he second fiprc n:pn:senls Ihc: simplest mullilew:1 model. As each "Jcan be viewed. a random conbibution to Ihe iRtemepl oflhejlh pc:ncm's n:p:ssion line. which is (a. +u.,), this is often known • a RANDOM IN1ERCI!PI' t.IDDEL. II is also

o!.

o!

'Dna

multilevel modele Schemafic IIusIration of the IIIIJdom intercepI model 217

MULn~MODBS

____________________________________________________

a simple example or a cat.lJll»l:NIS a; VARIANCE model as them is a sinlle wriance lmn conesponding to each level in the madel ror level 2. 0; far level I). The: motivation far multilc:vc:l madels was lheir ability to model the ~aliOD strudlR or Ihe dalB. We tbeman: CODSider the ~lati_ struclun: implied by the random inlen:cpl model. 1b do this. we have to consider the varillllCe oreach obsemdionand the COVARIANCE belween observalions. First.. consicler the varilUllClC. In multilevel models. the random component is the RESIDUAlS. Residuals rrom dill'eMnl levels are always assumed to be independent. Likewise, ~duals cOIRsponding to dill'erenl unill wilhin a level (i.e. difl'emal obscJ'\lalions within level I and dill'eMnt subjects within level 2) are assumed to be independenl. The total variance or each obscrYalion is thus the sum of the variance of the residuals at each level. Thus. in the random intercept model, when: each observation has a residual at level 1 and level 2. the variance of an observation is 0; + ,,;. Second. consider Ihe covariance. In the random intercept model or the second figure, different observations from the same subject. j. sh~ a common random CODlponent. their level 2 residual II). Their covariance is therefore COV("I.II)) = (u". uJ) =~. However. observations rrom different subjects share no common residuals. Their covariance is therefore zero. Recalling that Cor(A,B) = Cov(A.B)/v'Var(A) Var(B). we see that the ~lation strudun: implied by the: nndom inlercc:pl model is:

(o!

f

I

P =.;./

lo

.......bject .... dlDe

v'l< +;;)(G! +D;) =.;./(cr. ;-~) .......bject.lliR'mlluiaae dift'enmt _bjects

(I' Thus the: random inlcm:pl model of the sc:cond figun: implies a fixed com:lation. p, among a subject's responses. inclcpc:adcnt orhow rar apart in lime they an:... This is known as a comptIUIUl~'"""elry orexc/rmrgeable cam:lation stnIdurc:. The correlation. p. in equation CI) is also known as the inll'tl Inel2 unil, or ~ commonly ICC: in random intercept models. p mc:asurc:s the proportion Dr IOIaI vanan~ which is bc:lwcen subjeCls. Ir p = O. then observations arc indepenclc:nt. Consider how the random inte~ model illuslrated in the second flIUn: compares to filling an OLS line to each subject in tum. Such OLS lines would be: unbiased estimalcs or each subjecl'S IIUe line. However. they milht be: ilDPl'"isely cstimatecl, palticularly ir a subject has rew o_rYations. ConWftCly, the cstilDllle of the overall aYCnlle line (a +Ill) is a )Reise. but biased. cstimate of each subject·s IJUe line. Both exln:mes arc: unclc:simble. By fitting a multilevel model. we compromise bc:tween the: two exln:mc:s.. The estimates of the IIJarc: known as "best linear unbiased pmlictcn' (BLUPs)

and. as lheirname suIFSIs. have certain optimality properties (\Gbekc and Molenbc:rJhs, 2000. p. 80). 'I1Ie practical eft'cct is that Ihc: subject-5pc:ciflc repasion lines estimated by the mubilevel model arc:cbawn (or shrunk) clascr 10 the: mean line than the: OLS ellimatcs, and lhc: rewer Ihe obscrwIions em a subjc:d.1he meR their line is drawntowauds (borrows slI'aIgih from) Ihc: MEAN n:gn:aion line. this is often rereneci to as shrin/tce in the lilmDlurc:. Having ftUed the random intercept model. we should examine the levelland level 2 n:siduaIs to check whcthc:r they arc: approximately nannaI. as the multilevel naaclcl assumes. and identiry OUfLERS. Level 2 n:siduals can alsa be useclto distinguish oullying subjects: Ibis has found wide application in mccIicai settings. For mast longitudinal data. Ihe correlation bc:Iwc:en observations declines as the time between them increucs. Thus the ftxccl corn:lalion slnIctun: or the random inlercc:pl model,p. isinsumcicnl A natumlcxlCDlion istoallowsubjects to have their own slopcs as well as their own inlcKCpll.. as illustrated in the thinl ftpn C- page: 299). As bc:fon:.the ovaallllVCl'llgen:pasian line is Y=a+/ll. Now, however. thejlhsubjcct's regn:&sion line is r = (a + ,,,)+f/l + "/)1. lathe random inten:epl modclthellJwere normally distributed with mean 0 and variance all2. In the RANDmIIN'IERCDT AND SlOPE MOOEL(~~"/)havcaBIVARIATENORMAI.DlStRlBl7I1ONaboul(O.O).

As before. the level I n:siduals an: the vaticaJ distances between a subject's observations and their subject-specific rep:&sion line. The level 2 n:siduals an: now (H". ~), sa we have two level 2 residuals per subject, rcpn:senting the random inleR:epl and slope n:spcctively. We can calculate the variaDce and covariance or observations in a similar way 10 thc ranclom intcrccpl model. although the algebra is man: involved. Then we can derive the correlation structure: implied by this model. 1he variance or the respaases is no Jollier CDIIIIJainc:cl by Ihe model to be: constanl: it can now iacn:ase With lime. Further, the com> lation bc:tween obsel'\'aliCIM _ the same: subjcct can decline as the lime between them increases. Hence this model is oRen IIIDII: appropriate for l_plUdinal data. The way. the: nndom intcrcc:pl and slope model builds on the nmciom inlCn::epl madel sugests many further extensions. 10 begin with, if we have additional cowriates. they lOG can have random c:tTccli. For example. ir we include a In:alInc:nt variable. subject-spcc:iftc trcatment effects can be eSlimatc:cl. Levels can be added 10 the model to deambe: additiaaal levels in the daIa.. For example:. the first filUM shows thal subjects arc: nestc:cl withincenln:s. We can extend the rancIam intc:n::epl modcllo include a random effect at the ccnln: leyel. Such a model yields estimalcs of CXJIDponents Dr variance at each level (cenln:. subject and obsc:rvation), sa the proportion Dr the lOtai variance between centn:s can be: calculatc:cl. Further, the level 3 (cenln:) residuals can be: examined 10

______________________________________________________ V=(a+U,)+OJ+ vt>t

MULn~MODE~

Y.... V=(CX+U1)+(J+ Y1)t

Y=(a+Us)+(1k vall Y=(a+ual+(1k v.Jt

I

Y=(a+U..)f.(1k v,.)t

I Tune

mullllevel models Schemallc illusttBtion of the random intercept and slope modBI

inclic:* autU~ and cowriales can ~ given nndom centreleyel terms as weD as random subjc:ct-IeWlIlennS. The level I variance (which is analogous 10 the leSidual Yllrianc::c in OLS models) can also be modelled by COwrillles: e.g. male level I ~iduals may be mon: variable than those f'rom females. 11Iis is knowa as modelling compk:c l'Q,.ilIlion. Sometimes the nndom intercept and slope madel is not sufticiendy llexible to model &he carrelation slIUctum, particularly if ob.serYlllions an: close together in time. Many options ~ possible: if subjecls an: observed at identical times' then an DIInIcliWl allCmatiYe is an IIl1:1lruclU1Y!ti COVARIANCE MA11UX~ which imposes no parametric madel on the extyariance. Much has been written on this; sce. fOr example, VCIbcke and Molenbelgbs (2000. Chapter 16) and Digle el QI. (2002, Chapter S). Multilevel models forquanlilaliYeclala&re typically based on the multiYariate normal cliSlribution. 11Ius, the likelihood of the data can be writlal down and maximised using adaptations of flfewton-Raphsan techniques (for details. ICC Raudenbush and Bryk, 2002, Chapler 14). AilematiWlly, a Bayesian apprvach am be adoplc:d (see the chapter by Clayton. D.G. in Gilks. Ricbanlton and Spiegel haller, 1996). If likelihood methods IR adopted. rcStrictcd maximum likelihood (REML)is usuaUyuscd(Verbckeand Molenbaghs, 2000, p. 43). This cmmcls the downward bias of maximum likelihood estimales of variance and "'CIIIi1a negliPble exira work compulalionally. However, changes in REML lag-libUhoacls canDDl genemUy be used 10 CXllllpIR neSled models., so muimum likelihood may be pId'aTed far model buildin& (Goldstein. 2003, p. 36), allhaup. in uncommon situations with many fixed parametelS the lWO can gi~ quite different ansWCIS (\abeke and Molenbeqhs.. 2000, p. 198). OENERALISfD LINEAR MODELS (GLMs) exlend ~LS models tocliscrde 1apo1WCS. Anaiogously,gent'miued iiMtII'mixeti

rrrodel:J (OLMMs), sometimes called nonime. mixed rrrodel:J. extend multilevel models to disc~e responses. As with GLMs. we model a runction orlhe FROBABIU1"Ylhat the n:sponsc takes on a particular value. In GLMMs. however, for ",sponses on the same subject, this probability shlRS a subject-specific tcnn. For example. we can make the random inten:epl model, illustraled by the second figure,. a GLMM by letting Y follow a binomial distribution and writing theOWlrall repcssion line as logit(Pr{Y=)'U=a+ ~I. 1be subject specific ",,",ssion line for subjecljwould'be

Iogit(Pr{ Y = y}) = (a + Uj) + III

(2)

ad. as before, the leYeI 2 ",siduals, u" would be normally

dillributed about zero with variance 0;. Note that. as in GLMs, in GLMMs the level 1 variance is a fixed runction or the mean. TbeJcfore, there is no Ierm cOll'CSponding to Also. as with GLMs. the function 'IOIit' in equation (2) is

a;.

known as the link function. AllemaliYe link fUnctions (e.g. log. inverse normal) can be used together with other probabililY models such as the Poisson or negative binomial. .UnfOllUnalely, fitling OLMMs is nOl nearly as straightfarwanl as fitting multileWlI models to quantitative data. because the LlKELDIODD is much man: diflicultto compute. Three approaches, all discussed by OoldsleiD (2003). arc commonly·adopted. The ftJ5l approach is QUA5HJItEIJ1IDOD. Then: ~ two fcxms of this, penalised quasi-likelihood and marginalia:d quasi-likelihood.. Both meIhods rely an approximations., which can be made to ftrsI or second order. 11Ie approximations involved mean that quasi-likelihood methods provide biased parameter estimates: in panieular, estimales of variance componen~1 tend to be downwardly biased. This bias is 299

MULn~LMODBB

____________________________________________________

mosl marked in data sets with Cew level I unils per level 2 unil or probabilities close to boundaries (e.g. 1 or 0 for binary data). The biBS is least for second-order penalised quasi-likelihood. Another drawback is thaI. wilh quasilikelihood. no estimate oC the log-likelihood is available for comparing models. The scc:ond approach mies on numerical Of Monte Carlo integnlion melbods. This is a1IIIputationaily considenlbly more intc:osive if several random efl'eclS Of Je~ls are in\'olved. Ne\'Cltheless. it is becoming int"l'Casingly feasible. An additional advantage is thai these methods proVide an estimate oC the log-likelihood. which can be used for hypothesis testing DDd interval estimation. The Ihim approach is to adopt a Bayesian formulation with uninCormative priors. Many common models 8IC implemented in MLwiN (Rasbash el al.• 20(0) and seven! models are described in the WINBUGS manual (Spiegelhalter. 11Iomas and Best,. 1999). Nole that these methock can be extended to provide multilevel versions of more gencml multinomial models (see the chapter by Yang. M. in Leyland and Goldstein. 20(1). Multilevel models are likelihood based. An alternative class of melbods. known as OENEllAUSED EsnMAnNO EQUATIONS (GEEs). can also be used fOf multilevel data (see Diggle et al•• 2002. Chapler 11). GEEs model the mean and variance of the data only; unlike multilevel models. a PROBABDJTY DISTRIBUTION for the data (e.g. normal) is not specified. Standard errors arc often estimated robustly ft'om the sampling variance or the raiduals. using the Huber-White sandwich estimator (see HUBER-WHITE ES11t.lAlOR) (Digle et al.• 2002. p. 80). A theoretical advantage of GEes is that the fixed parameter estimates are consistent (i.e. reliable if lhere is sufficient data) even if the covariance sbUclure is wrongly specified; however. they may be ineflicienl if the covariance slJuClure is subslantially misspecified (Goldstein. 2003, p. 21). The drawback is Ihat variance components are not explicitly modelled. but arc tRated BS nuisance parameters. whereas Crom Ihe multilevel modelling perspecti~ the variance components contain useful insights. This is din:ctly related to an important. but subtle. difference between the two. Fixed panunc:ter estimates Iiom multilevel models estimate the effeci of a covariate on a subjecl conditional on lhe .lllue of lhei, mbje~I-:lpedJic effe~t:l. GEE panuncter estimalcs are marginalised over subjecl-specific elTc:cts: they estimalc the average effect of a covariate over the population the data 1£ drawn rlVlD. For multilevel models fOl' quantitative data. conditional and marginal estimates or fixed parameters coincide:. For discrete data they do noI~ oRen marginal estimates DR markedly smaller in mDlRitude (compare Tables 11.1 and 8.2 in Digle et al., 20(2).

The appropriate approach adoptcdde:pcnds on the scientific question. If the primary aim is to modellhe average R5pDDSC as a function of covariates and lime. and Ihe correlation is • nuisance. Gees may be prefem:d. The laulting paramcler estimates are often known as populalion alWQgm. ConveneIy, if understanding of the variance 5IJUctuR is important. e.g. in invcstipling dc:tcrmiaants of variation in growth rates. muhilcvel models are laIuiR:d.. A complication with eonditional models is thal. because the interprelation of the fixed panunc:ters is conditional on the variance model. if this is changed Ihe interpretation is generally altered. The literat~ on medical applicatiOns or multilevel models is vast and growing. A good starting poinl is the collection of papers in Leyland and Gold5lein (200 1). which includes models fOf growth data. l.patial dislribulion or mortalily and lIIOIbidit)'. and institutional comparisons. The latler is an important and widespread application of multilevel models. Applications to .ETA-ANALYSIS an:: discussed by Hardy and Thompson (1996) (quanlilalivedala) and Tumerel Ill. (2000) (binary data). CROSSOVER'TRIALS by Jones and Kenward (2003. Chaplers S and 6) and nUS1"ER RANOOMrSED TRIALS by Donner and Klar (2000). So far. we have assumed that ead1subjcct at each time only has one ~sponse. However. the covarianec: model n:adily extends to allow multivariale responses at each time. For example. a subjecl'S diastolic and systolic blood pl'Cssure can be modelled simultaneously (see the chapter by McLeod. A. in Leyland and Goldstein. 20(1). 1bc multilevel framework. can also be extended to handle lime-to-evenl data. wilh subjects having Rpc&ted events and a common frailty (the commonly adopled term ror a subjectspecific: random elTed in survival analysis). Indeed. rraillies at diffcrent levels of the hieran:hy can be fitled (Singer and Willett. 20(2). Anotherextension is whal is termed 'cross-classified' data. Here subjects DR members of moR than one hierarch)'. For example. subjects may be nested within gencml practices and health auihoritics. but may also be nested within distinct neighbourhoods.. served by a number of general practices. They lhererore belong to more than one hieran:h)'. Parameler estimation is no longer always straightforward (Goldstein. 2003. Chapler 11). Frequendy in sludies involving longitudinal follow-up. a proportion of the intended Rsponscs will be unobserved. An important advantage oC multilevel models over classical techniques is that • complete set oC observations on each subject included in the analysis is not laIuiRd~ subjc:cts can still be included in the analysis with partiall)' observed response data. Further, if subjects are missing RSponSCS, 01' dropout,.lhen provided that. given theif obsen'Cd datL Ihc: reason for the dropout does not depend on the unseen responses (lhe MISSINO AT RANOOM. MAR. assumption). parameter estimalcs from

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ MULTINOMIAl.. DISTRIBUnON multilevel models arc stiD valid (Litlle and Rubin. 20(2). Thus. if daaa arc analysed using multilevel models and n:sponse data an: MAR. ad hoc IMJIl1I'ATJON techniques such as replacing a missing observation by the previous seen obsemdion (LOCF) an: not mauircd: indeed they wiu generally introdUCIC BIAS (Molenberghs el QI.• 20(4). Sensitivity to MAR can be assessed; see. for example. C'aJpenlcr~ Pocock and Lamm (2002). However. parametel' estimates from GEEs are not vaUd under MAR: to guamntee their validity a SIJOnger assumption. missing completely at random (MCAR). is required. This assumption slates that the reason for a subject's unobserved data is independent of both their observc:d and unobserved data. Although GEEs can be modific:d to cope with MAR data. to do this efficiently requUcs a nontrivial multistage eSlimaiion process. The above does not apply 10 missing covariate infonnalion: ir a nontrivial degree of covariate informalion is missing. it usually needs 10 be rc:covered using appropriate daaa imputation methods. The generality of multilevel models means that the distribution or many test statistics is only known under the null hypothesis. so simulation often has to be employed in sample size calculations. Simplifying assumptions enable progress in special cases (Diggle eI 01.• 2002. p. 24). As multile\'el models become m~ mainstream. software to fit the basic models for quantitative data is becoming increasingly available in standard packages. All the models described here can be fiued using MLwiN (Rasbash el 01•• 2000); many can be fitted using PROCs MIXED, GLMMIX and NLMIXED in SAS version I.x (SAS v. 8.1. 20(2). For very large datasell. SAS is preferable. A comp!d1c:nsive review of the capabilities of available pac~s is givcn OR the MLwiN websilC. www.mlwin.com. Bayesian model fitting can be performed with the WinBUGS package (Spiegelhalter el QI., 1999). This is very ftexible. bUI the user is required to write the program to Ilt the model. and a degn:e of knowledge about MAuov CHAIN MONre CARLO methods is required. Newcomers to multilevel modelling should start with one of the many excellent books now available. The least technical of these is Kreft and de Lccuw (1991). which giYCS a basic introduction to models for quantitative data. from a social science pcrspecli\'e. The software MLwiN (Rubash el 01.. 2000) also comes with an accessible manual and many examples. Raudenbush and Bryk (2002) give a much more extensive treatment. including discrete response models. with many detailed social science examples. For the methodologically inclined. Verbeke and Molenbcrghs (2000) give a comprehensi\'C overview forquantitali,,-e data from a longitudinal perspc:clive. Examples are analysed in detail using mostly SAS (SAS v. 1.1. 20(2) and some

MLwiN (Rasbash et QI.• 2000) and SPLUS (SPLUS v.6. 20(3). 11Ic latter half is gi\'en over to problems with missing data. l..c:ss detailed but more general is Diggle el 01. (2002). who also oomc from a longitudinal standpoint, and discuss quantitative and disc~ data. multilevel models. GEEs and transition models. Most n:cenlly. Goldstein (2003) gives a cornprdlensi\oe ac:count of the CW'I'aIt state of multile\'el modclling~ including oudincs or tc:chnical details and many illustrali\'e examples. JRC 1Ac:lmowledaemeat: James R. Carpentel' was SUpporlc:d by ESRC Research Methods Programme grant H3332S0047. titled 'Missing, data in multi-level models·.J CupID"r, J., Poeock. s. aad Lwmm. C. J. 2002: Copin& with missing data in clinical trials: a model based approach applied to asthma trials. Statutiu in Met/jcin~ 21. 1~3-(;6. DIp. P. J .. Hequty. p .. ~ K. V. aDd Ztaer, S. L 2002: Analysis oj longiludinal dala. 2nd edition. Oxford: Oxford University ~ss. Danaerf A. ad 1CIar, N. 2000: INsign OM QllQIyJis oj dlUter ramlonri:ation triab in !realill resemrlr.london: Arnold. G~ w. R..1Uc:IIardsIIa. s. aDd SpIeaeIhaIter, O. J. (cds) 1996: MarkoW' chain Monte-Carlo in practke. londan: Chapman &: Hall. GGIdsIIID, H. 2003: Mllllilel'el slatistitsl models. 2nd edition. London: Arnold. Hardy, R. J. aDd TbampIoD. S. G. 1996: A likelihood IIIJIlI'OIICh to meta-analysis with random ctrects. Slatutia in Med· icine 15.619-29. J..., B. aDd KlftWanI,M. G. 2003: Desi8n and QllQlysis ojcrossoO'Ier Irials, 2nd cdition.l..ondon: Chapman &: Hall. KnIt, I. aad de L...". J. 1998: Inlrotlucing mullilerrl modelling. London: Sage. LeJIud, A. H. aDd GoIdstaID, H. (eels) 2001: Mullilnoel model/illg of !realill Jtalistks. Chichester: John Wiley a Sons, Ltd. Uttlt, R. J. A. aad Rabla, D. B. 2002: Slalislimi QllQlysu ..itll missing MIa. 2nd edition. Chichester: Joim Wiley &: Sam. Ltd. MaI..be..... G .. 1bJ,Is, H.. J-a, L, BeuakenI. c.. ~L G., MaUIIIkradt, C. aad Canal" R. J. 2004: Analyzi~ incomplete IqitudinaJ clinical trial data. Biostalislics S.445-64. Rasbas'f J., B......... W.. GGId.sIeIn. H., V.... ~... PInts, L, HnI"f ~I.. Waadlaaatef G.f Dnper. O.f ........anI, L 8Dd Lewis. T. 2000: A wer'$ guide to MLll'iN (rersioll 2.1). london: Institute of Education. Raadeabasll, S. W. &ad BI')'Ic, A. S. 2002: Hierarthit:allinear models: applim/iQIU tmtl data tmal},Jis melhods. 2nd edilion. London: ~. SAS Y. 1.1 2002: SAS Worlch'idc Headquarters. SAS Call1plS Drive. Cary. NC 27513-2"14. USA. \\'\\'W.SB5.com. SIn.r, J. D. aad WUIe", J. B. 2002: Applied IDflgiludinal data QRalysis: modelling MBlfge tmd el'elll Ot:cu"enre. Ne'A' York: Oxford University Press. Spleaelhlllfer.D.J., 1'bomas.A.aadllest,N.G.I999: WinBUGS loersion J.2 user malllltli. C.ambridge: MRC Biostatistics Unil. SPLUS Y. '2003: Insightful Switzerland. Christoph Menan-Ring II. 4153 Reinach. Switzerland. TID'DII'. R. 1\1., Omarf R. Z.. V. . . M., Goldstela f B. aad TbomJllOllt S. G. 2000: A multilc\'C:1 model framc'A'Ort far mc:1a-analysis of clinical trials with binat)' outcomc:5. SlatisliC'J in Medicine 19.3417-32. Verbeb, G. aad Molen........., G. 2000: Ulftar mixed nrodeb for longitudinal dala. Ncw York: Springcr Verlag.

Keaw....

multinomial distribution This is a genc:nlisalion of the BINmUAL DISTRIB~ to the cue where

m~

dian two

301

MULn~OOMAUU~

__________________________________________________

outcomes arc possible for every "trial'. Wbcras the binomial disbibution addmues the number of sut'CCSSCS (and daus implicitly dae numberorrailun:s also) in the case whm: every event 01' llial can only result in a succcss or a failure. the multinomial distribution models the numbers of each outcome in the case where each event 01' Irial can ha~ one of multiple ouamcs. For example. l..ossos el QI. (2000) note that. when modelliDl genetic mutations in a situation wilh fOW" ralbcr Ihan two distinct genotypes, it is neccssary to extend the usual binomial madel to a multinomial one. In general. for rr observations. eacb of which can independently take one: of N mutually exclusi~ outcomes with probabilities PI. P2,• •••• PH (when: PI +P2+ ••• +PN= I), then the PROIABn.JTY of seeiDl .1", observations achieving outcome I•.1": observations at'hieviDl ouame 2. etc•• ~ X, +X2+ ••• +.1"N=n. is givCD by:

whm: rr! (factorial n) is the product of all the integers up to and including n. namely. n x (rr -1) x (n - 2) x •.• x 3 x 2 )( 1. wilh O! dennc:d to be 1. Note that since the data an: multidimcasional. there is no single mean value of the dishibution as such. although (as for the binomial disbibution) the expected number to be seen wida outcome k is Pin.... AGL

1..aIIos.1. s., 11bddnDI, R., N....·haa, B. aad LnJ R. 2000: The inf~Dl:c or anliFG scleClian m Ie fjmCS. JoflTlltll oJ 1mRrunoItID 165. 5122-6.

muftlpl. comparisons Procedures fOl' a detailed examination of w~ differences between a set of MEANS lie. usually applied after a signiftcanl F-lest in an A..'W.YSIS OF VARIANCE bas led to the rejection thai all the means an: equal. A large number or multiple comparisollicchniques has been proposed but no single Icchnique is best in all situations. 11tc majordillinction between the techniques is how theYCXlllb'ol the in8alion of the TYPE 1 ERROR thai would occur if. for example. a simple SruDEHr'S I-TESr was applied to test the equality or each pairs of means. One vel)' simple prucc:clure for dealing with the innation procedure is to juqe the P-wlucs from Cat'h l-test against a significance level ofaim ralhcrthan a.1he nomiaal size oflhe 1)pe I mor. whc:rem is the number orI-tests performed - this is Down as the 8oNfI!RRoNt CORRECl1ON. Man)' alternatives approaches arc available. IDD5t or which arc based on the usual l-statistic. but which differ in the choiccofcritical value apinsl which thc l-statistic is compaml. A compn:hcasi~

account of multiple comparison procc:durcs is givcn in SSE

Hsu (1996).

IIsII, J. C. 1996: Multiple compari.fOM. London: CbIlJllDlR a Hall.

multiple correlation coefficient

See aM1Ilf.UTIDN

multiple Imputation This is a method by which missing valucs in a datuca arc replaced by more than one. usually between 3 and 10. simulated VcniOlW. Each of the simulatcd complete datascts is then anaiysc:d by the mecbod relevant 10 the in~tigalion to hand and the results combined to produce ellimates. S1"ANDARD EJlRORS and CONF1DENCE INTERVALS that ineoJpDl1lle missing data unc:Cltainty. Introducing appropriate random mor into abe imputation process makes it possible to get approximatel)' unbiased estimates of all panunetcn. although abe data mull be missing al random rOl' this 10 be the case. n.c multiple imputations lhemsel~ an: cn:aa.:d by a Bayesian approach (see BAYESIAN METHODS). which requires spcciftcation or a parametric model for Ihe complete data and. if necessuy. a model for the mc:cbanism b), which data become missing. A compn:hensi\'e accounlof multiple imputation and details of associated softwan: an: giVC8 in Schafer ( 1997). BSE (See also DROFOUI'S) ScWer,J. 1997: '11IellllQlysiJo/int.YlfllPlelenrlll,irtl1ialetlalll. Boca Raloa: CRO'Cbapman a: Hall.

multiple linear regreaalon

This is a technique used to model. or chamcterisc quantitativel)'. the ~Ialionship between a response variable. y. and a set of explanatory variables. X2. .... .1"". 11tc explanalory variables arc shicll)' assumed to be known or under the control or the invelligalor. i.e. they an: nul wnsidcred to be random variables. In praclice. w~ this is I1Rly the case~ the results from a multiple regn:ssion analysis an: inccrpmcd as being wnditionaJ on lbc observed wlues or the explanatory variables. The multiple regression model can be writtca as:

x,.

)' =

flo +/lixi + ... +P"x, +1:

wlM= fJo is an intcn:cpt and PI' iJ2' •••, /I" arc rqression cocflicients thai measun: the change in the n:5pome variable associated with a unit change in the com:sponding cxplana~ variable~ wnditianal on the other explana~ variables remaining constant. If Ihe explanatory variables an: highly correlated sucb an interpretation is problematic. The residual. E. is assumed to have a nonnaJ distribution with MEAN zero and YARIANCE! 0 2• An alternative way or writing the multiple regression model is thai y is distributed normally wilh mean /I and variance 02.. where /I =/10 + iJ,·1"1 + ~ .. + iJrr'Cll• This fonnulation makes it clear Ihal the model is only

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ MULTIPLE LINEAR REGRESSION suitable for continuous response 'Variables with. conditional on the values of the explanatory 'Variables. a NORMAL DIS. TRl8unON with constant variance. Fora sample orn JapDllse values along with the cOlRsponding values of the explaaatory variables. the aim of multiple regn:ssion is 10 arrive at a sct of 'Values for the regression coefficients that make the values of the response variable pn:dictcd from the model as 'close· as possible to the observed response values. Estimation of the parameters of the model (/J•• {J2' •••• /JII) is usually by least squares (sce Rawlings, Putala and Dickey. 1998). The variation in the response variable can be partitioned into a part due to rccression on the explanatory variables and a Jaidual. Tbis partition can be sci out in an ANALYSIS OF VARIANCE-type table as shown.

mulliple linear regression An analysis of vanlllJCfltype table

DoF

Solute

SS

/tiS

Rqressiolr

q

ROSS

ROSSlq

ResitIMsJ

n-q-I

RSS

RSSI(n - q - I)

MSR

RGMSI RSMS

Under the null hypolhcsis that all the regression coemcienls,fJ.~/J'%•...,{J",arezem.1hc square ratio (MSR)in this table can be tested apinst aD F-DlSTRlBt1I1ON with q aad n - q - I DBJREES OF FIlfl!I)OM. the residual mean square is an estimator of aDd is used in calculating ST,.o\HD,AIlD ERROItS of the estimated rqression coefficients (see Rawlings. Putula and Dickey. 191»8). The MULTIPLE CautELA110X COER=ICJENI'is the COI'I'elalion between the observed w1ues of the response and the wlues prcdiclCd by the model. The square of the multiple com:lalion coefficient gives the proportion of the variance or the response that can be explained by the explanatory variables. The overall test that all the regression coefftcients in a multiple regression model are zero is seldom ofgreat intcRst. 11K: investigator is mCR likely to be conc:emed with assessing whClhcr some subset or the expJanatOl)" variables might be almost as suc:cc:ssful as the full set in explaining the variation in the RlSpoDSC variable; i.e. a more parsimonious model is soughl. Various procedures. have been slilgestcd to help in this SC8IdJ (see .oW. SUBSETS REORESSION, AUTOMA11C SELECnON

mean

cr

PROCEDlJRES).

Once a final model has been letded on. the assumptions of the multiple linear n::gn:ssion approacb for the data to haad nced to be chc:cb:d. One way 10 investigate the possible failings ora model is to examine what arc known as reak.lutm. deftned as: Residual ==

obscrvcd raponsc value-l'raIicled n:sponse value

The n sample residuals can be plotted in a variety of ways to assess particular assumptions of the multiple regression model: a IUSTOORAM or STEM-AHD-LEAF PIDf of the residuals can be useful in checking for normalily or the error terms in the model: plots or the residuals against the corresponding values of each explanatory variables may help to un~r when the n:lalionship between the rosponsc and an explanatory variable is more complex than that originally assumed - it may suggest that a quadratic term is needed to madel a 'U-sbapc' or OJ-shape' apparent relationship; a plot of the residuals against the fitted 'Values may identify that. for example. the psacnce of the multivariate OUlUERS are worthy of further invcsliJation and checking or perhaps that lhe variance of the response increases with the filled values. suggesting that a transformation or the response sbould be considered. 111C1'C arc now many oCher regression diagnostics available (see. for example, Lovie. 191» I). To iIIustnte mUltiple regressioll we shall usc the data showil in the second table. These data arise from a study of 20 patients with b~ension (Daniel. 1995). In practice. of course. Ihere would be 100 few patients to allow a sensible analysis with Ieven explanatory variables. The response variable bc~ is the mean arterial blood pressure (mmHg).

mullple linear reg.....lon Da'. for 20 patients with hypettension

1 2 3

..

5 6 7 8 9 10 II

12 13 14 15 16 17 18 19 20

BP

Age

Wt,(thl

SA

TimeHt

Pulse

Slrra,

lOS

47 49 49 SO

1S.4 94.2

1.75 2.10 1.98 2.01 1.89 2.25 2.25 1.90 1.83 2.07 2.07 1.98

5.1

115 116 117 112 121 121 110 110 114 114 115 114 106 125 114 106 113 110 122

63 70

33 14 10

95.3 04.7

51

&9.4

4K

99.5 99.8

~

47 49

90.9

-18

92.7

47

94.4

49 SO

04.1 91.6 87.1 101.3 04.5 87.0 04.5

45 52 46 46

46 -18 56

&9.2

90.5

95.7

2.OS

1.92 2.19 1.98 1.87 1.90 1.88 2.07

3.8

1.2 5.8 7.0 9.3 2.S

6.2 7.1 5.6 5.3 5.6 10.2 5.6 10.0 7.4 3.6 4.3 9.0 7.0

n

73

99

72

95 10 42 8 62 35 90 21 47

71 69 66 69 64

74 71 68

67 76

80

69 62 70

98 95 18 12

71 75

99 99

BP: Mean meriaI blood Ift~ (1IUIIHu, Age: Age in yean; Weight: ~igbt in leg: SA: Body SlDface ~a (SCl'IoR IDCba)~ TimeHt: DwatiOll ofhypeltcDsion(yan); PuI.: Basal pulse (!alii mim): Stress: Measun: of suess.

303

MULn~RE~SY~

_______________________________________________

multiple II..... reg.....lon Results second table

in the

mulaple record systems

rl'grl'mOll COt!f/kMnl

,.,.""

T-mllie

p.'Yl/ue

-12.1705 -O.7OlJ -0.9699 -3.7765 -0.0614 -0.0845 -0.0056

2.5S66 0.0496 0.0631 1.5102 0.0484 0.Q516 0.0034

-S.Ol41 -14.7710 -IS.lOtI -2.3900 -1.4117 -1.6370 -1.6328

0.0002 0.0000 0.0000 0.0327 0.1815 0.1256 0.1265

(lnIeR:epl)

Ace WeiPt SA llmeHr

PUlse Sbas

The LEAST SQUARIiS ESTWADONS of the rqression parameters are shown in the third lable. The square of the mulliplecorrelation coefficienl is 0.99 and the mean squares ralio described above lakes the value S60.6~ tested Blainst an F-distribution with 6 and 13 degrees of fRedom the associated P-VALlfE is extremely small. Clearly~ the hypothesis thai aU the regression coefficients ~ zero can be safely rejected. For these data the sample size is too small for residual pJots to be particularly informalive. Howe'Va'~ for interest. the figure shows a plOI of residuals against filled values. The plot lives no cause forconcem in respect orthe constanl variance assumption. SSE

0.2-

!

0.0-

i

•

•

•

•

• • S•

•

-G.2§. -G.4-

•

•

•• •

•

•

•

•

-G.6-

-0.8-

• 105

110

115

120

125

Filled values of response

multiple linear reg.....lon Residuals plotted against fitted values [See also OENEIlAUSED LINEAR MOOELS~ LOOISnc REORESSlON]

DaaIeI, W. 1995: BiDslaJulics: II fDunt/tltion {Dr IIIIdlysia ill ,hi' hellllll sdent:rs~ &II cdilion. New Yark: John Wiley &: Sans. IDC. P. 1991: Itcpssion cIiapomc:s.ln Lo\·ie, P. aad Lovic, A. D. (cds). Netl· "'tIDpIfII'tI/~ in slalislicsfor ps)"dzolDD aniI,hI' Jocilll mm~9. Lonclon: Routi. J. 0., p....... S. G. aDd

1..0"

See C'APIVRE-IlEC.\PJRE

METHODS

Residual staDdanI emir: 0.4072 an 13 ciegRCs oI'fRlcdom: Multiple R·SIpIIIKd: 0.9962.

0.4-

DIc:keJ, D. A. 1998: Applil!ti regressiDlf tmaJ,sis: a rat!tll'C'lr totJl. New York: Springer.

SllIINIarti

Estimtlled

TI'Im

for data

Raw...,

mulaple testing

This n:fen to canyilll oul mulliple (more than one. bul possibly very many) statistical SIINIfI. C'ANCE TESTS. The problem is one ofnot conll'Ollilillhe overaD TYPE I ERROR rate when we perfonn many signiftcance tests.. The 'tYPe I error is the probability offalsely rejecling the nuD hypothesis (Ho) when it is actually true. If we compare lwo trealments in a aJNJCAL 1RI~ we generally state the null hypothesis to be that there is no difrereace in mean response (ar in death rates~ or cure rates~ etc.) between the lwo treatments. This is not a statement about the data that we see in Ihe trial (the sample meaas~ .i;) bul rather one about. the true (but unknown) population means.p;. The alternative (HI) is simply the CODver5e - i.e. that then: is a difrerence between the treatments. Now~ usually (although it is a very arbitrary yanklick), we reject Ho and dc:clare thllla difrereace between ImltmenlS exists if the calculated P-VALUE is Jess than S4Jf,. So for any single significance test~ if the null hypathesis is true. implicitly we are accepting a risk ofbeiq WIOng ofSfI, -and for many situations~ many people consider that an adequately small risk. Howe\ou. whlll happens if we perform more than one significance test to answer the same (ar relaled) questions and we are pn:pan:d to rejeclthe null hypothesis if either (or both) tests live P < O.OS? In Ihe simplest case of Iwo independent tests. the FROB,.\BIIJIY of eilher lest 1 Of' lest 2 (or, indeed, both lests) giving us a small (say, <s... ) P-value is 1- 0.9S2~ which equals O.097S (or close to 109f,). If we carried out t~ four, five or even ten independent sipificanee tests,. the probabilities thai III Jeast one of lhem will give a small «SCjl,) P-value would be. respectively, 0.143, 0.186. 0.226 and 0.401. These are Ihe IW.SE F05ITI\'E BlIORUI'ES and it is apparenllhal.. \'elY quickJy• the risk of er'l'OIleous/y declaring a slalistically signiftcant difference between the Imdmcnts (i.e. when all of the null hypotheses are true) becomes much (and unacceptably) higher Iban SCjl,. Therefore. we need methods to com:ct for this inftatcd ~ I erIor rate. The simplest method to use is the BmHRIIONI OORREC TlON. Using his very simplistic line or mlSlDllilil. if we are to cany out two signifteance tests but want to ensure that the overall chance of making a false positive enur is kept III S..~ then we should test each of the two nuJl hypotheses III the 2.5., level. Then the probability of either test 1 or lest 2 (or bulb tests) giving us a small (now 'smaU~ means less than 2.SCjl,) P-value is 1- 0.97S2 • which is close to So if either or baIh tells meet this more stringent level of SlDlislical sipiftcanc:e. we can reject the null hypothesis and onJy IUn a SCjl, risk or malting a false positive claim

o.os.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ MULTISTAGE CLUSTER SAMPLE

about a ditrcn:nce between the In:almcnts. His idea eXleads very easily to ~ than two teslS: ror tIRe we compare each calculated value or P with 0.OSJ3 (= 0.01(7). ror rour teslS we compan: cach calculated wlue or P with 0.0514 (= 0.0125). and so on. Thc:n: an: many masons why mulliple tesling (particularly in CUNlCAL TRLW - but in other medical applicationS too) might occur. evcn in studics that just compare two tn:alments. Examples include: many ENDPOOlIS. mo~ than one time point (e.g. short-tenn response and long-term response). more than one analysis me:thod (e.g. using a NONP•.utAMETIUC METHOD and its eqUivalent parametric test), diffcrent definitions or ~sponse to In:atment (e.g. ditrerence: in means. or mean changes rrom baseline. or pen:entage change from baseline. or proportion or ·responders'. etc.). ditrcn:nccs within (and between) various subgroups (see Sl1JKJROUP ANALYSIS). multiple: INTERIM ANALYSES. and so on. In studies comparing more than two treatments. there arc all thesc same possible examples or multiplicity in addilion 10 thosc or many comparisons between all the diffcrent treatment arms. It is thel'dore quite clear that \'Cry qUickly the simplc samario of one significance test or one primary endpoint can become unrealistic. The: Bonrcrroni mc:lhod is simple but also very incf1lcient: it lacks statistical POWER to identify real ditrerenccs between tn:atmenlS (see. for examplc. Pemcgcr, 1998). It can also lead to the \'cry uneomronable situation or two tests giving P-values of. say, 0.03 and 0.04. At fil1l site. these bot/r appear to dcmonstmle a statistically signilic:ant ditremlt'e between the groups; they an: both less than 0.05. However. because thc:rc are two lestS. Bonremmi says we nee:d to compan: cach orthc:m apinst the I11OI'e stringcnt criteria or 0.025. but now neither meets this level orstringcncy! Hence.. various (in ract. numerous) other methods ha\le becn proposed - some an: simple (albeit not as simple as Bonrenoni) and some an: ve:ry complex. The details an: beyond the scope of this text althoup a good overview is given by. roreumple. Hochberg and Tamhanc (1987). We will iIIustratc ideas using two simple: methods. One is by Holm (1979). He proposed ordering all (or k) calculatcd P-values PI (the largest) to Pit (the smallcst). The smallest calculated P-value (Pit) should be compared with aile: if P" < aile then we dc:cl~ this ditrerenee to be statisticaUy signillcant and move to the next smallest or the P-values. Pit-I' If P.t-I < aI(k - I) then wedeclan: this diffcrencealso to be statistically significant. The: proc:cd~ CXlGtinues 1OI1i1 all or the calculated P-values ha\le becn tested. orunlil one of the P-values rails to mcel the criterion for statistical signiricance. In this case, the proc:cd~ stops and no more or the teslS an: considc:rcd significant. Another method is called the ·closed tesling' approach; it is best dcscribc:d by an example. If we wanted to compare two doses of an activc drug with PLACEBO (or some other refc:rcnc:e

produc:t) then it might seem that we nccdtocany out each tc:sl al a 2.5tJt significance level. Howcver. assuming we believe that the dose-n:sponse relationship will be monotonically increasing then we can begin by just tesling the highest dose against the rererence at the standard 5tJt level.lrthe P-wlue is smaller than Sc.t. then the diffcrence is declan:d stalistically significant and the next dme(and its P-value) an: considered. This P-valuc is al.wcomparcd to Sc;t level and. if smaller than this. is declared as signiftcanl The proc:c:dure could continue ir thc:rc wc:rc several doses. Each test is carried out at the 5CJt significance level until one rails to meet it - then a1lteSling stops and none or the other tests is considenxl significant. This is a much more powerful procedure than Bonrerroni' s although it has a major problem if a treatment turns out to have no cfTect on the fine tc:sl (in this case the highest dose) but may have substantial effeclS at lower doses: none or the secondary tests can be considc:n:d ·significant· because the very fine test (for the: highest dose) railed. These two approaches (and ~ are many others) illuslrale that then: is no simplc. single approach to solving multiplicity problems that is applicable in all situations. SD Horb.... Y. and Twm....... A. c. 1987: Multiple comporison pmMIMrrs. New York: John Wiley a Sons. Inc. HaIm,5o 1979: A

simple scquenlially sejcctive multiple test proccdulC. Srantlintn'ion JOIII7Ia/ o/Slatislics6. 65-70. re.,..r, T. V. 1998: What's wrong with BoIlrmoni adjustments? BritUlr Mt!tIica/ Jounra/JI6. 1236-1.

mulUstage cluster sample

nonprobabilislic method or sampling is used when members of a population an: ammlCd in subgroups or cluslCl'5. In this method clusters an: the sampling unit rather than individuals. Members within acluster should beasdiffc:rcnt as possible whereas clusten. by way of contrast. should be as alike as possible. However. this condilion is hard to satisfy and sinc:e two mcmbcrsora cluster will be more alike than two rromditrc:rcntc:lusten.. it is beltcrto have many small clusters than a rew Iarp clustcrs. as this reduces sampling enor. Each cluster should be similar to the total population but on a smaller scale. Clusters must be distinct from each other and evcry member or the population should fall within a duster. In some situations. it is neccssary ror all clustcrs to be of a similar size and this may n:quire the pooling or some clusters. Otherwise. the PRCILo\BDJ1Y that a cluster is chosen can be made proportional to its sac. so that bigger c:lu5lels an: more likely to be chosen than smaller c:lusters and the probability orsclecling an individual member orthe cluster is inversely proportional to the size of the cluster. For a single-sta&;e cluster sample a list or the clusters is constructed. Then a random sample of the clusters is taken. This may be a simple random sample. with cach clusterhaving an equal probability of being included in the samplc. or it may be that the probability of being in the sample can be proporlionalto the size or the duster. Once the clusters have been selc:cted each member of the cluster is included in the sample. A

305

MUL~TATEMOOBB

____________________________________________________

For a lWo-stage cluster SlIIIIple ahe mdhod is the same as a single-stage cluster sample but once the clusters have been selected aben a SIMPLE RANDOM SaUIPLE is used to select the members of abe cluster to be included in die sample. Clusters ..ay also ronn larger clusten. in which case. multistage sampling would be used. Fint. abe clU51e15 would be sampled. rollowed by the subelusters. Depending an lhe makeup of tile popullllion there may be many SlqC5 to die multistap sampliq. Multislap CluslCr sampliq was used in a SlUdy or violalions or the international CXIde of marketing of breast milk substitutes ("naylor. 1998). Hen:. the capilaJ city or rour chosen countries ~ the main c1ustc~ di5bic:ls wen: randomly selected subc:lusten. health facilities wen: mndomIy selected rrom the subelustcrs and mothcn WCR S_III_ally sampled rrom Ihe heal... facilities. The main advan. is IhIIl no sampling rmme is n:quirm. 1he maiD disaclwnlage is that the sampling is nonnadom and samplinl envr incIases by laking multiple samples. as then: is sampling CII'CII' aI each _e. For further details see Crawshaw and Chamben (1994). SLV

rurlher transitions out ~ pniaibited. "Ibe Slale sllUclun: describes the slates and clelcnnincs which transitians an: allowal. Dilfen:nlslalc strucl1RS may chanac:terisc the same sIoehastic process and ~ thus nat unique to the process. 1he stale strucl1R chasen depends on the questions or inlcn:st. tile lI1IIISpam1Cy or naaclcl assumptions and Ihc case wiab which 10 make infCRnce5. 'Tbis strucbft can be n:pn:scntcd schemalically with a multislale diapam in which boxes n:pRSCDt Ihe wrious sIaIcs and anows between tile boxes n:pRsenlthe possible lnIISilions thatcaa occur. Figgn:s (a) 10 (e) pRseal wrious mulli5bdc diapams of commonly obscrvecl mullislale pmcess types.

1AIve I ..... 1Dead 1 1 Well 1..... 1 Sick 1..... 1Dead 1 (8) Sul\'iw1 Model

Alve

\.

muttlalate models Tbese an: often adopIecI for anaIysinl event history (see SVRYIVAL ANl\LYSIS-AN CMlRVIEW) and l.ONOIIUDIJtW. DATA. They an: commonly applied in studies whCR subjcclS an: roUowal up over lime with R:Speclto a (stachllSlie) process of illlelat ahal is observeclto occupy exacdy one ora ftnite number ofdiscn:1e slalcs al any poinl in lime. Mullillale naaclcls have been round to be eXlmnely useful in medical araIS such as psoriatic arduilis. where individuals may IDOYC between a number of disability sblles (Col. mild. moderab: and seYen: disability) over Ihe loqiludinal couneoflheirdisease (Hustedel III.• 2005); in hepatitis C virus (HeY) disease PlUlR:S5ion studies when: li\'Cl' biopsy scan:s an: used 10 dc:tcnnine the stqe of HCV-n:laIc:d li\'CI" disease (Swc:cliq el III.. 2006): in desaibing the slales. characterised by the occurn:nce of various evcDlS (acule pan venus hast disease, chmnic: graft versus hast diseuc~ relapse and dea... in n:mission). whCR a leukaemia patient may enter followiq bone manow InIDSpl....alion (Kcidinl. Klein and HORMiIZ, 2001); and in olhcr IR85 such as Alzheimer·s disease. bn1nc:hiolitis oblilCranS syndrvmc. cancer. c:agnilivc impairment. diabetic n:tinopaihy. HIVI AIDS. studiesoflwins and incompelinl risks ordeath studies. MultiSlale naaclcls an: buccl an stochllSlic processes that move tiuuulh a series of discmc stales in conlinuous lime. 1he movemenls between slalcs an: called lraasitions. SlIIlcs can be lransienl (movements out ~ allowecl) or absorbiDl, ir

1---..1=11

1=21

en....... ad CIIaadIen, ... 1994: A COIIcUr co",se ill A Irlyl $ttllistits.lnI cdilion. Cbcllelthlm: SIIRIeyThomcs Publidaen Ud.. .,.,...... A. 1991: VaoIaIionIaflbe intenlltiaallcodeofmarkdingor IRasI milk sullstibdCS: prewleace in faur CCIUIdIies. Britis" Mediall JtJrntl1316. 1117-22.

(b) Pmpasb-c Madel

(e) CompcliaJ Risks Model

1Heallhy 1---.. IDisabledI \.

,/

1 Dead 1 (d) Disability Madel

(e) WDCSS-Rccover)'-Deada Madel

multl8late models Val'ious cfagnuns of canmonIy obsetved mullislate pnx:ess Iypes A mullislate process can be spccifted fully either lIIrou&h ilS transition intensities (also known as hazard f'Unclions. sec SlJRVIYAL A.'W.YSlS - AN OVERVIEW) or by ils InInsitian abilities. 111e IrDsilian inlensilies are tile inslaalancaus probabililics per lime unil (i.e. tile InIIIsilion rales) of 10iRl rl'OlD one stale 10 another. liven the history (developmenl) or tile process just prior to the limes or the InDSilions. WhcM a lnnsilion fJODl one stille dim:t1)' 10 another is impassible Ihe cOIRsponcliq lransilion inlensily is ZCIO. 11Ie lnlllsilion probabilities n:pn:SCDl the condiliOnal prababilitics or Ihe process being in particular states al various times. giycn ahal tile process wasobscrved in specific stales III earlier limes and tile hislories of the process an: up to Ihese c:adier limes. For movements out of an absorbing state the transition probabililies will clearly be zeIO. Mathematically. liven a mulli5lale praccss for a subject. X( I). allime I ~ Owilha finile discrac stille spaccclenalcd by

piU.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ MULTIVARIATE NORMAL DISTRIBUTION Q $

= {I. 2, ... , ".} and history up to some time just prior to

< t. H(s- }.lhen Ibe transition probability of moving from

state i at time s to slate j at time I is given by:

P,(s,I) = Pr(X(I) =jlX(s) = i;H($-}} and the lnMition intensily of making an; toj instantaneous transition (denoted b), i - j) at time I is given by:

. Pr(X(I+&) =jlX(l) = i:H(I-») I1m «ij(1 ) = 41-0 AI It is imponant to note that both the transition interwities and the lnMilion probabilities arc esscntial for Ibe development. estimation and inference of multislate models. The transition intensities ~ important in the formulation of neccsslU)' models for Ibe data. in terms of ineorponting c:ovariatcs that explain some of the heterogeneity across subjcc:ts and for representing the assumptions (e.g. time homogeneit)'. Markov or semi-Markov, piecewisc constant or Weibull baseline interwilics. proportional intensities (sec JIR(II[It'I1(]N HAZARDS). c:Ic.) being made. The lnMilion probabilities. on the othCl' hand. arc impodant for aJII&lructing the likelihood function (sec LIKEUHOOO) to be maximised and for making long-range predictions. A simple and mathematically lnIc:table example of a multistate model. which has been described onen in the Iilerature. is the lime homogeneous Markov model. Here Irarwilion intensilies an: assumed constant over. or independent of, lime (i.e. time homogeneous) and only depend on the histor)' of the process through the current state (i.e. Markov assumption). In this special c:asc. a model for the lransition intensities. incorporating baseline explanatory Yariables, can be specified using a multiplicative structure and lhe proportional intensilies (or hazards) assumption proposed b)' Sir David Cox in 1972. which is that the transilion intensity. ay(,}. of making an ; - j IJUnsition at time , is given by:

a,;j(l)

= a;exp(lllijZI + '" + flpipp)

where «~ c~ 10 a baseline intensity of making an ; - j transition. which is being modified by the exponential of a linear combination of the baseline explanatory variables. ,z,. The p regression coefficients. fl.ij •... ,Ppy. associaled with Ihcsc baseline explanatory Yariables are assumc:d here to be transition intensityspecific. although constraints on them can lead 10 more pmsimonious models. The exponential of the rqression coefficients are interpreted as rate ratios. Fwthcr extensions beyond simple time homogeneous Madtov mullislaie models have bc:cn made. The readers ~ ~ferrcd to review articles by Commengcs (1999). Hougaard (1999), Andersen and Keiding (2002) and Mcira-Machado el QI. (2009) for further details.

=., ...

Frc:quendy in evenl history and longilUdinal data studies in medical n:scan:h.obscrvalion ofthe exact tnmsilion limes may not ocxur. lbis may be because. by the end or follow-up. all individuals under slUdy have not reached an absorbing stale. which thus rcsulls in right a:nson:d observation times. R»Uowup of some individuals may ba\"C only happened some lime after the process began and various evenls may have occurml bctwa:a the start of the process and abc start of follow-up. which ma)' ~t in left censoring if the limcs of these various evenls arc unknown. Furthcnnorc. then: are many longitudinal studies when: subjects are obsc:rvecl intcnniucntJ)' (i.e. discretely in time) and the times of transitions are interval censored (i.e. the exact times of bansitioRs ZR unobsc:rvecl). except possibly for an absorbing Slate such as death. Finally~ left InDICation may oa:ur when indiYiduals come under observation only some known time aftCI'the 'naln'/ defined time origin of the process. For example. polential participanlS eligible far a study would only ena if they have not died by stud)' commencement. 1b: pn:scnc:e of these obscrwtion and selcc:tion schemes may pose spc:c:ial problems for wUd inference. and assumptions on how these featu~s mayor may DOl be infonnalive for the mullistate process arc laIuircd in order to c:orwtruct the appropriate likelihood funetioa. In most siluations. at least initially. the 'sampling time prot'lCSS' is assumed to be noninformalive (ignorable) for the multistale process. BT ADd..... P. Lull Ke....... N. 2002: Multi-statemodds forevent bistoly analysis. Stalislkal Melhods in Met/kal ReSf!tUch II. 91-115. D. 1999: Multi-state models iD epidemioJac U/elime Odta Analysis 5. 315-27. Haapard, P. 1999: Multi-Slate modds: a ~\;c:w. U/el- Dala AllaIyJis 5. 239-64. Husted, J. A.. Tom, B. 0.. Farnell. V. T., Sc:beIItq, C. &lid G......n. Do Do 2005: Description and prcdidion of physical functional disabiUl)' iD psorilllk arthritis: a IODgitudinal analysis using a Markov modd 8pIIIOIICh. Arthritis C~ anti ReMarm 53. 401-9. Klidin. N.. JOebl, J. P. aDd HorowItz. Me M. 2001: Mullistatc models 8IKI outcome prediction iD bone manuw baDsplantatiOll. Stalislics in Medici_ 20. 1871-85••feln-MacIIHo, L., de Uiia-Alvlln!~ J., Cadano-SUinz. C. uII Anderllll, P. K. 2009: Multi-slalc models for die analysis oftimc-lo-e,'enl dais. Sla/aliral Metboth in MftlicaJ Restorm II, 195-222.5........ M.J.. DeAftIIIII, D., NttII. K. Ranay, M. E..1n1aa. W. L .. Wrlgbt, M .. Bnat, L .. 1IarrIs. H. Eo, Tnat RCV stad7 Graap, Hev N........ Reabfer 5........ Graap 2006: Estimated pro~ Ales in three VDited Kingdom hepatitis C cohorts c1ift"emi acconling to method of RClUibnc:nl JOIII1ItII of Clilrica/ Epiniology 59, 144-52.

eam.......,

a..

mulUV.late analysis of variance (MANOVA) See ANALYSIS OF VARIANCE

mulUVarlate nonna. distribution

This is a generalisation of the NOIWAL DJS1RIBUTION 10 more than one dimension and the PROBABILITY law that underlies many methods of multivariate: analysis.

307

II,

h. lu til'; t

f fit'

II,

fa

t

.'

it III i : IS (p·i i 'I'~ ~!

["

'

Sf.

It -

·lrJIII,J~li·~lP·=il!

_I".,.

I

t

ill

1 9.

...

'

I~ l~ff. I:I~ 'illl~Ui I (I III JJhi:~lIl,U:II,1 .J II 'a' Ihll=~l' 1.1 t~tll ~J~ i J t.' lifUr I~ I.J ~ I. -l I , , , r 11,.19. iI i I

It!~f'

51

11' 1 ' II II r I ...,[ l"=- s. I.

I --...

J:-I, r ~ a

II

I'

~~1J IJllr[JII~tl 1 It'· r 'I .. ,~ -- p i S i r. ".. " 1IL'l I ~ I ' r.Ii

I hl~ lahillltl'~ 1"I'li'~' 1·1-

"It ' ,

f: R.

~itll- - IEjlf~lJ "I~{ttail:l f a rr l . ,II ~ 1-" a r: if!. ~ ~ II ~~ lfiil! i.!~.·t 11'01

I.r1t:1I I"

D

i

! ftI

I

--

N negative binomial distribution

This is the

PROBA811J1'Y DIS11UBUI'IOH of the number of events requRcl in order to obsene k "SUCXlCSSCS' • ConInISI this with abc BIKOMLoU.

which madcls Ihc: number of SUCCCSICS that wiD occur gi\'en a fixed numbcrof lrials.. Also nole that, since the 0E0ME11UC DlmtaurlDN models the number of CMnts mauin:d to obscm: one success, it is D special case of the ncptive binomial. If cach e\'Cnt indepcndeady has a probability ofsucccss.p. thcn the probability mass function for the number of evcnls. :c. requilal befo~ observing k successes is: DlSlRIBUJl)N.

(."-I)!

Pr(X == x) == (k-I)!(x-k)!

I

Jt-lt

(I-p)

when: n! (factorial II) is given by abc produci of inlegers up to and including n. and or is defined to be I. 1'be MEAN of the distribution is kip and the VARIANCE is k(1 _ p)/p2. The distribution can be genenliscd to the case where the k panuncter is nul an inleger (by n:placingthe fKlorialtmns with pmma functions as mentioned in cw.tr.tA DlSTRlBlJ'J1OX). which then enables abc following inlCrpn:talion. Suppose we have observations of count daIa from a populatiaaofsizeN. wbc:~each pcnon'scount Will beindc:peadently distributed as Poisson with some parametu A. In Ibis case, we would expecllhc counts in the popuIatiaa 10 be disbibuled apin as D PoISSoN DlSl"RIBUTION. Yet often the: papulation exhibits m~ variance than can be explained by D Poisson distribution, an exampleofCMlRDlSPBlSIDN. Oncn:ason farthis might be that individuals do nol shan: the same value far A. For example. Mwangi el Ill. (2008) show that counts or malaria episodes in 373 childn:n show I110Ie Variability dian can be explained by a Poisson distribution. but show also that the negative binomial distribution proVides a much bc:UCI' fit. This is aUributc:d 10 variation in the susceptibility of the childn:n. with some children being at increased risk or clinical malaria compan:d to others, and so the: model assuming D common A does not hold. The negative binomial distribution is not the only distribution 10 allow far gmder dispersion than the Poisson distribution. but, specifically. ir \'Blues from individuals are Poisson distributed but the values or .1 vBI)' between individualsaccanling 10 D gamma diSlribution. then the: population rrequencies will be distributed as a negative binomial distribution. Fwthc:r discussioD or this DDd oIbc:r aspc:clS of the distribution sec: Grimmell and SliI'zaker (1992). Oelman el QI. (1995) and Glynn and Buring (1996). AGL £rrqdDpllldi~ CGIIIpIIIiM It) MtId"KaI SlQlirlia.:

cI) 2011

GeIawI. A., C'arID,J. a..Sten, H. S."" au...... 0. B. 1995:

Ba)'esi- dam anaI)'sis. Boca RaIon: Chapmaa a: HalIICRC. GI;yDD. R. J ............. J. E. 1996: ~)'S of measuring nICs of m:urmal emllS. Bri/frlr MftiimI JOUfIIQ/ 312. 364-7. GrIIIuDea. G...... stlrDbr. D. R. 1992: ProIIIbiIiIy and randompRlCesses. 2ad edition. Oxrord: ClareDclon PIa&. Mwllllll, T. W ........., G .. W-...., T" N.. ~ s. Mot Saow, R. W., d tJ/. 2008

Evidalce for O\'CI'~ioD in the: dislrillutioa of clinical malaria episodes in c:hiJdreD. PlAS ONE. 3(5). ell%.

negaUve predictive value (NPV) 1'bis is defined far a diaposlic test ror a particular condition as the ~ BD.ITY thatlhosc: who have a negative: Ic:st do nol aclually have: the condition under investigation as measuml by a ~fen:aoe or "gold' SlandarcL (Conll'aSithis with the I1OSI1IYE FREDICTIVE VALUE.)

If the data an: sct out as in the table. Ihc:n: d NPV==-

c+d NPV can also be expressed as a percentage.

negative predictive val. . GeneTIIIIIIbIe of test results. ~a+b+c+dmdWa.~s~

PreMnt

Total

The NPV should be JRSCllted with C'OIIRED INI'ERVALS (typically SCI al9S'l.) calcullllcd using an appropriate mdhod such as that of Wilson that will not produce impossible: values (IJCIt'CnlagC:S grealer than 100 or below 0) when NPV appnlKhes exlmDc: values. CLC (Sce also FALSE NEQA11\'E RAT'E. FALSE PDSJ11YE RA11!. NEiOAlIYE PUDlCl'M! VALUE SENSmVlTY. SFEClRaTYI

AUman, D. G.. MIIdda, D., BrJ-' T. N. aad GanIaer, M. J. 2000: Sialislia ..ilh confidence. 2nd edition. Loadoo: BMI Books.

nested case-conlrol studies

This is a fonn or in which the cases and controls an: drawn from within a larger study. In other words. abcy an: nested within a pamlt study. which is usually a CGIORT sruDY bul sometimes D CROS5-SEcrJONAL sruDYar a PREVAI.ENCE study.

C'ASE-CONI'ROL S'ltJDY

S«YIIId Edit_ Ediled by Briaa S. Everitt and ChrisIGph« R. P'dJDeIo

JohD Wiley 6\ ~ ....

3D9

NESTEDCASE.coNTROL STUDIES _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

111e acslCd nalun: of such studies provides Ihc method's slrength. One concern in Ihc design of cue-cantrol studies is the appropriate choice of controls.. all of whom should be eligible 10 be cases if Ihcy were 10 develop the: disease. A casc-conll'Ol study nested within a aJhcxt slUdyoven:omcs this concern as a control within the cohort who ~Ioped Ihc disease would be counted as D case. Usually. but not nc:ccssarily, the controls who 1ft chosen are matched to the cases GIl various confounding factors such as age and sex. The usual n:ason for conducting a nested case-cantrol study, rather than anaIysiAl data on the en~ cohart or survey, is economy. Usually more data 1ft collecled on the participants or the nested sludy than in the main study. Sometimes these data 1ft derived from the analysis of ston:d samples, blood or wine, for example, or from fUrther infOl'matiGII beill& obtained Iiom the participants. In D study to examine the role of sex steroid hormones in relation 10 endometrial cancer, Lukanova,,' til. (2004) nested acase-controilludy within thn:e large cohort studies flUID the Italy, Sweden and Ihc United Slates. 11H: cohorts comprised over 65000 women tiom whom venous blood samples had been taken at enrolment in the cohorts. From within the cahmts. 124 cases of endometrial cancer wen: idcalified and two conlrols per case were chasen from within Ihc same CXIiIorl as the case and matched on various factors iDc:luding date of and age at blood donation. In this way,1hey were able to contine the pracessin& of the samples to the 124 cues and carrespanding controls, providing a In:al saving on processing samples from all participants in the thn:c cohorts. yet without major loss of statistical POWEll. While the pracessiDl of blood samples is often a component of nested case-c:ontrol studies. othc:r samples can be the focus. In a study to assess the role of selenium in ClCIIOIIary bean diseuc in men. toenail clippings were obtained for selenium analysis (Yoshizawa ,,' Ill., 20(3). 1be study was nested within the Health Professionals' Study in the United StaleS. which is acahoJt studyofovcr 50000 men. Within the cahoJt 470 participants devclapedcoronary heart disease and a matched control was chasen for each one. Thus. fewer dian 1000 toenail samples had to be analysed for selcaium. The cost of sample pmc:essing is aaI Ihc only n:uan for conducting a nested case-control study. While infonnation collected on the cohan is of inten::sl. sometimes funher daIa coIlec:lion is Kaluiml. Forcxamplc. London ", til. (2003) canduc:tedacasc-c:onbUI study ofbrcastcaDlX'l'nCSled within a CXIIIDIt study of more Ihan 50000 women. '11Ieir inteRst was in n:sidcntial mapctic field exposure and forlhis they wen: able tOfacusonthe743casesidl:ntiftedinthecohortashavilllbR:ast cancer and a comparable numbcrofcontrols. Detailed 1IliSCSSmeat oflhe mapetic Reid ex~was madcin the: homes of the sclc:cled participants. Data on adler risk fldols for brast cancer and possible confounding factors wen: ain:ady available in the daIa Ihat had been collected for the entin: cohort.

Case-cantrol studies can also be ncsled within crossseclioul surveys. Baker ,,' til. (2003) conduc:ted a saney of approximately 3000 men in soulhcm Enlland to ascertain Ihc prevalence of knee disorden in Ihe p:ncral papulation. A ncsled casc-control component consiclcn:d Ihc cases who had underlone knee SUJlCl')'.1be focus was GIl occupational and spodiDl ac:livilies. The activities undertaken by the cases at the birthday prior to their rqatc:cI onset of symplOms were considered. For each case, fi~ conllOls wen: selected matched 10 the case within I year of age. The activities undertalccn by the controls Dlthe same: birthday as the case wen: then consiclend 11H: nested case-c:ontrol study thus allowed the inveslilalOn 10 avoid bias due to the cases beinl likely 10 give up activilics at an earlier qe than the controls. because of knee pain. A matched analysis (see MATCHED PAIRS ANALYSIS) was Kaluia and thus the entire cross-sec:lional lIudy was a weaker 1001 for this particular analysis than the matched nested case-canll'Ol method. A runhm- vmaat on Ihe mediad is to nest a casc-control lIudy within a large rouliDe daIa coIlcctiaa system. Alerbo (2003) analysed the risk of suicide in relation to spouse's psychialric illness or suicide. 'l1Ie data wen: obtained by IinkiDl Ihe Danish population n:gislcls usiAl the unique penonal identification numbers assigned to all people in Ihe countr,. All suicides were identified. as wen: 20 matched controls per case. All the spouses and children UviDl with the cases and controls wen: identified from the relisters, alonl with infonnalion on diagnoses from the Danish psychiatric: register. The sludy showaI that lhcre was a gmder rislc. of suicide among those whose spouse had been admitted 10 hospital with a psychiatric disanIer or who bad died, particularly if the death had been by suicide. Few countries an: able 10 conduct such saudies as the Iinkap between m:ord systems is naI pa5Sible or is DOl allowed. but when: it is possible. suc:h as in Scandinavia. the opportunities for such epidemiological lIudies an: JI'CIIl. In a cohort study in which cascs an: rcmJiled prospcc:tivcly il is possible that controls identifted DI one lime point became cases IDler. This is parliculady likely 10 happen if the disease is cammon. An example of this is a nested case-cantrol slUdy within D birlh cohort in Sweden (Emenius el til., 20(3). Hen: the rocus wason recum:nl whecziDl inchildn:n in relation 10 nitrogen dioxide exposure. Wheezing is common in childhood and cases were identified from assessment of the cohort at I and 2 yean of age:. Controls were chosen tiom within the cohort and matched to the cases on the day of birth. Three controls. selc:ctcd 10 match cases identified DI the age of 1yc:ar, wen: found to be wheezing at the 2-year DSSCSSlDeDl and so were also included as cases at ahaltime poinL Such nested cuc-conbol studies in which controls can became cases are sometimes called case-cohort studies. HI (See also BIImI COHORT snmlES. CASE-CONIROL STUDe)

____________________________________________________________ NEURALNEnNORKS Aaerbo, It. 2003: Rat of suicide aad spousc's ~hilllric ilJaea .suicide: ncsIaI cac-canaol study. Btitbh MftiksJ JDIImIII 327. 1025-6.. ....., p......... L, OIapIr, C. .... C. . . . D.2m3: KDec diSlldl::rs in the pmI papuIaIion ... Ihcir ",lilian to CICaIplIioL OmJptltiDlltlltllltl ElrrirOlllJlmlttI Mftlidne60. 794-7........

W1

G.. " ' " .... 0 .. " " " , N., K.... IL-J.. Lewaf.M..NonMI,

S. L 8IIdI ~. M. 2003: NOIIS a ...m-ofair poIIuIion. and n:curmnl whccziag in chillhn: a aestal cam-cClllllal SIUcly wiIbia die BAMSE ..... cahad. Ot:r:upotitmtll_ EiJ'firtJnnrmlai Medic;' 60., 87&-81. Laadaa, So J.. ~J. M., H. . . . K. L, , . . . . . B., MoIne, ... KaIaaII. L No. ~ W. T., ,...., J. M. ... ........... B.E. 2003: RcsidaIliaIlIUIpCtk fieldCXJXJ5IRand blast CllDCCl'risk: a nc"~ study fiana mullicdmiccal1lllt iDLas ADpIes Caunty, Califamia. AmeriaBJ JDIltNII 1# Epide",ioIogy lSi. 969-IO.1.'""WII, A., ...... It., . . . . . A., ArIIIII. A.,'"", p.. ~ \'.,i.Iaaer,P.. SIIan,R.B., ...,,~MaII, p...... r.. KaIIIII. K. L, Ledz. Me, StaItIa. P., . .,..., F., Ha'-..., G., ~ R., TGIIIaIa, P. 8IIdI z.w«..........., A. 2004: CimdaIing levclsaf'aa staaidhanncnesanlriskofcnclamclrial eaaccr in pDiSlmCnnp"usni wamcn. brltnItIliHIIJI JtIUIIIIIl D/ 0Inm' 108.425-32. y ........ K.. AaWiID, A.,.1ontI, J. s., ........, Me J., GIInmaIccI, B., 8sh.., C. K.. WIIIeH, W. c. ... II'-, It. B. 2003: PIaspedivc IIUdy of sclenium levels in IDcaaiIs and risk or caa.y heart diaac in mea. AnreTicar JOIIIIIIII 0/Epidtmitllogy

.......,s.,

158.852.....

net monetary beneftt (NMB)

Sec COSJ-EFR:CI1\'I!..

NESS ANALYSIS

net reproduction rate (NAA)

See DI!MOCIlAPJIY

neural networks nus is a general class of algorithms for MAC'Ima LEARNINO. A neural netWOJtcan be describccl asa pananclCrisc:ci class of runcti~ spccilicd by a weighted Imph (the nelwork's an:hiledUn:). '11H: weights associated with Ihe alaes of the ,mph an: the parameters. Oripnally. ac:wal nelworks wen: malivated by analogy with the strucllft of the brain. '11H: nodes of abe neul1ll network com:sponcito the: neurons andlhe mps 10 neuron inlenldions. Far clinx:tcd paphs. we can clisliquish reaDmIl an:hilCctures (canlaining ~Ies) and feedfarwanl IKhilectures (1IC)'C1ic). A \'aY importanl special case f1I feedrarwanl nelworks is liven by laycral networks. in which the nodes or Ihc: II1II'h 1ft cqaniscd Do layen such that caancclions an: possible only belween clell1Cllls oflWoamc:culive layem.1hc weight bc:Iwcen Ihcjda unit and lheldh unit ofsuccessiYe laycrs 1- I and I in a network is indicated by wb and il is often assumed dial all c:Ic:mc:IU or a layer IR c:ann&:Ctcd to all elc:ments or the suax:ssive layer (fuJly canncclcd an:hilecllR). In this way. the caanecliaas ~n lwo layas/- I and I caa be n:prc:sc:ntcd by a weiptllUllrix W,. whasec:nby at lOW k and coIUDUIjCXlllapDllds to Ihc weilhI W41 of the edge fR1lllrw*j to nodc: k in the SUCCCl5ive layen (sec: Ihe ftllR).lt is cllSlalllaly to caU Ihe ftIsllaya- the: inpullayer and the last one the: aulpUl layer. 11M: n:mainiq ones an: caIIc:cI hiddca layc:rs.

------~.~

Input laver

. .

Hidden layer

Output I.,.,.

neural networks Connections between layfNS on the weight f1IIlItbt

A 0pcn:eptraD' can be dcsc:ribcd as a network of this type: wi" no hidden layen.11 can also be seen as abe builcling block of complex nelworks. in that each unit can be n:,ardecI as a pcn:eplnm (if inSle8d of the transfer ftlnction one uses a threshold function, rctumin, Boolean yalues). Therefon:. laycn:d fccdfarwanl neural networks as described abo~ an: also oRen mcm::d to as "multilayer pcn:eplrDns' • In a la~ netwodc. lhc function is campulcd sequentially. assigning Ihe Yalue of the arpmenllo lhc input layer. then calculating the activation leyel of the successive layers as dcscribccllaler, wtillheoutputlayer isn:ached. Tbcoutpul of the: func'ion computed by lhc network is the activation value of the outpul unit. All units in a layer an: updak:cI simultaneously and all the layen an: updated sequentially. based on the output of Ihe pmvious layer. 11Ie UDils of Jayer I calculate their output values y/by a linear combination of"e values at the pi'Cvious layer YI_I. followed by a nonJinc:ar tmnsformation 1:/l0R, as follows: y,=I(W""_I) when: w, is the alae weilht malrix between layer 1- I and layer I. and when: I is callc:cl Ihe Irall:flerftllldiDn. A cammon choice for this lraasfer function is the 1000stic function: I f(::) = 1 + e-: Notice dial each neunl network thus n:pn:scnls a class of nonlinear functions parameterised by the weighls whose values cletennine Ihe inpullautput behaviour or the neural netWOJt. ThUmn, Ihe network amounts to chaosin, Ihe values of the weiJhts automatically. For Ibis. a (labelled) lniniq dataset is nc:ccled and an error function for Ihe pcrfOrn18llCC of the network has to be ftx.cd. Trainin,a neural nelWOJt can then be dane: by ftncliq those: wei,hls ....t minimise the: netwadt·s c:rmron such samples (i.e. by ftllinl the nelwcft to abe data).

311

NMB ________________________________________________________________________ More conc~tely, in the parameter space the error funclion ewluated on the training data translates to a cost function thai associates each configuration of Ihe edge weights wilh a given enor on the training set. Such a function is typically nonconvex. so lhal il can be minimiscd only locally, which is often done by gradient descent. A technique known as 'backpropagalion' plOvides a way to compute the necessary gradients efficienlly, allOWing the network 10 find a local minimum of the lIaining error with R:SpCCllo network weights. The fact Ihallhe lIaining algorithm is thus only guaranteed to convcrge to a local minimum implies thai the solution is affected by the initial estimate for Ihe weights. This is one of the major problems of nc:ural networks. Also problemlllic is the design of the architecture (e.g. the size aDd the number of hidden layers). often chosen as the result of llial and CIIUI'. Some such problems have been ovel'tlOmc by the introduction of Ihe related melhod of support vector machines. Other types of network arise from different design choices. For example. radial basis funclion nelworks usc a differenl transfer funclion~ Kohonen networks are used for clustering problems: Hopfteld networks are used for (lIJIDbinatorial optimisation problems. Different training methods also exist. NenDB [Sec also CLUSlER ANALYSIS IN MEDICINE] BaldI, P. 1991: Bioill/ormtllia; II mtlMint It!tII1dng apptOtldr.

OunIJridge, MA: MIT ~ss. BIsUp. C. 1996: Neural Iftht'Oriafor potier" lUognilion. Oxford: Oxford University Plas. QtstIa.Ia" No ad Sba....Ta,Ior. J. 2000: All ilflrotlutlion 10 Sllpporl rector lIftI~bint!s. Cambridge: Cambridge Uniwrsity I'W:ss. Mltcllell, T. 1995: Modine lelll'nin,. ~: McOra'A'-Hili.

NMB

Abbreviation for net monetary benefit. Sec COST-

EFFECIM!NESS ANALYSIS

N-of-1 btals

An N-of-l (or single-palient) IriaI eambines clinical practice with Ihe well-established methodology of the RANDOUI5m CONTROLLED DIAL to compare the effectiveness of two or more treatment options within an individual. The N~f-l trial offers a design thai facilitates identification of responders and nonresponders 10 tn:atmenl and subsequenl dctennination of optimum Iherapy for the individual. Indeed. within the context of the hierarchy of e\'idenc~bascd study designs. it has n:cenlly been suggested Ihal N-of-l bials deliver the highest strenglh of evidence for making individual patienl treatment decisions (Guyall el 01.• 2000). In clinical practice. the clinician commonly pelfonns a 'lberapeulic 1riaI' or 'lIial ofthc:rapy' . in which the individual palient n:cei\leS a treatment and the subsequent clinical eourse delennines whether In:almenl is judged effective and is continued. Such an approach has serious potential BL\SI:S due 10 the PI.ACDO effect. the natural history of the condition

and the urge of the paticnland clinician not todisappoinlone another. Themcthodologyofthe N~f-I trial at least partially overoomes SOI11C of these potential biases. N-of-l IriaJs generally CXIII1p8fC a single new therapy wilh a cuncnl sIandard therapy 01' a plllClCbo. However, as wilh lIadilional randomised controlled lrials. it is also possible 10 compare more than two treatment options. In an N-of-Ilrial the individual serves as his or her own control. n:ceiving all In:alments under in\'Cstigation. Ideally. such a trial is conducted as a double-blind (both the individual and outcome 8SSCssor blind to allocated treatment in any tn:atment period) multiclt1SSOvcr trial with three or ~ periods fOl'each treatment. Repeated alternations belween trealmc:at periods with the new intervention and lhe control ensure several comparisons between the treatments. The trial design will. however. be lailon:d to the clinical enlity and therapies involved. The lime commitment of such trials by both patients and health plOfessionals is considerable. N~f-I lrials rely on cooperalion between individual clinicians and patients. Hence. the patient"s (and clinician's) commitment to the trial is essential for il to mach fmitioR. The duration of an N~f-l trial willlaQ;ely depend on Ibe nalure of the c:ondition and lhe lreatmcnlS under investigation. but is likely 10 continue for between sc\'Cral weeks and scveral months if noI far longer. Hence. such trials am only elTective fOl' chronic and stable conditions when: the natural history of the condition is unlikely to change dramatically oYCl' the course or the llial. Examples of their usc in dilTerent clinical areas include osteoarthritis. gastroesophageal re8ux disease. attention deficit hypenctivity disorder and chronic airRow limitation. among others (March el 01•• 1994). One problem encounle~ in N-of-illials. as in CROSSOVER TRIAI.S, is cany~ver effects of In:almenL which may m1uce the estimated treatment effect. The therapies under invcstigation should ~rore have a rapid onset and cessation of effecl Ibat will help to minimisc any CU1')'over elTects. In addition. a washout period between treatments can be incorporated into the trial ar a run-in period. where the first few days on each ImIbnenl are not evaluated. Because oflbe expense and time involved. it is important 10 dctennine atlbe oUlset whether an N~f-I trial is really indicaled far an individual~ i.e. is the effectiveness of treatment in doubt far this specific individual? Full criteria (summarised earlier) that should be satisfied before an N-of-l trial is commenced an: pIOvided by GuyaD el 01. (1988). When the individual patient and clinician am in agreement that an N-of-I trial is justifiable. this design plOvidcs the additional opportunity to measure the symploms that maltc:r to the individual concerned. In addition to slandanlised and validated diseasc>specific and generic outcome mcasures.the individual is asked to identify their most tlOUbling symptoms or problems associalcd wilh the illness th.. an: important in

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ NONPARAMETRIC METHODS - AN OVERVIEW

their everyday lives. These then rorm the basis or a selradminiSleled diary orqucstionnain:.11 may be II daily diary or weekly summary depending on symptoms and treatment duration. bul when: possible several separate: measurements should be laken within each tn:almenl period. The opponunity to measure the symptoms thai matter to the individual is a unique feature or N-of-I trials. In II classical randomised controlled trial the lowest experimental unit is the indiVidual: in an N-or-I trial it is the treatmenl period. Thcn:fore the sample: size: in an N-of-I trial is the number of treatment periods applied. Sample size or POWtJl calculations used with classical randomiscd conlroned trials can also be used ror N-of-Itrials. However. they make certain assumptions concerning &he independence or data from each treatmcat period. which may not be reasonable. While: a large: numberortn:atment pc:riods would increase the statistical power. the nalUral counc or &he dinical entity. thcnpy characteristics and patienl compliance will generally put an upper limit on this number and thus statistical power will generally remain low. Random assignment or subjc:ds to treatmenls in classical randomised controlled trials is essential in onIer to oblain comparable groups with respect 10 explanatory and CXJIIfounding variable:s. Com:spondingly, random assignment or treatments to In:almcnt periods is essential in N-or-l trials. Once: the number of tn:alment periods has been determined there an: a number of ways or randomising the In:atmenls to pc:riods. The most recommended design when comparing two tn:alments (as is most c:ommonly the case) is random allocation within pairs oftn:alment periods. Forexample. for the: comparison of treatment A venus treatment B during eight treatment periods., the following RANDOMISAnON schedule might be generatc:cl: AB AS BA AB. This approach avoids the possibility of sevcml consecutive treatment periods with the same tn:almenl. In te:nns or the analyses or an N-or-I trial an important first approach is to pial the dais and examine the results visually. The more theomical methods depend heavily on the type or randomisation used. When the paired design isemploycd. the simplest approach may be to perf'onn a SlON 1BT, which examines the uKEUHOODor&he individual preferring the same treatmenl within each pair of In:alment periods. However. this does nol assess the sln:nglh or the treatment elTcc:t. only &he din:ction or it. A IDDIC powerful alte:mative is the 5nJDENTS I-lEST (either paired or unpain:d depending on randomisation). For such analyses the paired design is again prererable since il goes some way to reduce the impact or AUTOCORRELATION (i.e. &he assumption of such a statistical te:st thai observations rrom one In:atment period to the next will be indepc:adcnt). Rc:cording several measurements within each treatment period and comparing averages aclOSS the pc:riodscan n:duce this problem f'urlhcr. Paramdric tc:stsalso make the assumption of normality and nonpanmetric tests

may altemalively be used. In addition. BAYESIAN MEtHODS arc available (Zackcr el 0/•• 1977) for combining information rrom a series or N-of-l trials. When an individual's N-or-I trial has been complete:d the n:sults will be summarised and disscminatedduring a feedback 5c:Ssion between the clinician SB and patienl to inform rulun: treatment. OayaH, G .. Sadu!tt, D.. A.-.., J.. Roberts. It., Ol.r;/I) = Pr(zil + E; > 0) = Pr(e; > -r;fI)

=Pr(e; :5 x';//) = .(.r;I) where Ihc penultimatc equality hinges on the symmetry of the nonnal density of e,. Consider an ordinal raponse variable with S + I catclories O~ ...• S. An ordinal probit model can be specified usinl

r:

~=L

if

• "1< Yi•

if

HS
102. W....n. s. J., Camp"O, M. J. ad ...... R. 2001: Design and analysis of trials with quality of life IS an OUICome: a practical pide. JOIInIIlI of B;opharrntlC'eutica/ StaliJl;u 11. 3. 155-76. \V........ S. J., C...,..U, M. J. and PaIlle" S. 2001: Mc:daods for determining sample sizes far slUdics involving health-related qualil)' of life 1DCU1U'CS: a lUloriai. Hetll,' Sn"'iC'es IIIIti OulC'ome$ ReJetlrC'h MethotloloD 2, 83-99. W........ S.J•• MIUII'O,J. r .... B....., J. E. 2001: Usilll the SP-l6 with older adalts: cross-sectional communiI)' based suney. Ag£' tmtI Age;", 30. 3l7~l. W.... Jr,

J. E. and SIIerboa,... C. D. 1992: The MOS 3et-item sharl-fona health survey (SF-16).I. ec.eplual tion. Metiklll elll'e JO. 473-83.

framC\\"rt and item selec-

quantlle-quantlle (Q-Q) plots

See PROBABILn'Y

FlDI'S

quantile regression

'Ibis is a statistical n:gression method tIIat maclels any specified QUANTILE (e.g. MEOoo.. first quanilc. 90th pen:cntile) of a continuous dependent variable liven a set of EXPLANATORY VARIABLES. It is analogous 10 linear rqression (see MULDPLE LINEAR REGRESSION). which models the MEAN of the dependent variable instead. When applied to the median. quantile n:gression is bown as meditm regressiolf. Quantile regression has several appealing features. some or which arc iIIustraled in the following four examples. Although inspin:d b), real-life applications. all the examples pn:sc:ntcd an: fictitious. In them. deSCriptiODS and inlelpn:lalions an: kepi as aJDCise as passiblc aad may occasionally be simplistic. II is hoped that they ma)' nevertheless facililale unclcntandiq of the prominent fcallRS of quantile regn:ssion. E.'CQII,pie I. QlIIIlfUies lire 0/ sIIbslllnlil'lf interesl. Forced vital capacily (FVC) measun:s the 10lal volume of air onc can cx.hale after a deep inhalation and is commonly used along with other indexes 10 evaluate lung funclion. Lower values of PVC ma)' be indicalivc of some pulmolUll')' disorder. In clinical practice it is or pal intereSlto compare an individuars observed measure with refen:ncc values of normal. FVC is kaown to change ph)'siologicall)' along with age. sex and heighl. Reference values should therefore be agee. sex- and height-specific. The first figure (on page 371) shows a SCA1TEIlPL.Ofof FVC againsl age measured on health)'. nonsmoking mc:o. The lincs depict the 5th percentile estimated by quantile regression (solid) and mean FVC b)' UNEAR REORESSION (dashed). The lines are estimated ror I.I-m tall men. The 5th percentilc linc can be intcrpn:tcd as follows: at an), given age. 9S 4J, or healthy I.I-m lall men arc expecled 10 measure above the line. Observed FVC measures that fall below are typically considered subnormal. Nole lIaat mean FVC estimated by linear rqn:ssion is hardl), of any inlen:sl in this conleJtt. FVC measun:s of perfectl)' health)' individuals an: cx.pectcd 10 fall abovc and below the mean line. Insorar as they an: nol too low. they should raise no suspicion. Quantile regn:ssion may estimaIC other pcn:cntiles of nonnal (c.g. the lsi. the 10th). Quantilc:s an: of resean:h interest in man)' other seUiql. which include. for instance. a median lellaal dose in toxicology. percentiles of seawaler concentration of chemicals in environmental studics~ median survival time in CUNICAL TRIALS and 90th

___________________________________________________ 8

• •• •

2

5

• •• • •

QUANn~ReaR~~

•

~

1~

4

1

l

3

2~--------_P--------~--------~~--------p 70 40 50 eo 30 ,,-(years)

quantile reg....lon ScatterpJot 01 forced viIBJ cspaciIy against. age measured in 1000 tictJlious inclviduals with the 5th peteentle estimated by quantile I8f1ression (sold Ine) and the mean estimated by linesr If1IJtession (dashed line)

pen:entile of the time from aD elllClleocy call to admission in a hospital in emergency mc:diciDe. Exllmple 2. Qutmlilu provide buiglrl. Body mass index (8MI = weiptlhc:ight-sq~d~ in k&lm2) is often used when slUdyiq obesily. 'I1Ic: sccaacI figue on pqe 372 shows a SCllllelplal of 8MI &lainst &Ie in sc:denlary childrc:a (le:fthand panel) and inchilckcn on a pbysical activily programme (right-hand panel). The solid lines in each pai1cI n:pracnt from baItom to lOp the estimated 5th, 2S1h~ SOth. 75th and 95th percentiles. AI 10)'c:aJ'I of age the distribution of 8MI values in the two groups look similar. With ageinl. however. the twodislribuUaas sc:pande. The iaqer BMI wlucs IR esdmaled to IJUW higher in the sedentary papulation than in the active population. However, the lower SOCJt of the 8M1 vallics in the two populalians seem not 10 be conspicuously impacted by a sc:cIcntlll)' lifestyle. lndc:eeL the slopes maIc:d b)' quantile regression do DOl cliffei' silniftcandy between the two groups far aay percentile: below the mc:diu. Linear J'CII'CSSion (not. shown) would provide: estimates for the slopes of mean BMI. They would show a diluted, averageetrcct. which cauld allow but a pallial undc:rstanding of the complex impacl of the physical activily propamme onBMt Exturrple 3. Qumrlilu 11110"' 'llTillble Irons/omJllliorr. Urariium is a naturally occurring alpha-emitting radiaaucliclc: and a toxic he:avy metallic element with caKinogcnic poleDlia" Groundwater conc:cnlralioas of uranium are

Qta:.

meuun:cl iD the vicinity ·or a pollutinl saUrc:e•. Tbc: third ftlure shows the scauerplal or uranium concentrations (left-hand panel) and the 10larithm lnasf'orm or uranium (right-hand panel) qainst distance (miles) rrom the source. The solid lines represent the Sih. SOth and 95th percentiles or uranium and the dashed line ils mean. Modelling the relationship between uranium and distance is simpler on the 100arithm scale•. where it is approximalely linear, thaa on die: untransfonncd scale. Quantile I'Clression allows lnasformation or the dependeat variable. The quantiles of uranium arc estimated on the 10larithmic scale a.nd then Innsfol'lllCd back to the unlnmsfonnc:d scale. In the unInnsfonnc:d scale the estimated quanlile curves arc thus· constrained to be positive. which is clearly clc:sinble. In general, in linear regn:ssi_ the clc:pcndenl variable should DOt be lransronnc:d. despite this being common practice. Infcrmce on transfonned outc:ane wauld carry no informatioa about untransfarmed outc:Omc. unless sll'OnJ distribulioDal IIISUIDpIions W4R made. A cIired application of line.. lqI'CSSiaa 10 untransranned uranium. however. pnxIuccs aansc:nsical lICIaIi\IC estimates of meaa uranium. The DDnIincar relatiaaship between mean unnium and clislance should inslc:ad be: macll:1Ied with SDmc oIhar approprilllC method (e.l. splines - see SCATIERFI.DI' SMOCJIIIERS - and nonIinc:arrepasion methods). 2\ICn then. howew:r. inference about the rnc:aD only may be: unsatisrador)'. In the PRsc:nce of skewed dialributions the mean may be highly affcc:tecl by few

371

QUANTILE REGRESSION _

Sada ltary

hlysicaUy AdiYe - - -

40 -

..........

~

Iro

f ro

30

--

-m

Ji::

J=

}

•

~

~

~

"til

m

~

:1

...:: 20 ~

2{l

~

0

0

a:I

i:!O

0'

18

20

~-----r----'-----'-----~----~

12

1.4

'~ 6

Age (reat~,)

quantile regression Scatterplot of body mass index against age in 500 fictitious sedentary children (left-hand panel) and 500 fictitious children on a physical activity program (right-hand panel) with the 5th, 25th. 50th. 75th and 95th percentiles estimated by quantile regression (lines bottom to top)

unu~ually Il1r~e \·alue~. Inference about a set of quantile~ with quantile re£n:ssion !;enemll} permits mure complete inference.

l.·xamplc 4. QlIcl1llilt'.\· (11'(' ro"I1.\//o tJlIllit'r!i tlmllllt.'tI.lllr('-

Sample data may sometimes contain unusuall} lar,!;e or unusually small \·alucs. often referred to as OlITIJrRS. Outliers may occur because they are pres.cnt in the population from where the sample is dr.J\\·n or bcxausc of measurement em)!"!>. Both ca~s are extremely frequent in real application...... When outliers are present. the median may be a better summary statistic than the mean to asses...' the hlCation of a distribution because, unlike the mean. it is lar,!;cly unaffe'""ted by them. Whcn thc distribution of sume \-ariable gh'cn the independent \'ariables ha., unusually large or small \-'alues, the median may he morc cfficielltthan thc mean. in that it ha., more POWfJt and gi\'es narrower COSI1Df.SCI: INll:R\,.\I.S. If. for example. the distribution IS normal. then the median is less eflicknt than the mean: if it is exponential thcy an: equally enicknt; if it is a Sn!Dt::'\T'S I-OISTRlBtrTlO:-.i with J DEORUS Of I'RE1:t)o~1 thc median is more cllicicnl. The robustnes~ to outliers and mea,92. Sfm!u, S. S. 1946: On the theory of scales of IIICIISIRlDCut. Sdmce 103, 677-80. SleTeas, So So 1955: On the avcraging of data. St:ienl.Y 121.113-16. SftIIIIIOII, Eo 2001: GuidcIincSIO $lalistical evalualion of dlla from ralings scales and ~ JOIIIfIQI of RehabililQtiolr Met/kiM 33. 47-1.

ranks, ranking procedures

Nonparamc:lric slalisNONA\RAMEI1UC METHODS - AN OVERYIE\\')

lieal methods (see arc useful for aU lypes of data: while panunebic slatistical melhods arc applicable to quanlitative data thai meel the criteria of being nonnally distributed (see NORMAL DIS'J1UBU. noN) or othu known PIOBABIUfY Dl51R1BlmOHS. A common approach or nonparamelric statistical methods is to lnInsform data to ranks. A ranking of" onIen:d observations is a sel of numerical ranks [1. 2..... thai will Iepn:sent the observatiOllS in slalislical analyses. The rank sum is 1/2 11(n + 1) and the mean rank is 1/2(" + I). where " is the number of observations. Assessments on rating scales with a limited number of possible calegoric:s imply that groups of obsemdions wiD share the same category. and these observations will s~ the same rank wlue. which often is the: MEAN or the ranks that belong to the group of observations. so-called tic:d ranks. The calculations of the MANN-WHITNEY RANK SUM lEST of the difTc:rcnce belween lWO indepeadenl groups ofdaID mad of SPEAIlMAN"S RAMt COItRElATION COEfFIClEHT arc based on this lype or rank transformalion (Siegel and CasteJlan. 1988: Gibbons and Chakraborty. 20(3). The first Ogure shows the fRqueacy distribution of pairs of data from psychiatric assessments of abe severity of fatigue and lhe level of concentration dimculties in 43 patients. The assessments of the lWO variables ~ made on ruling scales having five ordered categories denoted Fl ••..• F5 and CI ••.. , CS. whc:rc FS and CS represent lack of symptom or difftcully. According to the frequency distribution of aJRssmc:nts on abe fatigue scale 16 patic:nts were judp:d to the calegory Fl. which n:presenls the moll severe ieYeI offaligue., and these will shan: the ranks from 1 to 16. the mean rank being 1.5. The sevCft patients in the calegary P2 will share die raaks 17 to 23. the mean rank being 16 + 1~(7 + 1)=20. The mean ranks of the two selS of distributions are shown in this figure. The pain of cell flequencies arc Rplaced by pails of ranks when tbc relationship

n.

F2

P3

F4

Tied

F5

lOt

4

11

16 l5.5

I

6

24.5

5

19

6

13.5

rant C5

I

C4

2

I

2

C3

2

2

I

C2

3

3

CI

8

I

I

total

16

1

a

11edrant

8.5 20 27.5

I ~

10 5.5 0

12

43

37.5

ranks, ranking prvced.... The frequencydistribulion of pails of data from psychiatric assessments of the SfIVfKity of fatigue and the level of concsntnJlion dfficulties in 43patients. TheratingscaleshaveRveOldered calegoriesclenotedF1' ... F5andC1, ..., C5, whereF5 f

andC5 tept8SBlJt a lack ofsymptom ordlflicu/ly. Thetwo sets of marginal distributions and the tied rank values of the margInsJ hequencles are given

betwa:n the severity of fBlip and cona:ntralion dillic:ulty is calcuJatc:d by Speannan's rank correlation coefIicieDl. This means lhaIthe observation (Pl. Cl) willgellhe pair of tied ranIts (20: 5.5) and the IhR:e absen'alians (Pl. C2) (20: 13.S). and so on. Spc:arman's rank corn:lation coeflicient. when adjuSlcd for lied obsen'ations. is 0.7S. A bivariate ranking applU8Ch developed for analysis of paired ordinal data regning agn:c:menl and disagreemenl thai takes aa:ounl of the information given by the pairs of data is suggestc:d by Svensson (1997). In this augmenaed ranking approach (aug-rank).lhe ranks an: tied lo the pairs of dala. which means to the obsemalions in the cells of a square OON11NOENCY TABLE or to the points of a SC'.O\TI'EllPL01'of dala from VISUAL AHALOOUESC'ALE (VAS) assessmenls. This means thaldle augmented rank of the assessments X depends on die pairing with V. The second figure (on p. 389) part (a) shows lhe paiRd disb'ibution of SO uscssmc:nts made by lWO ratc:n labelled X and Y. The tine individualscalcgorised A by ralu X arc found in dlecens(A;A).(A~} and (A:B). which means that rater Y has assessed one of these individuals to a higher category than hasralu '" and this individual wililhereforc be givc:n a higher aug-rank X-value. The DUg-rank X-values of lhese three pairs an: thererore 1.5. 1.5 and 3 respeclively (see pan (b». Ac:conIing to rater Y. 14 individuals arccatcgorisc:d A. bUI (2. I. 3, I) of ahem are catqoriscd A, B. C. D respectively by X. and Ihc:rerorc die Dug-rank V-wlues of these four groups or individuals will diITer (see part (b).

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ RECEIVER OPERATING CHARACTERISTIC (ROC) CURVE

(a)

R

A

RATER X

B

A D

T C B R B Y

A

2 1 8

1 2

C

D

1 2

I

II

3

14 3 1

ranks, ranking procedures (a) Thepaireddistribution of Intenater assessments of 50 individuals by a tourpoint scale (A, S, C, D). The agreement diagonal is marked (b)

(Aug-rank-X: Aug-rank-Y) A

C

D

(31;49)

(50'.50)

( 13.5:31.5)

(29.5:33.5)

(42.5:41.5)

B

D C

B

(3:15)

(12:16)

(23:22)

(34:29)

A

( 1.5;1.5)

(7.5:6.5)

(16:12)

(32;14)

ranks, ranking procedures (b) The paired distribution of aug-ranks of the frequency distribution of paired assessments X and Y of (a) This aUl-nnk approach to taking account of infonnation from Ihe pairs of orden:d categorical asscssmenlS when n:placing pmd onIinai daaa with pain of aug-nmu makes it possible to identify and sepandcly analyse a possible systematic component of observed disagreement from the occasional. noise. variability (sec AOREEMENT). A compielC ap1:enlCnt in all pails of aug-nnu dcftncs the nnk-lrDnSformable paIICnI of agm:ment (RTPA). which is uniquely lelated to the two marginal distributions. The RTPA is the distribution ofpairs that isexpccted when Ihe observed disagm:mcnl is completely explained by a systematic disapeemcnt and in the case or complcIC agreement. ES

been exposed than an: conlrOls. In some casc-conarol Sludics n:lI'OspCCtive data on expoMR ~ obtained from historical m:ords. However, in mosI situations such data~ not available and data ~ inslc:ad obtained by interviewing cases and controls (or their n:lalives). When this is clone thele is a chllDCC that cases may be mOle likely to lemember having been exposed than ale controls. Even w~ there is no genuine diJTcreace in frequency or cxPOSUIe between cues and controls. this differential recall may cause an appalent difference. so that the cxpo5Ul'e appears to be associated with disease. For example, in a casCH:ontrol study of congenital malfonnations. mothers arc asked about prior exposuI"Cs to infectious diseases. drugs. environmental pollutants. etc. It is quite plausible that a mother who has given birth to a malrormed child will be mole inteI"Csted in the Sludy and make more effort to lemember instances of past exposure. It is also possible ror n:caU bias to operate in the opposite direction: e.g. ir. throup shame. cases were less likely than connls to admit exposure. (For funber details sec Hennekcns and Buring. 1987. and Rothman and Orcenla.nd. 1998.) SRS (See also BIAS IN OBSERVATIONAL SlUDIESJ 1leDaebJu, C. H. aDd B........ J. F.. 1987: Epitlemiolo" in medirine. New York: Unle, Browa and Company.......... K. J .... GneaIaDd, S. 1998: Modem tpidrmiology, 2nd edition. Philadelphia: Lippincon-Raven P'ablishm..

receiver operating characteristic (ROC) curve

recall bias

DiagnoSlic testing plays an increaSingly imponant role in modern medicine and the ROC curve is a common graphical tool ror displaying the discriminatory ability of a diagnostic mmer (test) in distinguishing between diseased and healthy subjects. The outcome of a diagnostic test can be dichotomous (positive. negative). ordinal (e.g. nonnal. questionable. abnormal) or continuous (e.g. PSA measurements). 11Ie ROC curve arises only for ordinal and continuous outcomes. A diagnostic m..kcr is gcnemJly evaluated by comparison to a definite gold slandard procedun:ltesl. Such gold standanls ale often complicalcd to conduct,. intrusive. not sufficiently timely or expensive. This moovates the seard1 for inexpensi'VC, easily measurable and leliable alternatives. A subject is assessed as diseased or healthy depending on whether the com:sponding marker value is above or below a given dRshold. Associated with any tluahold value ale Ihe FROBABJUrYor a bUe positive (SENSI11\'IJY) and the probability of a true negative (SFECIRCII"Y). The ROC curve presents graphically the tradc>ofT between sensitivity and specificity ror e\'ery possible thn:shold value. By convention. the plot displays the specificily on the y axis and 1 - sensitivity on

STUDIES

the:c axis.

LSec also VAUDITY OF sc.w:s) Gtbbaas, J. D...... CIIakraboItJ. S. 2ODl: Nonptlrametric 91atis-

tkal injt.'f'rllte. 4th edition. revised and expanded. New York: MIReI Detter. ..... S. aad c.t.... N. J. 1988: NonptIrametric statutkJ for the hehtnioral Jnl!ftceJ. 2ad edition. New York: McGra,,"-Hili. SWIIIIDII, & 1997: A coefticient of apmneRt adjusccd for bias in paimI onicrM categorical data. Biometncal JormNll39.643-57.

reading the medical literature

Sec CRI11CAL

AIIPRAISAL

This is a BIAS dial can occur in C.UE.

0

u

S

0

Q.

5 4

3 2

~--~--~--~--~--~--~

o

1

2

3

4

curved functions well. The pmbIem is ovcn:omc by usiq piecewise poI)'IlGIIIiais. ia particular c:ubic:s. which have been found to haYC nice pmperlies with Saod abilil, 10 ftt a of complex Rlalionships. 1ba Rsult is a cubiC" which arises fonn..ly b, seeking a smooIh curvcg(x) to SUIIUIUII'ise the dependence of y on .Y, which minimises the expn:ssion:

.1.,

wric:I,

.k::&(XIIP + ~±l'

(x)2ch

the secon deri\'llli~ of g(x) willi n:spect 10 .Y. Allbaugh wilen wril rannally dlis crilCriaa loaks a liltle fannidablc., il is n:aliy nothilll mare dian an eft"od to pem Ihc ~ between the: soodneSHd'-lilof the daIa (as mc:asun:cl by ~ (),_,(x;»)2) and the 'willliness' ar departure of li!Juil, or g as measun:cl b, (x)2dx: rar a Unear function, this Iaatcr part would be

wheM If'(x)

t;:"n.c

JNIIBIDClCr A. sovems the smaadmess or r~ willi values IaUlliq in a smoother CUI"YC. The solution is a cubic spline. i.e. a series or cubic polynomials joined III the: unique obsc:nred values or the ellplanatary Variable, :r,. (Far mare details. see Friedman. 1991.) TIle: 'erreclive number or parameters' (analogous 10 die number of parameters ia a parlllllClric fil) or DEGREES OF fREFDOPd or a cubic spline slDOOlher is generall, used 10 speciry its smoothness radlc:r aban A ~Iy. A numerical sean:h is lhen used to determine the wluc of A. conespondins to Ihc: requin:d dqn:es of fn:cdom. The complexilY ofa cabic spline is approllimatel, abe same as a polynomial of desn:c one less thaD abe dc:cn:cs of rn:edom. However. die cubic spline IIDOOlbcr out' ilS parameters in a ~ even wa, and hence is mucb man: ftcxible aban polynomial rqn:ssion. We shall illusIraIc the use of cubic splines by filliag such a curve 10 the monlhly deaths rrOm bnJachilis. emphysema and asIhma in the UK Ii'om 197410 1979 far men and WGIIIen. A &ealtcrplot of the data and abe filted cubic spline is shown in the dainl ftpn: (pap 414). For these data. Iocall, wcipled n:cn:ssion is DOl so successful in ftlpresentins lbe dlda. The fourth 1lIun: (page 414) shows a number ofplaiS orlhe data with addc:cI locally wciPtcd ~gn:ssion fils. again with dift"en:nl wlues or Aad G. Hefti thccluuactcrislkeyclical ~ortheclala is onl, picked up willi A= 2 and a = 0.25. In abe other Ihrcc diqrams the amounl or IIDOOlhilil is too patiO ftlyeal die structure in abe data. SSE em.

·s,..,..

b and c. an: calleclltntll6. The number or knots can vary accordinS 10 the amount or data available for fitti. the function (see HaneR. 2001).

8

~~~OTSMOO1Ham

5

8

x

acatterpIot ....ootherw A linear spline funcfion with knotsata= 1, b=3. c=5 The linear spline is simple and can appIOximale some n:Iationships. but it is not smoaIh and so will not fil hiply

CIeftIad. and

w.

So 1979: RaIIust local.y 'A'Ciptccl n:pasiaa

IIIIDDIhiDc scaaaplals. Jt»InIIII D/ 1M A_riRIII SllIIislimJ

Alllodlll_ 74, 129-36......... J. H. 1991: Multiple

adaptive n:pssian splines. AIIIItIb D/ SIIIt&lk$ 19, 1-67. Hamil. 1'. E. 2001: Regrns_ $/rlltqw6 wil6 oppIimliDM ID 1.111" rntNklJ. Io,islic rqrrlllitHr I11III :AllYiNi tIrIIII,oJU.

",,,.,6,,,

New YcIIk: Spriqa;

413

~~TSM09~

-i •

I,

_________________________________________________

3600 3000'

'8 2500

•

B·

2000

150C'-

,20

,0

40

60

MonIh

(8)

Degree. 1

smooII_. 0.25

•

I

'

3500

,1"3&00 '

0

,°

¥

'0 2500

.

DeGree. 1- smoo1bness. 0.16

(b)

•

°t#O

•·0

•

~

J

°

J j2D

1500'

1500

'0

,

° o· .0. • • ¥ • I t#O •••• ••• -• ••o , • • •• • . y° ••, • °,. • ••• • .0,.... '. , •

.•

.,

.

,

'

•

•

"

Ibdh (c)

j3D

DIaree" 2.srnooIhnB11 .0.25. ' .... '.

_ _ .2~.0.75

(d)

.'

•

.

1·'3&00'

°

'.

•°0

'0 2500

J

•

,

•

i' .I

.

'.,

¥

o'

...........

'0 2500:" ..

~

0

40

ManIh

•#0'

~

0,.

A.~ ,.

1500 . #,

1500

.'

•

__ .

....

.•

° .°0

,

~

.4\,

° .•

~~.~o~.-'~~~~. ,.~ ,.~

.,

".~."..

" ..

40

MonIh

.......01 .......... 'LocaIIy w8/gIrled~ IJIs for mOitIhIy tIeaibs 110m bioni:IriI~ in'lhe Vi( 1974-1919

___________________________________________________ ..... plot

1'his Is a SCA11ERllLGr of tile: VARIANCB

or

~NBsruaES

condilioa.lmplicitlo scn:caiq is lhe~arac:learly· ..apiSllblc 0IIIc0mC dud il IndicaIivc 01' pmctmical disease _ ... _pIiDn ....1carly di...... is bcacftciaI in same way, such .. beUcr propasi~ .....cr IIaImcnl, . . ilMliw: ....icaI pmccduIa. Jqhcr "quality of life' ar mduccd chances of lIIOItaIili. Examples of cliapostic. rats uscd in sc:mcnin&": 1.1IIIIIIIIIIOpBp, to cIetcc:t pmclinical bn:asI. caacc:r cIiscasc ia WCIIIICIi; .2. • blood tell 10 ~. ~ spac:ilc: _lipn (PSA). as hlp level. in·1IICD ue . . . . . . 10 be associak:cl willi pRc:linical disease of pnISIatc clIIICI:I';~. bIoqd pn:ssun: ancIcholcstemllcve~8S"" ~Is of bolla ale IIIIIDCiaIcd wida canliac disease. Ie_nine _lSue nat widaaal COSIs(caslof'cx_nalian; c:05Is of I'aIse posiIiw !aUIII arising rrom follow..., IabamloIy pmccclun:s; casts ~ false nepliw n:sqlll uisii1& fRIIII

Ilia facIalS iD • factor ....ysis 01' Ihe campoaenlS in a PRINCIPAL COMPOKENT ANALYSIS apiRIIlheirllANKS ia kInDS of lDapituclc. The plat can be usccI ~ pnwidc: .. ial'armal estimate or Ihe a_ber or faclan (campollCllls) by n:IaiD. inc as ....y l'aclan (CompaIICIIlS) as lhe~ are variaac:cs lhat rail ~fcn Ihe IasIlarp cIrap on lbe piaL An example of such • plat lhat sUJICSIS lhn;C racton is shaw. in lbe 81'11". 0Ihcr examples I n JiveD in Plachcr .... MacCallum (2003). BS.

lSee KAISa's 1IIlE) ......... Ie. J. .... ~ .. C. 2003: RcpIirias . . Swift's electric fa:Iar ...,.. 1DII.'Iainc. Ulldtrsl. . . . SItItillia 2., 13-44.

ral. hope of cIi.....rn:c SIIIIaS). ScRciaiIil studies an: clcsiped to quantify: Ihe aabR.af die 1M:aeftt' (c..c. mlucliaa in lDD.IIaIily. cxlalcled sunival time, . . . .n:safqualilyaflirc); the tlqCl papulaIion .... is

.....nlng atudIas

1'hac am pI~ imatiiations to cIeIaniiac Ihe cm:ct of adminiltcrin& a diacDDllic rat to dc:II:cI Ihc prcscncc 01' aa-nce 01 pnlClinicai disease iD asymplDmalic indiViduals~}.. 1he ClKXlUllter is ia~ IiIllCCl by ~ bcakh pmfcIsiD~ ...... Ihan by ilia palicntsince 110 chical IOYmplGIIIS an: appuaat dud athcnvisc waald dri\le the paIicallo seck medical diapDsis. The .... of scRlClliq is to sepanIa Ihe papallIIiaa iIIIo twa IftXIP5: thOse with. hiP vcnualow prababilily aftbc giwn cIisordu~ lIIIIIIIIy one thai is pem:iwId to 1M: • serious public IIeaIth

expc:cled to benefit fiam SCn:cnilll (in terms or accIIcndarI c....ic paups): and tile cnar I8ICS (falsc positiws and false ncplives). The NLSE PtismvE RA1E is the liiIoMaaJry.lhat the b:sl ISSCIfS "cllC8Ie' wIleD. in rad.. 110 disease is pn:1iCDl (eanwncly. SIIf.CIFICIR, 01' the prababiIiay of ablaiailll • ncptive raub when eli..., is abHnl);.1he Mua lIIIIM'IM

3.5

1.0

·0.5 I

I

I

I

I

I

I

I

I

t

2

3

4

6

8

7

8

8

I

to

CoqJonent runber

..... plot Asaee plot Iot',. pdndpIII~ . . . . . 01. comJIatIon ",.,. of 10 oIJsenIfKi ~1fabIes

41&

SCREENING STUDIES _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ RAn is Ihe probabilily that the lell asserts 'no disease' when in fact diseasc is preseat (convenely. SDISITMI'Y. or the probabilily of obtaining a posilive result when disease is pn:seat). Fordiapastic p&apascs. sensilivity should be hip. while, for scn:ening purposes. spcciftcity should be high. to avoid unnecessary follow-up tesling of disease-IRe individuals. Scrc:cning lc:sts ~ indicated when Ihc: bc:acfits are judged to outwc:ilh the potenlial drawbacks (cOSls, risks of false pasili~ and false: negatiyc:s. etc.). Bceause nonranclomisc:d trials are subject to selfSEI.EC11ON BIAS. randomised scm:ninl trials offer the bell and moll reliable mechanism for cvalualil1l the patenlial bencfitliom scn:ening.1n mndomiscd sc~ing trials. study ann participants are olTered scn:ening aI relular intervals and the CODIrOI ann participants follow their "usual medical care". Due to Ihc: cost of sm:cning. such trials ~ usually canduck:d using a "Ilop«n:en clesill", in which sm:cning is offen:d for a limited time only (e.g. annual scm:ns for 3 to 5 yean). Several imporlant dilTercaees between scr=aiDg studies. used 10 evaluate the potenlial beneHI of a scn:eniRl inlervcnlion, and CLINICAL 11UAU. for Ihc: evalualim of a speciHc thempc:ulic inlc:rvenlim. . . . . Ihe dcsip and analysis of scn:ening studies very challenpng. In clinical lrealment trials. the cases are specified in Ihe pnJlOCOlto be comparable in both the llue1y and conbol anus of the trial: in scn:ening trials. participants are initially asymplOmatic and are nmdomised 10 the study ('offered scrc:eaiRl') or cxmlrol ("follow usual medical care') arms of the IriaI and cases evolve as the study progresses.lflhe scn:cning tell is successful, then cases will arise sooner in the study arm Ihan in the control ann. so survival limes (time ofENDIIOINT minus lime ofdiagnOSiS) will be: longu in the study arm Ihan in the conlrOl ann. even in the absence of a scn:ening bencHt. nais BIAS in the evalualim of scn:ening is known as "lead lime bias'. Also. because cases with laager pre-clinical disease durations ~ IDCR likely to be: detected by scn:cnil1l than cases with shorter pre-clinical durations, the cases that mise in the swely ann of a sm:cning trial are IlIOn: lilcely to be: less aggressive and Mace havc a more fawurable prognosis. even in the absence ofa scn:ening bencfiL This phenomenon is kaown as LENOTII-BlASm SAM. PUNO. Study ann participants also experience 'owrdiagnosis bias', orlhe tendency oflhc scn:ening test tosUlgest apparent butlnlly nontluaaening disease. (In an ideal world. this bias would not affect the laults if furthc:r diapastic tells later eliminate lhese individuals as cases of disease.) Finally. noncompliance in both arms is inevilable: some participants in the study arm may refuse scn:caing. while some in the control arm may seck scn:ening. 1bus.1hc: cases that arise in the two arms or a scn:cnilll trial may not be comparable as they are in a treatmenl trial. RANDmIISA110N e.un:s that the participanl cblll'8ClcriSlics are the same in both arms. includiRl thole that lead 10 noncompliance of either type in

eilher ann. alJuil1l for an IN1EN11ON-TO-TREAT analysis (Byar

etlll., 1976). 11ae mosl common measures used tocvaluate sm:cning are reduction in morltllily (comparison ofdeath ndcs) and melln beneftl lime (difference in the FatEAN surYiyailimc between the time of entry inlo the trial and the case endpoinl). Randomisation ensures t"t: I. the participanl charactc:rillics are the same in the two trial arms. including those thallead 10 IHIDCOmpliance of either Iype in either ann. and 2. Ihe elimination of bias due 10 lead time. when survi\'aI is mcasun:d from the time of entry inlO trial. Stalislical methods to estillUlle the benefit (n:duclion in mortality or exacnded survival time). lead time and Ihe elTect of length-biased sampling have been proposed. For overviews of the issues related to scn:enil1l and for statislical methodology of design and analysis of scn:enil1l studies. sc:c Zelen and Feinleib (1969), Zelen (1976). Goldberg and Wiltes (1981). Prorok and Connor (1986). Oastwinh (1987). Shapiro el ilL (1911). Prorok. Connor and Baker ( 1990). Connor and Prorok (1994). Kafaclar and Prorok (1994. 1996.2003,2(05) and Baker. Kramer and Prorok (2002). Scn:eniRlstudies are also used 10 evaluaac the outcomes of designc:d trials to scn:en chug compounds for their potential to be biolopcally active. A lypical drug scn:eniRl protocol may involve seyeral slqes basc:cl on the respoIWC of Ihc compound 10 "arious reactions: e.l. 'Conduct experimenl I: if the ellCll)' from the readion is less th. a spet'ificd leyel. rejc:ct the compound: otherwise. caacluc:t expcriment2: iflhc: second n:aclion is less than a second spccinc:d level. reject: otherwise. submit the compound for further tellil1l.' 11K: evalualion of such drul sm:ening desilns involves the same kind of considcndion as the evaluation ofnndomiscd scn:ening trials used on human subjc:cts. dc:saibcd earlier. See Roseberry and Ochan (1964) and Schultz el ilL (1973) far designs and analysis of drug scm:nil1l bials. as weU as related micles in Ihc: literalure on sm:ening designs to delccl unacccplable prvducts in manufacturil1l. KKa

Babr, s.. G.. Kraier, _. s.. ... Prorak, P. Co 2002: Slilistical iuucs in nndamiscd trials of cancer SCReninc. Britisll MrtfimJ COIIIItil Met/kat ReJetuth MethDtJology 2. II: www.biomedccDlr'II.camII47 1-22811211 I. D. p.. SlIDDII, R. M., FrIedtftId. W. T. d.L 1976: Randamiz.ccl clinicallrials: pcrspecli\'cs an some nxentideas.NeM'EngItmtlJoumtllo/Metlicinr'19S. 7~.c.aar. R. J. ad Pnnk. P. C. 1994: Issues in the IDOIlaIity analyses of randomized controlled lriaIsofcancer screening. Co"""lIedClinimJ Tl'iDu 15,11-99. GMhrIrIII,J. L 1987: 'l1Ic stalisliall pnxisian of medical sc:lmIing prac:cdurcs. Slalu'imJ Scitlrtr 2. 213-31. GoIdIIera.J.D.... Wltt-.J. T.1911: Theevalualionofmedical scfCellinl praa:cba. 71re Amtriam Slatistician lS. ~II. Kafadu, It. ... Prorak, P. C. 19M: A cIata....ylie IppiOIICb forestimaling lead lime and SCReaiDg benefit based on sum,... cun'a in randomised Irials. Slal&tit:l in MedidRe ll. 569-86. KdIdD', K. ...

-JU.

PnnIc. P. Co 1996:

Campuler simulatian experiments of

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SEAMLESS PHASE IIIIIITAIALS IIDCbaizcd scrceDinI trials. OlnrpIdllliDlltll SIIIIUlirs IIIIIl DtlIII AIIIIIp& 2~ 263-91. Kw....r, Ie. ... ....., P. C. 2003: AItcraatiw 4dini1ioas of camparablc case paups ... cstiIaaIcs of lead tiJDe ucI bcac:fillilllC in 18IIdaaIizcd CIIDCCI' SCRCniB& trials. SIIII&llu in Ma/i,,"21.13-111. KIIfadu, K.... PI'anIk, P. C. 2005: CanIputaIiDMI mctbDdsin lDdcai dccisiDDmakinc: losmcaornot to sm:cn? StlllUlira if MerliciIw 24.509-81. PronIc. p. c. ... c-r, .. J. 1916: Smaing far die early dcfcctiDn of cancer. Cllllft'r IIIIYJt;glllIMf4. 22S-ll....., P. c., C--, .. J.... ...... S. G. 1990: StaIi.sIicaI coasidcmlians ia cancx:r smcainl ...,....... UroIo,ie CliRir~ 0/ N",," ARlerim 17. 699-708. .........,..T. D.... GeIIaa. It. A. 1964: BitHMtrin 20. 7~. ScIndIz,J.... NII:IIoI, F......... G. L _ Weed,s.. D. 1973: MuIIipIc-sllp ..-edIns far dnas SCRCDiDI- Bitllft/rin 29. 293-300• . . . . . , s.., V.... W. . . . . P. _

.JtTft"'"

V..... L 1918:

PnilHlir lor br. ., alllar: I. WI" ___tv' pia propl _ ila ~ 1fJ6J-/986. Baltimare: Jalua Hapkias Uaiversity Pn:ss. ZelIa. Me 1976: Theory of culy dccedioD of' 1lRastc:anccr in abe pncal populalian..ln Hcasaa, J. c.. Maubdem. W. H. and Rozcawei& M. (eels), BmuI «IIItv'r: tlYlItb ill fr~b IIIIIIlr«ll...,. New yadt: Ra\a Pras. ... 287-301. ZellD,M. . . . ......., M. 1969: OIl abe Ihc:oIy of~ far dnaic discascs BiDlrwtrikll Sf;. aJl-ll.

_m.... Phaaellllil trials

Tmdilional"'g devel-

opment follows several distinct phases of development through 10 n:gislration. PHAsE 1lRL\L5 8M usuall), followed by PHAse II TIIALS in arder to choose abe optimal dose:., and

dlen afta' IOIDe plannillllimc PIIA5I! 10 1RL\L5 ~ iniliatcd. Althaup it is highly ck:sirable to Jeduce the lilDe bctwcca Phase U and Phase 01 there is ...uy a minimum lime Ihaltcaaas wMtlo spcad plaaliiJlS Phase III. It should be an objc:ctive to n:cIuc:e this lime. ODe possible way of n:ducilll this lime is to carry aut a seamless Phase uno clcsip. In this type or design the lime between Phase II and Phase IU is !educed and the two an: combined into a silllle llial. An adaptive sc:amIcas IriaI is one iDwhich abe Bnal anaI),sis wiD use daIa fium palic:allcnrollccl beron: Mel after the adaptation. The PYCS an iUustmlion or the dilfCI'CIKIC between .lraditional Phase IIIPhasc III appraach ami a seamless Icarn/canfinn appmac:b. 11Icre ~ certain eonsiclcraliaas that nc:cd to be laken into account for a sea_ss desi&n 10 be feasible. The mast important is the tilDC a patient needs to be rollowed to reach abe ENDPOINT, which is to be usccIto make the cIasc: selection. If the lime to reach the cndpoiDl is short in n:lation to palicnt MCruilment lllen enrolment eM coalinuc while abe decision is made and IIIe number of ovcnunning patients•.i.e. those rancIomisc:d to cIasc nDllaIccn rorward. will be minimised.. Howe~, if this time is long thea the number or overrunning paticnts will be RlOIe sisnificMt ad a seamless slUci)' less applicable. It is also advisable to usc: a well-cstablishccl endpoint

_un:

or sunvialc IIIIII'ker (~ 5U1tROO.QE EJUIOINTS)

whc:a

,

DoscA

I

• .1• •• • .j•

OaseB Oasec OaseD

•

i-

Plaa:ba

•••

••

I'hucD

Dose A

••i

: ~Whilc S&*e--.c.l

••• ••I ----------------~:~---------------:------~---------------+. •

•!I

~

•

••

;

OaseB Oasec

:• ~

OaseD

Placcbo StqeA -1.canIin&

Slap B - CCIIIfinniq

.............. 11111 t ..... Compstison of the tndIionIIJ PhIIse IIIPhass 1111IPPfOIJCh (lop panel) BIId seamless IeamitJt;conIImJlng (bottom p8IJfIIJ 417

SECONDARYENDPOINTS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ implcmentilll a seamless clcsign. Whe~ Ihc goal of Pbuc II is to establish an endpoint for Phase W. it is likely that II seamless IriaJ will be IM."CepICd. Then: 1ft also some 10gislical considcndions that ncc:d to be taken into account. puticularly IIR1UJId the drug supply and drug packagilll. 10 Ibis extent chug development prDgl1lllUl1CS thai do not have complicated or expeasive n:gimens 1ft IIICR suited to seamless designs. Regulalor)' agencies are also likely to have many questions around the use of a seamless study and IR likely to n:quire a second conRnnatory study. It is unlikely that a seamless Phase IUIII study will be accepted as a single conllnnatory study. One other cOMideralion is that of maintainilll the blind until all data 1ft frozen at the end of the Phase IU part. This then mauies the use of an independent data monitoring aJIIUIIillee to make the decision to swi~h 10 Phase ilL Then: 1ft many methods of analysis thai can be used in a seamless study. All methods mUll be seen to conlrOl the overall TYPE I EJUtOR. as this is a paramount mpliement for the egulatory agencies and is not negotiable for a conRnnatory study. Todd and Stallard (2005) consider a group sequential method (see INTERIM ANALYSIS) that incorporates a In:atment selection based on a short-tenn endpoint, followed by a contlnnalor)' phase that uses a longer tena endpoint. Bauer and Kieser ( 1999) consider the use of P-VALUE combination tests in a seamless llial by combining die information befo~ and after the adaptalion. Inoue. Thall and Berry (2002) and Schmidli, Bretz and Racine-Poon (2007) consider Bayesian decision rules to decide which dose to take farwanlto the confinualor)' phase. However. while Inoue, Thall and Beny (2002) use BAYESIAN t.tETHOOS in both the Phase II and Phase IU parts of the study. Schmidli. Betz and RacinePoon (2007) use Bayesian methods for dose selection but traditional frequenlist medlods in Phase III. using a combination lest to control the 1)pe I c:nur. Whichever melhod is used. simulations need 10 be carried oul 10 understand the operating characteristics of the design. Submission of these designs 10 regulalory qencies would requi~ these simulations. A I1IOIC thorough coverage of the considerations. operational aspects and examples are given in Maca el al. (2006). Sponsor rcpesentalion in seamless designs is conbo\'USial. but in a seamless adaptive design there can be more motiValion for sponsor participation in the dose decision process. AB ....... P. aad K_r, Me 1999: Combiaiq di«crent phases ia die devcIopmcat of medicalln:IIInCIIlS within a siqlc biaI. Sttllislia ill Mt!tllciM 11.1133-41.Inoae,L Y. T., Tlld,P.F.aadllerrJ,D.A. lOOl: ScamJcaly cxpandiq a nncIomimi Phase 0 lrialao Phase III. BitNMtri('s SI. 823-31. Maca,J............,... DnpIID, V.,

s..

Gale, P..... K ...... M. 2006: AdaptiYc seamless Phase uno ISpccIS and cumplcs. Dr., lII/orrnatitHI JtlllmGl40, 463-73. Sc....... B., BnIz. F..... . . . . ....., A. 2007: Bayesian predictiyC pD\\'CI' for iatcrim adaptaIion in SC8IRlcss Phase l1li11 trials ,,'hen: the eadpoint is survival up to IilOIIIC specified timepoinl Sttllalics ill Mftii('iM 264925-31. Tadd. S. .... St.uant, N. 2CDS: A ~ cliaical biaI daipC'ombiningPliascs D lad m: scqucnlialdcsipswithln:llmCllt scIcctian lad a chanp: of endpoinL Drllfl I"/Dr"""i",, JD11J71111 39. 1C»-18.

desilas - bKklraund. opc:ralianal

secondary endpoints segregation analysis

Sec ENDFOINTS

The obsemdion or characteristic sq~gation raaias among the offspring of particular pan:ntal crosses was first made by the: Austrian monk Gn:pIr Mendel (1822-1884) in his experiments on the pnlen peL These observations enabled him to fonnulate a theory or genetic transmission from plftnl to olTspring. Mendel studied discrete trailS (called PllEXOfYPES) in the garden pea (e.g. smooth versus wrinkled seed) and. after many generations of inbecdilll. obtained puK lines (widl unifonn phenolype. e.g. all having smooth seeds.ovc:rmany generations) for each trait. When lwo puK lines with different phenotypes (e.g. smooth and wrinkled seeds) an: crossed. aD the offspring (called the FI generlllio,,) wercor the same phenalype (e.g. all smooth). The trait that is uniformly praent in die FI generation is said to be tIonaillllnl. while the absent alternative is said 10 be receni~. When FI individuals &reCrossed with the recessive pure line (which is called a 'back-cross'). half the oft'spring had the dominant phenotype and the other half the ecessivc: phenotype. When two FI individuals we~ crossed (which is called an "inter-cross')' three-quarters of the offspring had the dominant phenotype and one-quarter the ecessive phenotype. Thesechancteristic 1:1 and 3:1 ratiosarecallc:d segregQlio" rQIiGS. Segregation ratios De explained by the fact that each individual ecdves a complete set of genes from both parents, so thai each gene is JRsent in duplicate. When thee De dift'e~nt forms or Ihc same gene. cadi fonn (or allele) may com:spond to a difl"c:rnt phenotype. but when an individual has two difl"c:ent alleles (i.e. is ',elerozyglRls),1hc phenotype of one or the alleles (the eccssive allele) is completely masked by the phenotype of Ihc other allele (the dominanl allele). Thus the FlgeneraliCID from lWO different pu~ (i.e. lromoz,·glRls) lines will be all heterozygous and theefan: display die dominant phenotype. A back-cross will rault in half die offspring having the heterozygous genotype and the other hair having the homozygous recessive genotype. An inter-crass will result in hair the offspring being heterozygote. one-quarter being homozygous dominant and one-quarter homozygous recessive.

___________________________________________________________ Classical se~galion analysis is the examination of the offspring of different mating types to see if Mendelian segrqalion ralios ~ present. When such nlios are observed. the inference is made lbat the phenotype in question is dc:tennincd by a single undc:rIyiq genetic locus. Complex segrqalion analysis is a furthu development orEbis method for lrailS in which Mendelian sepegation ratios may be masked by complexities such as the involvement of backpound genetic or environmental factors in addition 10 a locus of major effect PS (See also ALLBJCASSOC'IA11OH. 0EHE11C EPlDDOOLOOY. oemnc UNKAOE.. OEHOfYFE. PHENOI'YPEJ

systematic difTc:n:nces between those who are selected for study and those who an: DOl selectc:d. :so lbat Ihe selected sample is nol n:paaentalive of the larget population. For example. in a survcy of the ImOking habits of 14 year olck. a convenient sampling fnunc would be cbildnm attcnding schools in a deftned geographical area. Howcver. nOl all 14 YeaJ"olck will be included in Ibis sampling frame and if the n:asons for exclusion an: associated with the smoking habit a biased estimate of the prevalencc of smoking will be obtained. Another area where Ihe choice of sampliq frame might lead to BIAS is in telephone sampling. where households withouttelepbones would be sy5lellUltically cxcluded. Even when an appropriatc sampliq frame is used for a survey. nonrandom sampling can lead to biased cstimates. For cxample. in a Iludy or own:rowdiq. an appropriate sampling frame might be all households in an electoral ward or postcodc seclor.listed in order of postal addn:ss. Howc\lCr. a systematic sample of eyery eighth household might ovcrn:praent certain types of accommodation. such as ftaIs on a particular Roor (e.g. ground ftoor or top ftoor) in tenement blocks of eighL If the a\'CI"8ge number of people per household difren systematically between floors. this is likely to lead to a biased estimate of oyeraowding. Ideally. probability sampling methods should be used to avoid selection bias in survcys (see S.ulPUNO ME11IOOS - AN OVERVIEW). One type of study thai is almost ncYer caniecI out on a random sample or Ihe target popullllion is a randomised QJNICAL 1RIAL. Trials rely on random alloclllion to tn:abDcnt poups for their intcrnal validity but. because of lighl eligibility criteria for patient selection. those in the llial may not be n:praenlalive of all patients with the CUldilion beiqlRalcd. Epidemiological studies. especially C.o\SE-CamROL STUDIES. an: susceptible to selection bias. In case-control studies it can be exlRmely difftcull 10 obIain a aJllarol group thai is repraenlative of all noncases in the same target population lhal Ihe cases arise from. This can rault in biased estimates or the ODDS RATIO in either din:ction. dependiq on the form or selection bias. 1'1Icse issues arc discussed in dclail in Sackett (1979) and Ellenberg (1994). Even in ~fully designed

SENSm~TY

OBSERVAlIONAL SIlIDIES it can be difficult or impossible 10 rule

oul selection bias as a possible explanation for an observed associalion (Boydell el til•• 20(1). Selection bias can occur in many other contexts. For example. Kho et tiL (2009) describe how it can rault from the lCquirement few written informed consent in studies of medical mcords. WHG 80)'4l1li, J.. wan 0., J.. MrKeazIt, Ie. ,,111. 2001: Incidence of schizcJPamUa in ethnic minorilies in London: ecolopc:al study mto interaclions with awiroDmeDt. BriJislr Meditol JOIITIIII/321. 1336-8. £Ie....... J. H. 1994: Selc:ction bias in obsemdianal and experimeDtaI SIUda. StalUtics in Medicine 13. 557-67. KIID. M. &, 0aIfeU. ~... W...... D. J. ,t & 2009: WritlCll informed CODSCClI and selection bias in obsmational studies using medical ~0Rb: systcmDlic.mew. British MedimlJoumo/338. b866. DOl: IO.IIlCii bmj.b866. Sackett, 0. L 1979: Bias in analytic research. Jourlltll of Chrome DiJwIses 32. SI-63.

sensitivity "This is a mcasun: ofhow weD an altemalive test performs when it iscompaml with the refc:n:nceor "gold"

standard test for the diagnosis ofa condilion. Sensilivity islhe proponion of patients who an: correctly identified by Ihe lest as haying the condition out of all patieats who havc the condition. Sensitivity may also be cXlRssed as a percentage and is the counterpart to SPEClf1CI1Y. 1hc refen:nce standanl may be the best available diagnostic test or may be a combination of diagnostic methods. including followiq up palients until all with the disease haYe presented with clinical symptoms. For example. in a study of mammography. the reference standard for breast cancer would include aU ",omen who went on to develop brast cancer. whether they wem first diagnosed radiologically. histologically or symptomatically. Thus. Ihe best design when a diagnostic tcst is evaluated against a refen:nce standard is a COHOJn" sruDY with complete follow-up. When the data ~ SCI out as in Ihe table:

tI tI+("

•.• SenslliYlty =-

senaIIIvlty General table of test results among a + b + c + d inIIiIIidJaIs sampled Disetl:.e

Test

Positive Negative Total

Presenl

Absent

ToIDI

D

b tI b+tI

tI+b e+d tI+b+c+d

e tI+c

Sensitivily should be pn:sented with CONFIDENCE IJ\"1ERVALS. typically set at9S... calculated usiq an appropriate melhad such as that of Wilson (described in Altman el DL. 20(0).

419

~QUaa~~~SB

___________________________________________________

whidl will produce asyllU11dric conficlcnc:e inlcrValswithout impossible wlues. i.e. that will nalgive values for the upper canftdence interval> I wheD seRSitivilyapproaches I and the sample size is small. Where a test n:sull is a continuous mcasun:mcnl. eol.liver enzymes in serum. a cut-ofT point for abnormal values is chosea. If a lower wlue is chosen. then sensitivity will be ~ively high. but specificity matively low. The impact of all possible CUl-ofT points can be displayed paphically in a REtB\'ER OIUAJING aIARACIDIS'I1C (ROC) CURVE by pIottiDg sensitivity at e:achcut-offpoinl on the,. axis apinst I - specificity at each cul-offpoinlcmlhe oX axis. The choice: of' cul-off point is nal, however. solely a statistical dc:cision. as the balance between Ihe fa\I.SE IIOSJJIYE RATE and the FALS! NfXIATIVE RATE should be related to the clinical cantexl and consequences of' wrong diagnosis for the patienl and hcalthcare syslc:ln. A sample size calculation for sensitivity can be: made: by spcciryiag a coaftdcac:e iDte"aI (e.g. 9S4Jt,) and an acceptable width fOl'the lower bauacl of'the CODficlence interval. Where the anticipated SC:RSitivity is high and the sample size small. a ·small sample· methad should be used: a sample size table can be found in MachiD elllL (1997). CLC (See also UKEUIOOD RAno. NEOA11\IE FIlEDIcrIVE VAWE. POSI. TIVE IlREDICTIVE VAW£, DUE POSI1IYE RATE)

All..... D. G., Mac..... D.. B..,...., T. N. aDd GanIDer••L J. 2000: Slalillics willt I.'DftjiJeM~. 2nd cditiaa. London: BMJ Boob.

• ......, D.. C...........L, • .,..... P.... ......,..\.1997: SDmpk

si:e ,ables for cmiCilI Jtutlies.. 2ad edition. Oltfonl: Bladtwc:11 Sc:ic:nces LId.

aequentlal analysis

A mdhod allowing hypothesis tesls to be conducted on a DUmber of accasions as lhe: data aecumulate duou&h Ihc: course or a aJNJCAL 1RL\L A llial moaitoml in lhis way is usuallycalle:d a sc:quentiallrial. 'Ibis approach is iD mnlnSl to the usc of a slandanllixc:d sample size trial design. in which a single hypothesis tc:sl is candueled at the end of a lrial, usually when some specified sample size has been altainc:d. with no allowance: to collect ftarther data and n:pc:al the lest. Sequential analysis mc:Ihods DR: altractive iD c1inicallrials since. for cthicallUSDDS. il is often important to analyse the data as they aecumulale: and to stop Ihe study as 100II as the pn:sence or absence of a balmenl elTec:t is indicatc:d sufllciently clearly. Although the lOtai sample size for a sequential trial is DOl fixed iD advance: - it depends on Ihe observed data - an additional advantage: of seque:ntial mc:lhndolagy is that lrials may be caasaructed so that Ihe e:xpc:etc:d sample: size is smaller than that for a fixed sample size lrial with the same 'I)"pc I enor nde and POWER. Suppose that in a clinical trial we wish to compare two paups of patic:Dts. with one n:ceiviDg Ihe e:xperimenlal balment and the other Ihe COnllVllR:almc:DL Fonnally. we

define some mc:asum of the bQlmenl dilTemlCe bclwec:n the experimental and control groups. which we wiD denote: by tI. This trealmcnt difTemlCe may. for example. be measun:d by the dilTerence: belweeD the MEAN response for a nonaally distributed I!NDPOINI". the log-oclds ratio for a biDlU')" endpoint or the: log-hazanl ndio for a survival time endpoinL We generally wish to lest the NULL IIYIVIlIESIS that then: is no difference between the: treatment groups. i.e. that 11=0. ID a standard fixed sample: size tesl. some test statistic is obtained and compared with a critical value. The: critical value is chasen so as to give a specified 1}pe I envrrate. i.e. to e:nstR Ihat the risk of' concluding that thc:M is a bQlmc:Dl difference when. in ract. the lreatmc:nts are identical is controlle:d. usually to be no more than 5~. If Ihis sIandard h)'pOlhc:sis tc:sl is n:peated at a DUmber of' INTERIM ANALYSES. lIIIft ~ a number of opportunities to CUllCIudc: thai the trealmcnts are ditren:al. 'lbe risk of doing so on at Ic:ast one oecasion therefore increases above S4J,. so that the ovenll Type: I enar nde thus exce:eds S'I. and a valid lest is no longer pmviclcd. 'Ibis problem is addressed by sequeDlial analysis. iD which the mpc:atc:d hypothesis tests are conducted in such a way as to maintain an ovc:nll Type I enor nIe for Ihc: sc:quenlialtrial as a whole. Although sequential monitoring methods have been proposc:d based on a IDllge of possible lc:sl statistics (sec:. fareumplc:. Jennison and Turnbull. 2000. for a discussion of passible melhods) agc:neral sequentialapprvach is basc:d on the use or the eRicienl score statistic (see Whitc:head. 1997), as a measun: of lhe trealmc:Dl difference. Large positive: values com:spond to an indicalion of superiorily of' the experimental IJatmenl.lqe: Dc:gative values to an indication or superiority of Ihe control batmenl. while: values close: to ZCIO iadicate liUle dilfc:rmce betweeD the Imdmc:nts. The exacl form or the scom statistic cIepc:ads on Ihe type of data used and Ihe way in which the Imdment difTemlCe is measun:d.. As an example. for binary data. wilh the RaImc:IIl difference mc:asun:d by the: Iog-odds ratio. if equal numben of patieDls have received the experimental and cantrolwalments. the scan: statistic is half or the: difTemlCe: in absc:rvc:d numbers of succc:ssc:s on the experimental and mntrol anns. For survival datL with the Ireatmc:Dl diffc:n:nce mellllUn:d by the lag-hazard ratio. the sc:on: slalistic is the lag-rank statistic (see SURVIVAL ANALYSIS). ID a sequential trial. a number of interim analyses an: conducted. The value or the score statistic is calculatc:d at each interim analysis together with the: observed Fashc:r's information. a quantity mlated to the: sample size: summari~ iag the: amount of information available. If. at any interim analysis, the value of' lhe: SCIO~ statistic is sulliciently luge. the lrial is stapped and il is caneludal thai the experimc:Dtai treatment is superiol" to Ihe control lreatment. If Ihe score statistic is lao small. the llial is stopped aDd. clc:pcndiag on lhc way in which the: test is construclc:d. it may eithc:r be

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SEaUENTIALANALYSIS

concluded ....1 the experimental tn:almcnt is inferior to the control or that the~ is insufficient evidence to distinguish bc:Iwecn the two tn:atmcnts. If neither criterion is mel, that is ror intennedi.te \'slues or the sc~ st.tistics. the bial continues to the next interim analysis. Graphically. the observed \'alues or the liCo~ stalistic may be ploucd against the wlues of the inrormation. As. the infonn.tion available incn:ascs Ihroughout Ihe trial the ploucd points form what is called a SDmple path. At each interim analysis. the sample path is «llllparcd with upper and lower critic.1 values. with the bial stopped as soon as the SCOI'e stalistic lies either above the upperc:ritical value or below the lower critical \'alue. The critical \'alues. which. in general. lake diffe~nt values .t the diffen:nt interim analyses. thus define a continu.tion n:gion. As aln:ady explained. the problem of sequential analysis is the calculDlion or the critical valucs so as to gi\'e • specified Type I error mte, for example. or Sc.t. As the choit:lC of critical values to achieve dais aim is not unique. problems of appropriate choices ror use in a sequential dinicaltrial setting ~ also of inten:st. In particular. in contrast to fixed sample size hypothesis tests. asymmetric sequential methods ~ possible. A fixed sample size test tlud is designed to ha,'e specified power, say 9O'it. to detect a treatment efTecl orgi\'Cn size. say fJ = 0•• has equal power to detect the opposite bUtment efTect or the same magnitude. i. e. 0= -fJ•. A sequential test may be eonslnlcted to have power 0.9 to detect 0 =' •. but lower power to detect 0 = -fJ •. Such a sequential tesl may h.ve a smaller expected sample size when -0, than when fJ=' •• This is sometimes desirable in clinical trials.. when it is advantageous to stop a trial as soon as possible ir the experimental treatment appears to be inrerior to the conlrOl and there is no dcsi~ to continue n:cruiting patients to lest whether or not this inferiority is statistically significant. The method based on the score statistic is a very ftexible one. since,. as shown by Scharfstein, Tsiatis and Robins (1997) for a wide range or problems.. conditional on the observed information values. the seo~ statistics at the interim analyses are approximately normally distributed. This means tlud critical values can be obtained b.sed on this normality to proVide sequential tests that can be used ror many difTen:nt types of data and choit:lCS of measure ror the treatment diffen:nce. Two distinct .pproaches to the calculation of the critical valucs with which the efficient scon: statistics ~ eompan:d have been developed. The 8m. which is sometimes called the bounc/Q,ies applYHlch. is based on modelling a continuous sample path. The second uses Ihe .ssumed normality of the sc~ statistics directly. evaluating the critical values via a n:cw-sive numerical integnlion technique. wi'" the fonn of the sequential test often specified by wlud is called a spending function. The two .pproaches ~ described in detail later. A more general approach. the adaptive design method. which is not based on the asymptotic

,=

normality or Ihe scon: statistics.. is also brieRy described. After a brief example. Ihe problem of analysis at the end of • sequential bial is then discussed. We then continue with • description of the ~lated area of l'CSpOlUIe-dri\'en designs .nd end with some comments on the role or a DATA AND SAfETY MONIfClUNO Com01TEE in a sequential clinical trial. In the boundaries approach. the approximate NORt.W. DlmUBunoN of the seeR st.tistics ewlualcd at the interim analyses meBM that the observed valuesean be considen:d as points on a Brownian motion with drift equal to the trealment difTerence, obsen'ed at times given by the observed information. This has led to the considemtion or the .bstract conce:pt of continuous monilorin;. in which the value of the test statistic is taken to be observe:d at all times nther than at the discn:te times gi\'en by the interim analyses. Tbe plotted sample path thus forms a continuous line. which iscom~ with continuous boundaries. which may be e:xpn:ssed as runctions of the inrormation level. Many of the theOl'Clicai developments in sequential analysis have been based on considc:ration or this problem. A consequence or this ronnulation is that. since the sample path is considered to be continuous. the trial stops exactly on • boundary. whereas ror. disen:tely moniton:d trial. then: is some ovenhoot of the critical v.lue when the trial stops. The boundaries approach stems from the work of Wald (1947) who de\'elopcclthe sequential probability ralio lest (SPRY) for the: testing ofannamcnts during the Second World War. In Wald's SPRY after each obserwtion. the UKELDlOOD RATIO for the simple alternative hypotheSis ~lati\'e to the Dull hypothesis is calculated and the test continues so long .5 this likelihood r.tio ralls within some fixed nange. equivalent to the plotted values of the score statistic lying belween two parallel str.ight boundaries. Wald derived stopping limits so as to give a test with. specified Type I en'OJ' rate and powu under the assumption of continuous monitoring. Among all tests with the same properties. the SPRT minimises the expected sample size when either the null or alternative hypothesis holds. However. the parallel boundaries give • lest that. although itlc:rminates with probability 1. has no finite maximum sample size. This feature makes it unsuitable ror many clinical trials. Following the work orWald.. numberofaltemDIive ronns ror boundaries th.t maintain the ovmtll1)pe I error me ha"e been proposed. Whitehead (1997) describes a wide range of such tests. One form that is particularly commonly used in sequential clinical trials is the triangular test. This lest has straight boundaries that form a biangular-shapcd continuation ~gion. Tbe test approximately minimises the maximum expected sample size among all tests with the same c:nur rates .nd b.s a high probability of stopping with a sample size below tlud of the equi\,alent fixed sample size test. The critical wlues obtained using the boundaries approacb maintain the overall Type 1 elTOr J1Ilc for a continuously

421

~Quarr~~~s~

___________________________________________________

monitored tcsL In practice. monitoring is necessarily discrete. since even ir an interim analysis is conducted after observation of each patient. the infonnation will increase in small steps. This means that if the critical values from lhe boundaries approach are used. the "tYpe I error rate will be: less than the planned level of. for example, S%. Whitehead (1997) has proposed a correction to modiry lhe continuoWi boundaries to allow for the discretel), monit~ sample path. This com:ction brings in the critical values by an amount equal to the expected ovcnhoot of the disc~te sample path. The coJl'Cdion is pmticularly accurate for the triangular test. In gcnend. specialist software is needed for the conlilnlCtion of critical values using the boundaries approach. A commercially available software package. Planning and Eyaluation of Sequential Trials (PEST). is available from Medical and Pharmaceutical Statistics Research Unit at Lancaster University for the calculation of lhe boundaries. An alternative approach was a recursive numerical inlepation method for calculation of the overall TYpe I enor rate for a sequential trial with specified critical "alues under the assumption thal the scee Slatistics observed at the interim analyses are normall), diSlJibuted (Armitage. McPhcnon and Rowe. 1969). As wc)) as demonslraling the effect of eonducling interim analyses wilhout adjusting for lM.1IPLE 1'fSJ'INO. this method allows the construction ofcritical values to maintain an overall Type I enor rate of. say. S'i.. Using this approach. Pocock (1977) and O'Brien and Fleming (1979) calculated critical values for sequential tests that preserve the overall Type I cnor raIe to be S'i. when. for O' Brien and Fleming's design. the critical values with which the SC~ statistics are compami arc the same at each interim analysis. and. for Pocock's design. the critical values com:spond to the same P-VAWE for a conventional analysis perfonnc:d at each interim analysis. 11Ic critical yalues obtained were tabulated to allow eas)' implementation without the ncc:d for additional computation. Although these methods., particularly lhal pr0posed b)' O' Brien and Fleming. ~main in usc. they are not alwa)'s Ihe most appropriate designs in the clinical mal scUing. Pocock's design has been criticised because it has a relatiyely high chance of leading to ~jection of the null hypothesis very carly in the trial. O'Brien and F1eming's design. in contrast. is unlikely 10 stop carly in the trial unless then: is very SIrong evidence of a treatment difference. Ir the two treatments IR very similar. both designs are likely 10 lead to a mal n:quiring ~ patients than Ihe cquiwlenl fixed sample size trial. A more Rexible design approach is provided by the spending function method proposed by Lan and OcMets (1983). In this approach. the IOlaI overall Type I CI1'OI' rate of. say. S~ is considen:d to be spenl through lhc course or the trial. wilh lhe rate at which il is spenl controlled by the specified spending function. Not only docs Ibis introduce

ftcxibilily in the choice of Ihe shape of the stopping boundaries bul it also. in contrasl to lhe tests of Pocock and O' Brien and Fleming. allows construction of a tcstthat maintains the Type I error rate ir interim analyses are not taken at the planned limes. Man), forms can be used for the spending function. but families of functions to give tests with cCJtain properties have been proposed. A thorough review of the approach is given by Jennison and Turnbull (2000). As wilh the boundaries approach. specialisl software is reqUired to calculate the crilical values. The softwa~ package EAST prodU()Cd by Cytel Software Corporation and the S-PLUS module SeqTrial produced b)' MalhSoR perform the necessary calculations. An aIlcmalive to the sequential design approaches based on lhc assumption of nonnalit)' for S(lOI'C statistics jWit described is the adapIive design approach described b)' Bauer and KOhne (I ~). Although the ideas can be extended 10 trials willa gn:ater numben of slages. Bauer and KOhne focus on a two-stage design and assume that lhc data rrom each stage arc independent of those from the other stage. Suppose that a standard h),pothesis test of the null hypothesis thatlhere is no In:almenl ditTcn:nce is conducted based on the data obWned from each stage. leading to two P-values. PI and P2. A result of Fisher cited by Bauer and KOhne shows that. if there is no treatmenl difference. -2 log(P1P2) follows a CHISQUARE DlSTRlBtmON on 4 DEGREES Of ~t. allowing lIae data from lhc two stages to be combined in a single test. • rad that the only assumption made is the independence of data from the lWO stages meDDS lhat this approach has gn:at Rexibility, enabling changes to many features of the trial design without invalidating the final test. The most common change discussed is modification oflhe sample size ofthe second stage based on the predicted power oflhc trial at the end ofthc first stage. but possible changes go far be)'ond this 10 include changes of Ihe endpoinl being measured and the null hypolhcsis being tested. The adaptive design approach has been criticised. however. for the facl that the lest statistic. -2Iog(P1P2)' is not a sufficienl statistic for lhc treatment difference. This leads to a lack ofpower for the test. so that. irthe ftcxibilily of the adaptiyedesign is not utilised.. a sequential lest based on the boundaries approach or Ihe spending function method can be found that is as powerful and has smaller expected sample size. As an example of a sequential analysis. the ligule (pa;e 423) shows raulas from the anal)'sis or a small trial 10 assess the emcacy of Vaapa in men suffering erectile dysfunction as a rault ofspinal cord iqjUl)'. EUgibic men with a regular female partner. who were attending clinics in Southport. Belfast and Stoke Mandeville. ~ randomiscd bel\W:en Vaagra and a matching PLACEBO pi)). After four weeks they ~ asIccd whdhcl' the Imalment n:cci\'Cd had implO\'Cd their cn:dions. The bial was designed using the boundaries appruach with the triangular IcSl being chasen as an appropriate design. 11Ic solid

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SEQUENTIALANALYSIS

lines on the ftguIc iOusbalc the conlinuation region for this test when the clicientSCCR statistic. Z. is platted apiDst tbcobseryed F"JSbcr's information. V. Tbc IriaJ conlinucs until the wluc:s or Z and V lead to a point oulside this biaaluJar lqion. 7

6 5 4 3 Z 2 1

O+---____~__--__--__----~--~ -1

-2 -3 aeq..ntIaI .....,... Continuallon region and sample path for a dinlclll ,rial of Viagra In men with spinal cold

inilJly Al the lint interim analysis, 12 men bad completed rOlD' weeks' In:lllmCnt with 5/6 on Viapa and 116 em placebo n:pOltilll improyement. The ftrst ploucd point on the ftgun: n:pn:scnts these data. To allow rorthe fact that the bial is not monitorm mnliDuousIy. the boundaries an: .tjusted using the so-callc:d auislmas Re corn:ctian. so that the plotted point is compared with the iMCI"doucd boundaries shown. As the point is betwc:c:a Ihcse boundaries. Ja:lUilment to the trial conlinuc:d. At the third interim analysis. the observed implUYemcni niles wen: 8110 on Viqm and 1II0on placebo. On the basis of dICSe daIa. tbc upper boundary was reached and RClUitmeni clused. When the raults on tbc 6 men under treatment wen: added. the impro\lemc:Dl rates became 9112 and 1114 n:spcctively.lcading to the fourth ploued point. The clc:sip allowed a ItnIag pasili\le CXlKlusion 10 be dnwn aller only 26 men had been 1Icated.. nus is in comparison wilh a con\leDlionai fixed sample size trial. rorwhich 57 subjecls. ~ than twiccas many. would haw: been n:quinxl fo.r a design of the same power. The mcthacls described earlier mainly lead to tellS CXJDdueled ala n. .bcrofiJdcrim analyses orlhc data oblainccl in aclinicallrial, with the possibilily of stopping tbc trial as soon as sufftc:ienl evidence ora tn:aImcnleffect. or the lack ol'Such an effecl. is obtained. Much mon: gcacraI methods can be envisaged in which may aspccIS of a clinical trial design may be m:onsidm:cl following an interim analysis. Such mc:Ihods an: sometimes Rferxcl to as DATA-IJEPENDENI" DESIONS, or n:sponsc:..driven dc:sips. or. rather confusil1lly gi\len thai tbc same terms are used for the diffen:nt

approaches described bcn:. as either sequential designs or adaptive designs. A n:view of n:spansc-driven clc:sip methods is gi'VCD by Roscabagcr (1996). A simple n:spanse-driven design is tbc play-the-willner design far a clinical trial comparing two lIalmenls on the basisofasuccesslfailweendpoinl. The purpase of this dc:sip is to replace the random allocation of treatmenl to patients willi a method that leads to mon: patients m:civil1l Ihe superior treatmcaL Since. of coune. il is DOl kDown at the beginning or tbc trial which In:atmcnl is superior. tbc lint patient may be assigned 10 aln:almcnt al ranclam.lra success is observed fram the In:atmcnl or Ibis palienl. the next palient receiYes the same In:almcnt. If a railwe is observed. tbc next patient ret'Cives the other lIatmenl. Each subsequenl patient n:ccives the SIUDC tn:allllenl as the pnMous one if a success was obsc:ncd and the GIber tn:aImcDl ir a failure was observed. In praclice. the simple play-lhe-winncr mit: is genc:nlly modified to iaclude lOme nmdom elemenL Several ather rules with diffc:n:nl propcrlic:s. bul with tbc CXIIIUDDn aim or assigning man: patients 10 Ihe mast successfuillcalment ann, ha\le also been IUgeslcd. Responsc-driven dc:sips ha\le found mast use in carlyphase clinicallrials ror close finding. Hen:. the dose of Ihe experimental tn:atmenl thai is 10 be given 10 each patient in tbc llial is cIelenninccl depcading on the responses fram patients baled earlier. Often die aim is 10 dc:tcnnine a close that leads 10 a cc:rtain proponioa or patients expc:riencil1l some event: in trials in oncology. far example. toxicily rates of 2"' an: often 4."Onsidcnxl optimal. The usc of n:span~ driven clc:signs in such biaIs means that tbc optimal dose can be efHc:ientJy estimated without exposing patients 10 large doses lIIal may be highly Ioxic. 1he use or a sequential stopping nile in a clinical trial means thai may or the standanl analysis mclhods 1ft no longer appnJpriate. Suppose that a sequc:alial llial has &lopped al some inll:rim analysis with the lesl statistic exceeding the upper crilical value. i.e. with tbc conclusion that the experimental llalmenl is superior to the cOllbVI In:llbDcnt. 1he trial has stopped precisely because or the large observed value ortbc random SCIR stalistic. This me_ that a standard unbiased ellimate or the balmenl dilTen:ace based on the observed value or the lell statistic, e.g. the common MAXlMlBI UlCEUJIOOO ESTIMAlION. will. on aYenlle. ovcn:slimalC the uue value or the tn:almenl cIi~nc:e. The P-yalue from a slandard analysis will. in a similar way. on avenap be too small; i.e. il will overstate theevidcnce spinsl the null hypothesis. Special melhacls of analysis allowing rar the sequenlial monilOring have been developed. 1hese an: dc:scribc:cl in detail by Whitchc.t (1991) and Jc:nnisaa and Tunabull (2000) and an: implemented in the softwan: packqes PEST. EaSt and ScqTriaI. In Imgc-sade clinicaillials. mnnitorilll or accUIDulali1ll data is commonly uncIcrtabn by an indc:pc:nclent data and

423

SEQTRML _____________________________________________________________

il uses the sign of the dilTereDCcs rather than their m~nitude

safety monitoring 4XJII1millee (DMC), 11Ie primary role of such a committee is to ensure the safety of patients recruited to the biaI. It is lhererore nDlUral. in a sequenlial clinical trial. that the DMC should be involyed in the interim analyses conducted to assess the treatment dilTerence and in dc:cisions or whether or not the study should be stopped. The inwlvement or a DMC. the use or a carefully chosen sequential stopping rule. approved by the DMC before the start of the study. and a final analysis that allows for the sequential moniloring provide a clinim trial that can be stopped when appropriate withoul compromising the stalistical integrity of the Iaults obtained. NS

and is therefore less seRSitive than the WD.COXaN SIGNED RANK TEST. b can be used for two samplcsthat are matchedorpain:d

with a NUll. HYPorIIESIS that the ~L\NS are nol dilTcreal belween the two groups. Alternatiyely. it can be used in the one sample case 10 compare to a panicular value. e.g. the median. where the null hypothesis is lhallhc group median is not differenl to the proposed median. Jl is a nonpammclric venian of both the paired and the CNE-SaWPLE I-TEST. For the lWo-sampic case., find the sign of the diffi:m1c:e between the lwo Yalues in the pair. Cakulalc N.the number of differences showing a sign. Far the one-samplc c.e. find the sign or the ditTc:le11ClC between each subject· 5 wluc and the value or interest. Cakulalc N. the number or obsc:nations that arc ditrcrelll to the value of inlaal. Then for bach cases iel.'C be the numberorrewer sips•.'C = min( + s. -5). and CXIIIIpGII'C.'C to the critical region or the 8DK»dL\L DIS1'RI8UJ1ON, N. 112. Rejecl the null hypothesis if .'C is less than or equal to the critical value. As pan or a study, the genc:raJ hc:aJth sc:c:lion ofthe SF-36w. collected.. 1he subjecl's vaJues (shown in the flrsl lable) are to be compan:d to the expcc:1cd value of 72 within Ihc population. There arc S plus signs. 9 milaas signs and I tic; therefore x =S and N= 14. From the tables or the binomial distribution (N = 14. P = I/J the crilical value is 3. As S is greater than 3 then: is insufficienl CYidcnc:e to reject the null hypothesis., so il is concluded that this group's gcncml hc:aIlh SComi . . , not diJTc:lenl from those expected in the papulalion. Ocneral heallh scores wac collected on this group or subjects al II second lime point; the sc:oms at this lime paint arc shown in the sc:c:ond table. 1his lime. there are 7 plus signs., 8 minuses and frO lies.. ~rarc x =7 and N = IS. Compan: x = 7 lo the critical yalue or the binomial distribution IS. 1/ 2• This value is 3~ as 7 is ~r than 3 ~ is insufllcienl evidence to reject the nuD hypothesis. Then:rore the general health SI."O~san: not difl"ercnlalthe lwo time points.. For fUlthcr details sec PcII (1997) and Siegel and Caslellan (1998). SLY

AnaUaae, p .. M~ C. I(. and Rowe, B. C. 1969: Repealed sipiftcaaoe ICsts on accumulating daIa. JDUnlQI O/IM Royal SIQlislicalSotie'y. Serie$A 132. 2lS-M. B....., P.aad KibDe, 1(.1994: EvaluaIionofcllpaimmts wilh~ iaterim anaJyses. Biomelrit$ so. 1029-41. Jt!IIIIboa, C. ad Tamball, B. W. 2000: Group sequenliDl melbods with Qpplital;Q1U to tlin;mllriG&. Boca RaIOa: Oaapmaa & HalIlCRC. .....,)(. I(. O. and D. L 1983: Discme sequential boundaries for clinical trials. Biometrika 70. 659-63. O'Brim. P. C. and 11""", T.R-I9'79: A multipletesling ~ for clinical trials. Biomelria lS. S49-S6. Poad, So J. 1977: OIoup sequential methods in the design and analysis of clinicaJ trials. BiomelrilcQ 64. 191-9. RaHaberpr, W. F. 1996: New directions in adapti~ designs. SIQlutiml Science II. 137-19. Sdwfstem, 0. 0., Tltatll. A. A. and Ro...... J. M. 1997: Semiparametric efficiency and ilS implications OIl the design and analysis of P'OIIp-scquential studies. Journal o/,be Alllu;t:tUI Statistita/ ABOCiGtion 92. 1342-50. WaId, A. 1947: Sequentilll QllQlysiJ. New York: Jmn Waley " Sons. Wlllfaad, J. 1997: TIre design ad aftQl)'$is oJ stqumtilll cliniC'sl Irillis. Chichester: John Wiley & Sms. Ud.

OeM"

SeqTrlai

See 5EQUENl1AL ANALYSIS

sequential probability ratio teet (SPAT)

Sec

SBQUENTlAL ANALYSIS

shrinkage See WLnLEVEL MOOELS

Pttf, M. A. IWl: NonpGrtllftetric slQtislit$Jor bealtb carrrrmznb. Thousud Oaks: Sage...... S. and Casttllaa, N. J. 1998: NonpQfQRfetrit :lIatiJtia for 'be bebtniorQ/ st:knce:.. 2nd edition. New

sign test 1'his

is one: of lhe oldest nonparametric rnelhods and one or the mosl simple. II is so named because

York: McOraw-HiIL

sign test Subjecfs values in the general hesJIh seclion of the SF-36, using signs from the sign test GH value

60

SS

Sign

7S

100

+

+

SS

60

SO

60

72

40

=

90

7S

+

+

70

7S

5S

+

sign test Second recording of subject's vaAles in the general health section of the SF-36

Timet Time 2 Sign

60 40

SS 4S

+

+

7S 100

100 SO

+

SS

60

SO

70

9S

9S

60 6S

72

as

40

SS

90 70

7S 4S

+

+

70

7S

7S 6S

SS SO

+

+

______________________________________________________ slgnlftcance

testa and significance levels

Significance ICsls wen: inboduc:ed by R. A. Fisher (1925) as a means of assessing the evidcace apinst a HlILL HYJIOIH. ESIS. Often. such a null hypothesis states dlat there is no association between lWo variables: e.g. between hypcncnsion and subsequent hean disease. Significance teslS are conducted by calculating the P-VAUIE. defined as Ihc PROB. ABIUIY. if the null hypothesis wm: lrUc., that we would havc observed an association as IlIIJ:e as we did bychancc. The Icnn significance level is sometimes used as a synonym ror the Pvalue. Irthe P-value is small we have evidence ogo;nsl the null hypothesis: the entry on P-values describes their calculation and interpn:.tation in more detail. Fisher suggested that if the P-value is sufficiently small then the n:sult oftbe test should be regarded as providing evidence againsl the null hypothesis. He advocated that a con\'enlional line be drawn at SCjt significance (although he rejected fixed rules) and described results of experiments in which the P-value was sufficiently small as sIoliJliC'ml)' signijiC'onl. Stcmc and Davey Smith (2001) have argued that in situations typical of modem medical fCsean:h. P-values of around 0.05 proVide only modest evidence against the null hypothesis. A difTc:n:nt usc of the phrase significance level arises from the hypothesis testing approach to the intcrpmalion of experiments advocated by Neyman and Pearson ( 1933). who showed how to find optimal rules that would minimise the TYPE 1 and TYPE 11 ERROR raIcs OVCl' a series of many expcrimcnls. We make a 'JYpc: I cnur if we reject the null hypothesis when it is in fact lrUc. while we make a 'JYpc n cnor if we IICtlepl the null hypothesis when it is. in fact. false (see HYPOIlESIS 'JESTS). The Type I error rate, usually denoted as a. is closely related to the P-value since if, for example. the "iYpe I enor raIc is fixed at SCJf-. dlen we will reject the Dull hypothesis when P < 0.05. Thererore. n:scarc:hers using the NeymanPearson approach often report simply that the P-value for their leSt was Icss than their chosen Jignijimrrce level. 1bcre is. howeVCl', an important distinction between the use of the term significance level to refer to the evidence ag,ainst the null hypothesis provided by a particular experiment (Fisher's approach) and the choice of a fixed significance level that. tocether with the Type 11 error rate. will be used to detcnninc our behaviour with regard to the n:sulls. Goodman (1999) discusses the confusion caused by the failure to appreciate this distinction in more dclail. JS Fisher. R. A. 14)')..5: SIaliJliml me,hods lor resetm.'ll workers. Edinburgh: Oli\'CI' It Boyd. (iaodm.... S. N. 1999: 10ward eviclcace-bucd medical sIalistics. I: The P-wlue fallacy. Annab of interlftzl Meditine 130.995-1001. Ne,maa, J. aDd Panaa, Eo 1933: On the problem of the most efficient rests or stalisIicai hypolhcscs. PhillJSOphittl/ TTanstlclions oflire Royal Society. Series A 231.289-337. st......, J. A. and o.wy S........ G. 2001: Sifting the cyidc:noe - what" s \\TOng 'Aith sigaiftcanc:e tests? British Medical JOllmll/322. 226-31.

SIMP~SPARADOX

simple random sample

This is the most basic sampling technique. II is where a smaller group. a sample, is chosen by chance from a population. Each member of the population has an equal and known probability of being chosen to be in the sample. Each sample of a given size also has an equal probability of being chosen from die papulation. Sampling is usually done widloul replacement. 50 that each member of the papulation can only be selected for inclusion in the sample once. To choose a random sample. ftJst a list is needed of every member of the population to be sampled: this is the sampling Irmne. Each member of this list is then assigned a number from 1 to N (when: N is the lolal size or the population) in any order. Each member of the sample then has a probability of liN ofbcing in the sample. A nmdom number generator. or table. is then used to select a random number. The member of the population assigned that number is then selected to be included iD the sample. nus process is n:pe&ted until a sample of the required size is obtained. For example. suppose tbat a survey of doclon' opinions is to be carriedoUl. Then: are SOOdoctors ina hospital and a 10CJt sample is to be collected. Fint, the sampling frame ncc:ds to be obtained -a list of all doctors in the hospital. Nexl. cach doctor is assigned a number from I to 500, e.g. in alphabetical order or the oRler on the list. Now look at a random number table. which gives the following numbers. say: 28049 16831

11632 68254 14217 44612 0S049 13213 76103 07222 31852 43S01

Therefon:. the sample would include OOcton numbcn:d 280. 491.163.268.254,142.174.461.205,49.168.311.321.376. 103,72.223. ISS. 243, SOl. As 501 is outside the range ofthc numbers assigned it is ignon:d. Can: is nceded 50 as not to ignon: leading zeros or else same numbels might be inadvmcntly overlooked.. The main adwIugc oftbis method of sampling is Ihc Iac::k of clusificalion aror. as no infonnalion needs to be bown about i!ems except thai they are in Ihc population. b is useful when lillie is known about the population. only that it is likely to be homogeneous. The main disadvantage is it mipt not be possible to rmd the sampling frame. In the example given earlier. tbcIC might not be a list orall the doctors in the hospital. meaning that a difTen:nt method would need to be uscc:l For further details sec Crawshaw and Chambers ( (994). SLY CnnnIIaw,J.1IDd ~J. 1994: A t:oIftise t'OUrse in A lerrl stalislit:s. 3m edition. Cheltenham: Stanley Thornes Publishers Ltd.

simple randomisation

See RANDOMISATION

Simpson's paradox

nus is the observation that a measure of association between two categorical variables may be identical within the levels of a dlird categorical

42&

SKEWNESS ____________________________________________________________

variable, bul can t _ on an cnli~ly difl'crenI value when the wriable is cIi~PnW and the associlllioD mcasun: calculated f'nHn Ihc pooled datL As an cumple consider Ihc dne-way coalilllCIICY tabIc shown in Ihc table.lnf'anlsbom in two cli nics durin, a certain lime period ~ Calcloriscd acconliRl to survival and amounl of pre-1UIIaI ~ receival.

have posilive or riJht-band skew. When the tail or die dislributiaa is on the left-hand side (ICC put (b) in the fiISl fi~) die data have nClalive 01' left-hand skew. )

Slnlpeon·. paradox Three way c1assiftcation of infant survival and amount of pre-natal care In two clnIcs, taken from Evedlt (1992)

Itl/tlnt mniWlI AmDIDII of

Clime

Died

........ EJuImpIe 01 flrlht-hIInd IIncUd-hand skew Many clistributiaas eDCDUIIICrc:d in analyses or medical

S"rlmi

pn-ntIlDI CtI~ A

Less

3

116

MeR

4 17 2

293

Less

8

MeR

data an: positively skewed. Forcumpic. ~ 8 rat..elalcd growth hormone. was mc:aIIftCI in umbilical cord blood samples taken from 407 babies born at 37 wccks~ JCsIalion or later. The cliSlribution of the relu"s is gi'VCR in the second filun:; the d8ID am positively skewed since relatively few babies have conllcplin 1cvc:Js above 2ODg/m1.

197 23

Calcu1alcd within clinics. the oddI of survival vary liale

hi'" wi"

between Ihc two pn:-natal can: poups (ODDS RAllO (OR) for survival. comparing lower IllllllURI of can:. clinic A: ·OR= 1.2S~ .clinic B: OR =0.99) and the carraponding 00-5QUARED1ESJ'SofiDdcpendcacc ofsurvival and amount of ~

do not reach sipificaacc. Jr.

bowe\lCl'~

the data

1ft

collapsed twer clbaics. the odds nalio becomes OR = 2.82 and is IlaliSlically significant 8CCGI'di1ll1o D cbi-squamllcsL and &he conclusion would be thai the amounl of can: and survivallR ~l.lcd. Such a silUalion occurs when the third variable is associated with both the other variables and. themCR. confounds the association between the variables or i~. Hc:~. ldatively IDCIR pe-aataI care is given in dinic A and the survival pm:cnlale is also bigher in clinic A Ihan B. 'I1lcn:-

be, 10 some cxteal the pooled mca5tR of .o\SSDCIATION bdwcc~ survival and pnHUdai can: mcaswa both the association with ~nataI can: as well as that with clinics. To lake aa:.1JUIIl of the levels of a confOllnclinl variable. such as a clinic. a poaIcd within-level mc:asun: or association can be CODSlIUCted (sec MANm.-HAENs:za. ME11IOOS) or a slalistical madel can be used to adjust the association of interest for the conrouadc:r (see UJOISriC UXJUSSIDN, LOO-UNEAR MOOEU). SL Ewrftt, B. S. IF-: TMIIIItIlyJU D/I.'DIflilllenq ttlbk~ 2nd cditian.. Boca RaIDD: 0Iapman Ii: HalIfCRC.

skewne..

Data amclcscribcd as skewed ifthcy have an asymmclric clistribulicm. When thc tail or the distribution is cmthe rilht-hand side (sec part (8) in .t he first fIgtR) the data

ADalysis ofskcwed data can pracc:edeithcrusiRl thcllANlCS ofthe data or using transformc:d willes. Analyses using nmb are known as naDparametric ar distribution-flee methods. because they make no assumptions about the diSlributiaa of &he data. When clesmbilll skewed daIa usiRllIDIIJI8IDIIICbic methods the MmWI is a suitable MEI\SURE OF LOCA11OIN. Altenaative analysis techniques are based on transformed values. These usc panundrie mcdIods, which n:st on die assumption that the data have a particular distributicm. usually 8 NCJIWAL DlSl'RlBUI'ION. AlthouJh skewed data do not confann to this assumption. it may be ~bIc to apply a malhclDlllic:aI 'I'IlANSfCJRMA11O the cIaIa so lhal they cIo. WhCn the data an: pasilively slccwed il is often rouncllhatlhc lopridunic (los) transfannaliOD is approprialc. If the Icplin data an: Joged they have an approxillUllely DDIID8I

___________________________________________________ distribulion. as shown in the third figIR. When clc:ac:ribing slcewcd cIaIa in this situalion then the OfDMETRIC MEAN is an appmpriate paranu:lric measun:. of location; 80 r=

60

r=

F-r= ~

r--

I----

l"""'-

I .----.,

0

I

I

2 Loa cord leptin (log nWmI)

-, 4

skewness LogaritlJmictransfotmlllionoileptlnlfHldlngs Mathcmalical measures of ske.WDCSS can be used to describe distributions. Data with a synuncIric distribution. such as the normal distribulion, have a skewness of zero. Pasilive valucs for slcewness indicate a positively skewed diSlributioa when:as negative values for skewness indicale a negative sIcew. The skewness or the nw cord leptin measwanents is 2.7, whc:Jas that of the log-lnUlsfonned rneasun:menls is 0.2. which is considcrably closer to zero. SRC

software Sec STA1ISI1CAL PACKAOI!S

apaUaIepldemlology

This is the analysis·of epidemiological or public health data that an: geographically n:.fen::nc:ecI. ~ically the data arises ia two rarms: either (a) the n::sidential address of cases of dilcaae an: known 01' (b) arbitrary smaU arca such as ClCnsUS tracIs. zip codes or posk:odes have counts of disellK obscrvc:d wilhin lhem. The Iocatianal inf"ormalion is used in the analysis. usually to make inf~nces about spatial health declS. Often hypolheses of inlen:1I in spatial EJIIDEMIOLOOY focus Oft whelhc:r the n:sideatiBl ~ss of cases or disease yields insighl into etiology of the disease or. in • public health appIicatioa. whether advene environmental health hazauds exist locally within. Jqion (as exemplifted by local incn:ases in disease risk). For example. in a study of the n:lationship betweeD malaria endemicity and diabetes in Sardinia a sll'Ong neptive n::lalionship has been round. 1his n:.Iation had a spatial exPft'55ion and Ihe geographical distribution of malaria was impmtant in gc:aerating explanatory models for the n::lalion (Bemanlinelti el Ql. 1999). In public health pradicc. it is orconsiderable imponance to be able 10 assess whether localised an::as that have luger' dum expected numben of cues of disease an:: related to any underlying enviroamenlal cause. Hen:. spatial evidence or

~~ALEADE~OGY

a link between cases and a lIOUR:e is fundamental in Ihe .....ysis. BYiclenc:e such as a decline in risk withclistanc:e fram the putative SDUR:eorhazanl orelevalionorrisk in a pn:fem:d direclionisimporlanlinthisn:pni(scc.forexample.Law50n. 2001. 2006. Chapter 7; Lawson et QI•• 19M). There an:: four main ~ whem statistical methods ha\'C seea development in spatial epidemiology: DISEASE MAPI'IND. D~ CLUSTERINO. ecologjcal analysis and disease map surveillance. SefOM looking in detail aI each of these IRU. it is appropriate to consider same common themes or issues· that arise in all an::as or the subject. A rundamental fc:alun:: or data avaUable ror analysis in spatial epiclemiolou is that it is usually discn:te (either in the ranD ora point process orcaunling process). ud the cases of concern arise tiom within a local human populatiOlllhal varies in spatial density and in susa:ptibility tothediseaseofintcn:1l. Hence any model or lest procedun: must make allowanc:e rCX' thisbackpouncl(nuisance)papulationeffi:ct.Thebackpaund population e8'ect can be allowed for in a variety or ways. Forcaunaclida it isc~looblainexpected ndcsfCX' thediseascofinteR:slbuc:dontheagc.«XsbucllRofthclocai population.. and some crude estimalcs of local n::Iative risk an: often compuIal fiom the nlio of absenecIto ~ counts (e.g. Sl"ANlWtDISID ......wrvlincidenc:c RA11OS. or SMRs). For

caseC'\lall data,expc:clCd maarenolavailablcal then:solulion of the cue locations and the usc of the spatial disIn"bu1ion or a coatml disease has been ad\lOallcd. In thai aase the spaliat variation in the casedisalsc iscomparmlo the spaIial varialion in die aHIIIaI diseue. A major issue in dIis appIU8da is the cam:ct choice or CIOIIbuJ disease b is impadanllO chouse a control that is 11UIIchc:d 10 the age-sex stnIcttK or the case disease but is Ullll8'cctcd by the feabR ofintcn:sL fGrcxample.

4700

split."

epidemiology DisttI:IuIion of cases of chi1dhood lymphoma and leukaemia in HumbersidB. UK.

1974-1986 427

9MTMLE~OGY

___________________________________________________

in the _,sill or cues IIULIIId a put.aIi~ Iic:akh IuanI, a aXdntI cIiscaae shauId nat be alf'ec:II:d by the heal... hannI. CaunIJ or CIDIIIIaI elsac CIRI could also be . . iasIead of ~ raa when anaI,siDe CXJunl data. 11Ic Int (see ~ 421) and 5eIXIIId fipn:s c&spIa)' case 4M'IIl and CXIIIIIol' daaa maps fix' a . . . . of the UK far a lbcallimc periacl1lle thinI fI&IR displa,ys a typical count. daIa c:uaaple. ~~~~~~~~--~~~--~~~

_: .

...

.

•• '

• • *.A... ~~ •

~-,r

...

•

....... epidemiology ConImI'dlsttlNtlon: disIIibuIIDn . . t(II • . " " . of Ive bidhs ~ the bitth regisl8r in HumbtnIde, UK,' 1974-1986 Par CIlSe CVCIIl dais. IocaIiGlis often ~tlaiclenlial Mdrascs or cases and the cases .__ fnan a ~lIS

papialaiian ... varies bulb iIa ~ dcasity and in susc:eplibility tOdiscue. A betenJseaeaus ...... pracc:as madcl is often IIISIIIIICCl as a llaltilll poinl far .ru.ther anaIys& 11x: rocus or lntaat ror IDIIkin& int'en:acc ~nc par8,lDeten ~ribin, CKe$S risk J~ iDa n:liIIi~ risk ~ which is included in Ihc: ftnt-ardcr inleDlity of Ihc: Paissan pnICIeSS. It is ~bJc .... papalatian or awJnxuDealal he~ ncit~ may be u~ in 11M: ... set.. This couIcI be because ci'" Ihe (iopuIaIiaa backpound hazanl is· not direc:l1)' aYBilablearlhedilCUC displays a tcDdcacy Ia clusla' (pcJ:haps due·to UlllllCUured eawrialcs). The hdcropneily could be spatially com:JaIed. or it caulcllack t"ORItEI.A1Df, in . which case it could be n:pnIed ai a type or OYERDISPERSION. One CD iDelude such uaobscrwd hetcrascneity willlin die: rramework of ~ models •.a R..t\I\DlM EIWrI'. A caasidcnblc . . lileratuns . . developed alDCCmilll die: analysis or «XIUII1 dada in spatial cpiclcmiDlog (c.g. see n:vicws in E1lioll tiL, 2000; La~2001, 2006;'Lawaon and Williams. 2001; Lawson itl til., 20(3). 'The usuid .....1adapted far the IIIIIIlysis or n:&ion CXUIIS IISMIIeS dial the auals ~ ~nl PDissan mndam w.iabIes willa pIIBIIIdc:r A, in the ida . . . . 1his model _y becxlCld:dtoincludD~ hctcropndty bc:lween qians by inIrOcIuc:inJ a prior cIisriuion for abc . . mlalM risks (ICJI A,).1DaHpcnIiaD or -.ch ~ity bas become a camlllDlnpPRlBclIBad_ 8cu& York ... Mallie (BYM) model is now a SIandanl model. A filll Ba)'aiaD _)'Sis usin& this ..... (ICC JAYISIAN MmIDCS) is awilablc ali WJHBUG$ _~ lllUl)'en:asiaasortbis.maclel~ been .. apased (for wriauscUmpies ICC laWsan.,.2009). AL

e'

5IiIl . • [_ [

[0

1.44 , .... NI 1 1a .44 120 It.70 j Il!'-l n..il2 n.l'O IXI)

ID It

'j.J

n.42 [2(i)

....... epIcIemIoIagy DIsIrIbuIion ofcountsofSuiJden iniantdealh (SID) wlihinthe oountIes ofNotthCatoina, USA, 197~1978

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SPAnO-TEMPORAL DISEASE SURVEILlANCE B.....nIDeII,L,PanIHo.C..'tII. 1999: Ecolopcal ~nwith mars in covariaIes: an application. In Lawson. A. B.. Bun. A.• BaeIIn~ D.• ~. E. el aI. (cds). ~ mDpping _ rUle aSJeSSlnt!ft1 for public NoIll•. New York: John Waley It Saas. 1-., P. 329. ElIott, P., WIIcdIId. J. " .. (cds) 2000: Spalilll epidemiology: methods tllltlttpplimlions. London: Oxford Uni,asity PIa:s. La...... A. I. 2001: SlalisliaJImelbods iIrspaIitI/tpitknriolog),. New York: John Wiley & SoDS., Inc. La1noa, A. B. 2006: Statistical mctlrotls in spotiai epidemiolDgy. 2ad edition. New York: John Wiley & Sans. Inc. La_, A. B. 2OD9= Bayesian OWtlSt mappilrg: Irienmbit.YII modtIinI in spatialepitlrmiology. New yurt: CRe PIa:s. La..... A. 1._ WllllDls.F. L R. 2001 iR inlrotmtIOT)"g. to ~ mtIpfIing. New YOlk: John Waley & Sola. lac. LawIaa, A., B_II,A.""I999:Arevicwofmodellingapproachesinhealduisk assessment auaund puIIIive sources. In 1..a\\'SOIL A. B., Bigp:ri. A.• BaeIIn~ D., ~. E. el aL (cds). D~ mDpping _ rUle tUawmenl for public health. New Yolt: John Wiley It Sons. Inc.• pp. 23145. Uwsoa, A. B., BI'01It'DIt W• ., til. 2003: DireMe mappm,iIr lV-dUGS tmd.VL"iN. New York: John Wiley & Sons.IDI:.

spatlo-temporal disease surveillance

This is the detecUon or aberrations in health or disease. usually U they arise. This definition slrcsscs both the unusual nalun: of the disease event and also the impoJtance or temporal change in surveillance. How ·unusual· an evcnlor sequence ofevents is bceomcs. of CXIUI'Se. an issue in the design or any survcUlance system. The Centers for Disease Control (CDC) define surveill8DtlC u: •... the ongoing. systematic collection. analysis. and interpretation or health data essential to the planning. implemenlalion. and evaluation of public health practice. closely integrated with the timely dissemination or these data to those who need to know. The final link of the surveillm..:e chain is the application or these data to prevention and control. A surveillance system includes a runctional capacity rOl' data colle:ction. analysis. and dissemination linked to public health prognms' (Thacker. 1994). This definiUon stresses the collection, analysis and dissemination or data in a timely manner. and hc:ace it is very broad and stn:sses the focus on public health ncc:ds. However, it is possible to dislinguish two basic types or surveillance that play dilfermt roles in public health activities. Yarst. relrospediJ-e slB'\'Cillancc CXIDCeIDS the colle:ction or historical data on disease oc:cum:nce aDd its cxaminalion. n.e purpose or such analysis may be to inronn decision makers as to temporal 01' spatial ~nds and other rcalUn:s or disease behaviour. nus fonn or SID'\'Cillance is closely associalCd with classical epidemiological analysis (see EPIDEWWXIY). and differs mainly in its rocus on public: health needs. Second. prospective surveillance is the online or acli~ examination or disease data to discover changes in disease at the Ume or close 10 Ihc time or oc:cum::nc:c. In this casc. monitoring or disease occum:nce is done "as data arrive' so that decisions can be made conc:crning outbreaks or disease. 111c impol1ance or the this fonn or surveillance has been

heightened with the n:cent rise in tenorism and the pote:ntiaI ihn:atto health from bioterrorism. The n:lease or toxic or highly infectious agents into a population would be or grave concern in this context and so il is now importanl that rasl and accurate: surveillance of diseuc be undertaken to detm changes as early as possible. The dcsip or disease surveillance systems must consider some of the rollowing issues: 1. Early delet:lion. It is imponanl to dclCCl changes to

disease inciclcnce as early u possible. For example. ror certain inrc:c:tious disc:ascs il may take up to 7 days to receive Iabondory conRnnaUon or cucs. However. such a delay may be unacceptable ir a serious event had oc:cunal. Hence ways to speed up detection could be imporlanl. 2. Synt/romic methods. The use or ancillary inrormation is rcquin:d in order to speed up dete:ction of population health aberrations. A fonnal definition is given by (Sosin. 20(3): "..•the public health tcnn Syndromic Surveillance has been applied to systematic and ongoing collection. analysis and interprclation of data that precede diagnosis (e.g. laboralory test requests. emergency department chief complaint, ambulance response logs. pn:scription drug purchases, school or work absenteeism. as well as signs and symptoms m.-ordc:d during acute care visits) and that can signal a sufficient probability of an outbn:ak to warrant public health investigation.' ORen covariate information could be userul in helping to establish the character of an outbreak. The: first figure (page 430) displays the time series or phannac:eutical sales and gUlrOinlestinai diseuc reporting ror a Canadian example (see ror rurther examples Lawson and Kleinman, 2005). 3. Sensitivity and specificity of deledion methods. The calibndion ora suncillancc system is \'CI)' important in that false abernIion alarms (false posiliws) could lead to WInecessary public aIann. while raise abenation negatives oould lcad 10 health disasrers. Farrington and Andrews. 2004. and Le StraI, 2005. proVide IlIOn: deWlon these Luues. 4. Which diseose tlIId "'hat 10 look for? In relroSpc:c:tiYe surveillance thediscase: is usually known and the reatun:s or intcR:St an: also known (e.g. tn:ads). In prospectiYe surveillance. particularly where biotcnorism may play a role, the~ could be little a priori knowledge: or Ihc disease and the abenation to look for. Hence. detection systems must have the capability to deal with multiple diseases and possibly multiple ronns or aberration. nus is known U ",u1ti.YlTioie-mul'ifocus surveill8DtlC. Because or the potentially huge database searching problem thal results rrom this. DATA r.t1NlNO appl'OBChes have been adopted (sec. ror example. Wong el Ill., 2002, and Madigan. 2(05).

429

SPA'I'IO-TEMPORAL DISEASE SURVEIL.LANCE _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

E...... eon.(can....................... GI.GnMI) MClW__ ~ UnilSalH~.OVar·1heoCounIIr AnIIoNautMnt alid AfIIodanheaI ProcIucIs .-r}' 1PI1...., 1ID01)

,... ......... ".WchI;......

,~P---------------------------------------II\UtI·'lrOTC ....

,.J:

....".......

CIOIII·=i•• •

i• •••

2D

J

••

;0

DafB .,..aa tlllllparal • •_ ............ BIItIItIIotrIs SasIaIIc:hewan: epidemic curve and. flme ~ Ot ~ IIrtH:ountei (ort;:) ~ - - JBruJty to May 2001 (fmm Edg8 .'ilL, 2OtI4, ~ .Iou",., at Pub/it:'HeaIlh) , die nate 01 accuneaee (Jua;ps). 8acl overall cbaaic (_above).:

I.

• lillie series or c:ouall of elise. . ia lalaYaia or·. a pc;lnl incess 0, NpGItiI,ll 01' ~il CIdioa times). With lillie -=ria 'of cliscrele tile "mdi~ that ~t be of ialcn:lt < . .CIlIa. CtmllO.l'_t' eIi__ c~) am hi&hlighled ia Ibc IeCDDd IIpIe, wIdCb ihows thallhele ale three .... chanps (A,B,C), 'I1Ia lint abell'atioa '(A) is a sharp rise iii risk Oump) ar c:bappoinl'la Jevel, 'l'i..; .... aJlmatio. (8) dUd mIPt be of'iIIlaat is a ClUIIer of risk (a iacreue ad lllaa iD risk). 11IiI call alii), 'be ~ n:tmspec:li¥cly or course. 11Ie thinI . . . .liaa (c;:, is .M overall pnx:css claaap whe!e die IcVcl of die pac:ess is ct.apd but also the variability is. iacrcucd or clac:n:ascd. WheD a poiDt proc:cu of CYCllt IiIDes 1s0000000001ldloama), aced 10 be", whcnewl: .. aew eveal amva. Abcmllioas thai. iIrc rCIIIDII 'ia pojat prac:lC5lO1 CaD tale die farm of unusual . . . . .Iioas 1;tr "inta (Ie...........Iten)~ !IharP c:lumJcl aa T~mptInI/.· la ICIIIparIIl survcill_ eVCIIIS ia available (...., cilhcr &11

8

CCMI"

.1aSe

UCL

~

............. dl..... ·.....1. . . . sa.matIt: 1e8IuIaI. Of ~ forIJd J;t IemponII

SUIWIIanc:e 2.. $JIiIIiD,.t....,.lfalP,llialclDmain (disc_ maP) is to be

IDDJIifoIal Ihca IpIIIiaJ aad IpIIIio.fmnpanJ abcmdiau IIIUIl be CllllIIicII:nId. Spatial ~ c:auJcI mnaiat or clillXNlliauilics iii risk bdw_ n:pm (jUmps in

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SPEARMAN'SRHO(p)

risk aaoss fCIion boUDdaric:s), spatial clusters of risk (localised qgrqalions of cues of disease) and a Ioqmoge Rnd in risk. When disease maps 1ft monilCRCi iD lime then riNln,es in the above f'catures milhl be or inlCn:st. 'I1le development of'DCW clusters of (say) iDfectious disc:asc iD time mighl sipaI localised oua1Raks.. for example. Mull;,ur;QI~

issues

In health surveillance syslems desilacd lOr prvspec:tive survciU~

....y of the above fcallftS would have 10 be delectable aeross a wide range of diseases. Not only would whole disease stn:ams be mcmi~ but subsets or clisc:asc sllams may be ~uirm to be examined. For example:. for eady dclCctioa of eft"eclS il may be important lo IDIJIIitor IiDiI subsclS (such as old oryouq age paups). Also. it may be that CCRItElA110NS betWCCft subsels may be important wilen making ddc:cIiaa clccilions. ID addition. if disease maps ~ to be maailami ewer lime as well as time series. IhcD the problem grows CDAlidcrably. The BiO_Me system develaped by CDC (www.cdc.gov/pllinlcomponenlinitiativeslbiasenselindex.hbDI) bas a multiYariale and mixed sIn:am (spatial and aempcnl) capacity.

MtHkls

One or the c:um:at concems aboul laqe-scalc surveillllDtC syslelDS is the ability to model aJIIeCliy the varialiaa in disease aad to caUInae properly SENSmVITY and speciftcity

with sucb a multiwrillleand mullifocus task. TheappliclllioD of aayesiaD hierarchical maclelling has bc:ca advocab:cl for survcillaacc purposes and this may pIOve to be useful in ill abilily to deal with Iarp-scalc Iyslems evolving in lime in a nalUrai way (Zhau aad Lawson. 20118). For example. RCunive Bayesiaa )eanailll could be importanl. Clearly compulDliaaal issues could be paramounl hm". liven the polentiaUy multivariate Dablre ofthe probIem.ancllhc aced lo optimise campulatiaa is panunounL Sequential Monte Carlo mc:Ihods have been proposed to cleal with computational speedups (KalIl. Lai and Wong. 1994: Doucet. Dc Fn:ita and Gonion, 2001: Doua:t. Oadsill aDd ADdricu.200S: Vidal Rodeiru and Lawson. 2(06) as ha\'C 1IO\'C1 algorithmic speedups (Neill and Moo~ 2005). AL

lSee also 11ME SERES IN MEDICINE) Doaeet. A.. De Fnfta, N. ad GardaB. N. 2001: Sequenl./ Monl~ CtlTlo m~thOib in prGtlite. Ne\\' York: SpriDcer. DaacM. A.. God..... S. ad AIIdrIea. C. 2005: On scquaalilll MCIIIte Carlo slllUpling IIIClhocIs far Bayesian fibering. SIGIWks _ ComJlllling 10 197-201. ' ' ' ' ' ' ' - ' P. ad AIMInw. N. 2004: Outbn:al ddcc:tian: appIicatian to infcctiaus disease sum:i1IaDce. Ja BmokJncya, R. aDd SbaUp. D. (cds). MoniloriRJ l/re /reG/I' tJ/ PDI',,[QliMJ: JIGI&titGI printipl~J I11III n.ellrotb for publit _lib svrwilllllru.

New York: Oxfard Uaivenily PIas. Kaai. A., ..... J ..... W. . . W. 19M: ScqucaliaI inIpuIatians and (B)ayesian missing data problems. JOIIIIltII tJ/ lhe Ammctlll SIGlwical ADrJrilllitlft 89. 278. La..... A... ad KI"nma• It. 2005: SptllitIl tIIIIlqntlromir JIII1wiOllllte 101' pub/ir "~G/I'. New Ycd: Jaha Wiley a: Sans., Inc. I.e &tnt. Y. 2005: OveI\'icw oftcmpoml SW"ciUaacc. Ja Lawsan. A. B. and Kleinman. K. (cds). SptllialllRli 8jvrtlromit: JlltlrillQ/K~ for pllb/it: New York: John Wiley a: Sou. _. &,....., 0.2005: Bayesian data miniJII for hcakh SID\'CiIIanoc. Ja Ln.... A. B. and Kleimnaa.. It (cds). SptIIitII _ I1N1romir Jlltlrilllm« for pllb/it: WI". New York: John Walcy a Sons. Jac:. NIDI, D. ad A. W. 2005: Eflicient scan stalistic computalions. In Law-. A. B. aacI Kleiaman. It (cds). SpilI.1 tIIIIl l)'fIIlrumk JIII1w/tRft for JIfIhIk Inll". New YcIIk: Jobn Wile)' a Sons. Inc. s.ID, D. 2003: DIIft fiaaDcwodt for evalualilll syndrumic SID\'CiUaacc systclll5. JOIIfINIl 01 Ura Hellith 80. i8-i13. 'I1IIIcbr, S. 19M: HistClricai development. In Tcusda. S. and Clllftbill. R. (ells). Print/pleJ I11III JlNtlk~ oll'u6lk IImII" JlltwilltllrU. New Ven: (hfard University Pras. VIdId RadII.,.. C. ad ........ A. B. 2006: Online updaling of SJ*e-lilDC disease - SlllVCillance mocIels via paJtic:1e fiItcA. StGI&titGI MelbOib in Mellklll R~.IIt'" IS, 1-22. W. . . W.. Moon, A.. Caoper. o. ad W........ M. 2002: ~bascd anomaly pattem detection for dclccling disease autbJaks. In "'" NGliDul CM/~rma OIl Arl;' flriGI iIIIl!lr"~M~. Cambridp. MA: Mrr PIas. H. ad La--, A. B. 2008: EWMA SIDDDlhiDs aad Bayesian spaIiaI raodding far heaIda sunciUancc. SltlI&Iit$ iIr Meditine 27.

"mll".

.Iaan,

aa..

5907-28.

Spearman'. rank correlation

coefficient Sec

C'ORREI.A11ON

Spearman'. rho (p)

Also known as Spmmlllll'S runic aI"elllliDlr clH!jJicierrl. this is a mellllR or the relationship belween lwo variables that uses only the rankinls or Ihc observations.lflhcnuabd wlucsoflllctwovariablcsforasct of II indiYiduals 1ft II, and h,. with tit = II, - b,. then Ihc cacf6c:ient is deftncd explicitly as:

"Jf ~ p-I. I

-II

ID essence. p is simply Pearson's product momenl com:lation caemcieDI (see COUElA11ON) between the nnkings II and h. We can iIIusln1te the caeflicieal on the cIa'a shown in the lable, which were collected to invesliplC lhc: relalionship belween MEAN annual temperalure and lhc: monality rale for a t)'PC of breul cancer iD womea• The data relate to cenain regions of areal Britain. Norway aDd Sweden (sec Lea, 1965). Here, the Spearman com:laliOD is 0.90 aad Pearson's prodUCl mOlDent com:lation 0.87. In genen), the: Spearman cocOic:ient is mOJ'C mbusl qainst the pracace of oualias. SSE

431

SPECIFICITY _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

_

r....n'. rho

(P) BI'fJIJSI

CIIncer

motIIIIlly and

tempel8ful8

51.3· 49.9 50.0 49.2 48.5 47.1 41.3

102.5 104.5 100.4

Total

11.6 19.271.9

65.1 68.1

67.3 52.5

[See abo CORIII.AtION. NDNIWIMEIRIC: ME1IIOD5 -

AN

OVDVIEW}

..... A. J. 1965: New aIIscnatians ell tbri...... of~ 01 female basi CIDCCI' ia certain EurqlcID aIIIIIIIics. Bl'ilish M,tlical

JOIIrIIIIII, 4BB-9O.

specHlclty This is·a IIICIISIR or how well .. allernative test performs when it is compnd with lhe Rfaac:e of ·Iold' lIanclanl tesI for dillln05is of a coaclilion. SpccilIcity Is the pnIpOItion.of patienls CGm:cIly identificclas he from the condition by the diapollic lest oal of all patients who do DDI have the condition. Spcci6c:ily ...y aI. be aprcsscd as a pcrcenlap and is Ibe caualelpalt 10 SENSITMTY. The Mfc:n:nce IIandanI may be Ihe bell _YlliJable cIiqnosIic tell ar may be a combiMaion of cliacnostic mcI~ includilll foIlowiq up paIieaIs until all patienlS wilh the disease .a~ presented with clinical symptoms. It follows ahatabe bcstdcsip when adiaposlic tesl isc:valuarcdapinll a Mfcn:ace SIDDdud is a COllOID' ~ lmaliptan shaald eaasider whether vcrificlllion BIAS is pn:senl: this 0CCUI5 whc:a obtainiq a Deptive n:sult on one diainostic: lest inftllCllCCS Ihe chances of a patienl FinI on to have fllllllcr lesIS. ., 1haI_ patic:lllS with Ihe cOndition IIC\IeI' n:ceive the COJreCt diaposis. When the data an: sel out as in Ihe bible:

~'n' ~..-I ICily

d II = b+

among a +

Specificity should be pn:1CRIed with mNFIDENCE 1N1&VALS. typically lid III calculated usilll an appropriate melbad sach 81 thai of WilSOD (described in AlImaD elllt. 2000) ...... will DDt produce impassible values, i.e. that will IIDl pve values for dac upper conftdence inlen'lll > I when specificity appraachc:s 1 and the sample size is small When a lest raula is • aJlltiauaus mc:aIIRlIICnl, far example. HDL chDlestaoL a c.~11' poinl for abnanul values is chasen. If a hillier value is chasen. then specificity wiD be .a.tively high. but II:RSiIivity ..,latively low. 'l1Ie impad of all possible cul-ol1'painls CD be displayed paphically in a IECIJ\'Bl CIUA'I1NCI CIIAIIACIEJUSI1 (ROC) cunc.. Thechoiceofcut-oll'poinliSDDI,IIDwew:r.solelyastalistical decision. as the balance belWCcn the MUE POSI1I¥E RA1E and the fMSE EL\1I\'E RAm should be .e11llCd ID the cliaical context and conseq~cs of wrong dilllllOSis both for dac palienI and the healthcan: systclD. A Slllllple six caiculalion for spcci6c:ily c.. be . . . . by lIipailllilllacoaftdeace interval (e.g. 95.,) and. KCCpIabh: width for die lower bound of the conftdenee intenal. When: theanlicipaled specificity is hiP and Ihe sample size is s.....l. • 'small sample' mcIhad should be used: a sample aim IabIe is incl"" in Machin et III. (1997). etc (Sec: also NBL\1I¥E PREDICJ1\IE VALUE. IOSI1IVE PREDIC'IIVE

95.,.

14.6 11.7 72.2

42.3 40.2 31.1 34.0

ffJSUIIs

b d b+d

Positive Negative

95.9 11.0 95.0

45.1 46.3 42.1 44.2 43.5

apecIIIclty 6enetaI1BbIe of test b + c + d individJaIs SIUfIP/fId

VAUJE, 11lUE POSIIM! RATE)

A-..a, D. 0 .. MacIdII, 0.. ...,.... T. N. ad o.nt.r, M. J. 2000: SltII&tia with CDIIjfIIat.Y. 2nd edilioa. Loadoa: BMI Books. " '..... D.,c.apbIII,M., FaJllltP..... ....., A. 19f11: StInIpI, me 1IIbIt~ {or diftit:tll strMIfa. 2Dd cditica Chfard: a_k.U

.

~L~

spending function See SEQUENIlAL ANALYSIS

spline function

.PlU8 SPSS

Sec: SCA11BlJlLOl' SUOOIIIERS

See 5TA1IS1ICAL PACKMES Scc: STATI511CAL MCICAOES

stable population See DDtCXIlAPIIY stacked .... chart See BAR CHART

______________________________________________________ STANDARDERROR standard deviation

'J1Us is a measure of spread inICndcd to give an indication of the qftad of a series or values (.1'.. X2 • •••••1',,) about their MeAN(I). Taking the aYCmlle or abe difl'cm:ncc:s rlOm abe mean may initially seem a good measure oflheir spread. bul in fact this is always ZCI'O. 11Ic~rCR, Ihc standard deviation is based on the a~ or the squan:d difl'e~nc:cs from the mean. sinc:c these ~ all positive. Taking the squan: rool of this n:sult gives a measure lhal is in the same units as the original values. Thus. the standard deviation (s) is calcalaled using the rollowing formula. He~ n is the numbc:l' or amc:naliaas. ; takes values rlOm I to" and the ~ notation cIenoIes the sum..

i.e. (XI_x)2 + (.'tl-x)2 + :1=

E(Xi-X)2 ,,-1

es division by n - I. ralher lIIan or the squan:d dift"e~nces. 'J1Us givesan:sultthalisabe eslimaleofibellanclanldevilllion in the whole population. which is being estimated rrom the sample available. The &Iandard deviation can be denotcd SO. ad. s ex' o. although the last technically mers to the &Iandard devialion of a populalion. I'8Iher than a sample. 10 calculate the slandanl deviation by hand ~ is a IIIOM eonvenient 8nd malhemalicaUy equivalent formula:

As an example. the mi (lOftlent ( C) or 10 babies was measun:cl using dual eneru X-my absorptiomeby (DXA). 'l1Ic measumnents in grams ~: 46.6. 46.9.

49.2.49.8.53.2.61.1,68.1. 73.1. 77.1 aad 78.6. It is simple to calculalc that abe sum or the observations ~ .1'; :. 603.7 and the s~ of abe squares of the observations ~.~ - 37938.89. Thus.

The Yo

95%

... & (X._.i)2:

Naae thallhe fonnula i n. when taking the a

:1=

then approximalcly 9S4Jf, of the observalioas wiD be within twostandanldevialionsoflhe mean.'1be figure shows the casc or a standard normal cliSlribution, which hasa mean or 0 and a standard deviation or I. SRC

DImllBUDON

37938.89-1(603.7)2/ 10] 9

=

1 2.8Ig

a set of measurements is the square or Ihcir standarddevi ·on. AlthoUlh the variance has many uscs. the standard dey· lion is a more mcanlngrul descriptive statiSlic because it is in the same unilS as Ihe mw data. Whereas square millimelres. nun:. may have an obvious inlap~ta tion. sq~ millimetn:s or men:ury. ,mmHg1, does DOl. Altman ( 1991) suggests thai slaDdanl deviations may be quoted with one orlWo IItOm clccimal places than the original values. The slandanl deviation is typically used as a measu~ or spmJd alongside the mean and is most appropriate when the data ~ approximately symmetrically dislributc:d. It has the useful property that when the data follow a !(OBtAI.

I

I

~

~

I

I

I

0 Standard deviations

2

~

3

stIIncIm'd deviation Standard normal disttibution, with mean of 0 and SO of-1 AftIaaa, D. G. 1991: PradiaJI sltltisliC'3 for IMdiCtlI Laadoa: CllaplRID & Hall.

R!JmI'('It.

standard error This is Ihe SlANlWlD DEVIATION or Ihe SAMPI.INO DJS1II8Uf10Nora statistic. Foreumple.lbe Slandard

enOl' orlhc: sample MEAN or n observations is a l../ii. whcm,r is the V~RIANCE or the cxiginal observations. A useful aidc-memaire 10 distinguish when 10 use SIandanI deviation (SD) and when to use standard enOl' (SE) is 10 recall: 'SD fex' description, SE ror estimation: In particular. when describing patient eharaderislics in a sample, as in a n:searda paper"s typical 'nIb1e I, means and SOs should be IqIOrtcd. whereas when seeking to learn fram the sample and apply results 10 the relevant papulation, i.e. perfonning Slatistical inference either by IIYP01IIESIS 1!STS or estimation by CONfIDENCE INTERVAlS.lhcn the SIandard emw is used. The SEis necessarily smaller than the SDand it is wronltouse SE as a MeASURE OF SPREAD whca describing samples. M~ generally. standard cmxs can be attached to any sample-basc:d quantity. not just the meDofa single sample or COnlinuously distributed daIa.. as just discussed. The general form or a larg~sample 95'1. confidence interval for a populalion parameter (numerical characteristic) is the sample-based point esaimale :±J.96 (slandard errors). where 1.96 arises from the Slandanl NORMAL DIstRIBUTION and the standard error is that of the point estimate. itselr the best sample-based guess for the value orahe plll1lllleter. For two-sample inference. this is usually a quantity such as tbe difference in population means. ror continuous data. or the difference- in populatiOD proporlions. ror categorical data. SSE

433

STANDAADPO~noN

___________________________________________________

standard populBUon See DBIOORAPHY statistical coneuRlng See c~o A STATISTICIAN standardised

mortality

ratio

(SMA) See

DEMOORAPIIY

STATA

See STAllS11CAL PJ.CKAO~

statistical methods In molecular biology

Molecular biology is the branch of biology Ihat studies the structure and function of biological mamHIIOlc:cules of a "II. and especially their genetic role. Three types of macr0molecules an: the main subjects ofintcrcst: deoxyribonucleic acids (DNA), ribonucleic acids (RNA) and proteins. Genetic information is encoded in the DNA and inherited from parents to children and whca expressed. a gene. the basic unit of inhcrilaDc:e, is first b'an5cribed to messenger RNA. which then carries the information to a cellular machinery (ribosome) for protein production. This basic principle of the information Row in biology is often refem:d to as Ihe "central dogma'. put forward by Fnncis Crick in 1958. A central goal of molecular biology is to decipher the genetic infonnation and understand the regulalion of proIein synthesis and interaction in cellular processes. The rapid advance of biotechnology in the past few decades has facilitated manipulation of these important biopolymers and allowed scientists to clone. sequence and amplify DNA. As a result, a large amount of biological sequence and struclural infonnation has been generated and deposited into public accessible databases. The phenomenal growth of biological data is underpinned by the developments of high-throughput DNA sequc:acing and microarray technologies and Ihe recent prograses in giant ralClIR:h projects such as the human genome project that produced the sequence of the human genome. The word 'genome' refers to the entire collection of genetic malerial of an organism. These advances result in many complex and massive datasels, sometimes clecoupled Iiom specific biological questions under investigation. 1be nec:d to extract scientific: insights from these rich data by axnpulational and analytic means has spawned the new field of bioinformalics and computational molecular biology. which deals with storage. retrieval and analysis of biological data. 'l1Iese can consist of infcxmation storc:d in the genetic axle. but also experimental results from various sources. palient statistics and scientific litendUre. Bioinformatics is highly interdisciplinary. using techniques and concepts from informatics. statistics. mathematics. physics. chemistJy. biochemistry and linguistics. Nowadays. various biological dalabases and practical applic::ations of bioinformatics are R:adily available through Ihe internet and are widely used in biological and medical research.

A wide spectrum of statistical methods has been successfully applied in bioinformatics. ranging from the basic summary statistic::s and exploratory data analysis tools. to sophisticated bidden Markov models and Bayesian rcsampling methods (see BAYBlAN METHODS. MARKOV CHAIN MONTE CARLO). Analyses in bioinformatics focus on three types of datasets: genome sequences. macromolecule structures and large-scale func::tional genomics experiments. Various other data types are also involved. such as tax.onomy trees. sequence poIymorphisms. relationship data from metabolic:: pathways. patient statistics. text from scientific literature and so on. DNA sequences are the primary dala from the sequencing projec::ts and they only become really valuable through multiple layers of annotation and organisation. Sevenal areas of bioinformatics analysis are relevant when dealing with DNA and prolein sequences: sequence assembly. to establish the COlRct order of sequence c::ontigs for a eontiguous sequence; PRdiction of functional units. to identify subsets of sequences that code for various runctional signals such as protein coding genes. promoters. splice sites. regulatory elc:ments: and scquenc::c comparison and database search. 10 retrieve data emciently from organised d"abases. Most oflhese analyses involved sequerrce alignment. one of Ihe classic problems in the early development or bioinformatics. Sequence alignment is the basic tool thai allows us to determine the similarity of two or more sequences and infer eomponents that might be CIOII5c:I"YCCI through evolution and natura) selection. To align two protein sequences. similarity scores are assigned to all possible pairs of residues and the sequences an: aligned to each other so as 10 maximise the sum total of scores in the sequence pairings induced by the alignment. Dynamic programnring-based algorithms were de\'eloped to OYCKOme the large search space for the solution of optimal global and local alignment problems (NeedJeman and Wunsch. 1970: Smith and Waterman. 1981). Dynamic programming is a general algorithmic technique that solves an optimisation problem by recursively using 'divide and eonquer' for its subproblems. Faster heuristic word-based alignment algorithms 'Were later introduced for large database similarity sean:hes (BLAST by Altschul eI ai. 1990~ FASTA by Pearson and Upman. 1988). These algorithms build alignments by extending or joiniDl axnmon short patterns (·words') that an: computationally efficient. but often yield suboptimal solutions. The interpretation of alignment scores and database sclKh results was aided by statistical signiflcance deri~ from simulations and fIROBAlSI1JI"Y theory of extreme value distributions under the framework of standard statistical hypolhc:sis testing (Karlin and Altschul. 1990). 1'11c:se classic results have become indispensable tools for biomedical n:searchen and axnputational biologists 10 analyse molecular sequence data.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ STATlSTlCAL METHODS IN MOlECULAR BIOLOGY Slalislical models ~ also routinely usc:d 10 consbuct probabilistic profiles 10 characterise the regularity ofbiologieal signals basc:don eoJlcaions ofPRHllilncd sequences and to incmlSe SENSITMI'Y of seardJc:s. For example, a blackbased product multinomial madel CaD be used to describe the position-spcciftc base diSlributions of lhe 5' splice sile (exon-inlnHl junction) sipal in humans (sec the figure). which gives a richer rqm:seatation of Ihe sequence motif than lhe consensus CAGIGTGAG ('I' indiaates theexOft-intronjunction). A position-specific scoring molri."C can be derived subsequcnlly using logarithms of the ODDS RIJIO of the signal 10 background base ED evalWlle matches of new query sequcnees to Ihe sequence motif and ED quantify the in/ormotion conlent ofthc s~nal sequcace paIIem. n.e infonnalion conlent of a signal is defined as the a\'Crqe saxe of random sequence maIclIcs. measUl'Cd in 'bits~ using Ihe log (base two)ocIds ratio SCCRS that represent the number of0-1 s nccessuy to cocIe for ahis signal in a binary coding system. for iMlanceS. the human 5' splice site depicted in the ftgure contains 8 bits of information. meanil1l thai 'decoy' splice siles will be observed roughly ever)' 2 8 256 bases in mndom sequence. Note that the inf'onnaliaa content can also be formulalccl as the relotiFe enlropy (or Kullbodc-Leibler dislonce) of Ihe signa) to background nucleotide rmp.ency distributions in lhe context of information theory. More sophistiaated maclels and scoring mabices arc also available tocapt~ dependenCies among neighbouring positions using Morlcor models and albers. Anolher area of biological sequence analysis that relies heavily on stalistical aasaning is gene finding or, more gCDmllly~ predicting complex features from a sequeace. The goal ofprolein-axli11l gene ftndil1l is to lOCale gene feat~ such asexons and introas in a DNA genomic sequence. which

=

AAGOTGCTGTG CAOOTGAOTGG AATGTACGTGT CAOOTGAOCGG CAGGTATGCGO AAGOTAAAGTT CAOOTGAOCCC GCGGTAAOAOO GGOOTGAOTCA GAGGTGTGTGC CAGGTAATCAA ACGGTAAGCCC GTGGTGAGCOO AAGGTOOGTGC GAGGTGAGAGG AAGGTGAGGGC CAGGTAAGGCA CAGGTOAGCCT

is the essential ftrst-pass annotation of the genome project products. In addition to inferringholllologous (evolUtionarily n::laled) gene slnK:tu.a from database similarily sean:hes. Slatisticai ab initio gene-ftndil1l programmes ha~ been developed to integnde all known realUres aDd 'grammars' of protcilH:oding genes in a probabilistic model. Hidden Morko., models (HMMs) ~ at the heart of the mast popular gene finders (Genscan by Burge aDd Karlin. 1997. and n::viewcd in Dwbin elol•• 1998). HMMs WCI'e originally developed in Ihe early 19705 by elcctriaal engineers for the problem of spcccb recognition -to identif)' what sequence of phonemes (or words) was spoken from a long sequence of category labels n::pn:scnting the speech s~. The resemblance of the gene-finding problem to speech recognition and the way HMMs an:: formulated make them especially suited in this context. In addition. HMMs ~ lhcon:tically well-founded models. combining probabilistic madelling and fonnallangu&le theory that guaranlecs 'sensible' predictions that obey speciftecl grammatical rules e~n though they might not be the 4XJIRlCl genes. There are also wcll.cfocumented and computationally emcient methods for plll8lllClcr estimation (e.g. expc:ctatioo-maximisation) and optimisation (Vilerbi algorithm). A Markov chain is a series of mndom events oc:cuning with probabilities c:oaditionaOy dependent on the stale of lhe pm:eding event(s). A hidden Markov madel is a Markov chain in which each sIDle generales an observation according to some mle (usually stochastic). 111c: objective is 10 infu the hidden stale sequence that maximises the posterior probability of the obsen'CCI event sequence givc:a lhe model. For example. the hicldc:n Slates may repn:scnt words or phonemes aDd lhe observations are lhe acoustic signal.

Ia! iiG T~ 6~ ~ ¢ xl

Position: -3 -2 -1: +1 •

+2 .f3 +4

..s

+6 +7 +8

A 0.34 0.65 0.10 0.00 0.00 0.61 0.70 0.09 0.18 0.29 0.22 C 0.36 0.100.03 0.00 0.01 0.03 0.07 0.06 0.15 0.19 0.25 G 0.180.11 0.811.00 0.00 0.34 0.11 0.780.190.30 0.24 T 0.11 0.140.07 0.00 0.99 0.Q3 0.120.08 0.49 0.22 0.29

atatI8tIcal rnethocI8 In molecul. biology The human 5' spies site (exon-intron junction signal) 435

STATISTICAL PACKAGES _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ Malir discovery is an ala under acti~ resc:an:h and has bcneftlCd from sophisticated modem SlalisticallCchniques.ln • typicaJ setling. a collection or 5eqUeDt1C5 derived rrom MlCI.<WlRAY EXJIEItDIE.NTS or various sources an: believed to slaare cammon scquc:nce motirs that oftc:a replUenl runctionaJ domains or lqulatOl)'elemcnls. and the: challc:age is to find the: unknown signals and locale them in iDdi"iclual sc:quc:nccs. One approach is to fonnulllk: the mUltiple aJignment information as ~O DATA and inrer them togc:ther with othe:r parameters or the slatislicaJ modcl.gi~ only the sequc:nccs as obse:nables. Ad\'8DtlCd statistical madelling and ilCnllive computalion te:chniques such as the EM AUJO. RI11IM and Markov chain Monte: Carlo arc t)'PicaUy use:d for simultaneous model estimation (Uu. Neuwald and LaWJ'CRL'lC. 1999). The: runclion of a pmtcin is delermined by its Ihn:edimensional slnlClUrc. The: problem of pralicling the thm:-dimensional 5lrUcture of a prote:in rrom its amino acid 5c:qUc:ntC (or the pIVlCin-rolcliq problem. bc:c:ause proICins an: capable of quickly folding into their stable. unique threedimensional struclure. startiq from a nmdam coil conrormation without additional genc:tic mechanisms) is one of biggest challeages in bioinformalics. Th~ an: tbm: major lines of approaches for protein stRIcture pn:diclion: comparative modelliq. fold n:cognition and ab inilio pn:diction. Companlive modelling makes use of sequence alignmenl and database se:lIKhes and builds on the ract that evolutionarily related proteins wilh similar sequc:nccs have a similar structure. For plOleins widaout a homolocous sequence of knowD structure. the approach or "thrading~ has been developed. It is assumed that a small coIlc:ction of "rolds·. perhaps several huadlals in number. can be: used 10 model the nugority or protein domains in aJl orpnisms. The: pote:inrolding problem is thus nxluc:ed to the tasks of classirying the query protein based 011 ils primary sequence inlo one or the rolding classes in a database of kaowa 1hn:e-dimeft5ional structures. This classification is often accomplished using axnplicatc:d statistical models such as Oibbs sampling and HMMs to paramc:tcrisc the ftt or a se:quence to a given fold and solve the optimisation problem aoconlinJly. Analogous 10 the genc:-ftnding problem. one may atte:mpl to campute a polc:in's structure di~ly from its sc:quc:ncc. based on biophysical undc:nlanding or how the thrce-dimc:asionaJ stnH:lurc or proteins is altainc:d. The: chaJlelllC can be brokc:a clown into two mmponc:nts: devising a scoring fUnction thai can distinguish between c~land incorrect structurcs and a se:lIKh mc:thod to explore the confonnalionaJ space efficiently. Ir sua:essful. d=t folding certainly would give a deeper insight than the: "top-down' duading or homology modelling approaches. However. currently no reJiabie method has yet emerpd in this catc:cmy. During the past few years. the development or DNA anay tcchnolOSY bas scaled up thcbaditionaJly one-gene--at-a-timc

runclional studies to allow the monitoring of huncln:ds of thousands or genes simultaneously. A large number of stalistieal issues arise in connection with these: studies and these: have rosten:d unpn:cc:dentedconvelSalion and collabomlions between biologists and statisticians to establish means 10 plan. process and analyse thc:se massive datasc:ts. Many bnmches or statistics have bc:c:a revived anellor extended by their leCenl applications in the analysis of functionalgeoomics and molecular data. including DATA ~ methods 10 discover and classifY paUenI5. MUL11PLE lES11NO pruccdurcs to adjust P-VAJ.UB to conlrOl false: discovery rates and meta-anaJysis (sec SYSlBIATIC REVIEWS AND r.IEJA-ANALYSIS) to mmbine experimental results rrom various sources. New slatislicaJ methods will saon be nc:cded when combining inrormation Iiom multiple distinct data types (sequence. gene expression. prutcin structures. sc:queace variation and phc:aolypcs) for the same subjects. RFY

AIIda.... s. ... GIsII, W.....r. W.. PtIyen,E. w............... ». 1990: Basic local aliJDIDCIII sean:h tool. JDIInIIII 01 Mola. Biology 21S. 403-10. B.... C. B. ..... Karlla,s. 1997: Prcdidion of complete cme SlrUctum in hulDlll pamic DNA. JOrmllll of MoI«ultll' Biolog)' 268~ 71-94. BarIlla. It., EddJ. s.. KraP. A. ..... M.......,O.I998: Biolo,iallJeqlltlltclllNllysis: probobiliJti~ motle& of proleilu _ "uclei~ atids. c.nllrid&c: Camllridge Viii"mil)' Pras. KmID, S. ..... AItIcIauI. s. F. 1990: Methods for asscuiaglhc 5Ialistical sipificancc of molcculu sequence feahlla b), using p:neraJ lI:arillC II:hcmcs. ProtWdingJ 01 the NatiOlfllI Amtkmy 01 Scitlltes 011. United Slates of America 87. ~ Llu, J. S.. N....... A. aad Lawnnce. C. 1999: MaJkoviaa &truc:lllRs in biological sequence aJipmcals. JDUnlai of lhe Ameriam SlalisliNI Associalion 94, I-IS. Need'e-na, S. B. ..... WIIIIIdI. c. ». 1970: A aacnI method applicable to the scan:h for similuities in the amino acid sequence of t,,'O pnJteins. Jourlltli of MoI«ultII' BiDlo,), 48. 443-53. ......... w. a. aad Upm.... D. J. 1988: ImpRWed tools far biological sequc:nc:e comparison. Prtl«td· iRgs ollhe Na/iDnai At:atienry of SdmceJ of Ihe Uniled Sla/a of America 15. 2444-8. SaIHII, T. H. &lid W......, Me S. 1981: Identification of COIIUDOIl subscqucaccs. JOUI'MI 01 MDI«III. Biology 147, 195-7.

statistical paclcagea

In 2010 the Association ror Survey Computing (ASe) website: (www.asc.org.uk) lilled some around 200 51alista packages. Many of these: have been underdevelopment rorneady 40 )'c:ar5 aad therefore ilis bath a very mature and di\'ersc: sonwan: market. While many of these: around 200 packqes an: developed ror niebe markets. there arc still sevc:nl gc:acric software suites. II seems almast invidious to try to se:leca and discuss individual packages. HOMWer. there an: deady some wellknown and loag-eslablishc:d packages. and 10 many the term "statistical package' is almost synonymous with SPSSTM or possibly SASIN. Oiven the variety of analyses that these: packages offer. they can meet most user needs. It would seem likely that a viJtuaimonopoly should exist. but in fact there

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ STATISTICALPACKAGES have been new enlnUllS gainiag popularity. Comparing these is inslructive about Rnds in the development of statistical softw~. The packages in the fintlable ~ the ones on which we will coneenlnlc here.

statistical packlliea Major statistical pacluJges Mlljor 8tlllisl;C'QI ptldulres

SPSS

www.spss.com www.sas.com www.slala.cam www.insightful.com

SAS

STATA S-Plus

The prevalence or these major packages notwithstanding. ada packages, as listed in Ihe second table, although these will DOl be further discusscd.. Competition has beca good for the development ofplUpanIS and palCnliai pun:h1lSClS should always be aware of options outside the norm that may weD fit their raauin:ments. Topther with the ASC website (Jiven earlier), it will always be profitable to make: comparisons when purchasing. t~ an:

Other mtljor stlltislklll ptlCkllge~ Ocnstat STATlSnCA NCSS SYSTAT

www.vsn-int1.com WWW.Slatsoft.com www.ncss.com www.syslat.com

Naturally enough. one wants a slalislical package to do statistics and Ihe leadinl paclc.qeseovcr a wide range. These include basic descriptive statistics. including EDA-style charting. cvll1JRhc:asive c:mss-tabulalion _alysis. means testing. the gc:aeral linear model. mulliwriaIC methods. data mlucUon and clustering. nonpa.ramelrics. IOI-lincar modelling. time series - and more. The convcnion in the late 198Os-cady 1990s or the packages SPSS and SA! to run on desktop PCs seemed to cause a hiatus in the development of statistical metlladoloJy within these suila. Quite possibly, one of the maia reasons for this was the nec:cllo develop new user interfaces, as an alternative the command-line format preYiously used on maiDfnunc and minicompaten. With the DOS interface model heiq rapidly succcedcd by that ofWinclowslM• major c:onsecutive design changes MR needed. This did seem to leave a window of appoJtUnity for new enballts to the 11'UIIket. which could write clin:clly usiq modern programming an:hiteClurcs. S-Plus is perhaps the earliest example or this. initially written for the UNIX system and then subsc:quc:atly ported to

PCs. The desip wascvaceplually novel. based on the notion or an extensible slDlistical calculator. It provides adyancc:cl graphics facilities and has become papular with professional slatisticians for its ability todevclop analysis methodologies. ratheT thllll being tied to a rigid rramework. Over time S-Plus has developed 10 acid extensive user interface c:ahanccments as well as larpr statistical Ubraries. 11Ie public domain OR' (www.r-pmjecl.org) is based on a similar philosophy 10 S-Plus (sec R). STATA has become a very popular altanalive far similar reasons. Startilll out as a command-Iine-driven pqram. it has malwal over the years 10 offer a windowing interrace in addilion. III atlnlcliveness to raean:hcn has been a modem appraac:h to statistical teslinl. as \Yell as ill ability to incorporate new mcthacIoIogiesquickiy. NOl only do Ihc devclopcrs have an ardJiteclUrc that pc:rmilscasy incremental ex....sion. users thc:msclvcs can pl'OJIBIR their own pracedun:s. This has .ained the support of die professional statistical community. who IhrouJh their educative role haye pmmotcd the package's popularity. Panly as a n:sull ofcompetition. packqcs have also begun to differentiate themselves in lenns of extending extra su~ pon to the whole data analysis procca. While the &dual lest result remains the core of any aaalysis. data lII8I1IIIemcni is farlllCR demlllldiq in lelmsortime.1'1Ie n:saumcs ncc:dedto support DATA MANAOEMENT in a MULnCEHrRE clinical11UAL an: significantly IlU'Ier than those for a classical experiment. In these scenarios. IIIBIUIIiIll and mIIIIipuJatilll data prior 10 analysis becomes very impaltanL SAS bas long specialised in data mIIIIagcmcat support. willi ftexible proccdun:s far mcrgiq and manipulaliq datascts. as well as links to dalabase packages. In the phannaceulical industry SAS is almost a de racto standanI far majar analyses. reRecting ill abiUty 10 Supparlthe SIIonJ audit n:quircmc:nts in the induslry. To a CCItain ex.tent other paclcqes have been n:slriclCd to the ~Clangular data naaclel (arspR*lshecl) view ofdata. although all arc now impmvilll these: fc:alUrCs. Onedim:t effect ofthedcvclopmcnt ofstatistical packages has been to intnxluce the possibility of statistical data analysis to a wider audience than just statisticians. Siacc these usen an: often in finance and cammerc:e. they reprcsc:at a significant revenue stream 10 package pnxluccn and making the prop1U11l11C1' Meadly for nonspecialist audieac:es has became a priority forsome. SPSS's menu-driven ·point-andclick' interface. ror example. epitomises this model. In contJast. the command-line models ofSAS. STATA or SPius n:quirc more dcd.icalcd lnIining. althouP as noted earliCl' all have developed similar facilities. (STATA B inlnKlucc:d a menu-driven interf'ace in 2003 tocomplemcnt its tnldilional cOlllllUllld-line orientation.) Intepaling advanced cIaIa-cn1ry fcatun:s with a statistical analysis package is common. 'I1Ie pmlominant spn:adsheet

437

STATISTICAL PARAMETRIC MAP _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ daIa enlly model can be enhanced 10 include daIa enlly fonns. daIa checkilll and audit. The large packages such as SPSS and SAS provide "add-on· pmgrams ror this. Other prognms provide din:ct database links so that data enlly can be proVided in a normal prognunming package such as Microsoft Access and then din:clly impoltCd for analysis. While lraditionally the ~11s or an analysis 11M interpreted and then incorporated into a final n:poIt. packages have begun to ditTerentiate themselvcs on their ability to produce tables and n:sults Ihal can be dircclly pasted mlo a presenlalion qualily report. Packap:s vary widely on their ability to do Ibis and support can be patchy. SPSS provides a very good ability 10 mo\'C ~II$ tables.. but the exported graphics an: not of such a good qualily. STATA. by contrast. docs not otTer sophiSlicated cxport of results. but has in its IatcSl venions excellcnt graphical outpuL SAS offen full programmable reporting fcahlres that 11M \'CI'Y Oexible. but challenplII ror the naive user. Whilc the main focus of any slalistical user is on the large packages. dedicated packages still have a role. As an example, propams such as NQUERY (www.sIaIsol.ie). dedicated to sample size cSlimalion. do one particular job very wcll and an: popular as a resulL 11Ic lone. innovatiYc n:scarcher (an example perhaps being MX found at www.vcu.cdulmxI) is also a likely produecr or innovative soRwan:. An importanl dimension ror the individual consumer can be price. Some of Ihc major packages havc prices that match their capabilities: the silllie n:seBKbcr. panicularly in the commcR:ial sector. may find this an important factor in choicc. All the relevant websitcs can gi\·c guidance on obtainilll price quotations. Rather than ossirying. the marketplace for slalistical softWaM is healthy and researchers can find themselves well supported with a choice or divenc pKkages. CS

statistical parametrtc map statistical refereeing

SccSTAnmCSlNlMA01NO

There have been hundn:ds of review articles published in the biomedical literature that point out statistical cnon in the design. conduct. lU1alysis. summary and presentation of rc:sc:arch studies. 111e contents or every gcneral medical journal (most notably Annals of Inler.1 Medicine. BriliJIJ Medical JOllmol. JoumoJ of lire Amerimft Medical Auoc;olion. Unrcel and Hen' EnglmlCl JOIlTno/O/Medicine). as well as of many spcciaiiSi ones. have been subjectcd 10 this intense scrutiny sometimes rrequently. 1bcse review articles have rocused on particular SIalislical tests. frequency of usage and corn:CI application or Ic:chniques of statistical analysis. design of C1INJC'AL TRIALS and epidemiological studies. use of POWER calculations and CON. FlDENCElNTERVALS and many othu aspects. Their aImoSi universal coac:lusion is Ihal a subSlanliai pen:cntage of research studies. perhaps as many as 5()Cj(..

published in the biomedical literature conlains enors or sufftcicnl magnitude to cast some doubt on the Yalidity or the conclusions thal ha,'e been drawn. This does not mean that the conclusions 11M wrong. but it docs imply thai they may not be right. and this inevitably leads to serious conecm about the consequences both for understanding ordisease and for the lR:atment of patients. One solution 10 this problem has been Ihe introduction or medical slalisticians into Ihe peer ~vicw PJ'OtlCss. Some hayC advocated thai all submitted papen should be scrutinised in this way. arguing thai staliSlical review or those that are not published. no matter how poor. will at IcaSi lead to higher' standards in research and improvement in future papers. In view of the very large number or biomedical journals and Ihc huge numben ofpapcrs submitted for publication every year. such a remedy is impracticable. An alternative, now used by sevemljoumals. is 10 divide the peer review proa:ss into two Slages, whereby papers consiclcml by the editors as caadidatcs for publication are sent first to subject mallei' men:c:s (physicians. surgeons. epidemiologists. etc.) and those recommended for publication by them are then sent 10 statiSliciaas for rurthu specialist revicw. The process or statistical review is complcx, requires sophisticated judgement and varies conSiderably in its application to evcry section or a paper (absb'act. introduction. methods. ~sults and discussion). Altman (1998) ~views some of Ihc diRic:ulties and provides practical examples of both definite emu and matters of judgement. within study design. analysis. pn:senlation and interpretation. Then: are 12 broad aims or staliSlicai review thai can be summarised as follows: to prevent publication of slUdies that ha\"e a fundamenial law in design: 10 prevent publication of papers that havc a fundamental Oaw in interprelalion: 10 ensure that key aspects of background. design and methods of analysis are reported clearly: to ensure that key rcalures or the design are relccted in the analysis: 10 ensure that the best methods or analysis. appropriate to the data.. are used: to ensure that Ihc pn:scntation or n:suIls is adequate and employs summary Slatislics thal are justified by the design. Ihc data and the analysis; to ensure that tables are accurate and are consistent both with the text and with each other. to ensure thai the style ofOgun:s is appropriate. that they an: consistent with text and tables and not unduly repetitious or other content: to guard against excessive analysis and spw10us accuracy; to ensure that conclusions are justiOed by the results: 10 ensure that content or the discussion is justified by the n:sults and, in particular. that it avoids genenalisation far beyond the confines of the paper: and. finally. 10 cnsure that the abstract accords with the paper. The statistical reviewer may also comment on subject matter when an expert within the medical specially of the paper. but will not indicatc typos. except when these are critical ror acc:uracy within formulae or texl.lndeed. pointing

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ STATISTICSINIMAGING out inconsequential typos is not part of any aspect of any n:vicw process: the)' should be clisrqardecl b)' expelt n:vicwers and len entirely to the jouroal's CGpyeditor! Since slalislical n:view is camplicalcd and. far the n:viewer. samelimes excessively Icdious. with lhe nc:cessity or making very similar. sometimes lhe same. comments about manuscript after manuscript. ddailed slaliSlic:aI guidelines and chcc:klisis havc bc:cn wrilten with the specific: intention orhelping authors (and reviewers).11Iese have been suppartcd b)' the edilals or I1UID)' biamcclical journals and mcm:d to in the journal's pideiiDCS 10 authors. Examples can be found in Allman eI til. (2000) and Gardner eI oJ. (2000). 1110sc mast widely used far clinicallrials an: the CONSORT guidelines (Maher. Schulz and Altman.. 2001, updated 2010). far which them is accompanying explanation (Altman el 01., 2001, also updalccl 2010). and extension to cluster trials. noninferiarity and c:quivalcnt'C rancIomisecilriais. herbal mc:dic:ine illlencDlicn. nonpharmacological iIIIc:naIIicn. banns. al!stnM:as and pragmalic trials (sec www.consort-stalcmcnLarg). The chc:ckJisl thai fOnns part of the CONSORr stalerne'" is inlCaded 10 accompany a submiUcd paper and to ilKlicale w~ in the mlllalscript cuh item in the checklist has been addrased. thus servinl as a useful mnindcr to authors and an aide 10 mCftC5. 'I'IIcR IR also n:c:enl guidelines for reporting r.1ETA-ANALYSIS (PRISMA. which supercedes QUORUM). far observational studies (STROBE) and for genetic ASSOC'JADOH sludies (STREGA): details can be founclthroup the EQUATOR network (www.equator-nctwork.oll). Slalillical nMcW is inl&mdcd to be helpful aad consbUctivc: it should also reassure authcn and n:adcn that published papen an: sound. However. it is not always seen flUID this perspective and edilor& of joUlllals need to be viP- in ensuring that il does not become a focus far conbu'VCISy and dispute. as can happen. ror example.. when authors parade the views of "their own stalistician' to counter comments flUll1 a refem:. There is at prcsentlittle incentive far stalislicians 10 CIIPF in such n:view - it docs nol enhance theircara:n.there is no specific nning far ii, small (if an)') remuneration. il is time CCJIIsumillJ and "the only likely CXJIK'Rlc consequence of load n:vicwiq is fulun: Ialuesas farlllCR RWicws' (Batd1Clli. 2(02). Bacchctli also points out thai slalistics is a rich IRa far ftndinl mislakes and. when coupled with "the notion lhIt randing Raws is the key to high quality peer moiew', can lead to ·ftnding Raws that an: not really them·. This reinron:cs the need for sound statistical judJemenl. Statisticians may also have 10 aJUJlter mistaken crilic:isms from subjecl matter Ieviewers with limited Slalillic:aI knowledge (8acc:hcui, 20(2). The final part or statistical I1:view is usually a n:commcndation to the journal's editor either 10 accept. at'ICCpI with ~vision. I1:vise and resubmit. or n:jeclthe paper. n.c distinction between the second and thinl is sometimes difftcult and can only be made b), balanc:iq the eXlent and Batun: or the I1:visionsagainSlIhc capabililies of the authcnas evinc:ed

flUID the submitted paper. Rcjcc:tion b)' the statistician can also lead 10 provocation. especially as authors will be awan: that their "subjecl malter' peers have already judged il sound. In 1937 the I..ance,·s leadilll article Ibat hcmIded the series or classic papers by Bradford Hill on The Principles 0/Medit.Yl1 S'1I1islirs fCRWameci: ·It is exasperating, when we slUdied a problem b)' methods that we have spent laborious years in mastering. to find our colII:lusions questioned. aad perhaps refuted. by somcaae who could not have made the obsenalions himselr. It n:quin:s II1CR equanimity than moSl of us possess to acknowlcxlJe thai the faull is in ourselves: Authan or papers an: IMlvised to n:ad staIisIicai n:views carcrully. put them asidcfar48 hounand ani), lbcnslall to think aboul how 10 n:sponcl. For runhcr infannalion and discussion TJ sec Rubinstein (2005). Smith (2005). Wan: (200S). AftmaD, Do O. 1991: Stalislic:aI MYiewinl for medical journals. Slalislits in Mftiiti"r 17.2610-74. A....... 0. G .. Gore, s.~... Ganlaer. M. J..... S. J. 2000: S'alisliml guilelinr$j",. I.YJIIIribul.s 10 rMtlit:al jollrnals. In Ahman. D. G.. Machin. D•• BI)'IId. T. N. adOanlaer.M.l.(cds).SID''''ics wi,h co"jidmt~. 2nd cdition.l.oadon: BMJ Books. 171-90. Aa.a.. Do G .. SdlaIz,K. F.. Malaert 0.. Eaer.l\L, Dattdolr. F., m......... D., GebIcIae, P. C........... T. far ... CONSORT Groap 2001: 1hc ~iscd CONSOU stllcment far R:portinc randomiscd bills: cxplaaatioa ad clabondion. Annab tJ/ inlrmal Medicine 134. 663-94. BaedaetII, P. 2002: Peer ~ or stllistics in medical IaClR:h: the GIber pJUblem. BrillJh Mftiktll JDlllfla/324. 1271-73.0.......... 1\1. J., ..add.., 0.. C. .pIIeII. ~L J..... AItmaa. 0. o. 2000: Slalislittlith«klisls.1n AltmaD.D. G.• Madlin. D.• BryanL T. N. and anncr. M. l. (cds). S'alblics M·il" «IIfjit/ence. 2nd edition. Landan:

p.....,

BMJBoob.191-20I.MoIIer,O',SdI..... It.F.udAIIIuD.O'G. ........ CONSORTGraap200J:TheCONSORTstatcmml: R:\,iscd n:lCCllDlDl:ndaliaas for iJDpnwiqlbe quality of n:parts of parallclpaup randomiscd trials. Alulals tJ/lnlenrtll Mftiitine 134,657-62. Rulli...., LV. lOOS: SIDlbliml re1'irlF for /MliitOl jDIIma&, guidelinesj",. Du,lrors.ln Coltan.. T. aad Annit.qc. P. (cds). ED"""," pftiia of BkAsla/islits. 2Dd edition. 200.5. Cbichesta: Joba Wilc)" & Scm Ltd. p&lCS 51CJO-5192. S....... R. 200.5: Sla/mimi Tenn·j",. mediCQljtHlmals.jouma"s~tli'r;

In CDhon. T. aad Annilqe.

P. (cds). EnC)'tlopet/iD tJ/Bioslalillks. 2nciedition. lmS, Chichester: John Wiley & SODs Ltd. palCS 5193-5196. W~ J. H. 200.5:

Slalislittli rem' for lMtIit:aljtJllt1IIIu; In Colton, T. and Annilqe. P. (cds). Encytlopedia tJ/BitJslalislirs. 2nd edition. 2mS. Chidlester: JoIm Wiley " SODs Ltd. p&lCs 5186-5190.

statistics In Imaging

"Ibis is the usc of statislic:al techniques to analyse and quantify infannation conlaincd in digilal image formal. Imaging is widely used in medicine 10 visualise objecls. slJuc:tun:s and even physical proc:csses in .-;vo and in vill'O. A significant advanlap in mc:dic:al imagiag is lhe abilit)' to visualise 5lIuclUn:s or proc:csses without I1:lying on surgical operations. Thus. animals may be reqeled in cbuJ disc:o\'Cf)' and development or patients may not suffer fiom intrusive procedu~s. 11Ie ability to acquire infannatiOD withoul inlnlsive proccdun:s is also a

439

STATISTICS IN IMAGING _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ disadwatage to mc:dical imqiq.. This raises the issue of sunupte imqilll ENDPOINI'S (or .DIITO,at:,,): i.e. bow well do the cooclusious from an imaging cxperimeDt canapand to physical pmpcrtics obWacd from an inlnlsive procccIure? Allhaugh Ihe human visual system is \lei)' good at extnH:l.ing informatian fmm images. the sheer IIIDDUDtof data being produced CRales the CXIIDIIIOII problem of ·not enough lime to look al eYeI)'lhilll'. SIali&licai techniques usinI computers enable rc:searchen and clinicians to IUIIUI18rise large numbers ofil118lCS rapidly so dud paIIa'nS, Rnds, rqioas oractiyatiaD, etc.. may be idenliftc:d and quantif1cd. Besides the amount or information. medical imaging SyslCms also see beyoncIlhe visible light specIrUm and 8ft: able to process inl'onnalian Iium a wide range of Ihe elcctJumagnetic spectrum. Examples or medical imqing systems iacludc COllVCDtional radiology (X-rays). angiography (imaging or. system of bload ycssels using X-nays). positron emission t0mograph)' (PET). X-ray transmission computed tomography (CT)~ mapetic raonancc: imaging (MRl). microscopy. silllie photon emissian (campukd) tomography (SPET or SPECT). spc:ctroscop)' and ultrasound imaging. EvcD clcctrocDc:ephaIograms (BEGs) or magDdOCDCephalopanas (MEOs) 1ft examples of imaging systems. albeit with very poor spatial resolution when comparccito MRI ar PET. An image is a two-dimensianal func.1ian Ibal depends on spatial caordinalc:s. when: the amplilUde or Ihe function represents Ihc brightness or grey level of Ihe image at a particular poinL Individual clements or the image 1ft kDowa as piclUre elemen~ pixels for short. Imqes may be

coDecICd to fona • lhR:c-cIimcnsianai data SInIcture., ar volume:. where abe individual elements are called voxels. This is common in. far example:.. MR1 and PET. whl:re an experiment an a single subject will involye acquiriDg information in three spatial dimensions and in time. Traditional statistical tcchniquc:s in image analysis include: areas sucb as signal and morphological processiDg. Signal processing applications include image enhancement. imqe reslamtion. colour image pmccssing. wavelets and campn:ssioD. Morphological pmccssing assumes Ibal set Ibc:ory may be applic:clto manipulate slructun:s pn:sc:nt iD an imqe:.. A relatively new area or raeaIdI iD imaging is the usc: of MRI in fimctional or pbumacological 1Iudic:s of the brain. Punctioaal MRI (fMRI) is now weU developed and seeks to associate braiD fUactions (human CII' animal) with spcciftc regioas of the braiD. PharmacolOgical MRI (phMRI) is relatively new and seeks to aslOCialc pharmacakinc:tics with spc:ci8c regions orlhe (animal) brain. AlthouP group studies are willcspmul, consider a sing1~ubjc:ct analysis fiam • typical fMRI experimcnL After dala acquisition. a set of images associated with distinct slices of Ihe braiD is available far analysis. Each slice will have a time scqoeuc:e assoc:ialed with it; i.e:.. the imqiag experiment conlains both spatial and temporal infoamatiOn. Given knowledge orlhe study design. the: goal is to iclc:nlif'y regions of the braiD w~ significant activation wasobsc:rved, where aclivalioa is mc:asurecI by die: intensity of Ihe signal observed in Ibc: fMRI experimcnL Signal intensity is relalcd to Ihe ratio or oxygenaled and deoxygenated blood locally in the: brain.

(X,y) • (50;30)

I I 0

it

• I

I 0

20

eo

40

80

100

TlIII8

statistics In Imaging Example ofan MRI slice {Ie"} and voxel timtJ course (tight). 'TheBJCpetimentaldesign hasbBen supefimpossdon the IimeCOUlBfJpIot where the vfsualslimulatlDn Is shown bya dashBdline andlheaudiostimuJalion Is shown by a dotted line (data proWdBd by the Stain Mspping Unit, Depalfment of PsychIatry, University of Csmbtidge)

______________________________________________________________

The time coune in lite figure (page 440) shows a typical slice fiom an MRI experiment and the study design of onIolT sequences for \isual (dashed line) and auditory (douc:d line) stimuli. Each voxel in the image has an associated time course: a mask that eliminates nonbruin voxels is Iypically used to rocus the data analysis. UNEAR REORESSION. or. more fully. fiUi~ the OENERAUSED UN£AR MOOEL (OLM). is pc:rformcd on each voxel using the experimenlal design. convolved with a function to model Ihc hacmodynamic n:spolLK of the patient. as Ihc independent ,,·ariable. Trend mnoval is an important step and may be applied as a preprocessing step or by incorporating low-frcqucnc:y terms explicitly in the OLM. 1hc t)'pical assumption or independence bclween observations is not true in fMRldaIa: methods such as ~whitcning. autoregressive mociclling and least squares with adjustment ror correlated enors are auempts to oven:omc the limitations or ordinary least squares. Fiuing the OLM to fMRI data may be performed on an individual \'Oxcl. on a cluster or '¥Oxcls known as a region of intcn:st (ROI). where the data are averaged in space to produce a single time COUI1iC. or on e"'ery brain voxcl in the image. For Ihc first two cases. standard Ihcory far statistical inference on Jql'CSSion models may be applied. For Ihc lhird case. tcc:hniqucs such as Gaussian nmdom field theory. resampling (sec: BOOTSTRAP) and adjustments by multiple comparison procedures have been usc:d. Regardless of which mClhod is applied. a sel or voxels is obtained where signirICant aclivalion during the experiment was detected. Resean:hcrs then relate the images to the analomical regions identified in the acti\'ation image. also known as a statistical panunctric map (SPM). Infonnation from a group of patients may be combined or compared by first registering all images with a standard brain. The most common brain adas used is lhe Talairach alias. Then. a random elTec:ts or fixed elTccts model (sec LINEAR MIXID &n:crs MODELS) may be used 10 apply a statistical hypothesis tesl between groups of subjects in the experiment. For more details sc:e Serra (1982). Olasbey and Horgan ( 1995), Mooncn and Bandcltini ( 1999). Oonzalez and Woods (2002) and Worsley el aJ. (2002). B\v Glasbey. C. A. aDd Horpll,O. W. 1995: Image tlIItIl),su/or the bioiogittJI stienres. Chichester: John Waley cl Sons. Ltd. Oaazaltz, R. C. aDd Woods, R. E. 2002: Digital inrage protessing. 2nd edition. Englewood Clift's, NJ: Prcatioe Hall.l\loo.... c. T. W. ad Bandeltlal,P. A. (cds) 1999: FllIIctionol MRl. Bulin: Springer-Verlag. Serra.J.I982: Image tllfolysu tllfdmothmlDliml morphology. London: Academic Pn:ss. Worsley. It. J .. LI.... C. H., Alto... J., Pe..... V.. Daco. G. H.. M ..... F.ad Evul, A. C. 2002: A geDeral statistical approacb for fMRI datL NeuroImage IS, 1. I-IS.

statXact This is a specialised software package ror the ex.act analysis or small-sample categorical and nonparamctric

STA~CT

data with special emphasis on data in the rorm or contingency tables. The term "small-sample' applies equally to datascts with only a few observations. to large but unbalanc:ed datascts or 10 ~y TABI.ES with zeros and small cell counts in some or Ihc cells bul luge cell counts in other cells. In these sCUings. StaIXad produc:es exact P-VALUES and exact CONfIDENCE IN1BlVALS instead or relyi~ on possibly unn:liable large-sample theory for its infe~nccs. 1hc iDference is based on generali~ permutation diSCribulions or the appropriate test statistics in a conditional reference sct. DilTerent ",'Views of StatXact DR given by Lynch. Landis and Loc:alio (1991). Wass (2000) and Oster (2002). The cunent version. StatXad 6. olTCJS exact P-values ror one-. lWO- and K-sample problems. 2)C 2. 2 x t: and r x c contingency tables aDd measures of ASSOC'IATION. The data may be eilhcr unslnlifiedor Sb'Dtificd. Both independent and blocked samples DR accommodated. lbis version computes the exact confidence interwJ for ODDS RA110S that arise from 2 x 2 and 2 x c contingency tables. as well as an exact confidence interval ror the MEDWl shift parameter in an onIcred 2 x c contingency table. StatXacl olTers proccdura that clllel' explicitly to binomial data. Poisson data. nominal categorical data. ordered categOrical data. ordered COrMlated categorical data. continuous complete data and continuous right-censored data. For comparing two proponions (either from dependent or independent samples). StalXacl provides Ihc exacl unconditional confidence inlCnD1 ror a dilTerence in proportions or Ihc ratio of two proportions and computes exact P-valucs ror tests of equivalence and noninferiority. In addition to tools for ex.act inference. SlaIXacl also provides exact power and sample-size calculations far study designs involvi~ one. two or several binomial populations. In the two-binomial casc. these realUreS include exact powu and sample-size calc:ulations fordcsigning noninferiority and equivalence studies. In case: the computation of an cxac:t P-value becomes infeasible due to the lack ofeithu time or computing memory. SIatXact produces an unbiascdeslimate oflhccxac:t P-vaJuc 10 atlcast two dcc:imal digits of aa:Ul8CY USing el1lc:ienl Monte Cariosimulation strategies (see MARKOV CHAIN Mo.~ CARLO). The USCI' can arbitrarily incmu;c the number of Monte Carlo simulations in order to incmISC the acaarac:y. StalXac:t 6 runs on Microsoft Windows NTI2000IXP as a standalone prodUCL In addition. a special version. StatXact PROCs for SAS Users. is available as external SAS procedures for both the Microsoft Windows and Unix. operating systems. CCoIPSeICMINP L)'adI, J. Co, I MMUs, J. R. _

lAfallo. A. R. 1991: SIalXact. TIle

Amnicun Slatistifitlll4S. 2. 1S14. Oster, R. A. 2002: An exam-

inlllion of statistical ~ packages for CalcprU:al daIa analysis using exact methods. The AmeriCOll Stotirtidtlll 56. 3. 235-46. Was. J. A. 2000: SlIIXKt .. for Windows. Biolech SoftM'(lI'e and Intnnet Report I. I. 17-23.

441

STEM-AN~~~OT

_______________________________________________________________________________________________________________________________

stem-and-leaf plot Essenlially. this is an enhanced in which the actual data values are retained ror inspection. Observed values are each divided into a suitable 'stem' and 'leaf' ~ e.g. the tens ligun: and the units ftgun: in many examples.. and then all the leaves com:spondi~ to a particular stem are listed (usually horizontally) next to the value of the slcm. An example is shown in lhe figure. IBSTOORAM

14 14 14 14

15 15 15 15 15 16 16 16 16 16 17 17 17 17 17

: : : :

: : : : : : : : : : : : : : :

2 555 (f;TTn

889

000000111111 22222?2m?233333333333333333

44444.44444555555555555555555555

668£66616666666666666777711111111111 11111

888B8B888B8888888BB88888888888899999999999999999 0000000000000001111 1111111 11 11111 1t 333333333333333333333333

4444 ••44444.44444555555555555555555 668666666861111111

88BBaagggggggg 00000000000111 333 4 67

88

stellHM'ld-leaf plot A stem-and-leal plot for lire heights in centimetres of 351 eIdedy women

The plot oombines the visual pictun: of the data provided by the histogram with a display or the ordeml data values.

The design of stem-and-leaf plots is discussed in Velleman and Hoaglin (1981). It is important to use a typeface for which each digit occupies eqUivalent space. otherwise a key featun: of bei~ 'a histogram on its side' is lost. SSE Vel...., P. F. &lid H....... D. C. 1981: Applic:atiaas. basics, and computing of explCll'llory data analysis. Boston: Duxbury.

stepwise regression

See LOOISTIC' REORESSION. MULTI.

PLE LINEIdl REORESSIOH

stochastic process This is any system thai develops in acccxdance with probabilistic laws. usually in time but sometimes in space and possibly even in both time and space. Foreumple.the spread ofan epidemic is a stochastic process and its development can be tracked in lime. across some temUn or at the «Injunction of both lime and position. The constituents of a stochastic process are its :./ale. X say. and ils inde.dng lwiablt!f:.'• .s or t. 111e state is the primary mc:asun: or interest. such as number of individuals ill. while the indexing variable denotes either the lime (I) or the position (:.) at which the state is measured. A d~ indexing variable is usually shown as a subscript~ but a «Intinuous index appears within InIditional function notation. Forexamplc. suppose that the stale or the epidemic; is the nwnbcr of individuals who are ill. Then X, would denote the

number or individuals ill at lime / if observatiOIW wen: taken at the start ofeach day. while Xis) would denote the numbcr'of individuals ill at position:. measun:d continuously in space. Of course. the state or the process can also be either discrete (e.g. number of individuals ill) or continuous (e.g. £CO reading of a cordillit' patient). An essential ingredient in a stochastic process is the dependence or either successiYe or neighbouri~ obsCI"YDtiOIW. Different assumptions about the dependence slnlCtun: lead to diffen:nt types or stochastic process. which QI1 be used as models ror many observations collected in practice. The objective is usually to derive theoretical PROBABIU11ES for the variaus sbdcs of the syslcm and thus to use these probabiJitieseither fOl'prcdicti~ the future bchaviour of the syslcm 01' for gaining some understanding orits mechanism. Many practical systems can be modelled adequately by assuming a Markovian dependence structure. in which the PROBABILITY DlSJ1llBUllDNof X depends only on the most recent or neighbourl)' value. Standard stochastic processes that accord with such an assumption include random walks. Markov chains. branchi~ proecsscs. birth-. . .dcath proceues. queues and Poisson processes. Jones and Smith (2001) proVide an accessible introduction to the mathemalics of such proceues. Some classical applications of stochastic models to medicine are described in Gurland (1964). Suc:casrul uscsofMarkov models in medical CXJIIleXts range in time and application from the planning of palienl care (Davies, Johnson and Parrow, 1975) to n:saun:e provision (Davies and Davies. 1994) and the COSt-effectivel1CSS of ,'Kcines (Byrnes, 20(2). Many more examples can be round in jaumals such as Healdr Care Managmlell/ Scielll:e. WK 8~ O. B. 2002: A MaJkov model for sample size calculation and infen:noe in vaccine CCl5t~Jl'ecti\'aIeSS studies. StaliJtits in Medicine

21.3249-(10. DattIs, ....... 01.... H. T. O. 1994: Modelling palic:lll flows and IalOUIa: JIIOVisiaa in health systems. Omtro. Internalional JoumaJofMtmII&mrt!1fI SdeIW 22, 12l-ll. om., R.,JeImsaa,D. aDd tanoow, S. 1975: Planning patieaI care "ith a Marlcor model OperaliotlalRemsrr:h QuorIer/y'1A S99-CI07. GurIInd, J. (cd.) 19M: StoclJastk motlels in mftiidnt IIIIIl biology. Madison. WI: Uni\"CISiay or Wwmsin PIns. J.... P. w. UId SJDItb. P. 2001: St«lroslk protnSes, QII introJudion. l.oDcIon: AmolcL

stratified randomlsa6on See RANDOMISATION stratified sampling Stratified sampling oc:curs within defined strata of some population. This should be carried aut when the population contains easily identifiable subpopulatiOIW. If the sizes of the strata arc difrc:n:nt then proportional allocation should be used. If the SJ.o\HDARD DEVIATIONS an: known in advance then optimal or Neyman allocation can be used to minimise the VARIANCE of the estimalc or the populalion MEAN. If they an: unknown it is possible to usc a pilot study to eslimalc the SlaDdard deviations.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ STRUCTURAL EQUATION MODELS The mc:thod is as follows. Define the strata that the population faUs into. Decide if the slnda 1ft of a similar size and if the standard deviations 1ft Imown. For similar sized strata usc simple random sampling 10 select members of each straIum.lfthe sizes 1ft difTen:nt then the number in each slratum is proportional to slndum size. 'tben simple nndom samplina; is used to obtain the correct number in each IiIraIUID. If the standard devilllion is known in adWIICIC then for a fixed population sac. n is obtained by choosina; "1 50 that: nj ~

n

NjSj I

~ N..S",

when: ~ is the number in the stnIlum. S, is the standard deviation of values of ilCms within the s~ n is Ihe fixed population size. "1 is the number to be chosen by simple random sampling from the stratum and .s is the number of SlratL Thornhill eI 01. (2000) used stratifted samplin; in a study of disability followina; head injury. The patients we~ stratified accxxdina; to the Glasa;ow coma score. The mild aad unclassified patients were further stratified by Ihe IRsentina; hospital and a simple random sample was laken. In a;eneral. if the population can be separatc:d into distinguishable strata then the eslimates from stralified samplina; will be mo~ pn:cise than from a SI)DLE RANDOM SAMPLE and the~f~ it can be efficient. The disadvantages a~ that it CaD be ditlic:ull to choose the strata. it is not useful without homoa;eneous suba;roups, it can require ac:cunde information about the population and it can be expensive. For ~ details see Crawshaw and Chambers (1994) and SLV Upton and Cook (2002). CnWlllaw. J. _ ClaalDben, J. 1994: A crmciw aJrlrX in" In~1 staliJtits. 3nI edition. Cbcllalham: Slanlcy 11Iomes Publislam lid.. 'I1Io1IIIdII, s., T........ G. Me. MIII'n)', O. D., Me!• .., J., Roy, C. W.IIIIII PeIua7. K. L 2000: Disability in ~ people and adults one year after head injuJy: prospecth'C: cohort study. BrilisIJ Medical JOrmNlI320. 1631-5. UptGa, O. aDd Caak. L 2002: DitliolllUY of staliJtia. Oxford: Oxford Unh'a5ity PIas.

structural equation modelling software The four mosl commonly used packages for fitting Slnlc:tural equation models 1ft:

EQS (http://www.mvsoft.coml) LISREL(http;//www.ssic:entnLcoml) MPlus(hUp:/Iwww.stalmodel.comforder.html) AMOS(hup;/Iwww.amosdevelopmenLcoml) All four allow the fittJDa; or complex models n:lalively easily. althaua;h MPlus is possibly the moll Rexible. Each package's website provides speciRc information OIl their capabilities. as wen as availability and cost. SSE

structural equation models

The opendioaal definition provided by Pearl (2000. p. 160) slates: "An equalion y = Px + E is said to be :lITtlelUral if it is to be interpreted as follows: In an ideal experiment when: we conlJOl X to x and any other sCI Z of variables (801 containina; X or Y) to :.the value of Yis given by /J.Y: + E. when: I: is not a function of the seltings :c and z: The key word here is "control'. We 1ft obscrvina; values of Y after manipulalin; or fixina; the yalues of X. The madel implies that the values of Y. in fact, 1ft determined by Ihe values of X. A structural equation mode) is a description of the causal effect of X OIl Y. IL is a C.\USAL ~. and the panunelCr' fJ is a measun: of the causal effect of X on Y. II should be clearly distina;uishcd fI'om a linear n:p:ssion equation that simply describes the ASSOCJATION bc:tw=n two random variables. X and Y. If we an: able. in practice. to inlervene and control the values of X (by random allocation. for example) then it is straia;hlforward to use the ~ulting data to obtain a valid estimate of fJ. If. however. we do not have control of X. but can only observe the values of X and Y (and Z). as in an epidemioloa;ical or other type of OBSfllV~ DONAL STUDY. for example. this does nol invalidale the above opentional definition, but the challena;e for the data analyst is to find a valid (i.e. unbiascc:l) estimate ofthc causal parameter /J under these circumstances. The equation y =fJx + E is. of course. a description of a very simple structural model. II is common to coHeel data on se~ response variables (Ys) and several explanalory variables (XI) and to construel a series of structural equations of the following farm:

(;= 1Io/;j,k = ltoJ) (8) in which sevc:ml of the fJ values wiD be fixed 10 be zero. a priori. The adIers 8ft: to be estimated from the datL 11Ic form of the cqualions defined in (I) - i.e:. the sbUL1ura1 thccxy that dcccr-

mines die paIIcm offJ values 10 be cstimaIcd and those flxcd at zero - is detmnincd by the inYcstiptor's prior knowledge or hypolhescs c:oncx:ming the causal prucesses gcnc:raling the dIIIa. Quodng Byrne (1994, p.3): 'Stnactwal modeling (SEM) is a slatimc:al mdhodology that Iab:s a bypolhesis-tesling (i.e. confinnaIDry) approach to the multivariate analysis of a stnIclund thccxy bearing on some phcDDIIIeIIDR.' 'l)pically. SEM inyol\ICS (a) the specification of a set of sInIc:lural equations, (b) representation of these structural equalions using a a;raphical model (a path diapam - sec laler). (c) Simultaneously linin; the set of sbUctural equalions to a given set of data in onIer 10 estimate the fJ values and to test the adequacy oflhe madel.lfthe model fails to filthen Ihe inYeslia;aaor may nwise the model and try qain. The success of the exucise is likely to be hia;hly dependent upon the quality of Ihe investia;ator's prior knowleda;e of the likely 443

~~~EaM~ONMOOBB

_____________________________________________

causalllleChanisms under Icst and how much lhouPl he or she has giYCD 10 Ihc: clc:silll oflhc: slUdy in the ftnI place:. Good dcsilll aDd subsequent statistical analyses n:quire lc:chnical knowledge, skill and experience:. Far lcchaical knowlcd&c, n:adc:n arc~fclMd 10inlnJductcxy tealS by Dunn. E\lCrilland Pickles (1993). 8yme (1994) and Shipley (2000). and 10 the ad\lllllCCd IDDIIOpBph by Bollen (1919). Discussian orSeM in the contexl or n:c:c:nI work 011 aMIsaI infc:n:ac:e can be: found in Pearl (2ODO. 20(9)anci. apia, in Shipley (2000). Traditionally, SEM lias canc:cnlnllCd _ slnlclUnll models far quantilali~ daIa. which arc usually assumed 10 be: multivariate normal. Exlcnsions frvmthc IJaclilionallinc:ar slrUcllinll equations (i.e:. lJIII!AR REDRESSIDN) to gcnc:nlized linear slrUcluni equalians arc discussed by SkrancIaI and Rabe-Hc:slcth (2004). II is bquendy the case Ihal we cannot measure construc:ls din:c:dy. or al a l DOl withoul considerable: MEASUJIElIENI" ERROR. This givcs rise to lhe idea or LA11!NI" VAIlL\BIJSS. Thc:sc arc chandc:ristics Ihat arc IIDI diftlClly obsemable:. The)' may be straiptrorwanl canccpIS such as heighl. weilht. amount of cXPOSIR to a known IOxin. or COIICCnlnlliOll of a givcn metabolite in blood or urine:. bul we expliddy allOWl. that the)' cannot be mc:asun:cl wilhaul emJI'. 11H: obsc:rvc:d IIIC8Surement is a manifcst or indicator variablc:. while the anespanding amknown. but true:. value is a latcnt variable. Howc:~r. lalcnt variables ma)' be: more abslnacl lheon:tical aJDSlrUcts thai arc inlrDduccd toeaplain COVARIANCE bc:twc:c:n manifcst ar indicator variablcs. An example of this Jasllypc is the sci of'scora on a bauc:ry of copiti~ lc:sts that arc assumccl in some: way to reOcct a subject's cognitive abilil)' or gcnc:raI intelligence. Analhc:r example could be a sct of symptom scverily scores (the manifest variables). which an: assumccllO be indicatan of a palicat's overall cIcP'C of clcpn:ssion (the lalcnt wriablc). ~aII)'. a data anaI)'st will propose a famaal mc:asun:mc:nt model (usuaIl)' equivalent to same form of fadar analysis n:prcscnlalion) to R:laIc the: observed mc:asurcmc:alS willa the underlying lalcnt variables. We can Ihc:n proceed to propI'IIIC slrUctuni or causal h)'JXJlhc:scs involving the lalcnl variables instead of the fallible (cnar-prone) indicators. We sbut. for eumple. willa a t'OVAIUANCE MA11UX for the obsenal yariables. We fit a gcneml structunl equalion model to this covariance or moments matrix. "Ibis procedure will involve the simultaneous filling of'the measun:menl equations far the relevanl latent variablcs and Ihc:ir ~ing indicalors and oflhe slrUctuni equations thau&ht to rcOcct the 1IS5umc:c1 causal relationships between the lalcat variables. Specialist softwarcpackagcs 1ft now wiclc:ly available forsucbanalysc:s

din:cliOll ora causal elTcct)araclaublc-hcadc:clonc (indicaliq ~). The obsc:nrc:cI or manifest variables arc usuall),

plac:cd within a n:c:langular sqUIR box. while lalcnl variables arc placccl within aD oval or a cRle. Random measun:l1ICIIt cnon and residuals I'ram slructural equations., allhaup they an: sIriclIy spc:akilll laIaIl variables. arc nat traditionall)' plac:cd wilhin a cin:1c: .. owl. Path diBlrams 8ft: very closel), related to the gmphic:al I'CpI'CICnlalians (clRctcxl acyclic graphs. or DACis. far example: ICC CIRAPIIICAL MODElS) Ihat have R:lali~ly nD:nll)' bc:cn ~Iopcxl elsewhere (sa: Pearl. 2000. 2009, for example). 'I\vo simple: cumples of path dia&rams arc shown in Ihc: two figures. A detailed explanation will be Jiven in the foUowing section.

0-Y--0--~--

oI

1 Ox _---_1

Dy

p

structundequatlon models Path flaglBm tOlflPRlSenI the strucluflll equations linking encouragement to slop smoking duflng pregnancy (Z), the amount smoked duting pregnancy {X} and the bitth weight of the child (Y). Ox and Oyare randomly distributed residuals

EI

E2

I

J

0

GG \/ Y

-0,

Oxl

II

-0 I

P

Dy

_IBm

structundequatlon models Path to represent the stnJctuflll equations IinIdng encourtJIItIment to slop smokIngdutlngpteglUJncy(Z), the I1Ue amount smoked dutinQ Pf8tIIJancy (TJt) and the bilfh weight of the child (V). Ox and Dv are randomly disIribuIed residuals. XI and X2 are error-prone indicatots of smoking, with unconeIated measurement etrOl'S E1 and E2 respedive/y

(sec 51RUC'1URAL EQUA'IKIII MODflLINO S'OF'IWAJlE).

SlrUclUral equation maclc:ls arc ~ often n:pn:sentccl by a graphical slnIClUrc known as a paIh diBlnln (sec MlH ANALYSIS). In a paIh diagram die praposc:d relationships bc:awcc:n variables (whelhc:r manifest or observed) an: n:prcsenlCd eilhc:r by a linglc-laclc:d anow (indicatiq the

For an example:. Pennull and Hebel (1919) describe: a trial

in which pR:gnDt WOIIICn were randoml)' allocalcdto R:CCiYe CDlXlUl"8lcmc:1lI 10 n:duc:e .. stop their ciprcllc smokinl during pR:IlIIIIICy (Ibe treatment gruup) ar not (the conlrVl glOup) - indicated by Ibc: billlllY variablc:. Z An intcnnc:clialc

oulcome wriable (X) was the amount of cipn:UC IIDOking n:conlccl durinr: pregnancy.1be ultimate outcome (y) was the binh weightoflhe ncwbomchild. Smoking is likely loha~ been nxIuaxI in Ihc group subject 10 encaurqemcal. but also in the conlml puup(althoulh. presumably. to alc:ucrexlenl). 'I1Iere IR also likely to be: hidden confounders (e.1- odIer heallh pnIIIIOIing behaviaum) dudlR a.acialccl with boIh the molher·s smokiq duriq JRIIUIDC:y and the child·s binh weisht. Smalcinr: (X) is an endopaaus lIcabnent wriabIc the above confounding will n:suIt in Ihe n:sicIuaI rmm a sb'UclUnll equation madelto explain tile lc:Yel of smoking by RANDOMIf.VIOJtItonx:ci~c:acauracement be:ingc:an:laIcd with the n:siclual rrom the slnlttural equaUan liakinJ obscnccl levek of smoking to the birth weight or the child. We DSlIUIDe that ~ is no din:ct efl'ect of randomization (2) _ outaane (y): theell'cct ofZon ris an iJaclIJm ODe thmugh smoking (X): i.c.Zis anDBlllM!NL\LvARL\II.E.Ignarinr: Ihc.inlcn:epl terms. the two stnIctural equatiaas IR the followiq: X = yZ +

Ox and r = IX + Dr

In filting thc.sc two models to Ihc apprapriak: data we

lie-

knowlcd&e the ~aIion (p) bc:Iwcen the n:siduals. DJt and

Dr (diose CGlDpOlll:llts or X and r not uplained by Z and X n:&pcclively).1heoverall model isilluslralcdby Ihcfintfipre (pBF 444). Now. what if we acknow" thatsmokiDllcwls cannot be: measun:d lICCunlely and we decide to obtaia two different mcasun:mcnlS on czh penon in Ihc trial (XI and X2. say,beingsc:lr-n:poncdnwnbcnofpac:lcspcrclay,obIainedat6 monIhs and a monIhs into the pn:pancy)? The InIc level of smoking is now n:praentcd by the variable TJt • Our mcasun:mcnl naadcl is repn:sc:ated by the lwo equalions: Xl

= Tx +El andXl = Tx +£2

We assume that Ihc El and E:l IIIC8SIRJ1lCIII enars IR uncorrclatcd and thai ~ is no cbanp in the: lnIC level of smoking between Ihc two times. The n:vised SlrUcIllnll equations now usc TJt IDIha- than X. as fallows:

Tx = yZ +Dxand r =/lTx +0, 11ac cam:sponcIing path cIiqnm is shown in the sc:cond figure (page 444). NOIe dud not all of the model paramctcn implied by the model in the reconcI fipre can be estimated. 1hc: IllUdeI is too complex. far Ihc clara at hand. The model as a whole is said to be uncIcricIcntificd. but the gaocI ncwsis thai we can still estimale fl. the panunck:r mast likely 10 be or intercslto the invc:lliptar. Prablc:ms of uncIcricIentificatian IR beyand the scope of this entry. but an: CO\'Cn:d by the slaDdanlleXIbaob on Sll'UcIUnll equatians mocIcUing refen:accd below. GD B.... K. A. 1989: Strw/lll'tllqutlliDtu "'ilk /almt lY1ri11ble.s. New York: John Wiley a S_s. Inc. 8yrDe, .. M. 19M: SI,,'ural etplillion m _ling ,,';Ik Eg5 _ EQSlWiRdow.r. 1hausand Oab. CA: Sace Publicalioas. 0... G., EtuUt. B. I. ad I'IddII,

~

-

T

S

r

N

~

U

i

f

__________________________________________________

A.I993: MINk/I;". RJ.vrriGRtY.s and lale,,' .vrritlbk.r MliIIg EQS. Laadoa: t1aIpman &; HalJ. PIarI. J. 2000: 2nd edition. 2D: ClIIIStIIity. Cambridge: ClUDbridge Univmity Pre.a.. PenIIatt. T. ad HIlMI, J. .. 1989: SimultlllCDl&Cquatian cstimaliDn in a cliaical trial of Ibe effect of smatiDI and birth ""CiPI. BitJIMI,ir.r 45. 619-22. SIIipIe)" .. 2000: Ctnua _ ~orrela,ioIr mbiolDfY. Cambridp: Cambridp Univcnity Plass. ........, A. ........ ........ 1. 2004: Cknerali=ftllale,,' .arioblemotklu.,: nat/tile.oel IoRgitudintll _ !/,,'ural equaiiDlu motkls. Baca RaIan. f1.: Chapman a HaIIICRC.

student'. focIlatrlbutlon

See 1-orsn18~

stuclenf. ,.test

William Scaly Oasscl. who worbcI uncIcrthe pseudonym or·Studcnt'. deYelopc:cl the Stuclenfs 1test. 'I1ae Student'sl-tell is commonly refem:cito men:ly as the ,-test. The simplest usc of the l-test is in comparing the MEAN of a SBlple to some spccifted population mean this is usually called the ono-samplc: l-lesL The ,-lest can be modilic:clto compare the means of two indc:pcndeat samples (the two-sample ,-test) and for painxl clata to comPIR the ditrerences betwec:n the pairs (the pain:d ,-lest). Studen", I-tell isa panuncbic test and cedainassumplions are made about the clata. These ~ Ihat the observations within each poup (with indc:pcndcnt samples) ar the ditrerenc:es (with paired samples) IR approxillllltcly aonnaIly dislribuac:d and for the two-sample case we also requin: the two groups to ha~ similarvAllJAllD.S. ]fthc: sample data does

tlac IllSUmptions then the analysis is seriously .awed. HowC\'Cl'. the l-test is ·lObust' and is not peally atreclcd by a modc:ralc failun: to mc:eI the assumptions. 1bc OI»osample I-tell can be used tocam~ the mean of not meel

a sample 10 a cedain specified value. This yalue is usually the population mean. 'I1Ic NUU. HYFOI1IESIS slalcs that then: is no signiflcant difference bctwa:n the sample mean and the population mean and the allclllati~ hypothesis SIales thai lIIcM is a significant ditren:nce between the sample mean and the population mean. The assumption we make is"at theclata are a radom sample or independent observations from an underlying nannal dillributi_.'l1ae tellstalillic' ispvCII by: Sample mcan-Hypathcsised mean Standard enar of sample mean -

1=-..· · - - - · - - - - - · - - · - -

with n - 1 ~ OREES OF HlEEDOM. w~ " is the sample size. So , is the deviation of a IIOIIDIII wriable from its hypolhesised mean measured in Sl'ANDARD ERROR units. 11Ic stancIanI error of the

This is campan:d against the

........ lIIOIIIIisali..Jedby STANDARD DEYL\11QIN.

1-D15tR18~

{.In). .........

is~ .......

ForexampJc:.suppase 8MI vaJucs fora sample or25pcopie were measun:d and a mean value of 24.5 was found with a sample slandanl clcYialion of 2.5. 10 lest if this sample mean 8MI is signiftcantJy dill'erenl fram a population mean

445

SUBGROUp~yaS

___________________________________________________

BMI of 26 we can usc the onc-samplc: l-lest. whcft our Dull hypolbesis is lhallhc~ is lID difference bc:twcen the sample mean of 24.' and the population meanor26. Tllis allows us 10 c:akulate Ihe leal stalislic as rollows: I _ 24.'-26 _ -3.0

MIl-l

UsiOC . . . . fiIr with In - I) =24 dccI- 01 1i'eccIam. We ftnd a P-VAWE or 0.0062. The Rsult is SIaIislieaUy signiftc:anl and we Ihc:n:ran: accept the Dllemali~ hypolbesis thalthe mean BMI of the sample is sipificanlly dilfen:at from 26. We can usc the lWo-sample I-leslto clc:tamine the slalislical signifit'anccofan observed dinaenc:ebetwc:c:D the mean wlues of some variable belweea two subpoups ar belweea sepuaIe populations. For example. we: could look at the dilTen:ac:es in hciplS between males and remales. The lest Slalislic: for the two-sample l-Iest is given b)': I _ Difl'cmxc in sample IDCIIIS-Di~nce iD

-

hypolhc:siJcd means

SbIIIdiId emil' of the Clifl'CRIIOe in iIIC two sampae me. .

Fnlquc:ntl)' the null h)'pOlhesis of inten:sl is whether the lwo glOUpS have equal means and the CXHIespoading twosided altemalive hypothesis is Ihal the means ~ in fad dilfen:aL For example. when comparing the mean outcome ror two dilT~1 tralments is Ihe diffc:n:nc:e in means observed D statistically sipiftCanl one? In this case the lest statistic mluc:es to: ,

Diffc:n:nc:e iD the two II8IIIple means Stanclani envrof the diffc:n:nc:e in the lWo sample means

11ns is then cOlllplllalIo Ihe l-clislribulion with III + "2 - 2 dep:csorncdam. Whc~"1 isthc:sampJesi&r«lhe ftrstgmup and"2 is the sunplesizl: rorlhe sa:ondpoup.1heSlandanlenar oIlhe clilTaaaa: ill the two-samplc means is givaa by: SE(xl-·i'2)-

~_ (m-I).r)

.,

~ Jj -+-

III

"2

1r1-1)~

"1 +"2-2

and -'I aad-'21R the 5landarcldeviaIiaM for pxIII5CJIIC and two

respectively. Far the painxI '-lest, Ihe daIa an: depe~ i.e. there is a oae-to-anc cam:sponclence bc:lwc:cn Ihc yalues in the two samples. Pain:d cIaIa can occur liam 1W0 measun:mcnls on the S8IIIC pc:na1. e.g. befon: and afterbalmc:nt orlhe same subject mc:uurcd at clil"en:nt limes. II is incaneclto anaI)'se paimI data ignoring the pairing in such cimlmslances. as impcxtanl infamalion is IOSl Same ractors you do nat conllal in the cxpc:rimenl wiD aft"cct the bereft and Ihe after II'IC8IUI'CIIICls

equall),. so they wiD IIDI atrccllhc clitren:nce belween bef'on: aad aftu. By Iookinc only at Ihe dilTcn:ncc:s. a pain:d '-lest cam:cts rar Ihcse rKkn. the two-sample pairm l-test usually tc:slS the: null hypothesis that the population mean orlhe paired diffc:n:nccs of the two samples is ZCIO. We assume lhat the pairm diflC:n:nces an: independent. To perform the: paiRXi/-test we calculate the: difference between each set of pairs aad then ped"onn a ~ sample '-lest on the diffc:n:nc:es with the: nuD hypolbesis that the populalion mean oflhc dilTtRnces iScquaito zero. Mare details can be round in Allman (1991). MMB AImaa, Do O. 1991: Pradkal .'tltU,ic$ for 1ft_kill rrMINIr. Londoa: Qapmao & Hall.

subgroup analysis This fona of analysis is often employed in CLINICAL TRIALS in an Dltempt to identify putic:ular subpaupsofpalienls for whom a treatment weds betb:r (ar wvne) than far the overall patient population. For example. doc:s a IIealIIIenl wadt better for men than rar women? Such a question is D natund CJIIC rar clinicians to ask since Ihey do not lreat ·a'\ICI'DP~ patients and. when COnflODb:cI with a remale patient with a certain canclilion. would like to know whether the acc:epIC:d tmItmc:nt ror the: condition weds. sa),. less well ror women. AsscsIinc whc:thcr the c:ft"cct oIl1a1ma11 YBric:s acaxding to the value 01' one ar ~ patient charaderistic:s is lehdively SlIaightrmwarciliam a IIalisIic:al viewpoint. invalviDg IIDIhinI IIIIJm Ihaa fating a In:almenl bycxwarialc infaacIion.lIowcwr. liliiii)' S1aIisIieiaRs would caution apinsI auch analyses and. if uncIcrtalccn at all. sugest thai they ~ inlciprdal abaDely c:auliausly ill thespiritol·explonIion· ndhcrlhanan)'lhing man: rormaJ. 1he n:asons for such caution an: DOl dillkult to idcaIify. Fint, lrialscan rarel)' provasuftic:ic:al POWER Ioddcc:tsuch subpaaprmlcl1lcliaa elTccI5; clinical trials accrue: sullic:ienl puticipmls to provide acIcquaac: )nCisian far estimating qullllilies ofprimlli')' interest. ....0)' cwcndIln:alineIIl elfeclS. C'Gnftaing allcnlion 10 subJRIIIPS almost always aaulls in ellimates or inadequate prmsion. A Irial just large enough to ewluate an overaU tn:atmenl elTec:l adiabl)' will abnost inevila"y lack pamsion for evaluating dilTcmttial tn:atmenl effecls between differatl population subpJups. Sa:and.lWlDCMSlVDlalSlRSda thcO\a1ll ....... pIPS in a dinica11ria11R likely 10 be lXIIIII&abIc. SUbpaJps may nat eqjoy Ihe !iIIIK cIqpc 01' baIanae in JIIIicd c::t.ac.ic1itticL Finally. the:~ ~oftcn many possible prognostic racton in the baseline: data. e.g. age. gender. aace. type or stage or disease. from which to form subgroups. so lhal8Dalyses InDy quickly degenerate into 'data dn:cIging'. from which arises Ihc potential for past hoc emphasis on the subpwp analysis giYing ~sulls of most intcn:slto the invcsliplar. wilh emphasis given to n:sults dec:med "statistically signific:ant' contributing. in tum. to D preponclenulee or 'p < 0.05' n:sults

nue

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SUMMARY MEASURE ANALYSIS published in the mcdicailiterallR (an excess of raise positive ftndiDiI. thc=rCR). Olbel' potential dan&en or subgroup analysis can be found in dc:W1 in Pocock et III. (2002). SSE Pocack,S.J.,A......, s. Eo, ....... L E ..... bfa, L E. 2002: Subgraupanalysis.covnteadjustmcntMid bascliac comparisaIs in clinicailriallqlCllting: ~ practice and pnIbIcms. StQtut;cs ill Mftiirine 21. 2917-30.

8uftlclent-component cause model

See CAUSAL

toIODW

summary measure analysl8

This is a relatively slraighlforwarcl approach to the analysis or LONGITUDINAL DATA., in which the n:pcalcd measun:ments· or a response variable made on each individual in the study an: raluccd iD some way to a single number that is considen:d to capture an esscalial restun: of the response over time. In this way. the multivariate naI1IR of the n:pcaIed observations is lnInSformed to a univariate rraeaslR. The approach has been in use far many yean - see, far example. Oldham (1962) and Matthews el III. (1989). the most impadant consic:lcntion when applyinl a summary meal'" analysis is the choice or a suitable summary mellllR. a choice lhat needs to be made bdCR any data an: collected. The mellMR chasen neeck to be ~evant to Ihe particular questions ofinleral in the study and in the bmader scientific conlext in which the study tabs place. A wide range or summary measun:s has been proposed. as shown in the fllSltable. ~ to Frison and Pocock (1 992).ihe average: ~sponseoyc:rtime is often likely to be the most ~levanl. puticalarly in aJNICAL 11lJAI.S. Having chasen a suitable summary measun:. analysis will

involve DDlhing rnon: complicated than the applicalion or Student's l-tc:starcalculation ofa C'ONFlDENCEJN1ERVAL far the

group dift"c:rc:nce when two gmups IR being compaml ar a one-way ANALYSIS OF VARIANCE. when the~ are IIICR ~ two groups. If txlDSideral man: appropriate because or the dislribulional properties of the: sc:lc:ctcd summary measure, thc:a lIDIIIogous NDNP..\RAMEI'IU METHODS might be used.. The summary mcasure approach can be illustrated using the data shown in the second table, which come rrom a study of alcohol dc:pendence. Two groups or subjects. one with se~ dc:pendence and one with modcndc dependence on alcohol. hacllheir salsolinol exemion levels (in millimoles) recanIed on four consecutive days.

sumllllUY measure ....,... SaIsoIinoI excreIion data DIIy

Subjerl

I

2

Group I (moderate dcpc:ndencc:) 0.33 0.70 1 2 S.30 0.90 3 250 2.10 4 0.98 0.32 0.39 0.69 5 6 0.31 6.34 Group 2 (sc:vc:~ dependence) 0.64 0.70 7 8 0.73 I.IS 9 0.70 4.20 10 0040 1.60 11 2:50 1.30 12 7.10 1.20 1.90 1.30 13 0.50 0.40 14

J

4

2.33 1.80 1.12 3.91 0.73 0.63

3.20 0.70 1.01 0.66 3.86 3.86

1.00 3.60 7.30

1.40 2.60 S.4O 7.10 0.70

lAO

0.70 2.60 4.40 1.10

1.80

2.BO 8.10

.ummary measure ...lysIs Possible summary mfHISur8S (from Matthews eI III., 1989) Type of

Growth

.'a

QuestiDn t1finlelTsl

SumllllB)' measure

Is overall value of outcome variable the same in dillamt groups? Is maximum (minimum) raponsc dift"enmt

Overall mcan (equal time intervals) or amJ under curve (unequal intervals) Maximum (minimum) yalue

betwe:en groups? Is lime to maximum (miDimum) ~pansc:

"fime to maximum (minimum) n:spons

dillen:nt gioups? Is rate or change of outcome dift"e~nt between

Regrasion coefficient

~?

Orowtb

Is eventual value of outcome: ditrermt between groups?

OroWlh

Is raponsc: in one group delayed ~lalive to the other?

Final wlue or outcome: or dilTen:nce between last and lint values or pcm:entage change bc:Iwcen first and Jast values "fime to rcadJ a particular value (e.g. alW:cl pen:c:atage of baseline)

447

SUPPORrVECTOR MACHINES _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

Using the mean or the rour mc:asumnenls a.lable for cac:h subjcct as the sumDUll)' IIIC8SlR leads to the n:sulas shown in the third bible. TIleR is no CYiclencc of a glOUp dill"cn:nc:e in salsolinDl exc::mion levels.

summary ........ .....,... Results from uskJg the mean as a SUmt1llllYmeasut8 for the (fa_In the sscond IIlbIe Motiemle

Se..ere

Mean 1.80 0.60 sci n 6 1= -1.40. M= 12. 1'=0.19 95.. CI: 1-1.77.0.391

2.49

1.09 8

A possible altcmali~ to the usc of the ),IEAJI( as a summary measun: is 10 usc lhc maximum eacmion rate n:ccmlc:cl

over die rour da)'S. Applying the WJI.L"OXON RANK stJil TfSI' to this summ8l)' 1DCBSun: raullS in a at &Iali5lic of 36 and associalccl P-VALUE of 0.21. The summary mcasun: appmach 10 the analysis of 10000itudinal data can accommodate missing daIa but the implicit assumption is thallhcse arc missilll complerely at random (see DRCIIOUIS). SSE (Sec also AREA UNDBl aJRYEl FrIIoa, L ... Pamdc. s. J. 1902: Repeated mc&I1IRS ia clinical 1riaIs: anaIysisusinlmca SUllllDll)'staIisIicsand its impUadicafor dcsip. Sllllw;~s in Mrtlitine II. 168>704. Malt..... J. N. s., A_.. D.G.,c.a........ M.J.aad..,..., P. 1919: Analysisof' serial IIIClStRnaIs ia medical n:seaR:h. B,;IbIt If«lklll JDIIffIIII 300. 23-35. 0IdIIaa, P. 0. 1962: A nafcon analysis ofrepealCd

*

IDCIIMRmclllsof'*SIIIDe_jcc15.JCHtmtllo/ChrDIJit Disortkrs 15. 969-77.

..pport vector machines

Tbcsc arc: algoridllDS for learning complex classiftcali_ and n:pasion funclions. bc:lqiDg 10 the general ramily or'kcmcl methods' discussed later. Their aIIIIpIIlali_aI and statistical ef1iciCDC)' recently made them one or Ihc tools or choice in cc:daiD biological IM.TA MOlINa applications.

Support vector machines (SVMs) work by embedding the data into a featu~ space by means or IccrncI runctions (the so-called "kerncllrick'). In the binary classiftcation cue, a separating hyperplane that SCpanlCs the two classes is saught in this fealu~ space. New data points will be classifted inlo one of bath cluscsacconling to their position with rcspecllo this hyperplane. SVMs owe lheir name to their prapcdy of isolating a (often small) subset of daIa poinlS called SUpporl vccton". which have in~ng lhcorctical pmperties. 6

The SVM approach has several important virtues when companxl with earlier approaches: Ihc: choice or the hyperplane is founded on slatislical arguments: the hyperplane can be found by solving a cODvex (quadmtic) optimisation pmbIan. which means that lnIining an SVM is naI subject 10 local minima~ when a nonlinear kcmcl runction is used. the hyperplaac in lhe rcat~ space CD com:spond to a complex (nonlinear) decision boundary in the original data domain. Even IIIIR inlcn:slingly. kcmcl fUnctions can be defined nat oaIy on vectorial data but on vidually Dy kind of data. making it possible 10 classify slrings. images. trees or nodes iD a graph: the c1assiftcation ofunsc:cn data points is p:nemUy computationallychcapandclcpc:adsonthenumbcrorsuppart v«loIs. First intnJeluc:cd in 1992, support wctor machines arc now one or the standard tools in PATI'EL~ UCOCJNIDON appIicali_s. mostly due 10 their computational eOlcicncy and statistical stability. In n:ccnt ~ extensions or this algorithm 10 deal with a number of important data analysis Iasks have been proposed. n:sultiDg in the general ramily or "kernel melhucls' (Shawe-Taylar and Cristianini. 20(4) (see DENSm" ESmL\1IOXS).

The kinds or n:lation cIetectcd by kcmcl rncthock include classifications, ~~sians. cluslc:ring (sec a.USTEIl ANALYSIS IN MmICINE). principal aJlDpaaents (sec PRINCRL cor.lJIONENT AIW.YSIS). canonical ClDm'lalions(see c.u«)NlCALCOIRELA11ON ANALYSIS) and many olhen. In the same way as with SVMs.

the kernel Irick allows these methods to be applic:cl in a reatun: space that. is induced by this kernel. makiJas Iccmcl.methads applicable to virtually any kind or data. Elcpnlly, the dcvclapmenl of Iccmel rncthock can always be decomposc:cl into two modular steps: Ihe Iccmel design. on the one hand. and the choice of the aI,arithm, on the other hand. 11Ie Iccmcl design part implicidy clefillCsthe fcatun: space. which should conlain all a.lable inronnalion thal is ~Iewnt ror the pmblem at hand. The choice ordtc algorithm (which needs to be wrillcn in tc:nns or kernels) can be done indcpendeatly fivm the kernel design. As wiIh SVMs. most kernel mclhods n:cIuce their phase 10 aplimisilll a conycx cost runclion or to solving a simple eip:nvaluc prablem. hc:nce a'VOidin& one orlhc main computational pitralls or NEURAL NE1\VORKS. Howc:VCI'. since: they often implicitly make usc or very high dimensional spaces. kcmcl melhucls run the risk or ovcrftailil. For this n:ason. their ap nc:cds 10 illlXtrpondC principles of stalislic:aI learniDg theoJy. whicb help to i_dfy the cnacial panuneIcn that need to be conbOlled in order to avoid this risk (sec Vapnik. 1995). For further ~fen:ncc on SVMs. sec Crislianini and Shawe-Taylor (2000). NeRDS

lrainin,

a

Wi....., N.... Sbawe-1'11)'-, J. 2ODO: AIr ;"trodMrl_ 10

SlIpporl reel", ma""iMs. camllridF: Cambridr;c Uni\'alil)' Pras (www.suppart-\"CdCIr.net). . . . . .....,..., J .... CIf......... N.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SURVIVAL ANALYSIS-AN OVERVIEW 2004: Kernel./W,for ptlllefJIllllalysis. Camllricl&e: Cambridge UDiversity Pas (www.kemcl-mclbads..nca). VapIIIk, V. 1995: 11w nal"e 0/$Iatistictllwaming lheo". New yort: s.,nnpr.

surrogate endpoints These ~

DDPOINI'5 thai can a clinical endpoinl rar the purpose: of assessing the effects ornew tMatmcnlscariicr. at lower cast. or with lreater statistical SDSmYITY. Surrogate endpoints can include measun:menls of a biomarker. defined as ·a characteristic thai is objectively measured and evaluated as an indiclllOr of normal biological processes. pathogenic processc:s. or phannacologic ~sponses to a Ihc:rapeutic inten'ention' (Biamarken Dcftnitions Working Group. 20(1). Use of a biomarker as a sunople endpoint can also be usct'ul if the final endpoint mcas~ment is unduly invasive or uncomfortable. For an enclpoinlto be a sunople for a clinical endpoint itlDUst be a measun: of disease such that: (a) the size (or rlaluency) corn:lates stroqly with thai clinical endpoint (c.g. blood pn:ssun: is poSitively carn:lated with the risk of SIJOkc) and (b) lIatmc:nts praclucilll a change in the SUJlDpIC endpoint also modify the risk or thai particular clinical endpoint (e.l. n:ducil1l blood IRS~ nxluces the risk of sbake). Surrople endpoints an: routinely used in early drug development. when: interest focuses on shOWing thai new IIaImcnts have enoulh activily to wananl rurther n:searc:h. In confinnatolY PHAsE III TRIAlS. however. interal rocuses on shoWing that new lrealmcnts have the anticipated clinical benefits. and in such silUlltions surrogate cndpoints can anly be used if they have undergone rigorous statistical ewlualion (or ·wlidalion·) (Burz.ykowski. Molenbcrghs and Buysc. 200S). Indeed. some promising SUl1'OpIe endpoints have proven to be unn:liablc pralictan of clinical benefits. For example. cardiac arrhythmia was believed to be a JOOCI surrogate endpoinl far mortality after an acute heart allac~ sinec in Ihc:se cin:umstanca patients with a higher risk or such an anhylhmia have a gn:ater risk of dealh. Howcver. several clrup (c.g. Iill1ocaine. Rc:cainide) thai pnwent DrJhythmias after a heart attack actually incn:ase monalilY (£Cht et tiL. 1991). Similarly.1DIDC blood pn:ssurc-Iowaing druas (such as angialensin-convClting enzyme inhibitors) have much largcr etrccls on vascular mortality than mighl be predicted flOlll their effc:ds on blood pn:ssun: (Heart Oulc:cHncs Pn:vention Eyaluation Study InveslillllDrs. 2000). In contrast.. discasc-fn:c survival has rc:cenlly been yalidated as an acceptable sunoplc for overall survival in patients with colom:lal cancer lrcaled wilh IlUOlOpyrimidines (Sargent el tiL. 2(05). Pn:nlice (1989) pl'CJlXlClCCl a definition and opcralional crilcria for Ihc: validation of sunvpte endpoints. AlthouP the strict crileria prapased by Pn:nlicc seem lao Slriqent to ever be met in pmclicc, his landmark paper spoked iDten:s1 in developing statistical methods thai could be used to show that a surrogate is acccplabic (ar ·wliclated·) ror the purposes or ~place

assessilll a spcciftc class of bealmenls in a specific disease selling. One approadI consists or usinl a Ml111IDEL MODEL 10 show that the surroplc endpoint prcxIicls the true cadpoint ('inclividual-lcYcl· surrop:y). and thai the efTcds of a In:aImenl on the sunopIc endpoinl pn:dicl the efl'ecls or Ihc: batmcnl on the true c~nl ('trial-level" SUI'IOIBC:Y) (Buyse eI QL, 2(00).1111: lattercandition n:quiJa data to be aWlahle rrom scycral unils. usually fmm a META-ANALYSIS of scvaal trials. Anolher approacl1 consists of using a CAUSAL MODEL 10 cornpan: the causal efl'ect or lMatmc:nt on the true endpoint in patients for whom In:abnenl does. and does noI. aiTecl the surrogate. Sec Weir and Walley (2006) fell' a n:view or the tenninolou and sunugate validation models. CBlMB BIaaIarkIn Del. . . Wo...... Graap 1001: Biomarters and sunople endpoints: prefc:rml defiailions and coaccptual fiamc-

wark. elinictll PiwrmtltolD" and Tlterapealirs 69. 1'9-95. Blln)'kow*I. T., Malabe..... O. ad ..,.., M. (eels) 2005: EaYllualitJII of $II1rogale endpoi,,'s. SpriDler ~ B...,.. M., M........... G., Bun.rbWlld, T., Jtmud, D..... 0.,., H. 2001: 1'1Ic validltian or sunolale endpoints in mclll-analyses or randomized experiments. BiDJlalislirs l. 49-67. F.cId, D. s., .....,. I0Il, P. IL. MtIdHll, L B. « ...... die CardlacAn'.,...... Sapp. . . . Trial (CAST) lay........." J99I: Modality and

morbidity in palieals nx:eiviDI encaillidc.. ftecainidc. or placdIo: the cardiac arrhythmia aqIIRSSicIn bill. N~ Engllllltl JOIImQI of Met/kiM 324. 781-8........ 0uIcGmeI PnftDtIaa Ewlaatlaa Stad). lay......... 2000:

arms of an aDliotcnsin-cammiDI

aIZ)'mc iabibitar.. ramipril. on death rmm cardiowscular causes. m)'OCBldial infmlion, . . SIRJIce in hip-risk patients. Nn.' Ellglfllltl JDW'IIIII ofMdriRe 342. 145-53. Pnaace. R. L 1919: Sunoplc eadpoints in diDicaJ trials: clcftDition and opcraIioaaI critail. SI. listicsi" Meditine8. 43 1-10. Saqlat,D., \VIead,s.,IIaDer, D. G. 2005: ~heSW\'iYai (DFS) ysomaU sunival (OS) asa primay eadpoild far adjuvant coIoa CIllOCl' studies: indi\'idual paIicnt data from 20.191 patients on 18 nndamir.cd trials. JounraJ 0/ elinkal O"roIogy 23. ~70. Weir, C. J..... W"Iey, R. J. 2006: SlIIisIic:al evaluation of biomudc.cn as 51IJ'IVIatc endpoints: a literalUle m-icw. Stalistics iIr Metlidne 25. 1&3-203.

If.

survival analysis - an overview 'I1Iis covers mdhods ror the analysis of timc-lD-eVcnt da... e.g. survival limes. SurYiwl data occur when the oulcome ofinlc:n:sl is the lime fram a wcll-deftned lime origin to the occurrence or a

particular event or DlDPOM. If the endpoint is the cicada of a patienl the n:sulting data am. IilClally. survival times. Howcver. other endpoints ~ possible. e.g. the lime to micf ar rcc:u~JICIC ofsymploms. Such obscnalions 8RI often n:fcm:d to as limc-tOoCYCftt data allhoqh survival data is commonly used as a gcneric term. Slandard Slalistical mcthodofOJ)' is nol usuaDy appropriate ror sucb daIa. for two main n:asonL First. the distribution orsurviyaltime in general is likely to display positive SKEWNESS and 10 assuming nonnality for an analysis (as donc.. for example. by a l-lESTor a n:grasion) is probably not n:asonablc.

449

SURVIVALANALYSIS-ANOVERVIEW _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ critical than doubts about normality. however. is the pn:sencc or censon:d observations. ~ the surviwllime or an individual is rerenalto as censoml when die c:acIpoint or inb:RII has nat yet been mached (IIICR JRCiscly. right ccnscnd). For bUe surviyallimeslhis mi&ht be becausc the clata fram a &ludy arc analysed at a time point when IDIDC participanls are lIiII aIi\'e. ADaIher reason ror cellSDRd ewat limes is thai .. indiyiduallDipi ha", been last to rollow-up for JaIIGIIS unn:1aIed to theevenl ofintcn:st. e.l.due to moviDg to a Iocalicm that cannot be ncc:clordue to accidcnlal cIc:ada (scc DItCPOOTS). When ccasoring occurs all that is kaowa is that the actual. bat unkaowa. survivaltimc is iarpr than the ccnsan:d survival time. Specialised slalislical techniques clcvclopcd to analyse suell censcnd and possibly skcwccI outcomes are known as surviYal analysis. An impaltaDt assumption made in standard survival analysis is Ihat the ccnlDrinl is noninf'omlalive. i.e. that the aclUal surviwllimeofan individual is inclcpcnclcnt of any mechanism that causes thai individual's SUl'Yiwllimc to beccnsan:d. Forsimplieity. this dcscription alsoconcenlnlb:s OIIIcdmiqUCS forconlinuous surviyal times -Ihc: analysis or disclde lIIn'ivailimes is dc:scribccl in Collclt (2003). As .. cumplc. consiclc:r data that arise fram a doublcblind. randomisccl c:onbolled clinical IriaI (Ref) (sec CLINICAL 1I1AU) 10 compan: tralmcnts for )JIOlIalc caac:cr (plaecbo VClSUS 1.0 l1li of diClhylslilbcsbol (DES) adminiSlcn:d daily by mouth). 'I1Ic rull datasel is giyen in Andn:ws and H~bcq (1915) and the tint lable shows the tint scvcn or a subset of 38 patienlS used hen: and discussed in Collcu (2003). In this lIudy, the lime or GriPn was the dale 011 which a cancer suft"en:r was randomiscd to a IJUlmcnt and the cndpoint is Ihc dealh of a palic:al fmm pnJSlalc caaccr. 'Ihe surviwl timesorpalicn15 wIaodic:cl from athc:rcauscs or W'C1'e last duriq Ihc: follow-up pracC15 1ft R'IIanIccI as ript cellSDRd. The 'stalus' variable in Ihc: first table lakes the value unily if the palicnl has died f'Mm prostaIc cancer and Second.

IIICR

zero if the surviwllime is censan:d.. In additicmlo survival limes. a number ofplVlllOSlic raclOrs wen: n:conled. namely the • or the patienl at trial cnlly.their scnnn hllClDllllobin levcl in "",,100m). the size or their primM)' tumour in crn2 and the value or a combined index oflUmaur stqc and gnuIc (the Gleason inclcx with Jailer valucs indicating II1CR advancccllUmoun). 'I1Ic: main aim or this study was 10 compen: Ibc survival cxpcrieac:e between die: two In:alment paups. Inpncral. to dcscribesurvivalawo functions orlimc are of central inlcn:sl-Ihc .n;l'IIljaRdioR and Ihc: htlztB'd/IIIKlion. 'I1Icsc 1ft described in some detail ncxL 11Ic survival functionS(l) isddincd as the probabililY ahat an individual"s survivallimc:.. T. is IRala' lIIan ar equal to time I. i.e.:

$(1) = Pmb(T ~ t) The graph of S(I) apinll 'isknownas thesurviyal curve. Tbc survival curve CaD be lIIaughl or as a particular way or displaying die frequency distribution of Ihc: event times. rather than by. say, a HISI'OOIWI. Wbc:n dlerc an: no cc:nson:d observations in Ihc: saaaplc of survival limes. tile survival fuaclioncan be estimated by thecmpirical surviyar r.x:liaa: • { ) _ Number of individual s willi survivallimcs ~ I St Number of individual s in Ihc: data ICl Since every subject is ·aliyc· althe beginning orllle slUelyand no GIIC is observccIto sum\'e Ioqer than the larpst of die observed surviwllimcs then:

.$0(0) = 1 ancI.$o(I_) - 1

FID1hc:rmore. abc cslimarccl survivw filnc:ticm is assumed constanl bctwceD two adjaccnl cIcaIh times. so thai a plot or 5'(1) apillSl 1 is a step ftlnetion that cleaascs immediately after each 'death". This simple mc:Ihod cannot be used when Ibcrc an: ccnson:d abscrvations since the mcthacI docs not allow far infonnalicm pnwiclal by an individual whose surviwllimc is ccnson:d befan: lime Ito be used in die compulinl or die

8U1V1v......,.. SUn!ival limBs of ptOSIate cancer patients Ptltiml

IrIIRIbo I 2. 3

4 5 6

7

Trmlmenl {I =pl«ebo. 2=DESJ

1 2 2

1 2

I I

$un;,1Ii I;'"

,,,,,,,,Ihs, 6S 61 60 5& 51 51 14

Sltllru fI=di«/, 0= celUDl'ftI)

A" (,.,an)

S"."

Size t1/

Glmron

1I.m.

tMlllQUl' (mi,

iIIde.~

l4

a

4

10

(gm/IDO nil)

0 0 0 0 0 0

67

1

73

60 77

64 65

61

13.4 14.6 15.6 16.2 14.1 13.5 12.4

3

a

6 2.1 8

9 9 8

11

II

____________________________________________

aut the study period. A similar procedure can be used 10 estimate adlu pcKCI1tiics or the distribution of the survival times and approximate confidence intervals CaD be round once the variaace or Ihc esdmalccI pm:entile has been deriw:d from the VARIA.~CE. or the estimator of lhc suMWII'

eslimateat/. 11Ie moil commonly used method rCll'estimating the survival runction rar survival cIaIa conlaining censOn:d observations is the product-limit or KAFLAN-MmR ES11MA1OR. 1'11e essence or this appmach is the use ora product of a series of conditional pmbabUilies. One alternative estimalDr ror censored survival limes. derived differently but in practice often similar. is the NcI~Aalc:n estimator. Appnaximatc STANIWtD ERRORS and pointwise synundric or asym~tric CCNRIlEIC! IN1BlVALS rar the sum" funclion at 8 given time can be: dcriw:d to dercnnine the pn:cisiOD or the estimatorddaiJs are given in CoUdt (2003). The Kaplaa-Mciu eslimalorS of Ihc survivor curves £or the two prostate CaRC:« tmdmcnts arc shown graphically in the figure. 'I1tc survivor CUJ"VCS an: step functions that decn:asc at the lime points when panicipaats died or the cancu. ThecenSCRCI observations in thedala 1ft indicated by the ~ClOSs' marks on the CUl"VCs. In our patient sample then: is apprOximately a diffen:nce or2K in the proportion suniviog ror at least SO to 60 months between Ihc bealmcDt glUUpS. Since the diSlribution or survival times tends 10 be positively skewed Ihc r.IEDIAN is Ihc prererred summary measure or location. The mc:diaD surviwl time is the time beyond which S09f, or Iha individuals in abe population under slUely 1ft expected to SUl\'ive aDd. once abe survivor runction has bc:e" eslimalcd by S(I). can be esdmalccl by the smallest observed survival time, 1!tO. for which Ihc value of the eslimalc:d survivor function is 1c:ss than 0.5. The estimalcd median survival time can be lad rnJIII the survival curve by ftnding the smallest value on lhc ."C axis ror which Iha survival proportion n:aches less than 005. The OgUM shows thDl the median survival in the placebo group can be estimated as 69 years while an estimate for the DES group" is not available since survival extleCds S09f, tIuauP-

1.0

.-- --+I-.

SURvWALANALYS~-ANOVsmn8N

function.

In Ihc analysis or survival data. it is oRen of some int~ to assess which periods have the highest and which the Iowat chance of dea'" (or whatever Ihc event of inlCrCst happens 10 be) amonl those people alive at the time. TIle appropriate quaDtily for such risks is Ihc hazanl function. "(/). defined as Ihc (scaled) PROBABD.ITY that an individual experiences an event in a small time interval 61. given that the individual has surviwd up to Ihc bcpnning of the inlaval. The hazard f'uaclion theld'CR represents the instantaneous death rate ror an individuaisunivinllO time I. It iS8 mcasun" of how likely an individual is 10 experience an event as a runclion or the qe or.1hc individual.. 1'11e hazanlfUnc:tian may ~main constant. incrasc or declalC with time or take some more complex rarm. 'I1tc bazard runction of clcath in human hemp. ror example.. has a 'balhtub' sImpc. It is relalively hip illU11Cdiately after binh. declines rapidly in Ihc early yean and then remains ~lali\'Cly colWlant until bcpnning to rise during late middle age. A Kaplaa-Meicr type estimator or the hazanI runction is giveli by the proportion or individuals experiencing an event in an interval per unit time. given that lhcy have surviwd to the belinna", or Ihc inlaval. Howc'VCl'. Ihc estimated hazard function is gencmlly considcn:d 'tao noisy' ror practical usc. Instead. the cumulative or in~ graICd hazanI function, which is derived from the hazard function by sumnudi4)L is usually displayed 10 describe Ihc chanp in hazanl over time.

,--- --,

0.8

1~6

!

..

0.2

i

0

,.I.

-~

0.4

0.0

survlval .....

.- --,,- ---of.+.- -........

i

20

i

i

40 TIIIt8 (months)

60

• 80

DIsplay of Kaplan-Meier survivor cuwes

451

SURVIVALANALYSIS-ANOVERVIEW _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ In addition to companD; survivor functions graphically. a mare fonnal statistical test for a group cliffen:nc:e is often required in order to compare 5Un'ivallimes analytically. In the absenec of a:nsoring. a nonpanunetric teSi such as the Mann-Whitney test could be used (sce MANN-WHn'NEY RANK Sm.I TESt'). In the presence of censoring. the log-rank or Mantcl-Haeaszel tesl is Ihe mosl commonly used nonparametric test (see MANIB.-HAENszEulEIIIOOS). It teSis the NULL HYFOTHESIS that the population sunival functions S.(I), S:!.I). ••• , S,.,(I) DR the same in k groups. Briefly. abe lest is based on computing the expected numbel' of deaths for each obsenc:d "death' time in the dataset. assuming that the chances of dying. given that subjects are aI risk, are the same in abe groups. The total numbel' of expected deaths is then computed for each group by adding the expcctc:d number or deatM for each failure time. The teSi linally compares the observed number of deaths in each group with Ihe expected number of deaths using a CHI-SQUARE TESt' with k - I DEGREES OF f1tEI!DIlV (sec HasmCl' and Lemeshow, 1999). The log-rank test statistic. weights contributions from all failun: times equally. Several alternative test slalistics have been proposed that give diffen:atial weights to the failun: times. For example, the generalised Wilcoxon test (or Breslow tesl) uses weiptscqual to the number at risk. For the prostate cancer data in the first table the log-rank test (r=4.4 on I degree of fn:edom. P=0.036) deteclS a significant group diffen:nec in favour of longer survival on DES beatment while the Wilcoxon teSi. which puts relalively more weight on differences between the suniwl cwves at earlier times. fails to reach signillcance at the 54Jt test level 3.4 on I degree of freedom. P = 0.(65). Modelling sunivaltimes is useful especially when there an: several explanatory variables to consider. For example. in the prostate cancer trial poIients wen: ranclomised to treatment groups so that the abeon:lic:al distributions of the diagnostic factors wen: abe same in the two groups. However, empirical distributions in the patient sample might still vIII')' and if the pmgnostic variables DR relalcd to survival they might confOUDd the group diffen:nec. A survival analysis that "adjusts' the group difference for the prognostic faclor(s) is needed. 'I11c main approaches used for modelling the effecls or covariates on survival can be divided roughly inlo two classes - models based on assuming proportional hazards and models for din:ct effecls on the sun ivai times. The main lc:chnique used for modelling survival limes is due to Cox (1972) and is known as the PROPORTIO.~AL HAZ.O\RDS model or. more Simply. Cox's regression (sec COX'S REORESSm MODEL). In essence, the technique acls as the analogue of multiple regression for sunival limes conlaining censored observations, for which multiple regression itself is clearly nol suitable. BrieOy, the procedure

:e,

e:e =

models the hazard function and central to it is abe assumption thai the hazard funclions for two individuals at any point in time IR proportional, the so-called pr0portional hazards assumption. In other words. if an individual has a risk of 'death· at some inilial time point that is twice as high as anothCl' individual, then at all later times the risk of death remains twice as high. Cox's model is made up of an unspecified baseline hazard function, !ro(I). which is then multiplied by a suitable function of an individual's explanatory variable values. to give abe individual's hazard function. The interpretation of the regression parameter of abe ith covariate. /l" is that exp(/J,) gives the hazard or INCIDENCE rate change associated with an increase of one unit in the ,lh covariate, all other explanatory variables remaining constanl. Cox's regression is considerc:d a semi-parametric procedure because the baseline hazard funclion. bo(l), and by implication the PROII.O\BllJI'Y DISTRIBUJION of the survival times. does not have to be specified. The baseline hazard is len unspeCified: a different parameter is essentially included for each unique survival time. These parameters can be thought of as NUISANCE PARAMlrrER5 whose purpose is merely 10 control the parameters of interest for any changes in the hazard over time. Cox's regrasion can be used to model abe proslalc cancer sunival data. To start with. a model containing only abe single treatment factor is filted. The estimated re~ssion coefficient of a DES indicator variable is -1.98 with a Slandard enor of J.1. This translates into an (unadjusted) hazard ratio or exp( -1.98) = 0.1 38. In other words. DES treatment is estimated to reduce the hazard of immediate death by 86.2CJ. relative to Fl.o\CEBO treatmenL According to a UKELIHOOD RA110 (LR) test. the unadjusted effect of DES is Slatistically significant at the SCI. level (r = 4.5S on 1 degree: of freedom. P = 0.033). For the proslale cancCl' data. it is or inlcn:st to determine the effect of DES after controlling for the othu prognostic variables. Ukelihood ratio tesls showed that dropping age and serum haemoglobin from a model that contains abe treatment indicalor variable and all four prognostic: variables did not significantly worsen the model fil (al the lOCI. level); the fit of the final model is shown in the second table. After adjusting for the effecls of tumour size and Slage abe hazard miuction for DES relative to plaa:bo treatment is reduced to 67.1CJ. and is no longer sbltistically signiftcant (LR test: 0.48 on I de~e of f~m. P 0.49). Both tumour size and Gleason index have a hazard ratio above unity. indicating thai increases in (umour size and advanced stages are eSlimated 10 increase abe chance of death. Cox's model docs not require spcciftcation of the probability distribution orthe survival times. The hazard function is not restricted to a specific fann and as a result abe

=

r =

____________________________________________

SU~WAL~YSB-ANO~

..rvIvaI • ...,... Parameter estimates from Ct»c's tegression of SIH'IIvaI on lreafment gmup, tumour size and Gleason inde1C

EJ1erl e$timate Rqrenitm,. Ct1eJ/kinll (iI)

DES Twnaursize

GIe. . . index

-1.113 0.0126 0.7102

SllIIIdtJrd erN,

( 0.048 0.338

9SC1 CI lor e:cp(fJ)

(exp (,;))

lAwn limit

Upper lilllit

0.329 1.G86 2.034

0.031 O.MO UM9

3.47 1.19 3.95

HtlZtII'd ",tiD

survival...,... PaRlllJ8ler estimates from Iog-Iog/stIc acceIersted fallute time model of SUI'IIVaJ on treatment gnJUfJ, tumour size and Gleason index

Regression r«Jlicienl(a) DES 1\amaur size Gleason index

0.628 -0.031 -G.33S

A«elttrrllioR

Itlt:lor(exp(

0.203

semi-panmebic mocIcl h. considenIbIe llexibiUty ad is widely used. However. if lbc IllSUmptiDn of a particular probabilily clislribution farlhc~ is valid. inferences based OD such 811 uSlDDpti_ arc IIIIR pnscise. Far example.

estiJDates orlumud ratios or median surviwlliJnc:s will ba~ smaller IIancIant emH5. A IUIly parlllllClric: pIOpOItionai hazards model makes lbc .1IIIIe assumpliaas as Cox's n:pasion bul in additioa also assumes thai lbc baseliDc hazard IUDClion. hrJ.t~ can be paIBIIICleriseci aaxIIIIilll to a specific macIeI for Ihe distribulion or the surviyal lillles. Swvivallillle distribulions Ihaa can be ural Cor lIIis purpose. i.e. that ha~ the prvpodionaI hazards property, arc principally Ihe EXIONEJII1W.., Weibull and OompedZ Dl51'R111U'11ONS. DilTemnt dislribulions imply dilTaaat shapes of Ihe hazard func:lioa., and in pnac:1ic:e lhe dislributioa dud best clcscribes the functional fana of lhe observed hazard functicia is chasen - for dclails see CoDell (2003). A fanaily of fully pal8lllClric models I " accommodate cIin:ct multiplicative effects of COYarilllCs on SUl'Yival times and heIIce do IIDI have to rely on propanianal hazards arc dcrelermm /tIilw~ limI nrotkls. A wieler ..ap or SUl'Yival lime dislributions possesses the acceJended failure lime

-a»

1.0.... ';",il

Upper limit

0.112 0.988 0.939

1.568 1.077 2.oao

0.534 1.031 1.393

pmpcdy. principally the expollClllial, Weibull. log-logistic. genenlisc:d OAMMA. or I.CJ(JJt(OBWo DISTRIBt1J'IOJI In addilion. Ibis familY ofparamebic models iDelucles dillribuliCIIII (col. Ihe log-logistic clistribulion) ....1 model unimodal bazanI functions wbiIe aU distributions suitable ror dae prapaltiaaal hazards madeJ imply hazanl flmclioas ~ incmIse or ~ crase IDDIIDIoIIically. The .... pmpert)' mipl be Iimililli. far cxamP~ far modcllinl Ihc hazanI of dyilll after a complicated operation that peaks in die past~ye period. the pncllli accelendccl failure lime madel for the elTc:cts orpexplanalcxy variables. x,. x~ ...• x", can be ~pn:senlcd as a log-linear made] for Slln'ivallime.. T, namely:

t·¥; ,

1n{T) = ao +

, I

+ enar

where Cli ••••• CI.. are the un coeflicienlS or the expllllUdaly Yariables and Go aD illlcn:epl puamelC:r. The ~k:r a, re8ects the effi:clthat Ihe idI COYariIlla has on Iog-sunival lime willa pasilive values indicaIilll Ihe IUn'ifti time increases willi incrasinl yalues of the covariaIe aDd vice vena. In terms or the oriPnai IimescaIe. Ihe

"'at

~~~CU~E

____________________________________________________

model implies thai the explanatDl)" variables mcasun:d on aD individual acl multiplicalively and so aft"cel Ihc speed of progressima 10 the ~nt of inlCn:lt. The: intapn:talion of the panameler a, islhcn:fore thal exp (u,) giyc:s Ihc factor by which aDy surviWlltime pera:nlile (e.g. the median surviWlllime)changes per unit incn:asc in :C~ all other explanalor)' Yariables remaining CDIISIanL Sx.JRssed dilfcraady, Ihc probabilily ahal aD individual wilh ClOvariaIc Yalue x, + I survives beyond I is equal 10 the probability ahal an individual with WIIue .~, survives beyond exp(-a,)I. Hence exp(-a,) determines Ihc change in the spc:cd with which individuals pnx:c:cd along the limescale~ and the cocflicient is known as the acceleration fadar of the fth ClOYarialc. Softw~ pacbgc:s lypicall)' use the log-linear fonnulalion. The n:pession cael1icieals from fitling a log-Iogislic accelerated failure lime model to the prastalc cancer survival limes usiDg treatment, size of I1ImDUr aad Glcuon index as pmlictor variables ~ shown in the thirdlable. The DCgative aqression cocfftcicats sugeslthat lbc surviyallimcs lend 10 be shorter far larger value of tumour size and Gleason index. The posiliYe reg.asion cocmcicnt fOl' the DES bealmcnt indicator suggests that survival times lend 10 be longer for individualsassigncd to the acliYe treatment after adjusting for the elTeels of tumour size and stage. n.c eslimatc:d acceleralion factarfor an individual in the DES poupcolDpllml wilh the placebo group is exp( -0.621) == 0.534: i.e. DES is estimated to slow clown the progression or the cancer by a factor of aboul 2. While possibly clinically rdc:vanl. Ibis effcel is. howevcf. not SIalislicaily signiftcanl (LR test: Jf == l.57 on I degree of freedom. P == 0.2(1). In summary, surviyal analysis is a powerful 1001 for analysing limc-lo-eVenl data. The cl_cal techniques. Kaplan-Meier estimalion. Cox's n:p:ssion and accelerated failure lime madelling. ~ implemented in most general purpose STA1lS1IC.AL~. with the S-Pluspackage having palticularly extensiye facilities for lIaing and assc:ssing nonstaadanl Cox models. The area is complex and one of active cunent n:sc:arch. For more Reenl advances. such as frailty madcls to include RANDOM EFFEctS. MULTISTATE MODElS to model diffcn:nt InInsition nics and models for compeling risks. the n:ader is refem:d 10 Andersen (2002). Crowder (2001) and Hougaard (2000). SL

AIIdeneat p. K. (cd.) 2002: MlllliJla/~ nrotIe&. slalulim/,.tboth in mealall remur" II. 1.aIIdon: Arnold. AIIIInwa, 0. F.... Hen-II. A. Me 1985: Data. New York: SpriDlcr. CaIIett, O. 200J: MDdelling JUn'hYliiata in IMtlkal ramrr/t. 2nd cclilian. London: CIIapman a HalIICRC. o. It. 1m: Repasion models and life tabla (with discussion). JDurnsl D/ tM RDytll Slalis,;",1 Soriel,. S"i~s B 74. II1-22Q. Cnnrder, K. J. 2001: C/tmiem "",,,elinl f&Ia. Boca R.... Fl.: 0Iapmaa & HalIICRC.

c..

1IGaIer.0. W...............,5.1999: Appln surri1'tlltBIalYs&. New York: John Wilcy" Sans.ln!:. H......... P. 2000: AIIIIIy;uD/ mu/'iWll'iflle :uuri.tJIiattl. New York: Spriaccr.

8urvlval curve

See ~ER ESl'lMATJQN.Sl1RVIVAL

ANALYSIS-AN OVERVIEW

8urvlval function

Sec SURVIVAL ANALYSIS

8ystematlc revlaws and ........n.lyaIa This is aD approadI to die combining of n:suIts from the many individual CUNlCAL11UAU ora partic:ularlRalmcnt orthcrapy ahal may have been carried out over die caune of time. Such a procedun: is needed because individual trials ~ rarely large enough 10 ansWcf the questions we want 10 answer as reliabl)' as we would like. In practice. mast trials an: tao small far adequate CXIIIClusions to be drawn about potentiall), small advanlages ofparlicular thc:npies. Advocacy ofhqe trials is a natural ICSponse to this silualima. llut it is nat always possible to launch very large trials before thClBpies became widely accepted or n:jccted prematurely. An alacmativc possibilily is to examine: the results from all ~evant trials. a pmccssahat involw:s two components. ODC qUlllillll;ve~ i.e. the extraction or the relevanlliteratun: and description or the available trials. in tenns ofthcir n:leyllDCC and mdhodological stn:Dgths and weaknc:sscs (the s),:llmJlllic revie1'·). and lbc other qrIQRlillll;l~, i.e. malhcmalicaJIy combining results from dilTen:n1 studic:s, evea on occasions when Ihcse studies have used different measures to astC5S auk:ome. This component is known as a melll-tllllllyns (Normand,. 1999). Infonnal synthesis or evidence rna dilTen:nl studies is. or course. DDlhing new. but it is now generally accepted ahal mela-anaJysis gives the systematic review an objectivity ahal is inevitably lacking in the classical n:yiew article and can also help the process to achieve grater JRCision and generalisabilil)' of findings than any single SlUdy. Then: n:main sceptics who fccl that the conclusions fRlm a mela-analysis oRen go far beyand whaa the Icchnique and the data justify. but despite such conccms. the demand far systemalic n:vicws of hcallhcan: interventions has deYeloped rapidl)' during the last decade. inilialCcl by the Widespread adoption of the principles of EVIDENCE-BASED MEDICINE bath among hcalthc~ pnlCtitioncrs and polic),makers. Such n:vicws an: now incn:asingly used as a basis for bath individual tmdmcnl clccisions and the runding of hcallhcan: and heallhcan: research worldwide. This puwth in sy5lcmalic reviews is n:lleclc:d in the curn:at slate of the CocHRANE COLLAIIOIIA1ION database conlaining as it claes mon: Ihan 1200 complete systematic n:vicws. with a further 1000 due to be IMIdcd soon. Systematic n:vicwsandlbc subsequent meta-analysiS have a number of aims: to reYiew systematic.)' Ihc available eMclcnce ftom a panicular n:scan:h an:a: to provide quantitative

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SYSTEMAnCREVIEws AND META-ANALYSIS summaries of the ~ts from each sludy; to combine the results across studies ir appropriate - such combination of mndts leads to gmllCr slalistical power in emmDling Imllment effects: to assess the amounl of variability bc:twec:n studies; to estimate thedc,n:e orbenefttassocialcd with a particularsludy ~atment: to identify study chuac:teristics associated with particularly eITcctive ImItmenlS. Ideally. the trials included in a sYSlematic review should be clinically homo,eneous. For example. they mi;ht all study a similar type of patient for a similar duration with the same lRatment in the two anns of each lrial. In practic:c.. of course. the trials included are far more likely to differ in some aspects. such as eligibility criteria. duration of treatment. length of follow-up and how ancillary care is used. On occasions. even treatment itself may nOl be identical in all the lrials. This implies that. in most circumstances. the objective of a systematic review cQnnol be equated with that of a single lar,e trial. even if thatlrial has wide eligibility. While a single trial focuses on the effect of a spcciftc treatment in spccif1c situations. a meta-analysis aims for a more generalisable conclusion about the effect of a generic treatment policy in a wider range of areas. When the trials included in a sysaematic review do differ in some of their oomponents. therapeutic effects may very well be different. but these differences are likely to be in the size of the effects rather than their diR:dion. It would. after all. be exlnordinary ir~atmenteffects were exactly the same when estimated from trials in different countries. in different populations. in different ace groups or under diffi:n:ntlrealment rqimens. If the sludies were big enough it would be possible 10 measure these differences reliably. but in most cases this will not be possible. However. meta-analysis allows the investigation of sources or possible hetero,eneity in the results from diITerent trials. as we shan see later. and discourages the common, simplistic and often misleading inlelpretalion that the results orindividual clinical trials are in conRict because some are labelled 'positive' (i.e. statistically signiftcant) and others 'ne,ative' (i.e. statistically nonsipifieant). A systematic approach to synthesisin, information can oRen both estimate lhe degree ofbenclit from a particular therapy and whether the beneftt depends on specific characteristics of the studies. The selection of studies is the greatest single concern in applying meta-analysis and then: are at least thrc:c: important components of the selection process. namely breadth. quality and repn:senlalivencss (Poc:ock. 1996). Breadth relates to the decision as to whether to study a very spcciftc narrow question (e.g. the same chug. disease and selting for studies foliDWin, a common protocol) or a man: generic problem (e.g. a broad class of treatments for a range or conditions in a variety ofseltings). The broader the meta-analysis. the more difficulty there is in interpreting the oombined evidence as

regards future policy. Consequently. the broader the metaanalysis. the man: it needs to be interpreted qualitatively rather than quantitatively. Quality and reliability of a systematic review is dependent on the quality of the data in the included studies. although criticisms of meta-analyses for including original studies of questionable quality are typical examples of shooting the messenger who bears bad news. Aspc:c:ts of quality of the original anicles thai are pertinent to the reliability of the meta-analysis include a valid RANJX)).O$A. TIO.'l process (we arc: assuming that in meta-analysis of clinical trials. only nuadomised trials will be selected), MtNJratlSATION of potential BIASES introduced by DROPOUTS, acceptable methods of analysis. level of BUNDIKO and recording of adequate clinical details. Several attempts have been made to make this aspect of meta-analysis more rigorous by using the results given by applying specially constructed quality assessments scales to assess the candidate trials for inclusion in the analysis. Determining quality would be helped if the results from so many trials were not so poorly reported. In the future. this may be improved by the CONSORT stalemenl (CONSOLIDATED STAh"DARDS fOR REPoImNO 'hiALS).

1he representativeness of the studies in a systematic review depends largely on havin, an acuptable searcl1 stralcl)'. Onc.-e the researcher has established the ,oaIs of the systematic review. an ambitious literature search needs to be undertaken. the literature obtained and then summarised. Possible sources of material include the published literature. unpublished literature, uneompleted rc:scan:h reports. work in progress. mnference/symposia pl'OClCedings. disSCltalions, expert informants. granting agencies. trial registries. industry and journal hand searching. 1he search will probably begin by using computerised bibliographic databases of published and unpublished research review articles. forexamplc. MEDLINE. This is clearly a sensible strategy, allhough there is some evidence or deficiencies in MEDLINE when searching for RANDOMISED COKTROlJ.B) TRIALS. Enswing that a meta-analysiS is truly representative ean be problematic. II has long been known that journal articles arc: not a representative sample of work addressed to any particular area of research. Research with statistically significant results is potentially more likely to be submiucd and published than wodt with null or nonsigniftc:ant results. palticularly if the studies are small. The problem is made worse by the fact that many medical studies look at multiple oulc:omes and there is a tendency for only those oulc:omes su"esting a signilicant elTcct to be mentioned when the study is wrillen up. Outoomes that show no clear treatment effect arc:oflen igDDRd and so will not be included in any laler' review of studies looking at those particular outcomes. Publication bias is likely to lead to an overrepresenlation of positive results.

4S5

SYSTEMATIC REVIEWS AND META-ANALYSIS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ Clearly it bc:comes of some impottance to assess lhc likelihood or publication bias in any meta-analysis reported in the lilenllure. A well-known infonnal mClhod or investipiing this polential problem is the so-called RJNNEL PlDI". usually a pial or a measure or a study's pledsion (e.g. one over the SJANDARD ERRClIl)apinstefFect size. The most precise estimates(e.g.lhase fram the largest studies) will be at the top or the plot and those from less pn:cise or smaller studies at lhc bottom. The expectation or a ·funnel· shape in the pial relies an two empirical obscmdions. First. the variances or studies in a meta-anaJysis are nal idealical. but are distributed in sucb a way that lhere are fewer precise studies ud ndber more imprecise ones and. second. DI any ftxc:cllevel or VARIANCES. sludies arc symmelrically distribuled aboulthe MEAN. Evidence of publication bias is pIOvidcd by an absenc:e or slUdies on the left-hand side or the base or the runnel. The assumption is that, whether because or editorial policy or aulbar inaction or some alber reason. these studies (which are not statistically significant) are lhc ODeS ahat might not be published. An example of a funnel pial suggesting the possible presence or publication bias is given in the figure (laken from DuwJ and Twcc:dic. 2000). Various prapasals have bc:cn made as to bow to lest for publication bias ina syslelnatic review although none orthcse is wholly satisractory. The clangc:r orlhe testing approach is the Icmptation to assume thai. ir the lest is DOl significant, then: is no problem and the possibility ofpublicatian bias can be conveniently ignonxi. In pnctice. however. publication bias is very likely endemic to aU empirical resean:h and so should be assumed present. whalever the result or some testing procedures witb possibly low POWER. Once the studies ror systematic ~view have bc:ea selected and the possible problems of publication bias addressed.

(b)

(a)

10

i

8

-;

6

•

-

. ..•., .•• •• .~.

.5

-1

•

.5 Effect size

-;

6

•

~--C

•

e

8

0-

•

2

0

10

•

'2 4

I ....

effeci sizes and varianc:e estimala arc CXbacled rrom Ihe selected papers. reports. ele., and subjected to a metaanalysis. in which the aim is to proVide a global test or significamc:c ror Ihe avenll N1JLL IIYIVrHESIS of no effecl in all shlelics and to calculate an c:sIimalc and a CONFJDea: INIERVAL or the o\"raIl effect size. 1\\10 models are usually considered. one involving AXED EFFEl'"J'S and the other RANDOM EfIIfrIS (FJeiss.. 1993; Sutton el al.• 2000). The former assumes thDi the true efFect is the same: for all studies whelUS lhc latter assumes that individual studies have different elTc:ct sizes thal vuy randomly around the overall mean etTect size. Thus the nndom effects madel spc:ciftcally allows ror the existence of both between study heterogeneity and within-study variabilily. When Ihe resean::h question concerns whether treatment has produced an effect, on the avem;e. in the sel of studies being analysed. then the fixed effects modcI for Ihe studies may be the man: appropriate; hen: then: is no interest in generaiising Ihe results to oIher studies. Many statisticians believe. ~ver. thatlhc nndom effccts model is IIII:R appropriate than a fixed effects model ror melD-analysis.. because: between-study variDiion is an importanl soun::e of uncertainty that should Dol be: ignaml when assigning IIDCeI1aint)' into pooled results. tests or homogeneity ~ available. i.e. a test that the between-study YarillDCCcomponenl iszuo- ifil is, a fixed effects modcI is considered justified. Sucb a test is.. ~ver. likely to be of low power far dc:Ic:cIing departun:s fram homogeneity and so its practical consequences ~ probably quite limited. The essential feat~ or bath lhc fixed and nndom effects models ror meaa-analysis is Ihe use or a weighlc:d mean of lreabnent etTcct sizes frum the individual studies. with Ihe weights usually being Ihe reciprocals of the associaaecl

0.5

-J

• 1:5

• •

.•.•., ... • ~--C

4

~.

2

0

.5

-1

.5

05

15

Effect size

systematic reviews and rnetIHInalysis (a) Funnel plot of 35 simulated stucfes and metlHnaIysis with tflle effect size of zero: estimated effect size;s 0.080 with a 95% confidence interval of £-0.018,0. 17B}: (b) tunnelplot as in (a) with five feflmosr studies Sf.IPPI8SSBd; overall effect size is now estimated as 0.124 with a 95" confidence interval of fO.037,0.2fOJ. Rsprintedftom Duvaland Tweedie, 2000, withpennission fromTheJoumai 01 the American statistical Association. Copyright 2000 by tire American StaIistictJJ Association. AJII1Qhts teServed

______________________________________________________ STANDARDERROR standard deviation

nus

is • measure of SlRad inlCncled to live an iDdication of the or • series or values (.~.. X2• •••, .1',,) about lheir r.EAN(I). Takilll the averqe orlhe differences rrom Ihe mean may initially seem a good measure orlheir SJRad. bul in ract this is always ZCIO. 11ae~rorc, the standard devialiOD is based Oft the a~ of the squan:d diffe~nces from the mean. sinte these IR all positive. Takilll the square rool or this fault gives a measun: that is In the same units as the original values. Thus. the standard deviation (.I') is calcalalcd usilll the rollowing formula Hae n is the number of absemdiaas, ; takes values from I to If and the ~ nalation cIenob:s the IUIIL i.e. (XI-.t)2 + (Xl-x)2 + .. (x._.t)2:

spn:"

then approximately 9S4Jf, of the observations wiD be within lWostandanideYiationsorthe mcan.1be filun: shows the QSC or a standard aormaI clislribulion, which has a mean or 0 and. SIaDdanI clcviation or l. SRC

DISTRIBUIION

95%

-Ib

.1'=

E(Xi-X)2 If-I

Nale that Ihe ronnula i es division by n - I, ndher than n, when lakilllthe a or the squarN dilfe~nces. 'This gives a rawt thai is. be eslimaleorthe IIandanI deviation in the whole population. whic:h is beilll estimated from the sample available. The SIandaId deviation can be denaled SO. ad. .I' ar o. although the Iastlechnica1ly n:fcrs to the SIandanI clcviation or a populaaion. I'8Iher than a sample. TO calculate the slandanl deviation by hand ~ is a IIICR convenient and malhematicaIJy equivalent rormula:

,

01 Standard devialions

.......... deviation SIandatd nonnaJ disttibution, wiIIr mean of 0 and SD of 1 A......, D. G. 1991: Pradiftli Jlatistirs for mNiml

Laadoa: C1IapmID a Hall.

standard error 11ais is the srAl\D.W) DEVIATION or Ihe 5AMPLINO DJSlIIBUTIONofa statistic. Forcumple. the Ilandanl w~,r error orlhe sample MEAN or n observations is D I is the VARIANCE or the oripnal observations. A useful aiclc-memoin: to distinguish when 10 use sIancIanI c1cviatiOft (SD) and when to use standard error (S£) is 10 nx:aIl: 'SD rar dc:scriplion, SE ror estimation.' In particular, when describing patient c:baracterislics in a sample, as in a n:searcb paper·s Iypical1lab1c I, means and SOs should be ~rted. wIIe~ when seekinlto learn rRIIII the sample and apply !Csults to the relevanl papulation, i.e. performing Slatistical inre~nce either by IIYPOIIIESIS T!S1'S or estimation by CONJIIDfl(CE INIDVAlS.lhen the IIancIant CII'OI' is used. 111e S£is necessarily smaller than the SDand it is WIOIIItouse SE as a MEASURE (II SPREAD whc:a clescribilll samples. M~ generally, standard emxs can be altacbc:d to any sample-bascd quantity, notjusllhe meu ora single sample or conlinuously distributed daIa. as just discussed. The general form or a large.-sample 95f1. confidence inlerYal ror a populalion parameter (numerical characteristic) is the sample-based point estimate ~l.96 (slandard enors), where 1.96 arises from the Slandanl NORMAL DIstRIBUTION and the standard error is that of the point estimate. itself the best sample-based luess for the value orahe pammetcr. For two-sample inrerence. this is usually a quantity such as the difference in population means. ror continuous clata. or the differcnc:e in population piOporlions. ror categorical clata. SSE

JR.

As an example. the au 4XlRtent ( C) of 10 babies was mcasuml usinl dual C1ICrgy X-my absorptiomelly (DXA). 'I11c: mcasumnents in grams wcn:: 46.6. 46.9. 49.2,49.8,53.2.61.1.68.1,73.1,77.1 aad 78.6. It is simple to caJculalc that Ihe sum or the observations ~ .1'; :. 603.7 and the s~ orlhe squ~ of the observations ~.~ - 37938.89. 11Has.

.1'=

37938.89-1(603.7)2/10] 9

= 12. 81 1

'I1le Yo a set of IDC8SIIImDents is the square or lheir SIandard dcvi •on. AlthouP the variance has many uses. the standard dey· ion is a more meaningrul desaipti'VC staliSlic because it is in the same units as Ihe mw data. Wherais squan: millimelrcs, mm2 • may have an obvious inlap~1a tion, squan: millimetn:s or mercury, mmHI2, does DOl. Altman (1991) IUgeSls thai slDndanl deviations may be quoted with one orlwo IIIOJ"C clccimal places than the original values. The slandard deviation is typically used as a mealU~ or spmuI alongside the: mean aad is IDOSl appropriate when the data 1ft approximately symmetrically dislribukd. It has the useful properly that when the data follow a NOMIAL

RJrtUrIr.

STANDAADPO~noN

standard populsUon statlslleal consulting standardised

___________________________________________________ Sec DBIOORAPHY Sec CmmJU1NO A STATISTICIAN

mortality

ratio

(SMA)

Sec

DEMOORAPJIY

STATA

See STAnmcAL PACKAOES

statistical methods In molecular biology

Molecular biology is the branch of biology Ihat studies the slrUcture and fwaction of biolo,ical mamHnOlecules or a cell. and c:spccially their p:netic role. "I'hrce types of rn.acromolecules an:: the main subjects ofinlcrcst: deoxyribonucleic acids (DNA), ribonucleic acick (RNA) and prolcins. Genetic information is encoded in the DNA and inherited from parents to children and whca expressed. a ,ene, the basic unit of inheritance, is first b'ansc:ribcd to messenger RNA. which then carrics the infonnalion 10 a cellular machinery (ribasomc) for protein production. This basic principle or the information Row in bioloJ;Y is often referred 10 as Ihe 'central dogma' • put forwanl by Francis Crick in 1958. A ccnlnll goal of molecular bioloJ;Y is to decipher the genetic infonnation and understand the regulalion of protein synthesis and interadion in cellular processes. The rapid advance of biotcchnoloJ;Y in the past few decades has facilitated manipulation of these important biopolymers and allowed scatists to clone. sequence and amplify DNA. As a result. ZI lar;e amount of biological sequeace and structural information has been generated and dcposilcd inlO public accessible databases. 11Ie phenomenal powth ofbiolo,ical dala is underpinned by the de\lelopments of high-throupput DNA sequeac:in, and microBrraY technologies and Ihe recent progresses in ,iant rcsean:h projects such as the human genome project that produced the sequence of the human genome. The word 'genome' refers to the entire collection of genetic malcrial of an organism. 111ese advances result in many complex and massive datasets, sometimes decaupled &om specific biological questions under investi,ation. 11Ie need to extract scientific insighlS from these rich data by CXJmputational aDd analytic means has spawned the new field of bioinformatics and computational molecular biolo&;Y, which deals wilh stOI1lgC. retrieval and analysis of biological data. These can consist of information stoml in Ihc genetic axle. but also experimental results from various soun:a. palient statistics and scientific literature. Bioinformatics is highly interdisciplinary. using techniques and concepts rrom informatics. statistics. mathematics. physics. chemistJy. biochemisti)' aDd linguistics. Nowadays. various biological databases and pradical applications of bioinformatics arc n:adily awilDble throu,h Ihe internet and arc Widely used in biological and medical rcsearcb.

A wide spectrum of statistical methods has been successfully applied in bioinformatics. rangin, from Ihc basic summary statistics and exploratory dala analysis tools, to sophisticated bidden Markov models and Baycsian resamplin, methods (see BAYESIAN t.ETHODS. MARKOV CHAIN MONTE CARLO). Analyses in bioinformatics focus on three types of datasets: genome sequences. macromolecule structures and Iar;e-scale functional genomic, experiments. Various other data types are also involved. such as taxonomy trees. sequence poIymorphisms. relationship data from metabolic pathways. patient slaliSlics. text from scientific literalure and so on. DNA sequences arc the primary data from Ihc sequencin, projects and they only become rally valuable through multiple layers of annotation and organisation. Several areas of bioinformatics analysis arc relevant when dealing with DNA and protein sequences: sequence assembly. to establish the com:ct order of sequence cantip for a contiguous sequence; prediction of functional wits. 10 identify subsets of sequences thal code for various functional si,nals such as protein CIOding genes. promoters. splice sites. regulatory elements: and sequence comparison and database search. to retrieve data emciently from organised databa!ICs. Most oflhese analyses involved sequerrr:e align",ent. one of Ihe classic problems in Ihc carly development of bioinformatics. Sequence alignment is the basic tool that allows us to determine Ihc similarity of two or more sequences and infer components thai might be canscm:d through evolution and natumJ selection. To align two protein sequeaces. similarity scores arc assigned to all possible pain of residues and the sequences an:: aligned to each other so as 10 maximise the sum total of sean:s in the sequeace pairings induced by the alignment. Dynamic progrllllllJJing-basc:d algorithms wen: developed to oYeR:Ome the large scan:h space ror the solution of optimal global and local ali;nment problems (Necdleman and Wunsch, 1970: Smith and Waterman. 1981). Dynamic prognunmin, is a general algorithmic technique that solves an optimisation problem by recursively using 'divide and conquer' for its subproblems. Faster heuristic word-based alignment algorithms were later introduced for large database similarity searches (BLAST by Altschul eI QL. 1990: FASTA by Pearson and Lipman. 1988). These algorithms build alignments by extendin, or joining CXJmmon shon patterns ("words') that arc computationally efficient. but often yield suboptimal solutions. The interpn:tation of ali,nment scores and database search results was aided by statistical signiflcance deri\lCd from simulations and PROBABIUTY theory of eXtl'cme value distributions under Ihc framework of standard statistical hypothesis testing (Karlin and Altschul. 1990). These classic results ba\le become: indispensable tools for biomedical researchers and CXJmpulalional biolo&;ists to analyse molecular sequence data.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ STATlSTlCAL METHODS IN MOLECUlAR BIOlOGY Statistical models arc also routinely used 10 consbUct probabilistic proftles to characterise the regularity ofbiological signals based on collections ofpnwililned sc:qucncesand to incrasc SENsmvrrY of seardJes. For example, a blockbased product multinomial madel can be used 10 describe the position-speciftc base disbibutions of the 5' splice sile (exon-intron junction) sipal in humans (sec the figure). which gives a richer rqRSCDtalion of the sequence motif Iban the conseasus CAGIGTGAO ('I" indicates theexon-intronjuaction). A posUion-spedjic scoring malri." can be derived subsc:qucnlly using logarithms of the ODDS RATIO of the signal 10 background base 10 evalualc malc:hes of new query sequc:aces to the sequence motif and 10 quantify the in/ormation content or the signal sequeace paIICnI. 11te informalion content of a signal is defined as the a\ocrqe SC'OI'C of nuHIom sequence malcbes. mcaslftd in 'bits· using the log (base two) odds ndio scora that represent the number of0-1 s nccCSSlll)' to code for Ihis signal in a bilUllY coding system. For ilwtanccs. the human 5' splice site depicted in the ftgure contains 8 bits of information. meaning thai 'decoy' splice sites will be observed roughly every 28 256 bases in mndom sequence. Note thai the infonnaliaa content can also be formulalc:d as the retatire entl'0py (or K"lIback-uib!el' distance) of the signaito background nucleotide lmpIency dislributions in the contexi or information theory. More sophisticated maclels and scoring mallices an: also amiable tocaptlR dcpcnclcncies among neighbouring positions using MtuIcor models and oIhcrs. Another area of biological sequence analysis that n:lies heavily on stalistical reasaaing is gcne ftndinr; or, more generally. p~cling complex features from a seqllCDCC. The goal ofprotcin-cading gene ftnding is to locate gene featlRl such asexons and introns in a DNA genomic sequence. which

=

AAGGTGCTGTG CAOOTGAeTGG AATGTACGTGT CAOOTGAeCGG CAGGTATGOOG AAGGTAAAGTT CAGGTGAeCCC GCGGTAAGAGG GGOOTGAGTCA GAGGTGT'GTGC CAGGTAATCAA ACGGTAAGCCC GTGGTGAGCGG AAGGTOOGTGC GAGGTGAGAGG AAGGTGAGGGC CAGGTAAGGCA CAGGTGAGCCT

is the essenlial first-pass annotation of the pnomc project products. In addilion to inferring homologous (evolUtionarily related) gene SlnH:lUrcs rlOm database similarity searches. statistical ab initio gcnc-ftndilll programmes have been developc:d to integndc all known features and ~grammars' or prolcilKoding genes in a probabilistic model. Hidden Marko. motlels (HMMs) arc at the heart of Ihc mast popular gene finders (Gcnscan by Burge and Karlin. 1997. and reviewed in Dmbin eI at.. 1998). HMMs were originally developed in the early 19705 byeleclrical elllincers for the: problem of speech recognition -to identify what sequence of phonemes (or words) was spoken from a long sequence of category labels repn:scnting the spc:ech signal. The resemblance of the gcn~finding problem 10 spc:cch I'eI."ognition and Ihc way HMMs arc fannulated make them especially suitc:d in dais context. In addition. HMMs an: thecRlic&lly well-founded models. combining probabilistic modelling and rormallanguqe Ihcary that guarantees 'sensible' predictions that obey speciftc:d grammatical rules even though they might not be the aJII'CCl genes.

There are also wcll-documc:atcd and eomputationally emcient methods ror parameter eslimation (e.g. expc:clalion-maximisalion) and optimisation (Vitcmi algorithm). A Markov chain is a series of random cvents occurring wilh probabililies c:onditionally dependent on the state of the prc:cc:ding event(s). A hidden Markov madel is a Markov chain in which each sIBle genendcs an observalion according to some rule (usually stochastic). 1bc objective is to infer Ihc hidden stale sequence dI8l maximises the poSlcrior probability or the obsen'Cd event sequence "VCR the model. For example. the hidden Slates may repn:scnl words or phonemes and the observations are the acoustic signal.

Ia! ~!G T~ A~ ~ ¢ ~

-2 -1: ...1 ...2 +3 +4 +5 +6 ...7 +8 • A 0.34 0.65 0.10 0.00 0.00 0.61 0.70 0.09 0.18 0.29 0.22 C 0.38 0.100.03 0.00 0.01 0.03 0.07 0.06 0.15 0.19 0.25 G 0.180.11 0.811.000.000.340.110.780.190.300.24 T 0.11 0.140,(11 0.00 0.99 0.03 0.120.08 0.49 0.22 0.29

Position:

~

statistical methods In molecul. biology The human 5' spies site (exon-intron junction signBJ) 43&

STATISTICAL PACKAGES _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ Motif discovery is an Bra under acti~ research and has benefited from sophisticated modem statistical techniques. In a typical setting. a collection of sequc:atlCS derived from MJCIOI\RRAY EXPERDoIEm'S or various soun:es an: believed to shan: aJlDmon sequence motifs that often repmient functional domains or regulatory elements. and the challenge is to find the unknown signals and locale them in indi"idual seqIIc:atlCs. One approach is to formulaIC the multiple alignment infonnalion as MlSSL'lO DATA and infer them together with other parameters of the statistical model. given only the seqDc:atlCS as observables. Advanecd statistical modelling and itendive computation techniques such as the EM ALOO. Rl11IM and Markov chain Monte Carlo arc typically used for simultaneous model estimation (Uu. Neuwald and Lawn:nce. 1999). The function of a protein is determined by ilS threedimensional structure. The problem of predicting the three-dimensional sbUcture of a protein from its amino acid sequence (or the protein-foldi~ problem. because pn:1Ieins an: capable of quickly folding into their 5lable. unique threedimensional struct~. slarti~ from a random coil conformation without additional genetic mechanisms) is one of biggest challenges in bioinformatics. There an: three major lines of approaches for protein structure prediction: comparative modellilll. fold n:cognition and ab initio pn:diction. Comparative modelling makes use or sCQuence alignment and database searches and builds on the fact that evolutionarily related proteins with similar sequences have a similar structure. For proteins wilhout a homologous sequence of known structure. the approach of "dftading" has been developed. It is assumed that a small coIlcction of ·folds·. pc:daaps several hundn:d.s in number. can be used to model the majority of protein domains in all orpnisms. The proteinfolding problem is Ihus mluced to the IasU of classifying the query protein based on its primary sequence into one of the folding classes in a database of known three-dimensional structures. This classification is often accomplished using axnplicaled statistical models such as Gibbs sampling and HMMs to parametc:risc the fit of a sequence to a given fold and solve the optimisation problem acamlingly. Analogous 10 the gene-linding problem. one may attempt to aJlDPUlc a protein's struct~ din:dly from its sequence. based on biophysical underslaDdi~ of bow the thrce-dimensional structure of proteins is attained. The challenge can be broken down into two components: devising a seoring ftlnction thai can distinguish between co~1 and incom:ct structures and a search method to explOle the conformational space emcienlly. If successful. diJut folding certainly would give a deeper insight than the "top-down' lhrcading or homology modelling approaches. However. currently no reliable method has yet emerged in this category. During the past few years. the da'elopment of DNA anay tcchnology has scaled up the baditionally one-gene-at-a-time

functional studies to allow lhe monitoring of hundreds of thousands of genes simultaneously. A large number of statistical issues arise in connection wilh these studies and these have fosteml unpn:ccdented conversation and collaborations between biologisL~ and statisticians to establish means to plan. process and analyse these massive datasets. Many branches of statistics have been re"ivcd andlor extended by their recent applications in the analysis of functional genomics and molcc:ular data. including DATA !.IININO methods to disco\'Cr and classify paUenas. WIl1PLE 1ESTINO proocdurcs to adjust P-VALUfS to control false discovery rates and meta-analysis (see SYS1BIATJC REVIEWS AND MET.O\-ANALYSIS) to combine experimental n:su11S from various sources. New slatistical methods will soon be ncccled when combining infonnation from multiple distinct data types (sequence. gene expression. protein structures. sequence variation and phc:nolypes) for the same subjeclS. RFY

AltsdI"',S.F.. GIIII, W.. MlDer. W.. J\oIyus,E. W.andUpm.... O' 1990: Basic local alignment sead tool. Journlll 0/ MoImtlar Biology 21S. 403-10. a.... c. B. aad KarIID, s.. 1997: Prediction of complete gene SlnlclUres in human genomic DNA. Joumol of MoI«ultIT Biology 268. 7'-94. D....... It., Eddy. s.. Krop. A. and Mltddloa, G. 1998: BiologicQIJeqUentrQIIQlysis: probohilistic models ofprolrim tmd nuc/ric adds. Cambridge: Camlllidgc Uni"mil)' ~. KarIlD, S. aad Ab:..... S. F. 1990: Methods for assessing dac Slatislical sipUicance or molccuJu sequence fealuJa by using general scoriUC schemes. I'roLwdings 0/ the NatiOlftlI AmJmry 0/ Scimces O/Ilrt! United Slates of Amnim 87. 2264-8. LIu, J. S.. Neawald. A. aad LaW'l'llll:et C. 1999: Markovian struc:tlfts in biological sequence aJipmcnts. Journal of lhe Amniam Sialislimi Associolion M. I-IS. NeedlIIIIBD, s.. B. and WIIII!IdI. c. 0. 1970: A general method applicable to the scIIKh for similarities in the amino acid sequence of t\\'O proteins. JOUI'Mlof MoI«uJtIT Biology 48.443-53. PeIU'lOllt W. R. 8Dd Upm.... D. J. 1988: ImPlV\'ed tools for biological sequence comparison. PrD«eJ· ings 0/ lire Naliono/ At:adenry of ScWrlCt!J of lire Unilrd Siales of Amnim 85,2444-8. SmUb, T. H. aad W.........., l\L s.. 1981: Identification of common subscquencc:s. JOUTIItII 0/ Mol«ltlar

Biology 147. 195-7.

statistical packages

In 2010 the Association for Survey Computing (ASe) website (www.asc.org.uk) listed some around 200 statistical packages. Many of these have been underdevelopment forneaJiy 40 yean ud ~fore it is both a \'Cry mature and di,'crse soRWDIC market. While many of these around 200 packages arc developed for niche martets. there arc still several generic softwarc suites. II seems almost invidious to lJy to seleci and discuss individual packages. However. there an: clearly some wellknown and long-established packages. and 10 many the tenn ·stalistical package' is almost synonymous with SPSS"" or possibly SAS'''. Gh'en the variety of analyses that these packages offer. they can meet most user necds.1t would seem likely that a vmal monopoly should exist. but in fact Ihere

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ STATISTICALPACKAGES

have been new enlranls gaining popularity. Comparing these is instructive about bends in the development or statistical software. The packages in the ftrstlable ~ the ones on which we will COMenllate here.

statistical packages Major statistical packages MajDr Jlalisl;ca/ ptlduJges

SPSS SAS STATA S-Plus

www.spss.CXJm www.sas.com www.slata.CXJm www.insightrul.com

The prevalence of these major packages notwithstanding. there are other' packages. as listed in the second table. although these will not be rurther discussed. Competition has been good for the development or programs and potential purdJascrs should always be aware or options oulside the norm that may weD fttlheir n:quin:ments. Together with the ASC YI-ebsite (given earlier). it will always be profitable to make comparisons when purchasing.

statistical packages Other major sta6slicBJ packages Other major slaliJ/ical packtlgeJ

Oenstat STATISTICA NCSS SYSTAT

www.vsn-intl.com www.slatsoft.com WWW.RCSs.com www.syslat.com

Naturally enough. one wants a slalistical package to do slatistics and lhe leading packages cover a wide range. These include basic descriptive statistics. including EDA-style chaning. comprdlc:asive cross-tabulation analysis. means testing. the general linear model, multiVDriale mcthods. daIa raluction and clustering. nonparamelrics. log-linear modelling. time series - and more. The conversion in the late 198Os-ear1y 1990s or the packages SPSS and SAS to run on desktop PCs seemed to cause a hiatus in the development of statistical methodology within these suites. Quite possibly. one of the maiD reasons for this was the need to develop new user interfacc:s, as an alternative the command-line format previously used on maiDframe and minicomputers. With the DOS interface model being rapidly succeeded by that orWindowsN • major consecutive design changes were needed. This did seem to leave a window of opportunity for new enlnmts 10 the l1UIIItel. which could write dim:tly USing modem programming an:hilectwcs. S-Plus is perhaps the earliest example or this. initially written for the UNIX system and then subsequently ported to

PCs. The dcsip was conceptWllly novel. based on the notion of an extensible statistical calculator. It provides advanced graphics facilities and has become popular with professioaal statisticians ror its ability to develop analysis methodologies. rather than being tied to a rigid rramework. Over lime S-Plus has developed 10 add extensive user interface enhancements as well as larpr statistical libraries. 'I1Ic public domain OR' (www.r-project.org) is based on a similar philosophy 10 S-Plus (see R). STATA has become a very popular alternative ror similar reasons. Slarting out as a command-line-dmocn pqmm. it has maaured over the years 10 offer a windowing interface in addition. Its attractiveness to researchers has been a madem approach 10 statistical testing. as well as its ability to incorporate new mcthodologiesquickly. Not only do the developers have an architecture that pennitseasy inc:Rmental expansion. users themselves can program their own procedures. l'bis has gained lhe support or the professional statistical community. who duough their educative role have promoted the package"s popularity. PaJtly as a n:sult ofCXJmpc:lition, packages have also begun to differentiate themselves in terms of extending extra support to the whole data analysis process. While the actuallCSt result remains the core of any analysis. data managemcnl is rar mon: demanding in tcnns of time. 'l1Ie ~l'(lCs nc:c:ded to support DATA MANAOEMENT in a MULnCENTRE clinical11UAL 8JC significantly larger than those for a classical experiment. In these scenarios. managing and manipulating data prior 10 analysis becomes very impoltanL SAS has long specialised in data management support. with 8exible procedures ror merging and manipulating datasets. as well as links to database packages. In the pharmaceutical industry SAS is almost a de racto standard for major analyses. reRccting its ability 10 support the strong audit n:quirements in the: industry. To a c:ertaiD eXlent other' packages have been n:slrictcd 10 the rectangular data madel (orsprudshcel) viewofdata, although all are now improving these: fc:alwcs. One direct effect ofthe development ofstatistical packages has been to intnxluce the possibility of Slatistical data analysis to a wider audience than just statisticians. Since these usen 8JC often in ftnance and a:JIDIRCrc:e. they represent a signiftcant revenue sbeam 10 PKUge producers and making the program user fiiendly for nonspecialist audiences has become a priority for some. SPSS's menu-driven 'point-andclick' interface. ror examplc:.. epitomises this model. In conlnlSt. the command-line models or SAS. STATA or SPius n:quire more dc:dicatccl training. although as noIcd earlier all have developed similar facilities. (STATA 8 inllOduced a menu-driven interface in 200310 complement its traditional command-line orientation.) Inlepaling advanced chda-entry featuR:S with a statistical analysis package is common. The pralominant spreadsheet 437

STATISTICAL PARAMETRIC MAP _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ daIa entry model CaD be enhanced 10 include: dala entry fonns. daIa checking and audit. The large pacltqes such as SPSS and SAS provide "add-on' pmgrams for this. Other programs provide din:d database links so that data entry CaD be proVided in a normal propamming package such as Microsoft Access and then din:clly impoltCd for analysis. While lraditionally the n:sulls of an analysis IR interpreted and then incorporated into a final n:poIt. packages have begun 10 differentiate themselvcs on their ability to produce tables and results Ihat can be din:clly pasted mlo a presentation qualily report. Packages vary widely on their ability to do Ibis and support CaD be patc.hy. SPSS provides a very good ability 10 rno\'C: n:sull$ tables.. but the exported paphics an: not of such a good qualily. STATA. by contrast. does not offer sophiSlicated export of results. but has in ils IateSl versions excellenl graphical OUlpuL SAS offen full programmable reporting feahln:s that an: very dexible. but challenging for the naive user. While the main focus of aDy statistical user is on the large packages. dedicated packages still have a role. As aD example. propalDs such as NQUERY (www.sIaIsol.ie). dedicated to sample size cSlimaiion. do one particular job very well and an: popular as a n:suiL The lone. innovativc raearcher (aD example perhaps being MX round at www.vcu.edulmxI) is also a likely producer of innovative soRwan:. An important dimension for the individual consumer CaD be price. Some: orlbc major pacltqes have prices that match their capabilities: the single n:searcbcr. particularly in the aJI1UDCR:ial sector. may find this an important ractor in choicc. All the relevant websites can give guidance on obtaining price qUOlalions. Rather than ossifying. the marketplace for statistical 5OftWIR is healthy and n:scarchcG can find themselves well CS supported with a choice of divenc packages.

statlsUcal parametric map

SeeSTAnmCSlNlMA01NO

statiaUcaI refereeing There have been hundreds or review artic::les published in the biomedical literature that point out statistical enon in the design. conduct. analysis. summary and presentation or research studies. 1be contenls of every generul medical journal (most notably AnMIs 0/ In1eT1fD1 Medicine. BTili:lh Medical JOllmol. JoumoJ

of lire

Amerialll Medical Auociolion. Ltnlcel and Nen' Englo",' JOIlTnoi 0/Medicine). as well as or many spc:ciaiiSl ones. have been subjectcd 10 this intense scrutiny sometimes frequently. 1bese n:view articles have focused on particular SIaIistical tests, rrequency or usage and corn:ct application of tc:chniques or statistical analysis. design of CUNlC'AL TRIALS and epidemiological studies. use of POWER calculalions and CON. FIDI!NCEJN1'ERVALS and many other aspecls. Their almoSl univcnaJ conclusion is Ihat a substantial pcn:cntage or n:scarch studies. perhaps as many as SO'.it.

published in the biomedic'aI literature contains cnors of suflk:ient mqnitude to cast some doubt on the "alidily of the conclusions that ha,,, becn drawn. This does not mean that the conclusions IR wrong. but it does imply thai they may not be righ" and this inevitably leads to serious concern about the consequences both for understanding ofdisease and ror the tn:atment of patients. One solution 10 this problem has been Ihe introduction of medical slatislicians into Ihe peer ~view process. Some have advocatcd thai all submitted papen should be scrutinised in this way. arguing Ibal statiSlical review of Ihosc that are not published. no matter how poor. will at IcaSl lead to higher standards in research and improvement in ruture papers. In view of the very large number of biomedical journals and the huge numbc:rsorpapcn submillcd for publicalionevcry year. such a remedy is impracticable. An alternative. now used by severaljoumals. is 10 divide the peer review process inlG two Slages. whereby papers considenxl by the editors as candidates for publication are sent first to subject mallCr refem:s (physicians. surgeons. epidemiologists. ele.) and those R:COmmcnded for publication by them aM then sent 10 staliSlicians ror further specialist review. 1be process of statistical revie\\' is atrnplex. requires sophisticated judgement and varies considerably in its ape pUcalion to evcry section of a paper (absllac" inlroduclion. methods. results and discussion). Altman (1998) ~views some of the diRiculties and provides practical examples of boIh definite enors and matters or judgement. within Sludy design. analysis. pn:sentation and interpretation. There an: 12 broad aims of slatiSlicai revicw thai can be summarised as rollows: to prevent publication or studies that ha\" a fundamental law in design: 10 prevcnt publication of papers that have a fundamental flaw in in'erprelolion~ to cnsure that key aspects or background. design and methods of analysis an: reported clearly: to ensure lhat key fcatures of the design an: reftcctcd in the analysis: to ensure Ihat the best methods of analysis. approprialc to the data. aM used: to ensure that the pn:sentation of n:suIls is adequate and employs summary statislics Ihat are justified by the design. Ihc data and the analysis: to ensure that tables are accurate and are consistent boIh wilh the text and with each other: to ensure thai the style orftgun:s is appropriate. that they an: consistent wilh text and tables and not Wlduly n:pctitious of other content: 10 guard against excessive analysis and spurious accuracy; to ensure that conclusions ~ justiftc:d by the n:sulls: 10 ensure that content or the discussion is justified by the results and, in particular. Ibat it avoids generalisation rar beyond the confines or the paper: and. finally. 10 cnsure that the abstract accords with the paper. 1be statistical reviewer may also comment on subject matter when an expert within the medical specially of the paper. bUI will nul indic'atc typos. except when these an: critical for accuracy within rormulae or text.lndecd. pointing

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ STATISTICSINIMAGING out inconsequential typos is nol part of any aspect of any review pI'OCess; they should be disrcganlecl by expert reviewers and left entirely to the joumal's copyedilOr! Since slatislical review is complicated and. far the ~icwcr. sometimes excessively Icdious. with the necessity of making vel)' similar. sometimes the same. c:onunc:IIls aboul manuscript ancr manuscripl. ddailed slalislic:aI guidelines and checklists havc bc:ca wrilten with the spealic ink:ntionofhelping authors (and revieweD). 1hese have been suppadcd by the edilan of maD)' biamcclical journals and maml to in the jounal's piclc:lines 10 aUlhors. Examples can be found in Allman el aL (2000) and Gardner el III. (2000). "I'bosc mast widely usc:d for clinic:allrials arc: the CONSORT guidelines (Moher. Schulz and Altman. 2001, updated 2010). far which there is accompanying explanation (Altman el al, 2001, also updated 2010). ad extension to cluster trials. noninfcriarily and eqUivalence randomisccl trials.. herbal mc:dic:ine illlCr\lCDlions. nonpharmac:ological illlc:naUJDs. banns. abstnK:1s and prqmalic trials (see www.COIISDI.l-stalcmcnLCJII). The checldist thai forms pad of the CONSORr stalernenl is inlCllded to accompany a submiUcd paper and to indicalc where: in the mamascripl each item in the checklist has been acldressed.lhus serving as a useful ranincIer to authors and an aide 10 mcm:S. 'I1Ierc: IR also rc:cenl guidelines for n:polling r.El"A-AN.\LYSS (PRiSMA. which supcrc:c:dcs QUORUM), far obSCI'Yational studies (STROBE) and for genetic ASSOCIATION studies (STREGA): details can be found lhrough the EQUATOR network (www.c:qualor-nciwork.org). Stalistical n:Yic:w is intended 10 be helpful aad constJuctive: it should also reassure authors and Jadmi that publisbcd papen IR sound. However. il is not always secn flUID this perspective: and editors of joumals nc:c:d to be vigilanI in ensuring thai il does not become a focus far conlloveny and dispUlc. as ca happen. for eumple.. when authors parade lhe Yiews of "lhcir own statistician' to counler cammcats from a ref~. There is aI prcsentlilde incentive for statislic:ians 10 cnPIC in such review - itcloes ncl cnhanc:c thcir~, there is no spcc:iftc InIining for it, small (if any) remuneration. it is lime consumina and "the only likely aJRaac consequence or goad ~iewilll is fUlun: l'IXluesls farmorc: reviews' (Baa:helli, 2002). Bacchetli also points out thai slatislics is a rich an:a for finding mislakes and. when coupled with "the notion that randing Raws is the key 10 high quality peer review', can lead to 'finding Raws thai arc: not rally ~'. This rc:infarcesthe need for sound Slalislical judJemeat. Stalisliciam may also havc 10 CXIUJItc:r mistaken criticisms fmm subjc:ct matler n:Yiewers with limilcd statistical knowlcclgc (Bacchcui, 2002). The final part of statistical Rview is usually a n:commcndation to the joumal"s editor cilher to accept. 8C1t1ept with revision. RVise and resubmit. or rejcct the paper. n.e distinction between the second and thinl is sometimes climcult and can only be made by balancing the exlc:nt and nature of the: RVisions against the capabilities ofthcauthonuCYinc:ed

from the submitted paper. Rcjcc:tion by the statistician can also lead 10 provocation. especially as authors will be aware: that their "subject matter' peers haye already judged it sound. In 1937 the: lmrce,' s leading article lbal heralded the: series of classic papers by Bradford Hill on The Principles 0/Medit.YIl Statistics forcwamed: "II is exaspend.ing. when we slUdied a problem by methacls that we have spenllaborious years in mastering. to find our conclusions questioned. aDd pemaps refuted. by someaae who could not have made the obsenaliolW himself. It requires morc: equanimity than most or us possess to acknowledge thai the fault is in ourselves.' Authon of papers an: advised to n:ad staIislicai reviews carcf'ully. pul them aside far 48 hounand anly tbcnslalt to think about how to respond. For flUther infonnalion and discussion sec Rubinstein (2005). Smith (2005), Wan: (200S). TJ AftIam, D. O. 1991: SlIIislicaJ moiewinl for medical journals. Sla.is.its in Mmiti"~ 17.2610-74. A!fm.... Do 0 .. Gore, 5.1\01.. Gudaer. M. J. MIl Poc:ac., S. J. 2000: S'a.isliml gaitieliM, Jor COIIIribu.ors 10 rrwtIicaJ jourlltlls. In Altman. D. G.. Machin. D•• BI)'IRt. T. N. andOardDcr. M.J.(cds).Statir,;cs wi,h colljitltnte.2nd edition. Loadoa: BMJ Books. 171-90. ~D.G.. Sdlab,K. F.. Maller. 0.. Eaer.l\L,DattdDII.'., m.......... D., GabIdae, P. eMIl ...... T. ,_ tile CONSORT Groap 2001: 11ac R:\iscd CONSORT stlllemcnt far ~podinc raadomiscd bials: ClpIaDalioa and elaboration.. Anna& of In'~mal Medic_ 134, 663-94. 1IKdIettI, P. 2002: Pccr n:vic'W or stBlislics in medical mcaKh: the eMber pmblcm. BrilUh Mmiml JOIITfIall24. 1271-73.0........ 1\1. J., I\ladda, 0.. C_pIIeD. I\L J. MIl AItmaa, D. O. 2000: Sla'is'im/~h«klis's. InAitmaa. D.G•• Man. D.• BryIilL T. N. and Gardacr. M. J. (cds). S,alis.ics "'i'. toII/idence. 2nd edition. Loadan: BMJ Books. 191-201. M......,D.,Scb.K.F.... Altala.D.O.

rar ... CONSORTGraap200I:TheCONSORTstatcmenl:R:\'iscd n:cammcndaliCIIS for ilDpRtYinI the quality or n:pcIIls of parallelpaup nmdomiscd 1rials. Annols of'"lemtll Mmitirre 134,657-62.. RubI..... LV. lODS: S'alis,;ml rn'irK' /Dr 11f14itlll jDumals. guitkliM$forQu'_rs.ln CollOft, T. and Annitage.. P.. (cds). Erlt)'CIopnJia of Bio.stalis,ks. 2nd edition. 2005. Chichc:sta: John Wile)' " Salls UcL p&&Cs SlC»-SI92.. S...... R. 2005: Sta.istical rerietl'Jor merJimljDumals.joumtIl'spmpett;rr; In Colton. 1: and Annitap. P.. (cds). Enc)'tlopet/itl ofBiDJla.istics. 2IIdcdition. 200S.Chicbcstcr: John Wiley .t SoDs Ltd.. pages 5193-5196. W~ J. H.2OO5: Sialis'imi rerie.'for mmicaJjoU11lQ&; In Colton. T. and AnailllC. P..(cds).EncytlopeditlofBitJJlalislks.2IIdcdition. 2mS,Chicbestcr: JohD Wiley a Soas LtcL pqc:s SIB6-SICJO. statistics In Imaging

This is the u~ of statistical

lcchniqucs to analyse and quantify information conlainc:d in digital image formal. Imaging is widely used in medicine to visualise objects. strucltRS and e\'Cn physical processes in .-;'0 and in ,ilm. A significanl advanlap in medical imagiag is the ability 10 visualise Slnlcblrc:s or proc:c:sses without Rlying on surgical opcndiolW. Thus. animals may be rc:cyclcd in dnIg discovery and development or patients may nol sutler Iiom inbusive proc:c:dun:s. 11ae ability to acquire inrormation withoUI inlnlsive procedures is also a

431

STATISTICS IN IMAGING _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

mc:cIical imqing. This raises Ihc issue or sunup'" imqiq DIDPOINIS (or srllTOgtlc,.): i.e. bow well do the cooclusioas from an imaging experiment carrcspDDd to physical properties obcaiacd rrom an inlnlsiw pmceckue? Although the human visual s)'Urn is ~ good at exll1lcliDJ infonnatian fmm imaps. the sheer IIIIIDIIIItof dais being produced c:mdcs the 4XIIDIIIOft problem of ·nat enough lime to look at CYel)'thiq'. Slalislicallcchniqucs using c:omputen enable n:sean:hers and clinicians to summarise large numbers of images rapidly solhal paIIcmS, nnels, regioasofactivalian, etc.. may be idcnliftcd and quantil1cd.. Besides the amount or information. medical imaging sYSlems also see beyoncIlhc visible lipt spectnmI and IR able to process inf'omudiClft lium a wide range of die eleclnJlnagnetic spectrum. Examples of mc:dical ima&ing systems include conveadisadWDtage to

tional radiology (X-rays). angiography (imaging or a system of bloacl w:lsels using X-rays). positron emission t0mography (PET). X-ray transmission computed tomography (CT). mapetic resonance imagiq (MRI). miaoscopy. silllie photon emission (computed) tomography (SPET or SPECT). spcc:troscopy and uJtnlsDund iraqing. Eveo eJcctroenc:ephaIognuns (BEGs) or magnceoeacephalograms (MEGs) ~ examples or imaging systems. albeit with very poor spatial ~ution when compa.raIto MRI or PET. An image is a two-dimensional function Ibat depends oa spatial caordinates. when: the amplitude or the function represents Ihe brightness or gn:y level of die image at a particular poinL Individual elements or the image ~ Imowa as picbm: elements, pixels fOr shoIt.. Imqes may be

coDceled to farm a ducc-climensianal data SlrUc:t1R., or volume. where the individual elements 1ft called voxcls. This is COllUDOD in. few exampI~ MRI aad PET. where an experiment an a single subject wiJI involve acquiriDg information in ~ spatial climeations and in time. nadilional sblljstical techniques in image analysis iocludc aras such as signal and morpbolCJlical processing. Signal processing applications inclucle image enhancement. imqe n:stantion. colour image pnx:essing. wavelds and campn:sIioo. Morphological processing assumes that set theory may be applied to manipulalc slructun:s pn:seat io an image. A n:lalivcly new area or research in imaging is Ihc use of MRI in limctional or phumacological lIuc1ies of the brain. PunctiClllai MRI (fMRI) is now weD developed and secb to assacialc brain ftmctions (human or animal) with spcciftc n:gions or the brain. Phannacological MRI (phMRI) is n:latively new and sccIcs to associate phannacakinelics with speci8c n:gions oftile (animal) brain. Although group studies an: widespmul, consider a singlc-subject analysis fmm a typical fMRI experimenL After cIaIa acquisition. a sct of images usociatcd with distinct slices oflhc bnUn is available ror .-lysis. Each slice will have a time scqoence assacialed with it; i.e. the iraqiag experiment contains both spatial and temporal information. Given knowledge or the study design. the goal is to icIenIify n:gions orthc brain when: signiftcant activation was observed, where activalioa is by die intensity or the signal observed in the fMRI experimenL Signal intenlity is relabl to die ratio or oxygenaled and deoxygenated blood locally in Ihc brain.

mc:asurm

(X,Y) • (SO.30)

I I

...

0

I I

I 4) 0

.50

ro

7(]

ao

Bl)

0

20

•

80

80

100

Time

statistics In Irnllling Exampleo'anAlRI slice {Ie"} and voxel time coutSe (right). TheexpstimentlJldeslgnhasbeen supetimposBdonlhB timeCOUISS plot whenJthe vlsuBlslimulatlon Is shown byadashBdlineandtheaudiostimulalion ;s shawn by a dotted line (data providBd by the BIlIin AfIIpping Unit, Depattment of PsychIatry, University of cambridge)

____________________________________________________________

The Ume course in the figure (page 440) shows a typical slice f'rom an MRI experimeat and the study design of ontoO' sequences for \isual (dashed line) and auditory (dalted line) stimuli. Each voxel in the image has an associated time course: a mask that climinalcS nonbnUn voxels is typically used to focus the data analysis. UNEAR RfXJRESSION. or. mom fully. DUinr: the OENERALISfD LINEAR MOOEL (OLM). is pc:Iformcd on each voxel using the: experimeatal design. convolved with a funclion to model the hKmodynamic ~MC of the paUcnt. as the independent \·ariable. T~nd n:moval is an important slc:p and may be applied as a preprocessiDg step or by incorporating low-fn:queacy terms explicilly in the OLM. The typical assumption of inclcpcadcnee belween observations is noIlnIe in IMRI daIa~ methods such as prc-whilCDing. automg~ssive modelling and least squ~s with adjustment for correlaled enon ~ auempts to oven:omc the limitations of ordinary least squ~ Fitting the OLM to fMRI data may be performed on an individual voxel. on a cluster of voxcls knowD as a region of i~ (ROl). wheM the data ~ averaged in space to prodUCIC a single time course:. or on every brain voxcl in the image. For the fint IWO cases. standard theory for statistical inferenc:eon ~on mocIcls may be applic:d. Focthe lhird case. techniques such as Oaussian nmdom field theory. n:sampling (see BOCJ1'STRAP) and adjustments by multiple comparison procedu~s have been used. Rcprdlcss of which mclhod is applied. a sel of vaxels is obtained where signirlcanl acUvaiion during the experiment was detected. Resean:hcrs thea ~lalc the images 10 the anatomical regions idcaUDed in Ihc acUvation image. also known as a staUstical parametric map (SPM). Infonnation from a group of patients may be combined or compared by fint registering all images with a slanclanl brain. 111e mosl common brain adas used is the Talainlch alias. 111en. a random effects or fixc:d effects mode) (sec LINEAR ).UXID EFfiEcrS ).I()DElS) may be used 10 apply a statistical hypothesis tesl between groups ofsubjects in the experimcnl. For mom details sec Serra (1982). Olubey and Horgan ( 1995), Moanen and BandelUni ( 1999). Oonzalez and Woods (2002) and Worsley el al. (2002). B\v Glasbey. C. A. aDd Ho,..II,O. W. 1995: Image QllQlysu/or tire bioiogittJI ltienC'~s. Chichester: John Waley &; Sons, Ltd. G __lez~ R. C. aDd Woods, R. E. 2002: Digital imag~ proces· sing. 2nd edition. Englewood Cliffs. NI: Prcatice Hall. Mooaea. C. T. W. ad ""'ettlal,P. A. (eels) 1999: Fllllclionol MRl. Berlin:

Springer-Verlag. Serra.J.I982: IlfItIgetlJlolysu tIJIdmalbemali«ll morphology. London: A£ademic Press. WonIeJ. K. J .. L..... C. H.~ AltoD, J., Petre. V.. DIIIIC8IIt G. H.. ,.,....... F.1IDd Evaas, A. C. 2002: A geaeral staUsticai approKb for fMRI datL NeuroImtlge IS, I. 1-15_

StatXact This is a specialised softwan: package for the exact analysis of small-sample categorical and nonpanunctric

STA~CT

data with special emphasis on data in the form of contililency lables. 1bc term "small-sample' applies equally 10 datascts with only a few observations. to large but unbalanced datascts or 10 ~y TABLES wilh zeros and small cell counts in some of the cells but )~ cell counts in other cells. In these sellings. SIalXad produces exact P-VALUfS and exact CONFIDENCE INIBlVALS instead of ~Iying on possibly un~liable lillie-sample: theory for its inre~nces. The inference is based on genending pcnnutaUon dislrlbulions of the approprialc test statistics in a conditional ~ference set. Diffe~nt reviews of StaIXacl arc given by Lynch. Landis and Localio (1991). Wass (2000) and Oster (2002). 111e cunent version. SlalXac:t 6. offers exact P-valucs for o~. two- and K-sample problems. 2 x 2. 2 x c and ,. x c conlingency tables and meas~s of ASSOCIATION. The data may be eilhcr unstratified or SlJ'aIiftc:d. Both independenl and blocked samples arc accommodatc:d. This version computes the exact conftdencc interval for 0005 RA11DS Ihat arise from 2 x 2 and 2 x c conlingcncy lables. as well as an exacl confidence interval for the M~ shift parameter in an onIcn:d 2 x c conlingency table. StalXacl OO'CIS procedura that calc:l' explicitly to binomial data. Poisson daIB. nominal catcgoriaal data. ordcn:d categorical data. ordcn:d correlated categoriaal data. continuous complctc data and ClDRtinuous right-«nsorc:d data. For comparing IWO proportions (either from dependent or independent samples). StalXacl provides lhc exact uncondiUonal ClDRfidcnce intcrwl for a difference in proportions or Ihc ratio of two proportions and computes exact P-valucs for tests of equivalence and noninfcriority. In addilion to Iools for exact inf~ncc. StalXacl also provides exact power and sample-sizc calculations for saudy designs involvinr: one. two or several binomial populations. In the two-binomial case, thc:se feawres inc:lude exact power and sample-size calculations for designing noninferiorily and equivalence studies. In case the computation of an cxac:t P-value becomes infeasible due to the lack ofeilhc:rUmc orcompuling memory. SIaIXac:t prodUCICSBn unbiascdcslimate oflhcc:xactP-valuc 10 at IcaSi IWo dcc:imal digits of accuracy USing ef1icient MonIC carto simulation slrategies(see MARKOV CHAIN Mo.'«ECARLO). The user can arbitrarily increase the number of Monic Carlo simulations in order to incmI5C the DCCUI'Ky. StalXact 6 runs on Microsoft Windows NTI2000IXP as a saandalone prodUCL In addilion. a special version, StatXact PROCs for SAS Users. is available as eXlCmal SAS proc:cdu~s for bolh the Microsoft Windows and Unix operating sysaems. CCoIPSeICMINP Lyadt,J. c., '.MMUs,J. R. _1M", A. R. 1991: StalXact. The American Sltllislirian 45.2. 151-4. o.ter~ R. A. 2002: An examinlllion or stalislical software packages for clllegOrical elida analysis using eltKt methods. The Amerittlll SIoI&Iitian 56. 3, 23s-46. \Vas. J. A. 2000: StalXact .. for Windows. Biolem Software and Inlnnel Report I. I. 17-23.

441

fiEM-AN~~~OT

____________________________________________________

stam-ancHea' plot

Essenlially. this is an enhaac:ccI in which the actual data values arc retained for inspeclion. 0bsc:mxI values are each divided inlO a suilable 'stem' and 'leaf'. e.g. thc lens ftgure aad the units ftpan: in many examples. and then all the leayCS cam:sponcIiq to • particular stem are listed (usually horizaataIly) next 10 the value of the slcm. An example is shown in Ihc: ftpan:. 1OS'IOCIlAM

i4 : 2 i4 : 555

i4:~

i4 i5 15 i5 i5 i5 i6 16 16 16 16 17 17 17 17 17

: : : : : : : : : : : : : : : :

889 0000001 ttiii 2222ft?????233333333333333333 ...........555555555555555555555

686668~&~III"III','11111

888888BBDBB8_ _ 8~ OOOOOOOOOOOOOOOitt111 tti111UU11 U 333333333333333333333333 ••••••••••••44 ••• ~55555

688666&&&881111111

8888880G989999 00000000000111

333 4 67

88

stem ..d-leat plot A slem-and-lealplot tor lire heights in CBtJIimetres 01351 fIIdedy women The plot awnbines the visual piclun: of the clata provided by the histogram with a display of the orden:cI clata values. 11Ie cIesip of slcm-and-leaf plots is discussed in VeUcman and Hoaglin (1981). II is important 10 use a typeface for which each digit occupies equivaJeat space. otherwise a key SSE feature ofbeiq 'a histogram OIl its side' is losl.

V......, P. F.... R....... D. C. 1981: Appliclliaas.llasic:s. and computiq of aplanlory data analysis. Baston: Duxbury.

slepwl. regression Sec LOOISTIC REORESSION., MULn. PI.E lJNEAJl REGRESSION

stochastic process

'Ibis is any system that develops in 8IX'OIdancc with probabilistic laws. usually in time but sometimes in space and possibly even in both time and space. Forexamplc.the spn:ad ofanepidemic is a stochastic process and its de~lapmcnl can be bKkc:cI in lime. across some terrain or al the ClGIIjunclicm of bodI lime and position. The constiluents of a SlOChastic process are ilSslaie. X say. and its In••dng ,. .iablets,_ s or I. 'I1Ie state is the primary measun: of interest. such as number of individuals ill. while the indexing variablc denotes either the time (I) CII' the posilion (s) at which the state is measured. A discrete indexing variable is usually shown as a subSCript, but a ClGIItinuous index appears within lraclilional function nolation. For example. suppose that the state oftheepidenaic is the number of indiYiduals who are ill. 11Iea X, would denote the

number of individuals ill at lime I if observations wen: laken at theSIarl oreach day. whileX(s) wouldclcnote the number of individuals ill III position s measured cantinuously in spacc. Of course. the: state of the process can also be either cliscrete (e.l. number of individuals ill) or continuous (e.g. £ca reading of a cardiaC' patient). An essential ingn:dient in a stochallic JXUCCS5 is the: tleperrtimce of either SUlX"essive or neighbouriq abserwlions. Different assumptions aboul the dependence sbUctun: lead 10 dilfemll types of SIOehastic process. which can be used as maclels for many observations collected in practice. The objective is usually 10 cIcri~ theoretical PRO&\BILI11ES far the Yariaus sblb:s of the syslcm and thus to use these pdJabiJities either far predictiq the future behaviour orlbe syslcm ar for pining some understanding orits mechanism. Many practical systems can be modelled adequaacJy by assuming 8 Markovian clepenclencc structure. in which the PROBAIlun DlSTRlBUDDNof X depends ODlyonthc mast m:enl or neighbaurly value.. Standard stochastic pracesses that accold with such an assumption includc random walks. Markov chains. branchiq processes. birlh-aad-clcalh procelles. queues and Poisson processes. Jones and Smith (2001) proVide an aetasible introduction lolbe mathematics of such processes. Some classical appIicaiiODS of stochastic models to medicine arc described in Gurland (1964). Succasful usesofMarkoY aaaclcJsin medical caaleJLts l'DDIe in time and appIicalicm Iiom the planning of palienl CBR: (Davies. Jalmson and Panuw, 1975) to n:soun:e pIO\'ision (Davies and Davies. 1994) and the cosl-effi:ctiveness of ''Kcines (Byrnes, 2002). Many more examples can be round in jaurnaIs such as Health Can'MlIIIIIgemml Science. WK 8)1WS, O. B.lOO2: A MIIIkov model for sampIc six c:alculalioa and infCn:DCC in vaa:iae CCI5l~ti\'CIICIS Sbldies.. SI,,'isl;ts in 11_ _ 21,3249-60. DatBtR. ... Da_R. T.O.I9M: Madellinlpalielll ftows ad rcsaume JIIUVisiaD in Ixatlh systems. omtr- ''''mtIIliontII JDIIIIIIIl D/MtI1IIIgmfmI ~ 22, 123-31.""" ~JGIuIsan, Dad Fuww, S. 1975: PIaui. patiaII eR 'Ai'" a Markov model OpmlliotltllResearrJa QuorIeriy 26. S!J9-(a07. Gurllal,J. (cd.) 1964: SlodIGstk motIell ba mftiirine tIIIII biology. Madison, WI: Univcnil)' of WlSIXIIIIin PIal. J.... P. w.... SmItIa. P. 2001: Slodmlk pr«.nMS, . . introtiuditm. I.oadan: AnIold.

....atlfled randomlsaUon Sec RANDOMISATION

stratilled ..mpllng SlnItified sampling occurs within defined slrata of some papulation. This should be canied aut when the papulation contains easily identifiable subpopulations. If thc sizes of the sbala are cliffcn:ntthen proportional allocation should be used. If the srANDa\RO DEVIATIONS an: known in advance then optimal or Ncyman allocaticm CaD be used 10 minimise the VAIiAHa of the estillUde of thc papulalion MEAN. If they an: unknown it is passible to use a pilot saudy to estimalc the SIaDcIard devialicms.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ STRUCTURAL EQUATION MODELS The method is as follows. Define the strata that the population falls into. Decide if the slnlla are of a similar size and if the standard deviations are bown. For similar sized strata use simple nndom sampling to selCCI members or each stndum.lfthe siKs are difTen:nt then the number in each stratum is propoltionalto slndum size. Thea simple nndom sampling is used to obtain thecOll'CCI number in each slndum. If the standard deviation is known in advance then for a fixed population size. n is obtained by choosing "J 10 that: nj ~

n

NjSj I

~ N..S",

when: NJ is the number in the stntum. S, is Ihc standard deviation of values or items within the strata. n is Ihe fixed population size. ", is the number to be chosen by simple random sampling from the stratum and s is the number or stratL Thornhill eI QI. (2000) used stratified sampling ina study or disability following head injury. The patients wc~ stmlified 8CClOI'ding to Ihc Glasgow coma 1COI1:. The mild aDd unclassified patients were further stratified by Ihe pmsenling hospital and a simple random sample was taken. In general. if the: population can be: separated into distinguishable strata Ibea the estimates from stratified sampling will be: more precise tban from a SlYPLE RANDOM SAMPLE and the~fore it can be efficient. The disadvantages a~ that it CaD be difficult to choose the: strata. it is nol useful without homogeneous subgroups. it can require: accurale inrormation about the populalion and it can be expensive. For more details sec: Crawshaw and Chambers (J 994) and Upton and Cook (2002). SLV CnWlllaw. J. aad CllalDben, J. 1994: A crm~M COllrX in A Inoel

slotulks. 3nI cdilion. Cheltenham: StanicyThomcs PublUhers lid.. TboralllD, S., T...... G. Me. Mil1ft)', O. D., McEwea,J., Ray, C. W._ Peaa7. J(. L 2000: Disability in JOUIII people and adults one )·ear after head injuJy: pruspecti\'C cohan study. Br;l;m Medical JDlmNli320, 1631-5. UptGa, G. aad Coak. L 2002: DitliDnory t1/

stotulits. Oxford: Oxford Unh'a5ily PIas.

structural equation modelling software

The

four most commonly used packages for ftlling slnK:tural equation models are:

EQS (http://www.mvsoft.coml) LISREL(hUp:Jlwww.ssic:entral.coml) MPlus(hUp:/Iwww.stalmodeJ.comforder.html) AMOS(hUp:lIwww.amosdcveiopmenLcoml) All four allow the fitling or complex models n:latively easily. allhough MPlus is possibly the most ftexible. Each package's website proVides speciftc information on their SSE capabilities. as weD as availability and cost.

structural equation models The opcndional definition provided by Pearl (2000. p. 160) states: ·An equalion y ={Jx + E is said to be: :structural if it is to be: interpn:tc:d as follows: In an ideal experiment when: we conbol X to x and any other sc:t Zofvariablc:s (Il0l containing X or Y) to :. the value of Yis given by /J."C + e. when: £ is not a function of the settings ."C and z: The: key word here is ·control'. We are observing values of Yafter manipulating or fixing the values or X. The model implies that the values or y, in fact, are dcIerminc:d by die: values of X. A structural equation model is a description of the causal effect of X Oft Y. It is a C.o\USAL MOOEL. and Ihc pammclU fl is a measun: of the causal effect of X on Y. II should be clearly distinguished from a linear rep:ssion equalioD that simply describes the ASSOC'IATION bc:Iwc:en two random variables. X and Y. Ir we an: able., in practic:e. to intervaac: and control the values of X (by random allocation.. for example) then II is straightforward to use the raulting data to obtain a valid eslimale of fl. If. howc\'er, we: do not have control of X. but can only observe: the values of X and Y (and 2). as in an epidemiological or oIhc:r type of OBSElVAnONAL STUDY. for example:. this does not invalidale the above operational deftnition. butlhc challenge ror the data analyst is to find a valid (i.e. unbiasc:d) estimate of the causal panunc:tcr /J under these cin:umstances. Thc: eqUalion y =flx + E is. of course. a description of a very simple structural model. II is common to coUCCI data on several response variables (Ys) and several explanalor)' variables (Xs) and to construct a series of SllUctuml c:qualions of the following form: (i = 1to I; j.k = 1toJ)

(8) in which se\'cmI of the fJ values wiD be fixed 10 be ZCIO. a priori.

The oIhc:rs are to be estimated from the data. The form of lhc: equations defined in (I) - i.e. the sbUdund theoIy that cIctcrmines the paIb:m offl values 10 be cstimaIcd and those fixed at zero - is clc:lmninc:d by the inYestiptor's prior knowledge or hypolhescsconceming the causal pnx:c:sses gcDCIBIiDg the daIa. Quoting Byrne (1994. p.3): 'Sbuctwal modeling (SEM) is a sbItisIical IIICIhodology that IaIa a bypothc:siHcs1ing (i.e. confinnaIaIy) appraach to the multivariate: analysis or a stnIClLnl thccxy bearin; on some: phcaamenon.' 1)pically. SEM inyolves (a) the speciflcalion of a set of 5lnIctuml equations, (b) representation of these structural equalions using a graphical mocIc:l (a path diapam - sec: Iab:r). (c) simultaneo_ly lilting the: set of SlrUctural equations to a givea set of data in order to estimate the fJ values and 10 test the adequacyorlhe model. If the model fails to fit then Ihe inYesligalor may ~ise the model and by apin. The success of the CJCCn:ise is likely to be highly dependent upon Ihc quality of Ihc investigator's prior knowledge of Ihc likely 443

ST~RALEQUA~ONMOOBB

_____________________________________________

causal mechanisms under test and how much tt.oupt he or she has given to abe design ofabe study in the ftnI place. Oaod dcsip and subsequent statistical analyses require technical knowledge. skill and experience. For tc:cbaical knowledge. readers aremferred to inlnJduclOl)' texIS by Dunn, Everitt and Pickles (1993). Byrne (1994) and Shipley (2000). and to the adwnccd IIIODOpDph by Bollen (1919). Discussion of SSM in the context of recent work on aMlsaI infcmK:C can be found ia PeaJI (2000. 20(9)anci. &pin. ia Shipley (2000). Traditionally. SEM has aJltce:nlraled on slnlCblraI models for quantilati~ data.. which are usually assumed to be multivarialc normal. Exkmsiorw rrom the InHIitionallincar slrUclUnll equations (i.e. IJMWl RfIJRESSION) to generalized linear slnlCluni equations are discussed by Sbondal and Rabe-Hestcth (2004). It is frequently the case thal we cannot measure constructs directly, or at least not without considenblc MEASURElIENT EJlR(]Il. This gives rise to the idea of LAlENI' VAlUABLES. These are chancteristics that arc not dircctly observable. They may be straighlforwanl concepts such as height. weight. amount of exposure to a known toxin. or concenlnllion of a given metabolite in blood or urine. but we explicitly acknowledge that they cannot be measun:d without error. 1be observed measurement is a manifest or indicator variable. while the aJI1esponding IDIknown. but true. value is a latent variable. Howe\ler. laicnt variables may be more abslrac:l lhcorelical ClDDSIrUcIS thai arc inll'Oduced to explain COVARIANCE between manifest or indicator variables. An example oflhis Jast type is the set of SCeRS on a battery of cogniti~ tells that are assumed in SOnIC way to ICBec:t a subject's cognitive ability 01' general intelligence. Another example wuld be a set of symptom severily scores (the manifest variables). which an: assumed 10 be indicators of a patienl's overall dc:.grc:c of depression (the lalcnl variable). Typically, a data anaIySl will propose a fannal measurement model (usually equivalent to same fonn of factor analysis n:presentation) to RI8Ic the observed DlClIMRments with the underlying latenl variables. We can then proceed to propose SlruClurai or causal hypotheses involving the latent variables instead of the fallible (error-prone) indicators. We staJt, for example. with a COVARIANCE MA11UX for the observed variables. We ftt a genenal slructural equation model to this covariance or moments matrix. "this procedure will involve the simultaneous fttlin; of the mcasun:ment equations for the relevant lalcnl variables and their c:orrespondin; indicators and ofthe slrUCturai equations thought to rcJ1ec:t the assumed causal relationships between the latent variables. SpeciaiiSl software packages are now widely available for such analyses

din:clionofacausaJ effect) ora c1aublc-hcaded one (indicating CORRELAlKlN). The observed or manifest variables are usually placed withia a rectangular sqUIR box. while laIcnl variables

are placc:d within an oval or a ciKle. Random measuremc:al errors and residuals from sIruClural equations. althou", they are sbictly speaking lalalt variables. are not traditionally placed within a circle or oval Path diagrams an: YCI)' closely related to the paphicaJ representations (cliRCtcd acyclic ;raphs, or DAOs. for example: see CJRAPHICAL MODElS) that. have Rlali\lely IUlCntly been developed elsewhere (see Pearl. 2000. 2009, for example). 1\\'0 Simple examples of path diagrams are shown in abe two ftpres. A detailed explanation will be gi\len in the roDGWin; section.

0-Y0~

8

1 _----_1 I

Ox

p

Dy

slructuralequatlon models Path clagrsm to represent the structural equations linking encouragement to stop smoking during pregnancy (Z), tire amount smoked during PIf1I1IJ8IJCY (XJ and the bitth weight of the child (Y). Ox and Oy are randomly distributed resklua/s El

E2

J

I

GG \/

o y'0,

p

1----P

Ox ...

Dy

structural equation models Path diagrsm torsprssent the stfUClural equations linking encouragement to stop smokingclutingpregnancy (Z), the true amount smoked during pteglJancy (TJt) and the birth weight of the child (Y). Ox and Dv are flIlIdom/y dislributsd residuals. X1 and X2 are error-plDftS indicators of smoking, with uncorretsted msasuremsnf enors E1 and E2

respectively

(sec S11WCTURAL EQU,A1KJN MODElJ.INO SOFIWARE).

Slructurai equation models are ~ often n:prc:sentcd by a paphical structure known as a paIh diagram (sec MlH ANALYSIS). In a paIh diagram the proposed relationships bctwccn variables (whelher manifesl or observed) an: n:prcscnlc:d either by a singlc-hcaded aJI'OW (indicating the

For an example. Pennutt and Hebel (1989) describe a trial in which pn:;naal women were randomly allocated 10 m:eive eneouragcmenl to reduce or stop their cigareUC smoking during pregnancy (the lmItmenl ;roup) or nol (the control ;roup) - indicated by the binary variable. Z An intcnnediate

outcome variable (X) was die amount of cipn:UC IIDOking m:onIc:cI dwilll prepanc:y. The ullimalC oulcome (y) was the birthwcightofthe newbomchild. Smoking is likely toha~ been nxlucecI in Ihc poup subject 10 encaurqemcat. bit also in the cOllbaI paup (allhough. presumably. 10 alc:ucr~tml). 'I1acR IR also likely to be hielden confounders (e.g. ocher health pRJIDOIing behaviours) Ihat IR aaacialcd with bath the molhcr's srnokins durilll JRII18IM:y and the child·s birth weight Smalcing (I) is aD endogenous lIcaImcnl wriable the above confaunding will n:suIt in the raicIuaI f'nmI a sb'Uc1Unl equalion madel to ~plain die leYel or smoking by IANDOMlS.VION lDnx:ei~CIICIJW1IFmc:nt beingcam:lalCd wilh the raidual rrom the slnlttural equalion linking observed l~ls of smoking ID the birth weight or the child. We assume that then: is no din:ct eft"cct or mndomizalion (Z) an OUlaHlle (Y): theel1'cctorZon Vis an iJadin,ct ane tluaup smoking (I): i.e. Zis anDBJIUIENTAL VAIL\II.E.lgJIIJIiIJI thc.inleKepl tenDs. the two stnIctural equaliaos IR the rollowilll:

X = yZ + Ox and Y = /lX + Dr In filling these IWO models to Ihc appropliale daIa we acknowledge the c."CII'R:JaIic (p) bclween the mWluaIs.. Dx and D,. (Ihase camponealS or X and Y nat explained by Z and X n:&pectively). 11Ieovcnll model isilluslrated by thefint ftgun: (pap 444). Now. what if we acknowledge thatsmokinglCM:1s cannal be measun:d accurately and we decide ID obtain two different measun:menlS on each pc:nan in the IriaJ (XI and.n. SBy,beingselr-Rponcdnwnbenofpac:bperclay,ablaincciat6 monlhs and I monIhs into the )RI1I8DC)')? The InIe level or smoking is now n:pn:sc:atc:d by the variable Tx. Our measlRmeat madel is rqRSCDted by the lwo equalions:

XI

= Tx +EI and Xl = Tx+£2

that Ihc EI and El IDC8S1RI11CIIt enan IR UIICIXMIatecI and thai ~ is no chan&e in the true 1~1 or smaking between the two times. 'I"hc n:Yised SlrUc:IUnII equaliaRs now use Tx ndher than X. as rallows: We

IISSUIDC

Tx = yZ+OxandY =/lTx +0, 11ae cam:sponcIing ..th diagram is shown in the second figure (pap 444). Note Ihat nat all of the madel pammeIcIs implied by the model in the sec:ancI ftgun:can be estimatc:d. 1hc: madel is too compa rar the clara at hand. The model as a whole is said 10 be underidcntified. but the gaod ncwsis thal we can still estimate fJ. the puamc:Ic:r mast likely to be or inlelatlO Ihc invc:atiptar. Problems or undericIcntificatian IR beyond the scope or this miry. but an: coven:d by the SIaDdanI texlbaolcs aD SllUclUral equalians madcUing n:fen:nccd below. GD ....... K. A. 1989: SiruclMIYIletpltlliMu M'ilk IlIIm' I'IBiIIbles. New York: Jalut Wiley ct SCIIIS. Inc. &yr., B. M. 19901: Slrwl"",1 etplQlion m _ling ,,';Ik Sg5 _ EQSlWindowl. 1bausand Oab. CA: Sage PublicatioDs. DaD. G., EftrItt. B. I. ..... I'kIdII,

~

-

T

S

r

N

~

U

i

f

__________________________________________________

A.1Wl: Motkllilll nnvrritulres tIIItI MIt'" lvrritMes II.S;'" EQS. Laadon: t1aIpman ct Hall. Part. J. :!OOO: 2nd cdiliDn. 2009: CQllMlity. Cambridge: Cambridp University PraL .......... T. ..... H.....,

J. R. 1919: SimultllllCDllWquatian estillllllion in a

c1iDic:aI trial of the effect of smakial and birth weipt. BiomelricI 45.619-22. SIIIpIeJ, B. 2000: Carua tIIId ~orrela,ioIr ill biology. Cambridge: Cambrid,e Univcnity SknIaIW, A. ......... ....... 1. 2004: GGwl'llli:ftilole,,' ",ritlbkmodtling: Rutlliltloe( lon,i1utlinal _ s/rwlural «J_IiDRs motkIs. Boca RaIan. fL: Chapman a: HalIICRC.

""51.

student'. "dlstrtbutlon

See 1-DLmtIBU11ON

student'.,.test

William Scaly Oosset. who warked under the pseudonym or "Student'• developed the Student's ,test. 11ae Student's I-tell is commonly referred to men:ly as the I-test. The simplest use of the l-test is in comparing the MEAN or a sample ID some specifted population mean dOs is usually called the ono-sample l-lesL The l-Iest can be modified to compaue the means or two indc:pendent samples (the two-sample l-test) and for painxl data ta mmPIR the dillerences between the pairs (the pairad l-Iest). Stuclcnl"s/-tell isa parametric test and cc:nain assumptians are made about the clatL These IR that the observatians within each poup (with independent samples) CJI' the dillcrences (with pain:d samples) IR appIOximlllely nonnaIly distributed and rar the two-sample case: we also requft the two groups to have similar vAIlfAIIUS.lrthe sample data does IIDl meet these assumptions then the analysis is seriously nawc:cl. Howc\lCl'. the l-test is "robusl' and is not greatly affected by a modente railun: to mc:eI the assumptions. The ~Ie I-tell can be used tocampan: the mean of a sample to • certain specified value. This value is usually the population mean. 11Ic NUU. HYFOI1IESIS stales that thc:n: is lID signilicant difference betW'CCn the sample mean and the population mean and the allcmalive hypothesis sIaIcs that tIIcR is a significant dillcrence between the sample mean and the population mean. Theassmnption we make: islbat theclata IR a nadOlD sample or independent obsenations rrom an underlying nonnaI clislributiCIII. 11ae tell stalillic , is PVeD by: Sample mean-Hypalhesiscd mean Standard enar of sample mean -

--0-----.--,--

1= - .....

This is aIIII~ against the l-DlSIRIBU11ON with n - I [Ii.. OREES OF RlEEDOM. whc:n: n is the sample size. So 1 is the deviation or a narmaI variable from its hypothc:siBCd mean measun:d in STANDARD ERRCR units. The stancIarcI error or the ........ """'" isc:olilllolal by ISln). when: S is !he"", STANDARD DEVL\11ON•

Fareumple.suppaseBMlvBluesrorasampleor2SpcapIc wen: measun:d and a mean value of 24.5 was round with a sample slandarcl clcvialion or 2.5. 1b test if this sample mean BMI is significantly clil1'cnml fram • population mean

S

a

y

~

P

U

O

A

6

B

U

S

___________________________________________________

BMI of 26 we can usc the onc-samplc: l-lest. whcR our null hypolhcsis is lbatdle~ is lID diffen:nc:e bc:Iween the sample mean of 24.5 ancllhe populalion meanof26. 'lllisallows us to calculate Ihc test statillic as follows: I _ 24.5-26 _ -3.0

~

*"-

Usa. ...... tar willi (,. - 1)-24 01 1i'cccIam. We ftnd a P-VAWE or 0.0062. The n:sall is _islic:ally signiftcanl and we Ihc:n:fan: accept the alternalive hypolhcsis lhallhc mean BMI of the sample is sipiftcandy

ditrercal from 26We can usc lhc two-samplc: l-lcsIlo dc:taminc lhc slalislical sipiftl:anc:cofan observed diffen:nc:e betWc:cD the me_ wlues of some variable between two subpaups 01' between sepande populations. For example,. we could look at lhc ditren:aces in heipts bclween males and females. The test slalislic: for the lwo-samplc ,-lest is giYen b)': , _ DiffCft:DCc: in sample IIlCIIIS-Di~nce in hypothesised means - stIIIdiid emil' of the Clifi'ama: in iIIC two SIIIIII* IIlCUS

F'mpJc:ntl)' Ihc: null h)'pDlhc:sis of interest is whether lhc two glOUpS have equal means and the CXNIesponclil1l twoskied allCmalive hypothesis is thallhc means an: in fact ditren:nL For eumple. when comparillllhc: mean outcome for lwo diR'~nt lR:almenls is the diffen:nce in means absc:rved a Slalisticall)' signiflcanl one? In this case the lc:It slalislic: mluc:es to: Diffen:nc:e in the IWo sample means , - Standanl CITOI' of the diffen:nce in the two sample means 11W; is then cCllllp8lalID the I-distribuIiaD with n. + R2 - 2 dcp:c:sorhedam. Whc~"1 iSb samplcsi&b'the Orslgroup andn2 is Ihcsamplesiz b'lhc sec:ondpaup.1hcstandanlcnar or the clill'ama: in the two-sampIc means is given by:

SE(xl-·q) =:

4 4 -+na R2

~ =: (na -I}.1f m-I ).si ., nl +n2-2 and ", ancI-'2 ~ the slandanlde\liaIiaRI for poupsaneand two n:spcclively. Far the paiIaIl-Iat. Ihc daIa an: dependent. i.e. theM is a one-lo-anc canaponclence between h values in the lwo samples. Palin:d daIa caD occur tivm two mcasun:mc:nts on die samcpcnon.e.g. bcfan: ancIafterlle8lmc:.. or the same subject ~ al dil"CI'CII1 timc:s. II is incanectlo anaI)'1C pWaI data ignoriqthe pairing in suc:h cilalmslanccs. as impcxtanl infannation is IOSL Same: factors you do not cOl1llol in the aperimenl wiD afl'ect the bef'~ and Ihc aRa mc:asun:menls

equall),. SD they wiD nat all'eeth c&1I'cn:nce between Wan: and aftc:l'. By Ioakinc onl)' at the dilfcn:ncc:s., a pain:d I-test com:cts far these fackn. 1be lwo-sample paRd l-test usaally tc:sIS Ihc: null hypothesis Ihat the papulation mean orllle pain:cI clift'cn:nccs orlhc: lwo samples is zcm. We assume: thai the pairaI diffen:nces I n i_penclc:at. To perform Ihc: paiRXII-tesl we calculate die: diff'e~nc:e between each lid ofpail'5 aad then perf0I'III a onesample I-lest on the dilfen:nccs with die: nuD hypolhcsis lhal the papulation mean oflhe dilf~nces is c:quallo ZCIU. Man: c1clailscan be found in Allman (1991). MMB A-..., D. O. 1991: hactittll sltllalir$ for ",.tli I.oIIdoa: OIapmaa .t. Hall.

Fr6«ll'tir.

subgroup ...lysl8 This fonn of analysis is often employed in ClOOCAL TRlAU in an altempt to identify particular subpoups ofpalic:nts for whona aln:atmc:Dl wadts beuu (01' wone) than far the ow:raIl palic:nt papulalion. For example,. does a lIeatment warIt bclter for men than fOl' women? Such a quc:slion is a nalunll ODe fOl' clinicians 10 ask since Ihc:y do nolln:al ·a~~ patients and. when confnmted with a female patienl with a certain aandilion. would like to know whc:dac:r die accepIc:d trealmcnl for die: condition weds. sa),. less well far women. Assc:ssinc whc:lhcr the c:ft'ect or lrealmc:nI varic:s accading to the value of one ar IIIICR paIicnl charadc:ristic:s is n:IaIivel)' strai&htbwanlliam a IlalislicaI viewpoint.. involving nothing IIIIIm Ihan taIinIa tn:atment bycxwarialc inllRttiOlLIIowcwa'. I11III)' S1aIislic:ians wauld cauliaD apinsI auc:h analyses and. if undcrtabn at all. sugcsalhal they ~ infapn:tcd c:manel), c:audausly in IhclpirilC71"eapIantion'l1IIhcrlhananything man: farmal. 1hc JQ!IIJM far such CMIlion an: DOl difllc:uJt 10 id&:dif)'. Finl.lriaIscaD ran:1)' provide suflicic:at POWfJlIO dctc:c:l such sabpuupranlc:l1lc:tian effeclS; clinical trials accrue: sufficient participants to provide adcquaIc JRCisian fOl' cstillllllinl qu....iliesofprimary intcn:sl. usaaB)' overall baImc:IIlell"ec:ls. C'GnfInil1l allc:nlion to subgnups almast alWIl)'S rauks in e&limales of iDadcquale JRCisiaa. A Irial just Iarp enaup to e\lBluatc an cwc:raU In:abnelll efTc:cI Mliabl)' will almost inevitably lack pm:ision for evalualiftl dilfcn:nlialllalmeDl effects between diffcn:nl population subgroups. Scc:ancI.ItANlXMSlQDlaasun:s .... IhcCM:nll ....... paups in a dinicallIiaIlR likely 10 be CXIIIII&abIc. SubpJups 1118)' nat c:qjoy Ihe SIIIIC cIq;nIc or balance in , . . . dJan:.icritIicL Finall)'.Ihc:~ I n often many possible prvgnoslic facton in the baseline: data. e.g. a&e. gender, IKC. lype 01' IIqe or disease. from which to fonn subgraups. so tbat analyses ma), quickly cleFIICI1Ile into ~daIa dn:cIgil1l', frona which arises the potential for past hoc emphasis on Ihc: subpoup anal),sis giYing ~sults of most inten:stlo the in~liplar. with undue: emphasis given to n:salts clec:mc:d "stalislic:ally signiftc:anl' conlributil1l. in him. 10 a pn:pondenuacc of ~p < 0.05' raults

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SUMMARY MEASURE ANALYSIS published in the mc:dical litenlllR (an excessorralse pasiti~ findings. thererore). Other potential claqen or subgroup aaalysis can be round in detail in Pocock el DI. (2002). SSE

MODELS

group dift'c:rma: when two groups are being compared ar a one-way ANALYSIS OF VARIANCE when there~naon: than two groups. If cansideml more appropriate because of the dislributional ptUpC:rties of the selc:ctcd summary measlR, thea analogous NDNP.MWIEI'RIC MEnlODS might be used. The summary measure Dpproach can be illustrated USing the data shown in the second table. which c:ome from a study or alcohol clc:pendcnce. Two groups or subjects. one with ICWft clc:pendcnce and one with moderale dependence on alcohol. had their salsolinol excmion levels (in millimoles) reconIcd an rour cOlLteculive clays.

summary m_u18 analysis This

summary meaaunt ..aly. SIIlsDIinoI excretion data

PocadE,S.J.,A......, s. Eo, I!'IIas.L E.... 1Castu, L It. 2002: Subpaupanalysis,covariateadjuslment ..d baseliaecomparisonsin clinieallriallqlClding: cumnt practice and problems. Statutics ill MftiiC'ine 21. 2917-30.

sutflclent-component ca... model See CAUSAL is a relatively straighU'orwani approach to the analysis of LONOmrDlNAL

in which the n:pcaled me&SlRmcnts Dr a response variable made on each individual in the study an: rcduc:cd iD some way to a single number that is consicleml to caphR an essential feallR of the response over time. In dais way. the multivariate naIun: or the ...-ted observations is n.. rormcd 10 a univariate measIR. 11Ie approaclJ has been in use far many yean - see, far example, Oldham (1962) and Matthews el Qt. (1989). The most important considcralion when applying a summary measure analysis is the choice of a suitable swnmary measure. a choice that needs 10 be made berore any data an: collected. The melllUle chosen needs lobe mevan' 10 the partiaalarquc:stionsofinlelat in the study and in the bmader scienliftc context in which the study tabs place. A wide range orsumnwy mea5lRshas been proposed. as shown in the ftnt table. AcoonIing to Frison and Pocock (1992), the average response overtime is oRen likely to be the most relevant. particularly ill QJNICAL 11UAI.S. Having chasen a suitable summary mea51R. analysis will involve nothing I11On: complicated than the applic:alion of Student' sl-lest arcalculalion ofa C'ONf1DENCE JNlEIlVAL far the

/JQy

DATA,

J

4

Group I (moderate depeadence) 0.33 0.70 I 2 S.30 0.90 3 250 2.10 4 0.98 0.32 0.39 5 0.69 6 0.31 6.34

2.33 1.80 1.12 3.91 0.73 0.63

3.20 0.70 1.01 0.66 3.86 3.86

Group 2 (severe dependence) 0.64 0.70 7 0.73 8 I.SS 9 0.70 4.20 10 0040 l.fIO II 1.50 1.30 12 1.80 1.20 13 1.90 1.30 0.50 0.40 14

1.00 3.fIO 1.30 1.40 0.70 2.fIO 4.40 1.10

1.40 2.60 S.4O 7.10 0.10 1.80 2.80 8.10

Subjet:l

I

2

summary measure . .lyaIe Possible summary mfNISU18S (from Malllrews et aI., 1989) TypeD/MID

Peaked Growth

QuesliDn tlj"inleresl

SummlB')' mftUllTe

Is overall value or outcome variable the same in differenl groups? Is maximum (minimum) response dift'enml betweea poups? Is lime to maximum (minimum) response different groups? Is rate of change Dr outcome dift'erent between

Overall mean (equal time intervals) or area under curve (unequal intervals) Maximum (minimum) value Time 10 maximum (minimum) respons Regression coeflicieal

groups?

Growth

Is eventual value or outaJme dilTemat between groups?

Growth

Is response in one group deJayed relative to the other?

Final wJue Dr outaJme or diffen:nce between last and fInt values or percentage chllll&e between ftr:sI and last values Time 10 n:adJ a particular value (e.g. a fixed pen:entage or baseline)

447

SUPPORT VECTOR MACHINES _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ Using the mean or the rour measun:ments available for each subject as the swnmary measure leads lO the n:sullS shown in the third table. There is no evidence of a group dilferalce in sUsolinol exc:n:tion levels.

summary measure analysis Results from using the mean as a summatYmeasute for the da. in the sscond table Moderate

Mean sci

1.80 0.60 n 6 1= -l.4O. dr= 12. P=0.19 95.. CI: (-1.77.0.391

Sel'ere

2.49

1.09 8

A possible alternative lO the use of the ),tEAJIr as a summary

mcasun: is

10

usc the maximum exc:n:tion rate n:corded

over the rour days. Applying the WJLCOXOH RANK~' TESI' to this summary mcasun: n:sults in a lest lilatistic of 36 and

assoc:iaIcd P-VALlTf. of 0.21. The sW1llD8l)' measure appmaclJ 10 the analysis of longitudinal data can accommodate missing daIa but the implicit assmnplion is thallhese are missing completely at nnclom (see DRCPDUI'S). SSE (Sec also AREA UNDER CURVE) FrbDa, L ..... Paeock. s. J. 1992: Repeated me8SURS iD clinical trials: analysis using mean summary statistics and its implicatioll\ror «sip. Stalistirs in MMirint II. 1685-704. Ptlatt..... J. N. s., AJtman.D.G.~Calapbell.M.J...... Jlo,JItaa,P.1989:AnaI)'sisol serial me~1DeIIl5 iD medical IaClR:h. BTiIi., Metfirol JDUTflQI lOO. 23-35. 0IdMm, P. D. 1962: A nocc on the anaI)'sis ofrepeillal

mcasurementsofthe.samembjects.JDUTflQlo/Chronit DisortkrJ IS. fWl-77.

support Vector machines

1'bcsc arc algorithms for learning complex ciassiftcation and n:;n:ssion functions. belonging to the geru:raI famil), or 'kernel methods' discussed later. Their 4X11Dpulalional and statistical eflk:iCftC)' n:cently made them one or the tools of choice in cenaiD biologie&l DATA MINING applications. Support vector machines (SVMs) work by embedding the data into a fcatu~ space by means or kernel runctions (the so-called 'kernel trick'). In lhe binary classiftcation cue, a separating hyperplane that sepandCs the two classes is sought in this featu~ space. New da.. points will be classiftc:d into one of both classes according to their position with respect to this hyperplane. SVMs owe their name to their property of isolating a (often small) subset of data points called 'support veclOn·. which have interesting thccRtical properties.

The SVM applOach has several important virtues when compared with earlier approaches: the choice of the hyperplane is foundc:d on statistical arguments: the hyperplane can be found by solving a convex (quacbalic) optimisation problem. which means that IJaining an SVM is not subject 10 local minima: when a nonlinear kernel runction is used. the hyperplane in the rcatu~ space can correspond to a complex (nonlinear) decision boundary in the orieinal data domain. Even ~ inlm:Slingl),. kc:mcl functions can be defined DOl only on vectorial data but on virtually aay kind of data. making it possible lO classifY strings. images. trees or nodes in a graph: lheclassiftcalion ofunsecn data points i'&enerally computationally cheap and depends on the number ofsupport vectors. Fint introduced in 1992, support vector machines are now one or the standard tools in PATI'ER.'I ItE(.'()(JJI(JT applications.. mostly due to their computational emciency and statistical stability. In n:c:ent yc&rs. extensions or this algorithm 10 deal with a number of imponDnt data analysis tasks have been proposc:d, resulting in the general ramily or 'kernel methods' (Shawe-Taylor and Cristianini. 20(4) (see DENSrJ'Y ES~S).

The kinds or n:lation detected by kc:mcl methods include classifications, ~grcssiDIIS. clustering (sec CLUSTER ANALYSIS IN MEDICINE). principal4XllDponcnts (sec PIlINC1ML CDr.UONENT ANALYSIS), canonical c:oI'Ielations (see CANONICAL COJUlELA11ON ANALYSIS) and many others. In the same way as with SVMs. the kernel Irick allows these methods to be applied in a reature space that. is induced by this kernel. making Iccmel methods applicable to Virtually any kind or data. Elegantly, the dcvelopmenl of kernel methods can always be decomposed into two modular steps: the kernel design. on the one hand. and the choice of the algorithm, on the ather hand. 11M: kernel design pan implicitly defines the featun: space. which should contain all available inronnation Ihat is relevant for the problem at hand. The choice or the algorithm (which needs 10 be wriUcn in leIms of kernels) can be done independeatly from the kernel design. As with SYMs. mostlccmcl methods n:duc:e their tJaining phase to optimising a convex cost funclion or to solving a simple ei&envalue problem. hence avoiding one of the main computational pitralls or NEUR.O\L Nf:IWOIlKS. However. since they often implicitly make usc or very high dimensional spaces.. kernel methods run the risk or overflaing. For this reason. their desip nc:c:ds to ineorpondc principles of statistical learning theory. which help to idcatify the crucial panunclelS that need to be conbolled in order to avoid this risk (see Vapnik. 1995). For rurther rcfen:nce on SVMs. sec Cristianini and Shawe-Taylor (2000). NeRDB

erwt....., N...... S..we-....,.Ior~ J. 2000: An inlrotktion to mpport reelOf' mamiMs. CambridJC: Cambridge Unhasity Pms (""'''''.5UJ!POIl-\'eCCCIr.ncl). Sbawe-TQIor, J. aad CIt........I. No

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SURVIVAL ANALYSIS-AN OVERVIEW 2004: Kernel rM'irotb for pGllem tIIIIII)'lis. cambrid&e: Cambridge Uaivenity Plas (www.kcmel-lllClhacls..nca). VapaIk, V. 1995: 77w IfIII",e 0/stalistical &-ar,,;", l/reo". New York: SpriDgCr. ~ ENDPOINTS thai CaD rqJlace a clinical cndpoinl rar the purpose: of assessing the effects ofnc:w tratmentscadier. atlowcrcast. or willa grealer Slatistical smm\TI'Y. Sunolatc endpoints can include: measumnc:nts of a biomarkc:l'. clcftncd as ·a characteriSlic thai is objectively mc:asuml and evaluated as aD indiclllOrofnormal biolopcal processes. pathogenic praccssc:s. or pharmacologic responses to a Ihc:rapc:utic intcn-cnlion' (Biomarkers Deftnitions Working Group. 20(1). Use or a biomarker as a sunopae endpoint CaD also be usefUl if the final c:adpoinl mc:asun:menl is unduly invasive ar unatmfartable. For aD endpoint to be a sunople for a clinical cndpoint it must be a measure of disease such that: (a) the: size (or rR:qucncy) c:om:latcs strolllly with thai clinical c:adpoinl (c.g. blood pl'Cssure is posilively cam:latcd willa Ihe risk or suoke) and (b) treabnc:nts praclucillla change in the: SUJIOplC endpoint also modiry Ihe risk or thai particular clinical endpoint (e.g. n:ducing bloacl pra~ nxluccs the risk or stmkc:). Sunopae endpoints an: routinely used in carly drug development. where interesl focU5c:S Oft showing thal new tratments have enough acti\'ily to wanant rurther rescan:h. In conftnnatory PHAsE UllRIAlS. however. interesl rocuscs Oft shoWing that new In:almc:nts have the: anticipated clinical beneftls. and in such situations sunolate cndpoints CaD only be used ifthe:y have undcrgaae rigorous statistical evalualiOD (or "validation·) (Burz.ykowski. Molc:nbc:rghs and Buysc. 200S). Indeed. some promising surrogate endpoints have provcn to be unreliable pnxiictOlS of clinical beneftts. For example. canliK arrhythmia was believed to be a good sunogate endpoinl far mortalilY aIlcr aD acute heart aUack. since in Ihc:sc cin:umslancCS paticnts with a hilhcr risk of such aD aniaylhmia have a greater risk or death. Howcver, scveral drugs (c.g. lignocaine. IIccainide:) thai pnwent arrhythmias after a heart auack IIClually increase: mortalilY (Echt el tiL, 1991). Similarly. some: blood pra.5un:-lowering drugs (such as angiolensin-convc:ning enzyme: inhibitors) have much largcr effects on vascular mortality than might be predicted from their cffects on blood pressure (Heart Outcomes Prevention Evaluation Siudy Investigators. 2000). In conlnSl. disease-free: survival has n:cently bc:c:n validated as an acceptable sunoplc for overall survival in patients with colan:clal cancer treated with Ouoropyrimidines (~cnt el til., 2005). Prenticc (1989) prupascd a dc:ftnitiaa aad operational crilc:ria for Ihc: validation of sunuptc: endpoinlS. Although the strict criteria prqJ05CXl by Pnmlicc seem lao lilringc:nl 10 ever be met in practicc. his landmark paper sparked iDterest in dC'Veloping slalistical meIhods thai could be usc:cIlo shIM-that a sunugate is accepIable (ar "wlidatcd') for the: purposes or

surrogate endpoints These

assessing a specific class or lmllmc:nts in a spc:dftc disease: sc:Uing. One: approach consists or using a MUI.'I1U.\'EL ),fOOB-to show that the surrogaIC cndpoinl pnxlicts the hUe c:adpoinl ('individual-level' sunvp:y), aad that the cfTects of a tn:aImenl on the SUIIOpIe cndpoint praIic:t the: effects or Ihc: lRalmc:llt on Ihc lrUe c.q,aint ('trial-level' alunogacy)(Buyse: elill. 20(0). The latlerClCJftdilion n:qumdala 10 be: awOable: rrom SCYeral units. usually flOm a MEJ'A-MW.YSIS of se:wn! trials. Analhc:r approadl consists of USing a CAUSAL MOOB-to cOI11lJlR the causal effect ofllQbnc:nt on the lrue: endpoint in palients for whom tn:aImc:DI docs. and docs nul. afTec' the: surrople. Sec: Wcir and Walley (20D6) far a review or the: terminology ancIswmgate validation models. CBlMB B.......... Del...... Wo....... Graap 2001: Biomartas aad SIII'IOple cndpoints: JRfcmd defiDilians aad coDCCphllll fiamc. wark. Clinical PllarmtKDlogy IIIIIl 'I1reraptfl,icl 69. 89-95. BDnJ...... T., M....1IeqIas. O. aDd BII)., M. (cds) 2005: E'WUllIiDII of SlUlDla'e entlpoin". Springa' ~ B81Rt Me, M.......... G., IIan)".WIId, T., ......, D. ad Geys, H. 2001: The valicialiOll or SUDOglIIe cndpoiDls in lDda-aaalyscs of randomized experiments. Bios'a,illicl •• 49-67. F.dd. 0. s.. ua. IDII, P. IL. MlIdMIII, I.. B. lie CardlacArrll.¥tbada

ft" ....

511......... Trial (CAST) laY.......... 1991: Manality aad

morbidity in palieDlS nxeiviag caeainide. ftc:cainide. or placebo: the cardiac riydnnia SUAJRSSiaft 1riaI. NftF England JDllmQI of Met/kiM 324. 781-8...... 0aIcGIDeI Pn,1DtIaa .......... StadJ lay......... 2000: Eft"cds of an IDgiotcasilKonVCl'liag CIIZ)'1IIe iabibitar. rami.,nl. on death Iiam cardio¥ucular causes. myocardial inCan:Iion,"" in hilh-risk palienlS.Nnt· EIr&/1IIIIl JDIII1IIlI of MftiidRe 342, 145-53. rna... R. L 1919: Sunople eacIpoinlS in diDicailrials: deftnilion and aperaIioDaI criteria. SI. lislklin M«litine 8. 431-10. Supat,O',\V...... s.,HllllertD.G. .... 2005: Disease-he survival (DFS) vsomallsunival (OS) asa primary eodpoinl for adjuWDI caIoa caaccr studies: indi\'iduaI paIient data r.am 20.191 Jllllicats on 18 nndamizcd Irials. JDumtzi 0/ C/inital O"t.'DIogy 23, ~70. Weir, C. J.... W"Iey, R. J. 2006: Statistical evalualion or biomutccm as aarroglllC endpoiDts: a lilcrahlle ~yiC'A'. SIa'Uticl if Metlidne 25. 113-203.

survtval anal,. - an overview

This covcn mdhods ror the: analysis of tiDM>lD-eYcnl dalll. e.g. survival times. Survival da.. occur when the outcome: orinlc:n:51 is the: lime flUID a wc:ll-deftnc:d lime origin to the OCCUrmlClc: of a particular event or EJlDPOINI'. Ir the endpoinl is the de:aIh of a palienl Ihe resulting data 1ft. lilc:lally. survival limes. Howcver. other endpoints ~ possible. c.g. the lime: 10 relief ar recu~nce orsymptoms. Such observations IR often rerc:m:d to as IiJnc:..tOoCvc:nt data althoqh survival data is commonly used as a gcneric term. SIandanlSlAlisticai methodology is not usuaUy appropriate ror sucb data. for lwo main reasonL First. the distribution orsurvivallimc in general is likely to display positive SKEWNESS and so assuming nonnality for aD ....ysis (as done:.. for cxample. by a l-lESTor a regn:ssion) is probably nOl reasonable.

449

SURVIVALANALYSIS-ANOVERVlEW _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __

Second. IIIOIe critical than daubts about normalily. however. is the pn:sence or ClCnson:d obscr\'lllions. wIIeM the surviwllime of' an individual is rerem:d to as ClCnson:cI whea die eadpoint or inlm:sl has nul yet been mached (meR PRCisely. li,hI c:ellJOled). For true survival limes Ibis Diilht be _RIC the data flam a study 1ft aalyscd at a lime poiDl when same participants I n lIill aIi~. Anodaer reason ror ClCnsomI ew:al limes is thai .. individuailDipl ha", baen last to roDow-up for n:asans unn:laIcd to theevcnl ofinten:st. e.J. due 10 rnovinllO a location dual c:annot be Inc:cdorduc 10 accidenlal cIeaIh (see DRCIQITS). When censorinl accurs all that is blown is Ihalthe actual. bat unkaowa, survival time is larpr than the c:ensan:cl SlIm val lime. Specialised statistical techniques clnc:lopcd 10 ....)'IIC such c:enscncl and possibly skewed aub:omes I n known as survival analysis. An imparlaDtassumplion made inslllDdanl sunival ....ysis is that the cenlDl'inl is DDJliIlfonMIive. i.e. that the actual sUn'iwllimeof'an individual is inclepenclent of any mechanism dI. causes that individual's survival lime to beccnsan:cl. Farsimplicity.thisclescriptiOD aIsoconcenlnla _lechniques farconlinuous survival times-the analysis of cliscn:cc sunivallimes is clc:scribed iD Collett (2003). As .. eumple. consicler data lhat arise from a doubleblind. randomiscd CDlllIollcd clinical trial (Ref) (see CLDnCAL lRlAU) 10 compare treatmeDls ror prostate caac:cr (placebo VCJ5US 1.0l1li ofdielhyJslilbestrol (DES) adminiSlelal daily by mouda). The rull datasel is given in Andn:ws and Herzbeq (1915) and the fint lable shows the fint seven or 8 subset or 38 patients used heIe and discussed in Coiled (2003). In this lluely, the time of' oriIin was the clare _ wllich a cancer sufferer was ranclamiscd to a In:alment and the endpoint ilthe death or a patient flUm pmslale c:aacer. 'I1ac survival timc:sorpatients wIaodiecl fRJIII alhercauses or W'tR lost cIuriq Ihe follow-up pmc:ess 1ft rqaniecl as ript a:nsomI. The 'slalUs' variable in Ihe first tablc lakes the value unity if the patient has died f'mm prostaIe cancer and

lelU if the sun'iwl lime is ClCnsan:cI. In addition 10 survival limes. a numberorplVlllOSlic: rac:1Ors Wa1: ~ namely the 81c or the patient at trial enlly. their ICIUIII hllCllDlJllobin lew:1 in P11100ml.the size of their primary tumour in cm2 and the value or a cambi_ index oflUlDaUr st.qc aacI pade (the Gleason index with "lIer values indicating ~ advanc:cclwmoun). 11ac: main aim of'this saudy was 10 c:ompan: abe survival experienc:e between the IWO In:almeat paups. Inpneral. 10 clesc:ribc survivallWo funclionsoflime 1ft of ccntnal inlen:st -the .nil'll/fundion and the "lIztUdjilnc1;011. 'I1acse 1ft described in some detail nexL TIle survival fUnction S(/) is cldined as the probabililY thai an individual·. survival lime. T. is IRIIIa than ar equal 10 lime I. i.e.:

S(/) = Pmb(T ~ I) The,raphol'S(/)apinlllisknDwnasthesurvivaicurve.The survival ~ c:aa be dMJupt of 81 a panicular way of displayiftl die frequcacy distribution or the ew:nl tilDes. I1Ilhcr than by. say, a HISTCJCJRAM. Wbea dlere are no censon:d observations in Ihe sample of survival limes. the survival f'unclion can be estimated by the empirical survivar f'unc:ti_: •{) S I =

Number or individual s with survival limes ~ Number of Individual s in the ... sci

I

Since CVCIY subject is 6a1ive· at die bcpnniq orllle study and DO one is observecIlD survi~ Ioqer Ib.. the largest of die observed sun'iwl times then: $(0) = I ancI$(/_) = 1

Furthennon:. the eslimlllCd IIII'ViVOl' filllction is assumed constanl betweea lwo adjacent cIeath limes. so th8l8 plot of 5(/) qaiDSlI is a step fUnction that cIemues immecliately after each 'death·. This simple mc:Ihod clllUlOl be used when there are ccnsend obsemdions since the methacI does not allow far information provlclal by an individual whose sun'iwl time is censored befon: lime 110 be used in die compulinl or the

1IUIVInI • ...,.... SUtvivai times of pmsIate CBIJCfN' patients P,,'iIfIIl

IIIIIPIbo 1 2 3

4 5 6 7

SllIla

Age

I.

O-died,

(yean)

flllDlldu,

0= eelUlWaI)

TrftJIIIWIII {I =pllleebo.

Sunival

2-DESJ 1 2 2 1 2 1 1

6S 61 60

0 0 0 0 0 0 1

sa 51 51 14

Smnn

S;:e t1f

G/mson

llaem.

l1IIlIOUt' (mi,

iIrIII.y:

34

a

4

10

(1m/IOO ml)

67 60 77 64

65 61 73

13.4 14.6 15.6 16.2 14.1 13.5 12.4

3 6 21 8 18

a 9 9

a 11

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SURVIVALANALYSIS-ANOVERVIEW estimateal'. n.e most commonly used method rareslimaling the survival function rell' survival cIaIa conlainiDg ccmorcxl observations is the pnxIuct-limit or KAFl.AN-MmR ES11MA1OR. 1he ClSeIlCe orlhis appmach is Ihc use ora pmduct or a series

aut Ihe study period. A similar pmc:c:dun: can be used 10 estimate other percentiles or the distribution or Ihc survival limes and approxim.1C cxmficlcnce intervals CaD be found once the variance or the eslimatc:d percenlile has been clcriwd flOm Ihe VARIA.~ or the Cltimatar of the SUl'ViVGl' function. In the analysis of survival daIa. it isoRcn of some intcral to assess which periods have Ihe highest and which the lowest chance or de.... (or whatever Ihe event of inlClat happeM 10 be) among those people alive allhe time. TIle appropriate quantity for such risks is the hazard fUnction. h(I). defined as Ihe (1ICa1ed) PROIWIDJ1Y that an individual experiences an event in a small lime inter'V8l dl.liven th.1 the individual has survived up to Ihe bepnning of lhe interval. The hazard function therd"CR n:pn:5ents the instantanc:ous cIcalh rate far an individual surviving 10 lime I. It is a 1Deasun: or how likely an individual is 10 experience an CVCDt as a funclioD of the age or.1he individual.. 1he hazanll'unctiaa may remain m~ inctase or _rase with time or take some more complex fann. 1he hazard fWIClion or death ia human beiDp. rar exampl~ has a 'balhrub' shape. It is rel.lively high i~ diately after birth. declines mpidly in Ihe early yean and then remains ~I.tively coRllanl until beginning to rise cluriag late middle age. A Kaplm-Meier type estimatar or the hazanl runction is given by the proportion of individuals experiencing an event in an interval per unit lime. given that Ihcy have survived to the beginnilll of Ihe interval. Howcver. the estimated hazanI function is lenemJly cons~d 'too noisy' ror practical use. Instead. the cumulalive ar i~ paled hazard function. which is derived flam the hazard function by summalion. is usually displayed 10 describe the claanF in hazard over lime.

of condilional pmbabilities. Oae akcmaIi~ estimlllDr ror censon:d sunivallimcs. derived difl'em1lly but in practice often similar. is Ihc NeJ~Aalen c:slimalor. APJJI'Ol'imalc STANDARD ERRQRS

and pointwise

symmetric

or asymmetric

CCX\IRDEJIU: INIERVALS fex the sllrviwl funclion al a given lime

can be clcriwd to dercnninc abc pn:cision or the estimatordc:tails &Ie given in CoUdt (2003). The Kaplaa-Mcier estimalors of Ihe survivor curves for the two prostate cancer RaIments an: shown graphically in the figure. n.e survivor curves an: step funclions that decrease at the time poiDls when pmlicipaats died or the cancer. Thecenson:d observations in the daIa are indicated by the 'ClOSS' marks an Ihe cUI"VCl.lnourpalientsampie then: is approximalely a difl'cmIICC of20Clt in the pnJpOI'lion sunoivilll for at least SO lo60 months bdwecn Ihe lIalmCDlgraups. Since the c1iSbibution of survival times tends 10 be positively skewed Ihe MEDIAN is the prefell'ed summlllY measure of location. The mediaD sUrYi'VDl time is the time beyand which S09f, or ilia individuals in die population under study &Ie expc:ctccllO survive and. once ilia survivor function has bc:cn eslimatcd by S( I). can be eslimatc:d by

the smallest observed survivallime~ I!O. for which Ihe value or the eslimaled survivor function is less than 0.5. The estimatc:d median survival time can be read from the survival curve by finding the smallest value on Ihe .'C axis

ror which the survival prvporlion reaches less than 0.5. The nglR shows thai Ihe median survival in the placebo group can be estimated as 69 }e8I"S while an alimale ror the DES group is RDI a'VDilable since survival exceeds S09f, duouP-

1.0

.----~.

L ___ - - I

0.8

,----*'--~-~~

lu I

.- --.

0.4 0.2 0.0

0•

• 20

• 40

• 60

• 80

TIm8 {months} survlval .....,._. DIsplay of Kaplan-Meler survivor CUIWIS

451

SURVIVALANALYSIS-ANOVERVIEW _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ In addilion 10 compari~ survivor functions graphically. a IIKII'C fonnal slatisticallesl for a group difference is often required in order 10 compare SIJn'ivai limes analytically. In the absence of ceMOring. a nonpanuuelric lest sucb as the Mann-Whilney lest eouId be used (see MANN-WHnNEY RANK Sm.I TEST). In the Iftscnce of censoring. the log-rank or MantcJ-Hacaszel tesl is the most commonly used nonparametric test (sc:e MANIB.-HAENszELMEIlIOOS).llleSlS the NlILL HYPOTHESIS that the population survival functions S.(I), S:,(/). •.. , S,(I) arc the same in k groups. Briefly. Ihe lest is based on computing the expected numbec of deaths for each observed "death' lime in the dataset. asswning that the chances of dying. given that subjects are al risk, am the same in Ihe groups. The toUl numbec of expected deaths is then computed for each group by adding the expc:ctc:d number of deaths for each faillR time. The test finally compares the observed number of deaths in each group with the expected number of deaths using a CHl-SQUIdlE TEST wilh k - I DEGREES OF fItEEDO).f (see Hosmer and Lemcshow. 1999). The log-nnk test statistic. Je, weights contributions from all failon: times equally. Several alternative test slalistics have been proposed that give difl'cn:ntial weights 10 the failon: times. For example, the generalised Wilcoxon lest (or Breslow lest) uses weights cquallo the number at risk. For the prostate cancer data in Ihe firsl table the log-rank test (r=4.4 on I degrc:c of fn:edom. P=0.036) detects a significant group diffcmIIX: in favour of longer survival on DES beatmcnl wbile the Wilcoxon leSl, whicb puts n:lalively more weight on dilJerences between the survival curves at earlier times. fails to n:ad1 significance at the S9t test level (Je 3.4 on I degree of fRedam, P 0.(65). Modelling survival limes is useful especially when ~ an: several explanatory variables to consider. For example. in Ihe pruslale cancer trial palients wen: randomised to lmdment groups so that the lheon:lical distributions of the diapostic fadars wen: Ihe same in the two groups. However, empirical distributions in Ihc patient sample might still vary and if the prognostic variables arc related to survival they might confound the group diffen:nce. A survival analysis that "adjusts' the group difference for the prognostic faclol(s) is needed. The main approaches used for modelling Ihe effects of covariates on survival can be divided J1)ughly into two classes - models based on assuming proportional hazanls and modeJs for' diR:ct effects OD Ihe survival times. The main technique used for modelling survival limes is due to Cox (1972) and is known as Ihe PROPO~AL HAZ.O\RDS model or. more Simply. Cox's regression (sec COX'S REORESSlON MODEL). In essence. the technique acts as the analogue of multiple regression for survival times conlaining censored observations, for which multiple regn:ssion itself is dearly not suilable. BrieRy, the prot'edure

=

=

models the bazard function and central to it is the assumption that the hazard funclions for two individuals al any point in time arc proportional, the so-caUed pr0portional huards assumption. In other words. if an individual has a risk of 'death' at some iniliallime point thai is lwice as bigh as anolher individual, then al all later times the risk of death remains twice as high. Cox's model is made up of an unspecified baseline hazard function, ho(I). which is then multiplied by a suitable function of an individual's explanatory variable values. 10 give the individual's bazard function. The interpretation of the regression parameter of the ith covariate. Il" is that expffl,) gives the hazard or INCIDENCE rate cbange associaled witb an increase of one unit in the ith covariale, all other explanatory variables remaining constant. Cox's regression is considcrc:d a sc:mi-pammelric procedun: because Ihe baseline hazard function. ho(l), and by implication the PROB.O\BIUI'Y DISTRIBuno~ of Ihc survival times. does not have 10 be specified. The baseline hazard is left unspc:cified: a differenl parameter is essc:ntially included for each unique survival time. These parameten can be thought of as NUISANCE PARAMEl'ERS whose purpose: is merely 10 conlJ'ol the parameten of interesl for any cbanges in the hazard over time. Cox's regn:ssion can be used 10 modellhe prostate cancer survival data. To start wilh. a model containing only Ihe single treatment factor is filted. The estimated regn:ssion coefficient of a DES indicator variable is -1.98 with a standard enur of 1.1. This translates into an (unadjusted) hazard ratio of exp( -1.98) 0.138. In other words. DES lRatmcnl is estimated to reduce the hazard of immediate death by 86.2~ Rlative to Fl..ACEBO treatmenl According to a UKELDIOOD RA110 (LR) test. the unadjusted effecl of DES is Slatistically significant al the SCI. leyel (r 4.5S on 1 degree offRCdom. P=0.033). For the prostale cancer data. it is or in~ to determine Ihe effect of DES after conlJ'oliing for the other prognostic variables. Ukclihood ratio tests showed that dropping age and serum haemoglobin from a model that contains Ihe lRatmcnt indicalor variable and all four prognostic variables did not significantly wonen the model fil (al the IO~ level); the fit of the final model is shown in Ihc second table. After adjusting Cor the effects of lumour size and saage Ihe hazard reduction for DES relative to placebo lRatmcnl is reduced to 67.1 ~ and is no lODger statislically significant (LR lest: Jf 0.48 on 1 dcgn:e of fn:c:dom. P = 0.49). 80th lumour size and Gleason index ba\'e a hazard ratio above unity. indicating thai increases in tumour size and ad\'llllCClCl slages are estimated 10 increase Ihe chance of death. Cox's model does not n:qu~ specification oC the probability distribution of the survival times. 'J'he hazard function is not reslriclc:d to a specific: form and as a Rsult Ihe

=

=

=

____________________________________________

~ALANALYSB-AN~

..",..1 • ...,... Parameter estlmlJles 110m Cox's regression of SUfVivBI on I1eaIment tIIOUP. GIe8sm index

95~ ClltJI' expf/J)

EJI~I es,iIrrG'e

Pretlk,or

tumour size and

WJrifIhk

..

DES Tumaursize

Gleason index

Regres.itm ctleJ1icDlI (~)

SI_dlJrtl error

-1.113 O.os26 0.7102

1.203 0.048 0.338

( (;)

HlIZtII'd rll'",

I.Dwer

(exp~))

lim;'

0.329

1.G86

0.031 0.990

2.034

1.049

Upper /;111;1 3.47 1.19 3.95

survival.....,... PatamBler estimates from Iog-IqjsIJc accelerated fa/lu18 time model 01 sul'llval an IrelllnJent tJIOUP.

'umour size and Gleason indtIJc

95. CI /Dr e.~p(~aJ

EJlect ulimllie RegnDiDn

c«Jlidenl(a)

DES 'nImaursize Gleason index

0.628 -0.031

-0.335

S,_ _d

O-V;;a)) o

Accelertllion

1lK'lor(exp(

0 0.022 0.203

senD-paralllCbic model has considcnlble lexibilily and is widely used.. However. if the assumption of a particular probability clillributiDn far the'" is valid. infe.enccs based OR such 811 8SS1mIpti_ ale II10Ie pn:c:ise. Far example. c:sIimates ofhazanl ratios ar medlan surviWlllimes will ha~ Slll8ller IIandanI CIIOIS. A fUlly pallllDClric proportional bazanIs model makes the late asslDllpiions as Cox's rqn:ssion but in addition also assumes Ihal the baseline hazanl fUnction., brJ.l).. c_ be pananeterisecl aecanlilll to a specific madel far thc distribution 01' the lIIn'ivailimes.: Survival lillie distributions that can be used for this purpose. i.e. Ihat ha~ the pnJpDIIional hazards property, ~ principally thc EXPDNEN1W.. Weibull and GonapeItz Dl511U11U11ONS. Dift'aenl dislribulions imply dift'en:at sIIapcs of the hazanl function. _ in pnICIice the distribution .... best describes the fUnctional fann 01' the observed hazanI functioD is chaseD - for dclails see CoUea (20D3).

A flllllily of fully parametric madels that accommcxlatc dircc:t lDultiplicative eft"ecIs 01' covarillles on survival times and heac:e do not have to rely on pmpartiaaaI bazards are lI«elwfllmfailure I . motiek A wieler ImIIC or survival time distributioDs possesses the accc:leraled failure time

-a»

LD..w limil

Upper lilllU

0.182 0.981 0.939

1.568 1.077 2.080

0.534 1.031 1.393

pmpcdy. principally the exponc:atial, WeibuiL log-logistic. genc:nliscd GAMMA or LOGNORMAL DISIRB1I'DIS. ID addition. this family 01' panunebic models iDcludes elillribati_s (col. • log-logistic: clislribution) dull IDDCIeI UDimodal bazanI fllDclions while all dislributions suilable far die pRJpGdianai hazards model imply hazanl f'uJlclions thai incmIsc ar deCIUSC IDIIIIOlOIIically. The IaIter pmperty _pt be limitilil. far CUIIIplc. fex' modellinl Ihe hazanI 01' dyilll after a complicated operation that peaks iD • pa5I-operative period.

the pnemlacccJendccl faillR time ..... for the ctrccls orp explaaatCll)' variablc:s. XI• .%20 •••• x"' CaD be repn:senled 85 a log-linear IIIOdel far sunivallimc. T, lUIIIICIy:

, 1a(T) = Go + Ea;.Ti + enar , I

where a ••••• , a" am the uakaowa coefIlcienlS or the explanaloly variables and Go _ iDll:m:pI panuncla'. Tbc: panuncla' a, .e8ec:ts the efl'cc:l that the itb covarillle bas on Iog-survival lime willi pasiliw: values iDdicatilll Ihallhe survival time inclaSCS willi incn:asilq; values 01' the covariade aad vice vena. In terms or the oriPDaIlinlescalc. Ihe

~mnv~CU~E

____________________________________________________

model implies thai the explanatory variables measured on an individual act multiplicatively and so affect the speed of progn:ssion to the event of intesat. The intcrpn:tation of the parameter at is then:fon: that exp (a,) gives the factor by which any survival lime percentile (e.g. the median survival time) changes per unit increase in .'CI • all other explanatory variables mllaining constant Expressc:d diffen:ndy, the probability that an individual with mvariate value x, + 1 survives beyond I is equal to the probability that an individual with value :c1 survives beyond exp( -a,)I. Hence exp( -a,) determines the ehange in the spccd with which individuals proa:ed along the timescale. and the eaefJicient is known as the acceleration fador of the ilh mvariate. Soflwan: packages typically use the log-linear fonnulalion. The regression CXlCt1icients from fitting a log-logistic BlX'Clerated failun: time model to the prostate cancer survival times usiag treatmen~ size of tumour and Gleason index as predictor variables an: shown in the Ihird table. The negative regn:ssion coemcients sugeslthat the survivallimcs tend to be shorter for larg« value of tumour size and Gleason index. 1'he positive regression coefficient for the DES treatment indicator suggests that survival times lend 10 be longer for individuals assigned to the activc treatment aftc:r adjusting for the efTccts of tmnour size and stage. the estimalc:d aceel«alion factor for an individual in the DES poupcomparm with the placebo group is cxp( -0.621) =0.534: i.e. DES is estimated to slow down the progression of the canc« by a factor of about 2. While possibly clinically relevant. this effect is. howe\'er. not statistically signiftcanl (LR test: X2 = 1.57 on 1 dc:grec of f~om. P =0.2(1). In summary, survival analysis is a powerful tool for analysing time-to-event data. The classical techniques. Kaplan-Meier estimation. COX's regression and acxelerated failun: time modelling. an: implemenlcd in most general purpose STA11mCAL ~£S. with the S-Plus package having particularly extensive facilities for fitting and assessing nonstandanl Cox models. The an:a is complex and one of active curxnt research. For mon: recent advances. such as frailty models to include RANDOM EFfECTS. MULTISTATE MODElS to model different transition rates and modcls for competing risks. the reader is refcm:d to Anderscn (2002). Crowder (2001) and Hougaard (2000). SL AlldeneD, P. K. (cd.) 2002: Mu/liJla/~ models, slillislimi nwlhoth

in medictJ/ remuch II. London: Arnold. ADdnn, 0. F.... Henberw. A. M. 1915: Daill. New York: Springer. Collett, D.

2003: Modelling surl'nvrl dIIllI in nredical rat!tlTr/t. 2nd celition. London: Chapman " HalIICRC. Cox. D. R. 1972: Reprssion models and life tables (with discussion). Joumal o/Ihr Royal Slalistical Soriely. Series B 74. 187-220. Crowder, K. J. 2001: CitmicIII ",nrpeling rUJu. Boca Raton. FL: Qapmm & HalllCRC.

HDI8Ier.D. W ...............,S.I999:Applimsurri1'llltmahsis. New York: John Wiley" Sons. Inc. Baa..."" P. 2000: AnIIfy~isof nru/liwuillle m-riraJ N~· York: Sprincer.

.'11.

survival curve

See KAPL"~ER ES11MA11ON.Sl1JlVlVAL

ANALYSIS-AN OVERVIEW

survival luncUon

See SURVIVAL ANALYSIS

systematiC reviews and meta-analyals This is an approach to thc combining of n:suIas from the many individual CUNlCAL1'RIALS of a parlicuiartRatment orlherapy that may have been camed out over thc course of time. Such a procedure is ncccL:d because individual trials an: rarely large enough 10 answer the quc:slions we want to answer as reliably as we would like. In practice. most trials are too small for adequate eanclusions 10 be drawn about potentially small advantages of particular therapies. Advocacy ofl~e trials is a natural n:spanse to this situation. but it is nul always possible to launch "ery large trials before thCIBpies become widely acxeptc:d or rejected prcmatun:ly. An alcernative possibility is to examine the n:sults from all n:levant trials. a process that involves two CXJIDpanents. one qua/;IIl';ve. i.c. the extraction of the n:levantliterature and description of the awilable trials. in IeJms of their relevance and methodological stn:agths and weaknesses (the S)~/enltllic revi,.,). and the other qrlOlllilal;l'e, i.e. mathematically combining n:sults from difTerent studies, even on occasions when these studies have used diffen:nt measures to assess outcome. 1'1Iis component is known as a meill-tlllalysis (Normand. (999). Informal synthesis of evidence from different studies is. of counc., nothing new. but it is now generally accepted that meta-analysis gives the systematic n:view an objectivity that is ineVitably lacking in the classical review anicle and can alSID help the process to achieve greater pn:cision and gencnlisability of findings than any single: study. Then: remain sceptics who feel that the conclusions from a meta-analysis oRca go far beyond what the technique and the data justify. but despite such concerns. the demand for systematic reviews of healthcare intervcntions has developed rapidly during the last decade. initialc:d by the Widespread adoption of the principles of EVIDENCE-BASED MEDICINE both among healthcan: pnlCtitioners and policymakers. Such reviews are now inc~asingly used as a basis for both individual tmIIment decisions and the funding of hcaI~ and heallhcan: n:scarch worldwide. This growth in systematic n:views is reRccted in the cunaal stale of the COCHRANE C'OLLAIIORAtION database containing as it does mon: than 1200 complete systematic reviews. with a furtbel' 1000 due to be added soon. SystclDlllic revicwsand the subsequent mda-analysis have a number of aims: to l'C\'iew systematicaDy the available evidence from a puticuJar n:sean:h an:a: 10 provide quantitative

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SYSTEMATIC REVIEWS AND META-ANALYSIS summaries of the results from each sbidy: to combine the ~ts 8C'I'OSS studies if appropriate - such combinatian or mndts leads to gRalcrstalislicai power inemnmting lIealment ef1'cds: to 85SCSS the amau.. of Yariability between studies: to estinudc the degR:e ofbeneftt associalcd with a particuill1'sbidy trc:abnent: to idenlify study chamcterislics associated with particularly effeclive ImItments. Ideally. the trials included in a systematic review should be clinically homogeneous. For example. they might all study a similar type of patient for a similar dUl1ltion with the same treatment in the two anns of each trial. In practice. of course, the trials included are far more likely to differ in some aspects. such as eligibility criteria. duration of treatment. length of follow-up and how ancillary c~ is used. On occasions. even treatment itself may not be identical in all the trials. This implies that. in most cin:umstances. the objective of a systematic review conn,,' be equaled with that of a Single large trial. even if thatlrial has wide eligibility. While a single trial focuses on the effect of a specific treatment in specific situations. a meta-analysis aims for a more generalisable conclusion about the effect of a generic treatment policy in a wider range of areas. When the mals included in a systematic review do differ in some of their components. therapeutic effects may very well be different, but these differences an: likely to be in the size of the effects rather than their din:ction. It would. aner all. be cxlraordinary iftrc:almenteffects ~exactly Ihc same when eslimalc:d from trials in dif1'cn:nt countries.. in different popuIalioas. in ditTcn:nt age groups or under diffi:rcnt treatment regimens. If the studies ~ big enough it would be possible to meaMR these dif1'ercnces reliably. but in most cases this will noI be possible. However. rncta-analysis allows the investigation of soun::es of possible heterogeneity in the results from dift"erenttrials. as we shall see later, and disaJurages Ihc common. simplistic and often misleading inlClprctalion that the results ofindividual clinical bials ~ in conRict because some an: labelled 'positive' (i.e. statistically significant) and others 'negalive' (i.e. slalistically nonsignificant). A systematic approach to synlhcsising information can oRen both estimate the degn:e ofbencfil from a particular therapy and whether the benefit depends on specific characteristics of the studies. The selection of studies is the greatest single concern in applying meta-analysis and then: an: at least thn:e important components of the selection process. namely breadth, quality and representativeness (Pocock. 1996). Breadth relates to the decision as to whether to study a vcl')' specific narrow question (e.g. the same drug. disease and setting for studies following a common protocol) or a man:: generic problem (e.g. a broad class of treatments for a range of conditions in a variety ofseUings). The broader the meta-analysis. the more difficulty there is in interpreting the combined eyidence as

regards future policy. Consequently. the broader the metaanalysis. the man: it needs to be intClpl'Cled qualitatively rather than quantitatively. Quality and reliability of a systematic review is dependent on the quality of the data in the included studies. although criticisms of meta-analyses for including original studies of questionable quality are typical examples of shooting the messenger who bears bad news. Aspects of quality of the original articles that an: pertinent to the reliability of the meta-analysiS include a valid RANJX))lJS/a. TIO.~ process (we an: assuming that in meta-analysis of clinical trials. only randomised trials will be selected), MINIMISATION of potential BIASES introduced by DROPOllfS. acceptable methods of analysis. level of BUNDINO and recording of adequate clinical details. Seyeral attempts have been made to make this aspect of meta-analysis more rigorous by using the results given by applying specially construcled quality assessments scales to assess the candidate trials for inclusion in the aualysis. Detennining quality would be helped if the results from so many trials were not so poorly reported. In the future. this may be improved by the CONSORT statement (CON5OI.IDATED STAh"DARDS fOR REPoRTING "hIALS).

The representativcness of the studies in a systematic review depends largely on having an aca:ptable search stndcgy. Once the rc:scardIer has established the goals of the systematic review. an ambitious Iilcl1lture search needs to be undertaken. the literature obtained and then summarised. Possible soun:es of material include the published literature. ID1published Iiteratun:. uncompleted n:scan:h n:ports. work in progress. oonfcrcnce/symposia proc:ccdings. dissenatians. expert informants. granting agencies. trial n:gistries. industry and journal hand searching. The search will probably begin by using computerised bibliographic databases of published and unpublished research review articles.. forexample. MEDLINE. This is clearly a sensible strategy. although there is some evideace of deficiencies in MEDLINE when sean:hing for RANOOMISED CON11lOlJ.B) lRJAU. Ensuring that a mcta-anaIYSis is truly repn:senlalivc can be problematic. It has long been known that journal articles are not a representali\'e sample of work addressed to any particular an:a of n:scan:h. Rcscan:h with statistically significant results is potentially more likely to be submiued and published than work with null or nonsignificant results.. pal1icularly if the studies arc small. The problem is made worse by the fact that many medical studies look at multiple outcomes and there is a tendency for only those outcomes suggesting a significant ef1'cct to be mentioned when the study is wrillen up. Outcomes that show no clear treatment effect are often ignored and so will not be included in any later review of studies looking at those paJticular outcomes. Publication bias is likely to lead to an ovcrrcpresenlation of positive results. 4SS

SYSTEMATIC REVIEWS AND META-ANALYSIS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ Clearly il becomes of some importance to assess the likelihood of publicalion bias in any mela-analysis Rporled in the lilendure. A well-known infonnal method of invcsaipiing this potential problem is the so-called RJNN!I. PlDr. usually a plot of a mellSlR or a study's precision (e.g. one over the SlANDARDERRal)againsteft'ect size. The mast precise eSlimales(e.g.lhosc frum the largest studies) will be althe lop oflhe plot and lhosc from less pn:cise or smallersludies al the bottom. The expeclalion ofa ·fWlnel· shape in the plot Rlies on two empirical observations. First. the variances of studies in a meta-analysis are not ideAlical. but are distributed in such a way that IheR are fewer precise studies aad ralher more imprecise ones and. second. at any nxed level of VAlUANCES. sludies an: symmelrically dislribulc:d aboul the MEAN. Evidence of publication bias is proVided by an absence of slUdies on the left-hand side or the base of the funnel. The assumption is that. whether because of editorial policy or author inaction or some other reason. these studies (which are not statistically significant) are the ODeS that might not be published. An eompte or a funnel plot suggesting the possible presenc:e of publication bias is given in the figure (taken from Duwl and Tweedie.. 2000). Various proposals have been made as to how 10 test for publication bias in a syslCmalic review although none orthese is wholly satisfactory. The danger oflhe tc:stiilg approach is the temptation 10 assume thai. if the lest is not significant, then: is no problem and the possibility of publication bias can be conftniently ignorai. In praclic:e. howeftr. publication bias is very likely endemic to all empirical Rsean:h and 50 should be assumed present. whatever the result of some tesling proceduRs with possibly low POWEll. Once the studies fCll'systematic review have been selected and the possible problems of publication bias acIdressed.

10

•

e

8

-;• 6

-l.

-J

4

0-

~--C

"! 4

e

I

-..;

• e.

2

0

10

•

8

-;• 6

-

nus

(b)

(a)

e

effc:ct sizes and variance estimates an: Cllbacted from the selected papers. reports. c:lc:.. and subjc:c:1cd 10 a melaanalysis. in which the aim is 10 proVide: a global tCSI of significance fCll'the: ovc:nll NULL HYPCJnIESJSofnoeffc:c:1 in all studies and tocalculale an eslimDlcand a CONFIDENCE INlERVAL of the o\'Crail effc:ct size. TWo models are usually considen:d. one involving fIXED EFFEl'"TS and the other RANDOM EfFErI'S (Fleiss., 1993; Sutton ellll.• 2000). The former assumes that the true eft'ect is the same: for all studies wheaus the lalter assumes that individual studies haft diffCRnt effect sizes lhal vary randomly around the nndom effects the ovaall mean efl'ect size. model spc:c:iftc:ally allows fOl' the exislcnce of both belwc:en saudy heterogeneity and within-study variabilily. When the resean:h question concerns whether treatment has procIucc:d an c:ffc:c:I. on the avcmge. in the sel of studies being analysed. then the nxc:d effects mode) for the studies may be the: mare appropriate; hen: then: is no intc:rest in generalising the results 10 other studies. Many statisticians believe. however. that the nndom effecls model is more appropriate than a lixc:d efl"c:cts model for meta-analysis., because between-study variation is an important SDUR::C of uncertainty that should not be ignarccl when assiping uncenainty into poolc:d results. Tests of homogc:neily are available. i.e:. a lest that the between-study variance component isz.cro- ifit is. a fixed effects mode) is t:lOnsidered juSlific:d. Such a tesl is., however. likely 10 be of low power far delc:c:ling departures frum homogc:neily and so its practical consequences are probably quite limited. The essential feature of both the fixed and nndom effc:cts models for mcla-analysis is the use of a weighlc:d mean of (Rabnent effect sizes frum the individual studies. with the weights usually being the Rciproc:als of the associatc:cl

.5

-1

..•.• , .••

.5 Effect size

05

• t.5

• •

~--C

.•.• , . • -l·

2

-1

.5

05

• • t.5

Effect size

systematic reviews and meta-analysls (a) Funnel plot of 35 simulated studes and msta-analysis with lroe eIIect size of zero: estimated effect size;s 0.080 with a 95% conIIdencB interval of[-0.018,0. 178}; (b) funnel plot as in (8) with five feflmosf studies suppTessed; overalleffectsize;s nowestimateclasO.124 with. 95% confidence itJtewaJof fO.037,O.210J. Repdntedfrom Duvaland Tweedie. 2000. withpennission fromTheJoumal of the American statistical Association. Copyright 2000 by lhe American Statistical Associatfon. All fiIJhIs reserved

______________________________________________________

variances. Effect sizes might be standanliscd mean dirreraac:es for continuous RESPONSE VARIAIIUS or RELAnVE RISKS AND OODS R.o\11OS for binary outcomes. Both fixed effccts and random effects models n:sult in a test of zelO effect size and a confidence interval for effect size. However. it should be remembered that. in general. a more important aspc:ct of meta-analysis is often the exploration of the likely heterogeneity of effect sius ti'om the diffen:Dt studies. Random effect models. forexunple. allow for such hc:lelOgeoeity but they do not offer any way of exploring and potentially explaining the reasons study results vary. In other words. random effects models do nol 'control for'. 'adjust for' or "explain away' heterogeneity. Understanding heterogeneity should perhaps be the prilDlll")' focus of the majority ofmetaanalyses carried out in medicine. The examination of hc:terogeneity may begin with formal statistical tests for its presence. but evc:a in the absence of statistical evidence of heterap:neity. exploration of the mationship of elJect size to study characteristics may still be valuable. The question of impollalltle is. what causes hc:leJogeneity in systematic reviews of clinical trials? Study of the causes of heterogeneity oftn:almcnt effects in a metaanalysis ollen involves the technique generally known as META-RIDlESSIOM. Esseatially. this is nothing more than a weighted n:grasion analysis with effect size as the dependent variable. a Dumber of study characteristics as explanatory variables and weights usually being the n:ciprocal of the sum of the estimalcd variance of a study and the estimated between-study variance. although other more complex approaches have beea described. Mela-regmssion can. like subgroup analysis within a single clinical trial. quickly become lillie more: than DATA DIEDOINO. This danger can be pallially dealt with at least by p~ification of the covariates. which will be inve5ligaled as potential SDUR:a of hc:leJogeneity. As an example of the syslcmatic review and associated meta-analysis we shall consider transcranial magnetic stimulation (TMS) for the tn:atment of depn:ssion. Such tratment involves placing a high-intensity mapetic field of brief duration at the scalp surface to induce an electrical field at the cortical surface that can alter neuronal fUnctiOD. Repetiti\IC TMS (I'TMS) invol\ICS applying trains of these magnetic pulses. In humans I'TMS bas beea shown to produee changes in frontallobc bloocIliow and to oonnalise the response to dexmethUDDC in depn:ssion. Since trials in the late 1990s.. I'TMS has been proposed as a tn:alment for drug-resistant deprasion. schizophrenia and mania. McNamara el al (2001) n:port a sySlcmalic review of the published dala. in which RANDOMlSa) CONTIIOLLfl) 1RL\I.S were sean:hed for using a varic:ty of dalabases. iDeloding Medlinc: and Embasc. Sixteen published clinical trials of rTMS for depn:ssion wen: identified, but eight were excluded because there was no randomised control group

SY~~CSAMP~

and a further three excluded for reasons gi\ICD in the original paper. The resulis from the five trials accepted for the mela.....ysis an: shown in the table.

systematic reviews and meta analysl. Data for five RCTs of ,TAfS

Trial J Trial 2 Trial 3 Trial 4 Trial S

rTMS

Pla"ebo

II

6

11 I

Improved Not implOved

6 7 I 8 4 4 6

Improved Not implOved

17 18

8 24

Improved Not implOvcd Improved Not implOved Improved Not implOved

4 2

4 I 10

n.e n:sulis from both Ihe fixed effects and nmdom effects maclels an:. for these dal&. eudly the same. The overall effect size (log odds ratio) is estimalcd 10 be 1.33 with a standard error of 0.37. leading to an estimated odds ratio of 3.78 with 9S., confidence interYai (1.83. 7.8 J). BSE (See also FOREST PLOrJ

DuYaI. s. sad Tweedie, R. L 2000: Nonparamdric "trim 8DII fill' IDdhod of accounting for publicalion biu in meIa-aDIII)'sis. JOUTIftII olllle Amerittlll SlalisliUlIAssocilJ'ion 9S.19-98. Flekl,J. L 1993: The slllislieal basis of mcllHDalysis. SIlIlislimiMethods in MwJiml ReNlUm 2. 121~S. MKN........, B.. Ita)', J. L, ArtInIn, 0. J.. IIDII BaaIfut, S. 2001: TranscnnniaI IDlpetie stimulation for dqIIasion and other psychiatric disorders. P$,rlt%,ical MerJidM 31. 1141~. N.....-, S. T. 1999: Mca.analysis: formulating. cvaJuating,. combining 8DII rqJOIting. SIlIIi"i"s in Meiicine II. 321-59. PUcock. S.J. 1996: Clinical bills: a statiSlician's perspective. In AnnilagC. P. and David. H. A. (eels). AdWlllces in biomelTy. Chichester: John Wiley cl Sons, Ltd. SIdtOII, A. J .. Ab....... It. ... JODIIt D. R. .... S........ T. A. 2000: Methods/or melll-anal)'lU in medical reMGrM. OUchcstcr: John Wiley &: Sons. Ltcl.

systematic sample

Every kth element in a list is incluclc:d in the sample. To oblain such a sample. begin with the sampling frame and BITIIDIe in an order. which may be alphabetical or some other order. Then select the number of samples to be taken. Selc:ct a random starling point in the liSI. Divide the size or the papulation by the number of samples to be taken. This is the length of interval. k. Tben every kth unit. depending on the SlaJ'ting point. is included in the sample until the Dumber of samples to be taIcc:n is reached. This may mean staning spin from the beginning of the list. For example. Litde. Keefe and While (1995) used a systematic sample when studying melanoma patients in

457

SYSTEMATIC SAMPLE _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ a general practice. Every 125lh patieat on the general practice rqistcr was selected to be included: this was to yield a minimum of 60 indh'iduals. The main ad\'antage of systematic sampling is that it is a quick and easy-to-usc sampling method. particularly when dealing with large samples. where it is often used in prererence to SIMPLE RANDOM SAlIPLES. Howc\·er. irthere is a periodic cycle within Ihc sampling fmnle then estimates obtained from systematic sampling may be incorrect. Irthe periodic cycle is ruogniscd thea the Slalting point and the length or interval between chosen items can be varied. Systematic sampling

can often only be used when then: is a sampling fnunc available that can be onIeR:d in some way. For rurther details see Crawshaw and Chambers (1994) and Upton and Cook (2002). SLY Cra""b..." J. aad Cbmben.J. 1994: A t:oncise t:OIIrsr in A lerrl stalislirs. 3rd edition. '1.011 .folD Waley a Soas. Ltd

4S9

TEACHING MEDICAL STATISTICS _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ Lcc:tun:s work better when students who havc been challenged with this material are then able 10 ask a statistician to explain things that puzzle them. The c:onccpls acquiR:d in this way are more likely 10 be backed up by other parts of their course than the calculation of CHI-SQUARED 'IESTS. If students can be equipped with the basic ideas of wriabilily. measurement. IL\NIXJr.USA11ON. estimation and signiftcance. we havc done well. The machines can do the sums. Increasing numben of hcalthcarc students are taught by problem-based learning (PBL). a system intended to pn:pare students for a life of EVIDENCE-BASED MEDICINE. Statistics is l"DI'Cly ta",ht as part of the core PBL progl'1UDl1lC. bUI is instead taughl as a separate addilion to PBL cases., or in a scparate. parallellecturc- or seminar-based counc., or DOl at all. This is bad news not only for medical Slatistics (and those who teaC'h it) bul also for rnc:dicine. Surc:ly the skills needed for the interprelDlion ofevidence should be central in a couJ'sc preparing studcnlS for evidence-bascd practice. It happens because lutors. mainl)'laboratory scientists or cUnicians. feel insecure about teaching statistics and because the 'problems' ofPBL are usually descriptions ora patienl 1\IIors need to be aJRvint'Cd that they do not need to know the subject to facilitate students" mutual education and course orpniscrs need 10 be aJRvinced that problems can be a publication or a community problem. that the patient case is not the only way. Healthcare practitioncrsarc usually taught statistics as pari of study for a higher professional qualification. The key application is still the interprelDlion of numerical evident'C. mainly in the context of published resean:h. However. they often have the marc immediate goal of passing a demanding examination with a high failure rate. Some of these examinations include some quite advanced statisaics. such as those in radiotherapy or public health. Tbe teacher can make usc of this by collaborating with the students 10 defeat the examiner and concentrating on past questions. I find that slDJting with a few multiple-chOice questions to identify areas of difficulty and then explaining the answClS the students get wrong works \'CI)' well. Once the basics ha\'C been coveml in this way. past examinalion questions fonn the ideal motivalOr. It is for the examinen to design their tests so thai in order to pISS them the saudenls must learn whll the examiners think they ncc:d 10 know. For those who do not havc to satisfy an examiner but simply wish to undcntand their own subject's lilmlture belter. indin:cl teaching is frequently used. Many journals have canicd long series of articles on statistics intended to help their readers understand what is published. a praclice that began with the early ground-breaking Ltmcel articles by Bradford Hill in the 19305 and continues still. Resean:hers have very diITerent needs from practitioners. Tbey must acqui~ the skills 10 design sludies and analyse data. Understanding of concepts. while still cc:atraJ. is not enough. Practical skills a~ usually developc:d in hands-on

computing practical classes. preferably using softwaM of the lype that they will usc in their own n:search.1.ectuMS have a more natural place in this teaching. as methods and their applications and limitations can be described. We can even risk a few mathematical fonnulac without 100 much discouragement of well-motiVDlc:d sauclents. The opportunity 10 discuss their own projects is very allractivc to these students. Textbooks are particularly impodaDt to this group. At one time the marIcct was flooded with poor books on medical Slatistics (Bland and Altman. 1987). bul there are now many good ones. Another source of stalistical education for ~ searchers comes (rom individual discussions of their projects with a statistician. This is a Iwo-way street. as they educate the Slatistician aboul the research topic and medicine in genc:nl. I have learnc:d 50 much from the people who have come to me for help. For new statisticians. statistics is usually a Master"s caursc: taken by graduates in mathematics or other quantitative subjects. It is possible to study statistics as a braDCh of mathematics without real data making much of an appearIIIK'C. but if studenls have chosen 10 study medical statistics speCifically we would expect them to want a practical course with the focus on application 10 real problems. Clearly. they must become familiar with the common techniques ofdesign and analysis and should be able to analyse data within bath the frcquentist and the Bayaian frameworks (see BAYESIAN MEIlIODS). As nearly all statistical analysis is now done USing general-purpose SlatisticaJ software. they should learn the basics of the softwaM they are likely 10 mc:eI. AI the time of writing. SM. Stala and BUOS would be contenden for the programs of choice. but familiarity with other Widely used or speCialist software eould be included. Stalisticians need not only technical skills but also the abililY to collaborate with and give advice to members of other disciplines. Expcricnc:e is the best Reacher. but experience is what you get just after you nec:ded it. We would like 10 give our studc:ats a bit of experience before they are plunged inlO real-life problems. Medicine and other healthcarc professions have much 10 teach us hcJc.. 1 used 10 run a session for MSc studc:ats in medical statistics when: I invited clinical rescan:hcrs who so",ht my ach'ice to come and get it in front of a live audience. I pointed out to them thai if 1 wenl 10 consult them, they would do it with an audience of medical students. Perhaps we could incorporate this type of advisory clinic into our lcaching. We wanlto enable our students not only 10 use the cunmt sel of statistical mc:lhods bul also 10 develop new ones where these are nc:eded. 10 this end they need some theory as well as the practice of Slatistics. I think that statisticians should also have a sccun: basis for thinking that the statistical methods that we routinely use arc in some way the besa methods we could usc and for this reason a theoretical course will provide valuable grounding. even though the)' may never usc it again.

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ TIME SERIES IN MEDICINE We should not think that students wililcam and RUin all we teach them or that if we do not teach them something they will never know iL Being laq,ht is only a part orleaming and good students will continue to learn throughout their careers. What we must by to do is to give them the dem to relain what they have learned already and Ihe ability to add to their JMB knowledge whenever they need to. BIIIDd,J. M.'" AItIaaD, D. G. 1981: CaVQl dodar: a pim tale of medical stalistic:s tc:ltboolcs.. SriJish Met/itsl JourIltl129S. 979.

tetrachorlc corralaUon coetrIclent

See CORRD.A-

TJON

thrtve lines

See 0Il0W11I C1IARTS

tlme-dependent variables

nme-dcpendent covariates. also known u time-varying cowriales or updating covariates. are variables that can change their value over time. They are particularly impoltant for prognostic models. such as COX's RIDlESSION MODEL. They should be distinguished from fixed cowriates. which are measurable at baseline and do not change with time. Examples of fixed covariates are race and sex. Age varies with time. but is completely pmlictable fiom baseline data and so is not included among time«pendcnt covariates. nme-dependent covariates may be classed as being internal and external (Altman and de Slavola.. 1994). External factolS impact on outcome but do not explicitly refen:nce lime. e.g. the balf-life of a drug In:alment; when:as internal factors are measun:ments taken at set times relating to the individual or their condition. e.g. blood pressure or blood markers. The mason for considering the inclusion of time-dependent covariatcs is that including only baseline variables may ienon: a great deal of poICIItiaJ Pl'Olnostic infomudion. 11lerefon:. the inclusion of time-dependent covariates may substantially incn:ase the potential detail and accuracy in a model. For example. inaases (or decreases) o\'er lime in patienls' blood pressure may be a beta pmlictor of future pIOIlID5is than a single baseline value of blood pn:s5lR. The Cox. model can be extc:ndc:d to include timc-dependent covariates instead of, or in addition 10. ftxed cowriales. In simplest terms. the hazanI for a time-clcpeadent covariate takes the form: h(1)=exp(y:(I)). where /r(1) is the hazard at time I and is a lime-depcndent cowriate and y is ilS coeflicient value. As for fixed covariates. all data types can be enacn:cl as lime-dependent covariates into a Cox. regression model. II is impaltant to IISSCSS the assumplion of JIRaIOR11CXt.lA ILUARDS. once any time-depencL:nl covariate has been taken into account. These variables do add additional complications to any model. FilSt. they mquin: the dataset being analysed to contain additional variables or additional observations

=

(depending on the dalasct's structum). Second. it can be dimcull to obtain complete data on these wriables. especially with incrc:asing time. MJSSING DATA can be problematic. Third.. these variables effectivel)' incmase the choice of Cox models aVDilable for consideration. One must ensure that issues of multiplicity of testing am addn:ssed. Finally. there arc issues of inlelpmatiaa. Including time-depcndent covariates in a model may be pnactically simple. but the greater difftalily lies in interpreting Ihe data: one must be sum how any variable would be interpreted befo~ including it in a model. Simply. the hazard ratio for a time-depc:ndenl covariate represents an addilional change in risk associated with a change in this variable over time. For example. when considering bone pain as an outcome after treatment for plUSlale cana:r. one may wish to record Ihe developmenl of ostc. arthritis over time as the second condition may incn:ase the risk of bone pain. When interpraing oUlput it can be complicated bying to tease cause from effecl with such wriables.. MSIMP AftmaDf 0. G. aDd de StaYGla, B. L 1994: Practical pnJblems in filling a proportioaaI hazanls model to daIa with upd_ InCISIfto mealS or the covariata.. Slalislia in Mttlki1te 13.4. 30 1-41. CInes. 1\L A., GoaId. W. W.IlDllGat1ernz,R.G. 200J: An inlrotludiDIIto sutnral QJIIIlysis lISing Sial'. KYiscd edition. Texas: Slala PR:ss. 1\_ _ 0. ad Pumar, M. K. B. 1995: SIIrl'ilvU tIIItIlyJu: II JlrtlClimi approach. Chichester: Jaha Waley & Salls. Ltd. Plutodell, S. 1997: Cliniml trio&. New YOlk: Wiley lntc:r5ciencc.

Umeserles In medicine

Chatfteld(1989)hudeftned a time series as "a sequence ofobsc:lVatiaasordel-cd in time". In medicine and medial rc:sean:h. observations arc often onIercd in time and speciaJ k:chniques have evolved to deal with them. II is hc:lpfulto think of thn:c types of lime series: (I) single series. often long; (2) man: than one series. each of moderate length; and (3) many shorter series. 1he~ are at least Ibm: masons for collecting a single time series: (a) To lRCiicl some future event. An example might be measuring cmdinine clc:arance f'rom kidney failure patients where the main aim is to pmlict complete kidney failu~.

(b) To lesl whether some evenl in time hu an elrcel on subsequent outcomes. 1'bc:se are sometimes ailed beforc-and-aRer studies or intc:rruptc:d times series (Glass el QI•• 1975). Examples incluclc the effect of seat belt legislation on deaths due to car accidents. effect of NHS Direct on consullation to a genc:ral praclitioner and behavioural experiments in psychology. (c) To look for trends and m)'thms in the series. An example would be a spectral analysis of the: electroencephalogram (BEG) signal to measure the strength of alpha waves.

481

TIME SERIES IN MEDICINE _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ For series of more moderate length a common thc:mc is to examine whether th~ is an ASSOC'IATION between two series. Examples include studies to examine relationships between ClDI deaths and environmental tempendure and daily deaths ftom heart disease and air pollution. Shorter series are often dealt with under the lenD 'repeated measures'. Tbc reason for measuring observations OVCl' time is that a series would more accurately ~ncct the aclion of treatment than a single measure al one point in time. A typical ell8Jl1ple would be repeated measures in • CLINICAL TRIAL. such as blood pn:ssure measun:d monthly for a year. Summary measures (Matthews el til.• 1990) such as the AREA UNDER DIE CURVE or the slope or the response over lime are often the outcomesofinlemil. Repcatingobservalio.can improve the accwacy of the eslimak:s of lmllmcnt effects. The most basic time series model is the autorqressive (AR ( I») model (Diggle. 1990). Given a time series x" from which the MEAN value has been subbac:tcd an AR( I) is given by:

x, =

ClX,_1

+ er. where -I < a < 1

This model is often called a Mtlrkov model because the value at one point in lime only depends on the value at the point immediately pn:ceding it. The model is easily extended to AR(P), when: p> 1. For p=2. Cleflain values of the aJCffic:ients can give models in which cycles appear. Some forecasting models use autoregressive models. but in medicine these are mn:ly used because usually one is more intcn:sted in estimating trends. which are I1IOIC easily Rtted using convention models. The complemenlaly model to the AR( I) is the moving average model MA(l):

:cr = er+fJEt-I, where-I I. These models can be combined to produce an autorqn:.ssive moving average ARMA model. This has been round to model many time series and requircsonly low values ofp and q. The procedure of filling these models is often known as Box-Jenkins modellilll (Box and Jenkins. 1976). In general. this type or modellilll is more cammon in the forecasting and CXJntrol of industrial processes. although on occasion it has been applied in medicine. Perhaps Ihc most ClOI11IDOn feat~ of the analysis of clinical signals is to look for regularly occurring features or rhythms (Campbell. 1996). For humans to maintain stable bodily functions, clinical signals must be constrained to lie within cenain limits. This is done using nonlinear feedback loops. which tends to make patterns within signals recur regularly. A simple example will iIIuslmle the point. To remain healthy. hum.. must maintain blood pressure to within cedain narrow limits. Blood pressum is mediated through Ihc baron:ceplors. located in the wall of the aortic

an:h and in the wall of the carotid sinus. If blood pressure is too high. then signals from the barareceptolS ~It in vas0dilation. which drops the blood pressure. If pressure is too low. then vasoconstriclion oc:c:urs to increase the blood prcss~. The feedback mechanism is thought to be nonlinear and inc:orpondes a delay and, for Ihcse reasons. at-rest rhythms can occur spontaneously. Periodogram analysis involves dccompasilll a signal into individual frequency components where Ihc amplitude of these components is proporlionalto Ihc 'enCl'J)" of the signal at that frequency. It is aCXJnvenient method for summarisilll a long time series and is a natural procedure if we believe then: are mythms in the dala. Periodognm analysis is the method of choiCle for the analysis of clinical signals. The problem with the pcriodogram isihat it is an inconsistent eslimator. in that its VARL\.'fCE dues not reduce as the sample size increases. To achieve a consilient estimator various smoothillltcchniques (known as 'windows') an: applied to the periodogram. so that it estimates what is known as the spedrum. Then:. are three major components to be found in a Iypical heart rate spectrum and these an: also present in the blood pressure spectrum. A region of activity occurs at around 0.25 Hz. which is aUributable to rapindion (respiratory sinus arrhythmia) and this is thought 10 be a marker of \'apl (parasympathetic) activity. A second component at around 0.1 Hz arises rrom spontaneous vasomotor aclivity within Ihc blood pressure control system and is mediated by vagal and sympathelic acti\·ity. A third. low-frequency component al around 0.04 Hz is thought to arise ft'om thermoregulatory aclivity. An example is given in the first figum on pDIe 463 (Bernardi el al.• 2(01). which shows the effcct or recitation of manlnlS or prayen on the spcctnun or respiration. heart. rale (RR interval) blood pressure and mid-c:ercbral blood Row. II can be sec:a that n:cilalion conocnlrDtes the power of the signal at a cycle with a frequency of about 6 cycles per minute (0.1 Hz). Some signals are esseatially continuous, w~ othcn are discrete. For example. the heart rate is measured from surface electrodes on the chest from the electrocardiogram (ECO). Although the ECO is continuous. the heart nile is usually derived rrom the OR' wave in the ECO. which is a sudden spike just pn:cedilll the venbicularcontrac:tion. ThUs, the heart beat si,naI is essenlially a point process. Some authors have analysed the inlclbeat intervals. thus arriving al a spc:cbUln. which ellimales frequencies per beat. ralhcr than per uniltime. Others sample the heart rate (or RR interval) si,naI at Iqular intervals or niter the point process 10 produce a continuous signal that can be sampled. The elcctrocnc:c:phlogram (EEO) is electrical aclivity of the brain measured by electrodes at the surface of the skaD. Then: is an immc:ase amount of literature devalcd to the spectral analysis ofEEOs.ln particular. six spcctml pcakscan be idcatified. These peaks. with a typical range of frequencies

~

~

M

N

I

s

e

~

E

S

~

_________________________________________________

~J_rlt_

A

' J--"","rIt_

J

c2'.

1

J

----- ~--

----

l~. . . . .J~_ ft

•

___

J~= ~Z=ZJ~~_

O~I----------~~iO~I----------~~

~

time ..... In medicine Spec:fIUm sIJowiW etfecIs (in one subject) ofdrytlrmlt: dfuaIs 0DmpIIIfId with spontaneous breaIhIntI, on IffIIlIJitaIoly and cadwasc:ular thyIhms. Nate slow thythmlc osdIIations (~ ShnIn) in .. signals duJfnIJ t8CIlalion (Bematd et 111., 2001, British MedIcal Journal32S, publishing &oupJ an:: della 1 (0.5-2.0 Hz). dclta 2 (2.0-4.0 Hz). theta (4.0-8.0 Hz). alpha (8.0-12.0Hz).Ii&_ (12.0-14.0 Hz) and beta (14.~20.0 Hz). The peaks can be used, for CXlllllpic. to classify cliffelalt levels or sleep. Rcanll, lhcR has been illlCn:lt in describing neunaI pnxesscs in the ClIIDIaI of nonlinear~aad. in particular. ia the mpiclycvolvilll of .'nmIni61iC' _ •• An assum..- barCR appl,iagspec:lnll analysis is Ihal tile signal is ~ (i.e.. the .... and \'IIIi1lllClC do aaI chaap). However, mc:cIicai sipaIs ~ not IIaIionaIy in the usual sense. The)' canl8in IbytIuns thai may COllIe and 10 in the lime iatcnal. dac lRIqucacics ...y vary ar amplit_s or .c,clcs aI ccdaiD fnqueacics incIasc or cIccIase. SpedraI anal,1iI CCIIIlicIcrs tile cnIin: time interval aad 10 ~)'C1cs dud onlyaceur in putorahc: iab:l\laI will hawlhc:irspcclnl peaks

ftc"

attc:nualalh, Ihe low power inCllhcrpartsoflhe intcmd. One solution is 10 divide the series up inID sections ..... CDlllpllIe the spec:IIWn rarcach sec:ti_. 'I1Ie dilliculty heft is IhaI il is DOl mdistic 10 dIiDk ofa sip bcilll SlDlionIIr, in sccIions. A

14~

with pennissIon from the BAfJ

better inlUilivc: aaaclel is one in which die lis- 6ewtlves' slowly 10 dud Ihe nonstatianary campaacnl is slaw ia camparisDJI with die: sipal in which ~an: inacn:atal. The new), dcvcIopccIlelci ofWAVELI!l' AlW.YSIS is usccIlO ....., . sipals orlhiskiad. The main pnlbiem wilh liIDe &eries is thai &erial cam:laaion ilmdidala one of tile ID8in IIIIIUIftPIicms or COIMIItimiai n:p:ssion. namel, thallhc cmHI in the: aaadal an: incIcpende.. or each other. ~ second prabIcm is thai if we an: i_rated in Ihe n:lationsllip between two lime clapendmil wriabIcs);andz,....,oIhcr variable associated withti.-wiU be a ccmrauncli:r between Ibem. AIr cumpIe. ir bulb sc:ries illlRllllO with lime. thea ..." will appear com:latad. This has been Ihe IOIIIQ: of much BlllUlelllCnl. with positive c:anUalions sach as those between the annual popuIlIIion of Holland and the number of storks· nc:ats ad sales or icc cn:am and deaths by dnJwninc beinc qIIDIaI. eridencc or A ....i_! 1b make pnIIR:SS aae has to try and lit • mDCII:l dial n:1IIDftS IheClonf'aundcrvariablcL For. conIinuous oulcolllc

TIME SERIES IN MEDICINE _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ suppasclhe model isy,=/lx,+rJ,.I= 1•.. •• n. WheMY, is the dependent value measural at liIDe I. x, a vector ofcanfOunder and predictor wriables and , a wctor of rqlasion CXlCfficic:nls. If the MSiduais an: serially corMlab:cl. thea ordinary least sqIIIIIa does nul provide valid estimates of the SfAMl\RD EIlAIDRS or the par&IIIICteIs. If we assume dlat x, ad II, are gc:nendccl by AR( 1) processes willi panunc:tcn a and y. lhea, using ordinary leal squares to estimate~, Ihc ratio of the eslimatcd variance to Ihc true variaDcIe is approximately (1 ay)l(l +(1). In general. x, aad tI, are likely to be: positively CIOII'eIatcd. n..s the etrcct of ignoring serial COItIII!l.A1DI is to giveartifk:iaJly low eslimates of the sbIndanI errar(SE)oftllc aqaasiaa caemcienls.. 'l'bis means dc:cJaring significBIIIX: IIICR oRal Ihan the signifk:BIIIX: level would 1iUlFSt, under the NULL HYFOIIIESIS or no ASSODATION. AsSUlllinl tI is known. a method or generalised least squares known as the CCJc:luanc:-Orc:ua pmccchae (Cochnne and On:ult, 1949) can be employed. Write y~ = y,-aYI-I and.%~ = .1',-tl.1',_I. Oblain an ellimale or fJ using ordinary least sqlUlla OII}~ and .1';. However. since: tI will DOl usually be Down it CIIII be eslimatcd frum Ibc: ordinary Ic:IIIl squan:s ~iduals e, by: tI

=

te,e,-I/t;'-J ,=2

t=l

This leads to an ik:l1lti~ pmcedwe in which we Cllll canslrUcl a new set of lransfonncd variables and thus a new

set of n:gn:ssion estimates .nd so GIl until conYellence.

n.cilcl1lliyeCocJuane>.on:uIl~canbeinteqnlcd

as a slepwisc: algorithm for computing lllAXIMUM umJHOOD SllMAlIONS or tI and fJ when: the inilial ab&ervaIion YI is reganlccl as filu:d. If the n:siduaIs can be assumed to be narmally dislributal thea full maximum likelihood mdhacIs are aYailable, which estimaIe a andfJ simultaneously, and this can be gaac:nWa to higher anIeI- auton:grc:ssi~ models. These models can be: filled using (say) PROC AtITOREG ar AUTOREGRPSSION in the canpula' packqcs SAS and SPSS mspectively (sec: STATImCAL MC'ICACES)•. However. ~ 1'OCORREI.A'J'I or residuals can appear because the: wrong mocIDl is being filted. Por c:umple. if Ihc bue aaponse was quadndie and a linear madel was ftlled. Ihc enors would appear as a gmup ofneptiYe enars, a group ofposilive enun and then a gnHIp of lICIaliVC: errors. Il is a bc:uc:r SInItcgy 10 oblain a good model th8II using an aulOn:grc:ssiye enormodel as a panacea far models thaa simply do not fiL IntmuplCd time: series an: often either befon:-anckftc:r tn:alment for single: subjc:cls 01" befan:-and-after intervention far populations. An importlllll question for the analysis is whcthCl' the data ~ alln:latc:d. 1be main n:uon for com:lalioa might be because: the same subjc:ct is measam:d Wan: and after. However. if we Mmovecllhc subject dl'c:cl.the data may be indcpc:nclc:al and so, far example. a IWCHIII11pIc: SruDENrS I-TEST would be valid. One: could look fOr tme difren:nt sods of effecl on an oulcolne of an intervention at one poinl in time: Ca) a change ill slope. (b) a change in level or (e) a combination ofa change in slope and a change in level

25000 -

20000

I 'e

Ii

Conlrol coopendives

CooperaMs in NHS Direct areas: ----All contads •••.•.• Contacls dealt with by telephone advice - - Conlac1s res'*ing in direct contact with a doctor

15000

l-, l \. .I \.,\,

10000

,

5000

: .. .. , .. ....." " ,,, \ .

...... "

~....

.........

~

"

o~--------------------------------------------~-------------------------------------------Mar May .lUI Sap Nov Jan Mar May JuI Sep Nov Jan

1997

1998 Mon1h

Irne ..... In meclclne Monthly number of conI.acIs with GPs before and after introduction of NHS Dited (Munro fit aI., 2000, Bfilish Medical Journal 321, 150-3, with permission from the 8MJ Publishing Group)

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ TIME SERIES IN MEDICINE NHS Di~ is a Iclephane sySlem desipcd to n:lieve pn:ssun: on pacral practilioners (GPs). It wu intnxfuced into the UK in 2001 (MulllO el 01•• 2000). We an: intm:sted whether il has an eft"ect on the number of telephone calls to GPs. The most likely model is a change ia slope (see the second fig~ on page 464). One simple method is Ihe rollowing. We make the origin for lime the point at which the intervention oc:curmL The model is y,=a+/l.I+/J2t' +E,. wbIft y, is Ihe monthly number of calls to selected pmcticcs in month , and " = 0 if t < O. t' =1 ir I> O. Thus a tcst of the eft"ecl is to test whether ~=O. As slaICd earlier. conventional n:gression will gi~ inwlid JaUlIs ir the errors arc serially cormated and so we ncc:cl to check the serial COIKlalion or E,. Again. we can usc ccltain stalistical padcagcs to fit Ihe model assuming the enors arc pnendcd by an auton:grasive process. Many epidemiological sericscxmsist orcounts and n:qui~ Poisson ~p:ssion rather tbaa anlinuy linear n:gression. We can also employ a method similar to the Cochrane-On:UIt mc:thod to allow far serial COIRlation and usc CJENEIlAI..BED BI1falA1lNO EQUATIONS to estimate Ihe paramc:lels. Campbell (19M) analysed Ihe dependence or daily deaths fmm sudden infllDl de_ syndrome (SIDS) in England and Wales flUID 1979 to 1913 on the mean daily enYironmenlal tempenllUre measured in London. The input was ..... daily tcmpenllUre and the output daily clelllhs due 10 SIDS. ~ is clear seasonality in the modaIilY series but this does nat mean thai the~ is a causal m.tionship between lempenIIure and col cIeaIhs since many rKlan behavc seasonally. such as length or day and rainfall. II is only when lhcse elTects arc mnovedcan wecleduce a possible relationship. A model was litted that maoWMI seasonality and then included a linear tempenllUn: eft"ecL 1be coclllcieni associated willi me_ tempc:nalUre 3-S days befon: the death was -0.041 (SE 0.0(5). We intcrpn:1 this u saying thai aloe drop in tempenllUre is associated with a rise in SIDS by about 4 c.t. Further investigations demonstrated thai the n:Jalioaship was approximately linear. We can lest the n:siduals far autacomlation. using tests such as the Dnin-Watson lest (fint-ordcr AR) and the Ljung-Box (gcaeral order). However. one should ask if it is sensible to tcsl for serial earn:lalion and only include serial COIKlation in the model if the test is significant. One should also ask why the data an: serially c~laIc:cI. Serial COIKlalion could be split into intrinsic COIIelalion (eacIopnous)aadexlrin:JiccomJation(exogenaus).lnlrinsic COIKlalion means that the value at a particular lime depends dim:tly on the value aI an earlier lime. Examples include: serum choJestemJ at dift"cn:nt limes. population iD ap groups in sucx:essive yean. epidemics of measles. £Xbinsic COIKIalion OCCIIJ5 becausebOlh Yllriables dcpcadOD some thircl (timcdcpcndent)variablc. Examples include daily SIDS. when: the

cleathsan: not caused by epidcmics aad an: aly UllMlatcd 10 each other except throulh (say) the weathcl'. We will not cover n:pealcd measun:s in detail here. Commanly they arise when individuals havc measurements taken n:pc:atedly over time (see IlDEAlED t.lEAstJRES AlW.YSfS OF VAlUANCE). Often the serial concladon aspect or the data can be n:moved by the simple: expc:client or using 1111IIIIIIII')' measwa (Manhews el 01•• 1990). If nOI. Wiually either a simple AR(I) model is IISSIIIIICd. ar what is known as an exchangeable earn:lalion model or compound s)'llllDCb'y. This is pnended by a modc:J of the fonn:

'il =

/J + Gi + Eit

when:Yn is an outcome at lime Ion subjecl i. a, is theefl'ecI of subject i, which is assumed normally disbibutccl with wJiIIIICC 0;. E(a,a)=O wilen i_jand EU has wriance 0 2• 1beeft'cctofthis is to generate a covariance matrix with ~ on the oft"-diagonal and a2 + 0; on the diagonal tcnns. Allhough one would expect measuremenlS made funhcr away to be less c:arn:lated (i.e. perhaps an AR( I)), in practice compound symmeary has been found to be a rasonable assumption in many cases. We need to distinguish between methods when: serial com:lalion is _ important pari or the madel. such as rar pn:cIiclion. and whIR it is simply a Duisance. If it is a nuisance. theD we need to examine intrinsic and exlrinsic conelalion. We should allow far serial c~lalion in rep:ssion IDDCIelling. Often serial c~laIion can be "made to go away' and so the lime series aspect is nat a majar concern. Compound syllUDClry is a useful assumption far repealed mcasun:s in RANDOMISED CONtROIJ.ED 1RIA1S. MJC

..................."p.

BaDdbIIIlI.G.. C....al,s.,FdodIII, ...

Wdowcz.Jc.s.le,J. ad Lql,A. 2001: El'cct ofrosal)' pra)'Cr and )'oga lIIIIItru on autoaomic: canIioYascuiar rb)'lhms: comparaIive study. Brili'" Medical JDUmtlI 323. 1446-9. au. o. E. P. MIl JaIcIaI. G. 1976: Timr srrirstllllllylis:/Dr«tUliIIg fIIItl conlTDI. San Francisco: Halden Day. CalDpbd, Me J. 1994: 'llme series rqn:ssionfarc:ouals: an investigation inlOtbc n:laIionship bctwcn suddca infllll death S)'DIIrvIK aad cDYironmentallelllpCllllft. JDIlnItII 11/ Ih~ Ro)'tll Slalisli«l' Sode". S~rira A 157. 191-201. C. .pIIeIL Me J. 1996: SpccbBI anaI)'sis of cliaic:aI sipals.: an interface bdweca mcdicalltatiSlicians .ad mcdic:aI CllJineen.. St,,'utkal MrlllDth in Mrtlkll'Rrmum 5.51-66. a......, c. 1989: 'Thr_Iyris 11/1.ria: ,. ;"/roiMttion. 4th cdiliall.. LancIan: C'laapmaa A Hall. C. . . . . . De ad Orntt.G. B. J949: Application oflcast sqUImi ~pasion to n:lalioaships c:ontaiainr; IUllacDJRlalcd cnor lamS. JIHII'IItII of lhe Alntrit.Yur Slalistical AsJoritIlioIr 44.32-61. ...... P.J.I990:1iIrwMries.AbiOllalialimihllnHiltcliM.Oxfonl:Oxfoni Science Public:llioas. GIaII. O. V.. wu.., V. L - ~J. Me .. III. 1975: Design _ allt")'au of ,_ JerieJ experinwnlJ.

Colando: CoIando Assac:ialcd Pre&s. Ma.......,J. N. s.. A".D. 0.0., C..,..., l\1.J. ud .,....J. P. 1990: Aaal,sisof serial IDC8S1IIaIIeIII

in medical n:scan:b. BriliM Metlital Jollmlll 300.

485

TIME TRADE-oFF TECHNIQUE _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ ~S.

1\IUIII'O. J .. NIdIoI. J., O·C.......~ A.... KDowht, E.

2000: Impact of NHS DiKct oa demand for immediate CaR: observational smdy. Britislr MediC'G1 JoumoJ 321. 150-3.

time tra.off technique

See VON NEUMAN-

MORCIENS'I'fllN STAN1WlD GAMBLE

total fertility rate (TRF)

See DEMOOIWHY

transformations The use of Innsformations in slatistics has a long hiSlOry. Forexamplc..the Wilson-Hilferty cube root Innsformation for CHI-5QUo\RE DISTRIB~S. the Fisher z-TRANSFORMATION ror CORRELATIONS. the use of logarithms for biological data aDd the arc-sine root lransformations for proportions ~ well-known procedures. In most cases the use of lransronnalions is not an end in ilSelf. but rather a meaDS to an end. '11Ie ultimale benefit is usually nol what the transformation dirccdy achieves. but rather thaa it allows subsequent analysis 10 be simpler. IIIOIC revealing or more BClCurate. Whaa is most important is how the lransfonnalion aids in the inlel'pmation and description of the datL The transfonnalions may be applied to observations. either response or explanatory variables. or to panunc:tc:n or statistics. or they might be an explicit part of a stalistical model. '11Ie purposes of using transfonnalions include: '''ers to die coaapIetcnc:ss or lhe scale ar malaisealc questiommile iD die cowncc of importanl aras. Subcaacepts lib race. ecolopcal, clec:isioD, tiw:., cIiYapIII.

-= ...

COIIICIII1I8I, sampling Validily. mnqnhc:asiwmlS and rcasibilily ha1e been used in studies (S---. 1993). AsscssIilClllS an RmNOSCAIBpaendconiinaiclala "villi rank-illYarianl pmpatics only, whic:h mcanslhat the respaIIIICS indicalc ...... onII:raadaala IIIIIIhanalic:a value. Then:sullS

or slalislic:aI lrellllneru or data mlBl DIll be chaapd when relabellinl the arcIc:mI rcspoIIIII:s (sc:c lANK INVARIANCE). AppnIpiate Slalillical mdbacIs rar e_ _an or crilcrion and COIIIbucl WIlicIly oftc:a n:f'er ID the on:lI:r consislcncy . . to Ihc rclaIianship bctwcc. the scales or comparisaa. The 5CA11EIJIIJJI'or low back pain pen:ciw:d by 48 paIiaIlS widaallcast4yeaJ5_paiD hisIcxy made. aVBlW.AN.WJOUE 5t'ALE(VAS) aad.velbal clcscriplive scale (VDS-5) havin& the five clllcpJrie5 ofpen:civcd pain. 'IIDIIC', ..,.Jipble', 'modcnIe', 6ndllcrlleVCM', 6YC1Y sevcn:'. issbDwa ia the ftnl ftgun: (SYCDSIOII ~I til., 2009). Asevidall hm the plQl thc:mis Iarp cwcrlappil1l; c"," paIienIs "villi 'iIIDCIcndc paiD' _ die VOl used VAS posilionsr..... 28 to 73,anclthc: _ VAS padlians wcre"'by palienlswilh "rathcrsevcrepairi' .1hcpmpDltion ofcnalappi. pain IIDIOIIg • passible dill'cn:al pailS of clara defines the _un: or clilOldcr D. D equals 0.07. which .....111.7 \\ ofall passibleCDillbiMlionsof dill'cn:nt pairs are disanlaal. The expected paUc:m or compleae onIcr consiSb:Dcy, die rank-lnIIIsf'OI'IIIDbIc paIIcm of ~ (InPA). is conIInIcIecI by IJIIirinI off die lWO Ids or distributions of data

""isease

VAl ".VDI ......1n (n.4I)

5

•

4

•

•• • ••• • ••

••

............. •

•• •••• • ••• • • 2

•

1+-4-----------------------------------------------__ o 100 VAI_

nlldItJ of . . . . . The disItI1UIIonolpa/trJdassessmenlsoIback,." Dna vlsulllanaloguescllls forpein (VAS"..,) and. tiv8-point 'I8fb8I ~ scale for pain (VDS-5 pain) ~ C....... IOM. ., SItIIima:

«> 2011 .laID Wiley Ilk SoIII. ....

J«VIIItI EdiI_ &lied by . . . S,lEveritt . . CIuisIaphM' R. .....

471

V~NCE

_________________________________________________________

&pillll each other. The ~ or disanler exJRSSCS Ihc absc:ned dispenion of pairs Iiom allis order CClDSistcnt diSbibulion or interchangeability between the scales. The cut-olJ" respoase wlues for inlcncale calibndiOD 1ft also provided by Ihc IttPA. and il is obvious thallhc:n: is no linear cxm:spaadenc:e betwccn VAS ancIdiscn:tc scale assessments (see dae second ftgurc) (see RANKS. IW«DKJ 1IIIIOCfJ)lJIES) (SyCftSSOlL, 1993. 2000a. 200Db).

Ji

4

1.8

e3 21 • 0,

•

.....

; ==

E

100

0 V.paln

VIIIIdIty of acales The fIIIJk-transfonnlJble pattem of agreement (RTPA), uniquely defined by the two sets of frequency dlsttibulions of data In lite filSl ligure The~an: GIber mc:asura lhatcauld be applied toewluation

DcpencIins

011

Ihc

puIJJIJSC. 5III!AJIM.o\N"S IWIItC'CJRREI.A'llDNtvEffk!lENT. Oaadmm-

KnIsbI's pmma (see EvalD. 1992) and KmdaII's tau (see CCIUIB.A'I1ON) caulcl be suilable. Spcannan~s rmak-ardcrcam:IaIioa cocllkienl is a commonly used nonpanIIIIClric measure of ABX1A1IDN. Howew:r. a IIrang assacialiaa docs not neeC5Sal)' mean a hiP level or ana consisleney. aad docs DOl indicate .... two scales 1ft i.acn:ilaDleabie. TIle Peanaa carn:Jalion eaefticieat (see aIIIII!I.ATJON), Sl'UDENl"S 1-11!ST and Ihe MW.YSlSOFVAlL\NCEan: aIsoCOllUllDD in yalidity sbldies. A serious clrawllaclt is thal these methods assume nonnaIly dislribulcd quanlilali\'C data (see NORMAL DI5IIUIIUI1ON). and such mauin:mc:nts 1ft nat met by data flam raling scales (S~1IS61DD, 200(1,).

ES

t.YIII'.ncr

E..... 8. S. 1992: Tlwtllltlly.is 0/ 'tlhlt~ 2IId alitian. lAaIDn: Chapman a: Hall. lYe-. Eo 1993: A""'* of syslemslit _ rtIIIIlonr tliJ/BnKYJ WItIWnptlim#ortlintll",trgtwit.Yll.,s (diacrtatioa). OCifebaq: ~ Univcnity. 5 1 - . Eo 2OOOa: CaInpariscJa of the: quality of IISICSSlDCIIIS u_ CODliauous and diSCRte ordiaaI lllinl scalc:s. Biomrlrictll Jollrlflll 42. 417-34. s....., Eo 200Gb: CGacclcdace bcnwcn ralinp usinI dift'Clat scales for the same variable. Stalisticl fir M_ _ 19. 3483-96. S...., Eo, ScItIIheta ... N,--, B. 2009: 1be baIancccI imaIlOl)' far spiaal disonlers.1be ftJidity ora disease specific quc~ far evalaatian or outcCIIIICS in palienls with ftriaus apiaal disanIcn. SpiM 34. 1976-8].

a.."........

(Xi-x)l II-I

(See also COVAJlL\NCE MATRIX)

,

or scales.

which" is the number or observations. i takes values frona 1 II and Ihc % notation denotes the sum. i.e. (.~I_.t)2 + (.1'2-x)1 + ... + (x.-.t): to

and~lftusedtoinclicalc the variance. TechnicaDy. Ihc former n:fers to the variance of Ihc sample and die lallU to the variance of the population. which is being eslimaled by the sample and is lIUIrIinaily smaller. since the divisor is II instc:ad or ,,- 1 in dae fonnulL Whe:n quoting a mean Ia sununarise clatL it is also custGlll8l)' to quaIC a sample standard clevialion. This is the squaIC IVOl of'the sample variaac:e. and is in the same uaits as the: laW data. SRC

---- --

of various kinck of' validity

The variance is Ihc squan: or the STANDARD DEYL\1ION. It is calculated using the following fonnulL in

Balhr

ArPA VAS VI VD&5 back pail

5

variance

variance components

See COMPOJENTS OF VARIANCE

varlogram 11Iis is a procedure .... pruvidcs a descri~ lion of the aulOcorreJaaion (see CORRELATIOK in Jine series ... spadal clusters. II is the Jatter thai foms the rocus far dae following acCOUDt.1t is ilapodanlto describe and model this aulacDl'lelalion so as 10 incorporate il iDlo cslimalion and prediction prucedura. Far example. consider disease incidences IIIC8SUIaI DI s.-lial Joealions. 10 coastrucl a map. one would need Ia ink:IpDJate the incidence value for die locations at which it was nat obsm val and in the absence or lup-saaIe spatiallmld such predictions should Ii~ larger weights to nearby locations ir Ihc autocOlKlatiaa WCR inm:asiq with dccmIsing distances. n.evariopamisbasc:clonlhesemi-varillDCey(x.Y;"Nt"~).

which mealURS half the wriaac:e of die diffen:nc:e belween lwo yaluesofaaoutaJlne.Z observecIatlwospaliallacalions n:f'en:aced by the spalial coanIinDlcs (x. y) and (.~ +,,~ , + II,,). Slriclly speaking. the theoretical variognun is defined a twice the semi-ymanc:e. i.e.:

2Y(X.)"ih.\".",) == ~[Z(X'Y)-Z(.~+"%'Y+h~.)]21 but the sc:mi-varianee i If is usually ~renallo as Ihc ~ gram. It n:pn:ICDIS an ill\'ClK) IIIe&IIR of' Ihe statistical dc:pendency or Ihe wriablcs allacaIiaM (x. y) and (x+"h"", y + 11,.). In all generality. tile yllliopDm is a ftlnc:1ion or both the locatiaa(.~.y)andlhcdislaaceanddin:clion(hln"..,)·Hence.1a ellil1llllc it. rqJIicalc obscrvatiaas aI each loeation wauld be nccded. In praclicc:. only aae sueh Ralisalioa is available. 1b oycftXJIDe Ibis. the inlriDsic hypaIhc:sis is introclucal, which makes 8I51IIIIpIions about Ihc diffe~nce %(.1'. ,)-Z(x+""", , + "~).It slab:s that for the spatial ilia under inYesIigalion (I) Ihc expectation or Ihc ditren:ace is zero. i.e. thallhcn: is no

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ VISUAlANAlOGUESCALE spatial ~nd. and (2) the "ariance of lhc diff'en:nce depc:ack only on Ihe dbtance vc:dDr (Ir.a. h,) and nat the location. Far variognuns that reach an asymPtote (so-called bountlet:l yariocrams) this b)'pDlhcsis is equiwlc:nt 10 assuming seoand-onler slalionarily of the me8sun:s themselves. Under this assumption it is possible to estimate the wriopam from the cia... For simplicity. further assuming thai the Yarlogram isisolropic.. i.e. that it only dc:pends on thedislaDl."c. h. between Iwo locations and nOl lhc direction of (h.... h,.). the variogram can be estimak:d by the empirical yariasram: y(It) -

21~(Ir)I~(Zi-Zi)' =,

•

t.5

I i c:J

•

•

•

• •

t.O 0.5

(J)jective = 0.2504

0.0 0.0

0.:5

1.0

fed into a pmliction routine to specify the intcrpo)ation weights (sec Lawson, 1998). alIhough simultaneous maximum likelihood esdmadon of the variogram panuneten and possible trend parameters is considered preferable (sec Crasie. 1993). SL CNsde, N. A. C. 1991: Sltl'islit~ for

1.5

Dislance varlogram Spherical valiogram model (CUfV8) fitted to an empirical vanogram (open symbols} by optimising an objective function

spill.' .'tI. New Ym:

John Wiley &: Sons,lDc. a..WIOD, A. B. 1998: Statistical map. In AmIi., P. and Calton. T. (c:ck). EM}'£lopetiitl of biOJ/tllislitJ. Cbiclxsla': John Wiley &: Sans. Ltd.

velocity charts

when: N(h) is the set of distinct location pails (i,i' thai 1ft distance Ir apart. W(h)lthe Dumber of such pairs and and ZI the abserwd values at these locations. 10 achieve n::asonable numben an estimation is carried out at discn:te lap and distance bins 1ft allowed for. C~ has to be laken when choosing lhc number or lqs. 1q increments and bin widths (see Cn:ssic. 1993). Since the main goal or variopmn analysis is to rmd a parsimonious clesc:riplion of'the spatial aulocam:latioo slnJctureof'a yariable. a wriognunofa particular functional farm is usually lilled to the empirical wriogram. Suitable m0notonically inc:Rasing ftJncIions f«bounded wriognuns an: defined thruup tJua, panunelers: the IIIIggel rjfecllqRSClllS microscale variation or measun:mcnt error. the :lill I1:plalC8ts the YarillltlC of abe oulcome measun: and abe rjfeclive ronge is the dislance at which autocom:latian becomes negligible. For an example, the open symbols in abc ligan: show an empirical YBriogram to which a spherical variogram function (a particular choice of func:tioaal fonn) was liued. The curve is fully described by the parameter estimates (effective range =0.8, sill = 1.75. nugget =0.7) and indicates small-scale spatial autocorrelation up to a distance of 0.8.

•

Once a variognun function has been identified and

ftac:cl this runelion is usually considered as known and

See OIIOWTH CHARTS

vlMial analogue scale These

are scales used 10

mcaswc a subjective assessment. such as 'amount or pain' or 'level of anxiety'. particularly when the assessment is believed to lie along a continuum ralher than only laking a discn:te sct of values. The item consiSls of a line. typically 10cm in length, with lowat and highest valucs indicated by IabcJs at each end. The subject is expecled to pJace a mark on the line to repraeDl his or her assessmcnL An example follows (with a ClOSS indicating w~ a subject has placed a marlt): How much pain do you feel?

No pain 1-1-----:)( ....---11 Unbearable pain The data from a visual analogue scale are n:cordcd by

measuring how far along the Jine from the left end the subject has placed a mark. It is important to n:member dlat. althouP it is possible to reeanI the data from a visual analogue scale with great accuracy. lhc value is very subjective. In the examplc., the subject's n:sponse may be RlCOrded as 1.1 (because it is 1.1 em from the left-hand end of the line), but then: is no objective unit an this value. If another subject records a value of 2.1, it is not necessarily the case that this subject cxpcrienees mon: pain than the rust subject. However. if the lirst subject is mcasun:d again (say. after a month) and ~ives a SCOR: of2.1. it is possible to interpn:llhis to mean that the lint subject is now experiencing moo: pain than previously. It is also importaat to n:mcmber abat a visual analo;ue scale is unlikely to be linear. For example. a distance of 1 cm at one end oftbc scale does not necessarily rcpn:sentlhe same diffen:nce as a distance of 1em allhc oIher end of Ibc scale. This cautions against Ibc use of SIandanI methods for continuous dais; a common m:ommcndaIion is to analyse ranks of the sc:oI'CS rather than the raw scon:s. Far further dc:tails see Altman (1991) and Stn:incr and Norman (1995). PM AItaBI, D.

o.

1991: I'rtIclital sltllislitJ for mN/ital resttUclr~ a Hall. S.......r. 0. L. 8IId Nonua. G. R.

Landon: Chapman

481

~Na~~~~am~fi~GMI§£

________________________________

JIIE'''''' ,..

I"': IIMIIh ........., 6CltIIa: II 10 '-ir dllWitJpa.wlllllll. .,2adcditi.. 0atanI: 0xfanI UIIiYcniIy PRa.

Von Naumann II......... atancI8rd gamble

...,dIad of .....mill pn:fCIeIICCS in balth CCIIIIIDmic:s was ftnt pn:ICIIlccI ill \'OIl Naum.a .... Marpr.. SIan (195~). 'I1Ie method _laJpaIhJ:tiailallerics as a means or IIIeaIIIriDI people's ....c:n:IICDI whc:a facccl with a c~ IM:tweaa II'eatIIaII thaa oIfeD poIadiai IleDeftt iD quality of life (seoQUAUlYeJr LRNIASUREIIEXI'), but willi the ....40' Ihat Ihcnt is a ftaile paaitiilty dud . . paIienl will' not surviVe _ _ _L All iDdivid_ -PI be asked 10 eta. between IIac: ccrIaiIIly of survivinl ..... IUd paiacI iii a particular 11_ of ill health and a pmbIe betweea suni\'iq far ilia I8IIIe period Wilhoul disability.. on Ibe GIlD ...... and illllllaliale deaIh.. GIl Ihe CIIher. 1hI: ....-...m or aDViYiD& wilhaut disabilily, as oppcIICCIlo dJins. is . . . ¥ariaI .tiI die pcnoD shows lID ....CR111C8 IR:awa:a die amain apIic. aad die pmbIe. 'I1Iis pnlbabiBty Ihen cleftnc:s the uta1.y of _ individual f... the disablalllalc bclwcca 0 and 1, whose ENDPORftS ~ death and pafcc:l . . .th. 8ecmIse few paIienIs ale IICaIItaII1ed to dealilll ill pmbabiUIica. .. allaIUdive ......_ ailed the time trada-oJ1' technique is oftca I;DllClIed. 1IIia bep_." eslimalill& the

'I1Iis

classiC

likely nmiIIiniD& ycaDI of Ife far .a 1IeaIth,. subjccl, usia& adIIariaI bibles. and dian the roDowin& quCSIioa is asbcI: 'lmqine liYiIII the mnainclcr of your IIIdUnII .... (an Clllimaled IHIMIIar 01 YCIIIS wciilld be i~) in your pRICIIl 1taIc. CaaInst this wi'" the alranative. dIIIl yau IaIIIin in pedb:llaJth .... rawer ,eus. ~ IIIIIDJ. YCIIIS wauIcI yau IIICIiftc:IC ir,au could have perfect bc:aItb? .AD aa.p]e or .. use~lbelllladanl pmbJeappnlK'b is &iYcD in PeIIaa and CIIIIIPbclI (1997) who use it 10 cIliIaatc ulilities .... rap 01' Ileal... stales ill coIanK:IaI can:iDDDIL 1'Iieywcmllbte·lOdcmanltralelhatthequalityorlirebcneftts of lIabililllli_ iD the IrcalnI6DI or adYaated IIIIdIIIIalic coknctaI caaccr \YCIe ndccI almost u biPIy u Ihase of. partial mspoase. 11Iey ... abDwaI ... die bene. of ~ a drq licensed ror the IIalIDcnt or ~ coIarcdIiI CIIIIta' in paliellls who had rldled an Cllabliabed 5-FU-contaiaiDI . .DXII,CIUlwci&hed Ihe shad..... impact olloxicilyin thosepatimats who aahicvcdatlcast.lllbt1jsali_ of their disease BSB

N.-. s. ... 0 ...., N~ 1997: SIIItiIisaIiDn in caIInc:taI ClllCef.lrllmltltiMtIl JllllffllllII/PttIIitnirr N11ni1a13~ 275-a V. New....... ' ..... ..........., 0. 19.53: ,..., tJ/,.,. tIIIIl . . . " . . _ _ _.l"Cw YaIk: .JaIua Wiley a Salls, lac.

w washout pedocl

fuDclions iDelude the MarIet, Mexican hal aad ftrsldcrivalive

See CIOSSOVI!Jl ftW.S

of a Oaussian densily function (top lOWoflhe ftg~). Noliee

wavelet analysis

'This is

a method of n:pn:senting

a function by projecIiDg ilon toa collection ofbasis filaclions dc:riwd fl'OlD a siqle wavelet (often n:fc:ned tou the rnoIhcr waveld ;(1». All basis functions (wavelets) n:quin:d in the analysis an: translated (shifted) aacI dilated (lIn:lcheci) versions of the maehcr wavelel. Unlike the Fourier lransf'orm (sec Sullba Rao, 1998). wha5c basis functions an: derived from sine aad cosine waves with penislcnl oscillalions. the basis func:lioas fOl" the wa~let Inmsfann an: DCHmmJ and oscillalc ror a shan interval. As a rault. the wavelet ..... fonn simultaneously localises information from a function in both lime and fn:quency. For funcUORI with lilDe-varying charactcrislics or sudden changes. die wa~lellnlnsfann has proved quite useful. Two main "avaurs of the wavelellnlllSfann an: the 4XIDlinuous wa~let lransfonn (eWT) and the cIiscmc wavelet transform (DWT). 'I1ley differ by the fact thai the lransf'orm wades with continuous or discn:te translations and dilations. n:spec:lively. of the wavelet function. The CWT is a highly ~dlD1dant lransfonn with the family of wa~1ets being compuk:d via f/I(III + b). wilen: II and b an: mal numbers. .. gc:aeral. the numbcrofwavelelcoefficicnl5 is muchgmder than the number ofobscrvalions. Pbpularconlinuous wavelet Mollet

Mexican Hat

Fat Derivalive of Gaussian

Haar

~s(Ie~4)

DUJechies (length 8)

-

wavelet

that all the wavelet functions oscillate - i.e. they ha~ bath positive and nepli~ values - and the Morici wavelel is complex valued (the mal and imagiIWY portions an: ploucd using ditTcn:n1 line lypes). The DWT uses a waveletlhat is lnnsIated and dilated by discn:le values. i.e. ofb form 1/t('¥1 +It¥) wherej and Ie an: integers. Theparamc:terjisaIIDIIICIIIlyn:fenmloulhest"tlle. The DWT may be an orthogonal or biorthogaDallnlnsfann depeading on the wavelel function. Popular orthogonal di~ cn:le wayelc:a functions include the Ha., and Daubcchics families ofwavele15 (bottom lOW of the figure). Although the disclele wavelet funcUonsdisplayed look continuous they an: derived Iiom two. four aad eight unique yalues - from len to rigilL NoIice. the discn:te wa~lel functiClDs an: not symmetric.. exCICpI for the Maar wavelet. and much less smooth when compan:d to aJllliDuous wavelet CUnclians. A compromise between the CWT and DWT is partially achieved by usiq the IImtslatiCID invariant DWT where a discn:lc wa~lel function is appiiecllo all possible integer slliRs ~r the calli in time via f/I(-rl+k). This n:suks in a redundanl 2)

Association betwc:ea twa yariables Agreement bc:Iween two yariables

IncIcpeacIenI

x2-lc:s1 (r)( k)

Kruskal-Wallis test.

Related

CochnID Q-b:sl

Friedman b:sl

(ANOVA) Repeated measura ANOVA

Contiqency cocflk:ieDl

Speannan's rank com:lation. Kendall' 5 tau CGII'Clalion

Pearson praduct-mament cCIIRlaliOD

Kappa c:oeflicienl

Weighted kappa coeflk:ient

Umilsor~l

Jo~Terpstra leSt

AnaI)'sis of WIriaDce

48&

T

S

E

T

M

U

S

K

N

A

R

N

O

X

O

O

L

~

t.EDIG\L STA11STICS): C.I.

_______________________________________________________

www.whichlcsl.iDraiindex.htm may

prove helprul.

Thm: raclOrs inftucncechoicc or sllllislical tcsl: the nahR or the n:sponse (lype of data beilll analysed): abe Dumber or groups sampled (one. lwo or many); and.. irmon: dum one.. abe nalure or abe samplilll (malchcd or inclepcndcnl). The raponsc ex' outcome variable can be continuaus and appmximately nonnaIlydistributcclordicbalamous(a biruuy 'ycslno' outcome) or intcnncclialc to these in a variety or ways. Farcxamplc.the n:spame variablccould be in on.IcRd catclOricS (sec ANALYSIS OF 0JtDDW. IMTA). Olhc:rwise. abe n:sponse variable could be continuaus and aannonnaUy dislribub:d. beiDg skewaI or conlaiDinc 0U1I..IfJI5. pcdIaps. ID eithcror tlac latter cues it is approprialc lo apply ODe or the many NDNMIWIETIlIC MEI'IIODS. In abe special case or the n:spDDSC variable beinJdIe time unlil an eve.... which mayor may DOl ha~ ac::curmI by the time or analysis (stricdy. database closun:), aben SUl'Yiwl methods waulcl be used lo handle die C'ENSOREDOISERVATIONS. Notably. this enlDils a version or die Iog-rank test or one or its allcmaliw:s and can be slndifted or nol clepcadinl an the slrUctun: or die data (sec SURVIVAL AJlW.YSlS). The number of lmups being sampled is lencrally obVious. although somelimes care musl be takeD about analysing the com:ct statistical UBit. ID a clustcr nmcIomised saudy. for instance. il is Ihc clusters thai need to be analysed. nol the individuals fonning the clustcn. When n:peatcd measun:meDls arc taken. while more sophislicatc:d approaches can be adopted. thc simplest is to eonverl each individual's data iDto a suitable summary statistic prior to analysis. Forexample. this statistic mighl be the AREA UNDER TIlE CUJM!. slope or the regression line or MEAN observation. etc •• depending on whatever was previously decided 10 be lhe most clinically meaninlrul. III practice. note that stalillical convenicnce should not be the mlCria for choosing amonl possible summary slabStics (sec SUMMARY MEASURE ANALYSIS). Laslly.the relationship &I1IODI paups is crucial rordccicling OIl the c::om:ct testing procedure. In die simplest case in~vin& two lroups. one aeeds to know whether sample daIa wen: ptlired (also known as IIfIIlrhed, reltlled or depen_I) or unpaired (1IIUIItIIrheti. unrelaled or ilftlqentknl). 11Iis is usuall), strai....forwarcl~ C.I. whcac\lCl' daIa an: collected em Ihc same palicDts berore and after an intc:rventiOll or when a pair or orpDS (ear. eye. haad. kidney. ele.) is measured within abc same pcman or when twins an: studied within a eonlrolled experiment. It can be less clear how best to analyse clata in CCltDin CASEoCON1ROL srtJIUS. malchcd by sex and agclo withill a ftxcd number of years. howcver. 'Ibis is bccause~ a.m,. the purpose or matching is to cn:alc bnJadly comparable groups according 10 basic dcmopaphic status. ndhcr dum attemptinl lo achieve PRcisel), well-malchccl pain (see MATCIDNO).

11Ic table shows. aa:onIinc to abc thn=basic criteria. when to usc which test mcthad in die simplest cases. For complctioa. il also iadicates which pnx:cclun: applies whca assessing A5S0CL\TION or ACIlEflO!N1". Apin. rurthcrclclailscan be found under individualI)' II8IDCd enlrics. However. as emphasised duaugbout. confklc:acc infl:nlals an: prel'c:m:d to leu and for man: iDfarmali~ analyses IIiIL mudclling or aqn:ssion tec:lmiqucs can be bcucr mil 1bcse pmvidc muluaD)' actiusrallalllts for impartanl confounders.. an altagcdlcr I11CR . .ractary aAJI08Ch 10 bandling daIa and supc:riar 10 cxpc:cting it lo be adequately clc:scribc:cl by a P-valuc, as ir n:laIianships wilhin the data coulcl possibly be encapsulated by a single numbc:r. a hopelessly false ambition. NCYCIthcIcss. viewed positively and ~y, the right hypothesis 1e51 can ~ 10 rule oul chance as DB explanalian ror disweiDiIt data appan:nt in one or man: random samples. and lead abe iD~ OIl IOWDIds a ruller aulysis or the: daIa c:ollcctad and. ill tum. a deeper clinical uncIcnitanding. CRP (Sec also I!XACI' ME11IOO5 RJR CA1IDJRICAL DATA. IIYPOIHESIS TEnS)

Wilcoxon rank sum test

See MANN-WHIINEY RANK

SUM lEST

Wilcoxon signed rank teat

'Ibis is a noapanunctric VCl5ianofdle pairml-ll:Sl (see Sl'UDENrsl-'II5ST) used rex'1Wo poups thai an: either matched or pain:d. II is man: sensitive

than Ihc SKIN TEST as it uses the III8Ini.... or the diffcn:nccs bclMlCn the pairs not simpl)' Ihc sip or the diffen:ncc. It gives mon: wCilMlo pairs that show large differences than thasc that show small diffcn:nces. The Wilcoxon signed rank lest ICsts the assumplion thai the sum of the positi\'e RA...s equals abc sum of thc lICIalive ranks. The test statistic is die smaller or abe sum or the positive and the sum of the IlCplive ranks. The daIa should be CCIIIlinuous or ordinal in

Wilcoxon signed rank test Mcm2 and ICi61 Values, data from a study of patients with ctJtJCer Ptlll""

Mmr2

Ki61

D;jfemrce RmrIc 0/ SirdiJIerena tJf

,_Ie

t/iJfermt:e 1 2 3 4 5 6 7 1 9

14.78 7.96 10.89

12.10 18.23 16.40 18.02 23.35 26.70

14.78 8.68 1.57

I.SS S.14

3.043.96

8.16 8.40

0 -0.72 9.32 10.25 12.39 13.36 14.06 IS.19 18.30

1 2 3 4 5 6 7 8

-1 +2 +3 +4 +5 +6

+7 +1

_______________________________________________________________ WlSE

-.1'hepdn=d . . . . . . . . . be ........... and ~ ~ dID ... IEDWf ditrea.ac.: . fInt, ...........in . . . .fGreachpair(\IBIiaWe 1variIII1e 2). Then ... ~ ........... or ... dUl'tiiti....., ...... to hipit, ........ '~:.YtftIIIS . . . to lies iD" diIreiaIcts 1IIIII_1IIIIk lit zcig clft'CI1IIaL FiDeI dID swia of tha.~ farille paIIiMdilinnaa. ..."........_or .. JIIIIb -Pihe til...... V.... N, die ~ ...... 01 dlB'lNiCCIi II1II lilal", tiel. Pinel ... aIIIcal v... I'RIaI ............... ~W..... (W~. W-); nsject ... ~ IiwcmBIs If W Ja _, ilia ~ equailG .... cdtal.l~.•r.,...>W-.... ~I .... io'be. . . . . .·. . . . . 2 .... ¥icD . . . Iu .... or.IbId,.Mc;m2 and Ki67 __ - - CGIIIJIIIII'II .to . . If~" .cIft'crea&.oe"~-'~"" laJllllil!* w.lIh ....... o.a. ........iII the IaWe.. A plat or the diIreIaacas ....... ..., ae pI·...,.5,....eM .. tIIe

,..1h6

....... of.,......., ......

r.

"-.1.

Me r~-JSIIIIIl willa W-lllill(wt', ~,-I ~ N-I. Ram ............. (N-I..-o.o5) ....... cri1iaII . . . is 3. As I is len .... 3. ..... 1.....nt evidellce IG . . IIIe l1li1 ~L 'I1Icmrcn Ihcn i5 • diJrei.ac beI..... Man2 and lCi61 value&. Mcm2 viii...·... to ..

hiPar 1IIa'1Ci67 v*s.. Far 1urtbcr ....1s.., NI (1917). SicpI ... c.tcI_ (1908) .... Swin..w .... Campbell

!LV

(2002}.. Nt,

u.

A. 1W1: ·MIa ""..,rb , . ",.""

"":"""C'aII'

&1ft ~_ _•

~OIb:' __ "N.I• • • ___ ",...,_Ik,*,,,,Ia/flr'" ~.rd!Iita..2ad cditiaa.. New

Ycd:.....a..w..aa .......

;T.D.V.... O ....M.J.2G02: SItIIIJIia til . . . . . . ., ..... ediliGD.l.aaIIaa: aMi .....

WI..- . - s. aUGS AIG):\VDdIUoS WL8E AbIJaMaIIaararwciPtedlcat......eJIimaIar.

See ~ lQIiMEs BSmMIIIIN

y YIdea' correclon This is an aCtiUJI~t III8IIe 10 a CHl-SQUARI! 'IESI' wile.. ~ nlllDher or abscna1ians is IIIUIII. YaleS· CCII"I'CCliOD is I:UlDpIe of B continuity cam=aliaa and ii cIciipcd spcciftcally I'ar 2 x 2 fNquency tables. The cbi-sqwn tesl IlleS 11M: DDoSQUAIlE DBI'RIIU'IION 'to ddennine whc:thcra set or obIawd CDWIIS fio.. a study (the number or abscrvati.... in each cell of' a hquenc:y ..bIe) .difI'eI' significantly ~ die espec:tedcaunls paIidccI by. a hypadlesis:. This use of . . chi-squam diSlribulioa is .. appoximation based OIl the use or die ~ DlS11UBUI1ON

an

toapproXillUlletbedisllibuti_arllM: nlBllbcrorobRrYatioas , in each ·ceO or 11M: flequeacy table. The use or IIH: IIIIIIDIII clislributiDn is only... ~ lion because the of obsemdiDns ill a cell or a 1Rq1lCllC)' bible is aclilCRlc Value .(it ~. . only take nanneptive iaIcpr values: 0. 1~ 2. ...) when:as dID normal distrilMalion IS continuous (it caD lake any wille); Whea tho nlBllbcr or observlllions is J~ this· dill'aence becomes

liliiii_

&qd""" CfMIJIIMlM fiJ M__I SItIIiltlia; JftVIIIII EtIiIitIa

.Ievant (I:hc appntlWnaIioa beco.mc:s Vel)' goad),"'1 when die IIUII1ber or observationS is IIIIIID ilia ..,..aaimaliaa can ..... that the SIaIIdanI cbi....1IBIC Iat aault is invalid (the pVAUJE n:poIled by •

lest may be . . low).

Yales' conection is. .. Blf:I=DIPt

10

lake account of' die

..,..,ximaliaft in the .calc:ulaliaa or the chi....uan: tal SIB-

~r. 1Iac: cft'cc:torthec.'OlRClion Is todccn:ue die sizeof' ~ which P-Yalue fi'Dm.dIe 1I:st. TbisDICaas abat Yales~ com:ctiaD will alwaY51ivea IIIOIe"ClJlllCnllliYe last; it wiD be less likely 10 n:pod a sipiftcanl JaUiL Tllmeis nolDlJ~ ap=mc:nt_ Vllfcst CCImICtion shouJcI be appJ~. Wilh ~~patcrpmccuil1l speeds and me.... ' capacity it is ~ u.II, reasible ",use aacl IIICthocIs .instead. wbic:. a~d.. appIVldmalioas aIlopthcr

r

incraIses"

w_

(~-I. AsIa'S EXILT 1EST)..Far

fullller

~II

see Altman

(1991).

A111D1D, D. G. 1gg 1: PnlclktII l'lIlislirs p rrxtBr". Landon: Chapma • Hall.

PM

medklll

MIed by BriIII So Everitt MIl CuisIapMr R. .......

02011 ..... Wiley" Soar.. ....

481

:JI'~ 1Wsi~.~..,.ar~~~~

.,,·ineiisumd ..

.... to ....UIY.. ~.n.s diIraaIl scales. s..pJe waJ.sX amCXlllYCltelliO HCGIa Z

.••. 1IIa i'armala:

QU~~.Z~of~';'61~.~" . . . ....100,25111 caiIiIe:. ... positi_Z:~~"'10 qUaDtiIc:s iii 11M: ........f.,of the diIliibaliaa. ;,. .LMS eakmds .MmIoD ' ".. . .. . .WIiDD of die Haire .. . .Ia. -

. . . fbrlDWNESS:

X-II ;2-SO

whaa ... ~JI·aad'SWCDARD DBVL\1ION (SO) Ix' X'. ;aIui_ either,~'1iIIIIII* or ...cid.er :,..... ·'Ibe.DICIIII and SD ot%IR ......y.cIose to ...... CIIIC

...am.

__ "~lhcmall..arg;s_caefti:lltol'~6~

~. SD.~ by Ihe ...., ~ L "",,,~.pDWa'. ~'·Io'~ Ihe~ .. Cs6ma1ed by

DIII'mIinaN,

. . . .~.1IICIIIacIs. Whc:a. L-I(i.e. a nanna1 disui. . . .) dd5....IUlcs Io'~ ftnt .•IliiIiOn~ TIC'

,aq.......

·iI: . . . ., coI:·. JlCItIW.

Lx$.

MCaI'e"

n:Spec:tively.;.IIi·.caaiatorLiNDRlBItJ5SiiJNihe ~ Jf; IIIHi _Jiic

10 ·11........ ....... ..... distriIMJIJaa

Z_(~/M)'-l

lhe~_ "'e.xpR~~ •.• ~_.~ ar

Encyclopaedic Companion to Medical Statistics

Epidemiology and medical statistics

Advanced Medical Statistics

Statistics of Medical Imaging

Principles of Medical Statistics

Epidemiology and Medical Statistics

Medical statistics made easy

Medical Statistics from Scratch

Statistics of Medical Imaging

The Blackwell Companion to Medical Sociology

A Companion to Health and Medical Geography

Handbook of Statistics: Epidemiology and Medical Statistics

A probability and statistics companion

A Companion for Mathematical Statistics

Medical Statistics at a Glance

Essentials Medical Statistics (Second Edition)

Using and understanding medical statistics

Medical Statistics at a Glance

Medical statistics at a glance

Medical Statistics at a Glance

Using and Understanding Medical Statistics

Handbook of Statistics 27: Epidemiology and Medical Statistics

Medical Statistics: A Guide to Data Analysis and Critical Appraisal

The Encyclopaedic Dictionary of Marketing

Medical Statistics at a Glance, Second Edition

Principles of Medical Statistics (6th ed.)

Interpretation and Uses of Medical Statistics

Functional and Operatorial Statistics (Contributions to Statistics)

Functional and Operatorial Statistics (Contributions to Statistics)

Medical Statistics from A to Z: A Guide for Clinicians and Medical Students

Introduction to bayesian statistics

Encyclopaedic Companion to Medical Statistics

Epidemiology and medical statistics

Advanced Medical Statistics

Statistics of Medical Imaging

Principles of Medical Statistics

Epidemiology and Medical Statistics

Medical statistics made easy

Medical Statistics from Scratch

Statistics of Medical Imaging

The Blackwell Companion to Medical Sociology

A Companion to Health and Medical Geography

Handbook of Statistics: Epidemiology and Medical Statistics

A probability and statistics companion

A Companion for Mathematical Statistics

Medical Statistics at a Glance

Essentials Medical Statistics (Second Edition)

Using and understanding medical statistics

Medical Statistics at a Glance

Medical statistics at a glance

Medical Statistics at a Glance

Using and Understanding Medical Statistics

Handbook of Statistics 27: Epidemiology and Medical Statistics

Medical Statistics: A Guide to Data Analysis and Critical Appraisal

The Encyclopaedic Dictionary of Marketing

Medical Statistics at a Glance, Second Edition

Principles of Medical Statistics (6th ed.)

Interpretation and Uses of Medical Statistics

Functional and Operatorial Statistics (Contributions to Statistics)

Functional and Operatorial Statistics (Contributions to Statistics)

Medical Statistics from A to Z: A Guide for Clinicians and Medical Students

Introduction to bayesian statistics

Recommend Documents