ADVANCED TEXT S I N ECONOMETRIC S General Editors
C. W . J . GRANGE R G
. E . MIZO N
This page intentionally left blank
CO-INTEGRATION, ERROR CORRECTION, AND THE ECONOMETRI C ANALYSIS O F NON-STATIONARY DAT A Anindya Banerjee, Juan J. Dolado, John W. "Galbraith, and Davi d F . Hendry
OXFORD UNIVERSIT Y PRES S
Ms book lias been printed digitally an d produced i n a standard specification in order to ensure its continuing availability
OXFORD UNIVERSITY PRES S
Great Clarendon Street, Oxford 0X 2 6DP Oxford University Press is a department o f the University of Oxford. It furthers the University's objective of excellence in research, scholarship , and education by publishing worldwide in Oxford Ne w York Auckland Bangko k Bueno s Aires Cap e Town Chenna i Dar es Salaam Delh i Hon g Kong Istanbu l Karach i Kolkata Kuala Lumpur Madrid Melbourn e Mexico City Mumba i Nairobi Sao Paulo Shangha i Taipe i Toky o Toronto Oxford i s a registered trade mark of Oxford University Press in the UK and in certain other countrie s Published in the United States by Oxford University Press Inc., New York © A . Banerjee, J.J. Dolado, J.W. Galbraith, and D.F . Hendry 1993 The moral rights of the author have been asserte d Database right Oxfor d University Press (maker) Reprinted 2003 All rights reserved. No part of this publication maybe reproduced, stored in a retrieval system , or transmitted, i n any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriat e reprographics right s organization . Enquiries concerning reproductio n outside the scop e of the above should be sent to the Rights Department, Oxford University Press, at the addres s above You must not circulate this book in any other binding or cover and you must impose this same condition on any acquirer ISBN 0-19-828810-7
Preface This boo k i s intended a s a guid e t o th e literatur e o n co-integratio n an d modelling o f integrate d processes . Time-serie s econometric s ha s devel oped rapidl y durin g th e pas t decade , bu t especiall y s o in th e analysi s of non-stationarity. I n particular , th e stud y o f integrate d processe s ha s grown i n importance fro m th e statu s of a n exoti c topic, discusse d onl y in technical journals , t o bein g a n essentia l par t o f th e econometrician' s collection o f techniques . I t ha s thereb y develope d int o a n are a o f interest fo r econometri c theorist s an d applie d econometrician s alike . This boo k i s aime d a t graduat e student s i n economics , applie d econo metricians, econometri c theorists , an d th e genera l audienc e o f econo mists who use empirica l methods t o analys e tim e series. Despite th e growin g importanc e o f th e literatur e o n integratio n an d co-integration, mos t account s o f thi s literatur e remai n confine d t o journals, edite d collection s o f papers , o r surve y papers. Whil e som e o f the survey s ar e quit e detailed , spac e restriction s usuall y d o no t allo w a full expositio n o f man y o f th e theoretica l points . Thi s boo k attempt s t o bridge th e ga p betwee n account s suc h a s surveys , whic h ar e mainl y descriptive, an d account s tha t ar e mainl y theoretical . I t explain s th e important concept s informall y an d als o present s the m formally . Th e asymptotic theor y o f integrate d processe s i s describe d an d th e tool s provided b y thi s theor y ar e use d t o derive , i n som e detail , th e distributions o f estimators. B y taking reader s ste p b y ste p throug h som e of th e mai n derivations , ou r hop e i s t o mak e th e theor y readil y accessible t o a wide audience . We hav e trie d t o mak e th e boo k a s self-containe d a s possible . A knowledge o f econometrics , statistics , an d matri x algebr a a t th e leve l of a final-yea r undergraduat e o r first-yea r graduat e cours e i n econometric s is assumed , bu t otherwis e al l o f th e importan t statistica l concept s an d techniques ar e described . A boo k suc h a s thi s one , whic h discusse s a n are a tha t i s developin g rapidly, i s inevitabl y incomplet e an d run s th e ris k o f no t bein g quit e up-to-date. T o limi t th e tim e take n i n writin g an d revising , w e di d no t seek t o chas e a frontie r tha t wa s expanding in man y directions . Rather , the topic s covere d reflec t ou r view s of issues, models , an d method s tha t are likel y t o remai n importan t fo r som e tim e t o come , man y o f whic h will continue to provid e th e platfor m for futur e research .
Acknowledgements Our boo k wa s writte n i n tw o continents , thre e years , an d fou r univer sities, s o th e lis t o f people , acros s time , space , an d departments , t o whom w e ow e extensiv e debt s o f gratitud e ha s grow n formidably large. A majo r par t o f thi s deb t i s owe d t o th e Department s o f Economic s a t the Universitie s o f Californi a a t Sa n Diego , Florid a i n Gainesville , McGill, an d Oxford , an d th e Ban k o f Spain , wher e th e author s eithe r worked o r visite d for substantia l periods. Thei r generou s suppor t o f ou r work i s much appreciated . The boo k ha s als o benefite d greatl y fro m th e patien t scrutin y o f several o f ou r colleagues , wh o rea d th e entir e typescript an d mad e detailed comments . W e hav e pleasur e i n thankin g Michae l Clements , Rob Engle , Neil Ericsson, Ton y Hall (an d severa l o f his students), Colin Hargreaves, S0re n Johansen , Katarin a Juselius , Teu n Kloek , Jame s MacKinnon, G . S . Maddala , Grayha m Mizon , Jean-Fran9oi s Richard , Mark Rush , Nei l Shephard , Tim o Terasvirta , an d fou r anonymou s referees fo r thei r help . The y hav e mad e a grea t contributio n t o thi s book, an d foun d man y infelicitie s i n earlie r versions , bu t o f cours e ar e not responsibl e for an y that remain. Early version s o f th e boo k wer e inflicte d b y u s upo n ou r graduat e students. Amon g thos e wh o suffere d fro m th e confusio n cause d by obscur e notatio n an d prose , bu t continue d unflinchingly , Hughe s Dauphin, Caro l Dole , Jesu s Gonzalo , Catherin e Liston , Claudi o Lupi , Neil Rickman , an d Geet a Sing h deserve specia l thanks. We ar e als o indebte d t o Juli a Campos , Michae l Clements , Steve n Cook, Nei l Ericsson an d Claudi o Lup i fo r proof reading. The financia l suppor t o f th e Economi c an d Socia l Researc h Counci l (UK) unde r grant s B0125002 4 an d R23118 4 an d th e Fond s pou r l a Formation de s Chercheur s e t 1'Aid e a l a Recherch e (Quebec ) i s grate fully acknowledged . Finally, w e than k Andre w Schulle r an d th e editor s of thi s series , wh o remaine d encouragin g abou t th e projec t despit e it s many difficulties . Oxford A Madrid J Montreal J Oxford D
.B. . J. D . . W. G. . F. H.
Contents Notational Conventions, Symbols , an d Abbreviations x 1. Introductio n and Overview 1 1.1. Equilibrium relationships and the long run 2 1.2. Stationarity and equilibrium relationships 4 1.3. Equilibrium and the specification of dynamic models 5 1.4. Estimation of long-run relationships and testing for orders of integration and co-integration 8 1.5. Preliminary concepts an d definitions 1 1.6. Data representation an d transformations 2 1.7. Examples: typical ARM A processes 3 1.8. Empirical time series: money, prices, output, and interest rates 4 1.9. Outline o f later chapters 4 Appendix 4 Linear Transformations , Erro r Correction , and the Lon g Run i n Dynami c Regressio n 4 2.1. Transformations o f a simple model 4 2.2. Th e error-correction model 5 2.3. A n example 5 2.4. Bdrdsen an d Bewley transformations 5 2.5. Equivalence o f estimates from different transformations 5 2.6. Homogeneity and the ECM as a linear transformation oftheADL 6 2.7. Variances o f estimates o f long-run multipliers 6 2.8. Expectational variables and the interpretation of long-run solutions 6
3
Properties of Integrated Processes 6 3.1. Spurious regression 7 3.2. Trends an d random walks 8 3.3. Some statistical features o f integrated processes 8 3.4. Asymptotic theory fo r integrated processes 8 3.5. Using Wiener distribution theory 9 3.6. Near-integrated processes 9
i
0 8 2 0 2 3
6 8 0 2 3 5 0 1 4 9 0 1 4 6 1 5
viii Content
s
4. Testin g fo r a Unit Roo t 9 4.1. Similar tests and exogenous regressors in the DGP 10 4.2. General dynamic models fo r th e process o f interest 10 4.3. Non-parametric tests for a unit root 10 4.4. Tests o n more than on e parameter 11 4.5. Further extensions 11 4.6. Asymptotic distributions o f test statistics 12
9 4 6 8 3 9 3
5. Co-integratio n 13 5.1. A n example 13 5.2. Polynomial matrices 14 5.3. Integration and co-integration: formal definitions and theorems 14 5.4. Significance o f alternative representations 15 5.5. Alternative representations o f co-integrated variables: two examples 15 5.6. Engle- Granger two-step procedure 15
6 7 0
3 7
6. Regressio n wit h Integrate d Variable s 16 6.1. Unbalanced regressions and orthogonality tests 16 6.2. Dynamic regressions 16 6.3. Functional forms an d transformations 19 Appendix: Vector Brownian Motion 20
2 4 8 2 0
7. Co-integratio n i n Individual Equation s 20 7.1. Estimating a single co-integrating vector 20 7.2. Tests for co-integration i n a single equation 20 7.3. Response surfaces fo r critical values 21 7.4. Finite-sample biases in OL S estimates 21 7.5. Powers o f single-equation co-integration tests 23 7.6. A n empirical illustration 23 7.7. Fully modified estimation 23 7.8. A fully modified least-squares estimator 24 7.9. Dynamic specification 24 7.10. Examples 24 Appendix: Covariance Matrices 25
4 5 6 1 4 0 6 9 0 2 4 2
8. Co-integratio n i n System s o f Equations 25 8.1. Co-integration an d error correction 25 8.2. Estimating co-integrating vectors in systems 26 8.3. Inference about th e co-integration space 26 8.4. A n empirical illustration 26 8.5. Extensions 27
5 7 1 6 8 1
5 3
Contents i
x
8.6. A second example of the Johansen maximum likelihood approach 29 8.7. Asymptotic distributions of estimators of co-integrating vectors i n 1(1) systems 29
3
9. Conclusio n 29 9.1. Summary 29 9.2. Th e invariance o f co-integrating vectors 30 9.3. Invariance o f co-integration under seasonal adjustment 30 9.4. Structured time-series models an d co-integration 30 9.5. Recent research on integration and co-integration 30 9.6. Reinterpreting econometrics time-series problems 30
9 9 0 1 3 4 7
References 31
1
Acknowledgements fo r Quoted Extracts 32
1
Author Index 32
3
Subject Index 32
5
2
This page intentionally left blank
Notational Conventions, Symbols, and Abbreviations The following notationa l convention s will be used throughou t the text: Y, y endogenou X, Z , x , z exogenou
s variables s variables, o r vectors containing both y an d z Greek letters populatio n values (parameters) Greek letters with ~ o r ~ sampl e values (estimates ) Bold lowe r case (Roma n o r Greek) vector s Bold upper cas e (Roman or Greek ) matrice s Equation numbers Equations ar e numbere d consecutivel y i n eac h chapte r an d referre d t o within tha t chapte r b y this number alone . Equation s fro m othe r chapter s are referre d t o b y th e chapte r numbe r an d equatio n numbe r withi n chapter; e.g . th e fift h equatio n i n Chapte r 2 is (5) within Chapter 2 , an d (2.5) elsewhere . Symbols la first-differenc Kronecke fo
g operator:
e operator : r produc t
r al l modulus or absolut e value of x, where x i s a scalar determinan t o f A, wher e A is a matrix x conditiona l on y wea k convergence convergenc e i n distribution convergenc e i n probability Abbreviations
ADF augmente d Dickey-Fuller ADL autoregressive-distribute d lag
xii Notationa
l Conventions , Symbols , an d Abbreviation s
AR autoregressio n ARIMA autoregressiv e integrate d movin g average ARMA autoregressive-movin g averag e ARMAX ARM A + additiona l exogenou s processe s ASE Asymptoti c standard erro r BM Brownia n motio n Cl(d, b) co-integrate d o f order d , b CLT centra l limi t theore m COMFAC commo n facto r error representatio n CRDW co-integratin g regression D W statistic diag diagona l matrix d.f. degree s o f freedom DF Dickey-Fulle r DGP data-generatio n proces s DW Durbin-Watso n statisti c ECM error-correctio n model/mechanis m ESE (average ) estimate d standar d erro r FCLT functiona l centra l limi t theorem/ s FIML full-informatio n maximu m likelihood GLS generalize d least square s GNP gros s national produc t \(d) integrate d of orde r d ID independentl y distribute d IID independentl y an d identically distributed IMA integrate d movin g average IN(/i, a 2 ) independentl y and normall y distributed with mean fi an d variance a 2 IV instrumenta l variables LIML limited-informatio n maximum likelihood MA movin g averag e MDS martingal e difference sequence MLE maximu m likelihood estimato r N(ju, a 2 ) normall y distribute d wit h mean p, and variance a 2 NI near-integrate d OLS ordinar y least square s SC Schwar z information criterion SD standar d deviatio n SE standar d erro r SI seasonall y integrated SSD sampl e standar d deviatio n T sampl e siz e or las t observatio n i n a time-series TFE tota l fina l expenditur e VAR vecto r autoregressio n var varianc e
Notational Conventions, Symbols , and Abbreviations xii vec vectorizin W(r) Wiene
g operator r (Brownia n motion) process wit h increments of variance r
i
This page intentionally left blank
1
Introduction an d Overvie w This boo k consider s th e econometri c analysi s o f bot h stationar y and non-stationar y processe s whic h ma y b e linke d b y equilibriu m relationships. I t exposit s th e mai n tools , techniques , models , con cepts, an d distribution s involve d i n econometri c modellin g o f possibly non-stationar y time-serie s data . Sinc e th e focu s i s o n equilibrium concepts , includin g co-integration an d erro r correction , the analysi s begin s wit h a discussio n o f th e applicatio n o f thes e concepts t o stationar y empirica l models . Late r w e wil l sho w tha t integrated processe s ca n b e reduce d t o thi s cas e b y suitabl e transformations tha t tak e advantag e o f co-integrating (equilibrium ) relationships. I n thi s chapte r w e wil l introduc e som e importan t concepts fro m time-serie s analysi s an d th e theor y o f stochasti c processes, an d i n particula r th e theor y o f Brownia n motio n pro cesses. W e als o offe r severa l empirica l example s whic h us e thes e concepts. A significan t re-evaluatio n o f th e statistica l basis o f econometri c model ling too k plac e durin g th e 1980s . It s analytica l basis expande d fro m th e assumption o f stationarit y t o includ e integrate d processes . Th e effec t o f this shif t i s fa r fro m complete , bu t i s alread y radical , influencin g th e choice o f mode l forms , modellin g practices , statistica l inference , dis tribution theory , an d th e interpretatio n o f man y traditiona l concept s such a s simultaneity , measurement errors , collinearity , forecasting , an d exogeneity. Thi s boo k attempt s t o analys e thes e issues , describ e th e tools necessar y t o investigat e integrate d processes , an d relat e th e ne w methods t o thos e mor e familia r t o econometricians . Researc h i s con tinuing a t a rapi d pace , an d sinc e thi s boo k canno t cove r al l o f th e techniques tha t hav e bee n explored , w e wil l concentrat e o n thos e tha t we believe wil l remain useful . Time-series econometric s i s concerned wit h th e estimatio n o f relation ships amon g group s of variables , eac h o f whic h is observed a t a numbe r of consecutiv e point s i n time . Th e relationship s amon g thes e variable s may b e complicated ; i n particular , th e valu e o f eac h variabl e ma y depend o n th e value s take n b y man y other s i n severa l previou s tim e periods. I n consequence , th e effec t tha t a chang e in one variabl e ha s on another depend s upo n th e tim e horizo n tha t w e consider . I t i s eas y t o
2 Introductio
n an d Overvie w
imagine example s i n whic h a chang e i n on e quantit y ha s littl e o r n o effect o n anothe r a t firs t an d a substantia l effec t later . Alternatively , a variable ma y hav e a substantia l effec t o n anothe r fo r a time , bu t tha t effect ma y eventually die out . It i s useful , therefore , t o distinguis h wha t ar e ofte n calle d 'short-run ' relationships (thos e holdin g ove r a relativel y shor t period ) fro m 'long run' relationships . Th e forme r relat e t o link s tha t d o no t persist . Fo r example, a sudde n stor m ma y temporaril y reduc e th e suppl y o f fres h fish an d increas e it s price , bu t late r fai r weathe r wil l lea d t o th e re-establishing o f th e earlie r pric e i f deman d i s unaltered . Th e long-ru n relationships determin e th e generall y prevailing price-quantity combina tions transacte d i n the market , an d s o are closel y linke d t o th e concept s of equilibriu m relationship s i n economi c theor y an d o f persisten t co movements o f economi c tim e series i n econometrics . Ou r firs t tas k i s t o clarify thes e concepts .
1.1. Equilibriu m Relationship s an d th e Lon g Run An equilibrium state i s define d a s on e i n whic h ther e i s n o inheren t tendency t o change . A disequilibriu m i s an y situatio n tha t i s no t a n equilibrium an d henc e characterize s a state tha t contain s th e seed s o f its own destruction . A n equilibriu m stat e ma y o r ma y no t hav e th e property o f eithe r loca l o r globa l stability ; thus, i t ma y o r ma y no t b e true tha t th e syste m tend s t o retur n t o th e equilibriu m stat e whe n i t is perturbed. However , w e generall y conside r onl y stabl e equilibria , sinc e unstable equilibri a wil l no t persis t give n that ther e ar e stochasti c shock s to th e economy . Tha t is , equilibri a ar e state s t o whic h th e syste m i s attracted, othe r thing s bein g equal . I t ma y als o b e possibl e i n som e circumstances t o vie w th e force s tendin g t o pus h th e syste m bac k int o equilibrium a s dependin g upo n th e magnitud e o f th e deviatio n fro m equilibrium a t a given point i n time. Equilibrium ma y b e eithe r genera l o r partial . I n th e latte r case , a given market i s viewed as having attained equilibriu m i n spite o f the fac t that w e hav e no t take n accoun t o f th e feedbac k fro m othe r markets . I n both cases , a n equilibriu m relationshi p i s expresse d throug h a functio n f(*i, x 2, . - ., xn) = 0, whic h describes th e relationship s tha t hol d amon g the n variable s Xi t o x n whe n th e syste m i s in equilibrium . Th e phras e 'long-run equilibrium ' i s also use d t o denot e th e equilibriu m relationshi p to whic h a syste m converge s over time . Ove r finit e period s o f time , th e long-run o r equilibriu m relationship s ma y fai l t o hold , bu t the y wil l eventually hol d t o an y degre e o f accurac y i f th e equilibriu m i s stable , and i f th e syste m doe s no t experienc e furthe r shock s fro m outside . Expressed differently , a long-ru n equilibriu m relationshi p entail s a
Introduction and Overvie w 3 systematic co-movemen t amon g economi c variable s whic h a n economi c system exemplifie s precisel y i n th e lon g run ; w e wil l writ e equation s representing suc h co-movement s withou t tim e subscript s as , e.g . x\ = fix2 to denot e a linear long-ru n relation betwee n x^ an d x^. Our definitio n o f equilibriu m i s therefor e no t tha t i n whic h 'equili brium' refer s t o clearin g i n a particula r marke t an d wher e 'disequili brium' mean s tha t suppl y i s not equa l t o demand , a s i n Quand t (1978 , 1982): w e us e th e ter m 'market-clearing ' fo r th e forme r an d a 'non clearing market ' fo r th e latter . A non-clearin g marke t involve s quantity rationing o f som e agent s and , dependin g o n th e institutiona l structure , may o r ma y no t involv e a deviatio n fro m a n equilibriu m functiona l relationship. There i s o f cours e a connectio n betwee n th e meanin g o f 'equilibrium ' used i n econometric s b y Quand t an d others , an d tha t use d here , which is mor e commo n i n time-serie s analysis . Whe n a marke t clears , a n equilibrium relationshi p o f th e typ e w e hav e define d ma y als o occu r because clearin g o f tha t marke t ma y retur n th e syste m t o a stat e i n which som e functiona l relationshi p amon g observabl e variable s holds . Our definitio n i s intende d t o b e genera l an d therefor e t o incorporat e market-clearing equilibria , a s well as others whic h may arise throug h th e behaviour o f a variet y o f differen t type s o f systems . Fo r example , w e would sa y tha t a n equilibriu m relationshi p exist s betwee n aggregat e consumption and incom e if consumptio n tend s towar d a fractio n y of income i n th e absenc e o f shock s whic h ma y temporaril y pertur b th e relationship. Thi s nee d no t b e a n equilibriu m i n th e Quand t (1978 ) sense, however , becaus e i t ma y no t correspon d t o th e clearin g o f markets. (Al l consumer s may remain credit-rationed , for example.) Even i f shock s t o a syste m ar e constantl y occurrin g s o tha t th e economic syste m i s neve r i n equilibrium , th e concep t o f long-ru n equilibrium ma y nonetheles s b e useful . Th e presen t i s th e long-ru n outcome o f th e distan t pas t and , a s wil l b e mad e precis e below , a long-run relationshi p wil l ofte n hol d 'o n average ' ove r time . Moreover , a stabl e equilibriu m ha s th e propert y tha t a give n deviatio n fro m th e equilibrium become s mor e an d mor e unlikel y a s th e magnitud e o f th e deviation i s greater , s o tha t on e ma y b e reasonabl y confiden t tha t th e discrepancy between th e actua l relationship connectin g variables an d this long-run relationshi p i s withi n certai n bounds . Precis e definition s ar e provided in Chapte r 5 . Methods fo r investigatin g such long-ru n relationships ar e ou r concer n here. A n examinatio n o f these method s wil l lead u s to discus s aspects of time-series analysis , o f dynami c modelling in general , an d o f th e rapidl y growing literature treatin g co-integration , erro r correction , an d inference from non-stationar y data . Th e firs t ste p i s to clarif y th e statistica l notio n of stationarit y and it s links to th e concep t o f equilibrium.
4 Introductio
n and Overvie w
1.2. Stationarit y an d Equilibriu m Relationship s In economi c theory , th e concep t o f equilibriu m i s wel l establishe d an d well defined . Th e statistica l concept o f equilibriu m centre s o n tha t o f a stationary process, whic h wil l b e define d formall y below. A substantia l body o f method s i s developin g aroun d th e statistica l feature s o f equili brium relationship s amon g time-serie s processes , an d th e concept s o f Stationarity an d particula r form s o f non-stationarit y ar e crucia l t o thes e methods. If a particula r relationshi p suc h a s x\ = fix 2 emerges a s th e economi c system i s allowe d t o settl e down , this wil l describ e a n equilibriu m to a n econometrician jus t a s to a theorist . I n actua l tim e series , however , th e relation jt l t = fix 2t ma y neve r b e observe d t o hold . Consequently , w e look fo r way s of characterizin g the relationship s tha t ca n b e observe d t o hold betwee n x\ t an d x2t. Roughly speaking—again , term s wil l b e define d precisel y i n Chapte r 5—we sa y that a n equilibriu m relationship f(xi,x 2) = 0 hold s betwee n two variable s x j an d x 2 i f th e amoun t E, = f(xit,x2t) b y whic h actua l observations deviat e fro m thi s equilibriu m i s a median-zer o stationar y process.1 Tha t is , th e 'error ' o r discrepanc y betwee n outcom e an d postulated equilibriu m ha s a fixe d distribution , centre d o n zero , tha t does no t chang e ove r time . Thi s erro r canno t therefor e gro w indefin itely; i f i t did , the relationshi p coul d no t hav e bee n a n equilibriu m on e since th e syste m is free t o mov e eve r furthe r awa y fro m it . O f course , i t may b e difficul t t o distinguis h in finit e sample s between a n ever-growing discrepancy i n a n hypothesize d equilibriu m relationshi p an d a rando m fluctuation; forma l statistica l test s fo r problem s suc h a s thi s ar e discussed i n later chapters. Given th e characterizatio n above , th e short-ru n discrepanc y e t i n a n equilibrium relationshi p mus t hav e n o tendenc y t o gro w systematically over time . However , sinc e thi s erro r represent s shock s tha t ar e constantly occurrin g an d affectin g economi c variables , i n a rea l economi c system ther e i s n o systemati c tendenc y fo r thi s erro r t o diminis h ove r time either . I t would fall awa y to zer o only if shocks were to cease . This definitio n o f a n equilibriu m relationshi p hold s automaticall y when applie d t o serie s tha t ar e themselve s stationary . Fo r an y tw o stationary serie s {jc 1(} an d {x 2t}, irrespectiv e o f an y substantiv e economic relationshi p betwee n thes e tw o alone , a differenc e o f th e for m 1 Late r we will consider mor e precisely th e propertie s that th e deviatio n mus t have . Th e requirement i s usually state d a s bein g tha t th e deviatio n fro m th e equilibriu m relationship be integrate d o f orde r zer o (se e below); alternatively , w e migh t impos e onl y th e weake r requirement tha t th e unconditiona l expectatio n o f th e deviatio n fro m th e equilibriu m relationship b e zero , implyin g that onl y th e firs t momen t nee d exis t an d b e constant . Fo r simplicity, w e omit intercept s fro m th e presen t discussion .
Introduction and Overvie w 5 {xit — bx2t} mus t b e a stationary serie s fo r an y b . Thus , whethe r o r no t there exists a non-zero y 3 which describes a true equilibrium relationship , corresponding t o a non-zero derivativ e betwee n x\ an d x2, any arbitrarily chosen b wil l meet th e statistica l equilibriu m condition. Thi s doe s no t imply tha t w e canno t us e statistica l method s t o determin e th e para meters o f a long-ru n relationship , bu t simpl y tha t on e stag e o f th e process, i n which we look fo r a stationary discrepancy , is unnecessary. However, thi s concept o f statistica l equilibrium i s necessary an d usefu l in examinin g equilibriu m relationship s betwee n variable s tendin g t o grow ove r time . I n suc h cases , i f the actua l relationshi p i s x± = fix 2, th e discrepancy xi, - bx 2t wil l b e non-stationar y fo r an y b + /3, sinc e th e discrepancy deviate s fro m th e tru e relationshi p b y th e constan t propor tion ( b - )8 ) o f the growin g variabl e x 2t; onl y th e tru e relationshi p ca n yield a stationar y discrepancy . Wit h mor e tha n tw o variables , however , there ma y b e mor e tha n on e equilibriu m relation , an d thi s lead s t o another o f th e statistica l problem s tha t i s currentl y bein g pursued : th e empirical determinatio n o f th e numbe r o f equilibriu m relationship s between thre e or more non-stationar y tim e series .
1.3. Equilibriu m and th e Specificatio n o f Dynami c Models Equilibrium relationship s hav e playe d a n explici t rol e i n econometri c modelling sinc e it s foundation s (se e Morga n 1990) . I f ther e exist s a stable equilibriu m x\ = fix 2, th e discrepanc y {x\ t — fix 2t} evidentl y contains usefu l informatio n sinc e o n averag e th e syste m wil l mov e towards tha t equilibriu m i f i t i s no t alread y there . I n particular , (x-it-i - /3x 2t-i) represent s th e previou s disequilibrium . Suppos e th e equilibrium relationshi p is betwee n a variabl e {y t} to be modelle d and some serie s {zi} whic h i s exogenou s i n a n appropriat e sense . I f w e le t x = it yt an d X 2t = z t t o distinguis h thei r status , an d denote th e equili brium b y y — PZ, the n th e discrepancy , o r error , {y t — fizt} shoul d b e a useful explanator y variabl e fo r th e nex t directio n o f movemen t o f y t. I n particular, whe n y, — flz t is positive, y, is too hig h relative t o z t, an d on average w e might expect a fal l i n y i n futur e period s relativ e t o it s tren d growth. Th e ter m (y t-\ — Pzt-i), calle d a n error-correction mechanism, is therefor e sometime s include d i n dynami c regression s (se e Sarga n 1964, Hendr y an d Anderso n 1977 , an d Davidson , Hendry , Srba , an d Yeo 1978) . The tru e paramete r /3 characterizin g th e relationshi p i s no t know n i n general. Thi s nee d no t preven t th e error-correctio n mechanis m fro m being useful , however , sinc e th e unknow n paramete r ca n eithe r b e
6 Introductio
n and Overvie w
estimated separatel y i n a prio r analysi s o r estimate d i n th e cours e o f modelling th e variabl e o f interest . Moreover , th e genera l error-correc tion mechanis m ca n b e show n t o b e equivalen t t o variou s othe r transformations o f a genera l linea r mode l incorporatin g pas t value s o f both th e variabl e o f interes t an d th e explanator y variable s (se e Chapte r 2). A particula r advantag e o f th e error-correctio n mechanis m i s that th e extent o f adjustmen t i n a give n perio d t o deviation s fro m long-ru n equilibrium i s give n b y th e estimate d equatio n withou t an y furthe r calculation. Othe r form s o f th e estimate d mode l ar e als o convenien t i n that the y allo w th e implie d long-ru n relatio n itsel f t o b e see n directly . Considerations suc h a s these ar e discusse d i n the followin g chapter . The practic e o f exploitin g informatio n containe d i n th e curren t deviation fro m a n equilibriu m relationship, i n explainin g th e pat h o f a variable, ha s benefite d fro m th e formalizatio n o f th e concep t o f co-inte gration b y Grange r (1981 ) an d Engl e an d Grange r (1987) . Th e informa l definition o f statistica l equilibriu m discusse d abov e i s base d upo n a special cas e o f th e definitio n o f co-integration . Further , th e practic e o f modelling co-integrate d serie s i s closel y relate d t o error-correctio n mechanisms: error-correctin g behaviou r o n th e par t o f economi c agent s will induc e co-integratin g relationship s amon g th e correspondin g tim e series an d vic e versa. A serie s tha t i s tendin g t o gro w ove r tim e canno t b e stationar y (although i t ma y possibl y b e stationar y aroun d som e deterministi c trend), bu t th e changes i n tha t serie s migh t be . T o tak e a mechanica l example, i f a n objec t ha s a fixe d averag e positio n aroun d whic h i t moves, alway s returnin g afte r som e interva l t o thi s positio n lik e a randomly perturbe d weigh t a t th e en d o f a spring , the n it s displacemen t may b e a stationar y series . A n objec t tha t ha s n o suc h fixe d positio n may nevertheles s hav e a velocit y (th e chang e i n positio n pe r uni t time) , or acceleratio n (th e chang e i n th e velocit y pe r uni t time) , tha t i s stationary. Fo r example , i f th e objec t i s movin g eve r furthe r fro m it s point o f origin , bu t wit h velocit y fluctuatin g aroun d som e fixe d positiv e mean accordin g t o a fixe d distributio n function , the n th e velocit y o f th e object i s a stationary series. A serie s is said t o be integrate d o f order 1 (1(1)) if , althoug h it is itself non-stationary, th e change s i n thi s serie s for m a stationar y series . I t i s said t o b e integrate d o f orde r 2 (1(2) ) if , althoug h th e change s ar e non stationary, th e changes in th e changes for m a stationar y series . I n othe r words, i f th e serie s mus t b e difference d exactl y k time s t o achiev e stationarity, the n th e serie s i s l(k), s o that a stationary serie s i s 1(0). W e will us e th e ter m 'integrate d process ' t o refe r t o a serie s wit h orde r o f integration strictl y greate r tha n zero : precis e definition s ar e give n i n Chapter 3 . We ca n no w conside r th e concep t o f co-integration , it s relatio n t o th e
Introduction and Overvie w 7 definition o f long-ru n equilibriu m betwee n serie s give n above , an d it s use a s part o f a statistical descriptio n o f the behaviou r o f time serie s tha t satisfy som e equilibriu m relationship . A simpl e exampl e concern s tw o series, eac h o f whic h i s integrate d o f orde r 1 . Assum e tha t a long-ru n equilibrium relationshi p hold s betwee n them , an d tha t i t i s linear : x = X i P 2- The n (x t — f3x2) mus t be equa l t o zero i n equilibrium and the series {xi t — flx 2t} ha s a constant unconditiona l mean o f zero. Thi s nee d not impl y tha t {xi t — fix 2t} is stationary : th e varianc e o f {x lt - flx 2t} might b e non-constant , fo r example . Th e definitio n o f co-integratio n given b y Engl e an d Grange r (1987) , an d discusse d i n Chapte r 5 , doe s however requir e stationarit y o f th e deviatio n (x\ t~ fait} - Whe n stationarity doe s hold , w e sa y that x\ an d x 2 ar e co-integrate d (1,1) , denoted CI(1,1) ; tha t is , the y ar e eac h integrate d o f order 1 , and ther e exists som e linea r combinatio n {x\ t — /3x2t} whic h i s integrate d o f a n order on e lowe r tha n th e component s (i.e . i s 1(0) here) . I f {x it — fix 2t} has a constan t unconditiona l mea n bu t i s no t stationary , the n w e ma y still wan t t o sa y tha t a n equilibriu m relationshi p holds ; th e serie s wil l not, however , fi t th e stric t Engle-Grange r definitio n o f co-integration , which require s tha t som e linear combinatio n b e stationary. A substantiv e long-ru n equilibriu m relationshi p i s somethin g fro m which th e variable s involve d ca n deviate , bu t no t b y a n ever-growin g amount. Tha t is , th e discrepanc y o r erro r i n th e relationshi p canno t b e integrated o f an y orde r greate r tha n zero . Serie s integrate d o f strictl y positive order s whic h ar e linke d b y suc h a n equilibriu m relationshi p must, therefore , b e co-integrate d wit h eac h other . I n th e exampl e jus t given, th e fac t tha t th e integrate d series jt j an d x 2 mov e togethe r i n th e long ru n i s reflecte d i n th e fac t tha t the y ar e co-integrated ; a linea r relation yield s a stationary deviation . More generally , we can spea k o f variables that ar e co-integrate d (a , b ) when a > b an d b > 0, wher e a i s th e orde r o f integratio n o f th e variables and b is the reductio n in orde r of integration produce d by the linear combination , whic h the n ha s orde r o f integratio n a — b. Whe n b > 0, a linea r relatio n exist s betwee n th e variable s whic h i s integrate d of lowe r orde r tha n eithe r o f th e variable s themselves , bu t whic h ma y none th e les s no t b e 1(0) . I n th e latte r cas e ( a — b >0), th e variable s may deviat e fro m th e linea r relationshi p b y a n ever-growing amount , and s o i t i s no t th e kin d o f relationshi p tha t w e hav e bee n callin g a long-run equilibrium . Nevertheless , variable s tha t ar e CI(a , b) fo r b > 0 do contai n som e informatio n abou t th e long-ru n behaviour o f th e serie s involved. Since a relationshi p betwee n co-integrate d variable s can be show n to be representabl e usin g a n error-correctio n mechanis m (se e Chapte r 5) , and sinc e suc h representation s hav e bee n foun d t o b e valuabl e i n empirical modelling , ther e i s a forma l counterpar t t o th e informa l
8 Introductio
n and Overvie w
argument abov e suggestin g th e usefulnes s o f equilibriu m informatio n i n specifying dynami c regression models .
1.4. Estimatio n o f Long-Run Relationship s an d Testin g for Order s o f Integration an d Co-integratio n The existenc e o f long-ru n relationship s betwee n variables , th e potentia l orders o f integratio n o f particula r tim e series , an d th e implication s o f these fo r th e specificatio n o f dynami c econometri c model s ca n b e understood a s mathematica l propertie s withou t implyin g tha t w e kno w whether o r no t suc h relationship s exist , le t alon e wha t thei r form s fo r a particular empirica l problem woul d be . When a n estimate d regressio n equatio n implie s a n equilibriu m rela tionship betwee n tw o processes , i t i s a straightforwar d operatio n t o extract th e estimate d long-ru n equilibriu m relatio n regardles s o f th e form i n which the equatio n i s estimated. Th e calculatio n can be mad e by expressing th e equatio n i n a n equilibriu m for m an d takin g it s expecta tion. Thi s i s analogou s t o assumin g a stat e i n whic h th e value s o f th e variables d o no t change , s o tha t th e datin g o f variable s become s irrelevant an d th e equatio n i s treate d a s deterministic . Computin g th e derivative betwee n th e tw o serie s i s the n straightforward . Approxima tions t o th e variance s o f estimate d long-ru n multiplier s ca n als o b e computed. Chapte r 2 explore s variou s transformation s o f th e linea r model tha t ar e convenien t fo r these an d relate d calculations . Testing fo r th e existenc e o f suc h a n equilibriu m relationshi p i s no t nearly s o simple. First, i t is difficult empiricall y to establis h th e order s of integration o f individua l time series . Second , th e orde r o f integratio n o f a linea r relationshi p amon g variable s i s even harde r t o discove r tha n th e order o f integratio n o f a singl e series : drawin g inferences is complicate d by th e fac t tha t th e parameter s o f th e relationshi p ar e i n genera l unknown. Testing whethe r a n individua l serie s i s 1(1 ) a s oppose d t o 1(0 ) i s th e problem tha t ha s bee n widel y discusse d a s tha t o f testin g fo r a 'uni t root' i n a time series . Strategie s fo r performin g such testin g hav e ha d t o contend wit h th e proble m tha t 1(0 ) alternative s i n whic h th e serie s i s 'close' t o bein g 1(1 ) (s o tha t th e powe r o f th e tes t i s low ) ar e ver y plausible i n many economic circumstances . Further , th e for m o f the dat a generation proces s (e.g . th e order s o f dynamics ; th e questio n o f whic h exogenous variable s enter ; etc. ) i s not known , an d critica l value s o f tes t statistics ar e typicall y sensitive to th e structur e o f the process . Fuller (1976 ) an d Dicke y an d Fulle r (1979 ) emphasize d tha t testin g for non-stationarit y (again , 1(1 ) a s oppose d t o 1(0 ) series ) i s mor e difficult tha n conventiona l f-test s o f th e hypothesi s tha t th e autoregress -
Introduction and Overvie w 9 ive paramete r i s equa l t o on e i n a n AR(1 ) model . I n fact , wher e ther e are root s greate r tha n o r equa l t o one , conventionall y use d test s d o no t have standar d asymptoti c distributions . Th e origina l test s wer e variant s of conventiona l tests , wit h critica l value s retabulated usin g Monte Carl o experiments t o reflec t th e change s i n distributio n when , under th e null , the serie s are non-stationary. These origina l test s wer e base d o n simpl e form s o f autoregressiv e model: a n AR(1) model , with o r withou t drif t an d tim e tren d term s (i.e . yt = <xy t^i [+/3 ] [+yt\ +E t). Suc h simpl e form s ma y ofte n b e poo r approximations t o th e dat a generatio n process . Thi s wil l manifes t itself in th e failur e o f th e estimate d mode l t o pas s variou s mis-specificatio n tests. I n particular , test s fo r residua l autocorrelatio n wil l ofte n reflec t autocorrelated processe s tha t hav e bee n omitte d fro m th e mode l specifi cation. On e wa y o f dealin g wit h th e proble m o f findin g a n adequat e model withi n whic h t o tes t fo r non-stationarit y ha s therefor e bee n t o retain a simpl e autoregressiv e mode l form , bu t wit h a non-parametri c correction t o th e value s o f th e tes t statisti c t o allo w for a genera l for m of autocorrelatio n i n th e residuals . Anothe r approac h attempt s t o capture th e autocorrelatio n throug h th e additio n o f extra lagge d terms in the dependen t variable . Thes e issues are addresse d i n Chapter 4 . When serie s ma y contai n mor e tha n on e 'uni t root'—i.e . wher e the y may be 1(2 ) or of highe r orders—testin g become s yet mor e difficul t because th e sequenc e i n which different hypothese s ar e teste d ca n affec t inference. Suc h issues are als o considere d i n Chapter 4 . A relate d metho d ca n b e applie d t o th e proble m o f testin g fo r a n equilibrium relatio n betwee n integrate d variables . A prio r ste p mus t b e added t o th e metho d above , i n whic h a linea r relationshi p betwee n o r among th e variable s i n questio n i s estimated . Testin g fo r co-integratio n then entail s testin g th e orde r o f integratio n o f th e erro r i n thi s relationship. Fo r example , a stationar y erro r i n a mode l relatin g integrated serie s entail s a n equilibriu m relationship. Conversely , if there were n o equilibriu m relationship , ther e woul d b e nothin g t o ti e thes e series t o an y estimated linea r relation , an d thi s would imply non-stationarity of the residuals . It migh t appea r a t firs t sight , fo r example , tha t testin g fo r co-integra tion betwee n 1(1 ) serie s {x\ t} an d {x2t} woul d be precisel y th e sam e a s a tes t o f th e hypothesi s tha t {e j = {x lf - I3x 2,} i s 1(1 ) agains t th e alternative tha t {e (} is 1(0). However , thi s is true onl y unde r ver y strong assumptions. Necessar y condition s includ e tha t ther e i s onl y on e co integrating relatio n an d th e value s o f it s parameter s ar e known . I n th e bivariate case , whe n / 3 i s estimated , th e serie s tha t on e test s fo r stationarity i s {£, } = {XK — J3x2t}- Sinc e linea r regressio n minimize s th e variance of e t, the estimate d serie s of deviation s from equilibriu m has a smaller varianc e tha n th e tru e deviation s {x it — f}x2t}, assumin g tha t (3
10 Introductio
n and Overvie w
exists. Tha t is , th e metho d b y whic h /3 i s usuall y estimated amount s t o choosing / 3 i n suc h a wa y tha t th e tw o variable s ar e give n th e bes t chance t o appea r t o mov e together . Regressio n make s co-integratio n appear t o b e presen t mor e ofte n tha n i t should , s o tha t th e critica l values o f tes t statistic s mus t b e adjuste d t o reflec t th e fac t tha t / 3 i s estimated. Co-integratio n test s ar e therefor e similar , bu t no t identical , to standard stationarit y tests. Chapter 7 explore s thes e test s fo r co-integration , an d Chapte r 8 extends the discussio n to estimatio n an d testin g in systems of equations.
1.5. Preliminar y Concept s an d Definitions We assum e tha t reader s ar e acquainte d wit h the fundamenta l principles and method s o f econometric s an d statistica l inference. I t i s nonetheles s worth reviewin g som e importan t concept s an d definition s tha t wil l b e used i n later chapters , establishing terminology as we do so .
1.5.1. Stochastic Processes and Time-series Models A numbe r o f concept s fro m standar d time-serie s analysi s wil l b e necessary. Bo x an d Jenkin s (1970 ) giv e a thoroug h treatmen t o f thes e models. A stochastic process i s a n ordere d sequenc e o f rando m variable s {x(s, t) , s e S, t e T}, suc h that , fo r eac h t e T, x ( • , t) i s a rando m variable o n th e sampl e spac e S and , fo r eac h s e S, x ( s , - ) i s a realization o f th e stochasti c proces s o n th e inde x se t T (tha t is , a n ordered se t of values, each correspondin g t o on e valu e of the inde x set). A give n realization o f th e proces s ma y b e represente d a s {x(t), t e T}, and thi s notatio n i s als o ofte n use d fo r th e stochasti c proces s itself . I n later chapter s w e wil l typicall y refe r t o realization s o f stochasti c pro cesses by the notatio n x t for a value at t, and {x t}i (or {x t} or {*(}? = i) for a ful l se t o f values corresponding t o a n inde x set T = {1 , 2 , . . ., T}. We wil l als o restric t ou r attentio n t o discret e stochasti c processes , fo r which th e inde x se t i s a discret e set , i n whic h case w e generally use th e notation x t rathe r tha n x(i), whic h ma y appl y als o t o continuou s processes. Next, le t (x(f), t e 1} be a stochastic proces s suc h tha t E(\x(t)\) < for al l t € T, an d E(x(i)\$ t_d = x(t - 1 ) fo r al l t e T, wher e E ( • ) is the expectation s operato r an d $ t^i represent s a particula r information set o f dat a realize d b y tim e t - 1. The n {x(t), t e ¥} i s calle d a
Introduction and Overvie w 1
1
martingale wit h respect t o {$ t, t e T}. A martingale difference sequence can the n b e define d b y {y(t) = x(t) - x( t - 1) , f e T}. I t follow s tha t E (\y(t)\} #2(0 > • • • > x m(t}}'> the n w e requir e i n additio n tha t covariances o f th e for m E\Xk(tj)xi(tj)\ ar e finit e constant s an d ar e functions o f i, j, k , I only , for any admissible i, j, k , an d /. We wil l not offe r a rigorous definitio n of a n integrate d proces s a t this stage bu t w e ca n highligh t a numbe r o f th e issue s involved . A n integrated process i s one tha t ca n b e mad e stationar y b y differencing . A discrete proces s integrate d o f orde r d mus t b e difference d d time s t o reach stationarity ; tha t is , & dxt i s stationar y wher e th e differencin g operator A rf i s define d b y ( 1 - L) d (usin g th e la g operato r L , itsel f defined b y L nxt = *,_„). Fo r example , th e firs t differenc e i s Ax, = x, - x,_i, an d th e secon d differenc e i s A. 2xt = Axt — &xt-i = x, — 2x,-i + xt-2 = ( 1 ~ L) 2xt. Th e process ( 1 - L)x, = et, wher e {E,} is a white-nois e serie s (se e below) , i s calle d a random walk an d i s a simple exampl e o f a process integrated o f order 1 . Two issue s meri t comment . First , i f x t i s stationar y then s o i s A* , o r even A dxt fo r d > 0. Thus , th e stationarit y of A d;cr i s not sufficien t fo r x t to b e l(d). (Recal l tha t a n l(d) proces s i s one tha t must b e differenced d time s t o achiev e stationarity. ) Secondly , conside r th e stabl e auto regressive process , x, = a 0 + a\xt^i + st, wher e or j < 1 , XQ = 0, an d E, ~ IN(0 , or 2), t — l, . . ., T . The n {x,} i s non-stationar y sinc e E(xt) = A) , where X,_i denote s the histor y of the variabl e x: X,_j = (x,_j , x ( _2, . . ., XQ) . Le t th e parameter s A e A b e partitione d into (A l5 A 2) t o suppor t th e factorization Then [(y, z t', &i),(z t', A^) ] operate s a sequential cu t o n D(x r |X,_!,A) i f and onl y if A ! an d A 2 ar e variation free; tha t is , i f an d onl y if so tha t th e paramete r spac e A i s th e direct produc t o f A j an d A 2. I n other words , fo r an y value s o f A j an d A 2, admissibl e value s o f th e parameters A of th e join t distributio n ca n b e recovered . Th e essentia l element o f weak exogeneit y is that th e margina l distribution contain s n o information relevan t to A ! (for an exposition , se e Ericsson 1992) . Weak exogeneity: z t i s weakl y exogenou s fo r a se t o f parameter s o f interest ij> i f an d onl y i f ther e exist s a partitio n (A j , A2) o f A such that (i ) t/> i s a functio n o f A j alone , an d (ii ) [ ( y t z t ' , ^ i ) , (z t\ A^) ] operate s a sequential cut . Strong exogeneity. z t i s strongl y exogenou s fo r t/ > i f an d onl y i f z t i s weakly exogenou s fo r \f> an d so that y doe s not Granger-caus e z . Super exogeneity: z t i s supe r exogenou s fo r t y i f an d onl y i f z t i s weakly exogenous fo r \l> and A \ i s invariant t o intervention s affecting A^ . Weak exogeneit y ensure s tha t ther e i s n o los s o f informatio n abou t parameters o f interest fro m analysin g only the conditiona l distribution ; a variable z t i s weakl y exogenou s fo r a se t o f parameter s t/ > i f inferenc e concerning t/ ; can b e mad e conditiona l o n z t wit h no los s o f information relative t o tha t whic h could be obtaine d usin g the joint density o f y t an d
Introduction and Overvie w 1
9
Zf Stron g exogeneit y i s necessar y fo r multi-ste p forecastin g whic h proceeds b y forecasting future z s an d the n forecastin g ys conditiona l on those zs . Supe r exogeneit y sustain s polic y analysi s o n A I whe n th e marginal distribution of z t i s altered . Engle e t al. contras t thes e thre e type s o f exogeneit y wit h th e tradi tional concept s o f strict exogeneity an d pre-determinedness . I f u t i s th e error ter m i n a model , the n z t i s sai d t o b e strictl y exogenou s i f E[ztUt+i] = 0 V i, wherea s z t i s said t o b e predetermine d i f E[z tut+i] = 0 V i 3 = 0. Ehgl e e t al . sho w tha t th e latte r concept s ar e neithe r necessar y nor sufficien t fo r vali d inferenc e sinc e neithe r relate s t o parameter s o f interest. The following example (fro m Engl e e t al. 1983 ) seeks t o clarif y thes e concepts. Conside r th e DGP:
with
The parameter s (/? , denot e wea k convergenc e i n th e sens e tha t th e probabilit y measures converge : thi s i s th e analogu e fo r functio n spaces , o f conver gence i n distributio n fo r rando m variable s (se e Hal l an d Heyd e 1980) . Then, unde r weak assumptions abou t {x t}, (4)
Furthermore, i f /( • ) is a continuous functional o n [0 , 1], the n (5)
FIG 1.2. Mappin g the 10-poin t grap h on t o a step functio n
24
Introduction an d Overvie w
FIG 1.3. Ste p representatio n o f a random walk ove r 1 0 points
FIG 1.4. Ste p representation of a random walk over 10 0 points For furthe r details , se e Billingsle y (1968) , Dicke y an d Fulle r (1979 , 1981), Hall an d Heyd e (1980) , an d Phillip s (1986, 1987a) . In distribution s involvin g 1(1) variables , functional s o f Wiene r pro cesses aris e quit e generally , whereas conventional methods o f obtaining limiting distribution s tend t o b e specifi c to th e assumption s made abou t the dat a o r erro r process. 3 Also , man y of the statistic s regularly used in 3 B y thi s w e mea n tha t onl y wea k restriction s nee d t o b e satisfie d b y the {x,} sequenc e for convergenc e result s suc h a s (4 ) an d (5 ) t o hold . Phillip s (1987a ) provide s a goo d account o f this issue, an d a discussion is also containe d i n Ch . 3 .
Introduction and Overvie w
25
FIG 1.5. Ste p representation o f a random walk over 100 0 points
empirical researc h involvin g 1(1) tim e serie s hav e differen t distribution s from thos e tha t aris e wit h 1(0 ) data . I n particular , man y statistics i n 1(1) processes d o no t converge t o constants , a s i n th e 1(0 ) case , bu t instea d converge t o rando m variables . Thus , differen t critica l value s ma y b e required fo r tests , dependin g o n th e degre e o f integratio n o f th e tim e series. Consider th e rando m walk , y t = v ( _! + e t, wit h e, ~ IN(0,1 ) an d >>o - 0 . Then
Alternatively, fro m (7) ,
26
Introduction an d Overview
Similarly, corr 2 (yr , yt-k) na s a numerator of (t — k)2 an d a denominato r of t( t - k ) for k > 0, and so equals 1 - k/t. Whe n k < 0, let 5 = t - k so that t = s + k, an d let r = —k > 0 , in which case
Since y 0 = 0, we have that
The las t approximatio n use s To illustrat e th e us e o f Wiene r processe s i n derivin g distribution s involving 1(1 ) variables , w e wil l deriv e th e limitin g distributio n o f the sampl e mean , y = T~ l Xf= iJ V Becaus e {y,} i s a rando m walk , its mean converge s t o a functiona l o f a Wiene r process . Le t RT(r) = y^n/V r = y^/Vr fo r ( i - l)/T = £ r < i/T ( i = 1, . . ., T) , and Rr(l) = yr/VT. Rj(r} i s a ste p functio n wit h step s a t i/T, fo r z' = 1 , . . ., T , an d i s constant betwee n steps . Thus,
Introduction and Overvie w 2
7
The las t expressio n i s yi/VT, wher e y\ i s the lagge d mean . Thi s resul t uses th e fac t that , fo r any constant c,
From (3 ) and (4) ,
and hence
The unlagge d sample mean ha s the sam e limiting distribution. An interestin g aspec t o f (10 ) i s that th e Lindeberg-Felle r centra l limit theorem4 (whic h applies t o independen t bu t heterogeneously distribute d observations; se e Whit e 1984 ) ca n b e applie d t o obtai n th e distributio n of y an d henc e sho w that
Thus, som e functiona l o f Wiene r processe s ar e familia r rando m vari ables i n disguis e and w e will develo p thi s aspect a s we proceed. A proo f of (11 ) i s given in the Appendix . 7.5.7. Monte Carlo Simulation The purpos e o f Mont e Carl o simulatio n i s t o evaluat e b y experimen t quantities tha t woul d be ver y difficult o r impossibl e t o evaluat e analytically. Suc h experiment s typicall y begi n b y creatin g a se t o f dat a wit h known statistica l properties . Thi s i s achieve d b y specifyin g ever y aspec t of a data-generatin g process , o r clas s o f suc h processes , an d replacin g the rando m error s o f th e DG P b y pseudo-rando m numbers . Pseudo random number s ar e number s generate d deterministicall y t o mimi c a random proces s wit h a particula r distribution . A n investigato r typically generates a large numbe r o f suc h artificial data set s (calle d replications ) to investigat e statistica l technique s whic h analys e thes e dat a a s i f th e process generating them were no t known. Th e performanc e o f th e statistical techniqu e i n revealin g som e characteristi c o f th e dat a se t ma y 4
Strictl y speaking , th e versio n w e us e her e i s a specia l cas e o f thi s theorem , sometime s called the Liapuno v centra l limi t theorem.
28 Introductio
n and Overvie w
then b e evaluate d b y generatin g it s distributio n fro m independen t replications o f the experimen t an d comparin g th e result s wit h the known characteristics o f the proces s generatin g the data . For example , a n econometricia n ma y wis h t o examin e th e perform ance o f th e standar d Mes t i n dat a generate d b y a rando m walk . Artificial data-set s followin g a rando m wal k ma y easil y b e constructe d using pseudo-rando m disturbances , an d th e empirica l distributio n o f th e f-statistic i n sample s o f siz e T ca n b e generate d b y replicating N set s of T observations . Th e mean , variance , o r variou s critica l value s o f th e f-statistic ca n b e calculate d fro m th e empirica l distributio n and , fo r sufficiently larg e N , wil l b e clos e t o thei r populatio n (i.e . analytic ) counterparts. Th e investigato r can als o var y the parameter s o f the DG P in orde r t o observ e thei r effect s o n th e outcome . I n eac h experiment , the investigato r know s th e tru e parameter s o f th e process , an d s o ca n evaluate the estimator s an d tests used . Unlike analytica l studies , Mont e Carl o simulation s canno t produc e exact results ; an y resul t fro m a Mont e Carl o experimen t come s fro m a (pseudo-)random sample , an d therefor e ha s som e variabilit y attached t o it. Moreover , Mont e Carl o experiment s ar e inevitabl y specifi c t o th e particular dat a generatio n processe s examine d (althoug h i t ma y b e possible t o prov e analyticall y tha t result s wil l b e invarian t t o certai n parameters i n the process) . Nonetheless , Mont e Carl o result s ar e usefu l when analytica l results ar e difficul t t o obtain . I n particular, Mont e Carl o experiments ar e ofte n use d t o investigat e th e finite-sampl e performanc e of statistica l techniques , th e analytica l propertie s o f whic h ar e know n only asymptotically . There ar e a numbe r o f subtletie s t o th e desig n an d interpretatio n o f Monte Carl o experiment s whic h deman d carefu l attention , includin g th e methods use d t o generat e pseudo-rando m numbers , variance-reductio n methods suc h a s commo n rando m numbers , antitheti c rando m number s and contro l variate s intende d t o improv e precision , th e calculatio n o f standard error s of the experimenta l estimate s o f unknown quantities, th e use o f respons e surface s t o summariz e an d interpolat e results , an d recursive updatin g o f quantitie s o f interest . Exposition s o f Mont e Carl o methods ma y b e foun d in , fo r example , Hammersle y an d Handscom b (1964), Hendr y (1984) , Riple y (1987) , Hendry , Neale , an d Ericsso n (1990), an d Davidso n an d MacKinno n (1992) .
1.6. Dat a Representation an d Transformation s Since dat a transformation s pla y a n importan t rol e i n econometric s generally, w e briefl y consider thei r impac t o n 1(1 ) data . Conside r th e hypothesis tha t a se t o f integrate d dat a ca n b e describe d b y a linea r
Introduction and Overvie w 2
9
model wit h a constan t erro r variance . I n particular , a normall y dis tributed rando m wal k wit h drif t i s ofte n postulate d s o tha t Axt ~ IN(jW , cr 2). Man y economi c tim e serie s (suc h a s consumption , national income an d expenditure , o r th e pric e level ) d o gro w over time , but th e amoun t b y whic h the y gro w i n eac h perio d als o tend s t o rise . However, A.x t = x t — xt-i wil l b e stationar y onl y if the absolut e amoun t of growt h is stationary , i n whic h cas e fo r n > 0, a/x t wil l ten d t o zero . Percentage growth , b y contrast , ofte n display s n o obviou s tendenc y t o rise o r fall , makin g it a more likel y candidate fo r stationarity . Since th e levels o f man y economi c variable s ar e initiall y positive , an d recallin g that
we se e tha t stationarit y o f th e rat e o f growt h implie s stationarit y o f Alog(jc ( ). Change s i n th e logarithm s o f economi c dat a serie s suc h a s those jus t mentioned , therefore , see m mor e likel y t o b e stationar y than changes i n th e levels . W e wil l retur n t o thi s poin t i n Chapte r 6 below, where w e conside r ho w co-integratio n i s affecte d b y th e logarithmi c transformation. W e illustrat e som e o f thes e point s wit h actua l dat a series. The tim e serie s tha t we analys e is rea l net nationa l produc t (Y, in 1929 fmillion ) fo r th e Unite d Kingdo m ove r 1872-1975 . Th e dat a ar e taken fro m Friedma n an d Schwart z (1982 ) an d ar e als o investigate d i n Hendry an d Ericsso n (19910) . Figure s 1.6-1. 9 plo t thi s dat a serie s an d
FIG 1.6. U K rea l net nationa l produc t ( Y i n 192 9 fmillion), 1872-197 5
30 Introductio
n and Overvie w
FIG 1.7. Logarith m (lo g Y ) o f UK rea l net nationa l product
various transformation s o f it . Figur e 1. 6 plot s th e untransforme d serie s Yt; th e serie s i s tending t o gro w by increasing amounts , and s o would be better approximate d b y a conve x functio n than by a straight line . Thi s is visible fro m th e upwar d curvatur e an d th e muc h close r fi t o f th e quadratic trend lin e compare d wit h the linea r trend . I n Fig . 1.7 , w e plo t the logarith m o f th e series : th e curvatur e i s no longe r apparent , an d th e quadratic an d linea r trend s ar e ver y simila r an d fi t abou t equall y well . Thus, th e logarith m o f th e serie s i s relativel y wel l approximate d b y a straight lin e and , whil e growing , ther e i s n o eviden t tendenc y fo r th e growth rate to chang e over time . Figure 1. 8 plot s th e changes , AY ( . Ther e i s a tendenc y fo r bot h th e mean an d th e varianc e t o gro w ove r time , an d th e linea r tren d show n highlights th e former . (I t require s mor e carefu l inspectio n t o se e th e latter owin g to th e ver y large shock i n 1919-20. ) Differencin g th e initial series ha s therefor e no t produce d a stationar y series . I n Fig . 1.9 , however, wher e A log Yt i s plotted, ther e i s no longe r an y major chang e in th e mea n o r variabilit y of th e serie s ove r th e sample , wit h perhaps a slight tendenc y fo r th e varianc e t o b e smalle r i n th e perio d sinc e 1945 . Certainly, an y tren d i n th e mea n o f AlogY r i s negligible . Thi s series , then, ma y wel l b e stationary , althoug h neithe r th e logarithmi c transfor mation no r th e first-differenc e transformatio n produce d a stationar y series o n it s own . Sinc e th e difference s i n th e logarithm s appea r stationary, w e migh t expec t t o fin d tha t th e logarithm s o f th e origina l
Introduction and Overvie w
31
FIG 1.8. Change s (AY ) in UK real net nationa l produc t
FIG 1.9. Change s i n th e logarith m (AlogY ) o f U K rea l ne t nationa l product series ar e 1(1) , whil e th e untransforme d initia l serie s apparentl y i s no t and differencing i t is not sufficien t t o produce stationarity. Alternatively, an y linea r mode l o f AY , will hav e a n erro r term , whic h we denot e b y ut, with a standar d deviatio n o u tha t mus t b e in the sam e
32 Introductio
n and Overvie w
units a s Y t. Sinc e thes e ar e 192 9 fmillion, th e linea r mode l assume s a constant absolut e erro r standar d deviation . However , ne t nationa l product ha s grow n abou t six-fol d ove r th e sampl e s o tha t o u/Yt (th e relative error ) wil l b e muc h smalle r i n 197 5 than i n 1875 . It woul d b e difficult t o imagin e reasons fo r such a decline. The log-linea r model , b y wa y of contrast , assume s a constan t relativ e error standar d deviatio n (e.g . 2\ percen t o f Y , a t al l point s i n time) , which seem s muc h mor e plausible . Failin g t o transfor m th e dat a adequately violate s th e statistica l model of an 1(1) o r 1(0 ) series , an d ca n induce trendin g mean s an d variances , makin g testin g les s reliable . Certainly, a relativel y lon g tim e serie s i s neede d t o mak e suc h factor s obvious, bu t the y operat e eve n withi n post-wa r quarterl y dat a (se e e.g. Ermini an d Hendr y 1991) . Moreover , change s i n mean s an d variance s over tim e ar e ver y apparen t i n nomina l tim e series , an d ca n confus e attempts t o determin e co-integration . Grange r an d Mailma n (1991 ) analyse genera l transformation s i n 1(1 ) tim e series , an d Chapte r 4 below explores forma l statistica l test s o f hypothese s abou t th e degre e o f integration o f individual time series .
1.7. Examples : Typica l ARM A Processes Figures 1.10-1.2 0 present graph s o f typical examples o f serie s generate d by specia l case s o f ARMA(1,1) processes . Fo r eas e o f comparison, eac h series i s computer-generate d usin g th e sam e se t o f 20 0 observation s o n normally distribute d white-nois e error s s , ~ IN(0,1 ) wit h w 0 = 0. Th e data generatio n processe s are: Fig. 1.1 0 u
t
= £ t [whit
e noise ]
Fig. 1.1 1 u,
= e, + 0.8e,_i [MA(1)
, stationary]
Fig. 1.1 2 u,
= e, - 0.8£,_ ! [MA(1)
, stationary ]
Fig. 1.1 3 u,
= 0. 5 «,_! + e t [AR(1)
, stationary ]
Fig. 1.1 4 u,
= 0.5 ut-v + e t + Q.8e t^i [ARMA(1,1)
, stationary ]
Fig. 1.1 5 u,
- 0. 5 Mr _! + e, - 0.8e t _i [ARMA(1,1)
, stationary]
Fig. 1.1 6 u,
= 0.9 «,_! + e, [AR(1)
Fig. 1.1 7 u
t
Fig. 1.1 8 u, Fig. 1.1 9 u Fig. 1.2 0 u,
t
, stationary ]
= 0.9 ut-! + e, + 0.8e,_i [ARMA(1,1)
, stationary ]
= 0.99 «,_! + E , [AR(1)
, stationary ]
= 1.00 M,_! + s t [AR(1)
, non-stationary ]
= 1.0 1 ut-i + e t [AR(1)
, non-stationary ]
Introduction and Overview
Observation
FIG 1.10. A R = 0.0; MA = 0.0
Observation
FIG 1.11. A R =0.0; MA -0.8
33
34
Introduction and Overvie w
Observation
FIG 1.12. A R = 0.0; MA = -0. 8
Observation
FIG 1.13. A R = 0.5; MA = 0.0
Introduction an d Overvie w
Observation
FIG 1.14. A R = 0.5; MA = 0.8
Observation
FIG 1.15. A R = 0.5; MA = -0. 8
35
36
Introduction an d Overvie w
Observation
FIG 1.16. A R = 0.9; M A = 0.0
Observation
FIG 1.17. A R = 0.9; MA = 0.8
Introduction an d Overvie w
Observation
FIG 1.18. A R = 0.99; MA = 0.0
Observation
FIG 1.19. A R = 1.00 ; M A = 0.0 0
37
38 Introductio
n and Overvie w
Observation
FIG 1.20. A R = 1.01 ; M A = 0.00
A proces s suc h a s tha t i n Fig . 1.19 , a n AR(1 ) wit h a uni t root , i s a random walk and ma y also be expresse d a s ARIMA(0,1,0). The scale s o n th e graph s i n Figs . 1.10-1.2 0 ar e no t identical ; fo r th e non-stationary processes , i n particular , th e graph s sho w ver y wid e movements relativ e t o thos e o f th e stationar y series . Non-stationar y processes wit h root s strictl y greate r tha n unit y gro w ver y quickl y even where those root s ar e quit e clos e t o 1 , as can b e see n fro m Fig . 1.20 , a n AR(1) wit h a roo t i n th e autoregressiv e par t o f 1.01 . Th e stationar y processes i n Figs. 1.10-1.1 8 have unconditional means of zero an d finit e unconditional variances . The y ar e 'tied ' t o thi s zer o mea n i n th e sens e that deviation s fro m i t canno t accumulat e indefinitely . By contrast , th e process wit h a singl e roo t o f exactl y unit y (Fig . 1.19 ) ha s a n uncondi tional Varianc e which increases ove r tim e and wil l tend t o wande r widely (see equatio n (7) ) wit h a n unbounde d expecte d crossin g tim e o f th e origin. Th e proces s wit h a root greate r tha n unity (Fig, 1.20 ) i s explosive and will tend t o either + o r - oo , an d u, ma y b e approximated b y a n MA(rc ) proces s wit h increasing accurac y a s «— » oo. If a = 1, however, the firs t ter m doe s no t disappear , an d the approxima tion fails ; thi s follow s fro m th e failur e o f th e stationarjt y conditio n stated above . Whe n a = 1,
so that u t is the su m of a starting value, u t-n, and al l the error s accruing between t — n + 1 an d t . Thi s representatio n o f th e proces s {u t} a s a sum o f pas t contribution s i s the sourc e o f th e relationshi p o f integration in thi s time-serie s sens e an d integratio n i n th e integra l calculus , wher e the integra l o f a functio n ma y b e though t o f a s th e limi t o f a su m o f discrete area s unde r a curve . Figur e 1.1 9 i s th e cumulativ e sum , or discrete integral , o f the error s recorde d i n Fig. 1.10 . Many economi c tim e serie s hav e bee n modelle d usin g ARM A o r ARIMA processes , an d model s o f these type s will b e use d frequentl y in
40
Introduction and Overview
the followin g chapter s i n describin g th e method s an d tests . Priestley (1989) provide s example s o f othe r type s o f model s tha t ma y b e use d t o characterize non-stationar y processes.
1.8. Empirica l Tim e Series : Money, Prices , Output, an d Interest Rates Figure 1.2 1 graph s th e logarithm s o f quarterly , seasonall y adjusted , nominal M l an d price s (th e implic t deflato r o f tota l fina l expenditure , TFE) i n th e U K ove r th e perio d 1963-89 . Th e serie s (denote d logM , and lo g Pt) hav e stron g trend s an d ar e relativel y smooth , althoug h thei r growth rate s alte r perceptibl y aroun d 197 4 an d agai n aroun d 1980 . Suc h data ar e no t unlik e realization s fro m highl y autoregressiv e (1(1) ) pro cesses. Figure 1.2 2 show s thei r first difference s Alog(M f ) an d Alog(P^) . These ar e mor e errati c bu t ar e stil l highl y autocorrelated . Th e growt h
FIG 1.21. Tim e serie s o f mone y (Ml ) an d price s (implici t deflato r o f total fina l expenditure ) in the UK , seasonall y adjusted , i n logs
FIG 1.22. Tim e serie s o f A log M, an d A log P t
Introduction and Overvie w 4
1
rate o f M appear s t o hav e increase d ove r time , wherea s tha t o f P ha s fallen, especiall y afte r 1980 . These dat a d o no t see m t o b e stationar y although th e graph s b y themselve s d o no t revea l th e sourc e o f th e non-stationarity. Next, Fig . 1.23 shows the behaviou r o f log s o f th e rea l mone y supply (log(M/P,)) an d rea l TF E (log(Y,)) . I t migh t hav e bee n anticipate d from Fig . 1.21 that log(M r ) an d log(P () move d sufficientl y closel y ove r the whol e sampl e fo r thi s differentia l t o b e stationary , bu t Fig . 1.23 shows tha t th e rea l mone y suppl y i s non-stationary . Th e forma l ap paratus o f testin g fo r co-integratio n develope d i n Chapte r 7 i s designe d to detec t suc h relationship s statistically . B y wa y o f contrast , log(Y ( ) looks mor e lik e a serie s wit h a constan t linea r trend , subjec t t o perturbations i n 1973/ 4 and 1979/80 . In economi c terms , surprisin g features o f Figs . 1.22-1.2 3 ar e th e lo w pairwise correlation s betwee n Alog(M ( ) an d Alog(P r ), an d betwee n log(Mt/Pt) an d log(Y ( ), respectively . However , suc h result s hav e n o implications fo r th e existenc e o r otherwis e o f wel l define d relationship s between thes e variables . Monetar y theor y suggest s tha t th e opportunit y cost o f holdin g mone y i s a n importan t determinan t o f th e deman d fo r money, s o Fig . 1.2 4 show s th e tim e serie s o f th e interes t rat e (R t, a three-month loca l authorit y bil l rat e adjuste d fo r financia l innovation ) and th e rat e o f inflation , plotte d i n unit s tha t maximiz e thei r apparen t correlation. Th e serie s {R t} als o seem s t o b e non-stationary , bu t wit h a different tim e profile fro m th e othe r series . I n particular , i t i s much less smooth tha n th e othe r leve l series , bu t les s errati c tha n thei r changes . Finally, Fig . 1.25 shows Alog(Y r ) an d A/?, . Thes e ar e possibl y weakl y stationary, althoug h bot h appea r t o hav e highe r variance s i n the middl e of th e sampl e tha n a t th e ends . However , neithe r i s highl y autocor related, no r d o the y drif t noticeabl y i n an y direction . W e wil l analys e the fou r serie s log(M t), ^og(P t), log(Y,) , an d R t a s a syste m i n late r chapters. (Se e Hendry an d Ericsson (1991b) , who provided th e data. )
FIG 1.23. Tim e serie s o f real mone y (log M,/Pt) an d rea l TF E (lo g Yr)
42
Introduction and Overvie w
FIG 1.24. Time serie s o f a three-mont h interes t rat e (R t) an d th e rat e of inflation (AlogP r ) i n th e U K
FIG 1.25. Tim e serie s of A log Yt an d A7? r
1.9. Outlin e of Later Chapter s Chapter 2 discusses dynamic models fo r stationar y processes. Thi s allows us t o introduce , i n a familia r context , a numbe r o f consideration s which will prov e importan t later . Variou s equivalen t transformation s o f linea r autoregressive-distributed la g model s ar e considered , especiall y error correction, Bewley , and Bardse n forms . The rol e of expectation s in stationary processe s i s als o investigate d an d i s related t o th e absenc e of weak exogeneit y fo r th e parameter s o f th e economi c agents ' decisio n functions. Chapter 3 the n consider s th e analysi s o f 1(1 ) variables , an d explore s the concept s o f uni t roots , non-stationarity , order s o f integration , an d near integration . Th e behaviou r o f least-square s estimator s applie d t o
Introduction and Overvie w 4
3
spurious relationship s i s investigated an d a number o f results establishe d for Wiene r processe s (se e Phillip s 1987a) . Univariat e tests for uni t roots are discusse d i n Chapte r 4 , an d th e forma l definition s in Chapte r 3 ar e related t o th e propertie s o f integrate d series . Mont e Carl o result s illustrate th e variou s distributions . Extension s t o multipl e unit roots an d seasonal dat a ar e considered, an d severa l example s ar e describe d i n detail. Chapter 5 move s o n t o th e topi c o f co-integration . Followin g a bivariate exampl e an d forma l definitions , th e Grange r Representatio n Theorem i s described , linkin g co-integratio n t o erro r correction , an d clarifying th e statu s o f othe r representation s suc h a s commo n trends . The origina l Engle-Grange r two-ste p estimato r o f th e co-integratin g relationship i s analysed . Chapte r 6 firs t consider s inconsisten t regres sions sometime s use d i n orthogonalit y tests ; th e analysi s the n turn s t o distributions o f estimator s i n dynami c regressions wit h 1(1 ) data , base d on th e result s i n Sims , Stock , an d Watso n (1990) , an d i s illustrated b y a number o f examples. Chapter 7 discusse s testin g fo r co-integration . A rang e o f test s i s considered, base d o n testin g fo r a uni t roo t i n th e residual s fro m th e static regression . Whil e widel y used , suc h test s hav e drawbacks , an d Monte Carl o experiment s ar e use d t o illustrat e som e o f these . Test s based o n single-equation dynami c models ar e als o considered . Finally, i n Chapte r 8 , co-integratio n i n system s o f equation s i s analysed. Linea r co-integrate d system s ar e expresse d i n error-correctio n form an d maximu m likelihood estimatio n an d inferenc e fo r co-integrat ing vector s i s discussed, focusin g o n th e approac h propose d b y Johanse n (1988). A rang e o f extension s i s considered , a s ar e variou s othe r estimators. Th e analysi s i s agai n illustrate d b y a numbe r o f example s and simulatio n experiments.
Appendix Equation (11) To prove (11) , w e need t o construct a random variable X t, wher e
44 Introductio
n an d Overvie w
If
then, b y the Liapuno v centra l limi t theorem , The proo f o f (11 ) i s i n thre e steps . First , conside r (fro m (6) ) th e sample mean :
and
Thus X, ~ ID(0, cr?), a s required. Further, notin g tha t and usin g normality of e
and al l the condition s of the Liapuno v theorem are satisfied . Therefore , Finally, usin g the result s above, an d noting that y = TX
Introduction an d Overvie w 4 Since y/VT^> \\W(r)&r fro m result s above , w e hav e tha t y/\/T converges t o both \\W(r}Ar an d to N(0, 1/3) . Therefor e
The derivation s o f later result s follo w simila r lines.
5
2
Linear Transformations, Error Correction, and th e Lon g Run i n Dynamic Regression We begi n b y considerin g th e propertie s o f linea r autoregressive distributed la g (ADL ) model s fo r stationar y dat a processes . Trans formations o f th e AD L mode l t o erro r correctio n an d t o variou s other form s ar e described . W e discus s th e estimatio n o f long-ru n multipliers fro m dynami c models , an d th e equivalenc e o f th e estimates o f thes e multiplier s (an d thei r variances ) fro m an y o f several differen t forms . Finally , w e conside r inferenc e abou t long run multiplier s wher e expectationa l variable s ar e present , an d th e potential problem s ar e show n t o b e specia l case s o f th e genera l invalidity o f inferenc e when th e regressor s ar e no t weakl y exogenous fo r parameter s o f interest. In late r chapters , w e wil l concentrat e o n th e importanc e o f integrate d processes fo r econometri c modelling , an d i n particula r o n th e detectio n of th e stochasti c trend s embodie d i n integrate d processes , o n identifyin g series tha t shar e stochasti c trend s an d therefor e satisf y long-ru n equi librium relations , an d o n th e implication s o f suc h propertie s fo r th e estimation o f economi c relationships . Befor e beginnin g t o explor e thes e concepts, however , ther e ar e a numbe r o f aspect s o f th e us e an d specification o f dynami c econometri c model s whic h ca n b e reviewe d without a thoroug h knowledg e o f integrate d processes , an d whic h wil l be usefu l i n late r discussion . Th e calculatio n o f th e parameter s o f long-run relationship s fro m estimate d models , th e interpretatio n o f linear transformations , an d th e form s o f particula r model s suc h a s th e error-correction mode l ar e amon g thes e topics . Th e variable s use d i n this chapte r ma y al l be treate d a s being stationary , bu t reader s wh o ar e familiar wit h the concept s examine d i n late r chapter s wil l recogniz e tha t the sam e result s appl y if the variable s ar e co-integrated . One simpl e but fundamenta l problem tha t w e addres s i s the following : given a variabl e whic h in genera l depend s upo n it s ow n past an d o n th e values o f variou s exogenou s variables , ho w ca n w e determin e th e long-run equilibriu m relationshi p betwee n th e endogenou s variabl e an d the exogenou s variables ? I f a n endogenou s variabl e y t i s expresse d a s a
Linear Transformations an d ECM s 4
7
function onl y o f the valu e of a se t o f exogenou s variable s z t a t th e sam e point i n time , th e effec t o f z t o n y ( i s immediat e an d complete ; however, i f a la g distribution applie s t o ever y variable i n the model , th e long-run effec t mus t b e derive d a s a function o f al l the la g distributions . Moreover, ther e ar e othe r type s o f informatio n that ca n b e reveale d b y a dynami c equation; an y o f a numbe r o f equivalen t form s wil l provid e the sam e informatio n about, say , short-ru n an d long-ru n adjustment, but different form s o f th e equatio n wil l revea l differen t type s o f information conveniently. We wil l conside r a numbe r o f way s i n whic h t o estimat e long-ru n multipliers fro m dynami c regressio n models , an d i n doin g s o wil l examine severa l differen t type s o f model . Afte r describin g th e genera l autoregressive-distributed la g (ADL ) mode l fro m whic h th e othe r models ar e derived , w e firs t concentrat e upo n th e error-correctio n model, i n whic h th e term s representin g th e exten t o f deviatio n fro m equilibrium ar e explicitl y presen t i n th e estimate d equation , an d whic h therefore immediatel y display s informatio n abou t th e adjustmen t tha t a process make s to a deviation fro m som e long-ru n equilibrium. This chapte r wil l emphasiz e tw o importan t point s abou t linea r transformations. First , eac h o f the transformation s contains precisel y th e same information : th e estimate d value s o f long-ru n multipliers , hypo thesis tes t statistics , an d explanator y power s o f th e differentl y trans formed model s ar e al l identical . Th e choic e o f transformatio n ca n b e made purel y o n th e basi s o f convenience , an d w e wil l conside r whic h ones ar e convenien t fo r differen t purposes . Th e secon d poin t i s a corollary o f th e first , bu t i s wort h emphasizing : th e estimate s o f short-run adjustmen t parameters fro m th e error-correctio n mode l d o no t depend upo n th e paramete r d, use d i n definin g th e error-correctio n term y t_i — 9zt-i, as long a s other level s term s ar e presen t t o allo w for adjustment t o th e chose n parameter . I n particular , a value of unity for 6 may b e chosen , leadin g t o wha t is called 'homogeneity ' (a n error-correc tion ter m o f y t_i — zt-i), a s long a s th e necessar y extr a term s ar e present. Next, w e consider severa l othe r transformation s o f the autoregressive distributed la g model , du e t o Bewle y (1979 ) (an d discusse d b y Wickens and Breusc h 1988 ) an d Bardse n (1989) . Eac h o f thes e transformation s can b e relate d t o th e error-correctio n transformation , an d w e indicat e some o f th e implication s o f thi s fact fo r estimatio n usin g one o r othe r o f the transformations . Finally , w e will discuss som e potentia l difficultie s i n the estimatio n o f long-run equilibriu m relation s an d thei r interpretation , following McCallu m (1984) , Kell y (1985), and Hendr y an d Neale (1988). While thi s chapte r deal s explicitl y wit h stationar y (1(0) ) processes , many o f the model s considere d ca n b e use d wit h co-integrated processe s as well , a s explore d i n Chapter s 5 an d 6 . I n particular , th e equivalenc e of thes e transformation s (i n th e sens e tha t eac h for m ca n b e derive d
48 Linea
r Transformations an d ECM s
from an y othe r b y operatin g linearly o n th e variables ) i s relevant whe n dealing wit h th e Grange r Representatio n Theorem , als o discusse d i n Chapter 5 . Thi s equivalenc e ha s implication s fo r derivation s o f th e distributions o f coefficien t estimate s i n co-integrate d systems . I n a particular transformation , fo r example , th e variable s ma y al l b e inte grated o f orde r zero , s o tha t th e asymptoti c theor y o f stationar y processes applie s to th e distribution s of the estimates . Suc h a parameter ization migh t b e convenien t fo r inference , becaus e it s informatio n content i s identica l t o tha t o f th e origina l parameterization , i f fo r example tha t for m containe d bot h 1(1 ) an d 1(0 ) variables . Thes e issue s are considere d a t lengt h i n Chapte r 6 , an d th e analysi s i n thi s chapte r provides useful backgroun d for that discussion .
2.1. Transformation s o f a Simple Model Before beginnin g a genera l treatment , w e conside r th e first-orde r linea r autoregressive-distributed la g model, denote d ADL(1,1) , a s an exampl e and deriv e severa l linea r transformation s o f it . Eac h transformatio n i s equivalent i n th e sens e tha t eac h implie s th e sam e relationshi p betwee n exogenous an d endogenous variables . Th e ADL(1,1) is where e f ~IID(0, a2 ) an d \<Xi\ < 1 (se e Hendry , Pagan , an d Sarga n 1984). First conside r a stati c equilibriu m defined , a s above , a s a n environ ment i n whic h al l chang e ha s ceased , recallin g tha t w e ar e treatin g ( y t , x t ) a s jointl y stationary . Th e long-ru n value s ar e give n b y th e unconditional expectation s o f th e for m E(y t) i n (la) . Definin g v* = E(y t) an d x* = E(x t) V t, w e have, sinc e E(e t) = 0, and henc e
or
Then ki i s the long-run multiplier o f y wit h respect t o x. Now subtrac t v r _i fro m bot h side s o f (la) an d the n ad d an d subtrac t PoXf-i o n th e right-han d side to get 1 1 Equatio n (la ) i s invarian t t o suc h linea r transformation s whic h preserv e th e erro r process {e,} .
Linear Transformation s an d ECM s 4
9
Alternatively, w e could hav e adde d an d subtracte d (j8 0 + ft);c,_ i o n th e right side , t o get All o f thes e equation s impl y the sam e relationship , becaus e an y on e can be derive d fro m anothe r withou t violating the equality . In equation s (Ic) an d (Id), however , term s representin g th e discrepanc y betwee n yt-i an d x t-i o r betwee n y r _ t an d k\x t-\ appea r explicitly ; th e coefficient (th e sam e fo r eac h form ) o n thes e term s ca n b e take n a s a measure o f th e spee d o f adjustmen t o f y t o a discrepanc y betwee n y and x i n th e previou s period . W e examin e suc h error-correction models in detai l in the nex t section. Equation (Ib) i s similar to (Ic ) an d (id) i n that th e sam e information appears explicitl y a s a coefficient ; tha t is , ( wher e 8, is not equal t o one. Th e 9j are the equilibrium multiplier s give n above: 9/ = f}j(l)/a(i); an d i f the 9 j wer e known, the y coul d b e inserte d directl y int o th e EC M term s i n (3 ) an d the term s i n lagge d x coul d b e eliminated. 3 I n term s o f th e parameter s of (3) ,
Since th e EC M i s simply a linea r transformatio n of th e AD L model , we might ask what its distinguishing feature is. The answe r is that in the ECM formulation , parameter s describin g th e exten t o f short-ru n adjust ment t o disequilibriu m ar e immediatel y provide d b y th e regression . Although th e for m i n (3 ) i s analyticall y convenient , i t i s no t a usefu l empirical specification . I n practice , a singl e error-correctio n ter m a t la g r i s preferable , a s i t induce s a mor e interpretabl e an d mor e nearl y orthogonal parameterization . The error-correctio n mechanis m will be o f particula r valu e wher e th e extent o f a n adjustmen t t o a deviatio n fro m equilibriu m i s especiall y interesting. I t i s clear tha t th e EC M provide s thi s informatio n when th e error-correction term s ar e o f th e for m (y t-i — E/=i Qj xjt-i)i wit h Qj a known parameter . I f 6j i s not know n i t ca n b e estimated ; moreover , a n unknown 0 ; ca n implicitl y b e allowe d fo r i n th e error-correctio n ter m 3
Not e that this require s
52 Linea
r Transformation s an d ECM s
through th e inclusio n o f extr a lag s i n th e x/, withou t affectin g th e magnitude o f th e estimate d coefficient s 17, - i n (3) . Henc e thes e para meters d o no t nee d t o b e estimate d a t a n earlie r stag e i n order t o allo w us t o us e th e ECM . I n fact , a n importan t poin t i n favou r o f th e generalized EC M (3 ) i s tha t th e estimate d coefficient s o n th e error correction term s ar e unaffecte d by th e incorporatio n o f an y constan t 9 into th e term ; thi s wil l b e prove d afte r w e hav e establishe d som e othe r results whic h wil l simplif y th e proof . Th e implicatio n i s tha t w e ca n interpret th e coefficient s ry , i n (3 ) directl y a s adjustment s t o disequili brium eve n thoug h th e tru e disequilibriu m ter m i s give n b y (yt-t - Zf= i SjXjt-i) an d not by (y r _; - Xf= i *,*-;)• Henc e th e use of a generalized EC M does no t imply homogeneity ( 9 = 1) a s long a s extr a lags i n th e x, ar e incorporated , eve n thoug h th e error-correctio n term s that ente r (3 ) d o no t explicitl y allow for 9 ¥= 1 .
2.3. A n Exampl e An exampl e o f th e us e o f th e error-correctio n mechanis m ca n b e foun d in Davidso n e t al. (1978) , wh o us e a homogeneou s (6 = 1 ) error-correc tion mechanis m i n th e modellin g o f consumers ' expenditure . Th e 'error ' to whic h adjustment i s made i n th e mode l i s the differenc e between th e logarithms o f consumptio n an d income , eac h lagge d fou r quarters . Th e error-correction ter m i s significant i n a wide variety of specifications . I n particular, usin g quarterl y seasonall y unadjuste d dat a fro m th e Unite d Kingdom, expresse d a t constan t price s over th e sampl e perio d o f 1958(1) -1970(IV), th e author s favou r th e model 4 (standar d error s i n parentheses):
where th e statistic s z\ an d z 2 ar e asymptoti c x 2 test s f° r paramete r constancy an d seriall y independen t residuals , respectivel y wit h degree s of freedo m i n parentheses ; C, i s th e fitte d valu e o f rea l consumers ' expenditure o n non-durabl e good s an d service s C t; Y t i s rea l persona l disposable income ; P t i s the pric e deflato r fo r consumption ; an d D° i s a dummy variabl e fo r change s i n taxation. Th e error-correctio n ter m ha s a 4 Th e symbo l AjA 4 represent s th e firs t differenc e o f th e fourt h difference ; e.g . A 4 log Y, - A 4 log y,_j = AjA 4 log Y,.
Linear Transformation s an d ECM s 5
3
coefficient tha t i s reasonably substantia l a s well as statistically significan t at conventiona l levels . Th e mode l ca n readil y b e derive d fro m a n AD L model, notin g tha t log(C/y),_ 4 = lo g C,_ 4 - lo g Yt_4 = c,_4 - y,_ 4 , using lower-case letters t o denot e logarithms . On th e additiona l assumptio n tha t A 4cr, A 4y,, an d A 4pf ar e station ary, wit h £(A 4 c r ) = g c, E(& 4yt) = g y, an d E(A 4pt) = pa (th e annua l rate o f inflation) , then , takin g expectation s o f th e equatio n abov e fo r fixed value s of the estimate d parameters , Hence C * =kY* wher e k = exp(-5.3g_y - 1.3p a), notin g tha t g c = g y given th e proportiona l long-ru n solution . Thi s for m o f solutio n i s consistent wit h th e life-cycl e hypothesi s (se e Deato n an d Muellbaue r 1980), i n whic h case th e coefficient s of g y an d p a shoul d correspon d t o the negative s o f th e annua l wealth-incom e an d liqui d asset-incom e ratios. Th e resulting values seem sensible . For positiv e rea l growt h o r inflation , k ) show . I n demonstratin g thes e point s w e wil l make us e of the genera l structur e that Wicken s and Breusc h use to compar e linea r transformations o f regression models . Take a s a basic structure th e regressio n mode l where th e X matri x contain s lagge d (bu t no t contemporaneous ) y a s well a s contemporaneou s an d lagge d x terms , an d y is a k x 1 vector . Define thi s as corresponding t o the ADL mode l (2) . The representation s (4) an d (5 ) involv e transformin g the matrice s y an d X b y a transformation matri x A, suc h that, followin g Wickens and Breusch ,
Linear Transformation s an d ECM s
56
so that For example , tak e m = n = 2 and p = 1 in (2 ) so that th e matri x of the transformation t o th e Bardse n for m (5 ) is
1
0
-1
0 0 0 0 0
0 0
1
0
0 0 0 0
1
-1 0 0 0
0 0 0 0 0
0 0 0 0
1
0 0 0
1
0 0 0
1 -1
_1
0
0 0 0 0 0 0
(10a)
1
since x' t = [yt, 1 , y,^, y,_2, xt, jc,_1; *,_2] map s ont o \' t = [&yt, 1, Ayf _i, A* 1 , then we have From (1 ) i t i s clea r tha t Ay , i s n o longe r stationary : i t depend s no t only upo n th e stationar y process MI, , bu t als o upo n th e non-stationar y process y t-i (sinc e p i - 1 > 0). Hence a n AR(1) proces s wit h a coeffici ent o f 1 is 1(1) , bu t th e sam e proces s wit h a coefficien t o f 1.0 1 i s not , since differencin g wil l not reduc e this process t o stationarity . Many economi c tim e serie s ma y contai n a n exac t uni t roo t i f w e consider logarithmi c transformation s o f th e for m routinel y applie d t o economic tim e series. Otherwise , root s ver y close to, bu t slightl y greater than, unit y impl y non-stationar y serie s tha t ar e no t l(d) fo r an y d . Roots slightl y les s tha n unit y generat e near-integrate d series . Suc h processes wil l ten d t o b e difficul t t o distinguis h from thos e wit h root s of exactly unit y on moderatel y size d samples ; suc h processe s ar e discusse d in Chapte r 3 . Root s substantially greater tha n unity , by contrast, wil l b e easily detecte d a s the explosiv e characte r o f the serie s wil l be clea r wit h even fairl y smal l samples. Consider th e simples t data-generatio n proces s withi n whic h w e ca n discuss tests for unit roots:
100 Testin
g for a Unit Root
If on e wer e testin g th e tru e hypothesi s H 0:p = p 0 fo r p 0 < 1 , th e test woul d b e easil y performed . Runnin g th e regressio n (2) , th e t-statistic ( p — p0)/SE(p) has , asymptotically , a standar d norma l distributio n and ca n b e compare d wit h table s o f significanc e point s fo r N(0, 1). I n small sample s th e statisti c i s approximatel y t -distributed, althoug h th e coefficient estimat e p i s biased downwar d slightly. For p o = 1 , however , thi s resul t n o longe r holds . Th e distributio n o f the tes t statisti c jus t give n i s no t asymptoticall y normal , o r eve n symmetric. Tables o f critica l value s hav e bee n tabulate d b y D . A . Dickey an d ar e reporte d in , e.g . Fulle r (1976) . I t i s instructiv e t o examine thes e i n detail, an d they are recorde d a s Tables 4.1 and 4.2 . The critica l value s i n Fuller' s table s pertai n t o eac h o f thre e differen t models: i t i s importan t t o not e a t th e outse t that , a s i n man y othe r instances, th e distribution s of tes t statistic s obtaine d depen d no t onl y o n the data-generatio n process , bu t als o o n th e mode l wit h whic h w e investigate it . Fo r th e tim e being , w e wil l conside r thre e possibl e models:
The nul l hypothesi s i s that p , = 1 for i = a, b, c. Th e applicabilit y of each mode l depend s on what is known about th e DGP , sinc e we want t o construct simila r tests (tha t is , test s fo r whic h the distributio n o f the tes t statistic under th e nul l hypothesis is independent o f nuisance parameter s in th e DGP) . I f a tes t i s not similar , then th e appropriat e critica l value s may depen d upo n unknow n nuisanc e parameter s (e.g . a constant) , which will invalidate standar d inferences . W e will return t o th e similarit y of test s below . Fo r th e moment , w e will follow much o f the literatur e o n the topi c i n assumin g that (2 ) i s the DGP , i n whic h case th e issu e doe s not aris e sinc e (2 ) contains no nuisanc e parameters . Another formulatio n o f th e DG P deal s wit h a potentia l difficult y tha t arises fro m (2 ) concernin g th e statu s o f th e nuisanc e parameter s unde r the alternativ e H I . p < 1. Reconsider (2 ) when there is an intercep t arbitrar y y 0 Extensio n o f (3c ) necessar y Thus, fo r example , i n cas e (i) , i f th e mode l i s give n b y (3c) , th e appropriate critica l value s ar e give n b y Table s 4.1(c ) an d 4.2(c) . Th e same table s ca n b e use d t o conduc t inferenc e i n (iii) , despit e a non-zer o value o f n i n th e DGP , becaus e (3c ) yield s a simila r test . Similarit y implies tha t th e distribution s o f p an d it s associate d ^-statisti c ar e no t affected b y th e value , unde r th e null , o f th e nuisanc e parameter , an d the critical value s ar e th e sam e a s the one s tha t woul d appl y fo r n = 0, namely, those i n Tables 4.1(c ) an d 4.2(c). There ar e a numbe r o f noteworth y additiona l points . I n cas e (i ) ther e are n o nuisanc e parameters , s o tha t similarit y i s a trivia l property . I n general, a s this summar y suggests , a simila r tes t havin g a Dickey-Fuller distribution require s tha t th e mode l use d contai n more parameter s tha n the DGP . I n order to hav e a similar test fo r (iv) , one woul d the n nee d a model wit h a ter m suc h a s t 2, necessitatin g anothe r bloc k o f critica l values i n eac h o f Table s 4. 1 and 4.2 . I n cas e (ii) , fo r example , w e nee d at leas t mode l (36 ) (wit h a constant ) t o allo w fo r th e unknow n startin g value. I n cas e (iii ) w e hav e a n unknow n constan t an d nee d th e tren d term i n model (3c ) t o allo w for it s effect . Each o f thes e simila r test s i s als o exac t i n finit e samples , provide d appropriate critica l value s ar e available . I n general , however , i t wil l b e necessary t o abando n exac t test s i n orde r t o us e variant s o f th e Dickey-Fuller tes t wher e ther e ar e mor e unknow n parameters . Thes e parameters ca n typicall y be estimated , s o that asymptoticall y they can b e accounted fo r an d a tes t provided . Again , Kivie t an d Phillip s offe r general exac t an d simila r test s fo r DGP s wher e th e dynamic s ar e restricted t o first-order , a s wel l a s demonstratin g th e similarit y o f th e tests just mentioned . In th e cas e o f exac t parameterizations , suc h a s cas e (iii ) wit h mode l (3£>), w e d o no t hav e simila r test s wit h th e Dickey-Fulle r distributions . However, a s West (1988 ) showed , the f-statistic s i n th e exactl y paramet erized cas e (wit h exogenou s item s suc h a s a constan t i n th e DGP ) ar e asymptotically normal , jus t a s ar e f-statistic s use d fo r standar d prob lems. I n finit e samples , however , th e Dickey-Fulle r distribution s ma y be a better approximatio n tha n th e norma l distribution . We will explor e this asymptoti c normalit y further i n Chapte r 6 below.
1
Critica l value s ar e those corresponding t o the mode l use d i n Table 4.1 or 4.2 .
106 Testin
g for a Unit Roo t
4.2. Genera l Dynami c Model s fo r the Proces s o f Interest The firs t o f th e method s fo r allowin g richer dynamic s in th e DG P o f th e process o f interest , { y t } , wa s develope d concurrentl y wit h th e tes t tha t we hav e alread y describe d fo r a uni t roo t i n th e AR(1 ) model , an d i s reported i n Fulle r (1976) . Thes e mor e genera l method s yiel d tes t statistics tha t hav e th e sam e limiting distribution s a s thos e alread y discussed, becaus e the y ar e base d o n consisten t estimate s o f 'nuisance ' parameters. Henc e w e ma y us e th e las t row s o f Table s 4.1(a)-(c ) o r 4.2(a)-(c) fo r inferenc e wit h thes e statistic s i n larg e samples , bu t i n small sample s percentag e point s o f thei r distribution s will no t i n genera l be th e sam e a s fo r thos e applicabl e unde r th e stron g assumption s o f th e simple Dickey-Fuller model . When y t follow s a n AR(p) process ,
a tes t ca n be constructe d wit h the regressio n model :
The coefficien t p i s use d t o tes t fo r a uni t root , an d T(p — 1) an d (p - l)/SE(p ) hav e th e limiting distribution s tabulate d i n Tables 4.1(a ) and 4.2(a ) fo r T-*°°. Moreover , jus t a s i n th e cas e o f a n AR(1) process, w e ca n exten d thi s regressio n mode l t o allo w for th e possibilit y that th e data-generatio n proces s contain s a constan t (drift ) ter m o r a deterministic time trend. Again , fo r suitably modified regression models , the asymptoti c distribution s of th e statistic s base d o n p ar e thos e give n in Table s 4.1(fe)/(c ) an d 4.2(fe)/(c ) fo r T-^°°. Thes e procedure s ar e called 'augmented ' Dickey-Fulle r (ADF ) tests . The ai m i n modification s suc h a s thes e t o th e simple r for m o f th e Dickey-Fuller tes t i s to us e lagge d change s in th e dependen t variabl e t o capture autocorrelate d omitte d variable s whic h woul d otherwise , b y default, appea r i n th e (necessaril y autocorrelated ) erro r term . Wit h th e additional lagge d term s i t wil l b e possible , i f th e DG P ha s th e for m o f (4), t o produc e a mode l (5 ) i n whic h asymptoticall y the erro r term s ar e white noise , becaus e th e nuisanc e parameters ar e know n asymptoticall y and th e term s involvin g the m ma y b e remove d fro m th e erro r term . With white-nois e errors , th e asymptoti c Mont e Carl o critica l value s given i n th e firs t tw o table s ma y b e applied . Moreover , th e asymptoti c distribution o f th e coefficien t o n th e y r -i ter m i n (5 ) i s no t affecte d b y the inclusio n o f th e additiona l Aj f _, terms . I f y, is 1(1), th e difference d
Testing for a Unit Root 10
7
terms ar e al l 1(0 ) an d appropriat e scalin g ensure s tha t th e variance covariance matri x i s asymptoticall y block-diagonal . (Tha t is , al l cross product term s o f 1(0 ) an d 1(1 ) variable s i n th e matri x ar e asymptoticall y negligible.) I t i s thi s asymptoti c orthogonality tha t drive s th e result , much as , i n a standar d regressio n model , on e use s th e orthogonalit y of the informatio n matri x t o prov e th e statistica l independenc e o f th e estimated coefficien t vecto r fro m th e estimat e o f the standar d error . Th e asymptotic theor y an d th e issu e o f 'appropriate ' scalin g ar e discusse d later i n this chapter an d i n Chapter 6 . By allowin g the DG P t o tak e th e for m (4 ) rather tha n th e muc h mor e restrictive AR(1 ) for m (3) , w e hav e expande d th e clas s o f model s t o which we can validl y appl y unit-roo t test s of thi s type . Not e that , as it will generall y b e th e cas e tha t p i s unknown even wher e y t i s strictly an AR(p) process , i t i s generall y safe r t o tak e p t o b e a fairl y generou s number; i f too man y lags ar e presen t i n (5) , th e regressio n i s free t o se t them t o zer o a t th e cos t o f som e los s i n efficiency , wherea s to o fe w lags implies som e remainin g autocorrelatio n i n (5 ) an d henc e th e inapplicab ility o f even th e asymptoti c distributions i n Tables 4. 1 an d 4.2 . On e can , of course , perfor m test s fo r autocorrelatio n o n th e estimate d residual s from (5 ) i n orde r t o chec k th e acceptabilit y o f th e premis e tha t thes e residuals ar e whit e noise . Alternatively , mode l selectio n procedure s ca n be used t o choose p, and test fo r a unit root, jointly (see Hal l 1990) . We have , therefore , a class o f tests fo r th e uni t root whic h can validly be applie d t o serie s tha t follo w AR(p ) processe s containin g n o mor e than on e uni t root . Th e nex t natura l ste p i s to attemp t t o exten d furthe r the clas s of series t o which we can appl y such tests , ideall y in such a way as t o allo w exogenou s variable s t o ente r th e proces s a s well . Sai d an d Dickey (1984 ) provid e a tes t procedur e vali d fo r a genera l ARM A process i n th e errors ; Phillip s (1987a ) an d Perro n an d Phillip s (1988 ) offer a still more genera l procedure . While th e Said-Dicke y approac h doe s represen t a generalizatio n o f the Dickey-Fulle r procedure , i t agai n yield s test statistic s wit h th e sam e asymptotic critica l value s a s thos e tabulate d b y Dicke y an d Fuller . Th e particular advantag e o f thi s tes t i s tha t w e ca n appl y i t no t onl y t o models wit h M A part s i n th e errors , bu t als o t o model s fo r whic h (as is typically th e case ) th e order s o f th e A R an d M A polynomial s i n th e error proces s ar e unknown . Th e method involve s approximating the tru e process b y a n autoregressio n i n whic h the numbe r o f lag s increases wit h sample size . Begin b y assuming that th e data-generatio n proces s follows :
108 Testin
g for a Unit Root
so tha t th e erro r ter m i n th e autoregressio n follow s a n ARMA(p,q), presumed t o be stationar y an d invertible . Th e DG P ca n be rewritten a s
where k i s larg e enoug h t o allo w a goo d approximatio n t o th e ARMA(/>, q) proces s {u,}, s o tha t {v (} i s approximatel y whit e noise . The nul l hypothesi s i s agai n tha t p = 1. Sai d an d Dicke y sho w tha t th e test i s valid i n spit e o f th e fact s tha t p an d q ar e unknow n and tha t th e ARMA(p, q) i s approximated b y a n A R process , a s lon g a s k increase s with th e sampl e siz e T s o tha t ther e exis t number s c an d r, c > 0 an d r > 0 , suc h tha t c k > T 1/r an d T~ l/3k^Q. Henc e 7 1/3 i s a n uppe r bound o n th e rat e a t whic h th e numbe r o f lags , k , shoul d b e mad e t o grow wit h th e sampl e size . Ordinar y least-square s estimatio n o f th e model (6 ) i s prove n t o yiel d a consisten t estimato r o f ( p — 1); th e tes t can the n b e base d o n th e ?-typ e statistic , ( p - l)/SE(p) , usin g Tabl e 4.2(a). Clearly , th e for m o f th e regressio n implie d b y th e Said-Dicke y test i s precisely the sam e a s that o f the augmente d Dickey-Fulle r test . In thi s case Tabl e 4.2(a) , correspondin g t o a model containin g no drif t or trend , i s used , bu t th e tes t ca n als o b e adapte d t o allo w fo r a non-zero drif t ter m fj, i n th e model . Th e tes t i s modified onl y i n s o fa r a s it i s the n base d no t o n y, bu t o n y t — y,wher e y = T~l^^=iyt. Th e regression mode l (6 ) remain s th e sam e excep t fo r th e firs t regressor , which become s (y t-\ — y), an d tes t statistic s are calculate d i n th e sam e way. B y analogy to th e earlie r result s fo r Dickey-Fuller an d augmente d Dickey-Fuller tests , i t i s no t surprisin g tha t w e no w refe r t o Tabl e 4.2(b), correspondin g t o a mode l containin g a drif t term , fo r th e significance point s o f the (asymptotic ) distributions of th e statistics . Monte Carl o studie s of test powe r i n models wit h autocorrelate d erro r processes, describe d b y Dicke y e t al. (1986) , sugges t tha t th e empirica l levels o f th e T(p — 1) statistics ten d t o b e farthe r fro m th e nomina l tes t levels tha n thos e o f th e f-typ e statistics . Dicke y e t al. therefor e sugges t the us e o f th e f-typ e statistic s in thes e cases . Deviatio n o f nomina l fro m actual tes t level s i s particularly grea t i n DGP s wit h M A part s suc h tha t the M A la g polynomia l contain s a factor o f ( 1 — 6L), wit h 6 nea r unity . The near-cancellation o f such a factor wit h th e factor ( 1 - L ) i n the AR lag polynomia l (unde r th e null ) affect s th e actua l levels o f bot h T(p — 1) and f-typ e statistics , bu t i s especially seriou s fo r th e former .
4.3. Non-parametri c Test s for a Unit Roo t In extendin g th e origina l tests abov e t o allo w for higher-order autocorre lation, w e adde d extr a term s t o th e regressio n mode l t o accoun t fo r th e
Testing for a Unit Root 10
9
autocorrelation i n th e residual s tha t woul d otherwis e b e present . B y extending the model , i t was possible t o continu e to dra w valid inferences from th e asymptoti c critica l value s give n i n Table s 4. 1 an d 4.2 ; other wise i t woul d have bee n necessar y t o recomput e thes e critica l value s for each differen t DGP , whic h i n tur n woul d requir e knowledg e o f th e unobservable orders (p) o f the processe s i n these underlyin g DGPs. In expandin g th e se t o f models to whic h we ca n appl y these tests , ou r aim i s to avoi d increasing the numbe r o f table s o f critical values that we must fin d an d us e whil e nonetheles s allowin g fo r quit e genera l DGPs . Phillips (1987a ) provide s a n alternativ e procedur e tha t largel y allow s us to d o so ; ou r expositio n relie s o n furthe r result s reporte d i n Perro n (1988) an d Phillip s an d Perro n (1988) . Rathe r tha n takin g accoun t o f extra elements i n th e DG P b y addin g the m t o th e regressio n model , Phillips suggest s accounting for th e autocorrelatio n tha t wil l b e presen t (when thes e term s ar e omitted ) throug h a non-parametri c correctio n t o the standar d statistics . Tha t is , whil e th e Dickey-Fulle r procedur e aim s to retai n th e validit y o f test s base d o n white-nois e error s i n th e regression mode l b y ensurin g tha t thos e error s ar e indee d whit e noise , the Phillip s procedur e act s instea d t o modif y th e statistic s afte r estima tion i n orde r t o tak e int o accoun t th e effec t tha t autocorrelate d error s will hav e o n th e results . Asymptotically , th e statisti c is corrected b y th e appropriate amount , an d s o th e sam e limitin g distribution s apply. Fro m one perspective , th e effec t i s the sam e a s that o f ADF-type tests: we can validly conduc t asymptoti c inferenc e usin g Table s 4. 1 an d 4.2 . Thi s procedure doe s not , however , requir e th e estimatio n o f additiona l parameters i n the regressio n model . The data-generatio n process that is assumed to hol d is
or equivalently
It i s importan t t o note , however , tha t th e erro r ter m i s no t bein g assumed t o follo w a white-nois e process . Th e condition s tha t u t mus t satisfy i n (70 ) an d (Ib) ar e thos e liste d above i n Chapte r 3 as conditions (3.160)-(3.16d) give n in Phillips (19870). As wit h th e Dickey-Fulle r tests , test s o f th e Phillip s typ e ar e base d upon on e o f three differen t regressio n models , differin g onl y i n on e cas e from thos e use d earlier , b y centring the tren d term :
110 Testin
g for a Unit Roo t
and It i s eas y t o calculat e fro m thes e regression s th e coefficien t estimate s and th e '^-statistics ' fo r each . Fo r test s o f th e significanc e o f p,- , th e statistics ar e the n adjuste d t o reflec t autocorrelatio n i n th e corresponding Uit series . (W e wil l omi t subscript s a , b , o r c o n u t t o simplif y notation.) I f we defin e
and
then th e limitin g distribution s of th e tes t statistic s do no t depen d upo n the parameter s o f the proces s determinin g th e sequenc e {u t} i f o 2 = ou. In th e cas e o f test s statistic s o f th e Dickey-Fulle r (DF ) typ e tha t w e examined earlier , th e mode l i s presumed t o captur e th e relevan t features of th e proces s i n suc h a wa y tha t th e error s ar e independentl y an d identically distributed ; th e latte r i s sufficien t t o guarante e tha t a 2 = o 2u. Note tha t th e statistic s use d i n th e DF-typ e parametri c test s d o emerg e as specia l case s o f th e non-parametri c statistic s wher e th e estimate s o f the parameter s o 2 an d o 2u ar e equa l (i.e . where th e estimate s S 2U an d S2Tt, give n in (11) and (12 ) below, are equal) . We wil l se e thi s mor e clearl y whe n w e examin e th e non-parametri c statistics. I n orde r t o d o so , w e firs t nee d consisten t estimator s o f o 2 and o 2u. Ther e ar e a numbe r o f possibl e choices . I f \i = 0 i n th e DG P (7), the n th e standar d estimato r fro m an y o f (8a) , (8£>) , (8c ) wil l b e consistent fo r a u\ that is,
where u, represents th e residual s fro m on e o f (8a), (8b), (8c) , above. If j U ^ O , th e estimato r i s no t consisten t usin g th e residual s {u at}, bu t residuals fro m eithe r o f th e othe r tw o model s d o yiel d a consisten t estimate. For th e estimato r o f a 2 , a consisten t estimato r ca n b e foun d a t th e cost o f strengthenin g th e assumptions . First , conditio n (3.16& ) i s re placed wit h the conditio n tha t sup r E(\u t\2^} < ° ° fo r som e fi>2 . Next , a conditio n mus t b e place d o n th e la g truncatio n paramete r € which wil l be use d i n definin g th e estimato r o f a 2. The conditio n i s that £ —»°° a s T—> oo , suc h tha t ( i s o(T 1/4). Tha t is , th e numbe r o f lag s use d i n
Testing for a Unit Root 11
1
estimating autocorrelation s o f th e residual s increase s wit h th e sampl e size, but les s quickly than its fourth root. Given these conditions , a consistent estimato r o f a 2 is
The estimato r i s indexe d b y th e la g truncatio n paramete r € t o indicat e that differen t choice s o f € wil l lead t o differen t values . I t remain s only to specif y th e residual s t o b e use d i n (12) , and, as i n (11 ) above , w e may choos e the m fro m an y o f (8a) , (86) , (8c ) if fj. = 0. Als o a s i n (11), ,u + 0 require s tha t w e us e th e residual s fro m on e o f th e model s tha t does contai n a constant ter m in order t o preserv e th e consistenc y of this variance estimate . Evidentl y th e saf e strateg y i s t o tak e residua l esti mates fro m (8b) o r (8c ) i n an y cas e wher e ther e seem s eve n a smal l probability tha t th e data-generatio n proces s contain s a constan t (drift ) term. It i s important t o not e tha t bot h o f th e varianc e estimates S 2U an d S 2T( could b e define d usin g th e firs t difference s y t — yt_i rathe r tha n th e residuals u t. Under th e nul l hypothesis that p — 1 and that th e drif t an d trend term s are zero , the two wil l of cours e be equivalen t asymptotic ally. I n finit e samples , whic h o f th e tw o method s i s use d ca n mak e a substantial difference , however ; we will return to thi s point below. While S\e jus t define d i s consisten t fo r o 2 give n residual s fro m th e appropriate model , i t unfortunatel y doe s no t guarante e a non-negativ e estimate fo r finit e sampl e sizes . However , on e ca n guarante e a nonnegative estimat e wit h a simpl e modificatio n o f (12 ) pioneered b y Newey an d Wes t (1987) , whic h i s moreove r consisten t unde r precisel y the sam e conditions as is (12). Define
where (o f(j) = 1 - j((, + I)"1. A fe w example s o f test s usin g thes e quantities t o transfor m th e tes t statistic s ca n b e presente d withou t further discussion . Thereafter we will present statistic s for hypothese s o n \nb, \n c, an d y e i n (8b) an d (8c) , and fo r hypothese s involvin g p a s well as these parameters. Consider th e hypothesi s tha t p b = I (i n (8b)). 2 A n asymptoticall y valid tes t consist s of the statistic 3
2
W e trea t th e initia l observatio n a s fixe d a t zero ; not al l statistics here are invarian t t o the initia l value. Se e Phillips (1987a) an d Perron (1988). 3 Thes e statistic s ar e vali d fo r eithe r choic e o f S 2Tt give n abov e (i.e . the Phillip s o r Newey-West forms) .
112 Testin
g for a Unit Roo t
or, alternatively ,
where t(p b) i s th e ^-statisti c associate d wit h testin g th e nul l hypothesi s pb - 1 . Th e first o f these statistics , Z(p b), ha s under th e null hypothesis (H0: p b = 1) the limitin g distribution give n in Table 4.1(6) (T —* °°) ; th e second ha s th e limitin g distribution give n in Tabl e 4.2(6 ) (7 1 — » °°) unde r the sam e null . I t i s especially usefu l t o not e agai n her e th e fac t tha t th e original Dickey-Fuller statistic s are specia l case s o f these. Unde r Dicke y and Fuller' s assumptions , th e {«/,, } f=i ar e independentl y an d identicall y distributed, implying , a s w e note d above , tha t o\ = a2 an d therefor e that E(S 2Tf) = E(S 2U). Henc e o n averag e S 2T{ = S 2U, an d Z(p b) reduce s to T(p b — 1). Thi s i s precisely th e firs t o f th e statistic s tha t Dicke y an d Fuller examine . Moreover , Z(t(p b)) reduce s t o t(p b), th e ordinar y regression ^-statistic , an d ha s the distributio n given in Table 4.2. The correspondin g statistic s for model s (8a) an d (8c ) are als o give n in Perron (1988) , an d shar e thi s property . Fo r (8a), th e tes t statistic s ar e similar t o (14 ) and (15) . They ar e (wit h _y 0 = 0)
and
Analogous t o th e test s o n (8a) , (16 ) has th e significanc e points give n in Table 4.1(a ) an d (17 ) those i n Table 4.2(a) . Finally , fo r mode l (8c) , we have
and having th e limitin g distribution s tabulate d i n Table s 4.1(c ) an d 4.2(c ) respectively. Th e quantit y D x i s defined a s the determinan t o f th e inne r product o f the dat a matri x with itself: for (8c),
where, again , summation s are ove r al l available elements o f the vectors .
Testing fo r a Unit Root 11
3
In additio n t o th e extensio n o f th e Phillip s (1987fl ) result s t o th e cas e of regressio n model s containin g constan t an d trend , Phillip s an d Perro n (1988) presen t simulatio n evidenc e regardin g th e powe r o f th e Phillips type procedure s vis-a-vis that o f the Said-Dicke y procedure , eac h bein g applicable t o processe s tha t hav e genera l ARMA(j> , q) processe s i n th e errors fro m a regressio n mode l tha t consist s o f a constan t an d lagge d dependent variable . Th e data-generatio n process i s taken t o be
To characteriz e th e result s roughly, the Phillip s or Phillips-Perron tes t generally ha s highe r power , bu t suffer s substantia l siz e distortion s fo r 6 < 0, i n sample s o f size s typicall y foun d i n economics . Th e Said Dickey tes t als o involve s siz e distortion s fo r 9 < 0, bu t muc h smalle r ones: tha t is , eac h tes t reject s a tru e nul l o f p = 1 mor e tha n th e nominal siz e ( 5 per cen t i n these experiments ) states , bu t th e proble m is much wors e fo r th e Z(p ) an d Z(t(p)) statistic s o f Phillip s an d Perron , where rejection s o f th e tru e nul l rang e a s hig h a s 99. 7 pe r cen t fo r 6 = -0.8. (Siz e an d powe r als o depen d upo n th e numbe r o f lags chose n in th e Said-Dicke y tes t an d o n th e la g truncatio n paramete r i n th e Phillips-Perron tests. ) Fo r th e Said-Dicke y test , th e larges t siz e distor tions (wit h tw o lags , a tru e nul l i s rejecte d approximatel y 67. 7 pe r cen t of th e tim e a t a nomina l siz e o f 5 per cent ) disappea r a s th e numbe r of lags used increases, fallin g t o onl y 1 2 per cen t where 1 2 lags are used . This simulatio n stud y i s o f cours e a limite d one , dealin g a s i t doe s with onl y on e AR M A proces s fo r th e equatio n errors . I t doe s howeve r suggest tha t th e Phillips-typ e test s ar e mor e likel y to rejec t th e nul l of a unit root , whether or no t i t i s false; fo r error s wit h stron g negativ e M A components, th e differenc e i s quite large . On e migh t suspect a s well that the powe r o f th e Said-Dicke y procedur e woul d be highe r fo r processe s involving A R errors , becaus e th e tes t regressio n capture s A R term s precisely. Phillips an d Perro n conclud e b y recommendin g thei r ow n Z(p ) tes t for model s wit h positiv e M A o r II D errors , an d th e Said-Dicke y statistic for models with negative MA errors .
4.4. Test s o n More than One Paramete r The test s abov e hav e al l been directe d a t testin g th e leve l autoregressiv e parameter alone . I n model s (8b) an d (8c) , however , ther e ar e othe r parameters present , an d on e ma y b e intereste d i n a forma l tes t o f th e hypothesis tha t on e o f thes e i s zero , o r i n a joint test . Test s simila r t o
114 Testin
g for a Unit Roo t
those abov e ca n b e provided , bu t a furthe r se t of table s mus t b e use d t o find th e significanc e point s o f th e distribution s o f th e resultin g tes t statistics. Table s 4. 4 an d 4. 5 belo w ar e base d o n thos e give n b y Dicke y and Fulle r (1981) , wh o provid e likelihoo d ratio , ^-type , an d F-type statistics for test s on th e parameter s fi b, (JL C, an d y c i n (8b) an d (8c) . Th e tables ar e agai n derive d fro m a Mont e Carl o simulation . The statistic s tha t Dicke y an d Fulle r offe r ar e derive d unde r th e assumption tha t u bt an d u ct ar e white-nois e processes , bu t the y sho w that, a s wa s th e cas e wit h test s above , th e sam e distribution s ca n b e applied wher e th e error s follo w a n autoregressiv e proces s an d a cor rectly specifie d mode l i s used t o estimat e th e parameter s o f thi s process . As we noted earlier , however , it is desirable t o generaliz e th e test s t o b e applicable t o a s broad a s possible a class o f error processes , o f unknown form. Thi s ca n be done , onc e again , using a non-parametric correction . Table 4. 3 summarize s th e Mype , F-type , an d non-parametri c tes t statistics used fo r severa l nul l hypotheses involvin g the parameter s fi an d y. I n additio n t o th e quantitie s define d above , w e requir e
The Phillips-Perro n correction s t o th e standar d Dickey-Fulle r statist ics mus t howeve r b e use d cautiously . Again , th e accumulate d evidenc e of severa l Mont e Carl o simulatio n studie s suggest s tha t th e non-para metrically correcte d tes t statistic s d o no t alway s hav e th e correc t size s even in fairl y larg e samples . Schwert (1989 ) make s thi s poin t forcefully . Hi s results , amplifyin g those i n th e Phillips-Perro n simulation s reporte d earlier , sho w tha t th e critical value s o f th e augmente d Dickey-Fulle r tes t statistics , give n b y the standar d Dickey-Fulle r tables , ar e muc h mor e robus t t o th e presence o f movin g averag e term s i n th e error s o f th e random-wal k process tha n ar e th e correspondin g non-parametricall y adjuste d Dickey Fuller statistics . A n example , take n fro m Schwert , i s sufficien t t o illustrate th e point . The data-generatio n proces s i s give n by 4 y, = yt-i + ut + du t~i, 4
Fo r conformit y wit h th e notatio n o f Phillips-Perro n use d earlier , th e sig n o f th e coefficient o n 6 is changed here .
TABLE4.3(a). Tes t statistics for simple hypotheses in models with drif t an d trend 3 Statistic typ e Tes
a
t Statistic
Critica l values for Z(TI) , Z(t2) , an d Z(T^) ar e th e sam e as those fo r TI , TI, an d 7 3 respectively and ar e tabulate d i n Table 4.4. Note als o tha t S 2U an d S\ e ar e define d wit h respect t o th e residual s o f a particula r model , an d s o diffe r acros s models (8a), (8b), and (8c) . c ti(j) i s the it h diagonal element of the invers e second-moment matrix of the regressors i n model j . Sources: Dickey and Fuller (1981 ) and Perro n (1988) .
TABLE 4.3(6). Test statistics for joint hypothesesa
a
Critical values for Z(i), Z(<J> 2 )> and Z(3) are the same as those for !, 2, and !; DGP : (8b) wit h Pb = 1 , 25 0.29 0.65 0.38 0.49 50 0.29 0.50 0.66 0.39 100 0.29 0.39 0.50 0.67 250 0.30 0.51 0.67 0.39 0.30 500 0.39 0.51 0.67 00 0.30 0.67 0.40 0.51 (6) Tes t statistic O2; DGP : (8c) wit h 25 0.61 0.75 0.89 0.62 50 0.77 0.91 100 0.63 0.77 0.92 0.63 0.92 250 0.77 0.63 500 0.77 0.92 00 0.63 0.92 0.77 (c) Tes t statistic 0.74 25 0.76 50 0.76 100 250 0.76 0.76 500 00 0.77
0.90
0.95
0.975 0.9
r\.
4.12 3.94 3.86 3.81 3.79 3.78
; mode l (8b) 6.30 5.18 4.86 5.80 4.71 5.57 4.63 5.45 5.41 4.61 4.59 5.38
Me = 0 , yc = 0; model (8c ) 6.75 4.67 5.68 5.13 5.94 4.31 5.59 4.16 4.88 4.07 4.75 5.40 4.05 4.71 5.35 4.03 4.68 5.31 Xc = 0 ; model ( 8c) 0>3; DGP : (8c) wit h PC = 1 , ' 1.33 7.24 8.65 0.90 1.08 5.91 1.37 5.61 6.73 7.81 0.93 1.11 7.44 0.94 1.12 1.38 5.47 6.49 5.39 6.34 7.25 0.94 1.13 1.39 0.94 1.39 5.36 6.30 7.20 1.13 5.34 6.25 0.94 1.39 7.16 1.13 Pc = l ,
1.10 1.12 1.12 1.13 1.13 1.13
9
7.88 7.06 6.70 6.52 6.47 6.43 8.21 7.02 6.50 6.22 6.15 6.09 10.61 9.31 8.73 8.43 8.34 8.27
a
Al l entrie s i n th e lef t hal f o f th e tabl e hav e standar d error s o f les s tha n 0.005; those in the righ t half , les s tha n 0.06 . Source: Dicke y an d Fulle r (1981 : 1063) .
computed fo r tw o differen t length s o f lags . Th e firs t la g lengt h i s given by € 4 = [4(T/100) 1/4] an d th e secon d b y € 12 = [12(7/100) 1/4]; [x ] denote s the largest intege r les s tha n or equa l t o x. The result s o f thi s experimen t ar e presente d i n Table s 1 an d 2 o f Schwert (1989 : 148-9) . The y indicat e tha t th e distribution s o f th e Phillips-Perron test s ar e no t clos e t o th e Dickey-Fulle r distribution . The distributions ar e closest whe n 6 - 0. 5 or 0.8 but diffe r markedl y for values o f 9 —- —0. 5 an d —0.8 . Th e discrepancie s persis t eve n wit h sample size s a s larg e a s T = 1000. Th e AD F statistics , o n th e othe r hand, hav e distribution s tha t ar e muc h close r o n averag e t o th e Dickey-Fuller distribution . The poo r behaviou r o f th e Phillips-Perro n test s wher e negativ e M A terms ar e presen t persist s i n regression s tha t incorporat e a tim e trend .
Testing for a Unit Root 11
9
Schwert als o report s the distribution s of the normalize d unit-roo t estimators (i.e . T(p — 1)) i n thei r AD F an d non-parametricall y cor rected D F versions . Th e conclusion s remai n unaltered . Finally , Schwert's simulation s d o sugges t tha t th e finite-sampl e performanc e under th e nul l o f th e Phillips-Perro n procedures , i n th e case s wher e MA term s caus e siz e distortions , is bette r whe n S 2U and S 2Tf are calculated usin g th e firs t difference s o f y t tha n wher e th e regressio n residuals ar e used . However , th e test s ma y the n fai l t o b e consisten t against som e stationar y alternativ e hypothese s (Stoc k an d Watso n I988b). I t seem s safest , therefore , t o avoi d thes e test s i f ther e i s an y evidence o f th e kin d o f M A componen t t o th e error s tha t cause s siz e distortions. An alternativ e procedur e i s propose d b y Hal l (1989) , wh o suggest s that I V b e use d i n place o f OL S i n augmente d Dickey-Fuller tests . Th e level instrumenta l variabl e use d i n plac e o f y,^. 1 i s y t-(k+i), wher e th e residual autocorrelatio n functio n ha s non-zer o element s onl y u p to la g k (see Sectio n 4.6. 4 below) . Hall' s Mont e Carl o result s sugges t tha t th e method perform s well , particularly for negative MA erro r processes .
4.5. Furthe r Extension s Two mor e extension s o f th e testin g procedur e ma y b e considered . Th e first concern s testin g fo r multipl e uni t root s i n a process . Th e secon d i s testing fo r uni t root s a t seasona l frequencies . Inventorie s ma y b e regarded a s a goo d exampl e o f a variable tha t i s likel y t o b e 1(2 ) (contains tw o uni t roots) , a s i t i s constructe d b y aggregatin g a functio n of flo w variable s (productio n an d sales ) whic h ar e individuall y 1(1) ; a test fo r multipl e uni t root s woul d therefor e b e importan t whe n dealin g with stoc k variable s o f thi s kind . Test s fo r seasona l uni t root s ar e applicable whe n seasona l dat a ar e used . Standar d unit-roo t test s ma y provide misleadin g result s i n th e presenc e o f integratio n a t seasona l frequencies. 4.5.1. Multiple Unit Roots Consider th e proble m o f testin g fo r d > 1 uni t root s i n a series . Th e sequence o f testing—whic h start s wit h a test fo r a singl e unit root i n th e undifferenced series , the n proceed s t o a test fo r a second uni t root (tha t is, test s th e first-difference d series ) i f th e firs t nul l (o f a uni t roo t i n levels) i s not rejected , an d s o on—does not constitut e a statistically vali d testing sequence , sinc e al l o f th e unit-roo t test s considere d i n thi s chapter tak e th e complet e absenc e o f uni t root s a s th e alternativ e
120 Testin
g for a Unit Roo t
hypothesis. Dicke y an d Pantul a (1987 ) sugges t a more natura l sequentia l testing procedur e fo r uni t root s whic h take s th e largest 5 numbe r o f uni t roots unde r consideratio n a s th e firs t maintaine d hypothesi s an d the n decreases th e orde r o f differencin g eac h tim e th e curren t nul l hypothesis is rejected . Thi s continue s unti l th e firs t tim e th e nul l hypothesi s i s no t rejected. The sequentia l procedur e ma y be illustrate d fo r th e cas e d = 2. Le t u s consider th e AR(2 ) model , This mode l ca n be re-parameterize d a s where ft = (pjp 2 - 1 ) and ft = -(1 - pj)( l - p 2). The testin g procedure consist s o f the followin g steps: 1. Tes t th e nul l hypothesi s o f tw o uni t root s agains t th e alternativ e o f a singl e uni t root . Unde r thi s nul l hypothesi s f t = f t = 0 an d a n F-tes t may b e use d t o tes t it . Suc h a test , however , doe s no t tak e accoun t o f the one-side d natur e o f th e alternativ e hypothesis . A mor e powerfu l procedure follow s fro m notin g that , unde r bot h th e nul l an d th e alternative hypotheses , f t = 0. However , f t = 0 unde r th e nul l hypo thesis bu t i s les s tha n zer o unde r th e alternativ e hypothesis . Thus , a more powerfu l tes t i s give n b y estimatin g th e regressio n o f A 2 y, o n Ay f _!, computin g th e f-rati o o f ft , an d performin g a one-side d lower tail test usin g the Dickey-Fulle r critica l values . 2. I f th e nul l hypothesi s abov e i s rejected , procee d t o tes t th e nul l of one uni t roo t versu s th e stationar y alternative . Her e HQ an d HI ar e given b y f t < 0, f t = 0, an d f t < 0, f t < 0 respectively . Thus , a one-sided f-tes t her e involve s estimating the regressio n o f A 2 y, on A y f _ j and y t-\, computin g th e f-rati o o f ft , an d comparin g i t wit h th e Dickey-Fuller values . This testin g procedure ma y be generalize d t o testin g fo r three o r mor e unit roots . Dicke y an d Pantul a (1987 ) contain s th e result s o f a simula tion study . Thei r genera l conclusio n i s tha t th e sequentia l procedure , consisting o f testin g a nul l hypothesi s o f k uni t root s agains t a n alternative o f k — 1 uni t roots , base d o n f-tests , i s considerabl y mor e powerful tha n a n F-test-base d procedure . 4.5.2. Seasonal Integration We hav e s o fa r focuse d attentio n o n testin g fo r a uni t roo t a t th e zer o frequency. However , whe n seasona l dat a ar e used , i t ma y b e necessar y 5
Not e tha t th e firs t sequenc e too k th e smallest numbe r (i.e . 1 ) of uni t root s a s it s firs t maintained hypothesis .
Testing for a Unit Root 12
1
to allo w fo r seasona l averagin g o r seasona l differencin g t o achiev e stationarity. Fo r example , th e appropriat e differenc e to use to transform to stationarit y ma y not be x, - x t-i, bu t xt - x t~4 i n quarterly dat a or xt - x,~i2 i n monthly data. Seasona l integratio n (an d co-integration ) and testing fo r uni t root s a t seasona l frequencie s ar e discusse d b y Engle , Granger, an d Hallma n (1988) , Ghysel s (1990) , Hylleberg , Engle , Granger, an d Yo o (1990) , Engle , Granger , Hylleberg , an d Le e (1993) , and Ilmakunnas (1990) amon g others . Just a s a tim e serie s wit h n o seasona l componen t ma y b e wel l described b y a deterministi c process, a stationar y stochasti c process , o r an integrate d process , th e seasona l componen t o f a tim e serie s ma y b e well describe d b y a proces s fro m an y o f thes e classes , o r ma y combin e elements o f each . Whil e i t i s commo n practic e t o mode l a seasona l component a s havin g a deterministi c o r stationar y form , ther e ma y b e cases wher e i t i s appropriat e t o allo w th e mode l o f th e seasona l component t o drif t substantiall y ove r time . Thi s possibilit y is implicit in the practic e o f seasona l differencin g (se e e.g . Bo x an d Jenkin s 1970) , whereby a proces s observe d s time s pe r yea r woul d b e transforme d t o its , s -period difference , x t — x,-s, o n th e assumptio n tha t th e proces s contains an integrated seasona l component . In orde r t o allo w for a unit root a t a seasonal frequency, it is useful t o factor th e la g polynomial of the process . I f the la g polynomial contains a factor ( 1 - L s ) = A 5 , correspondin g t o a seasona l uni t root , the n i t can be factorize d as
That is , th e seasona l differenc e operato r ca n b e broke n dow n int o th e product o f th e firs t differenc e operato r an d th e moving-averag e seasonal filter 5(L ) containin g further root s o f modulus unity. Engle e t al. (1988 ) defin e a variabl e x t t o b e seasonall y integrated o f orders d an d D (denote d SI(d, D)) , i f & dS(L)Dxt i s stationary . Thus , for quarterl y data , i n th e terminolog y establishe d above , i f A 4 jr r i s stationary, the n x, is SI(1, 1) with S(L) = 1 + L + L 2 + L 3 . Further ,
Hence th e quarterl y seasona l uni t roo t proces s ha s fou r root s o f modulus unity : on e a t th e zer o frequency , on e a t th e two-quarte r (half-yearly) frequency , an d a pai r o f comple x conjugat e root s a t th e four-quarter (annual ) frequency . T o relat e thes e root s t o frequencie s in an intuitiv e way , conside r th e deterministi c proces s a(L)x t = 0. Fo r
122 Testin
g for a Unit Root
a(L) — (1 + L) , the n x,+i = -x, an d so ^(+2 = x t; th e proces s return s to its origina l valu e o n a cycl e wit h a perio d o f 2 . Fo r a(L) = ( 1 — /L), then x t+i = i.xt, x t+2 — f2x, = ~*< > *r+ 3 =— '*r> an d ^, +4 = —i 2xt = *„ s o that th e proces s repeat s wit h a period o f 4. As wit h a proces s wit h a singl e uni t roo t a t th e zer o frequenc y (e.g . the rando m wal k (1 — L)x, = et), a seasonally integrated proces s suc h as (1 - L 4)xt = £ r retain s th e effec t o f shock s indefinitely , an d ha s a variance whic h increase s linearl y wit h time . However , becaus e th e seasonally integrate d proces s contain s multiple roots o f modulus unity, it does no t behav e lik e a n 1(1) proces s i n all respects. Fo r example , shock s to th e syste m wil l als o alte r th e seasona l patter n o f th e series , s o tha t the sequence s o f observation s corresponding to eac h quarte r ma y evolve in differen t ways . Th e firs t differenc e o f suc h a seasonall y integrate d process wil l not b e stationary. Testing fo r a uni t roo t a t a seasona l frequenc y ha s muc h i n commo n with testin g fo r uni t root s a t th e zer o frequency . Test s hav e bee n proposed b y Hasza an d Fulle r (1982) , Dickey , Hasza , an d Fuller (1984) , Osborn, Chui , Smith , an d Birchenhal l (1988) , Hylleber g e t al (1990) , and Engl e e t al. (1993) , amon g others. W e wil l follow Hylleberg e t al. i n describing a testing strategy. Consider a process observe d quarterl y and generate d b y where e t i s IID(0 , cr 2) an d y(L ) i s a fourth-orde r la g polynomial . W e wish t o tes t th e nul l hypothesi s tha t th e root s o f y(L ) li e o n th e uni t circle, agains t th e hypothesi s tha t the y li e outside . Definin g thre e positive parameters ) generalizatio n o f (22 ) so tha t w e ca n again use the transforme d model where no w z' f = (i{ ZS. M Z 4,t) an d 0' = (0[, 6 2, 03, 04). T o defin e the element s o f zj, le t jU c = E(Ay t) = ( 1 - j8(l))~V c = b{i c, the unconditional mea n o f the drif t unde r th e null, usin g b = (1 - ^(l))" 1. Next, let
The 0 { ar e give n b y 0{ = (ft, ft , . . ., ft,) , 0 2 = A* c + j8(l)A c + y c , 63 = pc, an d 0 4 = y c + p cuc. Th e scalin g matri x T r become s diag(r 1/2 ip, T 1/2, T, r3/2) wher e i p i s the uni t vecto r o f dimensio n p . Finally £l p = E(zittz[tt), th e covarianc e matri x o f z^,. Th e element s of the matrice s Vj - an d <J>T ar e simila r t o thos e fo r th e simpl e Dickey Fuller test . Then, usin g 4> to denot e convergenc e in probability
128 Testin
g for a Unit Roo t
Again, Tabl e 3. 3 ma y b e applie d t o fin d th e densitie s o f th e Wiene r processes appearin g above , wit h th e exceptio n o f tha t appearin g i n th e expression fo r VT.S.S ; again , a n expansio n fo r thi s densit y i s give n b y Abadir (1992) . V i s therefor e bloc k diagonal , an d th e estimator s o f th e nuisanc e parameters j 8 are asymptoticall y normal an d d o no t affec t th e asymptoti c distributions o f th e Dickey-Fulle r statistics , s o tha t th e sam e critica l values ca n b e used . Th e b s tha t appea r i n som e o f th e expression s cancel appropriatel y t o mak e thi s possible . Thi s ma y b e see n i n th e simplest cas e wher e th e mode l doe s no t includ e eithe r th e constan t o r the tren d ter m bu t doe s include the Ay ; _ ; - terms . Noting that i n this case the term s Vj-^2 , \T,i,4' ^r,2,3 » 11 ^r,2,4 > Vr,3A> $r,2 > an d 0r, 4 ar e n °t 1 pp relevant, an d tha t V" = diag(o) . . . a) , V^3,3), wher e o> " i s th e z'th diagonal elemen t o f S2 p th e distributio n o f th e f-statisti c i s give n b y t = (o" 2Fri3j3)^1//207-;3. Thi s ha s th e standar d Dickey-Fulle r distributio n with th e critica l values give n by Tables 4.2(a) . Th e result s exten d t o th e cases wher e th e constan t an d (or ) tren d ar e (is ) include d i n th e mode l with th e critica l value s give n b y Table s 4.2(6 ) an d 4.2(c ) respectively .
Testing for a Unit Root 12
9
The inclusio n o f th e 1(0 ) term s Ay ( _ ; leave s unchange d th e asymptoti c distributions o f the parameter s o f interest .
4.6.3. Example: Non-parametric Test Statistics (Phillips 1987a) Consider th e simpl e random-wal k proces s y t = yt^ + ut. Th e mai n features o f non-parametri c correction s ma y b e illustrate d b y assumin g that th e onl y restriction s impose d o n th e stochasti c proces s {wj^ i ar e those give n by condition s (3.16a)-(3.16d) ; {wjjl i ma y therefore b e a n ARMA(p,q) proces s i n whic h cas e th e f-statisti c fo r p , i n th e mode l yt = pyt-i + ut, does no t have the standard Dickey-Fulle r distribution . As discusse d earlie r i n this chapter, a non-parametric correction i s one way o f accountin g fo r th e autocorrelatio n i n th e {wj™ = 1 series . Thi s correction enable s u s t o retai n th e us e o f th e Dickey-Fulle r critica l values t o conduc t inferenc e an d therefor e expand s th e rang e o f model s to which the Dickey-Fulle r test s ca n be applied . Using th e result s i n (3.21)-(3.24) , th e estimato r p an d it s f-rati o t(p) have the following limiting distributions:
where A =(cr 2 — cr2)/2 wher e CT 2 and cr 2 ar e a s define d i n (10a ) an d (106). I f th e u, ar e IID(0 , CT2), then CT2 = CT», and A =0. I f so , th e distributions o f p an d it s r-rati o i n (31 ) an d (32 ) above ar e th e usua l Dickey-Fuller distributions . It ma y the n b e verifie d tha t th e limitin g distributio n o f th e statisti c Z(p), where
is th e sam e a s th e distributio n obtaine d b y settin g A =0 i n (31) . This
130 Testin
g for a Unit Roo t
follows fro m a n inspectio n o f (31 ) an d b y noting that
Similarly, th e limitin g distribution o f the Z(t(p)), wher e
is the sam e a s the distributio n obtained by setting A = 0 in (32) . The limitin g distribution s o f (33 ) an d (34 ) ar e unchange d whe n A is replaced b y A in thes e expressions , wher e A is a consisten t estimato r o f A. Consisten t estimator s o f a 2 an d o 2u ar e require d i n orde r t o obtai n a consistent estimato r o f A and t o implemen t th e non-parametri c correc tions. A consistent estimato r o f a 2u i s given by either T~ 1^ \(yt - yt~i) 2 or 3 n"1Xf(yr — Pyt-i)2 • The asymptoti c equivalenc e o f th e tw o estima tors follow s fro m th e propert y tha t p- * 1 in probability. 8 A consisten t estimator o f o 2 ca n be obtaine d fro m (12 ) o r (13 ) a s before. Using argument s simila r t o thos e outline d above , th e no n -parametric corrections fo r th e mor e elaborat e model s whic h includ e constan t o r constant an d trend , ma y b e derived . I n particular , Z(p,- ) an d Z(f(p,) ) (/ = b, c) ma y be obtained .
4.6.4. Example: Instrumental Variables Test for Unit Roots (Hall 1989) The non-parametri c statistic s describe d i n exampl e 4.6. 3 ar e know n no t to perfor m wel l i n finit e sample s i n th e presenc e o f negativ e moving average error s (se e Schwer t 1989) . Hal l (1989 ) propose d estimatio n b y instrumental variable s a s a n alternativ e t o th e us e o f non-parametri c corrections. H e showe d tha t i n th e regressio n mode l y, = pyt~\ + ut, where u t i s a moving-averag e proces s o f som e specifie d orde r an d p i s equal t o 1 under H 0, the n p iv ha s the standar d Dickey-Fulle r distribu tion. The intuitio n for thi s result ma y b e easil y described: p OLS i n th e abov e model doe s no t hav e th e standar d Dickey-Fulle r distributio n because o f the bia s induce d b y th e correlatio n betwee n y r _i an d u, (whe n u t i s an ARMA(p,q) process) . I t i s therefor e necessar y t o us e a correctio n factor t o remov e thi s bias . Thi s bia s doe s no t appea r when , say , y,_ 2 is used a s a n instrumen t fo r y,_ i an d u t i s a n MA(1 ) process . Th e 8 A s note d above , th e finite-sampl e behaviou r o f thes e tw o estimator s ma y b e quit e different (se e Schwer t 1989) .
Testing for a Unit Root 13
1
Dickey-Fuller table s ca n thu s b e use d directly . W e formaliz e thi s intuition nex t b y presentin g a simpl e exampl e an d b y usin g some o f th e distributional result s derive d earlie r i n th e chapter . Throughout , t o simplify th e algebra , adequat e initia l observation s ar e assume d t o b e available, s o all sums are taken ove r 1 . . . T. Let th e DG P b e give n by
Then p, v, th e instrumenta l variables estimator o f p which uses _y,_ 2 a s an instrument for yt-\, is given by
Next, w e want to prove tha t
where W(r) is th e Wiene r proces s associate d wit h th e sequenc e {«,} . The RH S o f thi s expressio n i s th e limitin g distributio n o f th e simpl e Dickey-Fuller tes t fo r a mode l lik e (35 ) when th e u, ar e II D (see Section 4.6.1) . Thus , w e nee d t o sho w that , fo r th e instrumen t y t~k
Note tha t
Proof o f (i). From(35a) ,
132 Testin
g fo r a Unit Root
This follows from th e fac t tha t
Recall no w from (3.23 ) tha t
for th e DG P give n b y (35a)-(35c) . Further , fo r th e erro r proces s u t, o2u = (l + 0 2 )cr 2 and o 2 = (I + 0) 2o2e. It als o follow s from (3 5 b) tha t
Using (39) , it is now possible t o se e fro m (38 ) tha t
But a 2 = (1 + 0 2 )a 2 . Henc e The las t equalit y follows from th e expressio n fo r a 2 give n previously, (i ) now follows routinely from (40) . Proof of (ii).
All term s o f th e for m r~ 2 Xf= i}Vi M i-p / — 1.2, . . ., ( k — 1), converge in probabilit y t o zero . Thi s i s because th e scalin g T^ 1 i s appropriate fo r these sum s t o hav e non-degenerat e distributions. 9 Th e scalin g T~ 2 induces degeneracy . Th e distributio n o f T~ 2 2T= i.y?-i i s give n b y cr 2 (/oW(r) 2 dr) for the DG P (35a)-(35c) ; (ii) no w follows routinely . Finally, (37 ) follows fro m (36) , usin g k = 2 in (i ) an d (ii) , sinc e 9
Thi s follows fro m argument s similar to thos e used t o prove (3.21)-(3.24).
Testing for a Unit Root 13
3
It als o follow s fro m (37 ) that th e f -ratio form o f the test ,
has the Dickey-Fuller f-distributio n wher e a i s a consistent estimato r of a (possibl y equa l to ( 1 + §)& E, where 6 and d e ar e OL S estimators o f 6 and 0^. Thus, estimatio n b y instrumenta l variable s ha s th e sam e effec t a s th e non-parametric correction s t o p(OLS ) proposed b y Phillips an d Perron . In a smal l Mont e Carl o study , Hal l (1989 ) show s tha t th e siz e problems associate d wit h the Phillips-Perro n tes t ar e partiall y alleviate d by the us e o f this instrumental variable procedure . However , substantia l size distortion s remai n in the case s wher e 6 < 0 in the nul l model . No power calculation s ar e reported i n Hall's paper . 4.6.5. Example: Bounds Test for Unit Roots (Phillips and Ouliaris 1988) A limitatio n o f th e testin g procedure s discusse d i n thi s chapte r i s tha t the distribution s o f th e tes t statistic s ar e non-standard . Consequently , a number o f differen t set s o f critica l value s hav e t o b e use d t o implemen t the tests . This proble m i s at the hear t of a literature whic h exploit s the ide a tha t differencing a n 1(0 ) serie s induce s a uni t roo t i n th e moving-averag e representation o f th e process . Us e i s mad e o f thi s fac t t o devis e a unit-root tes t base d o n th e long-ru n variance, define d i n (3.16c) , o f th e first-differenced tim e series . Th e critica l value s ar e take n fro m th e standard norma l table . In orde r t o illustrat e thi s approach , assum e tha t y t follow s th e IMA(1,1) process , &yt = ( 1 - 9L)e t = ut, (41 ) 2 2 2 2 with E, ~ IID(0, o e). Th e long-ru n varianc e o f Ay , is a = (1 - 9) o E, so a 2 + 0 if and onl y if 9 ¥= 1. I n othe r words , if y, is 1(0), A.y, will have 0)—is sufficient fo r th e variables to b e calle d 'co-integrated' . 4 Th e exampl e is taken fro m Engl e an d Grange r (1987).
138 Co-integratio
n
Although thi s i s a simpl e example, muc h o f th e metho d an d reasonin g can be generalize d t o more complex cases. Wha t i s crucial is that, whil e {xt} an d {y t} ar e integrate d processes , no t tie d t o an y fixe d means , a linear combinatio n o f th e tw o variable s make s th e resultin g serie s a stationary proces s an d th e variable s x an d y ma y be sai d to b e linke d by the correspondin g equilibriu m relationship . It i s interestin g t o not e tha t i n th e bivariat e cas e w e hav e th e adde d bonus tha t thi s equilibriu m relationship, i f suc h a relationshi p exists , i s unique. Th e proo f i s straightforwar d an d follow s b y contradiction . Suppose not : tha t is , suppos e tha t ther e exis t tw o distinc t co-integratin g parameters a an d y suc h tha t {x, + ay,} an d {x t + yv( } are bot h 1(0) . This implie s tha t (ex— y)y r i s als o 1(0 ) becaus e subtracting on e I(d) series fro m anothe r canno t lea d t o a serie s integrate d o f orde r ( d + 1) (or higher) . Bu t sinc e {y t} i s 1(1), a non-zero constan t time s {y t} i s als o 1(1). Hence we have a contradiction unles s a = y. The analysi s is not quit e s o straightforwar d i n th e multivariat e cas e a s we mus t allo w fo r th e possibilit y o f severa l co-integratin g vectors . Nevertheless, muc h o f th e intuitio n gaine d fro m th e analysi s o f th e bivariate cas e carrie s through to riche r examples . There ar e a t leas t thre e reason s fo r regardin g th e concep t o f co integration a s centra l t o econometri c modellin g wit h integrate d vari ables, a s wel l a s t o th e examinatio n o f long-ru n relationship s amon g those variables . The firs t i s th e lin k tha t th e concep t formalize s amon g variable s o f higher order s o f integration , fo r whic h som e linea r combinatio n i s o f a lower orde r o f integration . I n th e mos t widel y use d examples , a reduction i s mad e fro m variable s tha t requir e first-differencin g fo r stationarity t o a composite time-serie s tha t i s stationar y i n levels . I n addition, thi s composit e stationar y variable , constructe d b y takin g a linear combinatio n o f th e origina l series , ma y be sai d t o characteriz e th e equilibrium relationshi p linkin g th e series . I f a n equilibriu m exist s among severa l variable s s o tha t suc h a stationar y linea r combinatio n exists, w e ma y coun t o n eventua l retur n o f this linea r combinatio n t o it s mean (typicall y zero) . Second, an d followin g directl y fro m thi s identificatio n o f co-integra tion wit h equilibrium , i s th e complementar y ide a o f meaningfu l versu s spurious regression . Regression s involvin g level s o f tim e serie s o f non-stationary variable s mak e sens e i f an d onl y i f thes e variable s ar e co-integrated. A tes t fo r co-integratio n the n yield s a usefu l metho d o f distinguishing meaningfu l regression s fro m thos e tha t Yule (1926 ) calle d 'nonsense' an d Grange r an d Newbol d (1974 , 1977 ) calle d 'spurious' . Finally, anothe r importan t propert y characterize s variable s tha t ar e co-integrated. A se t o f co-integrate d variable s is known t o have , amon g other representations , a n error-correctio n representation ; tha t is , th e
Co-integration 13
9
relationship ma y b e expresse d s o that a ter m representin g th e deviatio n of observe d values from th e long-ru n equilibriu m enters the model . This is a n interestin g resul t b y itself , bu t i s eve n mor e noteworth y a s a contribution t o resolving , o r synthesizing , th e debat e betwee n time series analyst s an d thos e favourin g econometri c methods . I t allow s a reconciliation, a t leas t i n part , o f time-serie s method s o f analysin g dat a that traditionall y considere d onl y th e propertie s o f difference d time series (whic h coul d mor e legitimatel y b e assume d stationary ) an d thos e econometric method s tha t lai d emphasi s o n the equilibriu m relationship s between variable s an d therefor e focuse d o n th e level s of variables. Bot h methods a s traditionall y use d coul d b e sai d t o hav e bee n flawed , th e former b y th e implie d necessit y o f ignorin g information contained i n th e levels o f variables , th e latte r b y it s tendenc y t o ignor e th e spuriou s regression problem . Reliance o n th e us e o f difference d data , a s a potentia l cur e fo r th e spurious regressio n problem , raise s a set o f new issues. A n exampl e o f a potentially controversia l recommendatio n fo r modellin g economi c time series appear s i n Grange r an d Newbol d (197 7 p . 206 ; emphasi s i n original): 'I n th e presenc e o f some autocorrelatio n o f the error s . . . firs t differencing migh t b e expecte d t o g o a long wa y towards alleviatin g th e problem an d i s certainly preferabl e to doin g nothing at all.' As a n illustration , Grange r an d Newbol d cit e th e result s o f Sheppar d (1971), who regressed U K consumptio n o n autonomou s expenditur e an d mid-year mone y stoc k fo r bot h level s an d changes , usin g annua l dat a over th e perio d 1947-62 . Th e result s wer e take n t o indicat e th e existence o f a significan t relationshi p i n level s whic h disappeare d en tirely whe n firs t difference s wer e employed . Th e level s regression , characterized b y a high value of R 2 an d a low value of the Durbin-Wat son statistic , i s spurious . However , th e first-difference d regression ap pears t o b e testin g a differen t hypothesis. 5 Th e differencin g operation , in particular , omit s an y information abou t long-ru n adjustment s tha t th e data ma y contain. Thus, whil e th e spuriou s regressio n proble m i s a seriou s one , th e practice o f differencin g integrate d serie s t o achiev e stationarity , an d o f treating th e resultin g serie s a s th e prope r object s o f econometri c analysis, i s not withou t costs . Error-correctio n mechanism s (ECMs ) ar e intended t o provid e a wa y o f combinin g th e advantage s o f modellin g both level s an d differences . I n a n error-correctio n mode l th e dynamic s of bot h short-ru n (changes ) an d long-ru n (levels ) adjustmen t processe s are modelle d simultaneously . Thi s ide a o f incorporatin g th e dynami c 5 I n the nex t chapte r we discuss the consequences of differencing (and over-differencing ) in case s wher e differencin g (an y numbe r o f times ) doe s no t alleviat e th e problem s o f non-stationarity an d wher e transformin g th e serie s monotonically , prio r t o differencing , appears to be the appropriat e procedure.
140 Co-integratio
n
adjustment t o steady-stat e target s i n th e for m o f error-correctio n terms , suggested b y Sarga n (1964 ) an d develope d b y Hendr y an d Anderso n (1977) an d Davidso n e t al. (1978) , amon g others , therefor e offer s th e possibility o f revealin g informatio n abou t bot h short-ru n an d long-ru n relationships. The theor y o f co-integratio n provide s a unifie d framewor k fo r th e analysis o f ECM s an d o f tim e serie s i n whic h th e variable s shar e on e o r more stochasti c trends . W e elaborat e upo n th e alternativ e representa tions o f co-integrate d system s i n Sectio n 5.3 , where w e als o provid e a more forma l descriptio n o f th e theory ; w e firs t revie w th e theor y o f polynomial matrice s whic h i s necessar y fo r a thoroug h understandin g o f several proof s i n th e nex t section s an d i n following chapters .
5.2. Polynomia l Matrice s A polynomia l matri x A(L ) i s a matri x fo r whic h th e element s {a ry(L)} are scala r polynomial s i n an argumen t L :
where k^ < ° ° . Usefu l reference s t o th e algebr a o f polynomia l matrice s include Gel'fan d (1967 ) an d Gantmache r (1959) . Th e degree , k , o f A(L) i s the highes t o f th e order s &,-, • o f th e elemen t polynomials :
Thus, A(L ) can be expressed a s
(10) The determinan t |A(L) | o f a polynomia l matri x A(L ) i s a scala r polynomial. A familia r exampl e o f a polynomial matri x i s A (A) = (A 0 - AI) , which occurs i n the characteristi c equatio n which ma y b e solve d fo r eigenvalue s o f th e matri x AQ . Ever y matri x satisfies it s ow n characteristi c equatio n (th e Cayley-Hamilto n theorem ) in that , i f we le t /(A ) = |A(A)| , the n /(A ) = 0 (wher e thi s i s interprete d as a matri x expression) . I n general , i f A(L ) = 2f= oA;L' , the n w e wil l also us e the notatio n A(B ) = 2f=oA,-B', fo r a matrix argument B.
Co-integration 14
1
The inverse of a finit e polynomia l matri x A(L) o f degre e k whic h has all root s o f th e determinanta l equatio n |A(z) | = 0 strictl y outsid e th e unit circle 6 i s given , i n general , b y a n infinite-orde r matri x C(L ) = ^T=oCiL'. Thi s matri x i s wel l define d i f an d onl y i f ]Cf= (AL' ' i s a convergent sequenc e a s A:—»°o . Fo r [ z > 1 (equivalently , |L| = z' 1 < 1) , a sufficien t conditio n fo r thi s t o hol d i s |C;|sSp' I where \p \ < I. 7 Th e C , ar e define d by an infinite set o f matrix identities which ma y b e describe d i n a simpl e scala r case , wher e A(L) = 1 - p L = a 0 + a\L, as follows:
such tha t
The constructio n give n b y (11 ) i s derive d b y usin g th e propert y C(L)A(L) = 1 and equatin g power s of L. The algebr a generalize s to high-order scala r polynomial s A(L) an d to matri x polynomials A(L). I n the nex t sectio n o f thi s chapte r an d i n Chapte r 8 we shal l nee d t o dea l with matri x polynomial s tha t hav e uni t root s ( z = 1). I n thes e cases , while th e matri x A(L ) ma y no t hav e a wel l define d invers e becaus e o f failure o f ran k conditions , transformin g A(L ) an d pre - an d post multiplying i t b y suitabl e matrice s wil l lea d t o a n invertibl e matri x provided certai n condition s ar e satisfied . Two polynomia l matrice s R(L ) an d T(L ) ar e sai d t o b e equivalent if and only if there exis t tw o invertible matrices U(L) an d V(L) suc h that Every polynomia l matri x A(L ) ca n be divide d o n th e lef t b y a matri x of th e for m ( B - LI ) fo r an y matri x B s o that , wher e A(L ) i s of degree k , where H(L ) i s o f degre e k -I an d D i s a constan t matrix , th e remainder term . T o obtai n th e precis e for m o f D , w e wil l deriv e thi s 6 Tha t is , denotin g a n arbitrar y roo t o f the determinan t equatio n b y z , \z \ > 1 + e, for some £ > 0 , fo r al l z satisfyin g this equation . 7 Not e tha t thi s exponentia l deca y conditio n i s onl y sufficien t an d no t necessar y t o guarantee convergence .
142 Co-integratio
n
result, whic h is simply a linear transformatio n o f th e origina l polynomia l matrix. W e hav e
and s o on . B y induction , w e ca n continu e thi s substitutio n fo r an y k t o get
A simila r resul t hold s fo r divisio n o n th e right . I n dealin g wit h integrated series , th e cas e B = I i s of particular interest; the n where A(l ) is equa l t o A(L ) evaluate d a t L = 1 . Not e tha t fro m (13) and (15) , for the cas e B = I ,
and
Further, A(l ) is called th e total effect. Whe n D = A(l) =0 , the n A(L ) is divisible o n th e lef t b y ( 1 — L)I withou t a remainder , an d henc e ca n b e rewritten i n terms of the operator ( 1 - L ) alone. The nex t mai n resul t t o b e prove d i s th e isomorphi c relationshi p between polynomia l matrice s an d companio n matrices . Thi s wil l clarif y the derivatio n of latent roots of polynomia l matrices, whic h are of grea t interest i n analysin g dynamic s an d co-integration . Conside r th e syste m of n deterministi c linear equations :
We se t A Q = I a s a normalization . Th e sam e informatio n ca n b e
Co-integration
143
represented i n stacke d for m (calle d th e companion form) b y definin g the followin g matrice s an d vectors :
Direct multiplicatio n o f 4 > int o 7, t-i an d comparison o f tha t outcom e with X r reveal s tha t th e secon d expressio n i n (18 ) merely augment s th e original syste m with a se t o f identitie s o f the for m x ( _i = x ( _ j , etc . The corresponding advantag e of companion form s i s that, whateve r th e valu e of k i n (16) , the companio n for m i s always of firs t order , an d henc e ca n be analyse d usin g alread y establishe d tools . Thi s advantag e i s pronounced whe n w e wis h t o fin d th e eigenvalue s o f A(L) , an d d o s o b y solving It wil l b e convenien t t o re-expres s (19 ) in term s o f th e negative s o f th e inverses of th e eigenvalues , /j, = —I/A , an d t o solv e Using the definitio n o f fro m (17 ) in (20) , we hav e
7
rom the partitioned invers e formula, wher e D ^0,
The firs t equalit y follow s fro m th e fac t tha t th e determinan t o f th e firs t
144 Co-integratio
n
matrix followin g th e equalit y i s one . Repeatin g thes e operation s i n th e alternative direction , i f E ^ 0, establishes tha t Both result s wil l b e use d below. Here , w e apply (22) t o th e determinan t in (21) , choosin g E a s th e larg e n( k - 1 ) x n(k — 1) matri x i n th e upper-left corner , an d D = I. The n FD -1G i s zer o excep t fo r it s top-right block, which is -^A^, an d D = 1. Thus,
(23) Comparing (21 ) wit h (23) , th e analysi s can b e see n t o repeat , leadin g t o | A O/) | after k - 1 steps. Thus , the laten t root s ca n b e foun d b y equatin g either expressio n t o zer o an d solving. Sinc e A ( •) i s n x n , O i s n k x n k an d s o ha s n k eigenvalues , as required. From (13) , whe n B = I, i f A(l ) ha s ran k r < n, the n |A(1) | = 0 an d hence A(L ) ha s n — r uni t roots . Conversely , i f A(l) ha s ran k n , A(L ) has none o f its eigenvalues equal to unity. Next, derivative s o f polynomia l matrice s wit h respec t t o thei r argu ments will b e needed , an d w e have
This i s reminiscen t o f th e mean-la g formul a i n a scala r distribute d lag . From th e resul t tha t H(l ) = - ]^= i/A, , w e now see that H(l ) = -T. Thus, whe n A(l ) = 0, s o tha t A(L ) = (1 - L)H(L) , the n |H(L) | = 0 delivers th e remainin g eigenvalues . I f H(l ) di d no t hav e ran k n whe n A(l) = 0, the n |H(1)| = 0, s o H(L ) als o ha s uni t roots . Usin g (13 ) an d (15) t o write H(L) = H(l) + (1 - L)K(L) , w e note that , i n the extreme case tha t T = 0, H(L ) = (1 - L)K(L) , whic h implie s tha t A(L) = (1 - L) 2 K(L). Consequently , equatio n (16 ) woul d becom e (1 — L)2 K(L)x r = 0 , yieldin g a syste m in secon d differences . There i s a close affinit y betwee n th e rank s o f A(l) , H(l) , etc. , an d th e numbe r of differences tha t ca n be extracte d fro m A(L) . Finally, polynomia l matrice s ar e invarian t unde r non-singula r linea r
Co-integration 14
5
transformations i n tha t the y hav e man y equivalen t representation s wit h the sam e properties. This is clear fro m (13 ) above. Mor e generally,
In term s o f (16) ,
For example , whe n k = 1 ,
Such linear transformations are use d regularly in Chapter 8 .
5.3. Integratio n an d Co-integration : Forma l Definition s and Theorem s DEFINITION 1. (adapte d fro m Engl e an d Grange r 1987) . Th e com ponents o f the vecto r x r ar e sai d to be co-integrate d o f order d , b, denoted x t~Cl(d, b) , i f (i ) x , i s l(d) an d (ii ) there exist s a non-zero vector « such that a'\, ~ l(d — b), d ^ b > 0. The vector a, is called the co-integratin g vector. If x , ha s n > 2 components , the n ther e ma y b e mor e tha n on e co-integrating vecto r « ; i t i s possibl e fo r severa l equilibriu m relation ships to gover n th e join t evolution o f the variables . I f there exis t exactly r linearl y independent co-integratin g vectors wit h r ^ n - 1 , then thes e can b e gathere d int o a n n x r matri x a . Th e ran k o f a wil l b e r an d is called th e co-integrating rank. DEFINITION 2. A vecto r time-serie s x , ha s a n error-correctio n representation i f it can b e expresse d a s where (a, i s a stationar y multivariat e disturbance , wit h A(0 ) = !„, A(l) havin g onl y finit e elements , z ( = «'x r , an d y a non-zer o
146 Co-integratio
n
vector. Fo r th e cas e wher e d = b = 1, and wit h co-integrating ran k r, the Grange r Representatio n Theore m holds (se e Sectio n 5.3.1) . Granger's theore m wil l prove tha t a co-integrate d syste m o f variable s can b e represente d i n thre e mai n forms : th e vecto r autoregressiv e (VAR), error-correction , an d moving-averag e forms . Thes e representa tions ar e al l isomorphic t o eac h other , an d th e theore m establishe s th e restrictions tha t hol d betwee n th e lag-polynomia l matrice s i n eac h representation o f the process . We ma y prov e th e theore m i n a t leas t thre e (equivalent ) ways , depending o n th e representatio n fro m whic h w e choos e t o start . Th e theorem i s stated i n Sectio n 5.3.1 . Followin g thi s statement, w e take th e autoregressive representatio n a s ou r starting-poin t an d deriv e th e mai n results. Thi s proo f i s due t o Johanse n (1991fl) . Th e sub-sectio n afte r th e proof contain s a detaile d interpretatio n o f th e results . I n Chapte r 8 we return t o th e theore m an d provide anothe r proof , thi s time startin g fro m the moving-averag e representation . Provin g th e theore m i n tw o way s highlights som e interestin g symmetries which exis t amon g the equivalen t representations o f the process .
5.3.1. Granger Representation Theorem (adapted from Engle and Granger 1987 and Johansen 1991 a) Let x t b e a n 1(1 ) vecto r o f n components , eac h wit h (possibly ) deterministic trend i n mean. Suppos e tha t th e syste m ca n be written a s a finite-order vecto r autoregression :
(25) where th e e t satisf y assumption s (3.16a)-(3.16d ) an d th e firs t k dat a points Xj_fc , Xj-fc+i , . . ., x 0 ar e fixed . Th e mode l ca n the n b e rewritte n in error-correction for m as
Both (25 ) and (26 ) ca n be writte n as where
Co-integration 14
7
Equation (26 ) may also b e written as where V(L ) = (1 - L)~\x(L) - *(!)£* ) = I» - Sti1^'. Fro m (13) above, 1 P(L) can alway s be constructed . Further , th e derivativ e of a(z) at z = 1 is equal to -W = -V(l). Define th e orthogona l complemen t Pj _ o f an y matri x P o f ran k q an d dimension n x g a s follows (0 < q < ri): (i) P_ L i s of dimensio n n x ( n — q); (ii) PI P = 0(B _, )X ,, P'P1 = 0,x(n _ ?) ; (iii) Pj _ ha s ran k n — q, an d lie s i n the nul l space o f P . Certain key assumptions may now be stated . ASSUMPTION Al . Th e characteristi c polynomial ,
has root s eithe r equa l t o o r strictl y greate r tha n one ; that is , |flr(z)| = 0 implies that eithe r z > 1 or z = 1. ASSUMPTION A2 . Th e n x n matri x n ha s reduced ran k r < n and is therefor e expressibl e a s the produc t o f tw o n x r matrice s y and a, where y and a have ran k r. Thus n = y«'. ASSUMPTION A3 . Th e ( n — r) x ( n — r) matri x y'iWa ± ha s ful l rank n — r. Assumption A l guarantee s tha t th e non-stationarit y o f x , ca n b e removed b y differencing . A 2 rule s ou t a stationar y x , process . I f n ha d full ran k (tha t is , i f |JT(Z) | ha d n o root s a t one) , then fro m (27), x, = Ji~ l(L)(/u + et), whic h would impl y that x t wa s stationary. I t is also the statement , i n th e autoregressiv e form , tha t th e syste m has r linearl y independent co-integratin g vectors . I n ligh t o f Assumptio n A2 , y« ' provides a transformatio n o f the n matri x (an d hence a linear combina tion o f th e Xjt whic h i s stationary) . Th e significanc e o f A 3 wil l becom e evident i n du e course , bu t essentially , i t ensure s tha t x r i s integrated of order n o greate r tha n 1 . Unde r th e assumption s state d above , th e following result s ma y be proved : (Rl) Ax r i s stationary.
148 Co-integratio
n
(R2) a'x , is stationary. (R3) £(Ax, ) =
(R4) E(a'x t) = -(
(R5) Ax , ha s a moving-average representation give n by (R6) C(l ) = aj_(y' i < P«j.)~1y'i ha s rank n - r . (R7) «'C(1 ) = O r X B C(l)y=0BXr.
where C(L ) = C(l) + (1 - L)Ci(L) , r= C(l)f» , x 0 i s a constan t (vector) o f integration, an d S, = Ci(L)e t. Proof. Multipl y (27) by y ' an d y' L respectivel y to obtai n th e equation s
using the decomposition n = ya' an d the result tha t y^ y = 0( n-r)Xr. Th e matrix n i s no t invertible , an d th e syste m give n b y (28a)-(28b) therefore canno t b e inverte d directl y t o expres s th e x it i n term s o f th e £;,. T o obtai n a n invertibl e system , w e defin e tw o ne w variables , (ot = (a'a)~la'xt an d v, = (a^ L a_ L )~ 1 a^ L Ax r . Next , defin e th e matrice s «=«(«' a)"1 an d «j _ = a L(a'LaL)~l. Le t R = (a, a± ) b e a n n x n matrix o f ran k n . The n R(R'R)~ 1 R' = !„ an d henc e («« ' + «j.«'i) = !„. Thus , Substituting i n (28a)-(286 ) gives
where i n (28a ) th e firs t ter m o n th e left-han d sid e need s t o b e writte n first a s -(y'y)(«'a)(«'a)~ 1 a'x,. Th e equation s for (a, an d v t ca n now be written i n autoregressive for m a s with
For z = 1 , this matrix has determinant
Co-integration 14
9
which i s non-zer o b y Assumption s A 2 an d A3 . Henc e z = 1 i s no t a root. Fo r z + 1, straightforwar d bu t tediou s algebr a enables u s t o express th e matri x A(z) as To sho w this , substitut e for *P(z ) in A(z ) in term s of n(z) and jr(l) = — nfro m (27) , and us e th e decompositio n n = y«' an d th e orthogonality conditio n yly = a' La = 0( n _ r ) X r . Fo r z = £ 1, therefore , from (31), where w e have used th e resul t tha t th e determinan t o f a matrix obtained by multiplyin g n — r column s (o r rows ) o f a n n x n matri x b y a constant i s th e determinan t o f th e origina l matri x multiplie d b y th e constant raise d t o th e powe r n — r. Thus , fo r z ¥= 1, |A(z) | = 0 i f an d only i f |;r(z) | = 0 . B y Assumptio n Al , i f w e exclud e z = 1, th e onl y remaining roots o f this determinant li e outside th e uni t circle. This show s tha t al l th e root s o f |A(z) | = 0 ar e outsid e th e uni t disk . Hence th e syste m define d b y (29a)-(29b) i s invertibl e an d 0, whic h ar e al l 1(0 ) i f co-integrabilit y holds . Thus , a i s consistently estimate d b y th e regressio n despit e th e complet e omissio n of al l dynamics. I n fact ,
(48)
Since {vj i s 1(0) under co-integrabilit y but {x t} i s 1(1),
158 Co-integratio
n
whereas
Thus,
which implie s that Hence a converge s t o a a t a rate o f O p(T) an d no t a t th e usua l rate of Op(T1/2). Convergenc e i s rapi d asymptoticall y an d i t i s thi s rapi d convergence o f th e estimate s o f th e coefficient s tha t i s use d b y Engl e and Grange r as the basis of their two-step estimator. Since & differs fro m a b y term s o f O p(T~l), th e asymptoti c result s for estimatio n o f dynami c model s wit h 1(1 ) variable s wil l b e th e sam e whether a i s estimate d o r known . Moreover , differencin g mus t reduc e the orde r o f integratio n o f a n integrate d variabl e b y unity , s o i f Ay f i s related t o AJC , an d perhap s lag s o f bot h o f these , an d i f {x t} an d {y j are co-integrated , the n y t_i - ax t-i i s 1(0) an d can be include d i n the ECM mode l a s if a wer e know n (that is , the samplin g variance of a ca n be ignored) . I f _{y t] an d {x,} ar e no t co-integrated , the n w e hav e th e familiar spuriou s regression problem ; i f the y ar e co-integrated , th e benefits accruin g from a static regression ar e potentially large . The so-calle d 'super-consistenc y theorem ' du e t o Stoc k (1987 ) ma y be stated formall y as follows. THEOREM (Stoc k 1987) . Suppos e tha t x , satisfie s ( 1 — L)x, = C(L)e, wit h C(L) = C(l) + (1 - L)C*(L) , wher e C*(L ) ha s all o f its laten t root s insid e th e uni t circle . I f C*(L ) i s absolutel y summable,10 th e disturbance s hav e finit e fourth-orde r absolut e moments, an d x , i s CI(1,1) wit h r co-integrating vectors (incorpor ated i n a matrix «) satisfying , uniquely, then11 Thus, instea d o f convergin g a t rat e T 1/2, a s i n stationar y processes , 10 Th e infinit e sequence {c ;}f i s sai d t o b e absolutel y summabl e i f 2*= i c j < °° . Fo r th e matrix C*(L ) t o b e absolutel y summable , th e conditio n i s that 27= ollCj1 l < °°. 11 Th e element s o f q an d Q wil l typicall y be al l zeroes and ones , definin g one coefficien t in eac h colum n o f «to be unit y and defining rotation s i f r > 1 . M = pli m E(T~2 2,^i x r x D-
Co-integration
159
least-squares estimator s converg e a t a rat e o f T. Thi s theore m an d th e error-correction representatio n o f co-integrated system s may be allie d t o give the followin g theorem . THEOREM (Engl e an d Grange r 1987) . Th e two-ste p estimato r o f a single equatio n o f a n error-correctio n syste m with one co-integrat ing vector , obtaine d b y takin g th e estimat e & of a fro m th e stati c regression i n place of the tru e value for estimatio n o f the error-cor rection for m a t a secon d stage , wil l hav e th e sam e limitin g distribution a s th e maximum-likelihoo d estimato r usin g th e tru e value o f a . Least-square s standar d error s i n th e secon d stag e wil l provide consistent estimate s of the tru e standard errors .
5.6.1. Sketch-proof of Engle-Granger Theorem (Bivariate Case) The followin g i s a proof o f thi s theorem fo r th e bivariat e case . Conside r the estimatio n o f ft and y in the tw o equations give n by
y, an d x t ar e co-integrate d 1(1 ) variable s wit h th e co-integratin g para meter give n b y a . I n th e contex t o f th e discussio n i n thi s chapter , th e error-correction mechanis m i s estimate d i n (53 ) usin g th e tru e valu e of th e co-integratin g parameter , whil e i n (54 ) a i s substitute d fo r a , where a i s derive d fro m th e stati c regressio n o f y t o n x t. Also , e * = e « + y(« - oc)x t-]_. Le t zt = yt- «x tWe nee d t o sho w that th e asymptoti c distributions of the estimator s f t and y , o f / 3 an d y respectively , ar e th e sam e regardles s o f whethe r on e uses a o r a (tha t is , whether one estimates (53 ) o r (54)). . In standar d fashion , w e hav e fro m (53 ) (assumin g adequat e initia l values)
The estimator s derive d fro m (54 ) ar e als o give n by (55 ) bu t wit h z t-\ and e f replacin g z t an d s t. From this , i t is easy to deduc e tha t th e resul t will be demonstrate d if the followin g condition s are show n to be true :
160 Co-integratio
n
(iii) th e asymptotic distribution s of are th e same ;
(iv) th e asymptoti c distribution s o f are the same . In (53) , we assum e tha t {e,} i s a n innovatio n proces s suc h tha t E(Axt£t) = 0. Note firs t that , b y th e propertie s o f 1(0 ) an d 1(1 ) series , a s use d an d discussed i n Chapter s 3 and 4 , th e followin g expression s ar e O p(l) (tha t is, non-explosiv e an d non-degenerat e a s T— > ) :
Secondly, Using (59) ,
Result (i ) now follows fro m (57 ) an d (58) . Also ,
Co-integration 16
1
Result (ii ) now follows from (56), (57) , an d (58) . Finally,
By (57 ) an d (58) , th e las t tw o expression s o n th e right-han d sid e o f th e above equalit y ar e O p(T~1/2). Resul t (iii ) follows , an d (iv ) i s prove d analogously from :
6
Regression wit h Integrate d Variables We hav e see n ho w th e presenc e o f integrated variables pose s som e special problem s whic h do no t appea r whe n workin g wit h station ary series . Thes e migh t lea d u s t o believ e tha t a ne w rang e o f techniques need s t o b e considere d i n orde r t o handl e suc h data . However, a s w e sho w i n thi s chapter , w e ca n continu e t o appl y standard regression s i f w e pa y attentio n t o order s o f integratio n and us e dynami c specification s whic h tak e accoun t o f an y co integrating relationships amon g the variables . The Engle-Grange r theore m i n Chapte r 5 , layin g emphasi s o n simpl e static regressions , implie s a goo d dea l abou t th e wa y i n whic h a n investigator ough t t o procee d wit h a n econometri c stud y o f integrate d variables. Som e o f thi s i s relate d t o th e evolutio n o f modellin g practic e among econometricians . Econometricians o f th e 1970 s bega n t o b e suspiciou s o f regression s using dat a i n levels . Thei r suspicion s wer e reinforce d b y worrie s expressed b y time-serie s analyst s relatin g t o spuriou s regressions . Th e focus o f attentio n bega n t o shif t toward s th e nee d t o hav e properl y specified model s wit h ric h dynami c structures . Th e move , followin g Mizon (1977) , Sim s (1977) , Hendr y an d Mizo n (1978) , an d Hendr y an d Richard (1982) , wa s toward s a metho d o f econometri c researc h tha t preferred model s whic h began wit h as general a specification as possible, and continue d wit h simplificatio n to a parsimoniou s econometri c mode l following fro m imposin g constraints consisten t wit h observe d data . (Se e Spanos (1986 ) fo r a detaile d treatment. ) Th e literatur e o n co-integratio n reinstated som e confidenc e i n stati c regression s i n levels , an d goo d econometric metho d appeare d t o hav e take n a ful l circle ; a s long a s th e 1(1) variables were co-integrated, suc h regressions mad e sense . There ar e nonetheles s severa l reason s fo r continuin g t o trea t stati c regressions a s being i n general sub-optimal . Firs t o f all, the estimat e a is biased fo r th e co-integratin g paramete r <x and , althoug h tha t bia s i s Op(T~l), i t ca n b e substantia l in finit e samples . Th e bia s i s likely t o b e a functio n o f som e paramete r suc h a s th e mea n la g o f th e dynami c adjustment proces s relatin g {y,} t o {x t}. I n som e circumstances , there -
Regression wit h Integrated Variables 16
3
fore, a retur n t o dynami c modellin g woul d see m t o b e th e appropriat e response t o th e problem s o f static-regressio n biases . Alread y a bod y of work exist s demonstratin g th e poo r performanc e o f static regression s fo r many type s o f proble m (Banerjee , Dolado , Hendry , an d Smit h 1986 , and Stoc k 1987) . Second , th e distribution s o f coefficien t estimate s wil l typically tak e non-standar d form s eve n wher e th e serie s ar e co integrated. Th e 'non-standardness' , b y which we generall y mean asymp totic non-normality , come s fro m th e propert y tha t th e serie s ar e integrated o f orde r greate r tha n o r equa l t o 1 . Th e fundamenta l point is that th e distribution theor y tha t applie s t o non-stationar y serie s i s different fro m th e familia r Gaussia n asymptoti c theory . Th e estimator s have distributions , i n general , whic h ar e functional s o f th e Wiene r processes discusse d i n Chapters 1 and 3. However , som e o f the standar d asymptotic theor y ma y be restore d i n dynamic models. We wil l elaborat e o n th e secon d o f thes e points , leavin g a discussion of th e firs t unti l Chapter 7 . I t i s important t o poin t ou t a t th e outset , i n order no t t o mislea d readers , tha t i t i s no t tru e tha t single-equatio n dynamic models ar e necessaril y superio r t o thei r static counterparts. Th e next tw o section s presen t example s wher e single-equatio n dynami c models d o perfor m satisfactorily . Yet , a s th e discussio n i n Chapte r 8 shows, i t i s possibl e t o construc t man y case s wher e single-equatio n dynamic model s b y themselve s ar e no t sufficien t fo r obtainin g efficien t and unbiase d estimate s (se e Engl e e t al. 198 3 an d Phillip s an d Loreta n 1991). There ar e severa l interrelate d difficultie s whic h ar e importan t an d which collectivel y impl y that the issu e is broader tha n simpl y a comparison o f dynami c wit h stati c models . A n informa l descriptio n o f th e problems encountere d i n modellin g non-stationar y variable s i n a singleequation framewor k woul d identif y a t leas t fiv e effects . First , th e presence o f uni t root s induce s non-standar d distribution s o f th e coeffi cient estimates . Second , th e erro r proces s ma y no t b e a martingal e difference sequence . Third , th e explanator y variable s ma y eac h b e generated b y processes that displa y autocorrelation ; take n i n conjunction with th e secon d effect , thi s give s ris e t o 'second-order ' biases . Fourth , there ma y be mor e tha n on e co-integratin g vector . Finally , th e explanat ory variable s i n th e singl e equatio n ma y no t b e weakl y exogenou s fo r the parameter s bein g estimated . Wea k exogeneit y ca n fai l if , say , a co-integrating vecto r enter s mor e tha n on e equatio n i n th e syste m generating th e variables . Static regression s ca n b e affecte d b y al l fiv e o f th e problem s liste d above, whil e dynami c model s ma y b e abl e t o accommodat e th e firs t three effects , a s i n th e example s give n i n th e section s tha t follow . However, estimate s derive d fro m single-equatio n dynamic model s ar e not optima l i f wea k exogeneit y fail s t o hold . Thi s fina l observatio n
164 Regressio
n wit h Integrated Variable s
extends th e discussio n fro m th e real m o f modellin g unit-roo t processe s to th e all-encompassin g real m o f genera l econometri c modelling . Thi s discussion i s formalize d i n Chapte r 8 an d illustrate d wit h severa l examples.
6.1. Unbalance d Regression s an d Orthogonalit y Tests Mankiw an d Shapir o (1985 , 1986 ) dre w attentio n t o a problem tha t ma y arise i n applyin g standar d distribution s t o inferenc e wher e ther e ar e non-stationary (o r borderlin e non-stationary ) serie s present , an d i n particular t o th e proble m o f inference concernin g orthogonalit y betwee n series. Whil e th e proble m is , a s wit h spuriou s regression , essentiall y a problem o f integrate d data , i t wil l appea r wit h near-integrate d dat a i n finite samples. 1 Wit h thi s qualification, the proble m ma y be sai d t o aris e in unbalanced regressions : tha t is , regression s i n which the regressan d i s not o f th e sam e orde r o f integratio n a s th e regressors , o r an y linea r combination o f the regressors. 2 The Mankiw-Shapir o discussio n centre s o n a condition suc h as Et-i(yt) =
c , implying y, = c + vt, E
t^(vt)
=
0, (1
)
where £ ( _i i s interpreted a s the expectation , conditiona l o n informatio n realized a t tim e t — 1, o f th e valu e of som e variabl e whic h may b e date d in th e future . Tha t suc h a conditio n hold s i s ofte n teste d wit h a regression suc h as where c^ = 0 under th e nul l hypothesi s tha t (1 ) holds . Example s o f such hypotheses an d test s aris e frequentl y i n model s tha t postulat e th e ful l use o f al l realized information . On e suc h exampl e fro m macroeconomic s is Hall' s (1978 ) formulatio n o f th e life-cycle/permanent-incom e model , which, give n a stringen t se t o f assumptions , implie s tha t consumptio n should follo w a rando m walk . Test s o f thi s hypothesi s hav e typicall y taken th e for m o f regression s o f difference d consumptio n o n a constan t and on e o r mor e lagge d incom e o r consumptio n terms ; unde r th e nul l hypothesis th e coefficient s o n th e lagge d term s shoul d no t b e signifi cantly differen t fro m zero . Mankiw an d Shapir o sugges t examinin g th e cas e i n whic h th e regres sor x t follow s the AR(1 ) process : 1 Whil e th e experiment s reporte d her e us e borderlin e stationar y data , th e result s wil l also appl y t o integrate d series . - Thes e ar e sometime s calle d inconsisten t regressions . Inconsistenc y i n thi s sens e i s unrelated t o th e concep t o f an inconsisten t estimato r o f a parameter: se e n . 3.
Regression wit h Integrated Variables 16
5
with corr(e ( , v t) = p an
d corr(e
t+; -,
v t) = 0 V; + 0.
Note tha t thi s is not a problem o f simultaneity bias: th e regresso r x t-\ is uncorrelated wit h v t. A structur e suc h a s thi s i s appropriat e i n man y models i n whic h thes e test s hav e bee n used . I n th e Hal l (1978 ) model , for example , p = 1 where x t an d y t represen t curren t incom e an d th e change i n curren t consumptio n respectively . Manki w an d Shapir o us e Monte Carl o simulation s t o tabulat e estimate s o f th e actua l rejectio n frequencies an d critica l value s i n /-typ e test s o f H 0: c 2 = 0, whe n standard ^-value s ar e used . Tabl e 6. 1 reproduce s a selectio n o f thei r results for model (2 ) and als o fo r the mode l with a linear time trend ,
TABLE 6.1 . Percentag e rejectio n frequencie s o f standar d f-test s a t nominal 5 per cen t level 3 DGP: (1 ) + (3) ; Sampl e siz e = T; No . o f replications = 100 0 Model (2)
e\P
Model (4)
1.0
0.9
0.8
0.5
0.0
1.0
0.9
0.8
0.5
0.0
30 0.99 26 22 0.98 0.95 17 12 0.90 0.00 5 (b) T = 200 0.999 29 0.99 18 13 0.98 0.95 9 0.90 7 0.00 5
24 20 17 12 9 6
20 15 15 10 8 6
11 10 8 7 6 5
7 7 7 6 6 5
60 54 50 38 28 6
45 40 37 30 22 7
36 33 30 25 19 7
16 15 14 12 10 5
6 6 5 6 6 6
23 15 10 7 6 4
20 13 9 7 6 4
10 8 7 6 6 5
5 4 5 5 6 5
61 41 29 17 10 5
48 32 24 14 9 5
38 27 20 12 8 4
18 13 11 7 6 5
5 5 6 6 7 5
(a) T = 50
0.999
a
Thi s tabl e compare s tw o sampl e sizes . Whil e th e tes t siz e distortion s ar e generally smalle r fo r th e large r sampl e an d wil l vanis h as T -» °°, thi s feature i s specific t o th e borderline-stationar y processe s use d (0r = 2s= i1s > ar >d definin g |,- j( (th e /-fol d summation o f th e if^ ) recursivel y a s |fy )( = Ss= il;-i,.s > 1 ^J ^ S> tn e transformation D is chosen suc h that
or, equivalently where
and L i s th e la g operator . Th e variate s v, ar e referre d t o a s th e
Regression wit h Integrated Variable s
181
canonical regressor s associate d wit h Y ( . Th e la g polynomial F U (L) ha s dimension k\ x N , an d ^JLoFiijF'iij i s non-singular . F yy i s assume d t o have ful l ro w ran k k; (ma y be equa l t o zero ) fo r j = 2, . . ., 2 g + 1, so Since w e ma y b e intereste d i n estimatin g onl y som e o f th e k equations i n (28) , we nex t need t o defin e a selectio n matri x C . I f w e needed t o conside r onl y n ^ k , w e could loo k a t th e regressio n o f CY , on Y f _ i , wher e C i s a n n X k matri x o f constants . Th e n regressio n equations t o be estimated ar e the n
The asymptoti c analysi s i n SS W is derive d i n stacke d single-equatio n form. I n orde r t o us e thi s form, we need th e symbo l ® whic h denotes a Kronecker produc t define d a s follows : conside r th e m x n matri x A = {fly } an d th e p X q matri x B ; th e Kronecke r produc t o f A an d B (in that order) i s the m p x n q matrix ,
V e c ( - ) denote s th e column-wis e vectoring operator . Thus , writin g the matrix A a s A = (a 1; a 2 , . . ., a n ), wher e eac h o f th e a , i s a n m x 1 vector, vec (A) is given by
X = [Yi , Y 2 , . . ., Yj--!]', s = vec(S) , v = ve c (if), an d ft = vec((A)'), then (32 ) ca n be writte n in stacked for m a s In orde r t o expres s (33 ) in term s o f th e transforme d regressor s Z = [Z{, Z 2 , . . ., Z'T-_I] ' = XD', not e tha t th e coefficien t vecto r correspond ing to thes e i s given by 6 = (!„ ® D'"1)/?.11 Thus , finally , 11 T o sho w this , substitut e fo r Z = XD' an d 5 = (!„ OD'^ 1)^ i n (34 ) giving s = (!„ ® XD')(In ® D'- 1 )/? + (£ J/2 ® ir _ 1 ) v . NOW (Aj ® A 2 )(A 3 ® A 4) = (A!A 3) ® (A 2 A 4 ), for arbitrar y matrice s A,- , i = 1 , 2, 3 , 4 , provide d th e matrice s ar e conformable . Usin g thi s rule (33 ) is recovered a s required.
182 Regressio
n wit h Integrated Variables
The OL S estimator 5 of 6 in th e stacke d transforme d regressio n mode l (34) is given by It i s possible t o se e fro m (30 ) tha t th e moment s involving the differen t components o f Zt converg e a t differen t rates . Fo r example , Z l j f an d Z 2 , are O p(l) whil e Z 3>f i s O p(t^2), Z 4j , i s O p(t), an d s o on . Henc e th e sample secon d moments , whic h is what we would be intereste d i n when looking a t th e matri x Z'Z, converg e a t a rate o f T fo r th e Z l i t an d Z 2tt components, a t a rat e T 2 fo r th e Z 3;( component , an d a t a rat e T 3 fo r the Z 4 r component . I n orde r t o handl e thes e differen t orders , SS W use the scaling matrix Tr , given by
(36) 1
All the convergenc e result s use the scale d Z' Z matri x T^Z'ZTy ; le t us call this scaled matri x Q . The firs t ste p in th e proo f i s to deriv e th e limitin g matrix for Q . SSW show that , unde r certai n regularit y conditions , Q = $ > V wher e th e elements of V may b e describe d a s follows : (a) V u an d V 12 ar e non-rando m matrice s give n b y S7= o Fn/Fii/ an d 2F=oFii/F2iy respectively . Additionally, V ]2 = V 21. (b) V l p = V ^ = 0, p = 3, ...,2g + l. (c) V 22 is also non-random, give n by F22F22 + S 7=0^21/^21; • (d) V mp , wher e m, p = 3, 5 , 7 , . . ., 2 g + 1, ar e rando m matrice s involving functionals of multivariate Wiener processes . (e) V mp, where m = 2, 4 , 6 , . . ., 2g , p = 3, 5 , 7 , . . ., 2 g + 1, are als o random matrice s involvin g functional s o f multivariat e Wiene r pro cesses. (f) V mp = [2/(p + m-2)] ¥ mm¥'pp, p = 4, 6, . . ., 2g, m = 2, 4, 6, . . ., 2g. This i s the firs t tim e w e have used multivariat e Wiener processes . Th e mathematical detail s involve d i n goin g fro m univariat e t o multivariat e Wiener processe s ar e comple x an d wil l no t b e deal t wit h her e (fo r a good account , se e Phillip s an d Durlau f 1986) . Howeve r th e generaliza tions fro m ou r analysi s in Chapte r 3 can b e understoo d intuitivel y fairl y easily an d the appendix sketche s th e bivariate case . Thus, eac h elemen t o f a standardize d n x 1 multivariat e Wiene r process W(r ) i s a univariat e Wiene r proces s an d th e element s o f W(r ) are independent . I n particular , W(l ) i s the multivariat e standar d norma l
Regression with Integrated Variables
183
density, tha t is , N(0, !„). Further, W(r ) e C[0,1]", wher e C[0,1 ] is the space of continuous function s defined on [0,1] . Convergence result s analogou s t o (3.17) , fo r a sequence o f mean zero random vector s {u (}, ca n b e prove d b y definin g standardize d sum s such as
with (t - l)/ r ^r an d tn e matri x f t i s th e long-ru n variance-covarianc e matrix o f u, - define d b y f t = limr^00.E(T~1S:rS'r) analogousl y wit h (3.16c). Th e {uj innovatio n sequenc e satisfie s conditions equivalen t t o those give n by (3.16a)-(3.16d) fo r the univariat e case . Provide d suitabl e regularity condition s ar e satisfied , the following multivariate analogue of (3.18) may be proved : RT(I-) = > W(r). Finally, multivariat e analogue s o f al l th e convergenc e result s give n earlier fo r univariat e processe s ma y b e derived . Thus , fo r example , referring t o Table 3.3, wher e y, = y r _ j + u r :
To derive th e result s abov e w e have assumed , a s in Table 3.3 , tha t {u j is a white-noise innovatio n sequence wit h !„ a s the varianc e matrix. The nex t ste p o f the argumen t involve s rewritin g the estimato r 6 i n a form suc h tha t it s distributio n ca n b e derived . Thi s i s don e b y firs t defining a non-singula r matri x H which , i n essence , transpose s th e stacked version of the matri x Z. Thus , (37)
From (35) ,
184 Regressio by substitutin g fo r s
n wit h Integrated Variable s fro m (34) . Next , usin g th e resul t tha t
Thus,
(38) As note d abov e th e matri x V is the limitin g matrix of Q . The asymptoti c distribution of
is neede d t o giv e us th e fina l result . Thi s limitin g vector, denote d b y takes th e followin g form:
where (a) (j) m fo r al l m ^ 3 are functional s of multivariate Wiener processes ; (b) 0 2 = 02 i + 022 , wher e ft, 2 = vec[F 22W(l)'S1/2], W(l ) is th e multi variate standar d norma l densit y function, and
Finally,
where (ft , 0 21) ar e independen t o f (0 22, ft , . . ., ft these steps , w e have the followin g theorem.
g+i).
Consolidatin g
This provide s u s wit h severa l interestin g results . First , d, an d henc e /} , is a consisten t estimato r o f 6, respectivel y /J , i n th e presenc e o f arbitrarily man y uni t root s an d deterministi c tim e trends . Thi s observa tion relie s o n th e assumptio n tha t th e mode l i s correctly specified , i n th e
Regression wit h Integrated Variable s 18
5
sense tha t th e error s ar e martingal e differenc e sequences , an d th e T T may rescale by powers of T greate r tha n \. We have alread y noted tha t th e estimate d coefficient s o n th e element s of Z r converg e t o thei r probabilit y limit s a t differen t rates . Hence , if some o f th e transforme d regressor s ar e dominated , i n a n orde r o f probability sense , b y stochasti c components , thei r limitin g distributions will b e non-normal . O n th e othe r hand , i f ther e ar e n o Z , regressor s dominated b y stochastic trend s (tha t is , if & 3 = k 5 = . . . = k 2g+i - 0) , then d, an d henc e ft , ha s a n asymptoti c normal joint distribution . This happens becaus e th e term s involvin g the rando m integrals ar e n o longe r present, a s ma y be see n fro m (30) , where k 3, k$, . . ., k 2g+i ar e th e ranks of matrices multiplying the stochasti c canonical regressors. I f these matrices ar e absent , th e transforme d regressio n i s considerabl y simpli fied a s i t i s expressibl e solel y i n term s o f stationar y variable s and deterministi c tren d terms . I n suc h a case , therefore , H(I B ®T r )(3-*)4. N(0 , H(S ® V^)H') wher e V i s no w a nonrandom matrix . Additionall y th e F-statisti c associate d wit h testin g a n arbitrary se t o f q linea r restriction s R/ J = r, i s asymptotically distributed as $ in this case . If a singl e stochasti c tren d i s dominate d b y a non-stochasti c trend , then, again , asymptoti c normalit y holds . Thi s i s th e resul t o f Wes t (1988) an d ma y b e see n usin g (30 ) and keepin g trac k o f th e rate s o f convergence o f th e sampl e moment s o f th e separat e component s o f Z f . Consider, fo r example , th e se t o f canonica l regressor s give n b y (tit, 1 , %itt, t)' an d suppos e th e transforme d regressio n i s expressibl e i n terms of these canonica l regressors. Thus , whil e the sampl e variability of the stochasti c tren d ter m i s O p(T), tha t o f th e deterministi c tren d i s O(T3/2). A s show n b y Wes t (1988) , an d discusse d i n Sectio n 6.2.1 , i n deriving th e asymptoti c distributio n for thi s case , th e deterministi c trend component dominate s th e stochasti c componen t an d asymptoti c normality follows . The Stock-Wes t (1988 ) example , discusse d earlier , work s because w e are abl e t o rewrit e th e regressio n i n term s of canonica l regressors which do no t hav e an y dominating stochasti c component. Th e issu e o f domina tion, i n this context, i s best addresse d b y looking at the scalin g matrix. Four mor e example s wil l no w b e give n t o illustrat e thes e arguments , using th e framewor k develope d above . Th e fina l exampl e i n thi s se t o f four contain s recommendations fo r modelling with integrated series . 6.2.5. Example (Sims e t al . 1990:119) Let th e proces s {x,} b e generate d accordin g t o th e followin g AR(2) process without drift :
186
Regression wit h Integrated Variable s
Under H 0, f a = 0, f a + fa = 1 and |/3 2| < 1 so tha t th e autoregressiv e polynomial i n (39 ) ha s onl y on e uni t root . I f a constan t i s include d i n the regressio n o f x, o n it s tw o lags , Y , (i n th e notatio n develope d earlier) i s given by
Transforming t o th e canonica l regressor form, 12 w e have
(40) where 61 = —fa, 6 2 = fa , an d 6 3 = f a + fa , Z l>t — Z 3; f = x t. It ma y also be shown that
Z 2 ( = 1 , an d
(41) where 0(L ) = (1 + faL)' 1 an d 0*(L) = (1 - L)" 1 [0(L) Note fro m (41 ) tha t F 2 i(L) = 0. Thi s implies , b y referrin g t o th e description o f th e V matri x above , tha t V i s block-diagonal . Th e estimate d j o f the coefficien t on th e (differenced ) stationary ter m ha s an asymptotically norma l distributio n wit h mea n 0 an d varianc e give n b y Vf]1. Th e margina l distribution o f o 2) however , i s no t normal ; becaus e F23 i s no t equa l t o zero , Z 2 ,t an d Z 3 j r ar e asymptoticall y correlated , and sinc e Z^ t ha s a Wiener distribution , so does the coefficien t o n Z 2:t . If a n intercep t i s no t include d i n th e regression , w e hav e a 2 x 2 block-diagonal V matrix . Th e estimate d coefficien t o j stil l ha s a n asymptotically norma l distribution , wit h d^ convergin g to it s probability limit a t rat e T 1/2, whil e S 3 has a Wiene r distributio n wit h convergence at rat e T . An y join t tes t involvin g di an d 6 3 wil l als o hav e a non-standard distribution. The analog y with the Stock-Wes t exampl e is direct. I n (27 ) we ha d a series o f term s integrate d o f orde r zero . Th e coefficien t estimate s o n al l these stationar y term s were jointly and individuall y asymptotically normally distributed . Th e join t distributio n o f 0 i n (27) , wit h an y o f th e 77, , was o f cours e non-standard . Thi s observatio n applie s equall y well here . There is , however , a n importan t differenc e betwee n th e Stock-Wes t 12 Thi s transformatio n i s no t unique , an d on e coul d imagin e choosin g others ; however , (39) ca n be rewritte n a s x, = (f) l + /3 2)*,_i - /3 2(*,-.i ~ x t-2> + 1t> because j8 0 = 0 under th e null, an d thi s suggest s th e decompositio n give n b y (40) . I t ha s th e advantag e o f makin g 6 l (= — /32) th e coefficien t o f a non-integrate d rando m variable , sinc e x , i s a n integrate d series.
Regression wit h Integrated Variable s
187
example an d th e curren t example . I n th e forme r case , becaus e /3 ha d already bee n se t equa l t o 1 , ou r parameter s o f interes t coul d al l b e written a s coefficient s o n mean-zer o an d non-integrate d variables . Inference coul d the n b e conducte d usin g standar d tables . I n th e latte r case, althoug h w e can us e standar d table s t o tes t fo r th e significanc e o f j32, a test o f fli + /3 2 = 1 still requires u s to us e non-standard distributio n theory (an d s o table s constructe d b y simulation) . I n a sense , ou r rewriting i n term s o f stationar y variables i s not sufficientl y successfu l t o enable u s t o conduc t inferenc e solel y usin g standar d tables . Exampl e 6.2.6 examines this issue in more detail .
6.2.6. Example (Sims e t al . 1990: 128) Suppose no w tha t x, is generate d a s in Sectio n 6.2. 5 bu t /? 0 i s non-zero under the null . The canonica l representation 13 yields
(42)
(43)
where 6(L) an d 0*(L ) ar e define d a s in Section 6.2. 5 above. Here, unlik e th e exampl e i n Sectio n 6.2.5 , ther e ar e n o element s o f Z ( dominated b y a stochasti c integrate d process . Th e stochastic-tren d term i s dominated, i n sample variability, by the deterministic-tren d ter m t. A detaile d discussio n of this case appears i n West (1988) .
6.2.7. Example (Banerjee an d Dolado 1988) This exampl e i s a consolidatio n o f most o f th e principa l points discussed in th e page s above . I t i s a variation of the Stock-Wes t example , an d al l statements concernin g th e distribution s o f variou s paramete r estimate s may be derive d fro m earlie r genera l principles. 13 Thi s decompositio n agai n ha s th e advantag e o f makin g 6 1 th e coefficien t o f a non-integrated variable . Th e motivatio n fo r choosin g thi s transformatio n i s therefor e similar t o tha t give n fo r the exampl e i n Sect. 6.2.5.
188 Regressio
n wit h Integrated Variable s
Consider th e followin g regression :
where y f denote s th e logarith m o f disposabl e incom e an d c t th e logarithm o f consumption , an d bot h variable s ar e 1(1 ) i n levels . Here , although w e hav e non-stationar y variable s a s regressors , i f the y ar e co-integrated wit h each other , a s the y mus t b e i f any o f th e permanent income/life-cycle model s o f consumptio n ar e t o mak e sense , the n thi s co-integration propert y make s bot h side s o f th e regressio n equatio n 1(0 ) and th e /-test s o f th e coefficient s o f al l the regressor s ar e asymptotically normal. Th e long-run - multiplier betwee n consumptio n an d incom e ca n be deduce d muc h as in an y dynamic model. A varian t of (44 ) is the mode l
Although th e individua l t-ratio s ar e asymptoticall y normally distributed , the distributio n o f th e Wal d statistic , use d fo r testin g th e join t nul l hypothesis j 3 =< 5 = 0 , i s a functiona l o f a Wiene r proces s an d it s distribution i s non-standard. Mor e interestingly , if (45) were re-paramet erized a s
where s t-i = y,_i - c t _j, yi = ft + 6, y 2 = j8 , an d st-i ma y be show n to be 1(0 ) under th e assumption s of the permanent-incom e hypothesis , the n I(YI = 0) woul d b e a functiona l o f a Wiene r proces s wherea s f(y 2 = 0) would hav e an asymptoticall y normal distribution . In th e genera l mode l give n b y (44) , th e followin g result s ma y b e proved, using theorems 1 and 2 in SSW (1990): (a) Th e /-statisti c o f eac h coefficien t individuall y i s asymptoticall y normally distributed. (&) Th e F-statistic s o f join t significanc e of an y prope r subse t o f th e se t of stationar y regressor s hav e standar d asymptoti c distributions . Thus, an y tes t o f th e join t significanc e of Ay f _y ( / = 1 , . . ., n — 1 ) and Ac ( _y ( / = 1, . . ., m - 1 ) will hav e th e correc t siz e i f standar d tables ar e used . Further , give n tha t th e non-stationar y variable s ar e co-integrated, i f th e regressor s i n th e non-stationar y se t wer e com bined, say , t o giv e p stationar y regressor s an d q non-stationar y regressors,14 a n F-statisti c tha t use s an y o f th e derive d p stationar y 14 I n (46) , fo r example , p = q = • 1 and th e origina l numbe r o f non-stationar y regressor s (excluding the trend ) is 2.
Regression with Integrated Variables 18
9
regressors i n combinatio n wit h an y o f th e origina l stationar y regres sors wil l also have a standard distributio n asymptotically . (c) Th e F-statistic s o f join t significanc e o f an y subse t o f th e se t o f non-stationary regressor s hav e non-standar d distributions . Moreover, a n F-statisti c tha t use s an y stationar y regressors i n combination wit h an y non-stationar y regressor s wil l hav e a non standard distribution . Point (a ) i s obtaine d fro m th e propert y o f th e non-stationar y regres sors formin g a co-integrate d set ; a s in Sectio n 6.2. 3 above, bot h 6 and /3 can b e writte n a s coefficient s o n mean-zer o stationar y variable s (wit h (46) givin g on e suc h re-parameterizatio n fo r /?) . Th e nex t exampl e reconsiders thi s poin t i n th e contex t o f modellin g practice . Poin t (b) i s not surprisin g becaus e th e F-statistic s considere d us e onl y stationar y regressors. Th e fac t tha t som e o f thes e stationar y regressor s ma y b e re-parameterizations o f som e o r al l of the origina l non-stationary regres sors i s an interesting feature . Point (c ) i s surprising in two respects. Conside r (44 ) and (46) ; the firs t surprising featur e i s th e non-standar d behaviou r o f th e F-statisti c an d the secon d i s that , whil e th e f-rati o o f th e coefficien t o f c t-\ ha s a standard distributio n unde r parameterizatio n (45) , unde r th e linea r re-parameterization give n b y (46 ) th e t -ratio ha s a Wiene r distribution . Both result s follo w fro m th e asymptoti c singularit y o f a particula r variance-covariance matrix. 15 Consider y i i n (46) , whic h tend s t o a non-degenerat e distributio n a t rate T ; T l/22 i s asymptotically normally distributed. Thus ,
and s o
This account s fo r th e asymptoti c singularit y o f th e variance-covarianc e matrix o f [ 6 , /?]' an d th e correspondin g non-standar d behaviou r o f th e F-statistic i n (45) . However , th e distributio n o f Tji ma y b e show n t o be non-degenerate . y \ ca n b e writte n a s a functiona l o f Wiene r processes, an d th e scalin g facto r (o f T ) suggest s th e resultin g non standard distribution . 15 Th e asymptoti c singularit y o f th e variance-covarianc e matri x i s th e proble m o f multi-collinearity in another guise. O n this , also see SS W (1990).
190 Regressio
n wit h Integrated Variable s
It i s instructive t o not e tha t th e regressio n give n by (44 ) would no t b e sensible unles s th e right-han d variables or regressor s wer e co-integrated . A specia l exampl e o f (44 ) wa s discusse d i n sectio n 6.1 , wher e w e spok e of a n unbalance d regression . Thi s i s a muc h mor e genera l poin t tha n that mad e i n th e contex t o f spurious regression. A regressio n involvin g a right-hand se t o f variable s integrate d o f a n orde r differen t fro m th e order o f integratio n o f th e left-han d sid e i s jus t a s problemati c a s a regression betwee n tw o unrelate d non-stationar y series . I n eac h case , the distribution s of the statistic s are non-standard . 6.2.8. Example (Stock and Watson 1988a) Stock an d Watso n (1988a ) provid e a n exampl e o f th e danger s involved in no t properl y takin g accoun t o f th e order s o f integratio n o f th e regressors an d th e regressand . The y se t u p a simpl e data-generatio n process base d o n th e permanent-incom e hypothesis:
where y* = the permanen t componen t o f disposabl e incom e whic h i s as sumed t o follo w a random wal k ct = consumption yst = transitory componen t o f disposabl e incom e whic h is a stationary innovation proces s p, = price leve l in period t. The innovation processes u, and v t ar e uncorrelated . Stock an d Watso n relat e th e tal e o f two econometricians tryin g to tes t versions o f Friedman' s permanen t incom e hypothesis . Th e misguide d econometrician, unawar e o f o r choosin g t o ignor e th e order s o f integration o f the series , estimate s the followin g regressions : c, = <x\ + Pipt (t
o chec k money illusion)
ct = a 2 + $2* (t
o check whethe r consumptio n ha s a trend )
Ac, = a 3 + !3 3Ay, (t
o calculat e the margina l propensity t o consume)
Ac, = 1X4 + 04y t-i (t
o tes t th e permanen t incom e hypothesis).
Each o f the inference s from thes e regressions i s invalid.
Regression wit h Integrated Variable s 19
1
The firs t regressio n i s a spuriou s regressio n o f th e classica l Granger Newbold kind ; c, an d p, ar e unrelate d rando m walks , an d th e eco nometrician's findin g o f a larg e ^-statisti c fo r j8 l5 thereb y leadin g hi m t o conclude i n favour of money illusion, 16 i s a spurious one . The secon d regressio n i s als o spuriou s sinc e i t attempt s t o explai n a random wal k (or, i n other words , a stochastically trending variable) b y a deterministic trend . Nelso n an d Kan g (1981 ) pointe d ou t th e danger s of running regression s whic h attemp t t o de-tren d stochasticall y trendin g data i n th e vai n hop e o f achievin g stationarity aroun d a trend . I n bot h cases th e problem s wit h th e inference s aris e becaus e th e regression s involve variables tha t ar e no t co-integrate d (se e Chapte r 3) . The thir d equatio n appear s t o b e correctl y specifie d bu t nevertheles s leads t o downwardl y biased estimate s o f th e coefficien t for th e margina l propensity t o consum e becaus e disposabl e incom e measure s th e chang e in permanen t incom e wit h error , sinc e i t include s th e chang e i n transitory incom e a s well . Th e fina l regressio n i s wha t w e calle d a n 'unbalanced regression ' a s i t trie s t o explai n a variabl e integrate d o f order zer o b y a variabl e integrate d o f orde r 1 . Th e serie s o f paper s noted abov e (Manki w an d Shapir o 1985 , 1986 ; Banerje e an d Dolad o 1988; Galbrait h e t al. 1987 ) conside r th e exten t t o whic h th e f -statistics in suc h case s ar e biase d awa y fro m zero , leadin g t o misleadin g infer ences abou t th e significanc e of coefficients. Stock an d Watso n compar e th e predicamen t o f thi s econometricia n with econometricia n B , say , wh o look s a t th e result s o f th e followin g alternative regressions :
The inference s fro m eac h o f thes e regression s wil l be , b y an d large , correct. Th e firs t regressio n her e i s th e standar d co-integratin g regres sion an d thi s tim e i s valid. Th e estimat e o f th e coefficien t 61 wil l have a Wiener distributio n bu t wil l be super-consistent . Th e reporte d standar d error wil l be incorrec t owin g to untreated autocorrelation . The secon d regressio n ca n be re-parameterized 17 a s Thus, (5 3 ca n b e writte n a s a coefficien t o n a stationar y variable (a s ca n 62 treate d i n isolation). Th e theory , a s described above , implie s that th e 16 Inferenc e o f thi s kin d woul d appea r t o b e faulty , i n an y case . T o conside r a rejectio n of H 0: fl l = 0 a s a reaso n fo r acceptin g an y specifi c alternativ e i s statistically an d logicall y unjustifiable. 17 O r i n a form analogou s to tha t give n b y (44) .
192 Regressio
n wit h Integrate d Variable s
usual t an d F distributions 18 wil l apply . A simila r argumen t applie s t o the thir d regression , wit h th e exceptio n tha t i n thi s cas e y t~i — ct_i forms th e co-integratin g relation . Stoc k an d Wes t (1988 ) an d Banerje e and Dolad o (1988 ) discus s regressions o f this form i n further detail . The mora l o f th e econometricians ' stor y i s the nee d t o kee p trac k o f the order s o f integration o n bot h side s o f the regressio n equation , whic h usually mean s incorporatin g dynamics ; model s tha t hav e restrictiv e dynamic structure s ar e relativel y likel y t o giv e misleadin g inference s simply fo r reason s o f inconsistenc y o f order s o f integration . Specificit y was clearly th e proble m wit h several o f the model s propose d b y th e firs t econometrician. A genera l t o specifi c metho d o f econometri c modellin g would hav e overcom e man y o f th e problem s o f spuriou s inference s an d non-standard distributions . A n initia l model, mor e genera l tha n th e on e postulated b y the secon d econometrician , o f the form , say, would b e mor e appropriat e fo r inferenc e whe n wea k exogeneit y condi tions ar e satisfied. 19 Accoun t mus t b e take n o f fact s (a)-(c ) o f Sectio n 6.2.7 whe n conductin g suc h inference ; mor e generally , th e exampl e illustrates way s i n whic h th e theor y o f modellin g wit h integrate d variables ha s contribute d t o improvin g ou r understandin g o f wha t constitutes goo d practice i n dynamic modelling.
6.3. Functiona l Form s an d Transformation s We dre w attentio n i n Chapte r 1 t o th e fac t tha t man y economi c tim e series wil l com e clos e t o conformit y with the integrate d model s onl y if a logarithmic transformatio n i s applied . Th e logarithm s o f man y suc h series ma y b e integrated , bu t i t seem s unlikel y that th e untransforme d levels o f macroeconomi c tim e serie s suc h a s consumption , nationa l income, an d th e pric e leve l coul d b e mad e stationar y b y differencin g alone. I t i s worth examinin g this transformation mor e closely , alon g with the effec t tha t i t ma y b e expecte d t o hav e o n a n equilibriu m relation ship. I f th e level s o f tw o serie s ar e co-integrated , d o w e expec t th e logarithms to be co-integrate d also , an d vice versa? Begin by examining a series wit h a tendency t o gro w over tim e subject to stochasti c shock s whic h ten d t o gro w wit h th e underlyin g series. Fo r example, 18 Th e F-distributio n wil l appl y whe n lookin g a t test s o f join t significanc e o f subset s o f regressors, eac h o f which is 1(0). I n thi s example , becaus e on e o f th e regressor s i s 1(1) an d the othe r i s 1(0), th e F-statisti c will hav e a non-standard distribution . 19 Se e Ch . 8 and earlie r discussio n i n this chapter .
Regression wit h Integrated Variables 19
3
where e t ha s a mean o f 1 and i s log-normally distributed. A serie s suc h as Y t might describe a number of economic tim e series , a t leas t i n broad outline. Takin g th e logarithmi c transformatio n o f (51 ) an d usin g lowercase letters t o denot e th e transforme d variables with Y, > 0,
where log(1 + y ) — y and e t = log (e t ). Equation (53 ) i s indee d commonl y use d a s a simpl e characterization of th e logarithm s o f economi c tim e series . A s a descriptio n o f suc h a transformed dat a series , (52 ) o r (53 ) seem s a t leas t admissible ; Ay , i s the growt h rate o f the leve l serie s Y t, and this growth rate varies aroun d a (typicall y positive ) mean . Tha t thi s equatio n coul d describ e th e leve l of th e serie s (s o y t denote s th e origina l dat a withou t th e logarithmi c transformation) seem s implausible , however: (53 ) woul d then impl y that the absolut e amoun t o f growt h varie s aroun d a fixe d mean , an d therefore that , a s th e serie s grows , th e averag e amoun t o f growt h fall s to zer o a s a proportion o f th e serie s itself . Moreover, cr 2 /var(Y < ) would tend t o zero , forcin g th e serie s t o becom e essentiall y deterministi c i n relative terms . Thi s criticis m doe s no t appl y t o (53 ) sinc e a i s a proportion o f Y t. Ermini an d Hendr y (1991 ) conside r th e issu e o f testin g 'logarithm s versus levels' b y formulating a test base d o n the encompassin g principle. The nul l mode l MI may be sai d to encompas s the riva l or alternativ e model MI i f M\ i s able t o explai n th e finding s o f M 2 . Alternatively , if the riva l mode l doe s no t adequatel y characteriz e th e propertie s o f th e process generatin g the series , th e nul l model ough t t o b e abl e t o predic t the form o f mis-specification one woul d expect to fin d i f the riva l mode l were estimated. To pursu e th e las t point , suppos e a dat a serie s {Y t} i s well characterized b y a rando m wal k i n logarithm s wit h a stabl e drif t an d homo skedastic errors. Suppos e furthe r tha t thi s implies that regressin g AY , on a constan t woul d yiel d unstabl e estimate s an d heteroskedasti c errors . A simple initia l tes t woul d the n b e t o estimat e th e rando m wal k i n bot h logarithms an d level s an d se e whethe r th e model s displaye d th e pre dicted behaviour. 20 I f th e nul l model als o ha d prediction s t o offe r abou t 20 Th e processe s correspondin g t o 'rando m wal k i n logarithms ' an d 'rando m wal k i n levels' ar e Ay , = f t + £ , an d A Y, = fi 2 + v,, respectively.
194 Regressio
n wit h Integrated Variable s
the for m o f th e instabilit y o f th e parameters , th e tes t coul d b e sharpened b y testin g for th e presenc e o f particular kind s of misspecification—say, drif t o r variance s of errors increasing exponentially over time . In general , th e entir e argumen t shoul d als o b e ru n i n revers e b y taking the riva l mode l a s th e null ; however , linea r model s d o no t ensur e positive observations, so awkwar d issue s arise. We illustrat e thi s discussio n wit h th e tim e serie s analyse d i n Chapte r 1, namely real ne t nationa l produc t (Y, i n 192 9 £million) for th e Unite d Kingdom ove r 1872-197 5 (fro m Friedma n an d Schwart z 1982) . Th e approach follow s that in Ermini an d Hendry (1991) . First, w e mode l th e leve l o f ne t nationa l produc t ove r th e sampl e 1875-1975 b y OLS . Onl y on e lagge d differenc e wa s neede d t o remov e any residual serial correlation, yielding
where th e standar d error s o f coefficien t estimate s ar e show n i n paren theses, o i s th e equatio n standar d error , an d S C i s th e Schwar z criterion. (Smalle r value s on balanc e produc e preferabl e models. ) Sinc e the mea n o f Y i s 4701.0 , th e a a s a percentag e o f Y i s 3. 1 pe r cent . However, th e coefficient s ar e no t constan t ove r th e sampl e period , a s shown i n Fig . 6.1 fo r th e intercept , an d Fig . 6.2 fo r th e one-ste p residuals an d o . (Se e Hendr y (1989 ) fo r details.) 21 Th e intercep t trend s upwards, an d o increase s ove r time , eve n ignorin g the larg e shoc k i n 1919-20. O n an y constancy test, th e mode l i s rejected a t fa r beyon d th e 1 per cen t leve l (e.g. tha t of Hansen 1992) . Next w e mode l growt h i n logs . A s before , on e lagge d differenc e removed residua l seria l correlation, giving
21 Recursiv e estimatio n involve s estimatin g a n equatio n ove r successivel y large r sub samples, startin g fro m a minimu m sub-sampl e an d extendin g t o th e ful l sample . Paramete r instability ma y b e tracke d b y lookin g a t th e behaviou r o f th e estimate d coefficients , a s sample siz e i s increased , t o se e whethe r the y fluctuat e significantl y o r remai n stable . Recursive Cho w (1960 ) test s ma y b e compute d i n a t leas t tw o ways . Th e firs t involve s estimating th e equatio n from , say , t = 1 to ( = 7\ , wher e T l i s greater tha n th e minimu m sample size , an d the n fro m t = I t o t = T t + 1. The one-step-ahea d Cho w tes t is based on a compariso n o f th e residua l varianc e o f th e tw o estimate d equation s an d i s a n F-tes t under th e nul l o f paramete r constancy . A secon d tes t i s give n b y estimatin g th e equatio n from, say , t = 1 to ( = T } an d comparin g th e residua l varianc e o f this regressio n wit h tha t of th e equatio n estimate d ove r th e ful l sample . A sequenc e o f thes e Cho w test s i s built u p by augmentin g th e sub-sampl e siz e b y on e a t eac h step , e.g . T 1 + 1 t o 7 \ + 2, an d
Regression with Integrated Variables
195
FIG 6.1. Recursiv e estimate s o f intercept i n levels mode l
FIG 6.2. One-ste p residuals i n levels mode l
comparing th e residua l varianc e o f eac h o f thes e equation s wit h th e ful l sampl e residua l variance. Alternatively , th e sequenc e o f one-ste p residual s (o r forecas t errors ) ca n b e examined relative to the residua l variance a t eac h sampl e size.
196
Regression with Integrated Variables
The percentag e a i s 3. 3 pe r cen t bu t no w th e intercep t i s constan t a s shown i n Fig. 6.3 , an d littl e residual heteroskedasticity remain s (se e Fig . 6.4). Th e mode l fail s constanc y test s onl y prio r t o th e larg e shoc k i n 1919-20. Ermini an d Hendr y us e result s fro m Ermin i an d Grange r (1991 ) t o describe th e particula r for m o f instabilit y an d heteroskedasticit y on e would expec t i n th e mode l i n level s i f th e dat a wer e generate d b y th e logarithmic model . Ermin i an d Grange r sho w that , i f th e dat a ar e generated by with time-invarian t distribution Ay , ~ IN(jU , cr 2), an d i f th e riva l mode l is then E(AY t) =
0 between whic h there i s a co-integratin g relationship in levels: Defining th e transforme d serie s x, — log (Xt) an d it = log (Z,), we have Using a Taylor serie s expansion of the logarithmi c function, w e obtai n
from whic h w e ca n se e tha t th e term s i n th e summatio n wil l declin e i n importance a s Z, grows , sinc e b y (59 ) u t i s of fixe d variance , whil e th e variance o f Z t i s o f O(t). Henc e w e expec t t o fin d a n equilibriu m relation o f som e sor t amon g th e logarithm s o f variable s tha t ar e co-integrated i n levels . Asymptotically , thi s equilibriu m relatio n i s o f a degenerate kin d wit h th e distributio n o f x t — zt collapsin g aroun d logQ3). Thi s i s als o a testabl e predictio n o f th e hypothesi s tha t th e random wal k mode l i n level s encompasse s th e logarithmi c model, 22 although th e tes t i s likely to hav e lo w power becaus e th e varianc e in th e errors i s likely to persist eve n in fairl y larg e samples . Conversely, i f we begin with a co-integrating relationship betwee n two series whic h hav e alread y been transforme d t o logarithms, then th e relationshi p amon g the level s of the serie s is which implies
22 T o se e this , simpl y substitut e A r,_1 fo r Z, . Th e instabilit y o f th e rando m wal k mode l in level s mad e a forma l tes t i n th e level s — > logarithms directio n unnecessar y i n th e Ermini-Hendry discussion , althoug h i n principle suc h a test coul d be carrie d out .
Regression wit h Integrated Variable s
199
a
FIG 6.5. Recursiv e estimate s o f d
or
This n o longe r ha s th e for m o f a standar d co-integratin g relationship , since W t — kV, = V t(V®~lvt — k) = ry r ; whil e v , ma y remai n a stationary process, th e erro r ter m r\ t i n th e ne w relationshi p depend s o n th e integrated serie s V t an d i s therefor e no t stationar y i n general . N o co-integrating relationshi p ma y therefore appear , an d a regression o f th e form W, = kV t + r] t i s likely to displa y considerable instability . At th e sam e time , i t shoul d b e note d that , i n eithe r o f th e abov e examples, onl y on e o f th e logarith m an d th e leve l o f a variabl e wil l b e an integrate d proces s (capabl e o f bein g mad e stationar y b y differen cing), althoug h stationarit y o r non-stationarit y wil l b e commo n t o bot h representations. Th e standar d definitio n o f co-integration , whic h de scribes equilibriu m relation s amon g integrate d processes , can be legiti mately applie d t o onl y one o f the tw o cases at a time. The fac t remains , however , tha t a co-integratin g relationshi p amon g the level s o f variable s suggest s th e existenc e o f som e linear equilibriu m relationship amon g the logarithm s of those sam e variables. The convers e need no t i n general b e true .
200 Regressio
n wit h Integrate d Variable s
Appendix: Vecto r Browman Motio n Consider th e bivariat e 1(1) dat a generatio n proces s give n by:
The DG P i n (Al ) i s a re-parameterizatio n o f a genera l bivariat e norma l distribution fo r (Ay, , Az f ) wit h covarianc e JJCT ^ an d define s th e inte grated vector process : when x, = (v, : z,)' an d v , = (e lt + r)£ 2t, £21)'- The n v , ha s non-unit error variance matri x £:
As i n Chapte r 1 , a suitably scaled functio n o f x f converge s t o a vecto r Brownian motio n process , denote d BM(E) . W e firs t deriv e th e standardized Brownian motion b y the transform:
and s = Oi/o 2. The n m ( ha s a unit error varianc e matrix since:
Alternatively, fro m (A2 ) an d (A4) :
(A6) Next, usin g a componen t b y componen t analysi s simila r t o tha t i n Chapter 3 , fro m (A5) :
where B(r ) = (#i(r), B 2(r))' (denote d BM(I)) , an d th e fl,-(r ) ar e th e standardized Wiene r processe s associate d wit h accumulatin g th e {e it}. Further:
Regression wit h Integrated Variable s
201
These vecto r formula e ar e natura l generalization s o f th e scala r Wiene r processes i n Chapter 3 . Scalar function s o f vecto r 1(1 ) variable s ca n b e handle d a s follows . Consider th e distributio n o f th e differenc e betwee n y t an d z t, namel y ut = d'xt fo r d' = (1, -1). The n fro m (A4):
202
Regression wit h Integrated Variable s
(A10) By direct calculatio n fro m (Al ) however ,
and W(r ) i s the Wiene r proces s associate d wit h {n^/a,,,} . B y definition, w t ~ £ it + (> ? ~ 1) £ 2«> s o tha t cr lv W(r) = OiB^r) + (r] - \}o 2B2(r), an d hence th e expression s i n (A10 ) an d (All ) are equal , bu t provid e different insight s into th e behaviou r o f the scala r second moment . Similarly, le t f = (1,0 ) s o tha t f'e t = EK/CTI , the n w e ca n deriv e a covariance suc h as:
Returning t o th e standardize d vecto r Brownia n motion , le t V(r) = (V^i(r) , V 2(r))' (whic h is BM(i:)) be associate d wit h the accumu lation o f {v,} . No w Vi(r) an d V 2(r) ar e no t independen t sinc e E(vltv2t) ¥= 0. The standardize d vecto r Brownia n motio n is B(r) = K'V(r) where K' i s defined i n (A4). Multiplyin g out, w e have : 2(r).
(A13 )
Indeed, i f w e conditio n v 1( o n v 2t (whic h generate s £ 1;) an d le t Vi. 2(r) be th e associate d "conditional " unstandardize d Wiene r process , the n
Regression wit h Integrated Variable s 20
3
and V 2(r) ar e independent . Becaus e £ lr = v 1( - £ r (v lr |v 2r ) = M we see that Vj. 2 (r) = Vi(r) - riV 2(r) = aiB^r) fro m (A13) . Finally, conside r a n expression o f the form :
Then the erro r covarianc e matri x is added on if the cross-produc t unde r analysis i s a contemporaneou s rathe r tha n a lagge d on e (se e th e appendix t o Chapte r 7 fo r a n extension) . Phillip s an d Durlau f (1986 ) and Phillip s (19886) provide proofs and generalizations.
7
Co-integration in Individua l Equations We firs t examin e method s o f testin g fo r co-integratio n vi a stati c regressions, an d provid e simulatio n estimate s o f th e uppe r percen tage point s o f th e distribution s o f statistic s use d i n th e tests . Next , we look a t th e propertie s o f the estimator s derive d fro m suc h stati c regressions. I n particular , w e focu s o n th e finite-sampl e biase s i n the estimate s o f co-integratin g vector s an d th e power s o f test s t o detect co-integration . Finally , w e conside r modifie d estimator s an d dynamic models . I n Chapte r 8 , system s method s o f estimatin g co-integrating relation s wil l be considered . The previou s chapte r focuse d o n th e propertie s o f co-integrate d pro cesses an d th e implication s o f modellin g wit h co-integrate d variables . We hav e discusse d th e 'super-consistency ' o f th e coefficien t estimate s i n the static o r co-integratin g regression , balance d an d unbalance d regres sions, an d th e distribution s o f th e statistic s commonl y use d t o tes t fo r the significanc e of regression coefficients . The tw o issues o f being abl e t o tes t fo r th e existenc e o f an equilibriu m relationship amon g variable s an d t o accuratel y estimat e suc h a relation ship ar e complementary . Indeed , a s demonstrate d i n discussin g spuriou s regressions i n Chapte r 3 , stati c regression s amon g integrate d serie s ar e meaningful i f an d onl y if they involve co-integrate d variables . Thus , i t i s of interes t t o discover , first , ho w wel l th e mos t frequentl y use d test s of co-integration perform , an d second , ho w accuratel y th e correspondin g equilibrium relationship i s estimated. The objectiv e o f thi s chapte r i s t o develo p test s applicabl e t o singl e equations whic h ma y b e use d t o detec t a long-ter m relationshi p o f th e form discusse d an d exploite d i n earlie r chapters . W e als o attemp t t o formulate som e recommendation s fo r efficien t estimatio n o f co-integrat ing parameter s an d testin g fo r co-integratio n i n finit e samples . I t wil l become clea r fro m th e discussio n that th e asymptotic propertie s o f static regression estimator s ar e ofte n rathe r differen t fro m thei r behaviou r i n empirically relevan t sampl e sizes . Further , lac k o f wea k exogeneit y du e to co-integratin g vector s enterin g severa l equation s als o alter s finit e sample behaviour . I t therefor e become s important , i n th e fac e o f dat a
Co-integration i n Individua l Equation s 20
5
limitations, t o conside r alternativ e method s which do not rel y exclusively on single-equatio n stati c regressions . Thes e ar e th e topi c o f Section s 7-9.
7.1. Estimatin g a Single Co-integratin g Vector Consider th e proble m o f estimatin g th e singl e co-integratin g vector a using the stati c mode l We conduc t th e discussio n i n thi s an d th e followin g section s i n thre e stages. First , w e elaborat e upo n th e theorem s presente d i n Chapte r 5 and develo p a n intuitiv e discussio n o f stati c regressions . Next , w e proceed t o th e issu e of testing for co-integratio n using static regressions . The testin g an d th e parameterizatio n o f the equilibriu m relationship ar e seen t o b e complementar y exercises . Finally , w e discus s simulatio n studies whic h cas t ligh t o n th e behaviour , i n finit e samples , o f th e static-regression estimator s an d th e power s o f th e test s fo r co-integra tion. In orde r t o kee p th e analysi s a s tractabl e a s possible , w e wil l restric t ourselves to considering CI(1,1 ) systems . Thus , suppos e tha t all the elements i n x, are 1(1). I n general , then , an y linear combination 6'x t o f the element s o f x ( wil l produc e a n 1(1 ) serie s u t. The onl y exception , if one exists , i s a co-integrating vector a suc h tha t «'x r i s 1(0).1 Ordinar y least square s minimize s th e residua l varianc e o f x t , an d therefor e a simple OL S regressio n o f th e for m (1 ) shoul d provid e a n excellen t approximation t o th e tru e co-integratin g vecto r whe n on e exists , a s discussed i n Chapte r 5 . The simplicit y o f thi s metho d an d th e eleganc e o f th e theoretica l argument hel p explai n th e popularit y o f suc h regressions . Al l tha t i s needed t o parameterize a long-run equilibriu m relationshi p amon g a set of variable s i s a stati c OL S regression . Thi s regressio n i s performe d a s the firs t ste p o f th e Engle-Grange r two-ste p estimator 2 an d serve s a s a preliminary chec k o n th e equilibriu m relationship s postulate d b y eco nomic theory to exist amon g the variables. 1 Initiall y w e focu s o n th e cas e wher e (apar t fro m normalization ) th e co-integratin g vector a i s uniqu e an d i s therefor e o f dimensio n n x 1 . A s th e analysi s i n Ch . 5 showe d (especially th e discussio n o f th e Grange r Representatio n Theorem) , thi s i s clearl y a restrictive assumptio n t o make . I n general , ther e wil l exis t r co-integratin g vectors , O^s r s n — 1, an d whe n gathere d i n a n array , th e matri x a wil l b e o f orde r n x r . Th e problem of estimatin g co-integratin g vector s i n system s is considered i n Ch . 8 . 2 Th e two-ste p estimato r an d it s asymptoti c propertie s ar e discusse d i n Ch . 5 . Th e general cas e i s derived b y Engle an d Grange r (1987 : 262, Theorem 2) .
206 Co-integratio
n i n Individua l Equation s
However, ther e ar e reason s fo r preferrin g alternative s t o th e simpl e static regressio n in sample s o f the siz e typica l i n economics. This chapte r will conside r dynami c regressio n method s an d modifie d estimators . These technique s hel p to reduc e or eliminat e source s of finite-sampl e biases whic h aris e fro m stati c estimation , an d whic h ca n b e ver y substantial i n practice.
7.2. Test s fo r Co-integration i n a Single Equatio n The simples t test s fo r co-integratio n propose d b y Engl e an d Granger , test fo r th e existenc e o f a uni t roo t i n th e residual s o f th e stati c regression. Th e method s o f Chapte r 4 ca n therefor e b e followe d wit h minor modifications . W e firs t conside r th e bivariat e case , wher e
*t = (yt,z ty.
The modification s are necessar y because, whil e the test s for uni t root s discussed i n Chapte r 4 us e th e origina l series , sa y {w t}, th e co-integra tion test s ar e base d o n th e estimated, o r derived, residual series ,
Hence, a s th e co-integratin g regressio n estimate s y 3 before th e tes t i s performed, th e co-integratio n tes t i s not simpl y a standar d test fo r a unit root i n the series u t. If / J wer e know n i n th e exampl e presente d i n Chapte r 5 (give n b y equations (5.1)-(5.6)) , th e nul l hypothesi s o f n o co-integration , cor responding t o p equa l t o 1 , coul d b e teste d b y constructin g th e serie s ut = y t — [3zt, treating thi s series a s the on e tha t ha s th e uni t roo t unde r the null , an d usin g the Dickey-Fulle r tables . However , i f / ? is unknown, it mus t b e estimate d (e.g. ) fro m th e stati c regressio n o f y t o n z t- Th e test is based on the nul l hypothesis of no co-integration , with the critica l values fo r th e tes t statistic s calculate d t o ensur e th e appropriat e prob ability of rejection of th e nul l hypothesis. Some o f th e mos t widel y use d test s o f co-integratio n hav e bee n th e co-integrating regression Durbin-Watson tes t (CRDW) , th e Dickey Fuller tes t (DF) , an d the augmente d Dickey-Fuller test (ADF) . The CRDW , suggeste d b y Sarga n an d Bhargav a (1983) , i s compute d in exactl y the sam e fashion as the usua l DW statisti c and i s given by
where u t denotes the OLS residual fro m the co-integrating regression . The nul l hypothesi s bein g tested , usin g th e CRD W statistic , i s o f a single uni t root : tha t is , u t i s a rando m walk . Thi s i s t o b e contraste d
Co-integration i n Individual Equations 20
7
with th e conventiona l us e mad e o f thi s statisti c i n standar d regressio n analysis where the nul l of no first-order autocorrelation i s tested. The us e of this statistic is problematic i n the presen t setting . First , th e test statisti c fo r co-integration depend s upo n th e numbe r of regressors in the co-integratin g equation and , mor e generally , o n th e data-generatio n process an d henc e o n th e precis e dat a matrix . Onl y bound s o n th e critical value s ar e available. 3 Second , th e bound s diverg e a s the numbe r of regressors i s increased , an d eventuall y ceas e t o hav e an y practica l value fo r th e purpose s o f inference . Finally , th e statisti c assume s th e null wher e u t i s a rando m walk , an d th e alternativ e wher e u t i s a stationary first-orde r autoregressiv e process . I n suc h circumstances , Bhargava (1986 ) demonstrate s tha t i t ha s excellen t powe r propertie s relative t o alternativ e tests . However , th e tabulate d bound s ar e no t correct i f ther e i s higher-orde r residua l autocorrelation , a s wil l com monly occur . Exac t inference i s therefor e possibl e i f an d onl y i f eac h regression exercis e i s augmented b y the us e o f algorithms such as that of Imhof (1961 ) t o cpmput e th e relevan t critica l values . I n principle , i t i s possible fo r simulatio n method s t o b e use d t o comput e th e critica l values. However , i n practic e thi s implie s a proliferatio n o f table s o f different critica l value s fo r differen t data-generatio n processe s an d simulation exercises . As w e hav e argue d previously , th e onl y hop e fo r uncomplicate d inference lie s in generatin g a robus t se t o f critica l values. Robustnes s i s defined b y lac k o f sensitivit y o f th e critica l value s t o a wid e rang e o f changes t o th e data-generatio n process . Test s that ar e simila r for a wide range o f nuisanc e parameters woul d ensur e thi s non-sensitivity . In othe r words, i t i s importan t t o hav e a se t o f tables tha t coul d b e use d regardless o f th e precis e propertie s o f th e DGP , a s lon g a s th e regression mode l i s parameterized t o satisf y certai n basi c properties suc h as balance . Test s o f co-integratio n base d no t directl y o n th e residual s but o n th e regressio n coefficient s themselves , migh t have highe r power . As a n alternativ e method , on e coul d conside r usin g non-parametri c corrections o f the sor t describe d i n Chapte r 4 to conduc t inferenc e usin g only a smal l se t o f tables , fo r a rang e o f possibl e data-generatio n processes. Example s o f bot h thes e procedure s wil l b e presente d i n du e course. Similar qualification s appl y to th e us e o f the D F statisti c and less so to the ADF , i f the numbe r o f Aw r _, term s appearin g i n the data-generation process coincide s wit h thos e use d i n th e implementatio n o f th e test . Since th e numbe r o f suc h term s appearin g i n th e DG P i s unknown , it seems safes t t o over-specif y th e AD F regression , an d us e a s man y 3 Whil e th e CRD W statisti c doe s no t hav e a limitin g distributio n wit h a non-zer o variance, T(CRDW ) = J~ l ^ = 2(u, - u,^) 2/T-2 £f= i«r 2 does .
208 Co-integratio
n i n Individua l Equations
lagged term s a s degrees-of-freedo m restrictions wil l allow . O f course , i n practice, th e choic e o f the la g structure i n ADF test s ma y be a d hoc an d different result s ca n b e obtaine d b y changin g th e lengt h o f th e auto regression. I n particular , th e powe r o f th e tes t ma y b e affecte d ad versely. Table 7. 1 provides , fo r illustratio n ( a mor e detaile d descriptio n o f applicable critica l value s wil l b e give n below) , th e 5 pe r cen t critica l values o f th e DW , ADF(l) , an d ADF(4 ) tests , fo r thre e sampl e size s (T = 50, 100 , 200) . Th e data-generatio n process i s a n «-variat e rando m walk wit h n less tha n o r equa l to 5 , as in Engle an d Yo o (1987) . It i s importan t t o emphasiz e that , i n commo n wit h th e test s fo r uni t roots, test s fo r co-integratio n ma y lac k powe r t o discriminat e betwee n unit root s an d borderline-stationar y processes. I n a small-scal e stud y of the powe r propertie s o f thi s test , Engl e an d Grange r (1987 ) sho w that , when th e data-generatio n proces s o f th e disturbance s o f the co-integrat ing equatio n i s a n AR(1 ) proces s wit h th e autoregressiv e paramete r equal t o 0.9 , th e power s o f the CRDW , DF , an d AD F test s a t th e 5 per cent critica l value s ar e 20 , 15 , an d 1 1 per cen t respectively . Whe n th e DGP i s altered t o b e a more genera l AR(1 ) proces s wit h a unit root , th e power o f th e AD F tes t become s 6 0 per cent , dominatin g strongl y bot h the power s of the CRD W an d D F test s a t the 5 per cen t level. Engle an d Grange r (1987 ) emphasiz e th e robustnes s t o change s in th e data-generation proces s o f th e AD F critica l values . Th e discussio n i n Chapter 4 help s t o explai n thi s result . Phillip s an d Ouliari s (1990 ) sho w that th e limitin g distribution of the AD F tes t statisti c is the sam e a s tha t of th e non-parametricall y adjuste d D F statistic . Becaus e th e limitin g distribution o f th e latte r statisti c i s invarian t t o nuisanc e parameter s i n the processe s generatin g th e dat a series , th e resul t follows . Eac h tes t manages t o correc t fo r variou s features that ma y be presen t i n the DGP , in on e cas e b y capturin g th e effect s i n a regressio n model , i n th e othe r by implicitl y adjusting th e critica l values. Phillips an d Ouliari s (1990 ) deriv e th e distribution s of severa l test s o f co-integration. W e clos e thi s sectio n b y presentin g a summar y o f th e theoretical result s presente d there . The y conside r th e linea r co-integrating regressions :
and
where y, an d z t satisf y (multivariate ) unit-roo t processes . Th e asymp totic distribution s o f a numbe r o f residual-base d test s ar e discussed , from whic h we wil l conside r fiv e (thi s analysi s is of cours e relate d t o th e
Co-integration i n Individual Equations 20
9
TABLE 7.1. Fiv e pe r cen t critica l value s fo r th e co-integratio n test s n
T
CRDW
ADF(l)
ADF(4)
2
50 100 200
0.72 0.38 0.20
-3.43 -3.38 -3.37
-3.29 -3.17 -3.25
3
50 100 200
0.89 0.48 0.25
-3.82 -3.76 -3.74
-3.75 -3.62 -3.78
4
50 100 200
1.05 0.58 0.30
-4.18 -4.12 -4.11
-3.98 -4.02 -4.13
5
50 100 200
1.19 0.68 0.35
-4.51 -4.48 -4.42
-4.15 -4.36 -4.43
Source: Th e CRD W critica l value s (se e Sarga n an d Bhargav a 1983 ) an d th e ADF(l) critica l value s were generate d b y PC-NAIV E usin g 10,00 0 replications . The ADF(4 ) critica l value s hav e bee n take n fro m Engl e an d Yo o (1987) . Th e ADF critica l value s ar e compute d b y replicatin g th e regressio n AM , = pu,-i + 2f =1 )-7.4(£> ) pertai n t o stati c model s which d o contai n constan t terms . Th e figure s sho w th e relationshi p between bia s an d sampl e siz e fo r fou r differen t value s o f th e rati o o f standard deviations . Th e horizonta l scal e i s implicitly Iog 2 (T/25) s o tha t the fou r point s show n ar e equidistant . Firs t o f all , i t i s eviden t tha t th e bias doe s no t declin e a t rat e T . Fo r example , i n Fig . 7.4(a ) (ol/o2 = 0.5), wit h p 2 = 0.6, th e bia s a t T = 2 5 i s 0.45 , a t T = 50 is 0.32, a t T = 100 i s 0.21 , an d a t T = 200 i s 0.13 . Thus , a n eightfol d increase i n sampl e siz e reduce s th e bia s b y a facto r o f approximatel y
216
Co-integration in Individual Equations
Sample size
Fio7.1(a). N o constant in model, estimate d bias v. sample size, s = 16
Sample size Fio7.1(&). Constan t i n model, estimate d bias v . sampl e size, s = 16 3.5. A s anothe r example , w e se e i n Fig . 7.2(a ) (01/02 = 4), wit h p2 = 0.6, th e biase s a t th e sam e se t o f sampl e size s ar e 0.017 , 0.010 , 0.005, 0.0026. 6 Her e a n eightfol d increas e i n sampl e siz e reduce s th e 6
Thes e number s ar e take n fro m th e experimenta l outpu t rathe r tha n rea d fro m th e figures. Th e standar d erro r o f th e smalles t o f these number s i s roughly 5 x 10~ 5.
Co-integration i n Individual Equation s
217
Sample size
Fio7.2(a). N o constant in model, estimate d bias v. sampl e size , s = 4
Sample siz e
FIG 7.2(6). Constan t in model, estimate d bias v . sampl e size , s = 4
bias b y a facto r o f 6.5 . Usin g a standard-deviation ratio o f 4 again but a value o f p 2 = 0.9, the biase s ar e 0.04 , 0.024, 0.014, an d 0.008 , a fivefol d decrease i n bias . Th e rat e o f declin e o f th e bia s i s alway s faster tha n but no t a s fast a s T fo r sampl e sizes up t o 200. Second, th e biase s increas e uniforml y i n pi an d decreas e uniforml y i n
Co-integration i n Individual Equation s
Sample si/.e
Fio7.3(a). N o constant i n model, estimated bia s v . sampl e size, s = I
Sample size
FIG 7.3(6). Constan t i n model, estimate d bia s v . sample size .
01/02- T o understan d this , we can rewrite (9 ) and (10 ) t o ge t
Co-integration in Individual Equations
219
Sample size
Fio7.4(a). N o constant in model, estimate d bia s v . sampl e size , s = 0.5
Sample size
Fio7.4(b). Constan t i n model, estimate d bia s v . sampl e size, s = 0.5
Since p i = 1 , {v, } i s a rando m wal k an d therefor e asymptoticall y dominates {«>,„,•, Az,_ ; , an d ( y — yz)t-k wher e th e value s o f i , j, and k 1 W e ar e gratefu l t o To m Rothenber g fo r pointin g out tha t R 2 i s a rando m variabl e in the presen t context . However, i t remain s a usefu l descriptiv e statistic. 8 Th e proble m o f finit e sampl e biase s wa s als o demonstrate d b y Hendr y an d Neal e (1987). Usin g recursiv e procedure s fo r OL S estimation , the y estimate d a bivariat e stati c regression fo r sampl e size s rangin g fro m 4 0 t o 200 , considering th e bia s o f th e coefficien t estimate fo r eac h sampl e size . Th e result s indicate d that, eve n fo r sampl e size s o f 200, the long-run coefficien t fro m th e stati c regressio n wa s approximatel y 0. 7 whil e th e tru e long-run coefficien t wa s 1.0 . Convergenc e t o th e tru e valu e wa s no t nearl y a s fas t i n practice a s T~ ! whic h dominate s for sufficientl y larg e T: se e (18 ) below.
Co-integration i n Individual Equations 22
1
will depen d upo n th e natur e o f the ARIM A process generatin g {y t} an d {z ar e & U containe d i n th e residual u t\ whe n \YI\ < 1, 13 = (72 + 73)7( 1 ~ 7i) - I n general , u, will b e serially correlated . It s long-ru n varianc e o 2, whic h appear s i n th e expressions fo r th e Wiene r distributiona l limit s o f th e sampl e moments , is given by where
It ma y then be show n that
Phillips (1986 ) show s that i t i s th e presenc e o f A in (18 ) tha t cause s th e biases. 9
Se e e.g. th e derivatio n o f the EC M representatio n i n Ch. 5 for CI(1 , 1 ) series. A simpl e rewritin g o f equatio n (10 ) above , t o tak e accoun t o f th e structur e o f th e residual autocorrelation , give s u s a versio n o f (14a ) wit h th e y ; suitabl y interpreted . Late r in thi s chapte r w e conside r a generalizatio n o f (14 ) an d investigat e th e consequence s o f using stati c an d dynami c regressions . 10
222 Co-integratio
n i n Individua l Equations
A simpl e wa y t o reduc e th e biase s i s to reparameteriz e th e equatio n in suc h a wa y tha t A is se t a t zero . Bot h (15a ) an d (156 ) satisf y thi s property. Fo r comparison , followin g Banerje e e t al. (1986) , w e ra n a second se t o f experiment s i n orde r t o investigat e th e effect s o f suc h re-parameterizations. Usin g th e DG P give n b y (14a)-(146), we estimate equation (15a) , wit h a lagge d z include d a s a n extr a regressor . Th e dynamic regression equatio n estimate d i s therefore
The extr a lagge d variable , z t-\, i s include d t o avoi d imposin g homo geneity (se e Chapte r 2) , a s i t woul d b e unrealisti c t o assum e tha t th e investigator know s th e precis e for m o f th e data-generatio n process . Th e co-integrating coefficien t i s estimate d b y computin g th e expressio n 1 - d/c: se e Sect . 2.4 . Th e stati c regressio n give n b y (16 ) i s als o estimated. The stron g exogeneit y propert y require d o f z t i s guaranteed , i n th e design o f th e experiment , b y drawin g e lt an d e 2t fro m uncorrelate d pseudo-normal distributions . Th e value s o f y , ( i = 1, . . ., 3 ) ar e varie d as i n Tabl e 7.3 , while ensurin g tha t long-ru n homogeneit y i s preserved . The sampl e size s an d th e rati o o f the standar d deviation s o f e lr an d e 2t are als o varied , t o giv e a se t o f 9 0 experiments . Th e simulation s ar e al l conducted with 5000 replications . The purpos e o f th e firs t par t o f thi s exercise i s to compar e th e biase s in th e estimate s o f th e co-integratin g paramete r obtaine d fro m dynami c regression wit h thos e obtaine d fro m th e stati c regression . (Th e tru e value o f th e co-integratin g paramete r i s 1. ) Som e o f th e result s fo r different configuration s o f th e y , parameter s an d standard-deviatio n ratios ar e give n i n Tabl e 7.3 . We repor t th e estimate d biases , fo r fou r different sampl e sizes , i n th e stati c model . Th e correspondin g estimate d biases fro m th e dynami c regressio n (wher e th e co-integratin g paramete r is calculated a s (1 — d/c)) ar e i n almost al l cases so small a s to b e withi n 2 Monte Carl o standar d error s o f zero an d s o ar e no t reported . W e wil l return t o th e compariso n o f these estimator s (stati c an d dynamic ) below ; for th e tim e being , th e noteworth y point i s simply that substantia l biases remain i n stati c estimate s fo r paramete r combination s a t whic h th e biases i n dynami c estimate s ar e zero , o r ver y clos e t o zero , sinc e th e dynamic model ha s been specifie d s o a s to mak e A close t o zero . While th e dynami c estimate s contai n negligibl e biase s i n thes e ex amples, Z t is strongly exogenou s fo r th e paramete r o f interest . Whil e i t is fairl y straightforwar d t o exten d thi s specificatio n t o includ e weakl y exogenous z t , th e usefulnes s o f estimate s fro m dynami c single equation s is reduce d substantiall y i f th e regressor s ar e no t weakl y exogenous . I t also become s difficul t t o mak e unambiguou s comparison s betwee n
Co-integration i n Individua l Equations 22
3
TABLE 7.3. Biase s in static models a DGP: (14« ) + (146) ; 5000 replications Sample siz e (T) 25 5 7i = 0.9 , 72 s =3 Yi = 0.9 , 72 s =1 Yi = 0.5 , 72 s =3 Yi = 0.5 , 72 s =1
= 0 ,-5, = 0 ,,5, = 0 ,• 1 , = 0 .1,
0 10
0 20
0 40
0
-0.,39
-0.25
-0.15
-0.07
-0.,04
-0,.32
-0.22
-0.14
-0.08
-0..04
-0,,23
-0.13
-0.07
-0.03
-0,,02
-0.,21
-0.12
-0.06
-0.03
-0,,02
a
Standar d error s o f thes e estimate s var y widely, but th e estimate d biase s ar e in almos t al l case s significantl y differen t fro m zero , fo r sampl e size s o f 5 0 o r greater. Not e tha t agai n th e biase s appea r t o declin e les s quickl y than T~ l, bu t more quickl y than T~V Z. Calculation s wer e undertaken usin g GAUSS .
dynamic an d stati c single-equatio n estimates . W e discus s thi s issu e below. Recalling th e discussio n i n Chapte r 5 , a tes t o f th e nul l hypothesi s H0 : c = 0, base d o n th e t -statistic t c= 0, i s a vali d tes t fo r co-integra tion.11 Thi s statistic , unde r th e nul l o f n o co-integration , i s no t asymp totically normall y distributed . Therefor e a secon d par t o f th e exercis e was used t o comput e th e critica l values of the distributio n of t c= 0 an d t o use thes e critica l values t o deriv e th e powe r o f thi s statistic , for a rang e of cases , t o detec t co-integration . Thi s i s a n exampl e o f a tes t o f co-integration base d no t directl y o n th e residuals , bu t o n a regressio n coefficient. A powe r comparison , betwee n a residual-base d tes t an d th e Mest, i s give n i n Tabl e 7.7 ; bu t firs t w e us e a mor e genera l DG P t o consider furthe r th e issu e of finit e sampl e biases . 7.4.1. General Data-generation Processes
Consider no w th e compariso n o f stati c an d dynami c estimate s o f th e long-run multiplie r whe n th e tim e serie s ar e derive d fro m a mor e 11 Whe n y an d z ar e no t co-integrated, ( y - z),_ 1 i s 1(1), in which case (19 ) ca n only be balanced i f c - 0 . This observatio n form s the logica l basis fo r a test o f co-integration base d on t c= a- Th e stron g exogeneit y o f z , (fo r th e parameter s i n (14a) ) ensure s tha t a tes t base d on estimate s fro m a single equatio n suc h a s (19 ) i s fully efficient .
224 Co-integratio
n i n Individua l Equation s
general DGP . Th e experiment s describe d abov e ar e specia l case s o f this more genera l DGP . Th e 'static ' estimat e o f the co-integratin g coefficient [3 is called ft s, whil e the dynami c estimate i s denoted p d. The exogenou s variabl e i s generated a s
so tha t z t ca n b e mad e eithe r 1(0 ) o r 1(1 ) b y choic e o f o'Oioi^i^i^;
to + ON O> in '—v ^O
^O
(N rO
ONON^OfN
rH f—•* rH ^H
^H 00 C-4
CN U"j ^ CO
oo
f^^^l
CN •* C4
o o "^
f^^1^
OS
^ ^ '-J
^^^
CN (N i-H
r*]o^Hv.oooorr>oONO ooooooocJoooooocsooooooocJo
N ^
^ ^-
O O O O O O O O O O O O O O O C 5 O O O O O O O O O O O O O O O C 5 OOOOOOOOT-H^Hi-H^-5ooOOOOOC5T-HT-H^H^-H
\O
. O C 5 O O O C 5 O O O O O O O O O O O O O O O O O
,0
Q,
>§ T3 (0
.s '^ OJ
o> 6O
PH
CQ t-l CO
O
>
M 4)
"H3 T3 O
"Q-
" CT
o '§ ca
a >> Q ^ &
7
228
u? C/3
in
O
'-H O
OO O-^ ^*~~s
•^~
MD CN) T—1 ON ^) f^ i^j- CO Tf i—i | 1
m s*-'
CO CN
O O
00 -
^^
ON ON ON ON O
8
CN
7
r~ ON ON oo o 1 1 1 1
8
1
x—V
1
1
1
y—V
1
t- ON
in ON H
3 g> en
^ aj 2
(^
0 -^ 4H
u .an M
C3
O
^ 'fe Q^ & CD TO 43
3
a CD
M) S1
II
.g 3 CD ^O
CS 00
PH
CD
O Q ^ X
^H
g
N
?
CD >H CTJ
6
Q. CD cu 1) ^
§
H
CTj TJ CD "
CD
S »
-^
-g ^
°
Q SH CQ
B
.Si
cd
CD
^ "§
S ^J
+j CD
•SP i3 ^
'||
^ 0
-1 S
*" "CD 0 T3
?%
9 2
CD
J2 '55 x 2
* .£ CD "*—'
g CD
c
CN 1/1 i —1 1
~ ^
s —
CO
in r-~ oo co
7 CN
2
rH
O O
ON
T
o o o oo 00 00 o o o o o O 00 oo oo o o o rH O O o o O O rH T—1 1—I o o o o o o
in ONON in o o o o
o o
in m
o o o o 1 1 1 1
in m
m m o in in o
1
CO
rH
^•vG 0s!--
QQ.
,- i + 2*,_! ) + £ 2t. (10'
) )
Co-integration i n Individua l Equations 23
1
The stati c regression involve s estimating an equation o f the for m and th e D F tes t i s conducted o n where v t = yt- fiz t- Th e DGP is optimal for the DF test her e becaus e (10') ha s a vali d commo n facto r whe n £(e 1( £ 2< ) = 0 (se e Hendr y an d Mizon 1978 , an d Sargan 1980). Sinc e ft = -2 , v t = y, + 2zt, so that (10') corresponds t o Au f = (p 2 - l)v. t-i + £ 2t an d henc e CD, coincide s wit h e 2, except fo r term s involvin g (/ 3 - f$)z t, etc . Fo r thi s reason , th e DG P selected b y Engle an d Grange r (1987 ) i s relatively favourable to th e D F test. By contrast , conside r th e DG P i n (14 ) wit h th e stati c regressio n i n (16) an d th e sam e form o f DF tes t a s in (24) : In thi s case, u t = yt- fiz t s o that i n (25), evaluated a t ) § = /? , hence In (26) , a common-facto r restrictio n i s impose d o n th e dynamics , bu t this tim e i t i s no t necessaril y a vali d representatio n o f (14a) . Indeed , since [3 = 1 by homogeneity, (14a) can be writte n as Comparison wit h (26 ) reveal s tha t th e ne w error [£ lf + (y 2 - l)Az J i s white noise , bu t ha s a large r varianc e tha n tha t o f th e erro r i n (14a) . Kremers e t al. (1992 ) sho w tha t t^ i n (24 ) retain s th e Dickey-Fulle r distribution unde r th e null ,
7 2 = 0.1 , s = 3 0.66/0.35 T = 25 0.99/0.84 50 1.00/1.00 100 = 1 /! = 0.5 , y = 0.1 , s 2 W: 0.79/0.31 r = 25 1.00/0.80 50 1.00/1.00 100 = 1/ 3 /! = 0.5 , y = 0.1 , s 2 (/)i 0.94/0.23 r = 25 1.00/0.75 50 1.00/1.00 100 a
S = CTi/0-2.
in Table 7. 7 ca n be obtaine d fro m th e followin g analysis. Neglectin g th e intercept, th e AD F tes t essentiall y involve s testin g YI = 1 in where th e firs t ste p regressio n o f y t o n z t estimate s fi , whic h her e ha s a population valu e o f unity . Unde r th e alternative , y t-i — flzt~~i is station ary, an d fo r y 3 = 1 the non-centralit y o f th e AD F pseud o Mes t wil l b e given approximatel y b y
Co-integration i n Individua l Equations 23
5
(see Mizo n an d Hendr y 1980) , wher e AS E denote s th e coefficien t asymptotic standar d erro r calibrate d t o a sampl e siz e o f T. Fo r give n design paramete r values , th e AS E i s easil y calculate d usin g PC-NAIVE , and som e outcome s ar e show n below . Similarly, the t c= 0 test i s actually based o n testing y j = 1 in
Since th e regresso r y ( _j - z t-\ i s stationar y unde r th e alternative , i f 7s + 72 + 7i = 1 i s impose d an d henc e z t-\ omitted , th e asymptoti c non-centrality o f th e Mes t o f y i = 1 (agai n i n PC-NAIVE) , yield s th e following illustrativ e values for T = 25: Case NCadf NC,ecm
(a) -1.15 -1.19
(*) -1.15 -1.28
(c) -1.15 -1.52
(d) -2.89 -3.25
(«) -2.89 -3.88
(/) -2.89 -5.32
In practice , thes e approximat e non-centralitie s wer e clos e t o th e mea n values o f the correspondin g tes t statistic s in th e Mont e Carlo , excep t fo r (fl)-(c) fo r th e ADF , which ha d a mea n o f abou t -2.1 5 (se e (4.28)). Their values hel p explai n both th e increasin g power s o f both test s acros s the experiment s an d th e relativel y bette r performanc e o f f c = 0 - Compared wit h th e critica l value s i n Tabl e 7.6 , and give n th e samplin g standard deviation s o f th e test s o f abou t 0. 8 fo r AD F an d 1. 0 for t c= 0, the non-centralitie s als o accoun t for the absolut e power s of the tests : when th e mea n outcom e i s below the critica l value, a power o f less tha n 0.5 usuall y results ; whe n th e mea n i s more tha n on e standar d deviatio n below th e critica l value , th e resultin g powe r i s under 0.2 ; two standar d deviations lowe r induce s a ver y lo w power ; an d s o on . Simila r argu ments appl y fo r deviation s o f the mea n abov e th e critica l value. Overall, ther e woul d see m t o b e som e advantag e i n modellin g dynamics les s restrictivel y tha n b y commo n factor s whe n th e latte r i s a poor approximation . Not e tha t th e absenc e o f an y contemporaneou s effect fro m Az , alway s induce s a violatio n o f commo n factors . Finally , since th e long-ru n paramete r i s no t assume d know n i n thes e experi ments, th e t c= 0 tes t procedur e i s a n operationa l one , and ha s th e sam e number of parameters her e as the AD F test . The mai n drawbac k t o suc h a n approac h i s its dependenc e o n stron g exogeneity. Boswij k (1991 ) propose s a Wal d tes t fo r co-integratio n i n individual equation s whe n th e regressor s ar e no t eve n weakl y exogen ous. Thi s jointly test s the nul l for th e coefficient s o f all the lagge d level s in a Bardsen formulation . Th e resultin g test i s asymptotically similar an d in effec t test s fo r a commo n facto r o f unit y (se e Hendry an d Mizo n 1978). Boswij k an d Franses (1992) investigat e the powe r o f this test.
236 Co-integratio
n i n Individua l Equations
7.6. A n Empirica l Illustratio n To illustrat e severa l test s fo r co-integratio n i n singl e equations , w e return t o conside r th e U K seasonall y adjuste d quarterl y dat a o n mone y demand. Th e ra w dat a serie s wer e show n i n Chapte r 1 , an d w e concentrate her e o n th e DW , DF , an d AD F test s base d o n a stati c regression, an d o n thei r compariso n wit h a dynami c regression, whic h is heavily over-parameterized . I n al l cases , w e assum e tha t ther e i s onl y one co-integratin g vecto r an d tha t i t enter s th e money-deman d model . See Kremer s e t al. (1992 ) an d Ericsson , Campos , an d Tra n (1990 ) fo r related analyses . The long-ru n determinant s o f th e deman d fo r transaction s mone y M, as measure d b y Ml , ar e th e pric e leve l P, rea l incom e a s measure d b y constant 1985-pric e tota l fina l expenditur e X S5, an d th e opportunit y cos t of holdin g mone y measure d b y R n. (Se e Hendr y an d Ericsso n (1991i> ) for detail s o f it s calculation. ) W e assume d a log-linea r equation , consonant wit h pric e an d incom e homogeneity , give n by where lower-cas e letter s denot e logs , ai = 1 i s anticipated , an d a, > 0 , / = 1 , 2, 3. Least-square s estimatio n o f th e stati c regressio n ove r the sampl e 1963(I)-t o 1989(11 ) yielded
The residual s wer e the n teste d fo r a uni t roo t usin g th e D F an d AD F tests, th e latte r commencin g wit h fou r lag s an d testin g down . Th e following result s were obtained :
No lagge d values of A w prove d significant , leadin g to th e D F test :
In n o cas e doe s an y tes t rejec t th e nul l o f n o co-integration , a s th e lvalues on th e estimate d coefficien t o f M J ar e i n the neighbourhoo d o f 2 in bot h th e D F an d th e AD F regressions . Tha t outcom e continue s t o hold i f a tren d i s adde d t o th e basi c static-regressio n mode l (30) , or i f
Co-integration i n Individual Equation s
237
price homogeneit y i s imposed an d Ap adde d a s a regressor, correspond ing t o allowin g m an d p t o be 1(2), wit h ( m - p ) an d Ap bein g 1(1). In that last case , R 2 fo r real mone y is equal to onl y 0.68. We assum e no w that Ap, x S5, an d R n ar e weakl y exogenou s fo r th e parameters i n th e conditiona l mone y deman d model . Th e outcom e o f estimating a dynami c equatio n i n th e level s o f th e variable s wit h fiv e lags o n eac h o f m — p, Ap , * 85, an d R n (plu s a constant ) b y leas t squares i s shown in Table 7.8. TABLE 7.8. Empirica l result s Variable
Lag 1
0 m— p
-1.000
xss
-0.041 0.115 -0.411 0.117 -0.757 0.210 -0.124 0.169
SE SE
Rn
SE Ap SE CONSTANT SE
0.
3
2
4
5
Sum o f lags
A 0.164 .147 0.549 0.,240 0,,251 0 .152 0,,132 0.,135 0 ,131 0.109 0,,028 0.118 0.087 0,.162 0.293 -0,,067 -0.,240 0,,130 0 .139 0.119 0 .026 0.135 0..139 0..139 -0.361 -0,,122 -0.,046 -0 .084 -0.045 -1.070 0.130 0 .187 0.178 0.,185 0.,176 0 .175 0.069 -1,.102 0.020 0,,307 -0.,412 -0 .329 0 .222 0.255 0,.253 0,,246 0 .246 0.203 - -0.12 4 0 .169
R2 = 0.9966 a = 0.0130 F(23 , 76) = 975.3 8 D W = 1.976 SC = -7.85 3 Mea n = 10.89613 1 S D = 0.19617 3 Normality % 2(2) = 4.29 AR 1- 5 F[5, 71] = 0.2 0 ARC H 4 F[4 , 68] = 0.22 Xj F[37,38] = 0.6 6 RESE T F[l,75 ] = 0.98 COMFACF[15,76] = 3.14 Tests on the significance of each variable Variable
Ffnum., denom. ]
Value
Probability
Unit-root Mest
m— p
F[5,76] F[6, 76] F[6, 76] F [6, 76] F[l,76]
340.201 7.801 12.127 6.846 0.536
0.000 0.000 0.000 0.000 0.466
-5.168 6.171 -5.719 -4.963 -0.732
*85
Rn
Ap
CONSTANT
Solved static long-run equation m — p = 1.102jc 85 - 7.278R n - 7.493A; ? - 0.84 2 (0.112) (0.528) (1.482 ) (1.230 )
238 Co-integratio
n i n Individual Equations
These dynami c estimate s ar e wel l behaved: th e unit-roo t f-test s ar e al l in th e neighbourhoo d o f 5 o r large r i n absolut e valu e an d ever y regressor matter s a s a se t (i.e . testin g al l fiv e lags) ; th e solve d lon g ru n is wel l define d an d compare s favourabl y wit h (30 ) sinc e th e thre e economic variable s have highl y significant coefficient s wit h sensible sign s and magnitudes ; th e goodnes s o f fi t i s reasonable ; an d th e diagnosti c tests o f th e dynami c specification ar e al l acceptable . Not e tha t th e su m of al l the lag s of th e dependen t variable , a s shown in th e fina l colum n of Table 7.8 , i s similar t o tha t foun d i n th e D F regression , bu t ha s a muc h smaller standar d error . Only th e firs t la g i s strongl y significant , a s i s show n i n Tabl e 7.9 . Tests o f commo n factor s i n th e la g polynomial s usin g th e procedur e i n Sargan (1980 ) yiel d the result s in Table 7.10 . Thus, th e hypothesi s o f fiv e commo n factor s ca n b e rejecte d a t an y reasonable leve l o f significance . Recallin g th e discussio n i n Sectio n 7. 5 above, thi s outcom e help s explai n wh y th e D F an d AD F test s di d no t reject th e nul l o f n o co-integration , wherea s th e dynami c mode l ha s done s o decisively . Give n tha t th e commo n facto r restriction s ar e rejected, th e D F an d AD F test s ar e no t wel l suite d t o detectin g co-integration. Th e EC M versio n o f thi s equation , reporte d i n Hendr y and Ericsso n (I99lb), ha s a ?-valu e greate r tha n 1 0 in absolut e valu e fo r the EC M coefficient , i n a mode l whic h parsimoniousl y encompasse s th e unrestricted equatio n fitte d above . Thus , th e evidenc e favour s rejectin g no co-integration, an d the result s in the nex t chapter suppor t tha t claim . TABLE 7.9. Test s on th e significanc e o f eac h la g Lag F[num.
, denom. ] = Valu e Probabilit
5 4 3 2 1
0.691 1.615 1.654 1.416 12.967
F [4, 76] F [4, 76] F [4, 76] F [4, 76]
F[4, 76]
y 0.600 0.179 0.170 0.237 0.000
TABLE 7.10. COMFA C Wald tes t statisti c summary table Order x 13 26 39 41 51
2
2 5
d.f . Valu
e Incrementa 0.086 0.196 4.176 8.101 47.128
3 3 3 3 3
l x 2 d.f . Valu
e
0.086 0.110 3.980 3.925 39.028
Co-integration in Individual Equations 23
9
7.7. Full y Modifie d Estimatio n This sectio n consider s method s fo r correctin g th e finite-sampl e biase s i n static regressions . Par k an d Phillip s (1988) , Phillip s an d Durlau f (1986) , Phillips an d Hanse n (1990) , an d Phillip s (19880 , 1991 ) hav e argue d tha t the performanc e o f estimator s o f co-integratin g vectors base d o n static regressions is adversely affecte d b y the existenc e of second-order biases. As show n i n th e example s below , thes e biase s hav e n o effec t o n th e consistency o f th e estimators , bu t resul t i n th e asymptoti c distribution s of scale d estimators , suc h a s T(p — ft) i n (31 ) below , havin g non-zer o means. Such biase s pla y a potentiall y importan t role i n finit e samples . Fo r example, le t the variables ylt an d y 2t b e generated by
When th e {u it} ar e autocorrelate d an d intercorrelated , a stati c regres sion o f yit o n y 2(, b y no t usin g an y informatio n abou t th e proces s generating y 2t, provide s a n estimat e o f y 3 whic h ca n b e quit e severel y biased eve n i n fairl y larg e samples . Phillip s e t al. therefor e recommen d full-system maximu m likelihood estimatio n o f co-integrate d systems . A s an alternativ e t o estimatio n o f th e ful l system , the y propos e correctin g the single-equatio n estimate s non-parametricall y i n orde r t o obtai n median-unbiased an d asymptoticall y norma l estimates . Thes e re commended corrections , fo r simultaneit y bia s an d residua l autocorrela tion, us e expression s derive d fro m th e asymptoti c distribution s o f th e estimators althoug h th e correction s ar e mad e t o estimator s fro m finit e samples. Phillip s an d Hanse n (1990 ) sho w tha t thes e correction s wor k effectively i n sampl e size s a s smal l a s 50. 15 Thei r exampl e i s presente d in Sectio n 7.10. 4 below. The estimate s obtaine d fro m full y modifie d an d full-informatio n methods ar e asymptoticall y equivalent . Thi s equivalenc e i s o f interes t because i t link s th e discussio n wit h a thir d possibl e metho d o f reducin g finite-sample biases , namely , estimatin g single-equatio n dynamic regres sions. Th e ai m o f th e analysi s i n thi s sectio n i s t o compar e th e non-parametrically corrected estimate s (whic h ar e als o asymptoticall y efficient an d median-unbiased ) wit h estimate s obtaine d fro m dynami c regressions i n eithe r thei r AD L o r EC M forms . Th e for m o f th e autocorrelation i n th e erro r proces s i n (31 ) an d (32 ) i s crucia l t o thi s comparison. Fo r som e specification s o f th e erro r process , a dynami c 15 Whil e i t i s possible t o deriv e exac t expression s fo r th e biase s i n finit e sample s t o an y desired leve l o f accuracy , usin g Edgeworth-typ e expansions , thi s i s a complicate d pro cedure .
240 Co-integratio
n i n Individua l Equation s
regression equatio n implicitl y perform s th e sam e correction s a s thos e achieved b y the non-parametri c correctio n terms . Th e long-ru n estimate s obtained fro m thi s properly specifie d dynamic equation ar e the n equivalent, asymptotically , t o th e non-parametricall y correcte d estimates. 16 I n such cases , therefore , tw o way s o f incorporatin g informatio n abou t th e marginal process (tha t is , th e proces s generatin g y^t) presen t them selves: non-parametri c correction , o r dynami c specification . However , for othe r specification s o f th e autocorrelatio n proces s a single-equatio n dynamic regressio n ma y fai l t o achiev e efficiency , o r eliminat e th e effects o f second-order bias , regardles s o f th e richnes s o f th e parameter ization, owin g t o a failur e o f th e conditionin g variables t o b e weakl y exogenous fo r the parameter s o f the dynami c equation. Our theoretica l discussio n i s based o n Phillip s (19880) . Althoug h i t is fairly straightforwar d to describ e an d categoriz e th e circumstance s unde r which dynami c single-equation estimate s wil l perfor m well , th e detaile d theoretical backgroun d fo r thi s descriptio n i s length y an d complex . Readers intereste d i n implementin g th e non-parametri c correction s ar e referred t o th e paper s b y Phillip s an d hi s co-author s cite d previously . We shal l focus on presentin g th e argument s intuitivel y and wil l illustrat e the theoretica l analysi s wit h tw o simulatio n exercises , th e firs t take n from Phillip s and Hanse n (1990) , an d th e secon d fro m Gonzal o (1990) .
7.8. A Fully Modifie d Least-square s Estimato r Consider th e data-generatio n proces s give n b y (31 ) an d (32 ) an d disregard, fo r th e moment , th e precis e autocorrelatio n structur e o f u ( = [«],, « 2 f]'• Assum e onl y tha t u ( i s weakly stationary with it s mean vector an d long-ru n covarianc e matri x give n b y [0,0] ' an d S 2 respect ively, wher e i H = {a)y}y = 12 . 17 Th e followin g decompositio n o f th e fl matrix i s usefu l i n understandin g it s structure : Q = V + F + F" , wher e V = £[u 0uo] an d r = 2)/t= i21 and fi)22 are consisten t estimates o f th e correspondin g element s i n th e long-ru n covarianc e matrix, an d A i s a consisten t estimat e o f A . Unde r quit e genera l conditions,
The notatio n BM(12 U 2) i s used t o denot e a bivariat e Brownia n motion process wit h covarianc e matri x S2n. 2 an d i s a matri x generalizatio n o f scalar Wiene r processes, a s discussed i n Chapter 6 . The limitin g distribution (37 ) is a covariance matri x mixture of normals (see Table 3.3). The 'ful l modification ' i n (33 ) achieve s tw o notabl e aims . First , b y taking accoun t o f an y seria l correlatio n i n th e residuals , th e bia s correction ter m 6 + mitigate s th e effect s o f second-orde r bias . Second , the correction s fo r long-ru n simultaneit y i n th e syste m mad e b y usin g yit (i n plac e o f yi t) permi t th e us e o f conventiona l (asymptotic ) procedures fo r inference . Thus , definin g th e full y modifie d standar d error b y s+ where ,
where o) result:
112
i s a consisten t estimato r o f ft>ii. 2, w e hav e th e following
242 Co-integratio
n in Individua l Equations
Phillips an d Hanse n (1990 ) sho w tha t thi s approac h i s asymptoticall y equivalent t o system s procedure s suc h a s ful l maximu m likelihoo d estimation discusse d i n Chapte r 8 . Bot h (38) , which simplifie s th e process o f inference , an d th e reductio n i n th e second-orde r bia s i n /3 + help estimatio n an d testin g o f singl e equation s i n co-integrate d systems . Our us e o f a simpl e data-generatio n process i s solely for th e purpose s o f exposition; th e literatur e t o whic h w e hav e referre d i s capabl e o f treating co-integrated system s at a high level of generality.
7.9. Dynami c Specificatio n Is i t possible , b y suitabl e dynami c specification alone , t o mak e th e sam e corrections a s those mad e b y the techniqu e describe d above ? I n orde r t o answer thi s question , Phillip s (1988a ) consider s a dynami c versio n o f equation (31):
yit = /3y 2t + r% + »? „ (39
)
where x t i s a vecto r wit h jointl y stationar y elements . Thus , x t contain s lagged value s o f A_y l r an d curren t an d lagge d value s o f Ay 2 r . Whil e far fro m bein g a genera l dynami c model , (39 ) i s a linear-in parameters AD L model . The proces s o f constructin g a regressio n equatio n suc h a s (39 ) ha s been extensivel y discusse d i n th e literatur e (see , i n particular , Engl e e t al. 1983) . Thus , focusin g o n th e DG P give n b y (31 ) and (32 ) and imposing no restrictions upo n the autocorrelatio n structur e o f the u it,
where %F f-i ' s th e informatio n se t containin g informatio n o n pas t realizations o f y lt, y 2t an d henc e o f «,,_/ , / = 1 , 2 ; / 5 = 1. B y construc tion, {rj t} i n (40 ) is a martingale difference sequence . If th e process generatin g u r i s now specialized t o th e cas e wher e i t is a linear process , s o that
where The varianc e o f v} t i s give n b y cr n 2 = a\\ — O2io22, and r] t i s orthogona l to £ 2, as well a s t o th e entir e histor y o f e, given b y (f,_i , £ r _ 2 > • • •)• 18
Not e that £ = { , d 2(L) = ^0d2jL>, an d v t ~ IN(0, a u.2) which is independent o f the regressors . It is then possible t o sho w that
(45) where / ? i s th e estimat e o f th e coefficien t o f y 2t i n (44) . Bv(r) an d B2(r) compris e a bivariate Brownian motion process with a well-defined variance-covariance matrix . The questio n pose d a t th e beginnin g o f thi s sub-sectio n ca n no w b e answered. Comparin g (37 ) and (45) , the full y modifie d estimato r fi + and th e dynami c single-equation least-squares estimator ar e equivalen t if and onl y i f B v(r) = BI ,2(r). Thes e tw o Brownia n motion processe s ar e not necessaril y equa l t o eac h other . Thi s i s becaus e B v(r) ca n b e correlated wit h B 2(r), despit e it s constructio n i n (40) . The generatin g mechanism fo r u 2t ma y therefor e b e informative , and optima l inference then require s join t estimatio n wit h th e error-correctio n model . Phillip s (1988) describe s thi s a s a failur e o f wea k exogeneit y or vali d conditioning. If , o n th e othe r hand , B v(r) an d B 2(r) ar e uncorrelate d a t al l frequencies, th e conditiona l proces s i s completel y informativ e fo r th e purposes o f estimation o f f t an d th e margina l process generating u2t ma y be ignored . In suc h a case, B v(r) — B\ 2(r).
244 Co-integratio
n i n Individual Equations
The example s followin g thi s sub-sectio n wil l elaborat e upo n thes e conditions, bu t w e wil l clos e thi s sectio n wit h a n interpretation . Th e non-equivalence o f th e dynami c regressio n estimato r an d th e full y modified estimato r arise s fro m possibl e correlatio n betwee n th e residual s r)t o f th e conditiona l proces s an d th e residual s u 2t o f th e margina l process. Thi s correlatio n arise s because, althoug h t] t i s orthogonal t o u 2t and th e pas t histor y o f u 2t (t] t i s orthogona l t o it s ow n pas t b y construction), u 2t i s no t necessaril y orthogona l t o th e pas t o f u\ t an d hence (r\ t, u 2t)' jointl y is not a martingale difference sequenc e (MDS) . Three example s ar e presente d below . The y ar e adapte d fro m Phillip s (1988a) an d ar e specia l case s o f th e example s appearin g i n tha t paper . Three differen t specification s o f th e autocorrelatio n structur e o f th e u , process ar e considere d whil e the data-generatio n proces s continue s t o b e (31) an d (32) . The example s hel p t o integrat e an d interpre t th e discussion s o n wea k exogeneity, dynami c modelling, and full y modifie d estimation. Exogene ity play s a n importan t rol e i n dealin g wit h non-stationar y variables . Dynamic regressio n equation s i n whic h the conditionin g is on weakl y or strongly exogenou s variable s (fo r th e parameter s o f interest ) provid e asymptotically unbiase d estimates. Further , inferenc e ma y b e conducte d with standar d tables . I n case s wher e suc h conditionin g i s no t possible , improperly conditione d equation s lea d t o inefficien t an d biase d esti mates. Th e ful l syste m mus t therefore b e estimate d o r th e non-paramet rically modifie d estimate s used . I t i s see n tha t full y modifie d estimation is anothe r wa y o f addressin g th e issu e of the completenes s o f conditiona l models fo r purpose s o f estimatio n an d inference . 7.10. Example s 7.10.1. Example (Phillips 1988a: 352)
In reduce d form , th e DG P (31 ) an d (32 ) is given by
Hence
Co-integration in Individual Equations 24
5
Thus, usin g th e formul a fo r th e conditiona l expectatio n o f bivariat e normal rando m variables , w e have Defining and usin g (48), we obtai n or, alternatively , where Finally, substitutin g for £ Several feature s ar e no w evident . B y construction , j\ t i s a n MDS. Second, agai n by construction , r\ t i s uncorrelated wit h u 2t.19 Fro m (47), we hav e tha t th e u 2t proces s i s serially uncorrelate d bot h wit h pas t u 2t and wit h pas t w l f . I t follow s tha t r\ t an d u 2t ar e incoheren t (tha t is , uncorrelated a t al l lag s o r frequencies) , tha t th e long-ru n covarianc e matrix o f [r] t, u 2t]' i s diagonal , an d tha t th e estimatio n o f a singl e dynamic equatio n should provid e a full y efficien t an d unbiase d estimat e of th e vector a . Looking a t th e conditiona l an d margina l processe s give n b y (50 ) and the secon d equatio n i n (46) respectively, and a t th e propertie s identified in th e previou s paragraph , single-equatio n leas t square s o n (50 ) i s equivalent t o full-informatio n maximu m likelihood fo r estimatin g y3 . Th e orthogonality o f th e r) t an d u 2t processe s ensure s tha t th e join t likeli hood functio n fo r th e syste m factorize s into th e likelihoo d function s fo r the margina l an d conditiona l model s give n b y th e secon d equatio n i n (46) an d (50 ) respectively. Ther e ar e n o cross-equatio n restrictions ; th e parameter o f interes t /3 ca n b e estimate d an d identifie d fro m (50 ) alone; and, recallin g th e discussio n o f wea k exogeneit y i n Chapte r 1 , th e marginal proces s generatin g u 2t nee d no t b e modelle d whe n estima ting 13. 7.10.2. Example (Phillips 1988a: 355)
where,
246 Co-integratio
n i n Individua l Equations
Then
The long-ru n covariance matrix of (rj t, u 2t)' i s given by
where CTH 2= au - o\ 2a22. The expression fo r Sln.2 follow s from appli cation o f th e conditional-expectation s formul a an d fro m inspectio n o f (53). t], an d u 2, ar e agai n incoherent , an d th e limi t Brownia n motion s are
where B n an d B 2 ar e independen t an d 5, , = BI 2 . Thus , estimatin g a dynamic single-equatio n mode l (th e conditional model ) provide s esti mates identical , asymptotically , t o thos e provide d b y th e Phillips Hansen procedure . Her e th e conditiona l mode l is given by In error-correctio n format , we may rewrit e (54) a s
Equation (54 ) is th e on e tha t mus t b e estimate d i n orde r t o obtai n a n asymptotically unbiase d estimato r o f 13. Th e static regressio n i s augmented i n (54 ) by th e term s Ay 2 r an d Ay 2 r _j. Thes e additiona l term s are incorporate d t o reduc e o r eliminate , in finit e samples , th e effect s o f second-order bias , without estimating the ful l system . Phillips (1988fl ) note s tha t th e bia s correctio n ter m d + fo r thi s example i s equal t o zer o sinc e A = (« 12, ft>22)'. However, t o obtai n full y modified estimates , fro m (34 ) y^ need s t o b e correcte d fo r long-ru n endogeneity a s follows : The sam e correctio n i s achieve d i n th e dynami c regression b y th e tw o Ay 2 r -/ term s i n (55) . The static regressio n produce s biase s b y ignoring these corrections . 7.10.3. Example (Phillips 1988a: 356)
Co-integration in Individua l Equations
247
We tak e th e proces s (e lt , e 2t)' t o b e distribute d a s i n Sectio n 7.10.2 . Then i t may be show n that The long-ru n covariance matrix is given by
where a 11-2 is as defined i n Sectio n 7.10.2, an d
The Brownia n motion s B^ an d B 2 ar e correlate d an d th e single equation dynami c estimato r an d th e full y modifie d estimato r ar e n o longer equivalent , unles s $ 21 =0. Fo r th e structur e o f th e correlatio n between B n an d B 2 (se e Phillips 1988a): where B^ 2(r) i s a univariat e Brownia n motio n proces s wit h varianc e given by crn 2 - oli^d^H' 1 an d is independent o f B2(r). Further ,
From (58 ) setting 9 2\ equa l t o zer o make s th e B^r) an d equivalent t o eac h other . Further , B^^r) ha s a variance o f on 2 an d is in al l respect s equivalen t t o th e S 12 (r) proces s give n i n (37 ) above. Thus, th e B n(r) an d B i2(r) processe s ar e equivalent , and , in accord ance wit h th e previou s discussion , thi s equivalenc e lead s t o th e equival ence o f th e single-equatio n dynami c estimato r an d th e full y modifie d estimator. It shoul d b e note d tha t # 21 = £ 0 also implie s that th e T-typ e term s (see Section 7.8 ) are importan t i n th e long-ru n varianc e matri x fo r th e (TJ ( , M 2()' process . Thi s i s jus t anothe r wa y o f sayin g tha t th e pas t o f th e process i s importan t (an d so, i n th e (rj t, u 2t)' constructio n w e hav e no t achieved a martingal e differenc e sequence) . Thus , th e equivalenc e o f dynamic single-equatio n estimator s an d full y modifie d estimator s ma y also b e assesse d b y lookin g fo r th e presenc e o f T-typ e term s i n th e long-run varianc e matrix . Thes e ar e th e term s (fo r example, th e firs t term i n (59) ) tha t giv e ris e t o biase s i n th e single-equatio n dynami c estimates o f the co-integratin g vector. The necessar y an d sufficien t conditio n fo r non-equivalenc e ha s a natural interpretatio n i n th e languag e of a n earlie r literatur e o n dynamic
248 Co-integratio
n i n Individua l Equation s
modelling. I t i s eviden t tha t th e conditio n 621 ^ 0 violate s wea k exo geneity20 a s ma y b e verifie d fro m (57) ; an d onc e again , i t ma y b e see n that th e issue s o f a full y modifie d estimation an d dynami c specification are closel y related . Thi s exampl e form s th e basi s fo r th e simulatio n exercise discusse d i n the fina l sub-section . 7.10.4. Simulation Example (Phillips and Hansen 1990: 116) The data-generatio n proces s fo r thei r simulation study is given by
The desig n o f th e experimen t consiste d i n allowin g o 2\ an d 0 21 t o vary. Thus , fou r value s o f a 21 an d thre e value s o f 0 21 wer e used . Th e values o f CT21 considered wer e -0.8 , -0.4 , 0.4 , an d 0.8 , an d th e thre e values o f th e moving-averag e parameter 0 21 were 0.8 , 0.4 , an d O.O. 21 f t was se t equa l t o 2 fo r al l twelv e combinations o f th e value s o f 02 1 an d 02i- Th e ai m wa s t o calculat e an d compar e th e distribution s of estima tors an d /-statistic s fo r th e co-integratin g parameter obtaine d b y OLS , single-equation dynamic , and full y modifie d methods. For th e full y modifie d method , Phillip s an d Hanse n use d a Bartlet t triangular windo w of lag length 5 and th e OL S residuals u lt t o calculate non-parametric estimate s o f A , J 2 an d henc e o f d +. W e shal l denot e these estimate s b y A , fi , an d < 5 +. Th e OL S f-statisti c wa s estimate d b y using St u (th e (1,1 ) elemen t fro m th e non-parametricall y estimate d long-run varianc e matrix ) a s a n estimat e o f th e standar d error . Th e dynamic equatio n regresse d y lt o n (v 2 t , Ay 2 < , Ay 2 ,_i, Ay 2 ( _ 2 , A y l r _ l 5 Ayif- 2 ), usin g 30,000 replication s fo r eac h simulatio n (tha t is , fo r eac h pair o f values o f (0 21, 2i = 0, th e dynami c /-statisti c i s substantially les s biase d (i n al l but on e case ) tha n th e F M /-statistic , bu t its variance i s much higher. Since th e us e o f th e norma l distributio n i s a considerabl e simplifica tion an d th e bia s comparison s ar e a t bes t ambiguou s fo r th e dynami c estimates (whe n $2 1 ^ 0) > ther e ma y b e reason s t o prefe r th e F M estimator over th e D estimato r whe n onl y long-ru n parameter s ar e o f interest. Thi s recommendatio n mus t b e qualifie d b y noting tha t a mor e richly parameterize d dynami c mode l ma y hav e provide d lowe r biase s and a distributio n o f th e /-statisti c close r t o th e norma l distribution . Performance wit h a negativ e M A paramete r i s als o important ; som e early studie s hav e suggeste d tha t th e F M estimato r perform s less well in such cases . Bot h thes e qualification s poin t t o th e nee d fo r mor e extensive simulation studies . What i s clea r fro m al l th e studie s considere d s o fa r i s th e poo r performance o f unmodifie d estimate s derive d fro m stati c regressions . Some for m o f incorporatio n o f th e dynami c structur e o f th e data generation process , eithe r b y mean s o f a non-parametri c correctio n o f the stati c regressio n estimate s o r b y runnin g dynami c regressions , i s 22 Phillip s an d Hanse n rationaliz e thi s behaviou r b y statin g tha t 'whe n thi s conditio n [02i = 0] doe s hold , th e parametri c natur e o f th e [dynamic ] metho d give s i t a natura l advantage ove r ou r semi-parametri c approach ' (1990 : 119) .
Co-integration i n Individual Equation s
251
TABLE 7.12. Mea n (standar d deviation ) of
02i = -0. 8
OLS D FM 021 = -0. 4 OLS D FM
02! = 0. 4
OLS D FM
CT21 = 0. 8
OLS D FM
02i = 0. 4
92i = 0. 0
-1.616 (1.268) -1.259 (2.040 ) -0.388 (1.432 )
-1.240 (1.105 ) -0.563 (1.701 ) -0.449 (1.092 )
-0.930 (1.00 ) -0.003 (1.40) -0.025 (0.896 )
-1.156 (1.32) -1.058 (1.69) -0.729 (1.49 )
-0.986 (1.25) -0.636 (1.57 ) -0.516 (1.35 )
-0.754 (1.149) -0.163 (1.388) -0.335 (1.193)
-0.711 (1.19) -0.664 (1.29 ) -0.606 (1.26 )
-0.520 (1.21) -0.478 (1.34 ) -0.267 (1.30 )
-0.267 (1.24 ) -0.213 (1.37) 0.096 (1.36 )
-0.575 (0.955 ) -0.445 (1.15) -0.519 (0.922 )
-0.302 (0.979 ) -0.339 (1.25 ) -0.102 (0.962 )
-0.098 (1.04 ) -0.184 (1.36 ) 0.418 (1.12 )
Reproduced fro m Phillip s an d Hanse n (1990) .
necessary fo r inference . Whil e super-consistenc y theorem s sho w tha t 1(0) term s ma y b e ignore d asymptotically i n regression s wit h 1(1 ) variables, thes e asymptoti c result s hav e littl e bearing , o n sampl e size s common i n econometrics , wher e 1(0 ) term s ar e importan t an d nee d t o be accommodated . The othe r importan t issu e raise d b y thes e example s i s th e wea k exogeneity o f th e conditionin g variable s fo r th e parameter s o f interest . Reconsider th e DG P i n (31 ) and (32 ) where u t i s a first-orde r auto regressive process, s o that a finite la g length dynamic model is valid:
where
Then or
252 Co-integratio
n i n Individua l Equation s
in term s o f 1(0 ) variables . Le t £[£]. < |e2f ] = °u a22£2t = Y£2t s o £
Further, assum e tha t 0 = (ft* : a : ft : §)' denote s th e parameter s o f interest, an d indee d tha t 6 i s bot h constan t an d invarian t t o regim e shifts affectin g Ay 2 ( . Nevertheless , althoug h (61 ) appear s t o defin e a valid conditiona l mode l fo r al l value s o f 0 , i f c 21 ¥= 0 the n Ay 2 , i s no t weakly exogenou s fo r 6 . Becaus e o f th e resultin g non-diagonality o f th e long-run c o variance matrix , thi s los s o f wea k exogeneit y ca n hav e a detrimental impac t o n th e bia s an d efficienc y o f th e least-square s estimator o f 9 in finit e samples . In fact , c 21 ¥= 0 jointly violates th e wea k an d stron g exogeneit y o f y 2f for 0 . To sor t ou t whic h aspect i s dominant, thre e case s meri t comment : the followin g implication s ar e base d o n Mont e Carl o studie s o f (61) . First, eve n i f y = 0 , s o tha t ther e i s n o simultaneit y an d 13* = p , th e previous conclusio n holds . Second , i f y = £ 0 wherea s c 21 = 0 , y 2r i s strongly exogenou s fo r 6 an d n o problem s result . Finally , i f stron g exogeneity alon e i s violated , bu t wea k exogeneit y holds , a s woul d happen i f A y l r _ j directl y affected Ay 2 , whe n c 21 = 0 , ther e ar e agai n no serious bia s effects . Thus , th e presenc e o f th e co-integratin g vecto r i n another equatio n appear s t o b e th e primar y determinan t o f th e finite sample bias . Consequently , co-integratio n force s a renewe d emphasi s on systems method s i f potentiall y misleadin g inferences ar e t o b e avoided . That i s the focu s o f Chapte r 8 .
Appendix: Covarianc e Matrice s Consider th e DG P i n (Al ) wher e y, is th e stationar y first-orde r vecto r autoregressive process : y r = Ay,_ i + e , wher e e t ~ IN(0 , S), (Al ) and al l th e laten t root s o f A li e insid e th e uni t circle . Ther e ar e thre e distinct c o variance matrice s relevan t t o th e analysis , a s follows , notin g that £(y f ) = 0. (a) Th e conditional (o r contemporaneous) covariance matrix
Co-integration i n Individual Equations
253
(b) Th e unconditional covariance matrix
obtained a s show n b y substitutin g (Al ) fo r y t, multiplyin g out , an d using stationarity . Th e element s o f G ca n be obtaine d b y vectoring (A3 ) and solving . (c) Th e long-run covariance matrix Consider th e finit e sampl e expression , analogou s t o E[T~~ 1S2T] i n th e scalar case :
Rewriting £ 2 as (I - A)( I - A)^ : G + A + A' + G(I - A')' 1 ^ ~ A') - G, on simplifyin g we have that : However, a mor e convenien t for m o f Q , directl y relate d t o th e spectra l density a t the origin , result s fro m (A3) :
(A5) 1
so tha t o n pre-multiplyin g E b y ( I -A)" an d post-multiplyin g b y (I - A')" 1 and using (A4):
254 Co-integratio
n i n Individua l Equation s
Similar principle s appl y t o derivin g thes e thre e matrice s i n mor e general weakl y stationar y processes . A s a secon d example , i f (Al ) i s altered t o th e first-orde r moving average: then, usin g j>t-i t o denot e availabl e information:
and: (A10) Following Phillip s an d Durlau f (1986) , consider a genera l 1(1 ) vecto r process: and v t i s a weakl y stationar y stochasti c proces s wit h unconditiona l covariance E(v tv't) = G an d long-ru n covarianc e Q = G + A + A'. Fro m (A4), A ca n be writte n as:
Extending th e analysi s in Chapte r 3 to allo w fo r vecto r processes , an d in th e appendi x t o Chapte r 6 t o allo w fo r non-II D errors , x r /Vr converges t o the vecto r Brownia n motion BM(fi) :
Then:
These vector formula e could b e standardize d usin g V(r) = K'B(r) wher e fi" 1 =KK'.
8
Co-integration i n System s of Equations We hav e s o fa r considere d onl y single-equatio n estimatio n an d testing. Whil e th e estimatio n o f singl e equation s i s convenien t an d often efficient , fo r som e purpose s onl y estimatio n o f a syste m provides sufficien t information . Thi s i s true, fo r example , whe n we consider th e estimatio n o f multipl e co-integratin g vectors , an d inference abou t th e numbe r o f suc h vectors. Traditionally , system s have bee n estimate d whe n ther e i s a failur e o f weak exogeneit y i n a singl e equation , an d thes e consideration s als o appl y here . Thi s chapter examine s method s o f findin g th e co-integratin g rank , considers eircumstance s whe n dynami c single-equatio n method s will be asymptoticall y equivalen t t o system s methods , an d provide s examples t o illustrat e thes e issues . Asymptoti c distribution s ar e also derived . In earlie r chapters , w e investigate d dat a serie s containin g uni t root s i n their scala r autoregressiv e representation s (i.e . thei r margina l distribu tions), an d denote d suc h serie s a s 1(1). I n thi s chapter w e will consider a vector tim e serie s of dimensio n n, a, = (*u,*2o • • •> x nt)' (generalizin g the analysi s t o an y numbe r o f variables) , wher e x , i s 1(1 ) s o tha t Ax r i s 1(0). Generally , an y arbitrar y linea r combinatio n o f th e element s o f x f , say w ( = a'x t, wil l als o b e 1(1) , an d suc h linea r combination s impl y o r give ris e t o spurious regressions. However , ther e ma y exis t vector s a , such tha t in whic h case th e relevant component s o f \t are co-integrated . In th e simples t bivariat e case , a s w e hav e seen , w e ma y tak e xf = (y t, z ty, wher e y t an d z t ar e individuall y 1(1). Th e arbitrar y linea r combination (y, - Kz t) wil l als o b e 1(1) , bu t i f there exist s a value i q of K suc h tha t (y, - jqz, ) ~ 1(0) , the n y t an d z t ar e co-integrated . Lettin g a{ = (1, — iq) b e th e co-integratin g vecto r i n thi s case , a^ mus t b e unique, sinc e fo r an y othe r valu e K*, the n y t — K*zt = yt ~ *q£ r + (jq - K*)z t = w t + (KI — n*)zt, whic h i s the su m of a n 1(0 ) proces s an d an 1(1) process, an d therefor e 1(1 ) unless j q = ie* .
256 Co-integratio
n i n System s o f Equations
For n element s i n x t ~ 1(1) , ther e ca n be , a t most , n — 1 co-integrating combinations. l Henc e 0 ^ r ^ n — I an d th e r vector s ma y b e gathered i n a n n x r matri x « = [« 1; «2, . . ., a,.] . Outsid e th e bivariat e model, n > 2 an d th e co-integratin g matri x i s n o longe r uniqu e i n th e absence o f prio r information . W e note d i n Chapte r 2 th e relate d issu e for stationar y equilibria , onl y som e o f whic h nee d correspon d t o substantive economi c hypotheses . A simpl e cas e of non-uniquenes s occur s whe n subset s of the Xj t are co-integrated. I n fact , fo r an y non-singula r r x r matri x F , wf = Fa'x t = a*'x t i s als o 1(0) . Thi s las t resul t show s tha t linea r combi nations o f th e co-integratin g vector s themselve s for m co-integratin g combinations. Sinc e a)x r an d a-x , ar e 1(0) , s o i s any linea r combinatio n thereof. I n th e terminolog y o f linea r algebra , th e dimensio n o f th e co-integrating spac e (give n b y th e ran k o f th e matri x a ) i s r an d th e columns o f « form th e basis vectors of this space . Pre-multiplyin g «' b y an r x r non-singula r matri x F doe s no t alte r eithe r th e co-integratin g space o r it s dimensions . Therefore , strictl y speaking , estimatin g th e co-integrating matri x « essentiall y involve s derivin g th e basi s vectors . The matri x a i s non-unique in the absenc e o f prior information . A brie f justificatio n may b e offere d fo r focusin g on th e ope n interva l (0, n) o f N , a s the domai n o f values for r. When r = n, x, must b e 1(0) , as show n in Sectio n 8. 1 below . W e therefor e exclud e thi s case whe n we know tha t \ t i s 1(1 ) an d onl y conside r stochasti c processe s wher e variables ar e marginall y 1(1) . Thus , n — r > 0, an d w e ca n re-expres s the proces s {x,} i n term s o f 1(0 ) processes , usin g th e r co-integratin g relationships an d n — r firs t difference s o f th e process . Th e cas e o f r = 0 is a trivia l on e a s i t implie s th e absenc e o f eve n a singl e co-integratin g vector an d suggest s respecification of th e syste m in differences. As w e sa w i n Chapte r 5, Engl e an d Grange r (1987 ) establishe d a n isomorphism betwee n co-integratio n an d error-correctio n models . I n order t o examin e co-integratio n i n system s o f equations , w e wil l deriv e that result , formulatin g the syste m in EC M form , i n som e detai l below , starting thi s time fro m th e moving-averag e representation o f the process . From tha t system , a maximu m likelihoo d estimato r (MLE ) o f r, th e number o f co-integratin g relationships , wil l b e obtaine d base d o n a method propose d b y Johansen (1988) . Thi s wil l i n turn enabl e u s to tes t hypotheses concernin g th e dimensio n o f th e co-integratio n space , an d establish a 'central value' o f a . A proo f o f this result i s given i n Sect . 8.1.
Co-integration i n System s o f Equations 25
7
8.1. Co-integratio n and Erro r Correction We no w retur n t o th e representatio n o f a co-integrate d syste m i n autoregressive o r (equivalently ) i n error-correctio n form . Whe n {Ax, } is a stationar y proces s (possibly ) wit h drift , w e ca n expres s i t a s a multivariate movin g averag e usin g th e Wol d (1954 ) decompositio n theorem: where e , ~ IID(0 , ft) ; L i s agai n th e la g operator , an d C(L ) i s a polynomial matrix in L give n by
The cumulativ e or tota l effec t fro m C(L ) i s given by
where th e C , agai n obe y a n exponentia l deca y conditio n o f th e for m discussed i n Chapter 5 . Using C(l), w e can rewrite C(L ) as where C*(L ) = Zr=oCfL' an d Cf= -E^+iC / s o that Cj f = !„ - C(l) . Note tha t th e existenc e o f thes e matrice s i s agai n guarantee d b y th e exponential deca y condition. Thus , fro m (1) , or
where fi = C(l)m . The ke y assumption s needed t o deriv e th e autoregressiv e representa tion o f th e proces s ar e give n below . A s i n Chapte r 5 , th e proo f follow s Johansen (1991a) . ASSUMPTION Bl. Th e characteristi c polynomial,
has root s eithe r equa l t o o r strictl y greate r tha n 1 ; tha t is , |C(z)| = 0 implies tha t eithe r \ z > 1 or z = 1. ASSUMPTION B2. Th e matri x C(l ) ha s reduce d ran k n — r an d i s therefore expressibl e a s the produc t o f two n x ( n - r ) matrice s (j> and tj, wher e ^ an d i\ have rank n — r. Thus, C(l ) = <j)t]' .
258 Co-integratio
n i n System s o f Equations
ASSUMPTION B3. Th e r X r matri x 0j_C*(l)i/ i ha s ful l ran k r.
2
Assumptions B1-B3 are analogous t o Assumptions A1-A3 in Chapter 5 . Given ou r result s o n C(l) in Chapter 5 , i t is natural t o requir e tha t C(l) be o f reduced ran k an d have ran k n - r . Also, r = n implie s tha t C(l) is identicall y th e nul l matrix . Thus , fro m (3) , (Ax, — fi) = C*(L)Ae( , which implies , afte r integration , tha t x , i s integrate d (a t most ) o f orde r 0. Assumption B 3 then rule s out th e possibilit y that C*(L ) ha s a root o n the uni t circle , s o x , canno t b e integrate d o f orde r — 1. I n eithe r case , we hav e a contradictio n o f the assumptio n tha t th e component s o f x, ar e 1(1). To deriv e th e autoregressiv e representation , multipl y (3 ) b y tjt' an d 0i respectivel y t o obtain th e equations
using th e decompositio n C(l ) = r\' an d th e resul t tha t The matri x C(l) is not invertibl e an d th e syste m given b y (4a) an d (4b) therefore canno t b e inverte d directl y t o expres s th e x it i n term s o f th e e,,. A n invertibl e syste m i s obtaine d b y defining , a s i n Chapte r 5 , tw o new variables , w , = (i/'»/)~ 1 ij'e r an d y, = (i/li/i^i/iAe,. Repeatin g th e steps use d i n Chapte r 5 , th e matrice s fj an d i] L ar e define d a s and i/^i/'ii/i)" 1 respectively . Next, agai n as in Chapter 5 , Substituting int o (4a ) an d (4b) give s
We therefor e hav e with
For z = l , thi s matrix has determinant
2 ±
Th e orthogona l complemen t of a matrix is defined in Sect . 5.3.1. Usin g this definition, an d i/ ± ar e n x r dimensiona l matrices with rank r.
Co-integration i n System s of Equations 25
9
which i s non-zero , usin g Assumption s B 2 an d B3 . Thus , B(z ) does no t have a root a t 1 . For \ z > 1, where (7 ) ma y b e show n b y substitutin g fo r C*(z ) i n B(z ) in term s of C(z ) an d C(l ) = (jtrj 1 , an d usin g th e orthogonalit y conditio n tj>'±4> = i f Iff = O r x ( n _ r ) . Fo r z > 1 , from (7), Thus fo r z > 1, |B(z ) = 0 i f an d onl y if |C(z) [ = 0 . Excludin g z = 1 , by Assumption B l th e onl y remainin g root s o f thi s determinan t li e outsid e the uni t circle . All the root s of |B(z) | = 0 are therefor e outsid e th e uni t disk , an d th e system define d by (5a) an d (5b) i s invertible. Thus , fro m (6), Also fro m (6) , note tha t
and, usin g the formul a for inversio n of partitioned matrices ,
From th e definitio n of Ae ?,
where F(L ) = [fj(l - L) , i Integrating (9 ) gives where x 0 is the constan t o f integration. T o deriv e the valu e of F(l), not e that Substituting fo r (B(l))- 1 fro m (8 ) give s F(l ) = Thus, recallin g tha t fi = C(l)m = (>if') m> F(l)^ i = 0 n x l . Th e auto regressive representation , i n its fina l form , is therefore give n by
260 Co-integratio
n i n System s of Equations
Several feature s o f th e derivation s abov e ar e noteworthy , particularl y with respec t t o th e F(l ) matrix. First , F(1)C(1 ) = C(1)F(1) = O n . Thi s result follow s fro m substitutin g »/j_(V»lC*(l)ij 1 )^ 1 ^>j. fo r F(l ) and tfrtj' for C(l ) and usin g the orthogonalit y conditions . Thi s re-emphasize s th e duality, firs t mentione d i n Chapte r 5 , betwee n th e impac t matri x i n th e MA representation , give n b y C(l) , and th e impac t matri x i n th e A R representation, give n her e b y F(l). The nul l spac e o f th e forme r i s th e range spac e o f the latte r an d vic e versa. Second, th e isomorphis m o f F(l) with the ya ' matri x in Chapter 5 can be demonstrate d easily . Note tha t
Both »/i(^j.C*(l)i/ 1 )~ 1 an d <j> L ar e matrice s o f ran k r an d dimensio n n x r . Thus , redefinin g §\ a s «' an d iji(0j_C*(l)i|_ L )~ 1 a s y , w e have F(l) = ya' , whic h i s a n n x n matri x wit h ran k r an d i s isomorphi c t o n. I t i s natura l t o defin e (jt'^x, (a'x, i n Chapte r 5 ) a s th e co-integrate d combinations o f th e x it. Integratin g (4b) show s tha t ^x, doe s no t contain a n integrate d componen t o f th e for m 2;= i e c Further , b y th e orthogonality o f ft wit h <j> L, th e co-integratin g combination s d o no t contain a trend . Bot h thes e result s matc h exactl y th e correspondin g results o n a'x t i n Chapter 5 . Third, i f B(L) wer e no t o f ful l rank , i t woul d b e possibl e t o extrac t another uni t roo t i n th e representatio n give n b y (6) , and th e syste m would b e 1(0 ) instea d o f 1(1) , a s assume d originally . Th e importanc e o f Assumption B 3 i s no w clear . Finally , usin g th e resul t tha t th e ran k o f F(l) i s r, i t is possible t o rewrit e th e mode l i n error-correction for m as where F(l) , lik e n i n Chapte r 5 , i s a matrix o f rank r an d ca n therefore be decompose d int o tw o n x r matrices , eac h o f ran k r. Th e step s involved i n goin g from th e fina l autoregressiv e for m of the syste m t o th e ECM form ar e given in (5.25)-(5.27), with n playin g the rol e o f F(l) . Sections 5. 3 an d 8. 1 hav e demonstrate d th e isomorphis m o f th e moving-average, error-correction , an d autoregressiv e representation s o f co-integrated processes . Th e nex t sectio n return s t o th e autoregressiv e representation an d relate s thi s t o th e metho d use d b y Johanse n (1988 ) which test s th e ran k o f n = ya', since , i f ther e ar e r co-integratin g vectors an d ya ' = n, the n ran k (it) = r. Th e non-uniquenes s o f thes e vectors (i n the absenc e o f a priori information ) is easily seen: for al l r X r non-singula r matrice s P . However , sinc e ran k (a ) = r , w e can normaliz e a * (perhap s afte r suitabl e rearrangemen t o f the variables ) such tha t «* ' = (I r : ft'), and so a*'x, = \at + fl'xbt wher e x' t = (x^ : x' bt).
Co-integration in System s of Equations 26
1
An importan t poin t fo r inference , give n (10) , is that th e EC M term s «'xr-/t wil l generall y ente r mor e tha n on e equation . Thi s wil l violat e weak exogeneit y whe n a i s a parameter o f interest, sinc e th e ECM s wil l be presen t i n som e o f th e othe r margina l distributions , an d wil l therefore necessitat e join t estimatio n fo r efficienc y a s discusse d i n Chapter 7 (see e.g. Phillips 1991, and Phillip s an d Hanse n 1990) . Henc e a necessar y conditio n fo r th e us e o f single-equatio n method s t o b e appropriate i n th e analysi s of co-integrate d system s is tha t th e relevan t ECM term s ente r only th e equatio n unde r study ; thi s i s clearl y no t a sufficient condition , sinc e i t i s possibl e tha t ther e ca n b e link s betwee n other parameters . As a n illustratio n o f (10) , conside r th e cas e wher e n = 2 an d r = 1. Let « ' = (1 , — K) an d X Q = 0 , s o that th e respectiv e system s become
(11') The for m i n (11 ) i s th e 'canonical ' representatio n i n 1(0 ) space , an d Phillips (1991 ) focuse s o n estimatio n o f thi s system . Whe n E(uitu2t) + 0, a 'simultaneit y problem ' i s present, bu t thi s ca n be deal t with b y th e inclusio n of A.x 2t a s a regressor i n th e firs t equatio n o f (11). The functiona l central-limi t theorem s fo r Wiene r processe s note d i n earlier chapter s appl y despit e th e seria l dependenc e i n u ( = [« lf , u 2t]', and direc t estimatio n o f K in th e firs t equatio n o f (11) ca n b e see n a s th e method originall y propose d b y Engl e an d Grange r (1987) . Inferenc e must, however, allo w for the seria l dependenc e i n ut. The latte r system , (11') , highlight s the 'structural ' form . At leas t one of Yi o r 7 2 mus t b e non-zero , sinc e otherwis e th e syste m ca n b e expressed i n term s o f difference d variable s alone . Wea k exogeneit y i s violated b y (among other possibilities ) YiY2 ^ 0. Since we are unlikely to know a priori whic h other equations ar e influence d by an y give n ECM, we tur n no w t o a metho d o f estimatin g th e co-integratin g rank r o f a system, which will also allo w tests o f this aspect o f weak exogeneity.
8.2. Estimatin g Co-integratin g Vector s in System s Consider the linea r system in (10) rewritten as
262 Co-integratio
n i n System s of Equations
where, fo r simplicity , w e hav e exclude d deterministi c term s suc h a s trends o r constants . W e shal l retur n t o a consideratio n o f thes e i n Section 8.5 . I n general , th e numbe r o f co-integratin g vector s wil l b e unknown i n empirica l modelling , an d mus t firs t b e determine d fro m th e data. Thi s step is important, becaus e both under - an d over-estimatio n of r hav e potentiall y seriou s consequence s fo r estimatio n an d inference . Under-estimation implie s th e omissio n o f empiricall y relevan t error correction terms , wit h thes e omitte d term s bein g relegate d t o e ( . Over-estimation implie s tha t th e distribution s o f statistic s wil l b e non standard. Thi s ma y b e demonstrate d b y inspectio n o f (12) . I f n i s correctly specified , al l th e variable s i n (12 ) ar e 1(0 ) an d standar d distributional result s apply . Howeve r ft">it-k will no t b e 1(0 ) i f the matri x « contain s vector s 0,%, say , suc h tha t a£x r _£ i s no t a co-integratin g combination an d i s therefore 1(1) . Th e vecto r itx. t-k W 'H hav e a mixture of 1(0 ) an d 1(1 ) term s correspondin g t o th e correc t an d incorrec t (o r over-estimated) co-integratin g vector s respectively . Incorrec t inference s will resul t fro m th e us e o f conventiona l critica l value s i n tests . W e wil l see late r tha t thi s ma y als o hav e a n advers e effec t o n forecastin g accuracy. Once r i s known , w e ca n procee d t o estimat e a an d y , notin g tha t non-singular linea r combination s o f thes e matrice s provid e equivalen t representations. Indeed , (« : y) is an over-parameterization o f n, so only the dimensio n o f the co-integratin g space ca n be establishe d directly . A tes t fo r th e nul l hypothesi s tha t ther e ar e r co-integratin g vector s can b e base d o n th e maximu m likelihoo d approac h propose d b y Johansen (1988) . Th e tes t i s equivalen t t o testin g whethe r j r = y « ' , where a an d y are n x r ; henc e i t i s a tes t o f the hypothesi s tha t n ha s less tha n ful l rank . We emphasiz e that , o f th e thre e distinc t cases , (i ) r = n, (ii ) r = 0, and (iii ) 0 < r < n, onl y cas e (iii ) wil l b e considere d formally . W e hav e already show n tha t cas e (i ) implie s tha t al l th e variable s i n x t ar e 1(0 ) and woul d onl y b e o f interes t i f ou r initia l assumption , tha t x , i s 1(1) , were incorrect . I n cas e (ii) , n = 0 and the syste m ought t o b e respecified in difference s t o achiev e stationarity . W e ca n potentiall y cove r thi s cas e as an extrem e o f cas e (iii) . For 0 < r < n, unde r th e assumption s tha t (12 ) i s the DGP , tha t al l coefficient matrice s ar e constant , tha t xj_ f c . . . x0 ar e give n and that 3
3 Phillip s an d Durlau f (1986 ) deriv e th e limitin g distributio n o f th e least-square s estimator o f (the equivalen t of) n , allowin g fo r more genera l error processes .
Co-integration i n System s o f Equations 26
3
the log-likelihoo d functio n i s derive d fro m th e multivariat e norma l distribution:4
The firs t ste p i s t o concentrat e L ( •) wit h respect t o £2 , whic h involves no ne w considerations , an d yield s th e conventiona l resul t tha t £2 = r~ 1 X; r = 1 e r eJ. Next , we remov e th e know n 1(0) variable s fro m (12) to focu s o n th e matri x of interest n , whic h requires concentratin g L ( •) with respec t t o (D 1; . . ., D^_j) . T o d o so , sinc e th e {D J ar e unre stricted, w e ca n partia l ou t th e effect s o f (Ax,_! , . . ., A.x,_ k+l) fro m both Ax t an d x ( _^ b y regression , t o obtai n residual s Ro f an d R ^ respectively. Le t q ( = (AxJ_ 1; . . ., AxJ_ A + i)'; then
The concentrate d likelihoo d functio n L*(JT ) no w depend s onl y o n {Rn,, Rift} an d take s th e form
Next, w e comput e th e second-momen t matrice s o f al l o f thes e residuals and their cross-products , S 0o, S 0 ^, Sk0, Skk, where
4 Not e that we use th e upper-cas e n fo r th e rati o of the circumferenc e of a circle to it s diameter, a s opposed to the lower-case n define d earlie r a s the matrix product yo'.
264 Co-integratio
n i n System s o f Equation s
Consequently, fro m (18) ,
If n were unrestricted, a conventional regression estimator would result . However, w e ar e intereste d i n th e clas s o f solution s tha t resul t from th e imposition o f the restrictio n tha t Hence, fro m (20) ,
Next, concentrat e L*(y , a) wit h respec t t o y , whic h wil l delive r a n expression fo r th e ML E o f y a s a functio n o f « , an d yield s a furthe r concentrated likelihoo d functio n whic h depend s onl y o n a . Onc e th e MLE o f a i s obtained , w e ca n solv e backward s fo r estimate s o f al l th e other unknow n parameter s a s function s o f th e ML E o f a . Thus , fro m (21),
Substituting $ into (21) yields L**(«) :
At firs t sight , differentiatin g L**(« ) wit h respec t t o « looks formidable, but i n fac t th e algebr a involve d i s clos e t o tha t underlyin g th e well known LIM L estimato r fo r a singl e equatio n fro m a simultaneou s system; bot h depen d o n reduced-ran k restriction s bein g imposed . I n order t o solv e th e problem , w e appl y partitione d inversio n result s t o (23) an d obtai n
Then maximizin g L**(a) wit h respec t t o a correspond s t o minimizing the generalized varianc e ratio , noting tha t [Soo l i s a constant . T o locat e tha t minimum , we procee d a s with LIM L an d impos e th e normalizatio n tha t a'S kka= I. Th e ML E now requires tha t w e minimize, with respec t t o « ,
Co-integration i n System s o f Equations 26
5
This involve s finding th e saddle-poin t o f the Lagrangian , where
, ij , an d p . Johanse n shows that , asymptotically , this procedur e determine s th e correc t para meters. H e als o obtain s th e relevan t limitin g distribution s o f th e estimators. 8.5.8. Weak Exogeneity and Conditional Models Most large-scal e econometri c system s an d man y other empirica l model s are ope n i n th e sens e tha t the y trea t a subse t o f th e variable s a s 'exogenous'. I n thi s sub-section , w e wil l focu s o n th e potentia l wea k exogeneity o f contemporaneou s conditionin g variable s fo r th e para meters o f interes t i n 1(1 ) co-integrate d system s (se e Engl e e t al. 1983) . As discusse d i n Chapte r 1 , wea k exogeneit y require s tha t ther e i s n o loss o f informatio n abou t th e parameter s o f interes t i n reducin g th e analysis fro m th e join t distributio n t o a conditiona l model . Th e concep t was develope d initiall y in th e contex t o f stationar y processes, bu t a s th e results i n Chapte r 7 suggested, it play s a n importan t rol e i n 1(1) system s as well. In particular , whe n th e vecto r o f observable s x , i s 1(1 ) ther e ca n b e cross-equation link s betwee n parameters , whic h ar e induce d b y th e occurrence i n severa l equation s o f commo n co-integratin g combinations «'x ( . I f a'\ t enter s bot h th e z't h and ;'t h equations , the n Xj t canno t b e weakly exogenou s fo r th e parameter s o f th e z't h equatio n sinc e th e parameters o f the tw o equations shar e commo n component s o f a'x , an d so canno t b e variatio n free . Failur e t o accoun t fo r suc h paramete r dependencies ca n adversel y affec t th e validit y o f inferenc e i n finit e samples (se e Chapte r 7 , Phillip s 1991 , Phillip s an d Loreta n 1991 , an d Hendry an d Mizo n 1992). To develo p notatio n fo r an 1(1) ope n system , tw o partitions o f \t are needed. T o exposi t th e basi c idea , i t i s convenien t t o retur n t o th e first-order syste m in (38 ) above , writte n as where e r ~ IN(0,£) an d « ' i s r x n o f ran k r. First , w e have th e usua l
Co-integration i n System s o f Equations 28
9
transformed partitio n o f x, int o w ( = (xJer.Ax^)' , capturin g the location s of th e uni t root s an d th e co-integratin g vectors , wher e ther e ar e r elements i n x',a an d ( n - r ) i n Ax& r . Th e histor y o f th e proces s u p t o time t - 1 is denoted i n 1(0) spac e b y Wj_ i = (w l5 . . ., w,_i) . Second , we partitio n Ax ( int o (Axi,:Ax 2r)', wher e Ax 2f i s r a x 1 an d i s t o b e treated a s weakl y exogenous fo r th e vecto r paramete r o f interes t tjt e 4> , which include s thos e element s o f a an d y relevan t t o Ax lt . Fo r late r use, w e explicitl y write ou t nm t-i i n term s o f (xi^ix^-i)', whe n ther e are r v + r2 = r co-integrating relations i n the tw o blocks, namely
The dimension s o f y n , y 12, y 2i, an d y 22 ar e ( n — m) X r 1; (n - m ) x r 2, m x r l 5 an d m x r 2 respectively ; and , correspondingly, a'n, a[ 2, « 21, an d « 22 ar e r^x ( n — m), TI x m , r 2x ( n - m) , an d r2 x m. If r 2 — 0, the n the relevan t element s are set to zero . Sinc e the analysis i n term s o f w , i s i n 1(0 ) space , th e approac h i n Engl e e t al. applies. The complet e se t of parameters o f the join t distributio n i s 0 e 0, an d these ar e mappe d one-for-on e t o f(0 ) = A e A, an d partitione d int o A=(Ai:A2)' wher e ^ e \i an d A 2 e A2. Factoriz e th e join t sequentia l density D x(^t Wj_ l 5 ff ) o f Ax ( int o it s conditiona l an d margina l components:
(56) Since w ( _! = (xJ-jtrAx^-j)', al l th e informatio n o n th e co-integratin g vectors i s retained i n Wj_j . Consequently , Ax 2f i s weakly exogenous fo r <j> i f (jt depend s o n A t alone , an d A : an d A 2 ar e variatio n free , s o tha t A = A j x A 2. Wea k exogeneit y o f Ax 2( fo r (j> canno t occu r whe n A ! an d A2 bot h depen d o n commo n component s o f a . As a consequenc e o f th e normalit y assumption , an d usin g the expres sion in (55) for ya'x^, conditionin g Ax lf o n Ax 2, lead s t o th e mea n of the conditiona l density:
290
Co-integration i n System s o f Equations
where W = E^E^1. Thus , a necessary conditio n fo r the wea k exogeneity of Ax 2( fo r (yii:«ii:«i 2) i s that eithe r {y 12 - Vy22} = 0 o r y 22 = 0; i.e. («2ix lt _i + a 22x2r-i) appear s i n onl y on e o f D Xl\X2(-) o r D Xl(-), bu t not both . Further , unles s y 21 = 0, the n (a'uXi t~i + a 'ux2t-i) wil l appea r in th e margina l distributio n o f Ax 2( , s o y 21 = 0 is als o necessary . Ther e are sufficien t condition s for thes e necessar y conditions t o hold , including 721-0, y 22 — 0 an d y 12 = 0 wher e th e latte r tw o aris e becaus e r 2 = 0. Such condition s ca n b e teste d usin g th e approac h i n Johanse n (1992b) and Johansen an d Juselius (1990) . Short-run parameter s ma y depen d o n som e o f th e element s i n a without jeopardi/in g efficien t inference s abou t long-ru n parameter s o f interest. However , i f al l th e element s o f ^ ar e o f interest , the n agai n variation-free parameter s ar e required , an d an y cross-restrictions violat e weak exogeneity. To illustrat e thi s analysis , reconside r th e exampl e i n equation s (31) and (32 ) and (60 ) of Chapte r 7 . Ther e i s one co-integratin g vecto r wit h parameter /? , r\ = r = 1, r 2 = 0, m = 1 , and n = 2:
This representatio n i s in term s o f w r (se e (38) above) bu t i s written a s a triangular syste m erro r correctio n a s i n Phillip s (1991) , imposin g a specific first-orde r autoregressiv e parametri c for m fo r th e erro r proces s u, (compare d wit h the genera l processe s allowe d by Phillips): The unconditiona l covarianc e matri x o f u r i s pli m T~1 ^u t uJ = G , derived i n Sectio n 8.5.1 . Le t c 12 = c 22 = 0 sinc e thes e parameter s onl y determine th e presenc e o f the lagge d differenc e o f x2t, an d d o not affec t co-integration vectors . The n th e long-ru n covariance matri x is (see Ch. 7 appendix):
where ft)u = on/(l - c n ) 2 an d ^12 = cr12/(l - c u). Th e non-diagonalit y of fl implie s tha t ther e i s informatio n abou t th e parameter s o f eac h equation i n th e other . However , b y conditionin g Ax lf o n Ax 2, i n th e first equation , th e cr 12 effec t i s removed. The n eve n i f th e firs t equatio n is dynamic , s o c u ¥ = 0, th e diagonalit y o f fl onl y depend s o n c 21 = 0. When c 21 + 0, th e long-ru n covarianc e matri x i s non-diagonal an d ther e
Co-integration i n System s o f Equations 29
1
is a los s o f wea k exogeneity , whic h ca n hav e a detrimenta l impac t o n the bia s an d efficienc y o f th e least-square s estimato r o f f i i n finit e samples. Not e tha t c 12 = £ 0 ca n b e correcte d withi n th e firs t equatio n treated i n isolatio n b y addin g lagge d A* 2/, bu t tha t c 21 ¥= 0 require s modelling th e syste m (althoug h correction s base d o n addin g lead s o f Ax 2 , hav e been propose d t o exploi t th e obvers e Grange r causalit y of x\ on X2'. se e Stoc k an d Watso n 1991) . We no w deriv e th e conditiona l an d margina l factorizations . I n term s of observables , th e origina l syste m fro m Chapte r 7 ca n b e writte n a s w, = Cw,_! + e t, or
Rewritten a s a VAR i n 1(0) variables as in (37) , w e have
where d 12 = c12 + ^c22, y n = (cn - 1 + /3c21), d 22 = c 22, an d y21 = c21. The restricte d firs t colum n o f D i s a n incidenta l effec t fro m assumin g a first-order autoregressiv e erro r initially. Finally, solvin g fo r th e conditiona l an d margina l representations , w e have
where W = ouo22\ A u = (/3 + W), A 12 = (cu - 1 - Wc 21), A 13 = (c12 - ^c 22), A 21 = c 21, A 22 = c 22, an d E[v ts2t] = 0. Assum e tha t <j> = (An:A12:A13:/J)' i s th e vecto r paramete r o f interest . Whe n A 21 = 0 , least-squares estimatio n o f 0 from th e firs t equatio n involve s n o los s of information. I n fact , x 2t i s strongl y exogenous fo r 0 in suc h a system . However, whe n A 21 + 0, Ax 2( i s no t weakl y exogenous fo r <j> an d th e analysis i s no t full y efficient . Mont e Carl o studie s (e.g . Phillip s an d Loretan 1991 ) confir m th e impac t o f thi s los s o f efficienc y i n finit e samples (se e Chapte r 7 ) . Irrespective o f th e valu e o f A 21, th e firs t equatio n i n (62 ) i s th e conditional expectation fro m (58) , namely Thus, onc e dat a ar e 1(1 ) bu t co-integrated , th e fac t tha t a n equatio n coincides wit h th e conditiona l expectatio n i s no t sufficien t t o justif y single-equation least-square s modelling . Rathe r surprisingly , weak exo geneity is at leas t a s important i n 1(1) processe s a s in 1(0) processes .
292 Co-integratio
n i n System s o f Equation s
8.6. A Second Exampl e o f the Johanse n Maximu m Likelihood Approach We reconside r th e U K seasonall y adjuste d quarterl y dat a fro m Sect . 7. 6 on money , prices , output , an d interes t rates , thi s tim e treate d a s a system, represente d b y a VA R wit h tw o lag s o n eac h o f m — p, &p, xS5, and R n, plus a constant an d a trend. Th e la g length was selected b y commencing a t fiv e lag s on ever y variable, an d sequentiall y testin g fro m the highes t order . Th e sampl e wa s 1964(3)-1989(2) . Th e residua l standard deviation s o f th e fou r equation s wer e 0.0161 , 0.0069 , 0.0126 , and 0.012 7 respectively , an d o n recursiv e F-test s al l fou r equation s ha d acceptably constan t coefficient s usin g one-of f 1(0 ) critica l values . Th e residuals als o yielde d insignifican t outcome s o n % 2 test s fo r autocorrela tion bu t no t fo r normality. In almos t ever y instance, tw o co-integratin g combinations wer e signifi cant (i.e . tw o unit roots were rejected) ; th e secon d o f these wa s virtually the sam e i n al l la g specifications , bu t th e firs t wa s ofte n a linea r combination o f th e firs t tw o row s reporte d i n Tabl e 8.9 . Suc h a findin g matches tha t i n Hendr y an d Mizo n (1992 ) an d Ericsso n e t al. (1991) . Beginning wit h th e larges t statistics , tw o o f th e test s i n eac h colum n ar e significant (se e Osterwald-Lenu m 1992 : Tabl e 2). The correspondin g eigenvector s ar e show n i n Tabl e 8.9 , i n rows , augmented b y th e tw o non-co-integratin g combination s i n th e las t tw o TABLE 8.8. Eigenvalues , tes t statistics , an d 5 per cen t critica l value s Eigenvalues
0.013817
Statistics
-riog(i-ft.;) £,(0.05
n — 4= r =
0
n — 3= r = 1 n - 2 = r =2 n - 1 =r = 3
72.82 28.73 6.22 1.39
0.060350
30.33 23.78 16.87 3.74
)
0.249694
0.517240
-riog(l - M, ;) »? 109.17 36.34 7.62 1.39
n - r (0.05)
54.64 34.55 18.17 3.74
TABLE 8.9. Normalize d eigenvector s « ' Variable
m— p
«i
1.0000 0.0311 -0.2633 0.9838
«2
»'l l>2
R,,
6.3966 1.0000 0.9435 4.5659
-0.8938 -0.3334 1.0000 -0.7701
7.6838 -0.1377 -1.2117 1.0000
Co-integration i n System s o f Equations
293
rows. Th e firs t ro w suggest s th e following long-ru n solutio n fo r th e money equation: This i s clos e t o tha t foun d fro m th e single-equatio n dynami c analysis in Chapter 7 . N o tren d i s required . Th e y matri x i s give n i n Tabl e 8.10. Only th e firs t entr y i n th e firs t colum n i s a t al l large , s o tha t th e firs t co-integrating vecto r onl y affect s th e firs t equatio n consisten t wit h th e weak exogeneit y o f x 85, R n, an d A p fo r th e parameter s o f th e money-demand equation . Thi s agai n matche s th e findin g ove r a shorte r sample in Hendry an d Mizo n (1992) . The secon d ro w o f Tabl e 8. 9 deliver s th e approximat e long-ru n solution This correspond s t o th e impac t o f exces s demand , a s measure d b y th e deviation fro m it s linea r trend , o n inflatio n wit h a smal l an d possibl y insignificant effec t fro m interes t rates . N o additiona l tren d i s the n required. Th e secon d colum n o f y show s a larg e effec t o f thi s ECM o n all fou r equations , violatin g an y possibilit y o f treatin g an y o f th e fou r variables a s weakly exogenous i n a model o f inflatio n or exces s demand when the parameter s o f interest includ e th e long-ru n multipliers. When th e orderin g o f variables is ( m — p,Ap, x S5, R n) th e long-ru n n matrix is -0.082 -0.245 -0.081 -0.761 0.164 -0.009 -0.474 0.112 0.007 0.146 -0.108 -0.147 -0.021 -0.119 0.149 -0.059
8.7. Asymptoti c Distributions o f Estimators o f Co-integrating vectors in 1(1 ) system s Gonzalo (1990 ) review s an d compare s th e variou s alternative s t o OL S for th e estimatio n o f co-integrating vectors, includin g those propose d b y TABLE 8.10. Adjustmen t coefficients y Variable
7i
72
m- p Ap
-0.0952 0.0048 -0.0210 -0.0001
0.4268 -0.5147 0.2578 -0.2253
*85
Rn
-0.0300 -0.0013 -0.0318 0.0796
-0.0076 0.0024 0.0116 0.0069
294
Co-integration i n System s o f Equation s
Stock (1987) , Stoc k an d Watso n (19886) , Johanse n (1988) , Phillip s (1988a), an d Phillip s an d Hanse n (1990) . Whil e al l o f th e suggeste d methods shar e th e super-consistenc y property , w e hav e see n tha t ther e can b e substantia l difference s i n thei r performanc e o n moderatel y size d samples. Gonzalo make s th e compariso n on a simple dat a generatio n proces s i n which co-integratio n hold s between th e 1(1 ) serie s z t an d y t: and
This syste m i s a specia l cas e o f (58 ) an d ca n therefor e b e represente d i n the error-correction for m
where w l f = /3e 2r + eic U 2t = £ 2n an d £(uu' ) = A, with
The logarith m o f th e likelihoo d functio n fo r th e EC M i s therefore L(a, y , A) = K - (r/2)ln|A |
where x , = (y t, z t)' , J~ (p— 1,0)' , « ' = (1 , -/?), an d y« ' i s th e 2 x 2 matrix o f rank 1 given i n (64). The system s (63 ) an d (64 ) hav e th e propert y tha t z t i s weakl y exogenous fo r /? . Sinc e th e u it are normall y distribute d (fro m (63)) , tak e conditional expectation s in (64)
Taking th e covariance s o f the u t fro m (65) , w e have
Co-integration in System s o f Equations 29
5
The paramete r /3 i s recoverabl e fro m (67) . Moreover, / ? doe s no t enter th e margina l distribution . Weak exogeneit y o f z t fo r / ? implie s tha t inferenc e concernin g f t ca n be carrie d ou t wit h n o los s o f informatio n b y usin g th e densit y o f y t conditional o n z t an d ignorin g th e margina l densit y o f z t (tha t is , th e DGP o f z t)- I t i s the n no t surprisin g that , whe n th e log-likelihoo d i s formally spli t int o a conditiona l an d a margina l likelihood, th e margina l density contain s n o informatio n abou t ft . Tha t is , (66 ) can b e rewritte n as
with A 0 = An - A 12A^A21, £, = Ay , - ( p - l)(y t-i - fizt-i) ~ ty&z t, and, finally , i/ > = A^A^ 1 = (f t + 0ffi/ff 2 ); V ca n b e interprete d a s a short-run multiplier , bein g th e coefficien t o n Az , i n (67) , while th e long-run multiplie r i s ft , fro m (63) . The ter m i n parenthese s i n (68 ) is the margina l likelihood o f z t (o r Az r ) an d doe s no t involv e /3; estimatio n of f t ca n b e carrie d ou t b y maximizin g the conditiona l likelihoo d alone . The estimat e i s tha t whic h woul d b e obtaine d fro m OL S i n th e regression correspondin g to (67). In orde r t o discus s th e asymptoti c propertie s o f differen t estimatio n methods, w e us e th e multivariat e functiona l central-limi t theore m an d transformation t o th e uni t interva l describe d i n Chapte r 6 . Fo r th e vector e t - (v t, E 2t)' , let pt - p,_ j + ef . The n
with B(r ) = (5i(r), B 2(r))'. Th e long-ru n covarianc e matri x o f thi s bivariate Brownia n motion proces s ca n b e calculate d a s in th e appendi x to Chapte r 7 :
Further,
where
296 Co-integratio
n in System s o f Equations
Hence
Results o n th e asymptoti c distribution s o f th e differen t estimator s o f co-integrating parameters wil l be state d withou t proof, bu t ca n b e found in Gon/al o (1990). (i) Static regression estimated by OLS. For \t generate d by (63) , the OLS estimator o f ft in a static regression ha s the asymptoti c distribution
using th e decomposition BI(S) = a)i 2a)22B2(s) + ( = / 3 implie s 6 = 0, an d s o A 2 = A% = 0. Whil e th e limiting distributio n abov e i s specifi c t o th e DG P (63) , i/ > = / ? wil l typically onl y aris e becaus e o f a n absenc e o f lagge d value s o f z t an d y t from th e DGP ; if fo r exampl e y, = yzt + Y\yt-\ + Y2Zt-i + error, the n the long-ru n multiplie r i s / ? = ( V + 72)/( l ~ 7i) > m whic h cas e 7i — 72 = 0 i s sufficien t fo r fi = ty . A commo n facto r (y 2 = — VYi) i s necessary an d sufficient . The term s A 2 an d A 3 abov e ca n b e eliminate d whe n if> = £ / ? by th e us e of othe r estimatio n methods, a s will be see n below .
Co-integration i n System s o f Equations 29
7
(ii) Non-linear least squares (Stock 1987). Thi s method , whic h elimin ates th e bia s containe d i n (70c) , consist s i n minimizin g th e su m o f squared residual s defined as
which i s non-linea r i n tha t th e coefficien t o n z t-i i n th e correspondin g regression mode l i s YiP- Th e coefficien t f t ca n howeve r b e recovere d from th e ordinar y linear regressio n
The asymptoti c distribution o f thi s NL S estimato r i s simila r to tha t i n (69), bu t wit h the ter m (70c ) omitted an d (706 ) modifie d to
Comparing (706) and (706') , we see that (706' ) contain s a factor of ty rather tha n (i/;-/3) . A s (706 ) is on e o f th e term s responsibl e fo r second-order bias , i t seem s likel y tha t OL S wil l perform relativel y well when ty— ft = Q, reducin g th e bia s i n (706) , an d tha t NL S wil l perfor m relatively wel l whe n ^ = 0, reducin g th e bia s i n (706') . I n th e Mont e Carlo stud y of Stock (1987) , th e DG P chose n implie s that ip = 0, leading to th e superiorit y o f th e NL S technique ; wher e t/ ; = ft , however , OL S may d o better . Recal l fro m th e definitio n of if> tha t V = f t i f 0 > a scaling factor fo r th e correlatio n betwee n th e underlyin g white-nois e disturb ances in y t an d z, t, is equal to zero . (in) Full-information maximum likelihood (FIML). Th e FIM L pro cedure o f Johanse n (1988 ) fo r estimatin g the matri x a o f co-integrating vectors i n a syste m i s describe d above . Gonzal o show s that , fo r th e DGP (63) , the FIML estimator o f ft has the asymptoti c distribution
where AI i s as given i n (70a) . Therefor e (71 ) is equivalent t o (69 ) wit h terms A 2 an d A 3 eliminated . FIML estimatio n eliminate s two sources of bias: th e non-symmetr y caused b y ip = £ ft which leads t o a bias in median (term (706)), an d th e simultaneous-equation s bias , whic h i s a bia s i n mean (ter m (70c)) , whic h results when the long-ru n covariance betwee n zt an d v t i n (63 ) i s no t accounte d for . Th e FIM L estimato r i s asymptotically symmetrically distributed.
298 Co-integratio
n i n System s of Equations
Moreover, th e asymptoti c distributio n give n i n (71 ) i s a mixtur e o f normals. (Recal l tha t i n (70a ) B 2(s) an d W(s) ar e independen t Brow nian motio n processes. ) A s a result , standar d asymptoti c chi-square d hypothesis tests ar e valid. (iv) Other estimators. Stoc k an d Watso n (19886 ) an d Bossaert s (1988 ) propose additiona l method s o f estimatio n base d o n principa l compon ents an d canonica l correlations respectively . The principal-componen t metho d find s th e linea r combinatio n o f y t and z t wit h minimu m variance , whic h amount s t o findin g th e co integrating vector. Give n th e covarianc e matrix of (y t, z t), th e principalcomponent estimat e o f th e co-integratin g vecto r i s th e eigenvecto r corresponding t o th e smalles t eigenvalu e o f thi s covarianc e matrix . Fo r the DG P (63) , it s asymptoti c distribution i s like tha t o f OL S a s given in (69), wit h th e additio n o f a fourt h ter m groupe d wit h A\, A-i an d AT,. Calling thi s term A 4, The additiona l ter m affect s th e bia s i n mean , whic h ma y b e large r o r smaller tha n tha t o f OL S a s thi s term ma y b e positiv e o r negative . Lik e FIML, th e principal-componen t metho d lend s itsel f naturall y t o th e estimation o f more than on e co-integratin g vector. The metho d o f canonica l correlatio n i s base d o n a searc h fo r th e linear combinatio n o f (y t, z t) an d (y t-i, z t-i) whic h ha s th e maxima l correlation subjec t t o normalizatio n and identificatio n constraints. Gonzalo compare s th e method s i n a Mont e Carl o simulatio n that use s a DGP simila r to (63) , but wit h (63a ) modifie d t o
where a\ = 0 o r 1 and wit h a\ = 1 . Th e result s ar e consisten t wit h th e analysis o f biase s give n above , an d i n particula r suppor t th e contentio n that th e Johansen-typ e FIM L estimato r wil l ten d t o b e superior . Whic h of OL S an d NL S i s superior depends , a s anticipated , o n th e parameter s V an d t y — fi. Moreover, a s w e hav e see n above , i t appear s tha t th e efficiency cos t o f over-parameterizatio n o f th e FIM L o r NL S estimator s is modest , whil e th e consequence s o f under-parameterizatio n ma y b e more serious .
9
Conclusion We briefl y summariz e th e mai n theme s o f th e book , an d the n consider th e invarianc e o f th e matri x o f co-integrating vectors i n a linear syste m unde r bot h linea r transformation s an d seasona l adjustment. Next , co-integratio n i s related t o structure d time-serie s models, whic h offe r a n alternativ e approac h t o modellin g inte grated data . Recen t researc h o n integratio n an d co-integratio n i s described, an d th e boo k conclude s b y re-interpretin g som e ol d econometric problem s i n the ligh t of co-integration theory .
9.1. Summar y Many economi c tim e serie s appea r t o b e non-stationar y and to drif t ove r time. Efficien t inferenc e i n time-serie s econometric s require s takin g account o f thi s phenomenon . Thi s boo k describe d th e modellin g o f economic variable s a s integrate d processes , allowin g fo r th e possibilit y that variable s ma y b e linke d i n th e lon g run , implyin g tha t linea r combinations of them ar e co-integrated . We firs t presente d th e backgroun d t o th e theor y o f integrate d series , building o n concept s fro m time-serie s analysi s an d th e theor y o f sto chastic processes . Th e resultin g distribution s o f estimator s an d test s applied t o integrate d dat a wer e functional s o f Wiene r processes , whic h when combine d wit h a functional central-limi t theorem le d to a powerfu l and genera l metho d fo r derivin g their limitin g distributions. These wer e different fro m th e limitin g distribution s conventionall y applie d t o sta tionary processes , bot h becaus e th e normalizatio n facto r was the sampl e size rathe r tha n it s squar e root , an d becaus e th e for m o f the asymptoti c distribution wa s non-normal . A n importan t implicatio n wa s tha t th e critical value s o f tes t statistic s differe d betwee n 1(0 ) an d 1(1 ) data . Although th e asymptoti c distributio n theor y involve d ne w type s o f derivations, i t wa s feasibl e t o maste r th e logi c o f Wiene r processe s without excessiv e effort ; th e pay-of f wa s tha t th e approac h simplifie d other derivation s (suc h a s constanc y tests , a s i n Hanse n 1992) , and , i n addition, wa s very general. The Wiene r proces s tool s the n allowe d u s t o analys e suc h divers e problems a s spuriou s (o r nonsense ) regressions , spuriou s detrending ,
300 Conclusio
n
parametric an d non-parametricall y adjuste d univariat e test s fo r uni t roots, regression s o n 1(1 ) data , an d test s fo r co-integration . W e showe d that eve n wit h 1(1 ) dat a man y test s ha d conventiona l distributions , bu t some di d not , s o car e wa s require d i n conductin g inference . Fo r example, test s suc h a s the Johansen statisti c Tlo g (1 - A ) for co-integration ha d distribution s whic h wer e functiona l o f Wiene r processes , although test s o n co-integratin g vector s wer e asymptoticall y normal . I n particular, over-identificatio n test s neede d t o b e formulate d after map ping t o th e spac e o f 1(0) variable s t o ensur e tha t thei r distribution s wer e not a mixture of thes e tw o type s of distributions (se e Hendr y an d Mi/o n 1992). Conditionin g test s o n th e 1(1 ) decisio n fo r th e numbe r o f co-integrating relation s allowe d th e test s t o b e treate d a s having conventional distributions . Co-integration provide d a conceptua l framewor k fo r mappin g t o 1(0 ) space an d therefor e w e examine d i t a s a data-reductio n too l an d investigated som e o f it s wide-rangin g implications. Test s fo r co-integra tion base d o n residual s fro m stati c regression s an d o n system s wer e derived. Th e Grange r Representatio n Theore m linke d co-integratio n t o a variet y of other representations , includin g error-correction mechanism s (ECMs) whic h hav e been widel y used sinc e th e lat e 1970s . This lin k in tur n entail s a ne w view of dynamics : lagged feedbacks an d ECMs d o no t necessaril y violate rationalit y in a n 1(1 ) world . Further , a s in Davidso n e t al. (1978) , th e rol e o f differencin g i s a s a transform , which preserve s co-integration , an d no t a s a filter , whic h eliminate s levels variable s an d henc e lose s co-integration . Conversely , omittin g a n ECM generall y induces a negative moving-averag e error, a point elabor ated upo n below .
9.2 Th e Invarianc e o f Co-integrating Vectors Linear systems , perhap s formulate d afte r suitabl e dat a transformation s (such a s logarithms) intende d t o mak e linearit y a reasonable approxima tion, pla y a leadin g role i n co-integratio n analysis . A linea r syste m i s invariant unde r non-singula r linea r transforms , bu t usuall y it s para meters ar e altere d b y suc h transforms . Chapte r 2 discusse d th e proper ties o f linea r autoregressiv e distribute d la g (ADL ) model s fo r stationar y data, relatin g transformation s o f ADL s t o ECM s t o demonstrat e th e equivalence o f estimator s o f long-ru n multiplier s fro m an y o f th e transforms eve n thoug h th e parameter s o f the equatio n wer e altered . I n 1(1) processes , th e correspondin g resul t i s that co-integratio n define s a n invariant o f a linear system , a s we now show . Consider a n identifie d n x r co-integratio n matri x « i n th e 1(1 ) system:
Conclusion 30
1
(1 ) where e ( ~IN(0,i;). Th e syste m i n (1 ) ha s parameter s (T , y, a, fi, E). Then, \, is 1(1 ) i f an d onl y i f rank (yl^aj j = n — r wher e * P i s th e mean la g matrix defined i n Chapter 8 . Here (y : y± ) has rank n, with y ± being n X (n — r) suc h tha t y i y = 0 an d (a:a ± ) ha s ran k n wit h «^« = 0 fo r «_ L o f siz e nx(n — r). Pre-multiplyin g (1 ) b y a know n n x n non-singula r matri x B (s o | B = £ 0), t
The syste m i n (2 ) ha s th e sam e likelihoo d a s (1) , bu t wit h parameter s (r*, y*, a, jti* , £*) wher e £ * = B£B'; a n exampl e o f a n admissibl e transform i s an y just-identifie d reformulatio n o f (1) . Onl y a i s unaf fected b y th e linea r transform , an d a'x,_ i remain s th e co-integratin g combination, s o a i s an invariant parameter o f the system. The 1(1 ) propert y o f th e syste m i s als o preserve d a s follows . Th e mean-lag matri x become s *P * = B*P and , lettin g (y * : yj) = (By: B^'yj.) s o that y*'y l = 0, the n and henc e th e tw o matrices hav e th e sam e rank . The invarianc e of « is a natural propert y o f reduced-ran k system s an d extend s t o 1(2 ) processe s and t o conditiona l systems . Thus , fo r a give n vecto r x, , reduce d forms , marginal models , conditiona l models , an d structura l form s al l ca n b e modelled wit h the sam e se t of co-integration vectors .
9.3. Invarianc e o f Co-integration Unde r Seasona l Adjustment The co-integratin g vecto r a i s invarian t t o seasona l adjustmen t b y a diagonal seasona l filte r S(L ) whic h satisfie s th e scale-preservin g prop erty S(l ) = I, a s does a procedur e lik e X-ll . Th e result s i n this sectio n are draw n fro m Ericsson , Hendry , an d Tra n (1992) . I t i s assume d tha t S(L) annihilate s an y deterministi c seasona l dummies . Th e invarianc e result hold s becaus e S(L ) can be written a s (see Chapte r 5) : We firs t sho w th e co-integratio n relatio n betwee n adjuste d an d unadjusted dat a an d the n establis h th e invarianc e o f th e co-integratio n matrix a o f x, . Le t x , = S(L)x,. denot e th e seasonall y adjuste d vecto r variable. The n
302 Conclusio
n
so tha t x , — \t = S*(L)Ax r . Henc e \ at an d x, co-integrat e wit h a uni t coefficient t o 1(0 ) whe n x, i s 1(1). Mos t seasona l adjustmen t filter s ar e two-sided an d symmetri c for mos t o f th e availabl e sample , s o that i n fac t S*(l) = 0 an d S(L ) = I + S**(L)A 2 . The n x ? - x , = S**(L)A 2 x ( s o that co-integratio n t o 1(0 ) occur s betwee n adjuste d an d unadjuste d dat a even whe n x t i s 1(2). Alternatively , i f Ax r i s 1(0) wit h a non-zer o mea n (as i n GNP) , the n x " - x , ha s a zer o mean , a s seem s sensibl e fo r the seasonal residual . Generally , i f S(L ) = I + St(L)A d , the n x ? an d x , co-integrate wit h a unit coefficien t to 1(0 ) whe n xt i s I(d), an d als o hav e a zer o mea n differenc e whe n x ( i s \(d — 1). Whe n x", — xt i s a t mos t 1(0), an y co-integratin g vecto r « ' o f eithe r x ? o r x , i s a co-integratin g vector o f th e other , s o co-integratio n parameter s ar e unaffecte d b y S(L). Sinc e x", = xt + S**(L)A2 x ( , we have tha t
and henc e th e differenc e is at leas t tw o order s o f integratio n lowe r tha n that of xt. However, th e adjustmen t paramete r y i s altere d a s follows . Multipl y (1) by S(L) t o give Ax? = S(L)fi + S(L)rAx,_! + S(L)y«'x f _ 1 + S(L)e ,
By suitabl e additio n an d subtractio n o f lag s an d difference s o f x ? o n th e right-hand side ,
When Sf(-L ) i s a scala r time s th e uni t matri x (th e sam e filte r fo r al l x it), vat = ef. I n (6) , i t look s a s i f y i s als o a n invariant , bu t a s o at involve s lagged, current , an d futur e difference s of x, o f dth o r highe r order , a s well a s e", the n on e o f v at o r e t i s likel y t o b e autocorrelated . Sinc e «'x?_i i s a n 1(0 ) variable , conventiona l seria l correlatio n biase s appl y t o it, an d henc e y will usuall y b e affecte d b y whethe r o r 'not th e dat a ar e seasonally adjusted . Th e short-ru n dynamic s wil l be change d whe n e t i s an innovation , becaus e v" i s correlate d wit h Ax?_i , an d additiona l lag s are neede d t o remov e it s autocorrelation .
Conclusion 30
3
9.4. Structure d Time-serie s Models and Co-integratio n An alternativ e approac h t o modellin g integrate d processe s i s offered b y structured time-serie s model s (se e Harvey 1989) . 1 I n thi s section , w e briefly explai n thei r for m an d relat e thei r dat a descriptio n propertie s t o a co-integrated system . A simpl e univariat e example i s given by
and E[e tvs] = 0 V?,s . Thei r for m generall y lead s t o th e presenc e o f negative moving-average errors , sinc e (7 ) and (8 ) imply that The proces s {e t — et_i + vt} ca n be re-expresse d a s a first-order moving average {e, — 9et-i}, wher e th e moment s o f th e derived proces s ar e identical t o thos e o f the origina l process an d determin e 9 . Th e variance of th e forme r i s 2o 2E + o 2v, an d tha t o f th e latter , {e t-det_i}, i s (1 + 0 2)ol, an d thes e mus t b e equa l t o eac h other ; thei r first-orde r auto-covariances ar e — o2 an d — 9o2, and agai n these mus t be equal . Al l longer la g c o variances vanish . Equatin g th e first-orde r seria l correlatio n coefficients of the two representations yield s where q = o2Ja2. Equatio n (10 ) is a quadratic i n 6 that, give n q, can be solved fo r a valu e o f 9 betwee n 0 an d 1 . Finally , equatin g first-orde r covariances a 2, = o 2e/9. Thus , Ay , i s 1(0 ) an d ha s a negativ e moving average erro r wit h parameter 9 : Ay, = e t — (?e,_i. There ar e clos e link s betwee n negativ e moving-averag e error s an d error-correction mechanism s a s remarke d earlie r (se e e.g. Gregoir an d Laroque 1991) . Conside r a simple co-integrated system ,
To marginaliz e with respect t o z a t al l lags in (11), firs t rewrit e it a s so that, i n terms o f differences , In (14) , w, = Ay3v,_ ! + AM , an d a s wit h (9) , when {v, } an d {u s} ar e mutually independent , w e ca n rewrit e w t a s £ , — T£,_I, wher e equatin g 1 Harve y call s suc h model s 'structural' , bu t a s tha t wor d i s heavil y over-use d i n econometrics, we have substituted 'structured' .
304 Conclusio
n
moments yield s -t/( l + r 2) = -l/( 2 + s) fo r s = )?ff-o 2v/o2u. Thus , a negative moving-averag e erro r als o result s fro m th e marginalizatio n providing A ^ 0 (th e uni t roo t i n (14 ) cancel s whe n A =0 sinc e the n s = 0 an d s o r = l ) . I f (7 ) an d (8 ) allowe d fo r a short-ru n dynami c element, th e observe d outcom e woul d b e simila r t o tha t entaile d
by (14) .
A structure d time-serie s mode l tha t generalize s (8 ) b y includin g a time-varying slope generate s a n 1(2) series ,
Thus, a s long as cr 2 + 0, Hence fro m (7) , When cr ^ = 0, we have £ t = t, t_v = £ 0, say, so that and C o i s th e mea n growt h rat e £[Ay r ] = g y = £ 0- Whe n a 2 ¥=0, (18 ) entails changes in £[Ay r ] = g y (f) over tim e an d generate s y , a s 1(2). The alternativ e possibilit y to evolvin g growt h rate s i s tha t o f change s in mean s ove r time , s o tha t g y(t) take s differen t value s i n differen t epochs. Suc h behaviour coul d b e approximate d b y a mode l i n which th e distribution D n(r]t) wa s non-normal, wit h a large mass a t zer o an d smal l probabilities o f larg e values . The n £ r woul d usuall y b e constant , bu t would occasionall y jum p t o a ne w level . Thus , i t i s unsurprisin g tha t discrimination betwee n integrate d an d regime-chang e model s i s difficul t (see Perro n 1989) . Conversely , ther e ar e clos e affinitie s betwee n struc tured time-serie s an d econometri c model s fo r integrate d data . Indeed , several researcher s hav e suggested switchin g from a unit-root nul l to on e of 1(0 ) o r co-integration . Fo r example , on e migh t see k t o tes t a 2, = 0 when a ^ = 0 (s o £ r = £ W) a s a tes t fo r a uni t roo t (se e e.g . Kwiatkowski, Phillips , an d Schmid t (1991) an d Leybourn e an d McCab e (1992)) .
9.5. Recen t Researc h o n Integration an d Co-integratio n During th e las t decad e ther e ha s bee n a n explosio n o f researc h o n integrated an d co-integrate d processes . Dozen s o f papers appeare d whil e we wer e writin g the book , an d man y will appea r betwee n completio n o f
Conclusion 30
5
writing an d it s appearanc e i n print . Wit h suc h a rapidl y movin g target, we focuse d o n centra l researc h topic s t o explai n wha t see m likel y t o remain th e majo r concepts , tools , techniques , models , methods , an d tests. Consequently, som e researc h area s receive d scan t treatment , including other estimatio n method s fo r co-integratio n vectors , a s well as studies of their properties : see inter alia Ahn and Reinse l (1988) , Bewley , Orden , and Fishe r (1991) , Boswij k (1991) , Bo x and Tia o (1977) , Engl e an d Yo o (1991), Phillip s (1991) , Saikkone n (1991) . Som e comparativ e Mont e Carlo studie s o f finit e sampl e behaviou r an d relate d econometri c theory have bee n noted , bu t other s appea r apac e an d w e ca n expec t man y more ove r th e nex t fe w year s clarifyin g th e choic e o f method , an d th e likely problem s confrontin g eac h proposal . Researcher s wil l als o stud y the problem s o f join t selectio n of , e.g . la g lengt h an d th e numbe r o f co-integration vectors . Anothe r researc h topi c i s th e orde r i n whic h hypothesis tests should be conducted . Intuitio n suggest s that i t should b e constancy, la g length , co-integration , congruenc e o f th e system , wea k exogeneity, structura l restrictions , encompassing , intercept s (an d whether the y lie in the co-integratio n space), etc . However , th e distributions o f test s o f th e firs t hypothesi s ar e affecte d b y th e presenc e o f co-integration, an d i t ma y wel l b e difficul t t o implemen t a goo d order , although i f the dat a ar e indee d 1(1) , test s fo r la g length based o n lagged first difference s wil l b e i n 1(0 ) space . On e recommendatio n concernin g choices o f method s an d estimator s tha t emerge d a s w e proceede d wa s for a system s approac h i n preferenc e t o single-equatio n modellin g until weak exogeneit y has been ascertained . Further development s hav e occurre d i n testin g fo r uni t root s i n univariate processe s suc h a s instrumenta l variable s test s an d Durbin Hausman test s (se e e.g . Hal l 1991 , Cho i 1992 , Schmid t an d Phillip s 1992, Kremer s e t al. 1992 ; an d Banerje e an d Hendr y 199 2 fo r a summary). However , th e previou s recommendatio n o f modellin g th e system rathe r tha n usin g univariate representation s bring s into questio n the poin t o f conductin g unit-roo t test s i n margina l processes . On e purpose migh t be t o rejec t th e nul l of integration against trend stationar ity. Here , th e availabl e test s ar e know n to hav e relativel y low power. I n particular, investigator s ofte n us e t( p = 1) rathe r tha n T(p — 1) (se e Sect. 4.6 ) althoug h Mont e Carl o evidenc e show s th e latte r t o hav e higher power . I n an y case , failur e t o rejec t th e nul l doe s no t entai l accepting it as 'true' . For example , univariat e unit-roo t test s can reflec t other non-modelle d form s o f non-stationarit y suc h a s regim e shifts , an d inherent non-stationarit y i n mea n an d varianc e functions . Further , variables inherit uni t roots fro m marginalizin g with respect t o othe r unit root processe s o n whic h they depend . Thus , failur e t o rejec t a nul l o f a unit roo t tell s u s littl e abou t th e persistenc e o f shock s t o th e variabl e
306 Conclusio
n
being considere d i n isolatio n o r i n a small, highly marginalized syste m a s discussed i n Campbel l an d Perro n (1991) . A secon d purpos e migh t be t o chec k that variable s i n a system ar e no t 1(2) (se e e.g . Pantul a 1991) , s o th e nul l woul d b e a uni t roo t i n th e differences o f th e origina l variables . However , i f th e intentio n i s t o model th e system , the n i t seem s bette r t o procee d fro m th e genera l t o the specifi c her e a s wel l an d tes t th e necessar y ran k condition s o n th e mean la g matri x o f th e syste m (se e followin g (1 ) above) . Nevertheless , sequential test s i n thi s contex t rais e som e ne w problems . Fo r example , the outcom e o f a pretest fo r a uni t root (i.e . rejec t o r no t reject ) affect s the critica l values used t o tes t economi c hypotheses , s o the possibilit y of Type-I error s a t th e firs t stag e ma y lea d t o siz e o r powe r distortion s a t the secon d stag e whe n conventional initia l values ar e used . Finally, a uni t roo t ma y b e o f interes t i n orde r t o validat e a specific estimator (e.g . Engle-Granger ) b y appealing t o super-consistency . Her e a uni t roo t tes t ma y b e o f descriptive valu e as i t depend s o n th e rati o of the covarianc e o f the firs t differenc e wit h the leve l to th e varianc e o f th e level, an d s o should b e clos e t o zer o whe n ther e i s a unit root, althoug h we showe d i n Sectio n 3. 6 tha t simila r distribution s wil l resul t fo r integrated an d near-integrate d processes . Th e rati o o f th e varianc e o f the firs t differenc e to tha t o f th e leve l i s another inde x of th e rapidit y of accrual o f information (either fro m trend s o r fro m drift) . Other likel y researc h interest s concer n test s o f structural , long-run , exogeneity, causality , an d encompassin g hypothese s (se e e.g . Boswij k 1991, Hendr y an d Mizo n 1992 , an d Banerje e an d Hendr y 1992) . Modelling 1(2 ) system s i s i n it s infanc y (se e Johanse n 1991fo) , bu t ha s close links to multi-co-integratio n an d th e analysi s of stock-flow relations (see Grange r an d Le e 1990) . Thi s las t developmen t provide s a n addi tional explanatio n fo r suc h phenomen a a s th e rol e o f inflatio n i n rea l money deman d equations : i f nominal money and th e pric e leve l are 1(2) , and rea l mone y an d inflatio n ar e 1(1) , the n th e las t ma y b e neede d t o create a n 1(0 ) co-integratio n vector . Extensiv e development s als o see m likely t o occu r i n estimatio n an d dynami c modelling , sinc e fo r man y objectives i n econometrics , includin g forecasting and policy , the focu s o f interest mus t b e al l parameter s o f th e syste m an d no t jus t th e long-ru n parameters. In co-integrate d processes , wea k exogeneit y o f th e conditionin g vari ables fo r th e parameter s o f interes t remain s a s vita l a s i t di d i n stationary processes—eve n fo r th e long-ru n parameters . Thus , i t i s important t o tes t fo r th e presenc e o f co-integratin g vector s i n othe r equations a s discusse d i n Chapte r 8 . Doin g so , however , implie s syste m modelling eve n fo r a n L M tes t (se e Boswij k 1991) . Further , Urbai n (1992) show s tha t test s fo r orthogonalit y betwee n regressor s an d error s lack powe r t o detec t suc h a weak exogeneity failure.
Conclusion 30
7
9.6. Reinterpretin g Econometrics Time-series Problems Integration an d co-integratio n als o lea d t o th e re-interpretatio n o f many extant econometric s time-serie s problems . W e conside r a fe w o f these , commencing with multi-collinearity.
9.6.1. Multi-collinearity When x , ~ 1(1 ) an d a'x , ~ 1(0) , the n includin g all the element s o f x ( o r \t-i a s regressors i n a singl e equatio n wil l induc e a n apparentl y seriou s collinearity problem . Th e secon d momen t matri x (X'X ) will b e O(T 2), whereas th e linea r combinatio n (a'X'Xa ) wil l b e O(T). Consequently , (T~ 2 X'X) will converge on a singular matrix . Generally, it is inadvisable to 'solve ' thi s proble m b y deletin g variables ; fo r 1(1 ) data , doin g s o jeopardizes th e possibilit y of co-integration . I f th e dependen t variabl e i s 1(0), the n th e solutio n i s to fin d th e co-integratin g combination a'x t o r «'x,-i an d us e tha t a s a n explanator y variable . Thi s strateg y cor responds t o th e usua l recommendatio n o f transformin g t o near-ortho gonal an d interpretabl e variables . I n othe r cases , wher e th e dependen t variable i s 1(1) bu t i s co-integrated wit h a subset o f \t, say, elimination may b e sensible , bu t Wiener-base d critica l value s shoul d b e use d fo r variables tha t canno t b e writte n implicitl y a s a n 1(0 ) functio n (se e Chapter 7) . Thes e idea s ar e relate d t o th e earlie r techniqu e o f con fluence analysi s in Hendry an d Morga n (1989) .
9.6.2. Measurement Errors Measurement error s ar e a secon d proble m wher e treatmen t recommen dations ca n differ i n the light o f data bein g integrated . Whe n \t ~ 1(1), then Ax ( ~ 1(0) , an d i f the dat a ar e i n logarithms , then th e change s ar e growth rates . I f observed growt h rate s ar e t o b e a t al l sensible, the n th e error wit h which the y ar e measure d mus t no t b e 1(1 ) o r higher . Lettin g x? denot e th e observe d series , on e possibl e mode l i s Ax t = Ax, + u f , where u, i s 1(0), s o that If th e measuremen t erro r i n level s i s denote d vr t = x°t — \t, then w r i s apparently 1(1) . Thi s consideratio n therefor e onl y rather weakl y bounds the scal e o f measuremen t error . Indeed , i f the DG P i s of th e for m tha t Ax, = e t, then u, and e t ar e essentially indistinguishable in models of x°.
308 Conclusio
n
However, whe n a'x ( i s a n 1(0 ) co-integratin g combination , then , o n pre-multiplying (20 ) by a', Since Aa'x , is I(—1) an d a'Xf wil l b e 1(0 ) onl y if a'u, i s I(—1) . Thus , 1(0 ) measuremen t error s o n growth rate s mus t co-integrat e t o I(—1 ) wit h co-integratio n matri x a if the observe d serie s ar e t o co-integrat e i n th e sam e wa y a s th e laten t variables whe n the measuremen t errors ar e 1(0 ) o n growt h rates. Nowa k (1990) call s a failur e t o observ e a'x° t bein g 1(0 ) whe n a'x, i s 1(0 ) a problem o f 'hidde n co-integration' . However , man y co-integratio n rela tionships, suc h a s consumption and income , ar e likel y to hav e connecte d measurement errors . Governmenta l statistica l bureaux ma y eve n correc t the dat a o n suc h serie s i n a relate d wa y t o avoi d divergence , whic h suggests a n 1(0) measurement erro r for , say, the rati o betwee n them . An alternativ e mode l o f measurement error fo r logarithm s is one wit h a constant-percentag e standar d deviation, s o that th e siz e of the absolut e error grow s with th e variable . This lead s t o x ? = x, + v t wher e var[v f ] is constant. Suc h a measuremen t erro r woul d no t imped e co-integratio n analyses, i n tha t inconsistenc y would not resul t a s in a n 1(0 ) setting , bu t would hav e th e usua l impact i n 1(0 ) representation s sinc e a'v t coul d b e 1(0). A n importan t instanc e is when v t i s an expectation s error , i n whic h case th e distribution s of th e long-ru n parameter estimate s ar e unaffecte d but short-ru n paramete r estimate s ma y b e biase d (se e Engl e an d Granger 1987 , an d Hendr y an d Neal e 1988) . 9.6.3. Incorrectly Omitted and Included Variables When a relevan t 1(1 ) variabl e i s omitte d fro m a relationship , 1(0 ) co-integration i s impossibl e an d seriou s biase s ca n result . I n particular , for a n 1(0 ) dependen t variable , al l th e remainin g 1(1 ) regressor s ma y cease t o b e significan t give n th e appropriat e critica l values , leadin g th e model t o collaps e t o on e i n differences . Includin g a n irrelevan t 1(1 ) variable o r vecto r wil l probabl y lowe r th e efficienc y o f estimate s o f th e co-integrating vector s bu t shoul d b e detectabl e i n larg e enoug h samples , with th e usua l possibility of Type-I errors. If on e incorrectl y include s a n 1(0 ) variabl e i n a co-integratio n vecto r in a stati c regression , it s coefficien t wil l b e biase d whe n tha t variabl e i s correlated wit h omitte d 1(0 ) variables . Th e consequence s i n th e max imum likelihoo d procedur e see m les s seriou s a s it is possible t o tes t fo r a unit vecto r (i.e . on e o f th e for m ( 0 ... 0 1 0 ... 0) ) lyin g i n th e co-inte -
Conclusion 30
9
gration spac e (se e Sect . 8.5.2.) . However , conditionin g o n th e estimate d coefficients o f 1(0 ) variable s i s inappropriate , an d spuriousl y smal l confidence interval s fo r th e remainin g 1(0 ) effect s wil l usuall y result . Finally, excludin g a n 1(0 ) variable fro m a mode l wil l no t affec t th e long-run paramete r estimate s i n larg e samples , bu t wil l usually bias th e short-run parameter s as in conventional econometric derivations . 9.6.4. Parameter Change in Integrated Processes The mos t seriou s proble m arisin g fro m possibl e paramete r chang e i n econometrics i s th e predictiv e failur e o f model s tha t fai l t o incorporat e the necessar y effects . Unfortunately , i t i s difficul t eve n t o diagnos e th e problem sinc e i t is easy to confus e a n 1(1) proces s wit h an 1(0 ) subjec t to shifts (se e e.g . Perro n 1989 , Rappopor t an d Reichli n 1989 , an d Hendr y and Neal e 1991) . Indeed , a s note d i n Sectio n 9. 4 above , structure d time-series model s implemen t th e latte r an d produc e th e former . Whether it is mor e usefu l to vie w economi c dat a as integrate d (in the sense o f havin g a uni t roo t i n th e autoregressiv e representatio n subjec t to regula r smal l shocks ) o r a s subjec t t o larg e an d persisten t regim e shifts (th e abolitio n o f fixe d exchang e rates followin g Bretto n Woods , o r their reinstatemen t i n th e ERM ; th e formatio n o f OPEC ; th e denation alization o f large sector s o f a n economy ; ne w form s o f monetary contro l or thei r removal ; financial and technological innovation ; etc.) remain s to be seen . However , bot h type s ar e boun d t o pla y importan t roles , an d although w e hav e focuse d o n th e forme r i n thi s book , understandin g economic behaviou r wil l necessitat e modellin g bot h integrate d dat a an d breaks appropriately . E x ante, structura l break s ca n lea d t o ba d predictions, whic h 1(1) data alon e d o not see m to cause . E x post, testing for paramete r chang e i n 1(1 ) dat a mus t allo w fo r a wid e rang e o f possible choice s fo r brea k points . Usefu l development s ar e occurrin g in deriving appropriat e test s base d o n Wiene r distributions , an d decisio n taking i n thi s are a shoul d improv e rapidl y (se e Nyblo m 1989 , Ch u an d White 1991 , 1992 , Andrew s an d Ploberge r 1991 , Hanse n 1991 , an d Li n and Terasvirt a 1991) . 9.6.5. Conditional Models o f Co-integrated Processes Chapter 8 emphasize d th e maximum-likelihoo d approac h t o testin g fo r and estimatin g co-integratin g vector s i n th e contex t o f a VAR . Thi s imposed th e minimu m conditionin g assumption s an d allowe d a clea r focus o n th e propertie s o f co-integratio n estimation . However , man y papers hav e begu n t o develo p approache s i n th e contex t o f systems that
310 Conclusio
n
treat a subset o f variables a s weakly exogenou s fo r al l the parameter s o f interest: se e Johansen (1992a , 1992&), Johanse n an d Juseliu s (1990) , an d Boswijk (1991) , inter alia. Relate d wor k include s tha t o n testin g fo r Granger causalit y i n co-integrate d system s (se e Tod a an d Phillip s 1991 , Mosconi an d Giannin i 1992 , an d Hunte r 1992) . For a lon g time , econometrician s hav e 'talked ' co-integratio n withou t realizing it : fo r example , Klei n (1953 ) discusse s variou s grea t ratio s o f economics, namel y consumption-income , capital-output , wag e shar e i n total income , an d s o on, implicitl y assuming a stationary , o r 1(0) , world . From ou r perspective , give n tha t th e component s o f thes e relation s ar e 1(1), Klein' s ratio s are earl y example s of co-integratio n hypotheses . In a log-linear multivariat e analysis , thes e postulat e particula r form s fo r th e rows of the co-integratio n matrix , highlightin g the potentia l confirmatory role o f th e method s discusse d i n Chapte r 8 . Econometrician s nee d n o longer simpl y assume long-ru n equilibrium relation s sinc e i t is feasible t o test fo r thei r existence . Onc e tha t i s establishe d th e analysi s is reduce d from 1(1 ) t o 1(0 ) space , allowin g th e applicatio n o f wel l establishe d tools. Thus, th e recen t focu s o n conditiona l o r ope n model s take s us back t o the 1970 s i n a n importan t sens e wit h th e link s betwee n economi c theor y or long-ru n equilibriu m reasonin g an d dat a modellin g havin g bee n placed o n a sounder footing . As w e hav e show n i n thi s book , ther e stil l remai n man y difficul t theoretical an d empirica l problem s t o b e overcome . However , th e literature o n co-integration , erro r correctio n an d th e econometri c analy sis of non-stationary data ha s enable d u s to gai n many important insights into modellin g relationship s amon g integrate d variables . Thi s ha s en hanced rathe r tha n replace d existin g method s o f dynami c econometri c modelling of economic tim e series.
References ABADIR, K . M . (1992) , 'Th e Limitin g Distributio n o f th e Autocorrelatio n Coefficient Unde r a Unit Root' , Annals o f Statistics, forthcoming . AHN, S . K. , an d REINSEL , G . C . (1988) , 'Neste d Reduced-Ran k Autoregressiv e Models fo r Multipl e Tim e Series' , Journal o f th e American Statistical Association, 83: 849-56. ANDERSON, T . W . (1958) , A n Introduction t o Multivariate Statistical Analysis, John Wiley , New York. ——(1976), 'Estimatio n o f Linea r Functiona l Relationships : Approximat e Distributions an d Connection s wit h Simultaneou s Equation s i n Econometric s (with discussion)' , Journal of th e Royal Statistical Society B,38 : 1-36 . ANDREWS, D . W . K. , an d PLOBERGER , W . (1991) , 'Optima l Test s o f Paramete r Constancy', mimeo. , Yale University Press. BANERJEE, A. , an d DOLADO , J . (1987) , 'D o W e Rejec t Rationa l Expectation s Models Too Often ? Interpretin g Evidence using Nagar Expansions', Economics Letters, 24: 27-32. (1988), 'Test s o f th e Lif e Cycle-Permanen t Incom e Hypothesi s i n th e Presence o f Rando m Walks : Asymptoti c Theor y an d Smal l Sampl e Interpre tations', Oxford Economic Papers, 40: 610-33. -and GALBRAITH , J . W . (1990a) , 'Orthogonalit y Test s wit h De-trende d Data: Interpretin g Mont e Carl o Result s using Nagar Expansions' , Economics Letters, 32: 19-24. -HENDRY, D . F. , an d SMITH , G . W . (1986) , 'Explorin g Equilibriu m Relationships i n Econometric s throug h Stati c Models : Som e Mont e Carl o Evidence', Oxford Bulletin of Economics an d Statistics, 48: 253-77. -GALBRAITH, J . W. , an d DOLADO , J . (19906) , 'Dynami c Specificatio n with the Genera l Error-Correctio n Form' , Oxford Bulletin o f Economics an d Statistics, 52: 95-104. -and HENDRY , D . F . (eds. ) (1992) , Testing Integration an d Cointegration, special issue of th e Oxford Bulletin of Economics and Statistics, 54, 225-55. BARDSEN, G . (1989) , 'Th e Estimatio n o f Long-Ru n Coefficient s fro m Error Correction Models' , Oxford Bulletin of Economics and Statistics, 51: 345-50. BEWLEY, R . A . (1979) , 'Th e Direct Estimatio n of the Equilibriu m Response i n a Linear Model' , Economics Letters, 3 : 357-61. BEWLEY, R . A. , ORDEN , D. , an d FISHER , L . (1991) , 'Box-Tia o an d Johanse n Canonical Estimator s o f Cointegratin g Vectors' , Universit y o f Ne w Sout h Wales, Economics Discussion Paper, 91/5 . BHARGAVA, A . (1986) , 'O n th e Theor y o f Testin g fo r Uni t Root s i n Observe d Time Series' , Review of Economic Studies, 53 : 369-84. BILLINGSLEY, P . (1968) , Convergence of Probability Measures, John Wiley , New York. BOSSAERTS, P . (1988) , 'Commo n Non-Stationar y Components o f Asse t Prices' , Journal o f Economic Dynamics an d Control, 12 : 347-64.
312 Reference
s
BOSWIJK, H . P . (1991) , 'Testin g fo r Cointegratio n i n Structura l Models', Univer sity o f Amsterdam, Econometric s Discussio n Pape r AE7/91 . (1992), 'Efficien t Inferenc e on Cointegratio n Parameter s i n Structural Erro r Correction Models' , Universit y o f Amsterdam , Econometric s Discussio n Paper, -and FRANSES , P . H . (1992) , 'Dynami c Specificatio n an d Cointegration' , Oxford Bulletin o f Economics an d Statistics, 54: 369-81. Box, G . E . P. , an d JENKINS , G. M . (1970) , Time Series Analysis Forecasting and Control, Holden-Day , Sa n Francisco. and TIAO , G . C . (1977) , ' A Canonica l Analysi s o f Multipl e Tim e Series' , Biometrika, 64: 355-65. BRANDNER, P. , an d KUNST , R . (1990) , 'Forecastin g Vecto r Autoregressions : Th e Influence o f Cointegration', Memorandu m 265 , IAS , Vienna . CAMPBELL, B. , an d DUFOUR , J.-M . (1991) , 'Over-Rejection s i n Rationa l Expec tations Models : A Non-Parametri c Approac h t o th e Mankiw-Shapir o Prob lem', Economics Letters, 35 : 285-90. CAMPBELL, J . Y. , an d PERRON , P . (1991) , 'Pitfall s an d Opportunities : Wha t Macroeconomists Shoul d Kno w Abou t Uni t Roots' , i n Blanchard , O . J . an d Fischer, S . (eds) , NBER Economics Annual 1991, MIT Press . and SHILLER , R . J . (1991) , 'Cointegratio n an d Test s o f Presen t Valu e Models', Journal o f Political Economy, 95 : 1062-88. CHAMBERS, M . J . (1991) , ' A Not e o n Forecastin g i n Co-Integrate d Systems' , Department o f Economics, Universit y of Essex . CHAN, N . H. , an d WEI , C. Z . (1988) , 'Limitin g Distribution s o f Least-Square s Estimates o f Unstabl e Autoregressiv e Processes' , Annals o f Statistics, 16 : 367-401. CHOI, I . (1992) , 'Durbin-Hausma n Test s fo r Uni t Roots' , Oxford Bulletin o f Economics an d Statistics, 54: 289-304. CHONG, Y . Y. , an d HENDRY , D . F . (1986) , 'Econometri c Evaluatio n o f Linea r Macroeconomic Models' , Review o f Economic Studies, 53 : 671-90. CHOW, G . C . (1960) , 'Test s o f Equalit y Betwee n Set s o f Coefficient s i n Tw o Linear Regressions' , Econometrica, 52: 211-22. CHU, C.-S . J. , an d WHITE , H . (1991) , 'Testin g fo r Structura l Chang e i n som e Simple Tim e Serie s Models' , Discussio n Pape r 91-6 , Universit y of California, San Diego, Dept . o f Economics . (1992) ' A Direc t Tes t fo r Changin g Trend' , Journal o f Business an d Economic Statistics, 10: 289-99. CLEMENTS, M . P. , an d HENDRY , D . F . (1991) , 'O n th e Limitation s o f Mea n Square Erro r Forecas t Comparisons' , Discussio n pape r 138 , Oxfor d Institut e of Economic s an d Statistics . Forthcoming, Journal o f Forecasting. (1992), 'Forecastin g i n Cointegrate d Systems' , Discussio n pape r 139 , Oxford Institut e o f Economics an d Statistics . DAVIDSON, J . E . H. , HENDRY , D . F. , SRBA , F. , an d YEO , S. (1978) , 'Economet ric Modellin g of th e Aggregat e Time-Serie s Relationshi p Between Consumers ' Expenditure an d Incom e i n th e Unite d Kingdom' , Economic Journal, 88 : 661-92. DAVIDSON, R. , an d MACKINNON , J . G . (1992) , Estimation an d Inference i n Econometrics, Oxfor d University Press. DEATON, A . S. , an d MUELLBAUER , J . N . J . (1980) , Economics an d Consumer
References 31
3
Behavior, Cambridge University Press. DICKEY, D . A . (1976) , 'Estimatio n an d Hypothesi s Testin g fo r Nonstationar y Time Series' , Ph.D . dissertation , Iowa State University. and FULLER , W . A . (1979) , 'Distributio n o f the Estimator s fo r Autoregress ive Tim e Serie s wit h a Uni t Root' , Journal o f th e American Statistical Association, 74 : 427-31. -(1981), 'Likelihoo d Rati o Statistic s fo r Autoregressiv e Tim e Serie s with a Unit Root' , Econometrica, 49: 1057-72. — and PANTULA , S . G . (1987) , 'Determinin g th e Orde r o f Differencin g i n Autoregressive Processes' , Journal o f Business an d Economic Statistics, 15 : 455-61. — and SAID , S . E . (1981) , Testin g ARIMA(p , 1, q) agains t ARM A (p + l,q)', Proceedings of the Business and Economic Statistics Section, American Statistical Association, 28 : 318-22. — BELL, W . R. , an d MILLER , R . B . (1986) , 'Uni t Root s i n Tim e Serie s Models: Test s an d Implications', American Statistician, 40: 12-26. -HASZA, D . P. , an d FULLER , W . A . (1984) , 'Testin g fo r a Uni t Roo t i n Seasonal Tim e Series' , Journal o f th e American Statistical Association, 79 : 355-67. DURLAUF, S . N. , an d PHILLIPS , P . C . B . (1988) , 'Trend s versu s Random Walk s in Tim e Serie s Analysis', Econometrica, 56: 1333-54. ENGLE, R . F. , an d GRANGER , C . W . J . (1987) , 'Co-integratio n an d Erro r Correction: Representation , Estimatio n an d Testing' , Econometrica, 55 : 251-76. and Yoo , B . S . (1987) , 'Forecastin g an d Testin g i n Co-integrate d Systems', Journal o f Econometrics, 35: 143-59. (1991), 'Cointegrate d Economi c Tim e Series : A n Overvie w wit h New Results', i n R . F . Engl e an d C . W . J . Grange r (eds.) , Long-Run Economic Relationships, Oxfor d University Press, 237-66 . GRANGER, C . W . J. , an d HALLMAN , J . (1988) , 'Mergin g Short - an d Long-run Forecasts : An Applicatio n of Seasona l Co-integratio n to Monthl y Electricity Sales Forecasting', Journal of Econometrics, 40: 45-62. -HYLLEBURG, S. , an d LEE , H. S . (1993) , 'Seasona l Co-Integration : Th e Japanese Consumptio n Function' , Journal of Econometrics, 55: 275-98. -HENDRY, D . F. , an d RICHARD , J.-F . (1983) , 'Exogeneity' , Econometrica, 51: 277-304. ERICSSON, N . R . (1992) , Cointegration, Exogeneity an d Policy Analysis, Specia l Issue, Journal of Policy Modeling, 14 , 3 and 4 . CAMPOS, J. , an d TRAN , H.-A . (1990) , 'PC-GIV E an d Davi d Hendry' s Econometric Methodology' , Revista de Econometrica, X, 7-117. and HENDRY , D . F . (1985) , 'Conditiona l Econometri c Modelling : A n Application t o Ne w House Prices i n the Unite d Kingdom' , i n Atkinson, A. C . and Fienberg, S . E . (eds) , A Celebration o f Statistics, Springer-Verlag , 251-85. -HENDRY, D . F . an d TRAN , H.-A . (1992 ) 'Cointegration , Seasonality , Encompassing an d th e Deman d fo r Mone y i n th e Unite d Kingdom' , Discus sion Paper , Boar d o f Governor s o f th e Federa l Reserv e System , Washington, DC. ERMINI, L. , an d GRANGER , C . W . J . (1991) , 'Som e Generalization s o n th e
314 Reference
s
Algebra o f 7(1 ) Processes' , Workin g Paper , Departmen t o f Economics , University of Hawaii at Manoa . ERMINI, L. , an d HENDRY , D . F . (1991) , 'Lo g Incom e vs . Linea r Income : A n Application o f th e Encompassin g Principle' , Workin g Pape r no . 91-11 , De partment o f Economics, Universit y of Hawaii at Manoa. EVANS, G . B . A. , an d SAVIN , N . E . (1981) , 'Testin g fo r Uni t Roots : 1' , Econometrica, 49: 753-79. (1984), Testin g for Unit Roots : 2 ' Econometrica, 52 : 1241-69. FRIEDMAN, M. , an d SCHWARTZ , A . J . (1982) , Monetary Trends i n th e United States and the United Kingdom: Their Relation to Income, Prices, and Interest Rates, 1867-1975, Universit y o f Chicago Press . FULLER, W . A . (1976) , Introduction t o Statistical Time Series, John Wiley , New York. GALBRAITH, J . W. , DOLADO , J. , an d BANERJEE , A . (1987) , 'Rejection s o f Orthogonality i n Rationa l Expectation s Models: Furthe r Mont e Carl o Result s for a n Extende d Se t of Regressors', Economics Letters, 25 : 243-7. GANTMACHER, F . R . (1959) , Applications o f th e Theory o f Matrices, Inter science, Ne w York. GEL'FAND, J . M . (1967) , Lectures on Linear Algebra, Interscience , New York. GEWEKE, J . (1986) , 'Th e Super-Neutralit y of Mone y i n th e Unite d States : A n Interpretation o f the Evidence' , Econometrica, 54 : 1-21 . GHYSELS, E . (1990) , 'O n th e Economic s an d Econometric s o f Seasonally' , paper presente d t o th e Sixt h World Congress o f the Econometri c Society. GONZALO, J . (1990) , 'Compariso n o f Fiv e Alternativ e Method s o f Estimatin g Long-Run Equilibriu m Relationships' , Discussio n Paper , Universit y of Cali fornia a t Sa n Diego. GRANGER, C . W . J . (1981) , 'Som e Properties o f Time Serie s Dat a an d thei r Us e in Econometri c Mode l Specification' , Journal of Econometrics, 16: 121-30. (1983), 'Forecastin g Whit e Noise', i n A. Zellne r (ed.) , Applied Time Series Analysis o f Economic Data, Bureau o f the Census , Washington, DC, 308-14 . (1986), 'Development s i n th e Stud y of Co-integrate d Economi c Variables' , Oxford Bulletin of Economics an d Statistics, 48: 213-28. -and HALLMAN , J . (1991) , 'Th e Algebr a o f 1(1) Processes' , Journal of Time Series Analysis, 12 : 207-24. -and LEE , T.-H. (1990) , 'Multicointegration' , i n G . F . Rhode s Jr . an d T . B . Fomby (eds.) , Advances i n Econometrics, JA I Press , Greenwic h Conn. , 71-84. and NEWBOLD , P . (1974) , 'Spuriou s Regression s i n Econometrics' , Journal of Econometrics, 2: 111-20 . -(1977), 'Th e Tim e Serie s Approac h t o Econometri c Mode l Building' , in C . A . Sim s (ed.) , Ne w Methods i n Business Cycle Research, Federa l Reserve Ban k o f Minneapolis. -(1978), Forecasting Economic Time Series, Academi c Press , Ne w York. — and WEISS , A . A . (1983) , 'Time-Serie s Analysi s o f Error-Correctio n Models', i n S . Karlin , T . Amemiya , an d L . A . Goodma n (eds.) , Studies i n Econometrics, Time Series an d Multivariate Statistics, Academi c Press , Ne w York.
References 31
5
GREOOIR, S. , an d LAROQUE , G . (1991 ) 'Multivariat e Integrate d Tim e Series : A General Error Correctio n Representatio n wit h Associated Estimatio n an d Tes t Procedures', Discussio n pape r 53/G305 , INSEE, Paris . GRIMMET, G . R. , an d STIRZAKER , D . R . (1982) , Probability an d Random Processes, Oxford University Press. HALDRUP, N. , an d HYLLEBERG , S . (1991) , 'Integration , Near-Integratio n an d Deterministic Trends' , Discussio n Pape r no . 1991-15 , Aarhu s University , Denmark. HALL, A . (1989) , 'Testin g fo r a Uni t Roo t i n th e Presenc e o f Movin g Average Errors', Biometrika, 79 : 49-56. (1990), 'Testin g fo r a Uni t Roo t i n Tim e Serie s using Instrumenta l Variables Estimator s wit h Pre-tes t Data-Base d Mode l Selection' , Discussio n Paper, Nort h Carolin a Stat e University. -(1991), 'Mode l Selectio n an d Uni t Roo t Test s base d o n Instrumenta l Variables Estimators', Discussio n paper, North Carolin a Stat e University. HALL, A . D. , ANDERSON , H . M. , an d GRANGER , C . W . J . (1992) , ' A Cointegration Analysi s o f Treasur y Bil l Yields' , Review o f Economics an d Statistics, 74: 116-25. HALL, P. , an d HEYDE , C . C . (1980) , Martingale Limit Theory an d Applications, Academic Press , Ne w York. HALL, R . E . (1978) , 'Stochasti c Implication s o f th e Life-Cycl e Permanen t Income Hypothesis' , Journal of Political Economy, 86: 971-87. HAMMERSLEY, J . M. , an d HANDSCOMB , D . C . (1964) , Monte Carlo Methods, Methuen, London . HANSEN, B . E . (1991) , Test s fo r Paramete r Instabilit y in Regression s wit h 1(1) Processes', Discussio n paper . Universit y of Rochester . (1992), 'Testin g fo r Paramete r Instabilit y i n Linea r Models' , Journal o f Policy Modeling, 14 : 517-33. HARVEY, A . C . (1989) , Forecasting, Structural Time Series Models an d th e Kalman Filter, Cambridge Universit y Press. HASZA, D . P. , an d FULLER , W . A . (1982) , 'Testin g for Nonstationary Paramete r Specifications i n Seasona l Time-Serie s Models' , Annals o f Statistics, 10 : 1209-16. HENDRY, D . F . (1984) , 'Mont e Carl o Experimentatio n i n Econometrics' , ch . 16 in Z . Griliche s an d M . D . Intrilligato r (eds.) , Handbook o f Econometrics, ii , North-Holland, Amsterdam, 937-76. (1989), PC-GIVE: A n Interactive Econometric Modelling System, Institut e of Economic s an d Statistics , Oxfor d University, Oxford . (1991o), 'Usin g PC-NAIV E i n Teachin g Econometrics' , Oxford Bulletin o f Economics and Statistics, 53, 199-223. (1991 b), 'Economi c Forecasting' , Repor t t o th e Treasur y an d Civi l Servic e Committee, UK . and ANDERSON , G . J . (1977) , 'Testin g Dynami c Specificatio n i n Smal l Simultaneous Models : A n Applicatio n t o a Mode l o f Buildin g Societ y Beha vior i n th e Unite d Kingdom' , ch . 8 c i n M . D . Intrilligato r (ed.) , Frontiers o f Quantitative Economics, iii(a) , North-Holland, Amsterdam, 361-83 . and CLEMENTS , M. P . (1992) , 'Toward s a Theory o f Economic Forecasting', unpublished paper , Institut e of Economics an d Statistics , Oxfor d University.
316 Reference
s
HENDRY, D . F. , an d ERICSSON , N . R . (1991a) , 'A n Econometri c Appraisa l o f U.K. Mone y Deman d i n Monetary Trends i n th e United States and th e United Kingdom b y Milto n Friedma n an d Ann a J . Schwartz' , American Economic Review, 81: 8-38 . and ERICSSON , N . R . (19916) , 'Modellin g th e Deman d fo r Narro w Mone y in th e Unite d Kingdo m an d th e Unite d States' , European Economic Review, 35: 833-81 . -and MIZON , G . E . (1978) , 'Seria l Correlatio n a s a Convenien t Simplifica tion, no t a Nuisance : A Commen t o n a Stud y o f th e Deman d fo r Mone y b y the Ban k of England', Economic Journal, 88 : 549-63. (1992), 'Evaluatin g Dynami c Model s b y Encompassin g th e VAR' , i n P. C . B . Phillip s (ed.) , Models, Methods, an d Applications o f Econometrics, Basil Blackwell , Oxford. — and MORGAN , M . S . (1989) , ' A Re-analysi s o f Confluenc e Analysis' , Oxford Economic Papers, 41 : 35-52 : reprinte d i n N . d e March i an d C . L . Gilbert (eds.) , History an d Methodology o f Econometrics, Clarendo n Press , Oxford, 1990 . -MuELLBAUER, J . N . J. , an d MURPHY , A . (1990) , 'Th e Econometric s o f DHSY', i n J . D . He y an d D . Winc h (eds.) , A Century o f Economics, Basi l Blackwell, Oxford , 298-334. — and NEALE , A . J . (1987) , 'Mont e Carl o Experimentatio n usin g PC NAIVE', i n T . Fomb y an d G . Rhode s (eds.) , Advances i n Econometrics, vi , JAI Press, Greenwich , Conn. , 91-125. -(1988), 'Interpretin g Long-Ru n Equilibriu m Solution s i n Conventiona l Macro Models : A Comment' , Economic Journal, 98 : 808-17. -(1991), ' A Mont e Carl o Stud y o f th e Effect s o f Structura l Break s o n Unit Roo t Tests' , i n P . Hack l an d A . H . Westlun d (eds.) , Economic Structural Change: Analysis an d Forecasting, Springer-Verlag, Vienna , 95-119 . -and ERICSSON , N . R . (1990) , PC-NAIVE: A n Interactive Program fo r Monte Carlo Experimentation i n Econometrics, Institut e o f Economic s an d Statistics, Oxfor d University, Oxford. — PAGAN, A . R. , an d SARGAN , J . D . (1984) , 'Dynami c Specification' , ch . 18 in Z . Griliche s an d M . D . Intrilligato r (eds.) , Handbook o f Econometrics, ii, North-Holland, Amsterdam , 1023-100 . -and RICHARD , J.-F . (1982) , 'O n th e Formulatio n o f Empirica l Model s i n Dynamic Econometrics', Journal of Econometrics, 20: 3-33 . -and UNGERN-STERNBERG , T . VO N (1981) , 'Liquidit y an d Inflatio n Effects o n Consumers' Behaviour' , ch . 9 in A . S . Deato n (ed. ) Essays i n th e Theory an d Measurement o f Consumers' Behaviour, Cambridge Universit y Press, 237-60 . HUNTER, J . (1992) , 'Test s o f Cointegratin g Exogeneit y fo r PP P an d Uncovere d Interest Rat e Parit y in the UK' , Journal of Policy Modeling, 14 : 453-64. HYLLEBERG, S . (1991) , Modelling Seasonally, Oxfor d University Press. and MIZON , G . E . (1989a) , 'Cointegratio n an d Erro r Correctio n Mechan isms', Economic Journal (Supplement) , 99 : 113-25. -(1989&), ' A Not e o n th e Distributio n o f th e Leas t Square s Estimato r of a Random Wal k with Drift', Economics Letters, 29 : 225-30. — ENGLE, R . F. , GRANGER , C . W . J. , an d Yoo , B . S . (1990) , 'Seasona l Integration an d Co-Integration' , Journal of Econometrics, 44: 215-28.
References 31
7
ILMAKUNNAS, P . (1990) , Testin g th e Orde r o f Differencin g i n Quarterl y Data : An Illustratio n o f th e Testin g Sequence' , Oxford Bulletin o f Economics an d Statistics, 52: 79-88. IMHOF, P . (1961) , 'Computin g th e Distributio n o f Quadrati c Form s i n Norma l Variates', Biometrika, 48: 419-26. JARQUE, C . M. , an d BERA , A . K . (1980) , 'Efficien t Test s fo r Normality , Homoskedasticity an d Seria l Independence o f Regression Residuals' , Economics Letters, 6: 255-9. JAZWINSKI, A . H . (1970) , Stochastic Processes an d Filtering Theory, Academi c Press, Ne w York. JOHANSEN, S . (1988) , 'Statistica l Analysi s o f Cointegratio n Vectors' , Journal o f Economic Dynamics and Control, 12 : 231-54. (1989), 'Th e Power o f the Likelihoo d Rati o Tes t fo r Cointegration', mimeo, Institute o f Mathematical Statistics, Universit y of Copenhagen . (1991fl), 'Estimatio n an d Hypothesi s Testin g o f Cointegratio n Vector s i n Gaussian Vector Autoregressive Models', Econometrica, 59: 1551-80. (1991&), ' A Statistical Analysi s of Cointegration fo r 1(2 ) variables', Institut e of Mathematica l Statistics, Universit y of Copenhagen . (1992a), 'Cointegratio n i n Partia l System s an d th e Efficienc y o f Singl e Equation Analysis' , Journal o f Econometrics, 52: 389-402. (19926), Testin g Wea k Exogeneit y and th e Orde r o f Cointegratio n i n U K Money Demand', Journal of Policy Modeling, 14 : 313-34. -and JUSELIUS , K . (1990) , 'Maximu m Likelihoo d Estimatio n an d Inferenc e on Cointegration—wit h Application s t o th e Deman d fo r Money' , Oxford Bulletin of Economics and Statistics, 52: 169-210. KELLY, C . M . (1985) , ' A Cautionar y Not e o n th e Interpretatio n o f Long-Ru n Equilibrium Solution s i n Conventiona l Macr o Models' , Economic Journal, 95: 1078-86. KIVIET, J. , an d PHILLIPS , G . D . A . (1992) , 'Exac t Simila r Test s fo r Uni t Root s and Cointegration , Oxford Bulletin of Economics and Statistics, 54: 349-67. KLEIN, L . R . (1953) , A Textbook o f Econometrics, Row , Peterso n an d Com pany, Evanston, 111 . KOERTS, J. , an d ABRAHAMSE , A . P . J . (1969) , O n th e Theory an d Application o f the General Linear Model, Rotterda m Universit y Press. KREMERS, J . J . M. , ERICSSON , N . R. , an d DOLADO , J . (1992) , Th e Powe r o f Co-integration Tests' , Oxford Bulletin of Economics and Statistics, 54: 325-48. KWIATKOWSKI, D. , PHILLIPS , P . C . B. , an d SCHMIDT , P . (1991) , Testin g the Null Hypothesis o f Stationarit y agains t the Alternativ e o f a Uni t Root: Ho w Sur e Are W e tha t Economi c Tim e Serie s Hav e a Uni t Root' , Cowle s Foundatio n Discussion Pape r No . 979 . LEYBOURNE, S . J. , an d MCCABE , B . P . M . (1992) , ' A Simpl e Tes t fo r Cointegration', typescrip t Nottingham University. LIN, C.-F. , an d TERASVIRTA , T . (1991) , Testin g th e Constanc y o f Regressio n Parameters agains t Continuou s Structura l Change', Discussio n paper , Univer sity o f California at Sa n Diego . MCCALLUM, B . T . (1984) , 'O n Low-Frequency Estimate s o f Long-Run Relation ships in Macroeconomics', Journal of Monetary Economics, 14 : 3-14 . MACKINNON, J . G . (1991) , 'Critica l Value s fo r Co-Integratio n Tests' , i n R . F .
318 Reference
s
Engle an d C . W . J . Grange r (eds.) , Long-Run Economic Relationships, Oxford Universit y Press, 267-76 . MANKIW, N . G. , an d SHAPIRO , M . D . (1985) , 'Trends , Rando m Walk s and Test s of th e Permanen t Incom e Hypothesis' , Journal o f Monetary Economics, 16 : 165-74. (1986), 'D o W e Rejec t To o Often ? Smal l Sampl e Propertie s o f Test s of Rationa l Expectation s Models', Economics Letters, 20: 139-45 . MANN, H . B. , an d WALD , A . (1943) , 'O n Stochasti c Limi t an d Orde r Relation ships', Annals o f Mathematical Statistics, 14: 217-77. MIZON, G . E . (1977) , 'Mode l Selectio n Procedures' , i n M . J . Arti s an d A . R . Nobay (eds.) , Studies in Modern Economic Analysis, Basi l Blackwell, Oxford. and HENDRY , D . F . (1980) , 'A n Empirica l Applicatio n an d Mont e Carl o Analysis o f Test s o f Dynami c Specification', Review o f Economic Studies, 47 : 21-45. MORGAN, M . S . (1990) , Th e History o f Econometric Ideas, Cambridg e Univer sity Press . MOSCONI, R. , an d GIANNINI , C . (1992) , 'Non-Causalit y i n Cointegrate d Systems : Representation, Estimatio n an d Testing' , Oxford Bulletin o f Economics an d Statistics, 54: 399-417. NANKERVIS, J . C. , an d SAVIN , N . E . (1985) , 'Testin g th e Autoregressiv e Parameter wit h the r-statistic' , Journal of Econometrics, 27: 143-61 . (1987), 'Finit e Sampl e Distribution s o f t an d F Statistic s i n a n AR(1) model with an Exogenous Variable' , Econometric Theory, 3 : 387-408. NELSON, C . R. , an d KANG , H . (1981) , 'Spuriou s Periodicit y i n Inappropriatel y Detrended Tim e Series' , Journal of Monetary Economics, 10 : 139-62. NEWEY, W . K. , an d WEST , K . D . (1987) , ' A Simpl e Positiv e Semi-Definit e Heteroskedasticity an d Autocorrelation-Consistent Covarianc e Matrix' , Econometrica, 55: 703-8. NOWAK, E . (1990) , 'Hidde n Cointegration' , Discussio n paper , Universit y o f California a t Sa n Diego. NYBLOM, J . (1989) , 'Testin g fo r th e Constanc y o f Parameter s ove r Time' , Journal o f th e American Statistical Association, 84: 223-30. OSBORN, D . R. , CHIU , A . P . L. , SMITH , J . P. , an d BIRCHENHALL , C . R . (1988) , 'Seasonality an d th e Orde r o f Integratio n fo r Consumption' , Oxford Bulletin of Economics an d Statistics, 50: 361-78 . OSTERWALD-LENUM, M . (1992) , ' A Not e wit h Fractile s o f th e Asymptoti c Distribution o f th e Maximu m Likelihoo d Cointegratio n Ran k Tes t Statistics : Four Cases' , Oxford Bulletin o f Economics an d Statistics, 54: 461-72. PANTULA, S . G . (1991) , 'Testin g fo r Uni t Root s i n Tim e Serie s Data' , Econometric Theory, 5 : 265-71. PARK, J . Y. , an d PHILLIPS , P . C . B . (1988) , 'Statistica l Inferenc e in Regression s with Integrate d Processes : Par t F, Econometric Theory, 4 : 468-97. PERRON, P . (1988) , 'Trend s an d Rando m Walk s in Macroeconomi c Tim e Series : Further Evidenc e fro m a New Approach' , Journal of Economic Dynamics an d Control, 12 : 297-332. (1989), 'Th e Grea t Crash , th e Oi l Shoc k an d th e Uni t Roo t Hypothesis' , Econometrica, 57: 1361-402. PHILLIPS, P . C . B . (1986) , 'Understandin g Spuriou s Regression s i n Economet -
References 31
9
tics', Journal o f Econometrics, 33: 311-40. — (1987o), 'Tim e Serie s Regressio n wit h a Uni t Root' , Econometrica, 55 : 277-301. — (19875), 'Toward s a Unifie d Asymptoti c Theor y o f Autoregression' , Biometrika, 74 : 535-48. -(1988a), 'Reflection s o n Econometri c Methodology' , Economic Record, 64: 344-59. — (19885), 'Multipl e Regressio n wit h Integrate d Tim e Series' , Contemporary Mathematics, 80 : 79-105. -(1991), 'Optima l Inferenc e i n Co-integrate d Systems' , Econometrica, 59 : 282-306. — and DURLAUF , S . N . (1986) , 'Multipl e Tim e Serie s Regressio n wit h Integrated Processes' , Review of Economic Studies, 53: 473-95. — and HANSEN , B . E . (1990) , 'Statistica l Inferenc e i n Instrumenta l Variables Regression wit h 1(1) Processes' , Review of Economic Studies, 57 : 99-125. — and LORETAN , M . (1991) , 'Estimatin g Long-Ru n Economi c Equilibria' , Review of Economic Studies, 58: 407-36. — and OULIARIS , S . (1988) , Testin g fo r Co-integratio n usin g Principa l Components Methods' , Journal o f Economic Dynamics an d Control, 12 : 205-30. -(1990), 'Asymptoti c Propertie s o f Residua l Base d Test s fo r Cointegra tion', Econometrica, 58: 165-93. — and PARK , J . Y . (1988) , 'Asymptoti c Equivalenc e o f Ordinar y Leas t Squares an d Generalize d Leas t Square s i n Regression s wit h Integrate d Vari ables', Journal of th e American Statistical Association, 83: 111-15. -and PERRON , P . (1988) , 'Testin g fo r a Uni t Roo t i n Tim e Serie s Regres sion', Biometrika, 75 : 335-46. PRIESTLEY, M . B . (1989) , Nonlinear an d Nonstationary Time Series Analysis, Academic Press , Ne w York. QUANDT, R . E . (1978) , 'Test s o f Equilibriu m vs . Disequilibriu m Hypotheses' , International Economic Review, 19 : 435-52. (1982), 'Econometri c Disequilibriu m Models' , Econometric Reviews, 1 : 1-63. RAPPOPORT, P. , an d REICHLIN , L . (1989) , 'Segmente d Trend s an d Non-Station ary Tim e Series' , Economic Journal, 99 : 168-77. REIMERS, H . E . (1991) , 'Comparison s o f Test s fo r Multivariat e Co-integration', Discussion Pape r no . 58, Christian-Albrechts University, Kiel. RIPLEY, B . D . (1987) , Stochastic Simulation, Joh n Wiley , New York. SAID, S . E. , an d DICKEY , D . A . (1984) , 'Testin g fo r Uni t Root s i n Autoregres sive-Moving Average Models of Unknown Order', Biometrika, 71 : 599-607. SAIKKONNEN, P . (1991) , 'Asymptoticall y Efficien t Estimatio n o f Cointegratin g Regressions', Econometric Theory, 1 : 1-21 . SAMPSON, M . (1991) , 'Th e Effec t o f Paramete r Uncertaint y o n Forecas t Vari ances an d Confidenc e Interval s fo r Uni t Roo t an d Tren d Stationar y Time Series Models' , Journal o f Applied Econometrics, 6 : 67-76. SARGAN, J . D . (1964) , 'Wage s an d Price s i n th e Unite d Kingdom : A Stud y i n Econometric Methodology' , i n P . E . Hart , G . Mills , an d J . K . Whitake r (eds.), Econometric Analysis fo r National Economic Planning, Butterworth ,
320 Reference
s
London; reprinte d i n D. F . Hendr y an d K. F . Wallis (eds.), Econometrics and Quantitative Economics, Basil Blackwell , Oxford , 1984 . SARGAN, J . D . (1980) , 'Som e Test s o f Dynami c Specificatio n fo r a Singl e Equation', Econometrica, 48: 879-97. and BHAROAVA , A . (1983) , 'Testin g Residual s fro m Leas t Square s Regres sion fo r Bein g Generate d b y th e Gaussia n Rando m Walk' , Econometrica, 51 : 153-74. SCHMIDT, P. , an d PHILLIPS , P . C . B . (1992) , 'L M tes t fo r a Uni t Roo t i n th e Presence o f Deterministi c Trends' , Oxford Bulletin o f Economics an d Statistics, 54: 257-87. SCHWERT, G . W . (1989) , 'Test s fo r Uni t Roots : A Mont e Carl o Investigation' , Journal o f Business and Economic Statistics, 1: 147-59. SHEPPARD, D . K . (1971) , Th e Growth and Role o f U K Financial Institutions 1890-1962, Methuen , London . SIMS, C. A. (ed. ) (1977) , New Methods in Business Cycle Research, Federa l Reserve Ban k o f Minneapolis. STOCK, J. H. , an d WATSON , M . W . (1990) , 'Inference i n Linear Tim e Serie s with Som e Uni t Roots' , Econometrica, 58 : 113-44. SPANOS, A . (1986) , Statistical Foundations o f Econometric Modelling, Cambridg e University Press . STOCK, J . H . (1987) , 'Asymptoti c Propertie s o f Least-Square s Estimator s o f Co-integrating Vectors', Econometrica, 55 : 1035-56. and WATSON , M . W . (1988«) , 'Variabl e Trend s i n Economi c Tim e Series' , Journal o f Economic Perspectives, 2: 147-74. (1988&), 'Testin g fo r Commo n Trends' , Journal o f th e American Statistical Association, 83: 1097-107. - (1991) ' A Simpl e MLE o f Cointegratin g Vectors i n Genera l Integrate d Systems', Typescript , Northwester n University , -and WEST , K . D . (1988) , 'Integrate d Regressor s an d Test s o f th e Perman ent Incom e Hypothesis' , Journal of Monetary Economics, 21: 85-96. TODA, H. , an d PHILLIPS , P . C . B . (1991) , 'Vecto r Autoregression s an d Causal ity', Cowle s Foundation Discussio n Paper, 997 . URBAIN, J.-P . (1992) , 'O n Wea k Exogeneit y i n Erro r Correctio n Models' , Oxford Bulletin o f Economics an d Statistics, 54: 187-207. WEST, K . D . (1988) , 'Asymptoti c Normality , whe n Regressor s hav e a Uni t Root', Econometrica, 56 : 1397-418. WHITE, H . (1980) , ' A Heteroskedasticity-Consisten t Covarianc e Matri x Estima tor an d a Direct Tes t for Heteroskedasticity' , Econometrica, 48 : 817-38. (1984), Asymptotic Theory fo r Econometricians, Academi c Press , Ne w York. WICKENS, M . R. , an d BREUSCH , T . S . (1988) , 'Dynami c Specification , the Lon g Run an d th e Estimatio n o f Transforme d Regressio n Models' , Economic Journal, 9 8 (Conference 1988) : 189-205 . WOLD, H . (1954) , A Study i n th e Analysis o f Stationary Time Series, Almqvis t and Wiksell , Stockholm . YULE, G . U . (1926) , 'Wh y D o W e Sometime s Ge t Nonsens e Correlation s Between Tim e Series ? A Stud y i n Samplin g and th e Natur e o f Tim e Series' , Journal o f th e Royal Statistical Society, 89 : 1-64 .
Acknowledgements for Quoted Extracts The author s ar e gratefu l t o the followin g fo r permission t o reproduce extracts: Elsevier Scienc e Publishers , fo r materia l from N . G . Manki w and M . D . Shapir o (1986), 'D o w e reject to o often : Small-sampl e properties o f rational expectations models', Economics Letters, 20: 142-3. The Review o f Economic Studies, fo r materia l fro m P . C . B . Phillip s an d B . E . Hansen (1990) , 'Statistica l Inferenc e i n Instrumenta l Variables Regressio n wit h 1(1) Processes', Review of Economic Studies, 57: 116-17. The Econometri c Societ y fo r materia l fro m D . A . Dicke y an d W . A . Fulle r (1981), 'Likelihoo d Rati o Statistic s fo r Autoregressiv e Tim e Serie s wit h a Uni t Root', Econometrica, 49: 1062-3. David A . Dickey , Professor o f Statistics, North Carolin a Stat e University. John Wile y & Sons , Inc. , fo r materia l fro m Wayn e A. Fulle r (1976) , Introduction to Statistical Time Series, 371-3.
This page intentionally left blank
Author Inde x Abadir, K . M . 126 , 128 Abrahamse, A . P . J . 10 4 Ahn, S . K . 30 5 Anderson, G . J . 5 , 50, 140 Anderson, H . 27 2 Anderson, T. W . 70n. , 26 5 n., 285 Andrews, D . W . K . 31 0 Banerjee, A . 55 , 95, 97, 163 , 166, 177n., 187, 191 , 192, 214, 215, 220, 222, 230 , 233, 306 , 307 Bardsen, G . 47 , 53, 56, 62, 235 Bewley, R. 47 , 49, 53, 152 , 305 Bhargava, A. 101 , 104, 155, 176, 207, 209 Billingsley, P . 24 , 89 Birchenhall, C . R . 12 2 Bossaerts, P . 29 8 Boswijk, H . P . 235 , 305, 307, 310 Box, G . E . P . 10 , 13, 121, 305 Brandner, P . 28 2 Breusch T . S . 47 , 55 , 56 , 59 , 62 , 63 , 64 Campbell, B . 167n . Campbell, J . Y . 30 6 Campos, J . 23 6 Chan, N . H . 91 , 96 n. Chiu, A . P . L . 12 2 Choi, I . 30 6 Chong, Y . Y . 28 2 Chow, G . C . 194n . Chu, C.-S . J. 31 0 Clements, M . P . 282 , 283, 285 Davidson, J . E . H . 5 , 50, 52, 140, 300 Davidson, R. 16 , 28 Deaton, A. S . 5 3 Dickey, D . A . 8 , 24, 82, 100 , 103, 107, 108, 112-23 , 169 Dolado, J. J . 55 , 97, 163, 166, 177n., 187, 191, 192 , 230 Dufour, J.-M . 167n. Durlauf, S . N . 82 , 92 , 93 , 182 , 203, 238 , 254, 262n . Engle, R . F . 6 , 7, 17 , 18, 19, 43, 67, 84n., 121, 122 , 137 n., 145 , 146, 152, 157-9, 163, 205n. , 208, 209, 211, 215, 231, 242 , 256, 261, 278, 279, 282, 283, 287, 288 , 305, 30 9
Ericsson, N . R . 18 , 28, 29, 41, 230, 232 , 236, 238, 269, 292, 301 Ermini, L . 32 , 193-7 Evans, G . B . A . 10 4 Fisher, I. 6 5 Fisher, L . 30 5 Frances, P.-H . 23 5 Friedman, M . 29 , 190 , 194 Fuller, W . A . 8 , 13 , 14, 15 , 24, 26, 100-3 , 106, 107 , 112-23, 169 Galbraith, J . W . 55 , 98, 166 , 177n., 191 Gantmacher, F . R . 14 0 Gel'fand, J . M . 14 0 Ghysels, E. 12 1 Giannini, C . 31 0 Gonzalo, J . 240 , 285, 286, 293, 294, 296-8 Granger, C . W . J. 6 , 7, 32 , 43, 69, 70, 81, 83, 84n. , 121 , 137n., 138 , 139, 145, 146, 157-9, 196 , 205n., 208, 209, 215, 231, 256, 257, 260, 261, 272, 278, 285, 287 , 307, 309, 310 Gregoir, S . 30 4 Grimmet, G . R . 9 6 Haldrup, N . 9 6 Hall, A . 107 , 119, 130, 133, 306 Hall, A. D. 27 2 Hall, P . 23 , 24, 89n., 179n . Hall, R . E . 164 , 165, 177 Hallman, J. 32 , 121 Hammersley, J . M . 2 8 Handscombe, D . C . 2 8 Hansen, B . E . 176 , 194, 238-41, 246, 248-51, 261, 294, 299, 310 Harvey, A . C . 30 3 Hasza, D . P . 122 , 123 Hendry, D . F . 5 , 17 , 28, 29, 32, 41, 47, 48, 49, 50 , 53 , 65 , 95 , 101 , 140, 162, 163, 193-5, 197 , 221, 229, 231-3, 235, 236 , 238, 269, 278, 279, 282, 283, 285, 288 , 292, 300, 301, 306-309 Heyde, C . C . 23 , 24, 89n., 179n . Hunter, J. 31 0 Hylleberg, S . 96 , 121-3 , 152 , 170 Ilmakunnas, P . 12 1 Imhof, P . 104 , 207
324
Author Index
Jenkins, G . M . 10 , 13, 121 Johansen, S . 43 , 96 , 146 , 151 , 153 , 211 , 256, 257 , 260 , 262 , 265 , 268 , 271 , 272 , 277, 287 , 288 , 290 , 292 , 294 , 297, 298 , 307, 31 0 Juselius, K . 271 , 272 , 277, 290 , 31 0 Kang, H. 19 1 Kelly, C . M . 47 , 64 , 65, 66 Kiviet, J . 104 , 105 , 169n. , 232 Klein, L . R. 31 0 Koerts, J . 10 4 Kremers, J . M . J. 230-3 , 306 Kunst, R . 28 2 Kwiatkowski, D. 30 4 Laroque, G. 30 4 Lee, H . S . 12 1 Lee, T.-H . 287 , 307 Lin, C.-F . 31 0 Loretan, M . 163 , 288 , 29 1 Leybourne, S . J. 30 4 McCabe, B . P . M . 30 4 McCallum, B . T. 47 , 64- 6 MacKinnon, J . G . 16 , 28, 211, 213 , 214 Mankiw, N . G . 164 , 165 , 166 , 177n. , 191 Mann, H . B . 1 4 Mizon, G . E . 101 , 152 , 162 , 170 , 231 , 235 , 278, 285 , 288 , 292 , 300 , 30 7 Morgan, M . S . 5 , 308 Mosconi, R . 31 0 Muellbauer, J . N . J . 5 3 Murphy, A . 5 3 Nankervis, J. C . 10 4 Neale, A . J . 47 , 65, 221, 309 Nelson, C . R . 19 1 Newbold, P . 69 , 70, 81 , 83 , 138 , 139 , 19 1 Newey, W. K. Il l Nowak, E . 30 8 Nyblom, J . 310 Orden, D . 30 5 Osborn, D . R . 122 , 12 3 Osterwald-Lenum, M. 268-76 , 292 Ouliaris, S . 133 , 134 , 208 , 210 , 21 1 Pagan, A . R . 4 8 Pantula, S . G. 120 , 121 , 30 6 Park, J . Y . 176 , 238 Perron, P. 107 , 109 , 111-19 , 133, 248n. , 304, 306 Phillips, G . D . A . 104 , 105 , 169n. , 232 Phillips, P . C . B . 22 , 24, 43, 71, 72, 81-3 , 86-8, 90-3 , 95 , 96, 101 , 107 , 109 , 111 , 113, 114 , 119 , 129 , 133 , 134 , 163 , 175 ,
176, 179n. , 182 , 203 , 208 , 210 , 211 , 222 , 230, 238-41 , 242-51, 254 , 261, 262n. , 277, 288 , 290 , 291 , 294 , 304-6, 310 Ploberger, W . 31 0 Priestley, M . B . 4 0 Quandt, R . E . 3 Rappoport, P. 30 9 Reichlin, L . 30 9 Reimers, H . E . 28 6 Reinsel, G . C . 30 5 Richard, J.-F . 18 , 162 Ripley, B. D. 2 8 Rothenberg, T . 220n . Said, S . E. 82 , 107 , 108 , 11 3 Saikkonnen, P . 30 5 Sampson, M . 28 2 Sargan, J. D . 5 , 48, 50, 101 , 140 , 155 , 176 , 207, 209 , 229 , 231 , 238 , 28 5 Savin, N . E . 10 4 Schmidt, P . 101 , 304 , 306 Schwartz, A. J . 29 , 194 Schwert, G . W . 82 , 114 , 119 , 130 , 248n . Shapiro, M . D. 164-6 , 177n. , 191 Sheppard, D . K . 13 9 Sims, C. A . 43 , 125 , 162 , 168 , 178 , 186- 9 Smith, G . W . 16 3 Spanos, A . 12 , 16 , 72 , 162 Stirzaker, D . R . 9 6 Stock, J. H . 43 , 119 , 152 , 158 , 163 , 172 , 177, 178 , 185-90 , 192 , 211 , 278 , 291 , 294, 296-8 Terasvirta, T . 31 0 Tiao, G . C . 30 5 Toda, H. 31 0 Tran, H.-A . 236 , 301 Ungern-Sternberg, T . vo n 28 8 Urbain, J.-P . 30 7 Wald, A . 14 , 43 Watson, M. W . 119 , 152 , 178 , 187-90 , 211, 278 , 291 , 294 , 298 Wei, C . Z. 91 , 96n. West, K . D . 105 , 111 , 169 , 171 , 172 , 177 , 178, 185-7 , 18 9 n., 192 White, H . 15 , 16, 27, 86 , 89 , 90, 310 Wickens, M . R . 47 , 55 , 56 , 59 , 62, 63 , 64 Wold, H. 25 7
Yeo, S . 5 Yoo, B . S . 121 , 152 , 208 , 209 , 278 , 279 , 282, 283 , 287 , 305 Yule, G . U . 69 , 70n., 71, 77, 138
Subject Inde x absolute summabilit y 15 8 adjustment: coefficient 15 5 disequilibrium 51 , 52, 55, 61 speed of 26 8 approximation theore m 12 3 asymptotic: convergence 15 8 independence 16 , 17 normality 105 , 126, 134, 163, 177, 178, 180, 185 ; and drif t ter m 169-7 4 asymptotic standar d erro r (ASE ) 235 Augmented Dickey-Fulle r tes t (ADF ) 106 , 108, 109 , 207-12, 232-4 , 238, 239 n. asymptotic distributio n 127 , 128 comparison wit h non-parametrically ad justed D F 114- 9 use o f IV i n 11 9 autocorrelation 13 , 71-2, 83 , 129, 163, 191, 206, 207, 212, 221 n., 238-42, 244, 286, 29 2 function 12 , 1 3 autocovariance functio n 12 , 13 autoregressive: -distributed lag (ADL) model 47-55 , 60-4, 224 , 239, 242 error 83 , 114 , 191, 291 process 12 , 72, 251, 257-60; see also autoregressive moving-average (ARMA) proces s representation (VAR) , see co-integrat ing: representations o f co-integrate d systems autoregressive integrate d moving-averag e (ARIMA) process 13 , 38, 39, 221 autoregressive moving-averag e (ARMA ) process 12 , 13, 39, 84 , 85 , 88 , 107, 108 examples o f 32- 8 Bardsen transformation , se e transformation: Bardse n Bartlett windo w 24 8 Bewley: representation 152 , 153 transformation, se e transformation : Bewley bias 67 , 68, 191 , 244, 246-8, 249, 250, 290 , 309 in AR(1 ) parameter 100 , 101 correction ter m 241 , 246
in estimate s o f co-integratin g vecto r 162-3, 214-30 , 238 , 239, 246, 250, 252 second-order 163 , 176, 238, 240, 246 , 296, 29 7 simultaneity 238 , 241, 297, 298 borderline-stationary 39 , 95, 166 , 208, 225 see also near-integrate d proces s bounds tes t 133 , 134 Brownian motio n 21 , 89 , 152 , 153, 241, 243, 246 , 247, 255, 278, 296, 297 see also Wiene r proces s vector, 200- 3 Cayley-Hamilton theore m 14 0 central limi t theorem 16 , 73, 88, 89, 171, 295 functional (FCLT) , see functional centra l limit theore m Liapunov 16 , 27, 44 Lindeberg-Feller 2 7 co-integrating: combination 279 , 283, 288 parameters 215 , 220, 222, 224, 248 rank 145 , 146, 262 regresssion 191 , 220, 229, 230; asymptotic theory o f 174- 7 representations o f co-integrated system s (EC, MA , VAR) 146 , 153-7, 257-6 1 vector 137 , 138, 145, 158, 159, 163, 205, 214, 236 , 248, 252-6, 262, 267, 268, 276, 277, 285, 289, 290, 293; asymptotic distributio n o f estimator s of 293-8; biase s i n estimation of , see bias; generalized 179 ; invariance of 300- 3 co-integration 6-8 , 67 , 136-61, 167 , 189, 255, 268 , 300, 308 definition 14 5 in logarithm s or level s 198 , 199 multi- 287 , 307 seasonal 121 , 151 space 256 , 266-99, 273, 279 system 257 , 260, 261 testing for 9 , 134 , 176, 205-52, 286; table o f critical value s 213 ; test power 230- 5 common facto r 13 , 101, 231, 233, 235, 238, 239, 285, 296 common tren d 152 , 153, 278 companion for m 143 , 181-3, 272 concentrated serie s 88 , 89, 263, 264, 272
326
Subject Inde x
conditioning, imprope r 244 , 245 constant, inclusio n of 212-1 9 continuous mapping theorem 89 , 90 convergence: in distributio n 1 6 of functional s o f Wiener processe s 91 , 183 in probabilit y 14 , 15 , 16 , 86, 157 , 176 , 185 to rando m variabl e 86 , 89 rate o f 14 , 125 , 158-9 , 168 weak 23 , 8 9 Cramer's theore m 173 , 17 7 cross-equation restriction s 155 , 24 5 decomposition 179 , 240 , 260, 296 deterministic trend , se e trend: non stochastic de-trending 70 , 82 , 83 , 191 spurious 92- 3 diagonalization 265 , 266 , 273, 290 Dickey-Fuller: distribution/critical value s 97 , 98, 100-3, 105, 106 , 121 , 129-32 , 167 , 169 , 170 , 210-11, 268; table s 102- 3 test (DF ) 101 , 104-10 , 112 , 114-19 , 207-12, 231 , 233 , 235 , 236 , 238 , 239 n., 267 ; asymptoti c distribution of 124-7 ; tests o n more tha n on e parameter 113 , 114 , 11 6 differencing 11 , 30, 99 , 111 , 119 , 134 , 139 , 147, 153 , 158 , 168 , 192 , 199 , 30 0 seasonal 121 , 12 2 diffusion proces s 9 6 discontinuity 95 , 96 Donsker's theore m 8 9 drift ter m 9 , 72 , 101 , 106 , 108 , 111 , 15 1 see also trend : non-stochasti c dummy variable 134 , 270-6 , 288 Durbin-Hausman test s 30 6 Durbin-Watson tes t 73 , 81, 93 in co-integrating regression (CRD W test) 176 , 207-8, 235-6 dynamic: estimator 223 , 224-30 , 237 , 243 , 244, 247-51 modelling/regression 5 , 8 , 46, 47, 50 , 51, 106, 163 , 167-71 , 177 , 178 , 192 , 214 , 221 n., 222-4, 225-6, 229 , 239, 243 , 246, 24 7 omitted dynamic s 157 , 220 , 22 9 specification 168 , 240 , 242-4 system 27 8 Edge-worth expansio n 23 9 n. eigen-: value 134 , 140 , 143 , 144 , 179 , 265 , 266, 267, 268 , 270 , 277 , 292, 298
vector 265 , 270 , 292 , 298 empirical data/result s 29-32 , 40-2 , 52-3 , 159, 194-7 , 235-8, 269-71, 292, 293 encompassing 193 , 198 , 23 8 endogeneity 176 , 24 6 Engle-Granger: theorem 159-6 2 two-step procedure 153 , 157-61 , 205n., 278, 285, 283 equilibrium: dis- 2 miltiplier, se e long-run: multiplier relationship 2-9 , 46 , 47, 50, 54, 55, 136-9, 192 , 205 state 2 , 4 static 4 8 ergodicity 16 , 17 , 88 , 8 9 error-correction 5 , 6 , 47 , 51 , 55 , 63, 64, 96, 224n., 246 mechanism 5-7 , 51-4 , 139 , 140 , 151 , 232, 234 , 238 , 268 , 270-5 , 278 , 279 , 294, 300 , 30 4 model 47 , 49-52, 55 , 63, 158 , 159 , 239 , 243, 256 , 257, 260 , 61 , 268, 274 , 277-9, 290 ; generalize d 50 , 52 , 60 , 61 representation 138 , 139 , 153 ; definition of 145 ; derivation o f 154- 7 term 50-3 , 60 , 61, 140 , 151 , 155 , 157 , 262 exact tes t 10 5 exogeneity 17-18 , 288 strict 19 , 67 strong 18 , 20, 222-3, 244 , 252 , 291 super 18-2 0 in uni t roo t test s 10 7 weak 18 , 20, 65-8 , 163 , 168 , 192 , 204 , 223, 240 , 243-5, 248 , 251-2, 261, 268 , 288-91, 295; importanc e i n co-inte grated processe s 252 , 307 finite sampl e biases , se e bia s Fisher effec t 6 5 forecasting 278-8 5 multi-step 18 , 19 frequency: domain 88n . zero v. seasonal 12 2 Frisch-Waugh theore m 70n . full-information maximum-likelihoo d (FIML) 238 , 239 , 241 , 245 , 250 , 297 , 298 fully modifie d estimation 238-41 , 243 , 244, 246-50 estimator 243 , 244 , 247, 248 , 249, 250 method 239 , 240 functional centra l limi t theore m (FCLT) 22 , 89 , 124-7, 261 , 295 , 299
Subject Index generalized co-integratin g vector 17 9 general-to-specific modellin g 168 , 192 Granger causalit y 18 , 291 Granger Representatio n Theore m 48 , 146-53, 300 homogeneity 47 , 51, 52, 60, 61, 221, 222 , 231, 23 6 impact matri x 151 , 260 inconsistent regressio n 164-8 , 190 , 191, 229, 230 innovation sequenc e 12 , 85-7, 183 instrumental variable s (IV) 55 , 59, 62, 63, 119, 130- 3 integrated process 1 , 6, 7, 11 , 12, 21, 39 , 69-71, 73, 136-8, 162-9 9 asymptotic theory o f 86-9 1 near-, see near-integrated process properties o f 84- 6 see also non-stationar y proces s integration: order of , se e ordej r o f integration seasonal, see seasonal integratio n intercept 72 , 151 , 210, 232, 234, 271, 272 , 273, 274 interim multiplie r representation 15 3 invariance 20 , 282, 283 principle 22 ; see also functiona l central limit theore m invertibility 13 , 84, 108 , 242 invertible system 148 , 149, 258, 259, 266 Jacobian 62 , 63 Johansen maximum-likelihoo d procedure 211 , 262-9, 285, 286, 300 power o f 277 , 278 Kronecker product 18 1 lag 9 , 11 , 47, 50, 52, 66, 106-8 , 123 , 225 , 248, 250, 251, 286, 303 length 248 , 286 mean 28 7 polynomial 22 9 structure 208 , 222, 229 truncation paramete r 110 , 111, 113 latent roo t 13 , 104 , 142, 144, 158, 224 law o f large numbers 86 , 90 life-cycle hypothesi s 164 , 188 likelihood rati o test s 153 , 277, 278, 294, 295 limited-information maximum-likelihoo d (LIML) 264 , 28 5 linear system 30 0 logarithms v. level s 29-32 , 193- 7
327
long-run: covariance matrix 240 , 241, 245-7, 252, 290 multiplier 8 , 47-9, 51 , 54, 57, 59-64, 188, 230 , 235, 293, 295, 296; variance of estimate s o f 61- 4 relationship 2 , 7, 8 , 140 , 220; see also co-integrating: vecto r response 15 3 solution 50 , 64-8 marginal: distribution 18 , 19 , 290, 295 process 240 , 243-5, 248n . marginalization 30 4 market clearing 3 martingale difference sequence (MDS ) 11, 12, 21, 163 , 179n., 185, 242, 244, 245 , 247 maximal-eigenvalue statisti c 267 , 273 maximum-likelihood 159 , 241-5, 256 , 262 , 264, 265, 266, 267, 269, 277, 283, 285, 286, 288 full-information, se e full-information maximum-likelihood limited-information, se e limited-information maximum-likelihood mean la g 144 , 287, 301 memory 8 5 mixing: coefficient 8 7 strong 16 , 17 , 87 uniform 16 , 17 mixingale 17 9 n. Monte Carlo : method 9 , 27, 28 response surface s 28 , 211, 213, 214 results 73-83 , 101 , 106 , 108, 114, 117-19, 133, 165, 214, 215, 222-3, 225-9, 232-5, 248-51, 279, 282, 283 , 285, 291, 298 standard erro r 7 5 moving-average 12 , 88; see also auto regressive moving-average (ARMA) process component o f errors 10 7 negative components 113 , 119, 250, 304 parameter 24 8 n. representation 133 , 153, 155, 156 seasonal filte r 12 1 multiple roots 119-2 2 multiplier, long-run, se e long-run: multiplier near-integrated process 95-7 , 99 , 164, 166, 225, 231, 277 nearly-inconsistent regressio n 229 , 230 non-centrality parameter 97 , 98
328
Subject Inde x
non-parametric: correction/test 9 , 108-10 , 114-9 , 130 , 208, 210 , 211 , 238-40 , 25 1 asymptotic theory o f 129-3 0 estimation 244 , 248 , 249 nonsense regressio n 69 , 80, 138 see also spuriou s regressio n non-stationarity 4 , 8, 9, 65, 67, 72, 81-4 , 134, 150 , 21 5 transformation t o stationarit y 69 , 70, 82, 83, 99 , 134 , 14 7 non-stationary process 5 , 6 , 9 , 38 , 39, 70, 71, 81 , 163 , 24 4 v. integrate d proces s 1 2 normality 180 , 28 9 asymptotic, se e asymptotic : normalit y normalization 57-9 , 265 , 285 nuisance parameter s 100 , 104-6 , 172 , 176 , 207, 21 0 order: of magnitud e 14 , 15 , 21 , 9 0 in probabilit y 14 , 1 5 order of integration 6-9 , 48 , 79-80, 84 , 85, 147, 151 , 190-2 , 258 defined 8 4 first 137 , 17 7 higher 138 , 157 , 16 3 zero 13 7 Ornstein-Uhlenbeck proces s 9 6 orthogonal complemen t 14 7 orthogonality 86 , 149 , 151 , 242 , 244 , 245 , 258n., 259,260 , 273 asymptotic 10 7 testing 164- 8 over-identification tes t 278 , 30 0 over-rejection 206 , 210 , 28 6 parameterization 48 , 207 , 208 , 250 , 274 , 275 of dynamic s 22 1 exact 105 , 224 of nearly-integrate d processe s 9 5 over-/under- 224-9 , 262 permanent incom e hypothesi s 164 , 177 , 178, 188 , 19 0 Perron-Phillips/Phillips test, se e non-para metric: correction/tes t polynomial matrice s 140-5 , 152 , 257 isomorphism wit h companion mat rices 142- 4 power serie s expansio n 9 7 power o f tests 8 , 15 , 96, 101 , 108 , 113 , 198 , 208, 214 , 223-4 , 230-5, 277 , 278 , 28 6 pre-determinedness 1 9 random wal k 11 , 21, 22, 24-9 , 38 , 71, 72, 82, 87 , 93 , 100 , 101 , 114 , 191 , 220 , 272
in logarithm s o r level s 19 3 n. see also unit root rank: co-integrating, se e co-integrating: ran k full 56 , 58 , 59 , 144 , 147 , 151 , 181 , 258 , 260, 28 7 reduced 144 , 147 , 151 , 256 , 257 , 264 , 285, 287 , 288 , 30 1 recursive estimatio n 194n. , 221 n. re-parameterization 67 , 157 , 168 , 189 , 191 , 222 see also transformatio n representation theorem, see Granger Rep resentation Theore m Said-Dickey tes t 107 , 108 compared wit h Perron-Phillip s tes t 11 3 Sargan-Bhargava test , se e Durbin-Watson test (CRD W test) Schwarz Criterion 194 , 28 6 seasonal adjustmen t filte r 301 , 303 seasonal integratio n 121- 3 sequential cu t 18 , 19 similar test s 100 , 104 , 105 , 16 9 n. size distortion s 113 , 133 , 166 , 16 7 Slutsky's theore m 89 , 173 spurious: correlation 70 , 71; in de-trended rando m walks 82 , 8 3 regression 69-81 , 83 , 92-5, 134 , 138-9 , 158, 159 , 162 , 191 , 230 , 25 5 stacked form , se e companion for m static regression 162 , 163 , 167 , 205 , 214 , 220-3, 231 , 238 , 246 , 251 , 29 6 comparison wit h dynami c 167 , 168 , 224-30 example o f 23 6 see also Engle-Granger: two-ste p pro cedure stationarity 1 , 4, 12 , 13 , 17 , 69, 212 , 26 2 stationary proces s 4 , 5, 6 , 7, 9, 11 , 29, 38, 39, 47 , 85 , 86 , 134 , 138 , 256 , 257 , 267 , 279 strictly 11 , 1 2 weakly/second-order/covariance 11 , 1 2 stochastic: differential equatio n 9 6 trend, se e trend, stochasti c structural representatio n 261 , 30 3 super-consistency 158 , 176 , 191 , 214 , 220 , 230, 251 , 294 , 296 total effect 142 , 25 7 trace 267 , 273 transformation 6 , 28-32, 88, 111 , 125 , 178-80, 185 ADL 51 , 59 ADL t o EC M 60 , 61 300, 301
Subject Inde x transformation (cont.): Bardsen 51 , 54-9, 62 , 63 Bewley 51 , 53-6, 58n. , 59, 60, 62, 63 equivalence of , 54-60 , 62 , 64 linear 47 , 51 , 60, 61, 63, 64, 145 , 152 , 178, 224 ; in dynamic regression 167-8 , 177, 178 ; o f polynomial matrice s 144 , 145 logarithmic 99 , 192- 9 trend (inclusio n of) 5 , 9, 82, 100, 101 , 106 , 125, 185 , 211 , 212 , 213 , 214 , 236 non-stochastic (deterministic ) 6 , 20, 21, 69-72, 82, 84, 125 , 146 , 151 , 172 , 173 , 185, 187 , 27 5 stochastic 153 , 169 , 172 , 174 , 179 , 180 , 185, 187 , 191 ; se e also commo n trend ; unit roo t sums of powers o f 2 0 unit circl e 13 , 104 , 123 , 141 , 149 , 15 8 unit root 8 , 9 , 13 , 38, 72, 83-6, 95 , 96, 133, 144 , 147 , 163 , 177 , 185 , 215 , 236 , 255, 258-60, 267, 270 , 287 , 289 multiple 12 2
329
near- 95 , 99; see also near-integrate d process in polynomial matri x 14 1 testing for 8 , 96, 99-135, 206, 211 , 215 , 306; descriptiv e valu e 306 ; in marginal processes 306 ; a t seasona l frequency 120-3 variance-covariance matri x 62 , 107 , 183 , 189, 243 , 252-4 , 273 long-run 248 , 249 vector autoregressio n (VAR ) 278 , 279 , 283, 291 , 29 2 vectoring operato r 181 , 273 Wald statisti c 127 , 188 , 23 9 Wiener proces s 21-3 , 26 , 86-91, 93 , 96, 131, 188 , 189 , 241 , 261 , 268 distribution 191 , 22 1 functional o f 24 , 90 , 93 , 125-8 , 163 , 188 , 300 multivariate 182-4 , 200-3, 268 white noise 11 , 12, 22, 87, 106 , 23 1 Wold Decompositio n Theore m 257 , 258