Co-integration, Error Correction, and the Econometric Analysis of Non-Stationary Data (Advanced Texts in Econometrics)

ADVANCED TEXT S I N ECONOMETRIC S General Editors C. W . J . GRANGE R G . E . MIZO N This page intentionally left b...

Author: Anindya Banerjee | Juan Dolado | J. W. Galbraith | David Hendry

84 downloads 1098 Views 16MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

ADVANCED TEXT S I N ECONOMETRIC S General Editors

C. W . J . GRANGE R G

. E . MIZO N

This page intentionally left blank

CO-INTEGRATION, ERROR CORRECTION, AND THE ECONOMETRI C ANALYSIS O F NON-STATIONARY DAT A Anindya Banerjee, Juan J. Dolado, John W. "Galbraith, and Davi d F . Hendry

OXFORD UNIVERSIT Y PRES S

Ms book lias been printed digitally an d produced i n a standard specification in order to ensure its continuing availability

OXFORD UNIVERSITY PRES S

Great Clarendon Street, Oxford 0X 2 6DP Oxford University Press is a department o f the University of Oxford. It furthers the University's objective of excellence in research, scholarship , and education by publishing worldwide in Oxford Ne w York Auckland Bangko k Bueno s Aires Cap e Town Chenna i Dar es Salaam Delh i Hon g Kong Istanbu l Karach i Kolkata Kuala Lumpur Madrid Melbourn e Mexico City Mumba i Nairobi Sao Paulo Shangha i Taipe i Toky o Toronto Oxford i s a registered trade mark of Oxford University Press in the UK and in certain other countrie s Published in the United States by Oxford University Press Inc., New York © A . Banerjee, J.J. Dolado, J.W. Galbraith, and D.F . Hendry 1993 The moral rights of the author have been asserte d Database right Oxfor d University Press (maker) Reprinted 2003 All rights reserved. No part of this publication maybe reproduced, stored in a retrieval system , or transmitted, i n any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriat e reprographics right s organization . Enquiries concerning reproductio n outside the scop e of the above should be sent to the Rights Department, Oxford University Press, at the addres s above You must not circulate this book in any other binding or cover and you must impose this same condition on any acquirer ISBN 0-19-828810-7

Preface This boo k i s intended a s a guid e t o th e literatur e o n co-integratio n an d modelling o f integrate d processes . Time-serie s econometric s ha s devel oped rapidl y durin g th e pas t decade , bu t especiall y s o in th e analysi s of non-stationarity. I n particular , th e stud y o f integrate d processe s ha s grown i n importance fro m th e statu s of a n exoti c topic, discusse d onl y in technical journals , t o bein g a n essentia l par t o f th e econometrician' s collection o f techniques . I t ha s thereb y develope d int o a n are a o f interest fo r econometri c theorist s an d applie d econometrician s alike . This boo k i s aime d a t graduat e student s i n economics , applie d econo metricians, econometri c theorists , an d th e genera l audienc e o f econo mists who use empirica l methods t o analys e tim e series. Despite th e growin g importanc e o f th e literatur e o n integratio n an d co-integration, mos t account s o f thi s literatur e remai n confine d t o journals, edite d collection s o f papers , o r surve y papers. Whil e som e o f the survey s ar e quit e detailed , spac e restriction s usuall y d o no t allo w a full expositio n o f man y o f th e theoretica l points . Thi s boo k attempt s t o bridge th e ga p betwee n account s suc h a s surveys , whic h ar e mainl y descriptive, an d account s tha t ar e mainl y theoretical . I t explain s th e important concept s informall y an d als o present s the m formally . Th e asymptotic theor y o f integrate d processe s i s describe d an d th e tool s provided b y thi s theor y ar e use d t o derive , i n som e detail , th e distributions o f estimators. B y taking reader s ste p b y ste p throug h som e of th e mai n derivations , ou r hop e i s t o mak e th e theor y readil y accessible t o a wide audience . We hav e trie d t o mak e th e boo k a s self-containe d a s possible . A knowledge o f econometrics , statistics , an d matri x algebr a a t th e leve l of a final-yea r undergraduat e o r first-yea r graduat e cours e i n econometric s is assumed , bu t otherwis e al l o f th e importan t statistica l concept s an d techniques ar e described . A boo k suc h a s thi s one , whic h discusse s a n are a tha t i s developin g rapidly, i s inevitabl y incomplet e an d run s th e ris k o f no t bein g quit e up-to-date. T o limi t th e tim e take n i n writin g an d revising , w e di d no t seek t o chas e a frontie r tha t wa s expanding in man y directions . Rather , the topic s covere d reflec t ou r view s of issues, models , an d method s tha t are likel y t o remai n importan t fo r som e tim e t o come , man y o f whic h will continue to provid e th e platfor m for futur e research .

Acknowledgements Our boo k wa s writte n i n tw o continents , thre e years , an d fou r univer sities, s o th e lis t o f people , acros s time , space , an d departments , t o whom w e ow e extensiv e debt s o f gratitud e ha s grow n formidably large. A majo r par t o f thi s deb t i s owe d t o th e Department s o f Economic s a t the Universitie s o f Californi a a t Sa n Diego , Florid a i n Gainesville , McGill, an d Oxford , an d th e Ban k o f Spain , wher e th e author s eithe r worked o r visite d for substantia l periods. Thei r generou s suppor t o f ou r work i s much appreciated . The boo k ha s als o benefite d greatl y fro m th e patien t scrutin y o f several o f ou r colleagues , wh o rea d th e entir e typescript an d mad e detailed comments . W e hav e pleasur e i n thankin g Michae l Clements , Rob Engle , Neil Ericsson, Ton y Hall (an d severa l o f his students), Colin Hargreaves, S0re n Johansen , Katarin a Juselius , Teu n Kloek , Jame s MacKinnon, G . S . Maddala , Grayha m Mizon , Jean-Fran9oi s Richard , Mark Rush , Nei l Shephard , Tim o Terasvirta , an d fou r anonymou s referees fo r thei r help . The y hav e mad e a grea t contributio n t o thi s book, an d foun d man y infelicitie s i n earlie r versions , bu t o f cours e ar e not responsibl e for an y that remain. Early version s o f th e boo k wer e inflicte d b y u s upo n ou r graduat e students. Amon g thos e wh o suffere d fro m th e confusio n cause d by obscur e notatio n an d prose , bu t continue d unflinchingly , Hughe s Dauphin, Caro l Dole , Jesu s Gonzalo , Catherin e Liston , Claudi o Lupi , Neil Rickman , an d Geet a Sing h deserve specia l thanks. We ar e als o indebte d t o Juli a Campos , Michae l Clements , Steve n Cook, Nei l Ericsson an d Claudi o Lup i fo r proof reading. The financia l suppor t o f th e Economi c an d Socia l Researc h Counci l (UK) unde r grant s B0125002 4 an d R23118 4 an d th e Fond s pou r l a Formation de s Chercheur s e t 1'Aid e a l a Recherch e (Quebec ) i s grate fully acknowledged . Finally, w e than k Andre w Schulle r an d th e editor s of thi s series , wh o remaine d encouragin g abou t th e projec t despit e it s many difficulties . Oxford A Madrid J Montreal J Oxford D

.B. . J. D . . W. G. . F. H.

Contents Notational Conventions, Symbols , an d Abbreviations x 1. Introductio n and Overview 1 1.1. Equilibrium relationships and the long run 2 1.2. Stationarity and equilibrium relationships 4 1.3. Equilibrium and the specification of dynamic models 5 1.4. Estimation of long-run relationships and testing for orders of integration and co-integration 8 1.5. Preliminary concepts an d definitions 1 1.6. Data representation an d transformations 2 1.7. Examples: typical ARM A processes 3 1.8. Empirical time series: money, prices, output, and interest rates 4 1.9. Outline o f later chapters 4 Appendix 4 Linear Transformations , Erro r Correction , and the Lon g Run i n Dynami c Regressio n 4 2.1. Transformations o f a simple model 4 2.2. Th e error-correction model 5 2.3. A n example 5 2.4. Bdrdsen an d Bewley transformations 5 2.5. Equivalence o f estimates from different transformations 5 2.6. Homogeneity and the ECM as a linear transformation oftheADL 6 2.7. Variances o f estimates o f long-run multipliers 6 2.8. Expectational variables and the interpretation of long-run solutions 6

3

Properties of Integrated Processes 6 3.1. Spurious regression 7 3.2. Trends an d random walks 8 3.3. Some statistical features o f integrated processes 8 3.4. Asymptotic theory fo r integrated processes 8 3.5. Using Wiener distribution theory 9 3.6. Near-integrated processes 9

i

0 8 2 0 2 3

6 8 0 2 3 5 0 1 4 9 0 1 4 6 1 5

viii Content

s

4. Testin g fo r a Unit Roo t 9 4.1. Similar tests and exogenous regressors in the DGP 10 4.2. General dynamic models fo r th e process o f interest 10 4.3. Non-parametric tests for a unit root 10 4.4. Tests o n more than on e parameter 11 4.5. Further extensions 11 4.6. Asymptotic distributions o f test statistics 12

9 4 6 8 3 9 3

5. Co-integratio n 13 5.1. A n example 13 5.2. Polynomial matrices 14 5.3. Integration and co-integration: formal definitions and theorems 14 5.4. Significance o f alternative representations 15 5.5. Alternative representations o f co-integrated variables: two examples 15 5.6. Engle- Granger two-step procedure 15

6 7 0

3 7

6. Regressio n wit h Integrate d Variable s 16 6.1. Unbalanced regressions and orthogonality tests 16 6.2. Dynamic regressions 16 6.3. Functional forms an d transformations 19 Appendix: Vector Brownian Motion 20

2 4 8 2 0

7. Co-integratio n i n Individual Equation s 20 7.1. Estimating a single co-integrating vector 20 7.2. Tests for co-integration i n a single equation 20 7.3. Response surfaces fo r critical values 21 7.4. Finite-sample biases in OL S estimates 21 7.5. Powers o f single-equation co-integration tests 23 7.6. A n empirical illustration 23 7.7. Fully modified estimation 23 7.8. A fully modified least-squares estimator 24 7.9. Dynamic specification 24 7.10. Examples 24 Appendix: Covariance Matrices 25

4 5 6 1 4 0 6 9 0 2 4 2

8. Co-integratio n i n System s o f Equations 25 8.1. Co-integration an d error correction 25 8.2. Estimating co-integrating vectors in systems 26 8.3. Inference about th e co-integration space 26 8.4. A n empirical illustration 26 8.5. Extensions 27

5 7 1 6 8 1

5 3

Contents i

x

8.6. A second example of the Johansen maximum likelihood approach 29 8.7. Asymptotic distributions of estimators of co-integrating vectors i n 1(1) systems 29

3

9. Conclusio n 29 9.1. Summary 29 9.2. Th e invariance o f co-integrating vectors 30 9.3. Invariance o f co-integration under seasonal adjustment 30 9.4. Structured time-series models an d co-integration 30 9.5. Recent research on integration and co-integration 30 9.6. Reinterpreting econometrics time-series problems 30

9 9 0 1 3 4 7

References 31

1

Acknowledgements fo r Quoted Extracts 32

1

Author Index 32

3

Subject Index 32

5

2


Notational Conventions, Symbols, and Abbreviations The following notationa l convention s will be used throughou t the text: Y, y endogenou X, Z , x , z exogenou

s variables s variables, o r vectors containing both y an d z Greek letters populatio n values (parameters) Greek letters with ~ o r ~ sampl e values (estimates ) Bold lowe r case (Roma n o r Greek) vector s Bold upper cas e (Roman or Greek ) matrice s Equation numbers Equations ar e numbere d consecutivel y i n eac h chapte r an d referre d t o within tha t chapte r b y this number alone . Equation s fro m othe r chapter s are referre d t o b y th e chapte r numbe r an d equatio n numbe r withi n chapter; e.g . th e fift h equatio n i n Chapte r 2 is (5) within Chapter 2 , an d (2.5) elsewhere . Symbols la first-differenc Kronecke fo

g operator:

e operator : r produc t

r al l modulus or absolut e value of x, where x i s a scalar determinan t o f A, wher e A is a matrix x conditiona l on y wea k convergence convergenc e i n distribution convergenc e i n probability Abbreviations

ADF augmente d Dickey-Fuller ADL autoregressive-distribute d lag

xii Notationa

l Conventions , Symbols , an d Abbreviation s

AR autoregressio n ARIMA autoregressiv e integrate d movin g average ARMA autoregressive-movin g averag e ARMAX ARM A + additiona l exogenou s processe s ASE Asymptoti c standard erro r BM Brownia n motio n Cl(d, b) co-integrate d o f order d , b CLT centra l limi t theore m COMFAC commo n facto r error representatio n CRDW co-integratin g regression D W statistic diag diagona l matrix d.f. degree s o f freedom DF Dickey-Fulle r DGP data-generatio n proces s DW Durbin-Watso n statisti c ECM error-correctio n model/mechanis m ESE (average ) estimate d standar d erro r FCLT functiona l centra l limi t theorem/ s FIML full-informatio n maximu m likelihood GLS generalize d least square s GNP gros s national produc t \(d) integrate d of orde r d ID independentl y distribute d IID independentl y an d identically distributed IMA integrate d movin g average IN(/i, a 2 ) independentl y and normall y distributed with mean fi an d variance a 2 IV instrumenta l variables LIML limited-informatio n maximum likelihood MA movin g averag e MDS martingal e difference sequence MLE maximu m likelihood estimato r N(ju, a 2 ) normall y distribute d wit h mean p, and variance a 2 NI near-integrate d OLS ordinar y least square s SC Schwar z information criterion SD standar d deviatio n SE standar d erro r SI seasonall y integrated SSD sampl e standar d deviatio n T sampl e siz e or las t observatio n i n a time-series TFE tota l fina l expenditur e VAR vecto r autoregressio n var varianc e

Notational Conventions, Symbols , and Abbreviations xii vec vectorizin W(r) Wiene

g operator r (Brownia n motion) process wit h increments of variance r

i


1

Introduction an d Overvie w This boo k consider s th e econometri c analysi s o f bot h stationar y and non-stationar y processe s whic h ma y b e linke d b y equilibriu m relationships. I t exposit s th e mai n tools , techniques , models , con cepts, an d distribution s involve d i n econometri c modellin g o f possibly non-stationar y time-serie s data . Sinc e th e focu s i s o n equilibrium concepts , includin g co-integration an d erro r correction , the analysi s begin s wit h a discussio n o f th e applicatio n o f thes e concepts t o stationar y empirica l models . Late r w e wil l sho w tha t integrated processe s ca n b e reduce d t o thi s cas e b y suitabl e transformations tha t tak e advantag e o f co-integrating (equilibrium ) relationships. I n thi s chapte r w e wil l introduc e som e importan t concepts fro m time-serie s analysi s an d th e theor y o f stochasti c processes, an d i n particula r th e theor y o f Brownia n motio n pro cesses. W e als o offe r severa l empirica l example s whic h us e thes e concepts. A significan t re-evaluatio n o f th e statistica l basis o f econometri c model ling too k plac e durin g th e 1980s . It s analytica l basis expande d fro m th e assumption o f stationarit y t o includ e integrate d processes . Th e effec t o f this shif t i s fa r fro m complete , bu t i s alread y radical , influencin g th e choice o f mode l forms , modellin g practices , statistica l inference , dis tribution theory , an d th e interpretatio n o f man y traditiona l concept s such a s simultaneity , measurement errors , collinearity , forecasting , an d exogeneity. Thi s boo k attempt s t o analys e thes e issues , describ e th e tools necessar y t o investigat e integrate d processes , an d relat e th e ne w methods t o thos e mor e familia r t o econometricians . Researc h i s con tinuing a t a rapi d pace , an d sinc e thi s boo k canno t cove r al l o f th e techniques tha t hav e bee n explored , w e wil l concentrat e o n thos e tha t we believe wil l remain useful . Time-series econometric s i s concerned wit h th e estimatio n o f relation ships amon g group s of variables , eac h o f whic h is observed a t a numbe r of consecutiv e point s i n time . Th e relationship s amon g thes e variable s may b e complicated ; i n particular , th e valu e o f eac h variabl e ma y depend o n th e value s take n b y man y other s i n severa l previou s tim e periods. I n consequence , th e effec t tha t a chang e in one variabl e ha s on another depend s upo n th e tim e horizo n tha t w e consider . I t i s eas y t o

2 Introductio

n an d Overvie w

imagine example s i n whic h a chang e i n on e quantit y ha s littl e o r n o effect o n anothe r a t firs t an d a substantia l effec t later . Alternatively , a variable ma y hav e a substantia l effec t o n anothe r fo r a time , bu t tha t effect ma y eventually die out . It i s useful , therefore , t o distinguis h wha t ar e ofte n calle d 'short-run ' relationships (thos e holdin g ove r a relativel y shor t period ) fro m 'long run' relationships . Th e forme r relat e t o link s tha t d o no t persist . Fo r example, a sudde n stor m ma y temporaril y reduc e th e suppl y o f fres h fish an d increas e it s price , bu t late r fai r weathe r wil l lea d t o th e re-establishing o f th e earlie r pric e i f deman d i s unaltered . Th e long-ru n relationships determin e th e generall y prevailing price-quantity combina tions transacte d i n the market , an d s o are closel y linke d t o th e concept s of equilibriu m relationship s i n economi c theor y an d o f persisten t co movements o f economi c tim e series i n econometrics . Ou r firs t tas k i s t o clarify thes e concepts .

1.1. Equilibriu m Relationship s an d th e Lon g Run An equilibrium state i s define d a s on e i n whic h ther e i s n o inheren t tendency t o change . A disequilibriu m i s an y situatio n tha t i s no t a n equilibrium an d henc e characterize s a state tha t contain s th e seed s o f its own destruction . A n equilibriu m stat e ma y o r ma y no t hav e th e property o f eithe r loca l o r globa l stability ; thus, i t ma y o r ma y no t b e true tha t th e syste m tend s t o retur n t o th e equilibriu m stat e whe n i t is perturbed. However , w e generall y conside r onl y stabl e equilibria , sinc e unstable equilibri a wil l no t persis t give n that ther e ar e stochasti c shock s to th e economy . Tha t is , equilibri a ar e state s t o whic h th e syste m i s attracted, othe r thing s bein g equal . I t ma y als o b e possibl e i n som e circumstances t o vie w th e force s tendin g t o pus h th e syste m bac k int o equilibrium a s dependin g upo n th e magnitud e o f th e deviatio n fro m equilibrium a t a given point i n time. Equilibrium ma y b e eithe r genera l o r partial . I n th e latte r case , a given market i s viewed as having attained equilibriu m i n spite o f the fac t that w e hav e no t take n accoun t o f th e feedbac k fro m othe r markets . I n both cases , a n equilibriu m relationshi p i s expresse d throug h a functio n f(*i, x 2, . - ., xn) = 0, whic h describes th e relationship s tha t hol d amon g the n variable s Xi t o x n whe n th e syste m i s in equilibrium . Th e phras e 'long-run equilibrium ' i s also use d t o denot e th e equilibriu m relationshi p to whic h a syste m converge s over time . Ove r finit e period s o f time , th e long-run o r equilibriu m relationship s ma y fai l t o hold , bu t the y wil l eventually hol d t o an y degre e o f accurac y i f th e equilibriu m i s stable , and i f th e syste m doe s no t experienc e furthe r shock s fro m outside . Expressed differently , a long-ru n equilibriu m relationshi p entail s a

Introduction and Overvie w 3 systematic co-movemen t amon g economi c variable s whic h a n economi c system exemplifie s precisel y i n th e lon g run ; w e wil l writ e equation s representing suc h co-movement s withou t tim e subscript s as , e.g . x\ = fix2 to denot e a linear long-ru n relation betwee n x^ an d x^. Our definitio n o f equilibriu m i s therefor e no t tha t i n whic h 'equili brium' refer s t o clearin g i n a particula r marke t an d wher e 'disequili brium' mean s tha t suppl y i s not equa l t o demand , a s i n Quand t (1978 , 1982): w e us e th e ter m 'market-clearing ' fo r th e forme r an d a 'non clearing market ' fo r th e latter . A non-clearin g marke t involve s quantity rationing o f som e agent s and , dependin g o n th e institutiona l structure , may o r ma y no t involv e a deviatio n fro m a n equilibriu m functiona l relationship. There i s o f cours e a connectio n betwee n th e meanin g o f 'equilibrium ' used i n econometric s b y Quand t an d others , an d tha t use d here , which is mor e commo n i n time-serie s analysis . Whe n a marke t clears , a n equilibrium relationshi p o f th e typ e w e hav e define d ma y als o occu r because clearin g o f tha t marke t ma y retur n th e syste m t o a stat e i n which som e functiona l relationshi p amon g observabl e variable s holds . Our definitio n i s intende d t o b e genera l an d therefor e t o incorporat e market-clearing equilibria , a s well as others whic h may arise throug h th e behaviour o f a variet y o f differen t type s o f systems . Fo r example , w e would sa y tha t a n equilibriu m relationshi p exist s betwee n aggregat e consumption and incom e if consumptio n tend s towar d a fractio n y of income i n th e absenc e o f shock s whic h ma y temporaril y pertur b th e relationship. Thi s nee d no t b e a n equilibriu m i n th e Quand t (1978 ) sense, however , becaus e i t ma y no t correspon d t o th e clearin g o f markets. (Al l consumer s may remain credit-rationed , for example.) Even i f shock s t o a syste m ar e constantl y occurrin g s o tha t th e economic syste m i s neve r i n equilibrium , th e concep t o f long-ru n equilibrium ma y nonetheles s b e useful . Th e presen t i s th e long-ru n outcome o f th e distan t pas t and , a s wil l b e mad e precis e below , a long-run relationshi p wil l ofte n hol d 'o n average ' ove r time . Moreover , a stabl e equilibriu m ha s th e propert y tha t a give n deviatio n fro m th e equilibrium become s mor e an d mor e unlikel y a s th e magnitud e o f th e deviation i s greater , s o tha t on e ma y b e reasonabl y confiden t tha t th e discrepancy between th e actua l relationship connectin g variables an d this long-run relationshi p i s withi n certai n bounds . Precis e definition s ar e provided in Chapte r 5 . Methods fo r investigatin g such long-ru n relationships ar e ou r concer n here. A n examinatio n o f these method s wil l lead u s to discus s aspects of time-series analysis , o f dynami c modelling in general , an d o f th e rapidl y growing literature treatin g co-integration , erro r correction , an d inference from non-stationar y data . Th e firs t ste p i s to clarif y th e statistica l notio n of stationarit y and it s links to th e concep t o f equilibrium.

4 Introductio

n and Overvie w

1.2. Stationarit y an d Equilibriu m Relationship s In economi c theory , th e concep t o f equilibriu m i s wel l establishe d an d well defined . Th e statistica l concept o f equilibriu m centre s o n tha t o f a stationary process, whic h wil l b e define d formall y below. A substantia l body o f method s i s developin g aroun d th e statistica l feature s o f equili brium relationship s amon g time-serie s processes , an d th e concept s o f Stationarity an d particula r form s o f non-stationarit y ar e crucia l t o thes e methods. If a particula r relationshi p suc h a s x\ = fix 2 emerges a s th e economi c system i s allowe d t o settl e down , this wil l describ e a n equilibriu m to a n econometrician jus t a s to a theorist . I n actua l tim e series , however , th e relation jt l t = fix 2t ma y neve r b e observe d t o hold . Consequently , w e look fo r way s of characterizin g the relationship s tha t ca n b e observe d t o hold betwee n x\ t an d x2t. Roughly speaking—again , term s wil l b e define d precisel y i n Chapte r 5—we sa y that a n equilibriu m relationship f(xi,x 2) = 0 hold s betwee n two variable s x j an d x 2 i f th e amoun t E, = f(xit,x2t) b y whic h actua l observations deviat e fro m thi s equilibriu m i s a median-zer o stationar y process.1 Tha t is , th e 'error ' o r discrepanc y betwee n outcom e an d postulated equilibriu m ha s a fixe d distribution , centre d o n zero , tha t does no t chang e ove r time . Thi s erro r canno t therefor e gro w indefin itely; i f i t did , the relationshi p coul d no t hav e bee n a n equilibriu m on e since th e syste m is free t o mov e eve r furthe r awa y fro m it . O f course , i t may b e difficul t t o distinguis h in finit e sample s between a n ever-growing discrepancy i n a n hypothesize d equilibriu m relationshi p an d a rando m fluctuation; forma l statistica l test s fo r problem s suc h a s thi s ar e discussed i n later chapters. Given th e characterizatio n above , th e short-ru n discrepanc y e t i n a n equilibrium relationshi p mus t hav e n o tendenc y t o gro w systematically over time . However , sinc e thi s erro r represent s shock s tha t ar e constantly occurrin g an d affectin g economi c variables , i n a rea l economi c system ther e i s n o systemati c tendenc y fo r thi s erro r t o diminis h ove r time either . I t would fall awa y to zer o only if shocks were to cease . This definitio n o f a n equilibriu m relationshi p hold s automaticall y when applie d t o serie s tha t ar e themselve s stationary . Fo r an y tw o stationary serie s {jc 1(} an d {x 2t}, irrespectiv e o f an y substantiv e economic relationshi p betwee n thes e tw o alone , a differenc e o f th e for m 1 Late r we will consider mor e precisely th e propertie s that th e deviatio n mus t have . Th e requirement i s usually state d a s bein g tha t th e deviatio n fro m th e equilibriu m relationship be integrate d o f orde r zer o (se e below); alternatively , w e migh t impos e onl y th e weake r requirement tha t th e unconditiona l expectatio n o f th e deviatio n fro m th e equilibriu m relationship b e zero , implyin g that onl y th e firs t momen t nee d exis t an d b e constant . Fo r simplicity, w e omit intercept s fro m th e presen t discussion .

Introduction and Overvie w 5 {xit — bx2t} mus t b e a stationary serie s fo r an y b . Thus , whethe r o r no t there exists a non-zero y 3 which describes a true equilibrium relationship , corresponding t o a non-zero derivativ e betwee n x\ an d x2, any arbitrarily chosen b wil l meet th e statistica l equilibriu m condition. Thi s doe s no t imply tha t w e canno t us e statistica l method s t o determin e th e para meters o f a long-ru n relationship , bu t simpl y tha t on e stag e o f th e process, i n which we look fo r a stationary discrepancy , is unnecessary. However, thi s concept o f statistica l equilibrium i s necessary an d usefu l in examinin g equilibriu m relationship s betwee n variable s tendin g t o grow ove r time . I n suc h cases , i f the actua l relationshi p i s x± = fix 2, th e discrepancy xi, - bx 2t wil l b e non-stationar y fo r an y b + /3, sinc e th e discrepancy deviate s fro m th e tru e relationshi p b y th e constan t propor tion ( b - )8 ) o f the growin g variabl e x 2t; onl y th e tru e relationshi p ca n yield a stationar y discrepancy . Wit h mor e tha n tw o variables , however , there ma y b e mor e tha n on e equilibriu m relation , an d thi s lead s t o another o f th e statistica l problem s tha t i s currentl y bein g pursued : th e empirical determinatio n o f th e numbe r o f equilibriu m relationship s between thre e or more non-stationar y tim e series .

1.3. Equilibriu m and th e Specificatio n o f Dynami c Models Equilibrium relationship s hav e playe d a n explici t rol e i n econometri c modelling sinc e it s foundation s (se e Morga n 1990) . I f ther e exist s a stable equilibriu m x\ = fix 2, th e discrepanc y {x\ t — fix 2t} evidentl y contains usefu l informatio n sinc e o n averag e th e syste m wil l mov e towards tha t equilibriu m i f i t i s no t alread y there . I n particular , (x-it-i - /3x 2t-i) represent s th e previou s disequilibrium . Suppos e th e equilibrium relationshi p is betwee n a variabl e {y t} to be modelle d and some serie s {zi} whic h i s exogenou s i n a n appropriat e sense . I f w e le t x = it yt an d X 2t = z t t o distinguis h thei r status , an d denote th e equili brium b y y — PZ, the n th e discrepancy , o r error , {y t — fizt} shoul d b e a useful explanator y variabl e fo r th e nex t directio n o f movemen t o f y t. I n particular, whe n y, — flz t is positive, y, is too hig h relative t o z t, an d on average w e might expect a fal l i n y i n futur e period s relativ e t o it s tren d growth. Th e ter m (y t-\ — Pzt-i), calle d a n error-correction mechanism, is therefor e sometime s include d i n dynami c regression s (se e Sarga n 1964, Hendr y an d Anderso n 1977 , an d Davidson , Hendry , Srba , an d Yeo 1978) . The tru e paramete r /3 characterizin g th e relationshi p i s no t know n i n general. Thi s nee d no t preven t th e error-correctio n mechanis m fro m being useful , however , sinc e th e unknow n paramete r ca n eithe r b e

6 Introductio

n and Overvie w

estimated separatel y i n a prio r analysi s o r estimate d i n th e cours e o f modelling th e variabl e o f interest . Moreover , th e genera l error-correc tion mechanis m ca n b e show n t o b e equivalen t t o variou s othe r transformations o f a genera l linea r mode l incorporatin g pas t value s o f both th e variabl e o f interes t an d th e explanator y variable s (se e Chapte r 2). A particula r advantag e o f th e error-correctio n mechanis m i s that th e extent o f adjustmen t i n a give n perio d t o deviation s fro m long-ru n equilibrium i s give n b y th e estimate d equatio n withou t an y furthe r calculation. Othe r form s o f th e estimate d mode l ar e als o convenien t i n that the y allo w th e implie d long-ru n relatio n itsel f t o b e see n directly . Considerations suc h a s these ar e discusse d i n the followin g chapter . The practic e o f exploitin g informatio n containe d i n th e curren t deviation fro m a n equilibriu m relationship, i n explainin g th e pat h o f a variable, ha s benefite d fro m th e formalizatio n o f th e concep t o f co-inte gration b y Grange r (1981 ) an d Engl e an d Grange r (1987) . Th e informa l definition o f statistica l equilibriu m discusse d abov e i s base d upo n a special cas e o f th e definitio n o f co-integration . Further , th e practic e o f modelling co-integrate d serie s i s closel y relate d t o error-correctio n mechanisms: error-correctin g behaviou r o n th e par t o f economi c agent s will induc e co-integratin g relationship s amon g th e correspondin g tim e series an d vic e versa. A serie s tha t i s tendin g t o gro w ove r tim e canno t b e stationar y (although i t ma y possibl y b e stationar y aroun d som e deterministi c trend), bu t th e changes i n tha t serie s migh t be . T o tak e a mechanica l example, i f a n objec t ha s a fixe d averag e positio n aroun d whic h i t moves, alway s returnin g afte r som e interva l t o thi s positio n lik e a randomly perturbe d weigh t a t th e en d o f a spring , the n it s displacemen t may b e a stationar y series . A n objec t tha t ha s n o suc h fixe d positio n may nevertheles s hav e a velocit y (th e chang e i n positio n pe r uni t time) , or acceleratio n (th e chang e i n th e velocit y pe r uni t time) , tha t i s stationary. Fo r example , i f th e objec t i s movin g eve r furthe r fro m it s point o f origin , bu t wit h velocit y fluctuatin g aroun d som e fixe d positiv e mean accordin g t o a fixe d distributio n function , the n th e velocit y o f th e object i s a stationary series. A serie s is said t o be integrate d o f order 1 (1(1)) if , althoug h it is itself non-stationary, th e change s i n thi s serie s for m a stationar y series . I t i s said t o b e integrate d o f orde r 2 (1(2) ) if , althoug h th e change s ar e non stationary, th e changes in th e changes for m a stationar y series . I n othe r words, i f th e serie s mus t b e difference d exactl y k time s t o achiev e stationarity, the n th e serie s i s l(k), s o that a stationary serie s i s 1(0). W e will us e th e ter m 'integrate d process ' t o refe r t o a serie s wit h orde r o f integration strictl y greate r tha n zero : precis e definition s ar e give n i n Chapter 3 . We ca n no w conside r th e concep t o f co-integration , it s relatio n t o th e

Introduction and Overvie w 7 definition o f long-ru n equilibriu m betwee n serie s give n above , an d it s use a s part o f a statistical descriptio n o f the behaviou r o f time serie s tha t satisfy som e equilibriu m relationship . A simpl e exampl e concern s tw o series, eac h o f whic h i s integrate d o f orde r 1 . Assum e tha t a long-ru n equilibrium relationshi p hold s betwee n them , an d tha t i t i s linear : x = X i P 2- The n (x t — f3x2) mus t be equa l t o zero i n equilibrium and the series {xi t — flx 2t} ha s a constant unconditiona l mean o f zero. Thi s nee d not impl y tha t {xi t — fix 2t} is stationary : th e varianc e o f {x lt - flx 2t} might b e non-constant , fo r example . Th e definitio n o f co-integratio n given b y Engl e an d Grange r (1987) , an d discusse d i n Chapte r 5 , doe s however requir e stationarit y o f th e deviatio n (x\ t~ fait} - Whe n stationarity doe s hold , w e sa y that x\ an d x 2 ar e co-integrate d (1,1) , denoted CI(1,1) ; tha t is , the y ar e eac h integrate d o f order 1 , and ther e exists som e linea r combinatio n {x\ t — /3x2t} whic h i s integrate d o f a n order on e lowe r tha n th e component s (i.e . i s 1(0) here) . I f {x it — fix 2t} has a constan t unconditiona l mea n bu t i s no t stationary , the n w e ma y still wan t t o sa y tha t a n equilibriu m relationshi p holds ; th e serie s wil l not, however , fi t th e stric t Engle-Grange r definitio n o f co-integration , which require s tha t som e linear combinatio n b e stationary. A substantiv e long-ru n equilibriu m relationshi p i s somethin g fro m which th e variable s involve d ca n deviate , bu t no t b y a n ever-growin g amount. Tha t is , th e discrepanc y o r erro r i n th e relationshi p canno t b e integrated o f an y orde r greate r tha n zero . Serie s integrate d o f strictl y positive order s whic h ar e linke d b y suc h a n equilibriu m relationshi p must, therefore , b e co-integrate d wit h eac h other . I n th e exampl e jus t given, th e fac t tha t th e integrate d series jt j an d x 2 mov e togethe r i n th e long ru n i s reflecte d i n th e fac t tha t the y ar e co-integrated ; a linea r relation yield s a stationary deviation . More generally , we can spea k o f variables that ar e co-integrate d (a , b ) when a > b an d b > 0, wher e a i s th e orde r o f integratio n o f th e variables and b is the reductio n in orde r of integration produce d by the linear combination , whic h the n ha s orde r o f integratio n a — b. Whe n b > 0, a linea r relatio n exist s betwee n th e variable s whic h i s integrate d of lowe r orde r tha n eithe r o f th e variable s themselves , bu t whic h ma y none th e les s no t b e 1(0) . I n th e latte r cas e ( a — b >0), th e variable s may deviat e fro m th e linea r relationshi p b y a n ever-growing amount , and s o i t i s no t th e kin d o f relationshi p tha t w e hav e bee n callin g a long-run equilibrium . Nevertheless , variable s tha t ar e CI(a , b) fo r b > 0 do contai n som e informatio n abou t th e long-ru n behaviour o f th e serie s involved. Since a relationshi p betwee n co-integrate d variable s can be show n to be representabl e usin g a n error-correctio n mechanis m (se e Chapte r 5) , and sinc e suc h representation s hav e bee n foun d t o b e valuabl e i n empirical modelling , ther e i s a forma l counterpar t t o th e informa l

8 Introductio

n and Overvie w

argument abov e suggestin g th e usefulnes s o f equilibriu m informatio n i n specifying dynami c regression models .

1.4. Estimatio n o f Long-Run Relationship s an d Testin g for Order s o f Integration an d Co-integratio n The existenc e o f long-ru n relationship s betwee n variables , th e potentia l orders o f integratio n o f particula r tim e series , an d th e implication s o f these fo r th e specificatio n o f dynami c econometri c model s ca n b e understood a s mathematica l propertie s withou t implyin g tha t w e kno w whether o r no t suc h relationship s exist , le t alon e wha t thei r form s fo r a particular empirica l problem woul d be . When a n estimate d regressio n equatio n implie s a n equilibriu m rela tionship betwee n tw o processes , i t i s a straightforwar d operatio n t o extract th e estimate d long-ru n equilibriu m relatio n regardles s o f th e form i n which the equatio n i s estimated. Th e calculatio n can be mad e by expressing th e equatio n i n a n equilibriu m for m an d takin g it s expecta tion. Thi s i s analogou s t o assumin g a stat e i n whic h th e value s o f th e variables d o no t change , s o tha t th e datin g o f variable s become s irrelevant an d th e equatio n i s treate d a s deterministic . Computin g th e derivative betwee n th e tw o serie s i s the n straightforward . Approxima tions t o th e variance s o f estimate d long-ru n multiplier s ca n als o b e computed. Chapte r 2 explore s variou s transformation s o f th e linea r model tha t ar e convenien t fo r these an d relate d calculations . Testing fo r th e existenc e o f suc h a n equilibriu m relationshi p i s no t nearly s o simple. First, i t is difficult empiricall y to establis h th e order s of integration o f individua l time series . Second , th e orde r o f integratio n o f a linea r relationshi p amon g variable s i s even harde r t o discove r tha n th e order o f integratio n o f a singl e series : drawin g inferences is complicate d by th e fac t tha t th e parameter s o f th e relationshi p ar e i n genera l unknown. Testing whethe r a n individua l serie s i s 1(1 ) a s oppose d t o 1(0 ) i s th e problem tha t ha s bee n widel y discusse d a s tha t o f testin g fo r a 'uni t root' i n a time series . Strategie s fo r performin g such testin g hav e ha d t o contend wit h th e proble m tha t 1(0 ) alternative s i n whic h th e serie s i s 'close' t o bein g 1(1 ) (s o tha t th e powe r o f th e tes t i s low ) ar e ver y plausible i n many economic circumstances . Further , th e for m o f the dat a generation proces s (e.g . th e order s o f dynamics ; th e questio n o f whic h exogenous variable s enter ; etc. ) i s not known , an d critica l value s o f tes t statistics ar e typicall y sensitive to th e structur e o f the process . Fuller (1976 ) an d Dicke y an d Fulle r (1979 ) emphasize d tha t testin g for non-stationarit y (again , 1(1 ) a s oppose d t o 1(0 ) series ) i s mor e difficult tha n conventiona l f-test s o f th e hypothesi s tha t th e autoregress -

Introduction and Overvie w 9 ive paramete r i s equa l t o on e i n a n AR(1 ) model . I n fact , wher e ther e are root s greate r tha n o r equa l t o one , conventionall y use d test s d o no t have standar d asymptoti c distributions . Th e origina l test s wer e variant s of conventiona l tests , wit h critica l value s retabulated usin g Monte Carl o experiments t o reflec t th e change s i n distributio n when , under th e null , the serie s are non-stationary. These origina l test s wer e base d o n simpl e form s o f autoregressiv e model: a n AR(1) model , with o r withou t drif t an d tim e tren d term s (i.e . yt = <xy tî [+/3 ] [+yt\ +E t). Suc h simpl e form s ma y ofte n b e poo r approximations t o th e dat a generatio n process . Thi s wil l manifes t itself in th e failur e o f th e estimate d mode l t o pas s variou s mis-specificatio n tests. I n particular , test s fo r residua l autocorrelatio n wil l ofte n reflec t autocorrelated processe s tha t hav e bee n omitte d fro m th e mode l specifi cation. On e wa y o f dealin g wit h th e proble m o f findin g a n adequat e model withi n whic h t o tes t fo r non-stationarit y ha s therefor e bee n t o retain a simpl e autoregressiv e mode l form , bu t wit h a non-parametri c correction t o th e value s o f th e tes t statisti c t o allo w for a genera l for m of autocorrelatio n i n th e residuals . Anothe r approac h attempt s t o capture th e autocorrelatio n throug h th e additio n o f extra lagge d terms in the dependen t variable . Thes e issues are addresse d i n Chapter 4 . When serie s ma y contai n mor e tha n on e 'uni t root'—i.e . wher e the y may be 1(2 ) or of highe r orders—testin g become s yet mor e difficul t because th e sequenc e i n which different hypothese s ar e teste d ca n affec t inference. Suc h issues are als o considere d i n Chapter 4 . A relate d metho d ca n b e applie d t o th e proble m o f testin g fo r a n equilibrium relatio n betwee n integrate d variables . A prio r ste p mus t b e added t o th e metho d above , i n whic h a linea r relationshi p betwee n o r among th e variable s i n questio n i s estimated . Testin g fo r co-integratio n then entail s testin g th e orde r o f integratio n o f th e erro r i n thi s relationship. Fo r example , a stationar y erro r i n a mode l relatin g integrated serie s entail s a n equilibriu m relationship. Conversely , if there were n o equilibriu m relationship , ther e woul d b e nothin g t o ti e thes e series t o an y estimated linea r relation , an d thi s would imply non-stationarity of the residuals . It migh t appea r a t firs t sight , fo r example , tha t testin g fo r co-integra tion betwee n 1(1 ) serie s {x\ t} an d {x2t} woul d be precisel y th e sam e a s a tes t o f th e hypothesi s tha t {e j = {x lf - I3x 2,} i s 1(1 ) agains t th e alternative tha t {e (} is 1(0). However , thi s is true onl y unde r ver y strong assumptions. Necessar y condition s includ e tha t ther e i s onl y on e co integrating relatio n an d th e value s o f it s parameter s ar e known . I n th e bivariate case , whe n / 3 i s estimated , th e serie s tha t on e test s fo r stationarity i s {£, } = {XK — J3x2t}- Sinc e linea r regressio n minimize s th e variance of e t, the estimate d serie s of deviation s from equilibriu m has a smaller varianc e tha n th e tru e deviation s {x it — f}x2t}, assumin g tha t (3

10 Introductio

n and Overvie w

exists. Tha t is , th e metho d b y whic h /3 i s usuall y estimated amount s t o choosing / 3 i n suc h a wa y tha t th e tw o variable s ar e give n th e bes t chance t o appea r t o mov e together . Regressio n make s co-integratio n appear t o b e presen t mor e ofte n tha n i t should , s o tha t th e critica l values o f tes t statistic s mus t b e adjuste d t o reflec t th e fac t tha t / 3 i s estimated. Co-integratio n test s ar e therefor e similar , bu t no t identical , to standard stationarit y tests. Chapter 7 explore s thes e test s fo r co-integration , an d Chapte r 8 extends the discussio n to estimatio n an d testin g in systems of equations.

1.5. Preliminar y Concept s an d Definitions We assum e tha t reader s ar e acquainte d wit h the fundamenta l principles and method s o f econometric s an d statistica l inference. I t i s nonetheles s worth reviewin g som e importan t concept s an d definition s tha t wil l b e used i n later chapters , establishing terminology as we do so .

1.5.1. Stochastic Processes and Time-series Models A numbe r o f concept s fro m standar d time-serie s analysi s wil l b e necessary. Bo x an d Jenkin s (1970 ) giv e a thoroug h treatmen t o f thes e models. A stochastic process i s a n ordere d sequenc e o f rando m variable s {x(s, t) , s e S, t e T}, suc h that , fo r eac h t e T, x ( • , t) i s a rando m variable o n th e sampl e spac e S and , fo r eac h s e S, x ( s , - ) i s a realization o f th e stochasti c proces s o n th e inde x se t T (tha t is , a n ordered se t of values, each correspondin g t o on e valu e of the inde x set). A give n realization o f th e proces s ma y b e represente d a s {x(t), t e T}, and thi s notatio n i s als o ofte n use d fo r th e stochasti c proces s itself . I n later chapter s w e wil l typicall y refe r t o realization s o f stochasti c pro cesses by the notatio n x t for a value at t, and {x t}i (or {x t} or {*(}? = i) for a ful l se t o f values corresponding t o a n inde x set T = {1 , 2 , . . ., T}. We wil l als o restric t ou r attentio n t o discret e stochasti c processes , fo r which th e inde x se t i s a discret e set , i n whic h case w e generally use th e notation x t rathe r tha n x(i), whic h ma y appl y als o t o continuou s processes. Next, le t (x(f), t e 1} be a stochastic proces s suc h tha t E(\x(t)\) < for al l t € T, an d E(x(i)\$ t_d = x(t - 1 ) fo r al l t e T, wher e E ( • ) is the expectation s operato r an d $ tî represent s a particula r information set o f dat a realize d b y tim e t - 1. The n {x(t), t e ¥} i s calle d a

Introduction and Overvie w 1

1

martingale wit h respect t o {$ t, t e T}. A martingale difference sequence can the n b e define d b y {y(t) = x(t) - x( t - 1) , f e T}. I t follow s tha t E (\y(t)\} #2(0 > • • • > x m(t}}'> the n w e requir e i n additio n tha t covariances o f th e for m E\Xk(tj)xi(tj)\ ar e finit e constant s an d ar e functions o f i, j, k , I only , for any admissible i, j, k , an d /. We wil l not offe r a rigorous definitio n of a n integrate d proces s a t this stage bu t w e ca n highligh t a numbe r o f th e issue s involved . A n integrated process i s one tha t ca n b e mad e stationar y b y differencing . A discrete proces s integrate d o f orde r d mus t b e difference d d time s t o reach stationarity ; tha t is , & dxt i s stationar y wher e th e differencin g operator A rf i s define d b y ( 1 - L) d (usin g th e la g operato r L , itsel f defined b y L nxt = *,_„). Fo r example , th e firs t differenc e i s Ax, = x, - x,_i, an d th e secon d differenc e i s A. 2xt = Axt — &xt-i = x, — 2x,-i + xt-2 = ( 1 ~ L) 2xt. Th e process ( 1 - L)x, = et, wher e {E,} is a white-nois e serie s (se e below) , i s calle d a random walk an d i s a simple exampl e o f a process integrated o f order 1 . Two issue s meri t comment . First , i f x t i s stationar y then s o i s A* , o r even A dxt fo r d > 0. Thus , th e stationarit y of A d;cr i s not sufficien t fo r x t to b e l(d). (Recal l tha t a n l(d) proces s i s one tha t must b e differenced d time s t o achiev e stationarity. ) Secondly , conside r th e stabl e auto regressive process , x, = a 0 + a\xtî + st, wher e or j < 1 , XQ = 0, an d E, ~ IN(0 , or 2), t — l, . . ., T . The n {x,} i s non-stationar y sinc e E(xt) = A) , where X,_i denote s the histor y of the variabl e x: X,_j = (x,_j , x ( _2, . . ., XQ) . Le t th e parameter s A e A b e partitione d into (A l5 A 2) t o suppor t th e factorization Then [(y, z t', &i),(z t', A^) ] operate s a sequential cu t o n D(x r |X,_!,A) i f and onl y if A ! an d A 2 ar e variation free; tha t is , i f an d onl y if so tha t th e paramete r spac e A i s th e direct produc t o f A j an d A 2. I n other words , fo r an y value s o f A j an d A 2, admissibl e value s o f th e parameters A of th e join t distributio n ca n b e recovered . Th e essentia l element o f weak exogeneit y is that th e margina l distribution contain s n o information relevan t to A ! (for an exposition , se e Ericsson 1992) . Weak exogeneity: z t i s weakl y exogenou s fo r a se t o f parameter s o f interest ij> i f an d onl y i f ther e exist s a partitio n (A j , A2) o f A such that (i ) t/> i s a functio n o f A j alone , an d (ii ) [ ( y t z t ' , ^ i ) , (z t\ A^) ] operate s a sequential cut . Strong exogeneity. z t i s strongl y exogenou s fo r t/ > i f an d onl y i f z t i s weakly exogenou s fo r \f> an d so that y doe s not Granger-caus e z . Super exogeneity: z t i s supe r exogenou s fo r t y i f an d onl y i f z t i s weakly exogenous fo r \l> and A \ i s invariant t o intervention s affecting A^ . Weak exogeneit y ensure s tha t ther e i s n o los s o f informatio n abou t parameters o f interest fro m analysin g only the conditiona l distribution ; a variable z t i s weakl y exogenou s fo r a se t o f parameter s t/ > i f inferenc e concerning t/ ; can b e mad e conditiona l o n z t wit h no los s o f information relative t o tha t whic h could be obtaine d usin g the joint density o f y t an d


9

Zf Stron g exogeneit y i s necessar y fo r multi-ste p forecastin g whic h proceeds b y forecasting future z s an d the n forecastin g ys conditiona l on those zs . Supe r exogeneit y sustain s polic y analysi s o n A I whe n th e marginal distribution of z t i s altered . Engle e t al. contras t thes e thre e type s o f exogeneit y wit h th e tradi tional concept s o f strict exogeneity an d pre-determinedness . I f u t i s th e error ter m i n a model , the n z t i s sai d t o b e strictl y exogenou s i f E[ztUt+i] = 0 V i, wherea s z t i s said t o b e predetermine d i f E[z tut+i] = 0 V i 3 = 0. Ehgl e e t al . sho w tha t th e latte r concept s ar e neithe r necessar y nor sufficien t fo r vali d inferenc e sinc e neithe r relate s t o parameter s o f interest. The following example (fro m Engl e e t al. 1983 ) seeks t o clarif y thes e concepts. Conside r th e DGP:

with

The parameter s (/? , denot e wea k convergenc e i n th e sens e tha t th e probabilit y measures converge : thi s i s th e analogu e fo r functio n spaces , o f conver gence i n distributio n fo r rando m variable s (se e Hal l an d Heyd e 1980) . Then, unde r weak assumptions abou t {x t}, (4)

Furthermore, i f /( • ) is a continuous functional o n [0 , 1], the n (5)

FIG 1.2. Mappin g the 10-poin t grap h on t o a step functio n

24

Introduction an d Overvie w

FIG 1.3. Ste p representatio n o f a random walk ove r 1 0 points

FIG 1.4. Ste p representation of a random walk over 10 0 points For furthe r details , se e Billingsle y (1968) , Dicke y an d Fulle r (1979 , 1981), Hall an d Heyd e (1980) , an d Phillip s (1986, 1987a) . In distribution s involvin g 1(1) variables , functional s o f Wiene r pro cesses aris e quit e generally , whereas conventional methods o f obtaining limiting distribution s tend t o b e specifi c to th e assumption s made abou t the dat a o r erro r process. 3 Also , man y of the statistic s regularly used in 3 B y thi s w e mea n tha t onl y wea k restriction s nee d t o b e satisfie d b y the {x,} sequenc e for convergenc e result s suc h a s (4 ) an d (5 ) t o hold . Phillip s (1987a ) provide s a goo d account o f this issue, an d a discussion is also containe d i n Ch . 3 .

Introduction and Overvie w

25

FIG 1.5. Ste p representation o f a random walk over 100 0 points

empirical researc h involvin g 1(1) tim e serie s hav e differen t distribution s from thos e tha t aris e wit h 1(0 ) data . I n particular , man y statistics i n 1(1) processes d o no t converge t o constants , a s i n th e 1(0 ) case , bu t instea d converge t o rando m variables . Thus , differen t critica l value s ma y b e required fo r tests , dependin g o n th e degre e o f integratio n o f th e tim e series. Consider th e rando m walk , y t = v ( _! + e t, wit h e, ~ IN(0,1 ) an d >>o - 0 . Then

Alternatively, fro m (7) ,

26

Introduction an d Overview

Similarly, corr 2 (yr , yt-k) na s a numerator of (t — k)2 an d a denominato r of t( t - k ) for k > 0, and so equals 1 - k/t. Whe n k < 0, let 5 = t - k so that t = s + k, an d let r = —k > 0 , in which case

Since y 0 = 0, we have that

The las t approximatio n use s To illustrat e th e us e o f Wiene r processe s i n derivin g distribution s involving 1(1 ) variables , w e wil l deriv e th e limitin g distributio n o f the sampl e mean , y = T~ l Xf= iJ V Becaus e {y,} i s a rando m walk , its mean converge s t o a functiona l o f a Wiene r process . Le t RT(r) = y^n/V r = y^/Vr fo r ( i - l)/T = £ r < i/T ( i = 1, . . ., T) , and Rr(l) = yr/VT. Rj(r} i s a ste p functio n wit h step s a t i/T, fo r z' = 1 , . . ., T , an d i s constant betwee n steps . Thus,


7

The las t expressio n i s yi/VT, wher e y\ i s the lagge d mean . Thi s resul t uses th e fac t that , fo r any constant c,

From (3 ) and (4) ,

and hence

The unlagge d sample mean ha s the sam e limiting distribution. An interestin g aspec t o f (10 ) i s that th e Lindeberg-Felle r centra l limit theorem4 (whic h applies t o independen t bu t heterogeneously distribute d observations; se e Whit e 1984 ) ca n b e applie d t o obtai n th e distributio n of y an d henc e sho w that

Thus, som e functiona l o f Wiene r processe s ar e familia r rando m vari ables i n disguis e and w e will develo p thi s aspect a s we proceed. A proo f of (11 ) i s given in the Appendix . 7.5.7. Monte Carlo Simulation The purpos e o f Mont e Carl o simulatio n i s t o evaluat e b y experimen t quantities tha t woul d be ver y difficult o r impossibl e t o evaluat e analytically. Suc h experiment s typicall y begi n b y creatin g a se t o f dat a wit h known statistica l properties . Thi s i s achieve d b y specifyin g ever y aspec t of a data-generatin g process , o r clas s o f suc h processes , an d replacin g the rando m error s o f th e DG P b y pseudo-rando m numbers . Pseudo random number s ar e number s generate d deterministicall y t o mimi c a random proces s wit h a particula r distribution . A n investigato r typically generates a large numbe r o f suc h artificial data set s (calle d replications ) to investigat e statistica l technique s whic h analys e thes e dat a a s i f th e process generating them were no t known. Th e performanc e o f th e statistical techniqu e i n revealin g som e characteristi c o f th e dat a se t ma y 4

Strictl y speaking , th e versio n w e us e her e i s a specia l cas e o f thi s theorem , sometime s called the Liapuno v centra l limi t theorem.

28 Introductio

n and Overvie w

then b e evaluate d b y generatin g it s distributio n fro m independen t replications o f the experimen t an d comparin g th e result s wit h the known characteristics o f the proces s generatin g the data . For example , a n econometricia n ma y wis h t o examin e th e perform ance o f th e standar d Mes t i n dat a generate d b y a rando m walk . Artificial data-set s followin g a rando m wal k ma y easil y b e constructe d using pseudo-rando m disturbances , an d th e empirica l distributio n o f th e f-statistic i n sample s o f siz e T ca n b e generate d b y replicating N set s of T observations . Th e mean , variance , o r variou s critica l value s o f th e f-statistic ca n b e calculate d fro m th e empirica l distributio n and , fo r sufficiently larg e N , wil l b e clos e t o thei r populatio n (i.e . analytic ) counterparts. Th e investigato r can als o var y the parameter s o f the DG P in orde r t o observ e thei r effect s o n th e outcome . I n eac h experiment , the investigato r know s th e tru e parameter s o f th e process , an d s o ca n evaluate the estimator s an d tests used . Unlike analytica l studies , Mont e Carl o simulation s canno t produc e exact results ; an y resul t fro m a Mont e Carl o experimen t come s fro m a (pseudo-)random sample , an d therefor e ha s som e variabilit y attached t o it. Moreover , Mont e Carl o experiment s ar e inevitabl y specifi c t o th e particular dat a generatio n processe s examine d (althoug h i t ma y b e possible t o prov e analyticall y tha t result s wil l b e invarian t t o certai n parameters i n the process) . Nonetheless , Mont e Carl o result s ar e usefu l when analytica l results ar e difficul t t o obtain . I n particular, Mont e Carl o experiments ar e ofte n use d t o investigat e th e finite-sampl e performanc e of statistica l techniques , th e analytica l propertie s o f whic h ar e know n only asymptotically . There ar e a numbe r o f subtletie s t o th e desig n an d interpretatio n o f Monte Carl o experiment s whic h deman d carefu l attention , includin g th e methods use d t o generat e pseudo-rando m numbers , variance-reductio n methods suc h a s commo n rando m numbers , antitheti c rando m number s and contro l variate s intende d t o improv e precision , th e calculatio n o f standard error s of the experimenta l estimate s o f unknown quantities, th e use o f respons e surface s t o summariz e an d interpolat e results , an d recursive updatin g o f quantitie s o f interest . Exposition s o f Mont e Carl o methods ma y b e foun d in , fo r example , Hammersle y an d Handscom b (1964), Hendr y (1984) , Riple y (1987) , Hendry , Neale , an d Ericsso n (1990), an d Davidso n an d MacKinno n (1992) .

1.6. Dat a Representation an d Transformation s Since dat a transformation s pla y a n importan t rol e i n econometric s generally, w e briefl y consider thei r impac t o n 1(1 ) data . Conside r th e hypothesis tha t a se t o f integrate d dat a ca n b e describe d b y a linea r


9

model wit h a constan t erro r variance . I n particular , a normall y dis tributed rando m wal k wit h drif t i s ofte n postulate d s o tha t Axt ~ IN(jW , cr 2). Man y economi c tim e serie s (suc h a s consumption , national income an d expenditure , o r th e pric e level ) d o gro w over time , but th e amoun t b y whic h the y gro w i n eac h perio d als o tend s t o rise . However, A.x t = x t — xt-i wil l b e stationar y onl y if the absolut e amoun t of growt h is stationary , i n whic h cas e fo r n > 0, a/x t wil l ten d t o zero . Percentage growth , b y contrast , ofte n display s n o obviou s tendenc y t o rise o r fall , makin g it a more likel y candidate fo r stationarity . Since th e levels o f man y economi c variable s ar e initiall y positive , an d recallin g that

we se e tha t stationarit y o f th e rat e o f growt h implie s stationarit y o f Alog(jc ( ). Change s i n th e logarithm s o f economi c dat a serie s suc h a s those jus t mentioned , therefore , see m mor e likel y t o b e stationar y than changes i n th e levels . W e wil l retur n t o thi s poin t i n Chapte r 6 below, where w e conside r ho w co-integratio n i s affecte d b y th e logarithmi c transformation. W e illustrat e som e o f thes e point s wit h actua l dat a series. The tim e serie s tha t we analys e is rea l net nationa l produc t (Y, in 1929 fmillion ) fo r th e Unite d Kingdo m ove r 1872-1975 . Th e dat a ar e taken fro m Friedma n an d Schwart z (1982 ) an d ar e als o investigate d i n Hendry an d Ericsso n (19910) . Figure s 1.6-1. 9 plo t thi s dat a serie s an d

FIG 1.6. U K rea l net nationa l produc t ( Y i n 192 9 fmillion), 1872-197 5

30 Introductio

n and Overvie w

FIG 1.7. Logarith m (lo g Y ) o f UK rea l net nationa l product

various transformation s o f it . Figur e 1. 6 plot s th e untransforme d serie s Yt; th e serie s i s tending t o gro w by increasing amounts , and s o would be better approximate d b y a conve x functio n than by a straight line . Thi s is visible fro m th e upwar d curvatur e an d th e muc h close r fi t o f th e quadratic trend lin e compare d wit h the linea r trend . I n Fig . 1.7 , w e plo t the logarith m o f th e series : th e curvatur e i s no longe r apparent , an d th e quadratic an d linea r trend s ar e ver y simila r an d fi t abou t equall y well . Thus, th e logarith m o f th e serie s i s relativel y wel l approximate d b y a straight lin e and , whil e growing , ther e i s n o eviden t tendenc y fo r th e growth rate to chang e over time . Figure 1. 8 plot s th e changes , AY ( . Ther e i s a tendenc y fo r bot h th e mean an d th e varianc e t o gro w ove r time , an d th e linea r tren d show n highlights th e former . (I t require s mor e carefu l inspectio n t o se e th e latter owin g to th e ver y large shock i n 1919-20. ) Differencin g th e initial series ha s therefor e no t produce d a stationar y series . I n Fig . 1.9 , however, wher e A log Yt i s plotted, ther e i s no longe r an y major chang e in th e mea n o r variabilit y of th e serie s ove r th e sample , wit h perhaps a slight tendenc y fo r th e varianc e t o b e smalle r i n th e perio d sinc e 1945 . Certainly, an y tren d i n th e mea n o f AlogY r i s negligible . Thi s series , then, ma y wel l b e stationary , althoug h neithe r th e logarithmi c transfor mation no r th e first-differenc e transformatio n produce d a stationar y series o n it s own . Sinc e th e difference s i n th e logarithm s appea r stationary, w e migh t expec t t o fin d tha t th e logarithm s o f th e origina l


31

FIG 1.8. Change s (AY ) in UK real net nationa l produc t

FIG 1.9. Change s i n th e logarith m (AlogY ) o f U K rea l ne t nationa l product series ar e 1(1) , whil e th e untransforme d initia l serie s apparentl y i s no t and differencing i t is not sufficien t t o produce stationarity. Alternatively, an y linea r mode l o f AY , will hav e a n erro r term , whic h we denot e b y ut, with a standar d deviatio n o u tha t mus t b e in the sam e

32 Introductio

n and Overvie w

units a s Y t. Sinc e thes e ar e 192 9 fmillion, th e linea r mode l assume s a constant absolut e erro r standar d deviation . However , ne t nationa l product ha s grow n abou t six-fol d ove r th e sampl e s o tha t o u/Yt (th e relative error ) wil l b e muc h smalle r i n 197 5 than i n 1875 . It woul d b e difficult t o imagin e reasons fo r such a decline. The log-linea r model , b y wa y of contrast , assume s a constan t relativ e error standar d deviatio n (e.g . 2\ percen t o f Y , a t al l point s i n time) , which seem s muc h mor e plausible . Failin g t o transfor m th e dat a adequately violate s th e statistica l model of an 1(1) o r 1(0 ) series , an d ca n induce trendin g mean s an d variances , makin g testin g les s reliable . Certainly, a relativel y lon g tim e serie s i s neede d t o mak e suc h factor s obvious, bu t the y operat e eve n withi n post-wa r quarterl y dat a (se e e.g. Ermini an d Hendr y 1991) . Moreover , change s i n mean s an d variance s over tim e ar e ver y apparen t i n nomina l tim e series , an d ca n confus e attempts t o determin e co-integration . Grange r an d Mailma n (1991 ) analyse genera l transformation s i n 1(1 ) tim e series , an d Chapte r 4 below explores forma l statistica l test s o f hypothese s abou t th e degre e o f integration o f individual time series .

1.7. Examples : Typica l ARM A Processes Figures 1.10-1.2 0 present graph s o f typical examples o f serie s generate d by specia l case s o f ARMA(1,1) processes . Fo r eas e o f comparison, eac h series i s computer-generate d usin g th e sam e se t o f 20 0 observation s o n normally distribute d white-nois e error s s , ~ IN(0,1 ) wit h w 0 = 0. Th e data generatio n processe s are: Fig. 1.1 0 u

t

= £ t [whit

e noise ]

Fig. 1.1 1 u,

= e, + 0.8e,_i [MA(1)

, stationary]

Fig. 1.1 2 u,

= e, - 0.8£,_ ! [MA(1)

, stationary ]

Fig. 1.1 3 u,

= 0. 5 «,_! + e t [AR(1)

, stationary ]

Fig. 1.1 4 u,

= 0.5 ut-v + e t + Q.8e tî [ARMA(1,1)

, stationary ]

Fig. 1.1 5 u,

- 0. 5 Mr _! + e, - 0.8e t _i [ARMA(1,1)

, stationary]

Fig. 1.1 6 u,

= 0.9 «,_! + e, [AR(1)

Fig. 1.1 7 u

t

Fig. 1.1 8 u, Fig. 1.1 9 u Fig. 1.2 0 u,

t

, stationary ]

= 0.9 ut-! + e, + 0.8e,_i [ARMA(1,1)

, stationary ]

= 0.99 «,_! + E , [AR(1)

, stationary ]

= 1.00 M,_! + s t [AR(1)

, non-stationary ]

= 1.0 1 ut-i + e t [AR(1)

, non-stationary ]

Introduction and Overview

Observation

FIG 1.10. A R = 0.0; MA = 0.0

Observation

FIG 1.11. A R =0.0; MA -0.8

33

34


Observation

FIG 1.12. A R = 0.0; MA = -0. 8

Observation

FIG 1.13. A R = 0.5; MA = 0.0


Observation

FIG 1.14. A R = 0.5; MA = 0.8

Observation

FIG 1.15. A R = 0.5; MA = -0. 8

35

36


Observation

FIG 1.16. A R = 0.9; M A = 0.0

Observation

FIG 1.17. A R = 0.9; MA = 0.8


Observation

FIG 1.18. A R = 0.99; MA = 0.0

Observation

FIG 1.19. A R = 1.00 ; M A = 0.0 0

37

38 Introductio

n and Overvie w

Observation

FIG 1.20. A R = 1.01 ; M A = 0.00

A proces s suc h a s tha t i n Fig . 1.19 , a n AR(1 ) wit h a uni t root , i s a random walk and ma y also be expresse d a s ARIMA(0,1,0). The scale s o n th e graph s i n Figs . 1.10-1.2 0 ar e no t identical ; fo r th e non-stationary processes , i n particular , th e graph s sho w ver y wid e movements relativ e t o thos e o f th e stationar y series . Non-stationar y processes wit h root s strictl y greate r tha n unit y gro w ver y quickl y even where those root s ar e quit e clos e t o 1 , as can b e see n fro m Fig . 1.20 , a n AR(1) wit h a roo t i n th e autoregressiv e par t o f 1.01 . Th e stationar y processes i n Figs. 1.10-1.1 8 have unconditional means of zero an d finit e unconditional variances . The y ar e 'tied ' t o thi s zer o mea n i n th e sens e that deviation s fro m i t canno t accumulat e indefinitely . By contrast , th e process wit h a singl e roo t o f exactl y unit y (Fig . 1.19 ) ha s a n uncondi tional Varianc e which increases ove r tim e and wil l tend t o wande r widely (see equatio n (7) ) wit h a n unbounde d expecte d crossin g tim e o f th e origin. Th e proces s wit h a root greate r tha n unity (Fig, 1.20 ) i s explosive and will tend t o either + o r - oo , an d u, ma y b e approximated b y a n MA(rc ) proces s wit h increasing accurac y a s «— » oo. If a = 1, however, the firs t ter m doe s no t disappear , an d the approxima tion fails ; thi s follow s fro m th e failur e o f th e stationarjt y conditio n stated above . Whe n a = 1,

so that u t is the su m of a starting value, u t-n, and al l the error s accruing between t — n + 1 an d t . Thi s representatio n o f th e proces s {u t} a s a sum o f pas t contribution s i s the sourc e o f th e relationshi p o f integration in thi s time-serie s sens e an d integratio n i n th e integra l calculus , wher e the integra l o f a functio n ma y b e though t o f a s th e limi t o f a su m o f discrete area s unde r a curve . Figur e 1.1 9 i s th e cumulativ e sum , or discrete integral , o f the error s recorde d i n Fig. 1.10 . Many economi c tim e serie s hav e bee n modelle d usin g ARM A o r ARIMA processes , an d model s o f these type s will b e use d frequentl y in

40

Introduction and Overview

the followin g chapter s i n describin g th e method s an d tests . Priestley (1989) provide s example s o f othe r type s o f model s tha t ma y b e use d t o characterize non-stationar y processes.

1.8. Empirica l Tim e Series : Money, Prices , Output, an d Interest Rates Figure 1.2 1 graph s th e logarithm s o f quarterly , seasonall y adjusted , nominal M l an d price s (th e implic t deflato r o f tota l fina l expenditure , TFE) i n th e U K ove r th e perio d 1963-89 . Th e serie s (denote d logM , and lo g Pt) hav e stron g trend s an d ar e relativel y smooth , althoug h thei r growth rate s alte r perceptibl y aroun d 197 4 an d agai n aroun d 1980 . Suc h data ar e no t unlik e realization s fro m highl y autoregressiv e (1(1) ) pro cesses. Figure 1.2 2 show s thei r first difference s Alog(M f ) an d Alog(P^) . These ar e mor e errati c bu t ar e stil l highl y autocorrelated . Th e growt h

FIG 1.21. Tim e serie s o f mone y (Ml ) an d price s (implici t deflato r o f total fina l expenditure ) in the UK , seasonall y adjusted , i n logs

FIG 1.22. Tim e serie s o f A log M, an d A log P t


1

rate o f M appear s t o hav e increase d ove r time , wherea s tha t o f P ha s fallen, especiall y afte r 1980 . These dat a d o no t see m t o b e stationar y although th e graph s b y themselve s d o no t revea l th e sourc e o f th e non-stationarity. Next, Fig . 1.23 shows the behaviou r o f log s o f th e rea l mone y supply (log(M/P,)) an d rea l TF E (log(Y,)) . I t migh t hav e bee n anticipate d from Fig . 1.21 that log(M r ) an d log(P () move d sufficientl y closel y ove r the whol e sampl e fo r thi s differentia l t o b e stationary , bu t Fig . 1.23 shows tha t th e rea l mone y suppl y i s non-stationary . Th e forma l ap paratus o f testin g fo r co-integratio n develope d i n Chapte r 7 i s designe d to detec t suc h relationship s statistically . B y wa y o f contrast , log(Y ( ) looks mor e lik e a serie s wit h a constan t linea r trend , subjec t t o perturbations i n 1973/ 4 and 1979/80 . In economi c terms , surprisin g features o f Figs . 1.22-1.2 3 ar e th e lo w pairwise correlation s betwee n Alog(M ( ) an d Alog(P r ), an d betwee n log(Mt/Pt) an d log(Y ( ), respectively . However , suc h result s hav e n o implications fo r th e existenc e o r otherwis e o f wel l define d relationship s between thes e variables . Monetar y theor y suggest s tha t th e opportunit y cost o f holdin g mone y i s a n importan t determinan t o f th e deman d fo r money, s o Fig . 1.2 4 show s th e tim e serie s o f th e interes t rat e (R t, a three-month loca l authorit y bil l rat e adjuste d fo r financia l innovation ) and th e rat e o f inflation , plotte d i n unit s tha t maximiz e thei r apparen t correlation. Th e serie s {R t} als o seem s t o b e non-stationary , bu t wit h a different tim e profile fro m th e othe r series . I n particular , i t i s much less smooth tha n th e othe r leve l series , bu t les s errati c tha n thei r changes . Finally, Fig . 1.25 shows Alog(Y r ) an d A/?, . Thes e ar e possibl y weakl y stationary, althoug h bot h appea r t o hav e highe r variance s i n the middl e of th e sampl e tha n a t th e ends . However , neithe r i s highl y autocor related, no r d o the y drif t noticeabl y i n an y direction . W e wil l analys e the fou r serie s log(M t), ôg(P t), log(Y,) , an d R t a s a syste m i n late r chapters. (Se e Hendry an d Ericsson (1991b) , who provided th e data. )

FIG 1.23. Tim e serie s o f real mone y (log M,/Pt) an d rea l TF E (lo g Yr)

42


FIG 1.24. Time serie s o f a three-mont h interes t rat e (R t) an d th e rat e of inflation (AlogP r ) i n th e U K

FIG 1.25. Tim e serie s of A log Yt an d A7? r

1.9. Outlin e of Later Chapter s Chapter 2 discusses dynamic models fo r stationar y processes. Thi s allows us t o introduce , i n a familia r context , a numbe r o f consideration s which will prov e importan t later . Variou s equivalen t transformation s o f linea r autoregressive-distributed la g model s ar e considered , especiall y error correction, Bewley , and Bardse n forms . The rol e of expectation s in stationary processe s i s als o investigate d an d i s related t o th e absenc e of weak exogeneit y fo r th e parameter s o f th e economi c agents ' decisio n functions. Chapter 3 the n consider s th e analysi s o f 1(1 ) variables , an d explore s the concept s o f uni t roots , non-stationarity , order s o f integration , an d near integration . Th e behaviou r o f least-square s estimator s applie d t o


3

spurious relationship s i s investigated an d a number o f results establishe d for Wiene r processe s (se e Phillip s 1987a) . Univariat e tests for uni t roots are discusse d i n Chapte r 4 , an d th e forma l definition s in Chapte r 3 ar e related t o th e propertie s o f integrate d series . Mont e Carl o result s illustrate th e variou s distributions . Extension s t o multipl e unit roots an d seasonal dat a ar e considered, an d severa l example s ar e describe d i n detail. Chapter 5 move s o n t o th e topi c o f co-integration . Followin g a bivariate exampl e an d forma l definitions , th e Grange r Representatio n Theorem i s described , linkin g co-integratio n t o erro r correction , an d clarifying th e statu s o f othe r representation s suc h a s commo n trends . The origina l Engle-Grange r two-ste p estimato r o f th e co-integratin g relationship i s analysed . Chapte r 6 firs t consider s inconsisten t regres sions sometime s use d i n orthogonalit y tests ; th e analysi s the n turn s t o distributions o f estimator s i n dynami c regressions wit h 1(1 ) data , base d on th e result s i n Sims , Stock , an d Watso n (1990) , an d i s illustrated b y a number o f examples. Chapter 7 discusse s testin g fo r co-integration . A rang e o f test s i s considered, base d o n testin g fo r a uni t roo t i n th e residual s fro m th e static regression . Whil e widel y used , suc h test s hav e drawbacks , an d Monte Carl o experiment s ar e use d t o illustrat e som e o f these . Test s based o n single-equation dynami c models ar e als o considered . Finally, i n Chapte r 8 , co-integratio n i n system s o f equation s i s analysed. Linea r co-integrate d system s ar e expresse d i n error-correctio n form an d maximu m likelihood estimatio n an d inferenc e fo r co-integrat ing vector s i s discussed, focusin g o n th e approac h propose d b y Johanse n (1988). A rang e o f extension s i s considered , a s ar e variou s othe r estimators. Th e analysi s i s agai n illustrate d b y a numbe r o f example s and simulatio n experiments.

Appendix Equation (11) To prove (11) , w e need t o construct a random variable X t, wher e

44 Introductio

n an d Overvie w

If

then, b y the Liapuno v centra l limi t theorem , The proo f o f (11 ) i s i n thre e steps . First , conside r (fro m (6) ) th e sample mean :

and

Thus X, ~ ID(0, cr?), a s required. Further, notin g tha t and usin g normality of e

and al l the condition s of the Liapuno v theorem are satisfied . Therefore , Finally, usin g the result s above, an d noting that y = TX

Introduction an d Overvie w 4 Since y/VT^> \\W(r)&r fro m result s above , w e hav e tha t y/\/T converges t o both \\W(r}Ar an d to N(0, 1/3) . Therefor e

The derivation s o f later result s follo w simila r lines.

5

2

Linear Transformations, Error Correction, and th e Lon g Run i n Dynamic Regression We begi n b y considerin g th e propertie s o f linea r autoregressive distributed la g (ADL ) model s fo r stationar y dat a processes . Trans formations o f th e AD L mode l t o erro r correctio n an d t o variou s other form s ar e described . W e discus s th e estimatio n o f long-ru n multipliers fro m dynami c models , an d th e equivalenc e o f th e estimates o f thes e multiplier s (an d thei r variances ) fro m an y o f several differen t forms . Finally , w e conside r inferenc e abou t long run multiplier s wher e expectationa l variable s ar e present , an d th e potential problem s ar e show n t o b e specia l case s o f th e genera l invalidity o f inferenc e when th e regressor s ar e no t weakl y exogenous fo r parameter s o f interest. In late r chapters , w e wil l concentrat e o n th e importanc e o f integrate d processes fo r econometri c modelling , an d i n particula r o n th e detectio n of th e stochasti c trend s embodie d i n integrate d processes , o n identifyin g series tha t shar e stochasti c trend s an d therefor e satisf y long-ru n equi librium relations , an d o n th e implication s o f suc h propertie s fo r th e estimation o f economi c relationships . Befor e beginnin g t o explor e thes e concepts, however , ther e ar e a numbe r o f aspect s o f th e us e an d specification o f dynami c econometri c model s whic h ca n b e reviewe d without a thoroug h knowledg e o f integrate d processes , an d whic h wil l be usefu l i n late r discussion . Th e calculatio n o f th e parameter s o f long-run relationship s fro m estimate d models , th e interpretatio n o f linear transformations , an d th e form s o f particula r model s suc h a s th e error-correction mode l ar e amon g thes e topics . Th e variable s use d i n this chapte r ma y al l be treate d a s being stationary , bu t reader s wh o ar e familiar wit h the concept s examine d i n late r chapter s wil l recogniz e tha t the sam e result s appl y if the variable s ar e co-integrated . One simpl e but fundamenta l problem tha t w e addres s i s the following : given a variabl e whic h in genera l depend s upo n it s ow n past an d o n th e values o f variou s exogenou s variables , ho w ca n w e determin e th e long-run equilibriu m relationshi p betwee n th e endogenou s variabl e an d the exogenou s variables ? I f a n endogenou s variabl e y t i s expresse d a s a

Linear Transformations an d ECM s 4

7

function onl y o f the valu e of a se t o f exogenou s variable s z t a t th e sam e point i n time , th e effec t o f z t o n y ( i s immediat e an d complete ; however, i f a la g distribution applie s t o ever y variable i n the model , th e long-run effec t mus t b e derive d a s a function o f al l the la g distributions . Moreover, ther e ar e othe r type s o f informatio n that ca n b e reveale d b y a dynami c equation; an y o f a numbe r o f equivalen t form s wil l provid e the sam e informatio n about, say , short-ru n an d long-ru n adjustment, but different form s o f th e equatio n wil l revea l differen t type s o f information conveniently. We wil l conside r a numbe r o f way s i n whic h t o estimat e long-ru n multipliers fro m dynami c regressio n models , an d i n doin g s o wil l examine severa l differen t type s o f model . Afte r describin g th e genera l autoregressive-distributed la g (ADL ) mode l fro m whic h th e othe r models ar e derived , w e firs t concentrat e upo n th e error-correctio n model, i n whic h th e term s representin g th e exten t o f deviatio n fro m equilibrium ar e explicitl y presen t i n th e estimate d equation , an d whic h therefore immediatel y display s informatio n abou t th e adjustmen t tha t a process make s to a deviation fro m som e long-ru n equilibrium. This chapte r wil l emphasiz e tw o importan t point s abou t linea r transformations. First , eac h o f the transformation s contains precisel y th e same information : th e estimate d value s o f long-ru n multipliers , hypo thesis tes t statistics , an d explanator y power s o f th e differentl y trans formed model s ar e al l identical . Th e choic e o f transformatio n ca n b e made purel y o n th e basi s o f convenience , an d w e wil l conside r whic h ones ar e convenien t fo r differen t purposes . Th e secon d poin t i s a corollary o f th e first , bu t i s wort h emphasizing : th e estimate s o f short-run adjustmen t parameters fro m th e error-correctio n mode l d o no t depend upo n th e paramete r d, use d i n definin g th e error-correctio n term y t_i — 9zt-i, as long a s other level s term s ar e presen t t o allo w for adjustment t o th e chose n parameter . I n particular , a value of unity for 6 may b e chosen , leadin g t o wha t is called 'homogeneity ' (a n error-correc tion ter m o f y t_i — zt-i), a s long a s th e necessar y extr a term s ar e present. Next, w e consider severa l othe r transformation s o f the autoregressive distributed la g model , du e t o Bewle y (1979 ) (an d discusse d b y Wickens and Breusc h 1988 ) an d Bardse n (1989) . Eac h o f thes e transformation s can b e relate d t o th e error-correctio n transformation , an d w e indicat e some o f th e implication s o f thi s fact fo r estimatio n usin g one o r othe r o f the transformations . Finally , w e will discuss som e potentia l difficultie s i n the estimatio n o f long-run equilibriu m relation s an d thei r interpretation , following McCallu m (1984) , Kell y (1985), and Hendr y an d Neale (1988). While thi s chapte r deal s explicitl y wit h stationar y (1(0) ) processes , many o f the model s considere d ca n b e use d wit h co-integrated processe s as well , a s explore d i n Chapter s 5 an d 6 . I n particular , th e equivalenc e of thes e transformation s (i n th e sens e tha t eac h for m ca n b e derive d

48 Linea

r Transformations an d ECM s

from an y othe r b y operatin g linearly o n th e variables ) i s relevant whe n dealing wit h th e Grange r Representatio n Theorem , als o discusse d i n Chapter 5 . Thi s equivalenc e ha s implication s fo r derivation s o f th e distributions o f coefficien t estimate s i n co-integrate d systems . I n a particular transformation , fo r example , th e variable s ma y al l b e inte grated o f orde r zero , s o tha t th e asymptoti c theor y o f stationar y processes applie s to th e distribution s of the estimates . Suc h a parameter ization migh t b e convenien t fo r inference , becaus e it s informatio n content i s identica l t o tha t o f th e origina l parameterization , i f fo r example tha t for m containe d bot h 1(1 ) an d 1(0 ) variables . Thes e issue s are considere d a t lengt h i n Chapte r 6 , an d th e analysi s i n thi s chapte r provides useful backgroun d for that discussion .

2.1. Transformation s o f a Simple Model Before beginnin g a genera l treatment , w e conside r th e first-orde r linea r autoregressive-distributed la g model, denote d ADL(1,1) , a s an exampl e and deriv e severa l linea r transformation s o f it . Eac h transformatio n i s equivalent i n th e sens e tha t eac h implie s th e sam e relationshi p betwee n exogenous an d endogenous variables . Th e ADL(1,1) is where e f ~IID(0, a2 ) an d \<Xi\ < 1 (se e Hendry , Pagan , an d Sarga n 1984). First conside r a stati c equilibriu m defined , a s above , a s a n environ ment i n whic h al l chang e ha s ceased , recallin g tha t w e ar e treatin g ( y t , x t ) a s jointl y stationary . Th e long-ru n value s ar e give n b y th e unconditional expectation s o f th e for m E(y t) i n (la) . Definin g v* = E(y t) an d x* = E(x t) V t, w e have, sinc e E(e t) = 0, and henc e

or

Then ki i s the long-run multiplier o f y wit h respect t o x. Now subtrac t v r _i fro m bot h side s o f (la) an d the n ad d an d subtrac t PoXf-i o n th e right-han d side to get 1 1 Equatio n (la ) i s invarian t t o suc h linea r transformation s whic h preserv e th e erro r process {e,} .

Linear Transformation s an d ECM s 4

9

Alternatively, w e could hav e adde d an d subtracte d (j8 0 + ft);c,_ i o n th e right side , t o get All o f thes e equation s impl y the sam e relationship , becaus e an y on e can be derive d fro m anothe r withou t violating the equality . In equation s (Ic) an d (Id), however , term s representin g th e discrepanc y betwee n yt-i an d x t-i o r betwee n y r _ t an d k\x t-\ appea r explicitly ; th e coefficient (th e sam e fo r eac h form ) o n thes e term s ca n b e take n a s a measure o f th e spee d o f adjustmen t o f y t o a discrepanc y betwee n y and x i n th e previou s period . W e examin e suc h error-correction models in detai l in the nex t section. Equation (Ib) i s similar to (Ic ) an d (id) i n that th e sam e information appears explicitl y a s a coefficient ; tha t is , ( wher e 8, is not equal t o one. Th e 9j are the equilibrium multiplier s give n above: 9/ = f}j(l)/a(i); an d i f the 9 j wer e known, the y coul d b e inserte d directl y int o th e EC M term s i n (3 ) an d the term s i n lagge d x coul d b e eliminated. 3 I n term s o f th e parameter s of (3) ,

Since th e EC M i s simply a linea r transformatio n of th e AD L model , we might ask what its distinguishing feature is. The answe r is that in the ECM formulation , parameter s describin g th e exten t o f short-ru n adjust ment t o disequilibriu m ar e immediatel y provide d b y th e regression . Although th e for m i n (3 ) i s analyticall y convenient , i t i s no t a usefu l empirical specification . I n practice , a singl e error-correctio n ter m a t la g r i s preferable , a s i t induce s a mor e interpretabl e an d mor e nearl y orthogonal parameterization . The error-correctio n mechanis m will be o f particula r valu e wher e th e extent o f a n adjustmen t t o a deviatio n fro m equilibriu m i s especiall y interesting. I t i s clear tha t th e EC M provide s thi s informatio n when th e error-correction term s ar e o f th e for m (y t-i — E/=i Qj xjt-i)i wit h Qj a known parameter . I f 6j i s not know n i t ca n b e estimated ; moreover , a n unknown 0 ; ca n implicitl y b e allowe d fo r i n th e error-correctio n ter m 3

Not e that this require s

52 Linea

r Transformation s an d ECM s

through th e inclusio n o f extr a lag s i n th e x/, withou t affectin g th e magnitude o f th e estimate d coefficient s 17, - i n (3) . Henc e thes e para meters d o no t nee d t o b e estimate d a t a n earlie r stag e i n order t o allo w us t o us e th e ECM . I n fact , a n importan t poin t i n favou r o f th e generalized EC M (3 ) i s tha t th e estimate d coefficient s o n th e error correction term s ar e unaffecte d by th e incorporatio n o f an y constan t 9 into th e term ; thi s wil l b e prove d afte r w e hav e establishe d som e othe r results whic h wil l simplif y th e proof . Th e implicatio n i s tha t w e ca n interpret th e coefficient s ry , i n (3 ) directl y a s adjustment s t o disequili brium eve n thoug h th e tru e disequilibriu m ter m i s give n b y (yt-t - Zf= i SjXjt-i) an d not by (y r _; - Xf= i *,*-;)• Henc e th e use of a generalized EC M does no t imply homogeneity ( 9 = 1) a s long a s extr a lags i n th e x, ar e incorporated , eve n thoug h th e error-correctio n term s that ente r (3 ) d o no t explicitl y allow for 9 ¥= 1 .

2.3. A n Exampl e An exampl e o f th e us e o f th e error-correctio n mechanis m ca n b e foun d in Davidso n e t al. (1978) , wh o us e a homogeneou s (6 = 1 ) error-correc tion mechanis m i n th e modellin g o f consumers ' expenditure . Th e 'error ' to whic h adjustment i s made i n th e mode l i s the differenc e between th e logarithms o f consumptio n an d income , eac h lagge d fou r quarters . Th e error-correction ter m i s significant i n a wide variety of specifications . I n particular, usin g quarterl y seasonall y unadjuste d dat a fro m th e Unite d Kingdom, expresse d a t constan t price s over th e sampl e perio d o f 1958(1) -1970(IV), th e author s favou r th e model 4 (standar d error s i n parentheses):

where th e statistic s z\ an d z 2 ar e asymptoti c x 2 test s f° r paramete r constancy an d seriall y independen t residuals , respectivel y wit h degree s of freedo m i n parentheses ; C, i s th e fitte d valu e o f rea l consumers ' expenditure o n non-durabl e good s an d service s C t; Y t i s rea l persona l disposable income ; P t i s the pric e deflato r fo r consumption ; an d D° i s a dummy variabl e fo r change s i n taxation. Th e error-correctio n ter m ha s a 4 Th e symbo l AjA 4 represent s th e firs t differenc e o f th e fourt h difference ; e.g . A 4 log Y, - A 4 log y,_j = AjA 4 log Y,.

Linear Transformation s an d ECM s 5

3

coefficient tha t i s reasonably substantia l a s well as statistically significan t at conventiona l levels . Th e mode l ca n readil y b e derive d fro m a n AD L model, notin g tha t log(C/y),_ 4 = lo g C,_ 4 - lo g Yt_4 = c,_4 - y,_ 4 , using lower-case letters t o denot e logarithms . On th e additiona l assumptio n tha t A 4cr, A 4y,, an d A 4pf ar e station ary, wit h £(A 4 c r ) = g c, E(& 4yt) = g y, an d E(A 4pt) = pa (th e annua l rate o f inflation) , then , takin g expectation s o f th e equatio n abov e fo r fixed value s of the estimate d parameters , Hence C * =kY* wher e k = exp(-5.3g_y - 1.3p a), notin g tha t g c = g y given th e proportiona l long-ru n solution . Thi s for m o f solutio n i s consistent wit h th e life-cycl e hypothesi s (se e Deato n an d Muellbaue r 1980), i n whic h case th e coefficient s of g y an d p a shoul d correspon d t o the negative s o f th e annua l wealth-incom e an d liqui d asset-incom e ratios. Th e resulting values seem sensible . For positiv e rea l growt h o r inflation , k ) show . I n demonstratin g thes e point s w e wil l make us e of the genera l structur e that Wicken s and Breusc h use to compar e linea r transformations o f regression models . Take a s a basic structure th e regressio n mode l where th e X matri x contain s lagge d (bu t no t contemporaneous ) y a s well a s contemporaneou s an d lagge d x terms , an d y is a k x 1 vector . Define thi s as corresponding t o the ADL mode l (2) . The representation s (4) an d (5 ) involv e transformin g the matrice s y an d X b y a transformation matri x A, suc h that, followin g Wickens and Breusch ,

Linear Transformation s an d ECM s

56

so that For example , tak e m = n = 2 and p = 1 in (2 ) so that th e matri x of the transformation t o th e Bardse n for m (5 ) is

1

0

-1

0 0 0 0 0

0 0

1

0

0 0 0 0

1

-1 0 0 0

0 0 0 0 0

0 0 0 0

1

0 0 0

1

0 0 0

1 -1

_1

0

0 0 0 0 0 0

(10a)

1

since x' t = [yt, 1 , y,^, y,_2, xt, jc,_1; *,_2] map s ont o \' t = [&yt, 1, Ayf _i, A* 1 , then we have From (1 ) i t i s clea r tha t Ay , i s n o longe r stationary : i t depend s no t only upo n th e stationar y process MI, , bu t als o upo n th e non-stationar y process y t-i (sinc e p i - 1 > 0). Hence a n AR(1) proces s wit h a coeffici ent o f 1 is 1(1) , bu t th e sam e proces s wit h a coefficien t o f 1.0 1 i s not , since differencin g wil l not reduc e this process t o stationarity . Many economi c tim e serie s ma y contai n a n exac t uni t roo t i f w e consider logarithmi c transformation s o f th e for m routinel y applie d t o economic tim e series. Otherwise , root s ver y close to, bu t slightl y greater than, unit y impl y non-stationar y serie s tha t ar e no t l(d) fo r an y d . Roots slightl y les s tha n unit y generat e near-integrate d series . Suc h processes wil l ten d t o b e difficul t t o distinguis h from thos e wit h root s of exactly unit y on moderatel y size d samples ; suc h processe s ar e discusse d in Chapte r 3 . Root s substantially greater tha n unity , by contrast, wil l b e easily detecte d a s the explosiv e characte r o f the serie s wil l be clea r wit h even fairl y smal l samples. Consider th e simples t data-generatio n proces s withi n whic h w e ca n discuss tests for unit roots:

100 Testin

g for a Unit Root

If on e wer e testin g th e tru e hypothesi s H 0:p = p 0 fo r p 0 < 1 , th e test woul d b e easil y performed . Runnin g th e regressio n (2) , th e t-statistic ( p — p0)/SE(p) has , asymptotically , a standar d norma l distributio n and ca n b e compare d wit h table s o f significanc e point s fo r N(0, 1). I n small sample s th e statisti c i s approximatel y t -distributed, althoug h th e coefficient estimat e p i s biased downwar d slightly. For p o = 1 , however , thi s resul t n o longe r holds . Th e distributio n o f the tes t statisti c jus t give n i s no t asymptoticall y normal , o r eve n symmetric. Tables o f critica l value s hav e bee n tabulate d b y D . A . Dickey an d ar e reporte d in , e.g . Fulle r (1976) . I t i s instructiv e t o examine thes e i n detail, an d they are recorde d a s Tables 4.1 and 4.2 . The critica l value s i n Fuller' s table s pertai n t o eac h o f thre e differen t models: i t i s importan t t o not e a t th e outse t that , a s i n man y othe r instances, th e distribution s of tes t statistic s obtaine d depen d no t onl y o n the data-generatio n process , bu t als o o n th e mode l wit h whic h w e investigate it . Fo r th e tim e being , w e wil l conside r thre e possibl e models:

The nul l hypothesi s i s that p , = 1 for i = a, b, c. Th e applicabilit y of each mode l depend s on what is known about th e DGP , sinc e we want t o construct simila r tests (tha t is , test s fo r whic h the distributio n o f the tes t statistic under th e nul l hypothesis is independent o f nuisance parameter s in th e DGP) . I f a tes t i s not similar , then th e appropriat e critica l value s may depen d upo n unknow n nuisanc e parameter s (e.g . a constant) , which will invalidate standar d inferences . W e will return t o th e similarit y of test s below . Fo r th e moment , w e will follow much o f the literatur e o n the topi c i n assumin g that (2 ) i s the DGP , i n whic h case th e issu e doe s not aris e sinc e (2 ) contains no nuisanc e parameters . Another formulatio n o f th e DG P deal s wit h a potentia l difficult y tha t arises fro m (2 ) concernin g th e statu s o f th e nuisanc e parameter s unde r the alternativ e H I . p < 1. Reconsider (2 ) when there is an intercep t arbitrar y y 0 Extensio n o f (3c ) necessar y Thus, fo r example , i n cas e (i) , i f th e mode l i s give n b y (3c) , th e appropriate critica l value s ar e give n b y Table s 4.1(c ) an d 4.2(c) . Th e same table s ca n b e use d t o conduc t inferenc e i n (iii) , despit e a non-zer o value o f n i n th e DGP , becaus e (3c ) yield s a simila r test . Similarit y implies tha t th e distribution s o f p an d it s associate d ^-statisti c ar e no t affected b y th e value , unde r th e null , o f th e nuisanc e parameter , an d the critical value s ar e th e sam e a s the one s tha t woul d appl y fo r n = 0, namely, those i n Tables 4.1(c ) an d 4.2(c). There ar e a numbe r o f noteworth y additiona l points . I n cas e (i ) ther e are n o nuisanc e parameters , s o tha t similarit y i s a trivia l property . I n general, a s this summar y suggests , a simila r tes t havin g a Dickey-Fuller distribution require s tha t th e mode l use d contai n more parameter s tha n the DGP . I n order to hav e a similar test fo r (iv) , one woul d the n nee d a model wit h a ter m suc h a s t 2, necessitatin g anothe r bloc k o f critica l values i n eac h o f Table s 4. 1 and 4.2 . I n cas e (ii) , fo r example , w e nee d at leas t mode l (36 ) (wit h a constant ) t o allo w fo r th e unknow n startin g value. I n cas e (iii ) w e hav e a n unknow n constan t an d nee d th e tren d term i n model (3c ) t o allo w for it s effect . Each o f thes e simila r test s i s als o exac t i n finit e samples , provide d appropriate critica l value s ar e available . I n general , however , i t wil l b e necessary t o abando n exac t test s i n orde r t o us e variant s o f th e Dickey-Fuller tes t wher e ther e ar e mor e unknow n parameters . Thes e parameters ca n typicall y be estimated , s o that asymptoticall y they can b e accounted fo r an d a tes t provided . Again , Kivie t an d Phillip s offe r general exac t an d simila r test s fo r DGP s wher e th e dynamic s ar e restricted t o first-order , a s wel l a s demonstratin g th e similarit y o f th e tests just mentioned . In th e cas e o f exac t parameterizations , suc h a s cas e (iii ) wit h mode l (3£>), w e d o no t hav e simila r test s wit h th e Dickey-Fulle r distributions . However, a s West (1988 ) showed , the f-statistic s i n th e exactl y paramet erized cas e (wit h exogenou s item s suc h a s a constan t i n th e DGP ) ar e asymptotically normal , jus t a s ar e f-statistic s use d fo r standar d prob lems. I n finit e samples , however , th e Dickey-Fulle r distribution s ma y be a better approximatio n tha n th e norma l distribution . We will explor e this asymptoti c normalit y further i n Chapte r 6 below.

1

Critica l value s ar e those corresponding t o the mode l use d i n Table 4.1 or 4.2 .

106 Testin

g for a Unit Roo t

4.2. Genera l Dynami c Model s fo r the Proces s o f Interest The firs t o f th e method s fo r allowin g richer dynamic s in th e DG P o f th e process o f interest , { y t } , wa s develope d concurrentl y wit h th e tes t tha t we hav e alread y describe d fo r a uni t roo t i n th e AR(1 ) model , an d i s reported i n Fulle r (1976) . Thes e mor e genera l method s yiel d tes t statistics tha t hav e th e sam e limiting distribution s a s thos e alread y discussed, becaus e the y ar e base d o n consisten t estimate s o f 'nuisance ' parameters. Henc e w e ma y us e th e las t row s o f Table s 4.1(a)-(c ) o r 4.2(a)-(c) fo r inferenc e wit h thes e statistic s i n larg e samples , bu t i n small sample s percentag e point s o f thei r distribution s will no t i n genera l be th e sam e a s fo r thos e applicabl e unde r th e stron g assumption s o f th e simple Dickey-Fuller model . When y t follow s a n AR(p) process ,

a tes t ca n be constructe d wit h the regressio n model :

The coefficien t p i s use d t o tes t fo r a uni t root , an d T(p — 1) an d (p - l)/SE(p ) hav e th e limiting distribution s tabulate d i n Tables 4.1(a ) and 4.2(a ) fo r T-*°°. Moreover , jus t a s i n th e cas e o f a n AR(1) process, w e ca n exten d thi s regressio n mode l t o allo w for th e possibilit y that th e data-generatio n proces s contain s a constan t (drift ) ter m o r a deterministic time trend. Again , fo r suitably modified regression models , the asymptoti c distribution s of th e statistic s base d o n p ar e thos e give n in Table s 4.1(fe)/(c ) an d 4.2(fe)/(c ) fo r T-^°°. Thes e procedure s ar e called 'augmented ' Dickey-Fulle r (ADF ) tests . The ai m i n modification s suc h a s thes e t o th e simple r for m o f th e Dickey-Fuller tes t i s to us e lagge d change s in th e dependen t variabl e t o capture autocorrelate d omitte d variable s whic h woul d otherwise , b y default, appea r i n th e (necessaril y autocorrelated ) erro r term . Wit h th e additional lagge d term s i t wil l b e possible , i f th e DG P ha s th e for m o f (4), t o produc e a mode l (5 ) i n whic h asymptoticall y the erro r term s ar e white noise , becaus e th e nuisanc e parameters ar e know n asymptoticall y and th e term s involvin g the m ma y b e remove d fro m th e erro r term . With white-nois e errors , th e asymptoti c Mont e Carl o critica l value s given i n th e firs t tw o table s ma y b e applied . Moreover , th e asymptoti c distribution o f th e coefficien t o n th e y r -i ter m i n (5 ) i s no t affecte d b y the inclusio n o f th e additiona l Aj f _, terms . I f y, is 1(1), th e difference d

Testing for a Unit Root 10

7

terms ar e al l 1(0 ) an d appropriat e scalin g ensure s tha t th e variance covariance matri x i s asymptoticall y block-diagonal . (Tha t is , al l cross product term s o f 1(0 ) an d 1(1 ) variable s i n th e matri x ar e asymptoticall y negligible.) I t i s thi s asymptoti c orthogonality tha t drive s th e result , much as , i n a standar d regressio n model , on e use s th e orthogonalit y of the informatio n matri x t o prov e th e statistica l independenc e o f th e estimated coefficien t vecto r fro m th e estimat e o f the standar d error . Th e asymptotic theor y an d th e issu e o f 'appropriate ' scalin g ar e discusse d later i n this chapter an d i n Chapter 6 . By allowin g the DG P t o tak e th e for m (4 ) rather tha n th e muc h mor e restrictive AR(1 ) for m (3) , w e hav e expande d th e clas s o f model s t o which we can validl y appl y unit-roo t test s of thi s type . Not e that , as it will generall y b e th e cas e tha t p i s unknown even wher e y t i s strictly an AR(p) process , i t i s generall y safe r t o tak e p t o b e a fairl y generou s number; i f too man y lags ar e presen t i n (5) , th e regressio n i s free t o se t them t o zer o a t th e cos t o f som e los s i n efficiency , wherea s to o fe w lags implies som e remainin g autocorrelatio n i n (5 ) an d henc e th e inapplicab ility o f even th e asymptoti c distributions i n Tables 4. 1 an d 4.2 . On e can , of course , perfor m test s fo r autocorrelatio n o n th e estimate d residual s from (5 ) i n orde r t o chec k th e acceptabilit y o f th e premis e tha t thes e residuals ar e whit e noise . Alternatively , mode l selectio n procedure s ca n be used t o choose p, and test fo r a unit root, jointly (see Hal l 1990) . We have , therefore , a class o f tests fo r th e uni t root whic h can validly be applie d t o serie s tha t follo w AR(p ) processe s containin g n o mor e than on e uni t root . Th e nex t natura l ste p i s to attemp t t o exten d furthe r the clas s of series t o which we can appl y such tests , ideall y in such a way as t o allo w exogenou s variable s t o ente r th e proces s a s well . Sai d an d Dickey (1984 ) provid e a tes t procedur e vali d fo r a genera l ARM A process i n th e errors ; Phillip s (1987a ) an d Perro n an d Phillip s (1988 ) offer a still more genera l procedure . While th e Said-Dicke y approac h doe s represen t a generalizatio n o f the Dickey-Fulle r procedure , i t agai n yield s test statistic s wit h th e sam e asymptotic critica l value s a s thos e tabulate d b y Dicke y an d Fuller . Th e particular advantag e o f thi s tes t i s tha t w e ca n appl y i t no t onl y t o models wit h M A part s i n th e errors , bu t als o t o model s fo r whic h (as is typically th e case ) th e order s o f th e A R an d M A polynomial s i n th e error proces s ar e unknown . Th e method involve s approximating the tru e process b y a n autoregressio n i n whic h the numbe r o f lag s increases wit h sample size . Begin b y assuming that th e data-generatio n proces s follows :

108 Testin

g for a Unit Root

so tha t th e erro r ter m i n th e autoregressio n follow s a n ARMA(p,q), presumed t o be stationar y an d invertible . Th e DG P ca n be rewritten a s

where k i s larg e enoug h t o allo w a goo d approximatio n t o th e ARMA(/>, q) proces s {u,}, s o tha t {v (} i s approximatel y whit e noise . The nul l hypothesi s i s agai n tha t p = 1. Sai d an d Dicke y sho w tha t th e test i s valid i n spit e o f th e fact s tha t p an d q ar e unknow n and tha t th e ARMA(p, q) i s approximated b y a n A R process , a s lon g a s k increase s with th e sampl e siz e T s o tha t ther e exis t number s c an d r, c > 0 an d r > 0 , suc h tha t c k > T 1/r an d T~ l/3k^Q. Henc e 7 1/3 i s a n uppe r bound o n th e rat e a t whic h th e numbe r o f lags , k , shoul d b e mad e t o grow wit h th e sampl e size . Ordinar y least-square s estimatio n o f th e model (6 ) i s prove n t o yiel d a consisten t estimato r o f ( p — 1); th e tes t can the n b e base d o n th e ?-typ e statistic , ( p - l)/SE(p) , usin g Tabl e 4.2(a). Clearly , th e for m o f th e regressio n implie d b y th e Said-Dicke y test i s precisely the sam e a s that o f the augmente d Dickey-Fulle r test . In thi s case Tabl e 4.2(a) , correspondin g t o a model containin g no drif t or trend , i s used , bu t th e tes t ca n als o b e adapte d t o allo w fo r a non-zero drif t ter m fj, i n th e model . Th e tes t i s modified onl y i n s o fa r a s it i s the n base d no t o n y, bu t o n y t — y,wher e y = T~l^^=iyt. Th e regression mode l (6 ) remain s th e sam e excep t fo r th e firs t regressor , which become s (y t-\ — y), an d tes t statistic s are calculate d i n th e sam e way. B y analogy to th e earlie r result s fo r Dickey-Fuller an d augmente d Dickey-Fuller tests , i t i s no t surprisin g tha t w e no w refe r t o Tabl e 4.2(b), correspondin g t o a mode l containin g a drif t term , fo r th e significance point s o f the (asymptotic ) distributions of th e statistics . Monte Carl o studie s of test powe r i n models wit h autocorrelate d erro r processes, describe d b y Dicke y e t al. (1986) , sugges t tha t th e empirica l levels o f th e T(p — 1) statistics ten d t o b e farthe r fro m th e nomina l tes t levels tha n thos e o f th e f-typ e statistics . Dicke y e t al. therefor e sugges t the us e o f th e f-typ e statistic s in thes e cases . Deviatio n o f nomina l fro m actual tes t level s i s particularly grea t i n DGP s wit h M A part s suc h tha t the M A la g polynomia l contain s a factor o f ( 1 — 6L), wit h 6 nea r unity . The near-cancellation o f such a factor wit h th e factor ( 1 - L ) i n the AR lag polynomia l (unde r th e null ) affect s th e actua l levels o f bot h T(p — 1) and f-typ e statistics , bu t i s especially seriou s fo r th e former .

4.3. Non-parametri c Test s for a Unit Roo t In extendin g th e origina l tests abov e t o allo w for higher-order autocorre lation, w e adde d extr a term s t o th e regressio n mode l t o accoun t fo r th e


9

autocorrelation i n th e residual s tha t woul d otherwis e b e present . B y extending the model , i t was possible t o continu e to dra w valid inferences from th e asymptoti c critica l value s give n i n Table s 4. 1 an d 4.2 ; other wise i t woul d have bee n necessar y t o recomput e thes e critica l value s for each differen t DGP , whic h i n tur n woul d requir e knowledg e o f th e unobservable orders (p) o f the processe s i n these underlyin g DGPs. In expandin g th e se t o f models to whic h we ca n appl y these tests , ou r aim i s to avoi d increasing the numbe r o f table s o f critical values that we must fin d an d us e whil e nonetheles s allowin g fo r quit e genera l DGPs . Phillips (1987a ) provide s a n alternativ e procedur e tha t largel y allow s us to d o so ; ou r expositio n relie s o n furthe r result s reporte d i n Perro n (1988) an d Phillip s an d Perro n (1988) . Rathe r tha n takin g accoun t o f extra elements i n th e DG P b y addin g the m t o th e regressio n model , Phillips suggest s accounting for th e autocorrelatio n tha t wil l b e presen t (when thes e term s ar e omitted ) throug h a non-parametri c correctio n t o the standar d statistics . Tha t is , whil e th e Dickey-Fulle r procedur e aim s to retai n th e validit y o f test s base d o n white-nois e error s i n th e regression mode l b y ensurin g tha t thos e error s ar e indee d whit e noise , the Phillip s procedur e act s instea d t o modif y th e statistic s afte r estima tion i n orde r t o tak e int o accoun t th e effec t tha t autocorrelate d error s will hav e o n th e results . Asymptotically , th e statisti c is corrected b y th e appropriate amount , an d s o th e sam e limitin g distribution s apply. Fro m one perspective , th e effec t i s the sam e a s that o f ADF-type tests: we can validly conduc t asymptoti c inferenc e usin g Table s 4. 1 an d 4.2 . Thi s procedure doe s not , however , requir e th e estimatio n o f additiona l parameters i n the regressio n model . The data-generatio n process that is assumed to hol d is

or equivalently

It i s importan t t o note , however , tha t th e erro r ter m i s no t bein g assumed t o follo w a white-nois e process . Th e condition s tha t u t mus t satisfy i n (70 ) an d (Ib) ar e thos e liste d above i n Chapte r 3 as conditions (3.160)-(3.16d) give n in Phillips (19870). As wit h th e Dickey-Fulle r tests , test s o f th e Phillip s typ e ar e base d upon on e o f three differen t regressio n models , differin g onl y i n on e cas e from thos e use d earlier , b y centring the tren d term :

110 Testin

g for a Unit Roo t

and It i s eas y t o calculat e fro m thes e regression s th e coefficien t estimate s and th e '^-statistics ' fo r each . Fo r test s o f th e significanc e o f p,- , th e statistics ar e the n adjuste d t o reflec t autocorrelatio n i n th e corresponding Uit series . (W e wil l omi t subscript s a , b , o r c o n u t t o simplif y notation.) I f we defin e

and

then th e limitin g distribution s of th e tes t statistic s do no t depen d upo n the parameter s o f the proces s determinin g th e sequenc e {u t} i f o 2 = ou. In th e cas e o f test s statistic s o f th e Dickey-Fulle r (DF ) typ e tha t w e examined earlier , th e mode l i s presumed t o captur e th e relevan t features of th e proces s i n suc h a wa y tha t th e error s ar e independentl y an d identically distributed ; th e latte r i s sufficien t t o guarante e tha t a 2 = o 2u. Note tha t th e statistic s use d i n th e DF-typ e parametri c test s d o emerg e as specia l case s o f th e non-parametri c statistic s wher e th e estimate s o f the parameter s o 2 an d o 2u ar e equa l (i.e . where th e estimate s S 2U an d S2Tt, give n in (11) and (12 ) below, are equal) . We wil l se e thi s mor e clearl y whe n w e examin e th e non-parametri c statistics. I n orde r t o d o so , w e firs t nee d consisten t estimator s o f o 2 and o 2u. Ther e ar e a numbe r o f possibl e choices . I f \i = 0 i n th e DG P (7), the n th e standar d estimato r fro m an y o f (8a) , (8£>) , (8c ) wil l b e consistent fo r a u\ that is,

where u, represents th e residual s fro m on e o f (8a), (8b), (8c) , above. If j U ^ O , th e estimato r i s no t consisten t usin g th e residual s {u at}, bu t residuals fro m eithe r o f th e othe r tw o model s d o yiel d a consisten t estimate. For th e estimato r o f a 2 , a consisten t estimato r ca n b e foun d a t th e cost o f strengthenin g th e assumptions . First , conditio n (3.16& ) i s re placed wit h the conditio n tha t sup r E(\u t\2^} < ° ° fo r som e fi>2 . Next , a conditio n mus t b e place d o n th e la g truncatio n paramete r € which wil l be use d i n definin g th e estimato r o f a 2. The conditio n i s that £ —»°° a s T—> oo , suc h tha t ( i s o(T 1/4). Tha t is , th e numbe r o f lag s use d i n


1

estimating autocorrelation s o f th e residual s increase s wit h th e sampl e size, but les s quickly than its fourth root. Given these conditions , a consistent estimato r o f a 2 is

The estimato r i s indexe d b y th e la g truncatio n paramete r € t o indicat e that differen t choice s o f € wil l lead t o differen t values . I t remain s only to specif y th e residual s t o b e use d i n (12) , and, as i n (11 ) above , w e may choos e the m fro m an y o f (8a) , (86) , (8c ) if fj. = 0. Als o a s i n (11), ,u + 0 require s tha t w e us e th e residual s fro m on e o f th e model s tha t does contai n a constant ter m in order t o preserv e th e consistenc y of this variance estimate . Evidentl y th e saf e strateg y i s t o tak e residua l esti mates fro m (8b) o r (8c ) i n an y cas e wher e ther e seem s eve n a smal l probability tha t th e data-generatio n proces s contain s a constan t (drift ) term. It i s important t o not e tha t bot h o f th e varianc e estimates S 2U an d S 2T( could b e define d usin g th e firs t difference s y t — yt_i rathe r tha n th e residuals u t. Under th e nul l hypothesis that p — 1 and that th e drif t an d trend term s are zero , the two wil l of cours e be equivalen t asymptotic ally. I n finit e samples , whic h o f th e tw o method s i s use d ca n mak e a substantial difference , however ; we will return to thi s point below. While S\e jus t define d i s consisten t fo r o 2 give n residual s fro m th e appropriate model , i t unfortunatel y doe s no t guarante e a non-negativ e estimate fo r finit e sampl e sizes . However , on e ca n guarante e a nonnegative estimat e wit h a simpl e modificatio n o f (12 ) pioneered b y Newey an d Wes t (1987) , whic h i s moreove r consisten t unde r precisel y the sam e conditions as is (12). Define

where (o f(j) = 1 - j((, + I)"1. A fe w example s o f test s usin g thes e quantities t o transfor m th e tes t statistic s ca n b e presente d withou t further discussion . Thereafter we will present statistic s for hypothese s o n \nb, \n c, an d y e i n (8b) an d (8c) , and fo r hypothese s involvin g p a s well as these parameters. Consider th e hypothesi s tha t p b = I (i n (8b)). 2 A n asymptoticall y valid tes t consist s of the statistic 3

2

W e trea t th e initia l observatio n a s fixe d a t zero ; not al l statistics here are invarian t t o the initia l value. Se e Phillips (1987a) an d Perron (1988). 3 Thes e statistic s ar e vali d fo r eithe r choic e o f S 2Tt give n abov e (i.e . the Phillip s o r Newey-West forms) .

112 Testin

g for a Unit Roo t

or, alternatively ,

where t(p b) i s th e ^-statisti c associate d wit h testin g th e nul l hypothesi s pb - 1 . Th e first o f these statistics , Z(p b), ha s under th e null hypothesis (H0: p b = 1) the limitin g distribution give n in Table 4.1(6) (T —* °°) ; th e second ha s th e limitin g distribution give n in Tabl e 4.2(6 ) (7 1 — » °°) unde r the sam e null . I t i s especially usefu l t o not e agai n her e th e fac t tha t th e original Dickey-Fuller statistic s are specia l case s o f these. Unde r Dicke y and Fuller' s assumptions , th e {«/,, } f=i ar e independentl y an d identicall y distributed, implying , a s w e note d above , tha t o\ = a2 an d therefor e that E(S 2Tf) = E(S 2U). Henc e o n averag e S 2T{ = S 2U, an d Z(p b) reduce s to T(p b — 1). Thi s i s precisely th e firs t o f th e statistic s tha t Dicke y an d Fuller examine . Moreover , Z(t(p b)) reduce s t o t(p b), th e ordinar y regression ^-statistic , an d ha s the distributio n given in Table 4.2. The correspondin g statistic s for model s (8a) an d (8c ) are als o give n in Perron (1988) , an d shar e thi s property . Fo r (8a), th e tes t statistic s ar e similar t o (14 ) and (15) . They ar e (wit h _y 0 = 0)

and

Analogous t o th e test s o n (8a) , (16 ) has th e significanc e points give n in Table 4.1(a ) an d (17 ) those i n Table 4.2(a) . Finally , fo r mode l (8c) , we have

and having th e limitin g distribution s tabulate d i n Table s 4.1(c ) an d 4.2(c ) respectively. Th e quantit y D x i s defined a s the determinan t o f th e inne r product o f the dat a matri x with itself: for (8c),

where, again , summation s are ove r al l available elements o f the vectors .

Testing fo r a Unit Root 11

3

In additio n t o th e extensio n o f th e Phillip s (1987fl ) result s t o th e cas e of regressio n model s containin g constan t an d trend , Phillip s an d Perro n (1988) presen t simulatio n evidenc e regardin g th e powe r o f th e Phillips type procedure s vis-a-vis that o f the Said-Dicke y procedure , eac h bein g applicable t o processe s tha t hav e genera l ARMA(j> , q) processe s i n th e errors fro m a regressio n mode l tha t consist s o f a constan t an d lagge d dependent variable . Th e data-generatio n process i s taken t o be

To characteriz e th e result s roughly, the Phillip s or Phillips-Perron tes t generally ha s highe r power , bu t suffer s substantia l siz e distortion s fo r 6 < 0, i n sample s o f size s typicall y foun d i n economics . Th e Said Dickey tes t als o involve s siz e distortion s fo r 9 < 0, bu t muc h smalle r ones: tha t is , eac h tes t reject s a tru e nul l o f p = 1 mor e tha n th e nominal siz e ( 5 per cen t i n these experiments ) states , bu t th e proble m is much wors e fo r th e Z(p ) an d Z(t(p)) statistic s o f Phillip s an d Perron , where rejection s o f th e tru e nul l rang e a s hig h a s 99. 7 pe r cen t fo r 6 = -0.8. (Siz e an d powe r als o depen d upo n th e numbe r o f lags chose n in th e Said-Dicke y tes t an d o n th e la g truncatio n paramete r i n th e Phillips-Perron tests. ) Fo r th e Said-Dicke y test , th e larges t siz e distor tions (wit h tw o lags , a tru e nul l i s rejecte d approximatel y 67. 7 pe r cen t of th e tim e a t a nomina l siz e o f 5 per cent ) disappea r a s th e numbe r of lags used increases, fallin g t o onl y 1 2 per cen t where 1 2 lags are used . This simulatio n stud y i s o f cours e a limite d one , dealin g a s i t doe s with onl y on e AR M A proces s fo r th e equatio n errors . I t doe s howeve r suggest tha t th e Phillips-typ e test s ar e mor e likel y to rejec t th e nul l of a unit root , whether or no t i t i s false; fo r error s wit h stron g negativ e M A components, th e differenc e i s quite large . On e migh t suspect a s well that the powe r o f th e Said-Dicke y procedur e woul d be highe r fo r processe s involving A R errors , becaus e th e tes t regressio n capture s A R term s precisely. Phillips an d Perro n conclud e b y recommendin g thei r ow n Z(p ) tes t for model s wit h positiv e M A o r II D errors , an d th e Said-Dicke y statistic for models with negative MA errors .

4.4. Test s o n More than One Paramete r The test s abov e hav e al l been directe d a t testin g th e leve l autoregressiv e parameter alone . I n model s (8b) an d (8c) , however , ther e ar e othe r parameters present , an d on e ma y b e intereste d i n a forma l tes t o f th e hypothesis tha t on e o f thes e i s zero , o r i n a joint test . Test s simila r t o

114 Testin

g for a Unit Roo t

those abov e ca n b e provided , bu t a furthe r se t of table s mus t b e use d t o find th e significanc e point s o f th e distribution s o f th e resultin g tes t statistics. Table s 4. 4 an d 4. 5 belo w ar e base d o n thos e give n b y Dicke y and Fulle r (1981) , wh o provid e likelihoo d ratio , ^-type , an d F-type statistics for test s on th e parameter s fi b, (JL C, an d y c i n (8b) an d (8c) . Th e tables ar e agai n derive d fro m a Mont e Carl o simulation . The statistic s tha t Dicke y an d Fulle r offe r ar e derive d unde r th e assumption tha t u bt an d u ct ar e white-nois e processes , bu t the y sho w that, a s wa s th e cas e wit h test s above , th e sam e distribution s ca n b e applied wher e th e error s follo w a n autoregressiv e proces s an d a cor rectly specifie d mode l i s used t o estimat e th e parameter s o f thi s process . As we noted earlier , however , it is desirable t o generaliz e th e test s t o b e applicable t o a s broad a s possible a class o f error processes , o f unknown form. Thi s ca n be done , onc e again , using a non-parametric correction . Table 4. 3 summarize s th e Mype , F-type , an d non-parametri c tes t statistics used fo r severa l nul l hypotheses involvin g the parameter s fi an d y. I n additio n t o th e quantitie s define d above , w e requir e

The Phillips-Perro n correction s t o th e standar d Dickey-Fulle r statist ics mus t howeve r b e use d cautiously . Again , th e accumulate d evidenc e of severa l Mont e Carl o simulatio n studie s suggest s tha t th e non-para metrically correcte d tes t statistic s d o no t alway s hav e th e correc t size s even in fairl y larg e samples . Schwert (1989 ) make s thi s poin t forcefully . Hi s results , amplifyin g those i n th e Phillips-Perro n simulation s reporte d earlier , sho w tha t th e critical value s o f th e augmente d Dickey-Fulle r tes t statistics , give n b y the standar d Dickey-Fulle r tables , ar e muc h mor e robus t t o th e presence o f movin g averag e term s i n th e error s o f th e random-wal k process tha n ar e th e correspondin g non-parametricall y adjuste d Dickey Fuller statistics . A n example , take n fro m Schwert , i s sufficien t t o illustrate th e point . The data-generatio n proces s i s give n by 4 y, = yt-i + ut + du t~i, 4

Fo r conformit y wit h th e notatio n o f Phillips-Perro n use d earlier , th e sig n o f th e coefficient o n 6 is changed here .

TABLE4.3(a). Tes t statistics for simple hypotheses in models with drif t an d trend 3 Statistic typ e Tes

a

t Statistic

Critica l values for Z(TI) , Z(t2) , an d Z(T^) ar e th e sam e as those fo r TI , TI, an d 7 3 respectively and ar e tabulate d i n Table 4.4. Note als o tha t S 2U an d S\ e ar e define d wit h respect t o th e residual s o f a particula r model , an d s o diffe r acros s models (8a), (8b), and (8c) . c ti(j) i s the it h diagonal element of the invers e second-moment matrix of the regressors i n model j . Sources: Dickey and Fuller (1981 ) and Perro n (1988) .

TABLE 4.3(6). Test statistics for joint hypothesesa

a

Critical values for Z(i), Z(<J> 2 )> and Z(3) are the same as those for !, 2, and !; DGP : (8b) wit h Pb = 1 , 25 0.29 0.65 0.38 0.49 50 0.29 0.50 0.66 0.39 100 0.29 0.39 0.50 0.67 250 0.30 0.51 0.67 0.39 0.30 500 0.39 0.51 0.67 00 0.30 0.67 0.40 0.51 (6) Tes t statistic O2; DGP : (8c) wit h 25 0.61 0.75 0.89 0.62 50 0.77 0.91 100 0.63 0.77 0.92 0.63 0.92 250 0.77 0.63 500 0.77 0.92 00 0.63 0.92 0.77 (c) Tes t statistic 0.74 25 0.76 50 0.76 100 250 0.76 0.76 500 00 0.77

0.90

0.95

0.975 0.9

r\.

4.12 3.94 3.86 3.81 3.79 3.78

; mode l (8b) 6.30 5.18 4.86 5.80 4.71 5.57 4.63 5.45 5.41 4.61 4.59 5.38

Me = 0 , yc = 0; model (8c ) 6.75 4.67 5.68 5.13 5.94 4.31 5.59 4.16 4.88 4.07 4.75 5.40 4.05 4.71 5.35 4.03 4.68 5.31 Xc = 0 ; model ( 8c) 0>3; DGP : (8c) wit h PC = 1 , ' 1.33 7.24 8.65 0.90 1.08 5.91 1.37 5.61 6.73 7.81 0.93 1.11 7.44 0.94 1.12 1.38 5.47 6.49 5.39 6.34 7.25 0.94 1.13 1.39 0.94 1.39 5.36 6.30 7.20 1.13 5.34 6.25 0.94 1.39 7.16 1.13 Pc = l ,

1.10 1.12 1.12 1.13 1.13 1.13

9

7.88 7.06 6.70 6.52 6.47 6.43 8.21 7.02 6.50 6.22 6.15 6.09 10.61 9.31 8.73 8.43 8.34 8.27

a

Al l entrie s i n th e lef t hal f o f th e tabl e hav e standar d error s o f les s tha n 0.005; those in the righ t half , les s tha n 0.06 . Source: Dicke y an d Fulle r (1981 : 1063) .

computed fo r tw o differen t length s o f lags . Th e firs t la g lengt h i s given by € 4 = [4(T/100) 1/4] an d th e secon d b y € 12 = [12(7/100) 1/4]; [x ] denote s the largest intege r les s tha n or equa l t o x. The result s o f thi s experimen t ar e presente d i n Table s 1 an d 2 o f Schwert (1989 : 148-9) . The y indicat e tha t th e distribution s o f th e Phillips-Perron test s ar e no t clos e t o th e Dickey-Fulle r distribution . The distributions ar e closest whe n 6 - 0. 5 or 0.8 but diffe r markedl y for values o f 9 —- —0. 5 an d —0.8 . Th e discrepancie s persis t eve n wit h sample size s a s larg e a s T = 1000. Th e AD F statistics , o n th e othe r hand, hav e distribution s tha t ar e muc h close r o n averag e t o th e Dickey-Fuller distribution . The poo r behaviou r o f th e Phillips-Perro n test s wher e negativ e M A terms ar e presen t persist s i n regression s tha t incorporat e a tim e trend .


9

Schwert als o report s the distribution s of the normalize d unit-roo t estimators (i.e . T(p — 1)) i n thei r AD F an d non-parametricall y cor rected D F versions . Th e conclusion s remai n unaltered . Finally , Schwert's simulation s d o sugges t tha t th e finite-sampl e performanc e under th e nul l o f th e Phillips-Perro n procedures , i n th e case s wher e MA term s caus e siz e distortions , is bette r whe n S 2U and S 2Tf are calculated usin g th e firs t difference s o f y t tha n wher e th e regressio n residuals ar e used . However , th e test s ma y the n fai l t o b e consisten t against som e stationar y alternativ e hypothese s (Stoc k an d Watso n I988b). I t seem s safest , therefore , t o avoi d thes e test s i f ther e i s an y evidence o f th e kin d o f M A componen t t o th e error s tha t cause s siz e distortions. An alternativ e procedur e i s propose d b y Hal l (1989) , wh o suggest s that I V b e use d i n place o f OL S i n augmente d Dickey-Fuller tests . Th e level instrumenta l variabl e use d i n plac e o f y,^. 1 i s y t-(k+i), wher e th e residual autocorrelatio n functio n ha s non-zer o element s onl y u p to la g k (see Sectio n 4.6. 4 below) . Hall' s Mont e Carl o result s sugges t tha t th e method perform s well , particularly for negative MA erro r processes .

4.5. Furthe r Extension s Two mor e extension s o f th e testin g procedur e ma y b e considered . Th e first concern s testin g fo r multipl e uni t root s i n a process . Th e secon d i s testing fo r uni t root s a t seasona l frequencies . Inventorie s ma y b e regarded a s a goo d exampl e o f a variable tha t i s likel y t o b e 1(2 ) (contains tw o uni t roots) , a s i t i s constructe d b y aggregatin g a functio n of flo w variable s (productio n an d sales ) whic h ar e individuall y 1(1) ; a test fo r multipl e uni t root s woul d therefor e b e importan t whe n dealin g with stoc k variable s o f thi s kind . Test s fo r seasona l uni t root s ar e applicable whe n seasona l dat a ar e used . Standar d unit-roo t test s ma y provide misleadin g result s i n th e presenc e o f integratio n a t seasona l frequencies. 4.5.1. Multiple Unit Roots Consider th e proble m o f testin g fo r d > 1 uni t root s i n a series . Th e sequence o f testing—whic h start s wit h a test fo r a singl e unit root i n th e undifferenced series , the n proceed s t o a test fo r a second uni t root (tha t is, test s th e first-difference d series ) i f th e firs t nul l (o f a uni t roo t i n levels) i s not rejected , an d s o on—does not constitut e a statistically vali d testing sequence , sinc e al l o f th e unit-roo t test s considere d i n thi s chapter tak e th e complet e absenc e o f uni t root s a s th e alternativ e

120 Testin

g for a Unit Roo t

hypothesis. Dicke y an d Pantul a (1987 ) sugges t a more natura l sequentia l testing procedur e fo r uni t root s whic h take s th e largest 5 numbe r o f uni t roots unde r consideratio n a s th e firs t maintaine d hypothesi s an d the n decreases th e orde r o f differencin g eac h tim e th e curren t nul l hypothesis is rejected . Thi s continue s unti l th e firs t tim e th e nul l hypothesi s i s no t rejected. The sequentia l procedur e ma y be illustrate d fo r th e cas e d = 2. Le t u s consider th e AR(2 ) model , This mode l ca n be re-parameterize d a s where ft = (pjp 2 - 1 ) and ft = -(1 - pj)( l - p 2). The testin g procedure consist s o f the followin g steps: 1. Tes t th e nul l hypothesi s o f tw o uni t root s agains t th e alternativ e o f a singl e uni t root . Unde r thi s nul l hypothesi s f t = f t = 0 an d a n F-tes t may b e use d t o tes t it . Suc h a test , however , doe s no t tak e accoun t o f the one-side d natur e o f th e alternativ e hypothesis . A mor e powerfu l procedure follow s fro m notin g that , unde r bot h th e nul l an d th e alternative hypotheses , f t = 0. However , f t = 0 unde r th e nul l hypo thesis bu t i s les s tha n zer o unde r th e alternativ e hypothesis . Thus , a more powerfu l tes t i s give n b y estimatin g th e regressio n o f A 2 y, o n Ay f _!, computin g th e f-rati o o f ft , an d performin g a one-side d lower tail test usin g the Dickey-Fulle r critica l values . 2. I f th e nul l hypothesi s abov e i s rejected , procee d t o tes t th e nul l of one uni t roo t versu s th e stationar y alternative . Her e HQ an d HI ar e given b y f t < 0, f t = 0, an d f t < 0, f t < 0 respectively . Thus , a one-sided f-tes t her e involve s estimating the regressio n o f A 2 y, on A y f _ j and y t-\, computin g th e f-rati o o f ft , an d comparin g i t wit h th e Dickey-Fuller values . This testin g procedure ma y be generalize d t o testin g fo r three o r mor e unit roots . Dicke y an d Pantul a (1987 ) contain s th e result s o f a simula tion study . Thei r genera l conclusio n i s tha t th e sequentia l procedure , consisting o f testin g a nul l hypothesi s o f k uni t root s agains t a n alternative o f k — 1 uni t roots , base d o n f-tests , i s considerabl y mor e powerful tha n a n F-test-base d procedure . 4.5.2. Seasonal Integration We hav e s o fa r focuse d attentio n o n testin g fo r a uni t roo t a t th e zer o frequency. However , whe n seasona l dat a ar e used , i t ma y b e necessar y 5

Not e tha t th e firs t sequenc e too k th e smallest numbe r (i.e . 1 ) of uni t root s a s it s firs t maintained hypothesis .


1

to allo w fo r seasona l averagin g o r seasona l differencin g t o achiev e stationarity. Fo r example , th e appropriat e differenc e to use to transform to stationarit y ma y not be x, - x t-i, bu t xt - x t~4 i n quarterly dat a or xt - x,~i2 i n monthly data. Seasona l integratio n (an d co-integration ) and testing fo r uni t root s a t seasona l frequencie s ar e discusse d b y Engle , Granger, an d Hallma n (1988) , Ghysel s (1990) , Hylleberg , Engle , Granger, an d Yo o (1990) , Engle , Granger , Hylleberg , an d Le e (1993) , and Ilmakunnas (1990) amon g others . Just a s a tim e serie s wit h n o seasona l componen t ma y b e wel l described b y a deterministi c process, a stationar y stochasti c process , o r an integrate d process , th e seasona l componen t o f a tim e serie s ma y b e well describe d b y a proces s fro m an y o f thes e classes , o r ma y combin e elements o f each . Whil e i t i s commo n practic e t o mode l a seasona l component a s havin g a deterministi c o r stationar y form , ther e ma y b e cases wher e i t i s appropriat e t o allo w th e mode l o f th e seasona l component t o drif t substantiall y ove r time . Thi s possibilit y is implicit in the practic e o f seasona l differencin g (se e e.g . Bo x an d Jenkin s 1970) , whereby a proces s observe d s time s pe r yea r woul d b e transforme d t o its , s -period difference , x t — x,-s, o n th e assumptio n tha t th e proces s contains an integrated seasona l component . In orde r t o allo w for a unit root a t a seasonal frequency, it is useful t o factor th e la g polynomial of the process . I f the la g polynomial contains a factor ( 1 - L s ) = A 5 , correspondin g t o a seasona l uni t root , the n i t can be factorize d as

That is , th e seasona l differenc e operato r ca n b e broke n dow n int o th e product o f th e firs t differenc e operato r an d th e moving-averag e seasonal filter 5(L ) containin g further root s o f modulus unity. Engle e t al. (1988 ) defin e a variabl e x t t o b e seasonall y integrated o f orders d an d D (denote d SI(d, D)) , i f & dS(L)Dxt i s stationary . Thus , for quarterl y data , i n th e terminolog y establishe d above , i f A 4 jr r i s stationary, the n x, is SI(1, 1) with S(L) = 1 + L + L 2 + L 3 . Further ,

Hence th e quarterl y seasona l uni t roo t proces s ha s fou r root s o f modulus unity : on e a t th e zer o frequency , on e a t th e two-quarte r (half-yearly) frequency , an d a pai r o f comple x conjugat e root s a t th e four-quarter (annual ) frequency . T o relat e thes e root s t o frequencie s in an intuitiv e way , conside r th e deterministi c proces s a(L)x t = 0. Fo r

122 Testin

g for a Unit Root

a(L) — (1 + L) , the n x,+i = -x, an d so ^(+2 = x t; th e proces s return s to its origina l valu e o n a cycl e wit h a perio d o f 2 . Fo r a(L) = ( 1 — /L), then x t+i = i.xt, x t+2 — f2x, = ~*< > *r+ 3 =— '*r> an d ^, +4 = —i 2xt = *„ s o that th e proces s repeat s wit h a period o f 4. As wit h a proces s wit h a singl e uni t roo t a t th e zer o frequenc y (e.g . the rando m wal k (1 — L)x, = et), a seasonally integrated proces s suc h as (1 - L 4)xt = £ r retain s th e effec t o f shock s indefinitely , an d ha s a variance whic h increase s linearl y wit h time . However , becaus e th e seasonally integrate d proces s contain s multiple roots o f modulus unity, it does no t behav e lik e a n 1(1) proces s i n all respects. Fo r example , shock s to th e syste m wil l als o alte r th e seasona l patter n o f th e series , s o tha t the sequence s o f observation s corresponding to eac h quarte r ma y evolve in differen t ways . Th e firs t differenc e o f suc h a seasonall y integrate d process wil l not b e stationary. Testing fo r a uni t roo t a t a seasona l frequenc y ha s muc h i n commo n with testin g fo r uni t root s a t th e zer o frequency . Test s hav e bee n proposed b y Hasza an d Fulle r (1982) , Dickey , Hasza , an d Fuller (1984) , Osborn, Chui , Smith , an d Birchenhal l (1988) , Hylleber g e t al (1990) , and Engl e e t al. (1993) , amon g others. W e wil l follow Hylleberg e t al. i n describing a testing strategy. Consider a process observe d quarterl y and generate d b y where e t i s IID(0 , cr 2) an d y(L ) i s a fourth-orde r la g polynomial . W e wish t o tes t th e nul l hypothesi s tha t th e root s o f y(L ) li e o n th e uni t circle, agains t th e hypothesi s tha t the y li e outside . Definin g thre e positive parameters ) generalizatio n o f (22 ) so tha t w e ca n again use the transforme d model where no w z' f = (i{ ZS. M Z 4,t) an d 0' = (0[, 6 2, 03, 04). T o defin e the element s o f zj, le t jU c = E(Ay t) = ( 1 - j8(l))~V c = b{i c, the unconditional mea n o f the drif t unde r th e null, usin g b = (1 - ^(l))" 1. Next, let

The 0 { ar e give n b y 0{ = (ft, ft , . . ., ft,) , 0 2 = A* c + j8(l)A c + y c , 63 = pc, an d 0 4 = y c + p cuc. Th e scalin g matri x T r become s diag(r 1/2 ip, T 1/2, T, r3/2) wher e i p i s the uni t vecto r o f dimensio n p . Finally £l p = E(zittz[tt), th e covarianc e matri x o f z^,. Th e element s of the matrice s Vj - an d <J>T ar e simila r t o thos e fo r th e simpl e Dickey Fuller test . Then, usin g 4> to denot e convergenc e in probability

128 Testin

g for a Unit Roo t

Again, Tabl e 3. 3 ma y b e applie d t o fin d th e densitie s o f th e Wiene r processes appearin g above , wit h th e exceptio n o f tha t appearin g i n th e expression fo r VT.S.S ; again , a n expansio n fo r thi s densit y i s give n b y Abadir (1992) . V i s therefor e bloc k diagonal , an d th e estimator s o f th e nuisanc e parameters j 8 are asymptoticall y normal an d d o no t affec t th e asymptoti c distributions o f th e Dickey-Fulle r statistics , s o tha t th e sam e critica l values ca n b e used . Th e b s tha t appea r i n som e o f th e expression s cancel appropriatel y t o mak e thi s possible . Thi s ma y b e see n i n th e simplest cas e wher e th e mode l doe s no t includ e eithe r th e constan t o r the tren d ter m bu t doe s include the Ay ; _ ; - terms . Noting that i n this case the term s Vj-^2 , \T,i,4' ^r,2,3 » 11 ^r,2,4 > Vr,3A> $r,2 > an d 0r, 4 ar e n °t 1 pp relevant, an d tha t V" = diag(o) . . . a) , V^3,3), wher e o> " i s th e z'th diagonal elemen t o f S2 p th e distributio n o f th e f-statisti c i s give n b y t = (o" 2Fri3j3)^1//207-;3. Thi s ha s th e standar d Dickey-Fulle r distributio n with th e critica l values give n by Tables 4.2(a) . Th e result s exten d t o th e cases wher e th e constan t an d (or ) tren d ar e (is ) include d i n th e mode l with th e critica l value s give n b y Table s 4.2(6 ) an d 4.2(c ) respectively .


9

The inclusio n o f th e 1(0 ) term s Ay ( _ ; leave s unchange d th e asymptoti c distributions o f the parameter s o f interest .

4.6.3. Example: Non-parametric Test Statistics (Phillips 1987a) Consider th e simpl e random-wal k proces s y t = yt^ + ut. Th e mai n features o f non-parametri c correction s ma y b e illustrate d b y assumin g that th e onl y restriction s impose d o n th e stochasti c proces s {wj^ i ar e those give n by condition s (3.16a)-(3.16d) ; {wjjl i ma y therefore b e a n ARMA(p,q) proces s i n whic h cas e th e f-statisti c fo r p , i n th e mode l yt = pyt-i + ut, does no t have the standard Dickey-Fulle r distribution . As discusse d earlie r i n this chapter, a non-parametric correction i s one way o f accountin g fo r th e autocorrelatio n i n th e {wj™ = 1 series . Thi s correction enable s u s t o retai n th e us e o f th e Dickey-Fulle r critica l values t o conduc t inferenc e an d therefor e expand s th e rang e o f model s to which the Dickey-Fulle r test s ca n be applied . Using th e result s i n (3.21)-(3.24) , th e estimato r p an d it s f-rati o t(p) have the following limiting distributions:

where A =(cr 2 — cr2)/2 wher e CT 2 and cr 2 ar e a s define d i n (10a ) an d (106). I f th e u, ar e IID(0 , CT2), then CT2 = CT», and A =0. I f so , th e distributions o f p an d it s r-rati o i n (31 ) an d (32 ) above ar e th e usua l Dickey-Fuller distributions . It ma y the n b e verifie d tha t th e limitin g distributio n o f th e statisti c Z(p), where

is th e sam e a s th e distributio n obtaine d b y settin g A =0 i n (31) . This

130 Testin

g for a Unit Roo t

follows fro m a n inspectio n o f (31 ) an d b y noting that

Similarly, th e limitin g distribution o f the Z(t(p)), wher e

is the sam e a s the distributio n obtained by setting A = 0 in (32) . The limitin g distribution s o f (33 ) an d (34 ) ar e unchange d whe n A is replaced b y A in thes e expressions , wher e A is a consisten t estimato r o f A. Consisten t estimator s o f a 2 an d o 2u ar e require d i n orde r t o obtai n a consistent estimato r o f A and t o implemen t th e non-parametri c correc tions. A consistent estimato r o f a 2u i s given by either T~ 1^ \(yt - yt~i) 2 or 3 n"1Xf(yr — Pyt-i)2 • The asymptoti c equivalenc e o f th e tw o estima tors follow s fro m th e propert y tha t p- * 1 in probability. 8 A consisten t estimator o f o 2 ca n be obtaine d fro m (12 ) o r (13 ) a s before. Using argument s simila r t o thos e outline d above , th e no n -parametric corrections fo r th e mor e elaborat e model s whic h includ e constan t o r constant an d trend , ma y b e derived . I n particular , Z(p,- ) an d Z(f(p,) ) (/ = b, c) ma y be obtained .

4.6.4. Example: Instrumental Variables Test for Unit Roots (Hall 1989) The non-parametri c statistic s describe d i n exampl e 4.6. 3 ar e know n no t to perfor m wel l i n finit e sample s i n th e presenc e o f negativ e moving average error s (se e Schwer t 1989) . Hal l (1989 ) propose d estimatio n b y instrumental variable s a s a n alternativ e t o th e us e o f non-parametri c corrections. H e showe d tha t i n th e regressio n mode l y, = pyt~\ + ut, where u t i s a moving-averag e proces s o f som e specifie d orde r an d p i s equal t o 1 under H 0, the n p iv ha s the standar d Dickey-Fulle r distribu tion. The intuitio n for thi s result ma y b e easil y described: p OLS i n th e abov e model doe s no t hav e th e standar d Dickey-Fulle r distributio n because o f the bia s induce d b y th e correlatio n betwee n y r _i an d u, (whe n u t i s an ARMA(p,q) process) . I t i s therefor e necessar y t o us e a correctio n factor t o remov e thi s bias . Thi s bia s doe s no t appea r when , say , y,_ 2 is used a s a n instrumen t fo r y,_ i an d u t i s a n MA(1 ) process . Th e 8 A s note d above , th e finite-sampl e behaviou r o f thes e tw o estimator s ma y b e quit e different (se e Schwer t 1989) .


1

Dickey-Fuller table s ca n thu s b e use d directly . W e formaliz e thi s intuition nex t b y presentin g a simpl e exampl e an d b y usin g some o f th e distributional result s derive d earlie r i n th e chapter . Throughout , t o simplify th e algebra , adequat e initia l observation s ar e assume d t o b e available, s o all sums are taken ove r 1 . . . T. Let th e DG P b e give n by

Then p, v, th e instrumenta l variables estimator o f p which uses _y,_ 2 a s an instrument for yt-\, is given by

Next, w e want to prove tha t

where W(r) is th e Wiene r proces s associate d wit h th e sequenc e {«,} . The RH S o f thi s expressio n i s th e limitin g distributio n o f th e simpl e Dickey-Fuller tes t fo r a mode l lik e (35 ) when th e u, ar e II D (see Section 4.6.1) . Thus , w e nee d t o sho w that , fo r th e instrumen t y t~k

Note tha t

Proof o f (i). From(35a) ,

132 Testin

g fo r a Unit Root

This follows from th e fac t tha t

Recall no w from (3.23 ) tha t

for th e DG P give n b y (35a)-(35c) . Further , fo r th e erro r proces s u t, o2u = (l + 0 2 )cr 2 and o 2 = (I + 0) 2o2e. It als o follow s from (3 5 b) tha t

Using (39) , it is now possible t o se e fro m (38 ) tha t

But a 2 = (1 + 0 2 )a 2 . Henc e The las t equalit y follows from th e expressio n fo r a 2 give n previously, (i ) now follows routinely from (40) . Proof of (ii).

All term s o f th e for m r~ 2 Xf= i}Vi M i-p / — 1.2, . . ., ( k — 1), converge in probabilit y t o zero . Thi s i s because th e scalin g T^ 1 i s appropriate fo r these sum s t o hav e non-degenerat e distributions. 9 Th e scalin g T~ 2 induces degeneracy . Th e distributio n o f T~ 2 2T= i.y?-i i s give n b y cr 2 (/oW(r) 2 dr) for the DG P (35a)-(35c) ; (ii) no w follows routinely . Finally, (37 ) follows fro m (36) , usin g k = 2 in (i ) an d (ii) , sinc e 9

Thi s follows fro m argument s similar to thos e used t o prove (3.21)-(3.24).


3

It als o follow s fro m (37 ) that th e f -ratio form o f the test ,

has the Dickey-Fuller f-distributio n wher e a i s a consistent estimato r of a (possibl y equa l to ( 1 + §)& E, where 6 and d e ar e OL S estimators o f 6 and 0^. Thus, estimatio n b y instrumenta l variable s ha s th e sam e effec t a s th e non-parametric correction s t o p(OLS ) proposed b y Phillips an d Perron . In a smal l Mont e Carl o study , Hal l (1989 ) show s tha t th e siz e problems associate d wit h the Phillips-Perro n tes t ar e partiall y alleviate d by the us e o f this instrumental variable procedure . However , substantia l size distortion s remai n in the case s wher e 6 < 0 in the nul l model . No power calculation s ar e reported i n Hall's paper . 4.6.5. Example: Bounds Test for Unit Roots (Phillips and Ouliaris 1988) A limitatio n o f th e testin g procedure s discusse d i n thi s chapte r i s tha t the distribution s o f th e tes t statistic s ar e non-standard . Consequently , a number o f differen t set s o f critica l value s hav e t o b e use d t o implemen t the tests . This proble m i s at the hear t of a literature whic h exploit s the ide a tha t differencing a n 1(0 ) serie s induce s a uni t roo t i n th e moving-averag e representation o f th e process . Us e i s mad e o f thi s fac t t o devis e a unit-root tes t base d o n th e long-ru n variance, define d i n (3.16c) , o f th e first-differenced tim e series . Th e critica l value s ar e take n fro m th e standard norma l table . In orde r t o illustrat e thi s approach , assum e tha t y t follow s th e IMA(1,1) process , &yt = ( 1 - 9L)e t = ut, (41 ) 2 2 2 2 with E, ~ IID(0, o e). Th e long-ru n varianc e o f Ay , is a = (1 - 9) o E, so a 2 + 0 if and onl y if 9 ¥= 1. I n othe r words , if y, is 1(0), A.y, will have 0)—is sufficient fo r th e variables to b e calle d 'co-integrated' . 4 Th e exampl e is taken fro m Engl e an d Grange r (1987).

138 Co-integratio

n

Although thi s i s a simpl e example, muc h o f th e metho d an d reasonin g can be generalize d t o more complex cases. Wha t i s crucial is that, whil e {xt} an d {y t} ar e integrate d processes , no t tie d t o an y fixe d means , a linear combinatio n o f th e tw o variable s make s th e resultin g serie s a stationary proces s an d th e variable s x an d y ma y be sai d to b e linke d by the correspondin g equilibriu m relationship . It i s interestin g t o not e tha t i n th e bivariat e cas e w e hav e th e adde d bonus tha t thi s equilibriu m relationship, i f suc h a relationshi p exists , i s unique. Th e proo f i s straightforwar d an d follow s b y contradiction . Suppose not : tha t is , suppos e tha t ther e exis t tw o distinc t co-integratin g parameters a an d y suc h tha t {x, + ay,} an d {x t + yv( } are bot h 1(0) . This implie s tha t (ex— y)y r i s als o 1(0 ) becaus e subtracting on e I(d) series fro m anothe r canno t lea d t o a serie s integrate d o f orde r ( d + 1) (or higher) . Bu t sinc e {y t} i s 1(1), a non-zero constan t time s {y t} i s als o 1(1). Hence we have a contradiction unles s a = y. The analysi s is not quit e s o straightforwar d i n th e multivariat e cas e a s we mus t allo w fo r th e possibilit y o f severa l co-integratin g vectors . Nevertheless, muc h o f th e intuitio n gaine d fro m th e analysi s o f th e bivariate cas e carrie s through to riche r examples . There ar e a t leas t thre e reason s fo r regardin g th e concep t o f co integration a s centra l t o econometri c modellin g wit h integrate d vari ables, a s wel l a s t o th e examinatio n o f long-ru n relationship s amon g those variables . The firs t i s th e lin k tha t th e concep t formalize s amon g variable s o f higher order s o f integration , fo r whic h som e linea r combinatio n i s o f a lower orde r o f integration . I n th e mos t widel y use d examples , a reduction i s mad e fro m variable s tha t requir e first-differencin g fo r stationarity t o a composite time-serie s tha t i s stationar y i n levels . I n addition, thi s composit e stationar y variable , constructe d b y takin g a linear combinatio n o f th e origina l series , ma y be sai d t o characteriz e th e equilibrium relationshi p linkin g th e series . I f a n equilibriu m exist s among severa l variable s s o tha t suc h a stationar y linea r combinatio n exists, w e ma y coun t o n eventua l retur n o f this linea r combinatio n t o it s mean (typicall y zero) . Second, an d followin g directl y fro m thi s identificatio n o f co-integra tion wit h equilibrium , i s th e complementar y ide a o f meaningfu l versu s spurious regression . Regression s involvin g level s o f tim e serie s o f non-stationary variable s mak e sens e i f an d onl y i f thes e variable s ar e co-integrated. A tes t fo r co-integratio n the n yield s a usefu l metho d o f distinguishing meaningfu l regression s fro m thos e tha t Yule (1926 ) calle d 'nonsense' an d Grange r an d Newbol d (1974 , 1977 ) calle d 'spurious' . Finally, anothe r importan t propert y characterize s variable s tha t ar e co-integrated. A se t o f co-integrate d variable s is known t o have , amon g other representations , a n error-correctio n representation ; tha t is , th e

Co-integration 13

9

relationship ma y b e expresse d s o that a ter m representin g th e deviatio n of observe d values from th e long-ru n equilibriu m enters the model . This is a n interestin g resul t b y itself , bu t i s eve n mor e noteworth y a s a contribution t o resolving , o r synthesizing , th e debat e betwee n time series analyst s an d thos e favourin g econometri c methods . I t allow s a reconciliation, a t leas t i n part , o f time-serie s method s o f analysin g dat a that traditionall y considere d onl y th e propertie s o f difference d time series (whic h coul d mor e legitimatel y b e assume d stationary ) an d thos e econometric method s tha t lai d emphasi s o n the equilibriu m relationship s between variable s an d therefor e focuse d o n th e level s of variables. Bot h methods a s traditionall y use d coul d b e sai d t o hav e bee n flawed , th e former b y th e implie d necessit y o f ignorin g information contained i n th e levels o f variables , th e latte r b y it s tendenc y t o ignor e th e spuriou s regression problem . Reliance o n th e us e o f difference d data , a s a potentia l cur e fo r th e spurious regressio n problem , raise s a set o f new issues. A n exampl e o f a potentially controversia l recommendatio n fo r modellin g economi c time series appear s i n Grange r an d Newbol d (197 7 p . 206 ; emphasi s i n original): 'I n th e presenc e o f some autocorrelatio n o f the error s . . . firs t differencing migh t b e expecte d t o g o a long wa y towards alleviatin g th e problem an d i s certainly preferabl e to doin g nothing at all.' As a n illustration , Grange r an d Newbol d cit e th e result s o f Sheppar d (1971), who regressed U K consumptio n o n autonomou s expenditur e an d mid-year mone y stoc k fo r bot h level s an d changes , usin g annua l dat a over th e perio d 1947-62 . Th e result s wer e take n t o indicat e th e existence o f a significan t relationshi p i n level s whic h disappeare d en tirely whe n firs t difference s wer e employed . Th e level s regression , characterized b y a high value of R 2 an d a low value of the Durbin-Wat son statistic , i s spurious . However , th e first-difference d regression ap pears t o b e testin g a differen t hypothesis. 5 Th e differencin g operation , in particular , omit s an y information abou t long-ru n adjustment s tha t th e data ma y contain. Thus, whil e th e spuriou s regressio n proble m i s a seriou s one , th e practice o f differencin g integrate d serie s t o achiev e stationarity , an d o f treating th e resultin g serie s a s th e prope r object s o f econometri c analysis, i s not withou t costs . Error-correctio n mechanism s (ECMs ) ar e intended t o provid e a wa y o f combinin g th e advantage s o f modellin g both level s an d differences . I n a n error-correctio n mode l th e dynamic s of bot h short-ru n (changes ) an d long-ru n (levels ) adjustmen t processe s are modelle d simultaneously . Thi s ide a o f incorporatin g th e dynami c 5 I n the nex t chapte r we discuss the consequences of differencing (and over-differencing ) in case s wher e differencin g (an y numbe r o f times ) doe s no t alleviat e th e problem s o f non-stationarity an d wher e transformin g th e serie s monotonically , prio r t o differencing , appears to be the appropriat e procedure.

140 Co-integratio

n

adjustment t o steady-stat e target s i n th e for m o f error-correctio n terms , suggested b y Sarga n (1964 ) an d develope d b y Hendr y an d Anderso n (1977) an d Davidso n e t al. (1978) , amon g others , therefor e offer s th e possibility o f revealin g informatio n abou t bot h short-ru n an d long-ru n relationships. The theor y o f co-integratio n provide s a unifie d framewor k fo r th e analysis o f ECM s an d o f tim e serie s i n whic h th e variable s shar e on e o r more stochasti c trends . W e elaborat e upo n th e alternativ e representa tions o f co-integrate d system s i n Sectio n 5.3 , where w e als o provid e a more forma l descriptio n o f th e theory ; w e firs t revie w th e theor y o f polynomial matrice s whic h i s necessar y fo r a thoroug h understandin g o f several proof s i n th e nex t section s an d i n following chapters .

5.2. Polynomia l Matrice s A polynomia l matri x A(L ) i s a matri x fo r whic h th e element s {a ry(L)} are scala r polynomial s i n an argumen t L :

where k^ < ° ° . Usefu l reference s t o th e algebr a o f polynomia l matrice s include Gel'fan d (1967 ) an d Gantmache r (1959) . Th e degree , k , o f A(L) i s the highes t o f th e order s &,-, • o f th e elemen t polynomials :

Thus, A(L ) can be expressed a s

(10) The determinan t |A(L) | o f a polynomia l matri x A(L ) i s a scala r polynomial. A familia r exampl e o f a polynomial matri x i s A (A) = (A 0 - AI) , which occurs i n the characteristi c equatio n which ma y b e solve d fo r eigenvalue s o f th e matri x AQ . Ever y matri x satisfies it s ow n characteristi c equatio n (th e Cayley-Hamilto n theorem ) in that , i f we le t /(A ) = |A(A)| , the n /(A ) = 0 (wher e thi s i s interprete d as a matri x expression) . I n general , i f A(L ) = 2f= oA;L' , the n w e wil l also us e the notatio n A(B ) = 2f=oA,-B', fo r a matrix argument B.

Co-integration 14

1

The inverse of a finit e polynomia l matri x A(L) o f degre e k whic h has all root s o f th e determinanta l equatio n |A(z) | = 0 strictl y outsid e th e unit circle 6 i s given , i n general , b y a n infinite-orde r matri x C(L ) = ^T=oCiL'. Thi s matri x i s wel l define d i f an d onl y i f ]Cf= (AL' ' i s a convergent sequenc e a s A:—»°o . Fo r [ z > 1 (equivalently , |L| = z' 1 < 1) , a sufficien t conditio n fo r thi s t o hol d i s |C;|sSp' I where \p \ < I. 7 Th e C , ar e define d by an infinite set o f matrix identities which ma y b e describe d i n a simpl e scala r case , wher e A(L) = 1 - p L = a 0 + a\L, as follows:

such tha t

The constructio n give n b y (11 ) i s derive d b y usin g th e propert y C(L)A(L) = 1 and equatin g power s of L. The algebr a generalize s to high-order scala r polynomial s A(L) an d to matri x polynomials A(L). I n the nex t sectio n o f thi s chapte r an d i n Chapte r 8 we shal l nee d t o dea l with matri x polynomial s tha t hav e uni t root s ( z = 1). I n thes e cases , while th e matri x A(L ) ma y no t hav e a wel l define d invers e becaus e o f failure o f ran k conditions , transformin g A(L ) an d pre - an d post multiplying i t b y suitabl e matrice s wil l lea d t o a n invertibl e matri x provided certai n condition s ar e satisfied . Two polynomia l matrice s R(L ) an d T(L ) ar e sai d t o b e equivalent if and only if there exis t tw o invertible matrices U(L) an d V(L) suc h that Every polynomia l matri x A(L ) ca n be divide d o n th e lef t b y a matri x of th e for m ( B - LI ) fo r an y matri x B s o that , wher e A(L ) i s of degree k , where H(L ) i s o f degre e k -I an d D i s a constan t matrix , th e remainder term . T o obtai n th e precis e for m o f D , w e wil l deriv e thi s 6 Tha t is , denotin g a n arbitrar y roo t o f the determinan t equatio n b y z , \z \ > 1 + e, for some £ > 0 , fo r al l z satisfyin g this equation . 7 Not e tha t thi s exponentia l deca y conditio n i s onl y sufficien t an d no t necessar y t o guarantee convergence .

142 Co-integratio

n

result, whic h is simply a linear transformatio n o f th e origina l polynomia l matrix. W e hav e

and s o on . B y induction , w e ca n continu e thi s substitutio n fo r an y k t o get

A simila r resul t hold s fo r divisio n o n th e right . I n dealin g wit h integrated series , th e cas e B = I i s of particular interest; the n where A(l ) is equa l t o A(L ) evaluate d a t L = 1 . Not e tha t fro m (13) and (15) , for the cas e B = I ,

and

Further, A(l ) is called th e total effect. Whe n D = A(l) =0 , the n A(L ) is divisible o n th e lef t b y ( 1 — L)I withou t a remainder , an d henc e ca n b e rewritten i n terms of the operator ( 1 - L ) alone. The nex t mai n resul t t o b e prove d i s th e isomorphi c relationshi p between polynomia l matrice s an d companio n matrices . Thi s wil l clarif y the derivatio n of latent roots of polynomia l matrices, whic h are of grea t interest i n analysin g dynamic s an d co-integration . Conside r th e syste m of n deterministi c linear equations :

We se t A Q = I a s a normalization . Th e sam e informatio n ca n b e

Co-integration

143

represented i n stacke d for m (calle d th e companion form) b y definin g the followin g matrice s an d vectors :

Direct multiplicatio n o f 4 > int o 7, t-i an d comparison o f tha t outcom e with X r reveal s tha t th e secon d expressio n i n (18 ) merely augment s th e original syste m with a se t o f identitie s o f the for m x ( _i = x ( _ j , etc . The corresponding advantag e of companion form s i s that, whateve r th e valu e of k i n (16) , the companio n for m i s always of firs t order , an d henc e ca n be analyse d usin g alread y establishe d tools . Thi s advantag e i s pronounced whe n w e wis h t o fin d th e eigenvalue s o f A(L) , an d d o s o b y solving It wil l b e convenien t t o re-expres s (19 ) in term s o f th e negative s o f th e inverses of th e eigenvalues , /j, = —I/A , an d t o solv e Using the definitio n o f fro m (17 ) in (20) , we hav e

7

rom the partitioned invers e formula, wher e D ^0,

The firs t equalit y follow s fro m th e fac t tha t th e determinan t o f th e firs t

144 Co-integratio

n

matrix followin g th e equalit y i s one . Repeatin g thes e operation s i n th e alternative direction , i f E ^ 0, establishes tha t Both result s wil l b e use d below. Here , w e apply (22) t o th e determinan t in (21) , choosin g E a s th e larg e n( k - 1 ) x n(k — 1) matri x i n th e upper-left corner , an d D = I. The n FD -1G i s zer o excep t fo r it s top-right block, which is -Â^, an d D = 1. Thus,

(23) Comparing (21 ) wit h (23) , th e analysi s can b e see n t o repeat , leadin g t o | A O/) | after k - 1 steps. Thus , the laten t root s ca n b e foun d b y equatin g either expressio n t o zer o an d solving. Sinc e A ( •) i s n x n , O i s n k x n k an d s o ha s n k eigenvalues , as required. From (13) , whe n B = I, i f A(l ) ha s ran k r < n, the n |A(1) | = 0 an d hence A(L ) ha s n — r uni t roots . Conversely , i f A(l) ha s ran k n , A(L ) has none o f its eigenvalues equal to unity. Next, derivative s o f polynomia l matrice s wit h respec t t o thei r argu ments will b e needed , an d w e have

This i s reminiscen t o f th e mean-la g formul a i n a scala r distribute d lag . From th e resul t tha t H(l ) = - ]^= i/A, , w e now see that H(l ) = -T. Thus, whe n A(l ) = 0, s o tha t A(L ) = (1 - L)H(L) , the n |H(L) | = 0 delivers th e remainin g eigenvalues . I f H(l ) di d no t hav e ran k n whe n A(l) = 0, the n |H(1)| = 0, s o H(L ) als o ha s uni t roots . Usin g (13 ) an d (15) t o write H(L) = H(l) + (1 - L)K(L) , w e note that , i n the extreme case tha t T = 0, H(L ) = (1 - L)K(L) , whic h implie s tha t A(L) = (1 - L) 2 K(L). Consequently , equatio n (16 ) woul d becom e (1 — L)2 K(L)x r = 0 , yieldin g a syste m in secon d differences . There i s a close affinit y betwee n th e rank s o f A(l) , H(l) , etc. , an d th e numbe r of differences tha t ca n be extracte d fro m A(L) . Finally, polynomia l matrice s ar e invarian t unde r non-singula r linea r

Co-integration 14

5

transformations i n tha t the y hav e man y equivalen t representation s wit h the sam e properties. This is clear fro m (13 ) above. Mor e generally,

In term s o f (16) ,

For example , whe n k = 1 ,

Such linear transformations are use d regularly in Chapter 8 .

5.3. Integratio n an d Co-integration : Forma l Definition s and Theorem s DEFINITION 1. (adapte d fro m Engl e an d Grange r 1987) . Th e com ponents o f the vecto r x r ar e sai d to be co-integrate d o f order d , b, denoted x t~Cl(d, b) , i f (i ) x , i s l(d) an d (ii ) there exist s a non-zero vector « such that a'\, ~ l(d — b), d ^ b > 0. The vector a, is called the co-integratin g vector. If x , ha s n > 2 components , the n ther e ma y b e mor e tha n on e co-integrating vecto r « ; i t i s possibl e fo r severa l equilibriu m relation ships to gover n th e join t evolution o f the variables . I f there exis t exactly r linearl y independent co-integratin g vectors wit h r ^ n - 1 , then thes e can b e gathere d int o a n n x r matri x a . Th e ran k o f a wil l b e r an d is called th e co-integrating rank. DEFINITION 2. A vecto r time-serie s x , ha s a n error-correctio n representation i f it can b e expresse d a s where (a, i s a stationar y multivariat e disturbance , wit h A(0 ) = !„, A(l) havin g onl y finit e elements , z ( = «'x r , an d y a non-zer o

146 Co-integratio

n

vector. Fo r th e cas e wher e d = b = 1, and wit h co-integrating ran k r, the Grange r Representatio n Theore m holds (se e Sectio n 5.3.1) . Granger's theore m wil l prove tha t a co-integrate d syste m o f variable s can b e represente d i n thre e mai n forms : th e vecto r autoregressiv e (VAR), error-correction , an d moving-averag e forms . Thes e representa tions ar e al l isomorphic t o eac h other , an d th e theore m establishe s th e restrictions tha t hol d betwee n th e lag-polynomia l matrice s i n eac h representation o f the process . We ma y prov e th e theore m i n a t leas t thre e (equivalent ) ways , depending o n th e representatio n fro m whic h w e choos e t o start . Th e theorem i s stated i n Sectio n 5.3.1 . Followin g thi s statement, w e take th e autoregressive representatio n a s ou r starting-poin t an d deriv e th e mai n results. Thi s proo f i s due t o Johanse n (1991fl) . Th e sub-sectio n afte r th e proof contain s a detaile d interpretatio n o f th e results . I n Chapte r 8 we return t o th e theore m an d provide anothe r proof , thi s time startin g fro m the moving-averag e representation . Provin g th e theore m i n tw o way s highlights som e interestin g symmetries which exis t amon g the equivalen t representations o f the process .

5.3.1. Granger Representation Theorem (adapted from Engle and Granger 1987 and Johansen 1991 a) Let x t b e a n 1(1 ) vecto r o f n components , eac h wit h (possibly ) deterministic trend i n mean. Suppos e tha t th e syste m ca n be written a s a finite-order vecto r autoregression :

(25) where th e e t satisf y assumption s (3.16a)-(3.16d ) an d th e firs t k dat a points Xj_fc , Xj-fc+i , . . ., x 0 ar e fixed . Th e mode l ca n the n b e rewritte n in error-correction for m as

Both (25 ) and (26 ) ca n be writte n as where

Co-integration 14

7

Equation (26 ) may also b e written as where V(L ) = (1 - L)~\x(L) - *(!)£* ) = I» - Sti1^'. Fro m (13) above, 1 P(L) can alway s be constructed . Further , th e derivativ e of a(z) at z = 1 is equal to -W = -V(l). Define th e orthogona l complemen t Pj _ o f an y matri x P o f ran k q an d dimension n x g a s follows (0 < q < ri): (i) P_ L i s of dimensio n n x ( n — q); (ii) PI P = 0(B _, )X ,, P'P1 = 0,x(n _ ?) ; (iii) Pj _ ha s ran k n — q, an d lie s i n the nul l space o f P . Certain key assumptions may now be stated . ASSUMPTION Al . Th e characteristi c polynomial ,

has root s eithe r equa l t o o r strictl y greate r tha n one ; that is , |flr(z)| = 0 implies that eithe r z > 1 or z = 1. ASSUMPTION A2 . Th e n x n matri x n ha s reduced ran k r < n and is therefor e expressibl e a s the produc t o f tw o n x r matrice s y and a, where y and a have ran k r. Thus n = y«'. ASSUMPTION A3 . Th e ( n — r) x ( n — r) matri x y'iWa ± ha s ful l rank n — r. Assumption A l guarantee s tha t th e non-stationarit y o f x , ca n b e removed b y differencing . A 2 rule s ou t a stationar y x , process . I f n ha d full ran k (tha t is , i f |JT(Z) | ha d n o root s a t one) , then fro m (27), x, = Ji~ l(L)(/u + et), whic h would impl y that x t wa s stationary. I t is also the statement , i n th e autoregressiv e form , tha t th e syste m has r linearl y independent co-integratin g vectors . I n ligh t o f Assumptio n A2 , y« ' provides a transformatio n o f the n matri x (an d hence a linear combina tion o f th e Xjt whic h i s stationary) . Th e significanc e o f A 3 wil l becom e evident i n du e course , bu t essentially , i t ensure s tha t x r i s integrated of order n o greate r tha n 1 . Unde r th e assumption s state d above , th e following result s ma y be proved : (Rl) Ax r i s stationary.

148 Co-integratio

n

(R2) a'x , is stationary. (R3) £(Ax, ) =

(R4) E(a'x t) = -(

(R5) Ax , ha s a moving-average representation give n by (R6) C(l ) = aj_(y' i < P«j.)~1y'i ha s rank n - r . (R7) «'C(1 ) = O r X B C(l)y=0BXr.

where C(L ) = C(l) + (1 - L)Ci(L) , r= C(l)f» , x 0 i s a constan t (vector) o f integration, an d S, = Ci(L)e t. Proof. Multipl y (27) by y ' an d y' L respectivel y to obtai n th e equation s

using the decomposition n = ya' an d the result tha t y^ y = 0( n-r)Xr. Th e matrix n i s no t invertible , an d th e syste m give n b y (28a)-(28b) therefore canno t b e inverte d directl y t o expres s th e x it i n term s o f th e £;,. T o obtai n a n invertibl e system , w e defin e tw o ne w variables , (ot = (a'a)~la'xt an d v, = (a^ L a_ L )~ 1 a^ L Ax r . Next , defin e th e matrice s «=«(«' a)"1 an d «j _ = a L(a'LaL)~l. Le t R = (a, a± ) b e a n n x n matrix o f ran k n . The n R(R'R)~ 1 R' = !„ an d henc e («« ' + «j.«'i) = !„. Thus , Substituting i n (28a)-(286 ) gives

where i n (28a ) th e firs t ter m o n th e left-han d sid e need s t o b e writte n first a s -(y'y)(«'a)(«'a)~ 1 a'x,. Th e equation s for (a, an d v t ca n now be written i n autoregressive for m a s with

For z = 1 , this matrix has determinant

Co-integration 14

9

which i s non-zer o b y Assumption s A 2 an d A3 . Henc e z = 1 i s no t a root. Fo r z + 1, straightforwar d bu t tediou s algebr a enables u s t o express th e matri x A(z) as To sho w this , substitut e for *P(z ) in A(z ) in term s of n(z) and jr(l) = — nfro m (27) , and us e th e decompositio n n = y«' an d th e orthogonality conditio n yly = a' La = 0( n _ r ) X r . Fo r z = £ 1, therefore , from (31), where w e have used th e resul t tha t th e determinan t o f a matrix obtained by multiplyin g n — r column s (o r rows ) o f a n n x n matri x b y a constant i s th e determinan t o f th e origina l matri x multiplie d b y th e constant raise d t o th e powe r n — r. Thus , fo r z ¥= 1, |A(z) | = 0 i f an d only i f |;r(z) | = 0 . B y Assumptio n Al , i f w e exclud e z = 1, th e onl y remaining roots o f this determinant li e outside th e uni t circle. This show s tha t al l th e root s o f |A(z) | = 0 ar e outsid e th e uni t disk . Hence th e syste m define d b y (29a)-(29b) i s invertibl e an d 0, whic h ar e al l 1(0 ) i f co-integrabilit y holds . Thus , a i s consistently estimate d b y th e regressio n despit e th e complet e omissio n of al l dynamics. I n fact ,

(48)

Since {vj i s 1(0) under co-integrabilit y but {x t} i s 1(1),

158 Co-integratio

n

whereas

Thus,

which implie s that Hence a converge s t o a a t a rate o f O p(T) an d no t a t th e usua l rate of Op(T1/2). Convergenc e i s rapi d asymptoticall y an d i t i s thi s rapi d convergence o f th e estimate s o f th e coefficient s tha t i s use d b y Engl e and Grange r as the basis of their two-step estimator. Since & differs fro m a b y term s o f O p(T~l), th e asymptoti c result s for estimatio n o f dynami c model s wit h 1(1 ) variable s wil l b e th e sam e whether a i s estimate d o r known . Moreover , differencin g mus t reduc e the orde r o f integratio n o f a n integrate d variabl e b y unity , s o i f Ay f i s related t o AJC , an d perhap s lag s o f bot h o f these , an d i f {x t} an d {y j are co-integrated , the n y t_i - ax t-i i s 1(0) an d can be include d i n the ECM mode l a s if a wer e know n (that is , the samplin g variance of a ca n be ignored) . I f _{y t] an d {x,} ar e no t co-integrated , the n w e hav e th e familiar spuriou s regression problem ; i f the y ar e co-integrated , th e benefits accruin g from a static regression ar e potentially large . The so-calle d 'super-consistenc y theorem ' du e t o Stoc k (1987 ) ma y be stated formall y as follows. THEOREM (Stoc k 1987) . Suppos e tha t x , satisfie s ( 1 — L)x, = C(L)e, wit h C(L) = C(l) + (1 - L)C*(L) , wher e C*(L ) ha s all o f its laten t root s insid e th e uni t circle . I f C*(L ) i s absolutel y summable,10 th e disturbance s hav e finit e fourth-orde r absolut e moments, an d x , i s CI(1,1) wit h r co-integrating vectors (incorpor ated i n a matrix «) satisfying , uniquely, then11 Thus, instea d o f convergin g a t rat e T 1/2, a s i n stationar y processes , 10 Th e infinit e sequence {c ;}f i s sai d t o b e absolutel y summabl e i f 2*= i c j < °° . Fo r th e matrix C*(L ) t o b e absolutel y summable , th e conditio n i s that 27= ollCj1 l < °°. 11 Th e element s o f q an d Q wil l typicall y be al l zeroes and ones , definin g one coefficien t in eac h colum n o f «to be unit y and defining rotation s i f r > 1 . M = pli m E(T~2 2,î x r x D-

Co-integration

159

least-squares estimator s converg e a t a rat e o f T. Thi s theore m an d th e error-correction representatio n o f co-integrated system s may be allie d t o give the followin g theorem . THEOREM (Engl e an d Grange r 1987) . Th e two-ste p estimato r o f a single equatio n o f a n error-correctio n syste m with one co-integrat ing vector , obtaine d b y takin g th e estimat e & of a fro m th e stati c regression i n place of the tru e value for estimatio n o f the error-cor rection for m a t a secon d stage , wil l hav e th e sam e limitin g distribution a s th e maximum-likelihoo d estimato r usin g th e tru e value o f a . Least-square s standar d error s i n th e secon d stag e wil l provide consistent estimate s of the tru e standard errors .

5.6.1. Sketch-proof of Engle-Granger Theorem (Bivariate Case) The followin g i s a proof o f thi s theorem fo r th e bivariat e case . Conside r the estimatio n o f ft and y in the tw o equations give n by

y, an d x t ar e co-integrate d 1(1 ) variable s wit h th e co-integratin g para meter give n b y a . I n th e contex t o f th e discussio n i n thi s chapter , th e error-correction mechanis m i s estimate d i n (53 ) usin g th e tru e valu e of th e co-integratin g parameter , whil e i n (54 ) a i s substitute d fo r a , where a i s derive d fro m th e stati c regressio n o f y t o n x t. Also , e * = e « + y(« - oc)x t-]_. Le t zt = yt- «x tWe nee d t o sho w that th e asymptoti c distributions of the estimator s f t and y , o f / 3 an d y respectively , ar e th e sam e regardles s o f whethe r on e uses a o r a (tha t is , whether one estimates (53 ) o r (54)). . In standar d fashion , w e hav e fro m (53 ) (assumin g adequat e initia l values)

The estimator s derive d fro m (54 ) ar e als o give n by (55 ) bu t wit h z t-\ and e f replacin g z t an d s t. From this , i t is easy to deduc e tha t th e resul t will be demonstrate d if the followin g condition s are show n to be true :

160 Co-integratio

n

(iii) th e asymptotic distribution s of are th e same ;

(iv) th e asymptoti c distribution s o f are the same . In (53) , we assum e tha t {e,} i s a n innovatio n proces s suc h tha t E(Axt£t) = 0. Note firs t that , b y th e propertie s o f 1(0 ) an d 1(1 ) series , a s use d an d discussed i n Chapter s 3 and 4 , th e followin g expression s ar e O p(l) (tha t is, non-explosiv e an d non-degenerat e a s T— > ) :

Secondly, Using (59) ,

Result (i ) now follows fro m (57 ) an d (58) . Also ,

Co-integration 16

1

Result (ii ) now follows from (56), (57) , an d (58) . Finally,

By (57 ) an d (58) , th e las t tw o expression s o n th e right-han d sid e o f th e above equalit y ar e O p(T~1/2). Resul t (iii ) follows , an d (iv ) i s prove d analogously from :

6

Regression wit h Integrate d Variables We hav e see n ho w th e presenc e o f integrated variables pose s som e special problem s whic h do no t appea r whe n workin g wit h station ary series . Thes e migh t lea d u s t o believ e tha t a ne w rang e o f techniques need s t o b e considere d i n orde r t o handl e suc h data . However, a s w e sho w i n thi s chapter , w e ca n continu e t o appl y standard regression s i f w e pa y attentio n t o order s o f integratio n and us e dynami c specification s whic h tak e accoun t o f an y co integrating relationships amon g the variables . The Engle-Grange r theore m i n Chapte r 5 , layin g emphasi s o n simpl e static regressions , implie s a goo d dea l abou t th e wa y i n whic h a n investigator ough t t o procee d wit h a n econometri c stud y o f integrate d variables. Som e o f thi s i s relate d t o th e evolutio n o f modellin g practic e among econometricians . Econometricians o f th e 1970 s bega n t o b e suspiciou s o f regression s using dat a i n levels . Thei r suspicion s wer e reinforce d b y worrie s expressed b y time-serie s analyst s relatin g t o spuriou s regressions . Th e focus o f attentio n bega n t o shif t toward s th e nee d t o hav e properl y specified model s wit h ric h dynami c structures . Th e move , followin g Mizon (1977) , Sim s (1977) , Hendr y an d Mizo n (1978) , an d Hendr y an d Richard (1982) , wa s toward s a metho d o f econometri c researc h tha t preferred model s whic h began wit h as general a specification as possible, and continue d wit h simplificatio n to a parsimoniou s econometri c mode l following fro m imposin g constraints consisten t wit h observe d data . (Se e Spanos (1986 ) fo r a detaile d treatment. ) Th e literatur e o n co-integratio n reinstated som e confidenc e i n stati c regression s i n levels , an d goo d econometric metho d appeare d t o hav e take n a ful l circle ; a s long a s th e 1(1) variables were co-integrated, suc h regressions mad e sense . There ar e nonetheles s severa l reason s fo r continuin g t o trea t stati c regressions a s being i n general sub-optimal . Firs t o f all, the estimat e a is biased fo r th e co-integratin g paramete r <x and , althoug h tha t bia s i s Op(T~l), i t ca n b e substantia l in finit e samples . Th e bia s i s likely t o b e a functio n o f som e paramete r suc h a s th e mea n la g o f th e dynami c adjustment proces s relatin g {y,} t o {x t}. I n som e circumstances , there -

Regression wit h Integrated Variables 16

3

fore, a retur n t o dynami c modellin g woul d see m t o b e th e appropriat e response t o th e problem s o f static-regressio n biases . Alread y a bod y of work exist s demonstratin g th e poo r performanc e o f static regression s fo r many type s o f proble m (Banerjee , Dolado , Hendry , an d Smit h 1986 , and Stoc k 1987) . Second , th e distribution s o f coefficien t estimate s wil l typically tak e non-standar d form s eve n wher e th e serie s ar e co integrated. Th e 'non-standardness' , b y which we generall y mean asymp totic non-normality , come s fro m th e propert y tha t th e serie s ar e integrated o f orde r greate r tha n o r equa l t o 1 . Th e fundamenta l point is that th e distribution theor y tha t applie s t o non-stationar y serie s i s different fro m th e familia r Gaussia n asymptoti c theory . Th e estimator s have distributions , i n general , whic h ar e functional s o f th e Wiene r processes discusse d i n Chapters 1 and 3. However , som e o f the standar d asymptotic theor y ma y be restore d i n dynamic models. We wil l elaborat e o n th e secon d o f thes e points , leavin g a discussion of th e firs t unti l Chapter 7 . I t i s important t o poin t ou t a t th e outset , i n order no t t o mislea d readers , tha t i t i s no t tru e tha t single-equatio n dynamic models ar e necessaril y superio r t o thei r static counterparts. Th e next tw o section s presen t example s wher e single-equatio n dynami c models d o perfor m satisfactorily . Yet , a s th e discussio n i n Chapte r 8 shows, i t i s possibl e t o construc t man y case s wher e single-equatio n dynamic model s b y themselve s ar e no t sufficien t fo r obtainin g efficien t and unbiase d estimate s (se e Engl e e t al. 198 3 an d Phillip s an d Loreta n 1991). There ar e severa l interrelate d difficultie s whic h ar e importan t an d which collectivel y impl y that the issu e is broader tha n simpl y a comparison o f dynami c wit h stati c models . A n informa l descriptio n o f th e problems encountere d i n modellin g non-stationar y variable s i n a singleequation framewor k woul d identif y a t leas t fiv e effects . First , th e presence o f uni t root s induce s non-standar d distribution s o f th e coeffi cient estimates . Second , th e erro r proces s ma y no t b e a martingal e difference sequence . Third , th e explanator y variable s ma y eac h b e generated b y processes that displa y autocorrelation ; take n i n conjunction with th e secon d effect , thi s give s ris e t o 'second-order ' biases . Fourth , there ma y be mor e tha n on e co-integratin g vector . Finally , th e explanat ory variable s i n th e singl e equatio n ma y no t b e weakl y exogenou s fo r the parameter s bein g estimated . Wea k exogeneit y ca n fai l if , say , a co-integrating vecto r enter s mor e tha n on e equatio n i n th e syste m generating th e variables . Static regression s ca n b e affecte d b y al l fiv e o f th e problem s liste d above, whil e dynami c model s ma y b e abl e t o accommodat e th e firs t three effects , a s i n th e example s give n i n th e section s tha t follow . However, estimate s derive d fro m single-equatio n dynamic model s ar e not optima l i f wea k exogeneit y fail s t o hold . Thi s fina l observatio n

164 Regressio

n wit h Integrated Variable s

extends th e discussio n fro m th e real m o f modellin g unit-roo t processe s to th e all-encompassin g real m o f genera l econometri c modelling . Thi s discussion i s formalize d i n Chapte r 8 an d illustrate d wit h severa l examples.

6.1. Unbalance d Regression s an d Orthogonalit y Tests Mankiw an d Shapir o (1985 , 1986 ) dre w attentio n t o a problem tha t ma y arise i n applyin g standar d distribution s t o inferenc e wher e ther e ar e non-stationary (o r borderlin e non-stationary ) serie s present , an d i n particular t o th e proble m o f inference concernin g orthogonalit y betwee n series. Whil e th e proble m is , a s wit h spuriou s regression , essentiall y a problem o f integrate d data , i t wil l appea r wit h near-integrate d dat a i n finite samples. 1 Wit h thi s qualification, the proble m ma y be sai d t o aris e in unbalanced regressions : tha t is , regression s i n which the regressan d i s not o f th e sam e orde r o f integratio n a s th e regressors , o r an y linea r combination o f the regressors. 2 The Mankiw-Shapir o discussio n centre s o n a condition suc h as Et-i(yt) =

c , implying y, = c + vt, E

t^(vt)

=

0, (1

)

where £ ( _i i s interpreted a s the expectation , conditiona l o n informatio n realized a t tim e t — 1, o f th e valu e of som e variabl e whic h may b e date d in th e future . Tha t suc h a conditio n hold s i s ofte n teste d wit h a regression suc h as where c^ = 0 under th e nul l hypothesi s tha t (1 ) holds . Example s o f such hypotheses an d test s aris e frequentl y i n model s tha t postulat e th e ful l use o f al l realized information . On e suc h exampl e fro m macroeconomic s is Hall' s (1978 ) formulatio n o f th e life-cycle/permanent-incom e model , which, give n a stringen t se t o f assumptions , implie s tha t consumptio n should follo w a rando m walk . Test s o f thi s hypothesi s hav e typicall y taken th e for m o f regression s o f difference d consumptio n o n a constan t and on e o r mor e lagge d incom e o r consumptio n terms ; unde r th e nul l hypothesis th e coefficient s o n th e lagge d term s shoul d no t b e signifi cantly differen t fro m zero . Mankiw an d Shapir o sugges t examinin g th e cas e i n whic h th e regres sor x t follow s the AR(1 ) process : 1 Whil e th e experiment s reporte d her e us e borderlin e stationar y data , th e result s wil l also appl y t o integrate d series . - Thes e ar e sometime s calle d inconsisten t regressions . Inconsistenc y i n thi s sens e i s unrelated t o th e concep t o f an inconsisten t estimato r o f a parameter: se e n . 3.


5

with corr(e ( , v t) = p an

d corr(e

t+; -,

v t) = 0 V; + 0.

Note tha t thi s is not a problem o f simultaneity bias: th e regresso r x t-\ is uncorrelated wit h v t. A structur e suc h a s thi s i s appropriat e i n man y models i n whic h thes e test s hav e bee n used . I n th e Hal l (1978 ) model , for example , p = 1 where x t an d y t represen t curren t incom e an d th e change i n curren t consumptio n respectively . Manki w an d Shapir o us e Monte Carl o simulation s t o tabulat e estimate s o f th e actua l rejectio n frequencies an d critica l value s i n /-typ e test s o f H 0: c 2 = 0, whe n standard ^-value s ar e used . Tabl e 6. 1 reproduce s a selectio n o f thei r results for model (2 ) and als o fo r the mode l with a linear time trend ,

TABLE 6.1 . Percentag e rejectio n frequencie s o f standar d f-test s a t nominal 5 per cen t level 3 DGP: (1 ) + (3) ; Sampl e siz e = T; No . o f replications = 100 0 Model (2)

e\P

Model (4)

1.0

0.9

0.8

0.5

0.0

1.0

0.9

0.8

0.5

0.0

30 0.99 26 22 0.98 0.95 17 12 0.90 0.00 5 (b) T = 200 0.999 29 0.99 18 13 0.98 0.95 9 0.90 7 0.00 5

24 20 17 12 9 6

20 15 15 10 8 6

11 10 8 7 6 5

7 7 7 6 6 5

60 54 50 38 28 6

45 40 37 30 22 7

36 33 30 25 19 7

16 15 14 12 10 5

6 6 5 6 6 6

23 15 10 7 6 4

20 13 9 7 6 4

10 8 7 6 6 5

5 4 5 5 6 5

61 41 29 17 10 5

48 32 24 14 9 5

38 27 20 12 8 4

18 13 11 7 6 5

5 5 6 6 7 5

(a) T = 50

0.999

a

Thi s tabl e compare s tw o sampl e sizes . Whil e th e tes t siz e distortion s ar e generally smalle r fo r th e large r sampl e an d wil l vanis h as T -» °°, thi s feature i s specific t o th e borderline-stationar y processe s use d (0r = 2s= i1s > ar >d definin g |,- j( (th e /-fol d summation o f th e if^ ) recursivel y a s |fy )( = Ss= il;-i,.s > 1 ^J ^ S> tn e transformation D is chosen suc h that

or, equivalently where

and L i s th e la g operator . Th e variate s v, ar e referre d t o a s th e

Regression wit h Integrated Variable s

181

canonical regressor s associate d wit h Y ( . Th e la g polynomial F U (L) ha s dimension k\ x N , an d ^JLoFiijF'iij i s non-singular . F yy i s assume d t o have ful l ro w ran k k; (ma y be equa l t o zero ) fo r j = 2, . . ., 2 g + 1, so Since w e ma y b e intereste d i n estimatin g onl y som e o f th e k equations i n (28) , we nex t need t o defin e a selectio n matri x C . I f w e needed t o conside r onl y n ^ k , w e could loo k a t th e regressio n o f CY , on Y f _ i , wher e C i s a n n X k matri x o f constants . Th e n regressio n equations t o be estimated ar e the n

The asymptoti c analysi s i n SS W is derive d i n stacke d single-equatio n form. I n orde r t o us e thi s form, we need th e symbo l ® whic h denotes a Kronecker produc t define d a s follows : conside r th e m x n matri x A = {fly } an d th e p X q matri x B ; th e Kronecke r produc t o f A an d B (in that order) i s the m p x n q matrix ,

V e c ( - ) denote s th e column-wis e vectoring operator . Thus , writin g the matrix A a s A = (a 1; a 2 , . . ., a n ), wher e eac h o f th e a , i s a n m x 1 vector, vec (A) is given by

X = [Yi , Y 2 , . . ., Yj--!]', s = vec(S) , v = ve c (if), an d ft = vec((A)'), then (32 ) ca n be writte n in stacked for m a s In orde r t o expres s (33 ) in term s o f th e transforme d regressor s Z = [Z{, Z 2 , . . ., Z'T-_I] ' = XD', not e tha t th e coefficien t vecto r correspond ing to thes e i s given by 6 = (!„ ® D'"1)/?.11 Thus , finally , 11 T o sho w this , substitut e fo r Z = XD' an d 5 = (!„ OD'^ 1)^ i n (34 ) giving s = (!„ ® XD')(In ® D'- 1 )/? + (£ J/2 ® ir _ 1 ) v . NOW (Aj ® A 2 )(A 3 ® A 4) = (A!A 3) ® (A 2 A 4 ), for arbitrar y matrice s A,- , i = 1 , 2, 3 , 4 , provide d th e matrice s ar e conformable . Usin g thi s rule (33 ) is recovered a s required.

182 Regressio

n wit h Integrated Variables

The OL S estimator 5 of 6 in th e stacke d transforme d regressio n mode l (34) is given by It i s possible t o se e fro m (30 ) tha t th e moment s involving the differen t components o f Zt converg e a t differen t rates . Fo r example , Z l j f an d Z 2 , are O p(l) whil e Z 3>f i s O p(t^2), Z 4j , i s O p(t), an d s o on . Henc e th e sample secon d moments , whic h is what we would be intereste d i n when looking a t th e matri x Z'Z, converg e a t a rate o f T fo r th e Z l i t an d Z 2tt components, a t a rat e T 2 fo r th e Z 3;( component , an d a t a rat e T 3 fo r the Z 4 r component . I n orde r t o handl e thes e differen t orders , SS W use the scaling matrix Tr , given by

(36) 1

All the convergenc e result s use the scale d Z' Z matri x T^Z'ZTy ; le t us call this scaled matri x Q . The firs t ste p in th e proo f i s to deriv e th e limitin g matrix for Q . SSW show that , unde r certai n regularit y conditions , Q = $ > V wher e th e elements of V may b e describe d a s follows : (a) V u an d V 12 ar e non-rando m matrice s give n b y S7= o Fn/Fii/ an d 2F=oFii/F2iy respectively . Additionally, V ]2 = V 21. (b) V l p = V ^ = 0, p = 3, ...,2g + l. (c) V 22 is also non-random, give n by F22F22 + S 7=0^21/^21; • (d) V mp , wher e m, p = 3, 5 , 7 , . . ., 2 g + 1, ar e rando m matrice s involving functionals of multivariate Wiener processes . (e) V mp, where m = 2, 4 , 6 , . . ., 2g , p = 3, 5 , 7 , . . ., 2 g + 1, are als o random matrice s involvin g functional s o f multivariat e Wiene r pro cesses. (f) V mp = [2/(p + m-2)] ¥ mm¥'pp, p = 4, 6, . . ., 2g, m = 2, 4, 6, . . ., 2g. This i s the firs t tim e w e have used multivariat e Wiener processes . Th e mathematical detail s involve d i n goin g fro m univariat e t o multivariat e Wiener processe s ar e comple x an d wil l no t b e deal t wit h her e (fo r a good account , se e Phillip s an d Durlau f 1986) . Howeve r th e generaliza tions fro m ou r analysi s in Chapte r 3 can b e understoo d intuitivel y fairl y easily an d the appendix sketche s th e bivariate case . Thus, eac h elemen t o f a standardize d n x 1 multivariat e Wiene r process W(r ) i s a univariat e Wiene r proces s an d th e element s o f W(r ) are independent . I n particular , W(l ) i s the multivariat e standar d norma l

Regression with Integrated Variables

183

density, tha t is , N(0, !„). Further, W(r ) e C[0,1]", wher e C[0,1 ] is the space of continuous function s defined on [0,1] . Convergence result s analogou s t o (3.17) , fo r a sequence o f mean zero random vector s {u (}, ca n b e prove d b y definin g standardize d sum s such as

with (t - l)/ r ^r an d tn e matri x f t i s th e long-ru n variance-covarianc e matrix o f u, - define d b y f t = limr^00.E(T~1S:rS'r) analogousl y wit h (3.16c). Th e {uj innovatio n sequenc e satisfie s conditions equivalen t t o those give n by (3.16a)-(3.16d) fo r the univariat e case . Provide d suitabl e regularity condition s ar e satisfied , the following multivariate analogue of (3.18) may be proved : RT(I-) = > W(r). Finally, multivariat e analogue s o f al l th e convergenc e result s give n earlier fo r univariat e processe s ma y b e derived . Thus , fo r example , referring t o Table 3.3, wher e y, = y r _ j + u r :

To derive th e result s abov e w e have assumed , a s in Table 3.3 , tha t {u j is a white-noise innovatio n sequence wit h !„ a s the varianc e matrix. The nex t ste p o f the argumen t involve s rewritin g the estimato r 6 i n a form suc h tha t it s distributio n ca n b e derived . Thi s i s don e b y firs t defining a non-singula r matri x H which , i n essence , transpose s th e stacked version of the matri x Z. Thus , (37)

From (35) ,

184 Regressio by substitutin g fo r s

n wit h Integrated Variable s fro m (34) . Next , usin g th e resul t tha t

Thus,

(38) As note d abov e th e matri x V is the limitin g matrix of Q . The asymptoti c distribution of

is neede d t o giv e us th e fina l result . Thi s limitin g vector, denote d b y takes th e followin g form:

where (a) (j) m fo r al l m ^ 3 are functional s of multivariate Wiener processes ; (b) 0 2 = 02 i + 022 , wher e ft, 2 = vec[F 22W(l)'S1/2], W(l ) is th e multi variate standar d norma l densit y function, and

Finally,

where (ft , 0 21) ar e independen t o f (0 22, ft , . . ., ft these steps , w e have the followin g theorem.

g+i).

Consolidatin g

This provide s u s wit h severa l interestin g results . First , d, an d henc e /} , is a consisten t estimato r o f 6, respectivel y /J , i n th e presenc e o f arbitrarily man y uni t root s an d deterministi c tim e trends . Thi s observa tion relie s o n th e assumptio n tha t th e mode l i s correctly specified , i n th e

Regression wit h Integrated Variable s 18

5

sense tha t th e error s ar e martingal e differenc e sequences , an d th e T T may rescale by powers of T greate r tha n \. We have alread y noted tha t th e estimate d coefficient s o n th e element s of Z r converg e t o thei r probabilit y limit s a t differen t rates . Hence , if some o f th e transforme d regressor s ar e dominated , i n a n orde r o f probability sense , b y stochasti c components , thei r limitin g distributions will b e non-normal . O n th e othe r hand , i f ther e ar e n o Z , regressor s dominated b y stochastic trend s (tha t is , if & 3 = k 5 = . . . = k 2g+i - 0) , then d, an d henc e ft , ha s a n asymptoti c normal joint distribution . This happens becaus e th e term s involvin g the rando m integrals ar e n o longe r present, a s ma y be see n fro m (30) , where k 3, k$, . . ., k 2g+i ar e th e ranks of matrices multiplying the stochasti c canonical regressors. I f these matrices ar e absent , th e transforme d regressio n i s considerabl y simpli fied a s i t i s expressibl e solel y i n term s o f stationar y variable s and deterministi c tren d terms . I n suc h a case , therefore , H(I B ®T r )(3-*)4. N(0 , H(S ® V^)H') wher e V i s no w a nonrandom matrix . Additionall y th e F-statisti c associate d wit h testin g a n arbitrary se t o f q linea r restriction s R/ J = r, i s asymptotically distributed as $ in this case . If a singl e stochasti c tren d i s dominate d b y a non-stochasti c trend , then, again , asymptoti c normalit y holds . Thi s i s th e resul t o f Wes t (1988) an d ma y b e see n usin g (30 ) and keepin g trac k o f th e rate s o f convergence o f th e sampl e moment s o f th e separat e component s o f Z f . Consider, fo r example , th e se t o f canonica l regressor s give n b y (tit, 1 , %itt, t)' an d suppos e th e transforme d regressio n i s expressibl e i n terms of these canonica l regressors. Thus , whil e the sampl e variability of the stochasti c tren d ter m i s O p(T), tha t o f th e deterministi c tren d i s O(T3/2). A s show n b y Wes t (1988) , an d discusse d i n Sectio n 6.2.1 , i n deriving th e asymptoti c distributio n for thi s case , th e deterministi c trend component dominate s th e stochasti c componen t an d asymptoti c normality follows . The Stock-Wes t (1988 ) example , discusse d earlier , work s because w e are abl e t o rewrit e th e regressio n i n term s of canonica l regressors which do no t hav e an y dominating stochasti c component. Th e issu e o f domina tion, i n this context, i s best addresse d b y looking at the scalin g matrix. Four mor e example s wil l no w b e give n t o illustrat e thes e arguments , using th e framewor k develope d above . Th e fina l exampl e i n thi s se t o f four contain s recommendations fo r modelling with integrated series . 6.2.5. Example (Sims e t al . 1990:119) Let th e proces s {x,} b e generate d accordin g t o th e followin g AR(2) process without drift :

186


Under H 0, f a = 0, f a + fa = 1 and |/3 2| < 1 so tha t th e autoregressiv e polynomial i n (39 ) ha s onl y on e uni t root . I f a constan t i s include d i n the regressio n o f x, o n it s tw o lags , Y , (i n th e notatio n develope d earlier) i s given by

Transforming t o th e canonica l regressor form, 12 w e have

(40) where 61 = —fa, 6 2 = fa , an d 6 3 = f a + fa , Z l>t — Z 3; f = x t. It ma y also be shown that

Z 2 ( = 1 , an d

(41) where 0(L ) = (1 + faL)' 1 an d 0*(L) = (1 - L)" 1 [0(L) Note fro m (41 ) tha t F 2 i(L) = 0. Thi s implies , b y referrin g t o th e description o f th e V matri x above , tha t V i s block-diagonal . Th e estimate d j o f the coefficien t on th e (differenced ) stationary ter m ha s an asymptotically norma l distributio n wit h mea n 0 an d varianc e give n b y Vf]1. Th e margina l distribution o f o 2) however , i s no t normal ; becaus e F23 i s no t equa l t o zero , Z 2 ,t an d Z 3 j r ar e asymptoticall y correlated , and sinc e Z^ t ha s a Wiener distribution , so does the coefficien t o n Z 2:t . If a n intercep t i s no t include d i n th e regression , w e hav e a 2 x 2 block-diagonal V matrix . Th e estimate d coefficien t o j stil l ha s a n asymptotically norma l distribution , wit h d^ convergin g to it s probability limit a t rat e T 1/2, whil e S 3 has a Wiene r distributio n wit h convergence at rat e T . An y join t tes t involvin g di an d 6 3 wil l als o hav e a non-standard distribution. The analog y with the Stock-Wes t exampl e is direct. I n (27 ) we ha d a series o f term s integrate d o f orde r zero . Th e coefficien t estimate s o n al l these stationar y term s were jointly and individuall y asymptotically normally distributed . Th e join t distributio n o f 0 i n (27) , wit h an y o f th e 77, , was o f cours e non-standard . Thi s observatio n applie s equall y well here . There is , however , a n importan t differenc e betwee n th e Stock-Wes t 12 Thi s transformatio n i s no t unique , an d on e coul d imagin e choosin g others ; however , (39) ca n be rewritte n a s x, = (f) l + /3 2)*,_i - /3 2(*,-.i ~ x t-2> + 1t> because j8 0 = 0 under th e null, an d thi s suggest s th e decompositio n give n b y (40) . I t ha s th e advantag e o f makin g 6 l (= — /32) th e coefficien t o f a non-integrate d rando m variable , sinc e x , i s a n integrate d series.


187

example an d th e curren t example . I n th e forme r case , becaus e /3 ha d already bee n se t equa l t o 1 , ou r parameter s o f interes t coul d al l b e written a s coefficient s o n mean-zer o an d non-integrate d variables . Inference coul d the n b e conducte d usin g standar d tables . I n th e latte r case, althoug h w e can us e standar d table s t o tes t fo r th e significanc e o f j32, a test o f fli + /3 2 = 1 still requires u s to us e non-standard distributio n theory (an d s o table s constructe d b y simulation) . I n a sense , ou r rewriting i n term s o f stationar y variables i s not sufficientl y successfu l t o enable u s t o conduc t inferenc e solel y usin g standar d tables . Exampl e 6.2.6 examines this issue in more detail .

6.2.6. Example (Sims e t al . 1990: 128) Suppose no w tha t x, is generate d a s in Sectio n 6.2. 5 bu t /? 0 i s non-zero under the null . The canonica l representation 13 yields

(42)

(43)

where 6(L) an d 0*(L ) ar e define d a s in Section 6.2. 5 above. Here, unlik e th e exampl e i n Sectio n 6.2.5 , ther e ar e n o element s o f Z ( dominated b y a stochasti c integrate d process . Th e stochastic-tren d term i s dominated, i n sample variability, by the deterministic-tren d ter m t. A detaile d discussio n of this case appears i n West (1988) .

6.2.7. Example (Banerjee an d Dolado 1988) This exampl e i s a consolidatio n o f most o f th e principa l points discussed in th e page s above . I t i s a variation of the Stock-Wes t example , an d al l statements concernin g th e distribution s o f variou s paramete r estimate s may be derive d fro m earlie r genera l principles. 13 Thi s decompositio n agai n ha s th e advantag e o f makin g 6 1 th e coefficien t o f a non-integrated variable . Th e motivatio n fo r choosin g thi s transformatio n i s therefor e similar t o tha t give n fo r the exampl e i n Sect. 6.2.5.

188 Regressio


Consider th e followin g regression :

where y f denote s th e logarith m o f disposabl e incom e an d c t th e logarithm o f consumption , an d bot h variable s ar e 1(1 ) i n levels . Here , although w e hav e non-stationar y variable s a s regressors , i f the y ar e co-integrated wit h each other , a s the y mus t b e i f any o f th e permanent income/life-cycle model s o f consumptio n ar e t o mak e sense , the n thi s co-integration propert y make s bot h side s o f th e regressio n equatio n 1(0 ) and th e /-test s o f th e coefficient s o f al l the regressor s ar e asymptotically normal. Th e long-run - multiplier betwee n consumptio n an d incom e ca n be deduce d muc h as in an y dynamic model. A varian t of (44 ) is the mode l

Although th e individua l t-ratio s ar e asymptoticall y normally distributed , the distributio n o f th e Wal d statistic , use d fo r testin g th e join t nul l hypothesis j 3 =< 5 = 0 , i s a functiona l o f a Wiene r proces s an d it s distribution i s non-standard. Mor e interestingly , if (45) were re-paramet erized a s

where s t-i = y,_i - c t _j, yi = ft + 6, y 2 = j8 , an d st-i ma y be show n to be 1(0 ) under th e assumption s of the permanent-incom e hypothesis , the n I(YI = 0) woul d b e a functiona l o f a Wiene r proces s wherea s f(y 2 = 0) would hav e an asymptoticall y normal distribution . In th e genera l mode l give n b y (44) , th e followin g result s ma y b e proved, using theorems 1 and 2 in SSW (1990): (a) Th e /-statisti c o f eac h coefficien t individuall y i s asymptoticall y normally distributed. (&) Th e F-statistic s o f join t significanc e of an y prope r subse t o f th e se t of stationar y regressor s hav e standar d asymptoti c distributions . Thus, an y tes t o f th e join t significanc e of Ay f _y ( / = 1 , . . ., n — 1 ) and Ac ( _y ( / = 1, . . ., m - 1 ) will hav e th e correc t siz e i f standar d tables ar e used . Further , give n tha t th e non-stationar y variable s ar e co-integrated, i f th e regressor s i n th e non-stationar y se t wer e com bined, say , t o giv e p stationar y regressor s an d q non-stationar y regressors,14 a n F-statisti c tha t use s an y o f th e derive d p stationar y 14 I n (46) , fo r example , p = q = • 1 and th e origina l numbe r o f non-stationar y regressor s (excluding the trend ) is 2.

Regression with Integrated Variables 18

9

regressors i n combinatio n wit h an y o f th e origina l stationar y regres sors wil l also have a standard distributio n asymptotically . (c) Th e F-statistic s o f join t significanc e o f an y subse t o f th e se t o f non-stationary regressor s hav e non-standar d distributions . Moreover, a n F-statisti c tha t use s an y stationar y regressors i n combination wit h an y non-stationar y regressor s wil l hav e a non standard distribution . Point (a ) i s obtaine d fro m th e propert y o f th e non-stationar y regres sors formin g a co-integrate d set ; a s in Sectio n 6.2. 3 above, bot h 6 and /3 can b e writte n a s coefficient s o n mean-zer o stationar y variable s (wit h (46) givin g on e suc h re-parameterizatio n fo r /?) . Th e nex t exampl e reconsiders thi s poin t i n th e contex t o f modellin g practice . Poin t (b) i s not surprisin g becaus e th e F-statistic s considere d us e onl y stationar y regressors. Th e fac t tha t som e o f thes e stationar y regressor s ma y b e re-parameterizations o f som e o r al l of the origina l non-stationary regres sors i s an interesting feature . Point (c ) i s surprising in two respects. Conside r (44 ) and (46) ; the firs t surprising featur e i s th e non-standar d behaviou r o f th e F-statisti c an d the secon d i s that , whil e th e f-rati o o f th e coefficien t o f c t-\ ha s a standard distributio n unde r parameterizatio n (45) , unde r th e linea r re-parameterization give n b y (46 ) th e t -ratio ha s a Wiene r distribution . Both result s follo w fro m th e asymptoti c singularit y o f a particula r variance-covariance matrix. 15 Consider y i i n (46) , whic h tend s t o a non-degenerat e distributio n a t rate T ; T l/22 i s asymptotically normally distributed. Thus ,

and s o

This account s fo r th e asymptoti c singularit y o f th e variance-covarianc e matrix o f [ 6 , /?]' an d th e correspondin g non-standar d behaviou r o f th e F-statistic i n (45) . However , th e distributio n o f Tji ma y b e show n t o be non-degenerate . y \ ca n b e writte n a s a functiona l o f Wiene r processes, an d th e scalin g facto r (o f T ) suggest s th e resultin g non standard distribution . 15 Th e asymptoti c singularit y o f th e variance-covarianc e matri x i s th e proble m o f multi-collinearity in another guise. O n this , also see SS W (1990).

190 Regressio


It i s instructive t o not e tha t th e regressio n give n by (44 ) would no t b e sensible unles s th e right-han d variables or regressor s wer e co-integrated . A specia l exampl e o f (44 ) wa s discusse d i n sectio n 6.1 , wher e w e spok e of a n unbalance d regression . Thi s i s a muc h mor e genera l poin t tha n that mad e i n th e contex t o f spurious regression. A regressio n involvin g a right-hand se t o f variable s integrate d o f a n orde r differen t fro m th e order o f integratio n o f th e left-han d sid e i s jus t a s problemati c a s a regression betwee n tw o unrelate d non-stationar y series . I n eac h case , the distribution s of the statistic s are non-standard . 6.2.8. Example (Stock and Watson 1988a) Stock an d Watso n (1988a ) provid e a n exampl e o f th e danger s involved in no t properl y takin g accoun t o f th e order s o f integratio n o f th e regressors an d th e regressand . The y se t u p a simpl e data-generatio n process base d o n th e permanent-incom e hypothesis:

where y* = the permanen t componen t o f disposabl e incom e whic h i s as sumed t o follo w a random wal k ct = consumption yst = transitory componen t o f disposabl e incom e whic h is a stationary innovation proces s p, = price leve l in period t. The innovation processes u, and v t ar e uncorrelated . Stock an d Watso n relat e th e tal e o f two econometricians tryin g to tes t versions o f Friedman' s permanen t incom e hypothesis . Th e misguide d econometrician, unawar e o f o r choosin g t o ignor e th e order s o f integration o f the series , estimate s the followin g regressions : c, = <x\ + Pipt (t

o chec k money illusion)

ct = a 2 + $2* (t

o check whethe r consumptio n ha s a trend )

Ac, = a 3 + !3 3Ay, (t

o calculat e the margina l propensity t o consume)

Ac, = 1X4 + 04y t-i (t

o tes t th e permanen t incom e hypothesis).

Each o f the inference s from thes e regressions i s invalid.


1

The firs t regressio n i s a spuriou s regressio n o f th e classica l Granger Newbold kind ; c, an d p, ar e unrelate d rando m walks , an d th e eco nometrician's findin g o f a larg e ^-statisti c fo r j8 l5 thereb y leadin g hi m t o conclude i n favour of money illusion, 16 i s a spurious one . The secon d regressio n i s als o spuriou s sinc e i t attempt s t o explai n a random wal k (or, i n other words , a stochastically trending variable) b y a deterministic trend . Nelso n an d Kan g (1981 ) pointe d ou t th e danger s of running regression s whic h attemp t t o de-tren d stochasticall y trendin g data i n th e vai n hop e o f achievin g stationarity aroun d a trend . I n bot h cases th e problem s wit h th e inference s aris e becaus e th e regression s involve variables tha t ar e no t co-integrate d (se e Chapte r 3) . The thir d equatio n appear s t o b e correctl y specifie d bu t nevertheles s leads t o downwardl y biased estimate s o f th e coefficien t for th e margina l propensity t o consum e becaus e disposabl e incom e measure s th e chang e in permanen t incom e wit h error , sinc e i t include s th e chang e i n transitory incom e a s well . Th e fina l regressio n i s wha t w e calle d a n 'unbalanced regression ' a s i t trie s t o explai n a variabl e integrate d o f order zer o b y a variabl e integrate d o f orde r 1 . Th e serie s o f paper s noted abov e (Manki w an d Shapir o 1985 , 1986 ; Banerje e an d Dolad o 1988; Galbrait h e t al. 1987 ) conside r th e exten t t o whic h th e f -statistics in suc h case s ar e biase d awa y fro m zero , leadin g t o misleadin g infer ences abou t th e significanc e of coefficients. Stock an d Watso n compar e th e predicamen t o f thi s econometricia n with econometricia n B , say , wh o look s a t th e result s o f th e followin g alternative regressions :

The inference s fro m eac h o f thes e regression s wil l be , b y an d large , correct. Th e firs t regressio n her e i s th e standar d co-integratin g regres sion an d thi s tim e i s valid. Th e estimat e o f th e coefficien t 61 wil l have a Wiener distributio n bu t wil l be super-consistent . Th e reporte d standar d error wil l be incorrec t owin g to untreated autocorrelation . The secon d regressio n ca n be re-parameterized 17 a s Thus, (5 3 ca n b e writte n a s a coefficien t o n a stationar y variable (a s ca n 62 treate d i n isolation). Th e theory , a s described above , implie s that th e 16 Inferenc e o f thi s kin d woul d appea r t o b e faulty , i n an y case . T o conside r a rejectio n of H 0: fl l = 0 a s a reaso n fo r acceptin g an y specifi c alternativ e i s statistically an d logicall y unjustifiable. 17 O r i n a form analogou s to tha t give n b y (44) .

192 Regressio

n wit h Integrate d Variable s

usual t an d F distributions 18 wil l apply . A simila r argumen t applie s t o the thir d regression , wit h th e exceptio n tha t i n thi s cas e y t~i — ct_i forms th e co-integratin g relation . Stoc k an d Wes t (1988 ) an d Banerje e and Dolad o (1988 ) discus s regressions o f this form i n further detail . The mora l o f th e econometricians ' stor y i s the nee d t o kee p trac k o f the order s o f integration o n bot h side s o f the regressio n equation , whic h usually mean s incorporatin g dynamics ; model s tha t hav e restrictiv e dynamic structure s ar e relativel y likel y t o giv e misleadin g inference s simply fo r reason s o f inconsistenc y o f order s o f integration . Specificit y was clearly th e proble m wit h several o f the model s propose d b y th e firs t econometrician. A genera l t o specifi c metho d o f econometri c modellin g would hav e overcom e man y o f th e problem s o f spuriou s inference s an d non-standard distributions . A n initia l model, mor e genera l tha n th e on e postulated b y the secon d econometrician , o f the form , say, would b e mor e appropriat e fo r inferenc e whe n wea k exogeneit y condi tions ar e satisfied. 19 Accoun t mus t b e take n o f fact s (a)-(c ) o f Sectio n 6.2.7 whe n conductin g suc h inference ; mor e generally , th e exampl e illustrates way s i n whic h th e theor y o f modellin g wit h integrate d variables ha s contribute d t o improvin g ou r understandin g o f wha t constitutes goo d practice i n dynamic modelling.

6.3. Functiona l Form s an d Transformation s We dre w attentio n i n Chapte r 1 t o th e fac t tha t man y economi c tim e series wil l com e clos e t o conformit y with the integrate d model s onl y if a logarithmic transformatio n i s applied . Th e logarithm s o f man y suc h series ma y b e integrated , bu t i t seem s unlikel y that th e untransforme d levels o f macroeconomi c tim e serie s suc h a s consumption , nationa l income, an d th e pric e leve l coul d b e mad e stationar y b y differencin g alone. I t i s worth examinin g this transformation mor e closely , alon g with the effec t tha t i t ma y b e expecte d t o hav e o n a n equilibriu m relation ship. I f th e level s o f tw o serie s ar e co-integrated , d o w e expec t th e logarithms to be co-integrate d also , an d vice versa? Begin by examining a series wit h a tendency t o gro w over tim e subject to stochasti c shock s whic h ten d t o gro w wit h th e underlyin g series. Fo r example, 18 Th e F-distributio n wil l appl y whe n lookin g a t test s o f join t significanc e o f subset s o f regressors, eac h o f which is 1(0). I n thi s example , becaus e on e o f th e regressor s i s 1(1) an d the othe r i s 1(0), th e F-statisti c will hav e a non-standard distribution . 19 Se e Ch . 8 and earlie r discussio n i n this chapter .


3

where e t ha s a mean o f 1 and i s log-normally distributed. A serie s suc h as Y t might describe a number of economic tim e series , a t leas t i n broad outline. Takin g th e logarithmi c transformatio n o f (51 ) an d usin g lowercase letters t o denot e th e transforme d variables with Y, > 0,

where log(1 + y ) — y and e t = log (e t ). Equation (53 ) i s indee d commonl y use d a s a simpl e characterization of th e logarithm s o f economi c tim e series . A s a descriptio n o f suc h a transformed dat a series , (52 ) o r (53 ) seem s a t leas t admissible ; Ay , i s the growt h rate o f the leve l serie s Y t, and this growth rate varies aroun d a (typicall y positive ) mean . Tha t thi s equatio n coul d describ e th e leve l of th e serie s (s o y t denote s th e origina l dat a withou t th e logarithmi c transformation) seem s implausible , however: (53 ) woul d then impl y that the absolut e amoun t o f growt h varie s aroun d a fixe d mean , an d therefore that , a s th e serie s grows , th e averag e amoun t o f growt h fall s to zer o a s a proportion o f th e serie s itself . Moreover, cr 2 /var(Y < ) would tend t o zero , forcin g th e serie s t o becom e essentiall y deterministi c i n relative terms . Thi s criticis m doe s no t appl y t o (53 ) sinc e a i s a proportion o f Y t. Ermini an d Hendr y (1991 ) conside r th e issu e o f testin g 'logarithm s versus levels' b y formulating a test base d o n the encompassin g principle. The nul l mode l MI may be sai d to encompas s the riva l or alternativ e model MI i f M\ i s able t o explai n th e finding s o f M 2 . Alternatively , if the riva l mode l doe s no t adequatel y characteriz e th e propertie s o f th e process generatin g the series , th e nul l model ough t t o b e abl e t o predic t the form o f mis-specification one woul d expect to fin d i f the riva l mode l were estimated. To pursu e th e las t point , suppos e a dat a serie s {Y t} i s well characterized b y a rando m wal k i n logarithm s wit h a stabl e drif t an d homo skedastic errors. Suppos e furthe r tha t thi s implies that regressin g AY , on a constan t woul d yiel d unstabl e estimate s an d heteroskedasti c errors . A simple initia l tes t woul d the n b e t o estimat e th e rando m wal k i n bot h logarithms an d level s an d se e whethe r th e model s displaye d th e pre dicted behaviour. 20 I f th e nul l model als o ha d prediction s t o offe r abou t 20 Th e processe s correspondin g t o 'rando m wal k i n logarithms ' an d 'rando m wal k i n levels' ar e Ay , = f t + £ , an d A Y, = fi 2 + v,, respectively.

194 Regressio


the for m o f th e instabilit y o f th e parameters , th e tes t coul d b e sharpened b y testin g for th e presenc e o f particular kind s of misspecification—say, drif t o r variance s of errors increasing exponentially over time . In general , th e entir e argumen t shoul d als o b e ru n i n revers e b y taking the riva l mode l a s th e null ; however , linea r model s d o no t ensur e positive observations, so awkwar d issue s arise. We illustrat e thi s discussio n wit h th e tim e serie s analyse d i n Chapte r 1, namely real ne t nationa l produc t (Y, i n 192 9 £million) for th e Unite d Kingdom ove r 1872-197 5 (fro m Friedma n an d Schwart z 1982) . Th e approach follow s that in Ermini an d Hendry (1991) . First, w e mode l th e leve l o f ne t nationa l produc t ove r th e sampl e 1875-1975 b y OLS . Onl y on e lagge d differenc e wa s neede d t o remov e any residual serial correlation, yielding

where th e standar d error s o f coefficien t estimate s ar e show n i n paren theses, o i s th e equatio n standar d error , an d S C i s th e Schwar z criterion. (Smalle r value s on balanc e produc e preferabl e models. ) Sinc e the mea n o f Y i s 4701.0 , th e a a s a percentag e o f Y i s 3. 1 pe r cent . However, th e coefficient s ar e no t constan t ove r th e sampl e period , a s shown i n Fig . 6.1 fo r th e intercept , an d Fig . 6.2 fo r th e one-ste p residuals an d o . (Se e Hendr y (1989 ) fo r details.) 21 Th e intercep t trend s upwards, an d o increase s ove r time , eve n ignorin g the larg e shoc k i n 1919-20. O n an y constancy test, th e mode l i s rejected a t fa r beyon d th e 1 per cen t leve l (e.g. tha t of Hansen 1992) . Next w e mode l growt h i n logs . A s before , on e lagge d differenc e removed residua l seria l correlation, giving

21 Recursiv e estimatio n involve s estimatin g a n equatio n ove r successivel y large r sub samples, startin g fro m a minimu m sub-sampl e an d extendin g t o th e ful l sample . Paramete r instability ma y b e tracke d b y lookin g a t th e behaviou r o f th e estimate d coefficients , a s sample siz e i s increased , t o se e whethe r the y fluctuat e significantl y o r remai n stable . Recursive Cho w (1960 ) test s ma y b e compute d i n a t leas t tw o ways . Th e firs t involve s estimating th e equatio n from , say , t = 1 to ( = 7\ , wher e T l i s greater tha n th e minimu m sample size , an d the n fro m t = I t o t = T t + 1. The one-step-ahea d Cho w tes t is based on a compariso n o f th e residua l varianc e o f th e tw o estimate d equation s an d i s a n F-tes t under th e nul l o f paramete r constancy . A secon d tes t i s give n b y estimatin g th e equatio n from, say , t = 1 to ( = T } an d comparin g th e residua l varianc e o f this regressio n wit h tha t of th e equatio n estimate d ove r th e ful l sample . A sequenc e o f thes e Cho w test s i s built u p by augmentin g th e sub-sampl e siz e b y on e a t eac h step , e.g . T 1 + 1 t o 7 \ + 2, an d


195

FIG 6.1. Recursiv e estimate s o f intercept i n levels mode l

FIG 6.2. One-ste p residuals i n levels mode l

comparing th e residua l varianc e o f eac h o f thes e equation s wit h th e ful l sampl e residua l variance. Alternatively , th e sequenc e o f one-ste p residual s (o r forecas t errors ) ca n b e examined relative to the residua l variance a t eac h sampl e size.

196


The percentag e a i s 3. 3 pe r cen t bu t no w th e intercep t i s constan t a s shown i n Fig. 6.3 , an d littl e residual heteroskedasticity remain s (se e Fig . 6.4). Th e mode l fail s constanc y test s onl y prio r t o th e larg e shoc k i n 1919-20. Ermini an d Hendr y us e result s fro m Ermin i an d Grange r (1991 ) t o describe th e particula r for m o f instabilit y an d heteroskedasticit y on e would expec t i n th e mode l i n level s i f th e dat a wer e generate d b y th e logarithmic model . Ermin i an d Grange r sho w that , i f th e dat a ar e generated by with time-invarian t distribution Ay , ~ IN(jU , cr 2), an d i f th e riva l mode l is then E(AY t) =

0 between whic h there i s a co-integratin g relationship in levels: Defining th e transforme d serie s x, — log (Xt) an d it = log (Z,), we have Using a Taylor serie s expansion of the logarithmi c function, w e obtai n

from whic h w e ca n se e tha t th e term s i n th e summatio n wil l declin e i n importance a s Z, grows , sinc e b y (59 ) u t i s of fixe d variance , whil e th e variance o f Z t i s o f O(t). Henc e w e expec t t o fin d a n equilibriu m relation o f som e sor t amon g th e logarithm s o f variable s tha t ar e co-integrated i n levels . Asymptotically , thi s equilibriu m relatio n i s o f a degenerate kin d wit h th e distributio n o f x t — zt collapsin g aroun d logQ3). Thi s i s als o a testabl e predictio n o f th e hypothesi s tha t th e random wal k mode l i n level s encompasse s th e logarithmi c model, 22 although th e tes t i s likely to hav e lo w power becaus e th e varianc e in th e errors i s likely to persist eve n in fairl y larg e samples . Conversely, i f we begin with a co-integrating relationship betwee n two series whic h hav e alread y been transforme d t o logarithms, then th e relationshi p amon g the level s of the serie s is which implies

22 T o se e this , simpl y substitut e A r,_1 fo r Z, . Th e instabilit y o f th e rando m wal k mode l in level s mad e a forma l tes t i n th e level s — > logarithms directio n unnecessar y i n th e Ermini-Hendry discussion , althoug h i n principle suc h a test coul d be carrie d out .


199

a

FIG 6.5. Recursiv e estimate s o f d

or

This n o longe r ha s th e for m o f a standar d co-integratin g relationship , since W t — kV, = V t(V®~lvt — k) = ry r ; whil e v , ma y remai n a stationary process, th e erro r ter m r\ t i n th e ne w relationshi p depend s o n th e integrated serie s V t an d i s therefor e no t stationar y i n general . N o co-integrating relationshi p ma y therefore appear , an d a regression o f th e form W, = kV t + r] t i s likely to displa y considerable instability . At th e sam e time , i t shoul d b e note d that , i n eithe r o f th e abov e examples, onl y on e o f th e logarith m an d th e leve l o f a variabl e wil l b e an integrate d proces s (capabl e o f bein g mad e stationar y b y differen cing), althoug h stationarit y o r non-stationarit y wil l b e commo n t o bot h representations. Th e standar d definitio n o f co-integration , whic h de scribes equilibriu m relation s amon g integrate d processes , can be legiti mately applie d t o onl y one o f the tw o cases at a time. The fac t remains , however , tha t a co-integratin g relationshi p amon g the level s o f variable s suggest s th e existenc e o f som e linear equilibriu m relationship amon g the logarithm s of those sam e variables. The convers e need no t i n general b e true .

200 Regressio

n wit h Integrate d Variable s

Appendix: Vecto r Browman Motio n Consider th e bivariat e 1(1) dat a generatio n proces s give n by:

The DG P i n (Al ) i s a re-parameterizatio n o f a genera l bivariat e norma l distribution fo r (Ay, , Az f ) wit h covarianc e JJCT ^ an d define s th e inte grated vector process : when x, = (v, : z,)' an d v , = (e lt + r)£ 2t, £21)'- The n v , ha s non-unit error variance matri x £:

As i n Chapte r 1 , a suitably scaled functio n o f x f converge s t o a vecto r Brownian motio n process , denote d BM(E) . W e firs t deriv e th e standardized Brownian motion b y the transform:

and s = Oi/o 2. The n m ( ha s a unit error varianc e matrix since:

Alternatively, fro m (A2 ) an d (A4) :

(A6) Next, usin g a componen t b y componen t analysi s simila r t o tha t i n Chapter 3 , fro m (A5) :

where B(r ) = (#i(r), B 2(r))' (denote d BM(I)) , an d th e fl,-(r ) ar e th e standardized Wiene r processe s associate d wit h accumulatin g th e {e it}. Further:


201

These vecto r formula e ar e natura l generalization s o f th e scala r Wiene r processes i n Chapter 3 . Scalar function s o f vecto r 1(1 ) variable s ca n b e handle d a s follows . Consider th e distributio n o f th e differenc e betwee n y t an d z t, namel y ut = d'xt fo r d' = (1, -1). The n fro m (A4):

202


(A10) By direct calculatio n fro m (Al ) however ,

and W(r ) i s the Wiene r proces s associate d wit h {n^/a,,,} . B y definition, w t ~ £ it + (> ? ~ 1) £ 2«> s o tha t cr lv W(r) = OiB^r) + (r] - \}o 2B2(r), an d hence th e expression s i n (A10 ) an d (All ) are equal , bu t provid e different insight s into th e behaviou r o f the scala r second moment . Similarly, le t f = (1,0 ) s o tha t f'e t = EK/CTI , the n w e ca n deriv e a covariance suc h as:

Returning t o th e standardize d vecto r Brownia n motion , le t V(r) = (Vî(r) , V 2(r))' (whic h is BM(i:)) be associate d wit h the accumu lation o f {v,} . No w Vi(r) an d V 2(r) ar e no t independen t sinc e E(vltv2t) ¥= 0. The standardize d vecto r Brownia n motio n is B(r) = K'V(r) where K' i s defined i n (A4). Multiplyin g out, w e have : 2(r).

(A13 )

Indeed, i f w e conditio n v 1( o n v 2t (whic h generate s £ 1;) an d le t Vi. 2(r) be th e associate d "conditional " unstandardize d Wiene r process , the n


3

and V 2(r) ar e independent . Becaus e £ lr = v 1( - £ r (v lr |v 2r ) = M we see that Vj. 2 (r) = Vi(r) - riV 2(r) = aiB^r) fro m (A13) . Finally, conside r a n expression o f the form :

Then the erro r covarianc e matri x is added on if the cross-produc t unde r analysis i s a contemporaneou s rathe r tha n a lagge d on e (se e th e appendix t o Chapte r 7 fo r a n extension) . Phillip s an d Durlau f (1986 ) and Phillip s (19886) provide proofs and generalizations.

7

Co-integration in Individua l Equations We firs t examin e method s o f testin g fo r co-integratio n vi a stati c regressions, an d provid e simulatio n estimate s o f th e uppe r percen tage point s o f th e distribution s o f statistic s use d i n th e tests . Next , we look a t th e propertie s o f the estimator s derive d fro m suc h stati c regressions. I n particular , w e focu s o n th e finite-sampl e biase s i n the estimate s o f co-integratin g vector s an d th e power s o f test s t o detect co-integration . Finally , w e conside r modifie d estimator s an d dynamic models . I n Chapte r 8 , system s method s o f estimatin g co-integrating relation s wil l be considered . The previou s chapte r focuse d o n th e propertie s o f co-integrate d pro cesses an d th e implication s o f modellin g wit h co-integrate d variables . We hav e discusse d th e 'super-consistency ' o f th e coefficien t estimate s i n the static o r co-integratin g regression , balance d an d unbalance d regres sions, an d th e distribution s o f th e statistic s commonl y use d t o tes t fo r the significanc e of regression coefficients . The tw o issues o f being abl e t o tes t fo r th e existenc e o f an equilibriu m relationship amon g variable s an d t o accuratel y estimat e suc h a relation ship ar e complementary . Indeed , a s demonstrate d i n discussin g spuriou s regressions i n Chapte r 3 , stati c regression s amon g integrate d serie s ar e meaningful i f an d onl y if they involve co-integrate d variables . Thus , i t i s of interes t t o discover , first , ho w wel l th e mos t frequentl y use d test s of co-integration perform , an d second , ho w accuratel y th e correspondin g equilibrium relationship i s estimated. The objectiv e o f thi s chapte r i s t o develo p test s applicabl e t o singl e equations whic h ma y b e use d t o detec t a long-ter m relationshi p o f th e form discusse d an d exploite d i n earlie r chapters . W e als o attemp t t o formulate som e recommendation s fo r efficien t estimatio n o f co-integrat ing parameter s an d testin g fo r co-integratio n i n finit e samples . I t wil l become clea r fro m th e discussio n that th e asymptotic propertie s o f static regression estimator s ar e ofte n rathe r differen t fro m thei r behaviou r i n empirically relevan t sampl e sizes . Further , lac k o f wea k exogeneit y du e to co-integratin g vector s enterin g severa l equation s als o alter s finit e sample behaviour . I t therefor e become s important , i n th e fac e o f dat a

Co-integration i n Individua l Equation s 20

5

limitations, t o conside r alternativ e method s which do not rel y exclusively on single-equatio n stati c regressions . Thes e ar e th e topi c o f Section s 7-9.

7.1. Estimatin g a Single Co-integratin g Vector Consider th e proble m o f estimatin g th e singl e co-integratin g vector a using the stati c mode l We conduc t th e discussio n i n thi s an d th e followin g section s i n thre e stages. First , w e elaborat e upo n th e theorem s presente d i n Chapte r 5 and develo p a n intuitiv e discussio n o f stati c regressions . Next , w e proceed t o th e issu e of testing for co-integratio n using static regressions . The testin g an d th e parameterizatio n o f the equilibriu m relationship ar e seen t o b e complementar y exercises . Finally , w e discus s simulatio n studies whic h cas t ligh t o n th e behaviour , i n finit e samples , o f th e static-regression estimator s an d th e power s o f th e test s fo r co-integra tion. In orde r t o kee p th e analysi s a s tractabl e a s possible , w e wil l restric t ourselves to considering CI(1,1 ) systems . Thus , suppos e tha t all the elements i n x, are 1(1). I n general , then , an y linear combination 6'x t o f the element s o f x ( wil l produc e a n 1(1 ) serie s u t. The onl y exception , if one exists , i s a co-integrating vector a suc h tha t «'x r i s 1(0).1 Ordinar y least square s minimize s th e residua l varianc e o f x t , an d therefor e a simple OL S regressio n o f th e for m (1 ) shoul d provid e a n excellen t approximation t o th e tru e co-integratin g vecto r whe n on e exists , a s discussed i n Chapte r 5 . The simplicit y o f thi s metho d an d th e eleganc e o f th e theoretica l argument hel p explai n th e popularit y o f suc h regressions . Al l tha t i s needed t o parameterize a long-run equilibriu m relationshi p amon g a set of variable s i s a stati c OL S regression . Thi s regressio n i s performe d a s the firs t ste p o f th e Engle-Grange r two-ste p estimator 2 an d serve s a s a preliminary chec k o n th e equilibriu m relationship s postulate d b y eco nomic theory to exist amon g the variables. 1 Initiall y w e focu s o n th e cas e wher e (apar t fro m normalization ) th e co-integratin g vector a i s uniqu e an d i s therefor e o f dimensio n n x 1 . A s th e analysi s i n Ch . 5 showe d (especially th e discussio n o f th e Grange r Representatio n Theorem) , thi s i s clearl y a restrictive assumptio n t o make . I n general , ther e wil l exis t r co-integratin g vectors , O^s r s n — 1, an d whe n gathere d i n a n array , th e matri x a wil l b e o f orde r n x r . Th e problem of estimatin g co-integratin g vector s i n system s is considered i n Ch . 8 . 2 Th e two-ste p estimato r an d it s asymptoti c propertie s ar e discusse d i n Ch . 5 . Th e general cas e i s derived b y Engle an d Grange r (1987 : 262, Theorem 2) .

206 Co-integratio

n i n Individua l Equation s

However, ther e ar e reason s fo r preferrin g alternative s t o th e simpl e static regressio n in sample s o f the siz e typica l i n economics. This chapte r will conside r dynami c regressio n method s an d modifie d estimators . These technique s hel p to reduc e or eliminat e source s of finite-sampl e biases whic h aris e fro m stati c estimation , an d whic h ca n b e ver y substantial i n practice.

7.2. Test s fo r Co-integration i n a Single Equatio n The simples t test s fo r co-integratio n propose d b y Engl e an d Granger , test fo r th e existenc e o f a uni t roo t i n th e residual s o f th e stati c regression. Th e method s o f Chapte r 4 ca n therefor e b e followe d wit h minor modifications . W e firs t conside r th e bivariat e case , wher e

*t = (yt,z ty.

The modification s are necessar y because, whil e the test s for uni t root s discussed i n Chapte r 4 us e th e origina l series , sa y {w t}, th e co-integra tion test s ar e base d o n th e estimated, o r derived, residual series ,

Hence, a s th e co-integratin g regressio n estimate s y 3 before th e tes t i s performed, th e co-integratio n tes t i s not simpl y a standar d test fo r a unit root i n the series u t. If / J wer e know n i n th e exampl e presente d i n Chapte r 5 (give n b y equations (5.1)-(5.6)) , th e nul l hypothesi s o f n o co-integration , cor responding t o p equa l t o 1 , coul d b e teste d b y constructin g th e serie s ut = y t — [3zt, treating thi s series a s the on e tha t ha s th e uni t roo t unde r the null , an d usin g the Dickey-Fulle r tables . However , i f / ? is unknown, it mus t b e estimate d (e.g. ) fro m th e stati c regressio n o f y t o n z t- Th e test is based on the nul l hypothesis of no co-integration , with the critica l values fo r th e tes t statistic s calculate d t o ensur e th e appropriat e prob ability of rejection of th e nul l hypothesis. Some o f th e mos t widel y use d test s o f co-integratio n hav e bee n th e co-integrating regression Durbin-Watson tes t (CRDW) , th e Dickey Fuller tes t (DF) , an d the augmente d Dickey-Fuller test (ADF) . The CRDW , suggeste d b y Sarga n an d Bhargav a (1983) , i s compute d in exactl y the sam e fashion as the usua l DW statisti c and i s given by

where u t denotes the OLS residual fro m the co-integrating regression . The nul l hypothesi s bein g tested , usin g th e CRD W statistic , i s o f a single uni t root : tha t is , u t i s a rando m walk . Thi s i s t o b e contraste d

Co-integration i n Individual Equations 20

7

with th e conventiona l us e mad e o f thi s statisti c i n standar d regressio n analysis where the nul l of no first-order autocorrelation i s tested. The us e of this statistic is problematic i n the presen t setting . First , th e test statisti c fo r co-integration depend s upo n th e numbe r of regressors in the co-integratin g equation and , mor e generally , o n th e data-generatio n process an d henc e o n th e precis e dat a matrix . Onl y bound s o n th e critical value s ar e available. 3 Second , th e bound s diverg e a s the numbe r of regressors i s increased , an d eventuall y ceas e t o hav e an y practica l value fo r th e purpose s o f inference . Finally , th e statisti c assume s th e null wher e u t i s a rando m walk , an d th e alternativ e wher e u t i s a stationary first-orde r autoregressiv e process . I n suc h circumstances , Bhargava (1986 ) demonstrate s tha t i t ha s excellen t powe r propertie s relative t o alternativ e tests . However , th e tabulate d bound s ar e no t correct i f ther e i s higher-orde r residua l autocorrelation , a s wil l com monly occur . Exac t inference i s therefor e possibl e i f an d onl y i f eac h regression exercis e i s augmented b y the us e o f algorithms such as that of Imhof (1961 ) t o cpmput e th e relevan t critica l values . I n principle , i t i s possible fo r simulatio n method s t o b e use d t o comput e th e critica l values. However , i n practic e thi s implie s a proliferatio n o f table s o f different critica l value s fo r differen t data-generatio n processe s an d simulation exercises . As w e hav e argue d previously , th e onl y hop e fo r uncomplicate d inference lie s in generatin g a robus t se t o f critica l values. Robustnes s i s defined b y lac k o f sensitivit y o f th e critica l value s t o a wid e rang e o f changes t o th e data-generatio n process . Test s that ar e simila r for a wide range o f nuisanc e parameters woul d ensur e thi s non-sensitivity . In othe r words, i t i s importan t t o hav e a se t o f tables tha t coul d b e use d regardless o f th e precis e propertie s o f th e DGP , a s lon g a s th e regression mode l i s parameterized t o satisf y certai n basi c properties suc h as balance . Test s o f co-integratio n base d no t directl y o n th e residual s but o n th e regressio n coefficient s themselves , migh t have highe r power . As a n alternativ e method , on e coul d conside r usin g non-parametri c corrections o f the sor t describe d i n Chapte r 4 to conduc t inferenc e usin g only a smal l se t o f tables , fo r a rang e o f possibl e data-generatio n processes. Example s o f bot h thes e procedure s wil l b e presente d i n du e course. Similar qualification s appl y to th e us e o f the D F statisti c and less so to the ADF , i f the numbe r o f Aw r _, term s appearin g i n the data-generation process coincide s wit h thos e use d i n th e implementatio n o f th e test . Since th e numbe r o f suc h term s appearin g i n th e DG P i s unknown , it seems safes t t o over-specif y th e AD F regression , an d us e a s man y 3 Whil e th e CRD W statisti c doe s no t hav e a limitin g distributio n wit h a non-zer o variance, T(CRDW ) = J~ l ^ = 2(u, - u,^) 2/T-2 £f= i«r 2 does .

208 Co-integratio

n i n Individua l Equations

lagged term s a s degrees-of-freedo m restrictions wil l allow . O f course , i n practice, th e choic e o f the la g structure i n ADF test s ma y be a d hoc an d different result s ca n b e obtaine d b y changin g th e lengt h o f th e auto regression. I n particular , th e powe r o f th e tes t ma y b e affecte d ad versely. Table 7. 1 provides , fo r illustratio n ( a mor e detaile d descriptio n o f applicable critica l value s wil l b e give n below) , th e 5 pe r cen t critica l values o f th e DW , ADF(l) , an d ADF(4 ) tests , fo r thre e sampl e size s (T = 50, 100 , 200) . Th e data-generatio n process i s a n «-variat e rando m walk wit h n less tha n o r equa l to 5 , as in Engle an d Yo o (1987) . It i s importan t t o emphasiz e that , i n commo n wit h th e test s fo r uni t roots, test s fo r co-integratio n ma y lac k powe r t o discriminat e betwee n unit root s an d borderline-stationar y processes. I n a small-scal e stud y of the powe r propertie s o f thi s test , Engl e an d Grange r (1987 ) sho w that , when th e data-generatio n proces s o f th e disturbance s o f the co-integrat ing equatio n i s a n AR(1 ) proces s wit h th e autoregressiv e paramete r equal t o 0.9 , th e power s o f the CRDW , DF , an d AD F test s a t th e 5 per cent critica l value s ar e 20 , 15 , an d 1 1 per cen t respectively . Whe n th e DGP i s altered t o b e a more genera l AR(1 ) proces s wit h a unit root , th e power o f th e AD F tes t become s 6 0 per cent , dominatin g strongl y bot h the power s of the CRD W an d D F test s a t the 5 per cen t level. Engle an d Grange r (1987 ) emphasiz e th e robustnes s t o change s in th e data-generation proces s o f th e AD F critica l values . Th e discussio n i n Chapter 4 help s t o explai n thi s result . Phillip s an d Ouliari s (1990 ) sho w that th e limitin g distribution of the AD F tes t statisti c is the sam e a s tha t of th e non-parametricall y adjuste d D F statistic . Becaus e th e limitin g distribution o f th e latte r statisti c i s invarian t t o nuisanc e parameter s i n the processe s generatin g th e dat a series , th e resul t follows . Eac h tes t manages t o correc t fo r variou s features that ma y be presen t i n the DGP , in on e cas e b y capturin g th e effect s i n a regressio n model , i n th e othe r by implicitl y adjusting th e critica l values. Phillips an d Ouliari s (1990 ) deriv e th e distribution s of severa l test s o f co-integration. W e clos e thi s sectio n b y presentin g a summar y o f th e theoretical result s presente d there . The y conside r th e linea r co-integrating regressions :

and

where y, an d z t satisf y (multivariate ) unit-roo t processes . Th e asymp totic distribution s o f a numbe r o f residual-base d test s ar e discussed , from whic h we wil l conside r fiv e (thi s analysi s is of cours e relate d t o th e


9

TABLE 7.1. Fiv e pe r cen t critica l value s fo r th e co-integratio n test s n

T

CRDW

ADF(l)

ADF(4)

2

50 100 200

0.72 0.38 0.20

-3.43 -3.38 -3.37

-3.29 -3.17 -3.25

3

50 100 200

0.89 0.48 0.25

-3.82 -3.76 -3.74

-3.75 -3.62 -3.78

4

50 100 200

1.05 0.58 0.30

-4.18 -4.12 -4.11

-3.98 -4.02 -4.13

5

50 100 200

1.19 0.68 0.35

-4.51 -4.48 -4.42

-4.15 -4.36 -4.43

Source: Th e CRD W critica l value s (se e Sarga n an d Bhargav a 1983 ) an d th e ADF(l) critica l value s were generate d b y PC-NAIV E usin g 10,00 0 replications . The ADF(4 ) critica l value s hav e bee n take n fro m Engl e an d Yo o (1987) . Th e ADF critica l value s ar e compute d b y replicatin g th e regressio n AM , = pu,-i + 2f =1 )-7.4(£> ) pertai n t o stati c model s which d o contai n constan t terms . Th e figure s sho w th e relationshi p between bia s an d sampl e siz e fo r fou r differen t value s o f th e rati o o f standard deviations . Th e horizonta l scal e i s implicitly Iog 2 (T/25) s o tha t the fou r point s show n ar e equidistant . Firs t o f all , i t i s eviden t tha t th e bias doe s no t declin e a t rat e T . Fo r example , i n Fig . 7.4(a ) (ol/o2 = 0.5), wit h p 2 = 0.6, th e bia s a t T = 2 5 i s 0.45 , a t T = 50 is 0.32, a t T = 100 i s 0.21 , an d a t T = 200 i s 0.13 . Thus , a n eightfol d increase i n sampl e siz e reduce s th e bia s b y a facto r o f approximatel y

216

Co-integration in Individual Equations

Sample size

Fio7.1(a). N o constant in model, estimate d bias v. sample size, s = 16

Sample size Fio7.1(&). Constan t i n model, estimate d bias v . sampl e size, s = 16 3.5. A s anothe r example , w e se e i n Fig . 7.2(a ) (01/02 = 4), wit h p2 = 0.6, th e biase s a t th e sam e se t o f sampl e size s ar e 0.017 , 0.010 , 0.005, 0.0026. 6 Her e a n eightfol d increas e i n sampl e siz e reduce s th e 6

Thes e number s ar e take n fro m th e experimenta l outpu t rathe r tha n rea d fro m th e figures. Th e standar d erro r o f th e smalles t o f these number s i s roughly 5 x 10~ 5.

Co-integration i n Individual Equation s

217

Sample size

Fio7.2(a). N o constant in model, estimate d bias v. sampl e size , s = 4

Sample siz e

FIG 7.2(6). Constan t in model, estimate d bias v . sampl e size , s = 4

bias b y a facto r o f 6.5 . Usin g a standard-deviation ratio o f 4 again but a value o f p 2 = 0.9, the biase s ar e 0.04 , 0.024, 0.014, an d 0.008 , a fivefol d decrease i n bias . Th e rat e o f declin e o f th e bia s i s alway s faster tha n but no t a s fast a s T fo r sampl e sizes up t o 200. Second, th e biase s increas e uniforml y i n pi an d decreas e uniforml y i n


Sample si/.e

Fio7.3(a). N o constant i n model, estimated bia s v . sampl e size, s = I

Sample size

FIG 7.3(6). Constan t i n model, estimate d bia s v . sample size .

01/02- T o understan d this , we can rewrite (9 ) and (10 ) t o ge t

Co-integration in Individual Equations

219

Sample size

Fio7.4(a). N o constant in model, estimate d bia s v . sampl e size , s = 0.5

Sample size

Fio7.4(b). Constan t i n model, estimate d bia s v . sampl e size, s = 0.5

Since p i = 1 , {v, } i s a rando m wal k an d therefor e asymptoticall y dominates {«>,„,•, Az,_ ; , an d ( y — yz)t-k wher e th e value s o f i , j, and k 1 W e ar e gratefu l t o To m Rothenber g fo r pointin g out tha t R 2 i s a rando m variabl e in the presen t context . However, i t remain s a usefu l descriptiv e statistic. 8 Th e proble m o f finit e sampl e biase s wa s als o demonstrate d b y Hendr y an d Neal e (1987). Usin g recursiv e procedure s fo r OL S estimation , the y estimate d a bivariat e stati c regression fo r sampl e size s rangin g fro m 4 0 t o 200 , considering th e bia s o f th e coefficien t estimate fo r eac h sampl e size . Th e result s indicate d that, eve n fo r sampl e size s o f 200, the long-run coefficien t fro m th e stati c regressio n wa s approximatel y 0. 7 whil e th e tru e long-run coefficien t wa s 1.0 . Convergenc e t o th e tru e valu e wa s no t nearl y a s fas t i n practice a s T~ ! whic h dominate s for sufficientl y larg e T: se e (18 ) below.


1

will depen d upo n th e natur e o f the ARIM A process generatin g {y t} an d {z ar e & U containe d i n th e residual u t\ whe n \YI\ < 1, 13 = (72 + 73)7( 1 ~ 7i) - I n general , u, will b e serially correlated . It s long-ru n varianc e o 2, whic h appear s i n th e expressions fo r th e Wiene r distributiona l limit s o f th e sampl e moments , is given by where

It ma y then be show n that

Phillips (1986 ) show s that i t i s th e presenc e o f A in (18 ) tha t cause s th e biases. 9

Se e e.g. th e derivatio n o f the EC M representatio n i n Ch. 5 for CI(1 , 1 ) series. A simpl e rewritin g o f equatio n (10 ) above , t o tak e accoun t o f th e structur e o f th e residual autocorrelation , give s u s a versio n o f (14a ) wit h th e y ; suitabl y interpreted . Late r in thi s chapte r w e conside r a generalizatio n o f (14 ) an d investigat e th e consequence s o f using stati c an d dynami c regressions . 10

222 Co-integratio


A simpl e wa y t o reduc e th e biase s i s to reparameteriz e th e equatio n in suc h a wa y tha t A is se t a t zero . Bot h (15a ) an d (156 ) satisf y thi s property. Fo r comparison , followin g Banerje e e t al. (1986) , w e ra n a second se t o f experiment s i n orde r t o investigat e th e effect s o f suc h re-parameterizations. Usin g th e DG P give n b y (14a)-(146), we estimate equation (15a) , wit h a lagge d z include d a s a n extr a regressor . Th e dynamic regression equatio n estimate d i s therefore

The extr a lagge d variable , z t-\, i s include d t o avoi d imposin g homo geneity (se e Chapte r 2) , a s i t woul d b e unrealisti c t o assum e tha t th e investigator know s th e precis e for m o f th e data-generatio n process . Th e co-integrating coefficien t i s estimate d b y computin g th e expressio n 1 - d/c: se e Sect . 2.4 . Th e stati c regressio n give n b y (16 ) i s als o estimated. The stron g exogeneit y propert y require d o f z t i s guaranteed , i n th e design o f th e experiment , b y drawin g e lt an d e 2t fro m uncorrelate d pseudo-normal distributions . Th e value s o f y , ( i = 1, . . ., 3 ) ar e varie d as i n Tabl e 7.3 , while ensurin g tha t long-ru n homogeneit y i s preserved . The sampl e size s an d th e rati o o f the standar d deviation s o f e lr an d e 2t are als o varied , t o giv e a se t o f 9 0 experiments . Th e simulation s ar e al l conducted with 5000 replications . The purpos e o f th e firs t par t o f thi s exercise i s to compar e th e biase s in th e estimate s o f th e co-integratin g paramete r obtaine d fro m dynami c regression wit h thos e obtaine d fro m th e stati c regression . (Th e tru e value o f th e co-integratin g paramete r i s 1. ) Som e o f th e result s fo r different configuration s o f th e y , parameter s an d standard-deviatio n ratios ar e give n i n Tabl e 7.3 . We repor t th e estimate d biases , fo r fou r different sampl e sizes , i n th e stati c model . Th e correspondin g estimate d biases fro m th e dynami c regressio n (wher e th e co-integratin g paramete r is calculated a s (1 — d/c)) ar e i n almost al l cases so small a s to b e withi n 2 Monte Carl o standar d error s o f zero an d s o ar e no t reported . W e wil l return t o th e compariso n o f these estimator s (stati c an d dynamic ) below ; for th e tim e being , th e noteworth y point i s simply that substantia l biases remain i n stati c estimate s fo r paramete r combination s a t whic h th e biases i n dynami c estimate s ar e zero , o r ver y clos e t o zero , sinc e th e dynamic model ha s been specifie d s o a s to mak e A close t o zero . While th e dynami c estimate s contai n negligibl e biase s i n thes e ex amples, Z t is strongly exogenou s fo r th e paramete r o f interest . Whil e i t is fairl y straightforwar d t o exten d thi s specificatio n t o includ e weakl y exogenous z t , th e usefulnes s o f estimate s fro m dynami c single equation s is reduce d substantiall y i f th e regressor s ar e no t weakl y exogenous . I t also become s difficul t t o mak e unambiguou s comparison s betwee n

Co-integration i n Individua l Equations 22

3

TABLE 7.3. Biase s in static models a DGP: (14« ) + (146) ; 5000 replications Sample siz e (T) 25 5 7i = 0.9 , 72 s =3 Yi = 0.9 , 72 s =1 Yi = 0.5 , 72 s =3 Yi = 0.5 , 72 s =1

= 0 ,-5, = 0 ,,5, = 0 ,• 1 , = 0 .1,

0 10

0 20

0 40

0

-0.,39

-0.25

-0.15

-0.07

-0.,04

-0,.32

-0.22

-0.14

-0.08

-0..04

-0,,23

-0.13

-0.07

-0.03

-0,,02

-0.,21

-0.12

-0.06

-0.03

-0,,02

a

Standar d error s o f thes e estimate s var y widely, but th e estimate d biase s ar e in almos t al l case s significantl y differen t fro m zero , fo r sampl e size s o f 5 0 o r greater. Not e tha t agai n th e biase s appea r t o declin e les s quickl y than T~ l, bu t more quickl y than T~V Z. Calculation s wer e undertaken usin g GAUSS .

dynamic an d stati c single-equatio n estimates . W e discus s thi s issu e below. Recalling th e discussio n i n Chapte r 5 , a tes t o f th e nul l hypothesi s H0 : c = 0, base d o n th e t -statistic t c= 0, i s a vali d tes t fo r co-integra tion.11 Thi s statistic , unde r th e nul l o f n o co-integration , i s no t asymp totically normall y distributed . Therefor e a secon d par t o f th e exercis e was used t o comput e th e critica l values of the distributio n of t c= 0 an d t o use thes e critica l values t o deriv e th e powe r o f thi s statistic , for a rang e of cases , t o detec t co-integration . Thi s i s a n exampl e o f a tes t o f co-integration base d no t directl y o n th e residuals , bu t o n a regressio n coefficient. A powe r comparison , betwee n a residual-base d tes t an d th e Mest, i s give n i n Tabl e 7.7 ; bu t firs t w e us e a mor e genera l DG P t o consider furthe r th e issu e of finit e sampl e biases . 7.4.1. General Data-generation Processes

Consider no w th e compariso n o f stati c an d dynami c estimate s o f th e long-run multiplie r whe n th e tim e serie s ar e derive d fro m a mor e 11 Whe n y an d z ar e no t co-integrated, ( y - z),_ 1 i s 1(1), in which case (19 ) ca n only be balanced i f c - 0 . This observatio n form s the logica l basis fo r a test o f co-integration base d on t c= a- Th e stron g exogeneit y o f z , (fo r th e parameter s i n (14a) ) ensure s tha t a tes t base d on estimate s fro m a single equatio n suc h a s (19 ) i s fully efficient .

224 Co-integratio


general DGP . Th e experiment s describe d abov e ar e specia l case s o f this more genera l DGP . Th e 'static ' estimat e o f the co-integratin g coefficient [3 is called ft s, whil e the dynami c estimate i s denoted p d. The exogenou s variabl e i s generated a s

so tha t z t ca n b e mad e eithe r 1(0 ) o r 1(1 ) b y choic e o f o'Oioiîî^;

to + ON O> in '—v Ô

Ô

(N rO

ONONÔfN

rH f—•* rH ^H

^H 00 C-4

CN U"j ^ CO

oo

f^^^l

CN •* C4

o o "^

f^^1^

OS

^ ^ '-J

^^^

CN (N i-H

r*]o^Hv.oooorr>oONO ooooooocJoooooocsooooooocJo

N ^

^ ^-

O O O O O O O O O O O O O O O C 5 O O O O O O O O O O O O O O O C 5 OOOOOOOOT-H^Hi-H^-5ooOOOOOC5T-HT-H^H^-H

\O

. O C 5 O O O C 5 O O O O O O O O O O O O O O O O O

,0

Q,

>§ T3 (0

.s '^ OJ

o> 6O

PH

CQ t-l CO

O

>

M 4)

"H3 T3 O

"Q-

" CT

o '§ ca

a >> Q ^ &

7

228

u? C/3

in

O

'-H O

OO O-^ ^*~~s

•^~

MD CN) T—1 ON ^) f^ i^j- CO Tf i—i | 1

m s*-'

CO CN

O O

00 -
^^

ON ON ON ON O

8

CN

7

r~ ON ON oo o 1 1 1 1

8

1

x—V

1

1

1

y—V

1

t- ON

in ON H

3 g> en

^ aj 2

(^

0 -^ 4H

u .an M

C3

O

^ 'fe Q^ & CD TO 43

3

a CD

M) S1

II

.g 3 CD Ô

CS 00

PH

CD

O Q ^ X

^H

g

N

?

CD >H CTJ

6

Q. CD cu 1) ^

§

H

CTj TJ CD "

CD

S »

-^

-g ^

°

Q SH CQ

B

.Si

cd

CD

^ "§

S ^J

+j CD

•SP i3 ^

'||

^ 0

-1 S

*" "CD 0 T3

?%

9 2

CD

J2 '55 x 2

* .£ CD "*—'

g CD

c

CN 1/1 i —1 1

~ ^

s —

CO

in r-~ oo co

7 CN

2

rH

O O

ON

T

o o o oo 00 00 o o o o o O 00 oo oo o o o rH O O o o O O rH T—1 1—I o o o o o o

in ONON in o o o o

o o

in m

o o o o 1 1 1 1

in m

m m o in in o

1

CO

rH

^•vG 0s!--

QQ.

,- i + 2*,_! ) + £ 2t. (10'

) )


1

The stati c regression involve s estimating an equation o f the for m and th e D F tes t i s conducted o n where v t = yt- fiz t- Th e DGP is optimal for the DF test her e becaus e (10') ha s a vali d commo n facto r whe n £(e 1( £ 2< ) = 0 (se e Hendr y an d Mizon 1978 , an d Sargan 1980). Sinc e ft = -2 , v t = y, + 2zt, so that (10') corresponds t o Au f = (p 2 - l)v. t-i + £ 2t an d henc e CD, coincide s wit h e 2, except fo r term s involvin g (/ 3 - f$)z t, etc . Fo r thi s reason , th e DG P selected b y Engle an d Grange r (1987 ) i s relatively favourable to th e D F test. By contrast , conside r th e DG P i n (14 ) wit h th e stati c regressio n i n (16) an d th e sam e form o f DF tes t a s in (24) : In thi s case, u t = yt- fiz t s o that i n (25), evaluated a t ) § = /? , hence In (26) , a common-facto r restrictio n i s impose d o n th e dynamics , bu t this tim e i t i s no t necessaril y a vali d representatio n o f (14a) . Indeed , since [3 = 1 by homogeneity, (14a) can be writte n as Comparison wit h (26 ) reveal s tha t th e ne w error [£ lf + (y 2 - l)Az J i s white noise , bu t ha s a large r varianc e tha n tha t o f th e erro r i n (14a) . Kremers e t al. (1992 ) sho w tha t t^ i n (24 ) retain s th e Dickey-Fulle r distribution unde r th e null ,
7 2 = 0.1 , s = 3 0.66/0.35 T = 25 0.99/0.84 50 1.00/1.00 100 = 1 /! = 0.5 , y = 0.1 , s 2 W: 0.79/0.31 r = 25 1.00/0.80 50 1.00/1.00 100 = 1/ 3 /! = 0.5 , y = 0.1 , s 2 (/)i 0.94/0.23 r = 25 1.00/0.75 50 1.00/1.00 100 a

S = CTi/0-2.

in Table 7. 7 ca n be obtaine d fro m th e followin g analysis. Neglectin g th e intercept, th e AD F tes t essentiall y involve s testin g YI = 1 in where th e firs t ste p regressio n o f y t o n z t estimate s fi , whic h her e ha s a population valu e o f unity . Unde r th e alternative , y t-i — flzt~~i is station ary, an d fo r y 3 = 1 the non-centralit y o f th e AD F pseud o Mes t wil l b e given approximatel y b y


5

(see Mizo n an d Hendr y 1980) , wher e AS E denote s th e coefficien t asymptotic standar d erro r calibrate d t o a sampl e siz e o f T. Fo r give n design paramete r values , th e AS E i s easil y calculate d usin g PC-NAIVE , and som e outcome s ar e show n below . Similarly, the t c= 0 test i s actually based o n testing y j = 1 in

Since th e regresso r y ( _j - z t-\ i s stationar y unde r th e alternative , i f 7s + 72 + 7i = 1 i s impose d an d henc e z t-\ omitted , th e asymptoti c non-centrality o f th e Mes t o f y i = 1 (agai n i n PC-NAIVE) , yield s th e following illustrativ e values for T = 25: Case NCadf NC,ecm

(a) -1.15 -1.19

(*) -1.15 -1.28

(c) -1.15 -1.52

(d) -2.89 -3.25

(«) -2.89 -3.88

(/) -2.89 -5.32

In practice , thes e approximat e non-centralitie s wer e clos e t o th e mea n values o f the correspondin g tes t statistic s in th e Mont e Carlo , excep t fo r (fl)-(c) fo r th e ADF , which ha d a mea n o f abou t -2.1 5 (se e (4.28)). Their values hel p explai n both th e increasin g power s o f both test s acros s the experiment s an d th e relativel y bette r performanc e o f f c = 0 - Compared wit h th e critica l value s i n Tabl e 7.6 , and give n th e samplin g standard deviation s o f th e test s o f abou t 0. 8 fo r AD F an d 1. 0 for t c= 0, the non-centralitie s als o accoun t for the absolut e power s of the tests : when th e mea n outcom e i s below the critica l value, a power o f less tha n 0.5 usuall y results ; whe n th e mea n i s more tha n on e standar d deviatio n below th e critica l value , th e resultin g powe r i s under 0.2 ; two standar d deviations lowe r induce s a ver y lo w power ; an d s o on . Simila r argu ments appl y fo r deviation s o f the mea n abov e th e critica l value. Overall, ther e woul d see m t o b e som e advantag e i n modellin g dynamics les s restrictivel y tha n b y commo n factor s whe n th e latte r i s a poor approximation . Not e tha t th e absenc e o f an y contemporaneou s effect fro m Az , alway s induce s a violatio n o f commo n factors . Finally , since th e long-ru n paramete r i s no t assume d know n i n thes e experi ments, th e t c= 0 tes t procedur e i s a n operationa l one , and ha s th e sam e number of parameters her e as the AD F test . The mai n drawbac k t o suc h a n approac h i s its dependenc e o n stron g exogeneity. Boswij k (1991 ) propose s a Wal d tes t fo r co-integratio n i n individual equation s whe n th e regressor s ar e no t eve n weakl y exogen ous. Thi s jointly test s the nul l for th e coefficient s o f all the lagge d level s in a Bardsen formulation . Th e resultin g test i s asymptotically similar an d in effec t test s fo r a commo n facto r o f unit y (se e Hendry an d Mizo n 1978). Boswij k an d Franses (1992) investigat e the powe r o f this test.

236 Co-integratio


7.6. A n Empirica l Illustratio n To illustrat e severa l test s fo r co-integratio n i n singl e equations , w e return t o conside r th e U K seasonall y adjuste d quarterl y dat a o n mone y demand. Th e ra w dat a serie s wer e show n i n Chapte r 1 , an d w e concentrate her e o n th e DW , DF , an d AD F test s base d o n a stati c regression, an d o n thei r compariso n wit h a dynami c regression, whic h is heavily over-parameterized . I n al l cases , w e assum e tha t ther e i s onl y one co-integratin g vecto r an d tha t i t enter s th e money-deman d model . See Kremer s e t al. (1992 ) an d Ericsson , Campos , an d Tra n (1990 ) fo r related analyses . The long-ru n determinant s o f th e deman d fo r transaction s mone y M, as measure d b y Ml , ar e th e pric e leve l P, rea l incom e a s measure d b y constant 1985-pric e tota l fina l expenditur e X S5, an d th e opportunit y cos t of holdin g mone y measure d b y R n. (Se e Hendr y an d Ericsso n (1991i> ) for detail s o f it s calculation. ) W e assume d a log-linea r equation , consonant wit h pric e an d incom e homogeneity , give n by where lower-cas e letter s denot e logs , ai = 1 i s anticipated , an d a, > 0 , / = 1 , 2, 3. Least-square s estimatio n o f th e stati c regressio n ove r the sampl e 1963(I)-t o 1989(11 ) yielded

The residual s wer e the n teste d fo r a uni t roo t usin g th e D F an d AD F tests, th e latte r commencin g wit h fou r lag s an d testin g down . Th e following result s were obtained :

No lagge d values of A w prove d significant , leadin g to th e D F test :

In n o cas e doe s an y tes t rejec t th e nul l o f n o co-integration , a s th e lvalues on th e estimate d coefficien t o f M J ar e i n the neighbourhoo d o f 2 in bot h th e D F an d th e AD F regressions . Tha t outcom e continue s t o hold i f a tren d i s adde d t o th e basi c static-regressio n mode l (30) , or i f


237

price homogeneit y i s imposed an d Ap adde d a s a regressor, correspond ing t o allowin g m an d p t o be 1(2), wit h ( m - p ) an d Ap bein g 1(1). In that last case , R 2 fo r real mone y is equal to onl y 0.68. We assum e no w that Ap, x S5, an d R n ar e weakl y exogenou s fo r th e parameters i n th e conditiona l mone y deman d model . Th e outcom e o f estimating a dynami c equatio n i n th e level s o f th e variable s wit h fiv e lags o n eac h o f m — p, Ap , * 85, an d R n (plu s a constant ) b y leas t squares i s shown in Table 7.8. TABLE 7.8. Empirica l result s Variable

Lag 1

0 m— p

-1.000

xss

-0.041 0.115 -0.411 0.117 -0.757 0.210 -0.124 0.169

SE SE

Rn

SE Ap SE CONSTANT SE

0.

3

2

4

5

Sum o f lags

A 0.164 .147 0.549 0.,240 0,,251 0 .152 0,,132 0.,135 0 ,131 0.109 0,,028 0.118 0.087 0,.162 0.293 -0,,067 -0.,240 0,,130 0 .139 0.119 0 .026 0.135 0..139 0..139 -0.361 -0,,122 -0.,046 -0 .084 -0.045 -1.070 0.130 0 .187 0.178 0.,185 0.,176 0 .175 0.069 -1,.102 0.020 0,,307 -0.,412 -0 .329 0 .222 0.255 0,.253 0,,246 0 .246 0.203 - -0.12 4 0 .169

R2 = 0.9966 a = 0.0130 F(23 , 76) = 975.3 8 D W = 1.976 SC = -7.85 3 Mea n = 10.89613 1 S D = 0.19617 3 Normality % 2(2) = 4.29 AR 1- 5 F[5, 71] = 0.2 0 ARC H 4 F[4 , 68] = 0.22 Xj F[37,38] = 0.6 6 RESE T F[l,75 ] = 0.98 COMFACF[15,76] = 3.14 Tests on the significance of each variable Variable

Ffnum., denom. ]

Value

Probability

Unit-root Mest

m— p

F[5,76] F[6, 76] F[6, 76] F [6, 76] F[l,76]

340.201 7.801 12.127 6.846 0.536

0.000 0.000 0.000 0.000 0.466

-5.168 6.171 -5.719 -4.963 -0.732

*85

Rn

Ap

CONSTANT

Solved static long-run equation m — p = 1.102jc 85 - 7.278R n - 7.493A; ? - 0.84 2 (0.112) (0.528) (1.482 ) (1.230 )

238 Co-integratio

n i n Individual Equations

These dynami c estimate s ar e wel l behaved: th e unit-roo t f-test s ar e al l in th e neighbourhoo d o f 5 o r large r i n absolut e valu e an d ever y regressor matter s a s a se t (i.e . testin g al l fiv e lags) ; th e solve d lon g ru n is wel l define d an d compare s favourabl y wit h (30 ) sinc e th e thre e economic variable s have highl y significant coefficient s wit h sensible sign s and magnitudes ; th e goodnes s o f fi t i s reasonable ; an d th e diagnosti c tests o f th e dynami c specification ar e al l acceptable . Not e tha t th e su m of al l the lag s of th e dependen t variable , a s shown in th e fina l colum n of Table 7.8 , i s similar t o tha t foun d i n th e D F regression , bu t ha s a muc h smaller standar d error . Only th e firs t la g i s strongl y significant , a s i s show n i n Tabl e 7.9 . Tests o f commo n factor s i n th e la g polynomial s usin g th e procedur e i n Sargan (1980 ) yiel d the result s in Table 7.10 . Thus, th e hypothesi s o f fiv e commo n factor s ca n b e rejecte d a t an y reasonable leve l o f significance . Recallin g th e discussio n i n Sectio n 7. 5 above, thi s outcom e help s explai n wh y th e D F an d AD F test s di d no t reject th e nul l o f n o co-integration , wherea s th e dynami c mode l ha s done s o decisively . Give n tha t th e commo n facto r restriction s ar e rejected, th e D F an d AD F test s ar e no t wel l suite d t o detectin g co-integration. Th e EC M versio n o f thi s equation , reporte d i n Hendr y and Ericsso n (I99lb), ha s a ?-valu e greate r tha n 1 0 in absolut e valu e fo r the EC M coefficient , i n a mode l whic h parsimoniousl y encompasse s th e unrestricted equatio n fitte d above . Thus , th e evidenc e favour s rejectin g no co-integration, an d the result s in the nex t chapter suppor t tha t claim . TABLE 7.9. Test s on th e significanc e o f eac h la g Lag F[num.

, denom. ] = Valu e Probabilit

5 4 3 2 1

0.691 1.615 1.654 1.416 12.967

F [4, 76] F [4, 76] F [4, 76] F [4, 76]

F[4, 76]

y 0.600 0.179 0.170 0.237 0.000

TABLE 7.10. COMFA C Wald tes t statisti c summary table Order x 13 26 39 41 51

2

2 5

d.f . Valu

e Incrementa 0.086 0.196 4.176 8.101 47.128

3 3 3 3 3

l x 2 d.f . Valu

e

0.086 0.110 3.980 3.925 39.028

Co-integration in Individual Equations 23

9

7.7. Full y Modifie d Estimatio n This sectio n consider s method s fo r correctin g th e finite-sampl e biase s i n static regressions . Par k an d Phillip s (1988) , Phillip s an d Durlau f (1986) , Phillips an d Hanse n (1990) , an d Phillip s (19880 , 1991 ) hav e argue d tha t the performanc e o f estimator s o f co-integratin g vectors base d o n static regressions is adversely affecte d b y the existenc e of second-order biases. As show n i n th e example s below , thes e biase s hav e n o effec t o n th e consistency o f th e estimators , bu t resul t i n th e asymptoti c distribution s of scale d estimators , suc h a s T(p — ft) i n (31 ) below , havin g non-zer o means. Such biase s pla y a potentiall y importan t role i n finit e samples . Fo r example, le t the variables ylt an d y 2t b e generated by

When th e {u it} ar e autocorrelate d an d intercorrelated , a stati c regres sion o f yit o n y 2(, b y no t usin g an y informatio n abou t th e proces s generating y 2t, provide s a n estimat e o f y 3 whic h ca n b e quit e severel y biased eve n i n fairl y larg e samples . Phillip s e t al. therefor e recommen d full-system maximu m likelihood estimatio n o f co-integrate d systems . A s an alternativ e t o estimatio n o f th e ful l system , the y propos e correctin g the single-equatio n estimate s non-parametricall y i n orde r t o obtai n median-unbiased an d asymptoticall y norma l estimates . Thes e re commended corrections , fo r simultaneit y bia s an d residua l autocorrela tion, us e expression s derive d fro m th e asymptoti c distribution s o f th e estimators althoug h th e correction s ar e mad e t o estimator s fro m finit e samples. Phillip s an d Hanse n (1990 ) sho w tha t thes e correction s wor k effectively i n sampl e size s a s smal l a s 50. 15 Thei r exampl e i s presente d in Sectio n 7.10. 4 below. The estimate s obtaine d fro m full y modifie d an d full-informatio n methods ar e asymptoticall y equivalent . Thi s equivalenc e i s o f interes t because i t link s th e discussio n wit h a thir d possibl e metho d o f reducin g finite-sample biases , namely , estimatin g single-equatio n dynamic regres sions. Th e ai m o f th e analysi s i n thi s sectio n i s t o compar e th e non-parametrically corrected estimate s (whic h ar e als o asymptoticall y efficient an d median-unbiased ) wit h estimate s obtaine d fro m dynami c regressions i n eithe r thei r AD L o r EC M forms . Th e for m o f th e autocorrelation i n th e erro r proces s i n (31 ) an d (32 ) i s crucia l t o thi s comparison. Fo r som e specification s o f th e erro r process , a dynami c 15 Whil e i t i s possible t o deriv e exac t expression s fo r th e biase s i n finit e sample s t o an y desired leve l o f accuracy , usin g Edgeworth-typ e expansions , thi s i s a complicate d pro cedure .

240 Co-integratio


regression equatio n implicitl y perform s th e sam e correction s a s thos e achieved b y the non-parametri c correctio n terms . Th e long-ru n estimate s obtained fro m thi s properly specifie d dynamic equation ar e the n equivalent, asymptotically , t o th e non-parametricall y correcte d estimates. 16 I n such cases , therefore , tw o way s o f incorporatin g informatio n abou t th e marginal process (tha t is , th e proces s generatin g y^t) presen t them selves: non-parametri c correction , o r dynami c specification . However , for othe r specification s o f th e autocorrelatio n proces s a single-equatio n dynamic regressio n ma y fai l t o achiev e efficiency , o r eliminat e th e effects o f second-order bias , regardles s o f th e richnes s o f th e parameter ization, owin g t o a failur e o f th e conditionin g variables t o b e weakl y exogenous fo r the parameter s o f the dynami c equation. Our theoretica l discussio n i s based o n Phillip s (19880) . Althoug h i t is fairly straightforwar d to describ e an d categoriz e th e circumstance s unde r which dynami c single-equation estimate s wil l perfor m well , th e detaile d theoretical backgroun d fo r thi s descriptio n i s length y an d complex . Readers intereste d i n implementin g th e non-parametri c correction s ar e referred t o th e paper s b y Phillip s an d hi s co-author s cite d previously . We shal l focus on presentin g th e argument s intuitivel y and wil l illustrat e the theoretica l analysi s wit h tw o simulatio n exercises , th e firs t take n from Phillip s and Hanse n (1990) , an d th e secon d fro m Gonzal o (1990) .

7.8. A Fully Modifie d Least-square s Estimato r Consider th e data-generatio n proces s give n b y (31 ) an d (32 ) an d disregard, fo r th e moment , th e precis e autocorrelatio n structur e o f u ( = [«],, « 2 f]'• Assum e onl y tha t u ( i s weakly stationary with it s mean vector an d long-ru n covarianc e matri x give n b y [0,0] ' an d S 2 respect ively, wher e i H = {a)y}y = 12 . 17 Th e followin g decompositio n o f th e fl matrix i s usefu l i n understandin g it s structure : Q = V + F + F" , wher e V = £[u 0uo] an d r = 2)/t= i21 and fi)22 are consisten t estimates o f th e correspondin g element s i n th e long-ru n covarianc e matrix, an d A i s a consisten t estimat e o f A . Unde r quit e genera l conditions,

The notatio n BM(12 U 2) i s used t o denot e a bivariat e Brownia n motion process wit h covarianc e matri x S2n. 2 an d i s a matri x generalizatio n o f scalar Wiene r processes, a s discussed i n Chapter 6 . The limitin g distribution (37 ) is a covariance matri x mixture of normals (see Table 3.3). The 'ful l modification ' i n (33 ) achieve s tw o notabl e aims . First , b y taking accoun t o f an y seria l correlatio n i n th e residuals , th e bia s correction ter m 6 + mitigate s th e effect s o f second-orde r bias . Second , the correction s fo r long-ru n simultaneit y i n th e syste m mad e b y usin g yit (i n plac e o f yi t) permi t th e us e o f conventiona l (asymptotic ) procedures fo r inference . Thus , definin g th e full y modifie d standar d error b y s+ where ,

where o) result:

112

i s a consisten t estimato r o f ft>ii. 2, w e hav e th e following

242 Co-integratio

n in Individua l Equations

Phillips an d Hanse n (1990 ) sho w tha t thi s approac h i s asymptoticall y equivalent t o system s procedure s suc h a s ful l maximu m likelihoo d estimation discusse d i n Chapte r 8 . Bot h (38) , which simplifie s th e process o f inference , an d th e reductio n i n th e second-orde r bia s i n /3 + help estimatio n an d testin g o f singl e equation s i n co-integrate d systems . Our us e o f a simpl e data-generatio n process i s solely for th e purpose s o f exposition; th e literatur e t o whic h w e hav e referre d i s capabl e o f treating co-integrated system s at a high level of generality.

7.9. Dynami c Specificatio n Is i t possible , b y suitabl e dynami c specification alone , t o mak e th e sam e corrections a s those mad e b y the techniqu e describe d above ? I n orde r t o answer thi s question , Phillip s (1988a ) consider s a dynami c versio n o f equation (31):

yit = /3y 2t + r% + »? „ (39

)

where x t i s a vecto r wit h jointl y stationar y elements . Thus , x t contain s lagged value s o f A_y l r an d curren t an d lagge d value s o f Ay 2 r . Whil e far fro m bein g a genera l dynami c model , (39 ) i s a linear-in parameters AD L model . The proces s o f constructin g a regressio n equatio n suc h a s (39 ) ha s been extensivel y discusse d i n th e literatur e (see , i n particular , Engl e e t al. 1983) . Thus , focusin g o n th e DG P give n b y (31 ) and (32 ) and imposing no restrictions upo n the autocorrelatio n structur e o f the u it,

where %F f-i ' s th e informatio n se t containin g informatio n o n pas t realizations o f y lt, y 2t an d henc e o f «,,_/ , / = 1 , 2 ; / 5 = 1. B y construc tion, {rj t} i n (40 ) is a martingale difference sequence . If th e process generatin g u r i s now specialized t o th e cas e wher e i t is a linear process , s o that

where The varianc e o f v} t i s give n b y cr n 2 = a\\ — O2io22, and r] t i s orthogona l to £ 2, as well a s t o th e entir e histor y o f e, given b y (f,_i , £ r _ 2 > • • •)• 18

Not e that £ = { , d 2(L) = ^0d2jL>, an d v t ~ IN(0, a u.2) which is independent o f the regressors . It is then possible t o sho w that

(45) where / ? i s th e estimat e o f th e coefficien t o f y 2t i n (44) . Bv(r) an d B2(r) compris e a bivariate Brownian motion process with a well-defined variance-covariance matrix . The questio n pose d a t th e beginnin g o f thi s sub-sectio n ca n no w b e answered. Comparin g (37 ) and (45) , the full y modifie d estimato r fi + and th e dynami c single-equation least-squares estimator ar e equivalen t if and onl y i f B v(r) = BI ,2(r). Thes e tw o Brownia n motion processe s ar e not necessaril y equa l t o eac h other . Thi s i s becaus e B v(r) ca n b e correlated wit h B 2(r), despit e it s constructio n i n (40) . The generatin g mechanism fo r u 2t ma y therefor e b e informative , and optima l inference then require s join t estimatio n wit h th e error-correctio n model . Phillip s (1988) describe s thi s a s a failur e o f wea k exogeneit y or vali d conditioning. If , o n th e othe r hand , B v(r) an d B 2(r) ar e uncorrelate d a t al l frequencies, th e conditiona l proces s i s completel y informativ e fo r th e purposes o f estimation o f f t an d th e margina l process generating u2t ma y be ignored . In suc h a case, B v(r) — B\ 2(r).

244 Co-integratio

n i n Individual Equations

The example s followin g thi s sub-sectio n wil l elaborat e upo n thes e conditions, bu t w e wil l clos e thi s sectio n wit h a n interpretation . Th e non-equivalence o f th e dynami c regressio n estimato r an d th e full y modified estimato r arise s fro m possibl e correlatio n betwee n th e residual s r)t o f th e conditiona l proces s an d th e residual s u 2t o f th e margina l process. Thi s correlatio n arise s because, althoug h t] t i s orthogonal t o u 2t and th e pas t histor y o f u 2t (t] t i s orthogona l t o it s ow n pas t b y construction), u 2t i s no t necessaril y orthogona l t o th e pas t o f u\ t an d hence (r\ t, u 2t)' jointl y is not a martingale difference sequenc e (MDS) . Three example s ar e presente d below . The y ar e adapte d fro m Phillip s (1988a) an d ar e specia l case s o f th e example s appearin g i n tha t paper . Three differen t specification s o f th e autocorrelatio n structur e o f th e u , process ar e considere d whil e the data-generatio n proces s continue s t o b e (31) an d (32) . The example s hel p t o integrat e an d interpre t th e discussion s o n wea k exogeneity, dynami c modelling, and full y modifie d estimation. Exogene ity play s a n importan t rol e i n dealin g wit h non-stationar y variables . Dynamic regressio n equation s i n whic h the conditionin g is on weakl y or strongly exogenou s variable s (fo r th e parameter s o f interest ) provid e asymptotically unbiase d estimates. Further , inferenc e ma y b e conducte d with standar d tables . I n case s wher e suc h conditionin g i s no t possible , improperly conditione d equation s lea d t o inefficien t an d biase d esti mates. Th e ful l syste m mus t therefore b e estimate d o r th e non-paramet rically modifie d estimate s used . I t i s see n tha t full y modifie d estimation is anothe r wa y o f addressin g th e issu e of the completenes s o f conditiona l models fo r purpose s o f estimatio n an d inference . 7.10. Example s 7.10.1. Example (Phillips 1988a: 352)

In reduce d form , th e DG P (31 ) an d (32 ) is given by

Hence

Co-integration in Individual Equations 24

5

Thus, usin g th e formul a fo r th e conditiona l expectatio n o f bivariat e normal rando m variables , w e have Defining and usin g (48), we obtai n or, alternatively , where Finally, substitutin g for £ Several feature s ar e no w evident . B y construction , j\ t i s a n MDS. Second, agai n by construction , r\ t i s uncorrelated wit h u 2t.19 Fro m (47), we hav e tha t th e u 2t proces s i s serially uncorrelate d bot h wit h pas t u 2t and wit h pas t w l f . I t follow s tha t r\ t an d u 2t ar e incoheren t (tha t is , uncorrelated a t al l lag s o r frequencies) , tha t th e long-ru n covarianc e matrix o f [r] t, u 2t]' i s diagonal , an d tha t th e estimatio n o f a singl e dynamic equatio n should provid e a full y efficien t an d unbiase d estimat e of th e vector a . Looking a t th e conditiona l an d margina l processe s give n b y (50 ) and the secon d equatio n i n (46) respectively, and a t th e propertie s identified in th e previou s paragraph , single-equatio n leas t square s o n (50 ) i s equivalent t o full-informatio n maximu m likelihood fo r estimatin g y3 . Th e orthogonality o f th e r) t an d u 2t processe s ensure s tha t th e join t likeli hood functio n fo r th e syste m factorize s into th e likelihoo d function s fo r the margina l an d conditiona l model s give n b y th e secon d equatio n i n (46) an d (50 ) respectively. Ther e ar e n o cross-equatio n restrictions ; th e parameter o f interes t /3 ca n b e estimate d an d identifie d fro m (50 ) alone; and, recallin g th e discussio n o f wea k exogeneit y i n Chapte r 1 , th e marginal proces s generatin g u 2t nee d no t b e modelle d whe n estima ting 13. 7.10.2. Example (Phillips 1988a: 355)

where,

246 Co-integratio


Then

The long-ru n covariance matrix of (rj t, u 2t)' i s given by

where CTH 2= au - o\ 2a22. The expression fo r Sln.2 follow s from appli cation o f th e conditional-expectation s formul a an d fro m inspectio n o f (53). t], an d u 2, ar e agai n incoherent , an d th e limi t Brownia n motion s are

where B n an d B 2 ar e independen t an d 5, , = BI 2 . Thus , estimatin g a dynamic single-equatio n mode l (th e conditional model ) provide s esti mates identical , asymptotically , t o thos e provide d b y th e Phillips Hansen procedure . Her e th e conditiona l mode l is given by In error-correctio n format , we may rewrit e (54) a s

Equation (54 ) is th e on e tha t mus t b e estimate d i n orde r t o obtai n a n asymptotically unbiase d estimato r o f 13. Th e static regressio n i s augmented i n (54 ) by th e term s Ay 2 r an d Ay 2 r _j. Thes e additiona l term s are incorporate d t o reduc e o r eliminate , in finit e samples , th e effect s o f second-order bias , without estimating the ful l system . Phillips (1988fl ) note s tha t th e bia s correctio n ter m d + fo r thi s example i s equal t o zer o sinc e A = (« 12, ft>22)'. However, t o obtai n full y modified estimates , fro m (34 ) y^ need s t o b e correcte d fo r long-ru n endogeneity a s follows : The sam e correctio n i s achieve d i n th e dynami c regression b y th e tw o Ay 2 r -/ term s i n (55) . The static regressio n produce s biase s b y ignoring these corrections . 7.10.3. Example (Phillips 1988a: 356)

Co-integration in Individua l Equations

247

We tak e th e proces s (e lt , e 2t)' t o b e distribute d a s i n Sectio n 7.10.2 . Then i t may be show n that The long-ru n covariance matrix is given by

where a 11-2 is as defined i n Sectio n 7.10.2, an d

The Brownia n motion s B^ an d B 2 ar e correlate d an d th e single equation dynami c estimato r an d th e full y modifie d estimato r ar e n o longer equivalent , unles s $ 21 =0. Fo r th e structur e o f th e correlatio n between B n an d B 2 (se e Phillips 1988a): where B^ 2(r) i s a univariat e Brownia n motio n proces s wit h varianc e given by crn 2 - oli^d^H' 1 an d is independent o f B2(r). Further ,

From (58 ) setting 9 2\ equa l t o zer o make s th e B^r) an d equivalent t o eac h other . Further , B^^r) ha s a variance o f on 2 an d is in al l respect s equivalen t t o th e S 12 (r) proces s give n i n (37 ) above. Thus, th e B n(r) an d B i2(r) processe s ar e equivalent , and , in accord ance wit h th e previou s discussion , thi s equivalenc e lead s t o th e equival ence o f th e single-equatio n dynami c estimato r an d th e full y modifie d estimator. It shoul d b e note d tha t # 21 = £ 0 also implie s that th e T-typ e term s (see Section 7.8 ) are importan t i n th e long-ru n varianc e matri x fo r th e (TJ ( , M 2()' process . Thi s i s jus t anothe r wa y o f sayin g tha t th e pas t o f th e process i s importan t (an d so, i n th e (rj t, u 2t)' constructio n w e hav e no t achieved a martingal e differenc e sequence) . Thus , th e equivalenc e o f dynamic single-equatio n estimator s an d full y modifie d estimator s ma y also b e assesse d b y lookin g fo r th e presenc e o f T-typ e term s i n th e long-run varianc e matrix . Thes e ar e th e term s (fo r example, th e firs t term i n (59) ) tha t giv e ris e t o biase s i n th e single-equatio n dynami c estimates o f the co-integratin g vector. The necessar y an d sufficien t conditio n fo r non-equivalenc e ha s a natural interpretatio n i n th e languag e of a n earlie r literatur e o n dynamic

248 Co-integratio


modelling. I t i s eviden t tha t th e conditio n 621 ^ 0 violate s wea k exo geneity20 a s ma y b e verifie d fro m (57) ; an d onc e again , i t ma y b e see n that th e issue s o f a full y modifie d estimation an d dynami c specification are closel y related . Thi s exampl e form s th e basi s fo r th e simulatio n exercise discusse d i n the fina l sub-section . 7.10.4. Simulation Example (Phillips and Hansen 1990: 116) The data-generatio n proces s fo r thei r simulation study is given by

The desig n o f th e experimen t consiste d i n allowin g o 2\ an d 0 21 t o vary. Thus , fou r value s o f a 21 an d thre e value s o f 0 21 wer e used . Th e values o f CT21 considered wer e -0.8 , -0.4 , 0.4 , an d 0.8 , an d th e thre e values o f th e moving-averag e parameter 0 21 were 0.8 , 0.4 , an d O.O. 21 f t was se t equa l t o 2 fo r al l twelv e combinations o f th e value s o f 02 1 an d 02i- Th e ai m wa s t o calculat e an d compar e th e distribution s of estima tors an d /-statistic s fo r th e co-integratin g parameter obtaine d b y OLS , single-equation dynamic , and full y modifie d methods. For th e full y modifie d method , Phillip s an d Hanse n use d a Bartlet t triangular windo w of lag length 5 and th e OL S residuals u lt t o calculate non-parametric estimate s o f A , J 2 an d henc e o f d +. W e shal l denot e these estimate s b y A , fi , an d < 5 +. Th e OL S f-statisti c wa s estimate d b y using St u (th e (1,1 ) elemen t fro m th e non-parametricall y estimate d long-run varianc e matrix ) a s a n estimat e o f th e standar d error . Th e dynamic equatio n regresse d y lt o n (v 2 t , Ay 2 < , Ay 2 ,_i, Ay 2 ( _ 2 , A y l r _ l 5 Ayif- 2 ), usin g 30,000 replication s fo r eac h simulatio n (tha t is , fo r eac h pair o f values o f (0 21, 2i = 0, th e dynami c /-statisti c i s substantially les s biase d (i n al l but on e case ) tha n th e F M /-statistic , bu t its variance i s much higher. Since th e us e o f th e norma l distributio n i s a considerabl e simplifica tion an d th e bia s comparison s ar e a t bes t ambiguou s fo r th e dynami c estimates (whe n $2 1 ^ 0) > ther e ma y b e reason s t o prefe r th e F M estimator over th e D estimato r whe n onl y long-ru n parameter s ar e o f interest. Thi s recommendatio n mus t b e qualifie d b y noting tha t a mor e richly parameterize d dynami c mode l ma y hav e provide d lowe r biase s and a distributio n o f th e /-statisti c close r t o th e norma l distribution . Performance wit h a negativ e M A paramete r i s als o important ; som e early studie s hav e suggeste d tha t th e F M estimato r perform s less well in such cases . Bot h thes e qualification s poin t t o th e nee d fo r mor e extensive simulation studies . What i s clea r fro m al l th e studie s considere d s o fa r i s th e poo r performance o f unmodifie d estimate s derive d fro m stati c regressions . Some for m o f incorporatio n o f th e dynami c structur e o f th e data generation process , eithe r b y mean s o f a non-parametri c correctio n o f the stati c regressio n estimate s o r b y runnin g dynami c regressions , i s 22 Phillip s an d Hanse n rationaliz e thi s behaviou r b y statin g tha t 'whe n thi s conditio n [02i = 0] doe s hold , th e parametri c natur e o f th e [dynamic ] metho d give s i t a natura l advantage ove r ou r semi-parametri c approach ' (1990 : 119) .


251

TABLE 7.12. Mea n (standar d deviation ) of

02i = -0. 8

OLS D FM 021 = -0. 4 OLS D FM

02! = 0. 4

OLS D FM

CT21 = 0. 8

OLS D FM

02i = 0. 4

92i = 0. 0

-1.616 (1.268) -1.259 (2.040 ) -0.388 (1.432 )

-1.240 (1.105 ) -0.563 (1.701 ) -0.449 (1.092 )

-0.930 (1.00 ) -0.003 (1.40) -0.025 (0.896 )

-1.156 (1.32) -1.058 (1.69) -0.729 (1.49 )

-0.986 (1.25) -0.636 (1.57 ) -0.516 (1.35 )

-0.754 (1.149) -0.163 (1.388) -0.335 (1.193)

-0.711 (1.19) -0.664 (1.29 ) -0.606 (1.26 )

-0.520 (1.21) -0.478 (1.34 ) -0.267 (1.30 )

-0.267 (1.24 ) -0.213 (1.37) 0.096 (1.36 )

-0.575 (0.955 ) -0.445 (1.15) -0.519 (0.922 )

-0.302 (0.979 ) -0.339 (1.25 ) -0.102 (0.962 )

-0.098 (1.04 ) -0.184 (1.36 ) 0.418 (1.12 )

Reproduced fro m Phillip s an d Hanse n (1990) .

necessary fo r inference . Whil e super-consistenc y theorem s sho w tha t 1(0) term s ma y b e ignore d asymptotically i n regression s wit h 1(1 ) variables, thes e asymptoti c result s hav e littl e bearing , o n sampl e size s common i n econometrics , wher e 1(0 ) term s ar e importan t an d nee d t o be accommodated . The othe r importan t issu e raise d b y thes e example s i s th e wea k exogeneity o f th e conditionin g variable s fo r th e parameter s o f interest . Reconsider th e DG P i n (31 ) and (32 ) where u t i s a first-orde r auto regressive process, s o that a finite la g length dynamic model is valid:

where

Then or

252 Co-integratio


in term s o f 1(0 ) variables . Le t £[£]. < |e2f ] = °u a22£2t = Y£2t s o £

Further, assum e tha t 0 = (ft* : a : ft : §)' denote s th e parameter s o f interest, an d indee d tha t 6 i s bot h constan t an d invarian t t o regim e shifts affectin g Ay 2 ( . Nevertheless , althoug h (61 ) appear s t o defin e a valid conditiona l mode l fo r al l value s o f 0 , i f c 21 ¥= 0 the n Ay 2 , i s no t weakly exogenou s fo r 6 . Becaus e o f th e resultin g non-diagonality o f th e long-run c o variance matrix , thi s los s o f wea k exogeneit y ca n hav e a detrimental impac t o n th e bia s an d efficienc y o f th e least-square s estimator o f 9 in finit e samples . In fact , c 21 ¥= 0 jointly violates th e wea k an d stron g exogeneit y o f y 2f for 0 . To sor t ou t whic h aspect i s dominant, thre e case s meri t comment : the followin g implication s ar e base d o n Mont e Carl o studie s o f (61) . First, eve n i f y = 0 , s o tha t ther e i s n o simultaneit y an d 13* = p , th e previous conclusio n holds . Second , i f y = £ 0 wherea s c 21 = 0 , y 2r i s strongly exogenou s fo r 6 an d n o problem s result . Finally , i f stron g exogeneity alon e i s violated , bu t wea k exogeneit y holds , a s woul d happen i f A y l r _ j directl y affected Ay 2 , whe n c 21 = 0 , ther e ar e agai n no serious bia s effects . Thus , th e presenc e o f th e co-integratin g vecto r i n another equatio n appear s t o b e th e primar y determinan t o f th e finite sample bias . Consequently , co-integratio n force s a renewe d emphasi s on systems method s i f potentiall y misleadin g inferences ar e t o b e avoided . That i s the focu s o f Chapte r 8 .

Appendix: Covarianc e Matrice s Consider th e DG P i n (Al ) wher e y, is th e stationar y first-orde r vecto r autoregressive process : y r = Ay,_ i + e , wher e e t ~ IN(0 , S), (Al ) and al l th e laten t root s o f A li e insid e th e uni t circle . Ther e ar e thre e distinct c o variance matrice s relevan t t o th e analysis , a s follows , notin g that £(y f ) = 0. (a) Th e conditional (o r contemporaneous) covariance matrix

Co-integration i n Individual Equations

253

(b) Th e unconditional covariance matrix

obtained a s show n b y substitutin g (Al ) fo r y t, multiplyin g out , an d using stationarity . Th e element s o f G ca n be obtaine d b y vectoring (A3 ) and solving . (c) Th e long-run covariance matrix Consider th e finit e sampl e expression , analogou s t o E[T~~ 1S2T] i n th e scalar case :

Rewriting £ 2 as (I - A)( I - A)^ : G + A + A' + G(I - A')' 1 ^ ~ A') - G, on simplifyin g we have that : However, a mor e convenien t for m o f Q , directl y relate d t o th e spectra l density a t the origin , result s fro m (A3) :

(A5) 1

so tha t o n pre-multiplyin g E b y ( I -A)" an d post-multiplyin g b y (I - A')" 1 and using (A4):

254 Co-integratio


Similar principle s appl y t o derivin g thes e thre e matrice s i n mor e general weakl y stationar y processes . A s a secon d example , i f (Al ) i s altered t o th e first-orde r moving average: then, usin g j>t-i t o denot e availabl e information:

and: (A10) Following Phillip s an d Durlau f (1986) , consider a genera l 1(1 ) vecto r process: and v t i s a weakl y stationar y stochasti c proces s wit h unconditiona l covariance E(v tv't) = G an d long-ru n covarianc e Q = G + A + A'. Fro m (A4), A ca n be writte n as:

Extending th e analysi s in Chapte r 3 to allo w fo r vecto r processes , an d in th e appendi x t o Chapte r 6 t o allo w fo r non-II D errors , x r /Vr converges t o the vecto r Brownia n motion BM(fi) :

Then:

These vector formula e could b e standardize d usin g V(r) = K'B(r) wher e fi" 1 =KK'.

8

Co-integration i n System s of Equations We hav e s o fa r considere d onl y single-equatio n estimatio n an d testing. Whil e th e estimatio n o f singl e equation s i s convenien t an d often efficient , fo r som e purpose s onl y estimatio n o f a syste m provides sufficien t information . Thi s i s true, fo r example , whe n we consider th e estimatio n o f multipl e co-integratin g vectors , an d inference abou t th e numbe r o f suc h vectors. Traditionally , system s have bee n estimate d whe n ther e i s a failur e o f weak exogeneit y i n a singl e equation , an d thes e consideration s als o appl y here . Thi s chapter examine s method s o f findin g th e co-integratin g rank , considers eircumstance s whe n dynami c single-equatio n method s will be asymptoticall y equivalen t t o system s methods , an d provide s examples t o illustrat e thes e issues . Asymptoti c distribution s ar e also derived . In earlie r chapters , w e investigate d dat a serie s containin g uni t root s i n their scala r autoregressiv e representation s (i.e . thei r margina l distribu tions), an d denote d suc h serie s a s 1(1). I n thi s chapter w e will consider a vector tim e serie s of dimensio n n, a, = (*u,*2o • • •> x nt)' (generalizin g the analysi s t o an y numbe r o f variables) , wher e x , i s 1(1 ) s o tha t Ax r i s 1(0). Generally , an y arbitrar y linea r combinatio n o f th e element s o f x f , say w ( = a'x t, wil l als o b e 1(1) , an d suc h linea r combination s impl y o r give ris e t o spurious regressions. However , ther e ma y exis t vector s a , such tha t in whic h case th e relevant component s o f \t are co-integrated . In th e simples t bivariat e case , a s w e hav e seen , w e ma y tak e xf = (y t, z ty, wher e y t an d z t ar e individuall y 1(1). Th e arbitrar y linea r combination (y, - Kz t) wil l als o b e 1(1) , bu t i f there exist s a value i q of K suc h tha t (y, - jqz, ) ~ 1(0) , the n y t an d z t ar e co-integrated . Lettin g a{ = (1, — iq) b e th e co-integratin g vecto r i n thi s case , a^ mus t b e unique, sinc e fo r an y othe r valu e K*, the n y t — K*zt = yt ~ *q£ r + (jq - K*)z t = w t + (KI — n*)zt, whic h i s the su m of a n 1(0 ) proces s an d an 1(1) process, an d therefor e 1(1 ) unless j q = ie* .

256 Co-integratio

n i n System s o f Equations

For n element s i n x t ~ 1(1) , ther e ca n be , a t most , n — 1 co-integrating combinations. l Henc e 0 ^ r ^ n — I an d th e r vector s ma y b e gathered i n a n n x r matri x « = [« 1; «2, . . ., a,.] . Outsid e th e bivariat e model, n > 2 an d th e co-integratin g matri x i s n o longe r uniqu e i n th e absence o f prio r information . W e note d i n Chapte r 2 th e relate d issu e for stationar y equilibria , onl y som e o f whic h nee d correspon d t o substantive economi c hypotheses . A simpl e cas e of non-uniquenes s occur s whe n subset s of the Xj t are co-integrated. I n fact , fo r an y non-singula r r x r matri x F , wf = Fa'x t = a*'x t i s als o 1(0) . Thi s las t resul t show s tha t linea r combi nations o f th e co-integratin g vector s themselve s for m co-integratin g combinations. Sinc e a)x r an d a-x , ar e 1(0) , s o i s any linea r combinatio n thereof. I n th e terminolog y o f linea r algebra , th e dimensio n o f th e co-integrating spac e (give n b y th e ran k o f th e matri x a ) i s r an d th e columns o f « form th e basis vectors of this space . Pre-multiplyin g «' b y an r x r non-singula r matri x F doe s no t alte r eithe r th e co-integratin g space o r it s dimensions . Therefore , strictl y speaking , estimatin g th e co-integrating matri x « essentiall y involve s derivin g th e basi s vectors . The matri x a i s non-unique in the absenc e o f prior information . A brie f justificatio n may b e offere d fo r focusin g on th e ope n interva l (0, n) o f N , a s the domai n o f values for r. When r = n, x, must b e 1(0) , as show n in Sectio n 8. 1 below . W e therefor e exclud e thi s case whe n we know tha t \ t i s 1(1 ) an d onl y conside r stochasti c processe s wher e variables ar e marginall y 1(1) . Thus , n — r > 0, an d w e ca n re-expres s the proces s {x,} i n term s o f 1(0 ) processes , usin g th e r co-integratin g relationships an d n — r firs t difference s o f th e process . Th e cas e o f r = 0 is a trivia l on e a s i t implie s th e absenc e o f eve n a singl e co-integratin g vector an d suggest s respecification of th e syste m in differences. As w e sa w i n Chapte r 5, Engl e an d Grange r (1987 ) establishe d a n isomorphism betwee n co-integratio n an d error-correctio n models . I n order t o examin e co-integratio n i n system s o f equations , w e wil l deriv e that result , formulatin g the syste m in EC M form , i n som e detai l below , starting thi s time fro m th e moving-averag e representation o f the process . From tha t system , a maximu m likelihoo d estimato r (MLE ) o f r, th e number o f co-integratin g relationships , wil l b e obtaine d base d o n a method propose d b y Johansen (1988) . Thi s wil l i n turn enabl e u s to tes t hypotheses concernin g th e dimensio n o f th e co-integratio n space , an d establish a 'central value' o f a . A proo f o f this result i s given i n Sect . 8.1.

Co-integration i n System s o f Equations 25

7

8.1. Co-integratio n and Erro r Correction We no w retur n t o th e representatio n o f a co-integrate d syste m i n autoregressive o r (equivalently ) i n error-correctio n form . Whe n {Ax, } is a stationar y proces s (possibly ) wit h drift , w e ca n expres s i t a s a multivariate movin g averag e usin g th e Wol d (1954 ) decompositio n theorem: where e , ~ IID(0 , ft) ; L i s agai n th e la g operator , an d C(L ) i s a polynomial matrix in L give n by

The cumulativ e or tota l effec t fro m C(L ) i s given by

where th e C , agai n obe y a n exponentia l deca y conditio n o f th e for m discussed i n Chapter 5 . Using C(l), w e can rewrite C(L ) as where C*(L ) = Zr=oCfL' an d Cf= -E^+iC / s o that Cj f = !„ - C(l) . Note tha t th e existenc e o f thes e matrice s i s agai n guarantee d b y th e exponential deca y condition. Thus , fro m (1) , or

where fi = C(l)m . The ke y assumption s needed t o deriv e th e autoregressiv e representa tion o f th e proces s ar e give n below . A s i n Chapte r 5 , th e proo f follow s Johansen (1991a) . ASSUMPTION Bl. Th e characteristi c polynomial,

has root s eithe r equa l t o o r strictl y greate r tha n 1 ; tha t is , |C(z)| = 0 implies tha t eithe r \ z > 1 or z = 1. ASSUMPTION B2. Th e matri x C(l ) ha s reduce d ran k n — r an d i s therefore expressibl e a s the produc t o f two n x ( n - r ) matrice s (j> and tj, wher e ^ an d i\ have rank n — r. Thus, C(l ) = <j)t]' .

258 Co-integratio

n i n System s o f Equations

ASSUMPTION B3. Th e r X r matri x 0j_C*(l)i/ i ha s ful l ran k r.

2

Assumptions B1-B3 are analogous t o Assumptions A1-A3 in Chapter 5 . Given ou r result s o n C(l) in Chapter 5 , i t is natural t o requir e tha t C(l) be o f reduced ran k an d have ran k n - r . Also, r = n implie s tha t C(l) is identicall y th e nul l matrix . Thus , fro m (3) , (Ax, — fi) = C*(L)Ae( , which implies , afte r integration , tha t x , i s integrate d (a t most ) o f orde r 0. Assumption B 3 then rule s out th e possibilit y that C*(L ) ha s a root o n the uni t circle , s o x , canno t b e integrate d o f orde r — 1. I n eithe r case , we hav e a contradictio n o f the assumptio n tha t th e component s o f x, ar e 1(1). To deriv e th e autoregressiv e representation , multipl y (3 ) b y tjt' an d 0i respectivel y t o obtain th e equations

using th e decompositio n C(l ) = r\' an d th e resul t tha t The matri x C(l) is not invertibl e an d th e syste m given b y (4a) an d (4b) therefore canno t b e inverte d directl y t o expres s th e x it i n term s o f th e e,,. A n invertibl e syste m i s obtaine d b y defining , a s i n Chapte r 5 , tw o new variables , w , = (i/'»/)~ 1 ij'e r an d y, = (i/li/iî/iAe,. Repeatin g th e steps use d i n Chapte r 5 , th e matrice s fj an d i] L ar e define d a s and i/î/'ii/i)" 1 respectively . Next, agai n as in Chapter 5 , Substituting int o (4a ) an d (4b) give s

We therefor e hav e with

For z = l , thi s matrix has determinant

2 ±

Th e orthogona l complemen t of a matrix is defined in Sect . 5.3.1. Usin g this definition, an d i/ ± ar e n x r dimensiona l matrices with rank r.

Co-integration i n System s of Equations 25

9

which i s non-zero , usin g Assumption s B 2 an d B3 . Thus , B(z ) does no t have a root a t 1 . For \ z > 1, where (7 ) ma y b e show n b y substitutin g fo r C*(z ) i n B(z ) in term s of C(z ) an d C(l ) = (jtrj 1 , an d usin g th e orthogonalit y conditio n tj>'±4> = i f Iff = O r x ( n _ r ) . Fo r z > 1 , from (7), Thus fo r z > 1, |B(z ) = 0 i f an d onl y if |C(z) [ = 0 . Excludin g z = 1 , by Assumption B l th e onl y remainin g root s o f thi s determinan t li e outsid e the uni t circle . All the root s of |B(z) | = 0 are therefor e outsid e th e uni t disk , an d th e system define d by (5a) an d (5b) i s invertible. Thus , fro m (6), Also fro m (6) , note tha t

and, usin g the formul a for inversio n of partitioned matrices ,

From th e definitio n of Ae ?,

where F(L ) = [fj(l - L) , i Integrating (9 ) gives where x 0 is the constan t o f integration. T o deriv e the valu e of F(l), not e that Substituting fo r (B(l))- 1 fro m (8 ) give s F(l ) = Thus, recallin g tha t fi = C(l)m = (>if') m> F(l)^ i = 0 n x l . Th e auto regressive representation , i n its fina l form , is therefore give n by

260 Co-integratio

n i n System s of Equations

Several feature s o f th e derivation s abov e ar e noteworthy , particularl y with respec t t o th e F(l ) matrix. First , F(1)C(1 ) = C(1)F(1) = O n . Thi s result follow s fro m substitutin g »/j_(V»lC*(l)ij 1 )^ 1 ^>j. fo r F(l ) and tfrtj' for C(l ) and usin g the orthogonalit y conditions . Thi s re-emphasize s th e duality, firs t mentione d i n Chapte r 5 , betwee n th e impac t matri x i n th e MA representation , give n b y C(l) , and th e impac t matri x i n th e A R representation, give n her e b y F(l). The nul l spac e o f th e forme r i s th e range spac e o f the latte r an d vic e versa. Second, th e isomorphis m o f F(l) with the ya ' matri x in Chapter 5 can be demonstrate d easily . Note tha t

Both »/i(^j.C*(l)i/ 1 )~ 1 an d <j> L ar e matrice s o f ran k r an d dimensio n n x r . Thus , redefinin g §\ a s «' an d iji(0j_C*(l)i|_ L )~ 1 a s y , w e have F(l) = ya' , whic h i s a n n x n matri x wit h ran k r an d i s isomorphi c t o n. I t i s natura l t o defin e (jt'^x, (a'x, i n Chapte r 5 ) a s th e co-integrate d combinations o f th e x it. Integratin g (4b) show s tha t ^x, doe s no t contain a n integrate d componen t o f th e for m 2;= i e c Further , b y th e orthogonality o f ft wit h <j> L, th e co-integratin g combination s d o no t contain a trend . Bot h thes e result s matc h exactl y th e correspondin g results o n a'x t i n Chapter 5 . Third, i f B(L) wer e no t o f ful l rank , i t woul d b e possibl e t o extrac t another uni t roo t i n th e representatio n give n b y (6) , and th e syste m would b e 1(0 ) instea d o f 1(1) , a s assume d originally . Th e importanc e o f Assumption B 3 i s no w clear . Finally , usin g th e resul t tha t th e ran k o f F(l) i s r, i t is possible t o rewrit e th e mode l i n error-correction for m as where F(l) , lik e n i n Chapte r 5 , i s a matrix o f rank r an d ca n therefore be decompose d int o tw o n x r matrices , eac h o f ran k r. Th e step s involved i n goin g from th e fina l autoregressiv e for m of the syste m t o th e ECM form ar e given in (5.25)-(5.27), with n playin g the rol e o f F(l) . Sections 5. 3 an d 8. 1 hav e demonstrate d th e isomorphis m o f th e moving-average, error-correction , an d autoregressiv e representation s o f co-integrated processes . Th e nex t sectio n return s t o th e autoregressiv e representation an d relate s thi s t o th e metho d use d b y Johanse n (1988 ) which test s th e ran k o f n = ya', since , i f ther e ar e r co-integratin g vectors an d ya ' = n, the n ran k (it) = r. Th e non-uniquenes s o f thes e vectors (i n the absenc e o f a priori information ) is easily seen: for al l r X r non-singula r matrice s P . However , sinc e ran k (a ) = r , w e can normaliz e a * (perhap s afte r suitabl e rearrangemen t o f the variables ) such tha t «* ' = (I r : ft'), and so a*'x, = \at + fl'xbt wher e x' t = (x^ : x' bt).

Co-integration in System s of Equations 26

1

An importan t poin t fo r inference , give n (10) , is that th e EC M term s «'xr-/t wil l generall y ente r mor e tha n on e equation . Thi s wil l violat e weak exogeneit y whe n a i s a parameter o f interest, sinc e th e ECM s wil l be presen t i n som e o f th e othe r margina l distributions , an d wil l therefore necessitat e join t estimatio n fo r efficienc y a s discusse d i n Chapter 7 (see e.g. Phillips 1991, and Phillip s an d Hanse n 1990) . Henc e a necessar y conditio n fo r th e us e o f single-equatio n method s t o b e appropriate i n th e analysi s of co-integrate d system s is tha t th e relevan t ECM term s ente r only th e equatio n unde r study ; thi s i s clearl y no t a sufficient condition , sinc e i t i s possibl e tha t ther e ca n b e link s betwee n other parameters . As a n illustratio n o f (10) , conside r th e cas e wher e n = 2 an d r = 1. Let « ' = (1 , — K) an d X Q = 0 , s o that th e respectiv e system s become

(11') The for m i n (11 ) i s th e 'canonical ' representatio n i n 1(0 ) space , an d Phillips (1991 ) focuse s o n estimatio n o f thi s system . Whe n E(uitu2t) + 0, a 'simultaneit y problem ' i s present, bu t thi s ca n be deal t with b y th e inclusio n of A.x 2t a s a regressor i n th e firs t equatio n o f (11). The functiona l central-limi t theorem s fo r Wiene r processe s note d i n earlier chapter s appl y despit e th e seria l dependenc e i n u ( = [« lf , u 2t]', and direc t estimatio n o f K in th e firs t equatio n o f (11) ca n b e see n a s th e method originall y propose d b y Engl e an d Grange r (1987) . Inferenc e must, however, allo w for the seria l dependenc e i n ut. The latte r system , (11') , highlight s the 'structural ' form . At leas t one of Yi o r 7 2 mus t b e non-zero , sinc e otherwis e th e syste m ca n b e expressed i n term s o f difference d variable s alone . Wea k exogeneit y i s violated b y (among other possibilities ) YiY2 ^ 0. Since we are unlikely to know a priori whic h other equations ar e influence d by an y give n ECM, we tur n no w t o a metho d o f estimatin g th e co-integratin g rank r o f a system, which will also allo w tests o f this aspect o f weak exogeneity.

8.2. Estimatin g Co-integratin g Vector s in System s Consider the linea r system in (10) rewritten as

262 Co-integratio


where, fo r simplicity , w e hav e exclude d deterministi c term s suc h a s trends o r constants . W e shal l retur n t o a consideratio n o f thes e i n Section 8.5 . I n general , th e numbe r o f co-integratin g vector s wil l b e unknown i n empirica l modelling , an d mus t firs t b e determine d fro m th e data. Thi s step is important, becaus e both under - an d over-estimatio n of r hav e potentiall y seriou s consequence s fo r estimatio n an d inference . Under-estimation implie s th e omissio n o f empiricall y relevan t error correction terms , wit h thes e omitte d term s bein g relegate d t o e ( . Over-estimation implie s tha t th e distribution s o f statistic s wil l b e non standard. Thi s ma y b e demonstrate d b y inspectio n o f (12) . I f n i s correctly specified , al l th e variable s i n (12 ) ar e 1(0 ) an d standar d distributional result s apply . Howeve r ft">it-k will no t b e 1(0 ) i f the matri x « contain s vector s 0,%, say , suc h tha t a£x r _£ i s no t a co-integratin g combination an d i s therefore 1(1) . Th e vecto r itx. t-k W 'H hav e a mixture of 1(0 ) an d 1(1 ) term s correspondin g t o th e correc t an d incorrec t (o r over-estimated) co-integratin g vector s respectively . Incorrec t inference s will resul t fro m th e us e o f conventiona l critica l value s i n tests . W e wil l see late r tha t thi s ma y als o hav e a n advers e effec t o n forecastin g accuracy. Once r i s known , w e ca n procee d t o estimat e a an d y , notin g tha t non-singular linea r combination s o f thes e matrice s provid e equivalen t representations. Indeed , (« : y) is an over-parameterization o f n, so only the dimensio n o f the co-integratin g space ca n be establishe d directly . A tes t fo r th e nul l hypothesi s tha t ther e ar e r co-integratin g vector s can b e base d o n th e maximu m likelihoo d approac h propose d b y Johansen (1988) . Th e tes t i s equivalen t t o testin g whethe r j r = y « ' , where a an d y are n x r ; henc e i t i s a tes t o f the hypothesi s tha t n ha s less tha n ful l rank . We emphasiz e that , o f th e thre e distinc t cases , (i ) r = n, (ii ) r = 0, and (iii ) 0 < r < n, onl y cas e (iii ) wil l b e considere d formally . W e hav e already show n tha t cas e (i ) implie s tha t al l th e variable s i n x t ar e 1(0 ) and woul d onl y b e o f interes t i f ou r initia l assumption , tha t x , i s 1(1) , were incorrect . I n cas e (ii) , n = 0 and the syste m ought t o b e respecified in difference s t o achiev e stationarity . W e ca n potentiall y cove r thi s cas e as an extrem e o f cas e (iii) . For 0 < r < n, unde r th e assumption s tha t (12 ) i s the DGP , tha t al l coefficient matrice s ar e constant , tha t xj_ f c . . . x0 ar e give n and that 3

3 Phillip s an d Durlau f (1986 ) deriv e th e limitin g distributio n o f th e least-square s estimator o f (the equivalen t of) n , allowin g fo r more genera l error processes .


3

the log-likelihoo d functio n i s derive d fro m th e multivariat e norma l distribution:4

The firs t ste p i s t o concentrat e L ( •) wit h respect t o £2 , whic h involves no ne w considerations , an d yield s th e conventiona l resul t tha t £2 = r~ 1 X; r = 1 e r eJ. Next , we remov e th e know n 1(0) variable s fro m (12) to focu s o n th e matri x of interest n , whic h requires concentratin g L ( •) with respec t t o (D 1; . . ., D^_j) . T o d o so , sinc e th e {D J ar e unre stricted, w e ca n partia l ou t th e effect s o f (Ax,_! , . . ., A.x,_ k+l) fro m both Ax t an d x ( _^ b y regression , t o obtai n residual s Ro f an d R ^ respectively. Le t q ( = (AxJ_ 1; . . ., AxJ_ A + i)'; then

The concentrate d likelihoo d functio n L*(JT ) no w depend s onl y o n {Rn,, Rift} an d take s th e form

Next, w e comput e th e second-momen t matrice s o f al l o f thes e residuals and their cross-products , S 0o, S 0 ^, Sk0, Skk, where

4 Not e that we use th e upper-cas e n fo r th e rati o of the circumferenc e of a circle to it s diameter, a s opposed to the lower-case n define d earlie r a s the matrix product yo'.

264 Co-integratio

n i n System s o f Equation s

Consequently, fro m (18) ,

If n were unrestricted, a conventional regression estimator would result . However, w e ar e intereste d i n th e clas s o f solution s tha t resul t from th e imposition o f the restrictio n tha t Hence, fro m (20) ,

Next, concentrat e L*(y , a) wit h respec t t o y , whic h wil l delive r a n expression fo r th e ML E o f y a s a functio n o f « , an d yield s a furthe r concentrated likelihoo d functio n whic h depend s onl y o n a . Onc e th e MLE o f a i s obtained , w e ca n solv e backward s fo r estimate s o f al l th e other unknow n parameter s a s function s o f th e ML E o f a . Thus , fro m (21),

Substituting $ into (21) yields L**(«) :

At firs t sight , differentiatin g L**(« ) wit h respec t t o « looks formidable, but i n fac t th e algebr a involve d i s clos e t o tha t underlyin g th e well known LIM L estimato r fo r a singl e equatio n fro m a simultaneou s system; bot h depen d o n reduced-ran k restriction s bein g imposed . I n order t o solv e th e problem , w e appl y partitione d inversio n result s t o (23) an d obtai n

Then maximizin g L**(a) wit h respec t t o a correspond s t o minimizing the generalized varianc e ratio , noting tha t [Soo l i s a constant . T o locat e tha t minimum , we procee d a s with LIM L an d impos e th e normalizatio n tha t a'S kka= I. Th e ML E now requires tha t w e minimize, with respec t t o « ,


5

This involve s finding th e saddle-poin t o f the Lagrangian , where
, ij , an d p . Johanse n shows that , asymptotically , this procedur e determine s th e correc t para meters. H e als o obtain s th e relevan t limitin g distribution s o f th e estimators. 8.5.8. Weak Exogeneity and Conditional Models Most large-scal e econometri c system s an d man y other empirica l model s are ope n i n th e sens e tha t the y trea t a subse t o f th e variable s a s 'exogenous'. I n thi s sub-section , w e wil l focu s o n th e potentia l wea k exogeneity o f contemporaneou s conditionin g variable s fo r th e para meters o f interes t i n 1(1 ) co-integrate d system s (se e Engl e e t al. 1983) . As discusse d i n Chapte r 1 , wea k exogeneit y require s tha t ther e i s n o loss o f informatio n abou t th e parameter s o f interes t i n reducin g th e analysis fro m th e join t distributio n t o a conditiona l model . Th e concep t was develope d initiall y in th e contex t o f stationar y processes, bu t a s th e results i n Chapte r 7 suggested, it play s a n importan t rol e i n 1(1) system s as well. In particular , whe n th e vecto r o f observable s x , i s 1(1 ) ther e ca n b e cross-equation link s betwee n parameters , whic h ar e induce d b y th e occurrence i n severa l equation s o f commo n co-integratin g combinations «'x ( . I f a'\ t enter s bot h th e z't h and ;'t h equations , the n Xj t canno t b e weakly exogenou s fo r th e parameter s o f th e z't h equatio n sinc e th e parameters o f the tw o equations shar e commo n component s o f a'x , an d so canno t b e variatio n free . Failur e t o accoun t fo r suc h paramete r dependencies ca n adversel y affec t th e validit y o f inferenc e i n finit e samples (se e Chapte r 7 , Phillip s 1991 , Phillip s an d Loreta n 1991 , an d Hendry an d Mizo n 1992). To develo p notatio n fo r an 1(1) ope n system , tw o partitions o f \t are needed. T o exposi t th e basi c idea , i t i s convenien t t o retur n t o th e first-order syste m in (38 ) above , writte n as where e r ~ IN(0,£) an d « ' i s r x n o f ran k r. First , w e have th e usua l


9

transformed partitio n o f x, int o w ( = (xJer.Ax^)' , capturin g the location s of th e uni t root s an d th e co-integratin g vectors , wher e ther e ar e r elements i n x',a an d ( n - r ) i n Ax& r . Th e histor y o f th e proces s u p t o time t - 1 is denoted i n 1(0) spac e b y Wj_ i = (w l5 . . ., w,_i) . Second , we partitio n Ax ( int o (Axi,:Ax 2r)', wher e Ax 2f i s r a x 1 an d i s t o b e treated a s weakl y exogenous fo r th e vecto r paramete r o f interes t tjt e 4> , which include s thos e element s o f a an d y relevan t t o Ax lt . Fo r late r use, w e explicitl y write ou t nm t-i i n term s o f (xiîx^-i)', whe n ther e are r v + r2 = r co-integrating relations i n the tw o blocks, namely

The dimension s o f y n , y 12, y 2i, an d y 22 ar e ( n — m) X r 1; (n - m ) x r 2, m x r l 5 an d m x r 2 respectively ; and , correspondingly, a'n, a[ 2, « 21, an d « 22 ar e r^x ( n — m), TI x m , r 2x ( n - m) , an d r2 x m. If r 2 — 0, the n the relevan t element s are set to zero . Sinc e the analysis i n term s o f w , i s i n 1(0 ) space , th e approac h i n Engl e e t al. applies. The complet e se t of parameters o f the join t distributio n i s 0 e 0, an d these ar e mappe d one-for-on e t o f(0 ) = A e A, an d partitione d int o A=(Ai:A2)' wher e ^ e \i an d A 2 e A2. Factoriz e th e join t sequentia l density D x(^t Wj_ l 5 ff ) o f Ax ( int o it s conditiona l an d margina l components:

(56) Since w ( _! = (xJ-jtrAx^-j)', al l th e informatio n o n th e co-integratin g vectors i s retained i n Wj_j . Consequently , Ax 2f i s weakly exogenous fo r <j> i f (jt depend s o n A t alone , an d A : an d A 2 ar e variatio n free , s o tha t A = A j x A 2. Wea k exogeneit y o f Ax 2( fo r (j> canno t occu r whe n A ! an d A2 bot h depen d o n commo n component s o f a . As a consequenc e o f th e normalit y assumption , an d usin g the expres sion in (55) for ya'x^, conditionin g Ax lf o n Ax 2, lead s t o th e mea n of the conditiona l density:

290

Co-integration i n System s o f Equations

where W = EÊ^1. Thus , a necessary conditio n fo r the wea k exogeneity of Ax 2( fo r (yii:«ii:«i 2) i s that eithe r {y 12 - Vy22} = 0 o r y 22 = 0; i.e. («2ix lt _i + a 22x2r-i) appear s i n onl y on e o f D Xl\X2(-) o r D Xl(-), bu t not both . Further , unles s y 21 = 0, the n (a'uXi t~i + a 'ux2t-i) wil l appea r in th e margina l distributio n o f Ax 2( , s o y 21 = 0 is als o necessary . Ther e are sufficien t condition s for thes e necessar y conditions t o hold , including 721-0, y 22 — 0 an d y 12 = 0 wher e th e latte r tw o aris e becaus e r 2 = 0. Such condition s ca n b e teste d usin g th e approac h i n Johanse n (1992b) and Johansen an d Juselius (1990) . Short-run parameter s ma y depen d o n som e o f th e element s i n a without jeopardi/in g efficien t inference s abou t long-ru n parameter s o f interest. However , i f al l th e element s o f ^ ar e o f interest , the n agai n variation-free parameter s ar e required , an d an y cross-restrictions violat e weak exogeneity. To illustrat e thi s analysis , reconside r th e exampl e i n equation s (31) and (32 ) and (60 ) of Chapte r 7 . Ther e i s one co-integratin g vecto r wit h parameter /? , r\ = r = 1, r 2 = 0, m = 1 , and n = 2:

This representatio n i s in term s o f w r (se e (38) above) bu t i s written a s a triangular syste m erro r correctio n a s i n Phillip s (1991) , imposin g a specific first-orde r autoregressiv e parametri c for m fo r th e erro r proces s u, (compare d wit h the genera l processe s allowe d by Phillips): The unconditiona l covarianc e matri x o f u r i s pli m T~1 û t uJ = G , derived i n Sectio n 8.5.1 . Le t c 12 = c 22 = 0 sinc e thes e parameter s onl y determine th e presenc e o f the lagge d differenc e o f x2t, an d d o not affec t co-integration vectors . The n th e long-ru n covariance matri x is (see Ch. 7 appendix):

where ft)u = on/(l - c n ) 2 an d ^12 = cr12/(l - c u). Th e non-diagonalit y of fl implie s tha t ther e i s informatio n abou t th e parameter s o f eac h equation i n th e other . However , b y conditionin g Ax lf o n Ax 2, i n th e first equation , th e cr 12 effec t i s removed. The n eve n i f th e firs t equatio n is dynamic , s o c u ¥ = 0, th e diagonalit y o f fl onl y depend s o n c 21 = 0. When c 21 + 0, th e long-ru n covarianc e matri x i s non-diagonal an d ther e


1

is a los s o f wea k exogeneity , whic h ca n hav e a detrimenta l impac t o n the bia s an d efficienc y o f th e least-square s estimato r o f f i i n finit e samples. Not e tha t c 12 = £ 0 ca n b e correcte d withi n th e firs t equatio n treated i n isolatio n b y addin g lagge d A* 2/, bu t tha t c 21 ¥= 0 require s modelling th e syste m (althoug h correction s base d o n addin g lead s o f Ax 2 , hav e been propose d t o exploi t th e obvers e Grange r causalit y of x\ on X2'. se e Stoc k an d Watso n 1991) . We no w deriv e th e conditiona l an d margina l factorizations . I n term s of observables , th e origina l syste m fro m Chapte r 7 ca n b e writte n a s w, = Cw,_! + e t, or

Rewritten a s a VAR i n 1(0) variables as in (37) , w e have

where d 12 = c12 + ^c22, y n = (cn - 1 + /3c21), d 22 = c 22, an d y21 = c21. The restricte d firs t colum n o f D i s a n incidenta l effec t fro m assumin g a first-order autoregressiv e erro r initially. Finally, solvin g fo r th e conditiona l an d margina l representations , w e have

where W = ouo22\ A u = (/3 + W), A 12 = (cu - 1 - Wc 21), A 13 = (c12 - ^c 22), A 21 = c 21, A 22 = c 22, an d E[v ts2t] = 0. Assum e tha t <j> = (An:A12:A13:/J)' i s th e vecto r paramete r o f interest . Whe n A 21 = 0 , least-squares estimatio n o f 0 from th e firs t equatio n involve s n o los s of information. I n fact , x 2t i s strongl y exogenous fo r 0 in suc h a system . However, whe n A 21 + 0, Ax 2( i s no t weakl y exogenous fo r <j> an d th e analysis i s no t full y efficient . Mont e Carl o studie s (e.g . Phillip s an d Loretan 1991 ) confir m th e impac t o f thi s los s o f efficienc y i n finit e samples (se e Chapte r 7 ) . Irrespective o f th e valu e o f A 21, th e firs t equatio n i n (62 ) i s th e conditional expectation fro m (58) , namely Thus, onc e dat a ar e 1(1 ) bu t co-integrated , th e fac t tha t a n equatio n coincides wit h th e conditiona l expectatio n i s no t sufficien t t o justif y single-equation least-square s modelling . Rathe r surprisingly , weak exo geneity is at leas t a s important i n 1(1) processe s a s in 1(0) processes .

292 Co-integratio

n i n System s o f Equation s

8.6. A Second Exampl e o f the Johanse n Maximu m Likelihood Approach We reconside r th e U K seasonall y adjuste d quarterl y dat a fro m Sect . 7. 6 on money , prices , output , an d interes t rates , thi s tim e treate d a s a system, represente d b y a VA R wit h tw o lag s o n eac h o f m — p, &p, xS5, and R n, plus a constant an d a trend. Th e la g length was selected b y commencing a t fiv e lag s on ever y variable, an d sequentiall y testin g fro m the highes t order . Th e sampl e wa s 1964(3)-1989(2) . Th e residua l standard deviation s o f th e fou r equation s wer e 0.0161 , 0.0069 , 0.0126 , and 0.012 7 respectively , an d o n recursiv e F-test s al l fou r equation s ha d acceptably constan t coefficient s usin g one-of f 1(0 ) critica l values . Th e residuals als o yielde d insignifican t outcome s o n % 2 test s fo r autocorrela tion bu t no t fo r normality. In almos t ever y instance, tw o co-integratin g combinations wer e signifi cant (i.e . tw o unit roots were rejected) ; th e secon d o f these wa s virtually the sam e i n al l la g specifications , bu t th e firs t wa s ofte n a linea r combination o f th e firs t tw o row s reporte d i n Tabl e 8.9 . Suc h a findin g matches tha t i n Hendr y an d Mizo n (1992 ) an d Ericsso n e t al. (1991) . Beginning wit h th e larges t statistics , tw o o f th e test s i n eac h colum n ar e significant (se e Osterwald-Lenu m 1992 : Tabl e 2). The correspondin g eigenvector s ar e show n i n Tabl e 8.9 , i n rows , augmented b y th e tw o non-co-integratin g combination s i n th e las t tw o TABLE 8.8. Eigenvalues , tes t statistics , an d 5 per cen t critica l value s Eigenvalues

0.013817

Statistics

-riog(i-ft.;) £,(0.05

n — 4= r =

0

n — 3= r = 1 n - 2 = r =2 n - 1 =r = 3

72.82 28.73 6.22 1.39

0.060350

30.33 23.78 16.87 3.74

)

0.249694

0.517240

-riog(l - M, ;) »? 109.17 36.34 7.62 1.39

n - r (0.05)

54.64 34.55 18.17 3.74

TABLE 8.9. Normalize d eigenvector s « ' Variable

m— p

«i

1.0000 0.0311 -0.2633 0.9838

«2

»'l l>2

R,,

6.3966 1.0000 0.9435 4.5659

-0.8938 -0.3334 1.0000 -0.7701

7.6838 -0.1377 -1.2117 1.0000

Co-integration i n System s o f Equations

293

rows. Th e firs t ro w suggest s th e following long-ru n solutio n fo r th e money equation: This i s clos e t o tha t foun d fro m th e single-equatio n dynami c analysis in Chapter 7 . N o tren d i s required . Th e y matri x i s give n i n Tabl e 8.10. Only th e firs t entr y i n th e firs t colum n i s a t al l large , s o tha t th e firs t co-integrating vecto r onl y affect s th e firs t equatio n consisten t wit h th e weak exogeneit y o f x 85, R n, an d A p fo r th e parameter s o f th e money-demand equation . Thi s agai n matche s th e findin g ove r a shorte r sample in Hendry an d Mizo n (1992) . The secon d ro w o f Tabl e 8. 9 deliver s th e approximat e long-ru n solution This correspond s t o th e impac t o f exces s demand , a s measure d b y th e deviation fro m it s linea r trend , o n inflatio n wit h a smal l an d possibl y insignificant effec t fro m interes t rates . N o additiona l tren d i s the n required. Th e secon d colum n o f y show s a larg e effec t o f thi s ECM o n all fou r equations , violatin g an y possibilit y o f treatin g an y o f th e fou r variables a s weakly exogenous i n a model o f inflatio n or exces s demand when the parameter s o f interest includ e th e long-ru n multipliers. When th e orderin g o f variables is ( m — p,Ap, x S5, R n) th e long-ru n n matrix is -0.082 -0.245 -0.081 -0.761 0.164 -0.009 -0.474 0.112 0.007 0.146 -0.108 -0.147 -0.021 -0.119 0.149 -0.059

8.7. Asymptoti c Distributions o f Estimators o f Co-integrating vectors in 1(1 ) system s Gonzalo (1990 ) review s an d compare s th e variou s alternative s t o OL S for th e estimatio n o f co-integrating vectors, includin g those propose d b y TABLE 8.10. Adjustmen t coefficients y Variable

7i

72

m- p Ap

-0.0952 0.0048 -0.0210 -0.0001

0.4268 -0.5147 0.2578 -0.2253

*85

Rn

-0.0300 -0.0013 -0.0318 0.0796

-0.0076 0.0024 0.0116 0.0069

294

Co-integration i n System s o f Equation s

Stock (1987) , Stoc k an d Watso n (19886) , Johanse n (1988) , Phillip s (1988a), an d Phillip s an d Hanse n (1990) . Whil e al l o f th e suggeste d methods shar e th e super-consistenc y property , w e hav e see n tha t ther e can b e substantia l difference s i n thei r performanc e o n moderatel y size d samples. Gonzalo make s th e compariso n on a simple dat a generatio n proces s i n which co-integratio n hold s between th e 1(1 ) serie s z t an d y t: and

This syste m i s a specia l cas e o f (58 ) an d ca n therefor e b e represente d i n the error-correction for m

where w l f = /3e 2r + eic U 2t = £ 2n an d £(uu' ) = A, with

The logarith m o f th e likelihoo d functio n fo r th e EC M i s therefore L(a, y , A) = K - (r/2)ln|A |

where x , = (y t, z t)' , J~ (p— 1,0)' , « ' = (1 , -/?), an d y« ' i s th e 2 x 2 matrix o f rank 1 given i n (64). The system s (63 ) an d (64 ) hav e th e propert y tha t z t i s weakl y exogenous fo r /? . Sinc e th e u it are normall y distribute d (fro m (63)) , tak e conditional expectation s in (64)

Taking th e covariance s o f the u t fro m (65) , w e have

Co-integration in System s o f Equations 29

5

The paramete r /3 i s recoverabl e fro m (67) . Moreover, / ? doe s no t enter th e margina l distribution . Weak exogeneit y o f z t fo r / ? implie s tha t inferenc e concernin g f t ca n be carrie d ou t wit h n o los s o f informatio n b y usin g th e densit y o f y t conditional o n z t an d ignorin g th e margina l densit y o f z t (tha t is , th e DGP o f z t)- I t i s the n no t surprisin g that , whe n th e log-likelihoo d i s formally spli t int o a conditiona l an d a margina l likelihood, th e margina l density contain s n o informatio n abou t ft . Tha t is , (66 ) can b e rewritte n as

with A 0 = An - A 12AÂ21, £, = Ay , - ( p - l)(y t-i - fizt-i) ~ ty&z t, and, finally , i/ > = AÂ^ 1 = (f t + 0ffi/ff 2 ); V ca n b e interprete d a s a short-run multiplier , bein g th e coefficien t o n Az , i n (67) , while th e long-run multiplie r i s ft , fro m (63) . The ter m i n parenthese s i n (68 ) is the margina l likelihood o f z t (o r Az r ) an d doe s no t involv e /3; estimatio n of f t ca n b e carrie d ou t b y maximizin g the conditiona l likelihoo d alone . The estimat e i s tha t whic h woul d b e obtaine d fro m OL S i n th e regression correspondin g to (67). In orde r t o discus s th e asymptoti c propertie s o f differen t estimatio n methods, w e us e th e multivariat e functiona l central-limi t theore m an d transformation t o th e uni t interva l describe d i n Chapte r 6 . Fo r th e vector e t - (v t, E 2t)' , let pt - p,_ j + ef . The n

with B(r ) = (5i(r), B 2(r))'. Th e long-ru n covarianc e matri x o f thi s bivariate Brownia n motion proces s ca n b e calculate d a s in th e appendi x to Chapte r 7 :

Further,

where

296 Co-integratio

n in System s o f Equations

Hence

Results o n th e asymptoti c distribution s o f th e differen t estimator s o f co-integrating parameters wil l be state d withou t proof, bu t ca n b e found in Gon/al o (1990). (i) Static regression estimated by OLS. For \t generate d by (63) , the OLS estimator o f ft in a static regression ha s the asymptoti c distribution

using th e decomposition BI(S) = a)i 2a)22B2(s) + ( = / 3 implie s 6 = 0, an d s o A 2 = A% = 0. Whil e th e limiting distributio n abov e i s specifi c t o th e DG P (63) , i/ > = / ? wil l typically onl y aris e becaus e o f a n absenc e o f lagge d value s o f z t an d y t from th e DGP ; if fo r exampl e y, = yzt + Y\yt-\ + Y2Zt-i + error, the n the long-ru n multiplie r i s / ? = ( V + 72)/( l ~ 7i) > m whic h cas e 7i — 72 = 0 i s sufficien t fo r fi = ty . A commo n facto r (y 2 = — VYi) i s necessary an d sufficient . The term s A 2 an d A 3 abov e ca n b e eliminate d whe n if> = £ / ? by th e us e of othe r estimatio n methods, a s will be see n below .


7

(ii) Non-linear least squares (Stock 1987). Thi s method , whic h elimin ates th e bia s containe d i n (70c) , consist s i n minimizin g th e su m o f squared residual s defined as

which i s non-linea r i n tha t th e coefficien t o n z t-i i n th e correspondin g regression mode l i s YiP- Th e coefficien t f t ca n howeve r b e recovere d from th e ordinar y linear regressio n

The asymptoti c distribution o f thi s NL S estimato r i s simila r to tha t i n (69), bu t wit h the ter m (70c ) omitted an d (706 ) modifie d to

Comparing (706) and (706') , we see that (706' ) contain s a factor of ty rather tha n (i/;-/3) . A s (706 ) is on e o f th e term s responsibl e fo r second-order bias , i t seem s likel y tha t OL S wil l perform relativel y well when ty— ft = Q, reducin g th e bia s i n (706) , an d tha t NL S wil l perfor m relatively wel l whe n ^ = 0, reducin g th e bia s i n (706') . I n th e Mont e Carlo stud y of Stock (1987) , th e DG P chose n implie s that ip = 0, leading to th e superiorit y o f th e NL S technique ; wher e t/ ; = ft , however , OL S may d o better . Recal l fro m th e definitio n of if> tha t V = f t i f 0 > a scaling factor fo r th e correlatio n betwee n th e underlyin g white-nois e disturb ances in y t an d z, t, is equal to zero . (in) Full-information maximum likelihood (FIML). Th e FIM L pro cedure o f Johanse n (1988 ) fo r estimatin g the matri x a o f co-integrating vectors i n a syste m i s describe d above . Gonzal o show s that , fo r th e DGP (63) , the FIML estimator o f ft has the asymptoti c distribution

where AI i s as given i n (70a) . Therefor e (71 ) is equivalent t o (69 ) wit h terms A 2 an d A 3 eliminated . FIML estimatio n eliminate s two sources of bias: th e non-symmetr y caused b y ip = £ ft which leads t o a bias in median (term (706)), an d th e simultaneous-equation s bias , whic h i s a bia s i n mean (ter m (70c)) , whic h results when the long-ru n covariance betwee n zt an d v t i n (63 ) i s no t accounte d for . Th e FIM L estimato r i s asymptotically symmetrically distributed.

298 Co-integratio


Moreover, th e asymptoti c distributio n give n i n (71 ) i s a mixtur e o f normals. (Recal l tha t i n (70a ) B 2(s) an d W(s) ar e independen t Brow nian motio n processes. ) A s a result , standar d asymptoti c chi-square d hypothesis tests ar e valid. (iv) Other estimators. Stoc k an d Watso n (19886 ) an d Bossaert s (1988 ) propose additiona l method s o f estimatio n base d o n principa l compon ents an d canonica l correlations respectively . The principal-componen t metho d find s th e linea r combinatio n o f y t and z t wit h minimu m variance , whic h amount s t o findin g th e co integrating vector. Give n th e covarianc e matrix of (y t, z t), th e principalcomponent estimat e o f th e co-integratin g vecto r i s th e eigenvecto r corresponding t o th e smalles t eigenvalu e o f thi s covarianc e matrix . Fo r the DG P (63) , it s asymptoti c distribution i s like tha t o f OL S a s given in (69), wit h th e additio n o f a fourt h ter m groupe d wit h A\, A-i an d AT,. Calling thi s term A 4, The additiona l ter m affect s th e bia s i n mean , whic h ma y b e large r o r smaller tha n tha t o f OL S a s thi s term ma y b e positiv e o r negative . Lik e FIML, th e principal-componen t metho d lend s itsel f naturall y t o th e estimation o f more than on e co-integratin g vector. The metho d o f canonica l correlatio n i s base d o n a searc h fo r th e linear combinatio n o f (y t, z t) an d (y t-i, z t-i) whic h ha s th e maxima l correlation subjec t t o normalizatio n and identificatio n constraints. Gonzalo compare s th e method s i n a Mont e Carl o simulatio n that use s a DGP simila r to (63) , but wit h (63a ) modifie d t o

where a\ = 0 o r 1 and wit h a\ = 1 . Th e result s ar e consisten t wit h th e analysis o f biase s give n above , an d i n particula r suppor t th e contentio n that th e Johansen-typ e FIM L estimato r wil l ten d t o b e superior . Whic h of OL S an d NL S i s superior depends , a s anticipated , o n th e parameter s V an d t y — fi. Moreover, a s w e hav e see n above , i t appear s tha t th e efficiency cos t o f over-parameterizatio n o f th e FIM L o r NL S estimator s is modest , whil e th e consequence s o f under-parameterizatio n ma y b e more serious .

9

Conclusion We briefl y summariz e th e mai n theme s o f th e book , an d the n consider th e invarianc e o f th e matri x o f co-integrating vectors i n a linear syste m unde r bot h linea r transformation s an d seasona l adjustment. Next , co-integratio n i s related t o structure d time-serie s models, whic h offe r a n alternativ e approac h t o modellin g inte grated data . Recen t researc h o n integratio n an d co-integratio n i s described, an d th e boo k conclude s b y re-interpretin g som e ol d econometric problem s i n the ligh t of co-integration theory .

9.1. Summar y Many economi c tim e serie s appea r t o b e non-stationar y and to drif t ove r time. Efficien t inferenc e i n time-serie s econometric s require s takin g account o f thi s phenomenon . Thi s boo k describe d th e modellin g o f economic variable s a s integrate d processes , allowin g fo r th e possibilit y that variable s ma y b e linke d i n th e lon g run , implyin g tha t linea r combinations of them ar e co-integrated . We firs t presente d th e backgroun d t o th e theor y o f integrate d series , building o n concept s fro m time-serie s analysi s an d th e theor y o f sto chastic processes . Th e resultin g distribution s o f estimator s an d test s applied t o integrate d dat a wer e functional s o f Wiene r processes , whic h when combine d wit h a functional central-limi t theorem le d to a powerfu l and genera l metho d fo r derivin g their limitin g distributions. These wer e different fro m th e limitin g distribution s conventionall y applie d t o sta tionary processes , bot h becaus e th e normalizatio n facto r was the sampl e size rathe r tha n it s squar e root , an d becaus e th e for m o f the asymptoti c distribution wa s non-normal . A n importan t implicatio n wa s tha t th e critical value s o f tes t statistic s differe d betwee n 1(0 ) an d 1(1 ) data . Although th e asymptoti c distributio n theor y involve d ne w type s o f derivations, i t wa s feasibl e t o maste r th e logi c o f Wiene r processe s without excessiv e effort ; th e pay-of f wa s tha t th e approac h simplifie d other derivation s (suc h a s constanc y tests , a s i n Hanse n 1992) , and , i n addition, wa s very general. The Wiene r proces s tool s the n allowe d u s t o analys e suc h divers e problems a s spuriou s (o r nonsense ) regressions , spuriou s detrending ,

300 Conclusio

n

parametric an d non-parametricall y adjuste d univariat e test s fo r uni t roots, regression s o n 1(1 ) data , an d test s fo r co-integration . W e showe d that eve n wit h 1(1 ) dat a man y test s ha d conventiona l distributions , bu t some di d not , s o car e wa s require d i n conductin g inference . Fo r example, test s suc h a s the Johansen statisti c Tlo g (1 - A ) for co-integration ha d distribution s whic h wer e functiona l o f Wiene r processes , although test s o n co-integratin g vector s wer e asymptoticall y normal . I n particular, over-identificatio n test s neede d t o b e formulate d after map ping t o th e spac e o f 1(0) variable s t o ensur e tha t thei r distribution s wer e not a mixture of thes e tw o type s of distributions (se e Hendr y an d Mi/o n 1992). Conditionin g test s o n th e 1(1 ) decisio n fo r th e numbe r o f co-integrating relation s allowe d th e test s t o b e treate d a s having conventional distributions . Co-integration provide d a conceptua l framewor k fo r mappin g t o 1(0 ) space an d therefor e w e examine d i t a s a data-reductio n too l an d investigated som e o f it s wide-rangin g implications. Test s fo r co-integra tion base d o n residual s fro m stati c regression s an d o n system s wer e derived. Th e Grange r Representatio n Theore m linke d co-integratio n t o a variet y of other representations , includin g error-correction mechanism s (ECMs) whic h hav e been widel y used sinc e th e lat e 1970s . This lin k in tur n entail s a ne w view of dynamics : lagged feedbacks an d ECMs d o no t necessaril y violate rationalit y in a n 1(1 ) world . Further , a s in Davidso n e t al. (1978) , th e rol e o f differencin g i s a s a transform , which preserve s co-integration , an d no t a s a filter , whic h eliminate s levels variable s an d henc e lose s co-integration . Conversely , omittin g a n ECM generall y induces a negative moving-averag e error, a point elabor ated upo n below .

9.2 Th e Invarianc e o f Co-integrating Vectors Linear systems , perhap s formulate d afte r suitabl e dat a transformation s (such a s logarithms) intende d t o mak e linearit y a reasonable approxima tion, pla y a leadin g role i n co-integratio n analysis . A linea r syste m i s invariant unde r non-singula r linea r transforms , bu t usuall y it s para meters ar e altere d b y suc h transforms . Chapte r 2 discusse d th e proper ties o f linea r autoregressiv e distribute d la g (ADL ) model s fo r stationar y data, relatin g transformation s o f ADL s t o ECM s t o demonstrat e th e equivalence o f estimator s o f long-ru n multiplier s fro m an y o f th e transforms eve n thoug h th e parameter s o f the equatio n wer e altered . I n 1(1) processes , th e correspondin g resul t i s that co-integratio n define s a n invariant o f a linear system , a s we now show . Consider a n identifie d n x r co-integratio n matri x « i n th e 1(1 ) system:

Conclusion 30

1

(1 ) where e ( ~IN(0,i;). Th e syste m i n (1 ) ha s parameter s (T , y, a, fi, E). Then, \, is 1(1 ) i f an d onl y i f rank (ylâj j = n — r wher e * P i s th e mean la g matrix defined i n Chapter 8 . Here (y : y± ) has rank n, with y ± being n X (n — r) suc h tha t y i y = 0 an d (a:a ± ) ha s ran k n wit h «^« = 0 fo r «_ L o f siz e nx(n — r). Pre-multiplyin g (1 ) b y a know n n x n non-singula r matri x B (s o | B = £ 0), t

The syste m i n (2 ) ha s th e sam e likelihoo d a s (1) , bu t wit h parameter s (r*, y*, a, jti* , £*) wher e £ * = B£B'; a n exampl e o f a n admissibl e transform i s an y just-identifie d reformulatio n o f (1) . Onl y a i s unaf fected b y th e linea r transform , an d a'x,_ i remain s th e co-integratin g combination, s o a i s an invariant parameter o f the system. The 1(1 ) propert y o f th e syste m i s als o preserve d a s follows . Th e mean-lag matri x become s *P * = B*P and , lettin g (y * : yj) = (By: B^'yj.) s o that y*'y l = 0, the n and henc e th e tw o matrices hav e th e sam e rank . The invarianc e of « is a natural propert y o f reduced-ran k system s an d extend s t o 1(2 ) processe s and t o conditiona l systems . Thus , fo r a give n vecto r x, , reduce d forms , marginal models , conditiona l models , an d structura l form s al l ca n b e modelled wit h the sam e se t of co-integration vectors .

9.3. Invarianc e o f Co-integration Unde r Seasona l Adjustment The co-integratin g vecto r a i s invarian t t o seasona l adjustmen t b y a diagonal seasona l filte r S(L ) whic h satisfie s th e scale-preservin g prop erty S(l ) = I, a s does a procedur e lik e X-ll . Th e result s i n this sectio n are draw n fro m Ericsson , Hendry , an d Tra n (1992) . I t i s assume d tha t S(L) annihilate s an y deterministi c seasona l dummies . Th e invarianc e result hold s becaus e S(L ) can be written a s (see Chapte r 5) : We firs t sho w th e co-integratio n relatio n betwee n adjuste d an d unadjusted dat a an d the n establis h th e invarianc e o f th e co-integratio n matrix a o f x, . Le t x , = S(L)x,. denot e th e seasonall y adjuste d vecto r variable. The n

302 Conclusio

n

so tha t x , — \t = S*(L)Ax r . Henc e \ at an d x, co-integrat e wit h a uni t coefficient t o 1(0 ) whe n x, i s 1(1). Mos t seasona l adjustmen t filter s ar e two-sided an d symmetri c for mos t o f th e availabl e sample , s o that i n fac t S*(l) = 0 an d S(L ) = I + S**(L)A 2 . The n x ? - x , = S**(L)A 2 x ( s o that co-integratio n t o 1(0 ) occur s betwee n adjuste d an d unadjuste d dat a even whe n x t i s 1(2). Alternatively , i f Ax r i s 1(0) wit h a non-zer o mea n (as i n GNP) , the n x " - x , ha s a zer o mean , a s seem s sensibl e fo r the seasonal residual . Generally , i f S(L ) = I + St(L)A d , the n x ? an d x , co-integrate wit h a unit coefficien t to 1(0 ) whe n xt i s I(d), an d als o hav e a zer o mea n differenc e whe n x ( i s \(d — 1). Whe n x", — xt i s a t mos t 1(0), an y co-integratin g vecto r « ' o f eithe r x ? o r x , i s a co-integratin g vector o f th e other , s o co-integratio n parameter s ar e unaffecte d b y S(L). Sinc e x", = xt + S**(L)A2 x ( , we have tha t

and henc e th e differenc e is at leas t tw o order s o f integratio n lowe r tha n that of xt. However, th e adjustmen t paramete r y i s altere d a s follows . Multipl y (1) by S(L) t o give Ax? = S(L)fi + S(L)rAx,_! + S(L)y«'x f _ 1 + S(L)e ,

By suitabl e additio n an d subtractio n o f lag s an d difference s o f x ? o n th e right-hand side ,

When Sf(-L ) i s a scala r time s th e uni t matri x (th e sam e filte r fo r al l x it), vat = ef. I n (6) , i t look s a s i f y i s als o a n invariant , bu t a s o at involve s lagged, current , an d futur e difference s of x, o f dth o r highe r order , a s well a s e", the n on e o f v at o r e t i s likel y t o b e autocorrelated . Sinc e «'x?_i i s a n 1(0 ) variable , conventiona l seria l correlatio n biase s appl y t o it, an d henc e y will usuall y b e affecte d b y whethe r o r 'not th e dat a ar e seasonally adjusted . Th e short-ru n dynamic s wil l be change d whe n e t i s an innovation , becaus e v" i s correlate d wit h Ax?_i , an d additiona l lag s are neede d t o remov e it s autocorrelation .

Conclusion 30

3

9.4. Structure d Time-serie s Models and Co-integratio n An alternativ e approac h t o modellin g integrate d processe s i s offered b y structured time-serie s model s (se e Harvey 1989) . 1 I n thi s section , w e briefly explai n thei r for m an d relat e thei r dat a descriptio n propertie s t o a co-integrated system . A simpl e univariat e example i s given by

and E[e tvs] = 0 V?,s . Thei r for m generall y lead s t o th e presenc e o f negative moving-average errors , sinc e (7 ) and (8 ) imply that The proces s {e t — et_i + vt} ca n be re-expresse d a s a first-order moving average {e, — 9et-i}, wher e th e moment s o f th e derived proces s ar e identical t o thos e o f the origina l process an d determin e 9 . Th e variance of th e forme r i s 2o 2E + o 2v, an d tha t o f th e latter , {e t-det_i}, i s (1 + 0 2)ol, an d thes e mus t b e equa l t o eac h other ; thei r first-orde r auto-covariances ar e — o2 an d — 9o2, and agai n these mus t be equal . Al l longer la g c o variances vanish . Equatin g th e first-orde r seria l correlatio n coefficients of the two representations yield s where q = o2Ja2. Equatio n (10 ) is a quadratic i n 6 that, give n q, can be solved fo r a valu e o f 9 betwee n 0 an d 1 . Finally , equatin g first-orde r covariances a 2, = o 2e/9. Thus , Ay , i s 1(0 ) an d ha s a negativ e moving average erro r wit h parameter 9 : Ay, = e t — (?e,_i. There ar e clos e link s betwee n negativ e moving-averag e error s an d error-correction mechanism s a s remarke d earlie r (se e e.g. Gregoir an d Laroque 1991) . Conside r a simple co-integrated system ,

To marginaliz e with respect t o z a t al l lags in (11), firs t rewrit e it a s so that, i n terms o f differences , In (14) , w, = Ay3v,_ ! + AM , an d a s wit h (9) , when {v, } an d {u s} ar e mutually independent , w e ca n rewrit e w t a s £ , — T£,_I, wher e equatin g 1 Harve y call s suc h model s 'structural' , bu t a s tha t wor d i s heavil y over-use d i n econometrics, we have substituted 'structured' .

304 Conclusio

n

moments yield s -t/( l + r 2) = -l/( 2 + s) fo r s = )?ff-o 2v/o2u. Thus , a negative moving-averag e erro r als o result s fro m th e marginalizatio n providing A ^ 0 (th e uni t roo t i n (14 ) cancel s whe n A =0 sinc e the n s = 0 an d s o r = l ) . I f (7 ) an d (8 ) allowe d fo r a short-ru n dynami c element, th e observe d outcom e woul d b e simila r t o tha t entaile d

by (14) .

A structure d time-serie s mode l tha t generalize s (8 ) b y includin g a time-varying slope generate s a n 1(2) series ,

Thus, a s long as cr 2 + 0, Hence fro m (7) , When cr ^ = 0, we have £ t = t, t_v = £ 0, say, so that and C o i s th e mea n growt h rat e £[Ay r ] = g y = £ 0- Whe n a 2 ¥=0, (18 ) entails changes in £[Ay r ] = g y (f) over tim e an d generate s y , a s 1(2). The alternativ e possibilit y to evolvin g growt h rate s i s tha t o f change s in mean s ove r time , s o tha t g y(t) take s differen t value s i n differen t epochs. Suc h behaviour coul d b e approximate d b y a mode l i n which th e distribution D n(r]t) wa s non-normal, wit h a large mass a t zer o an d smal l probabilities o f larg e values . The n £ r woul d usuall y b e constant , bu t would occasionall y jum p t o a ne w level . Thus , i t i s unsurprisin g tha t discrimination betwee n integrate d an d regime-chang e model s i s difficul t (see Perro n 1989) . Conversely , ther e ar e clos e affinitie s betwee n struc tured time-serie s an d econometri c model s fo r integrate d data . Indeed , several researcher s hav e suggested switchin g from a unit-root nul l to on e of 1(0 ) o r co-integration . Fo r example , on e migh t see k t o tes t a 2, = 0 when a ^ = 0 (s o £ r = £ W) a s a tes t fo r a uni t roo t (se e e.g . Kwiatkowski, Phillips , an d Schmid t (1991) an d Leybourn e an d McCab e (1992)) .

9.5. Recen t Researc h o n Integration an d Co-integratio n During th e las t decad e ther e ha s bee n a n explosio n o f researc h o n integrated an d co-integrate d processes . Dozen s o f papers appeare d whil e we wer e writin g the book , an d man y will appea r betwee n completio n o f

Conclusion 30

5

writing an d it s appearanc e i n print . Wit h suc h a rapidl y movin g target, we focuse d o n centra l researc h topic s t o explai n wha t see m likel y t o remain th e majo r concepts , tools , techniques , models , methods , an d tests. Consequently, som e researc h area s receive d scan t treatment , including other estimatio n method s fo r co-integratio n vectors , a s well as studies of their properties : see inter alia Ahn and Reinse l (1988) , Bewley , Orden , and Fishe r (1991) , Boswij k (1991) , Bo x and Tia o (1977) , Engl e an d Yo o (1991), Phillip s (1991) , Saikkone n (1991) . Som e comparativ e Mont e Carlo studie s o f finit e sampl e behaviou r an d relate d econometri c theory have bee n noted , bu t other s appea r apac e an d w e ca n expec t man y more ove r th e nex t fe w year s clarifyin g th e choic e o f method , an d th e likely problem s confrontin g eac h proposal . Researcher s wil l als o stud y the problem s o f join t selectio n of , e.g . la g lengt h an d th e numbe r o f co-integration vectors . Anothe r researc h topi c i s th e orde r i n whic h hypothesis tests should be conducted . Intuitio n suggest s that i t should b e constancy, la g length , co-integration , congruenc e o f th e system , wea k exogeneity, structura l restrictions , encompassing , intercept s (an d whether the y lie in the co-integratio n space), etc . However , th e distributions o f test s o f th e firs t hypothesi s ar e affecte d b y th e presenc e o f co-integration, an d i t ma y wel l b e difficul t t o implemen t a goo d order , although i f the dat a ar e indee d 1(1) , test s fo r la g length based o n lagged first difference s wil l b e i n 1(0 ) space . On e recommendatio n concernin g choices o f method s an d estimator s tha t emerge d a s w e proceede d wa s for a system s approac h i n preferenc e t o single-equatio n modellin g until weak exogeneit y has been ascertained . Further development s hav e occurre d i n testin g fo r uni t root s i n univariate processe s suc h a s instrumenta l variable s test s an d Durbin Hausman test s (se e e.g . Hal l 1991 , Cho i 1992 , Schmid t an d Phillip s 1992, Kremer s e t al. 1992 ; an d Banerje e an d Hendr y 199 2 fo r a summary). However , th e previou s recommendatio n o f modellin g th e system rathe r tha n usin g univariate representation s bring s into questio n the poin t o f conductin g unit-roo t test s i n margina l processes . On e purpose migh t be t o rejec t th e nul l of integration against trend stationar ity. Here , th e availabl e test s ar e know n to hav e relativel y low power. I n particular, investigator s ofte n us e t( p = 1) rathe r tha n T(p — 1) (se e Sect. 4.6 ) althoug h Mont e Carl o evidenc e show s th e latte r t o hav e higher power . I n an y case , failur e t o rejec t th e nul l doe s no t entai l accepting it as 'true' . For example , univariat e unit-roo t test s can reflec t other non-modelle d form s o f non-stationarit y suc h a s regim e shifts , an d inherent non-stationarit y i n mea n an d varianc e functions . Further , variables inherit uni t roots fro m marginalizin g with respect t o othe r unit root processe s o n whic h they depend . Thus , failur e t o rejec t a nul l o f a unit roo t tell s u s littl e abou t th e persistenc e o f shock s t o th e variabl e

306 Conclusio

n

being considere d i n isolatio n o r i n a small, highly marginalized syste m a s discussed i n Campbel l an d Perro n (1991) . A secon d purpos e migh t be t o chec k that variable s i n a system ar e no t 1(2) (se e e.g . Pantul a 1991) , s o th e nul l woul d b e a uni t roo t i n th e differences o f th e origina l variables . However , i f th e intentio n i s t o model th e system , the n i t seem s bette r t o procee d fro m th e genera l t o the specifi c her e a s wel l an d tes t th e necessar y ran k condition s o n th e mean la g matri x o f th e syste m (se e followin g (1 ) above) . Nevertheless , sequential test s i n thi s contex t rais e som e ne w problems . Fo r example , the outcom e o f a pretest fo r a uni t root (i.e . rejec t o r no t reject ) affect s the critica l values used t o tes t economi c hypotheses , s o the possibilit y of Type-I error s a t th e firs t stag e ma y lea d t o siz e o r powe r distortion s a t the secon d stag e whe n conventional initia l values ar e used . Finally, a uni t roo t ma y b e o f interes t i n orde r t o validat e a specific estimator (e.g . Engle-Granger ) b y appealing t o super-consistency . Her e a uni t roo t tes t ma y b e o f descriptive valu e as i t depend s o n th e rati o of the covarianc e o f the firs t differenc e wit h the leve l to th e varianc e o f th e level, an d s o should b e clos e t o zer o whe n ther e i s a unit root, althoug h we showe d i n Sectio n 3. 6 tha t simila r distribution s wil l resul t fo r integrated an d near-integrate d processes . Th e rati o o f th e varianc e o f the firs t differenc e to tha t o f th e leve l i s another inde x of th e rapidit y of accrual o f information (either fro m trend s o r fro m drift) . Other likel y researc h interest s concer n test s o f structural , long-run , exogeneity, causality , an d encompassin g hypothese s (se e e.g . Boswij k 1991, Hendr y an d Mizo n 1992 , an d Banerje e an d Hendr y 1992) . Modelling 1(2 ) system s i s i n it s infanc y (se e Johanse n 1991fo) , bu t ha s close links to multi-co-integratio n an d th e analysi s of stock-flow relations (see Grange r an d Le e 1990) . Thi s las t developmen t provide s a n addi tional explanatio n fo r suc h phenomen a a s th e rol e o f inflatio n i n rea l money deman d equations : i f nominal money and th e pric e leve l are 1(2) , and rea l mone y an d inflatio n ar e 1(1) , the n th e las t ma y b e neede d t o create a n 1(0 ) co-integratio n vector . Extensiv e development s als o see m likely t o occu r i n estimatio n an d dynami c modelling , sinc e fo r man y objectives i n econometrics , includin g forecasting and policy , the focu s o f interest mus t b e al l parameter s o f th e syste m an d no t jus t th e long-ru n parameters. In co-integrate d processes , wea k exogeneit y o f th e conditionin g vari ables fo r th e parameter s o f interes t remain s a s vita l a s i t di d i n stationary processes—eve n fo r th e long-ru n parameters . Thus , i t i s important t o tes t fo r th e presenc e o f co-integratin g vector s i n othe r equations a s discusse d i n Chapte r 8 . Doin g so , however , implie s syste m modelling eve n fo r a n L M tes t (se e Boswij k 1991) . Further , Urbai n (1992) show s tha t test s fo r orthogonalit y betwee n regressor s an d error s lack powe r t o detec t suc h a weak exogeneity failure.

Conclusion 30

7

9.6. Reinterpretin g Econometrics Time-series Problems Integration an d co-integratio n als o lea d t o th e re-interpretatio n o f many extant econometric s time-serie s problems . W e conside r a fe w o f these , commencing with multi-collinearity.

9.6.1. Multi-collinearity When x , ~ 1(1 ) an d a'x , ~ 1(0) , the n includin g all the element s o f x ( o r \t-i a s regressors i n a singl e equatio n wil l induc e a n apparentl y seriou s collinearity problem . Th e secon d momen t matri x (X'X ) will b e O(T 2), whereas th e linea r combinatio n (a'X'Xa ) wil l b e O(T). Consequently , (T~ 2 X'X) will converge on a singular matrix . Generally, it is inadvisable to 'solve ' thi s proble m b y deletin g variables ; fo r 1(1 ) data , doin g s o jeopardizes th e possibilit y of co-integration . I f th e dependen t variabl e i s 1(0), the n th e solutio n i s to fin d th e co-integratin g combination a'x t o r «'x,-i an d us e tha t a s a n explanator y variable . Thi s strateg y cor responds t o th e usua l recommendatio n o f transformin g t o near-ortho gonal an d interpretabl e variables . I n othe r cases , wher e th e dependen t variable i s 1(1) bu t i s co-integrated wit h a subset o f \t, say, elimination may b e sensible , bu t Wiener-base d critica l value s shoul d b e use d fo r variables tha t canno t b e writte n implicitl y a s a n 1(0 ) functio n (se e Chapter 7) . Thes e idea s ar e relate d t o th e earlie r techniqu e o f con fluence analysi s in Hendry an d Morga n (1989) .

9.6.2. Measurement Errors Measurement error s ar e a secon d proble m wher e treatmen t recommen dations ca n differ i n the light o f data bein g integrated . Whe n \t ~ 1(1), then Ax ( ~ 1(0) , an d i f the dat a ar e i n logarithms , then th e change s ar e growth rates . I f observed growt h rate s ar e t o b e a t al l sensible, the n th e error wit h which the y ar e measure d mus t no t b e 1(1 ) o r higher . Lettin g x? denot e th e observe d series , on e possibl e mode l i s Ax t = Ax, + u f , where u, i s 1(0), s o that If th e measuremen t erro r i n level s i s denote d vr t = x°t — \t, then w r i s apparently 1(1) . Thi s consideratio n therefor e onl y rather weakl y bounds the scal e o f measuremen t error . Indeed , i f the DG P i s of th e for m tha t Ax, = e t, then u, and e t ar e essentially indistinguishable in models of x°.

308 Conclusio

n

However, whe n a'x ( i s a n 1(0 ) co-integratin g combination , then , o n pre-multiplying (20 ) by a', Since Aa'x , is I(—1) an d a'Xf wil l b e 1(0 ) onl y if a'u, i s I(—1) . Thus , 1(0 ) measuremen t error s o n growth rate s mus t co-integrat e t o I(—1 ) wit h co-integratio n matri x a if the observe d serie s ar e t o co-integrat e i n th e sam e wa y a s th e laten t variables whe n the measuremen t errors ar e 1(0 ) o n growt h rates. Nowa k (1990) call s a failur e t o observ e a'x° t bein g 1(0 ) whe n a'x, i s 1(0 ) a problem o f 'hidde n co-integration' . However , man y co-integratio n rela tionships, suc h a s consumption and income , ar e likel y to hav e connecte d measurement errors . Governmenta l statistica l bureaux ma y eve n correc t the dat a o n suc h serie s i n a relate d wa y t o avoi d divergence , whic h suggests a n 1(0) measurement erro r for , say, the rati o betwee n them . An alternativ e mode l o f measurement error fo r logarithm s is one wit h a constant-percentag e standar d deviation, s o that th e siz e of the absolut e error grow s with th e variable . This lead s t o x ? = x, + v t wher e var[v f ] is constant. Suc h a measuremen t erro r woul d no t imped e co-integratio n analyses, i n tha t inconsistenc y would not resul t a s in a n 1(0 ) setting , bu t would hav e th e usua l impact i n 1(0 ) representation s sinc e a'v t coul d b e 1(0). A n importan t instanc e is when v t i s an expectation s error , i n whic h case th e distribution s of th e long-ru n parameter estimate s ar e unaffecte d but short-ru n paramete r estimate s ma y b e biase d (se e Engl e an d Granger 1987 , an d Hendr y an d Neal e 1988) . 9.6.3. Incorrectly Omitted and Included Variables When a relevan t 1(1 ) variabl e i s omitte d fro m a relationship , 1(0 ) co-integration i s impossibl e an d seriou s biase s ca n result . I n particular , for a n 1(0 ) dependen t variable , al l th e remainin g 1(1 ) regressor s ma y cease t o b e significan t give n th e appropriat e critica l values , leadin g th e model t o collaps e t o on e i n differences . Includin g a n irrelevan t 1(1 ) variable o r vecto r wil l probabl y lowe r th e efficienc y o f estimate s o f th e co-integrating vector s bu t shoul d b e detectabl e i n larg e enoug h samples , with th e usua l possibility of Type-I errors. If on e incorrectl y include s a n 1(0 ) variabl e i n a co-integratio n vecto r in a stati c regression , it s coefficien t wil l b e biase d whe n tha t variabl e i s correlated wit h omitte d 1(0 ) variables . Th e consequence s i n th e max imum likelihoo d procedur e see m les s seriou s a s it is possible t o tes t fo r a unit vecto r (i.e . on e o f th e for m ( 0 ... 0 1 0 ... 0) ) lyin g i n th e co-inte -

Conclusion 30

9

gration spac e (se e Sect . 8.5.2.) . However , conditionin g o n th e estimate d coefficients o f 1(0 ) variable s i s inappropriate , an d spuriousl y smal l confidence interval s fo r th e remainin g 1(0 ) effect s wil l usuall y result . Finally, excludin g a n 1(0 ) variable fro m a mode l wil l no t affec t th e long-run paramete r estimate s i n larg e samples , bu t wil l usually bias th e short-run parameter s as in conventional econometric derivations . 9.6.4. Parameter Change in Integrated Processes The mos t seriou s proble m arisin g fro m possibl e paramete r chang e i n econometrics i s th e predictiv e failur e o f model s tha t fai l t o incorporat e the necessar y effects . Unfortunately , i t i s difficul t eve n t o diagnos e th e problem sinc e i t is easy to confus e a n 1(1) proces s wit h an 1(0 ) subjec t to shifts (se e e.g . Perro n 1989 , Rappopor t an d Reichli n 1989 , an d Hendr y and Neal e 1991) . Indeed , a s note d i n Sectio n 9. 4 above , structure d time-series model s implemen t th e latte r an d produc e th e former . Whether it is mor e usefu l to vie w economi c dat a as integrate d (in the sense o f havin g a uni t roo t i n th e autoregressiv e representatio n subjec t to regula r smal l shocks ) o r a s subjec t t o larg e an d persisten t regim e shifts (th e abolitio n o f fixe d exchang e rates followin g Bretto n Woods , o r their reinstatemen t i n th e ERM ; th e formatio n o f OPEC ; th e denation alization o f large sector s o f a n economy ; ne w form s o f monetary contro l or thei r removal ; financial and technological innovation ; etc.) remain s to be seen . However , bot h type s ar e boun d t o pla y importan t roles , an d although w e hav e focuse d o n th e forme r i n thi s book , understandin g economic behaviou r wil l necessitat e modellin g bot h integrate d dat a an d breaks appropriately . E x ante, structura l break s ca n lea d t o ba d predictions, whic h 1(1) data alon e d o not see m to cause . E x post, testing for paramete r chang e i n 1(1 ) dat a mus t allo w fo r a wid e rang e o f possible choice s fo r brea k points . Usefu l development s ar e occurrin g in deriving appropriat e test s base d o n Wiene r distributions , an d decisio n taking i n thi s are a shoul d improv e rapidl y (se e Nyblo m 1989 , Ch u an d White 1991 , 1992 , Andrew s an d Ploberge r 1991 , Hanse n 1991 , an d Li n and Terasvirt a 1991) . 9.6.5. Conditional Models o f Co-integrated Processes Chapter 8 emphasize d th e maximum-likelihoo d approac h t o testin g fo r and estimatin g co-integratin g vector s i n th e contex t o f a VAR . Thi s imposed th e minimu m conditionin g assumption s an d allowe d a clea r focus o n th e propertie s o f co-integratio n estimation . However , man y papers hav e begu n t o develo p approache s i n th e contex t o f systems that

310 Conclusio

n

treat a subset o f variables a s weakly exogenou s fo r al l the parameter s o f interest: se e Johansen (1992a , 1992&), Johanse n an d Juseliu s (1990) , an d Boswijk (1991) , inter alia. Relate d wor k include s tha t o n testin g fo r Granger causalit y i n co-integrate d system s (se e Tod a an d Phillip s 1991 , Mosconi an d Giannin i 1992 , an d Hunte r 1992) . For a lon g time , econometrician s hav e 'talked ' co-integratio n withou t realizing it : fo r example , Klei n (1953 ) discusse s variou s grea t ratio s o f economics, namel y consumption-income , capital-output , wag e shar e i n total income , an d s o on, implicitl y assuming a stationary , o r 1(0) , world . From ou r perspective , give n tha t th e component s o f thes e relation s ar e 1(1), Klein' s ratio s are earl y example s of co-integratio n hypotheses . In a log-linear multivariat e analysis , thes e postulat e particula r form s fo r th e rows of the co-integratio n matrix , highlightin g the potentia l confirmatory role o f th e method s discusse d i n Chapte r 8 . Econometrician s nee d n o longer simpl y assume long-ru n equilibrium relation s sinc e i t is feasible t o test fo r thei r existence . Onc e tha t i s establishe d th e analysi s is reduce d from 1(1 ) t o 1(0 ) space , allowin g th e applicatio n o f wel l establishe d tools. Thus, th e recen t focu s o n conditiona l o r ope n model s take s us back t o the 1970 s i n a n importan t sens e wit h th e link s betwee n economi c theor y or long-ru n equilibriu m reasonin g an d dat a modellin g havin g bee n placed o n a sounder footing . As w e hav e show n i n thi s book , ther e stil l remai n man y difficul t theoretical an d empirica l problem s t o b e overcome . However , th e literature o n co-integration , erro r correctio n an d th e econometri c analy sis of non-stationary data ha s enable d u s to gai n many important insights into modellin g relationship s amon g integrate d variables . Thi s ha s en hanced rathe r tha n replace d existin g method s o f dynami c econometri c modelling of economic tim e series.

References ABADIR, K . M . (1992) , 'Th e Limitin g Distributio n o f th e Autocorrelatio n Coefficient Unde r a Unit Root' , Annals o f Statistics, forthcoming . AHN, S . K. , an d REINSEL , G . C . (1988) , 'Neste d Reduced-Ran k Autoregressiv e Models fo r Multipl e Tim e Series' , Journal o f th e American Statistical Association, 83: 849-56. ANDERSON, T . W . (1958) , A n Introduction t o Multivariate Statistical Analysis, John Wiley , New York. ——(1976), 'Estimatio n o f Linea r Functiona l Relationships : Approximat e Distributions an d Connection s wit h Simultaneou s Equation s i n Econometric s (with discussion)' , Journal of th e Royal Statistical Society B,38 : 1-36 . ANDREWS, D . W . K. , an d PLOBERGER , W . (1991) , 'Optima l Test s o f Paramete r Constancy', mimeo. , Yale University Press. BANERJEE, A. , an d DOLADO , J . (1987) , 'D o W e Rejec t Rationa l Expectation s Models Too Often ? Interpretin g Evidence using Nagar Expansions', Economics Letters, 24: 27-32. (1988), 'Test s o f th e Lif e Cycle-Permanen t Incom e Hypothesi s i n th e Presence o f Rando m Walks : Asymptoti c Theor y an d Smal l Sampl e Interpre tations', Oxford Economic Papers, 40: 610-33. -and GALBRAITH , J . W . (1990a) , 'Orthogonalit y Test s wit h De-trende d Data: Interpretin g Mont e Carl o Result s using Nagar Expansions' , Economics Letters, 32: 19-24. -HENDRY, D . F. , an d SMITH , G . W . (1986) , 'Explorin g Equilibriu m Relationships i n Econometric s throug h Stati c Models : Som e Mont e Carl o Evidence', Oxford Bulletin of Economics an d Statistics, 48: 253-77. -GALBRAITH, J . W. , an d DOLADO , J . (19906) , 'Dynami c Specificatio n with the Genera l Error-Correctio n Form' , Oxford Bulletin o f Economics an d Statistics, 52: 95-104. -and HENDRY , D . F . (eds. ) (1992) , Testing Integration an d Cointegration, special issue of th e Oxford Bulletin of Economics and Statistics, 54, 225-55. BARDSEN, G . (1989) , 'Th e Estimatio n o f Long-Ru n Coefficient s fro m Error Correction Models' , Oxford Bulletin of Economics and Statistics, 51: 345-50. BEWLEY, R . A . (1979) , 'Th e Direct Estimatio n of the Equilibriu m Response i n a Linear Model' , Economics Letters, 3 : 357-61. BEWLEY, R . A. , ORDEN , D. , an d FISHER , L . (1991) , 'Box-Tia o an d Johanse n Canonical Estimator s o f Cointegratin g Vectors' , Universit y o f Ne w Sout h Wales, Economics Discussion Paper, 91/5 . BHARGAVA, A . (1986) , 'O n th e Theor y o f Testin g fo r Uni t Root s i n Observe d Time Series' , Review of Economic Studies, 53 : 369-84. BILLINGSLEY, P . (1968) , Convergence of Probability Measures, John Wiley , New York. BOSSAERTS, P . (1988) , 'Commo n Non-Stationar y Components o f Asse t Prices' , Journal o f Economic Dynamics an d Control, 12 : 347-64.

312 Reference

s

BOSWIJK, H . P . (1991) , 'Testin g fo r Cointegratio n i n Structura l Models', Univer sity o f Amsterdam, Econometric s Discussio n Pape r AE7/91 . (1992), 'Efficien t Inferenc e on Cointegratio n Parameter s i n Structural Erro r Correction Models' , Universit y o f Amsterdam , Econometric s Discussio n Paper, -and FRANSES , P . H . (1992) , 'Dynami c Specificatio n an d Cointegration' , Oxford Bulletin o f Economics an d Statistics, 54: 369-81. Box, G . E . P. , an d JENKINS , G. M . (1970) , Time Series Analysis Forecasting and Control, Holden-Day , Sa n Francisco. and TIAO , G . C . (1977) , ' A Canonica l Analysi s o f Multipl e Tim e Series' , Biometrika, 64: 355-65. BRANDNER, P. , an d KUNST , R . (1990) , 'Forecastin g Vecto r Autoregressions : Th e Influence o f Cointegration', Memorandu m 265 , IAS , Vienna . CAMPBELL, B. , an d DUFOUR , J.-M . (1991) , 'Over-Rejection s i n Rationa l Expec tations Models : A Non-Parametri c Approac h t o th e Mankiw-Shapir o Prob lem', Economics Letters, 35 : 285-90. CAMPBELL, J . Y. , an d PERRON , P . (1991) , 'Pitfall s an d Opportunities : Wha t Macroeconomists Shoul d Kno w Abou t Uni t Roots' , i n Blanchard , O . J . an d Fischer, S . (eds) , NBER Economics Annual 1991, MIT Press . and SHILLER , R . J . (1991) , 'Cointegratio n an d Test s o f Presen t Valu e Models', Journal o f Political Economy, 95 : 1062-88. CHAMBERS, M . J . (1991) , ' A Not e o n Forecastin g i n Co-Integrate d Systems' , Department o f Economics, Universit y of Essex . CHAN, N . H. , an d WEI , C. Z . (1988) , 'Limitin g Distribution s o f Least-Square s Estimates o f Unstabl e Autoregressiv e Processes' , Annals o f Statistics, 16 : 367-401. CHOI, I . (1992) , 'Durbin-Hausma n Test s fo r Uni t Roots' , Oxford Bulletin o f Economics an d Statistics, 54: 289-304. CHONG, Y . Y. , an d HENDRY , D . F . (1986) , 'Econometri c Evaluatio n o f Linea r Macroeconomic Models' , Review o f Economic Studies, 53 : 671-90. CHOW, G . C . (1960) , 'Test s o f Equalit y Betwee n Set s o f Coefficient s i n Tw o Linear Regressions' , Econometrica, 52: 211-22. CHU, C.-S . J. , an d WHITE , H . (1991) , 'Testin g fo r Structura l Chang e i n som e Simple Tim e Serie s Models' , Discussio n Pape r 91-6 , Universit y of California, San Diego, Dept . o f Economics . (1992) ' A Direc t Tes t fo r Changin g Trend' , Journal o f Business an d Economic Statistics, 10: 289-99. CLEMENTS, M . P. , an d HENDRY , D . F . (1991) , 'O n th e Limitation s o f Mea n Square Erro r Forecas t Comparisons' , Discussio n pape r 138 , Oxfor d Institut e of Economic s an d Statistics . Forthcoming, Journal o f Forecasting. (1992), 'Forecastin g i n Cointegrate d Systems' , Discussio n pape r 139 , Oxford Institut e o f Economics an d Statistics . DAVIDSON, J . E . H. , HENDRY , D . F. , SRBA , F. , an d YEO , S. (1978) , 'Economet ric Modellin g of th e Aggregat e Time-Serie s Relationshi p Between Consumers ' Expenditure an d Incom e i n th e Unite d Kingdom' , Economic Journal, 88 : 661-92. DAVIDSON, R. , an d MACKINNON , J . G . (1992) , Estimation an d Inference i n Econometrics, Oxfor d University Press. DEATON, A . S. , an d MUELLBAUER , J . N . J . (1980) , Economics an d Consumer

References 31

3

Behavior, Cambridge University Press. DICKEY, D . A . (1976) , 'Estimatio n an d Hypothesi s Testin g fo r Nonstationar y Time Series' , Ph.D . dissertation , Iowa State University. and FULLER , W . A . (1979) , 'Distributio n o f the Estimator s fo r Autoregress ive Tim e Serie s wit h a Uni t Root' , Journal o f th e American Statistical Association, 74 : 427-31. -(1981), 'Likelihoo d Rati o Statistic s fo r Autoregressiv e Tim e Serie s with a Unit Root' , Econometrica, 49: 1057-72. — and PANTULA , S . G . (1987) , 'Determinin g th e Orde r o f Differencin g i n Autoregressive Processes' , Journal o f Business an d Economic Statistics, 15 : 455-61. — and SAID , S . E . (1981) , Testin g ARIMA(p , 1, q) agains t ARM A (p + l,q)', Proceedings of the Business and Economic Statistics Section, American Statistical Association, 28 : 318-22. — BELL, W . R. , an d MILLER , R . B . (1986) , 'Uni t Root s i n Tim e Serie s Models: Test s an d Implications', American Statistician, 40: 12-26. -HASZA, D . P. , an d FULLER , W . A . (1984) , 'Testin g fo r a Uni t Roo t i n Seasonal Tim e Series' , Journal o f th e American Statistical Association, 79 : 355-67. DURLAUF, S . N. , an d PHILLIPS , P . C . B . (1988) , 'Trend s versu s Random Walk s in Tim e Serie s Analysis', Econometrica, 56: 1333-54. ENGLE, R . F. , an d GRANGER , C . W . J . (1987) , 'Co-integratio n an d Erro r Correction: Representation , Estimatio n an d Testing' , Econometrica, 55 : 251-76. and Yoo , B . S . (1987) , 'Forecastin g an d Testin g i n Co-integrate d Systems', Journal o f Econometrics, 35: 143-59. (1991), 'Cointegrate d Economi c Tim e Series : A n Overvie w wit h New Results', i n R . F . Engl e an d C . W . J . Grange r (eds.) , Long-Run Economic Relationships, Oxfor d University Press, 237-66 . GRANGER, C . W . J. , an d HALLMAN , J . (1988) , 'Mergin g Short - an d Long-run Forecasts : An Applicatio n of Seasona l Co-integratio n to Monthl y Electricity Sales Forecasting', Journal of Econometrics, 40: 45-62. -HYLLEBURG, S. , an d LEE , H. S . (1993) , 'Seasona l Co-Integration : Th e Japanese Consumptio n Function' , Journal of Econometrics, 55: 275-98. -HENDRY, D . F. , an d RICHARD , J.-F . (1983) , 'Exogeneity' , Econometrica, 51: 277-304. ERICSSON, N . R . (1992) , Cointegration, Exogeneity an d Policy Analysis, Specia l Issue, Journal of Policy Modeling, 14 , 3 and 4 . CAMPOS, J. , an d TRAN , H.-A . (1990) , 'PC-GIV E an d Davi d Hendry' s Econometric Methodology' , Revista de Econometrica, X, 7-117. and HENDRY , D . F . (1985) , 'Conditiona l Econometri c Modelling : A n Application t o Ne w House Prices i n the Unite d Kingdom' , i n Atkinson, A. C . and Fienberg, S . E . (eds) , A Celebration o f Statistics, Springer-Verlag , 251-85. -HENDRY, D . F . an d TRAN , H.-A . (1992 ) 'Cointegration , Seasonality , Encompassing an d th e Deman d fo r Mone y i n th e Unite d Kingdom' , Discus sion Paper , Boar d o f Governor s o f th e Federa l Reserv e System , Washington, DC. ERMINI, L. , an d GRANGER , C . W . J . (1991) , 'Som e Generalization s o n th e

314 Reference

s

Algebra o f 7(1 ) Processes' , Workin g Paper , Departmen t o f Economics , University of Hawaii at Manoa . ERMINI, L. , an d HENDRY , D . F . (1991) , 'Lo g Incom e vs . Linea r Income : A n Application o f th e Encompassin g Principle' , Workin g Pape r no . 91-11 , De partment o f Economics, Universit y of Hawaii at Manoa. EVANS, G . B . A. , an d SAVIN , N . E . (1981) , 'Testin g fo r Uni t Roots : 1' , Econometrica, 49: 753-79. (1984), Testin g for Unit Roots : 2 ' Econometrica, 52 : 1241-69. FRIEDMAN, M. , an d SCHWARTZ , A . J . (1982) , Monetary Trends i n th e United States and the United Kingdom: Their Relation to Income, Prices, and Interest Rates, 1867-1975, Universit y o f Chicago Press . FULLER, W . A . (1976) , Introduction t o Statistical Time Series, John Wiley , New York. GALBRAITH, J . W. , DOLADO , J. , an d BANERJEE , A . (1987) , 'Rejection s o f Orthogonality i n Rationa l Expectation s Models: Furthe r Mont e Carl o Result s for a n Extende d Se t of Regressors', Economics Letters, 25 : 243-7. GANTMACHER, F . R . (1959) , Applications o f th e Theory o f Matrices, Inter science, Ne w York. GEL'FAND, J . M . (1967) , Lectures on Linear Algebra, Interscience , New York. GEWEKE, J . (1986) , 'Th e Super-Neutralit y of Mone y i n th e Unite d States : A n Interpretation o f the Evidence' , Econometrica, 54 : 1-21 . GHYSELS, E . (1990) , 'O n th e Economic s an d Econometric s o f Seasonally' , paper presente d t o th e Sixt h World Congress o f the Econometri c Society. GONZALO, J . (1990) , 'Compariso n o f Fiv e Alternativ e Method s o f Estimatin g Long-Run Equilibriu m Relationships' , Discussio n Paper , Universit y of Cali fornia a t Sa n Diego. GRANGER, C . W . J . (1981) , 'Som e Properties o f Time Serie s Dat a an d thei r Us e in Econometri c Mode l Specification' , Journal of Econometrics, 16: 121-30. (1983), 'Forecastin g Whit e Noise', i n A. Zellne r (ed.) , Applied Time Series Analysis o f Economic Data, Bureau o f the Census , Washington, DC, 308-14 . (1986), 'Development s i n th e Stud y of Co-integrate d Economi c Variables' , Oxford Bulletin of Economics an d Statistics, 48: 213-28. -and HALLMAN , J . (1991) , 'Th e Algebr a o f 1(1) Processes' , Journal of Time Series Analysis, 12 : 207-24. -and LEE , T.-H. (1990) , 'Multicointegration' , i n G . F . Rhode s Jr . an d T . B . Fomby (eds.) , Advances i n Econometrics, JA I Press , Greenwic h Conn. , 71-84. and NEWBOLD , P . (1974) , 'Spuriou s Regression s i n Econometrics' , Journal of Econometrics, 2: 111-20 . -(1977), 'Th e Tim e Serie s Approac h t o Econometri c Mode l Building' , in C . A . Sim s (ed.) , Ne w Methods i n Business Cycle Research, Federa l Reserve Ban k o f Minneapolis. -(1978), Forecasting Economic Time Series, Academi c Press , Ne w York. — and WEISS , A . A . (1983) , 'Time-Serie s Analysi s o f Error-Correctio n Models', i n S . Karlin , T . Amemiya , an d L . A . Goodma n (eds.) , Studies i n Econometrics, Time Series an d Multivariate Statistics, Academi c Press , Ne w York.

References 31

5

GREOOIR, S. , an d LAROQUE , G . (1991 ) 'Multivariat e Integrate d Tim e Series : A General Error Correctio n Representatio n wit h Associated Estimatio n an d Tes t Procedures', Discussio n pape r 53/G305 , INSEE, Paris . GRIMMET, G . R. , an d STIRZAKER , D . R . (1982) , Probability an d Random Processes, Oxford University Press. HALDRUP, N. , an d HYLLEBERG , S . (1991) , 'Integration , Near-Integratio n an d Deterministic Trends' , Discussio n Pape r no . 1991-15 , Aarhu s University , Denmark. HALL, A . (1989) , 'Testin g fo r a Uni t Roo t i n th e Presenc e o f Movin g Average Errors', Biometrika, 79 : 49-56. (1990), 'Testin g fo r a Uni t Roo t i n Tim e Serie s using Instrumenta l Variables Estimator s wit h Pre-tes t Data-Base d Mode l Selection' , Discussio n Paper, Nort h Carolin a Stat e University. -(1991), 'Mode l Selectio n an d Uni t Roo t Test s base d o n Instrumenta l Variables Estimators', Discussio n paper, North Carolin a Stat e University. HALL, A . D. , ANDERSON , H . M. , an d GRANGER , C . W . J . (1992) , ' A Cointegration Analysi s o f Treasur y Bil l Yields' , Review o f Economics an d Statistics, 74: 116-25. HALL, P. , an d HEYDE , C . C . (1980) , Martingale Limit Theory an d Applications, Academic Press , Ne w York. HALL, R . E . (1978) , 'Stochasti c Implication s o f th e Life-Cycl e Permanen t Income Hypothesis' , Journal of Political Economy, 86: 971-87. HAMMERSLEY, J . M. , an d HANDSCOMB , D . C . (1964) , Monte Carlo Methods, Methuen, London . HANSEN, B . E . (1991) , Test s fo r Paramete r Instabilit y in Regression s wit h 1(1) Processes', Discussio n paper . Universit y of Rochester . (1992), 'Testin g fo r Paramete r Instabilit y i n Linea r Models' , Journal o f Policy Modeling, 14 : 517-33. HARVEY, A . C . (1989) , Forecasting, Structural Time Series Models an d th e Kalman Filter, Cambridge Universit y Press. HASZA, D . P. , an d FULLER , W . A . (1982) , 'Testin g for Nonstationary Paramete r Specifications i n Seasona l Time-Serie s Models' , Annals o f Statistics, 10 : 1209-16. HENDRY, D . F . (1984) , 'Mont e Carl o Experimentatio n i n Econometrics' , ch . 16 in Z . Griliche s an d M . D . Intrilligato r (eds.) , Handbook o f Econometrics, ii , North-Holland, Amsterdam, 937-76. (1989), PC-GIVE: A n Interactive Econometric Modelling System, Institut e of Economic s an d Statistics , Oxfor d University, Oxford . (1991o), 'Usin g PC-NAIV E i n Teachin g Econometrics' , Oxford Bulletin o f Economics and Statistics, 53, 199-223. (1991 b), 'Economi c Forecasting' , Repor t t o th e Treasur y an d Civi l Servic e Committee, UK . and ANDERSON , G . J . (1977) , 'Testin g Dynami c Specificatio n i n Smal l Simultaneous Models : A n Applicatio n t o a Mode l o f Buildin g Societ y Beha vior i n th e Unite d Kingdom' , ch . 8 c i n M . D . Intrilligato r (ed.) , Frontiers o f Quantitative Economics, iii(a) , North-Holland, Amsterdam, 361-83 . and CLEMENTS , M. P . (1992) , 'Toward s a Theory o f Economic Forecasting', unpublished paper , Institut e of Economics an d Statistics , Oxfor d University.

316 Reference

s

HENDRY, D . F. , an d ERICSSON , N . R . (1991a) , 'A n Econometri c Appraisa l o f U.K. Mone y Deman d i n Monetary Trends i n th e United States and th e United Kingdom b y Milto n Friedma n an d Ann a J . Schwartz' , American Economic Review, 81: 8-38 . and ERICSSON , N . R . (19916) , 'Modellin g th e Deman d fo r Narro w Mone y in th e Unite d Kingdo m an d th e Unite d States' , European Economic Review, 35: 833-81 . -and MIZON , G . E . (1978) , 'Seria l Correlatio n a s a Convenien t Simplifica tion, no t a Nuisance : A Commen t o n a Stud y o f th e Deman d fo r Mone y b y the Ban k of England', Economic Journal, 88 : 549-63. (1992), 'Evaluatin g Dynami c Model s b y Encompassin g th e VAR' , i n P. C . B . Phillip s (ed.) , Models, Methods, an d Applications o f Econometrics, Basil Blackwell , Oxford. — and MORGAN , M . S . (1989) , ' A Re-analysi s o f Confluenc e Analysis' , Oxford Economic Papers, 41 : 35-52 : reprinte d i n N . d e March i an d C . L . Gilbert (eds.) , History an d Methodology o f Econometrics, Clarendo n Press , Oxford, 1990 . -MuELLBAUER, J . N . J. , an d MURPHY , A . (1990) , 'Th e Econometric s o f DHSY', i n J . D . He y an d D . Winc h (eds.) , A Century o f Economics, Basi l Blackwell, Oxford , 298-334. — and NEALE , A . J . (1987) , 'Mont e Carl o Experimentatio n usin g PC NAIVE', i n T . Fomb y an d G . Rhode s (eds.) , Advances i n Econometrics, vi , JAI Press, Greenwich , Conn. , 91-125. -(1988), 'Interpretin g Long-Ru n Equilibriu m Solution s i n Conventiona l Macro Models : A Comment' , Economic Journal, 98 : 808-17. -(1991), ' A Mont e Carl o Stud y o f th e Effect s o f Structura l Break s o n Unit Roo t Tests' , i n P . Hack l an d A . H . Westlun d (eds.) , Economic Structural Change: Analysis an d Forecasting, Springer-Verlag, Vienna , 95-119 . -and ERICSSON , N . R . (1990) , PC-NAIVE: A n Interactive Program fo r Monte Carlo Experimentation i n Econometrics, Institut e o f Economic s an d Statistics, Oxfor d University, Oxford. — PAGAN, A . R. , an d SARGAN , J . D . (1984) , 'Dynami c Specification' , ch . 18 in Z . Griliche s an d M . D . Intrilligato r (eds.) , Handbook o f Econometrics, ii, North-Holland, Amsterdam , 1023-100 . -and RICHARD , J.-F . (1982) , 'O n th e Formulatio n o f Empirica l Model s i n Dynamic Econometrics', Journal of Econometrics, 20: 3-33 . -and UNGERN-STERNBERG , T . VO N (1981) , 'Liquidit y an d Inflatio n Effects o n Consumers' Behaviour' , ch . 9 in A . S . Deato n (ed. ) Essays i n th e Theory an d Measurement o f Consumers' Behaviour, Cambridge Universit y Press, 237-60 . HUNTER, J . (1992) , 'Test s o f Cointegratin g Exogeneit y fo r PP P an d Uncovere d Interest Rat e Parit y in the UK' , Journal of Policy Modeling, 14 : 453-64. HYLLEBERG, S . (1991) , Modelling Seasonally, Oxfor d University Press. and MIZON , G . E . (1989a) , 'Cointegratio n an d Erro r Correctio n Mechan isms', Economic Journal (Supplement) , 99 : 113-25. -(1989&), ' A Not e o n th e Distributio n o f th e Leas t Square s Estimato r of a Random Wal k with Drift', Economics Letters, 29 : 225-30. — ENGLE, R . F. , GRANGER , C . W . J. , an d Yoo , B . S . (1990) , 'Seasona l Integration an d Co-Integration' , Journal of Econometrics, 44: 215-28.

References 31

7

ILMAKUNNAS, P . (1990) , Testin g th e Orde r o f Differencin g i n Quarterl y Data : An Illustratio n o f th e Testin g Sequence' , Oxford Bulletin o f Economics an d Statistics, 52: 79-88. IMHOF, P . (1961) , 'Computin g th e Distributio n o f Quadrati c Form s i n Norma l Variates', Biometrika, 48: 419-26. JARQUE, C . M. , an d BERA , A . K . (1980) , 'Efficien t Test s fo r Normality , Homoskedasticity an d Seria l Independence o f Regression Residuals' , Economics Letters, 6: 255-9. JAZWINSKI, A . H . (1970) , Stochastic Processes an d Filtering Theory, Academi c Press, Ne w York. JOHANSEN, S . (1988) , 'Statistica l Analysi s o f Cointegratio n Vectors' , Journal o f Economic Dynamics and Control, 12 : 231-54. (1989), 'Th e Power o f the Likelihoo d Rati o Tes t fo r Cointegration', mimeo, Institute o f Mathematical Statistics, Universit y of Copenhagen . (1991fl), 'Estimatio n an d Hypothesi s Testin g o f Cointegratio n Vector s i n Gaussian Vector Autoregressive Models', Econometrica, 59: 1551-80. (1991&), ' A Statistical Analysi s of Cointegration fo r 1(2 ) variables', Institut e of Mathematica l Statistics, Universit y of Copenhagen . (1992a), 'Cointegratio n i n Partia l System s an d th e Efficienc y o f Singl e Equation Analysis' , Journal o f Econometrics, 52: 389-402. (19926), Testin g Wea k Exogeneit y and th e Orde r o f Cointegratio n i n U K Money Demand', Journal of Policy Modeling, 14 : 313-34. -and JUSELIUS , K . (1990) , 'Maximu m Likelihoo d Estimatio n an d Inferenc e on Cointegration—wit h Application s t o th e Deman d fo r Money' , Oxford Bulletin of Economics and Statistics, 52: 169-210. KELLY, C . M . (1985) , ' A Cautionar y Not e o n th e Interpretatio n o f Long-Ru n Equilibrium Solution s i n Conventiona l Macr o Models' , Economic Journal, 95: 1078-86. KIVIET, J. , an d PHILLIPS , G . D . A . (1992) , 'Exac t Simila r Test s fo r Uni t Root s and Cointegration , Oxford Bulletin of Economics and Statistics, 54: 349-67. KLEIN, L . R . (1953) , A Textbook o f Econometrics, Row , Peterso n an d Com pany, Evanston, 111 . KOERTS, J. , an d ABRAHAMSE , A . P . J . (1969) , O n th e Theory an d Application o f the General Linear Model, Rotterda m Universit y Press. KREMERS, J . J . M. , ERICSSON , N . R. , an d DOLADO , J . (1992) , Th e Powe r o f Co-integration Tests' , Oxford Bulletin of Economics and Statistics, 54: 325-48. KWIATKOWSKI, D. , PHILLIPS , P . C . B. , an d SCHMIDT , P . (1991) , Testin g the Null Hypothesis o f Stationarit y agains t the Alternativ e o f a Uni t Root: Ho w Sur e Are W e tha t Economi c Tim e Serie s Hav e a Uni t Root' , Cowle s Foundatio n Discussion Pape r No . 979 . LEYBOURNE, S . J. , an d MCCABE , B . P . M . (1992) , ' A Simpl e Tes t fo r Cointegration', typescrip t Nottingham University. LIN, C.-F. , an d TERASVIRTA , T . (1991) , Testin g th e Constanc y o f Regressio n Parameters agains t Continuou s Structura l Change', Discussio n paper , Univer sity o f California at Sa n Diego . MCCALLUM, B . T . (1984) , 'O n Low-Frequency Estimate s o f Long-Run Relation ships in Macroeconomics', Journal of Monetary Economics, 14 : 3-14 . MACKINNON, J . G . (1991) , 'Critica l Value s fo r Co-Integratio n Tests' , i n R . F .

318 Reference

s

Engle an d C . W . J . Grange r (eds.) , Long-Run Economic Relationships, Oxford Universit y Press, 267-76 . MANKIW, N . G. , an d SHAPIRO , M . D . (1985) , 'Trends , Rando m Walk s and Test s of th e Permanen t Incom e Hypothesis' , Journal o f Monetary Economics, 16 : 165-74. (1986), 'D o W e Rejec t To o Often ? Smal l Sampl e Propertie s o f Test s of Rationa l Expectation s Models', Economics Letters, 20: 139-45 . MANN, H . B. , an d WALD , A . (1943) , 'O n Stochasti c Limi t an d Orde r Relation ships', Annals o f Mathematical Statistics, 14: 217-77. MIZON, G . E . (1977) , 'Mode l Selectio n Procedures' , i n M . J . Arti s an d A . R . Nobay (eds.) , Studies in Modern Economic Analysis, Basi l Blackwell, Oxford. and HENDRY , D . F . (1980) , 'A n Empirica l Applicatio n an d Mont e Carl o Analysis o f Test s o f Dynami c Specification', Review o f Economic Studies, 47 : 21-45. MORGAN, M . S . (1990) , Th e History o f Econometric Ideas, Cambridg e Univer sity Press . MOSCONI, R. , an d GIANNINI , C . (1992) , 'Non-Causalit y i n Cointegrate d Systems : Representation, Estimatio n an d Testing' , Oxford Bulletin o f Economics an d Statistics, 54: 399-417. NANKERVIS, J . C. , an d SAVIN , N . E . (1985) , 'Testin g th e Autoregressiv e Parameter wit h the r-statistic' , Journal of Econometrics, 27: 143-61 . (1987), 'Finit e Sampl e Distribution s o f t an d F Statistic s i n a n AR(1) model with an Exogenous Variable' , Econometric Theory, 3 : 387-408. NELSON, C . R. , an d KANG , H . (1981) , 'Spuriou s Periodicit y i n Inappropriatel y Detrended Tim e Series' , Journal of Monetary Economics, 10 : 139-62. NEWEY, W . K. , an d WEST , K . D . (1987) , ' A Simpl e Positiv e Semi-Definit e Heteroskedasticity an d Autocorrelation-Consistent Covarianc e Matrix' , Econometrica, 55: 703-8. NOWAK, E . (1990) , 'Hidde n Cointegration' , Discussio n paper , Universit y o f California a t Sa n Diego. NYBLOM, J . (1989) , 'Testin g fo r th e Constanc y o f Parameter s ove r Time' , Journal o f th e American Statistical Association, 84: 223-30. OSBORN, D . R. , CHIU , A . P . L. , SMITH , J . P. , an d BIRCHENHALL , C . R . (1988) , 'Seasonality an d th e Orde r o f Integratio n fo r Consumption' , Oxford Bulletin of Economics an d Statistics, 50: 361-78 . OSTERWALD-LENUM, M . (1992) , ' A Not e wit h Fractile s o f th e Asymptoti c Distribution o f th e Maximu m Likelihoo d Cointegratio n Ran k Tes t Statistics : Four Cases' , Oxford Bulletin o f Economics an d Statistics, 54: 461-72. PANTULA, S . G . (1991) , 'Testin g fo r Uni t Root s i n Tim e Serie s Data' , Econometric Theory, 5 : 265-71. PARK, J . Y. , an d PHILLIPS , P . C . B . (1988) , 'Statistica l Inferenc e in Regression s with Integrate d Processes : Par t F, Econometric Theory, 4 : 468-97. PERRON, P . (1988) , 'Trend s an d Rando m Walk s in Macroeconomi c Tim e Series : Further Evidenc e fro m a New Approach' , Journal of Economic Dynamics an d Control, 12 : 297-332. (1989), 'Th e Grea t Crash , th e Oi l Shoc k an d th e Uni t Roo t Hypothesis' , Econometrica, 57: 1361-402. PHILLIPS, P . C . B . (1986) , 'Understandin g Spuriou s Regression s i n Economet -

References 31

9

tics', Journal o f Econometrics, 33: 311-40. — (1987o), 'Tim e Serie s Regressio n wit h a Uni t Root' , Econometrica, 55 : 277-301. — (19875), 'Toward s a Unifie d Asymptoti c Theor y o f Autoregression' , Biometrika, 74 : 535-48. -(1988a), 'Reflection s o n Econometri c Methodology' , Economic Record, 64: 344-59. — (19885), 'Multipl e Regressio n wit h Integrate d Tim e Series' , Contemporary Mathematics, 80 : 79-105. -(1991), 'Optima l Inferenc e i n Co-integrate d Systems' , Econometrica, 59 : 282-306. — and DURLAUF , S . N . (1986) , 'Multipl e Tim e Serie s Regressio n wit h Integrated Processes' , Review of Economic Studies, 53: 473-95. — and HANSEN , B . E . (1990) , 'Statistica l Inferenc e i n Instrumenta l Variables Regression wit h 1(1) Processes' , Review of Economic Studies, 57 : 99-125. — and LORETAN , M . (1991) , 'Estimatin g Long-Ru n Economi c Equilibria' , Review of Economic Studies, 58: 407-36. — and OULIARIS , S . (1988) , Testin g fo r Co-integratio n usin g Principa l Components Methods' , Journal o f Economic Dynamics an d Control, 12 : 205-30. -(1990), 'Asymptoti c Propertie s o f Residua l Base d Test s fo r Cointegra tion', Econometrica, 58: 165-93. — and PARK , J . Y . (1988) , 'Asymptoti c Equivalenc e o f Ordinar y Leas t Squares an d Generalize d Leas t Square s i n Regression s wit h Integrate d Vari ables', Journal of th e American Statistical Association, 83: 111-15. -and PERRON , P . (1988) , 'Testin g fo r a Uni t Roo t i n Tim e Serie s Regres sion', Biometrika, 75 : 335-46. PRIESTLEY, M . B . (1989) , Nonlinear an d Nonstationary Time Series Analysis, Academic Press , Ne w York. QUANDT, R . E . (1978) , 'Test s o f Equilibriu m vs . Disequilibriu m Hypotheses' , International Economic Review, 19 : 435-52. (1982), 'Econometri c Disequilibriu m Models' , Econometric Reviews, 1 : 1-63. RAPPOPORT, P. , an d REICHLIN , L . (1989) , 'Segmente d Trend s an d Non-Station ary Tim e Series' , Economic Journal, 99 : 168-77. REIMERS, H . E . (1991) , 'Comparison s o f Test s fo r Multivariat e Co-integration', Discussion Pape r no . 58, Christian-Albrechts University, Kiel. RIPLEY, B . D . (1987) , Stochastic Simulation, Joh n Wiley , New York. SAID, S . E. , an d DICKEY , D . A . (1984) , 'Testin g fo r Uni t Root s i n Autoregres sive-Moving Average Models of Unknown Order', Biometrika, 71 : 599-607. SAIKKONNEN, P . (1991) , 'Asymptoticall y Efficien t Estimatio n o f Cointegratin g Regressions', Econometric Theory, 1 : 1-21 . SAMPSON, M . (1991) , 'Th e Effec t o f Paramete r Uncertaint y o n Forecas t Vari ances an d Confidenc e Interval s fo r Uni t Roo t an d Tren d Stationar y Time Series Models' , Journal o f Applied Econometrics, 6 : 67-76. SARGAN, J . D . (1964) , 'Wage s an d Price s i n th e Unite d Kingdom : A Stud y i n Econometric Methodology' , i n P . E . Hart , G . Mills , an d J . K . Whitake r (eds.), Econometric Analysis fo r National Economic Planning, Butterworth ,

320 Reference

s

London; reprinte d i n D. F . Hendr y an d K. F . Wallis (eds.), Econometrics and Quantitative Economics, Basil Blackwell , Oxford , 1984 . SARGAN, J . D . (1980) , 'Som e Test s o f Dynami c Specificatio n fo r a Singl e Equation', Econometrica, 48: 879-97. and BHAROAVA , A . (1983) , 'Testin g Residual s fro m Leas t Square s Regres sion fo r Bein g Generate d b y th e Gaussia n Rando m Walk' , Econometrica, 51 : 153-74. SCHMIDT, P. , an d PHILLIPS , P . C . B . (1992) , 'L M tes t fo r a Uni t Roo t i n th e Presence o f Deterministi c Trends' , Oxford Bulletin o f Economics an d Statistics, 54: 257-87. SCHWERT, G . W . (1989) , 'Test s fo r Uni t Roots : A Mont e Carl o Investigation' , Journal o f Business and Economic Statistics, 1: 147-59. SHEPPARD, D . K . (1971) , Th e Growth and Role o f U K Financial Institutions 1890-1962, Methuen , London . SIMS, C. A. (ed. ) (1977) , New Methods in Business Cycle Research, Federa l Reserve Ban k o f Minneapolis. STOCK, J. H. , an d WATSON , M . W . (1990) , 'Inference i n Linear Tim e Serie s with Som e Uni t Roots' , Econometrica, 58 : 113-44. SPANOS, A . (1986) , Statistical Foundations o f Econometric Modelling, Cambridg e University Press . STOCK, J . H . (1987) , 'Asymptoti c Propertie s o f Least-Square s Estimator s o f Co-integrating Vectors', Econometrica, 55 : 1035-56. and WATSON , M . W . (1988«) , 'Variabl e Trend s i n Economi c Tim e Series' , Journal o f Economic Perspectives, 2: 147-74. (1988&), 'Testin g fo r Commo n Trends' , Journal o f th e American Statistical Association, 83: 1097-107. - (1991) ' A Simpl e MLE o f Cointegratin g Vectors i n Genera l Integrate d Systems', Typescript , Northwester n University , -and WEST , K . D . (1988) , 'Integrate d Regressor s an d Test s o f th e Perman ent Incom e Hypothesis' , Journal of Monetary Economics, 21: 85-96. TODA, H. , an d PHILLIPS , P . C . B . (1991) , 'Vecto r Autoregression s an d Causal ity', Cowle s Foundation Discussio n Paper, 997 . URBAIN, J.-P . (1992) , 'O n Wea k Exogeneit y i n Erro r Correctio n Models' , Oxford Bulletin o f Economics an d Statistics, 54: 187-207. WEST, K . D . (1988) , 'Asymptoti c Normality , whe n Regressor s hav e a Uni t Root', Econometrica, 56 : 1397-418. WHITE, H . (1980) , ' A Heteroskedasticity-Consisten t Covarianc e Matri x Estima tor an d a Direct Tes t for Heteroskedasticity' , Econometrica, 48 : 817-38. (1984), Asymptotic Theory fo r Econometricians, Academi c Press , Ne w York. WICKENS, M . R. , an d BREUSCH , T . S . (1988) , 'Dynami c Specification , the Lon g Run an d th e Estimatio n o f Transforme d Regressio n Models' , Economic Journal, 9 8 (Conference 1988) : 189-205 . WOLD, H . (1954) , A Study i n th e Analysis o f Stationary Time Series, Almqvis t and Wiksell , Stockholm . YULE, G . U . (1926) , 'Wh y D o W e Sometime s Ge t Nonsens e Correlation s Between Tim e Series ? A Stud y i n Samplin g and th e Natur e o f Tim e Series' , Journal o f th e Royal Statistical Society, 89 : 1-64 .

Acknowledgements for Quoted Extracts The author s ar e gratefu l t o the followin g fo r permission t o reproduce extracts: Elsevier Scienc e Publishers , fo r materia l from N . G . Manki w and M . D . Shapir o (1986), 'D o w e reject to o often : Small-sampl e properties o f rational expectations models', Economics Letters, 20: 142-3. The Review o f Economic Studies, fo r materia l fro m P . C . B . Phillip s an d B . E . Hansen (1990) , 'Statistica l Inferenc e i n Instrumenta l Variables Regressio n wit h 1(1) Processes', Review of Economic Studies, 57: 116-17. The Econometri c Societ y fo r materia l fro m D . A . Dicke y an d W . A . Fulle r (1981), 'Likelihoo d Rati o Statistic s fo r Autoregressiv e Tim e Serie s wit h a Uni t Root', Econometrica, 49: 1062-3. David A . Dickey , Professor o f Statistics, North Carolin a Stat e University. John Wile y & Sons , Inc. , fo r materia l fro m Wayn e A. Fulle r (1976) , Introduction to Statistical Time Series, 371-3.


Author Inde x Abadir, K . M . 126 , 128 Abrahamse, A . P . J . 10 4 Ahn, S . K . 30 5 Anderson, G . J . 5 , 50, 140 Anderson, H . 27 2 Anderson, T. W . 70n. , 26 5 n., 285 Andrews, D . W . K . 31 0 Banerjee, A . 55 , 95, 97, 163 , 166, 177n., 187, 191 , 192, 214, 215, 220, 222, 230 , 233, 306 , 307 Bardsen, G . 47 , 53, 56, 62, 235 Bewley, R. 47 , 49, 53, 152 , 305 Bhargava, A. 101 , 104, 155, 176, 207, 209 Billingsley, P . 24 , 89 Birchenhall, C . R . 12 2 Bossaerts, P . 29 8 Boswijk, H . P . 235 , 305, 307, 310 Box, G . E . P . 10 , 13, 121, 305 Brandner, P . 28 2 Breusch T . S . 47 , 55 , 56 , 59 , 62 , 63 , 64 Campbell, B . 167n . Campbell, J . Y . 30 6 Campos, J . 23 6 Chan, N . H . 91 , 96 n. Chiu, A . P . L . 12 2 Choi, I . 30 6 Chong, Y . Y . 28 2 Chow, G . C . 194n . Chu, C.-S . J. 31 0 Clements, M . P . 282 , 283, 285 Davidson, J . E . H . 5 , 50, 52, 140, 300 Davidson, R. 16 , 28 Deaton, A. S . 5 3 Dickey, D . A . 8 , 24, 82, 100 , 103, 107, 108, 112-23 , 169 Dolado, J. J . 55 , 97, 163, 166, 177n., 187, 191, 192 , 230 Dufour, J.-M . 167n. Durlauf, S . N . 82 , 92 , 93 , 182 , 203, 238 , 254, 262n . Engle, R . F . 6 , 7, 17 , 18, 19, 43, 67, 84n., 121, 122 , 137 n., 145 , 146, 152, 157-9, 163, 205n. , 208, 209, 211, 215, 231, 242 , 256, 261, 278, 279, 282, 283, 287, 288 , 305, 30 9

Ericsson, N . R . 18 , 28, 29, 41, 230, 232 , 236, 238, 269, 292, 301 Ermini, L . 32 , 193-7 Evans, G . B . A . 10 4 Fisher, I. 6 5 Fisher, L . 30 5 Frances, P.-H . 23 5 Friedman, M . 29 , 190 , 194 Fuller, W . A . 8 , 13 , 14, 15 , 24, 26, 100-3 , 106, 107 , 112-23, 169 Galbraith, J . W . 55 , 98, 166 , 177n., 191 Gantmacher, F . R . 14 0 Gel'fand, J . M . 14 0 Ghysels, E. 12 1 Giannini, C . 31 0 Gonzalo, J . 240 , 285, 286, 293, 294, 296-8 Granger, C . W . J. 6 , 7, 32 , 43, 69, 70, 81, 83, 84n. , 121 , 137n., 138 , 139, 145, 146, 157-9, 196 , 205n., 208, 209, 215, 231, 256, 257, 260, 261, 272, 278, 285, 287 , 307, 309, 310 Gregoir, S . 30 4 Grimmet, G . R . 9 6 Haldrup, N . 9 6 Hall, A . 107 , 119, 130, 133, 306 Hall, A. D. 27 2 Hall, P . 23 , 24, 89n., 179n . Hall, R . E . 164 , 165, 177 Hallman, J. 32 , 121 Hammersley, J . M . 2 8 Handscombe, D . C . 2 8 Hansen, B . E . 176 , 194, 238-41, 246, 248-51, 261, 294, 299, 310 Harvey, A . C . 30 3 Hasza, D . P . 122 , 123 Hendry, D . F . 5 , 17 , 28, 29, 32, 41, 47, 48, 49, 50 , 53 , 65 , 95 , 101 , 140, 162, 163, 193-5, 197 , 221, 229, 231-3, 235, 236 , 238, 269, 278, 279, 282, 283, 285, 288 , 292, 300, 301, 306-309 Heyde, C . C . 23 , 24, 89n., 179n . Hunter, J. 31 0 Hylleberg, S . 96 , 121-3 , 152 , 170 Ilmakunnas, P . 12 1 Imhof, P . 104 , 207

324

Author Index

Jenkins, G . M . 10 , 13, 121 Johansen, S . 43 , 96 , 146 , 151 , 153 , 211 , 256, 257 , 260 , 262 , 265 , 268 , 271 , 272 , 277, 287 , 288 , 290 , 292 , 294 , 297, 298 , 307, 31 0 Juselius, K . 271 , 272 , 277, 290 , 31 0 Kang, H. 19 1 Kelly, C . M . 47 , 64 , 65, 66 Kiviet, J . 104 , 105 , 169n. , 232 Klein, L . R. 31 0 Koerts, J . 10 4 Kremers, J . M . J. 230-3 , 306 Kunst, R . 28 2 Kwiatkowski, D. 30 4 Laroque, G. 30 4 Lee, H . S . 12 1 Lee, T.-H . 287 , 307 Lin, C.-F . 31 0 Loretan, M . 163 , 288 , 29 1 Leybourne, S . J. 30 4 McCabe, B . P . M . 30 4 McCallum, B . T. 47 , 64- 6 MacKinnon, J . G . 16 , 28, 211, 213 , 214 Mankiw, N . G . 164 , 165 , 166 , 177n. , 191 Mann, H . B . 1 4 Mizon, G . E . 101 , 152 , 162 , 170 , 231 , 235 , 278, 285 , 288 , 292 , 300 , 30 7 Morgan, M . S . 5 , 308 Mosconi, R . 31 0 Muellbauer, J . N . J . 5 3 Murphy, A . 5 3 Nankervis, J. C . 10 4 Neale, A . J . 47 , 65, 221, 309 Nelson, C . R . 19 1 Newbold, P . 69 , 70, 81 , 83 , 138 , 139 , 19 1 Newey, W. K. Il l Nowak, E . 30 8 Nyblom, J . 310 Orden, D . 30 5 Osborn, D . R . 122 , 12 3 Osterwald-Lenum, M. 268-76 , 292 Ouliaris, S . 133 , 134 , 208 , 210 , 21 1 Pagan, A . R . 4 8 Pantula, S . G. 120 , 121 , 30 6 Park, J . Y . 176 , 238 Perron, P. 107 , 109 , 111-19 , 133, 248n. , 304, 306 Phillips, G . D . A . 104 , 105 , 169n. , 232 Phillips, P . C . B . 22 , 24, 43, 71, 72, 81-3 , 86-8, 90-3 , 95 , 96, 101 , 107 , 109 , 111 , 113, 114 , 119 , 129 , 133 , 134 , 163 , 175 ,

176, 179n. , 182 , 203 , 208 , 210 , 211 , 222 , 230, 238-41 , 242-51, 254 , 261, 262n. , 277, 288 , 290 , 291 , 294 , 304-6, 310 Ploberger, W . 31 0 Priestley, M . B . 4 0 Quandt, R . E . 3 Rappoport, P. 30 9 Reichlin, L . 30 9 Reimers, H . E . 28 6 Reinsel, G . C . 30 5 Richard, J.-F . 18 , 162 Ripley, B. D. 2 8 Rothenberg, T . 220n . Said, S . E. 82 , 107 , 108 , 11 3 Saikkonnen, P . 30 5 Sampson, M . 28 2 Sargan, J. D . 5 , 48, 50, 101 , 140 , 155 , 176 , 207, 209 , 229 , 231 , 238 , 28 5 Savin, N . E . 10 4 Schmidt, P . 101 , 304 , 306 Schwartz, A. J . 29 , 194 Schwert, G . W . 82 , 114 , 119 , 130 , 248n . Shapiro, M . D. 164-6 , 177n. , 191 Sheppard, D . K . 13 9 Sims, C. A . 43 , 125 , 162 , 168 , 178 , 186- 9 Smith, G . W . 16 3 Spanos, A . 12 , 16 , 72 , 162 Stirzaker, D . R . 9 6 Stock, J. H . 43 , 119 , 152 , 158 , 163 , 172 , 177, 178 , 185-90 , 192 , 211 , 278 , 291 , 294, 296-8 Terasvirta, T . 31 0 Tiao, G . C . 30 5 Toda, H. 31 0 Tran, H.-A . 236 , 301 Ungern-Sternberg, T . vo n 28 8 Urbain, J.-P . 30 7 Wald, A . 14 , 43 Watson, M. W . 119 , 152 , 178 , 187-90 , 211, 278 , 291 , 294 , 298 Wei, C . Z. 91 , 96n. West, K . D . 105 , 111 , 169 , 171 , 172 , 177 , 178, 185-7 , 18 9 n., 192 White, H . 15 , 16, 27, 86 , 89 , 90, 310 Wickens, M . R . 47 , 55 , 56 , 59 , 62, 63 , 64 Wold, H. 25 7

Yeo, S . 5 Yoo, B . S . 121 , 152 , 208 , 209 , 278 , 279 , 282, 283 , 287 , 305 Yule, G . U . 69 , 70n., 71, 77, 138

Subject Inde x absolute summabilit y 15 8 adjustment: coefficient 15 5 disequilibrium 51 , 52, 55, 61 speed of 26 8 approximation theore m 12 3 asymptotic: convergence 15 8 independence 16 , 17 normality 105 , 126, 134, 163, 177, 178, 180, 185 ; and drif t ter m 169-7 4 asymptotic standar d erro r (ASE ) 235 Augmented Dickey-Fulle r tes t (ADF ) 106 , 108, 109 , 207-12, 232-4 , 238, 239 n. asymptotic distributio n 127 , 128 comparison wit h non-parametrically ad justed D F 114- 9 use o f IV i n 11 9 autocorrelation 13 , 71-2, 83 , 129, 163, 191, 206, 207, 212, 221 n., 238-42, 244, 286, 29 2 function 12 , 1 3 autocovariance functio n 12 , 13 autoregressive: -distributed lag (ADL) model 47-55 , 60-4, 224 , 239, 242 error 83 , 114 , 191, 291 process 12 , 72, 251, 257-60; see also autoregressive moving-average (ARMA) proces s representation (VAR) , see co-integrat ing: representations o f co-integrate d systems autoregressive integrate d moving-averag e (ARIMA) process 13 , 38, 39, 221 autoregressive moving-averag e (ARMA ) process 12 , 13, 39, 84 , 85 , 88 , 107, 108 examples o f 32- 8 Bardsen transformation , se e transformation: Bardse n Bartlett windo w 24 8 Bewley: representation 152 , 153 transformation, se e transformation : Bewley bias 67 , 68, 191 , 244, 246-8, 249, 250, 290 , 309 in AR(1 ) parameter 100 , 101 correction ter m 241 , 246

in estimate s o f co-integratin g vecto r 162-3, 214-30 , 238 , 239, 246, 250, 252 second-order 163 , 176, 238, 240, 246 , 296, 29 7 simultaneity 238 , 241, 297, 298 borderline-stationary 39 , 95, 166 , 208, 225 see also near-integrate d proces s bounds tes t 133 , 134 Brownian motio n 21 , 89 , 152 , 153, 241, 243, 246 , 247, 255, 278, 296, 297 see also Wiene r proces s vector, 200- 3 Cayley-Hamilton theore m 14 0 central limi t theorem 16 , 73, 88, 89, 171, 295 functional (FCLT) , see functional centra l limit theore m Liapunov 16 , 27, 44 Lindeberg-Feller 2 7 co-integrating: combination 279 , 283, 288 parameters 215 , 220, 222, 224, 248 rank 145 , 146, 262 regresssion 191 , 220, 229, 230; asymptotic theory o f 174- 7 representations o f co-integrated system s (EC, MA , VAR) 146 , 153-7, 257-6 1 vector 137 , 138, 145, 158, 159, 163, 205, 214, 236 , 248, 252-6, 262, 267, 268, 276, 277, 285, 289, 290, 293; asymptotic distributio n o f estimator s of 293-8; biase s i n estimation of , see bias; generalized 179 ; invariance of 300- 3 co-integration 6-8 , 67 , 136-61, 167 , 189, 255, 268 , 300, 308 definition 14 5 in logarithm s or level s 198 , 199 multi- 287 , 307 seasonal 121 , 151 space 256 , 266-99, 273, 279 system 257 , 260, 261 testing for 9 , 134 , 176, 205-52, 286; table o f critical value s 213 ; test power 230- 5 common facto r 13 , 101, 231, 233, 235, 238, 239, 285, 296 common tren d 152 , 153, 278 companion for m 143 , 181-3, 272 concentrated serie s 88 , 89, 263, 264, 272

326

Subject Inde x

conditioning, imprope r 244 , 245 constant, inclusio n of 212-1 9 continuous mapping theorem 89 , 90 convergence: in distributio n 1 6 of functional s o f Wiener processe s 91 , 183 in probabilit y 14 , 15 , 16 , 86, 157 , 176 , 185 to rando m variabl e 86 , 89 rate o f 14 , 125 , 158-9 , 168 weak 23 , 8 9 Cramer's theore m 173 , 17 7 cross-equation restriction s 155 , 24 5 decomposition 179 , 240 , 260, 296 deterministic trend , se e trend: non stochastic de-trending 70 , 82 , 83 , 191 spurious 92- 3 diagonalization 265 , 266 , 273, 290 Dickey-Fuller: distribution/critical value s 97 , 98, 100-3, 105, 106 , 121 , 129-32 , 167 , 169 , 170 , 210-11, 268; table s 102- 3 test (DF ) 101 , 104-10 , 112 , 114-19 , 207-12, 231 , 233 , 235 , 236 , 238 , 239 n., 267 ; asymptoti c distribution of 124-7 ; tests o n more tha n on e parameter 113 , 114 , 11 6 differencing 11 , 30, 99 , 111 , 119 , 134 , 139 , 147, 153 , 158 , 168 , 192 , 199 , 30 0 seasonal 121 , 12 2 diffusion proces s 9 6 discontinuity 95 , 96 Donsker's theore m 8 9 drift ter m 9 , 72 , 101 , 106 , 108 , 111 , 15 1 see also trend : non-stochasti c dummy variable 134 , 270-6 , 288 Durbin-Hausman test s 30 6 Durbin-Watson tes t 73 , 81, 93 in co-integrating regression (CRD W test) 176 , 207-8, 235-6 dynamic: estimator 223 , 224-30 , 237 , 243 , 244, 247-51 modelling/regression 5 , 8 , 46, 47, 50 , 51, 106, 163 , 167-71 , 177 , 178 , 192 , 214 , 221 n., 222-4, 225-6, 229 , 239, 243 , 246, 24 7 omitted dynamic s 157 , 220 , 22 9 specification 168 , 240 , 242-4 system 27 8 Edge-worth expansio n 23 9 n. eigen-: value 134 , 140 , 143 , 144 , 179 , 265 , 266, 267, 268 , 270 , 277 , 292, 298

vector 265 , 270 , 292 , 298 empirical data/result s 29-32 , 40-2 , 52-3 , 159, 194-7 , 235-8, 269-71, 292, 293 encompassing 193 , 198 , 23 8 endogeneity 176 , 24 6 Engle-Granger: theorem 159-6 2 two-step procedure 153 , 157-61 , 205n., 278, 285, 283 equilibrium: dis- 2 miltiplier, se e long-run: multiplier relationship 2-9 , 46 , 47, 50, 54, 55, 136-9, 192 , 205 state 2 , 4 static 4 8 ergodicity 16 , 17 , 88 , 8 9 error-correction 5 , 6 , 47 , 51 , 55 , 63, 64, 96, 224n., 246 mechanism 5-7 , 51-4 , 139 , 140 , 151 , 232, 234 , 238 , 268 , 270-5 , 278 , 279 , 294, 300 , 30 4 model 47 , 49-52, 55 , 63, 158 , 159 , 239 , 243, 256 , 257, 260 , 61 , 268, 274 , 277-9, 290 ; generalize d 50 , 52 , 60 , 61 representation 138 , 139 , 153 ; definition of 145 ; derivation o f 154- 7 term 50-3 , 60 , 61, 140 , 151 , 155 , 157 , 262 exact tes t 10 5 exogeneity 17-18 , 288 strict 19 , 67 strong 18 , 20, 222-3, 244 , 252 , 291 super 18-2 0 in uni t roo t test s 10 7 weak 18 , 20, 65-8 , 163 , 168 , 192 , 204 , 223, 240 , 243-5, 248 , 251-2, 261, 268 , 288-91, 295; importanc e i n co-inte grated processe s 252 , 307 finite sampl e biases , se e bia s Fisher effec t 6 5 forecasting 278-8 5 multi-step 18 , 19 frequency: domain 88n . zero v. seasonal 12 2 Frisch-Waugh theore m 70n . full-information maximum-likelihoo d (FIML) 238 , 239 , 241 , 245 , 250 , 297 , 298 fully modifie d estimation 238-41 , 243 , 244, 246-50 estimator 243 , 244 , 247, 248 , 249, 250 method 239 , 240 functional centra l limi t theore m (FCLT) 22 , 89 , 124-7, 261 , 295 , 299

Subject Index generalized co-integratin g vector 17 9 general-to-specific modellin g 168 , 192 Granger causalit y 18 , 291 Granger Representatio n Theore m 48 , 146-53, 300 homogeneity 47 , 51, 52, 60, 61, 221, 222 , 231, 23 6 impact matri x 151 , 260 inconsistent regressio n 164-8 , 190 , 191, 229, 230 innovation sequenc e 12 , 85-7, 183 instrumental variable s (IV) 55 , 59, 62, 63, 119, 130- 3 integrated process 1 , 6, 7, 11 , 12, 21, 39 , 69-71, 73, 136-8, 162-9 9 asymptotic theory o f 86-9 1 near-, see near-integrated process properties o f 84- 6 see also non-stationar y proces s integration: order of , se e ordej r o f integration seasonal, see seasonal integratio n intercept 72 , 151 , 210, 232, 234, 271, 272 , 273, 274 interim multiplie r representation 15 3 invariance 20 , 282, 283 principle 22 ; see also functiona l central limit theore m invertibility 13 , 84, 108 , 242 invertible system 148 , 149, 258, 259, 266 Jacobian 62 , 63 Johansen maximum-likelihoo d procedure 211 , 262-9, 285, 286, 300 power o f 277 , 278 Kronecker product 18 1 lag 9 , 11 , 47, 50, 52, 66, 106-8 , 123 , 225 , 248, 250, 251, 286, 303 length 248 , 286 mean 28 7 polynomial 22 9 structure 208 , 222, 229 truncation paramete r 110 , 111, 113 latent roo t 13 , 104 , 142, 144, 158, 224 law o f large numbers 86 , 90 life-cycle hypothesi s 164 , 188 likelihood rati o test s 153 , 277, 278, 294, 295 limited-information maximum-likelihoo d (LIML) 264 , 28 5 linear system 30 0 logarithms v. level s 29-32 , 193- 7

327

long-run: covariance matrix 240 , 241, 245-7, 252, 290 multiplier 8 , 47-9, 51 , 54, 57, 59-64, 188, 230 , 235, 293, 295, 296; variance of estimate s o f 61- 4 relationship 2 , 7, 8 , 140 , 220; see also co-integrating: vecto r response 15 3 solution 50 , 64-8 marginal: distribution 18 , 19 , 290, 295 process 240 , 243-5, 248n . marginalization 30 4 market clearing 3 martingale difference sequence (MDS ) 11, 12, 21, 163 , 179n., 185, 242, 244, 245 , 247 maximal-eigenvalue statisti c 267 , 273 maximum-likelihood 159 , 241-5, 256 , 262 , 264, 265, 266, 267, 269, 277, 283, 285, 286, 288 full-information, se e full-information maximum-likelihood limited-information, se e limited-information maximum-likelihood mean la g 144 , 287, 301 memory 8 5 mixing: coefficient 8 7 strong 16 , 17 , 87 uniform 16 , 17 mixingale 17 9 n. Monte Carlo : method 9 , 27, 28 response surface s 28 , 211, 213, 214 results 73-83 , 101 , 106 , 108, 114, 117-19, 133, 165, 214, 215, 222-3, 225-9, 232-5, 248-51, 279, 282, 283 , 285, 291, 298 standard erro r 7 5 moving-average 12 , 88; see also auto regressive moving-average (ARMA) process component o f errors 10 7 negative components 113 , 119, 250, 304 parameter 24 8 n. representation 133 , 153, 155, 156 seasonal filte r 12 1 multiple roots 119-2 2 multiplier, long-run, se e long-run: multiplier near-integrated process 95-7 , 99 , 164, 166, 225, 231, 277 nearly-inconsistent regressio n 229 , 230 non-centrality parameter 97 , 98

328

Subject Inde x

non-parametric: correction/test 9 , 108-10 , 114-9 , 130 , 208, 210 , 211 , 238-40 , 25 1 asymptotic theory o f 129-3 0 estimation 244 , 248 , 249 nonsense regressio n 69 , 80, 138 see also spuriou s regressio n non-stationarity 4 , 8, 9, 65, 67, 72, 81-4 , 134, 150 , 21 5 transformation t o stationarit y 69 , 70, 82, 83, 99 , 134 , 14 7 non-stationary process 5 , 6 , 9 , 38 , 39, 70, 71, 81 , 163 , 24 4 v. integrate d proces s 1 2 normality 180 , 28 9 asymptotic, se e asymptotic : normalit y normalization 57-9 , 265 , 285 nuisance parameter s 100 , 104-6 , 172 , 176 , 207, 21 0 order: of magnitud e 14 , 15 , 21 , 9 0 in probabilit y 14 , 1 5 order of integration 6-9 , 48 , 79-80, 84 , 85, 147, 151 , 190-2 , 258 defined 8 4 first 137 , 17 7 higher 138 , 157 , 16 3 zero 13 7 Ornstein-Uhlenbeck proces s 9 6 orthogonal complemen t 14 7 orthogonality 86 , 149 , 151 , 242 , 244 , 245 , 258n., 259,260 , 273 asymptotic 10 7 testing 164- 8 over-identification tes t 278 , 30 0 over-rejection 206 , 210 , 28 6 parameterization 48 , 207 , 208 , 250 , 274 , 275 of dynamic s 22 1 exact 105 , 224 of nearly-integrate d processe s 9 5 over-/under- 224-9 , 262 permanent incom e hypothesi s 164 , 177 , 178, 188 , 19 0 Perron-Phillips/Phillips test, se e non-para metric: correction/tes t polynomial matrice s 140-5 , 152 , 257 isomorphism wit h companion mat rices 142- 4 power serie s expansio n 9 7 power o f tests 8 , 15 , 96, 101 , 108 , 113 , 198 , 208, 214 , 223-4 , 230-5, 277 , 278 , 28 6 pre-determinedness 1 9 random wal k 11 , 21, 22, 24-9 , 38 , 71, 72, 82, 87 , 93 , 100 , 101 , 114 , 191 , 220 , 272

in logarithm s o r level s 19 3 n. see also unit root rank: co-integrating, se e co-integrating: ran k full 56 , 58 , 59 , 144 , 147 , 151 , 181 , 258 , 260, 28 7 reduced 144 , 147 , 151 , 256 , 257 , 264 , 285, 287 , 288 , 30 1 recursive estimatio n 194n. , 221 n. re-parameterization 67 , 157 , 168 , 189 , 191 , 222 see also transformatio n representation theorem, see Granger Rep resentation Theore m Said-Dickey tes t 107 , 108 compared wit h Perron-Phillip s tes t 11 3 Sargan-Bhargava test , se e Durbin-Watson test (CRD W test) Schwarz Criterion 194 , 28 6 seasonal adjustmen t filte r 301 , 303 seasonal integratio n 121- 3 sequential cu t 18 , 19 similar test s 100 , 104 , 105 , 16 9 n. size distortion s 113 , 133 , 166 , 16 7 Slutsky's theore m 89 , 173 spurious: correlation 70 , 71; in de-trended rando m walks 82 , 8 3 regression 69-81 , 83 , 92-5, 134 , 138-9 , 158, 159 , 162 , 191 , 230 , 25 5 stacked form , se e companion for m static regression 162 , 163 , 167 , 205 , 214 , 220-3, 231 , 238 , 246 , 251 , 29 6 comparison wit h dynami c 167 , 168 , 224-30 example o f 23 6 see also Engle-Granger: two-ste p pro cedure stationarity 1 , 4, 12 , 13 , 17 , 69, 212 , 26 2 stationary proces s 4 , 5, 6 , 7, 9, 11 , 29, 38, 39, 47 , 85 , 86 , 134 , 138 , 256 , 257 , 267 , 279 strictly 11 , 1 2 weakly/second-order/covariance 11 , 1 2 stochastic: differential equatio n 9 6 trend, se e trend, stochasti c structural representatio n 261 , 30 3 super-consistency 158 , 176 , 191 , 214 , 220 , 230, 251 , 294 , 296 total effect 142 , 25 7 trace 267 , 273 transformation 6 , 28-32, 88, 111 , 125 , 178-80, 185 ADL 51 , 59 ADL t o EC M 60 , 61 300, 301

Subject Inde x transformation (cont.): Bardsen 51 , 54-9, 62 , 63 Bewley 51 , 53-6, 58n. , 59, 60, 62, 63 equivalence of , 54-60 , 62 , 64 linear 47 , 51 , 60, 61, 63, 64, 145 , 152 , 178, 224 ; in dynamic regression 167-8 , 177, 178 ; o f polynomial matrice s 144 , 145 logarithmic 99 , 192- 9 trend (inclusio n of) 5 , 9, 82, 100, 101 , 106 , 125, 185 , 211 , 212 , 213 , 214 , 236 non-stochastic (deterministic ) 6 , 20, 21, 69-72, 82, 84, 125 , 146 , 151 , 172 , 173 , 185, 187 , 27 5 stochastic 153 , 169 , 172 , 174 , 179 , 180 , 185, 187 , 191 ; se e also commo n trend ; unit roo t sums of powers o f 2 0 unit circl e 13 , 104 , 123 , 141 , 149 , 15 8 unit root 8 , 9 , 13 , 38, 72, 83-6, 95 , 96, 133, 144 , 147 , 163 , 177 , 185 , 215 , 236 , 255, 258-60, 267, 270 , 287 , 289 multiple 12 2

329

near- 95 , 99; see also near-integrate d process in polynomial matri x 14 1 testing for 8 , 96, 99-135, 206, 211 , 215 , 306; descriptiv e valu e 306 ; in marginal processes 306 ; a t seasona l frequency 120-3 variance-covariance matri x 62 , 107 , 183 , 189, 243 , 252-4 , 273 long-run 248 , 249 vector autoregressio n (VAR ) 278 , 279 , 283, 291 , 29 2 vectoring operato r 181 , 273 Wald statisti c 127 , 188 , 23 9 Wiener proces s 21-3 , 26 , 86-91, 93 , 96, 131, 188 , 189 , 241 , 261 , 268 distribution 191 , 22 1 functional o f 24 , 90 , 93 , 125-8 , 163 , 188 , 300 multivariate 182-4 , 200-3, 268 white noise 11 , 12, 22, 87, 106 , 23 1 Wold Decompositio n Theore m 257 , 258

Co-integration, Error Correction, and the Econometric Analysis of Non-Stationary Data (Advanced Texts in Econometrics)

Panel Data Econometrics (Advanced Texts in Econometrics)

Co-integration, error correction, and the econometric analysis of non-stationary data

Using Cointegration Analysis in Econometric Modelling

Nonstationary Panels, Panel Cointegration, and Dynamic Panels

Finite Sample Econometrics (Advanced Texts in Econometrics)

Finite Sample Econometrics (Advanced Texts in Econometrics)

The Econometrics of Macroeconomic Modelling (Advanced Texts in Econometrics)

Econometric Analysis of Health Data

Econometric Analysis of Panel Data

Econometric Analysis of Count Data

The Econometrics of Macroeconomic Modelling (Advanced Texts in Econometrics)

Econometric Analysis of Count Data

Econometric Analysis of Panel Data

Econometric Analysis Of Count Data

Error correction coding

Econometric Forecasting And High-Frequency Data Analysis

Analysis of Panel Data (Econometric Society Monographs)

Econometric Analysis of Panel Data, 3rd Edition

Advanced Quantitative Data Analysis

Advanced Quantative Data Analysis

Volatility and Time Series Econometrics: Essays in Honor of Robert Engle (Advanced Texts in Econometrics)

The Practice of Econometric Theory: An Examination of the Characteristics of Econometric Computation (Advanced Studies in Theoretical and Applied Econometrics)

The Practice of Econometric Theory: An Examination of the Characteristics of Econometric Computation (Advanced Studies in Theoretical and Applied Econometrics)

Econometric Models in Marketing (Advances in Econometrics)

Econometric Analysis

Error Correction Coding: Mathematical Methods and Algorithms

Error Correction Coding : Mathematical Methods and Algorithms

Advanced Econometrics

Advanced Econometrics

Readings in Unobserved Components Models (Advanced Texts in Econometrics)

Co-integration, Error Correction, and the Econometric Analysis of Non-Stationary Data (Advanced Texts in Econometrics)

Panel Data Econometrics (Advanced Texts in Econometrics)