Group Invariance in Statistical Inference
T h i s p a g e i s i n t e n t i o n a l l y left b l a n k
G r o u p I n...
45 downloads
879 Views
18MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Group Invariance in Statistical Inference
T h i s p a g e i s i n t e n t i o n a l l y left b l a n k
G r o u p I n v a r i a n c e in Statistical Inference
Narayan
C . Giri
University of Montreal, Canada
W o r l d h
Singapore
Scientific * New Jersey
• London • Hong
Kong
Published by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Fairer Road, Singapore 912805 USA office: Suite I B , 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
G R O U P I N V A R I A N C E IN S T A T I S T I C A L I N F E R E N C E Copyright © 1996 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 9810218753
Printed in Singapore.
To NILIMA NABANITA NANDIAN
T h i s p a g e i s i n t e n t i o n a l l y left b l a n k
CONTENTS
Chapter 0. G R O U P I N V A R I A N C E
1
0,0. Introduction
1
0.1. Examples
3
Chapter I . M A T R I C E S , G R O U P S A N D J A C O B I A N S
7
1.0. Introduction
7
1.1. Matrices
7
1.1.1. Characteristic Roots and Vectors
8
1.1.2. Factorization of Matrices
9
1.1.3. Partitioned Matrices
9
1.2. Groups
10
1.3. Homomorphism, Isomorphism and Direct Product
12
1.4. Topological Transitive Groups
12
1.5. Jacobians
13
Chapter 2. I N V A R I A N C E
16
2.0. Introduction
16
2.1. Invariance of Distributions
16
2.1.1. Transformation of Variable in Abstract Integral
22
2.2. Invariance of Testing Problems
25
2.3. Invariance of Statistical Tests and Maximal Invariant
26
2.4. Some Examples of Maximal Invariants
28
2.5. Distribution of Maximal Invariant
31
2.5.1. Existence of an Invariant Probability Measure on 0{p) (Group of p x p Othogonal Matrices)
33
2.6. Applications
33
2.7. The Distribution of a Maximal Invariant in the General Case
36
vii
viii
Contents
2.8. An Important Multivariate Distribution
37
2.9. Almost Invariance, Sufficiency and Invariance
44
2.10. Invariance, Type D and E Regions
45
Chapter 3. E Q U I V A R I A N T E S T I M A T I O N I N C U R V E D MODELS
49
3.1. Best Equivariant Estimation of y, with \ Known 3.1.1. Maximum Likelihood Estimators 3.2. A Particular Case
50 53 53
3.2.1. An Application
58
3.2.2. Maximum Likelihood Estimator
58
3.3. Best Equivariant Estimation in Curved Covariance Models
60
3.3.1. Characterization of Equivariant Estimators of S
61
3.3.2. Characterization of Equivariant Estimators of 0
63
Chapter 4. S O M E B E S T I N V A R I A N T T E S T S I N MULTINORMALS
68
4.0. Introduction
68
4.1. Tests of Mean Vector
68
4.2. The Classification Problem (Two Populations)
75
4.3. Tests of Multiple Correlation
82
4.4. Tests of Multiple Correlation with Partial Information
85
Chapter 5. S O M E M I N I M A X T E S T S I N M U L T I N O R M A L E S 5.0. Introduction
91 91
5.1. Locally Minimax Tests
93
5.2. Asymptotically Minimax Tests
106
5.3. Minimax Tests
111 2
5.3.1. HotelUng's T Test
112
2
5.3.2. R Test
124
5.3.3. eMinimax Test (Linnik, 1966)
135
Chapter 6. L O C A L L Y M I N I M A X
TESTS IN SYMMETRICAL
DISTRIBUTIONS
137 137
6.0. Introduction
137
6.1. Eliiptically Symmetric Distributions
137
6.2. Locally Minimax Tests in E (n, S )
140
6.3. Examples
143
p
Chapter 7. T Y P E D A N D E R E G I O N S
162
Group Invariance in Statistical Inference
Chapter 0 GROUP I N V A R I A N C E
0.0. I n t r o d u c t i o n One of the unpleasant facts of statistical problems is that they are often too big or too difficult to admit of practical solutions.
Statistical decisions
are made on the basis of sample observations. Sample observations often contain information which is not relevant to the making of the statistical decision. Some simplifications are introduced by characterizing the decision rules in terms of the sufficient statistic (minimal) which discard that part of sample observations which is of no value for any decision making concerning the parameter and thereby reducing the dimension of the sample space to that of the minimal sufficient statistic. T h i s , however, does not reduce the dimension of the parametric space.
B y introducing the group invariance principle and
restricting attention to invariant decision rules a reduction to the dimension of the parametric space is possible.
I n view of the fact that sufficiency and
group invariance are both successful in reducing the dimension of the statistical problems, one is naturally interested in knowing whether both principles can be used simultaneously and if so, in what order. Hall, Wijsman and Ghosh (1965) have shown that under certain conditions this reduction can be carried out by using both principles simultaneously and the order in which they are used is immaterial in such cases. However, one can avoid verifying these conditions by replacing the sample space by the space of sufficient statistic and then using group invariance on the space of sufficient statistic. I n this monograph we treat multivariate problems only where the reduction in dimension is
1
2
Group Invariance in Statistical
Inference
very significant. In what follows we use the term invariance to indicate group invariance. In statistics
the term invariance is used in the
mathematical
sense
to denote a property that remains unchanged (invariant) under a group of transformations. In actual practice many statistical problems possess such a property. As in other branches of applied sciences it is a generally accepted principle in statistics that if a problem with an unique solution is invariant under a group of transformations, then the solution should also be invariant under it. This notion has an old origin in statistical sciences. Apart from this natural justification for the use of invariant decision rules, the unpublished work of Hunt and Stein towards the end of Second World War has given this principle a Strang support as to its applicability and meaningMness to prove various optimum properties like minimax, admissibility etc. of statistical decision rules. Although a great deal has been written concerning this principle in statistical inference, no great amount of literature exists concerning the problem of discerning whether or not a given statistical problem is actually invariant under a certain group of transformations. Brillinger (1963) gave necessary and sufficient conditions that a statistical problem must satisfy in order that it be invariant under a fairly large class of group of transformations including Lie groups. In our treatment in this monograph we treat invariance in the framework of statistical decision rules only. D e F i n e t t i (1964) in his theory of exchangeability treats invariance of the distribution of sample observations under finite permutations. It provides a crucial link between his theory of subjective probability and the frequency approach of probability. T h e classical statistical methods take as basic a family of distributions, the true distribution of the sample observations is an unknown member of this family about which the statistical inference is required. unknown.
According to D e Finetti's approach no probability is
If « i , i E s , . . . are the outcomes of a sequence of experiments con
ducted under similar conditions, subjective uncertainty is expressed directly by ascribing to the corresponding random variables X ,X^, t
• • • a known joint
distribution. When some of the X's are observed, predictive inference about others is made by conditioning the original distributions on the observations. De Finetti has shown that these approaches are equivalent when the subjectivist's joint distribution is invariant under finite permutation. T w o other related principles, known in the literature, are the weak invariance and the strong invariance principles. T h e weak invariance principle is used
Group Invariance
3
to demonstrate the sufficiency of the classical assumptions associated with the weak convergence of stable laws (Billingsley, 1968). T h i s is popularly known as Donsker's theorem (Donsker, 1951). L e t X\, X2, • • • be independently dis2
tributed random variable with the same mean zero and the variance a
Sj =
£j=i
= ^ . * =  . ;
i

1
.  " . « 
converges weakly to Brownian motion.
and let
Donsker proved that { * „ ( ( ) }
Strong invariance principle has been
introduced to prove the strong convergence result (Tusnady, 1977). Here the term invariance is used in the sense that if X i , X j , . . . are independently dis2
tributed random variables with the same mean 0 and the same variance a , and if ft is a continuous function on [0,1 then the limiting distribution of h(Xi) does not depend on any other property of Xi. 0.1. E x a m p l e s We now give a n example to show how the solution to a statistical problem can be obtained through direct application of group theoretic results. E x a m p l e 0.1.1. Let X
a
= (X
n l
,...,X
a p
)\
a = l,...,N(>
p) be inde
pendently and identically distributed pvariate normal vectors with the same mean ^ = (m,..
. , j t ) ' , and the same positive definite covariance matrix E . p
The parametric space f! is the space of all {11, £ ) . H0 : // = 0 against the alternatives Hi : 11
T h e problem of testing
0 remains unchanged (invariant)
under the full linear group Gj(p) of p x p nonsingular matrices g transforming each Xi * gXi, i = l,...,N.

Let
N
N
It is wellknown { G i r i , 1977) that (X,S) transformation on the space of {X,S)
(X,S)^(gX,gSg'),
is sufficient for ( / * , £ ) . T h e induced
is given by
g € Gi(p) •
(0.2)
Since this transformation permits arbitrary changes of X, S and any reasonable statistical test procedure should not depend on any such arbitrary change by g, we conclude that a reasonable statistical test procedure should depend on (X, S) only through 2
T
1
= N(N  l l X ' S " * .
It is wellknown that (Giri, 1977) the distribution of T
(0.3) 2
is given by
4
Group /nuariance in Statistical
2
2
fr*(t \6 )
=
(Nl)T(^Np)) (
where 6
2
Inference

\8 y(t j{N 2
2
+
2
= JV/i'£y
Under H S
l
> T{\N
if £ > 0 2
+ j)
= 0 and under ff i S
0
2
(0.4)
> 0.
Applying
Neyman and Pearson's L e m m a we conclude from (0.4) that the uniformly 2
most powerful test based on T T
2
test, which rejects H
0
of H
0
against Hi is the wellknown Hotelling's
for large values of
2
T. 3
N o t e . I n this problem the dimension of fl is p + also the dimension of the (X,S).
= P(P+ )
For the distribution of T
2
)
w n
i h is c
the parameter is
a scalar quantity. One main reason of the intuitive appeal, that for an invariant problem with an unique solution, the solution should be invariant, is probably the belief that there should be a unique way of analysing a collection of statistical data. A s a word of caution we should point out that, if in cases where the use of invariant decision rule conflicts violently with the desire to make a correct decision or to have a smaller risk, it must be abandoned. We give below one such example which is due to Charles Stein as reported by Lehmann (1959, p. 338). E x a m p l e 0.1.2.
f
Let X = Xi,...,X )', p
Y = (Yi,...,Y )' p
be indepen
dently distributed normal pvectors with the same mean 0 and positive definite covariance matrices S , 6 S respectively where 6 is an unknown scalar constant. Consider the problem of testing Ho : 6 — 1 against H\ : 6 > 1, T h i s problem is invariant under Gj(p) transforming X —> gX,Y
—» gY,g e Gt(p). Since this
group is transitive (see Chapter 1) on the space of values of (X, Y) with probability one, the uniformly most powerful invariant test of level a under Gi{p} is the trivial test $ ( X , Y) = a which rejects Ho with constant probability a for all values {x,y)
of (X,Y).
Hence the maximum power that can be achieved
over the alternatives Hi by any invariant test under G;(p) is also a . B u t the test which rejects Ha whenever
(0.5) where the constant C depends on level a, has strictly increasing power 0(6} whose minimum over the set 6 > Si > 1 is /?{ /3(1) = a. discussions and results refer to Giri (1983a, 1983b).
F o r more
Group Invariance
5
Exercises 1. L e t Xi,...
,X be n
independently and identically distributed normal random
variables with the same mean 9 and the same unknown variance o
2
H
A
(a)
and let
:d = 0 and H i : 9 / 0. F i n d the largest group of transformations which leaves the problem of testing Ha against Hi invariant.
(b)
Using the group theoretic notion show that the twosided student (test is uniformly most powerful among all tests based on (.
2. U n i v a r i a t e G e n e r a l L i n e a r H y p o t h e s i s .
Let
dently distributed normal random variables with E(Xi) i — 1,... ,TI.
, . . . ,X
be indepen
n
= (ft, V a r ( X ; ) =
2
a,
L e t fi be the linear coordinate space of dimension of n and
let lip. and IL
U
be two linear subspaces of fl such that dim I I ^
dim IIJJ — i , I > k. Consider the problem of testing HQ : 8 —
— k and ... ,9 )' G P
lTm against the alternatives Hi : 9 6 HQ. (a) F i n d the largest groups of transformations which leave the problem invariant. (b)
Using the group theoretic notions show that the usual Ftest is uniformly most powerful for testing HQ against H\.
3. L e t Xy,...,
X
n
be independently distributed normal random variables with
the same mean 9 and the same variance a . 2
against H\ : a
2
= a
2
For testing H
2
Q
: a
=
a\
< af, where i. T h e determinant of the lower triangular matrix det C = H(=i » We shall also write det C — \C\ for convenience. A square matrix C = (ey ) of order p is a upper triangular matrix if Cj, = 0 c
for i > j and det C = [IS=I "A square matrix of order p is nonsingular if det C ?^ 0. If det C = 0 then C , is a singular matrix. A nonsingular matrix C of order p is orthogonal if CC = CC = I. T h e inverse of a nonsingular matrix C of order p is the unique matrix C such that C C

1
= C
_
1
1
_
1
l
C = J . From this it follows that det C " = (det C)~ .
A square matrix (7 = ( c ^ ) of order p or the associated quadratic form x'Cx = J2i Z l j djXiXj is positive definite if x'Cx > 0 for x = ( i i , . . . , x )' / 0. p
If C is positive definite C A of order pACA'
_
1
is positive definite and for any nonsingular matrix
is also positive definite.
1.1.1. Characteristic roots and vectors T h e characteristic roots of a square matrix C of order p are given by the roots of the characteristic equation det(CA/)=0
(1.2)
where A is real. A s det {8C8'  XI) = det ( C  XI) for any orthogonal matrix 8 of order p, the characteristic roots of C remain invariant under the transformation of C —> 8C8'. T h e vector x — ( i ^ , . . . , x )' / 0 satisfying p
{CXI)x
= 0
(1.3)
is the characteristic vector of C corresponding to its characteristic root X. I f x is a characteristic vector of C corresponding to its characteristic root X, then any scalar multiple ax, a j i 0, is also a characteristic vector of C corresponding to X. Some Results on Characteristic R o o t s a n d Vectors 1. T h e characteristic roots of a real symmetric matrix are real. 2. T h e characteristic vectors corresponding to distinct characteristic roots of a symmetric matrix are orthogonal.
Matrices, Groups and Jacobians 9 3. T h e characteristic roots of a symmetric positive definite matrix C are ail positive. 4. Given any real square symmetric matrix C of order p, there exists an orthogonal matrix 9 of order p such that 9C9' is a diagonal matrix D{\\,..., where A i , . . . , A
p
A ) p
are the characteristic roots of C. Hence det C — n f = i A