A Berry-Esseen bound for U-Statistics

Journal of Mathematical Sciences, Vol. 128, No. 1, 2005 A BERRY–ESSEEN BOUND FOR U -STATISTICS L. V. Gadasina UDC 519...

Author: Gadasina L.V.

6 downloads 430 Views 197KB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Journal of Mathematical Sciences, Vol. 128, No. 1, 2005

A BERRY–ESSEEN BOUND FOR U -STATISTICS L. V. Gadasina

UDC 519.21

The rate of convergence in the central limit theorem for nondegenerate U -statistics of n independent random variables is investigated under minimal sufficient moment conditions on canonical functions of the Hoeffding representation. Bibliography: 12 titles.

1. Introduction Let X1 , . . . , Xn be independent random variables with values from a measurable space (X, B). Assume that n ≥ m ≥ 2 and consider an U -statistic Un = Un (X1 , . . . , Xn ) = Φi1 ...im (Xi1 , . . . , Xim ), 1≤i1 0, then ∆n ≤ Am ( σn−2

n

Egj2 I(|gj | > σn ) + σn−3

j=1

+σn−1

m

E|gj |3 I(|gj | ≤ σn )

j=1

E|gIc |I(|gIc | > σn ) +

c=2 1≤i1 σn )

c=2 Iˇc

j=1

and

m

(2)

E|zj |3 )−1 . Without loss of generality, we may assume that h > 2m+3 . By the Esseen

j=1

inequality,

2 2 h |Eeit(sn+vn +wn ) − e−t /2 |t−1 dt sup |P (sn + vn + wn < x) − Φ(x)| ≤ π x 0 h 2m+3 2 24 2 2 + √ h−1 ≤ |Eeitsn − e−t /2 |t−1 dt |Eeit(sn+vn +wn ) − Eeitsn |t−1 dt π π π 2π 0 0 2 h 24 24 + |Eeit(sn+vn +wn ) − Eeitsn |t−1 dt + √ h−1 = ε1 + ε2 + ε3 + √ h−1 . π 2m+3 π 2π π 2π To estimate ε1 , we use [6, Chapter XVI.5, Theorem 2] with T = h−1 instead of T = 8h−1 /9 as follows: ε1 ≤ π −1 σ ¯n−3

n j=1

E|gjn +

m

hcjn |3 ≤≤ 16π −1 σ ¯n−3

c=2

+32π −1 (m − 1)2 m¯ σn−3 σn2

n

E|gjn |3

j=1 m

E|gIc |I(|gIc | > σn ).

c=2 Iˇc

The term containing h−1 can be estimated similarly. 2525

(k)

(k)

Proposition 1. Let γ1 , . . . , γn , k = 1, . . . , c, be a series of sequences consisting of independent identically (1) (c) distributed random variables such that γi1 , . . . , γic are independent for i1 = · · · = ic and, in addition, do not depend on X1 , . . . , Xn . Then E|

(1)

(c)

γi1 . . . γic gIc n |p ≤ 2(2−p)c

Iˇc

c

(s)

E|γ1 |p

s=1

E|gIcn |p ,

1 ≤ p ≤ 2.

Iˇc

To prove this proposition, we need the following statement. Proposition 2. For any x ∈ R, the following inequalities hold: (1) |eix − 1| ≤ |x|; (2) |eix − 1| ≤ |x|p−1, 1 ≤ p ≤ 2; (3) |eix − 1 − ix| ≤ 2|x|p, 1 ≤ p ≤ 2; (4) a · b ≤ p−1 ap + q −1 bq , a > 0, b > 0, p−1 + q −1 = 1. Estimation of ε2 . We need the following lemma. Lemma 1. If 0 ≤ t ≤ h and 1 ≤ i1 < · · · < ic ≤ n for c = 1, . . . , m, then n

|E exp(itzk )| ≤ exp(−t2 (1/3 − c2−(2m+9)/3))).

k=1,k=Ic

Proof. Expanding exp(itzk ) into the Taylor series, we see that 1 1 1 1 |E exp(itzk )| ≤ 1 − Ezk2 t2 + |t|3E|zk |3 ≤ 1 − t2 (Ezk2 − hE|zk |3 ) 2 6 2 3 1 1 ≤ exp − t2 (Ezk2 − hE|zk |3 ) , 2 3 where the last inequality follows from the inequality 1 − x ≤ e−x , which holds for all x ∈ R. Thus,   n 1 1 |E exp(itzk )| ≤ exp − t2 ( Ezk2 − h E|zk |3 ) 2 3 k=1, k=Ic

k=Ic

k=Ic

1 2 1 1 = exp − t2 ( − Ezi21 − · · · − Ezi2c + hE|zi1 |3 + · · · + hE|zic |3 ) . 2 3 3 3 In addition, E|zi |3 ≥ 0 and Ezi2 ≤ (E|zi |3 )2/3 ≤ h−2/3 ≤ 2−2(m+3)/3 . This proves the lemma. Consider the integrand in ε2 : Eeit(sn +vn +wn ) − Eeitsn = Eeitsn (eit(vn+wn ) − eitvn + eitvn − 1) = Eeit(sn +vn ) (eitwn − 1) + Eeitsn (eitvn − 1) = Eeit(sn+vn ) (eitwn − 1 − itwn ) + (itEeitsn wn + itEeitsn (eitvn − 1)wn + Eeitsn (eitvn − 1) = L1 + L2 + L3 + L4 . To estimate L1 , we apply the third inequality of Proposition 2 and then Proposition 1 as follows: |L1 | ≤ 2|t|pE|wn |p ≤ 2(2−p)m+1 (m − 1)p−1 σ¯n−p |t|p

m

E|gIcn |p ,

c=2 Iˇc

Now we estimate L2 : |L2 | ≤ σ ¯n−1 |t|

m c=2 Iˇc

2526

|E exp(it

k=Ic

zk )eitzi1 . . . eitzic gIc n |

1 ≤ p ≤ 2.

=σ ¯n−1 |t|

m

|E exp(it

c=2 Iˇc

zk )| · |E(eitˆzi1 − 1) . . . (eitˆzic − 1)gIc n |.

k=Ic

The last inequality holds since the gIc n are degenerate. By Lemma 1, ¯n−1 |t|m+1 exp(−t2 (1/3 − m2−(2m+9)/3 )) |L2 | ≤ σ

m

E|ˆ zi1 . . . zîc gIc n |.

c=2 Iˇc

Now we pass to L3 . The Hölder inequality with 1 ≤ p ≤ 2 and q = p/(p − 1) implies that |L3 | ≤ |t|(E|eitvn − 1|q )1/q (E|wn |p)1/p . By Propositions 1 and 2, 2 1 |t|E|vn| + |t|p E|wn|p q p m m 2 m 1 (2−p)m −1 p−1 p −p ≤ 3 |t|¯ σn E|gIc |I(|gIc | > σn ) + 2 (m − 1) |t| σ ¯n E|gIc n |p . q p c=3 ˇ c=2 ˇ |L3 | ≤ 21/q |t|1+1/q (E|vn |)1/q (E|wn |p )1/p ≤

Ic

Ic

Finally, |L4 | ≤ |t|E|vn | ≤ σ¯n−1 |t|

m−1

m

E|gIc Jd n | ≤ 3m σ ¯n−1 |t|

d=2 c=d+1 Jˇd

m

E|gIc |I(|gIc | > σn ).

c=3 Iˇc

Estimation of ε3 . Decompose the integrand as follows: Eeit(sn +vn +wn ) − Eeitsn = (Eeit(sn +wn ) − Eeitsn ) + (Eeit(sn +vn +wn ) − Eeit(sn+wn ) ) = I1 + I2 . We use a randomization method (see, e.g., [1,2]). Fix t ∈ [2m+3 , h] and consider a sequence of mutually independent and independent of X1 , . . . , Xn random variables α1 , . . . , αn with the Bernoulli distribution and such that P (αj = 1) = 1 − P (αj = 0) = f(t) = 9 · 2m−1 t−2 log(t), j = 1, . . . , n. Note that f(t) ∈ [0, 1] for all t. The random variables Xj , j = 1, . . . , n, can be represented in the following form: d

¯ j + (1 − αj )X ˜j , Xj = αj X

j = 1, . . . .n,

¯j and X ˜j are independent copies of Xj for all j = 1, . . . , n. In this case, where X d

sn =

n

αj z¯j +

j=1

n (2) ˜ ¯ (1 − αj )˜ zj = s(1) n (X) + sn (X). j=1

Since the functions gIc n are additive and homogeneous, wn = σ ¯n−1 d

c m

Wck =

c=2 k=0

where ¯ ck = σ W ¯n−1

k

αls

ˇ k Iˇc ,I c =Lk s=1 L k k

and d

vn =

m−1

m d

V¯cdk = σ ¯n−1

k

ˇ k Jˇd ,J d =Lk s=1 L k k

Note that Wck consists of

¯ ck , W

c=2 k=0 c

¯ Lk , X ˜ I \L ), (1 − αis )gIc n (X c k

(6)

s=k+1

V¯cdk = σ¯n−1

d=2 c=d+1 k=0

where

c m

m−1

m d

Vcdk ,

d=2 c=d+1 k=0

αls

d

¯ Lk , X ˜ J \L ). (1 − αjs )gIc Jd n (X d k

s=k+1

c d summands and Vcdk consists of summands. k k 2527

Lemma 2. Denote the sequence α1 , . . . , αn by α. Then E|E[eitsn |α]|2 ≤ t−3·2 (1)

E|E[exp(it

m−1

,

(2)

E|E[eitsn |α]|2 ≤ t3·2

αk zk )|α]|2 ≤ t−3·2

m/3−1

m−1

e−t

(22m/3 −3c/4)

2

/3

,

,

k=Ic

and E|E[exp(it

(1 − αk )zk )|α]|2 ≤ t3·2

m/3−1

(22m/3 −3c/4)

exp(−t2 (1/3 − c2−2(m+3)/3 )).

k=Ic

˘ n such that the X ˘ j are independent copies of Xj for ˘1 , . . . , X Proof. Consider a sequence of random variables X ˘ all j = 1, . . . , n. Then z¯j , z˜j , and zj (Xj ) = z˘j are independent and identically distributed for all j = 1, . . . , n. It follows that n n E|E[exp(it αj z¯j )|α]|2 = EE[exp(it αj (¯ zj − z˘j ))|α] j=1

= E(exp(it

n

j=1

αj (¯ zj − z˘j ))) =

j=1

n

E exp(itαj (¯ zj − z˘j )).

j=1

Expand exp(itαj (¯ zj − z˘j )) into the Taylor series: 1 1 zj − z˘j ))| ≤ 1 − t2 E(αj (¯ zj − z˘j ))2 + |t|3 E(αj |¯ zj − z˘j |)3 . |E exp(itαj (¯ 2 6 Using the inequality E|¯ zj − z˘j |3 ≤ 4E|zj |3 , we see that 2 2 zj − z˘j ))| ≤ 1 − t2 f(t)(Ezj2 − hE|zj |3 ) ≤ exp(−t2 f(t)(Ezj2 − hE|zj |3 )). |E exp(itαj (¯ 3 3 Hence, n n (1) m−1 2 E|E[eitsn |α]|2 ≤ exp(−t2 f(t)( Ezj2 − h E|zj |3 )) = exp(−3 · 2m−1 t2 t−2 log(t)) = t−3·2 . 3 j=1

j=1

This proves the first inequality of the lemma. We proceed similarly to prove the second inequality: (2) m−1 2 1 E|E[eitsn |α]|2 ≤ exp(−t2 (1 − f(t)) ) = t3·2 e−t /3 . 3

The last two inequalities are obtained similarly to Lemma 1: E|E[exp(it

k=Ic

c c 1 2 2 αk zk )|α]|2 ≤ exp(−t2 f(t)( − Ezis + h E|zis |3 )) 3 s=1 3 s=1

m/3−1 1 (22m/3 −3c/4) ≤ exp(−t2 f(t)( − c2−2(m+3)/3 )) = t−3·2 . 3 The proof of Lemma 2 is completed.

Decompose I1 as follows: (2) I1 = E exp(it(s(1) n + sn +

c m

(2) ¯ ck )) − E exp(it(s(1) W n + sn ))

c=2 k=0

(2) = E exp(it(s(1) n + sn +

m c=2

2528

¯ cc )) − E exp(it(s(1) + s(2) )) W n n

(2) +E exp(it(s(1) n + sn +

m m (2) ¯ cc )) ¯ cc + W ¯ c0 ))) − E exp(it(s(1) W (W + s + n n c=2

(2) +E exp(it(s(1) n + sn +

c m

c=2

¯ ck )) − E exp(it(s(1) + s(2) + W n n

c=2 k=0

m

¯ cc + W ¯ c0 ))) = I11 + I12 + I13 . (W

c=2

Now we transform the first summand: (2) I11 = E exp(it(s(1) n + sn ))(exp(it

m

¯ cc ) − 1 − it W

c=2

(2) +itE exp(it(s(1) n + sn ))

m

m

¯ cc ) W

c=2

¯ cc = I111 + I112 . W

c=2

For a fixed α, the random variable inequality

(2) sn

(1)

does not depend on sn m

|I111| ≤ 2|t|p(m − 1)p−1

and Wcc . Hence, Proposition 2 implies the

(2)

¯ cc|p |α]|. E|E[eitsn |α] · E[|W

c=2

Using the H¨ older inequality, Proposition 1 and Lemma 2, we see that |I111 | ≤ Am σ ¯n−p |t|p

m c (2) (E|E[eitsn |α]|2)1/2 ( E|αis |2p)1/2 E|gIc n |p c=2 Iˇc

≤ Am σ¯n−p |t|p+3·2

m−2

s=1

e−t

2

/6

f(t)

m

E|gIc n |p ,

1 ≤ p ≤ 2.

c=2 Iˇc

Now we estimate I112 : σn−1 |I112| ≤ |t|¯

m

(2)

E|E[eitsn |α] · E[exp(it

c=2 Iˇc

αk z¯k )|α]

c

αis E[

s=1

k=Ic

c

eitαis z¯is gIc n |α]|.

s=1

Similarly to the case of L2 , we get the estimate |I112| ≤

|t|¯ σn−1

m

its(2) n

E|E[e

|α] · E[exp(it

c=2 Iˇc

αk z¯k )|α]

c

α is

s=1

k=Ic

×E[(eitαi1 zî1 − 1) · . . . · (eitαic zîc − 1)gIc n |α]|. By the Hölder inequality, Proposition 2, and Lemma 2, ¯n−1 |I112 | ≤ |t|m+1 σ

m (2) (E|E[eitsn |α]|2)1/2 c=2 Iˇc

(E|E[exp(it

αk zk )|α]|2)1/2 (

Eα4is )1/2 × E|ˆ zi1 · . . . · zîc gIc n |

s=1

k=Ic

≤ Am σ ¯n−1 |t|3m/4(1+3·2

c

m/3−1

)+1 −t2 /6

e

logm/2 (t)

m

E|ˆ zi1 · . . . · zîc gIc n |.

c=2 Iˇc

We decompose I12 as follows: (2) I12 = E exp(it(s(1) n + sn ))(exp(it

m c=2

¯ cc ) − 1)(exp(it W

m

¯ c0 ) − 1) W

c=2

2529

m m (2) ¯ ¯ c0 ) W W +E exp(it(s(1) + s ))(exp(it( )) − 1 − it c0 n n c=2 (2) +itE exp(it(s(1) n + sn ))

m

c=2

¯ c0 = I121 + I122 + I123 . W

c=2

The Hölder inequality with 1 ≤ p ≤ 2 and q = p/(p − 1) and Propositions 1 and 2 imply that |I121 | ≤ (E| exp(it

m

¯ cc ) − 1|p)1/p (E| exp(it W

c=2

m

¯ c0 ) − 1|q )1/q W

c=2

m m ¯ cc|p )1/p ( ¯ c0 |p )1/q ≤ 21/q |t|1+p/q (m − 1)p−1 ( E|W E|W c=2

≤ Am |t|pf(t)2/p σ ¯n−p

c=2

m

E|gIc n |p ,

1 ≤ p < 2.

c=2 Iˇc (1)

(2)

For a fixed α, sn does not depend on Wc0 and sn . The same reasoning as in the case of I111 shows that |I122 | ≤ Am σ¯n−p |t|p−3·2

m−2

m

E|gIcn |p ,

1 ≤ p ≤ 2.

c=2 Iˇc

We estimate I123 similarly to I112 : σn−1 I123 ≤ |t|¯

m

(1)

E|E[eitsn |α] · E[exp(it

c=2 Iˇc c

(1 − αk )zk |α]

k=Ic

(1 − αis ) × E[(eit(1−αi1 )ˆzi1 − 1) · . . . · (eit(1−αic )ˆzic − 1)gIc n |α]|

s=1

≤ |t|1−9m2

m/3−4

e−t

2

(1/6−m2−2m/3−3 ) −1 σ ¯n

m

E|ˆ zi1 · . . . · zîc gIc n |.

c=2 Iˇc

For I13 , we use the following representation: (2) I13 = E exp(it(s(1) n + sn +

m

¯ c0 ))(exp(it W

c=2

(exp(it

c−1 m

c−1 m

¯ ck ) − 1 − it W

m

c=2 m

¯ c0 )) W

c=2

c=2 k=1

+

¯ cc ) − 1) W

(2) ¯ ck ) − 1) + E exp(it(s(1) W n + sn +

c=2 k=1

(exp(it

m

c−1 m

¯ ck ) + itE exp(it(s(1) + s(2) W n n

c=2 k=1

¯ c0 )) W

c=2

c−1 m

¯ ck = I131 + I132 + I133 , W

c=2 k=1

where the values I131 and I132 can be estimated as above. (s) (s) To estimate I133 , we need a series of randomizations. Let η1 , . . . , ηn , s = 1, . . . , m − 1, be a series of independent identically distributed random variables that are independent of α and of X1 , . . . , Xn and have the Bernoulli distributions (s)

P (ηj 2530

(s)

= 1) = 1 − P (ηj

= 0) = 1/2,

j = 1, . . . , n, s = 1, . . . , m − 1.

¯ j can be represented in the following way: We denote this series by η. In this case, X d (1) (1) (1) (1) ¯j = X ηj Yj + (1 − ηj )Y¯j , (1)

where Yj

(1) ¯j for all j = 1, . . . , n. Further, and Y¯j are independent copies of X (1) d

Yj

(2)

(2)

(2)

(2)

+ (1 − ηj )Y¯j ,

= ηj Yj

... (m−1) d

Yj (s+1)

where Yj

(m)

= ηj

(m)

(m)

+ (1 − ηj

Yj

(m)

)Y¯j

,

(s+1) (s) and Y¯j are independent copies of Yj for s = 2, . . . , m − 1. In this case,

d

(1)

(2)

(1) ¯ ) + sn1 (Y¯ (1) ) = s(1) n (X) = sn1 (Y

n

(1)

(1)

ηj αj zj (Yj ) +

n (1) (1) (1 − ηj )αj zj (Y¯j ).

j=1

j=1

For s = 1, . . . , m − 1, (1)

(2)

(1)

(2)

(s+1) ) + sns+1 (Y¯ (s+1) ). s(1) ns = sns+1 + sns+1 = sns+1 (Y

Set (1)

(2)

sn0 := s(1) n

and sn0 := s(2) n .

The equalities (2) d

(1)

sn0 + sn0 =

k

(1) d

s(2) nr + snk =

r=0

s

(1)

s(2) nr + snk ,

s = 1, . . . , k,

r=0

hold, where (1) snk (Y¯ (k) ) =

k n

(s)

(k)

ηl αl zl (Yl

)

l=1 s=1

and ¯ (r) ) = s(2) nr (Y

n r−1

(s)

(r)

(r)

ηl (1 − ηl )αl zl (Y¯l

).

l=1 s=1

Therefore, d ¯ ck = W

k

c

ˇ k ,Lk ∈Ic s=k+1 b1 =0 Iˇc L

×

b1

ηr(1) s

s=1

k

1

1

b1

αls

ˇ b ,Rb ∈Lk s=1 R 1 1

(1) (1) (1) ˜I \L ) (1 − ηls )gIc nRb1 (YRb , Y¯Lk \Rb , X c k

ˇb L ˇ k ,Lk =Rb Iˇc ,I c =Lk b1 =0 R 1 b b 1 k k b

×

k

1

s=b1 +1

k

d

=

(1 − αis )

ηr(1) s

s=1

1

k

1

c

(1 − αis )

k

αls

s=1

Rb1 s=k+1

(1) (1) (1) ˜ I \L ). (1 − ηls )gIc nRb1 (YRb , Y¯Lk\Rb , X c k 1

1

s=b1 +1

¯ ck into summands of three types: We decompose W d ¯ ck = W

k

¯ ckb1 = W ¯ ck0 (Y¯ (1), X ˜I \L ) + W ¯ ckk (Y (1), X ˜I \L ) W c k c k Lk Lk

b1 =0

2531

+

k−1

¯ ckb (Y (1) , Y¯ (1) ˜ W 1 Rb Lk \Rb , XIc \Lk ). 1

1

b1 =1

Further, k−1

k−1

d ¯ ckb = W 1

d

k−1

¯ ckB b (Y (2) , Y¯ (2) ¯ (1) ˜ W 1 2 Rb Rb \Rb , YLk \Rb , XIc \Lk ) 2

b1 =1

=

b1

k−1 1 −1 b

k−1 1 −1 b

¯ ckB b + W ¯ ckB 0 ) + (W 1 1 1

b1 =1

+

¯ ckBk−1 bk−1 + W ¯ ckBk−1 0 ) = (W

where

WckBj bj =

j

b j−1

i=1

¯ ckB b + W ¯ ckB 0 ) + . . . (W 2 2 2

k−1

¯ ckBj bj + W ¯ ckBj 0 ), (W

j−1

(

··· 2

j

2

k

ηr(i) (1 − ηr(j) )) · · · s s

s=bj +1 i=1

ˇ k Iˇc ˇ b1 ,Rb1 =( Rbs−1 ) L R b1 k b b bs

ˇ b ˇ bj−1 bj−1 R ,Rb =Rbj j Rb

ηr(i) s

1

j=1 1≤bj

A Berry-Esseen bound for U-Statistics

Bound: A Novel

Bound to a Warrior

A Bound Ofhonour

Bound to a Warrior

Bound by a Promise

Bound to a Warrior

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

A Berry-Esseen bound for U-Statistics

Bound: A Novel

Bound to a Warrior

A Bound Ofhonour

Bound to a Warrior

Bound by a Promise

Bound to a Warrior

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Bound

Recommend Documents