Acta Mathematica Sinica, English Series Jan., 2001, Vol.17, No.1, pp. 169–180
A Berry-Essen Inequality for the Kaplan-Meier L-Estimator Qi Hua WANG Institute of Applied Mathematics, Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences, Beijing 100080, P. R. china E-mail:
[email protected] Li Xing ZHU Institute of Applied Mathematics, Academy of Mathematics and Systems Sciences, Chinese Academy of Sciences, Beijing 100080, P. R. China and Department of Statistice and Actuarial Science, The University of Hong Kong, Hong Kong, P. R. China E-mail:
[email protected] Abstract Let Fn be the Kaplan-Meier estimator of distribution function F . Let J(·) be a measureable real-valued function. In this paper, a U-statistic representation for the Kaplan-Meier L-estimator, T (Fn ) = xJ(Fn (x)) dFn (x), is derived. Furthermore, the representation is also used to establish a Berry-Essen inequality for T (Fn ). Keywords Kaplan-Meier L-estimator, U-statistic representation, Berry-Essen Inequality 1991MR Subject Classification 62G05, 62E20
1
Introduction
Let X1 , X2 , . . . , Xn be nonnegative independent random variables representing survival times with a common distribution function F . In survival analysis or medical follow-up studies, one does not observe the true random survival times X1 , X2 , . . . , Xn of the n patients who entered the study, but rather observes only the right censored version of these variables due to dropping out of or living withdrawl of patients from the study. That is, associated with Xi there are independent nonnegative censoring variables Yi which are iid r.v. with d.f. G, one observes only (Z1 , δ1 ), . . . , (Zn , δn ), where Zi = min(Xi , Yi ), δi = I[Xi ≤ Yi ], where i = 1, 2, · · · , n, I[·] is the indicator function for some event. Received September 28, 1998, Accepted March 10, 1999 Research supported by the National Natural Science Foundation of China and a CRCG grant of the University of Hong Kong.
Q. H. Wang and L. X. Zhu
170
It is noted that some important statistic characters can be expressed as T (F ) =
xJ(F (x))
dF (x). This also happens identically in survival analysis (see, for example, [1]). In the uncensored case, an L-estimator can be defined by replacing F in T (F ) with its empirical estimator Fn . But, in the case of censored data, it fails to define the empirical estimator of F . [1] defined a Kaplan-Meier L-estimator T (Fn ) replacing F in T (F ) by the Kaplan-Meier estimator Fn of F , where Fn is defined as follows: (see [10]) n N + (Zj ) I[Zj ≤t,δj =1] , t ≤ Z(n) , 1 − Fn (x) = j=1 1 + N + (Zj ) 0, otherwise. n
I[Zi > Zj ]. N + (Zj ) =
(1.1)
j=1
As a special case in her paper, Reid calculated the influence curve of the Kaplan-Meier Lestimator and gave some examples. In this paper, we represent T (Fn ) as a U-statistic plus a remainder, and employ the representation to establish a Berry-Essen inequality for it. The main results will be formulated in Section 2 and proved in Section 3. For the sake of simplicity, we denote a constant by c and a positive absolute constant by c0 which may take different values at different places.
2
The Main Results
Due to the complicated structure of the representation of T (Fn ) described below, we first introduce some notations which will be used frequently in the proof of the theorems for convenient reading. 2.1
Some Notations
1 (x) = P (Z1 > x, δ1 = 1), F¯ (x) = 1 − F (x), Let H(x) = P (Z1 ≤ x), H1 = P (Z1 ≤ x, δ1 = 1), H ¯ ¯ ¯ and G(x) = 1 − G(x), H(x) = 1 − H(x) = F¯ G ¯ −1 (Zi )I[0 ≤ Zi ≤ x, δi = 1] − g(Zi , δi ; x) = −H
x∧Zi
0
¯ 1, ¯ −2 dH H
g (Zi , δi ; x), ξ(Zi , δi ; x) = F¯ (x) i , δi , Zj , δj ; x) = H ¯ −2 (Zi )I[0 ≤ Zi ≤ x, Zi < Zj , δi = 1] + ψ(Z
¯ −2 (Zi )I[0 ≤ Zi ≤ x, Zi < Zj , δi = 1] + −E H
0
x∧Zi ∧Zj
0
x∧Zi ∧Zj
¯1 ¯ −3 dH H
¯ 1 Zi , δi , ¯ −3 dH H
i , δi , Zj , δj ; x) + ψ(Z j , δj , Zi , δi ; x), h(Zi , δi , Zj , δj ; x) = g(Zi , δi ; x) + g(Zj , δj ; x) + ψ(Z ∞ h(Zi , δi ; Zj , δj ) = J(F (x))F¯ (x)[ h(Zi , δi ; Zj , δj ; x) + g(Zi , δi ; x) g (Zj , δj ; x)] dx 0
A Berry-Essen Inequality for the Kaplan-Meier L-Estimator
+ and a U-statistic Un =
0
1 n2
i<j
∞
171
J (F (x))ξ(Zi , δi ; x)ξ(Zj , δj ; x)dx,
h(Zi , δi ; Zj , δj ). Furthermore let τF = inf{t : F (t) = 1}. τG
and τH are similarly denoted. In this paper, we assume τF ≤ τG throughout. 2.2
Main Results
Theorem 1 Assume that F and G are continuous and that J(·) satisfies: (i)
J(·) is real-value measurable, and vanishes in [1 − β, 1] for some 0 < β < 1;
(ii)
J (·) exists in (0, 1 − β), and is a bounded function satisfying the Lipschitz condition,
i.e., there exists c0 such that |J (x1 ) − J (x2 )| ≤ c0 |x1 − x2 |, ∀ x1 , x2 ∈ (0, 1 − β). Then T (Fn ) − T (F ) = Un + n + Rn , x −2 ∞ ¯ 1 dx and Rn satisfies ¯ dH where n = −n−1 0 J(F (x)F¯ (x) 0 H √ 1 P ( n|Rn | > cn− 2 ) ≤ cn−1 .
Remark 1
(2.1)
(2.2)
Some examples of J(·) satisfying the conditions of Theorem 1 are given in [1, 2].
Let τ0 = inf{x : F (x) = 1 − β}. We now establish a Berry-Essen inequality for T (Fn ). Theorem 2
Under the conditions of Theorem 1, we have √ 1 sup |P ( nσ −1 (T (Fn ) − T (F ) ≤ x)) − Φ(x)| ≤ c0 Bn− 2 , x
where σ2 =
τ0
0
τ 0
B=
0
τ0
J(F (x))J(F (y))F¯ (x)F¯ (y)
0
x∧y
0
¯ −2 (x) dx J(F (x))F¯ (x)H σ
¯ −2 (s) dH1 (s) dx dy, H
(2.3)
3
τ0
τ0 ¯ −2 (x) dx 2 + ¯ −2 (x) dx)2 J(F (x))F¯ (x)H J (F (x))F¯ 2 (x)H 0 2 σ τ0 ¯ (x)H ¯ −2 (x) dx J(F (x)) F . + 0 σ +
0
(2.4)
We leave all proofs of the theorems to the next section.
3
Proofs of Theorems
¯ (x) = 1 − F (x), r (x) = log F ¯ (x) − log F¯ (x) − U n , where n = n−1 x H n − ¯1 ¯ −2 dH Let F n n n n 0 n = 12 and U h(Zi , δi ; Zj , δj ). n
Lemma 1
i<j
Assume that F and G are continuous. We then have ¯ −8 (x)n−3 E rn2 (x) ≤ c0 H
(3.1)
Q. H. Wang and L. X. Zhu
172
for any 0 ≤ x < τH . Proof
Let n
n
¯ 1n (x) = 1 ¯ n (x) = 1 H I[Zi > x, δi = 1], H I[Zi > x], n i=1 n i=1 x ¯ dH1n (s) ¯ (x) − θn (x) = log F n ¯ n (s) . H 0
(3.2)
From [3], rn (x) admits the following decomposition: (n)
(n)
(n)
rn (x) = h1 (x) + h2 (x) + h3 (x) + θn (x),
(3.3)
with (n) ¯ −3 (x)n−3 , E(h1 (x))2 ≤ c0 H
(3.4)
(n) E(h3 (x))2
(3.5)
¯ −6
≤ c0 H
−3
(x)n ,
n 23 H nH ¯ 2 (x) ¯ 2 (u)s 13 (n) P (|h2 (x)| > s) ≤ c0 exp − + exp − , 2 2 (n)
where h1
(n)
h2
(n)
h3
(3.6)
are as defined in [3]. We omit their forms here since it is not necessary to
specify them at all for our purpose. By Lemma 1 in [4], it follows that n 2 ¯ (x) , P (|θn (x)| > s) ≤ c0 exp − H 2
s≥
4 ¯ 2 (x) n2 H
.
(3.7)
¯ (x) − F¯ (x) − F¯ (x)U ¯ (x) − n . By the Taylor expansion for exp(log F n − F¯ (x) Let Rn∗ = F n n log F¯ (x)), we have ¯ (x) − F¯ (x) = F¯ (x)U n + F¯ (x) n + F¯ (x) F rn (x) n θ 1 1 ¯ ¯ (x) − log F¯ (x))2 , + F (x)F¯ 1−θ1 (x)(log F n 2 n
(3.8)
where 0 < θ1 < 1. From (3.8) Rn∗ then becomes 1 ¯ θ1 ¯ (x) − log F¯ (x))2 , rn (x) + F (x)F¯ 1−θ1 (x)(log F Rn∗ = F¯ (x) n 2 n
(3.9)
which implies that {|F¯ (x) rn (x)| > s} ⊆ {|Rn∗ (x)| > s}.
(3.10)
¯ (x) − F¯ (x)| < 1, hence it is easily seen by (3.8) and (3.9) that for s > 2 Note that |F n n | > s ⊆ |U n + F¯ (x) n | > s . {|Rn∗ (x)| > s} ⊆ |F¯ (x)U (3.11) 2 4 ¯ −2 (x) and n / n is a U-statistic and note that | h(Z1 , δ1 ; Z2 , δ2 ; x)| ≤ 6H Recall that n2 U 2 E h(Z1 , δ1 ; Z2 , δ2 ) = 0, from the Bernstein inequality for U-statistic (see, e.g., [2], p. 201) and (3.10) and (3.11) it follows that ¯4 n | ≥ s ≤ 2 exp − nH (x) s2 , P (|F¯ (x) rn (x)| > s) ≤ P |U 4 576
s > 2.
(3.12)
A Berry-Essen Inequality for the Kaplan-Meier L-Estimator
173
Note that 2 2 ¯ rn (x)| = 2 F (x)E|
0
2
sP (F¯ (x)| rn (x)| > s) ds +
2
∞
sP (F¯ (x)| rn (x)| > s) ds
= 2[En1 + En2 ],
(3.13)
we can derive that 2 s s s s s (n) (n) P |h1 (x)| > P |h2 (x)| > d +8 d 4 4 4 4 4 0 0 2 ∞ s s s s s s (n) P |h3 (x)| > P |θn (x)| > + 16 d + 16 d 4 4 4 4 4 4 4 0 2 2 ¯ n H (x) 2 ¯42 n H (x) s + sP |θn (x)| > ds. 4 0
En1 ≤ 16
∞
(3.14)
Hence, (3.3)–(3.7) and (3.14) together yield the following inequality ¯ −6 (x)n−3 . En1 ≤ c0 H
(3.15)
By (3.12), it is easy to obtain that ¯ −4 (x)n−1 exp En2 ≤ 288H
−
¯ 4 (x) nH . 144
(3.16)
Combining with (3.13)–(3.16), Lemma 1 is proved. u Proof of Theorem 1 Let K(u) = 0 J(t) dt. we then have
∞ T (Fn ) − T (F ) = x d(K(Fn ) − K(F )) 0 ∞ 1 = x d J(F (x))(Fn (x) − F (x)) + J (F (x))(Fn (x) − F (x))2 2 0 ∞ 1 + x d K(Fn (x)) − K(F (x)) − J(F (x))(Fn (x) − F (x)) − J (F (x))(Fn (x) − F (x))2 2 0
(n)
= mn + RT .
(3.17)
Integrating by parts, it follows that ∞ mn = − J(F (x))(Fn (x) − F (x)) dx − 0
0
∞
J (F (x))(Fn (x) − F (x))2 dx = I + II. (3.18)
From the three-term Taylor expansion for exp(log Fn (x) − log F¯ (x)), it follows that ∞
¯ (x) − log F¯ (x))2 dx ¯ (x) − log F¯ (x) + 1 (log F I= J(F (x))F¯ (x) log F n n 2 0 θ2 ∞ ¯ (x)F¯ 1−θ2 (x) F (log Fn (x) − log F¯ (x))3 dx J(F (x)) n + 3! 0
(n)
= I1 + R I ,
(3.19)
Q. H. Wang and L. X. Zhu
174
¯ (x) − log F¯ (x) − 1 n g(Z , δ ; x). Then where 0 < θ2 < 1. Let rn (x) = log F n i i i=1 n
∞ J(F (x))F¯ (x)[ h(Zi , δi ; Zj , δj ; x) + g(Zi , δi ; x) g (Zj , δj ; x)] dx I1 = n−2 i<j
0
+ n + n−2 + 2n−1 + 0
∞
n
n
i=1
i=1 ∞
0
∞
0
J(F (x))F¯ (x) g 2 (Zi , δi ; x) dx
J(F (x))F¯ (x)rn (x) g (Zi , δi ; x) dx
J(F (x))F¯ (x)rn2 (x) dx +
= I11 + n +
4
∞
0
J(F (x))F¯ (x) rn (x) dx
(n)
RiI1 ,
i=1
(3.20)
where rn is defined at the beginning of this section and the definitions of h and g are gives in Subsection 2.1. For the term II in (3.18) we have, from the two-term Taylor expansion for ¯ ) − exp(log F¯ ), exp(log F n 1 ∞ J (F (x))F¯ 2 (x)(log Fn (x) − log F¯ (x))2 dx 2 0 1 ∞ ¯ θ3 (x)F¯ 2−θ3 (x)(log F ¯ (x) − log F¯ (x))3 dx J (F (x))F − n n 2 0 1 ∞ ¯ θ3 (x)F¯ 2(1−θ3 ) (x)(log F ¯ (x) − log F¯ (x))4 dx J (F (x))F − n n 8 0
II = −
(n)
(n)
= II1 + R1II + R2II ,
(3.21)
where 0 < θ3 < 1. Recalling the definition of rn (x), it follows that
∞ J (F (x))F¯ 2 (x) g (Zi , δi ; x) g (Zj , δj ; x) dx II1 = −n−2 i<j 0 n ∞
−2
− 2n
i=1
− n−1
i=1
−2
n
∞ 0
0 ∞
0
J (F (x))F¯ 2 (x) g 2 (Zi , δi ; x) dx
J (F (x))F¯ 2 (x) g(Zi , δi ; x)rn (x) dx
J (F (x))F¯ 2 (x)rn2 (x) dx = II11 +
3
i=1
(n)
RiII1 .
(3.22)
Recalling the definition of Un , we have I11 + II11 = Un .
(3.23)
Hence, together (3.17)–(3.22), it follows that T (Fn ) − T (F ) = Un + n + Rn , where (n)
(n)
Rn = RT + RI
+
4
i=1
(n)
RiI1 +
3
i=1
(n)
(n)
(n)
RiII1 + R1II + R2II .
A Berry-Essen Inequality for the Kaplan-Meier L-Estimator
175
For getting to the conclusion in Theorem 1, all we need to do is to bound Rn with the following √ 1 inequality P ( n|Rn (x)| > cn− 2 ) ≤ cn−1 . We now bound every term in Rn .
(n) (1) To bound RT . Let τn = inf x {x : Fn (x) = 1 − β}. τ0 is as defined right before (n)
Theorem 2. Integrating by parts on RT , and then employing the two-term Taylor expansion for K(Fn (x)) as well as the Lipschitz condition for J (·), it can be verified that
(n)
|RT | ≤ c0 Hence
(2τn )∨τ0
0
|Fn (x) − F (x)|3 dx.
(3.24)
√ √ 1 1 1 1 (n) (n) n|RT (x)| > cn− 2 , τn < τ0 or τ0 ≤ τn < 2τ0 P ( n|RT (x)| > cn− 2 ) ≤ P 2 2 √ 1 (n) + P ( n|RT (x)| > cn− 2 , τn > 2τ0 ) 4τ0 |Fn (x) − F (x)|3 dx > cn−1 + P (τn > 2τ0 ) ≤ 2P 0
=: rn1 + rn2 .
(3.25)
From [5], formulae (2.50), we have, for 0 ≤ x < τH , p ≥ 2, ¯ (x) − log F¯ (x)|p ≤ c n− p2 H ¯ −2p (x). E| log F n 0
(3.26)
Hence, by the Tchebychev inequality, the inequality |x − y| ≤ | log x − log y|, for 0 < x, y < 1 and (3.26), we have rn1 ≤ cn2
0
2τ0
¯ (x) − F¯ (x)|6 dx ≤ cn−1 . E|F n
(3.27)
Note that {τn > 2τ0 } ⊆{There exists τ0 ≤ s0 < τH , and 0 < a < 1 such that sup |Fn (x) − F (x)| > a}. 0≤x≤s0
Hence, invoking [6], (formula (2.18), we have for a sufficiently large n ¯ (x) − log F¯ (x)| > a ≤ c exp{−λH(s ¯ 0 )a2 n}, rn2 ≤ P sup | log F n 0 0≤x≤s0
(3.28)
where both λ and c0 are some absolute constants. Clearly, (3.25), (3.27) and (3.28) together yield that
√ 1 (n) P ( n|RT (x)| > cn− 2 ) ≤ cn−1 .
(3.29)
(n)
(2) To bound RI . By the Tchebyschev inequality, the Cauchy-Schwartz inequality and (3.26), we have √ 1 (n) P ( n|RI (x)| > cn 2 ) ≤ cn2
τ0
τ0
J(F (x))J(F (y)) 0
0
¯ (x) − log F¯ (x))6 E 2 (log F ¯ (y) − log F¯ (y))6 dx dy ≤ cn−1 . × E 2 (log F n n 1
1
(3.30)
Q. H. Wang and L. X. Zhu
176 (n)
(n)
(3) To bound RiII .
Similarly to RI , we have √ 1 (n) P ( n|R1II (x)| > cn− 2 ) ≤ cn−1 .
(3.31)
The Markov inequality then immediately yields that √ 1 (n) P ( n|R2II (x)| > cn− 2 ) ≤ cn−1 . (n)
(3.32)
(n)
(4) To bound RiI1 , i=1,2,3,4 and RjI2 , j = 1, 2, 3. Note that, by [3], x 2 ¯ 1 (s). ¯ −2 (s) dH E g (Z1 , δ1 ; x) = − H
(3.33)
0
Again using the Techbyschv inequality, we then have √ 1 (n) P ( n|R1I1 (x)| > cn− 2 ) τ0 τ0 1 1 ≤ n−1 J(F (x))J(F (y))F¯ (x)F¯ (y)E 2 g4 (Z1 , δ1 ; x)E 2 g4 (Z1 , δ1 ; y) dx 0
0
¯ −2 (τ0 ) ≤ cH
τ0
0
Similarly, we have
2
|J(F (x))| dx
n−1 ≤ cn−1 .
(3.34)
√ 1 (n) P ( n|R1II1 (x)| > cn− 2 ) ≤ cn−1 .
(3.35)
(n)
For R2I1 , applying the Cauchy-Schwartz inequality and the Dharmadhikar-Jogdeo (D-J) inequality (see, e.g., [7]), it follows that τ0 τ0 (n) 2 ER2I1 ≤ c0 n−1 J(F (x)J(F (y))F¯ (x)F¯ (y)
0 0 1 1 1 4 × E rn (x)E 4 rn4 (y)E 4 g4 (Z1 , δ1 ; x)E 4 g4 (Z1 , δ1 ; y) dx dy τ0 2 1 −1 ¯ −2 J(F (x))F¯ (x)E 4 rn4 (x) dx . ≤ c0 n H (τ0 ) 0 1 4
where E b f =: (Ef )b for a positive constant b. Let rn∗ (x) = Fn (x) − F¯ (x) −
1 n
(3.36)
n
i=1
ξ(Zi , δi ; x)
¯ (x) − where ξ(zi , δi ; x) is as defined in Subsection 1. From Taylor’s expansion for exp(log F n log F¯ (x)) and the same argument as that for (3.9), we obtain θ4
¯ (x) r∗ (x) F ¯ (x) − log F¯ (x))2 , (log F rn (x) = ¯n − ¯ nθ n F (x) F 4 (x)
0 < θ4 < 1,
(3.37)
where rn (x) is as defined right after (3.19). Hence, (3.36) and (3.37) together give 4 τ0 2
1 (n) 2 ER2I1 ≤ c0 n−1 E 2 sup rn∗ (x) |J(F (x))|F¯ (x) dx + 0
τ0
0≤x≤τ0
0
θ4
¯ (x)F |J(F (x))|F n
¯ 1−θ4
2 1 ¯ (x) − log F¯ (x))8 dx (x)E 4 (log F . n
(3.38)
From Theorem 1(c) in [8], we have
E
sup |rn∗ (x)|p ≤ c0 n−p
0≤x≤τ0
(3.39)
A Berry-Essen Inequality for the Kaplan-Meier L-Estimator
177
for any p > 0. Hence, combining (3.38), (3.39) with (3.26) it follows that (n) 2
ER2I1 ≤ cn−3 .
(3.40)
Clearly, the Tchebyschev inequality and (3.40) together yield that √ 1 (n) P ( n|R2I1 (x)| > cn− 2 ) ≤ cn−1 .
(3.41)
√ 1 (n) P ( n|R2II1 (x)| > cn− 2 ) ≤ cn−1 .
(3.42)
(n)
Similarly for R2II1 , that is,
Employing the Markov inequality and combining (3.37), (3.39) with (3.26), we then have √ 1 (n) P ( n|R3I1 (x)| > cn− 2 ) ∗2 ≤ cn E sup rn (x) 0≤x≤τ0
τ0
+ 0
0
τ0
|J(F (x))|(F (x))−1 dx
¯ (x) − log F¯ (x))4 (F (x))−1 dx ≤ cn−1 , |J(F (x))|E(log F n
(3.43)
where rn∗ is as defined right after (3.36). Similarly, it can be proved that √ 1 (n) P ( n|R3II1 (x)| > cn− 2 ) ≤ cn−1 .
(3.44)
By virtue of the Markov inequality, the Cauchy-Schwartz inequality and Lemma 1, it follows that
√ 1 (n) P ( n|R4I1 (x)| > cn− 2 ) ≤ cn−1
τ0
¯ −8 (x) dx J(F (x))F¯ (x)H
2
.
(3.45)
0
So far we have proved that √ 1 P ( n|Rn | > cn− 2 ) ≤ cn−1 .
(3.46)
This completes the proof of Theorem 1. To prove Theorem 2, we first cite a lemma derived in [3]. Lemma 2
For any random variables ζ and η and any real-valued nonnegative constant α,
we have α sup |P (ζ + η ≤ x) − Φ(x)| ≤ sup |P (ζ ≤ x) − Φ(x)| + √ + P (|η| > α). x x 2π Proof of Theorem 2
(3.47)
Note that E g (Zi , δi ; x) = 0,
where g (·, ·, ·) is as defined in Subsection 2.1. Then ∞ J(F (x))F¯ (x) g(Z1 , δ1 ; x) dx, g(Z1 , δ1 ) = E[h(Z1 , δ1 ; Z2 , δ2 )|Z1 , δ1 ) = 0
(3.48)
(3.49)
Q. H. Wang and L. X. Zhu
178
and
ψ(Z1 , δ1 ; Z2 , δ2 ) = h(Z1 , δ1 ; Z2 , δ2 ) − g(Z1 , δ1 ) − g(Z2 , δ2 ) ∞ 1 , δ1 ; Z2 , δ2 ; x) + g(Z1 , δ1 ; x) + g(Z2 , δ2 ; x)] dx = J(F (x))F¯ (x)[ψ(Z 0 ∞ J (F (x))F¯ 2 (x) g (Z1 , δ1 ; x) g (Z2 , δ2 ; x) dF (x). + 0
From (3.49) we can get, via some elementary calculation, that
σ 2 = Eg 2 (Z1 , δ1 ) ∞ ∞
¯ −1 (Z1 )I[0 ≤ Z1 ≤ x, δ1 = 1] J(F (x))J(F (y))F¯ (x)F¯ (y) H =E
0
0 x∧Z1
¯ 1 (s) H ¯ −1 (Z1 )I[0 ≤ Z1 ≤ y, δ1 = 1] ¯ −2 (s) dH H 0 ! y∧Z1 −2 ¯ ¯ + H (s) dH1 (s) dx dy
+
0
∞
∞
= 0
x∧y
J(F (x))J(F (y))F¯ (x)F¯ (y)
0
0
−2 ¯ H (s) dH1 (s) dx dy.
(3.50)
Also Eh2 (Z1 , δ1 ; Z2 , δ2 ) ∞ ∞ ≤3 J(F (x))J(F (y))F¯ (x)F¯ (y)E[ h(Z1 , δ1 ; Z2 , δ2 ; x) h(Z1 , δ1 ; Z2 , δ2 ; y)] dx dy 0 0 ∞ ∞ J(F (x))J(F (y))F¯ (x)F¯ (y) + 0
0
0
0
× E[ g (Z1 , δ1 ; x) g (Z1 , δ1 ; x) g (Z2 , δ2 ; x) g (Z2 , δ2 ; y)] dx dy ∞ ∞ + J (F (x))J (F (y))E[ξ(Z1 , δ1 ; x)ξ(Z1 , δ1 ; y)ξ(Z2 , δ2 ; x)ξ(Z2 , δ2 ; y)] dx dy
= 3(E1 + E2 + E3 ).
(3.51)
Note again that E g (Z1 , δ1 ; ·) = 0,
E g (Z1 , δ1 ; x) g (Z1 , δ1 ; y) =
x∧y
¯ −2 dH1 , H
0
2 , δ2 ; Z1 , δ1 ; ·)|Z2 , δ2 ] = E ψ(Z 1 , δ1 ; Z2 , δ2 ; ·)] = 0. 1 , δ1 ; Z2 , δ2 ; ·)|Z1 , δ1 ] = E[ψ(Z E[ψ(Z Combining with the independence of (Z1 , δ1 ), . . . , (Zn , δn ), we have E2 =
0
E3 =
∞
0
∞
0
∞
0
x∧y
J(F (x))J(F (y))F¯ (x)F¯ (y) 0
∞
J (F (x))J (F (y))F¯ 2 (x)F¯ 2 (y)
0
¯ −2 dH1 H x∧y
2
dx dy,
¯ −2 dH1 H
(3.52)
2
dx dy,
(3.53)
A Berry-Essen Inequality for the Kaplan-Meier L-Estimator
179
and E h(Z1 , δ1 ; Z2 , δ2 ; x) h(Z1 , δ1 ; Z2 , δ2 ; y) 1 , δ1 ; Z2 , δ2 ; x)ψ(Z 1 , δ1 ; Z2 , δ2 ; y) = 2E g (Z1 , δ1 ; x) g (Z1 , δ1 ; y) + 2E ψ(Z 1 , δ1 ; Z2 , δ2 ; x)ψ(Z 2 , δ2 ; Z1 , δ1 ; y) + 2E ψ(Z x∧y x∧y ¯ −2 dH1 + 4 ¯ −3 dH1 . ≤2 H H 0
(3.54)
0
Consequently, En1 can be bounded by ∞ ∞ J(F (x))J(F (y))F¯ (x)F¯ (y) 6 0
0
x∧y
0
¯ −3 dH1 dx dy. H
Hence, under the conditions of Theorem 2 it follows that ∞ 2 ¯ −2 (x) dx Eh (Z1 , δ1 ; Z2 , δ2 ) ≤ 21 J(F (x))F¯ (x)H +
0 ∞
(3.55)
2 2
¯ −2 (x) dx J (F (x))F¯ (x)H
.
(3.56)
0
On the other hand, E|g(Z1 , δ1 )|3 ≤
∞
0
∞
0
∞
J(F (x))J(F (y))J(F (z))F¯ (x)F¯ (y)F¯ (z)
0
× E| g (Z1 , δ1 ; x) g (Z1 , δ1 ; y) g (Z1 , δ1 ; z)| dxdydz.
(3.57)
and E| g (Z1 , δ1 ; x) g (Z1 , δ1 ; y) g (Z1 , δ1 ; z)| y∧s x∧y x∧y∧z ¯1 + ¯1 ¯ −3 dH ¯ −2 dH ¯ −2 ≤− H H H 0 0 0 y∧s x∧z −2 ¯ ¯ 1 (s) ¯ 1 dH ¯ −2 dH H (s) + H 0 0 x∧s y∧z −2 ¯ ¯ 1 (s) ¯ 1 dH ¯ −2 dH H (s) + H 0 0 z∧s y∧s x −1 −2 ¯ ¯1 ¯ ¯ ¯ −2 dH − H (s) H dH1 H 0 0 0 z∧s x∧s y −1 −2 ¯ ¯1 ¯ ¯ ¯ −2 dH − H (s) H dH1 H 0 0 0 y∧s x∧s z −1 −2 ¯ ¯1 ¯ ¯ ¯ −2 dH − H (s) H dH1 H 0
0
0
Substituting (3.58) into (3.57), it is easy to see that τ0 3 ¯ −2 (x) dx |J(F (x))|F¯ (x)H E|g(Z1 , δ1 )| ≤
¯ 1 (s) dH
¯ 1 (s) dH ¯ 1 (s) dH ¯ 1 (s). dH
3
.
0
From (3.59) and (3.50), we have E|g(Z1 , δ1 )|3 3
(Eg 2 (Z1 , δ1 )) 2
τ ≤
0
0
¯ −2 (x) dx |J(F (x))F¯ (x)H σ
3
(3.58)
= B1 .
(3.59)
Q. H. Wang and L. X. Zhu
180
From (3.56) and (3.50), we then have that 21 Eh2 (Z1 , δ1 ; Z2 , δ2 ) ≤ τ0 2 τ0 2 ¯ −2 (x) dx + ¯ −2 (x) dx 2 Eg (Z1 , δ1 ) σ2 J(F (x))F¯ (x)H J (F (x))F¯ 2 (x)H 0 0
= B2 . = B1 + B2 /21. From [9], there exists an absolute constant c0 such that Let B n √ − 12 . sup P nσ −1 Un ≤ x − Φ(x) ≤ c0 Bn n−1 x
(3.60)
Since
(n − 1)(n − 2) 2 n − 1 2 σ + Eh (Z1 , δ1 ; Z2 , δ2 ), n3 2n3 hence, we then have, for a large n, 1 √ − 12 ≤ cn−1 ≤ Bn − 12 . nσ −1 Un > Bn P n−1 Var Un =
(3.61)
By (3.60), (3.61) and Lemma 2, we come to the conclusion that there exists an absolute constant c0 such that
√ − 12 . sup |P ( nσ −1 Un ≤ x) − Φ(x)| ≤ c0 Bn x
(3.62)
Recalling the definition of n in Theorem 1, from (3.62) it follows that √ sup |P ( nσ −1 (Un + n ) ≤ x) − Φ(x)| x √ √ ≤ sup |P ( nσ −1 Un ≤ x) − Φ(x)| + sup |Φ(x − nσ −1 n ) − Φ(x)| x x τ0 1 1 − −1 ¯ (x)H ¯ −2 (x) dxn− 12 ≤ c0 Bn− 12 , 2+√ σ ≤ c0 Bn J(F (x)) F 2π 0
(3.63)
where B is as defined in Theorem 2. Combining (3.63) with Theorem 1 and Lemma 2, Theorem 2 is proved. References [1] N. Reid, Influence functions for censored data, Ann. Statist., 1981, 9(1): 78–92. [2] R. J. Serfling, Approximation Theorems of Mathematical Statistics, New York: John Wiley & Sons, 1980. [3] M. N. Chang, P. V. Rao, Berry-Essen Bound for the Kaplan-Meier estimator, Commun. Statist-Theory Meth., 1989, 18: 4647–4664. [4] N. Breslow, J. Crowley, A large sample study of the life and product-limit estimator under random censorship, Ann. Statist., 1974, 2(2): 437–453. [5] Q. H. Wang, Consistent estimator in random censorship semiparametric models, Science In China Ser. A, 1996, 39(2): 163–176. [6] P. Major, L. Rejt¨ o, Strong embedding of the estimator of the distribution function under random censorship, Ann. Statist. 1988, 16(3): 1113–1132. [7] B. L. S. Prakasa Rao, Asymptotic Theory of Statistical Influence, New York: John Wiley & Sons, 1987, 127–128. [8] I. Gijbels, J. Wang, Strong representations of the survival function estimator for truncated and censored data with applications, J. Multi. Analysis, 1993, 47(2): 210–229. [9] R. Helmers, W. R. Van Zwet, The Berry-Essen Bound for U-Statistic, III (S. S. Gupta, J. O. Berger Eds.), New York: Academic, I, 1982, 497–512. [10] E. L. Kaplan, P. Meier, Nonparametric estimation from incomplete observation, J. American Statistical Association, 1958, 53(2): 457–481.