Biostatistics (2002), 3, 2, pp. 195–211 Printed in Great Britain
Multipoint linkage detection in the presence of heterogeneity YEN-FENG CHIU∗ Department of Biostatistics, School of Public Health, CB #7420, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7420, USA
[email protected] KUNG-YEE LIANG Department of Biostatistics, School of Public Health, Johns Hopkins University, Baltimore, MD 21205, USA TERRI H. BEATY Department of Epidemiology, School of Public Health, Johns Hopkins University, Baltimore, MD 21205, USA
S UMMARY Linkage heterogeneity is common for complex diseases. It is well known that loss of statistical power for detecting linkage will result if one assumes complete homogeneity in the presence of linkage heterogeneity. To this end, Smith (1963, Annals of Human Genetics 27, 175–182) proposed an admixture model to account for linkage heterogeneity. It is well known that for this model, the conventional chisquared approximation to the likelihood ratio test for no linkage does not apply even when the sample size is large. By dealing with nuclear families and one marker at a time for genetic diseases with simple modes of inheritance, score-based test statistics (Liang and Rathouz, 1999, Biometrics 55, 65–74) and likelihoodratio-based test statistics (Lemdani and Pons, 1995, Biometrics 51, 1033–1041) have been proposed which have a simple large-sample distibution under the null hypothesis of linkage. In this paper, we extend their work to more practical situations that include information from multiple markers and multi-generational pedigrees while allowing for a class of general genetic models. Three different approaches are proposed to eliminate the nuisance parameters in these test statistics. We show that all three approaches lead to the same asymptotic distribution under the null hypothesis of no linkage. Simulation results show that the proposed test statistics have adequate power to detect linkage and that the performances of these two classes of test statistics are quite comparable. We have applied the proposed method to a family study of asthma (Barnes et al., 1996), in which the score-based test shows evidence of linkage with p-value 0, x = 1. Then, ∀ i = 1, . . . , n, E{log[gi (yi , xi ; λ, θ, γ0 , φ ∗ )]; gi (yi , xi ; α0 , θ0 , γ0 , φ0 )} − E{log[gi (yi , xi ; λ, θ0 , γ0 , φ ∗ )]; gi (yi , xi ; α0 , θ0 , γ0 , φ0 )} gi (yi , xi ; λ, θ, γ0 , φ ∗ ) <E (y , x ; α , θ , γ , φ ) − 1. ; g i 0 0 0 0 i i gi (yi , xi ; λ, θ0 , γ0 , φ ∗ ) However, under H0 : θ = θ0 = 12 and assuming that there is no linkage disequilibrium and no epistasis, gi (xi |yi ; λ, θ, γ0 , φ ∗ ) · gi (yi ; φ ∗ ) RHS = · gi (yi ; φ0 ) · gi (xi ; γ0 ) − 1 gi (xi ; γ0 ) · gi (yi ; φ ∗ ) yi xi = gi (yi ; φ0 ) {gi (xi |yi ; λ, θ, γ0 , φ ∗ )} − 1 yi
xi
= 0. Hence, under H0 , E{log(gi (yi , xi ; λ, θ, γ0 , φ ∗ )); gi (yi , xi ; α0 , θ0 , γ0 , φ0 )} is uniquely maximized at θ = θ0 , ∀ i = 1, . . . , n. The results follow by the same arguments as stated in the other approaches. Note that this approach is applicable only when θ0 is equal to 12 . Proof of Proposition 1. Following similar arguments in Gong and Samaniego (1981) and Liang and Rathouz (1999), applying the Taylor expansion to the score function considered based on approach 1, ∂ S (θ ) γˆ ,φˆ 0 Sγˆ ,φˆ (θˆλ,γˆ ,φˆ ) = Sγˆ ,φˆ (θ0 ) + (θˆλ,γˆ ,φˆ − θ0 ) + o p (1) ∂θ n ˆ √ −1 ∂ log f i (yi , xi ; θ0 , γˆ , φ) 2 =0+ n · n(θˆλ,γˆ ,φˆ − θ0 ) + o p (1) ∂θ i=1 = O p (1)
Multipoint linkage detection in the presence of heterogeneity and Tλ∗
D
= λSγˆ ,φˆ (θˆλ,γˆ ,φˆ ) →
χ12 1 2 1 2 2 χ0 + 2 χ1
209
if θ0 is an interior point of [0, 12 ] if θ0 is on the boundary of [0, 12 ].
The same arguments apply to approaches 2 and 3, except that the score functions must be replaced by Sγˆ ,φ ∗ (θˆλ,γˆ ,φ ∗ ) or Sγ ∗ ,φˆ (θˆλ,γ ∗ ,φˆ ) in approach 2 and by S(θˆλ , γˆλ , φˆ λ ) in approach 3. √ Proof of Proposition 2. Let δ = (γ , φ), and under H0 let δˆ be the n-consistent estimate of δ. Under ˆ ∀ fixed α ∈ [ε, 1], ε > 0, the LRB test statistic is given by H A , let the estimate of δ be δ(α). ˆ L α (θˆ (α), α, δ(α)) ˆ = 2 log ˆ ˆ ˆ R(θˆ , α, δ) α, δ(α)) − 0 (θ0 , δ)}, = 2{α (θ(α), ˆ L 0 (θ0 , δ) where θ is defined in Lemma 1. ˆ α, δ) ˆ at θ0 , Applying a Taylor series expansion of R(θ, 2 ˆ ˆ ∂ α (θ0 , δ(α)) ∂α (θ0 , δ(α)) 1 ˆ = 2 (θˆ (α) − θ0 ) ˆ R(θˆ , α, δ) + o p (1) + (θ(α) − θ0 )2 ∂θ 2 ∂θ 2 −1 ˆ ˆ ˆ ∂α (θ0 , δ(α)) ∂α (θ0 , δ(α)) ∂ 2 α (θ0 , δ(α)) = − + o p (1) ∂θ ∂θ ∂θ 2 2 −1 ∂ (θ , δ(α))/∂θ ˆ ˆ ∂α (θ0 , δ(α))/∂θ ∂ 2 α (θ0 , δ(α))/∂θ α 0 ˆ = − + o p (1). √ √ n n n Note that ˆ ˆ ∀α ∈ [ε, 1]; (1) under H0 , α (θ0 , α, δ(α)) = 0 (θ0 , δ), p ˆ → δ0 as n → ∞, ∀α ∈ [ε, 1], where δ0 is the (2) as shown in the previous section, under H0 , δ(α) true value of δ; (3) under H0 , o p (1) holds uniformly in α ∈ [ε, 1]. According to Liang and Self (1996), under H0 and as n → ∞, ˆ ∂α (θ0 , δ(α))/∂θ D → N (0, Iα (θ0 , δ0 )), √ n and
where
2 ˆ −∂ 2 α (θ0 , δ(α))/∂θ p → Iα (θ0 , δ0 ), n
n α2 ∂ log f i (yi , xi ; θ0 , δ0 ) 2 E0 . n→∞ n ∂θ i=1
Iα = lim
Therefore, from the results above and in Self and Liang (1987), if θ0 is an interior point of [0, 12 ] χ2 D ˆ ˆ Rε = 2 sup R(θ , α, δ) → 1 1 2 1 2 if θ0 is on the boundary of [0, 12 ]. α∈[ε,1] 2 χ0 + 2 χ1 The same arguments apply to approaches 1 and 3, except that δ0 must be replaced by δ ∗ in approach 2. ˆ Note that δ ∗ = (γ0 , φ ∗ ) when δˆ = (γˆ , φ ∗ ) and δ ∗ = (γ ∗ , φ0 ) when δˆ = (γ ∗ , φ).
210
Y. F. C HIU ET AL. R EFERENCES
BARNES , K. C., N EELY , J. D. AND D UFFY , D. L. et al. (1996). Linkage of asthma and total serum IgE concentration to markers on chromosome 12q: evidence from Afro-Caribbean and Caucasian populations. Genomics 37, 41–50. BARNES , K. C., F REIDHOFF , L. R., N ICKEL , R., C HIU , Y.-F., J UO , S.-H., H IZAWA , N., NAIDU , R. P., E HRLICH , E., D UFFY , D. L., S HOU , C., L EVETT , P. N., M ARSH , D. G. AND B EATY , T. H. (1999). Dense mapping of chromosome 12q13.12-q23.3 and linkage to asthma and atopy. Journal of Allergy and Clinical Immunology 104, 485–491. DAVIES , R. B. (1977). Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64, 247–254. DAVIES , R. B. (1987). Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 74, 33–43. D UFFY , D. L. (1995). GCONVERT. ‘www.qimr.edu.au/davidD/sibpair.html#Gconvert’. Queensland Institute of Medical Research, Australia. D UFFY , D. L. (1997). The genetic epidemiology of asthma. Epidemiologic Reviews 19, 129–143. E LSTON , R. C. AND S TEWART , J. (1971). A general model for the genetic analysis of pedigree data. Human Heredity 21, 523–542. E LSTON , R. C. (1998). Linkage and association. Genetic Epidemiology 15, 565–576. FARAWAY , J. J. (1993). Distribution of the admixture test for the detection of linkage under heterogeneity. Genetic Epidemiology 10, 75–83. G REEN , P., FALLS , K. AND C ROOKS , S. (1990). CRIMAP. St Louis: Washington University. G ONG , G. AND S AMANIEGO , F. J. (1981). Pseudo maximum likelihood estimation: theory and applications. Annals of Statistics 9, 861–869. H ALDANE , J. B. S. (1919). The combination of linkage values and the calculation of distances between the loci of linked factors. Journal of Genetics 8, 229–309. H ODGE , S. E. AND E LSTON , R. C. (1994). LODS, WRODS, and MODS: the interpretation of LOD scores calculated under different models. Genetic Epidemiology 11, 329–342. H UBER , P. (1967). The behaviour of maximum likelihood estimates under nonstandard conditions. In Proceedings of the Fifth Berkeley Symposium in Mathematical Statistics and Probability, Berkeley: University of California Press. K RAUTER , K., M ONTGOMERY, K., YOON , S. J., L E B LANC -S TRACESKI , J., R ENAULT , B., M ARONDEL , I., H ERDMAN , V., C UPELLI , L., BANKS , A. AND L IEMAN , J. (1995). A second gen eration YAC contig map of human chromosome 12. Nature 377, 321–333. K RUGLYAK , L., DALY , M. J., R EEVE -DALY , M. P. AND L ANDER , E. S. (1996). Parametric and nonparametric linkage analysis, a unified multipoint approach. American Journal of Human Genetics 58, 1347–1363. L EMDANI , M. AND P ONS , O. (1995). Tests for genetic linkage and homogeneity. Biometrics 51, 1033–1041. L ANDER , E. S. AND G REEN , P. (1987). Construction of multilocus genetic maps in humans. Proceedings of the National Academy of Sciences USA 84, 2363–2367. L ANDER , E. S. AND S CHORK , N. J. (1994). Genetic dissection of complex traits. Science 265, 2037–2048. L IANG , K.-Y. AND R ATHOUZ , P. J. (1999). Hypothesis testing under mixture models: application to genetic linkage analysis. Biometrics 55, 65–74. L IANG , K.-Y., R ATHOUZ , P. J. AND B EATY , T. H. (1996). Determining linkage and mode of inheritance: mod scores and other methods. Genetic Epidemiology 13, 575–593.
Multipoint linkage detection in the presence of heterogeneity
211
L IANG , K.-Y. AND S ELF , S. G. (1996). On the asymptotic behavior of the pseudo-likelihood ratio test statistic. Journal of Royal Statistical Society Series B 59, 785–796. M ACLEAN , C. J., B ISHIP , D. T., S HERMAN , S. L. AND D IEHL , S. R. (1993). Distribution of lod scores under uncertain mode of inheritance. American Journal of Human Genetics 52, 354–361. M RAZEK , D., PAULS , D., A NDERSON , I., B ROWER , A. AND K LINNERT , M. (1989). Segregation analysis of 145 asthmatic families [abstract]. American Journal of Human Genetics 45 (suppl.), A245. OTT , J. (1991). Analysis of Human Genetic Linkage. Baltimore: The Johns Hopkins University Press. R ISCH , N. (1990). Linkage strategies for genetically complex traits. I. Multilocus models. American Journal of Human Genetics 46, 222–228. S ELF , S. G. AND L IANG , K.-Y. (1987). Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. Journal of American Statistical Association 82, 605–610. S MITH , C. A. B. (1963). Testing for heterogeneity or recombination fraction values in human genetics. Annals of Human Genetics 27, 175–182. W HITTEMORE , A. S. (1996). Genome scanning for linkage: an overview. American Journal of Human Genetics 59, 704–716. W ILLIAMSON , J. A. AND A MOS , C. (1995). Guess LOD approach: sufficient conditions for robustness. Genetic Epidemiology 12, 163–176. [Received 1 March, 2000; revised 9 February, 2001; accepted for publication 9 April, 2001]