Series in Biostatistics Vol.1
Development of Modern Statistics and Related Topics In Celebration of Prof Yaoting Zhang'...
59 downloads
624 Views
12MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Series in Biostatistics Vol.1
Development of Modern Statistics and Related Topics In Celebration of Prof Yaoting Zhang's 70th Birthday
Heping Zhang ian Huang
World Scientific
Development of Modern Statistics and Related Topics
This page is intentionally left blank
Series in Biostatistics Vol.1
Development of Modern Statistics and Related Topics In Celebration of Prof Yaoting Zhang's 70th Birthday
edited by
Heping Zhang Yale University School of Medicine, USA
Jian Huang University of Iowa, USA
V f e World Scientific wb
NewJersev New Jersey • London • Si, Singapore • Hong Kong
Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: Suite 202,1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
DEVELOPMENT OF MODERN STATISTICS AND RELATED TOPICS: In Celebration of Professor Yaoting Zhang's 70th Birthday Copyright © 2003 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-238-395-6
Printed in Singapore.
Preface This proceedings is dedicated to Professor Yaoting Zhang in celebration of his 70th birthday in November 2003. Professor Zhang is an internationally renowned statistician and a Chinese pioneer in the statistical development, applications, and education. During the Fifth International Chinese Statistical Association Conference held in Hong Kong, August 17-19, 2001, Professor Zhang reunited with many of his friends and former students inside and outside China. The idea for this proceedings was conceived on this unforgettable occasion. Inside this proceedings, Dr. Yao and Dr. Li present their interview article with Professor Zhang. In their article, they highlight Professor Zhang's career as a scientist and teacher, and his influence in many fields outside statistics including finance and geology. He is one of the statisticians in China who played a pivotal role in rebuilding statistics as a research discipline after the Culture Revolution. He has trained and inspired several generations of Chinese statisticians. This monograph includes two special invited papers and nineteen invited papers from Professor Zhang's friends and former students. The special invited paper by D. Siegmund and B. Yakir discusses approximation to the distribution of the maximum of certain Gaussian processes. The need for such approximation arises from multi-point linkage analysis for detecting chromosomal regions that harbor genes predisposing a trait of interest, such as a disease in human. The special invited paper by Z.L. Ying introduces an elegant approach for computing variances in a class of adaptive semiparametric statistics. This approach has applications in many incomplete and censored data models. The invited papers encompass a wide range of topics. They cover the following areas: asymptotic theory and inference, biostatistics, economics and finance, statistical computing and Bayesian statistics, and statistical genetics. In the areas of asymptotic theory and inference, L.Z. Lei, L.M. Wu and B. Xie study large deviation and deviation inequalities with L\ norm in kernel density estimation in a p-dimensional space. G. Lu investigates local sensitivity of model misspecification in likelihood inference. Y.S. Qin and Y.X. Wu study empirical likelihood confidence intervals for quantile differences. Q.W. Yao establishes exponential inequalities for spatial processes and uniform convergence rates in density estimation. In biostatistics, F.F. Hu discusses some recent advances in responsev
VI
adaptive randomized designs in clinical trials and industrial applications. L.X. Li studies estimation in linear regression models with interval-censored data. Y.C. Xia proposes a childhood epidemic model with birthrate dependent transmission to study the effect of birthrate on the epidemic dynamics. In economics and finance, T.J. Chen and H.F. Chen introduce a new regression model for studying stock volatility. Z.H. Chen proposes using ranked set sampling in observational economy. D.Y. Xie investigates explicit transitional dynamics in growth models. X.D. Zhu studies the linkage between uncertainty about estimated probabilities of catastrophic events and their high insurance premium. H.F. Zou studies a fiscal federalism approach for optimal taxation and intergovernmental transfers based on a dynamic model. In statistical computing and Bayesian statistics, the paper by M.H. Chen, X. He, Q.M. Shao, and H. Xu proposes a Monte Carlo gap test in computing the Bayesian highest posterior density regions. C.H. Liu investigates methods for accelerating the EM and Gibbs algorithms. The paper by M. Tan, G.L. Tian and H.B. Fang considers the problem of estimating restricted normal means using the EM-type algorithms and the noniterative inverse Bayes formulae sampling. In statistical genetics, J. Huang and K. Wang propose using a semiparametric normal copula model in linkage analysis of quantitative traits. Z.H. Li, M.Y. Xie, and J.L. Gastwirth use optimal design theory to suggest ways for improving the Haseman-Elston regression for detecting quantitative trait loci. Finally, Zhu and Zhang explore structural mixture models and their applications in genetic studies. We are grateful to Qiwei Yao and Zhaohai Li for their invaluable help in editing this monograph, and for conducting the biographical interview with Professor Zhang. We are also grateful to all the contributors for their enthusiastic support of this project. We appreciate the support and assistance of Dr. K K Phua, Dr. Ye Qiang, Ms. Tan Rok Ting, and Ms. Elaine Tham at the World Scientific Publishing Company. We thank Mr. Chang-Yung Yu for his able assistance in assembling the articles. Lastly, but most importantly, as Professor Zhang's students, we thank Professor Zhang wholeheartedly for opening our eyes to the statistical discipline and for teaching us the learning and research skills, and after all, for teaching us how to mentor the future generations. We are proud of you, Professor Zhang and wish you the happiest birthday and a good health. Jian Huang and Heping Zhang, March 2003
Contents
Preface
v
An Interview with Professor Yaoting Zhang Qiwei Yao and Zhaohai Li
1
Significance Level in Interval Mapping David O. Siegmund and Benny Yakir
10
An Asymptotic Pythagorean Identity Zhiliang Ying
20
A Monte Carlo Gap Test in Computing HPD Regions Ming-Hui Chen, Xuming He, Qi-Man Shao and Hai Xu
38
Estimating Restricted Normal Means Using the EM-type Algorithms and IBF Sampling Ming Tan, Guo-Liang Tian and Hong-Bin Fang
53
An Example of Algorithm Mining: Covariance Adjustment to Accelerate EM and Gibbs Chuanhai Liu
74
Large Deviations and Deviation Inequality for Kernel Density Estimator in Li(_Rd)-distance Liangzhen Lei, Liming Wu and Bin Xie
89
Local Sensitivity Analysis of Model Misspecification Guobing Lu Empirical Likelihood Confidence Intervals for the Difference of Two Quantiles of a Population Yongsong Qin and Yuehua Wu
VII
98
108
viii
Exponential Inequalities for Spatial Processes and Uniform Convergence Rates for Density Estimation Qiwei Yao
118
A Skew Regression Model for Inference of Stock Volatility Tuhao J. Chen and Hanfeng Chen
129
Explicit Transitional Dynamics in Growth Models Danyang Xie
140
A Fiscal Federalism Approach to Optimal Taxation and Intergovernmental Transfers in a Dynamic Model Liutang Gong and Heng-Fu Zou Sharing Catastrophe Risk under Model Uncertainty Xiaodong Zhu
156
179
Ranked Set Sampling: A Methodology for Observational Economy Zehua Chen
189
Some Recent Advances on Response-Adaptive Randomized Designs Feifang Hu
205
A Childhood Epidemic Model with Birthrate-Dependent Transmission Yingcun Xia
220
Linear Regression Analysis with Observations Subject to Interval Censoring Linxiong Li
236
When Can the Haseman-Elston Procedure for Quantitative Trait Loci be improved? Insights from Optimal Design Theory Zhaohai Li, Minyu Xie and Joseph L. Gastwirth
246
IX
A Semiparametric Method for Mapping Quantitative Trait Loci Jian Huang and Kai Wang Structure Mixture Regression Models Hongtu Zhu and Heping Zhang
262
272
X
Professor Yaoting Zhang
:-s#*K8SWIfry
.•
,•- • r . J. —•**•. • .a? 11 • ; -wSt 14/
;•-•*.
* • . " »••*
The participants of the Statistics Training Course in Wuhan University in 1980. The course was commissioned by the State Education Commission and had significant impact on the development of statistics in China. Zhang was the fifth from the left in the front row.
XI
In Peking University in 1961.
Professor Paolu Hsu, sitting at the center of the front row, and his students in Peking University. The second from the left in the front row is Zhang.
1, - •
r"*' 5 '^aulSfe.ss
**%$&%$ - v."" }V
3OT§#
. , * . '••'
*
.
'
•
\ ^ 1 f?
-
•
*
\M&&>
INf PPlf
" • ' . • . « »
^ ' WKk*-
Professor Yaoting Zhang and his student in Wuhan University.
A N INTERVIEW W I T H PROFESSOR YAOTING Z H A N G
Q I W E I YAO Department Guanghua
of Statistics,
London School of Economics, Houghton Street, London WC2A 2AE, UK School of Management, Peking University, 100871, Beijing, China Z H A O H A I LI
Department
of Statistics,
George Washington University, 2001 G Street NW., Washington DC 20052, USA Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, 6120 Executive Blvd., EPS, Rockville, MD, 20852
Professor Yaoting Zhang was born in Shanghai, China in 1933. He entered the Department of Mathematics at Qinghua University in 1951, and then was transferred to the Department of Mathematics at Beijing (Peking) University. After completing his undergraduate education, he studied probability and statistics under Professor Paolu Hsu in 1955. He and Professor Hsu developed a lifetime friendship. During t h e 'Culture Revolution', he was sent to the Wu-Jiang Hydropower Station in Gui Zhou Province. After the 'Culture Revolution', he joined the faculty at Wuhan University. Professor Zhang is a devoted teacher in statistics and an enthusiastic adviser for his students. His contribution in statistical applications in China ranges over a wide spectrum of areas. At present, Professor Zhang holds a Professorship in the School of Economics of Shanghai University of Finance and Economics. He is also an Adjunct Professor at both the People's University of China and Beijing University.
This interview is based on several conversations with Professor Yaoting Zhang in 2001 and early 2002. Inevitably some editing has been required. The final version was firmed up, with Professor Zhang's approval, on 29 June 2002 in Peking University. Growing U p Yao: Professor Zhang, many thanks for agreeing to talk to us. Could we start by asking you to tell us some of your early days? Zhang: I was born in 1933 in an ordinary family in Shanghai. At that time China was still a society which valued boys above girls. My parents 1
2
were very happy to have me. As the youngest boy with one brother and three sisters in the family, I was treated with extra love. However before long the whole family fled from Japanese invasion after the notorious '8.13' event. The memory of my childhood was to flee from calamities, from Japanese invasion, and from Japanese brutality. Due to increasing difficulty to make ends meet, my mother, my brother and two sisters went back to my parents' hometown in Wujin to live on farm. My father, my second sister and I stayed in Shanghai in order for me to finish my primary school. We managed to live on a single grocery stand until the summer of 1945. After I finished my primary school, we went back to Wujin to join the rest of the family. My family went back to Shanghai after the end of the anti-Japanese war in the fall of 1945. I then entered the Shanghai Middle School, which changed the way of my thinking. The Shanghai Middle School was in a suburb of Shanghai, having boarding students only. The students fell into two clusters: those from wealth families such as bankers, senior officials and capitalists, and those from poor families. I remember clearly that the traditional Chinese-style cottongowns and jackets with buttons in the front were forbidden in the School. For the first time I wore shirts and overcoats. When the communist party took the power in Shanghai in 1949, some of my classmates went to Hong Kong or Taiwan. But most of us stayed in Shanghai. Out of more than 30 students in my class, 10 were admitted to the Qinghua University and the other 10 entered the Shanghai Jiao Tong University in 1951. I entered the Mathematics Department in Qinghua. Actually I did not have a clear idea on what I should do at that time. The three preferred subjects on my application form were mathematics, foreign language and chemical engineer. I was assigned to mathematics. Yao: As one of the very top universities in China, Qinghua can afford to be very selective in recruiting students. You must have a very exciting time in Qinghua. Zhang: When I entered Qinghua, there were 20 students in our class, more than all the students in the second, third and fourth years put together in the Mathematics Department at that time. We were warmly welcomed. The teachers in the department took us to the city for a day-out, visiting the Forbidden City and Temple of Heaven. Each of us was given a lunch bag with bread, sausages and eggs, which was pretty high-class. In those days, tourists typically carried with themselves some steam-buns and salted vegetables. Two of my classmates later became the Fellows of Chinese
3
Academy, and more than 10 were certified PhD student supervisors. Professor Paolu Hsu and Statistics Yao: As both a senior student and a young faculty member, you were very close to Professor Paolu Hsu. Would you please share with us some of your memories of Professor Hsu? Zhang: I met Mr. Hsu Paolu in 1955. Under the arrangement by the Department, I started to learn from him, as a part of the preparation for setting up a new teaching and research section in Probability and Statistics. At that time my own wish was to go to Tibet — to serve my Motherland in the poorest and the most needed place. However I was 'ordered' to stay in the Department as a teacher after the graduation. Mr. Hsu asked me to read two books: "Measure theory" by P.R. Halmos and "An introduction to probability and its applications" (1st edition) by W. Feller. Those two books gave me a foundation from which I immensely benefit in my whole life. In that time, probability and statistics were looked down within mathematical circle. At one stage even Mr. Hsu was thinking to change to algebra. The whole mathematics in China would not be as it is now if not for the publication of "The National Science Outlines" in 1956, which earmarked computational mathematics, differential equations, probability and statistics as the important mathematical subjects for further development. After learned that I was learning probability from Hsu, one of famous mathematician (who was also my teacher and taught me several courses) commented: "Nowadays some people are willing to learn quasi-mathematics, such as probability". This reflects the common view of probability and statistics in most people's mind at that time. In 1956 after participating the First National Science Planning Conference, Mr. Hsu said to me that in the whole country there were only 8 people, including me, working on Probability. This was how we started. In the beginning I learned Markov chains, as Mr. Hsu was interested in some limiting theorems of Markov processes at that time. In the 'big jump movement' in 1958, I realised the great potential of statistical application from doing practical work. Mr. Zhou Hua-Zhang in Qinghua, who came back from the United States together with Qian Xue-Sen, did a lot of work in statistical applications. I was greatly influenced by him. When the Meteorology Department in Beijing University could not find anybody to teach a statistics course, I offered my service. Actually I was learning the subject while teaching it. This is how I got into statistics.
4
Yao: Among your various expertise, you are known as an expert in Multivariate Analysis. Why did you choose this particular area? Zhang: In 1959 Mr. Hsu led a seminar series reading "Multivariate analysis" by Roy and "Mathematical statistics" by Wilks. At that time I learned the Neyman-Pearson theory and some of multivariate analysis methods more systematically. In the second half of the year, Zhang LiQian's work on the triangular scheme for experiment design caught Mr. Hsu's attention. So we changed to experiment design. In the same year the Annals of Mathematical Statistics published a long paper by Bose in this subject. My knowledge on experiment design started with that paper. In the seminar, we also read a small book entitled "Experiment design and analysis of variances" by H.B. Mann. That was the first time I came across to the formula for inverse partitioned matrices. I was impressed that Mr. Hsu could simplify the proof in Mann's book by using this formula. In the erratic 1960's, life was getting harder in Beijing after the rush for ultrasonic and political anti-rightist movement. I did a lot practical work in factories in that period. In order to carry out the forest survey, we read "Sampling Techniques"' by Cochran. The book "Sampling Theory" by Sun Shang-Ze was based on Mr. Hsu's summarised notes. In 1961 Mr. Hsu gave seminar talks in linear models, which was largely reflected in my joint book with Fang Kai-Tai on multivariate analysis. Mr. Hsu also gave talks on the derivation of exact sampling distributions in multivariate analysis, which we found difficult to master at the time. Because of this, I did a lot of exercises, including some rather tricky multiple integrals which I solved using Hsu's methods. Later in 1980 Krishaniah brought us the notes taken by Olkin and Deemer from Hsu's lectures at North Carolina Chapel Hill in 1947, which helped me to understand the technique properly. But with modern stochastic simulation, deriving an exact sampling distribution remains merely as an intellectual challenge nowadays and is losing its practical significance. Yao: This reminds me those lectures and seminar talks you gave us when I was a PhD student in Wuhan. You could always find a simple and often elegant treatment for a complex mathematical problem. This amazed us most! Your exquisite matrix techniques were beyond us. What is the trick to gain those abilities? Zhang: I learned the matrix techniques through several research projects led by Hsu, mainly on the PBIB and BIB schemes in experiment design. We worked on matrices defined on finite fields. The key to solve many problems was to count the number of canonical forms under various
5
transforms. The results were collectively published in Mathematics Advances in 1963 under the pen-name 'Ban-Cheng'. One of the important ideas from Hsu is to think of the invariance of a mathematical problem under certain transform group. Then the problem can be reduced to find the solution for the canonical from under the transforms. The canonical form of linear models is a case in point, which can be found in many books such as Lehmann's. This idea often works very well in experiment design and multivariate analysis. By incorporating this idea into the teaching and textbooks, I was able to provide simple and elegant treatments for some materials in my lectures as well as my books, which was well received by students. When we were poised for further advanced research in 1963 and 1964, the political turbulence such as the socialism education, culture revelation started one after another. The research was almost standstill for whole ten years. When it was all over, I was approaching my fifty already. Yao: I also remember your constant yearnings for the advancement of Applied Statistics and Statistical Applications in China on various occasions. The urgency of doing that must, at least partially, come from your own experiences. Would you like to tell us some of your applied work? Zhang: Since 1958 I have established wide connections with faculty members in other departments in Peking University as well as practical sections in the society. The practical projects which I was involved include the resource survey analysis in Xi-Xia-Bang-Ma area and the causing analysis of yellow earth jointly with the Geology and Geography Department; weather forecast, especially on forecasting the three key factors (namely the length of period, the intensity and the beginning date) for plum rains, and median and long term forecast jointly the Geophysics Department and the National Climatological Bureau; the forecast for earthquake wave; armyworm forecast jointly with the Biology Department; the survey and the forecast of water supply and waste-water treatment jointly with the Institute of Municipal Engineering. The most effective project with long-lasting impact was the foundation design blueprint for the high buildings in QianShang-Meng area in Beijing. Many key technicians in the Institute grew out of this project. In spite of the fact that I was denounced as counterrevolutionary in the 'Culture Revolution', many people from various applied areas still came to me for statistical consultation. In that period, I did some projects related to military techniques. The statistical techniques involved include outliers detection, spline regression, factor analysis etc, which also enhanced my appreciation on those techniques.
6
During the 'Culture Revolution' Li: We all know you suffered badly from the 'Culture Revolution'. Would you mind telling us what happened during those terrible years? Zhang: One thing which I never gave in during the 'Culture Revolution' is to disseminate and to popularise the orthogonal design. During that period, the university students were workers, peasants and soldiers. We had to give lectures in factories, and combined our own expertise with practical problems. So the orthogonal experimental design was an obvious and handy choice. We applied it successfully in real practice in both Peking Analytical Instrument Factory and Yan-Shang Petrochemical Engineering Factory. The results were assembled in a book on orthogonal design published by the High Education Press. Towards the end of the 'Culture Revolution', I, as an assigned counterrevolutionary, could not be accepted by universities. Beijing University sent me to the Wu-Jiang Hydropower Station in Guizhou where my wife was, which, as they thought, was the only place I could go. In October 1976, I left Beijing for Guizhou and ended as a school teacher there for two years. The life was hard, we ate boiled vegetables as we had no oil to stir-fry them. But on the other hand, I got on very well with the colleagues there and was very popular among my students and pupils. In fact with no political pressure and no discrimination, I felt relaxed and happy. During those two years I had time to go back to the work I started in the earlier years and was able to write more research papers. I visited Mao-Tai twice; once was on a business trip and the other was during a Spring-festival holiday in my colleague's home at his invitation. Even now I still cherish those very pleasant days in my life. After 'Culture Revolution' Li: After the 'Culture Revolution', you joined Wuhan University where both Qiwei and I met you at the first place. How would you summarise your major activities at Wuhan University? Zhang: After the culture revolution, I received invitations from several universities. I went to Wuhan University and spent my next 16 years there. When I arrived there I did not feel as energetic as used to be, but I decided to do my best to bring up some young people. At that time the State Education Commission entrusted me to organise and conduct a halfyear national training course in Statistics for university teachers ranked at
7
lecturer level or above. The course was attended by about 40 people from different universities in the country and the invited lecturers included Chen Xiru, Ni GuoXi and Deng Weicai. The course had a significant impact on the development of statistics in China. The second thing worth mentioning is that in 1985 I organized a postgraduate class for 45 students; 15 from Wuhan University and 15 from Central China Normal University plus the other 15 who were sent to Wuhan by other universities. Quite a few of them are now well-accomplished statisticians. Another thing that I was proud of was to put together the national selection examination for students studying statistics in the United States. The project was initiated by Professor George Tiao when I visited University of Chicago in 1983. The idea was to select 30 candidates out of 100 students sitting in the examination every year for PhD studies in American universities. Harvard, Berkeley, Chicago and Wisconsin were jointly responsible for the allocation of those candidates. The aim was to produce high quality applied statisticians for China. After continuous effort for two years, the project was finally brought into effect. Over the years dozens of students went to the United States. Some of them have become influential figures in statistics now. If I have made some contribution to Chinese statistics, it must be that I have encouraged the talented young statisticians coming forth in large numbers. Li: Your recent research interest has shifted to economics and finance. What was your motivation for such a change? Zhang: In early 1990's, I recognized the serious lack of expertise in quantitative economic analysis in China. The modern development of statistics has two backbones: medical science and economics. In terms of statistical application in these two areas, we were far behind. I decided to shift my main focus to introducing the relevant modern analytic methods and the related theory, doing research which is relevant to Chinese reality, and to promote the development of specialists in this area. China had a State-planned economy for a long time, which undermined the market research and development. Obviously the old survey methods based on report forms were no longer adequate. Actually I realised the importance of the sampling survey methods in early 1980's. Together with Wu Hui, the Director of the Foreign Affairs Office in the State Statistics Bureau, we translated Cochran's "Sampling Techniques" into Chinese. At that time the publisher thought that the book would not sell. The book had been held for almost two years before it was published in 1985 under an official intervention. In fact the publisher's initial estimate was proved wrong; the second print appeared within one year of its first publication.
8
To meet the practical needs, I published "Bayesian Statistical Inference" and "Statistical Analysis for Qualitative Data" in early 1990's. These books presented the relevant methods and theory together with some real Chinese examples. In late 1990's, I wrote a few economy-oriented textbooks, including "Theory and Methods for Data Ordering and Quantification", "Utility Functions and Optimisation" and "Information and Decision Theory". In three books, I put economy and mathematics well together. Hopefully the students in economics will find these new textbooks useful and helpful. I am glad to see that my book "Statistical Analysis of Financial Market" was very well received. Some universities listed it as the standard reference book for their entrance examination for graduate studies in finance. By writing this book, I became more familiar with statistical methodologies for analysing financial market. I read the literature from abroad and tried to get myself acquainted with new development in the world. During this process, Zhou Hengpu helped me greatly. He set up the Institute for Advanced Economics Research in Wuhan University which imported many new books and subscribed a number of overseas periodicals. This made it much easier for teachers and students to keep up with the newest development in the world. Many students have graduated from this Institute. While in the late 1980's I focused on reliability theory and its applications (organising a research group with people from six universities, undertaking dozens projects with the results assembled in two specific monographs), in late 1990's I worked on the applications in taxes and finance. Together with my students, we focused on stock price data. Over the years, we carried out research on the measures for stock market risk and on the forecasting the market direction and had obtained interesting results. Most results have been commercialised now; they cannot be published in academic journals. My plan for the rest of my life is to bring up some capable specialists on economic analysis, who can solve problems in finance, public and social security. I am Proud of M y Students Li: Many of your students, including myself, see you as a good friend, an excellent teacher and one of pioneer statisticians in China. You always enjoy interacting with your students. What would you like to say to your past and current students? Zhang: Looking back of my life, I am gratified that I have been involved in statistical applications in various areas, and have made substantial effort to disseminate and to popularize statistics in China. In spite
9
of little contribution on the theoretical side, I take comfort from the fact that many of my students are now well-accomplished theoretical or applied statisticians. They are working hard for our Motherland, or are doing their best to glorify her. I sincerely thank my students. Having progressed a long way ahead in their career, they still remember me — a teacher who happened to introduce them to statistics. I also sincerely thank my friends and teachers. Without their constant encouragement, affirmation and appreciation for what I did, I could not pull myself through those difficult times. Last but not least, I would like to thank my teacher Mr. Hsu Paolu wholeheartedly. Although he has left us for thirty years now, his earnest words, his passion and persistent drive for science, his heart-felt love for the Motherland has always been influential in my life. I regret deeply that I could not adequately accomplish his idea in statistics. But I believe that the achievements of the new generation will comfort his soul in Heaven.
S I G N I F I C A N C E LEVEL IN INTERVAL M A P P I N G
DAVID O . S I E G M U N D Department
of Statistics,
Stanford
University,
Stanford,
CA 94305,
USA
BENNY YAKIR Department
of Statistics,
Hebrew
University,
Jerusalem,
Israel
The false positive rate of a genome scan that uses interval mapping involves the distribution of the maximum of a Gaussian process, for which there are two approximations: one relying on the Rice-Davies formula, which is accurate for relatively sparsely placed markers, and one that is accurate for closely spaced markers. In this paper we combine these two approximations to obtain an approximation that is accurate for both sparse and dense markers. We also give a new proof of the Rice-Davies formula.
1. Introduction Genome scans lead to testing a large number of markers distributed throughout the genome for linkage to a trait of interest. Typically linkage is detected in any region of the genome where an appropriate asymptotically Gaussian stochastic process exceeds a threshold. The genomewide false positive error rate is the probability, under the null hypothesis that no markers are linked to the trait, that the maximum of the stochastic process exceeds this threshold. The stochastic process arises from marker data, ideally placed on an equally spaced grid throughout the genome. Often one also uses interval mapping (Lander and Botstein, 1989), which is a technique based on the EM algorithm to interpolate the observed process between markers. The "markers only" process is asymptotically the discrete skeleton of a non-differentiable, locally Markovian, Gaussian process, either the Ornstein-Uhlenbeck process or a close relative, while the interpolations of interval mapping are very smooth. For mapping quantitative traits in experimental genetics, there are two recommended approximations to control the genomewide false positive error rate. One is based on Rice's formula as applied by Davies (1987) (Rebai, Goffinet, Mangin, 1994, 1995, Dupuis and Siegmund, 1999) for the expected 10
11
number of upcrossings of a smooth Gaussian process. It is appropriate when intermarker spacing is reasonably large (ca. 20 cM) and interval mapping is used to interpolate between markers. The second is based on an approximation to the distribution of the maximum of the discrete skeleton of a locally Markov Gaussian process (Feingold, Brown and Siegmund, 1993, Dupuis and Siegmund, 1999) and is appropriate for closely spaced markers. Of these two approximations the first is overly conservative when markers are closely spaced, while the second is anti-conservative when markers are widely spaced and interval mapping is employed. The goals of this paper are (i) to suggest a combination of these two approximations, which seems to combine the best features of both, and (ii) to give a new derivation of the Davies approximation based on a likelihood ratio transformation along the lines of Yakir and Pollak (1998) and Siegmund and Yakir (2000a, 2000b). 2. Known Results We begin with the one dimensional case of a backcross (or an intercross where the possibiity of a dominance deviation is ignored). Let Xt denote a mean 0 variance 1 stationary Gaussian process with covariance function that satisfies R(t) = 1 — (3\t\ + o(t). A case of particular interest is the Ornstein-Uhlenbeck process, where R(t) = exp(— 0\t\), which arises from the Haldane (no interference) model for crossovers. For markers equally spaced at distance A, for a single chromosome of genetic length L, we have the approximation (Feingold, Brown and Siegmund, 1993) P{ max XiA > i } a l - $(x) + vLxipix),
(1)
0 x and adding over subintervals of I , we obtain from (7) the formula: P ( m a x Z s >x) w 1 - $(z) s€2T
2TT
/ Psds. Jo
(9)
7. Evaluation of (9) In the case where {Xt} is the Ornstein-Uhlenbeck process the matrix W has the explicit form:
W = {wij}»,j=o,
1/(1 - e - ^ A ) i = j = 0,m, (1 + e - 2 ^ A ) / ( l - e- 2/3A ) 0 < i = j < m, - e - ^ A / ( l - e- 2 / 3 A ) | * - j | = l.
Indeed, for u £ { 1 , . . . ,m - 1}, let