Theoretical Exercises in Probability and Statistics, 2nd Edition

N. A. Rahman M.A., M.Sc., Ph.D. Other books on theoretical and applied statistics N. A. Rahman A course in theoretical...

Author: Najeeb Abdur Rahman

205 downloads 1936 Views 9MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

N. A. Rahman M.A., M.Sc., Ph.D.

Other books on theoretical and applied statistics N. A. Rahman A course in theoretical statistics N. A. Rahman · al exercises in probability Prac t IC . . and statistics The advanced theory of statistics Sir Maurice Kendall, Alan Stuart, and J. Keith Ord Vol. 1. Distribution theory. . Vol. 2. Inference and relatIonshIp . . Vol. 3. Design and analysis, and tIme-series M. Boldrini Scientific truth and statistical method (transL Ruth Kendall) Sir Maurice Kendall Multivariate analysis Time-series Sir Maurice Kendall Rank correlation methods Sir Maurice Kendall Statistical papers of George Udny Yule ed. A. Stuart and Sir Maurice Kendall Studies in the history of statistics and probability (2 voL) ed. E. S. Pearson and Sir Maurice Kendall ed. Sir Maurice Kendall and R. L. Plackett M. H. Quenouille Rapid statistical calculations Characteristic functions E. Lukacs E. Lukacs Developments in characteristic function theory J. A. John and M. H. Quenouille Experiments: design and analysis F. Yates Experimental design: selected papers Sampling methods for censuses and surveys F. Yates Biomathematics Cedric A. B. Smith Combinatorial chance F. N. David and D. E. Barton Exercises in mathematical economics and econometrics J. E. Spencer and R. C. Geary S. Vajda Problems in linear and nonlinear programming Mathematical model building in economics and industry (1st and 2nd series) ed. Sir Maurice Kendall Computer simulation models John Smith The mathematical theory of infectious diseases N. T. J. Bailey N. T. J. Bailey The biomathematics of malaria Estimation of animal abundance and related parameters G. A. F. Seber Statistical epidemiology in veterinary science F. B. Leech and K. C. Sellers Statistical method in biological assay D. J. Finney Physical applications of stationary time-series: with special reference to digital data processing of seismic signals E. A. Robinson Style and vocabulary: numerical studies C. B. Williams For a list of Griffin's "Statistical Monographs & Courses", and the "Biometrika" books distributed by Griffin, see the end of this book.

THEORETICAL EXERCISES IN PROBABILITY and

STATISTICS FOR MATHEMATICS UNDERGRADUATES

With answers and hints on solutions

N. A. RAHMAN M.A.(Alld), M.Sc.(Stat.)(Ca1c.) Ph.D.(Stat. )(Camb.) Senior Lecturer in Mathematical Statistics, University of Leicester

SECOND EDITION Including an extensive supplement bringing the total of exercises to over 580

MACMILLAN PUBLISHING CO., INC. NEW YORK

Copyright © Charles Griffin & Co Ltd 1983

Published in USA by Macmillan Publishing Co., Inc. 866 Third Avenue, New York, N.Y. 10022 Distributed in Canada by Collier Macmillan Canada, Ltd.

All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission in writing from Charles Griffin & Co. Ltd., Charles Griffin House, Crendon Street, High Wycombe, Bucks, England, HP136LE. By arrangement with the originating publisher CHARLES GRIFFIN & COMPANY LIMITED London and High Wycombe First published 1967 Second edition 1983 Library of Congress Catalogue Card Number 83-61216 ISBN 0-02-850760-6

Typeset in Northern Ireland at The Universities Press (Belfast) Ltd. Printed in Great Britain by Pitman Press, Bath.

Foreword Statistical theory is one of those subjects which are best learnt by working through many exercises. But good exercises are not easy to find. Authors of textbooks frequently devote inadequate attention to this side of what ought to be their business and, if they provide exercises at all, offer too many which are trivial. There are some notable exceptions to this generalization, but it is true enough to justify an attempt to provide a separate set of exercises which make an essential point and have been piloted through a class of students to ensure that they are of the right standard. Some years ago I myself published a set of exercises in theoretical statistics, with answers. They have been fairly widely used, to judge by the sales, but some of them were of a rather advanced standard and, in effect, have been replaced by the exercises at the end of chapters of Kendall and Stuart's Advanced Theory of Statistics. In the meantime Dr Rahman has for some years been teaching theoretical statistics at the University of Leicester and has got together a set of exercises for undergraduates. They should be useful to teachers and students from the sixth-form level onwards, and, having been worked over by students for several years, should be almost free of error and ambiguity. I hope that the publication of this set may do something to encourage the study of statistics in all parts of the world. LONDON

1967

M.G.KENDALL

Note by Sir Maurice Kendall In the above Foreword I stressed my view that really good exercises-those detailing statistical, not merely arithmetical, method-are of the highest importance. It is gratifying that the steady demand for this book has justified this view and I welcome this new enlarged edition. 1982 M. G. KENDALL

v

TO

BEAUMONT HALL UNIVERSITY OF LEICESTER

Preface to First Edition This collection of theoretical exercises concentrates only on such aspects of probability and statistics as could be covered, with some selection, in a first year's course for mathematics undergraduates. The choice of topics is largely a personal one, though perhaps not entirely arbitrary, as the selection of problems represents a balanced integration of the basic ideas which would normally be included in an introductory course designed to create a lively interest in the subject as a whole. On the other hand, the mathematical requirements have been deliberately made flexible so as to permit a choice of exercises suitable for anyone of the usual three years of a degree course. It is hoped that the use of these exercises will help to clarify the preliminary ideas of a novel subject, and to provide a sample of the kind of mathematics that is a prerequisite for further study. Over 400 exercises are included in this collection, and these have been broadly classified into the following four chapters: Chapter 1 : Discrete probability and random variables (including the use of simple difference equations and generating functions). Chapter 2 : Continuous random variables (including joint and simple derived distributions). Chapter 3 : Estimation, tests of significance, and inference (including bivariate correlation and regression). Chapter 4 : Characteristic functions (their use in deriving sampling distributions explicitly). Chapter 1 starts with the simplest notions of discrete probability and elementary combinatorial methods. The main stress is on an integration of the ideas of probability with the concepts of random variables and expected values. The general development is based on a consistent use of generating functions, and an attempt has also been made to include a number of exercises on the use of difference equations for the evaluation of probabilities and expected values. The plan of the second chapter is complementary to that of the first in providing an introduction to the concept of continuous variation and related statistical ideas of expectations for joint and simple derived distributions. vii

viii

PREFACE

The first two chapters provide the basis for the major Chapter 3, which deals with a variety of statistical problems. These should give the student an indication of the kind of manipulative mathematics used in statistical theory, and a deepening insight into the principles. The emphasis on estimation is based on the belief that this is perhaps the most significant contribution of modern statistical theory, though, unfortunately, a clear understanding of the ideas of estimation is often not readily acquired by a beginner. The omission of confidence intervals and power functions is due to my conviction that these ideas require a thorough grasp of distribution theory which puts them outside the scope of a first course. On the other hand, the notion of efficiency in terms of relative variance is a useful concept which has the additional merit of being intuitively acceptable to a beginner. Multivariate regression and the general least-squares approach used in the analysis of variance have both been excluded as too specialized for a first course. The last chapter is perhaps of limited interest, since there seems to be an increasing tendency in universities to institute a course in statistical theory before the study of complex functions. However, the characteristic function is of such importance in theoretical statistics that its complete exclusion was, on balance, considered unjustifiable. The purpose of Chapter 4 is to provide a small selection of largely simple examples to illustrate some of the ways in which the characteristic function is used in statistical theory. Of course, these examples could also be solved by other methods, but it is hoped that, as presented, they will give a beginner an insight into the analytical power and simplicity of the method. In this sense the chapter can be regarded as a self-contained unit. In general, the above classification may be regarded as developmental and, in fact, many of these exercises have been selectively used in such a planned sequence with mathematics undergraduates during different stages of the three-year degree. The general impression gathered on the basis of a few years' experience is that students almost invariably found the exercises interesting and helpful in clarifying their understanding of the subject, despite some difficulty in the manipulative mathematics involved. Perhaps this limited experience might be an indicator of a more general acceptability and usefulness of this collection. Answers and some hints on solutions have been provided to assist such students as may be working on their own, without the help of teachers. The exercises have been gleaned from many different sources. My major debt is to the standard research journals, and due acknowledgement is made to published papers from which the ideas for the exercises have been obtained. I am obliged to the publishers and

PREFACE

ix

authors of (1) An Introduction to Probability Theory and its Applications, Vol. I, by W. FELLER (John Wiley, New York, 1952) (2) Introduction to Probability and Random Variables, by G. P. WADSWORTH and J. G. BRYAN (McGraw-Hill, New York, 1960) (3) Probability Theory for Statistical Methods, by F. N. DAVID (Cambridge University Press, 1951) (4) Probability-An Intermediate Text-book, by M. T. L. BIZLEY (Cambridge University Press, 1957) (5) Introduction to Statistical Method, by B. C. BROOKES and W. F. L. DICK (William Heinemann, London, 1958) (6) The Advanced Theory of Statistics, Vol. I, by M. G. KENDALL and A. STUART (Charles Griffin, London, 1958) and to the publishers and trustees of (7) Introduction to Mathematical Probability, by J. V. USPENSKY (McGraw-Hill, New York, 1937) for kindly permitting me to include certain instructive problems from their works, Some exercises have also been obtained from past examination papers of the University of Leicester, and my thanks are due to the university authorities for permission to use them. Finally, a number of exercises have been specially constructed for this collection to illustrate some specific points in theory, but no claim to originality is made. I should like to thank Professor H. E. Daniels, who initially suggested my undertaking this compilation, and whose encouragement has been a major support over several years. My thanks are also due to Mr F. Downton, Reader in Mathematical Statistics at the University of Birmingham, for his careful reading of a part of the manuscript and for suggesting several interesting exercises for this collection. Apart from this, I am also obliged to him for his intangible help in the development of my ideas through a close professional association. To this I would also add my thanks to Professor R. L. Goodstein for the opportunity for experimentation with different groups of students which has given me a first-hand experience of the difficulties facing beginners in the field. Finally, I am much indebted to Messrs Charles Griffin & Co., and their consultant editor, Dr M. G. Kendall, for their helpful co-operation in the publication of this work. I should be grateful if users of this book would point out any errors or obscurities in the formulation of the exercises. LEICESTER 1966 N.A.R.

Preface to Second Edition The continuing demand for this book has occasioned a new edition. This is being done with a slight modification of the title which, I believe, now more clearly reflects the nature of the contents and also specifically distinguishes this book from my other collectionPractical Exercises in Probability and Statistics, Griffin, 1972. The new edition also contains a substantial supplement of 160 new exercises. Of these, the majority were initially formulated as questions set in various undergraduate examinations that were conducted over several years at the University of Leicester and, as such, they reflect a pattern in the development of statistical education which, it is hoped, will interest teachers elsewhere. Many of these exercises were gleaned from published papers, but it is regretted that it has not been possible to identify these sources explicitly. Nevertheless, I should like to record my indebtedness to all whose work has been used in the exercises. Furthermore, not all the exercises in the supplement are devoted to the traditionally defined areas of probability and statistical theory. In fact, in keeping with my individual approach to statistical education at the undergraduate stage, I have included in the supplement a few exercises in ancillary mathematics which experience here has shown to be helpful in stimulating student interest in statistics. Personal preference apart, there is a deeper pedagogical reason for this departure from established practice. There is now, more than ever before, a need to stress the underlying unity of the mathematical sciences at the undergraduate stage in order to avoid specious compartmentalisation. I believe this unity is intellectually necessary to provide balance in the structure of the honours degree in mathematics and to meet the academic challenge to the mathematical sciences that is currently being presented by the spectacular advances in computer technology. I am grateful to users of the earlier edition of the book who pointed out various misprints. These have now been corrected. LEICESTER N .A.R. 14th March, 1980

x

Contents EXERCISES Page

Chapter 1 : Probability and discrete random variables . .

1

Chapter 2 : Continuous random variables

27

Chapter 3 : Estimation, sampling distributions and inference; bivariate correlation and regression .

51

Chapter 4: Characteristic functions

124

List of abbreviations used in references to journals, books and examination papers

141

ANSWERS AND HINTS ON SOLUTIONS

Chapter 1

143

Chapter 2

177

Chapter 3

209

Chapter 4

285 SUPPLEMENT

Additional Exercises 1 to 160

309

Answers and Hints on Solutions

373

xi

1

Probability and discrete random variables

1 Two dice are rolled. Let A be the event that the sum of the points on the faces shown is odd, and B the event that there is at least one 3 shown. Describe A u B; A II B; A - B; and (A II B) u A.

2 For k = 1,2, 3,4, let Nk be the event that N has at least k aces in a bridge deal. Let Sk, Ek, w,. be the analogous events for S, E and Wrespectively. How many aces has W in the events (i) Wi; (ii) N2"S2; (iii) NiISl"B i ; (iv) W2 -W3 ; (v) Ni II Si" Ei II Wi; (vi) N3 II Wi; (vii) (N2 u S2) II E2? 3 Given the three arbitrary events A, Band C, find simpler expressions for (i) (A u B) II (A u B); (ii) (A u B) II (A u B) II (A II B) ; (iii) (A u B) II (B u C); (iv) (A u B) II (A u B) II (A u B). 4 Find the probability that no two of r ( < 10) digits, each chosen at random from 0, 1,2, ... ,9, are equal. 5 Three dice are rolled. What is the probability that there is at least one 6, assuming that the dice are unbiased? Also obtain this probability if the dice are loaded, the probabilities of getting a 6 with them being PI' P2 and P3 respectively. 6 A set of eight cards contains one joker. A and B are two players and A chooses five cards at random, B taking the remaining three cards. What is the probability that A has the joker? A now discards four cards and B two cards. If it is known that the joker has not been discarded, what is the probability that A has the joker? 7 Find the probability that two throws with three unbiased dice each will both show the same configuration if (i) the dice are distinguishable; and (ii) they are not distinguishable. 8 Consider three loaded dice. Die A is certain to show a 3. Die B is twice as likely to show a 2 as to show a 5, and will not show any other number. Die C is twice as likely to show a 4 as a 1, and will show no other number. The three dice are rolled once. Find the probability that (i) A shows a larger number than B; (ii) B shows a larger number than C; (iii) C shows a larger number than A; and (iv) A shows a larger number than B, and B shows a larger number than C. 9 The Security Council numbers eleven members with Great Britain, China, France, Russia and the United States as permanent members. If at a meeting

2

EXERCISES IN PROBABILITY AND STATISTICS

the members sit down at random, find the probability that the British and French delegates are next to each other but that the Russian and American delegates are not, if (i) the delegates sit in a row; .and (ii) the delegates sit round a circular table. 10 In a row of sixteen tiles on a roof, eleven chosen at random are blown off in a gale. Defining a "run" as a sequence of gaps or of tiles, find the probability that the row contains (i) 10 runs; (ii) 9 runs.

11 Two unbiased dice are rolled r times. Find the probability that each of the six results (1, 1), (2,2), ... , (6,6) appears at least once. 12

(i) Eight castles are placed at random on a chess-board. Prove that the probability that none can take another is (8 !)2/64(8)

(ii) If the castles are placed randomly on the chess-board but subject to the condition that none can take another, then prove that the probability that none stands on a square of the white diagonal is 8

L (-1)'/r!

,= 2

(iii) Hence, or otherwise, deduce that if the castles are placed completely at random on the chess-board, then the probability that none can take another and none stands on a square of the white diagonal is (8 (8) !)2. L... ~ (_ 1)'/r.' 64

,= 2

13 Find the probability that a hand of thirteen cards from a standard pack contains the ace and king of at least one suit. 14 For a random distribution of cards at bridge, calculate the probabilities for the following events: (i) A specified player (say S) has at least two aces, irrespective of the other players' hands. (ii) Some one player of the four has at least two aces, irrespective of the other three players' hands. (iii) S and no other player has at least two aces. (iv) Exactly one player of the four has at least two aces. 15 X chooses at random an integer from 1 to m (both inclusive). Without knowing X's choice, Y also chooses an integer from 1 to m. Find the probability that the two numbers chosen do not differ by more than n «m). 16 Suppose that each of n sticks is broken into one long and one short part. The 211 parts are then shuffled and arranged into n pairs from which new sticks are formed. Find the probability that (i) the parts will be joined into their original form; (ii) all long parts are paired with short parts; and (iii) at least one of the original sticks is formed.

PROBABILITY AND DISCRETE RANDOM VARIABLES

3

17 The digits 1, 2, ... , n (n < 10) are written down in a random order such that the n ! arrangements are all equally likely to give a number of n digits. If n == 7, find the probability that the number is divisible by (i) 2; (ii) 4; and (iii) 8. 18 The game of craps is played as follows. In a particular game one person rolls a pair of dice. He wins on the first throw if he scores 7 or 11, and loses on the first throw if he scores 2, 3 or 12. For any other score on the first throw, there are two different ways of continuing the game. In the first method, the player continues to roll the dice until he wins with a 7 or loses with the score obtained on the first throw. In the second method, the player also continues to roll the dice until he loses with a 7 or wins with the score obtained on the first throw. Find the total probability of the player winning a game under each of the two systems of play. 19 An urn contains n white and m black balls; a second urn contains N white and M black balls. A ball is randomly transferred from the first to the second urn, and then from the second to the first urn. If a ball is now selected randomly from the first urn, prove that the probability that it is white is n mN-nM - - +-,------:-;;----n+m (n+m)2(N +M + 1)"

20 Three newspapers A, Band C are published in a certain city, and a survey shows that for the adult population 20% read A, 16% B, and 14% C; 8% read both A and B, 5 % both A and C, 4 % both Band C, and 2 % read all three. If an adult is chosen at random, find the probability that (i) he reads none of these papers; (ii) he reads only one of these papers; and (iii) he reads at least A and B if it is known that he reads at least one paper. 21 For three independent events A, Band C, the probability for A to occur is a; the probability that A, Band C will not occur is b; and the probability that at least one of the three events will not occur is c. If p denotes the probability that C occurs but neither A nor B occur, prove that p satisfies the quadratic equation ap2+[ab-(I-a)(a+c-l)]p+b(l-a)(l-c)

= 0,

and hence deduce that (l-a)2+ab

c>

(I-a)

.

Further, show that the probability of the occurrence of Cis p/(p+ b), and that of B happening is (l-c)(p+b)/ap. In the particular case where a = 0'20, b = 0'42, and c = 0·985, verify that p is either 0·18 or 0·14. 22 Each packet of a detergent called "Sowite" contains a coupon bearing just one of the letters of the name of the product. A set of coupons making the word SOWITE can be exchanged for a free packet of the product. If each packet bought is equally likely to contain any of the six letters, calculate the probability, correct to five decimal places, that a housewife who buys ten packets can get a free packet without exchanging any duplicates with her friends.

4


Another firm introduces a rival detergent called "Osowite" and adopts the same coupon scheme, each packet being equally likely to contain any of the six different letters in the name. Show that the probability that the housewife who buys ten packets of "Osowite" will get a set of seven coupons spelling the name is approximately half the probability in the case of "Sowite". 23 An urn contains n balls, each of different colour, of which one is white. Two independent observers, each with probability of 0·1 of telling the truth, assert that a ball drawn at random from the urn is white. Prove that the probability that the ball is, in fact, white is (n - 1)/(n + 80). Also, show that if n < 20, this probability is less than the probability that at least one of the observers is telling the truth. Resolve this apparent paradox. 24 An unbiased coin is tossed 2n times. Find the probability, p(2n), that there will be an equal number of heads and tails. Prove that p(2n) is a decreasing function of n. 25 A sample of r individuals is taken from a population of n people with replacement. Find the probability U r that m given persons will all be included in the sample. Also, show that if n --+ 00 and r --+ 00 so that r/n --+ p, a constant, then Ur --+ (l-e- P)m. 26 A certain number n of identical balls is distributed among N compartments. Find the probability that one specified compartment contains r balls. Further, show that this probability, Pro satisfies the inequality relation

l)n-r( r)(r- 1I/

l)n-r( r)r-l

2 (n/N)" ( (n/N), ( - - 1-1-
= i, prove that the probability of the match being finished in ten or less games is 1981/2048.

*

In a game of skill a player has probabilities t, 152 and of scoring 0, 1 and 2 points respectively at each trial, the game terminating on the first realization of a zero score at a trial. Assuming that the trials are independent, prove that the probability of the player obtaining a total score of 11 points is

36

u"

=

I)"

3(3)" 4(-:3 . 13 4 + 39

and that the expectation of his total score is ¥. Also, suppose the rules are changed so that the game does not end on the first realization of a zero score at a trial but the trials continue indefinitely. In this case, show that the probability of the player obtaining a score of exactly 11 points at some stage of play is

37 An urn contains a white and b black balls. After a ball is drawn, it is to be returned to the urn if it is white; but if it is black, it is to be replaced by a white ball from another urn. Show that the probability of drawing a white ball after the foregoing operation has been repeated n times is

p"

b

[

1]".

= 1 - (a + b) 1 - a + b

38 Two urns contain, respectively, a white and b black, and c white and d black balls. One ball is taken from the first urn and transferred into the second, while simultaneously one ball taken from the second urn is transferred into the first. Find the probability, p", of drawing a white ball from the first urn after such an exchange has been repeated n times. Also obtain the limiting value of p" as n -+ 00. Two players A and B start playing a series of games with £a and £b respectively. The stake is £1 on a game, and no game can be drawn. If the probability of A winning any game is a constant, p, find the initial probability of his exhausting the funds of B or his own. Also, show that if the resources of B are unlimited, then (i) A is certain to be ruined if p = !; and (ii) A has an even chance of escaping ruin if p = 2 1 / a/(1 +21/a).

39

40 Two players A and B agree to contest a match consisting of a series of games, the match to be won by the player who first wins three games, with the proviso that if the players win two games each, the match is to continue until it is won by one player winning two games more than his opponent. The probability of A winning any given game is p and the games cannot be dra\vn. (i) Prove that f(p), the initial probability of A winning the match, is given by


7

(ii) Show algebraically that

df > 0

for 0 2) single connections, one for each room in the establishment; and during a working day all the m connections are equally likely to contact the telephone operator for service. If Pn denotes the probability that in any sequence of n calls to the operator no room contacts the exchange more than three times consecutively, prove that

Pn

=

[(1-IX2)IX~-I_(1-IXI)IX~-1 V(IX I -I(2),

where IXI and IX2 are the roots of the quadratic equation in x m 2x 2 -III(m-l)x-(m-l)

= O.

Find the limiting values of Pn when (i) II -+ 00 and m is finite; (ii) m -+ 00 and II is finite; and interpret their significance. 42 In a lottery m tickets are drawn at a time out of 11 tickets numbered from 1 to 11 (m ~ 11). Find the expectation and variance of the random variable S denoting the sum of the numbers of the III tickets drawn. 43 At an office N letters are to be posted, one to each of N different addresses. and a capricious secretary decides to distribute the letters randomly amongst the N addressed envelopes. If all the N ! arrangements of the letters are equally likely, show that the expectation and the variance of the number of correct postings are both unity. In a similar situation, another slightly less capricious secretary decides to make independently a guess of the correct envelope for each letter to be posted, so that the NN choices are all equally probable. Prove that in this case the expectation and variance of the number of correct postings are 1 and (N -l)/N respectively. 44 A box contains 2" tickets among which (~) bear the number r (r = 0, 1.2•... , n). A group of III tickets is drawn at random from the box, and if the random variable X denotes the sum of the numbers on the tickets drawn. show that E(X)

11111

= 2;

( X)

var

=

mil

4

[1- (2" - 1)] (111-

1) .

45 By considering an example of a discrete sample space, show that the probability distribution of a random variable defined over it is a suitable reordering of the elements of the sample space and their associated probabilities.


8

A probability distribution is defined over the positive integral values from that P(r), the probability for the integer r, is proportional to (;)/(r + 1). Evaluate the proportionality factor and hence prove that the mean and variance of the distribution are

o to n such

respectively. Further, show that this probability distribution can be formally obtained from a finite sampling scheme with replacement in which at least one ball is selected randomly at a time from a total of (n + 1). 46 A population consists of all the positive integers, and the probability of obtaining the integer r from this population is P(r)

= k(1-

Or 1,

(r

= 1,2,3, ... ),

where 0 < 0 < 1.

Determine the constant k, and the mean and mode of this population. Show also that if 0 = 1_(Wln, where n is a positive integer, then the median of the distribution may be considered to be n +1. What is the variance of the distribution? 47 A certain mathematician always carries two match-boxes, which initially contain N match-sticks each, and every time he wants a light, he selects a box at random. Obtain Ur , the probability that when he finds a box empty for the first time the other box contains exactly r match-sticks, and verify that N

L

Ur

=

1.

r=O

Also, prove that the expectation and variance of the number of matches left in the box are p

and

[(2N+2)-(1+p)(2+p)] respectively,

where p == (2N + 1)uo-1.

48 There are n compartments into which identical balls are distributed one by one in such a way that each ball is equally likely to fall into anyone of the n compartments. This process is continued until every compartment has at least one ball. Prove that the probability that every compartment is occupied after t balls have been used is

mto (:)(-If(n-m)t/nt. Hence deduce the probability that exactly t balls are needed for filling all the n compartments, and that the expected number of balls required is n

n

L -. m

m=l

49 A box contains N varieties of objects, the number of objects of each variety being the same. These objects are sampled one at a time with replacement; and if Xr is a random variable which denotes the number of drawings


9

necessary to produce any r different varieties in the sample, find the expectation and variance of X,. Also, for large N, show that

E(X,) '" N 10g[Nj(N -r+ 1)], and var

1 N 10g[Nj(N -r+ 1)].

N(r-l) (X ) , '" N

-r+

50 In the previous example, let the N varieties be identified by being numbered from 1 to N. If X denotes the largest number drawn in n drawings when random sampling with replacement is used, find the probability of

X = k. Hence obtain the mean and variance of X. Also, show that for large N and fixed n, E(X) "" nN n+l

and

nN 2 var(X) "" (n + 1)2(n + 2)'

51 Of a finite population of N animals in a region, Ware caught, marked and released. Members are then caught one by one until w (preassigned) marked animals are obtained, the total number of the animals in the sample being a random variable X. Show that the probability

P(X

=

11)

=

(;=:)(:= 11)/(~)'

for w

~ n~ N-

W +w,

and verify that this represents a proper probability distribution over the given range of variation of the random variable. Hence show that (X)

E

= w(N + 1) W+1

d an

E[X(X

+

1)]

= w(w+ l)(N + I)(N + 2) (W+1)(W+2)'

If a new random variable Y is defined by the relation

Y= X(W+ 1) 1, w then prove that

E(Y) = Nand

var(Y) = (N+l)(N-W)(W-w+1)jw(W+2).

52 A large number (N) of persons are subject to a blood test which can be administered in two alternative ways: (i) Each person is tested separately, so that N tests are required. (ii) The blood samples of k (a factor of N) persons are pooled and analysed together. If the test is negative,-this one test suffices for the k individuals. If the test is positive, each of the k persons must be tested separately, and in all (k+ 1) tests are required for k persons. Assuming that the test responses of the N persons are statistically independent and that the probability (1- p) for a test to be positive is the same for all individuals, find the probability that the test for a pooled sample for k persons will be positive.

\0


If S be the number of tests required for the N persons under plan (ii), prove that the mean and variance of S are

N[l+k(1-pk)]/k

and

Nkpk(l-pk) respectively;

and that the value of k which gives the minimum expected number of tests for the N persons satisfies the equation k 2 + l/pk logp = O.

53 In the simplest type of weather forecasting-rain or no rain in the next 24 hours-suppose the probability of raining is p (> 1), and that a forecaster scores a point if his forecast proves correct and zero otherwise. In making n independent forecasts of this type, a forecaster who has no genuine ability decides to allocate at random r days to a "rain" forecast and the rest to "no rain". Find the expectation of his total score (Sn) for the n days and show that this attains its maximum value for r = n. What is the variance of Sn? Devise an alternative scoring system which would ensure that under random forecasting the expectation and variance of Sn will be 0 and 1 respectively. 54 In the above example, suppose the forecaster has some ability to forecast the weather correctly. Accordingly, if the probability of his forecasting rain for any day is ~, and the conditional probability of raining on a day given that a "rain" forecast has been made is p, find the respective probabilities of the four possible outcomes for any day. Hence, assuming independence of weather conditions for n days, obtain a scoring system such that (i) E(Sn) = 0 for random forecasting; (ii) E(Sn) = n if the forecaster has perfect ability; and (iii) var(Sn) is a minimum when the marginal distribution of forecasting is the same as the marginal distribution of rain on any day, the two events being assumed independent. Verify that with this scoring system E(Sn)

= n~(p -

p)/pq,

(p+q = 1).

55 In a sequence of Bernoulli trials with probability of success (S) p and of failure (F) q, (p + q = 1), find the expected number of trials for the realization of the patterns (i) SSSFS; (ii) SF FS ; and (iii) SSSF. 56 For a sequence of n Bernoulli trials in which the probability of success (S) is p and that of failure (F) is q, (p+q = I), show that Yn, the probability that the pattern SF does not occur in the entire sequence, satisfies the difference equation Yn-Yn-l +pqYn-2

= 0 for n ~ 2.

Hence obtain explicit expressions for Yn when p :F q and when p = q. Also, prove that the expected number of trials necessary for the realization of r consecutive repetitions of the pattern SF is I-p'q' , pq'(1 -pq ) ~ ~4' -1).

57 Random drawings are made from an urn containing b black and w white balls. Each ball drawn is always replaced, and, in addition, c balls of the

II


colour drawn are added to the urn. If P(lI, k) denotes the probability of drawing exactly k black balls in the first n drawings, show that P(n, k) satisfies the recurrence relation P(lI,k)=b

b+(k-l)c w+(n-k-l)c (1)·P(n-l,k-l)+b (1)·P(n-l,k), +w+n- c +w+n- c

where P(n, - I) may be taken to be zero. Hence obtain P(n, n) and P(n, 0). Also, for k < n, verify that p(/I,k)

n) b(b+c)(b+2c)" ·{b+(k-l)c}. w(w+c)(w+2c)" '{w+(n-k-l)c} . (b+w)(b+w+c)(b+w+2c)" '{b+w+(n-l)c}

= (k

satisfies the recurrence relation. Further, if p = b/(b + w), q = (1- p), and y P(Il, k) can be rewritten in the form

=

c/(b + w) (y > - 1), prove that

( k) = (") p(p+y)(p+2y)'" {p+(k-l)y}. q(q+y)(q+2y)'" {q+(/I-k-l)y} PII,

k'

1(1+y)(1+2Y)"'{1+(1I-1)y}

.

1

Finally, if now n -+ 00, p -+ 0, y -+ 0 so that np -+,-t, ny -+ -, then show that p the limiting form of P(n, k) is n(k)

= (,-tP+k-l)(~)AP(_1 k

l+p

)k,

l+p

(0 ~ k < (0),

and hence that, as p -+ 00, ll(k) tends to the Poisson distribution with mean ,-t. 58 Two players A and B alternately roll a pair of unbiased dice. A wins if on a throw he obtains exactly six points before B gets seven points, B winning in the opposite event. If A begins the game, prove that his probability of winning is 30/61, and that the expected number of trials needed for A's win is approximately 6. 59 Under a newly proposed motor insurance policy, the premium is £a in the first year. If no claim is made in the first year, the premium is £Aa in the second year, where (0 < ,-t < 1) and ,-t is fixed. If no claim is made in the first or second years, the premium is £,-t2(X in the third year; and, in general, if no claim is made in any of the first r years (r ~ 1), the premium is £,-tr(X in the (r+ l)th year. If in any year a claim is made, the premium in that year is unaffected, but the next year's premium reverts to £(X, and this year is then treated as if it were the first year of the insurance for the purpose of calculating further reductions. Assuming that the probability that no claim will arise in any year is constant and equal to q, prove that in the nth year (n ~ 2) of the policy, the probabilities that the premium paid is nn-l(X or nn- j-l(X, (1 ~j ~ n-l), are qn-l and (l_q)qn- j - l respectively. Hence calculate the expected amount of the premium payable in the nth year and show that if this mean must always exceed k(X (k > 0), then ,

A>

k+q-l kq .

12


60 A player rolls four unbiased dice, and if S is the random variable denoting the sum of points obtained in a single throw of the dice, prove that the probability P(S = n) is the coefficient of (}n-4 in the expansion of (1- (}6)4/6 4(1_ (})4

for all n in the range (4 ~ n ~ 24). Hence, or otherwise, deduce that (i) P(S = 18) = 5/81; and (ii) E(S) = 14. 61 The probability of obtaining a 6 with a biased die is p, where (0 < p < 1). Three players A, Band C roll this die in order, A starting. The first one to throw a 6 wins. Find the probability of winning for A, Band C. If X is a random variable which takes the value r if the game finishes at the rth throw, determine the probability-generating function of X and hence, or otherwise, evaluate E(X) and var(X). 62 The six faces of an ordinary cubical die are numbered from 1 to 6. If two such unbiased dice are rolled once, find the probability distribution of the random variable X denoting the sum of points obtained. Also, find an appropriate numbering of the twelve faces of two unbiased dice which would ensure that the probability P(X = r) is the same for all r in the range (1 ~ r ~ 12), and show that for such a pair of dice the probabilitygenerating function of X is G((})

= (}(1- (}12)/12(1- ()).

63 From an urn containing (2n+ 1) tickets numbered serially, three tickets are drawn at random without replacement. Prove that the probability that the numbers on them are in arithmetical progression is 3n/(4n 2 -1). Further, by considering the sample space corresponding to the possible realizations of the arithmetical progressions, show that the common difference of the arithmetical progression can be regarded as a discrete random variable X with the probability distribution defined by P(X

=

r)

=

(2n+ 1)-2r

n

2

'

for r = 1,2,3, ... , n.

Hence deduce that the probability-generating function of X is G( (})

=

4(} . [1 _ (2n + 1- (}n)( 1 + (}~l. n(l- (})2 4n

J

64 In a game of skill a player has probability p of winning a point and probability q, (p+q = 1), of losing a point at each attempt. If the trials are independent, find the probability distribution of the random variable X giving the player's total score in n trials. Hence, or otherwise, obtain the mean and variance of X. Also, show that if a new random variable Y is defined by the relation Y= (X+n)/2,

then Y has the Bernoulli distribution with probability of success p in each of n trials. 65 At each independent trial in a game, a player has probability p of winning a point, probability q of losing a point, and probability r for no loss or gain,


13

where (p+q+r = 1). Find the probability-generating function of the random variable X giving the player's total score in n trials, and hence deduce the mean and variance of X. Also, show that the probability for a total score of m (~n) points in n trials is P(X

= m) =

L }=m

(n+m)/2(

n . )(m+n-J) . . . {p/rY (q/ry-m. r". m+n-J J

66 A particular constituency has a total of (L + C + F) registered voters of which L are by conviction Labour supporters, C are Conservative and Fare the floating voters. The probability that a Labour supporter will vote for his party in a by-election is Pl' and the probability is P2 that a Conservative will exercise his vote. The probabilities are P3 and P4 for a flQating voter to vote either Labour or Conservative respectively. Prove that in a straight fight between a Labour and a Conservative candidate, the probability of a tie is given by the coefficient of ()c+ F in the function [(1- Pl)+ Pl()]L [(1- P2)9+ P2f [(1- P3 - P4)() + P3()2 + P4]F. Show also that for the random variable N representing the total number of votes cast in the by-election, the mean and variance are: E(N) = Lpl+CP2+F(P3+P4);

var(N) = LPl(1-Pl)+CP2(1-P2)+F(P3+P4)(1-P3-P4)' 67 In an industrial process individual items are in continuous production, and the probability of finding a defective item on inspection is a constant, p. To ensure a reasonable standard of the outgoing product, 100 per cent inspection is carried out until a sequence of r (preassigned) non-defectives is found after the detection of a defective. After this, 100 per cent inspection is discontinued and only a given fractionf(O X2' ... ,Xn are n independent ~e~lizations of X, find the moment-generating functions of the random variables Yand Z defined by n

Y=

L

n

XI

and

=

Z

1= 1

1= 1

and obtain their limiting values when np and p'" O. Also, prove that P(Y = 0)

=

L xf,

,to G) C;2)(A/2Y.

-t

P(Z

m, a positive number, as

II - t 00

= 0), where A. == p/(I- p),

and r takes even values ~ n. 79 A discrete random variable X takes the values ± 2, ± 1 and 0 with probabilities tp2, p(I- p) and (1- p)2 respectively. Derive the momentgenerating function of Sn, the sum of n independent observations of X. If np = m, where n is sufficiently large so that terms of order n - 2 can be considered negligible, show that in this case the moment-generating function of Sn is approximately

M(t) =

exp{4m(1 +m/n) sinh 2 t/2},

and hence, or otherwise, deduce that var(Sn) = 2m (1 +~). 80 A discrete random variable X has a Poisson distribution with parameter

A., where A. is known to be large. Given a small positive number b, prove that for any realization of X

S~I <Xl

Jx+b

=

[

JA.+b 1+

(-I)'-1(2s-2)! (X-A.)'] 22 • IS !(s-I)!' l+b .

By using this expansion, deduce that

[ r::-:J:.]

E yx+b

(1-

1

(24b-7)

= yA.+b- 8JX+ 128A.3/2

(240b 2 -260b+75) 0(1-7/2) 1024A. 5 / 2 + A ,

(32b 2 -52b+ 17)] (A.--3). [ ~] _ ![1 + (3-8b) 8A. + 32-1. +0 ,

var y X+u - 4

and hence verify that the variance of 81

2

J X +i is ! + O(A. - 2).

Two children, A and B, divide at random a bar of chocolate containing ~ 1) sections, so that A has r sections and B (4n-r), with neither having no chocolate. If A is known to have more than B, show that the expected number of sections in A's possession is 3n, and the variance of his number of sections is n(n -1)/3. Show also that if on three successive occasions A has more than three times as much chocolate as B, it is reasonable to doubt that the bar was broken at random.

4n (n

18


82 A lady declares that by tasting a c,;!p o~ tea made with milk she can discriminate whether the milk or the tea mfuslOn was first added to the cup. In an experiment designed to test her claim, eight cups of tea are made, four by each method, and are presented to ~er for judgme.n~ in ~ random order. Calculate the probability of the cups bemg correctly dIvIded mto two groups of four if such a division is entirely random. If the experiment is repeated ten times 'and a "success" in an individual experiment is scored if the lady corre~tly divides the cups, show that a score of two or more "successes" in ten trials provides evidence, significant at the 1 per cent level, for her claim. The lady considers this definition of "success" too stringent, so an alternative experiment is devised. In this experiment six cups of each kind are presented to her in random order, and she is now regarded as scoring a "success" if she correctly distinguishes five or more of each kind. Calculate the probability of such a "success", and also the probability of three or more successes in ten independent experiments, assuming her claim to be false. Explain the difference, if any, between the stringency of these two procedures for testing the lady's claim. 83

In an unlimited sequence of Bernoulli trials with probability p of success (S) and q of failure (F), two patterns are defined by the outcomes of three consecutive trials as (i) SFF and (ii) FSF. Prove that the expected number of trials needed for r (~ 1) consecutive realizations of the pattern (i) is

and that for the pattern (ii) is

84

A printing machine is capable of printing any ofn characters IX" 1X 2, ••• , IXn and is operated by an electrical impulse, each character being, in theory, produced by a different impulse. In fact, the machine has probability p, independent of any previous behaviour, of printing the character corresponding to the impulse it receives, where (0 < p < 1). If it prints the wrong character, all such characters are, independently, equally likely to be printed. One of the n impulses, chosen at random, was fed into the machine twice and on both occasions the character lXi was printed. Show that the probability that the impulse chosen was the one designed to produce the character lXi is (11- 1)p2/( I - 2p + np2). If the machine had printed lXi on the first occasion and IXj (j 1= i) on the second, determine now the probability that the impulse chosen was that designed to produce lXi' Does it make any difference to this probability if it is merely known that the second character printed was not lXi?

85 Electronic tubes in continuous production at a factory are packed in batches of N (large) units for despatch to customers. It is known from past experience of the manufacturing process that the probability of a random batch containing k defective tubes is Pk> where (0 ~ k ~ m). From a randomly selected batch, a sample of n ( < N) tubes is tested and of these r ~ m are found to be defective. Find the probability that the chosen batch had, in fact, k ~ r defective tubes.


19

If according to a customer's specification of acceptable quality, batches with P(~r) defectives are considered to be definitely bad, prove that the total probability of the selected batch being in fact a bad one is

l_:t:Pk(~=;)/(~)

(N-n) (N)' L Pj. I/'. }-r '/ ' } m

j=,

DiscusS briefly how this result could be used for quality control of the production process. 86 In a big department store there are n separate service counters, and a customer entering the store is equally likely to go to anyone of the counters for service. Find the probabilities that a particular counter X will be the first to have a run of r consecutive customers before any other counter receives s successive customers when it is known that just preceding the count only the last customer went to (i) counter X; and (ii) another counter Y. If /I is finite and r = s large, verify that these two probabilities are both asymptotically equal to l/n, and explain the significance of this limiting value. 87 Housewives shopping at a grocery store are equally likely to ask for anyone of n standard brands of detergents, but due to an advertising campaign for a new detergent X, it is expected that the probability for a housewife to ask for X is a constant P (0 < P < 1), the demand for the standard brands remaining equi-probable. Assuming that all housewives coming to the store buy one and only one brand of the (n + 1) detergents, find the probability that on any day the first r consecutive housewives will purchase X before a run of s shoppers buys packets of anyone of the standard brands. Hence determine the limiting values of this probability for (i) n large and r, s finite; and (ii) /I finite and r. s large. 88 An infinite population consists of two kinds of biological particles which may be likened to white (W) and black (B) beads. In their initial state the beads are found linked together in chains of four in all the five possible distinct colour combinations. The probability that a chain has a white bead is a constant p, and that for a black bead is q (== 1 - pl. Each chain temporarily splits in the middle to form two doublets, and these doublets then randomly recombine to form new chains of four beads each. This dual process of splitting up and recombination continues indefinitely without any other kind of change in the original population of beads. Determine the proportions of the three colour combinations of the doublets (WW), (WB) and (BB) immediately after the first stage of splitting. If now only proportions xo, 2yo and Zo (xo + 2yo + Zo = 1) of the doublets are considered further in the process, because of certain accidental destruction, find X n, 2Yn and Zn' the corresponding proportions after the nth successive stage of splitting, and hence obtain their limiting values as n -+ 00. 89 In an infinite population of two kinds of biological particles, which may be symbolised by white (W) and black (B) balls, the balls exist singly and in pairs of the three colour combinations (WW), (WB) and (BB). Initially. the single (W) and (B) balls are in the proportions Po and qo (Po + qo = 1), whilst the paired balls (WW), (WB), (BB) have the proportions roo 2so, to respectively (/'0+2so+to = 1).

20


All the paired balls undergo a simultaneous disjunction, after which they are equally likely either to combine randomly with the original single balls only to form new pairs of the three types, or to remain single. This second population of paired and single balls repeats the process of separation and combination to form a third population, and this continues indefinitely. Find the proportions of the paired and single balls after the nth stage of the process, and hence deduce their limiting values for n -+ 00. On empirical considerations, the demand for a particular brand of breakfast cereal at a retail store may be considered to be a random variable X having a Poisson distribution with mean A. On the first of each month the retailer arranges to have a stock of n packets of the cereal. Find the expected number of lost sales per month because of inadequate stock, and hence deduce that, as a fraction of the average monthly demand, the expected proportion of lost sales is 90

7t"

= (l-i)[I-F(n-l)]+p(n-l),

where the probability P(X = r) := per) for all r ~ 0, and F(r) is the distribution function of X. Also, prove that the expectation of the monthly proportion of lost sales is n <Xl k! p" '" I-I k~O;:rt), and that a forecaster scores a point if his forecast proves correct and zero otherwise. In making n independent forecasts of this type, a forecaster, who has no genuine ability, predicts "rain" with probability A. and "no rain" with probability (1- A.). Prove that the probability of the forecast being correct for anyone day is [1-p+(2p-1)A.].

Hence derive the expectation of the total score (Sn) of the forecaster for the n days, and show that this attains its maximum value for A. = 1. Also, prove that var(Sn) = n[p-(2p-1)A.][1-p+(2p-1)A.], and thereby deduce that, for fixed n, this variance is a maximum for A.

=

t.

98 In testing the quality of large batches of a mass-produced article, randomly selected items of the product are examined individually from each batch, the probability of finding a defective item in a batch being a constant p, (0 < p < 1), and that of a non-defective q (= 1- pl. Each batch is inspected separately, and if an inspected item is found to be of acceptable quality, a score of one point is made, while for each defective item encountered the penalty is to subtract a points from the total score, a being a known positive integer. Initially the score for a batch is zero, and as soon as the cumulative score is M, the batch is accepted without further inspection. On the other hand, when the total score is ~ - M for the first time the batch is rejected immediately.


23

If for this inspection scheme, u(x) is the probability that a batch will be ce~ted when the cumulative score is x, where x is taken to be in the equiva-

~;nt positive range (0 < x ~ 2M), prove that u(x) satisfies the difference equation v(x+a+ 1)-v(x+a) = -AV(X), where u(x)

==

and

q-Xv(x)

), == pqa.

Further, assuming that a solution of v(x) is a power series of the form 00

v(x) =

I: ArVr(X), r=O

determine explicitly the series expression for u(x). 99 In a game of skill, a trial can result in anyone of m mutually exclusive results R l' R 2 , ••• , Rm· The probability for Ry to happen is proportional to p", where p is a constant (0 < p < t). The stake for each trial is one shilling, and if Ry occurs at a trial the player receives AV shillings, where (0 < A< 1). If S. denotes the amount received by a player after n independent trials, prove that the probability-generating function of S. is

_ [(1- p)8' 0 and as n -+ 00, np -+ ex, nc -+ p, where ex and p are constants, prove that the limiting distribution of x is negative binomial. Determine the cumulant-generating function of this limiting distribution, and hence obtain the first four cumulants. Finally, verify that for p = 0, the negative binomial reduces to a Poisson distribution with mean ex. 103 The probability that a certain type of tree has n flowers is given by = 0, 1,2, .... Each flower has probability j of being pollinated and hence producing fruit, independently from the other flowers on the tree. Each fruit has probability! of being eaten by birds before ripening. Show that the probability of any flower producing a ripe fruit is 1.

(1- p)pn, n


25

A particular tree bears r ripe fruit. Show that the probability that it had initially n flowers is

(~) pn-r(2_py+lj2n+l. Show also that if an orchard of k independent trees produces no ripe fruit, the probability that the total number of flowers was initially 11 is

b~~ ~ p' (2-ptI2"". 104 Contagious distributions are applicable to situations where individuals or items are supposed initially dispersed in randomly scattered groups-such as egg masses in the case of insects or clumps in the case of bacteria-which are subject to chance fluctuation in size. It may be supposed that there occurs subsequently some spatial dispersion from the initial groups, or reduction due to attacks of predators. Such phenomena are usually poorly represented by the simple Poisson model, and a more suitable model is provided by a class of three-parameter contagious distributions defined by the probabilitygenerating function G(O)

=

exp{).I[f(O)-I]},

where 00

f(O) == r(n+l)

L A~(O-INr(I1+k+l)_ k=O

and . 1. 1,..1. 2 ,11 are parameters such that ..1. 1 ,..1. 2 > 0 and 11 ~ o. Verify that G(O) represents a probability-generating function of a discrete random variable X for integral values of X ~ o. Also, prove that the probabilities P(X = r) = ar satisfy the recurrence relations

_ Al ~ 1 (HI) ar+1 - --1· L... -k,.f (O).ar-k

r+

k=O

for r > 1,

•

j fJ ~ 1 and 0 ~ X ~ 1. Show that the rth moment of X about the origin is

IXfJI(1X + r)(fJ + r), and hence evaluate the mean and variance of X. If IX = 2P, prove that Iii, the median of the distribution, is given by Iii =

(1- ~rfP

5 For a continuous random variable X defined in the range (a ~ X ~ b), and having, for X = x, a probability density function proportional to g(x), prove that Iii, the median of tlle distribution, satisfies the integral relation

f Iii

f b

2 g(x) dx

=

g(x) dx.

a

a

If X has the probability density function proportional to xl(l +X)3 in the range (0 ~ X < 00), show that the median of the distribution is (1 + j2). Also prove that the mode is 1, and hence deduce the sign of the skewness of the distribution.

6 A continuous random variable X has, for X = x, the distribution function (1 _e-atanX) in the range (0 ~ X ~ n/2), where 0 < IX < l. Find the probability 27

28


density function of X, and hence show that 1110' the mode of the distribution is given by , mo = -! 0). P(X ~ x) =!+ ~. (2 1)2r I' V 27t r=O r+ .r .

L

Also, using integration by parts, show that an alternative form is 1

e- tx2

v 27t

X

P(X ~ x) = 1- ~.--.

[

1+

Ln (-1)'(2r)!] 2 + Rn(x), .r.x r

I

2r

r=1

where

Verify that this series expansion for the distribution function of X is asymp· totic by proving that, for any n, the remainder is less in absolute value than the last term taken into account.

CONTINUOUS RANDOM VARIABLES

35

39 Starting from the standard asymptotic expansion for the distribution runctio n of a unit normal random variable X, viz., 1 e- tx2 [ ro (-I)'(2r) '] P(X::; x) ~ 1- r-c.~-. 1+ 2' 2, " y 2n x ,= 1 • X • r .

L

obtain Schlomilch's more rapidly convergent form of the expansion given by P(X ::; x) ~ 1-

1 e- tx2 h:.--[I-Al+A2-5A3+9A4-129As+57A6-9141A7+ ...], y2n x

where, for any positive integer r, A,

==

1/l].1

(x 2 + 2k).

40 A random variable X has the probability density function J(X = x) = ksin 2n xcosx defined in the range (-nI2 ::; X ::; nI2), where k is a function of the positive integer II. Determine k and calculate the probability that (i) X ::; () and IXI ~ () for any given (); and n

(ii)

-6::;

X :e;;

1t

6'

Also, prove that where 12n+ 1 satisfies the recurrence relation

(2n+l)212n+l = 1+(2n)(2n+l)1 2n _ 1, with II = 1. Obtain the distribution of Z

sin 2 X and identify its form.

=

41 If (Xx denotes the distribution function of a normally distributed random variable X with zero mean and unit variance, prove that

(Ill) A(m+r+ 1) = t ,~o (-1)' (m)r A(m+r),

m (i) ,~o (-1)' r

(ii) ,to

m

(-1)'(7) B(m+r)

where for

II

=

! ,to (-I)'C) A(m+r),

a positive integer and () > 0 1

A(n) ==

f (X~

d!X x

and

B(n) ==

o

o

Also, if for positive integral values of nand s 1

D(n, s)

==

Jx'!X~ o

then show that

f(X~ 1

d!Xx ,

!Xlix d!Xx ·

36


(iii)

f

(_I),(m) D(2n+l,m+r)

= 0; and

(-1)'(7)

=!

r

r=O

(iv) Jo

D(2n,m+r+l)

rt (-1)'(7)

D(2n,m+r).

42 If X is a random variable with the probability density function for X = x, find the probability density function of Y = X 2, if (i) f(x) = 2x e- x2, for 0 ~ X < co (ii) f(x) = (1 +x)/2, for -1 ~ X ~ 1 (iii) f(x) = t, for -t ~ X ~ 1.

f(x~

43 If X and Yare two random variables with the joint probability density function f(x, y), for X = x and Y = y, find the probability density function of Z if (i) f(x, y) = 4xye-(X2 +)l2), for 0 ~ X, Y < co and Z = (X2 + y2)t. (ii) f(x, y) = 1, for 0 ~ X, Y ~ 1 and Z = X+ Y ifX+Y1. (iii) f(x, y) = ala2 e-(OIX+02)1), for 0 ~ X, Y < co and Z = X+ Y. 44 Each of two independent events A and B can only occur once in the future. The probabilities of their happening in the time-interval (t, t+dt) are proportional to t e-ext . dt and t 2 e- fJt . dt respectively for all t in the range (0 < t < oo~ where oc and p are positive constants. Prove that the probability that the two events occur in the order BA is P3(4oc + P}/(oc + P)4. 45 If the independent random variables X and Y have the probability density functions f(x) and g(y) defined respectively in the intervals (0 ~ X ~ oc) and (0 ~ Y ~ P), (P > oc), find the probability distribution of the random variable Z = X + Y, indicating the discontinuities in this distribution. Hence obtain the distribution of Z in the particular case when both X and Yare uniformly distributed, and determine its limiting form as oc -. p. 46 For the independent random variables X and Y having the probability density functionsf(x) and g(y), both defined in the unit interval (0 ~ X, Y~ find the probability distribution of Z = X + Yin the following cases: (i) f(x) = 1 and g(y) = 3(1- 2y)2 ;

n

(ii) f(x)

= 1r: and

g(y)

=

J-=y ;

2yx 2 l-y (iii) f(x) = 3(1-2x)2 and g(y) = 3(1-2y)2. Also, find the distribution of W = X 2 + y2 when both X and Y have a uniform distribution in the unit interval. 47 The two random variables X and Y have, for X = x and Y = y, the joint probability density function 1 f(x, y) = - 2 2' for 1 ~ X < co; llX ~ Y ~ X. xy

Derive the marginal distributions of X and Y. Further, obtain the conditional distribution of Y for X = x and also that of X given Y = y.


37

48 The continuous random variables X and Y have a joint probability density function proportional to y"(x- y)(l-x)P, for 0 ~ X ~ 1;0 ~ Y ~ X, h parameter a being > - 1 and {3 a positive integer. t e Find the proportionality factor, and hence determine the probability distribution of the statistic u = y/Jx. Also, for any given uo, (0 < U o < 1), shoW that 2u"0 +1 x ) _ P(u ~ Uo - B(a+3,{3+1)

(P)

P [(a+2) (a+ l)uo + (a+ l)(a+2)uo+ 2r +S ] x r~o r (-1)' (a+2r+5) (a+2r+4) 2(a+,.+3)(a+2,.+4)(a+2r+5)·

49 If X and Yare independent random variables such that X is uniformly distributed in the range (1 ~ X ~ 3), and Y has the negative exponential distribution in the interval (Y ~ 2), obtain the joint distribution of the random variables z = X /Y and W = X Y. Hence derive the marginal distributions of Z and W, indicating the discontinuities in their probability density functions. 50 A gun is fired at a target and the bullet strikes the target at a point P. The vertical and horizontal distances of P from the bull's-eye are independent normally distributed random variables with zero mean and variance (f2. Show that the probability that the bullet strikes the target at a distance R greater than r from the bull's-eye is e- r2 / 2a " and hence deduce the probability that the bullet hits the target in the annular region (r 1 ~ R ~ /'2)' Also, find the probability that of n independent attempts at hitting the target k (0 ~ k ~ /1) fall in the region (rl ~ R ~ r 2) and the rest outside it. 51 Two independent realizations Xl and X2 are given of a random variable X which, for X = x, has the probability density function

0- 1 e- x/8, for 0

~

X
X2' X3' and X4 are independent observations from a univariate normal population with zero mean and unit variance, obtain the sampling distributions of the following statistics:

1X2

(i) u = (x~ + x~)/(x~ + x!) ; (ii) v = (Xl +X2+X3)/[!<XI-X3)2+i(XI-2x2+X3)2+X!]t; and (iii) w = (x~ + x~)/(x~ + x~ + x~). By using the appropriate published tables, determine the constants such that P(u ~

1(1)

= O'OS

and

IXI

and

p(lvl ~ 1(2) = 0·01.

54 The continuous random variables X, Y and Z defined in the range (0 ~ X, Y, Z < (0) have the joint probability distribution with the density function f(X = x, Y = y, Z = z) = (xyz)-t. g(x+y+z). Derive (i) (ii) (iii)

the marginal distributions of the random variables U = X + Y + Z; V = Y/X; and W = Z/(X + Y).

A continuous random variable X has, for X = x, a probability density function f(x) defined in the doubly infinite interval (- 00 < X < (0). If Xl and Xn denote the minimum and maximum of the first n independent observations derived from this population, prove that the probability for the (n+ l)th ~nd (n+2)th independent observations to lie outside the interval (Xl ~ X ~ Xn)

S5

IS

6/(n + l)(n + 2).

56 From a standard pack of playing cards having four aces in fifty-two cards, cards are dealt one by one until an ace appears. Prove that P(X = r~ the probability that exactly r cards are dealt before the first ace turns up, is given by P(X

= r) = (Sl-r)(SO-r)(49-r}/13. 49. SO. S1.

Verify that this represents a proper discrete probability distribution defined over the admissible range of values of the random variable X. Hence determine the mean and variance of X. Show, further, that by a suitable transformation P(X = r) can be considered approximately proportional to (Z3 - z), where z varies continuously in the range (0 ~ z ~ SO). Hence calculate the approximate values of the mean and variance of X. Assuming this approximate form of the probability distribution of X as related with z, determine the probability for X ~ 2S, and thereby prove that if this event happens in two consecutive deals of the pack, it is reasonably certain that the card distribution in the pack was not random. 57 Cards are dealt out one by one from a well-shuffled standard pack until the second ace appears. Prove that the probability that exactly r cards are dealt before the second ace turns up is per)

= r(Sl-r)(SO-r}/13.17.49.S0.


39

By considering r to be a realization of a discrete random variable X such that P(X

= r) = p(r),

verify that this defines a proper probability distribution over the possible range of values of r, and hence obtain the mean and variance of X. Also, show that by a suitable transformation p(r) can be considered approximately proportional to (50-z)z(z+ 1), where z varies continuously in the range (1 ~ z ~ 49). Hence calculate the approximate values of the first two moments of X and the percentage errors of approximation. Using this continuous approximation to the distribution of X, determine the probability for X ~ 25, and thereby prove that if this event happens in four consecutive deals of the pack, it is safe to infer the non-randomness of the card distribution in the pack. S8 It is known from empirical considerations that the life of an electronic tube produced by a factory may be regarded as a random variable T having a negative exponential distribution with mean T. In an experiment designed to assess the quality of a large consignment of tubes, a random sample of n tubes is taken and put on the test rack. If t I and t2 are the observed lifetimes of the first two tubes, show that the joint distribution of t I and t 2 is

If the minimum standard of approvable quality of the consignment is such that (t 1+t2 ~ c), where c is a known constant, prove that the probability of obtaining random samples of n worse than the approvable quality is

Find the limiting value of Pn as n -+ significance.

00,

and c and

T

are finite, and explain its

S9 In a factory with It lamps lighted at the same time daily, a bulb is replaced immediately it burns out, the cost of replacing it being a constant y. An alternative method is to change all I! bulbs at the same time periodically with time-period T, but still change a bulb when it burns out. When all the I! bulbs are changed together, the total cost is a + {JI!, where a and {J are constants such that a + {JI! < I'll. If the lifetime T of a bulb is a random variable having a uniform distribution in the interval (0 ~ T ~ k), show that the expected number of burnouts per socket is (T < k).

Hence deduce that the equation for determining the optimum value of which makes the average cost per unit of time the bulb is lit a minimum is eX (I-x) = 1-0,

where x == T/k, and 0 == (a + {J1l)!YIl < 1.

T

40 60


If X and Yare independent unit normal random variables, and V(h, q) l(h, q)

= =

P(O

~

~

P(X

~

X

h; 0

h; Y

~

~

Y

~

qX/h),

qX/h),

and (h) = P(O

~

~

X

h),

q and It being positive constants, then prove that 1 1(11, q) = V(h, q)-tI>(h) + 2n cot- 1(q/Il).

Hence, by deriving Polya's approximation,

V(It, q) '"

tan~;q/h) . [1_e-QhI2Ian-l(Qlh'J,

obtain a suitable approximation for 1(1t, q).

61

For the normal bivariate integral

f

0000

1(It,q) = 21nf

e- t (X 2+).2)dydx,

h Qxl"

prove Nicholson's approximation that

1(It,q)",~.e-th2(1+W2). 2n

1 ], n=1f: 1t-2n(~.~)n-l[ dw w(1+w2) W

where w == q/It. It and q being positive constants. Also, verify that the general term of the series is

1t- 2 n( -2rl (n-1)!

nIl (2~')~V'/22r . w2r+ r=O

I

.(1+w2)"-r.

,

62 If XI and X2 (x I < x 2) denote two ordered observations of a uniformly distributed random variable X defined in the range (0 ~ X ~ 2a), derive the joint probability distribution of Xl +x 2 y= - - - an d

2

X 2 -X l

Z=-2-'

Hence show that for the marginal distribution of y the probability density function is (i) y/a 2 , for 0 ~ y ~ a; and (ii) (2a- y)/a 2, for a ~ y ~ 2a. Also, prove that the marginal probability density function of z is 2(a-z)/a 2 , 63

for 0 ~ z ~ a.

Prove that if A > 0, AB-C 2 > 0, then

2~

f f exp[-1 p) and P(z ~ P1 < p) for fixed Po and Pl'

42


67 If x and yare the maximum values obtained from independent samples of 111 and 11 (m ~ 11) observations respectively from a rectangular distribution in the (0, 1) interval, find the sampling distribution of the product z = Xy in the two cases (i) m = nand (ii) m < n. Hence, or otherwise, deduce that for equal sample sizes the statistic v

=

-2nlogz

is distributed as a X2 with 4 dJ., but that for m < n the rth moment about the origin of v is E(v r )

=

G),. qr+ 1) .(I-A:+ 1)/(1_).),

(). ==

m/n).

Use this result to show that for m =F n the distribution of v may be approximated as fJx 2, where X2 has v dJ., the constants fJ and v being evaluated as

fJ =

(1+).2) ),(1+),) and

2(1+).)2 v = (1+).2)'

68 If W1 and W2 are the sdmple ranges obtained from independent samples of size and n2 respectively from the (0, a) rectangular distribution, derive the sampling distribution of the ratio

"l

u= Prove that for any positive integer r E(u r)

= (Ill

WdW2'

n1(n1 -1)n2(n2 -1) +r)(n1 +r-l)(112 -r)(n2 -r-l)

Hence, or otherwise, deduce the limiting form of the distribution of u as both nl and n2 .....

00.

If w is the sample range obtained from a random sample of n observations from a rectangular population in the (0, 1) interval, prove that - (2n -1) log w is distributed approximately as a X2 with 4 degrees of freedom. Hence show that if z" is the product of k independent sample ranges from the above population obtained from samples of sizes n1, n2' ... , n" respectively, then for large samples a first approximation gives that - (2n -1) log z" is distributed as a X2 with 4k dJ., n being the average of the sample sizes nj. Further, prove that an imprOVed approximation can be taken as

69

(2n--l)[1

where kV(nj) ==

{1+4V(n i )}] I 2 2n 2 ogz" - X

L" (nj-n)2. j=

1

70 If Xl' X2"'" X" are the maximum values obtained from independent samples of equal size n from a rectangular population in the range (0, 1~ find the exact distribution of the product v" = Xl X2 ••• X". Hence, or otherwise, prove' that - 2n fog v" is distributed as a X2 with 2k degrees of freedom.


43

Also, if the Xj are obtained from large samples of size nj (i = 1, 2, ... , k) ectively, then show that a convenient large-sample approximation for the re sp f. distribution 0 Vk IS

7

-2n [ 1- 2V(n.)] logvk

,...,

2k [ 1- V(n.)] n2' dJ.,

X2 with k

where n is the average of the nj and kV(nj) ==

L (nj-n)2. j=

I

71 Suppose UI is the smallest and VI the largest of nl independent observations from a rectangular population in the interval (0,1); and U 2 is the smallest and V2 the largest observation from another random sample of size n2 from the same population. If it is known that UI ::s:; U2' then for fixed UI prove the rollowing: (i) The conditional joint distribution of VI and U2 is

Hence derive the unconditional distribution of the statistic T= (U2-UI)/(VI-UI),

and verify that for positive integral r E(Tr)

=

(nl -1)r(n2 + 1)r(r+ 1) (111 -r-1)r(n2 +r+ 1)·

(ii) The conditional joint distribution of U2' V2 and VI is n2(n2 -1)(nl -1)

°

(VI-UI)n,-2(V2- U2)"2- 2 (1 -U )n 1 +n 2 -I ° dU2 dV2 dvlo I

(UI ::s:; VI ::s:; 1; UI ::s:; U2 ::s:; V2; U2 ::s:; V2 ::s:; 1).

Hence derive the unconditional distribution of the statistic

u = (V2 -

U2)/(VI - u I ),

and use it to prove that E(U r )

=

n2(n2 -1)(nl -1) (n2+ r)(n2+ r - 1)(nl-r-1)

(iii) The conditional joint distribution of VI and V2 is (nl-1)n 2

o

(VI-UI)n l -2(V 2 -u l )n 2 -1 (1 -UI )n+n I ·dvl dv 2' 1 2

(UI ::s:; VI ::s:; 1; UI ::s:; V2 ::s:; 1).

Hence determine the unconditional distribution of the ratio V

and deduce that

=

(VI - UI)/(V 2 - UI),

44


72 The lifetime x (in hours) of electronic tubes mass-produced by a standard process is a random variable with a probability distribution having the density function 1X2 x e - O. Prove that the manufacturer's condition for introducing the new process is satisfied if 1 P < 1X--·log(l+A). 111 73 A population of N persons is exposed to the risk of accidents. Initially there is an equal probability for anyone individual to sustain an accident, and it may be assumed that an accident does not result in death, so that the population remains of constant size. In general, as suggested by Greenwood and Yule, suppose that the probability of a person having an accident is altered if he has already met with a previous accident. If then !(t, x) t5t is the probability at time t (>0) that a person who has had x accidents will have another accident in the infinitesimal time-interval t5t, and Vx is the expected number of persons who have had x accidents at time t, prove that

dv x =!(t,x-l).Vx-l-! (t,x).v x , (it where, by definition, v_ 1 = O. From this general differential equation deduce that (i) if!(t, x) = kcp(t), k a constant, then v, = N . (r+ l)th term of a Poisson distribution with parameter kT; and (ii) if !(t, x) = (b + cx)cp(t), band c being positive constants, then Vx = N . (r+ 1)th term of a negative binomial distribution with prob· ability-generating function G(l})

= (l-wt'·/(1-wOt'·,

where

f I

T

=-

cp(u) du

and

w =- (l-ecT).

o

74 A large group of persons is exposed to the risk of accidents over a prolonged period, which may be considered to be divided into a number of time-intervals of equal length. Due to personal and occupational differtmces the accident proneness of the persons varies, and it is known from empirical considerations that if, per time-interval, the mean number of accidents for an individual is A, then the probability that he will have x accidents in a random time-interval is given by a Poisson distribution with mean ,t As suggested


45

by Green~ood ~~d Yule, it may be assumed that the probability of a person having a given IS e' nr)" e-cJ. . ..1.,-1 dA., (A. ~ 0). B considering the joint distribution of the random variables x and A., prove t:at the marginal distribution of x has the probability-generating function G(O)

= e'/(1 +e-O)'.

Hence, or otherwise, deduce that corr(x, A.) = (1 +e)-t. Also, if var(xIA.) denotes the variance of the conditional distribution of x for any given A., and var(x) is the variance of the marginal distribution of x, verify that for variation over A. E[ var(xIA)]

= r/e, whereas var(x) = 1'(1 + e)/e 2 •

Explain the difference between these two results as measures of the variability of x. Finally, derive the conditional distribution of A. for given x, and verify that E(A.lx) = (x + r)/(l + c).

75 Accidents occurring randomly in time may be classified according as they give rise to injuries to 1, 2, 3, ... persons. For the distribution of these classified accidents, considered over equal time-intervals, it may be assumed that accidents, each of which involved k injured persons, have a Poisson distribution with mean 'l'k, for all integral values of k ~ 1. (i) Find the probability-generating function of the random variable X denoting the total number of persons injured in anyone time-interval. Hence deduce that if (0 < p < 1),

then X has a negative binomial distribution with probability-generating function

G(z) = (1- p)J./(1_ pz).1., so that, per time-interval, the mean number of persons injured is A.p/(l- p), and the probability that no person is injured is (1- p)J.. (ii) If Y is the random variable denoting the total number of accidents sustained by a person, and Pk is the probability of a particular person receiving injury in an accident involving k persons, derive the probability-generating function of Y. Hence, for

Pk = kp

and

'l'k = A.pk/k,

verify that Y has the Poisson distribution with mean A.pp/(1- pl. 76 A city corporation is responsible for continually providing with lamps a large number N of street-lighting posts. Initially, at time t = 0, a new lamp is inserted in each post with the general principle that as the lamps fail, they are

46


to be replaced by new lamps. In order to minimize maintenance costs two alternative plans are suggested for further procedure: Plan I is that the replacement of the lamps as they fail is continued in. definitely. Plan II is that the replacement as the lamps fail is continued only till time t = T. Then all the N lamps are replaced by new ones. This procedure is repeated indefinitely, all lamps being replaced at nT for all integral values of n ;?; 1. Apart from these regular replacements, irregular replacements are still made whenever lamps fail in the intervals nT < t < (n+ 1)T. For comparing the costs of operation under the two plans, it is known that (i) the labour charge for an irregular replacement is u; (ii) the average labour charge per lamp of a regular replacement under Plan II is v; and (iii) w is the price of a lamp. It may also be assumed that the lifetime distribution of a lamp is f(t) dt,

(0

~

t < (0),

with expected lifetime L > T. Assuming that under Plan I lamps fail randomly in time, so that the probability of a lamp failing in the interval (t, t + dt) is dt/L, calculate C 10 the expected cost of maintenance in the interval (0. T). For Plan II, prove that the expected number of irregular replacements of lamps at a given post in the time-interval (0, T) is IX)

G(T)

==

L

F m(T),

m=l

where F m(T) is the probability that at least m lamps fail at a post in (0, T). Hence calculate C 2, the expected cost of maintenance in (0, T) and verify that

being the ratio of the average total costs of regular and irregular replacement of a lamp. As particular eases of this general result, show that (a) iff(t) = A. e- At , then under all circumstances Plan I is more economical than Plan II; and (b) if f(t) = ..1.2 e- A1 • t, then Plan II is more economical than Plan I, provided that

p

° To. Whenever a bulb fails in a lamp post it is immediately replaced by a new bulb, and this method of replacement is continued indefinitely. Starting with all new bulbs at time t = 0, the process is considered up to a stage t = T, where, in general, (r-l)To < T < rTo for all integral values of r ;?; 1, and F~) (T) is the probability of making at least m replacements in a given lamp post in the interval (0, T).

47


Prove that if r = I, then F~:' (T) satisfies the integral relation

F~:' (T)

fF:':~ T

=

I(T-t)f(t) dt;

o

but that for r ~ 2, 1"-(,-I)T o

F:;.'(T) =

1"0

f F~'_I(T-t)f(t)dt+ f F~=P(T-t)f(t)dt o

for m ~ r,

1"-(,-1)1"0

whence the expected number of replacements in the pcriod (0, T) is 00

G,(T) ==

L F~'(T). m=l

Hence determine the probability of a replacement being made in a post during the interval (t, t +dt) regardless of when the bulb was put in. Also, verify as particular cases that if the lifetime distribution of bulbs is uniform in the range (0 ~ t ~ To), then GI(T) = (eT/To_I), if 0 < T< To;

and G:z(T) =

(eT/To_I)_(~ -1)

e(T/To-H,

if To < T< 2 To.

78 Electric bulbs, used individually for street lighting in a large number of posts, have a lifetime distribution with probability density function f(t) for o ~ t < 00; and a bulb is replaced immediately it burns out. If, starting from lime t = 0, the process is observed till t = T, calculate the expected number of replacements in a post during the interval (0, T). Hence deduce g(t) dt, the probability of a bulb being replaced in (t, t + dt) for t < T, irrespective of when the bulb was put in. Next, suppose that at the end of the first interval of time T, all bulbs which were put in the posts before time X < T and have not burned out are replaced by new ones, but the bulbs replaced after time X continue to be used, provided, of course, that they have not burned out. Prove that with such a mixture of old and new bulbs, the probability of a bulb having an expected lifetime >t in the second interval of length Tis x S:z{t)

=

(l-P)Sl(t)+

f

g(T-X)Sl(X)Sl(t+x)dx,

(1" < T),

o

where p is the proportion of bulbs not replaced at time t = T and Sl(t) is the probability that a bulb has a lifetime > t. In the particular case whenf(t) = A. e- A1, verify that S:z(1") = !e-·I.t(1+e-J.X).

79 In a large city a number of street lighting posts are supplied with electric bulbs having a lifetime distributionf(t) dt for (0 ~ t < (0). Initially, at time t = 0, all posts carry new bulbs, and in course of time whenever a bulb burns

48


out it is immediately replaced by a new bulb. In addition, all posts are inspected at regular intervals of time T, so that at time t = liT, (n ~ 1), (i) all bulbs which were replaced in the interval (nT-X, nT), (X < T) and have not burned out by t = nT continue to be used; and ' (ii) all bulbs which were replaced during the interval [(n -1)T, (nT-X)], and have not burned out by t = nTare replaced by new bulbs. Suppose that after the above replacements have been made, Pn is the proportion of bulbs not replaced at t = nT, and that of these a proportion pi have their last replacement before t = (n + 1)T in the interval [(n + 1)T- X, (n + 1)T]. Similarly, of the proportion (1- Pn) of bulbs actually replaced at t = nT, a proportion p have their last replacement before t = (n + 1)T in the interval [(n + 1)T- X, (11 + 1)T]. Prove that Pn satisfies the difference equation

Pn+ I

=

PnP' +(1- Pn)P,

for

11 ~

0,

and, assuming that both p and pi are independent of n, find an explicit expression for Pn. Also, verify that if the difference between p and pi is small then, as a first approximation, P2 '" Pro so that the proportion of replacements at t = nT is effectively stabilized soon after the second T interval. Hence, if g(t) dt, (0 < t < T), is the probability of a replacement being made at a given post in (t, t + dt) irrespective of when the bulb was put in, and S(t) is the probability of a bulb having a lifetime> t, show that x p

=

f

g(T-x.)S(x.)dxI'

o and xx pp'

=

f f g(T-x )g(T+X I -X2)S(xdS(x 2) dX l

o

I

dX2·

0

Evaluate p and pi in the particular case when J(t) = A. 2 e- A1 • t, A. being a positive parameter.

80 If U I < U2 < ... < Un are a set of ordered observations from a population having a probability density function J(x) in the range - 00 < x < 00, derive the joint distribution of the random variables PI' P2' . .. , Pn such that Ut

PI =

•

J J(x)dx

"i+ 1

and

Pi+1 =

-00

J J(x)dx,

for i = 1,2, ... ,n-1.

~

Next, suppose that a random sample of size m is taken from another population, also with a doubly infinite range, and that these m observations are distributed over the (n+ 1) intervals formed by the Uj so that m l observations are un• Find the joint distribution of the mi and Ph and then prove that every random set of realizations of the mi has, for given m and n, the probability 1/(m~").

81 A random variable X has the Laplace distribution with probability density function proportional to e-<xlx-ml for (- 00 < X < (0), m being a positive parameter. If c is any given positive number ~m, prove that

1 E{lx -en = -. [e-<x(m-C)+ex(m_e)]. ex

49


Hence deduce that the minimum value of the mean deviation of X is 1/1X, which is attained for c = m. 82 Explain clearly the difference between the standard deviation and mean deviation of a continuous random variable X. In particular, if X has a negative xponential distribution with probability density function proportional to : -«" for X ~ 0, and a and b are positive constants, prove that E(X _a)2 =

1 (1IX )2

-+ 1X2

--a

,

and E{IX-bl} = b+(2e-«b-l)/IX.

Hence deduce that for this distribution the minimum value of the mean deviation is (loge 2) X standard deviation. 83 A dam has a total capacity of M units of water for irrigation, but if there is no rainfall during a month then the dam can only supply IXM units of water, (0 < IX < 1). The amount x of rainfall in a month is a random variable, and as x increases, the supply of water in the dam also increases; and since the capacity of the dam is limited it may be assumed on empirical considerations that this increase dies off exponentially. As a simple model, it may therefore be assumed that if the amount of rainfall in a month is x then the total supply or water in the dam is

where (J is a small positive parameter. If the probability distribution of rainfall in a month is

prove that E(S)

=

M[I-(I-IX)/(l +2{J)t].

Hence, provided P is sufficiently small for p2 to be negligible, show that the actual supply S will be at least equal to its expectation if x ~ (1- P/2). Also, prove that the probability for this to happen is approximately 2[1-(1-P/2)], where (t) is the distribution function of a unit normal variable. 84 A random chord of length Y is drawn across a circle of radius p such that its perpendicular distance from the centre of the circle is px, where x is assumed to be a uniformly distributed random variable in the interval (0, If Yh Y2' ...• Yn are It independent realizations of Y, prove that for their mean y

n

pn

E(y)

=2

,. .,

1·57 p

and

Further, if Ym is the largest of the E(Ym)

=

pA(l1)

and

11

var(y)

=

(32 - 3n 2)p2 12n

p2 ,..., 5n'

observed Yi> prove that

var(Ym)

=[

4n(n+3) (n+ l)(n+2)

2] 2

A (n) . p ,

50

EXER(,[SES IN PROBABILITY AND STATISTIC'S

where _

A(n)

=

n-I

n

r(n+l)

(-1)'

r~o 2'(r+2)· r(n-r)

I{ r (r+2)}l 2 .

Hence. or otherwise. verify that for n = 5 E(Ym) =

7(45n~ 128)p

'" 1·95p

and

var(Ym) '" 0·00695p2.

Also. show that for large n. a first approximation gives var(Ym) '" 4p2/(11 + 1)2(11 + 2)2. 85 A chord of length I is drawn at random across a circle of radius p. Find the expectation of I if (i) the perpendicular distance of the chord from the centre of the circle is PX. where x is uniformly distributed in (0 ~ x ~ 1); (ii) the chord makes an angle with a tangent through one extremity where is uniformly distributed in (0 ~ ~ n/2); (iii) as defined in (ii). has a probability density function proportional to e(n - e) in (0 ~ e ~ n/2). For any given k. (0 ~ k ~ 2). suppose Ph P2 and P3 denote the probabilities P(l ~ kp) in the three cases. Prove that PI and P3 are both ~ P2. Comment on the results obtained.

e.

e

e

e

I

86 If XI and X2 are independent observations from a rectangular distribution in the (0. 1) interval. find the joint distribution of the statistics /I

=

XtX2

and

v = (l-x t )(l-x 2).

Hence determine the conditional distribution of II for given v. and the marginal distribution of v. Also. find directly the marginal distributions of II and v. 87 A random sample of II observations XI. X 2 ••••• Xn is given from a popula· tion having mean m and variance (12. Define s\ the least-squares estimate of (12. and verify that the expected value of S2 is (12. Suppose that the sample average x and S2 are known. but the original sample observations are not available. It is then found that an (n + l)th observation of magnitude ks (k a known constant) was erroneously ignored in the determination of x and S2. If S2 denotes the correct least-squares estimate of (12 obtained from all the (11+ 1) observations. prove that

S

2= 2[ (-11n - 1) + (11+11)" (k--;X )2] . S

88 The tails of a normal distribution with mean III and variance (12 are cut off at a distance ±k(1 (k > 0) from the mean. If (1~ is the variance of the truncated distribution. show that

Also. evaluate the P2 coefficient of the truncated distribution and prove that a sufficient condition for it to be < 3 is that k is >.)3.

3

Estimation, sampling distributions, and inference; bivariate correlation and regression

I A random sample containing an even number n of observations is drawn. The first n/2 observations are realizations of a random variable X such that

P(X = 1) = p;

P(X = 0) = q,

where p+q

= 1.

The second n/2 observations are of another random variable Y such that P( Y = 1)

= q and P( Y = 0) = p.

If r is the number of ones in the sample, show that E(r)

n

="2

and

var(r)

2 A random variable X is such that AT e- A P(X = r) = (

A), r.

l-e

for r

= npq.

= 1,2,3, ...,

;, being an unknown parameter. A random sample of size N contains nr observations having the value r. Show that 00

,1,*

=

L:

r. nr/N

r=2

is an unbiased estimate of A. 3 In a sequence of 11 Bernoulli trials with a probability of success p, (q == 1- p), /' successes were observed. (i) Obtain an unbiased estimate of pq, and find its variance. (ii) Find an unbiased estimate of pq2. (iii) Show that p*(1- p*j2, where p* = 1'/11, is not an unbiased estimate of pq2, but that the bias --+ 0 as 11 --+ 00. 4 In tossing a coin with a probability p for a head, a sequence ofr consecutive heads followed by a tail is known as a run of length r for (r = 0, 1,2, ... ,11), where (0 < p < 1). Calculate the mean and variance of the run length r. Also show that when 11 --+ 00,

E(r)--+(l~p)

and

var(r)--+(1~p)2'

and find an unbiased estimate of this limiting variance. Sl

EXERCISES IN PROBABILITY AND STATISTICS 52 5 In a series of Bernoulli trials with a probability of success p, a sequence 01 successes up to and including the first failure is known as a "turn". If S denotes the total number of successes in n turns, prove that the mean and variance of Sn are

np 1- P

an

np

d

. I respective y.

(1- p)2

Hence prove that an unbiased estimate of var(Sn) is

Sn(n + Sn)/(n + 1), and that a reasonable estimate of p is p* = -Sn- .

n+Sn

6 In a sequence of n Bernoulli trials with probability of success p, (q == 1- p), r successes were observed. It is desired to estimate the ratio p/q, and two estimates are proposed: r

and

Il-r

r

n-r+l

Show that

E[_r] = !!. n-r q var

r] [---=n r

E[ _r n r+ r

[1 +~P,=2 f Il,/q'] '

1[

= 1:

(t -1) L<Xl -,.Il, - ( ,=2 L Il,/q' )2] ; q 00

q ,=2

and

1] = !!.(1_pn) = ~ [1 + (P+A.) f Il,/A.'], q P ,=2 .I\.

var [ ll-r+1

]

(p+A.)2[

= -A.-

<Xl

(t-1)

(<Xl

'~2-A.-,-·Il'- '~2Il'/A.'

)2] ,

where Il, is the tth central moment of p* == r/n, (0 < p < 1), and A. == q+ l/n. 7 From a large lake containing an unknown number N of fish, a random sample of M fish is taken. The fish caught are marked with red spots and released into the lake. After some time, another random sample of n fish is drawn and it is observed that m of them are spotted. Show that P(N, m), the probability that the second sample contains exactly m spotted fish, is given by

By considering the ratio P(N, m)/P(N -1, m) deduce that the maximumlikelihood estimate of N is the largest integer short of nM/m. [It may be assumed that there was no change in the fish popUlation of the lake in the interval between the drawing of the two samples.] 8 For the logarithmico-normal distribution defined by the probability density function

f(X = x) =

~

uj2ic

exp

{-2~2(10g.,X_I1l)2} v

ESTIMATION, SAMPLING DISTRIBUTIONS, INFERENCE, ETC.

53

. the range (0 ~ X < co), show that the maximum-likelihood estimates of In d (12 are /II an 1 n m* = g and (12' = - L (log. Xj- g)2, nj=I

where g is. the natural logarithm of the geometric mean of the random sample observations x l' X2" •• , Xn • 9 A continuous random variable X defined in the range (0 ~ X < co) has, for X == x, the probability density function proportional to x e- x /6, (lJ > 0). Find the mean and variance of X. If a random sample of n observations x .. X2"" , Xn is given from this. population, obtain the maximum-likelihood estimate of the parameter lJ and calculate the variance of the estimate. Show also that 1 n

-'L xf 3n j= 1

is an unbiased estimate of var(X), but that the estimate ti2 has a bias of 0(11- 1), where i is the sample average.

10 A random sample of n observations Xl' X2" .. , Xn is given of a random variable X having, for X = x, a probability density function proportional to x'(I-x) defined in the range (0 ~ X ~ 1), where the parameter a is unknown. Show that a*, the appropriate maximum-likelihood estimate of a, is given by the equation * _ (3g+2)+(g2+4)t a ' - 2g

where g is the natural logarithm of the geometric mean of the sample.

11 A random variable X has the probability density function

{I

}

-1 f(X = x) = _x__ . exp - -2 (log. X - p)2, (1Ji1c 2(1

for X ~

o.

From this population, a random sample of size 11 has mean i and variance Use the method of moments to find estimates of p and (12. Also show that the mean of the distribution is greater than the median.

S2.

12 A continuous random variable X, defined in the range (0 ~ X ~ 1[/2), has, for X = x, a distribution function proportional to (l_e-,.. inX), where IX> O. Find the probability density function of X. Given a random sample of n observations from this distribution, derive the maximum-likelihood equation for a*, the estimate of a. Also, prove that the large-sample variance of a* is

4a 2 sinh 2 a/2 11(4 sinh 2 a/2 _( 2 )' 13 If g(X) is a function of a random variable X having mean m and a finite variance, prove that to a first approximation

= and var[g(X)] = E[g(X)]

g(m), [g'(mW. var(X).

54

EXERC'lSES IN PROBABILITY AND STATISTICS

Extend these results to the case of k correlated variables X I, X 2,· .. ,X Hence deduce that if the X j are observed cell frequencies in a multinoOli~i distribution such that k

LX j=

j

== N (fixed),

I

then var[g(X I· X

where E(X;)

=

2·····

mi ,

X k )]

for i

=

'"

k j~'

( cg ) 2 IIlj

eX j

X,=m, -

( eg ) ]2 i~1 III; ax; X,=III, •

1 [k

N

1,2, ... , k.

14 In a biological experiment the observed frequencies in the four distinct classes AB, Ab. aB, ab were found to be Ill' 11 2 , 11 3 , 114 respectively. On a genetical hypothesis, the corresponding expected proportions are 1 1)

is an unbiased estimate of p. Also, verify that the mode of the sampling distribution of p* lies in the range

p < p* < P

[1 + r- f-p],

so that the distribution has slight asymmetry. Finally, prove that var(p*) = p2

f if ~(r+S-l),

s= 1

/

S

I

and that

p*2(1_ p*)/(r-l- p*) is an unbiased estimate of this variance. 24 The random variables X and Y have a bivariate normal distribution with parameters (ml, m2; 0"1' 0"2; p) in standard notation. Prove that for positive integral values of rand s /l,s

== E[(X -ml)'(Y-m2Y] =

C1'iO"s2

±(~)(1_p2yI2

j=O

pS-iVjV,+s_j,

J

where Vk is the kth central moment of the univariate unit normal distribution. Given a sample of n observations (Xi> Yi) for (i = 1,2, ... , n) from this bivariate population, two possible estimates

1

Ii = ni~1 n

(XI) Yi

i~1 X~~1 Yi n

and

T2

=

n

are proposed for the parameter A. == mtlm2. Find the expectation of Ii and 12· Hence show that only 12 asymptotically converges to A. as 11 -+ 00, and that for large samples 2 var(12) ,.., -A. [ (VI-PV2) 2 +(1-p 2 )V22] , n

VI

and V2 being the coefficients of variation of X and Y respectively.

58


25 A biologist wishes to determine the effectiveness of a new insecticide when used in varying degrees of strength. It is known from certain empirical con. siderations that p(x), the probability of killing an insect at strength x of the insecticide, in a fixed interval of time, is given by the relation 10ge[ 1 ~~:x)] =

a+x, where r:x is an unknown parameter.

To estimate a, the insecticide is used for a fixed period on three random groups of 11 insects each, and the groups are subjected to different strengths of the insecticide, which, on an appropriately chosen scale, correspond to the values x = - 1, 0, 1. If the total number of insects killed in the three groups is r, show that a*, the maximum-likelihood estimate of r:x, is obtained from the cubic

y3 + (1- 4>)(e + 1 +e- I )y2 + (1- 24>)(e + 1 +e- I )y+(I- 34»

=

0,

= -loge y, 4> =l1/r, and e is the Napierian constant. 26 In an animal-breeding experiment four distinct kinds of progeny were observed with the frequencies and (Ll1j =N). The corresponding where

r:x*

Ill' 112' 113

114

expected proportions on a biological hypothesis are i (2 + p), i (l - p), i (1- pl tp, where p is an unknown parameter. Obtain p*, the maximum-likelihood estimate of p, and verify that its large-sample variance is

2p(1- p)(2 + p) N(l +2p) Show further that an estimate of this variance, obtained by substituting

p* for p, is not an unbiased one, but that its bias relative to unity is approxi· mately -2(5 +3p +6p2 +4p3)

which

.--------:-----0.-------,

N(l +2p)3

27

A

r

--+

°

as N

--+ 00.

variable X has the probability density function 1 ( f( X = x) = -_. e- x /a . ~ ar(p)

)P-I

a'

for

°~ X < ""

00

.

Given 11 independent observations XI' X2" •• ,X n of X, prove that the expecta· tions of the sample arithmetic and geometric means are and

ap

a.

[r(p+~)/ r(P)r

respectively.

Hence deduce that the ratio of the population arithmetic and geometric means is (J

=p

e-(P),

where 4>(P)

d =-[log r(p)]. dp

Also, show that (J*, the maximum-likelihood estimate of (J, is the ratio of the sample arithmetic and geometric means.

ESTIMATION. SAMPLING DISTRIBUTIONS, INFERENCE, ETC.

59

If the parameter a is known and only p is estimated, obtain the large.ample variance of p, the estimate of p, and thereby prove that the large:~mple variance of &, the estimate of (J in this case, is (J2[p -

I -

t/>'(P)J2 /Ilt/>'(P).

28 An experiment results in six independent observations Yr (r = 1,2, ... , 6) such that . (2nl') E(Yr) =- ex cos (2nr) 6"" +P SID 6"" ; var(Yr) = (12. Find the least-squares estimates of ex and estimates has variance (12/3.

p,

and verify that each of these

29 A chord I is drawn at random across a circle of radius p, such that it passes through a given point on the circle and makes an angle (J with the tangent to the circle at the given point. Find the expectation and the variance of /, the length of the chord. Suppose p is not known, and it is desired to obtain its estimate from " independent measurements I" 12 , • •• ,In of chords as drawn above. Show how these measurements can be used to obtain an unbiased estimate of p, and also find the variance of this estimate. 30 A target P moves randomly on the arc of a quadrant of a circle of radius r and centre 0, such that OP makes an angle (J with the horizontal, where (J is known to have a probability density function proportional to (J(n - (J) in the range (0 ~ (J ~ n/2). If PM denotes the perpendicular on the horizontal lhrough 0, show that .::\, the area of the triangle OPM, has the expected value 3r2(n 2+ 4)/8n 3 • If /' is unknown, and a random sample .::\1' .::\2, .•. ,.::\n of 11 values of .::\ is given, obtain an unbiased estimate of r, and hence prove that the estimated area of the quadrant is 2n4.::\0

3(n 2 +4)' where .::\0 is the mean of .::\" .::\2,· .. ,.::\n.

31 The

311

ZI' Z2"'"

Zn'

E(xj)

independent observations x" X2'" . , Xn; YI' Y2'" . ,Yn; and each have the same unknown variance (12, and

=

ml

;

E(Yj)

=

m2 ; E(zj)

=

ml +m 2, for i

= 1,2, ... ,11.

Use the method of least squares to obtain the unbiased estimates of ml and m2' and hence derive the best estimate of (12 based on the total available degrees of freedom. Also, show that the mean square for testing the hypothesis H (1111 = 1112) is 11

2

i(X- ji) ,

where

x and ji are the means of the x and Y observations respectively.

32 The

311

zIt Z2.'''' Zn.

E(x j)

=

independent observations XI' X2" .. ,xn; YI' Y2" .. ,Yn; and each have the same unknown variance (12, and ml

;

E(yj)

=

m2 ; E(zj)

=

m l -m 2,

for i = 1,2, ... ,11.

60


Obtain the least-squares estimates of m, and m2 , and hence derive the best estimate of (12 based on the total available degrees of freedom. If it is proposed to test the hypothesis H (m, = Am 2 ), A being a know proportionality factor, then show that the appropriate generalized t statistic i~

t

=

[(2-A)X+(1-2A)y+(1 +A)Z]jn S[6(A2-A+ 1)]"1'

where t has the t distribution with (3n - 2) dJ., S2 is the least-squares estimate of (12, and X, y, z are the means of the x, y and z observations respectively. 33 The 3n independent observations X"X2""'Xn ; Y"Y2,""Y' ... , Zn, have the same unknown variance (12, and their expectatio~; depend upon three independent parameters Oh O2 , and 03 in such a way thaI

Z h Z 2,

E(Xi) = 0, +0 2 +0 3 , E(Yi)

= -0, +0 2 +03 ,

E(Zi)

= -20 2+0 3 , for i = 1,2, ... , n.

and Use the method of least squares to obtain the unbiased estimates of lJ h 02 and 03 , and also the best estimate of (12 on the available degrees of freedom. Also, show that the mean square for testing the hypothesis H (0, = O2 = 0l) can be put in the form

where Y, == y+z and Y 2 == (2x-3y+3z)/JIT, x,y,z being the means 01 the x, Y and Z observations respectively. 34 The 11 observations x" X2,"" Xn are from a population with mean III and variance (12, and the correlation between any pair of observations is constant and has coefficient p. If

where hand k are unknown constants, is an unbiased estimate of (12, sholl' that n

T=

L (Xi- X)2/(l_p)(Il-I),

n

where /lX ==

i= ,

35

LXi'

i= ,

If y" )'2" .. , Yn are independent observations such that E(Yr)

= rO

and

var(Yr)

= r3 (12,

for r

= 1,2, ... , n,

derive the least-squares estimate of the parameter 0 and obtain the variance of the estimate. Hence show that for large /I this variance is asymptotically (12/(loge n). 36 Three parcels are weighed at a post office singly, in pairs and all together. all weighings being independent and of equal accuracy. These weights are

ESTIMATION, SAMPLING DISTRIBUTIONS, INFERENCE. ETC.

61

(" . k == 0, 1), the suffix 1 denoting the presence of a particular parcel

°

~ijk '~' suffix denoting its absence. and~b~ain explicit expressions for the least-squares estimates of the weights the variances and covariances of the estimates in terms IIe Parcels, giving or " l b' or tthe variance of t h e ongma 0 servatlOns.

7 There are 3n independent observations XI' X2" .. ,X n ; YI' Y2" .. ,Yn; and J z each observation having the same unknown variance ci 2 • The ; I' ;2.1' 'v';I~e~' of the observations are given by: mea

E(Xi)

=

0 1 +20 2+30 3,

E(yJ

=

20 1 +30 2 +03,

and where 01 , O2 , and 03 are unknown parameters. Use the method of least squares to derive estimates of the contrasts (/11- 02 )' (0 2-.0 3) and (0 3 -Od, and he.nce also the unbiased estimate of (f2. H it is deSIred to test the hypothesIs H (0 1 = 021a = ()3Ib), where a and b are known constants, prove that the requisite mean square for testing this hypothesis is

where X,

y, z are the averages 0*

of Xi' Yi and Zi respectively; =

(}'IX+A2Y+';'3 Z). (Af + A~ +A~) ,

and

Al == (l+2a+3b);

A2 == (2+3a+b);

A3 == (3+a+2b).

38 A sample of n independent observations y" Y2" .. ,Yn is given of a normally distributed random variable Y such that

E(Yr) = lX+p.r(xr-x)

and

var(Yr) = (f2

ror (r = 1,2, ... , n), where IX and p are unknown parameters, Xr are values of a non-random variable, and (f2 is an independent unknown parameter. Obtain the least-squares estimates of IX and p, and hence derive the best estimate of (f2. Also, calculate the variance of the estimate of p and deduce the appropriate test statistic for testing the hypothesis H (P = Po). 39 Given a random sample of n pairs of observations (xj, Yi), (i = 1,2, ... , n), show how the method of least squares may be modified for fitting a straight line of the type

IXx+Py+l

=

0

by minimizing the sum of the squares of the perpendicular distances of (x;. Yi) from the straight line. Hence show that the estimates of IX and pare /I

and b respectively, where b = I/(mx - y),

ESTIMATION. SAMPLING

where

111*

INFERENCE. ETC.

63

is the best estimate of m; and

(I +1l2-l)S2 I)

DISTRIBUTIONS,

=(A.Il

I

+n 2)[A.

i

(xr-x?+

r;1

I

(Yr-y)2] +;,1I11I2(X-y)2,

r;1

_ - being the sample averages of the x and Y observations. =1= 1, the varian~e of m* is less than the variance of the arithmetic mean of the (II) + n2) observatIOns.

x. }~ISO, prove that, for A.

43 If a random variable t has Student's £(12) =

t

distribution with

I'

dJ., prove that

C. ~ 2)-

The random variables X and Y have a bivariate normal distribution with ncan s III , Ill., variances 0';, 0'; and correlation p. Given a random sample of " paired ob~ervations (Xi' Yi), for (i = 1,2, ... , IIJl, and a further independent ~l~J11ple of 112 observa~io~s on X only, Y not being recorded, an estimate of /II" is given by the statistic

T= YI +hdx-xd, where XI' YI and hi are the sample means and the regression coefficient of Yon X calculated from the first sample, and x is the mean of the X observations in both samples. Show that T is an unbiased estimate of m)' and that var(T)

=

(0';

nll11+n2

)[1I1+112(I-p2)(111-23)]'

nl-

44 A random sample of N observations is given from a normal population with mean 1/1 and variance 0'2, both parameters being unknown. Suppose S2 is the usual unbiased sample estimate of 0'2, v2 any given positive constant, and /I is the smallest integer satisfying S2

11

~ 1: -

v

and n

N,

~

L

If another independent sample of n observations is taken from the same population, show that by using the two sample means another unbiased estimate of m can be derived whose estimated variance is for large N asymptotically equal to v2 •

45 If X2 has the X2 distribution with

II

E[X/jVJ '"

dJ., show that for large

II

(1- 4~)'

A random sample of n observations is given from a univariate normal population with coefficient of variation A.. Prove that the sample coefficient of variation (sample standard deviation/sample mean) is an asymptotically unbiased estimate of }, with the large-sample variance

A. 2(1 + 2A. 2)

2n 46 A sample of II independent observations Xl' X2" •• , Xn is given from a normal distribution with mean m and variance 0'2, and the statistic d2 is

64


defined as n-1

d2

=

I

(Xi+1- X j)2/2(1I-1).

i= 1

Show that d2 is an unbiased estimate of q2, and by determining its variance prove that its efficiency is 2(11-1)/(311-4) as compared with S2, the usual least-squares estimate whose variance is 2q 4/(n-1).

If Xl' X2" .. ,Xn are 11 random variables such that

47

E(xj)

= m, var(xj) =

for i

q2,

1,2, ... , n,

=

and cov(Xj, x j) = pq2,

for i "# j,

show that (i) var(x)

= [1 +(n-1)p]q2/n, where x is the average of the Xj; (ii) E[1:/= I (Xj- X)2] = (11-1)(1- p)q2; and (iii) - 1/(11-1) ::::.; p ::::.; 1.

48

Given two linear functions n

L1

I

=

n

ajXi

and

L2

=

i= 1

L bjxj, j=l

where the a's and b's are constants and the Xj are random variables such that E(xj)

=

IIlj,

var(xj)

=

cov(x j , Xj) = q2Pii' (i"# J),

and

q2,

find the variances of L1 and L2 and also their covariance. Further, calculate the variance of the function (L 1 - L 2 ), and then show that if all Pii = p, this variance reduces to q2

where

~,

[(1- p) J1 ~; + pet ~,) 2]. 1

== (a, - b,) "# 0 for all r, and hence deduce that n

L

~;

--,--'=-,,1__ ::::.;

LL 49

p ::::.; 1.

0(,0(.

If Xl' X2,"" Xn are n independent random variables such that E(x)

= mj , var(x) =

q2

for j

=

=

I

1,2, ... , n,

prove that the random variables n

n

Y1

=

I

ajXj

Y2

and

j= 1

bjXj,

i= 1

where the a's and b's are constants not all zero, are uncorrelated if n

I

j= 1

ajb j

=

O.


Hence, or otherwise, show that the arithmetic mean any deviation (Xj - x) from the mean. Also. prove that var(xj-x) = 0'2(1-1/11), and

cov[(Xj-X), (xj-x)] = -a 2 /n,

50 If Xl> X2" E(xj)

=

••

x is uncorrelated with

for i :F j.

,Xn are random variables such that var(xj)

mjo

65

=

cov(Xj, Xj)

0'2,

=

pa 2 ,

for i :F j

= 1,2, ... , n,

obtain the variance of the linear function n

L =

L

ajX"

j= 1

where the

aj

are constants not all zero.

If the Xj are divided into two distinct groups of (v. +V2 = n) to define two new random variables

L Xj, j=1

and

V2

elements

n

VI

SI =

V1

and

S2 =

LXi> j=vl+1

prove that corr(S1,S2)

=

Also, when n -+ 00 such that value of this correlation.

P[{I+(Vl-l);}~~+(V2-1)p}r. V 1 -+ 00

but

V2

remains finite, find the limiting

51 Of two discrete random variables, X can take only the values ±a and Y the values ±{J, with unequal probabilities for the four possible realizations of the pairs (X, Y). In a random sample of N observations, the frequencies corresponding to (-a, -fJ), (a, -{J), (-a,{J) and (a,{J) were found to be "., 112' n3 and 114 respectively, (Ll1j = N). Prove that the sample productmoment correlation between X and Y is P'3-A1A2)/[(1-AD(l-A~)]t,

. 1.1

. 1.2

where and are orthogonal linear contrasts of the observed relative frequencies associated with the sample means of and Y, and is the contrast Hence show that the sample correlation vanishes orthogonal to both and when

A1

X

. 1.2,

A3

By considering the limit of the relative frequencies in this equation as N -+ 00, deduce that for the joint distribution of X and Y, zero correlation in the population ensures the statistical independence of the random variables.

52 From a finite collection of N balls of which M «N) are white and the rest black, two successive random samples of size n1 and n2 respectively are drawn without replacement. If the random variables X and Y denote the number of white balls in the two samples, prove that P(X = r, Y= s) =

(:1)(:2)(N~~~=:2)/(Z),

and indicate the limits of variation of rand s.

66


By considering the appropriate array distribution, show that E(yIX

=

r)

= "2(M -r)/(N -"1),

and hence that corr(X, Y) = - [(N _

":)~~ _ "2)] t.

Also, find var(ylX = r) and deduce that this variance can never exceed "2(N -"1 - "2) 4(N -"1 -1) . 53 Each of two packs A and B has N cards which are of t different types, the cards of any type being indistinguishable. Pack A has a cards of each type (at = N), and pack B has bj cards of the ith type Cr.b j = N; 0 ~ bj ~ ~ For a random arrangement of the two packs, a "match" is said to OCcur in a specific position if in that position the cards in A and B are of the same type, Suppose that X and Yare random variables associated with any two cards of B, each taking the value 1 or 0 according as a match is or is not observed in the corresponding positions of the cards. By considering separately the two cases when the cards associated with X and Yare or are not of the same type, derive the bivariate distribution of X and Y. Hence prove that for the marginal distribution of X E(X)

=

l/t;

var(X) = (t-1)/t 2 ;

and that where I

(t-l)V(b j ) ==

L j=

(b j -Nt- 1)2,

1

so that the correlation is a maximum when b j = N It. Also, if SN is the random variable denoting the total number of matches realized in a random arrangement of A and B, use the above results to obtain var(SN), and then establish that the maximum value of this variance is N 2 (t-1)/(N -1)t 2 . If X and Yare correlated random variables with correlation p and coefficients of variation v I and V2 respectively, prove that, as a first approxima· tion,

54

where ), is the ratio of the expectations of X and Y. Further, assuming that the joint distribution of X and Y is symmetrical. obtain an approximate expression for the bias in the value of E(X/Y) used in deriving the variance. 55 If x and S2 are the usual sample mean and variance based on a sample of 11 independent observations from a normal population with mean m and variance (12, prove that the correlation between x and Student's t statistic (x - m)/s is

In

(";3Yr(";2) / r("; 1).

ESTIMA TION, SAMPLING DISTRIBUTIONS, INFERENCE, ETC.

by using the AIso,

67

r function approximation r[(v + 1)/2)]

(

(v/2)tr(v/2) '"

I) 1- 4v

for large v,

veri'fy that for large samples corr(x, t) '"

(1- 4~).

56 From a bivariate normal distribution of the random variables X and Y, ith parameters (m", m" (1", (1" p) in standard notation, a random sample of n wbservations (Xi' Yi) for (i

= 1,2, ... , n) is given. Prove that the sample product-

~oment correlation coefficient r is invariant under linear transformation of the observations. Hence deduce, without obtaining the sampling distribution, that the probability density function of r can only involve the parameter p. If X and Yare transformed to unitary orthogonal variables Wand Z with the sample product-moment correlation coefficient R, then show that r2(1_ R2)

[

R2(I_r2) =

Pf] 2 1+ R(I- p2)t '

where f2 is the ratio of the sample sums of squares of the Wand Z observations.

57 The volume (V) of a duck's egg is known empirically to be proportional 10 (Iength)(breadth)2, where length (X) and breadth (Y) may be regarded approximately as having a bivariate normal distribution with coefficients of variation ),' P for X and Y respectively, and correlation p. If 2 and pare sufficiently small for powers greater than the fourth to be negligible, find the coefficient of variation of V. and verify that as a first approximation its value is {2 2 +4p2+42Pp)t.

Also, determine y, the coefficient of variation of the ratio Y/X, to the same degree of approximation, and hence show that an alternative form of the lirst approximation for the coefficient of variation of V is (32 2 + 6p2 - 2y2)t.

58 For the binomial distribution with constant probability of success p, prove that y" the probability for r successes in n trials, satisfies the finite difference relation

Ay, y,+Y,-l

=

(n+ I)p-r (n+l)p+rO-2p)'

If 0 < p < t, show that this equation leads to a limiting r distribution approximation for the binomial distribution such that the statistic 4[r(l- 2p) + (n + I)p]

(l_2p)2 may approximately be regarded as a X2 variable with dJ.

[ 4{n+l)p(I-P) (1-2p)2 +

I] .

68


59 By using a suitable approximation in the region of integration of tilt double integral

f

" a

[4>(a)]2

= 21nf e- t (X 2+ y2 )dydx, (a>

0),

o0 prove P6lya's approximation that

f fo a

-I-

d ' " 21 (I -e - 2a 2'1t)t . e-tx2 X"

o

Hence show that the sample median obtained from (2v + I) independelll observations of a unit normal variable is approximately normally distributed with zero mean and variance nl(n+4v). Verify that the asymptotic efficiency of the median as compared with the sample mean is 21n.

60 Given a random sample of (2v + 1) observations from a unit normal population, prove that the probability distribution of the sample median x is proportional to [F(x) {1- F(x)} ]" dF(x),

where F(x) is the distribution function of a unit normal random variable. By using the Cadwell approximation

F(z) {1 - F(z)} '" e - (2,,,)%

2 [

2(n - 3) (7n 2 - 60n + 120) I + 3n2 Z4 45n 3 Z6 +

...] ,

show that a scalar transform of the sample median defined by

(4V;nr.

y =

x

has the approximate normalized probability distribution

1

(1 +3k)

fo' e-ty-' (1+ky4)dy, 2n

(-co < y < co),

where k == 2( 7T - 3)v/3(4v + 7T)2. Hence verify that with this approximation for the distribution of x var(x) -

(

7T

7T+4v

)[

8(7T-3)V] 1 + (7T + 4v)2 ,

and, for the kurtosis of x,

16(n-3)"

f2(X) '" (n + 4,,)2 .

61 If x and yare independent random variables such that x has a unit normal distribution, and y is distributed as with n dJ., prove that the ratio

·c

xjn

t=--

.JY

has Student's t distribution with

II

dJ.

ESTIMATION. SAMPLING DISTRIBUTIONS. INFERENCE. ETC.

69

Assuming that for sufficiently large n

r{(n+ O/2}

1)

(

1- 411 •

(1I/2)tnn/2) -

derive Bartlett's approximation that (n -t)loge [1 +(t2 /n)] is distributed as -I with 1 d.f.

62 If PI' P2'· .. ,Pk are probabi~ities derived from the r~alizations :X.I' X2.··· ,.Xk of the independent random vanables X 1> X 2.· ..• X k With probabilIty denSity functionsfI(xl),f2(X2) •... •f,.(Xk) such that

f X;

Pi =

for i = 1.2•...• k.

fi(t) dt.

-00

prove that the Pearson statistic k

p=

2: i=

2 loge Pi

1

is distributed as X2 with 2k dJ. 63 A sample of II independent observations of a random variable Y is given. where Y has the probability density function

!(y= y) =

e-

l .e-(Y-fJ)/9

in the interval Y ~ JI. and zero otherwise. Obtain the joint distribution of the largest and smallest sample observations, and hence derive the distribution of the sample range w. Verify that for any given fixed value \1'0 the probability ~

P(w

wo)

= 1_(1_e- wo / 9),,-I.

Explain how this result can be used to test a specific hypothesis about the parameter O.

64 A random variable X has, for X = x. the probability density function f'(x) defined in the interval (ex ~ X ~ {3). If a sample of II independent observa'Iions from this distribution is given. and the (algebraically) smallest and largest of these are denoted by x I and x" respectively, derive the joint distribution of x, and x". If the sample range is defined by R = XII - X I. show that the marginal . distribution of R is

f

x, +R

/l-R

g(R)dR = n(n-1)

!(X1)!(XJ+R)[

2:

f

!(X) dX

r-

2

dxl·dR.

XI

where 0 ~ R ~ {3-ex. Hence derive the distribution of R in the particular case when X is uniformly distributed in the interval (0 ~ X ~ 1). 65 A random sample of (2v+ 2) observations of a uniformly distributed random variable X in the range (0 ~ X ~ 2a) is ordered. the (v + 1)th and (1'+2)th observations being XI and X 2 • (x 2 > XI). Obtain the joint probability distribution of XI +X2

Y=--2

an

d

70


Hence prove that for the marginal distribution of y the probability density function is

y (V)

(v+l) a22Y B(v+ 1, v+ 1)' r~o r (-1)

y-r( l- y)2Y-2r [1 +_ (l- y)2r+l]j (2r+ 1), ti

ti

according as the range of y is (0 ::s;;; y ::s;;; a) or (a Also, verify by actual integration that P(O

::s;;;

y

y

::s;;;

::s;;;

2a).

t. of x and =

::s;;; a)

Starting from the joint distribution s, the sample mean and standard deviation based on a random sample of n observations from a normal population with mean m and variance (l2, prove that the sampling distribution of v, the square of the sample coefficient of variation, defined by

66

v = S2/X 2, is

n+2i)[

e-(n/2Jl)

co

(2n/A.2)i

(n-l\J~or(2i+l)'

r ( -2-

~ r -2-'

[

n

] (2J+3l12

~

n ](n+2J)/2 1+ (n-l)v

for 0 ::s;;; v < 00, where A. == (l/m. Also, obtain the distribution of w

•

(n-l) -n- dv

= six.

67 If s is the sample standard deviation based on n independent observations from a normal population with mean m and variance (l2, prove that E(s) = (lCy, where 1 1] ( 2)t r[(v+ 1)/2J [ Cy

and

== ;

.

r(v/2)

(l2 var(s) '" 2v'

'" 1- 4v + 32v 2

'

(v == n-l).

Assuming that, for v moderately large, s can be regarded as approximately normally distributed with mean (lCy and variance (l2/2v, show that the statistic

v=

Cy.t(t2~2Vr,

t being the standard Student's t statistic with v d.f., has asymptotically the unit normal distribution. Also, determine the central moments of v and so prove that the approximation may be regarded as effectively correct to O(v- 1 ). 68 Assuming Stirling's asymptotic formula for the r(n + 1) '"

(~) n • .j2im,

prove that, for n sufficiently large and h (> 0) fixed, r(n + h) '" nhr(n).

r

function,

ESTIMATION. SAMPLING DISTRIBUTIONS. INFERENCE, ETC.

71

1-1 nce show that if the random variable X has the Beta distribution with prob-

a~lity density function f(X

1

(0 ~ X ~ 1),

= x) = B(P,q)'XP-I(1-X)4- 1,

then for large p and relatively small q, a first approximation gives - 2p log. x as X2 with 2q dJ.

Also prove that as an improved approximation -(2p+q-l) log. x is X2 with 2q dJ.

69 For a continuous random variable X having a Beta distribution in the unit interval with parameters p and q in standard notation, Bartlett has shown that if p is large and q relatively small, then - (2p + q -1) log. x is approximately X2 with 2q dJ. By a suitable transformation of this approximation, prove that

x "'" (

2p-l )X2/24, 2p+2q-1

where X2 has 2q dJ.,

and hence, by setting

v = X2 /(2p+2q-l) and s = q/(2p+2q-l), that

x"'" e - v - s(1 +~s)v + sv 2 •

70 The independent random variables X I. X 2" ••• Xn have Beta distributions in the unit interval (0, I) which, in standard notation, have the parameters (aj' pj ) for j = 1,2, ... , n respectively, where a.j==a.j+I+Pj+l, forj= 1,2, ... ,n-1. Prove that the probability distribution of the product of the n random variables is also a Beta distribution in the same unit interval but with parameters

(a.n , ~

.± p

j ).

)=

I

Hence derive the distribution of g, the geometric mean of the n variables.

7.

If X and Yare normal correlated random variables each with zero mean and unit variance. prove that for positive constants II 1.112' k I' k 2 • P(1I 2 ~ X ~ II I ,k 2 ~ Y~ k l ) = = M(h l , k l , p)+M(h 2 , k 2 , p)- M(1I1' k2' p)- M(h 2 ,

kl' p),

where M(a, b, p) = P(X ;;:: a, Y;;:: b)

Further, if a. and

and

corr(X, Y) = p.

Pare negative, show that M(a., P, p) = t- M(a., - p, - p)-(a.),

where P(O

~

X

~ -a.)

== ( -!X),

72


and hence finally deduce that M(a.,p,p) = M(-a., -P,p)+(fJ(-a.)+(fJ(-p).

72 If the random variables X and Y have a bivariate normal distribution with probability density f(X = x, Y= y) =

R

2nO' I 0' 2

1- p

.exp[-2(1~ P2){x:-2P~+Y:}], 0' 0' 0'2 0' 2 1

1

prove that the marginal distribution of X and the conditional distribution of y for given X are both univariate normal. Hence, by considering the probability density contours of the normal bivariate surface, show that the contour 01 smallest area in the (x, y) plane which excludes a fraction P of the probability is given by the equation x2 xy y2 2 - 2p- + 2 = 2(I- p2)log.(I/P). 0'( 0'10'2 0'2 73 For the random variables X and Y having a joint bivariate normal distribution with the probability density function f(X

x, Y

=

=

y) =

2

1

1 . exp [ 2(I_ p 2)(x -2pxy+ y 2nJt- p 2 for

-

00

< X, Y
0, for

-00

< X, Y
0, in which (1f and (1~ are the variances of X and Y, and the b's are constants. Show that in this case the equi-probability contours must be ellipses, and obtain the marginal distributions of X and Y, indicating the permissible range of variation of the random variables. Hence prove that the joint probability density function has the form

76


where p is the correlation between X and Y, and n == 3(fJ 2 - 2)/(3 - P ~ being the common value of the coefficient of kurtosis of the marginalll tributions of the random variables. IS. Discuss the limiting forms of the joint distribution when fJ2 -. 2 alld fJ2 .... 3. 84 For the joint distribution of the random variables X and Y with the probability density function 1 ------===x

2:n:ala2~ x

(::~) [1

n > 0,

obtain the marginal distribution of X, and hence show that a;.x, the con. ditional variance of Y, given X = x, lies on the ellipse 2

a},.x 2(1 2 (2n+4) a2 -p) 2n+3

2

x_I + (2n+4)a~ - .

Further, prove that the probability is half that a random observation falls outside the equi-probability contour

85 A farmer, interested in increasing the output of his potato crop, experi· ments with a new fertilizer that is claimed to give an appreciable increase. He uses the new fertilizer on n equal farm plots, and obtains the individual yield figures. From past experience, the farmer knows that his average returns have been m Ibs per plot. Explain the method for analysing the data, stating the assumptions underlying the test of significance for the hypothesis that the new fertilizer has given no different average yield from the past. Suppose the test of significance gives a barely significant result at the 5 per cent level, and the farmer considers that this is not sufficiently convincing evidence for him to introduce the new fertilizer. The farmer then wishes to conduct another similar experiment and desires to know how many experi· mental plots he should take to ensure that differences as large as the one already observed would be significant at the 1 per cent level. State the pro· cedure for obtaining the requisite number on the basis of the available experimental evidence. If the farmer suspects rather large seasonal variations affecting his harvest and also considerable differences in the plot fertility, suggest another experimental procedure and its statistical analysis for testing whether the new fertilizer does, indeed, give a markedly better yield. [It 'Pay be assumed that there is no practical difficulty in dividing each experimental plot into two equal parts.]

86 A laboratory carries out regularly tests for the assessment of the breaking strength of cement mortar briquettes produced in large lots of relatively homogeneous kind; and in an investigation to compare the accuracy of technicians A and B working in the laboratory, two samples of size n 1 and n2


77

(vely are obtained from tests conducted by them on randomly selected

rc~pec t~es. Explain how the sample data can be analysed to test the hypothesis I1nqu~ifference between the accuracy of A and B, stating explicitly the assump(If 110 underlying the test of significance. If the observed result is significant lion: I per cent level, show how a 95 per cent confidence interval can be ;II I ~~ed for the parameter measuring the relative accuracy of A and B. ol1l~uggest another experimental procedure and its statistical analysis for paring the accuracy of A and B when it is known that briquettes proll'omed in different lots tend to have different breaking strengths. ( lie 87 In a single tossing of a penny the probabilities of obtaining a head or a . '1 are p and (1- p) respectively; and an experiment consists of tossing the ~lnl1Y twice: Find the probability distribution of the four possible outcomes of Ihe expenment. . .. . If in N independent tnals of this expenment the difference between the bserved relative frequencies of two heads and two tails is I"~ prove that the ~laximum-likelihood estimate p* of pis

p* = (1 + ,1)/2. Show that p* is an unbiased estimate of p, and verify that in this case the usual formula for the large-sample variance of a maximum-likelihood estimate gives the exact variance. of p*. Hence, for ~arge tv., indicate two different ways or lesting the hypotheSIS that the penny IS unbiased. 88 The probabilities of obtaining a head on a single tossing with each of IWO pennies are Pi and P2 respectively, the difference () == (Pi - P2) being an unknown parameter. Find the probabilities of the four possible outcomes of if single tossing of the two pennies. Jf it known that the second penny is unbiased, prove that the maximumlikelihood estimate of () is . ()* = (1- 2,1)/2, where ), is the observed relative frequency with which the first penny turned up tails in N independent tossings of the two pennies. Verify that var({)*) = Pl(l;;Pd. On the other hand, if nothing is known about the second penny, derive an unbiased linear estimate of (j in terms of the observed relative frequencies, and show that the variance of this estimate can never exceed 1/(4N). 89 In a factory producing synthetic yarn, the amount of raw material that can be put into the plant at anyone time is, on a certain quantitative scale, an integral non-random variable x which can take the values x = 1,2, ... , n. The quantity of yarn produced, Yx' depends upon the amount x of the raw material used and an efficiency factor of the plant, so that a linear regression relationship between Yx and x does not hold over all the possible values of x. It is therefore assumed that Yx is a random variable such that E(yx)

= a.X+PX2,

var(yx)

= (12,

and it is desired to estimate the magnitude of the linear parameter a. and the damping parameter p. If a set of sample values Yl, Y2,' .. , YII is obtained by independent experimental runs of the plant using the quantities x = 1, 2, ... , n respectively of die

78


raw material, apply the method of least squares to obtain a* and estimates of a and p, and show that formally

P*, t~

S4(]'2 ( *) var a = (S2 S4 -S~)'

Hence, or otherwise, derive explicit expressions for the two variances.

90 Suppose a and b are unbiased estimates of two parameters respectively, such that var(a) =

var(b) =

A\(]'2;

)'2(],2;

and cov(a, h) =

C(

and p .

A3(]'2,

where AI, A,2 and A,3 are known constants, and S2 is an independent estimate of (]'2. Assuming that a and b are normally distributed and ns 2 /(J2 has a X2 dis. tribution with n degrees of freedom, by considering the function a - pb, Or otherwise, obtain the appropriate 95 per cent confidence limits for the para. metric ratio p == a/po Hence verify that in the particular case when )'3 =: 0, the limits are

~±sto [)'l +A2(~r -A,IA2 s;~~r/ [1-A,2 s;~~J. to being the 5 per cent point of Student's t distribution with n degrees of freedom. 91 Suppose x and yare independent normal variables with zero means and variances (JI and (]'~, and a new random variable is defined by the ratio z = (y+b)/(x+a), a and b being known constants, and a > O. Prove that the probability dis· tribution of z is

1 [ ~ V 27t

1

Q(J~ + (JIbz exp { 2 2 22 2 2)f· -(az-b) /2«(]'2+(J\Z )}+R(z) dz, (J 2 + (J I Z .

.( 2

where R(z) is a positive function of z such that 00

f

R(z) dz

=

p(lwl ~ a/(Jd,

-00

w being a unit normal variable. Hence deduce that if

a is sufficiently large

compared with (]' I> then the random variable g(z)

= (az -

b)/«(J~ + (JIz2)t

is approximately normally distributed with zero mean and unit variance. Use this result to prove that if x and yare correlated normal variables with correlation p, then the appropriate generalization of g(z) is h(z)

=

(az-b)/«(]'~-2p(]'I(]'2Z+(JIZ2)t.

92 Saplings were planted at the corners of a square lattice in a large rectangular plantation, there being MN saplings distributed in N rows each with

TIMATlON, SAMPLING DISTRIBUTIONS, INFERENCE, ETC. ES

79

I nts During the ensuing winter a certain number of saplings were killed

Mp ast ~nd it was assumed, as a null hypothesis, that these deaths occurred ~Y/~ndentIy so that the contiguity of saplings did not affect their chance of

10 ePval To test this hypothesis, a random sample of n saplings (n ~ MN) was su~v~ a~d it was observed that amongst them there were in all d contiguous

ta .e of saplings that had died.

palr~y considering the four possible types of contiguities between two points, rove that, in gener~l, if two ~oints are select.ed ~t random, then the probability ~r their being contIguous pomts of the lattIce IS 2{4MN-3(M+N)+2} P2= MN(MN-l) . Hence determine the expected number of contiguous pairs in the sample of I saplings. Also, assuming that the distribution of d can be approximated ~y a binomial distribution with parameter P2, indicate a method of testing the hypothesis of randomness of deaths amongst the saplings of the plantation. Extend the above analysis to show that there are in all twenty different ways in which the four dis~i~ct contiguous pair~ can be ~ormed into c A 2 , A3 and 114 the corresponding expected proportions are PI' P2, P3 and P4 respectively, where r.Pi = 1. A random sample of nl observations is taken from this population and the observed class frequencies are found to be nll, n1 2, n13 and n14' A second random sample of n2 observations is taken by ignoring the Al class, and the observed frequencies in the A 2, A3 and A4 classes are n22, n23 and 1124 respectively. Finally a third random sample of n3 observations is obtained by ignoring both the A I and A2 classes, and the frequencies in the 113 and A4 classes are n33 and n34 respectively. If the total number of observations in the four classes obtained from these three samples are denoted by r l , r2, r3 and r4 respectively rEri = En} = n), find the expectations of the rio Hence derive estimates of the expected proportions Pi and verify that these are identical with those obtained by the method of maximum likelihood. By using the transformation PI = (1-9 2); P2 = 92(1-9 3);

P3 = 9293(1-94); P4 = 929394

on the joint likelihood of the P's derive the maximum-likelihood estimates 9J and 9% of the 9's. Show that for large samples these estimates are uncorrelated and also obtain their variances. Hence calculate the large-sample variance of a statistic Twhich is a known function of 9!, 9t and 9%. Ifn is fixed, but n l , n2 and n3 may be varied, find the optimum values for the II} which minimize var(T), and then verify that this minimum variance is

O!,

~. [(1-92)JO;I:~I+(1-93)JO;I;~I+J9il-94)·I:~1r

so


94 If a random variable y has a lognormal distribution such that x === loll, is normally distributed with mean and variance (f2, prove that for r > 0 )

e

E(yr)

=

er~ + 1-r 2

a"

and thereby obtain explicit expressions for It and V 2 , the mean and varianCt of y. Given a random sample of 11 observations Yt, Y2,··" Yn from the lognormal population, suppose the corresponding x values XI' X2"'" Xn have a mean i and a sample variance S2 defined by n

ns 2

=

I

(xi-i)2.

i= t

Use the infinite series

where k-t

Ak == (n_l)k-t /

}I (n+2j-3),

to show that E[e rx . hHr 2s2)]

=

E(yr)

and thus deduce that efficient estimates of It and V 2 are

m = eX. h(!S2) and

Finally, by considering an asymptotic expansion of Ak in inverse powers of n, prove that to 0 (n- 2 )

h( ) Z

-

e

.-.2/n

[1. + z2(SZ+3)] 3n 2

'

and hence derive, correct to the same order, large-sample approximations for

m and v2 • 9S In extensive sampling of a multinomial population it occasionally happens that the expected proportion in one of the classes is exceptionally large and, as a measure of sampling economy, it is suggested that only a known fraction of the preponderant class be sampled, whereas the other classes are enumerated in the usual way. In such a situation, suppose the expected proportions in the four classes Ai> A 2 , A3 and A4 are proportional to (2+ 9), (1- 9), (1-9) and 9 respectively, where 9 is a parameter which can theoretically have any value in (0 < 9 < 1). If it is decided to record only a fraction p, (0 Z2' Z3 and Z4 respectively (l:Zi = M). Derive the equation for 9*, the maximum-likelihood estimate of 9, and find its large-sample variance. Further, prove that 9* will be as efficient an estimate of 9 as &, the maximum-likelihood estimate of 9 obtained from an

ESTIMATION. SAMPLING DISTRIBUTIONS. INFERENCE, ETC.

lln cens

81

ored sample of N observations, if

M=

(1 +20){2(1 + p)-(l- p)lW

4{2(1+p)+(1+7p)O}

.N =N.h(O,p),say,

and hence ?educe that whatever be the value of the parameter O. as a first approximatIOn for small p 4(1 + p)(17 + 23p)(3 +5p)2 M~ (7+ 13p)(13+19p)2 .N. 96 In a plant-breeding experiment four distinct types of progeny AB, Ab, aB d ab are possible; and in a random sample of N 1 observations, the observed ~n quencies in the four classes were found to be Xl> X2' X3 and X4 respectively, == N 1)' If on a genetical hypothesis the corresponding expected probabi'~~ies are proportional to (2 + 0), (1- 0), (1- 0) and 0, determine the equation for 0, the maxi~um-likeli~ood ~stimate of the parameter 0, and derive the large-sample vanance of thiS estlmate. . . . . Since, in general, the parameter 0 hes 10 the range (0 < 0 < 1), It IS suggested that another convenient method for its estimation is to ignore ~ompletely the preponderant AB class in sampling. Suppose, then, that a total number of N 2 observations are taken from such a truncated distribution, and the observed frequencies in the remaining three classes are obtained as }' Y3 and Y4 respectively (I:Yi = N 2 )· Use the method of maximum likelihood t~'obtain 0*, the estimate of 0 based on this second sample, and prove that for large samples (0 *) _ 0(1 - 0)(2 - 0)2 var 2N .

(fe,

2

Hence show that 0* must have a variance ::f var(O) if N 2 = 0·5174 N l' Further, if 0** denotes the estimate of 0 obtained by using the joint likelihood of the above two independent samples. prove that var(O**)

= (1

20(1- 0)(2- 0)2(2+ 0) + 20)(2 - 0)2 N 1 +4(2+ O)N 2'

Use this result to show that if N 1 + N 2 = N, a constant, then the best allocation of the sample sizes for minimum var(O**) is obtained for N 1 = O. Discuss the significance of the results in the light of efficient estimation of 0 by the method of maximum likelihood. 97 Suppose the ordered observations of a random sample of size /1 from a population having a probability density function f(x) in the range (-00 < x < (0) are denoted by III < "2 < 113 < ... < lin' Further. given a random sample of III from another population. let these »I observations be distributed amongst the (/1 + I) intervals formed by the IIi such that »11 observations are O. If Rb is the proportion of defective items examined on the average in. batch, and Rg is the proportion of non-defective items inspected on th: average, then the efficiency of the sampling inspection scheme is E == Rb-Rg •

Show that for corr(x, X -x) < 0 and a specified a, E=

N(N-n)

a

Jl(N-Jl)'x~o

[(X+l) Jl] n+l Pn+l(x)-N,Pn(x) ,

where Pm(x) is the marginal distribution of x for samples of size m. Assuming that the marginal distributions of both x and X are rectangular, determine the values of n and a which maximize E, and hence prove thaI. for large N, this maximum is -t. Alternatively, if the proportion of items to be inspected is fixed at p, prove that, for large N, the values of 11 and a which maximize E are 11 - JNp/(I-p) and a- Jp(l-p)/N. Finally, suppose that a batch is fully inspected if x > a, but accepted on the basis of the sample if x ::::; a. If, on the average, the initial and after· inspection proportions of defectives in the batch are 0( and [3, and the "aver· age outgoing quality" defined by the ratio [3/(0(+ [3) is fixed at p, prove thaI E is maximized for

IN(Ji-l)

11 -

a-

and

fl'

(A == 0(/[3),

provided N is large compared to P- 1. [It may be assumed that all defective items detected are replaced by non· defectives. ] 106 Suppose U 1, U2 and U3 are the first, second and third quartiles respectively obtained from an ordered sample of N == (4n+3) (n being an integer ~l) observations of a random variable x having a probability density function f(x) in the range ( - 00 < x < (0). Find the joint distribution of the random variables Ui + 1 "1

Pi

=

f - 00

f(x) dx

and

Pi

=

f

f(x) dx

(i = 1,2),

IIi

An independent sample of 111 observations from another population with a doubly infinite range is distributed over the four intervals formed by the Ui such that m 1 observations are U3' To test the hypothesis that the two samples are, in fact, and '~he same population a statistic w is defined as rrom

4

w

=

L (4mi-m)2/9m2. i= 1

I'rove that w always lies between (0, 1), and that

E(w) = 4(N +m+ 1)/3m(N +2); and 32 (m-l)(N +m+ l)(N +m+2)(N + l)(N +5) var(w) = 27' m3(N +2)2(N +3)(N +4) ,

. that the distribution of w may be approximated by a suitable Beta distribu-

;:~n. Also, compare the limiting behaviour of the distribution of w when (i) N is fixed and m --+ co ;

(ii) ttl is fixed and N --+ co. Finally, suggest an approximate procedure for using the w statistic when N is large but not necessarily of the form (4n + 3).

107 If u denotes the sample median of a random sample of (2n + 1) observalions from a population having a probability density function f(x) in the range (- CfJ < X < co), determine the marginal distribution of the random variable

f u

p=

f(x)dx.

-00

Suppose an independent sample of III observations is taken from another continuous population with a doubly infinite range, and it is found that ml or these observations are < u. Find the joint distribution of m 1 and p, and hence show that the marginal distribution of 111 1 satisfies the recurrence relation P( ml ) = (ml + l)(n+m-md . P(ml (m-1II1)(n+11I1 + l)

+ 1).

Further, use this distribution to prove that the rth factorial moment of 1111 is E[mlrl]

= 111(') (n +1')(')/(211 + I +1')(').

Hence. or otherwise. determine the mean and variance of mI' Also. if 111 and /I increase indefinitely in such a way that m/ll = A (> 0), prove that the asymplotic variance of the statistic Y = (2m l -m)

I{ m2()·+2)} 1

IS

(3),,+2) -2 2()" + 2)11 + 0(11 ). 108 If

Xl ~ X 2 ~ ••• ~ Xn

denote the ordered observations of a random

~a~ple of size 11 from a rectangular distribution in the interval (0, a), find the

Jomt distribution of the statistics 1I =

!{xn-xd and v = t(xn+xd.

88


Hence derive the marginal distribution of v. Prove that v is an unbia estimate of the population mean, and that its efficiency. as compared W~I the sample mean x of the n random observations, is III (n+ 1)(n+2)

6n

>

1

t' 3 lor n ~ .

Also, show that the standardized random variable W

=

v-E(v) Jvar(v)

----;::=====

the limiting Laplace distribution ~. e-..filwl. dw fO "l ' r ( - 00 < W < 00), whereas the standardized sample mean x has a limitio unit normal distribution. In view of this limiting distributional behaviou: comment on the use of v as an estimate of the mean of the rectangular popu: lation. has, as n --+

00,

109 If x is the sample mean obtained from n independent observations frOIll a rectangular population in the range (0, a), prove that x is an unbiased estimalt of a/2 with variance a l /12n. Alternatively, suppose the sample observations are ordered as Xl ~ xa ~ ••• ~ X n , and a linear function y is defined by n

y=

L CIXi> j=

1

where the Cj are arbitrary constants. Derive the joint distribution of lilt ordered observations and use it to establish that E(x~ x~) I

J

= n! U+s-l)! (i +r+s-l)! . ar+s (n+r+s)!U-1)!(i+s-l)!

for i > j, and r, s positive integers. Hence, or otherwise, find the mean and variance of y, and then deduCt that y is an unbiased linear estimate of a/2 with minimum variance ir Cn = (n + 1)/2n, and Cj = 0 for i "# n. Find the distribution of this best estimate of a/2, and then show that. as n --+ 00. the standardized unit variable defined by t = {y-E(Y)}/Jvar(y) has the limiting distribution e'- 1 dt for (- 00 < t ~ 1). Also, show that for finite n the efficiency of y as compared with the sample mean x is (n + 2)/3.

110 If XI ~ Xl ~ ... ~ Xn are ordered observations from a population having a probability density function (k+ l)x k /a(k+ 1) for 0

~ X ~

a,

where a and k are positive parameters, find the joint distribution of the Hence prove that for i > j and r, s positive integers r s E(x.x.) I

J

=

a+s.n(k+l) {n(k+ l)+s+r}'

and use it to evaluate cov(Xj, x n ).

r(n)rp+k:l)r(i+::~) ,

rU)r(i +k: l)r (n+ ::~)

XI'


89

Assuming k to be known, prove that

=

y

I'

I

1I(k+ 1)+ 1 n(k+2) . Xn

h best linear estimate of the population mean, and that its efficiency as eared with the sample mean x of n random observations is

(limP

n(k+ 1)+2 > 1 for n ~ 2. k +3 ,\Iso. prove that the standardized random variable U

has. as /I

-+ 00,

y-E(y)

=

-'-----;====

Jvar(y)

the limiting distribution eu - 1 du for (- 00 < u ~ 1).

III Given an ?rdered samp.l~ of 11 o~servatio~s x I ~ X 2 ~ ••. ~ Xn from a pllpulation. havl~g.a p~obablhty densIty functIOn f(x) for (- 00. < x < 00), lind the jomt dIstributIOn of x. and X" (n ~ t > 5 ~ 1). Use thIS result to derive the joint distribution of the random variables P

=

P(x. ~ x ~ x,)

and

Q = P(x

~

x.).

lienee show that the marginal distribution of P depends only on n and the lIilTerence (t - 5). In particular, if E(P) = a, an assigned number, verify that P has a Beta distribution with parameters (n + l)a and (n + 1)(I-a). Indicate how this result may be used to determine the smallest sample size 11 such that the probability or P lying in a specified interval (b, c) is also at least p. 112 Given a random sample of n observations from a normal population with mean m and variance (12, suppose x and 52 are the usual unbiased estimates of the population parameters. For any given positive number k, a random variable Q is defined by

J

i+lc.

Q -- _1_ (1~' V

~/~

e

-tlx-m)2/a 2

d . x.

i-/CS

Prove that I

E(Q)

=

1 (In-I)' In=1B 2' -2-

f -I

dO (

(2)n/2'

1 +n-l

where t == k[n/(1I+ l)Jt. Hence deduce that, on the average, a proportion

or the population lies between the limits

IX

1)

n+ t X±5t,. ( -n- , where t,. is the percentage point of Student's t distribution with (n-l)dJ. such that the probability P( - t,.

~

t

~

t,.)

=

IX.

90


Also, show that if E(Q)

= ex, then to terms of 0(n- 1) var( Q) = t; .e - IiInn.

113 A number k of similar instruments are used for routine measurellle and the observations made by the ith instrument are xij' U = 1,2, ... , n) :' all (i = 1,2, ... , k) . . Assuming that. the ~jj a.re independent normal vari~bl~: with the same vanance (f2 but with dIffenng means such that E(xd:::: ( find the maximum-likelihood estimates of the and (f2. Show furth~r Ih' for (f2*, the estimate of (f2, at

ej

and

var ((f

2*) _ 2(N - k)(f4 ' N2

k

where N ==

L nj. j=

Hence show that

1

(i) if nj = n, a constant, and k -+ co, then (f2* is an inconsistent estimak of (f2; and (ii) if k is a fixed constant and all nj -+ co, then (f2* is a consistent estimatt of (f2. Comment on these results in the light of R. A. Fisher's claim that "an efficient statistic can in all cases be found by the method of maximum likeli. hood".

114 If YI' Y2" .. ,Yn are n independent observations from a normal popula. tion with unknown mean m and variance (f2, obtain explicitly the maximum. likelihood estimates ji and S2 of m and (f2 respectively. Further, suppose a new random variable v is defined by

v = s2+(ji_m)2. Then, assuming the known form of the joint sampling distribution of ji and S2, prove that (i) E{ v'. (ji- m)2 P + I}

(H) E{,'. (y - m)")

= 0 for fixed r and all integral values of p; and

2 2)P+' ~ (-"--. n

rG+p+r) r(p+t) ( rmr i+p)

for all integral values of p and for all r such that the the right-hand side have meaning.

r

functions on

115 Random samples of size n1 , n2' ... ,nk are taken respectively from ( univariate normal populations having the same mean m but differing varianoo (ff, (f~, ... , (ft. If the sample observations are denoted by xij U = 1,2, ... ,IIi; i = 1,2, ... , k), prove that m*, the maximum-likelihood estimate of m, is obtained as a root of the equation

L k

j=)

(*) njXj-m sf+(xj-m*)2

where Iii

njxj

==

L xij j= I

and

=0 '


91

enerally, suppose an estimate In of III is obtained as a solution of the \Iore. g k _ 1I'llIon '\' wj(Xj-m) 1 ~I ' L 2 2 = 0, j=1 Sj +(xj-m) A

A

III

Ihich the weights Wi are constants so chosen ~s to .ensure t~at nl is als.o \~ 'stent estimate of m. Prove that the asymptotIc vanance of 111 as k -4 oc IS

,,,n'

~i (n,.~~)a,,/{.f Wd a;}

vac(.h)

,=1

educe that if I Ien ce d

Wi =

var(m*)

(ni - 2) for all i, then

) = var(ln)+.,Ik Pi (rx. _pl. - (} ,= 1

rxi ==

2.

,=1

1

nda;; Pi == (lt i- 2)/a;;

2; {k.I rxj}2, 1= 1

(} ==

and

JIk rxi jki~/i'

J lhal var(m*) > var(lll), except when the It j are all equal. ~I Further, as another consistent estimate of 111, consider the weighted average

k (kI

.x = I

I1 j

/ljXi

j= 1

j=

•

1

Find the variance of x and hence verify that if the ni are equal to n, then (i) var(x) < var(ln) if the ai are equal; but (ii) for unequal ai' var(x) > var(IfI), provided 11 is sufficiently large.

116 Given a random sample of (2n+ 1) observations from a population having a probability density function f(x) in (- CX) < X < CX)), prove that, as a first approximation, the sample median is normally distributed with mean III and variance 1/[8nf2(m)], m being the population median. Hence deduce that (i) if X is normally distributed then the sample median has an asymptotic efficiency of 0·64 as compared with the average of (211 + 1) observations; (ii) if X has the Cauchy distribution, then the asymptotic efficiency of the sample median as compared with the maximum-likelihood estimate of m based on a sample of size (2n + 1) is 0·81. 117 A random sample of n observations is given from a (he probability density function rxV

nv).e

-ax

,'-1

.X

,

for 0 ~ X
0, VI > 0; U3 > 0, V3 > 0)

=

I

1

= 4n 2 [(sin- 1 p)2-(sin-1 p/2)2]+ 4n[sin- 1 p-sin- I p/2]+ constant, and hence deduce that cov(Cj , C j + I)

= 316 -

Finally, use these results to verify that

and

(~sin-I p/2

r

96


126 In the previous example, suppose an alternative scoring systelll i\ adopted such that

Cj

=

I if either (D j > 0, dj > 0) or (D j < 0, dj < 0)

Cj

=

-1

and otherwise.

If now a new correlation statistic be defined as n-I

g

=

L

C/(n -I),

j= 1

prove that E(g)

= ~ sin - 1 p, n

and

/

var(g) = ~

~(lnn~I\; (n~1)2 [(n-I)(~sin-I pr +2(n-2)(~sin-1 p/2f] 11n-I3[1 2 9(n-l)

(2. -sm -I )2] n p

.

Further, if an estimate p* of p is obtained as

p* = sin (ng/2), show that for p = 0, the efficiency of p* as compared with the product. moment correlation coefficient r is for large samples "" 36/11 n 2 • 127 Suppose (Xl' yd, (x 2, Y2),·· . ,(X"' Yn) are random observations from a bivariate normal population such that the differences Ui = (Xi - Yi) have mean JI. and variance (12 for all (j = 1,2, ... , n). Using the signs of the Uj, a statistic n

S

=

L

SJn

i= 1

is defined where the Si are random variables such that Si

= 1 if Uj > 0 and

Sj = 0 otherwise. Prove that E(S)

1

= !+t/>(-r) and var(S) = _{-!-t/>2(or)}, n

where

fo [ r

-r == JI./(1

and

t/>(-r) ==

e-)·2/ 2 dy.

Hence, if an estimate -r* of or is obtained as a solution of the equation t/>(-r*) =

eS;I),

show that for large samples var(-r*) "" 2n. e- t h{!_ t/>2(-r)}. n

BSTIMATION, SAMPLING DISTRIBUTIONS, INFERENCE, ETC.

97

Further, if i denotes the usual estimate of T (sample mean of udsample standard . tion) prove that deVla ' E(i)

and var(i) sO

=

TC C== (n~lrr(n~2)/r(n;1) n,

n

n-l

[(n-l) -C; ] '" 2n T2+2

= n(n_3)+T 2 n-3

for large n,

that the asymptotic efficiency of T* as compared with i is

1 T2+2 1-4q,2(T)'

~.

128 In the administration of a lethal drug to a random group of n animals lhe measurement of interest is x, the time to death of each animal; but for reasons of economy it is not expedient to wait till all the animals die, and the sampling procedure is terminated after a fixed time T. Suppose, then, that in lhe sample of n obser~ations r. of them Xl' X 2, ... , x, were found to be < T, lhere being (n-r) survIvors at tIme T. Assuming that x is normally distributed with unknown mt:,an JI. and variance (12 determine the maximum-likelihood equations for (j and 0, the estimates of (1 ~nd () == (T - JI.)/a respectively. Hence show that IJ _ p(T-x) - pb+(l-p)g(b)

and A _ P [ 2 OA2 t 2 g(O )-(1_p)v 2 .2(V+ )-(v-2)O], A

where p is the observed proportion of deaths within time T; oflhe Xi;

f

x is the average

00

v2 == 4.± (Xi-T)2/r(x-T)2;

,= I

and

g(O) == e-tiJ'j

e- tz2 dz.

Ii

129 Suppose XI' X2" •• , Xn are random observations from a normal populalion with mean J1 and variance a 2 • If it is known that the sampled population is lruneated at a point T so that all Xi are < T, determine the maximum-likelihood equations for estimating the parameters a and 0 == (T - JI.)/a. Hence show lhat for their estimates IJ and 0 (j

= (T-x)/{O+g(O))

and A

g(O)

where

=

2(V2

+02)tV_(V 2

2-

2)0 '

x is the sample mean,

-00

98


130 In quality-control inspection of a mass-produced article, inte centres on estimating the unknown proportion p of defective items in r~ product; but the product is randomly grouped into batches of N items ea . and sampling inspection consists of taking and inspecting a random satn;~ from anyone batch, t Suppose, then, that the product is represented by a finite population of At white (defective) and Mq black (non-defective) balls, where p+q == 1. ~ random batch of ,N balls is taken from this population, and t~en a randotr, sample of n balls IS drawn from the batch, If the number of white balls in t~ batch is X (not known) and that observed in the sample is x, find the jOint probability distribution of the random variables X and x when (i) both the batch and the sample are obtained by sampling with replace, ment; (ii) the batch is obtained by sampling with replacement, and the sampk by sampling without replacement; (iii) the batch is obtained by sampling without replacement, and the samplt by sampling with replacement; and (iv) both the batch and the sample are obtained by sampling without replacement. Hence show that in each of the above sampling schemes p* = x/n is an un, biased estimate of p, but that the corresponding variances of p* are (i)

p: [1 + n ~ 1] ;

"') -pq (III

n

pq; (11") -

n

[1 +-n-l

M-N] -- ' N' M-I '

, -pq~1 -11-1] (IV) -- , n

M-l

131 In an investigation into consumer preference for the type of fuel used for central heating of homes, suppose that in a total population of N houses the proportions using gas, oil, and various kinds of solid fuels are p, q and (1- p - q) respectively, In order to estimate the parametric ratio p/q, a random sample of n houses is taken, the sampling being without replacement, and it is observed that the frequencies of gas and oil users are x and y respectively, Find the sampling distribution of x and y, and then show that E[x/(y+l)]

=

Np [ (N-Nq-l)(nl] p Nq+I' 1N(n) ~q[1-(l-qr]

for largeN,

Also, prove that for positive integral values of rand s

E[x(rl, yes)]

=

l1(r+sl, (Np)(r) , (Nq)(s)/N(r+s),

Hence, or otherwise, deduce that if nq is large then the coefficient of variation of x/tv + J) is approximately

[(N -11)(P+q)]t, (N-J)npq

132 Suppose x and yare random variables having means m" 1112' equal variance (12, and correlation p > 0, and for any given two new random variables II and v are defined by

e

1/ = X

Prove that

cos e+ y sin e;

v

=

- x

sin

e+ y cos e,

eSTIMATION, SAMPLING DISTRIBUTIONS. INFERENCE, ETC.

99

deduce that /-tence ') 0 " corr (u, v) ~ corr (x, y); (I '" M (ii) (x+ y)/./i and (y-x)/y 2 are uncorrelated random variables. 33 Apart from the usual random errors of measurement, an instrument 1 tWO independent and additive sources of bias of unknown magnitude h~ch cannot be totally eliminated by suitable adjustment of the instrument. ~ Iwever, according to the manufacturer's specifications, it is possible to set hO instrument so as to make the biases either positive or negative. Four sets t independent measurements each are made of a quantity by setting the ?strument in the four possible combinations for the biases. Denoting the ~ts by Xj' Yj, Zj and Wj, (i = 1,2, ... , n), it may be assumed that

/n

E(xj) = ex + Pl - P2

E(yj) = ex -

Pl + P2

for all i, where ex is the true magnitude of the quantity measured, Pl, P2 are the unknown instrumental biases, and all the observations have the same variance (12,

Find the least-squares estimates of the parameters, and verify that the expectation of the error sum of squares is (4n - 3)0'2. Hence, assuming that the observations are normally distributed, indicate, without proof, the generalized t statistic for testing the hypothesis that the two instrumental biases arc equal.

134 The lifetime x of electric bulbs produced by a factory is known on empirical considerations to have a distribution ee- Bx dx, for (0 < x < 00), where eis an unknown parameter. Prove that the probability that a randomly selected bulb has a lifetime > a constant T is e - TB. To estimate e, a random sample of n bulbs is tested, and in order to save sampling cost, the actual lifetimes of the bulbs are not observed. Instead, at time T it is noted that r < n bulbs are burnt out, so that there are (n - r) bulbs in the sample having a lifetime > T. Determine the probability distribution of the random variable r and hence show that the maximum-likelihood estimate of e is

1 (n) e = -.log . T lI-r A

Also, prove that the large-sample variance of

eis

var(e) = (e TB -1)/nT 2 , and find an approximate value for the variance of l/e. 135 In destructive life-testing of various kinds of physical equipment like ball bearings, radio tubes, electric light bulbs etc.• it is known that the lifetime x of an individual item has the negative exponential distribution

e-

l .

e- x /B • dx, for x ;;:::: 0,

where 0, the mean lifetime, is an unknown parameter. If, in general, II items are put on a test rack, it is economical to stop experimentation after the first r < II failures have been observed. It is also

100


e

theoretically advantageous for efficient estimation of to use the fact that observations occurred in an ordered manner. Denoting the r observation tb: x I ~ X 2 ~ ... ~ X,. prove that the maximum-likelihood estimate of eis S b!

e= Lt1 xj+(n-r)x, ]/r. By using the transformation YI = Xl;

Yj

=

Xj-Xj_ l • for 2

~j ~

r.

fit.td t~e joi~t distribution of the y·s'. Hence. de?uce that 2rO/e ~as a X2 dit tnbuhon with 2r dJ.• so that the maxlmum-hkehhood formula gives the exa' variance of () as 2Jr. ~ Use these results to prove that (i) 0 is fully efficient as compared with e*. the usual maximum-likelihoOi! estimate of based on a completely enumerated sample of r iternl and (ii) for the random variable x,

e

e

,

E(x,)

=

eL

1/(n-j+ 1);

var(x,) =

j= 1

e2

,

L 1/(n-j+ If j= 1

136 Suppose that YI• Y2 •• •• , y" are n independent observations such thai E(Yv ) = IX+/3IXlv+/32X2," and var(Y.) = u 2 • for (v = 1.2•...• n). where IX. /31' /32' and u 2 are unknown parameters. and Xl' X 2 are non-random ex. planatory variables. If f3t and /3! are the usual least-squares estimates o! /31 and /32 respectively. prove that var(/3n = u 2/(l- r2)S 11; var(/3!) = u 2/(I- ,.2)S 22. where. denoting deviations from the sample means by lower-case symbols, n

Sij == LXiv Xjv.

(i.j = 1.2).

v= I

and r is the sample product-moment correlation between X I and X 2' Next, suppose an independent unbiased estimate b1 of /31 is given such that var(b 1) = ur. and sr is an unbiased estimate of ur. By considering the deviations (Y,.-b,X",). prove that a simple least-squares estimate of /32 is n

A

/32 = (S2y-b 1S 12 )/S22'

Si}' ==

L

YvXiv.

(i = 1.2).

v=l

Show that P2 is an unbiased estimate of /32 and that var(P2) = (u2+r2urS,,)/S22' so that P2 has smaller variance than /3! if ui < u 2/( 1 - ,.2)S 11' Find the expectation of the residual S. S.• n

L (Yv- b 1X tv-P2 X 2Y'

v= I

and use it to verify that an unbiased estimate of u 2 is S2 =

Lt1

(Yv-bIX1v-P2x2Y-(l-r2)sisll}/(n-2).

101

IlSTIMATION, SAMPLING DISTRIBUTIONS. INFERENCE, ETC.

. Jly under assumptions of normal theory, show how this estimate can Flr;:'ro; an approximate large-sample test of significance for any hypothesis oe use about P2' In the distribution of the number of accidents per worker in a factory 137 given period of time, the frequency of workers sustaining one, two or over aaccidents is available, but the number of persons who did not have an rno~~ent cannot be enumerated owing to the factory population fluctuating aeci. g that period. This gives rise to a discrete distribution in which the zero dunn d up is unobserve . gro If it may be assumed that this truncated distribution is Poisson with an known mean A, and that in a sample of N the observed frequency for x .un I'dents per worker is lx, for x ~ I, show that a simple estimate of A based ,ICC f . on the method 0 moments IS

A*

m~_1 m,

=

I

,

where m~ is the rth sample moment about the origin. Prove that for large N E[log(l +A*)] = 10g(1 +A)

(1

~~iZ:)~4\O(N-2),

and. as a first approximation, var(A *) = (1- e - A)(A + 2)/ N. Further, as an alternative estimation procedure, show that the maximumlikelihood estimate 1 is obtained as a solution of

x=

1(l-e- 1)-"

.x being the average of the truncated sample. Hence deduce an approximation

for). when x is large, and verify that for large samples. the efficiency of ..1.* as compared with 1 is A(e A - I)/(}. + 2)(e A - A-I). Discuss the behaviour of this efficiency for variation in I,. 138 The first k (~I) classes of the lower tail of a Poisson distribution with unknown mean A are truncated, and a random sample of N observations is obtained from the normalized distribution. If m~ denotes the rth sample moment about the origin, prove that a simple estimate of A. based on the method of moments is

Also, for large N, show that E{log(;. * + k)}

= 10g(A + k) A{(3A. + k + 1)(Jt', - k + 1)+ 2Ji'.} + O(N - 2) . 2N(}.+k)2 (Ji', -k+ 1)2

and, as a first approximation,

(A*) _ A.{(A.-k+I)(Jt',-k+l)+2Ji'd var, N{Jl', -k + 1)2 ' Where Ji'l is the mean of the truncated distribution.

•

102


Alternatively, if x denotes the sample mean, prove that the maxim likelihood estimate ~ of A. is obtained as a root of the equation Uill

x-~ = I/~o B(k, r+ I). ~r/r! B(k, r+ I) being complete Beta functions in standard notation. Hence verify that for large samples the efficiency of A. * as compared W'I ~ is II. A.(j.t'1 -k + 1)2 {Jl~ -(Jl'1-A.)(Jl~ -k+ 1)}{2Jl'1 +(A.-k+ 1)(j.t'1 -k+ I)}' 139 If X and Yare independent random variables with means m1, m2 a~ variances uf, u~ respectively, prove that the variance of the product XY is (ufu~ + mfu~ + m~uf).

Hence, or otherwise, find the variance of the product &.p, where &. and Pale the usual least-squares estimates of the parameters ex and /3 in the regression equation E(y)

= ex + /3(x -

x),

based on n random observations having equal variance u 2 • Further, if S2 is the least-squares estimate of u 2 , and ns; is the S.S. of the sample values or'the explanatory variable x, show that under normal theory assumptions S2 = s~(s~ +n&2 + np2 s;)/n 2s;

is a biased estimate of var(&.p) and that, relative to I, the bias is

2(: =~)U2/(U2 + nex + ns;/32). 2

140 For a discrete random variable Y having a negative binomial distribu· tion, the point probabilities are P(Y = r)

= (m+;-I)

pr q-(m+r)

for all integers r ~ 0, where m is a known integral parameter, and p is an unknown parameter such that q - p = 1. If this distribution is truncated by the exclusion of the zero class, prove that for the random variable X of the normalized truncated distribution, the sth factorial moment is E[X(SI] = (III+S-1)(sl pS/(I_q-III).

Further, suppose a random sample of size N is taken from this truncated distribution and the sample mean is found to be x. Show that the equation for p, the maximum-likelihood estimate of p, is

P= ~[I-(I +p)-"'], m

and that, for large N, pq(l_ q -",)2

var(p)

=

Nm[I-(q+mp)q

( + 1) m

• ]

eSTIMA.TION, SAMPLING DISTRIBUTIONS, INFERENCE, ETC.

. ely prove that \l!ern aUv , . E(X2) ,nJ hence .

103

= {l+(m+l)p}E(X),

deduce that a moment estimate of p is

p*

=

(V2 -x)/(m+ l)x,

is the sample second moment about the origin. For large samples, \,here Vl lI:rify that (i) E[log{1 + (m + l)p*}] = I) } __ (m+l)q(l-q-m)(3m p +Sq -l) O(N- 2) 2N ( )2 + ,

== log{ 1 +(m+ p

mmp+q

and (ii) the efficiency of p* as compared with

Pis

(mp+3q-l){I-(mp+q)q

(m+

1)r

141 An infinite biological population consists of three kinds of distinct individuals classified according to the three possible pairs AA, Aa and aa of a doubly inherited character (A, a) with probabilities (1-0)2, 20(1-0) and (/2 respectively, 0 being an unknown parameter. If 0 denotes the maximumlikelihood estimate of 0 obtained from a random sample of size n from the population, prove that the variance of the estimate is exactly 0(1 - 0)/2n. However, it is not always possible to identify correctly the AA and Aa individuals in the population without further time-consuming experimentation. Accordingly, to avoid errors of misclassification, the sample data are classified only in the two distinct groups (AA or Aa) and aa. Show that in this case 0*, the maximum-likelihood estimate of 0, has the large-sampl! variance (1- 02 )/411. Hence deduce the efficiency of 0* as compared with O. and discuss the relative merits of the two methods for estimating O. 142 The probability density function f(x) of a continuous random variable X is proportional to (-00 < X < (0),

where a, a real number, and p ~ 2 are parameters. Prove that the proportionality factor is l/l(p) satisfying the recurrence relation

(p+ 1)(p+2) l(p + 2) = [1 + (p + 2)2] . l(p),

. . with 1(0) == 2 smh 1[/2.

c

Use this result to derive the expectation of 2 logf(x)/ila. 2 , and hence deduce that the large-sample variance of the maximum-likelihood estimate of IX obtained from a random sample of II observations. of X is 1+(p+4)2 lI(p + l)(p + 2)(p + 4)"

Also, show that the sample mean .x is not an unbiased estimate of a but is asymptotically consistent for p --+ XJ.

104


143 The particles emitted by a radioactive source are registered as a funel' of time, and it may be assumed that the probability of a particle being elllit:~ in (t,t+dt) is '\Ii for 0 < t
Yj), (i == 1,2, ... , n), find a, the maximum-likelihood estimate of IX, and determine

106


its sampling distribution. Hence prove that the amount of information o~ tained from a single observation of a is

11(11) ' 11 + 1 ;

a2

so that the maximum-likelihood estimate of a does not extract all the in~ mation contained in the sample. Ot.

147 Two independent random variables x and y have the joint probabil't . function . 1\ denslty ' for x, y

e-(9x+ y /9),

~

0,

o being an unknown parameter. Prove that the information supplied by a Pal of observations of x and y for the estimation of 0 is 2/0 2 , and hence deduct the total information obtained from 11 independent pairs (Xi' YI), ro~ i = 1,2, ... ,11. Further, given the sample observations, show that the maximull\, likelihood estimate of 0 is (Y/X)t,

t =

where X and Yare the sums of the X and y sample values respectively. Determine the joint distribution of X and Y. Use the transformation

X

= u/t and Y = ut

to derive the joint distribution of II and t. Hence, or otherwise, obtain the marginal distribution of t. Show that the amount of information to be expec. ted from a single observation of t is 211 (

211 )

0 2 ' 211+ 1 ' so that the maximum-likelihood estimate does not extract all the information in the sample for the estimation of O. Comment on this result.

148 Show that for a single random observation from the Laplace distribution (-00 < x < 00),

e

the information for the estimation of is unity. Suppose XI' X2,' •• ,Xn , (11 == 2s+ 1 and s > 1) are random observations from the above distribution. Prove that the sample median u is the maximum· likelihood estimate of and deduce the sampling distribution of this estimate. Further, show that

e,

f 1

t'(2-t)" dt

=

2 2•• B(s+ 1, s+ 1),

o

and use it to prove that the amount of information provided by the sample median u for estimating is

e

(s+ 1)(2s+ 1) s -1

[1

J

(2s) ! (s !)2 22• 1 .

ESTIMATION, SAMPLING DISTRIBUTIONS, INFERENCE, ETC. /ll!ll ce

how that the amount of information lost in using this estimate of s

I'

1] '" 4[ ~1ts -1]

2(2s+ 1)[(S+ 1)(2s)! s-1 (s!)222s ,0 Ihat

107

as s .....

00,

this loss also

(J

for large s,

~;

Discuss this result.

-+ 00.

49 If x X2"'" Xn are independent observations from a normal distribution me;~ Jl and variance (12, find the joint distribution of the random

I.h

Wit

,'ariables Y1

=

xJn

n

L

and Yi =

for 2 ~ i ~ n,

cijXj,

j=l

. 'suming that the x -+ Y transformation is unitary orthogonal. as Next, by considering the transformation Yi

In .SZi,

=

for 2 ~ i ~ n,

where n

ns 2 ==

L (Xj_X)2

n

and

nx

j= 1

~

L Xj'

j= 1

determine the joint distribution of X, sand

i.

==

Z2, Z3,"

., Zn-1'

Hence, if for

2 n

mA

L1 (xrx)A/n,

==

j=

show that X, sand mvm2"v/2, (v > 2), are independently distributed. Finally, use this result to find the mean and variance of gl = m3m2"3/2

and

g2 = m4m2"2-3.

lit may be assumed that if u and v are correlated unit normal variables with correlation p, then E(U 3V 3) = 3p(3+2p2); E(U 4V4) = 3(3+24p2+8p4).] ISO Particles are emitted randomly from a radioactive source, and it may be assumed that the probability of a particle being emitted in the time-interval (I, I +dt) is 0- 1 • e- I / II • dt, for t ~ 0, () being the unknown decay parameter of the source. Find the probability of a particle being emitted in the interval (0, T), where T is a constant, and hence deduce that the probability of a particle, which is emitted in (0, T), being emitted in (t, t + dt) is

e- I /II dt O(1-e-.1.)'

(.:t == T/O).

If t10 t 2 , •• • , tn are the emission times of particles from independent sources under observation in (0, T), find the equation of 9, the maximum-likelihood

108


estimate of (). Also, show that the information per observation for estimatin () is g 1{ '2 -), } ()2 1A)2 •

(t-:

Further, assuming there are a total of N sources under observation in the ex~r~me":t, prove that the expected information for the whole period per umt ttme IS N [(I-e-),)2 - A,2 e-),] ()3

A,(l-e-),)

.

Hence show that the most efficient value of the time-interval T is approxi. mately 4(). 151 There are k events E l , E 2 , •• • ,Ek , and the probability of the occurrence of Er is Pro (0 < Pr < 1), for (r = 1,2, ... , k). In Ar independent trials the event Er occurred Qr times, (r = 1,2, ... , k). Further, a combined event T is defined as the joint occurrence of E l , E 2 , • •• ,Ek with probability P, and in C indepen. dent trials T occurred c times. Assuming that the events Er are independent, find the maximum-likelihood estimates of Pr and P. Hence, for large samples, obtain a Xl statistic for testing the hypothesis of independence of the events Er • 152 If Xl ~ Xl ~ ••• ~ Xn are ordered observations from a population with probability density functionJ(x) defined for ( - 00 < X < 00), prove that for the sample range w

f

E(w)

=

[1- (Xn -(I-a)"] dx,

where a ==

-00

J(t) dt.

-00

Further, denoting E(w) by E[(w-£O)']

f x

00

£0,

show that the rth central moment of w is

= -(r-I)( -(0)' + +r(r-I)

ff 00

Xn

-00

-co

[1-a:-(I-al)n+(an-al)"](W-£O)r-ldxldxn,

where the probabilities P(x ~ x.)

== al and P(x

~

xn) ==

a..

153 If Xl ~ Xl are two ordered observations from a normal distribution with mean m and variance (fl, find the joint distribution of Xl and Xl' Hence show that the distribution of the difference u = Xl -Xl is

1C . e -u1 /4a1 . du, (u

(fv 1C

~

0).

Further, if Sl is an independent unhiased f"stimate of (T2 with v dJ., show that the significance of u may be tested uy the statistic ul /2s1, which has the" lbstribution with (1, v) dJ. Suppose Xl and Xl are marks obtained by two students in a competitive examination. Discuss how, under suitable assumptions, the above result may be used to select the winning candidate for the award of a prize.

ESTIMATION. SAMPLING DISTRIBUTIONS. INFERENCE. ETC'.

109

If ser~~ose

In a large national opinion poll just prior to the budget, N randomly ted persons were questioned about their attitude towards the government. included in the poll, a proportion oc were in favour of the government od the rest against. The budget contained some controversial measures ~ich were expected to affect public opinion, but the government asserted h t its support was, on the whole, unaffected by the budget proposals. t aTo test the government's claim, the same N persons were again asked to . ress their opinion. It was found that of those who earlier supported the cX~ernment only 111 still held the same opinion, and of those who were initially ~oainst the government a number n2 confirmed their earlier view. The rest of ;I;e sampled individuals had changed their attitude. Show how these data can be analysed to test the null hypothesis that there has been no significant 'hift in government support. S Also, as a particular case, verify that for oc = 0'5, the appropriate test s[;ltistic for the null hypothesis is N(I1( -112)2/(11 1 + 112)(N -1I( -112)

having a X2 distribution with a single dJ. 155 In seismological research, the arrival time of a given wave is a normally distributed random variable with mean () and variance O'I. However. a proportion p of the observations are affected by an uncertainty arising in most cases rrom the difficulty of identifying the exact beginning of a phase when microseisl1ls are present. Observations ~ffected in this manner may be assumed to be normally distributed with mean (() + Ji) and variance O'~. Derive the probability distribution of x, the arrival time of a random wave. If from past experience p is known, and as a simplification it is assumed that "( = 0'2 = 1, derive the maximum-likelihood equations for the parameters () and I' on the basis of n random observations Xl' X2'·.·' X n · Also, in standard notation, prove that the elements of the information matrix of the estimates of () and Ji are:

_E[02~;;L]

= 1I[1-pJi2(1-rd];

_E[020010ga,lLJ = np[1-11 (rl-r2)]; and _E[02 10g L] = IIp[rt +Ji2(rt -3r2+2r3)]; 2

0112

where, for integral k,

f J2n 00

r

=

k-

1

-00

y2 2

e--/-dy(1 +A

e-P)'t'

and

A

== (1 -pP ) e- p2 / 2 .

l56 In bacterial counts with a haemacytometer, the distribution of the random variable X denoting the number of bacteria per quadrat is generally known to be of the Poisson form with an unknown mean /I, However, it is ~ometimes difficult to count correctly if there is a large number of bacteria 111 any quadrat; and in order to avoid such non-statistical errors it is convenient to record separately all quadrats having t or less bacteria each, but to pool the rrequency of all other quadrats with more than t bacteria,

110


In a random sample of N quadrats it was found that II, quadrats h bacteria each (0 ~ I' ~ t), and there was a total of IIR quadrats each ~~' more than t bacteria. For given t, derive the equation for fl, the maXill1Ulll 1!~e1ihood estimate of fl., and hence show that the large-sample variance:, fl. IS var(fl) = II/N[P,+P,{(II-t-1)+f1.p,/(l-P,)}], where the probabilities

P(X = t) == p, Also, verify that, for t

and P(X ~ t) == P,.

= 0, var(fl) = (e"-I)/N.

157 A binomial variable X has a distribution defined by the point prob. abilities

P(X = s) = (;)pSqll-S,

for 0

~

X

~

n,

p being the conventional probability of a "success" and (p + q = I J. If thi,

distribution is truncated so as to exclude both the terminal classes, prOVe that the mean of the resulting distribution is greater than lip if p < q. Further, if a random sample of II observations is taken from this truncated distribution and I' successes are observed, prove that p, the maximum. likelihood estimate of p, is obtained from the equation

P=!:..n [I_ + (l-PHpn-I_(l_p)n-I}] (I_pn-I) , A

and that for large n

pq

var(p) = - . II

(l- pn _qn)2 [(1_p"_qn)_lIpq(pn 2+qn 2_pn 2qn 2)],

-::c-:----,-------,-,:---:-'-::----;;-"---'----=-----.---::--:;~~

Alternatively, if for large II, p* = 1'/11 is used as a simple approximate estimate of p, determine the bias of this estimate and verify that exactly

pq [(1- pn _ qn) _Ilpq(pn- 2 +qn- 2 _ pn- 2qn- 2)] (1 n n)2 It -p -q

var(p*) = - .

158 An insect pest lays its eggs on the fruit blossom of mango trees, but not all eggs hatch to become larvae, as some of them are destroyed by weather hazards. The only definite indication of the presence of the pest is in wormed ripe mangoes, which show up when the fruit is sliced. In order to assess the extent of damage caused by the pest in an orchard, samples of n mangoes each were examined from different trees, and records of wormed fruit kept from k trees in whose samples there was at least one damaged mango. The total number of fruit obtained from a tree is large compared with the sample size n, so that the effect of sampling from a finite population may be ignored, and the trees in the orchard may also be treated as independent. If the probability of obtaining a wormed mango from a tree is a parameter p, find the probability of the observed distribution of 1'1>1'2, ... ,r1 wormed mangoes from the k trees. Prove that the equation for p, the maximumlikelihood estimate of p, may be put in the form

p = Po[1 - (1 - p)n],

eSTIMATION, SAMPLING DISTRIBUTIONS, INFERENCE, ETC.

111

P is the observed relative frequency of wormed fruit in the k samples. wheresh~w that for large k Also, pq(l_q")2 var(p) = nk(l" -q -npq"1)' (p+q = 1).

9 for an hereditary abnormality, the probability that a particular child

~5 family is affected depends partly on his or her birth-rank. In order to

~n astigate this effect of birth-order, a complete family (i.e. without any

~~s~arriages or stillbirths) of n children was examined, and it was found

hat r of them were abnormal. t Assuming that there were no mUltiple births in the family so that the n children can be ranked without any ties, suppose the sum of the birth-ranks of fer abnormals is A. Then, for constant II and r, show that, on the null ~~pothesis that birth-order has no effect on the occurrence of the abnormality, the probability of obtaining the sum A is

P[A;n,r] = S[A;n,r]

1(;),

where S[A; n, r] is the number of ways of selecting r of the first n natural numbers such that their sum is A. Prove that the function S[A; II, r) satisfies the following recurrence relations: (i) S[A;n,r] = S[A;n-l,r]+S[A-n;n-l,r-l); (ii) S[A;n,r) = S[r(n+l)-A;n,I']; (iii) S[A:II,I'] = S[!II(Il+1)-A;II,II-r). Hence deduce that A has a probability distribution symmetrical about its mean r(n+ 1)/2. Next. by considering the coefficient of (JA in the Taylor expansion of r

F«(J; n, r) ==

fl (Ji(l- (In-i+ 1)/(1- (Jil, i= 1

prove that the probability-generating function of A is F«(J; n,r)

IG).

Use this result to determine the cumulant-generating function of A, and then derive explicitly the cumulants of A. Hence verify that the second and fourth cumulants of A are K2

= (11+ 1)./12;

K4

= -(n+ 1).{n(n+ 1)-.}/120,

[. == r(lI-r)].

Finally, by applying a correction analogous to Yates's correction for continuity, obtain a suitable normal approximation for testing the null hypothesis of no birth-order effect on the incidence of the abnormality. 160 In the study of an hereditary disease which may develop at different ages, it is often found that there is, on the average, greater similarity between the ages of onset of the disease for a parent and child from a single family than between two affected persons chosen at random from the general population.

112


Suppose that the disease may be produced by either of two different genes, say A and B, it being assumed that there are no families in which tar! the genes occur. Furthermore, in families in which the disease is deterJrtiboi~ by gene A, suppose the age of onset is distributed with mean m 1 and vari hb; af ; and it may also be assumed that there is no correlation between the aganI), onset for parent and child within each family. Similarly, gene B gives a ~S(i age of onset m2 and variance a~. e~. If the genes A and B are indistinguishable in their effects, except as rega age of onset, prove that for such a mixed population the correlation betW::: the ages of onset for parents and children is :

_ [1 +

p -

1tlaf+1t2a~ ]-1 1tl1t2(m1 -m2 ) 2 '

where 1tl and 1t2 are the proportions of the A and B genes i~ the populatior. and 1tl +1t2 = 1. Hence deduce that for al = a2 = a, P = o·s Iflm 1 -m 21~20 As a generalization, suppose there are g different genes A, B, ... , G in proportions 1tl' 1t2, ... ,1tg respectively. Then show that under similar con. ditions and with same notation

p =

[1+ f 1t a;/:[ 1t 1t,(m -m,)2]-I, r

r=1

r

r

r,*'

the second summation extending over all pairs of genes. each pair beinl counted only once. 161 On a certain biological hypothesis, the occurrence of a rare hereditary human disease may be explained as due to the inheritance of either of two gene\ A and B, which are known to occur in the population in the ratio of I:A (>Oi The genes are indistinguishable in their effects, except as regards the age 01 onset of the disease; and, as a simple model, it may be assumed that for such a mixed population, the age of onset is a continuous random variable whose distribution is compounded of two unequally weighted univariate normal distributions such that (i) the weights are in the ratio of 1 : A; (ii) on an appropriate scale, the means are - In and m; and (iii) both the variances are equal to unity. Find the probability density function of the age of onset distribution, and hence deduce that this distribution will be bimodal if, and only if, Ilog AI < 2m(m2 _1)1 - 2 log[m + (m 2- 1)1]. As a generalization, suppose the variances of the compounding distribution are af and a~. Prove that in this case the least possible separation between the means of the compounding distributions for bimodality is

Gfa'" where a 2 is the harmonic mean of

1·840-,

at and a~.

162 An hereditary human abnormality like juvenile amaurotic idiocy is believed to be due to a single recessive gene, and it is therefore usual to find abnormal children in families produced by heterozygous parents who are both themselves normal. If p is the probability that the offspring of two heterozygous parents will be abnormal. then for the estimation of p allowance has to

ESTIMATION. SAMPLING DISTRIBUTIONS. INFERENCE. ETC.

113

d for the fact that a number of families are excluded from the observed m8 eause of the absence of abnormal children. Consequently, sampling is Jat3 : ; to families de~ived from normal parents such that each family has at ,:(ln ti e abnormal child. Ica st onpose a total of N families are sampled, there being ns families of size s, SUPI ~ s ~ c. If there are R abnormals in the total sample, prove that \I'here xi;;;um-likelihood estimate lJ of q (== 1 - p) is obtained as the real root (he: mathan unity of the equation Ill:

±

"Iher

R s.n. (1-lJ) = •= 1 (1- ~. , show that the information contained in the sample for the estimation

,\ Iso, I)f

q is c

L s. n.(1- spqS- 1 -

qS)/pq(1 _ q")2.

s= 1

A test of significance indicates that the estimate of p is significantly greater Ihan the Mendelian expectation of p = !; but, as an alternative to the rejection )f the hypothesis, it is suspected that this difference may be due to the fact :hat single abnormal cases are not always reported. If, then, families with at Icast two abnormals only are considered, modify suitably the maximumlikelihood estimation of q, and hence derive the large-sample variance of this estimate. 163 A simple method for comparing the fitnesses of organisms of two different phenotypes, which may be assumed to correspond with two different genotypes A and B, is to compare their viabilities before the age at which they are scored. Then any divergence from Mendelian expectation can be ascribed to differences of viability before this age. Suppose that at the stage of sampling, the A and B organisms in the population are in the ratio of 1 : 0, so that interest centres on the estimation of the unknown parameter 0; and that in a random sample of n organisms, r were observed to be of type B. Prove that the statistic r/(n-r+l) is an asymptotically unbiased estimate of O. Also. by considering a suitable logarithmic expansion of r/(n - r + l), show that for large samples var[r/(n-r+ 1)] =

0(1 + 0)2 [20 ] n I +n+ O(n- 2 ) •

For z > 0, prove that

1 L [/k+l k! n (z+m) ] =-. k=O m=l Z 00

If sampling is confined to samples of size n which have r 'I: n, use the above result to evaluate the expectation of 1/(n - r). Hence verify that, for n sufficiently large so that the probability of r = II is negligible.

E[r/(n _ r)] '" 0

[1 +(1: 0) + (1 + O~~2 +0) + 9(n- 3)] .

Discuss why the estimate r/(n - r + 1) of 0 is better than the maximum-likelihood estimate r/(n- r).

114


164 At a factory mass-producing units of electronic equipment, the assernbl units are tested before being packed for despatch to customers. The probabU~ that a unit will be found to be non-defective is a constant p, (0 varIX).

16 In the above example, suppose the range of X is divided into (2m+ 1) IIllcrvals each of length II, and another random variable Z is defined by the probability distribution

P(Z

=

0)

= P[ -

~ :;;; X :;;; ~],

and P(Z

(21' - 1)11

= I'll) = P(Z = -I'll) = P [ - - 2 -

~ X ~

(2,. + I)h]

2

'

(I' = I, 2, ... , /11).

Prove that the characteristic function of Z has the same form as that fur }'.

17 For a random variable X having the semi-triangular distribution with 11robability density function

I(X = x) = 2(a-x)/a 2 for 0:;;; X ~ a,

130


prove that the characteristic function of X is 2(cos at-I) 2 sin at-at (ait)2 + ait' at Hence verify that the mean and variance of X are a/3 and a 2 /18 respect"" Further, suppose the range of X is divided into m equal intervals eal~, length h, and another discrete random variable Y is defined by C P [Y=

(2r-l)h] 2 = P[(r-l)h

~

X

~

rh],

for r = L 2, ... , m.

Show that the characteristic function of Y is

2 [Q(at)Q(th) Q2(th/2)

ait

Q.(th) [Q(2at) -1] /Q3(th/2) 1] IIIQ(th/2)+ 2(mt)2 Q(at) ,

sinO where Q(O) == -0-' By a suitable expansion of this characteristic function, prove that h2

E(Y) = E(X)+ 6a'

and var(Y)

=

h 2 ( 1 + ah 22 ) • var(X)+ 36

18 Determine the characteristic function about the origin of a uniform: distributed random variable X defined in the range (0 ~ X ~ 2a). Suppose the range of X is divided into m equal intervals each of lenglh. and that another probability distribution of a random variable Y is definedh,

p[Y= (2r;l)h]= P[(r-l)h

~

X

~

rh],

for

r = 1,2, ... ,m.

If 0 find .the cha~acteristic function of log X: Hence s~ow that the probabilitj denSity functIOn of u, the sum of the loganthms of /I mdependent observationl of X, is - p+i F. PII

e

-=-.,.-;-::::-:--;-::.

2ni{r(p)l"

Je"" fr(-z)l"dz. J

l -p-i,Y.J

Evaluate the complex integral by using a suitable contour, and then deduce that the sampling distribution of v, the geometric mean of the II observations of X, is

[d

IlV"p-l'f-

1

ll -

" (_I)"r+,,+ I _ _ r(1l) [r(p)}'" r'S-O dZ,,-1 .

37

VII:]

{r(l +Z):"

:=r •

dv

(0 ~ v

< 'X).

If 11 is a positive integer, prove Euler's formula

L( r

(z + ~ )

= /J t -

(2n)(II-II/2 •

liZ.

and hence, as a particular case, derive the

r(z)r(z+t)

r

r(IIZ),

function duplication formula

In r(2z)/22Z-I.

=

The independent random variables X I' X 2, ... , X II all have tions such that the probability density function of X j is

38

I(Xj = Xj) = e-·'tJ xfJ-1/r(p),

for 0

~

r

distribu·

Xj < co, Pj > 0,

where all the Pj are unequal and also do not differ by an integer. Find the characteristic function of the statistic U

=

I

logxj ,

j= I

and then show that the probability density function of u is

J iXJ

"

1

n r (p) j= 1

1 . 2--;' 1tI

e'lZ

-ioo

n r(pj-z)dz. II

j= I

CHARACTERISTIC FUNCTIONS

137

rove that the probability distribution of v = e",n is lienee P n 00 (_1)r+ 2 vn(r+ p.)-I ~. L L r(' I) J _ . r(PA-pj-r). dv [(Pj) j=1 r=O r+ k'¢j

n

fI

(0::::; v < (0).

jd

In the previous example, if Pj = p+(j-I)/n,p > 0, for (j = 1,2, ... ,n), that the characteristic function of u is ,how

39

n-nil

r(np + nit)/r(np).

" ce obtain the probability distribution of the geometric mean v, and verify he~ it is of the same form as the distribution of the arithmetic mean of n I ~ependent observations from a population having the probability density ,n , function

f(X = x) = e- X xP-Ijr(p),

for 0::::; X < 00.

.• , Xn are n independent realizations of a random variable X having zero mean and finite higher moments Pr (r ~ 2), derive the characterislic function of X, the average of the Xj' Hence, or otherwise, deduce that the second and fourth central moments of x are

40 If XI' X2,'

P2

-

n

an

d

P4

+ 3(n -I)pi n3

respectiVely.

Use these results to prove that

where n

(n _1)s2

=

L

(Xj-

X)2.

j= I

41 A continuous random variable X has the probability density function [(xl defined for (- CI) < X < CI), with mean P and variance (52. Suppose Ihat X and S2 are the usual unbiased estimates of P and (52 respectively, based on a random sample of n observations. Let the joint characteristic function of.x and S2 be

('*1> t If.x and

S2

2)

== E[exp(itlx+it 2 s2)].

are known to be independently distributed, prove that

a4J] [at2

=

[I/I(tdn{ x [a4J2] ~

12=0

at2

, 12=0

where 1/1('1:) is the characteristic function of X and deduce that 1/1('1:) satisfies the differential equation

d (dl/l) d'l:

2 1/1 1/1 . d'l:2 -

2

+ U 2 1/1 2 =

4J2(t 2) that

of

S2.

Hence

0,

so that X must be normally distributed. 42 Given that XI' X2,' •• , Xn are n random observations from a normal population with mean m and variance u 2 , find 4J(tI, t 2 ), the joint moment-

138


generating function of the random variables A and B defined by n- I

2a 2A

= (1I-1)c5 2 =

L (x j -xj+d 2,

j= I

and n

2a 2B = IIS2 =

L (Xj_X)2, j= I

X being the sample average. Further, if Mn denotes the usual 11th-order determinant such that cP(tI' t 2) = M;;t,

show that Mn satisfies the difference equation Mn = (l-t2-2td(Mn-l-tiMn-3)+tiMn-4'

for 11 ~ 5.

Hence verify that an alternative explicit expression for Mn is Mn = nil

(211-~-I$-td\'(I-t2)n-I-\"

.=0

J

l

Use this series representation of cP(t I, t 2) to prove that the jth moment about the origin of the ratio AlB is }- 0 and b(1-a)(l-c) > O. Hence the quadratic has positil( roots if

ab-(1-a)(a+c-l) < 0 or c > Also, z

22

=

p/(p+b) and y

=

(l-a)2+ab (I-a) .

(\-c)(p+h)/ap.

Bizley (1957). Ex. 1.5, p. 32. For Sowite the favourable cases are the partitions of 10 in groupS lik(

1

A;\SWERS AND HINTS ON SOLUTIONS: CHAPTER 04,05'

147

a6)' where Laj = 10 and aj ~ 1 for all i. These cases total

"'~';~! Ib, ~ a,!

1O! (;! + ~; +

coefficicnt of x'" in

. +

~~;r

=

coefficient of x lO in 1O! (e-l)6

=

6 10 - 6. 5 10 + 15.4 10 -20. 3 10 + 15.2 10 - 6 = 16435440,

the required probability

=

16435440/6 10

0·27181.

""

.. h~l1ce osowite, exclude all partitions in which al = 1. These total l'Of

L 1O! /1! tI2 a !,

where

j

= coefficient I

II l~'

lee

~1

aj

Ia + 1 =

and

j

10

of Xl 0 in 10! x( e-' - $' = 8341200,

the required probability

= 8094240/6 10

""

0·13386.

" H. F. Downton-Private communication. .. Let A and B be the two observers. Denote balls by ai' a2,' .. , an, where 'S the white ball. Then Pta,) = P(a r) = l/n. 1/, I•

10-;

PtA says akiad

= P(B says akiak) =

PtA says aria.)

= P(B says arias) = 10' 11- 1 .

9

1

Iknee

I

P(al A an

_ P(al, A and B say al) d B) say a l - PtA an d B say al )

PtA and B say a1IadP(al) /I

I

PtA and B say allas)P(as )

s= 1

( 1 ) 21 10 II

11-1 - - -+ 1 as 11+80

1 )2 1 9 ) 2 I '1 -+(11-1)(- (--)" (10 II .10 11-1 II

n -+

00.

Probability of at least one of A and B telling the truth 81

19

= 1 - 100 = 100' and 19

81(20-11) >0 100(11+80)

11-1 11+80

---- =

100

for 11 < 20.

As II increases, the possibilities of distillct lies increase, and so the prob· ability of both A and B telling the some lie decreases.

24 Brookes and Dick (1958). Ex. 14. p. 89.

p(2n)

=

(2nn)/2

2"

and

p(2n + 2) = 2n + 1 < 1 p(2n) 2n +2 .

148


25 Feller (1952). Ex. 7 and 8, p. 76. Let Ai be the event that the ith individual is missing from the Hence, considering intersections,

u, =

t

(_1)k

k=O

(m) (1-'~)' -. (1-e- P)m k n

as r -.

Sa

IlIp!:

00,

since in the limit

kP)' -. e(1--;:

kP •

26 Uspensky (1937). Ex. 5, p. 168. _ (;)(N -1f-' _ ( n Nn - N

P, -

)r( 1 _.!.)n-, N . P /r ., I

where ,-1

P ==

n

(1-k/n).

k= 1

The inequality follows from the fact that for 1 ~ k

so that r)(,-I)/2 ( 1- 2,

whence

Pn where Pi =

P2

=

=

m-1) (m-1) ( ----;;;Pn-l + --;;J2 Pn-2'

1, since Uo

=

= 1. Hence the result.

Ul

(i) As m -... 00, 1X1 -... 1, 1X2 -... 0 and Pn -... 1. (ii) As n -... 00, IX'l -... 0, lXi -... 0 and Pn -... O. 42

Uspensky (1937). Ex. 5, p. 168. Let Xi be the random variable denoting the number on the ith ticket drawn. Then m

S=

LXI'

i= 1

But E(Xi) = (n+ 1)/2; E(xf) = (n+ 1)(2n+ 1)/6; and for i i= j E(xjxj)

= =

J L rsln(n-l) = L -r- [n(n+1) ---r n

r"'.

r=ln(n-l)

2

(n+ 1)(3n 2 -n-2)/12(n-l).

Hence var(xi) = (n2-1)j12 Therefore E(S) = m(n2+ 1);

and

COV(Xi'X) = -(n+1)/12.

var(S) = m(n;; 1) [1-

43

:=:].

Montmort's matching problem (1708). Let the envelopes be in the natural order 1,2, ... , N. Corresponding to this there are N! arrangements of the letters. Define random variables X, such that Xr = 1 when envelope in the rth place has correct letter and x, =0 otherwise (r = 1, 2, ... , N). Then N

S=

LX,.

r= 1

ANSWERS AND HINTS ON SOLUTIONS: CHAPTER

1

153

Eyidently, E(x,) = l/N; var(x,) = (N-1)/N2; cov(x" x,) = 1/N2(N-1), Hence E(S) = var(S) = 1. " 'I ()ihe second case, the probability of S correct postings is

(~)(~r (1- ~r-s.

[n

. 's a binomial distribution with probability of success 1/N. Hence the Ihls I . l11~an and vanance. lJspensky (1937). Ex. 13, p. 181. .s4 Let Xj be a r~ndo~ variable representing the number on the ticket drawn the ilh occaSion (, = 1,2, ... , m). Then ~

m

X=

L1 Xj'

j=

,\Iso.

E(Xj) = ,Ind for i

±d

,=0

t(n\V2/1 = n/2; E(x1) =

~ j,

r2(~) {(~) -I}

/I

E(xjx) =

,~o

2/1(2/1-1)

n2

n

±

,=0

t2(n)1/2/1 = n(n+1)/4; t ~

rs(~) (~) + ,~s 2/1(2/1-1)

- 4- 4(2/1 - i)' fhcrefore var(Xj) = n/4; cOV(Xj• Xj) = -n/4(2/1-1), whence 11111/2; var(X) =

E(X) =

mn[ 4 1- m-I] 2/1-1 .

45 UL (1962).

Per)

k- I

kC)/r+ ke: = ± = r+ 1) =

=

,=0

± r+

1

(2"+ I -1)/(11 + 1).

r(I1+I)V(2/1+I_1) = (11+1)

lienee E(r)=

,=0

E(r2) =

(11 + 1)/(11+ 1)

:)/(11+ 1) so that

1

-

±\r+

,=0

±

(11)1/(211+1-1)-1

,= 0 , '/

(11-1)2/1 + 1 2"+ I-I .

(11+1),2/(2/1+1_ 1) 1

(11 /I = [ lI(ll+I)L

=1I) -(I1+I)L

I1)] /

II (11 + +L (2"+ 1-1) ,=0 r+ I = [2/1-1(11 2 -11+2)-1]/(2/1+1-1). whence var(r) = (11+1)2/1-1[2/1+1_(11+2)]/(2"+ 1-1)2.

,=1

,

/I

(")

,=0 r

154


There are (~~ Dways of selecting (r + 1) balls out of (n + 1), and th number of ways is (2n+ I_I), excluding the possibility of not selectine to~ Hence the probability is g an!

(~: ~ )/(2 + 1_1) = n

46

UL (1960). k = 0; E(r) P(r)

mode

=

=

= 0- 1 ;

var(r)

P(r).

= (1- 0)/0 2 • < I,

0(1- 0),-1, and since (1- 0)

P(r) is a maximum for r::::: I .

II~

1. n

Sn ==

L

P(r)

,= 1

= 1-{l-Or =

t

if 0 = 1-(W /n •

Similarly, 00

L

P(r) = I-Sn

=

,=n+l

t

if 0 = 1-(W /n.

Hence, by convention, median = n +t. 47 Banach's match-box problem.

u, = CNN-r)mN
= 2- N.coefficient of x N in {l+x/2)2N+I(I-x/2)-1 _ _ 2N ~ (2N + - 2 . L. ,=0

1)

r

.

But

f (2N+l)+ r

,=0

2yl (2N+l) ,=N+I r

= 22N + 1,

so that

To find E(r). evaluate first

E(N -r) =

!.

N-I

L (2N -r)u,+

1

= [(2N + 1)(1-110)- JI]/2,

whence the result. Similarly to obtain E(r 2 ), consider N-2 E[(N-r)(N-r-l)] = L (N-r)(N-r-l)u, ,=0

= [2N(2N + 1)(1-2uo)+(4N +2)(1-lIo)+E(,.2)-(4N +3)/IV4. since Uo = u l . Hence E(,.2) = (2N+3)-3(2N+l)lIo.

1

A!'lSWERS A~D HI~TS ON SOLUTIONS: CHAPTER

i

. Ie (1957). Ex. 4.7, p. 121.

.

..

155

. "

~I~ balls and n compartments the total dlstnbutlve posslblhties are

.Ill

Wit t tal number of favourable possibilities is rhe 0

L: t!

III

ai'

where

I

ai

= t and each

ai

~

1

±m

(n)(_1)m(n_m)l,

:=:coefficientofx'int!(eX -lt=

m=O

e the stated probability. This probability depends upon all favourable .. h:Oc·ncluding those in which the tth ball completes the distribution over ,.I'~S ~ompartments. If f(t) be the probability that all compartments are not ,'~,:pied with t balls, then

±(n)(-lr(I-~)', m n

f(t) = 1.

m=O

d Ihe probability that exactly t balls are required .In I' Ihe expected va ue IS

=

\',I!(t-l)-f(t)}

I f(t) = n ±~.m (n)(_l)m+1 m

1=0

oJ

, I

= f(t-l)- f(t), whence

m= 1

1

1

= nf[l-(I-X)n]dX = nf(1-yn)dY = x

o

l-y

0

±

n/m.

m=l

49 Let Xi be the number of drawings following the selection of the ith variety

up 10 and including the drawing which shows up the (i + l)th variety, (i ~ 1).

rhen

r-1

nr

= 1 + L:

Xi'

i= 1

But 00

E(Xi)

=

I

vp(l-p)·-l

= p-1,

.=1

where p == (N - i)/N and var(xi) = (1- p)/p2. Hence

E(nr) = N

r-1 L: (N -

i) -

1

i=O

=

N

N

f. 1=.\

t- 1 '" N -r+ J

f t- 1 dt = N 10g[N/(N -r+ 1)]. N-r+ 1

Similarly,

r-1

var(nr) = N i~O i(N-i)-2 '" N 2

fN

t- 2 dt-N

N-r+1

=

fN

t- 1dt

N-r+1

N(r-l)/(N -r+ 1)-N 10g[N/(N -r+ 1)].

156 50

EX ERC'ISES IN PROBA B I LIT Y AND STATISTIC'S

The probability of X

~

kin n drawings = (k/NY'; and

Hence E(X)

= ktl

~ N-

kpk

N II

[N + II

f til dt] = nN/(lI+ 1),

1 _

o

and

f

E(X2) -- N- [N + 2-2 II

II

f N

N

til + 1 dt-

o

til

~ nN 2/(n+2),

dt]

0

whence var(X). Bailey, N. T. J. (1951). B, 38, 293. For second sampling, the population has W marked and (N - WI u: marked animals. Therefore

51

pen) = probability of (w - 1) marked and (n- w) unmarked animals in II first (11 - 1) captures x probability of a marked nth capture

C~ J(~=:')

W-w+l . N-n+l '

and the stated expression follows by recombining factorials. To prove lh,' the probabilities add up to unity, it is enough to show that

II,.

= N-IV-w (N-n)(n-l)

SeN, W, w) -

Verify that SeN, W, W+ I)-S(N, that by successive reduction

W-w

w, w) =

w-l

SeN -1, W,

=

(N) W .

w+ 1)-S(N -1, W, wI,'

SeN, W, w+l)-S(N, W, w)=S(w, W, w+l)-S(W, W, w)=o. SeN, W, W+ 1) = SeN, W, 1)

Hence

N-IV

= coefficient of X IV -

1

in

L

(I +X)N-t-I

t=O

52

Dorfman, R. (1943). AMS, 14,436, and UL (1962). Let Xi be the number of tests required for the ith group in plan (ii), Then N/k

S=

L Xi'

where

P(x i

= k+ 1)

=

l_ qk

and

P(Xi = 1)

= qk,

i= 1

Hence E(S) and var(S), using the fact that 53

Xi

are uncorrelated.

Let Xi be a random variable such that Xi = 1 if forecast is correct (F) for ith day =

0 if forecast is incorrect (F) for ith day,

(i = 1. 2, ... ,11).

I


",'n

P(Xi

1) == 1-P(Xi

===

/'

(II-I')

157

•

= 0) = p. ;;+q. -n-' and Sn = i~1 Xi'

X. are correlated random variables, so that for i '# j d\ll{he

I

. v.) ::: P(Xi

• I \, ,\ J

::: p. Mp·

== 1, Xj = 1)

c= ~)

+q.

G=~) ]+q. (n~r)[p. (n~ 1) +q. (II~~~ 1)].

. cl! E(S.) == np - (n - I')(P - q) < np for p > q, and var(Sn) = npq. Ji,nDenote rain by R and no rain by R. The four mutually exclusive possibilities (he weather-forecast combination for any day are RF, RF, RF, RF. If :"r 's IV,I IV2' W3 and W4 respectively are associated with these outcomes, then .,tift this == whence

Ind II

.

°

E(S.) = r(wlP+w2q-w3P-w4q)+n(w3P+w4q),

for all /' if W 2 = - w l P/q and

W4 =

1 [ 2 pI' 2 p(n - 1') ] var(X.) = - WI' -+W3' - and I II q q

_ np

var(S.) - - . q

Ihcrerore

.md this == 1 for all I' if WI

w3P/q· This ensures E(X i ) =

cov(X;, Xj)

PI' 2 +-. (WI q

.

.

= 0, (I '# J).

2

W3),

±W 3 • Thus the alternative scoring system is

= (q/np)t; W4 = -"'2 = (p/nq)t. 54 For the four possible outcomes RF, RF, RF, RF, the probabilities are WI

= -

=

2 W3

-

W3

1/1. 11(1- p), p - (1.13 and q - (1.( 1 - 13). For random allocation of {he corresponding probabilities are

IIcnce (i) gives W3 = - W4q/P, W 2 = - w IP/q; (ii) gives W4 = (l-W1P)/q, using (1. = P and 13 (iii) gives var(S.) = n[w1p2/q-(1-W1P)2/p], using the probabilities of the outcomes with when WI = q/p. Hence WI = w4'1 = q/p; W2

(1.

=

I'

days to rain,

= 1;

= 13 = p. Var(S.) is minimum W3 = -1.

55 Define xr to be the expected number of additional trials needed to realize the pattern when the first I' letters are in agreement. Difference equations then give (i)

Xo

= p-l+ q -l p -4; (ii)

Xo

= p-l+ p -2 q -2; (iii) Xo

56 UL (1962). The roots of Z2 - z + pq = 0 are p and q. Hence, for p i= q,

dnd for p

=

q, Y.

=

(n + 1)r·. These results hold for n ~ O.

=

q-l p-3.

158


Let Xm be the expected number of additional trials needed to Co the sequence when a stage has been reached where the first m letters fllpl. pattern are in agreement with the observed sequence, for (0 ~ m ~ 2r).or If Then X2k = 1 + PX2k+ 1 + qxo and x 2k+ 1 = 1 + PXl + qX2k+ 2, where x Hence 2, '" x 2k -a = (xo-a)/(pq)\

whence, putting k

= r,

a

==

[l+(l-pq)x o]/(l-pq),

the result for Xo' The inequality follows since r

Xo

=

I

(pq)-k

and

k= 1

°

~ pq ~

!.

57 Feller (1952). Ex. 33-35, p. 128. The event "exactly k balls in n drawings" can occur if (i) k black balls in (n-1) drawings and white ball in nth drawing· (ii) (k-1) black balls in (n -1) drawings and black ball in nth dra~ing Hence difference equation. 58 Huyghens' problem. UL (1963). Probability of A's winning in a trial is 5/36 and that of B's winning is 6.;, Hence probability of A's winning at the (2r+ l)th turn is /.

G!r G~r (:6)' whence total probability on summation for (0 Expected number of trials for A's win is 61 5 00 (31)r (30)r 30' 36 r~o (2r+ 1) 36 36

59

~ r

=

< (0).

5 6+ 61 '" 6.

UL (1962). Expected premium in nth year is

t- 1 +(1- q)

a[(qA

As n

~ 00,

:t:

(qAt] = a[(l-q)+q"A"-l(l-A)]/(l-q},).

the expected premium tends to limiting minimum value a(l- q)/(l- qA)

> ka,

whence the inequality. 60

UL (1962). The probability-generating function of S is G(6) = (6+6 2 + ... +( 6 )4/6 4 = 64(1-6 6 )4/6 4(1-6)4,

whence the result. E(S) 61

=

G/(l).

Probabilities for the wins of A, Band Care p/(l_q3), pq(\_q3). an:

pq2j{1_q3) respectively.

P(X = r) = pqr- 1 for r ~ 1. The probability-generating function of \ is G(6) = p6/(1-q6), whence E(X) = p-l and var(X) = qp-2. P(X = r) = (r-l}/36, for (2 ~ r ~ 7); P(X = 12-r) = (r+ 1)/36. k' (0 ~ r ~ 4). A simple numbering for one die is (1,2,3,4,5,6) and for It. other (0,0,0, 6, 6, 6). Other possibilities arise from suitable partitions of Gil'

62


1

159

UL (19 63 ). . ho ut any restriction the three numbers can be chosen in 11(411 2-1)/3 Wlt,( x is the first number in A.P. with common difference r, then ..IP·.•((211-1) and for any x, r ~ [(2n+l)-x]/2. Direct enumeration , r :~ -; A.P.'s in all. Rearrangement of the A.P.'s according to the magnitude I\~S "common difference gives the distribution of X, whence G(O) by sum; Ihe

~I

111011.

, rhe probability-generating function of X is G(O) = (pO+qO-l)", whence ~ n == 1I(2p - 1land var( Xl = 4npq. Also. ,I.

G(O)

=

±(n~t)p("+11/2q(n-'1/201'

I=-n

,llhat PIX ',\h~rc(-II ~

2

( n~t)

= t) = -2-

p(n+I)/2 q (n-I)/2,

t ~ n) and (n-t)/2 is an integer ~O.

h~ The probability-generating function of X is G(O) .:h~nce E(X) = n(p-q) and var(X) = n[4pq+r(1-r)].

= (pO+qO-l+ r

r,

Again,

,md PIX = m) is the coefficient of om in this expansion. Thus, putting I j -II = m, i.e. A. = n + m - j, the limits for j are obtained from n ~ n + m-j ,lIulli +111- j ~ j. I

M

The probability-generating function of the total electorate is G(O,.) = [(1-PI)+P 10] L[(l-P2)+P2-r] C[ (1-P3-P4)+P3 0 +P4-r ]f' ,

where the coefficient of Orr gives the probability of the Labour and Conserliltive candidates obtaining rand s votes respectively. The required probability lor a tie is the sum of the coefficients of orr for all 0 ~ r ~ min(L + F, C + F). lienee the result by putting -r = 0- 1 in G(O, or). The probability-generating function of N is obtained by putting 0 = or, whence the mean and variance.

or

67 Dodge, H. F. (1943). AMS, 14, 264. r-I

iii P = p

L q' =

l_qr.

1=0 r-I

lIil/J

=L

(t+l)pq'/(l-qr) = [l-qr(l+rp)]/p(1_qr).

1=0 00

(a) g =

L

t(l-P)P I = (l_qr)/q'.

1=0

(b) u

= gh + r =

(1- qr)/pqr.

160


(iii) v = f

- 1 X expected

number of items inspected before getting a dct

1Cehl

00

L

=f- 1

1=

(c) ljJ

tpq,-1 = (fp)-I.

I

= (u + fv)/(u + v) = fl[f + q'(1- f)].

(d) p = p(l -ljJ) = pq'(1- f)/[f + t((l- f)].

(iv) p is maximized for p = p* obtained from (1- f)(1- p*Y 68

= f(r + l)p* - 1]/(1 - p*). whence p*

=

[(r + l)p* -II/r.

UL (1961).

Pn(r) Using factorial moments

=

G)p'qn-,.

= (l-6pq)/npq. whence the result for

')12

69 Frisch, R. (1925). B, 17, 165, and UL (1962). The cumulant-generating function of r is ,,(t) mean and variance.

}'2 :::

O.

= n log(q + pe'), when"

" (n-l) l)] = np [ L _ p,-l q"-,+1 - nL- l ( n _p'qn-, ,=t r

= np[

(:=

1

r

,=t

!)pt-l qn-t+

1] = (:)pt q"-t+ 1.

t.

For t = 0.111(0) = O. whence E(r) = np. 70 Romanovsky, V. (1923). B, 15,410, and UL (1962). 00

,,(t)

=

nlog(q+pe' )

==

L

,= 1

",fir!

and

p = (1+e- Z)-I,

whence d,,(t)

Cit = n/[1 +e-(z+ll] == f(t), say, so that d,,(t) dt

00

L jI'l(O). t'lr!

,=0

where

", = jI,-ll(O) d,-lp

=

n[:;,~ll{l+el (z+ll}l=o dp d

{d,-2 p}

= ndz,-l = n dz ' dp dZ,-2 = or Ie, = pq. d",_ tfdp, since dpldx mines "s.

dp d",-l

dz'~'

= pq. Successive differentiation dele:


"I The moment-generating function about the mean

I

161

i. is

• t i,t [ t t t t4 ] exp[A(e -t-l)] = eXPT 1+3+12+60+360+ " ' , 2

2

3

the moments by expansion . .. h~nce _, P(X === r) = P(X = - r) = !pr. Hence P(X = 0) = 1 - 0 for all k.

i

pH 1)(0) = r(n+l).A.~+1

s=O

(-A,2)s. r(s+k+2) . r(s+ 1) r(n+s+k+2)

Thus for n = O,f(H 1)(0) > O. But for n =F 0,

i

pH 1)(0) = nA,~+1

(-~2Y.B(n,s+k+2)

s=O

f

s.

1

= nA,~+ 1 e-

A2%

ZH 1(1_

zt- 1 dz

> O.

o

The cumulant-generating function is ,,(t) = A,1[f(e')-lJ, whence on expansion A,1 A,2 [ 2A,2] "1 = A,1A,2/(n+ 1); "2 = 11+ 1 1 + 11+2 ;

[1 + 11+2 6A,2 6A,~ ] + (11+2)(11+3) ; A,1A,2 [1 14A,2 36A,~ A,1A,2

"3 = 11+ 1 "4 =

11+ 1

The expression for

11

A,~] + 11+2 + (11+2)(11+3) + (11+2)(11+3)(11+4) .

is obtained by eliminating A,2 from the ratios "2/K1 and

"3/"1· 105 UL (1964). (i)

(~)(~~)/( ~~) -

0·43885

~~ )/( ~~) -

0·69618

(ii) 1-( (iii)

G~)/e~) = 0·25000

175


13

. (48')/(52) -0.10971 12

(IV)

M(a) (b)

1-4(~~)/[C~)-(~~)] - 0·36964 6(1~)/[(~~)-(1~)]

- 0·30666.

1h~ changed probabilities are

1-(~~) / (~~) - 0·56115

106 UL

(1964). n

n

n

x=!

x=o

x=o

L X4 = L X4 = L

Ii)

(n+4)(S)

[(X

+ 3) 0 m-ku

00

(12;::,: f(x-m)2f(x)dX+ f (x-m)2f(x)dx -

m+ku

CJJ

or 0"2;:a.

f

P[\x -111\;:a. ku] x (kuf

a

(ii) E(X)

=

f 00

xf(x) dx+

xf(x) dx

a

-00

o

f (y+a)f(y+a)dy+ o

a+

(y+a)f(y+a)dy

0

- 00

=

f 00

f 00

f

yf(y+a)dy+

yf(y+a)dy

=

a.

0

-00

Similarly, 00

E(X-a)'

=

00

f y,/(y+a)dy+ f .v'f(y+a)dy

(-1)'

o

0

= 0 for odd r. (iii) For c >

m, define 00

S(c) ==

f

Ix-clf(x) dx

-00

f f C

=

=

f 00

(c-x)f(x)dx+

(x-c)f(x)dx

- 00

C

m

I

I

m

m

c

00

(c-x)f(x) dx+

-00

(x-c)f(x) dx+2

(c-x)f(x)dx.

Hence C

S(c)-S(m)

=

2

f (c-x)f(x)dx > 0, m

unless c 24

= m. Similarly for c < m.

Downton, H. F. Private communication.

Suppose 0 ~ g(X) < t/J(g). Then

00, 00

E[g(X)]

=

and let the probability density function of g Ix 00

00

f gt/J(g) dg ;::,: f gt/J(g) dg ;::,: k f t/J(g) dg, o

k

k


.hence

2

183

the first inequality. For the second, assume

g(X) = [( X;J1r -(l-C)r S'z1ey (1957). Ex. 7.11, p. 219.

:5

I.e: the straight line be divided into mn intervals each of length

.1

e

l/mn. Then tal number of ways for choosing m parts so that they total mn intervals !hChlO coefficient of xmn in (1 +X+X2+ ... +xmn)m

. h is A == m(m+ 1) ... (m+mn-1)/(mn)! ,.hl~or the favourable combinations each of the m parts:::; I, and so we can II

most take mnl intervals for any part. Hence the favourable number of comh ffiClent . f mn'10 0 x (1 +X+X2+ ... +xmn/)m

~Inalions is t e coe

. hich is B == A -m2(m+ 1)(m+2) ... {m+mn(1-/)-2}/{mn(1-1)-1}! Hence ;~c required probability is 1-m(1-l)m-l obtained as the limit of B/A as ~ -+ ro.

26 UL (1958).

The proportionality factor is Ir;lnsrormation y = log x.

l/(1A

which is obtained by using the

f (1fo

00

00

E(X,)=_1_fx,-le-(I08X-mll/2al dx =_1_ =

(1fo 0 exp r(m + r(12/2).

e'Ye-(y- m l l / 2c71

-00

lienee the mean and variance. mode = em- al ;

maximum ordinate =

1

M::.' e- m+ al /2.

(1v 2n

Solution of a 1 = e m+ al/2 and a2 = e 2m + al (e al -1) gives the last results.

27 The proportionality factor is 12. Hence

f ,x 1

E(e'X )

=

12

e

.x 2(1-x)dx

=

12t- 4[e' (t 2 -4t+6)-2(t+3)]

o hy successive integration by parts.

28 Pearson, K. (1929), B, 21, 370, and UL (1961). Ir E(X) = m, then set z = X - m so that

(X2+aX +b)2

=

(z2+Az+B)2,

where A == a+2m, B == m2+am+b. Hence, since E(Z2+Az+B)2 ~ 0,

P2+(J[3; +A/j"i;;y+(1+B/J12)2 ~ Pl +1, where

dy

184


But a and b are arbitrary, so the inequality follows if

I+BIJl.2 =

J7i; +AIJii; = O.

0 and

P(x) = 2-" = exp( -x log 2). Thus x has a negative exponential dj. tribution with density function

29

(log 2) exp( - x log 2),

(0

x < 00).

~

Hence the cumulant-generating function is

= -10g(1- tllog 2), t < log 2. = p"+n(n+ 1),,/2 = exp(n log p - px). Hence the normalized dens K(t)

30 P(x) function of X is

II.

pe- P"/(I-e- P),

and for t
0, the cumulant-generating function is

_

K(t) - log

{[ 1+a2 ] [1-(a-t)e-("'-t)"/2]} 1 -ae ",,,/2 1 + (a-t)2

Hence,

1 - ax/2

E(X)

2a

= e"'''/2 -a + 1 +a2 ;

(e",,,/2 -a)(x-ax 2/4)-(1-ax/2)2 2(a 2 -1) var(X) = (ett"/2 _ a)2 + (a2+ W. Therefore for a > 0 and a2 negligible E(X)"" 1-(x-3)a,.., 1-0'14a,

and

[ x 2 -4(x-l)(x-2) ] var(X) '" (x-3) ll+ 4(x-3) .a"" (x-3)(1+0·16cx). 33 If k is the proportionality factor, then 00

kf

-00

00

-~=2kf 1+x4 0


,.Ibere

2

185

w:== 1/(1 +X4), so that k = 2/B(i, i)· 1

f:tX):::= 0 and var(X) = E(X2) = B(;, !)'

f

W-

i (1- w)-t dw

= 1.

o

wo

P(X

~ IX) = 2B(~,!)' f w- t (1-w)-tdw,

where Wo ==

(1+1X4)-1,

o

=

2;r!)'

j

C t (1-wot)-tdt,

o II

result, using term-by-term integration after expansion of hence the t

Ii -lI'olr .

\.I A ramous result due to K. Pearson.

'.1

hence

_ [(NP-X+ 1)(r-x+ 1)] Yx+l - Yx (Nq-r+x)x Ayx = Yx+l-Yx

and

Ayx 1 dy - - '" -'-d . Yx+t Y x

For bl = b2 = 0, the normal distribution with mean a and variance bo I'

obtained .

.IS Cadwell,1. H. (1952), B, 39, 207. x

F(x) = t+cI>(x) and

cI>(x) = -1-fe- tx2 dx, o

fo

so that

.11111

lIence

4F(x)[I-F(x)] '"

-2x 2/"

,I

2X4 7x 6 3n 45n

2X2 n

1--+---+ ... 2(n-3)x4 (60-7n2)x6 3n2 + 45n 3 +

'" e

+

'" -2x 2/" e

4 (60-7n 2)x6 ... } 2X 2/,,] [1 + {2(n-3)X 3n2 + 45n + e ,

hence the stated result.

3

...

186 36


A classical result due to E. C. Molina (1915).

= 1- F(/l),

P(X ~ /l)

where

Hence, integrating by parts successively, F(!l)

=

l-e- p

•

•

L

/lr/r!

= I-P(Y ~

v).

r=O

37 F(r) = P(X

= B(

=

~ r) =

.t

qn-r 1 _)'

r+ ,n r

(:)p"qn-.

r

L

.=0

(r) p" qr-·.B(n-r,r-s+l) s

f

f 1

(r)

qn-r . Lr p' qr-. B(r+ 1, n-r) .=0 s

zr-s(1- zr-r-1 dz

o

1

_

-B(

qn-r

I

).

r+ ,n-r

(l-Z)

n-r-1

r

.(p+qz)dz,

o

by reversing the order of summation and integration, q

= =

1 . B(r+ 1, n-r)

f yn-r-1(1_ y)r dy,

y=q(l-z)

o

G(q).

f x

38

P(X

~ x) =

1

1:

1 + fo' 0

e

-x 2 /2

d

x,

and the first result is obtained by term-wise integration of the series fore-"; For the second result,

ANSWERS AND HINTS ON SOLl'TIONS: CHAPTER

',lhe Oce

2

187

successive integrations lead to the series. Finally, oc· (211 + 1) ! e - x2/2 e - (1 2 + 2Ix)/2 IR.(x)1 =

f f

2., fo, n! . 0 (t+x)2n+2 . dt 00

(2n+l)!e- x2 /2 dt (2n)!e- x2 /2 1 < 2n.n!fo '0 (t+x)2n+2= 2n,n!fo'x2n+I'

19 Eolow, E,.R. (1934), AMS, 5, 137. . Use the series (x 2 +2r)-1 = x- 2

00

L (-I)k(2rNx 2k k=O

'valuate AI' A 2 " , , ,A, as power series in inverse powers of x, The stated ::~,~It is obtained by direct substitution in the expression for P(X ~ x).

.ro Direct integration gives k = (2n + 1)/2. (i) P(X ~ 0)

= (1 +sin2n+ 10)/2. p(IXI ~ 0) = 2P(X ~ 0) = 2[ 1- P(X < 0)] = (1- sin2n+ 10). Iii) P( -7t/6 ~ X ~ n/6) = 1t)2n+ I. E(X) = 0 and var(X) = E(X2) as stated, where

f

71/2

12n + 1 ==

xsin 2n +1xdx,

o

whence integration by parts gives the recurrence relation. Z has a Beta distribution. 41 Romanovsky, V. (1933). B. 25. 195. (i) Set

Ux = a~:+ 1(l-axY"

Then

=

fo' I .~

e- x2 / 2 dx.

!a~(l-axY"+ux(!-u~y".

Expand (1 - ax)m and integrate term-wise, whence (i) by noting that

f 00

1 (I 2\111 -x 2 /2 d - 0 fo'-oo U x '4- UxJ e x - .

fo'[

(Jx

(ii) (1.Sx

=

!+u(Jx, where U(Jx =

e- I / 2 dt.

The equality follows by integration of a~(I-ax)ma(Jx

=

!a~(I-ax)m+u(Jx(!-u~y".

(iii) Integrate

(iv) Integrate x2n a~+ 1(I-axY" = tx2n (X~(1-(XxY" +x2n

ux(!-u~y".

188


42 Probability density function of Y is (i) e-",

(0 ~ Y

< (0); (ii) 1/2Jy,

(0 ~ Y ~ 1);

(iii) 1/2Jy, for 0 ~ Y ~ !; and 1/4Jy, for! < Y ~ 9/4.

43

(i) Transform to polars by using x = z cos e, y = z sin 0, whence f probability density function of Z is I

2e-: 2 • Z3, (0 ~ Z < (0). (ii) With the transformation z = x + y and x = x, the region x +y _ x = y = 0 is transformed into the region bounded by x == 0 z~. I and z - x = O. The limits of x are (0 ~ x ~ z), whence the probabil density function of Z in this region is z for (0 ~ Z ~ 1). II, In the region x + y = 1, x = 1, Y = 1 make the transformal • Ii' Z = X +Y -1 and x = x, whence the transformed regIon is x " ' z = O. z = x. The limits of x are (z ~ x ~ 1) for fixed z. Therefore' this region the probability density function of Z is 0- z), ,;" (0 ~ Z ~ 1).

The two densities may be added as there is no discontinuil)' i the range of Z. (iii) The probability density function of Z is

al aZ . [e-aIZ_e-a2Z~, az-a,

(0 ~ Z

< (0),

obtained by using the transformation z = x + y, x = x. 44 The probability density functions of A and Bare (lh e - at and respectively for 0 ~ t < 00. Probability of A not happening in (0, t) is (1 +oct)e-"'. Probability of B happening in (t, t + dt) is !p 3 t Z e -fJI dt. The required probability of occurrence in the order BA is

f

!pJ,le '

IX)

(1 + oct) e- at . !p 3 t Z e- fJI • dt.

o 45 The probability density function of Z is Jf(x)g(z - x) dx. where II" integration is over the range of x for fixed z, and the region is bounded~. x = 0, x = oc. z = x and x = z - p. Therefore the density function of Z is (i) z/oc/3, if 0 ~ Z ~ oc; (ii) 1/13, if oc ~ Z ~ 13; (iii) (oc + 13 - z)/ocp, if 13 ~ Z ~ oc + p. For oc = 13, the distribution of Z is triangular with density function: i' for 0 ~ Z ~ 13, and (2p-z)/pz for 13 ~ Z ~ 213. Baten, W. D. (1934), AMS, 5, 13. Use the method of the preceding exercise. The pro babili ty density funclill!' of Z are as follows: (i) z(4zZ-6z+3), for 0 ~ Z ~ 1; and -4z 3 +18z z -27z+14, for 1 ~ Z ~ 2.

46

") 1 ( 11 4"

1

Iog [11 _+Jz] Jz ' for 0 ~ Z ~ 1; and

4" log

[3-Z+2(2-z)i] 1 ,for z-

1 ~ Z ~ 2.


2

189

''') i(SzS-40z4 +80z 3 -60z 2 + 15z), for 0 ~ Z ~ 1; and i(_SzS+40z 4 -80z 3 +100z 2 -95z+46), for 1 ~ Z ~ 2. The joint distribution of u = x 2 and v = y2 is

(UI

dudv

C'

4y uv

for 0 ~ U, V ~ 1.

Hence the probability density function of w = u + v is 11:

4'

for 0 ~ W ~ 1;

1

'4 cos-

~7

1

[

-

and

w2 -8W+8] w2 for 1 ~ W '

~

2.

The probability density function of the marginal distribution of X is 1 log x, for 1 ~ X < x

2'

00.

Ihe probability density function of the marginal distribution of Y is

~, 2y

lor fixed X =

for 1 ~ Y
(h)+21nf o

whence result.

f e-(X'+Y') /2 dxdy

e- r '12r drde

(a == tan - 1qJh),

7

h qxlh V(h, Q) = 21nf f e-(x'+y2)/2 dx dy. o

0

If the region of integration in the (x, y) plane having area hQJ2 is the sam. as the area of a sector with radius R and angle a, then R 2 = IIq/a. Thercfol( a

V(II, q) =

21n

h sec9

ff o

e- rll2 r dr de,

0

and P6lya's approximation is obtained by approximating the double inlW U

«R

21n

ff o

e -r2/2 r dr de,

0

whence the approximation for /(11, q).


2

195

Cadwell, 1. H. (1951), B~ 38, 475. bl put x == r cos e, y = r SIn e. J(h, q)

n/2

= 2~

f e-

th2 sec 28

de,

a

w

ate by parts successively, using [_h2ze- th2 (l+z2)] as the first function Inlegr . 'ach integratIOn. 11 t The alternative expression for the general term is obtained by putting : == I so that (1/w) d/dw = 2d/dt, and then differentiating by Leibnitz's :h~orem.

~2

~,l

Obtained ~y direct integration. For fixed y In (0 ~ Y ~ a), (0 ~ z ~ y); and [or fixed y in (a ~ y ~ 2a), [0 ~ z ~ (2a- y)]. For fixed z, [z ~ y ~ (2a-z)]. Astandard result obtained by using the transformation

~_ C'1+ a , v = '1_(Ca+Ab)

u=

A

Ihe variables

II

AB-C 2 '

'

and v are independently distributed and

Ae + B'12 -

2C~'1- 2a~ - 2b'1

(Ba 2 +Ab 2 +2Cab) (AB-C 2)

(AB-C 2)V2 A'

Au 2 - - - - - ' - - - - -

"hence result. fl.I The proportionality factor k is obtained from

k

f'"

fOO

x 2e - yx 2 dx d y _ k foo

(1+lxlt

-

-000

"~I

k = (IX -

dx

_ 2k foo _d_ x_ - I (1+x)a-'

(1+lxl)a-

-00

0

0/2.

The marginal density function of X is IX-I

(-00 < X < (0),

2(1 + Ixl)a'

IIllllhat of Y is

f 00

1 (IX-)

dx (1 +x)a '

x2 e-

YX2

(0 ~ Y

< (0).

o

!'q = 0; var(X) = 2/(1X-2)(1X-3). All moments of Y diverge since

f 00

E(Y) = (IX-l)

o

dx 2 -+ 00. x (1 +x)a

196


But

since the integrand is an odd function of x. Now E(X) 1 corr(X, Y) = - Jvar(X)' [E(y2)/E2(Y)-1]t' But var( Y) = E( y2) - E2( Y) > 0, so that

E(y2)/E2(Y)-1 > O. Hence corr(X. Y)

=

O.

65 UL (1963). Marginal distribution of X is (l-X)fZ XP+ 1 dx

B(a+ I, P+2) ,

(0 ~ X ~ 1).

Marginal distribution of Y is

1(1- y)fZ+ 1 dy B(a+2, P+ 1) ,

E(X) =

(0

~

Y ~ 1).

P+2 . (ct+ 1)(P+2) ct+P+3' var(X) = (ct+p+3)2(a+p+4);

P+ 1 Y (P+ 1)(ct+2) E(Y) = a+p+3; var( ) = (a+p+3)2(Ct+P+4) Conditional distribution of X for Y = Y is

[(Ct+ l)(l-x)fZ/(l- yt+ 1] dx,

(y ~ X ~ 1).

Conditional distribution of Y for X = x is

[(P + l)//xP+ 1] dy,

(0 ~ Y ~ 1).

(Ct+ l)(P+ 1)]t corr(X, Y) = [(ct+ 2HP+ 2) , given by the product of the two regression coefficients. 66

Murty, V. N. (1955), JASA, 50, 1136. The joint distribution of x and y is (0 ~ x ~ Ct; 0 ~ y ~ fJ).

Transform to polars using the transformation

x/a

= r cos 0, yiP = r sin O.


2

197

ate out over. r, noting that lolC~ for fixed () 10 (0 :::;; () :::;; n/4), (0:::;; r :::;; sec ()); !~)) tor fixed () in (n/4 :::;; () :::;; n/2), (0:::;; r :::;; cosec ()).

III

I'

P(z :::;; Po > p) = 1P(z:::;;

(_n_) (p/po)m ; m+n

PI < p) = (~)(P/Plr. m+n

Rider. P. R. (1955), J ASA, 50, 1142. h7 Ii' For m = n, the distribution of z is - n2 z"-1 log z dz, (0:::;; z :::;; 1), whence the distribution of v. Iii) For m < n, distribution of z is

(0 :::;; z :::;; 1), whence the distribution of t defined by z

= e- I is (0:::;; t < (0).

Therefore, for v = 2nt,

E(v')

= (2/A)' r(r + 1)(1- A:+ 1)/(1 -

,t).

The X2 approximation is obtained by assuming v - f3x 2 with v dJ., where f3 and v are obtained by equating the first two moments. 68 Rider, P. R. (1951), JASA, 46,502. The joint distribution of WI and W2 is 11 1112(nl-1)(n 2 -1) ,-2 2-2 rx n , +n2 • wi (rx - WI) w'2 (rx - W2) dW I dW2, (0 :::;;

Wio W2 :::;;

rx).

\Iake the transformation wdrx = r sin (), W2/rx = r cos (). Then, for fixed () in 0 ~ n/4), (0 :::;; r :::;; sec ()); and for () in (n/4 :::;; () :::;; n/2), (0 :::;; r :::;; cosec ()). Integration over r gives the distribution of (), whence the distribution of II ~ tan (I is

In ~

n l n 2(nl -1)(n2 -1) (111 + 11 2 )(nl + n2 -1)(n 1 + n2 - 2)

~--~~~--~~~--~X

x [(III-n2)un ,-2_(n l +n 2 -2)u n ,-I]du, for 0:::;; u:::;; 1, .md

for 1 :::;; u

-2logVk =

i= I

l

,\here xl is an independent with ~ dJ. The approximation follows by using rhe lechnique of the precedIng exercise. 71 Hyrenius, H. (1.953), J ASA, 48, 534. . (i) 0 ~ T ~ 1 If U2 ~ VI and T > 1 If U2 > VI' Therefore, for 0 ~ T ~ 1, the conditional distribution of T given UI is

(nl -1)n2 I . H~ I (n2 -1)(1-UIY( _ T)H2-,-1 dT x (l-ulf,+H2 ,=0 r I x

f

(z_u.)H,+H2-,-2dz.

II,

For T> 1, the limits of integration for z for fixed UI and Tare UI ~

Z

~

1 +uI(T-l) T '

whence the second part of the conditional distribution of Tis

(nl -1)n2

H~ I (n2 -1) (-1)' dT . ,=0

r

TH'(nl+r)

The conditional distribution of Tdoes not involve UI explicitly, hence this is the unconditional distribution. (ii) Use the transformation V = (V2 -U2)!(VI -UI), Y = U2 and z = VI' Note that (0 ~ V ~ 1) if (V2 -U2) ~ (VI -utl, and (1 < V < (0) if (V2 - U2) > (VI - UI)' For fixed z and V ~ 1, limits of yare I'I ~ Y ~ 1-(z-utJV; and for fixed V, (UI ~ Z ~ 1). Hence. for (0 ~ V ~ 1). the conditional distribution of V for fixed UI is

n2(n2- 1)(nl-l)

("1+ 112- 1)(III+n2- 2) '

[( nJ + n2- I)V"2- 2 -nl+n2( 2)VH2-I]dV .

For V > 1, the corresponding limits for y and z successively are UI

~

Y

~

1- V(Z-UI)

and

UI

~ Z ~

UI +(1-u l )fV.

whence the conditional distribution of V is

n2(n -1)(n -1) 2 I V-H'dV (n l +n2- 1)(n l+n 2 -2)' . As before. the conditional distribution is free of U \.

200


Use the polar transformation WI = R cos () and W2 = R sin 0 wh after appropriate integration over R, the distribution of V is' ent~

(O~ V~ 1); and

n2(n l -l). Vn 2- l dV,

n l +n2- 1 n2(n l -l). V-n'dV, n l +n2- 1

(1

~ V< 00).

72 UL (1963). The manufacturer's specification is satisfied if for

e- pm > (1 + A.) e- am ,

or

f3 < a

whence the inequality.

73 Irwin, J. O. (1941). Discussion following E. G. Chambers and G. U. Yul (1941). JRSS(S), 7, 89. . At time t, (lV x = Vx - vx + I, and at time t + (it the corresponding differcnl' is v x - I .f(t,x-l)(lt-vx .f(t,x)(it, so that at time t+(it , (lVx Tt =

f

vx-I' (t,x-l)-v x ·f(t,x),

whence the differential equation as (it (i) For f(t, x)

= kljJ(t), we have for

-+

O.

x

dvo

=0

dt = and since at t

=

0,

Vo

= N, therefore

~/ VI

kljJ(t) . vo, Vo

= N e- kT . Again, from

= kljJ(t)[N e- kT -v.J

= e- kT • NkT. In general, if -kT

vr_I=Ne

(kTy-1 . (r-l)!'

then

~r + kljJ(t) . Vr = kljJ(t). Vr - I gives

_ N _kT(kT)' e -,-. r.

Vr -

(ii) A similar method to (i) leads to the negative binomial.

ANSWERS AND HINTS ON SOLUTIONS: CHAPTER 2

20]

Irwin, J. O. (1941). Discussion following E. G. Chambers and G. U. Yule JRSS(S), 7, 89. '19~)idently, E(xIA) = var(xIA) = A.. The cumulant-generating function of A is v _rlog(1-tlc), so that E(A) = ric and var(A) = rlc 2 • The marginal dis· jll~ " Irlbution of x IS 7~

x . AX c'. e A,-I dA C' (x + r-l) ! Je-XI' (r-l)! =(l+c)x+"x!(r-l)! forx~O.

1,

and

o T

F 1(T)

= f(t) dt.

o Then the expected number of replacements at the post is

G(T) =

f Fm(T),

m=l

and

dGd(t). dt t

For the second part, there are two possibilities: (i) a bulb replaced at t = T; and (ii) a bulb not replaced at t = T.

=

g(t) dt.

204

EXERCISES IN PROBABI LITY AND STATISTIC'S

For (i), consider a bulb replaced in (x, x + dx), where x < X is th . to t = T. The probability that this replacement is not replaced at t : illlt g(T-x)S\(x) dx. so that I'

f .Y.

P

=

g(T-X)SI(X) dx.

o

Hence the expected lifetime of a bulb replaced at t = T is > t is (I-P)SI(t).

Also for (ii), the probability of a replacement bulb not replaced at having a lifetime > t is x

f

t ::: .,

g(T-X)SI(X)SI(t+x)dx.

o

79

Campbell, N. R. (1941), JRSS(S), 7, 110. where PI = p,

P [1-(p' _p)n] Pn -- (l-p'+p) ,

whence 1

--:----:-:----:-;0 ,..,

1-(p' _ p)2

1.

The integral expression for p is obtained as for P in the preceding exercise. The expression for pp' is obtained by considering two consecutive replace. ments at Xl and X2' where (T-X < Xl < T) and (2T-X < X2 < 2T). Since no change takes place at t = T, the probability for given Xl and X2 is g(T-XI) dXI . S(xd. g(T-x l -X2) dX2' S(x 2 )·

For f(t) = .1 2 e-).I. t. p

and pp'

= (l-e-).X)- ~ . e-).x[1 + e-l),(T-X)],

= [1-e-).X (1 + ~)] [1-e-).X (1 +

_A.:

e-).(2T-X) •

A.:) _A.:

e-).(2T-X)

(l-e- 2ATl]

[1- e- 3).X (1 +lA.X)].

80 The joint distribution of Uj is

n f(uj) dUj, n

n!

j=

I

and that of Zj is n

n!

n dZ

h

1= I

where

f U;

Z,

=

f(x) dx,

-00

and 0

d(x/O)

o

E(X) = 20; E(X2)

=

60 2 ; var(X) = 20.

The maximum-likelihood estimate of 0 is 0* = x12. Therefore var( 0*) = var(x) = 02/2n.

*

i'inally,

f Xf]

1 .. E[-3 n ,= 1

20 2 , and

=

E(x 2) = var(x)+E2(x) whence

=

20 2 n

-+ 40 2,

E(~X)2 = 202( 1 + 21n).

10 The probability density function of X is (IX +

1)(1X+2)x a (1-x)

so that 1X>-1.

rhe roots of the maximum-likelihood equation are

* _ (3g+2) ± (g2+4)t IX

~ow -

00

< g :::; 0, and

- 2g

-

IX* -+ -

1 as g -+

-

00.

.

Hence the positive root for

It.

II

whence E(X)

=

eJ.l+O'2/2;

var(X)

Equate the mean and variance with

=

x and

e 2J.1+O'2(e U2 _1). S2 to obtain two simultaneous

212

EXERCISES IN PROBABILITY AND STATISTIC'S

equations which give the moment estimates

11*

X4

! log [ S2'+ xi

=

]

The median of the distribution is e,l < E(X). UL (1962). Since the distribution function must be I for X = n/2, the proporti . factor is (I - e - IX)- I, and the probability density function of X is on~llI. 12

(0 ~ X ~ n/2).

The maximum-likelihood equation is eIX * - I 1 I 1X*-eIX*-1 = T, or IX* = I+T(eIX*-l)'

1 T=-;;

/I

i~lsinxi'

13 By Taylor's Theorem, g(X) - g(m)+(X -m)g'(m), neglecting high •. derivatives, so that E[g(X)] - g(m), and

var[g(X)] - E[g(X) - g(m)f - [g'(mW var(X). The generalization follows by considering an expansion of g(X 1, X 2, .. '. X ' about Xi = mi' for i = 1,2, ... , k. If Xi are cell frequencies with a multinomial distribution, then I

E(X i )

=

mi;

Hence the result. 14

UL (1961). The equation for ()* is nl n2 +n3 n4 2+()* - 1-()* + ()*

= O.

Var(()*) is obtained from the second derivative of the logarithm of the sampk likelihood. Use the results of the preceding exercise to prove that E(O) = ()

var(O) = (l +6()-4()2)/4N.

and

UL (1963). In the first case, P(X = 0) = e- m; P(X Hence the sample likelihood is

15

L =

~

1)

= (1-e- m).

(~)(e-m)n(l_e-rn)N-n,

whence m*

= 10g(N/I1) and var(m*) = (l-e- rn )/N e- rn .

In the second case, P(X

=

0)

= e- rn ;

P(X

=

I)

=

m e- m ;

P(X ~ 2)

= 1-(1 +m)e-

lil

Hence the sample likelihood is L

= no!

111

N ! . (e -1II)no (m e -1II)n, [1- (1 + m) e -lIIt- no -.,. ! (N -no-l1d!

•


var ( 111

**)

=

3

m[l- (1 + 11l) e-"'] Ne "'(1-I1l+m 2 -e Ill)

m(e'" -1-11l) =var(I11*).( e '" - 1)(1 -11l+m-e 2 Ill)'

,nee the result on retaining the first power of 111. r* = (nd N )1.

el I):::: Nr2;

I~ ,,1\

2 (.(11\+113)= N(l-p);

1 +11 3')1 -----r;;.

p* = 1- (11

. approximate estimates do not add up to unity. Use Ex. 13 above to prove that

InN

var(r*) =

l-r2 4N;

p(2-p) var(p*) =~;

* q(2-q) var(q) = ~.

17 The maximum-likelihood equation for j) is

~-~-~+~=O 2+/3 2-/3 1-/3 1+3/3

Ihen

p:::: (1-0)2

so that var(O)

=

.

var(j)/4p.

IrQ = ala4/a2a3' then

4

;~1

=

Iltll var[log Q]

64(5+2p-4p2) l/E(a;) = 3N(4- p2)(1_ p)(l + 3p)'

= var(Q)/[E(Q)j2, and var(Q) =

[dd~] 2 p

•

var(p*),

p.=p

'Ihcnce var(p*) = var(j) by appropriate substitution. 18 UL (1963).

The sample likelihood from the two experiments is L=

(~J {l(4-0)}"' {i(4+9)jN,-n. x x

(N1132) {6~ (12 + 9)(4 - O)}"3 {614(4 + 9)2}NZ-II,.

213

214


e*

whence the equation for

is

n,+n 3 N,+2N 2 -n,- 2n3 n3 - 4-0* + 4+0* + 12+0* = O. For large samples, to test the hypothesis R(O

= ( 0)

use

. . 0*-0 0 S.E. (0*) - a umt normal vanable.

19

Pollard, H. S. (1934), AMS, 5, 227. The probability density function of X is

f(x)

=

1 . (1 +A)fo

[~.e-X2/2ai+~.e-(X-p)2/2(1~], (11

(-00 < X
m.

0, E(X)

=

~

var(x)

=

=

m=

O. Hence

~ (11

(12

is a sufficient condition

[Of

27(1 + ..1.)2 [1 A ]2' 4N ~. exp{ -m 2 /2(1i} +~. exp{ -(m- Jl)2/2(1~} (11

Therefore, for Jl

and /1 > O.

(12

= 0, the condition var(x)

=

var(x) reduces to

(1+AP2)(P+).V-~(1+A)3p2 This is a quartic in P, say g(p)

=

=

O.

O. Then g(O) = . 1. 2 > 0, g(oo)

-+ 00

and

g(1) = (1 +,;l(1-n/2) < O.

Therefore the quartic has two positive roots PI and P2' satisfying the statcJ condition for var(x) and var(x). 20

UL (1963). The distribution of Xr is

B

1

rl", /1-1"+

1 )

.00-nx~-1(0(-xr)n-rdx"

(0 :::;;

Xr :::;;

0:),


J

3

215

E[(Xr)"] = r(n+ 1)r(k+r) 0( r(r)r(n+k+ 1)'

the mean and variance of X r • = 0("/(k+1), for k ~ 1. This gives Also, E(x) = 0(/2 and var(x) = 0(2/12n.

,·~nce E(Xk)

,Ir n

"" 2v+ 1. the median is x v + 1 so that E(Xv +l) = 0(/2 and

var(xv+Il = 0(2/4(n+2).

, Since families with no abnormal children are excluded, the total proba,I I'S (1-t/'). Therefore . lilY

E(r,) ==

i

ri (k.) pr. q"-r'/(l_q") r,

r.=1

= kp/(1-t/'),

for i

= 1,2, ... , n.

lherefore for the average of the rio E(r) = kp/(l-q"), whence the moment ;,Iimate of p is obtained from

_

kp* r = 1-(l-p*t' \Iso,

k

[k

"-1]

n var(r) = var(r) = 1 pq", 1-~ -q 1-q

var(r) = (ddr*) 2 • var(p*). p pO=p lZ Haldane, J. B. S. (1943), B, 33, 222.

-1) L (n) - (n _

E(n/r) =

00

r

n=r

1

r

prqn-r

= L 00

s=o

(r +s) prqs = p-l. s

Ilul

E(r/n) =

r~. q

q

I: (n =11) f t

n=r

r

q

= r(!!.)rftr- l . q

o

n- 1

dt

o

I: (r+S-1\ tSdt s J

s=o

q

=

r(~r f tr - l (1- t)-r dt o 1

= rp f ur- l [1-q(1-u)r l dll, o

whence the result by term-wise integration.

U

= pt/q(1-t),

216 23

EXERCISES IN PROBAbILITY AND STATISTICS

Finney, D. J. (1949), B, 36, 233. 00

E[g(a,

PH = pll+ 1 qP . coefficient of X'-Il-l =

in

L (q + PXY'-Il-P-t

n=,

p"qP. coefficient of X'-Il- 1 in (q + pXY-Il- P-l .

f

X,

1=0

whence E[g(1,0)]

=

p

and

E[g(2, 0))

=

p2.

Also,

1

= (r-1)p2

f 11- 2[1-q(1- u)r

du,

1

u

=

pt/q(1-t)

o

= p2

f tf/(r + ss - 1) .

• =0

Hence an unbiased estimate of var(p*) is p*2 _ g(2, 0) = p*2 (1- p*)/(r- 1- p*).

For the mode, consider P(n)/P(n-1) = (n-1)q/(n-r),

so for maximum probability r-1

r-1

- - 1 < n-1 < --. p

p

24 Use the transformation

y-m2 and - - = v. (]'2

Then

E[(X :~lr (Y~~2r]= -Q)

ff 00

= 2~

-

a)

00

exp{-!(u 2 +w2)}.u'[w(1-p2)t+pul'dudw,

-00-00

where w = (v-pU)(1_p2)-t, whence the result by term-wise integration.

ANSWERS AND HINTS ON SOLl'T10NS: (,HAPTER 3

Also. since

217

x ( u)(1+-V)-l

-=..1. 1+Y ml

m2

~r~fore

Hence V~k -

1 (2k) 2k . k !

OJ

E(Tl ) = . 1. [ 1 +(V2 - PV1) k~l

!] 1+

..1., as n --+

00.

Hul X and ji have a normal bivariate distribution with correlation P and

.fficients of variation

,(lI:

x]

E[Y

[1

vdJn, v2IJn respectively. Therefore

= . 1. 1 + In(V2 -

OJ (2k)! (V2 )2k-l] PV1)' k~l 2k. k! In

-+

..1., as n -+

00.

\\ a first approximation,

var[~]

mi 2 var(x)+mfm;:4 var(ji)-2mlm;:3 cov(x, y).

=

'11

2~ Let '2 and '3 be the number of insects killed at strengths x = -1,0,1 "spcctively so that' 1 +, 2+, 3 == ,. Hence the joint likelihood of the sample is

1.=

( ")[ '1

1'1[

1 1+e (a liJ X

]"-'1

e-(a-l) 1+e (a 1)

('3n)[ 1+e 1

(n)[ '2

X

]"[e-(a+l)]"-" 1+e (

var(S I) - var(S 2)]

0"2

= 2"[I1{I+(11-1)p}-v l {I+(v l -l)p}-v 2{1+(v 2 -1)p}] = 0"2pV\ V2, whence corr(SI, S2)'

226

EXERCISES IN PROBABILITY AND STA TlSTlCS

The limiting value of the correlation as

VI ---+ 00

is

51 UL (1963). Sample mean of X values Sample mean of Y values

= =

(n2+ n4 -nl -n3)(X/N == A, 1(X. (n3+ n4 -nl -n 2)f3/N == A, 213. (1l 1-n Z-1l3+ n4)/N.

A,3 =

The correlation is obtained by evaluation of the two sums of squares and Ih sum of products. For zero correlation, t

~= and if this is satisfied as N P(X

52

=

-(X,

---+

Y

(nl ~n2) (nl ~n3),

'X),

then

= -13) =

P(X

=

-(X)P(Y

= -{i).

Pearson, K. (1924), B, 16, 172. P(X = r) =

(~)(~l-:) (~)

and P(Y= siX

= r) =

(M:r)(N-~=;I+r)/(N~n}

Hence the result for P(X = r, Y = s) is obtained by recombining the binomial coefficients. The limits are 0::::; r ::::; min (nl' M); 0::::; s ::::; min (n2' M -r) and 0::::; r+s ::::; min (n l +n 2, M). For X = r the conditional distribution of Y is hypergeometric, so Ihal E(YIX

Similarly,

= r) = n2 (M -r)/(N -nIl.

E(XIY = s)

=

nl (M -s)/(N -n2)'

The linear regression coefficients of Y on X and of X on Yare -n2/(N -11 11 and -nl/(N -n2)' whence corr(X, Y).

whence result.

53 Greenwood, J. A. (1938), AMS, 9, 56. Consider any two cards of the B pack. There are two cases. I.. Cards ali of same type. II. Cards are of different types. I and II can occur III

(~)

-(XI

== (X2 ways, respectively.


3

227

t- 1 NN-a-I j(N) -1 . IX, 2 t-l. - -N-l P(X = 0, Y= 1) = . I X , )(N) 2 1 N-a P(X = 1, Y= 0) = -.--.IX, N-l j(N)2 1 a-I P(X= I,Y= 1)=-.--.IX, N-l j(N)2 .

rhUS for I,

P(X = 0 Y = 0) = ,

t

.

a

t

t

t

for II,

'

P(X = 1, Y = 1) =

1 = ('

P(X

=

P(X

= 0, Y= 1) =

1, Y= 0)

~. N ~ 1 . 1X2j(~)

P(X=O, Y=O)=

N-a-l N-l .1X2j(N)2

[~. (~~II)+e~2)(N~I)] .1X

2

/G)

[~·(Z=~)+(t~2)(N;~~I)] .1X2/(~)'

Ihe last two probabilities are obtained by considering separately the possibilithat the first card of the A pack mayor may not belong to the type of Ihe second card of the B pack. Hence the joint distribution of X and Y. Finally, let z" Z2,'" , ZN be random variables each associated with one "f the cards of the B pack such that Ii~S

1 if a match occurs at the kth position;

Zk

=

Zk

= 0 if no match occurs at the kth position.

Ihen N

SN =

L Zk'

Zk

has the marginal distribution of X or Y,

k=l

md for i #- j, corr(Zj, Zj) = corr(X, Y). ~ Put X -m, = wand

Y-m2

=

z. Then

(1 +~) (I---=-+ Z22_"') m1 m2 m2 z) (WZ WZ2)] w ",..1. [1+ ( m 1- m2 - ml m2 - m~- mlm~ .

X = w+ml = . 1. Y z+m2

Z2

(1)

Psing the first approximation X/y", ..1.[1 + (w/ml -z/m2)] gives E(X/Y) '" . 1. Ind the approximation for var(X/Y). The approximation for the bias in E(X/Y) is obtained by considering the l1~xt terms in the expansion (1) for (X/Y) and noting that E(WZ2) = 0 for a 'Ymmetrical distribution. Hence E(X/Y)-}, '" ..1.ViV2-PVl)'

228


Pearson, K. (1931), B, 23, 1.

55

Since E(t) = 0, cov(,x, t)

=

E[(x-m)t]

E[(x-m)2fi/s]

=

(11-1)1 J~'r(I1;I)" -2- , by direct integration.

=

r(I1-2) -2

a

But var(t) = (11-1)/(11-3), and varIx) = a 2 /11, whence corr(x, t). With the transformation X

56

= a+bU

r = it! (Xi-X)(Yi-

=

J!

and Y

= a+{W,

jiy[ it! (Xi-X)2. it! (Yi- ji)2 r

(Ui-U)(Vi-Vlj[it! (Ui- U)2. it! (Vi-W]},

III

where (Xi-X) = b(Ui-U); (Yi-ji) = P(Vi-V). Hence the invariance ofr. Put a = m x , b = ax, a = Illy, P = a y , then r is as in (1), and the joint di,. tribution of U and V becomes

1

21t(1_p2)!'ex p

[1

2

2

]

2(1_p2)'(U +V -2puv) dudv,

(-00 < U,V 0; c := np+t. Ihererore y=

1 -(x+m)/P ( + )a pa.+1 r(ex+l)·e . x m,

(

~ X < -/11....,

)

00,

where {J:= (1-2p)/2 > 0;

ex:=

4(11 + I)p(l- p) (I_2p)2

211p + 1

> 0; /11:= 2(1-2p) > 0,

'!llhat

2(x+m)

fJ \1

2{r(1-2p)+(I1+ l)p} (I_2p)2

hence the result.

~9

[t/>(aW '"

2~

nl2 2ul,!;

ff o

e- r212 .rd,.dO,

0

,Ihieh gives P61ya's approximation. For a standardized normal variable X, P(X ~ x)

= !+t/>(x),

md the distribution of the sample median z is

(21'+1)! (1.!)2

1 [I {J.( )12]\. -=212 d ·fo· 4- 'l'ZJ e . Z.

Iknee the result by using P6lya's approximation.

230


60 Cadwell, J. H. (1952), B, 39, 207. Using the Cadwell approximation, the distribution of the median proportional to [ ] e-(4Vh)x 2/21f. 1 + 2(~:23)V . X4 dx,

. X 1\

whence the distribution of y on normalization. The moments of yare obtai

~n~qilid

f 00

_1_

.j2ic .

2r

y. e

~

d _ (2r)! y - 2r . r !.

-y2/2

-00

61 A particular case of a general result due to M. S. Bartlett. The distributi ~t~ ~ 1 dt (-00 < t < (0). JnB(n/2,!) . (1 + t 2/n)(n+ 1)/2'

Hence the distribution of w = (1 + t 2/n)-1 is

1 ~-2U2 -t B(n/2,!), w (l-w) dw,

(0 ~ w ~ 1),

and the distribution of y = - n log w is proportional to e->'/2. y-t [1-(y/2n)+O(n- 2)]dy. The result follows by approximating the quantity in the square brackets a\ exp( - y/211). This is K. Pearson's well-known PA test. Each Pi is an independcnI variable with a uniform distribution in (0, 1).

62

63 64

UL (1962). Obtained by using the general result of the next exercise. The joint distribution of x I and Xn is Xn

(1

!)2~n!-2) ! ·f(x.) dXI [f f(x)dx x,

(x I ~ Xn ~

13; ex

r-

~ xI ~

2 f(xn)dxn,

13)·

For fixed XI' make the transformation R = Xn-X I, and then for fixed R integrate for x I in (ex ~ x I ~ 13 - R). If X is uniform in (0, 1), then the distribution of R is

B(n

~ 1, 2) . R

n-

2(1- R) dR,

65 The joint distribution of XI and

X2

(0

~

R

~

1).

is

(2v+2)!. (XI)" (1- X2)v. dXI dx 2, (V!)2 2a 2a (2a)2 and the joint distribution of y and z is

_-:::---7"~4( v + 1)

(2a)2l'+2B(v+l,v+l)'

(y _ z)V (2a _ _ z)V d dz y y.


3

231

obtain the distribution of y integrate over z, the range being ro (i) (0 ~ z ~ y) if (0 ~ y ~ a); (ii) [0 ~ z ~ (2a- y)] if (a ~ y ~ 2a). rhus for (0 ~ Y ~ a), the probability density function of y is y

4( v + 1) (2a)2.+2B(v+1,v+1)'

f [(a-z)

2

2 •

-(a-y)] dz,

o

Ivhence the result by expansion and term-wise integration. Similarly for the range (a ~ y ~ 2a). Direct integration gives P(O

±(~

~ y ~ a) =

22 .+ I B( 1 1 1)' v+ , v+ r=O v

1

-

- 22'+ l B(v+1,v+1)

. •

I~O

(V)

t (

)(-l).-r/(2V-2r+1)

r

-11 JI )

Z21

dz

o

1

= 22>+

1

1 B(V +

f

1, v+ 1)' (1-Z2)" dz o

B(v+1,!)

1

= 2 2.+ 2 B(v+ 1, v+ 1)

= 2,

using the duplication formula Jnr(2v+2) = 22 .+ 1 r(v+ 1)r(v+!).

xJn

66 Make the transformation = r sin (J and sJn=l = r cos (J to obtain Ihe joint distribution of rand (J as e -nm'/2a 2 - - - - - - - 1 - . cosn- 2 ede. e-r2/2a2. rn - 1 x 2(n-2)/2Jn qnr(n;

xexp

)

(mJn ..sm e) . dr, ~.I

Expansion of the last exponential and term-wise integration over r then gives Ihe distribution of e as

~"

e- n/ 2V2

- - - - . L:.

/it

r(II;1)

j)

(2n)i,2r(n + 2

vJr(j+ 1)

j=O

.cos

Rut v = [n/(n - 1)] cot2e and cot( - e) distribution of e for (0 ~ e ~ n/2) is

2 e- n/ 2v2

In r (

00

L:

-1 ..

11:2

)

J=O

(2njV2)i r(2'

1)'

n-

=

2e sini e de,

-

cot e, so that by pooling, the

r (n+2j) -2- . cosn- e sin

'.}+

lienee the () ...... v transformation gives the result.

2

.

2J

O dO.

232


Jv,

To obtain the distribution of w = consider the distribution separately over the ranges (-n/2 ~ () ~ 0) and (0 ~ () ~ n/2). Hence byOfhO transformation Ic

w=

C: r. 1

cot (),

the distribution of w is

e-n/2V2

co

2 "2r(n+i) (2nIV)' -2-

'L

In r ( n~ 1)

(n-l)t dw

(n-1)w 2

r(j+ 1)

j=O

}(j+2 l/ 2

n

{

{ I+

n

}(n+ jl/ 2

n (n -1)w 2

for 0 ~ w
O. With the transformation

76

x

=

rcosO, y

rsinO,

=

the region is transformed into r2:s;; 1, subject to 1- p sin 20 ~ 0 for all Ipi :s;; 1. Therefore the limits are (0 :s;; r :s;; 1) and (- n/4 :s;; 0 :s;; 7t/4). Henc the result, since u = 1- p sin 20. c Transform to u = (X-m1)/0"1 and v = (y-m 2)/0"2· Then the joint dis. tribution of z = u/v and v is

77

2 n R · exp { 2(1

~p2). V2(1-2 Pz+z 2)} ·Ivl dvdx, ( - 00

< v, z < (0).

Hence the distribution of z is (1- p2)t dz 7t . (z_p)2+(I_p2)'

( - 00

< z < (0).

78 The joint distribution of u and y is _1_. exp 27t0"10"2

{_ty2(U:+~)} .IYI dydu, 0"1 0"2

( - 00

< u, y
0, 2 < P2 < 3. For P2 -+ 2, n -+ and the distribution becomes uniform. For P2 -+ 3, n -+ <Xl and the distribution becomes bivariate normal.

°

84

Pearson, K. (1923), B, 15, 231. Using the notation of the preceding exercise,

E(YIX

=

x)

0"2

= p-x. 0"1

Therefore. if f1 (x) is the marginal density function of X. a

fl(X). var(YIX =

x) =

f -a

(y-p ::

x)

f(x, y) d(y-P ::

x),


fl(X)

=

M:

r(n+2)

Y 2n . 0' 1(n + 2)tr(n + i)

3

239

]"+t

[X2 2 ' 2(n + 2)0' 1

. 1

The semi-axes of the equi-probability contour are

0' 1 I/I(2n

+ 4)t

.!~[(I_ p2)(2n + 4)]t, so that the total probability outside the contour is

f

and

1

f(x, y) . 4n(n + 2)0' 10' 2(1- p2)tl/l dl/l = (1- 1/1 2)"+ 1.

I/!

85 VL (1961).

Let x and s be the sample mean and standard deviation obtained from the 'Irst experiment. Then oe == (x-m)/s is known. If the required number of plots is (v+ 1), then oeJv+"1

= 1 % value of Student's t with v d.f.

Jnd so oe

=

1 % value of t with v d.f. - f( ) r::-:1 = v , say. yv+l

Inverse interpolate on the sequence of values N, f(N) obtained from the t !able for varying N d.f. to determine v such that f(v) = oe. The number of plots is the least integer containing (v + 1). To allow for seasonal variations, divide each experimental plot into two equal sub-plots. Allocate randomly one of each pair of sub-plots to the new fertilizer, using the other as a control. Test the hypothesis of no difference by Student's paired t statistic.

86 UL (1961). First part is the standard F statistic for testing the equality of two lariances. For second part, consider pairs of briquettes from the lots, and assign landomly one of each pair to A and B respectively. If (Uj, Vj) are the measurements of A and B on the ith pair, then consider new variables Xi = Uj - Vi; 1',= IIj+Vi, (i= 1,2, ... ,n). Since cOV(Xi,Yi) = var(ui)-var(vi), therefore a test of the equality of the lariances of the measurements of A and B is equivalent to testing that the pairs (x;, Yi) arise from an uncorrelated population. Thus, if R is the sample Nimate of corr(xj, yj), then the appropriate test statistic is t

= RJN-2/Jt-R 2

with (N-2) d.f.,

Vbeing the number of pairs of briquettes used.

~ The probabilities of the four outcomes-HH, HT, TH and TT-are r. PC/. C/p and q2 respectively. (p + q = 1). If the corresponding observed

240


frequencies are n l , n2, n3, n4, then

P*

=

(N +nl -n4)/2N

= (1

+A,)/2;

1

var(p*) = 4N 2 ' var(111 - n4) = pq/2N. To test the hypothesis H(p = t), use (i) z

=

(P*-t)

/(8~r as a unit normal variable.

4

(ii) X2 =

L

(4nj-N)2/4N, as X2 with 3 dJ.

j= I

88 Since PI = P2 + b, the probabilities of the outcomes HH, HT, TH and TT are P2(P2 + b), (P2+ b)(1-P2)' (1-P2- b )P2' (1-P2)(1-P2-b). If tho observed frequencies are nl, n2, n3, n4, then for P2 = t the estimate of cl t easily obtained. " In the second case, consider 4

T=

L ajnJN j=1

as an unbiased estimate of b, the constants aj being determined by E(T) == b. Hence

T= (n2-n3)/N, where

var(T) = 4J(l-(M/N,

and

4J == PI + P2 - 2PIP2' Thus 0

~

USill l'

4J

~

1, whence result.

89 UL (1963). The formal results follow by a direct use of least squares. Explicitly, S2

=

n(n + 1)(2n + 1)/6; S 3

=

n4(n + 1)4/16;

and S4 = n(n+1)(2n+1)(3n2+3n-1)/30. Hence var(IX*)

=

24(2n + 1) (3n 2+ 3n - 1)0'2 . and n(n 2 - 1)(n + 2)(3n 2+ 3n + 2)'

* 120(2n+ 1)0'2 var(fJ ) = n(n2 _ 1)(n + 2)(311 2 + 3n + 2)' 90

Fieller, E. C. (1932), B, 24, 428, and UL (1963). (2

is Student's

(2

= (a - pb)2/S2(A,1 + A, 2p 2- 2pA,3)

statistic with n dJ. The confidence limits are obtained from t~ =

(a-pb)2/s2(A,1 +A,2p2_2pA,3)

giving two roots for p. The stated result follows for A,3

=

O.

.


3

241

Geary, R. C. (1930), J RSS, 93, 442. ql put u = x + a and v = y + b. Then the joint distribution of u and z

~.exp{-t[(u-af +(uz-bfJ} .Iuldudz, W' t 0"2

0"1

= vlu

( - 00 < u, z < 00),

0"2

.• hence ~he ~ist~ibu~ion of z by integration over u. But since P(z) dz is a rrobabihty dlstnbuhon, co

f

P(z) dz = 1,

which gives

-00

f

f

a/a,

1

fo·

-a/a,

00

e- w2/ 2 dw+

-00

az-b

R(z) dz

I-p[lwl ~

or

=

1,

where

:J+ f

R(z)dz

(0"~+O"fZ2)i =

w,

00

= 1,

-00

or

f""

R(z) dz

=

p[lwl ~

:J

lienee the result for a sufficiently large compared with 0"1. For correlated lariables. x and y' = y-p(0"2/0"t)x are independent normal variables. Then z

=

y'+(b-')'a) +,)" x+a

where,),

0"2

== p--, 0"1

= z'+,),. ~'hcre

z' satisfies the conditions of z for uncorrelated variables. Therefore az'-(b-')'a) [O"~(1- p2)+ O"fz 2]i l

Ii

approximately a unit normal variable. Hence the result.

92 Todd, H. (1940), J RSS(S), 7, 78. Consider the four vertices of a square. The four kinds of contiguous pairs are: (i) endpoints of a horizontal side of the square; (ii) endpoints of a vertical side of the square; (iii) and (iv) endpoints of the two diagonals of the square. Enumeration then gives P2. Define a triplet by (r, s) where rand s denote the between-row and between(olumn intervals of the triplet. There are 20 distinct types of triplets, grouped .1\ rollows: (1, 1)-4; (2,2)-2; (0,2)-1 ; (2,0)-1 ; (2, 1)-6; (1,2)-6. I he number of ways a triplet (r, s) can occur in the field is (M - r)(N - s).

242


Therefore P3

=

4{5MN-7(M+N)+9}

/(~N).

For testing randomness of deaths with contiguous pairs, use

as a unit normal variable approximately. 93

Daniels, H. E. (1941), JRSS(S), 7, 146.

and n2P4 -PI

= n IP4 +1 -+1

E(r4)

n3P4 -PI-P2

Using these as estimating equations and solving,

*=

rl pf = -;

P2

nl

*

P3

=

r2(nl-rl)

.

nl(nl +n2- r d'

r3(nl-rl)(nl+n2- rl- r2 ) . , nl(nl +n2 -rl)(nl +n2 +n3 - r l -r2)

*-1

P4 -

* * *_

- PI - P2 - P3 -

r4(nl-rd(n l +n2- r ,-r2) . nl(nl +n2 -rd(nl +n2 +n3 - r, -r 2 )

The sample likelihood is 4

L

=

constant x (1- pd- n2 (1- PI - P2)-n 3

n pi',

i= ,

whence the same estimates of the p's. Transform L in terms of the (J's; thel maximization gives nl-rl. 0* _ -0*2 --' 3-

nl

r3+ r4- n 3 r2+ r 3+ r4- n 3

The asymptotic variances are:

var(T) ,...,

L4 (OT)2 ;10* . var(On u 1 6f=6,

1=2


3

243

,fill Ibe minimum variance is obtained by using Lagrange's method on the ',ocl ion ~ ~

F

=

var(T)+A.(n l +n2+n3-n).

Finney, D. J. (1941), JRSS(S), 7,155.

for P > O. E( s 2P) -_ (n -1)(n + 1)(n + p3) ... (11 + 2p- 3)-.(1 2p ,

and

=

=

11

II~oce

E[M!r 2s2)]

e[(n-I)/2n),2a 2

and E[e'XhHr 2s2)]

=

e'~+tr2,,2

E(y').

ence the estimates, their efficiency being inferred from the properties of 2 and s . By logarithmic expansion,

,h I

k(k-l) [1

I gll.k---o 1

_

11

"Ilhill

2(k-2) 2k ---+ 311

2 -6k+5 O( + n 3n 2

-3)] ,

k(2) 1 A.k = 1--+-[3k(4)+ 16k(3)+ 6k(2)] + O(n- 3), n 6n 2

.• here in factorial notation kim) == k(k-l)(k-2) ... (k-m+ 1). Therefore

lienee

9S For an uncensored sample of N observations var(O)

= 20(1-0)(2+0)/N(I+20)

by a straight application of the maximum-likelihood procedure. For a censored sample, the expected proportions in the AI, A 2 , A3 and 14 classes are proportional to p(2 + 0), (1- 0), (1- 0) and 0, with the proporI,onality factor [2(I+p)-(I-p)O]-I. Hence

var(O)

= 0(1-0)(2+0)[2(I+p)-(I-p)O]2. 2M[2(1 + p)+(l + 7p)O]

244

EXERCISES IN PROBABILITY A NO STA TISTICS

Now 1(0 1

,

) P

=

(I+ P) (1+20)(1-..1.10)2 2' 1+ . 1. 20 '

where..1.1 == (l-p)/2(1+p), ..1.2 == (1+7p)/2(I+p). Note that 0 < ..1.1 < t < ..1.2 < 2, and 3..1. 1+..1.2 = 2. Then for fix h(O,p) is maximized for 0", 2(I+p)/(13+19p), whence the inequality. cd

96

The equation for

I'

6 is

and var(O)

= 20( 1 - 0)(2 + 0)/( 1 + 20)N I'

For censored data, the expected proportions in Ab, aB and ab classes.

~1- 0)/(2- 0), (1- 0)/(2- 0) and 0/(2 - 0) respectively. Hence equation r~/I:~ IS

YI+Y2 Y3 N2 1 - 0* + 0* + 2 _ 0* = O. For full efficiency of 0* as compared with

N2

= NI·

0

(1 + 20)(2- 0)2 4(2+0) == NI·g(O),

say.

But g(O) is maximum for 0 = 0'1472, whence the inequality. Var(O·*) I, obtained simply by adding the information obtained from the two samplc~

97 The joint distribution of the

n f(u

Uj

is

n

n!

j)

du j ,

X)P(Xk-ll x k- 2, ... , XhX), .. P(xlIX)P(X) == P(xkIXk-l)P(Xk-lIXk-2)" . P(x21X l)P(X1 IX)P(X). Therefore the final result is obtained by summation over the successive ndilional distributions of Xk, Xk-l,' .. ,Xl and then lastly X . ~ 101 Mood, A. M. (1943), AMS, 14, 415. Use the argument of Ex. 100 to prove that

.

E[x(m l ) (X -x)(m2)lx]

=

n(m I) (N )(m 2 ) - n . x(m , +m2), N(m,+m2 )

.hence cov(x, X - x), var(X - x), var(x). Also 2 cov(X, x) = var(X)+var(x)-var(X -x). Now corr(x, X - x) = 0 if (12«(12 - N f.l + f.l2) = 0, which gives the two possibilities (i) E(X - f.l)2 P(X) = 0, (ii) EX(X - N)P(X) = O. Ii) gives the distribution P(X = f.l) = 1 and P(X '" f.l) = 0; (ii) leads to the distribution P(X = 0) = a constant p, P(X = N) = 1- p, and P(X) = 0 for 1 ~ X ~ (N -1). 103 Mood, A. M. (1943), AMS, 14, 415.

f (\N~ )(a) r=Of (N~ )x coefficient of zr in + z)" = coefficient of ZO in (l+z)" f ( P )z-r r=O N-r coefficient of ZN in +z)« f ( ~ )zN-r = (a+ P). r=O N

.=0

r

r

=

(1

r

(1

=

N

r

Similarly,

~ (a+r)(p+N-r) N (P+N-r) r P = r~o \ P

,;

.

x coefficIent of z' in (l-Z)-«-l

= coefficientofzN in (l-z)-«-l

f (P+N-r)zN-r N-r

r=O

= coefficient of

ZN in (l-z)-«-l

f 1=0

(P+t)ZI t

= coefficient of ZN in (l-z)-«-1I-2 = (a+ P~N +

1).

248


The mth factorial moment of X is E[x(m)] = Ct(m) N(m)/(Ct+ p)(m), and that of Y is E[y(m)] = N(m)(Ct+m)(m)/(Ct+p+m+l)(m).

104 Mood, A. M. (1943), AMS, 14, 415. Clearly, P(X;jX i - 1) = P(x;jx j_ d, and the general results folIow rro Ex. 101 above. (iHvi) are obtained as particular cases. For (vii), n, j

E(x/xjIX)

= E[XxiIX]-

L E[xjxrIX],

(i =1= r),

r= 1

105 Mood, A. M. (1943), AMS, 14, 415. For corr(x, X -x) < 0, the number of items inspected on the average 11 0, E =

1)

N(N -n) n [(x+ fi] ( _ ). - 1 Pn+1(x)-N·P.(x) . fiN fi x=a+l n+

L

If x has a discrete rectangular distribution, P(x

=

r)

= l/(n+ 1) == P.(x),

(x

= 0, 1, 2, ... , n).

Hence E = 2(N-n)(n-a)(a+l) N(n + l)(n + 2)

if X is also rectangular. Its maximum is attained for a

= (IN +2-3)/2 and

n

=

IN +2-2,

max E = (IN +2- V/2N

-+

t,

as N

so that -+

w.

ANSWERS AND HINTS ON SOLUTIONS: CHAPTER 3 if ,::::

249

Np, then E is to be maximized subject to pN = n+(N -n)(n-a)/(n+ 1).

, Ihis to express E as a function of n, p and N, and maximize this with tIC '" . eet to varIatIOn In n. '"Pfinally, oX = ! and {J = (N - n)(a + 2)(2)/2N(n + 2)(2). therefore maximize E

= 2(n-a)/A(a+2)

subject to the condition that

(N -n)(a+2)(2) Np N (n+2)(2) = 1-p == T' say.

106 Mathisen, H. C. (1943), AMS, 14, 188. Denote

Yi = mi-i for (i = 1,2,3,4) and C ¢(tl' t 2, t 3, t 4) = E [ex p =

ttl f

4

= i~/r

Then

t i Yi)]

(4N+3)! (N !)4

N

m

'" (PIP2P3P4) dpi dp2 dp3,

,here the integration is over the region 4

n01 ~ 1; 0 ~ P2 ~ 1- PI; 0 ~ P3 ~ 1- PI - P2; and", ==

L Pi e'i. i= I

Ihercfore 4

E(C) 2

E(C )

[024>]

= i~1 Ct~

=

".=0'

and

4 4 [04¢] L -4 + L [ ~ 02 4> 2 ] . i= I Ct 1'.=0 ioFj (li 1'0=0 :l

(tj

i

lienee E(w) and, with some tedious simplification, var(w). Note that max C

Ii) If '"

-+

= 9m 2/16, and so (0

x, N is fixed,

4 E(w) --+ 3(N + 2)

Iii) If N -+

YJ,

32 and

(N+1)(N+5)

var(w) --+ 27' (N +2)2(N +3)(N +4)'

m is fixed, 4 3m

E(w) --+1117

~ w ~ 1).

and

var(w)

32 m-1 --+

27 .~.

Mathisen, H. C. (1943), AMS, 14, 188.

The joint distribution of ml and P is

'£11+

1) ! m ! n+ml (1 )n+m-ml d 1I1!)2"'1!(m-md!'P -p p,

(0

~ ~

1 0

~p~;

~~ml~m. ~)

250


For the rth factorial moment of ml' use the marginal distribution of . the point probabilities Iti. Wilt,

(n:~I) (n:::~l)/

e

n +:+ 1),

(0

~ ml ~ m).

E(ml) = m/2; var(ml) = m(2n+m+2)/4(2n+3), whence the asyrn variance of y. PIOII, 108

Craig, A. T. (1943), AMS, 14, 88.

The marginal distribution of v is (i) n. 2n - 1 • a- n. vn- 1 dv, for 0 ~ v ~ a/2; (ii) n. 2n- 1 • a- n(a_v)n-l dv, for a/2 ~ to ~ a.

E(v

r)-_(~)(~)r[_I __+_r!_ ±(n+k-l)! 2r-k] 22 n+r (n+r)!'k=O k! . ,

so that E(v) = a/2; var(v) = a2/2(n+ l)(n+2). For (- co < w ~ 0), the distribution of w is

n [2(n+l)(n+2)]t·

[1- J(n+l)(n+2) lwlfi ]n-l dw

--+

_1_e-lwlfi dw

fi

By symmetry the same limiting form holds for (0 The characteristic function of x is

~

'

as n --.

'J

w < co).

and that of the standardized variable (x - a/2)J12i1/a is

cP(t)

= e - it/4j3n . [exp{ it(12/n)t} -1]n/[it(12/n)t]n, whence log cP(t)

--+

!(it)2,

as n --+ co.

109 Craig, A. T. (1943), AMS, 14, 88.

For the rectangular distribution in (O,a), E(xr ) = ar/(r+l). The distribution of the ordered observations is

n!

n

". a n dx/o j=

I

whence E(x~ xj) by successive integration. The function y is an unbiased estimate of a/2 if n

L

iC i = (n+ 1)/2,

i= 1

and var(y)

=

=

[ {n}2 a2 2 iC j -(n+ 1) n (i-l)ci+ n i(i+ l)cf(n+l)(n+2) i=2 i=2 i=2

-2 (

L

L

L

n }{ n } {n .L iCi .L (j+ l)cj +2 .~ (i+ l)ci x .~ jCj ,=2 )=2 ,-3 )-2 i -\

} ]

(n + l)a 2 a2 + 2(n+2) '4'


,n~

n+l 2

251

n

= -- -

CI

3

L ic,. j=2

{heCCfore for minimum variance , _J)cr+2r

,~r

r-I

n

j= 2

J=r+ I

L (j-l)cj+2(r-l) L

jCj

= (n+l)(r-l), (r

= 2,3, ... ,n).

. unique solution is Cr = 0 for 2 ~ r ~ (n-l), Cn = (n+l)/2n, whence ::: O. The distribution of Xn is na-I(xn/a)n-I dxn, (0 ~ Xn ~ a), so that var(xn) = na 2/(n + 1)2(n + 2).

Jr.C

The distribution of t is

(n:2 )t(1+n

1)-n[

~

,hich -+e,-I . dt, for (- 00

t

]n-I

1 +t/Jn(n+2)

~

1), as n --+

. dt

00.

110 Craig, A. T. (1943), AMS, 14,88.

For any random observation x, E(xr) = (k+ l)ar/(k+r+ I), so that _ (k+ l)a _ (k+ l)a 2 E(x) = k+2 and var(x) = n(k+3)(k+2)2' Ihe distribution of the ordered observations is n!(k+l)n a"(k+ I)

I\

k

n

•

Xj

dx j ,

lienee E(xjxj) by direct integration. Therefore, n(k+ l)a 2 var(xn)= {1+n(k+l)l2{2+n(k+l)}; and

n(k+ l)a E(xn) = l+n(k+l);

r(n)rU+ {1/(k + l)}] n(k+ l)a 2 cov(xJ, xn) = {I +n(k+ 1)}{2+n(k+ I)}' r{j)r[n + {1/(k+ 1)}J , Illr J' 10

be unbiased,

"cr('l+ _1_)/ k+ 1 t...'

r(')- 1+n(k+1) I

n(k+2)'

r[n+{1/(k+l)}] = r(n)

\Iinimization of var(y) subject to the above constraint gives I ~ i ~ (II-I), and ell = {l +n(k+ 1)}/n(k+2). Hence var(y) = (k+ l)a 2/n(k+2)2{2+n(k+ I)}.

0. Cj

= 0 for

Ihe distribution of y is n(k+l) [ n(k+2) ]n(k+l) (n-I)(k+I)+k d an(k+I)' 1+n(k+1)'y . y,

o~ y

1 +n(k+ 1)] ~ [ n(k + 2) . a.

252


Hence, by transformation, the distribution of u is 1

~~----~~~

n(k+ 1)

~~~--~~~~x

[1 + {1/n(k + I)} ]n(k+ 1) [n(k + 1) {2 + n(k+ 1)} 1!-

u ]n(k+I)_1 [n(k+l){2+n(k+1)}]t .du,

x [ 1+~~--~----~~ whence the limiting distribution, as n 111

~

co.

Wilks, S. S. (1941), AMS, 12, 91.

f x.

Let

u=

f ~

f(x) dx

and

v

=

f(x) dx.

x,

-~

Then the joint distribution of u and v is

nn+l) .-1 n-I(1 )1-.-1 d ns)r(t-s)nn-t+ 1)' u v -u-v . udv, the region of variation being bounded by II + V = I, u = v = O. Now P = 1 - u - v and Q = u. Also for fixed P, 0 ~ Q ~ (1- Pl. Hell the distribution of P is l\

1 . pr-I (l_p)n-r dP, B(r,n-r+1) . where r = t - s is an integer 112

~

(0 ~ P ~ 1),

1.

Wilks, S. S. (1941), AMS, 12,91.

J

(x-lII+ko)/a

fo 1

Q=

e -,,2/2 d u.

(x -III-ko)/a

Therefore

E(Qls)

J 00

1

= r-c

e

y2n

-

-v2/2

1 d v. M: y2n

f

v/./ii + k./a

e -,,2/2 d u

vjjn _ ks/a

00

ks/a

fo (,,:S b:(-'C.:l)w'J·dW Hence


II I

1 ::::-~.

V 2,.

(1I-1)(n-I)/2

(11-1)

253

exp[-t{(/1-1)+(}2}z2].zn- 1 dzdO,

2(n - 3)/2f _ _

2

-I 0

where 0: = 'I'I! •• 1

3

'J'_

( 11 )'

II' --- -

11+1

,

rl!sull.

pul

=~,

(x-m)/u

var(Q)

s/u

=

= '/, and ct>(z) =

(8Q)2 ~~

~=O

v."

. var(~)+

*-

(OQ) ~ vII

2 ~=O

fry:; e-·I•2 / 2 . dy.

. var(I71.

Neyman, 1. and Scott, E. L. (1948), E, 16, 1. The logarithm of the sample likelihood is 1 k n, 10gL = -N log(ufo)- 2u 2 ' j~l j~1 (Xij_~j)2

11.1

1 k - N log(ufo)- 2u 2 j~1 [(lI j -1)st+llj(Xj- ~y],

n,

n,

where Xj =

L1 xjllj;

(lIj-1)st

=

L

(xij-xY

j= 1

j=

lienee results by application of the standard maximum-likelihood procedure.

114 Neyman, J. and Scott, E. L. (1948), E, 16, 1. Transform the joint distribution of y and S2 to polars so that

.j;t(ji -

ml/u

=

R sin 0

and

s.j;t/u

= R cos O.

Ihen (0 ~ R < 00) and (-n/2 ~ () ~ n/2). Also, for given rand k, r(-)k V y-m

=

/1

U 2r + k R2r+k (2r+k)/2' .

. k(} sm .

lienee the results on integration.

liS Neyman, J. and Scott, E. L. (1948), E, 16, 1. The logarithm of the joint likelihood of the sample observations is k

log L = constant-

k

1

L Il j loguj-t L 2{lljst+lIj(x-m)2},

j= 1

.hence the equations for m* and u7*.

j= 1 Uj

254


The equation for

mmay be formally written as

k

L WiePi(Xi,m) = 0,

where ePj(xj,m) == (x i -m)/[st+(x,-m)2].

i= I

Hence by Taylor's expansion, ePj(Xj, m) ~ ePj(X j, m)+(ln-m)eP;(x j, m),

so that the asymptotic variance of mis E

var(m)

=

[.t

wt ePt(Xj, m)+

,=1

.L. WjWjePi(X j , m)ePj(xj , m)] ,*)

k

L~, wjE{eP;(Xj, mn]

2

The expectations are evaluated directly by using the results of the preccd . to give . var (m A) . In.. exercise 116 (i) L~t the ordered sa?1ple obser~ations be XI ~ X2 ~ .. ,' ::::;. x2n , I " that X n + I IS the sample median. Assummg that X n + I > m, the dlstnbution 1,1 x n + 1 is .

(2n+l)![, (n !)2

4-eP

2

(Xn+l)

]nf(

Xn+1

)d

Xn+l,

where

f

Xn+1

eP(X n+ I)

=

f(x) dx ,.." (Xn + I - m)eP'(m)

=

(Xn+ I - m)f(m).

m

Also for large samples, f(xn + tl ~ f(m). Hence the approximate distribUliol, of X n + l . (ii) The Cauchy distribution has the probability density function 1 1 f(x) = ;'1 +(x-m)2' (-00 < X < 00) and _

[o2 10g f (X)] _ ~ E om 2 - 7t'

00

f

1-(x-m)2 d = {1 +(x_m)2}3' X

I

2'

-00

117 The estimate of ex is ex* = vlx, and the characteristic function of thl distribution of the sample average x is

it )-nv ( 1-. nex Inversion gives the distribution of x as

(nex)"V _nax -nv - 1 dr(nv)' e X x,

(0 ~

x
0)

f

"" = _fol . exp{ -(u j-pf/2I1 Z } dUj = t+rb(.). 2n

11

o

lienee mean and variance of S. Also, var(S)

e::r'=t'

= var[rb('*)] -

var(r*).

Use the independence of Ii and s to obtain the moments of i by noting that for any given r such that n - r - 1 > 0,

Stirling's approximation gives

Cn

,...,

e in so that c; - 1 + 3/2n.

128 Halperin, M. (1952), J ASA, 47, 457.

fo .f

en

P(x

> T)

=

e-%2/ 2 dz,

II

.md the logarithm of the sample likelihood is

1 logL = constant-rlogl1--z ' 211

r

L j=

Or

rOz

11

2

(x j - T)2+_(T-x)-- +

1

'YO

+(n-r) 10g[f e-%2/ z dzl 8

lienee the results stated by standard procedure. 129 Halperin, M. (1952), JASA, 47, 457.

fo'

8

P(x

o.

e.

Elimination of

a- gives a quadratic in . g(O),

130 Deming, W. E. (1953), JASA, 48, 244, and UL (1964). (i) P(X) =

(~)pXqN-X;

P(xIX) = (:) (~r

(1- ~r-x ;

[O~X~N;

(ii) P(X) =

(~)pXqN-X;

p(xIX) =

O~X~n].

(~)(:=:) / (~); [0 ~ X ~ N; 0 ~ x ~ min(n, X)].

(iii)

P(X) =

(7)(:-~)/ (~);

p(xIX) =

(:)(~r(I-~r-X;

[0 ~ X ~ min(N,Mp); 0 ~ x ~ (iv) P(X) =

(7) (:-~) / (~);

(~)(:=:) / (~);

min(N,Mp); 0 ~ x ~ min(n, x)). Work in terms of factorial moments. First evaluate E[x(r)lx] and then average over X to obtain E[x(r)]. [0

~

p(xIX) =

nJ.

X

~

131 The joint distribution of x and y is

(Nq) (NI1-X-Y - Np -Nq)/ (:), (NP) x y

(0:;;:; x,

Y:;;:;I1;O:;;:;X+Y~Il).

11

Therefore

y n n- ( x ) E [- X] = L L - P(x, y) y+l y=Ox=O y+l

f (_1)(Nq) n~Y(NP-l)(N-Nq-l-NP-l)/(N-I) y x=l x-I n-y-I-x-I n-I n-I( 1 )(Nq)(N-Nq-l)/(N-l) = np Y~O y+l y+l n-y-l n-l np n-I (Nq+ 1) (N -Nq-l) / (N -1) =Nq+l'y~o y+l n-y-l n-l

= np

y=o y+l

= N::l

[(~) - (N-~q-l)]/(~~D, whence the result.

In the same way, E[x(r)y(s)] is obtained by repeated use of the identity

f (M)u (N-M) = (N) n-u

u=O

n

Hence the moments of x and y. To obtain the coefficient of variation of

x/(" + 1) use the following result:


3

261

If X and Yare random variables with coefficients of variation VI> V2' and rrelation p, then, as a first approximation, the coefficient of variation of ,0 . ( 2+ 2_2 )t .\'If IS VI v2 PVI v2 • 31 UL (1964).

I folloWS directly from first principles.

33 UL (1964).

I The estimates are: cr:* = (x+y+z+w)/4; In = (x-y+z-w)/4; R.",(-x+y+z-w)/4; x,y,z,w being the averages of the x, y, z, and w pI • observatIons. . The error sum of squares IS n

M ==

L [(Xj-X)2+(Yj- y)2+(Zj_Z)2 +(Wj-w)2]+n(x+ y- w-z)2/4, j=

I

with (4n-3)d.f. The t statistic for testing the hypothesis H(fJ 1 = fJ 2) is

(x-y)[n(4n-3)/2M]t,

with (4n-3) dJ.

134 UL (1964). The random variable r has a binomial distribution and

P(r) = Hence the estimate

C)(1-e- )'(eT8

T8 )n-r,

(0 ~ r ~ n).

0 and its variance. For large samples,

var(1/0) - 8-4 yare B) = (e TO -

I)j11 (8 2 T)2.

135 Epstein, B. and Sobel, M. (1953), J ASA, 48, 486. XI

Since P(x ~ x r ) = e- xr/8, the joint distribution of the observations ~ Xz ~ ••. ~ Xr is

(_;'\1 't' o-r. n r.

.

.± Xj/O) . [e-

exp (-

n

xr/8 ]n-r ..

J= 1

J= I

dXj'

whence the estimate O. The Yj are independent random variables with joint distribution

it (n-~+1)

e-n but summing tt tn(n+ I)-A. II The function F((); n,I') satisfies F(O; n, 1')

=

F(O; n-l, I')+O"F((}; n-1, 1'-1)

whence, on equating coefficients of OA, I(A; n, 1'), the coefficient of OA in F(O; n, 1'), satisfies the same relation as S[A; n, 1']. Also for n = 1, I(A; II rJ = S[A; n, r]. Hence the probability-generating function of A. ' The cumulant-generating function ,,(t) is expanded in terms of Bernoulli numbers as ,,(t) =

-lOge) +tr(n+ 1)t+ sinh (n - j + 1) t ] + log [ 2 .. jt/2 . n-j+1 j~1 (n -; + 1) t smh(jt/2) j r

r

'h

=tl'(I'+I)t+L j=

Note that B2

= !; B4 =

(-1)'"+ I t 2m

L1m. 2 (2 )' m.

I m=

.B2m [(n-j+1)2m-lm].

lo.

160 Harris, H. and Smith, C. A. B. (1947-49), AE, 14, 307. If x and y be the ages of onset for a random pair of parent and child, then E(xIA)

=

E(yIA)

= 111.;

E(x 2IA)

=

E(y 2IA)

= m~+a~;

and E1x.rIA) =

mf·

Similarly, and E(xyIB) = m~.

Hence E(x)

=

E(y)

=

1tlml +1t2m2;

E(.\:2) = E(y2) = 1t.(ar+m~)+1t2(a~+m~);


3

273

I !-larris, H .. ~nd Smi.th. C. A: B. (1947-49), A E, 14, 307. 16 The probabIlity denSIty functIOn of the age of onset x is

fIx)

=

~[e-t(X+m)2+Ae-1-(X-III)2]/(1+A)'

(-co < x 0, B(A) > 0 and var(X)

A 1-0A(A) 1- e '«1 + A) . [1 + (1- O)B(A)j2'

=

lIence

.nd ~

var(X)

A![I-e-).(1 +,{)].

168 Cohen Jr., A. C. (1960), JASA, 55,342. 'Xl

Let

/Ix

be the number of nests with x eggs each, so that

L

nx

= (N -

n).

x=2

Ihe logarithm of the sample likelihood is logL

= constant+n log(I-0)-AN +Nx 10gA-N 10g[1-e-).(1 +AO)],

.• here x is the sample average, so that Nx = n+(N -n)x*. The equations for land (J are

x -::-l _ e - :i(1 + ~O) - ~ ~ -1 = 0 1- e - ).( 1+ AO)

A

(1)

tnd

_ n + ~

Iolving for

N e-1l = O. l-e-1 (l +AO) AA

(2)

8, (1) and (2) give A

_

0-

~-x(l-e-l) _ N~ e- 1 -n(l-e- 1) A'

A e- ,t(1-x)

-

A ' ·

(N - n),{ e-).

(3)

lienee

x*

=

[

l

-1

]

l 1+1-(1:l)e- 1 '

.hich may be rewritten as e1-l = x*A/(x* - ~). he this t9 eliminate e1 from the first equation of (3) to obtain the stated i(,uit for O.

278


For small A,

1-(1+A)e-). = A. 2 (1-~A+ A. 2_ ... ) '" A. 2 e- 2 )'/3 2 3 4 2'

l

whence the approximation for For large samples, var

(X) _

~ [l-(1+A)e-).][1-(1+A.O)e-).].

[(1_e- A)2_A.2 e-A]

- N'

,

(0) = (1-0)[1-(1 +AO) e- A][(1-e-).)(1-(} e-).)-A(1-(}+A.O)e- AI var NAe A[(1-e )')2_A 2 e-).] ----=; and

0) _ (1-0)(1-A.-e-).)[1-(1+AO)e-).]

A

cov(A.,

N[(1-e-A)2 _A2 e A]

-

Plackett, R. L. (1953), Bs, 9, 485.

169

00

Let A* =

L a,n,/N. Then A* is an unbiased estimate of A if

,= 1

which determines the a,. var(A*)

= -1[00 L r2p,_ N ,=2

{OO rp, }2] , ,=2

L

where p, == (

;'"

A

e -1)r!

and

_ A (l-e-)')2 var(A) = N' 1 _ (A + 1) e -).' 1 Write

T ==

-2

N

whence eft' (A *).

10

L b,n, ,=2

so that

But 00

'L

L b,p, = 2P2+ Lrp,

,= 2

=

A+,F/(e).-I),

,=2

00

L b; p, =

CfJ

12p2 +

,=2

L

,= 2

r2p,

A. + 6,1,2 /(e). - 1)+ A2 e)./(e). -1).

=

Hence var(T).

170

Plackett, R. L. (1953), Bs, 9, 485.

P(X

= s) = AS/(e). -1)s! == Ps' say, for s ;;:: 1.

00

E(T1 ) = N

L

,= I

00

P2,-1

and

E(T2 )

=

N

L

,= I

P2,'

;

ANSWERS AND HINTS ON SOLUTIONS: CHAPTER 3 ~ence

E((}*)

279

= e- A• =

var(T.)

NL~, P2r-l- (~/2r-lr].

2)= NL~, P2r- C~I P2r) 2].

var(T

co

cov(T1• T2) = -N

and

co

L L P2r-lP2s'

whence var((}*).

r = I s= I

i1I Cohen, Jr., A. C. (1960), Bs, 16,446. The distribution of x is: PIx = 0) = (I-());

PIx

= r) = o;.:/(eA-l), for r

~ I.

[he estimation and the variances of the estimates are direct. Set

I/I(l) == (1-e- A)2/[l-(1 +l)e- A]. rhen lim I/I(l) = 1 and

A-+co

lim I/I(l) = 2.

A-+O

Also

d log I/I(l)

d}.

e- 2A[(2 + l)(e A + 1)- 4 eA] (l-e A)[l-(t +l)e A]'

lienee t//(l) ~ 0 for l ~ O. Therefore the inequality for var(l). 171 Peto, S. (1953), Bs, 9, 320. The probability that ni organisms administered to a guinea-pig in the ith group do not effect a kill is (1- p)n, ...... e-n,p. Iherefore the likelihood for the whole experiment is

L

=

Ii (~i)(e-n,pY'(l_e-nfpr·-r'. r,

i= 1

lienee Pand its variance. The information obtained from n organisms per animal in a group of 1/1 guinea-pigs is In =

mn 2 e- np/(1-e- np ),

dod for variation in n, this is maximized for a value of n given by "p::: 2(1- e - np ), with root np = 1· 5936. The most economical expected proportion of survivors is e -np . . . 0·203. 173 Cochran, W. G. (1950), Bs, 6, 105.

280


The equation for

J is

J=

Jl (ni-s/)~i(l-e-il)-1 ~tl

v~!c5) = 1/~1 n/x~(e,xI-l)-I, A

(~i == Jv/)

nivi>

"

1

1

~-

".... N'

(Xi

== c5v/)

1·54

(X2)

=--

N

max~l

e -

since

x 2 j(e,x-l)

is maximum for

X

= 1·5936.

174 First part is Query 78, Bs (1950), 6, 167.

Let the group means be x and ji based on nl and n2 observations res . tively. so that n = n t +n2 and var(x- ji) = (1Un. +(1~/n2' pc\. For (i) minimize (12 (12 ---.!.+ __ 2_ with respect to n1 .

nl

n-nl

Hence n 1 = n(1d«11 +(12) and n2 For (ii) minimize

n=

= MJ«11 +(12)'

en 2+n (12_(12) 1 1 22 1 with respect to nl' enl- (11

Hence nl = (11«11 +(12)je; n2 = (12«11 + (12)je. For (iii) solve the simultaneous equations and

(1~

(1~

-+-=e. nl n2

This gives the quadratic in n1 en~-(en+(1~-(1~)nl +(1~n

= O.

A necessary condition for it to have positive roots is that

en + (1~ - (1~ > 0, and the roots are real if

(en + (1~ - (1~)2 - 4ne(1~ = [en - «1~ + (1~W - 4(1~(1~ ~ O. For (11

= (12 = (1, nt

=

~ [1± (1- ~:~r}

n2 =

~ [1+ (1- ~:~r].

175 The argument is similar to that of the preceding exercise with t = III tall) instead of n = n 1 + n2' For (i),

nl = t(1.1«11 +(12~); n2 = t(12j~«11 +(12~)'

For (ii), nl = (11«11 +(12~)je; n2 = (12«11 +(12~)je~. For (iii), if et+(1~-IX(1~ > 0 and [et-«1~+IX(1~W-41X(1~(1~ ~ O. then

n1 =

~ [{I + (1~~;(1~}± {I

«11

+::~)2r {I

«11

-::~)2r].


n2 == ;rt [{

1- ut ~trtu~} +{I

(Ul +

3

281

::ja)2f{1- ::ja)2fl (u 1-

16 Bliss, C. I. (1953), Bs, 9, 176, and the accompanying note by R. A. Fisher. I

The probability-generating function of X is <Xl

L

Ox. P(X = x) = (1 +p_pO)-k.

x=o

lienee ,,(t). The first four cumulants of X are:

"1 = kp; "2 = kp(1 + p); "3 = kp(1 + p)(1 + 2p); "4 = kp(1 + p)(1 + 2p)(1 + 3p). The logarithm of the sample likelihood is log L = constant + Nx log p - N(k + x) log(1 + p) + <Xl

+

L Ix log[k(k+l) ... (k+x-l)],

x=o

\,hence the equations for p and t The covariance matrix follows directly. (i) Set p~

= Ix/N, for all x. Then the equation for k* is

~ * - (p*)-I/k 1+ ~ k* . ;/::\ xpx 0

O

,

which may be written formally as

k*

= F(P~,

pT, p~, .. .).

Note that E(P~)

= Px ==

P(X

= x);

var(p~)

= Px(1- Px)/ N ;

and (x 1= y).

Hence var(k*). (ii) Set E(x)

= kp == J.t, and E(s2) = kp(1 + p) == It2' Then _

J.t[I+~r

[2

].

k= ( (s - I(2) - (x - II) J.t2-J.t ) 1 + (It2 - It) Hence, using a logarithmic expansion,

var[Iogk] '" (It 2 - It)-2 [(2J.t:: Ilf var(x)+ var(s2) 2(2 It;l- It) COV(.X,S2)], where varIx) = It2IN; var(s2) '" (It4-lt~)IN; cOV(X,S2) Ilr == E(X - Il)r. Hence var(i 1.

11,

An (A -1)/(An-1) '" 0 for A < I, and", (A -1) for A > 1. (iii) For A -+ I, var(T) -+ 0- 2 /n.

li9 UL (1964). Least-squares estimation gives J.L* rtl 2 is S2 = Il)

Lt,

(Xi-X)2+

=

it,

x and

0* = ji-x, and the estimate

(Yi- ji)2 ]/(n+v-2).

lest H(O = ( 0 ),

t [0 lest

H[var(x i)

F . Ir its

=

=

ji-x-Oo (

s

- nV)1n+I'

.h

WIt

(n+v-2)dJ.

var(Yi)), use the F statistic

I\"

v-I"

= n-l' i~l (Xi- X)2 i~' (Yi-ji)2

with (n-l,v-l)d.f.,

reciprocal to ensure F > 1.

= corr(Y, Z) = t, and corr(X, Z) = O. (i) If Y = c, then X3 = c-x 2, and so X = x, +X2, Z = C-X 2+X4'

ISO eorr(X, Y)

-to

Hence carr (X. ZIY = c) = lii) If Y = i.Z. then.\2 = (i.-I).\3+i'.\-l-' so that

X

=

Xl+(A-l)X3+AX4,

and

Z = X3+X4'

Hence

2A-l

corr(X, ZIY

= AZ) = 2(A2 -A + 1)1-'

181 UL(1964). E(y)

= (X+y, E(y_(X)2

=

p2+3y2,

and cov(x,y) = E(xy) =

p.

lienee corr(x, y).

182 UL (1964).

The sum of ranks for the two rankings is N(N + 1)/2. The sum of squares for the first ranking is N(N 2 -1)/12. The sum of squares for the second ranking is k-l

=i[(t2+2tf+tk )+

k

L t;ti+l(t i+ -ti)]-N(N+1)2/4 =! L 1

i=1

njtjtj_l'

j=2

This is also the sum of the products of the two rankings. Hem:e p.

284


183 UL (1964). The result is obtained by simplification of the X2 in standard form . E(aj) = AnJN; E(b j) = BnJN. There are (k-l)dJ. The result is due to BrWl\h and Snedecor. and,

ANSWERS AND HINTS ON SOLUTIONS

Chapter 4

E(e iIZ )

ff 2~ f f 'Xl

,::1)

-Y]

-co

C1J

00

-!'fJ

-:0

= 21n =

exp{itxy-!(x 2+ y2)} dx dy

exp[-t{(x_ity)2+(1+t2)y2}]dxdy = (l+t 2)-t.

For correlated variables, itxy

2(I~P2).(x2-2Pxy+y2) = -2(1~P2)[X-{P+(I-P2)it}y]2-t[1- 2pit+ (1- p2)t 2]y2.

lienee integration gives

E(e iIZ )

=

[1-2pit+(I-p2)t2rt.

A logarithmic expansion gives for the first two cumulants of Z and K2 = (1 + p2).

Ki =

2 The characteristic function of ajXj is cos(ajt), and that of Y is n

t2)dtldt2dxdy,

00

-00-00

or, reversing the order of integration,

A~SWERS A~D HINTS ON SOLUTIONS: CHAPTER

ff 00

2

oP

4n Op =

4

287

00

(1'1(1'2 _." _."

2n 4>(tI,t 2 )dt l dt2 = ~'

hence result, since P = i for p = o. • BY symmetry, the total probability in the negative quadrant is the same, consequently in each of the mixed quadrants it is Ind 1 1 . -1

4- 2n Sin

p.

S The density function of X and Y is f(x,y) =

4~2

ff 00

00

-')')

-tX)

-co

-Q'J

exp(-itlx-it2y)·4>(tl,t2)dtldt2

fherefore (_P)J[ 1 foo

P(X~(X, Y~P)=.L -.,)=0 J. 00

f f 00

X

21n

-2 n

a

f exp{-! 0, the integration over t 12 is simple and gives the density function

{-t(u - 2pw +~) }.u(n-4)/2 2(n-l/l2for(ll; 1)

(1- p2)(n-l)/2 exp

X

00

x2~ f

(l-2it22)-(n-2)/2exp{-it22(V-

~:2)}

dt 22 ,

which for ut' - w2 > 0 finally gives (l - p2)(n -1)/2 exp{ -!(u + v - 2pw)}. (uv _

W 2)(n- 4)/2

2n-IJn r(n; 1)r(Il;2) The (II, V, W) -4 (Sx, s)" r) transformation is straightforward, but the standard form of the constant of the joint distribution is obtained by using the duplicalion formula 2n -

3

r(Il-2)r(~) 2 2

_

r::. 1(n-2). - ....;n.

12 Elderton, W. P. (1933), B, 25, 179. For the continuous distribution of X, E(e iIX ) discrete random variable Y, PIY,= h/2)

=

(1- it/.,1,)-I. For the

= (I-e- Ah ); P[Y = (2r+ 1)h/2] = e-rhA(l_e-Ah), for,.

Hence the probability-generating function of Y is E(OY) = Oh/2(l-e- Ah )/(l-O" e- hA ),

~

1.

292


and the characteristic function is _ (l_e- lh) e irh / 2

CPh(t)-

_

1 -e irh-hl

_~)-I

(

1 ,I.

-

whence the result for K( /) and K,,((). K1

=

~

"I -

sinh(V.h)/(-V.h)

' Sill . h{-!(' '/11'1 - 2 A-It) '( 12h().-i/)}'

[I

1 I]

zit coth(z,l,.It) - i. '

V-It.

and the inequality follows since 0 < tanh(-V-h)

13

V.It,

whence K2 < K2'

Kupperman, M. (1952), B, 39, 429. The probability density function of X is

(a+x)/a 2, for -a ~ X ~ 0 (a-x)/a 2, for 0 ~ X ~ a.

and

Hence cp(t) by integration. For the discrete distribution, 111ft = a, and

P[(r-l)h ~ X ~ rhl

=

h{2a-(2r-l)h}/2a 2, for 1 ~ r ~

Ill.

Therefore

cph(t)

=

I~[e-ita

a =

(2r-l)heith(2r-ll/2+eita 2

h2[e-ita.~.!!..a

=

f r= 1

f

(2r-l)heith (2r- l l/2]

i: eith(2r-l)/2_eita.~ iJ f

10t,=1

2

r= 1

e- ith(2,-ll/2]

10t,=1

h [ .O{l_eita } . O{l_e- ita }] 2a2 e - Ita ot sin(tlt/2) + ella at sin(tlt/2) ,

whence the result. Therefore

cph(t)

=

tlt/2 ] 3 [Sin tit] cp(t) [sin(th/2) th'

and a logarithmic expansion gives the relation between K2j and K2j'

14

Kupperman, M. (1952), B, 39, 429. P(Z = 0) =

h(4a-h)

whence the result.

4

a

2

'

and

P(Z = rh) =

h(a-rlt) a

2'

for 1 ~ r ~

Ill.


4

Bxpansion of '" h(t) gives h2 (it)2 K2 = + - + the coefficient of -2' in

"2

12

.

•bich, on differentiation, is found to be -

h4 /32a 2.

IS Kupperman, M. (1952), B, 39, 429. a

E(e i'X )

= 2~3

f

1

(a 2- x 2) cos tx dx

=~

o

.h'oee result . \;

P [ y=

f

(1- u2) cos atu du,

0

(2,.-1)11] 311 2 = 4a 3 1!(3a 2 -h 2)-h 2,.(r-l)].

Ihererore the characteristic function of Y is

~ [eilh(2r-ll/2 + e - ilh(2r- 1 l/2] . ~ [!(3a 2- 112) - 11 2r(,. - I)] ~

40 3

" I

--

1I(3a 2-h2) [eiIIl12(eilllth_l) e ilhI2(e-illllh_I)]_ 'h + 'h 4a 3 e"-l e-II-I 311 3 ---3

16a

LIII [(2r-1)2 _1][ei,h(2r-ll/2 +e- ilh(2r-1 l/2] r= 1

11(120 2-112) sinmth + -311 (0)2 ~ Ie illl(2r-ll/2 +e -ilh(2r-l l/ 2] L 160 3 4a 3 ot r= 1 • sin(thI2) -

h(12a 2-h 2)

=

16a 3

.

sin at 3h(0)2[ sinat] sin(tIl/2) + 4a 3 ot sin(thI2)'

\\"hich may be reduced to the form h2 [sin at thl 2] 3 [Sin 2at sin til at (thI2)3] 4a 2 ----;;t. sin(th/2) - (at)2 ~. ~ . sin at . sin 3(th/2) + 3 [Sin at sin 2 til (thI2)5 ] + (at)2 ~. (tll)2 . sin 5 (tIl12) .

lienee the result by using the expansion in Bernoulli numbers. 16 Kupperman, M. (1952), B, 39, 429. 1I(12a 2 -112) 3h 3 ,.2 P(Z = rh) = 16a 3 4a 3 '

for,. ~ O.

Ihcrerore the characteristic function of Z is 1,112(/2_112) [III ] 3h3 III " - 3 - 1 + L (e ithr +e- ithr ) - - 3 L ,.2(e i1hr +e- itlrr ) 16a r=1 40 r=1 2 2 h(12a -h ) [ eilh(eillllll-l) e-i'h(e-illl'h_I)] =

16a

3

1+

'h

e"-I

+

. e""-l

(2)2

lienee the slaled result.

3h 4a ct

+--3" -..;-

+

LIII r= 1

(e i1hr +e- illrr).

293

294


Kupperman, M. (1952), B, 39, 429. The characteristic function of X is obtained by direct integration a d be rewritten as ' n rna:. 17

_2_ [Sin 2at . ~ _ 1] +~ [Sin at -1] (ait)2 2at sm at art at

==

~[et/11(tl_l] +~ [et/12(t l _ 1] , (Glt)2 Glt r/!l(t) ==

I: Bj~j~:t)j(22j_l); J.J.

where and

j=2 00

r/! 2(t) ==

L: )=2

B .(2a)j (it)j J

• .,

the B j being Bernoulli numbers.

'

J.J.

Hence the mean and variance of X.

P[Y _ (2r-l)h]

__ 2h_h 2(2r-l), a a2

2

for 1 ~ r ~

Ill,

so that the characteristic function of Y is

f eith(2r- l/2[2h _h 2(2r2-1)] l

a

r= 1

a

= 2h.eith/2. eit.hm _l_ 211 (~.~) ;, eith(2r-l)/2 e'th -1

a

a2 i at r~1

'

whence the result. The mean and variance are deduced by considering Ihl' expansion of log QUJ) in terms of the series involving Bernoulli numbers.

18

Kupperman, M. (1952), B, 39, 429. E(e itX ) = eita[(sin at)/at], so that the cumulant-generating function of X II log [e

ita

(Sin at)] _ ~ - - = L.. at r= 1

(it)' "r'-,' r .

The characteristic function of Y is

!. ki1eith(r+tl = eiat [Sin at] [ k ,=0

at

. tl1/2 ]. sm(tl1/2)

Therefore, if Kr denote the cumulants of Y, '.0

(it)'

00

_

(it)'

00

(it)i Bi hi

L: "r·-, = L ",.-.-, + L: -.-, .-. , r=1 r. r=1 I. j=2J· J the Bj being Bernoulli numbers. Hence

19

"2r+ 1

=

K2r+ 1; "2r

=

h2r K2r+ B 2r.];,

r ~ 1.

Kupperman, M. (1952), B, 39, 429. The characteristic function of X about the origin is

eith(n + 11/2 sin(ntl1/2) th/2 . nth ''2 . sin(th l '2)'


E(X) = (n + 1)h/2,

4

295

whence (t l , t 2) dt2

-oc,

Jnd

f 00

h(y)

=

2~

-00

,hence the result.

e- it2 ), 4>(0, t 2 ) dt 2 ,

298


27 Kenney, J. F. (1939), AMS, 10, 70.

Since E(e ilX ) = cp(t), therefore the joint characteristic function of Y a Z is nd I/I(t1>t 2 ) = E(eillY+ilZZ) = [CP(tl +t2)]V[cp(td]nl-V[cp(t2)]nz-v.

Further, if h(z) is the marginal density function of Z, then by using the Inv . Theorem, er· SlOn

f e-ilZ'[~. ~I/I] 00

E(Ylz) h(z)

1 = -2 1t

I

ut 1

11:

0

dt 2,

-00

and

.,

E(y2Iz) h(z)

J

= 21 1t

e- i1z '

-00

[~. ~2fJ I uti

dt 2· 11=0

These equations lead to E(Ylz) and var(Ylz), noting that E(X') = [;,.

o'CP~d] ott

I

1,=0

and

f i1z' 00

[cp(t 2)]n

=

e

h(z) dz.

-00

The correlation between Y and Z is obtained from the linear regression equations expressing E(Ylz) and E(Zly). 28

The joint characteristic function of u and v is CP(tl, t 2)

= E[exp(it 1 u+it 2v)] = (1-2it 1 )-nI 2 exp[ -At~/2(1-2itl)], n

where

A

=I

ai.

' 0, evaluate the integral 00

f

eit2V cf>(t 1, t 2) dt2

-00

by using a semicircular contour in the upper half of the complex plane.

fherelevant poles are at t2 = i±t1. Hence the numerator in cf>(t1Iv) is ne - v [t 1 cos vt 1 + sin vt 1 ] t 1(1 +tn

2'

.

The denominator is ne- v(1+r)/2. obtained as the limit for t 1 --+O of the numerator. 30 Bartlett, M. S. (1938), JLMS, 13,62. Use the result of Ex. 26 above.

f

00

Je- it2Y

00

cf>(t1' t 2) dt2

=

-00

J

exp{ -it 2Y+K(t 1, t2)}' dt2

-00

00

eoP[it,,-(oloy)). exp[ - it 2y -!(K20t i + 2K 11t 1t2 + K02t~)] . dt2

-00 00

=

f

eoP[it!. -(016y)) . e-tK2ott

exp['t 1( K02 t 2 - I 2Y -2 2 + 2K11 t 1 t 2 )] • dt 2

-00

whence cf>(t 1IY)· 31 The characteristic function of the sample mean

[eit;;/:

x is

I]",

and so by the Inversion Theorem the probability density function of x is f(x)

= -

1

Joo

2n

itl e- itx [e- ."-1till

1]" dt.

-00

The integrand is analytic everywhere. and so the path of integration may be changed to the contour r consisting of the real axis from - 00 to - p, the small semicircle of radius p and centre origin, and the real axis from p to 00. Thus itl f(x) = -1 e- itx - .dt 2n 1till

f

[e "_I]"

r

=

.(11) . feit[(jf,,'-X) dt.

(-1)". " (-1)) j

2n

j~O

(i/Il)"

r

t"

300


But ei~Z

f -Z"

dz

= 0, for 0( > 0,

r

2nO("-1 - i" . (II _ 1)!,

for 0(
t 2 )

= [1+(tl:t2rrn

v is

[1+el:~2rrn.

rhUS the characteristic functions (W 1> 0) and cP(O, t 2) of u and v are of the ..Jdle form, but cP(t I> t 2 ) 1= cP(t I> 0) cP(O, t 2 )· Expansion of 10gcP(tl> t 2 ) shows that the coefficient oftlt2 is zero. The probability density function of u is

-00

-00

rhe integral is evaluated by using a semicircular contour. Hence _

ne- n1iil

2n-1 (2n+r-1)!

g(lI) = (2 II _1)1' .

L

r = ()

-1(" _'_1)1' , • _II 1 .

[nluWn-r-1 _ ,)2n+r • (-r:o

,.= 1

f(x) being the density function of X. Therefore

[ata¢]

= i["'(ttfn))"-1 feiIOC/n.x2f(X)dX-

2 12=0

_;["'(t.ln)]n-2 [feiIIXln.Xf(X)dxr Hence the differential equation by putting ttfn = t. The solution

follows since ",(0) = 1 and ""(0) = ill.

42 Williams, J. D. (1941), AMS, 12, 239, and McGregor, J. R. (1960), B,47, 111. The variables Yj = (xj-m)/a are unit normal, and n-l

A

=!

L (Yj-Yj+l)2; j= 1

n

B

=!

L(yj-ji)2. j= 1

Therefore

But n

-!

L YJ+tlA+t2B j= 1

where a == 1-tl -[(n-1)/n]t2; b == tl -t2/n; c == 1-2tl -[(n-1}/n]t 2; d == /2/11 so that a+b+(n-2)d = 1: 2b+c+(n-3)d = 1. M n has the leading diagonal (a, c, c, . .. c, a), bordered by diagonals whose c\.cments are all h's, and the remaining elements are d's. Row operations gIve Mn = t~-1 Lln' where Lln is the nth-order determinant:


1 1

1

1 1 1

()

1 0

0 0 0

1

()

1

0 0 0

0 0

1

()

0 0 0

0 0 0

0

1

0

.......

0 0 0 0

()

4

305

1

1 1 (0+ 1) ,

.here 8 == (c-d)/(b-d); ()+ 1 == (a-d)/(b-d). Let D.-l denote the (n -1)th-order determinant obtained from An by xcluding the first row and column of An. Also, let C j denote the jth-order ~elerminant formed by the first j rows and columns of Dn _ 1, U < n - 1). Then Mn=t~-l(Dn_l-An_l)' C 1 =(}, C 2 =(}2-1, and for j>2 II

(.::: (}C j - 1 -C j J

2•

Hence

BUI

and /).n = Dn- 1 -An Therefore

1•

Assuming that An = AIZn+A2Z-n, the constants Al = - z/(1- Z2); A2 = z/(1- Z2). Hence An = An- 2 +S n- 1 , where S. == z'+z-', whence

Mn = t~Mn_2+~-lSn_l' Bul z, Z -

1

are the roots of Z2 - (}z + 1 = O. Therefore Sn-(}Sn-l +Sn-2 = 0,

and the difference equation for Mn follows. j

To obtain the expression for mj' put t2 =

f 00

= [ 0i(t 1;t 2)] ot 1 1,=0

A j e/2B •

-00

L

,= 1

t,

in

fIk=l -1-.e-tY~dYk' fo

and integrate for ( - ex) < t. ~ 0). Also, for the given form of (t 1> t 2)

[ oj(t~,t2)] ot 1

= [dj(tt O)] I, =0

dtl

. (1_t 2 )-tln + 2 j-l). I,

=0

306

EXERCISES [I>< PROBABILITY A;,\;[) STATISTIC'S

43 Cochran, W. G. (1937), JRSS, 100,69. Let uj = xj-x and Vj = Yj- y. Then E(uj) = E(vj) = 0; var(u) = var(vj ) = (n -1)/n; and corr(u j , Vj) = p.

Also, since uj and Vj have a joint bivariate normal distribution, P(u j > 0, Vj > 0)

=

P(uj < 0, Vj < 0)

=

!+(1/2n) sin-I p

by Sheppard's medial dichotomy theorem. Hence E(C) = t+(1/n)sin- 1 p and var(Cj ) = !-[(1/n) sin-l p]2. Note that, by symmetry, P(1I 1

> 0,

VI

> 0; U2 > 0, V2 > 0) = P(UI < 0,

VI

< 0; U2 < 0, v2 < 0).

To obtain P(UI > 0, VI > 0; U 2 < 0, V2 < 0), change the sign of t3 and I . the characteristic function. Then inversion and integration now give 4 III P(u l > 0,

VI

> 0; U2 < 0, V2 < 0)

1

1

= 4n 2[(sin- 1 p)2-{sin-1 p/(n-1W]+ 4n[sin- 1 p-sin- l p/(n-l)]+conslalll = P(u l < 0, VI < 0; u2 > 0, V2 > 0), by symmetry.

Hence cov(C I, C 2 ) = constant - [(lIn) sin -I p/(n -1)F whence var(C), the constant being evaluated since var(C) = for p = 1. For large n,

°

var( C) - :n [1 -

(~ sin - I P ) 2].

which gives the large-sample efficiency of p*. 44 Bartlett, M. S. (1936), PRS, 154, 124. E[eit(X-m)]

= (1+t 2)-I, and y-m = (1-JL)(xl-m)+JL(x 2-m).

Therefore the characteristic function of (y-m) is E[ exp{ it(1- f/)(x 1- m)}] . E[exp{ itJL(X2 - m)}]

= [1 + t 2(1- JL)2] - I [1 + t 2p2]- I 1

= 1-2JL

[

(1- JL)2 1+t2(1-JL)2

JL2] 1+t2JL2'

But (1 + t 2 a 2 )-1 is the characteristic function of the distribution 1 - e- 1zl /C1 dz 2a' ,

( - 00

U2, ... , then by considering the product (1 + ax + {3x2)S prove that S=ao+(a\+aao)x 1 +ax+ (3x 2 Hence, as a particular case, determine the sum to infinity of the series 1-8x +28x 2-80x 3+ . .. , and indicate the range of x for which the summation is valid.

°

11 (i) For a> and Il small such that terms involving show that a root of the equation is a (1-!1l log ex). (ij) Prove that for any r>

11 2

are negligible,

°

1 r(r+2)

2r+l 2r+3 2r(r+ 1) 2(r+ 1)(r+2) .

By using this identity, or otherwise, find the sum to Il terms of a series whose rth term is 1/r(r+2). What is the limiting sum as Il ~ oo? (iii) For any two numbers a and b such that a > b > and a - b is small compared with a, prove that the value of the function

°

g(a, b) = a;::2

-log(~)

lies between (a-b)3/6a 3 and (a-b)3(3a-2b)/6a 3b.

314


12 (i) If f(r)

= X-I /

(x: r) where

is any positive integer, prove that

X

f(r-1)-f(r)=r- 1 /(x:r),

for r;;;'1.

Hence show that

so that, as n ~ 00, the sum of the series tends to X-I. (ii) If X is a sufficiently large number such that terms involving · 'ble, prove t h at negI19l (

(iii) If

Ixl < 1 and

X

x-4 , tlrt'

-I)! = 1-~+ 1 1 1 2x 2 - 2x3'

x+1

the rth term of an infinite series is

(-xY Ur =

prove that

r(r+ 1)

for r;;;.l,

L u =l- (1_+_x) 10g(1+x). 00

r

X

r=1

13 (i) If the rth term (r ;;;'1) of an infinite series is 1 +2+2 2 +, .. +2 r - 1 U = r (r+ I)! prove that 00

LU

r

=!(e-1f.

r=1

(ii) Prove that for any positive integer

"f (r-1n ) (3r

r

r=1

)

n;;;'

=

1

4"+1-1. n+1

(iii) If x is sufficiently large for terms involving x- 4 to be negligihk. prove that 2 1 1 (x + X)2 [Iog(x + 1) -log x] = 1- 24x2 + 24x 3 . 1

(iv) If x is sufficiently small for terms involving that

XS

to be negligible, prow

14 (i) The function f(r) = 1/(1 + r+ r2) is defined for all non-negative val· ues of r. Prove that f(r-1)-f(r)=1

2r 2 4' +r +r

forr;;;'1.

315

SUPPLEMENT

Hence show that for any positive integer S ==

i r=

"

2,. 1

11

11(11

1 + ,.2 + 1'4

+ 1)

n 2 + 11 + 1 '

so that lim S"

=

1.

.. +--GO

(ii) If

Ixl 1) unbiased cubical dice till at least one six shows up. If the event "at least one six" OCcurs on the rth throw of the dice, the player wins £2 1- kr (r;;;:.l). Prove that the probability of the player winning on the rth throw is (~)k(r-l)[l-(~)kJ.

Hence deduce that his expected reward is l-(~)k

£ 2k -

1

[1-(&YJ'

so that, on the average, the player can never be on the winning side if the non-returnable stake to play the game is £2 1- k • Also, show that the expected reward is a decreasing function of k.

30 An urn contains n (> 3) balls, either white or black, but otherwise identical; each ball has a probability p of being white and q of being black (p + q = 1). A random sample of k «n) balls is drawn from the urn. Prove that the probability of the sample containing only one white ball is k

"-t+ (nr-11

k)prq"-r.

r=1

Furthermore, if it is known that the urn contains one black and one white ball and the remaining n - 2 balls are equally likely to be either white or black, prove that the probability of the sample of k balls containing only one white ball is ,,-k+l (11 -

k

n(I1-1)2,,-2'

r~1

k)

r(n-r) r-1 .

31 A man plays a game of chance in which, at each turn, he rolls a pair of unbiased cubical dice whose faces are numbered from 1 to 6. He continues until the first "successful" turn, a successful turn being one in which at least one of the dice shows 6. If this is his 11th turn (11 ;;;:.1), he receives a reward of £a" (a < 1). This terminates the game and the stake for playing it is 50 pence. Prove that the player's expected reward is £l1a/(36-25a). Hence deduce that, on the average, the player will be on the winning side if a> ~~. 32 In a game of poker, five cards out of a standard pack of 52 are dealt out at random to a player. There are four suits in a pack and the 13 cards in a

SUPPLEMENT

321

suit are regarded to be in an ascending order in which the ace is taken to be Ihe lowest card and also the highest. Any five cards (not necessarily of the same suit) which are in an unbroken sequence are said to form a run. I-Iowever, if the five cards in the run belong to the same suit, then the sequence is known as a running flush. Find the probability that the player has a run but not a running flush. Also, show that there is a 4 per cent increase in this probability if it is known Ihat the player has the jack of hearts. What is the explanation for this increase? 32a There are It alternatives in a multiple choice test, and the probability of an examinee knowing the right answer is p. However, if he knows the right answer, then it is not absolutely certain that he will give the correct response because of the emotional tensions of the testing situation. It may therefore be assumed that the probability of the examinee giving the correct response when he knows the right answer is 1 - a where a > 0 is small but not negligible. Alternatively, if the examinee does not know the right answer Ihen he randomly selects one of the It altet:natives. (i) Prove that the probability of the examinee knowing the right answer when he, in fact, gives the correct response is Itp(1-a)

l+[n(l-a)-l]p·

Also, determine the probability of the examinee not knowing the right answer when, in fact, he gives an incorrect response. (ii) Evaluate these probabilities on the assumption that when the examinee does not know the right answer, he knows that m < It of the alternatives are incorrect so that he makes a random choice from the remaining It - m alternatives. (iii) Find the limiting values of the probabilities in (i) and (ii) as It ~ 00, and comment on the results obtained. 33 As a financial inducement to Ruritanians for changing their habits of smoking, the government presented two separate proposals in parliament. The first proposal referred to a decrease in the excise duty on cigars and lobacco, and the second called for an increase in the duty on cigarettes. Two free votes were taken in parliament with the following results. The ratio of Ihe number of M.P.s voting for a decrease in duty on cigars and tobacco to Ihe number voting against it was one and a half times the ratio in the vote on an increase in the duty on cigarettes. Again, of those M.P.s who voted in favour of a reduction in the duty on cigars and tobacco there was a majority of 165 in favour of the increase of duty on cigarettes; and of those who voted against the decrease in duty on cigars and tobacco there was a majority of 135 in favour of an increase of duty on cigarettes. Finally, if 110 of those who voted for both government proposals had, in fact, joined those who voted against them, the number of the former would have been twice Ihat of the latter. Assuming that there were no abstentions in the votings, find (i) the total number of M.P.s attending the meeting; (ii) the numbers voting for each proposal; (iii) the numbers voting for and against both proposals; and (iv) the numbers voting for one and against the other proposal.

322


34 Each of two numbers x and y is selected randomly and independ from the integers 1, 2, 3, 4 and 5. Show by suitable enumeration tha~ntly probability that x2 + y2 is divisible by 5 is is. Further, prove that this is ~he the probability for 2X2+3y2 to be divisible by 5, and that the probab~l'sll that both x 2+ y2 and 2x 2+ 3y2 are divisible by 5 is ,Js. Illy Extend these results to the case when x and yare chosen frollJ h integers 1 to 5N, where N is any integer ~1, to show that the ab t e probabilities are independent of N and so also applicable when x and y ~\'e selected from the set of all integers. dre

35 There are k similar urns numbered from 1 to k, each containing IJJ + 1 (11 > k) balls which are identical apart from their colour. The balls in the urn are all white, but in the rth urn (1'::;; r'::;; k -1) there are m + r black 'I ; 11 - r white balls. An urn is selected randomly and two balls are drawn fl:(;l( it. If both the balls are found to be white, determine the condition~; probabilities that the kth urn was selected, on the assumption that tI;~ second drawing was made (i) after replacement of the first ball drawn; and (ii) without replacement of the first ball drawn. Explain very briefly why the conditional probability in (i) should be less than that in (ii).

kl:

36 There are six unbiased dice Dl> D 2 , • •• , D 6 , the six faces of D j beinl: numbered (i, i, i, 4,5,6) or (i, i, i, 1,2,3) according as i.::;; 3 or i > 3. If one Ilf the dice is selected randomly and rolled twice, then prove that the probahility of obtaining a double-six is -h. Further, given that a double-six has been observed, prove that thl' probability of the selected die being D6 is i. How is this probability altercd if it is known that D3 is a biased die such that, though it is equally likely 10 show a three or a number greater than three, the probability of it showing a six is twice that of obtaining a four or a five? 37 On receiving an external stimulus, a certain kind of biological particle gives rise to a progeny of r (0.::;; r'::;; n) particles with probability (~)p'q" " where p and q are positive parameters such that p + q = 1. The original particle then dies. The particles, if any, constituting the progeny act independently of each other and are subjected to another external stimulus, till' behaviour of these progeny particles being the same as that of the original particle. If Xl and X 2 are random variables denoting the number of particles in the progenies generated after the first and second stimulus respectively, prove that the joint probability-generating function of Xl and X2 is E(OflO~2) == G(Ol> O2) = [q + pOl(q + p02)/I]/I.

Hence, or otherwise, show that (i) P(X2 > 0) = 1- q/l (1 + pq/l-l)/I; (ii) E(Xl + X 2) = np(1 + np);

and

(iii) var(Xl + X 2) = I1pq[np + (1 + np f].

38 A system can be in anyone of sixteen states which may be represented as the equal cells of a square divided by lines parallel to its sides. Initially the system is equally likely to be in anyone of the states. The sixteen states are of three types conveniently denoted as A, Band C, and their distribu-

SUPPLEMENT • J1

11 0

323

is as given in the diagram below. A B B A

B C C B

B C C B

A B B A

After initial placement, the system is moved n times. At each move, the system is transferred to a neighbouring cell in any direction horizontally, "ertically or diagonally, all such moves being equally likely. \ For n;?!: 1, suppose all' {311 and 'YII respectively are the probabilities that after It moves the system is in an A-type, B-type or C-type cell. Prove that Ihese probabilities satisfy the following difference equations: a" =!{311-1 +!'YII-l

{311 = ~all_1 +~{311-1 +!'YII-l 'Y" =1all -l +~{311-1 +i'YlI-l'

Show that the solution of these difference equations is all =!-i{3",

where

10 [ 13 (-9)"] .

{3"=21 1- 90 40

Hence deduce the probabilities that after n moves the system will be in a specific A-type, B-type or C-type cell and verify that, as n ~ 00, these probabilities are in the ratio 3: 5 : 8.

39 Two players A and B contest a series of games. At each trial the probabilities of A and B winning a game are p and q respectively (pi 0, qrO), and there is a non-zero probability r for it to be drawn. It is agreed Ihat the winner of the series will be the player who is the first to win four more games than his opponent. Prove that, for pi q but irrespective of the value of r, (i) the initial probability of A winning the series is 1 1 + (q/p)4 '

and (ij) the probability of B winning the series when he is exactly 4) games behind A is

11

(0 ~ n ~

1-(p/q)4-11 1- (p/q)8

Hence deduce the values of the probabilities (i) and (ii) when a game cannot be drawn, the other conditions for the contest being unaffected.

40 A self-service store has a stock of nt + n undated one pint bottles of milk of which n are fresh and 111 a day old, and it may be assumed that these

324


are placed randomly on the service counter of the store. If a cu selects k (k < Ilt and n) bottles of milk, find the probability that she ~O~lcr fresh milk bottles. as -log[Ap2/R(1 + pf]. p

44 In The Wind on the Moon by Eric Linklater (Macmillan; 1944), the Palfrey sisters, Dinah and Dorinda, have in all the sum of 16 shillings and 11 pence to pay for the unusual services of the barristers Hobson and Jobson. In offering the sum to the barristers, Dinah thinks that it would be rather difficult for them to divide the money equally between them. However, Mr Jobson thinks otherwise and declares "I see there are twice as many half-crowns as shillings and the number of half-pennies is the same as the number of shillings and half-crowns added together, including the very shining bob with the head of Queen Victoria on it. Then if you add to that (that is, the total number of half-crowns and halfpennies) the total number of two-shilling pieces, the total is one and a half times the number of pennies. Now let me see ... " "There are two single shillings", said Mr Hobson, "and if you multiply the number of half-crowns by that you will get the exact number of pennies. It's perfectly easy, Jobson. We divide the 16 shillings and 11 pence into two exactly equal halves, by taking 11 coins each. There are no complaints and nothing left over." Determine the individual face values of the 22 coins offered in payment to the barristers. Hence show that if the 22 coins are randomly divided into two groups of 11 each, then the probability that the resulting division of money will be equal is 200/4199-0·0476. It may be assumed that the symmetrical distribution of coins is the only solution for equal division of the money.

326


Also, show that the same distribution of the face values of the coi . ns IS obtained even if the number of shillings is not known. [Note: For those unfamiliar with the pre-decimal British monetary syste shilling or a bob was worth 12 old pence, a two-shilling piece 24 old P:' a IICr and a half-crown 30 old pence.]

45 (i) Define (x), the distribution function of the unit normal variable X and then show that for any x > 1 . x

1 Jexp[1 -1) + (x-2+ 1) -2 (t2-1)2J dt= (X-2-1.

2J(2'Tr)

-x

(ii) The parliament in Ruritania consists of 2N + 1 members. On . non-party motion, a free vote is proposed. A pressure group consiS\~ of 11 members who agree amongst themselves that they will all VOir in the same way, but it may be assumed that the remaining members will vote independently, and each is equally likely to vote for o~ against the motion. Assuming that there were no abstentions at the actual voting, determine the probability that more than half the total votes will be cast in the same way as those of the pressure group. Jr N is large compared with 11, use a suitable approximation to show that, for fixed N, this probability is an increasing function of II. Further, assuming that the effective range of a unit normal variable is from -3 to 3, prove that it is practically certain that the decision on the motion will go the same way as the voting of the pressure group if 11 ;;;, ~[J(8N + 13) - 3].

46 A population of wild animals consists of N members, where N is a large but unknown number, and a biologist is interested in obtaining an estimate of N. He takes a random sample of W animals, marks and then releases them. After this, he starts capturing the animals randomly and noting whether they are marked or not. He continues this sampling without replacement till this second sample contains exactly w (fixed and preassigned) marked animals. The sampling is discontinued and it is observed thai the total sample size is n. If X is a random variable denoting the size of the sample, prove that P(X=I1)=

(~=:)(:=~)/(~),

for

w~X~N-W+w.

By considering P(X = n) as a function g(N) of the unknown parameter N, show that g(N) g(N-1)

(N -11)(N - W) N(N- W-I1+W)'

Hence deduce that the maximum-likelihood estimate of N is the largest integer contained in IlW/w. Also, assuming that E(X) = w(N + l)/(W + 1), evaluate a simple unbiased estimate of N. [Note: It may be assumed that no change in the animal population occurs in between the taking of the two samples.]

SUPPLEMENT

327 47 A player has a constant probability p (>1) of winning in an independent trial of a .game of c?ance. The sta.k~ for each trial is a shillings. If the player wins a trIal he receives (a + x) shIllIngs (x> 0), but he loses his stake money ~J1 the contrary event. The player decides to play this game with the proviso :hat he would stop play immediately after his first failure. If S is the random variable denoting the net gain of the player at the end of his play, prove that px

E(S)=--a; q

px 2

var(S)=-2 , q

p+q=1.

Also, show that the player can expect to gain at least a shillings at the end of play if x> a/po

48 Small electrical fuses are mass-produced and then packed into large boxes by an automatic filling machine. The number of fuses put into a box by the machine cannot be controlled completely; and in order to ensure that no customer is a loser, the filling machine is so adjusted that it puts at least n [uses in a box. It may be assumed on empirical considerations that the actual number of fuses in a box is a discrete random variable X having the geometric distribution with point-probabilities P(X = x) = (1- p)px-n,

for

X~

n,

where p is a positive parameter such that 0 < P < 1. Given a randomly selected box, the probability that any particular fuse selected at random from the box is usable is a constant 0, where 0 < 0 < 1. If a box contains exactly x fuses, prove that the probability that it contains exactly r usable fuses is P(r I X = x) =

(:)0'(1- oy-r,

for

0:0;;; r:O;;; x,

and write down the expected number of usable fuses in a box containing exactly x fuses. Hence, or otherwise, show that the expected number of usable fuses in a randomly selected box (whatever the number of fuses it contains) is E(r) = nO + pO/(l- p).

49 A manufacturer of a breakfast cereal introduces a coupon scheme as a sales promotion campaign. A set of coupons consists of It (~2) different ones and one coupon is placed in every packet of the cereal. A complete set of coupons can be exchanged for a stock of spring bulbs. A housewife buys one packet of the cereal every week. Assuming that it is equally likely for her to find anyone of the n coupons in a packet, prove that the probability that she will have to wait r (~1) weeks before obtaining the (x + l)th coupon when she already has x different coupons is P(r I x) =

(1-~) (~r-l,

for

1:0;;; x:O;;; n-1.

Hence show that, for any given x, the expectation of r is It/(Il- x), and the expected total number of weeks required for the completion of one set is

328


50 (i) If X is a random variable having the Poisson distribution with /L, and a, (3 and k are non-negative integers such that 0 < k s: l11ean prove that '" Q 1) and (3 «1), the profit increases with rising demand However, the effect of an increase in price is to decrease demand due to th~ competition of imported steel; consequently, for given a, the total realizable profit tends to decrease somewhat because of the additional cost of holding unsold stock. Accordingly, it is suggested on empirical evidence that a reasonable approximation of this dampening effect of price increase on profit is a factor exp[-(3(a - x)], which proportionately reduces a parabolic increase in profit with rising demand. It may therefore be assumed that, on a suitable scale, the profit of the company is T= aX(l + (3x)

e-!Ha-x).

Hence, assuming that the demand is appropriately scaled so that x is a normally distributed random variable with zero mean and variance (3, prove that the expected profit is E(T) = a(32«(33 + 2)exp[-(3(a - (32/2)]. Finally, suppose (3 is fixed by a retail price agreement amongst the steel companies, and a company has the freedom to determine the production capacity of its plant for maximizing the expected profit. Show that under such a restrictive price arrangement the company can obtain maximum profit when a = 1/(3.

56 A factory produces corned beef for export which is despatched in large consignments of 12 oz. containers to customers overseas. However, due to a recent failure in the sterilization of the product, the government has imposed the restriction that a consignment should only be allowed to pass for export if a random sample of 100 containers from it is found to be completely free from infection. It is suspected that, despite improvement in the sterilization process at the factory, there is an a (small) per cent rate of infection in the consignments offered for inspection. If it may be assumed that the size of the consignments is large compared with the sample size so that the effect of sampling from a finite population is negligible, prove that the probability of a random consignment being passed as free of infection is approximately

SUPPLEMENT

331 Hence deduce that the probability that of a batch of ten consignments least three will be rejected on inspection is

_fl.

:1

1-

L 2

w=O

(10) (l-e-"')W(e-"')lO-

w•

W

If, over a period of time, this probability is assessed to be 1· 5 per cent, verifY that the rate of infection still present in the consignments is approximately five per cent. [Note: e- o.os = 0.95 correct to two decimal places.]

57 A printing house proposes to pay £a as a standard charge for proofreading a book irrespective of the number of errors detected by the proof-reader. If no errors are detected in the proofs, then the proof-reader is paid another £ e-Aa, where A> 0 is a parameter. On the other hand, if the proof-reader detects errors in the proofs, then he is paid £pr-1a if he detects r errors for 1::;;; r-,

for l::;;;r"2 - ({3 - a )[1- (8)], where 8 == [log({3 - a) -IL ]/0' and 0 is the distribution function of th standardized normal variable. Hence deduce that the relative change in the average social security contribution is e (8 - 0') + [1- (8)] exp 0'(8 -!O').

63 An owner of a large Edwardian mansion has the choice between tw methods of heating the residence. He can either continue to use solid fuel i:: the existing fire-places or replace them by thermostatically controlled gas fires. It is estimated that the annual consumption of solid fuel would be N tons at a cost of £a per ton. On the other hand, the initial cost of installing t~e gas fires is substantial and it m~y be discounted as an annual cost of £(l Smce the gas fires are thermostatIcally controlled, the consumption of gas would depend upon the daily temperature distribution as recorded within the house. For anyone day, the gas consumption in therms may be regarded as a random variable X with a probability density function proportional to e- Ax2 . X3 for X>O, and A > 0 is a parameter. Prove that the average consumption of gas per day is N w/ A therms. Hence, assuming that the price of gas is £ 'Y per therm and that, on the average, there are 300 days in a year when heating is at all required, determine the expected annual cost of heating the house by gas. Also, show that, on the average, solid fuel heating will be more economical as long as 'Y

aN-{3

J>.

22SJ";'

->---

64 A refinery, located at a seaport, has a capacity of processing N gallons of crude oil during a week. The refinery is supplied with crude oil by tankers which arrive at random intervals of time; and if the refinery receives no fresh crude oil during a week then it is scheduled to process only aN gallons, where 0 < a < 1. The amount of crude oil received by the refinery during a week is a random variable X, and as X increases the scheduled processing capacity of the refinery also increases. However, since the capacity of the refinery is limited, it may be assumed on empirical considerations that this increase dies off exponentially. It may therefore be assumed that if the amount of crude oil received by the refinery in a week is X, then the scheduled processing amount is T = N[l-(l-a) e- 13x ], where (3 «1) is a small positive parameter. If the probability density function of the probability distribution of X is f(x)

=

x

e-~x2

for X;;:. 0,

prove that E(T) = N - N(l-a)[l-J2;. e!13 2{3{1-({3)}],

335

SUPPLEMENT

where. ct>(.) is th~ distri?ution function ~f a unit normal variable. Hence, aSSumIng that (3 IS sufficiently small ~or (3 to be negligible, show that in any one week the amount of crude oIl scheduled for processing cannot be greater than the expected amount if

0 1 log (7T X";-I3· 1- s e12) 2"

•

6S A continuous random variable X has the probability density function proportional to x"'-I/(l+(3x"')2",

for X;;oO,

where a and (3 are positive parameters and n > 1. Prove that if (2/1-1)a - r > 0, then the rth moment of X about the origin is given by

E(X') = 2n-l (3'/""

B(~+

12n -1-~) a

a'

in standard notation for the beta function. Also, show that the stationary values of the probability density function of X are the roots of the equation a '" ,..x

=

a-I (2n-l)a+l .

Hence deduce that for a> 1 the probability distribution of X has a mode at the point [

a-I

]

1/",

X= {(2n-l)a+l}(3

.

66 Explain, very briefly, the concept of conditional probability. A continuous random variable X has the beta distribution of the first kind with the probability density function 1 - - - x l1l - l (l-x) B(m,~

,

for 0..; X..; 1

,

where 111 is a positive parameter. Given a a positive number less than unity, prove that P(X ..;a) = a "'(111 + l-l11a).

Given another positive number (3 < a, prove that . ((3)111 111 + 1 - m(3 P(X"; (3 I X..; a) = - . 1 . a 111+ -l11a

67 A continuous random variable X has the probability density function ![1 + h(l- x 2 )], for -l..;X..; 1, where A is a parameter having some value in the interval (-1,1). Prove that for any non-negative integer r

E(X')=.!. [_1_{1+(-lY+2}+ 2A {I + (-lY+3}]. 2 r+ 1 (r+2)(r+4)

336


Hence verify that 2A

var(X) =

E(X) = 15 ;

75-4A 2 225

Also, prove that the distribution function of X is

( 21[A4

A

1 2)] F x)=- l--+x+-x 2( 1-2:x

2

.

Hence show that the median of the probability distribution of X is a rc' I root of the equation a

A(x 2 -1f-4x = 0, and that this root is > or or 0,

in standard notation of the beta function. Hence deduce that k = .f2/7r and that none of the even moments of X exists. Also, show that the turning values of the probability density function of X are the roots of the equation

X4=1. Hence prove that the probability distribution of X is bimodal and determinc its modes. Finally, show that the points of inflexion of the probability distribution of X are the real roots of the equation

3x 8 -12x 4 + 1 = 0, and determine these roots explicitly. [Note: It may be assumed that f(t)f(l- t) = 1T/sin 1Tt.]

69 A continuous random variable X has the probability density function ex(3x",-l f(x) = (1 + (3x"'f' for X;;;'O, where ex and (3 are positive parameters. Prove that if ex > r, then the rth moment of X about the origin is given by E(xr) =_l_B(l +..!:. (3r/'" ex '

1-..!:.)

a'

in standard notation for the beta function. Also, for a> 1, show that the mode of the probability distribution of X is at the point [ ex -1 ]11'" x= (3(a+1) .

SUPPLEMENT

337

Finally, prove that the median of the probability distribution of X is at the point

flen ce , for ex> 1, infer the sign of the skewness of the probability distribution of X. 10 (i) For any two positive numbers ex and v, the incomplete gamma function is a ra(v)=r;v)

J

e-xx,,-l dx.

o

If

v> 1, then show by suitable integration by parts that e-aex,,-t ra(v-1)- ra(v) =

r(v)

(ii) For any r;;;:: 0 and 0> 0, prove that II

[(r, 0)=

Je-

4x2 x r dX=2(r-1)/2rA(r~ 1)r(r~

1),

o

where A =!02. (iii) A continuous random variable Z has a non-normal probability distribution with the probability density function 1 [Z4_6z2+3] fu e- tz2 1+ n ,for -oo..4 e- / .X3, for X

~ 0,

where>.. is a positive parameter. Prove that E(X) = 4>..

and

var(X) = 4>.. 2.

If Xl> X2, •.. ,X" are 11 independent observations of X, show thaI maximum-likelihood estimate of >.. is

llll'

A=~i,

where i is the average of the Xi' Hence deduce that an unbiased estimate of >.. 2 is ni 2 /4(411

+ 1),

whereas the estimate i 2 /16 has a bias of O(n- I ). 80 A continuous random variable X has a normal distribution such lhal both the mean and variance of X are equal to 0, unknown. If XI' Xb ','" x" are independent observations of X, show that the equation of 8, lhl' maximum-likelihood estimate of 0, is e(e+1)=i 2+s 2,

where n

ni=Lxj j~l

n

and

ns 2 =L(xj-if. i~l

SUPPLEMENT

341

~eoce deduce that

8 = ~({1 + 4(.XZ + S2)}! -1] ]lIII that, for large samples,

20 2 var(O) = n(28+ 1)' A

11 Explain,. ver'! briefly, the concept of unbiasedness in the theory of tatistical estimation . . A continuous random variable X has the probability density function

1

f( x) = -3 x 2 e- x/ fJ for X;a.O. , 20

prove that for any r;a. 0

E(xr) = !orf(r+ 3). Hence verify that E(X)=30;

Given that XI> X2, ••• , X" are independent realizations of X, prove that the maximum-likelihood estimate of 8 is 8=li,

Ilhere i is the average of the n observations. Also, determine var(O). Two functions 1 " L1 = x~ and ~=li2 4n i=1

L

Jre suggested as possible estimates of var(X). Show that Ll is an unbiased of var(X) but

~Iimate

\0

that L2 is asymptotically unbiased.

82 An infinite population consists of white and black balls, and it is known that p, the proportion of white balls in the population, is rather small. In order to obtain a good estimate of p, an experimenter decides to sample the balls one at a time till he obtains exactly r (fixed in advance) white balls in the sample. (i) If X is a random variable denoting the size of the sample which includes exactly r white balls, prove that P(X=n)= (;=Dprq,,-r,

for X;a.r;

q==l-p.

(ii) Prove that the probability-generating function of X is E(OX) = (pO)'(1-qOrr.

Hence, or otherwise, deduce that E(X)= rIp.

342


(iii) Given r and the sample realization n of X, prove that the ma . )(llllll01_ likelihood estimate of p is r

A

p=-, 11

and that for large samples

p2q

var(p)=-. r

(iv) Also, verify that

E(:=~)=P, so that

p is

not an unbiased estimate of p.

83 In the distribution of the number of police prosecutions per motorist' · over a given . . d 0 f time, ' . 'III a Iarge cIty peno t he f requency 0 f motorists havin prosec~tions is 11" for r ~ 1 ~nd .2::::"=1 nr == N, b~t the number of motori~t: ~ho dId not ?ave a pro~ecutIo~ IS unknow~ owmg to the motorist popul
ni=Lxj

and

"

ns 2 =L(xj-if.

i=1

i=1

~ence deduce that

e =![{k 2+4(i 2+s 2)}L k], ad that, for large samples, J

2k(J2 var«(J) = n(2(J + k) A

What is the limiting value of var(O) if, for fixed n, k ~ 0, (J ~ 00, such ,hat k6 ~ /-L, a finite number?

86 Explain, very briefly, the concept of the best estimate of a parameter in ,he theory of linear estimation. Suppose that Xl' X2, ... ,XII are n independent observations such that (J2 E(x,,) = /-L; var(xv ) = , for 1 ~ v ~ n, n-v+1 Il'here /-L and (J2 are independent parameters. Determine /-L *, the leastsquares estimate of /-L, and show that 2(J2 var(/-L *) = - - n(n + 1) Another linear function suggested as an estimate of /-L is

6 T= (

n n+

1)(2

11

n+

1) L (n-v+1) 2xv. v=l

Prove that T is an unbiased estimate of /-L but that, as compared with #.t *,

2(2n+ 1)2 Eff(T) = 9n(n + 1)

8

9

for large n.

87 Explain clearly the meaning of linear regression and state the assumptions underlying the standard tests of significance in linear regression analysis. Suppose that Yl> Y2' ... ,YI1 are independent observations such that E(yJ = a

+ (3(x" - i) + "(x,, - if;

where x" are values of a non-random variable, i = I~=1 xv/n, and a, (3, " (1-0) and (J2 are independent parameters. Prove that, in general, the linear functions 11

a*=

L

yJn

v=l

are biased estimates of a and {3 and find their expectations explicitly. What happens if the x" are equispaced?

344


Also, for general Xv, determine a linear function of a* and (3* expectation is independent of 'Y and find its variance. wh()~t

88 Suppose that Yl> Y2, ... , Y.. are n independent observations such thaI E(y.,) = a + (3(xv - i),

var(y.,) = 0'2,

for v = 1, 2, ... , n,

where the Xv are values of a non-random explanatory variable with av . i, and a, {3 and 0'2 are independent parameters. If a* and {3* a er,lgt standard least-squares estimates of a and (3 respectively, then t;e tht residual is defined as e /Jlh rv = yv-a*-{3*(xv-i),

for v = 1, 2, ... , n.

Prove that the rv satisfy the two linear constraints n

L

,.

rv = 0

L

and

1'=1

(xv - i)rv = O.

1'=1

Also, prove that for any v

n-1 var(r.,) = 0'2 [ n

E(r.,) = 0; where

(x., -

X

i)2]

.

.

L

X=

(x.,-if;

v=l

and

i

var(r.,) = (n -

2)0'2.

v=l

89 Suppose (Xi> Yi), for i = 1, 2, ... , n are independent paired observations such that each Yi is normally distributed with E(Yi) = a

+ (3(Xi -

i)

and

var(Yi) =

0'2,

where a, {3 and 0'2 are the parameters in standard linear regression analysis and i is the average of the Xi> the values of the explanatory variable. If n'; and {3* are the usual least-squares estimates of a and {3 respectively, find (J *, the best estimate of the parametric function (J = a - {3i. Prove that corr«(J*, (3*) =

i.Jn/

(t1 xfY.

Also, determine the value of this correlation coefficient when, in particular, the Xi (i = 1, 2, ... , n) are the first n natural numbers, and then show that, as n ~ co, the limiting value of the correlation coefficient is -!J3. 90 Explain briefly the concepts of unbiasedness and efficiency in the theory of estimation. It is known that, in biological investigations carried out over prolonged periods of time, the experimenters often obtain quite accurate information concerning the coefficient of variation (standard deviation/mean) of the quantitative characteristic under study. This information can be used to plan further experimentation. Since large random variability is a typical feature of biological material, a knowledge of the coefficient of variation can be

SUPPLEMENT

345

Jsed 10 ~btain an asymptotically unbiased es~imate of a parameter, which .as a van~nce smaller than that of ~h~ best lInear unbiased estimate. , In particular: suppose a ~haracten~tIc Y. (su~h as the length of the ears of ~aiZe of a particular type) IS under Investigation, where

E( Y) = IL

and var( Y) =

0'2,

and 0'2 being independent parameters. It may be assumed that v == 0'/ IL is rnown accurately from previous similar experimentation. If Yl> Y2, ... , YI1 JlC independent observations of Y, an estimate of IL is defined by the linear [Odelion i=l

~here

C is so chosen that E(T-IL)2 is a minImum. Prove that C =

1/(11 +v 2 ). Hence show that T has a smaller variance than the sample

Jverage

y, but that E(T) = IL

(1+:2)-1

so that T is an asymptotically unbiased estimate of IL. Also, prove that the tlliciency of T as compared with y for the estimation of IL is (1+

:2y.

••• ,x,. are n independent observations with mean and variance 0'2. Another independent sample of n correlated observalions Yl> Y2, ... ,YI1 is given such that for all i = 1,2, ... , n

91 Suppose that Xl> X2, ~

E(Yi)=IL;

var(Yi) = 0'2;

and

corr(YhYj)=P,

(i=/=i).

Prove that x and y, the means of the x and Y observations respectively, are unbiased estimates of IL and determine their variances. If T== ax + (3y is a linear function of the sample means, find a and (3 such that E(T) = IL and var(T) is a minimum. Verify that this minimum var(T) is 0'2 - [1 +(n -1)p]/[2+(n -1)p].

n

Also, show that (n _1)2p2 ] var[(x + y)/2] = [ 1 + 4{1 + (n -1)p} . min var(T).

Hence, or otherwise, deduce that for n> 1 and P=/= 0 var[(x + y)/2] > min var(T) and comment on this result. 91 If Xl> X2, ••• ,Xm are independent observations of a random variable X having a normal distribution with mean IL and variance 0'2, find the least-squares estimates of IL and 0'2. Hence indicate, without proof, how an exact test of significance based on Student's distribution may be used to test any null hypothesis about IL.

346


Given the 11 sample observations, an experimenter wishes to obt . interval estimate for an additional independent but unknown observat~tn an X. Show how the argument leading to Student's distribution may be IOn or ified to derive the required 100(1-1/) per cent (0 < 1/ < 1) confi:odinterval for the (11 + l)th observation as encc i -t(1/;

1l-1)S~ (11: 1)~x ~i + t(1/; 1l-1)S~ (11: 1),

where x is the unknown observation, i and S2 are the sample mean v~ri~nce. of th.e Xi and t( 1/; 11 - 1) is the 1001/ per cent point of Stude~~~ dlstnbutlOn with d.f. .

11-1

93 Explain clearly, without proof, how a confidence interval may b. obtained for the regression coefficient in linear regression analysis, and Sht c. the underlying assumptions of the analysis. (C In an investigation the distinc~ values of t.he explanatory variable x arc Xl> X2, ••. ,Xn and the correspondmg observations of the dependent variabl' yare Yl, Y2, ... ,Yn, it being assumed that the standard linear regressi{)~ holds between X and y, so that E(Yi) = a + (3 (Xi - i) var(yJ = (J'2, for i = 1,2, ... , 11, and i is the average of the Xi' The experimenter wishes to make k (preassigned and ~ 1) further independent observations of the dependent variable, each observation corresponding to the same given value Xo 1= Xi (i = 1,2, ... , of x. If Yk is the average of these k observations (yet to be made), prove that the 95 per ccnt confidence interval for Yk obtained on the basis of the known (Xi. Vi) i = 1, 2, ... ,11 is

11)

[1 1

A

a+(3(xo-i)-tO.05 xs 'k+-;;+

(X o -i)2J!

X

~

[1 1 (xo-xf]\ X •

~Yk~a+(3(xo-i)+tO.05xS 'k+-;;+ A

where a, ~ and S2 are the usual least-squares estimates of a, (3 and (12 respectively, X=L:t~l (Xi -i)2, and to'05 is the 5 per cent point of Student's distribution with 2 d.f. Suggest reasons, if any, which might make the above confidence interval inappropriate in a practical situation.

11 -

94 (i) Explain clearly the difference between a poillt estimate and an illterval estimate of a population parameter in the theory of statistical inference. (ii) Suppose that Xl> X2, ••. ,Xn are 11 independent observations from a normal population with mean /.L and variance (J'2. If i and S2 are the sample mean and sample variance respectively, indicate, without proof, how these sample quantities may be used to obtain a confidence interval for /.L. An experimenter plans to make two further independent observations Xn+l> Xn+2 from the above normal population. If A is a known constant (0 < A < 1), prove that the 99 per cent confidence interval for the linear

347

SUPPLEMENT

[unction

~,

X-tOoOIXS

[1-+1-2,$1-,$ ]~ :S;;L:s;;x+toooIXs[1-+1-2'$1-'$ J" 2

11

1 1 '

Mre tOoOI is the one per cent point of Student's distribution with 11 -1 d.fo Also, show that, irrespective of the value of ,\ in its permissible range, ,he above confidence interval for L must be greater than the interval W

11 +2)~ x±too01Xs ( 2;- .

9S If

XI' X2,' •• , X are independent observations of a normally distributed rariable X with mean IL and variance u 2 , find the least-squares estimates of ~ and u 2 • Hence indicate, without proof, how an exact test of significance based on Student's distribution may be used to test any null hypothesis about IL· Given the 11 sample observations, an experimenter wishes to obtain another 11 independent observations of X. If Xo is the mean of these 11 observations (yet to be made), prove by a suitable extension of the argument leading to Student's distribution that the 100(1- T)) per cent (0 < T) < 1) confidence interval for the difference x - Xo is Il

where x and S2 are the sample mean and variance of the already observed 11 values of X, and t( T); 11 - 1) is the lOOT) per cent point of Student's distribution with 11 -1 dJ.

96 In 11 Bernoulli trials with a constant probability of success p, w successes were observed. Show that for the probability distribution of the random variable w the third central moment is 1L3(W)=l1pq(q-p),

where p+q=1.

Prove that an unbiased estimate of 1L3( w) is

T

=

113 p*(1- p*)(1-

2p*)

--''-----'------=.----'-

(11 -1)(11 - 2)

,

p* being the observed relative frequency of successes. Also, show that for large 11 var(T)-npq(1-6pqf.

97 In a plant-breeding experiment, the observed frequencies of progeny in the four mutually exclusive classes AI> A 2 , A 3 , A4 were Ill> 112, 113, 114 respectively (I I1j == N). On a genetical hypothesis the corresponding probabilities for the four classes are -&(2+0), -&(2-0), -&(1-0), i6(1+30) respectively, Where e is an unknown parameter. Derive the maximum-likelihood equation for e, and show that the

EXERCISES IN PROBABILITY AND STATISTICS 348 large-sample variance of the estimate 8 is 2 ( A) _ 4(4- ( )(1- 6)(1 + 36) var 6 3N(5 + 26 _ 4(2)

Further, suppose there was some error in the classification of the A A3 progeny, though the Al and A4 plants were classified correctly. PO~I and observed frequencies in the A2 and A3 classes and then derive the equ t~hc for 6*, the maximum-likelihood estimate of 8. Verify that for large sam;I~~1l var( 8*) 4(3 - 28)(5 + 28 - 4(2) var(6) (1-8)(2-8)(29+328)' and hence show that var( 8*) > var( 8). 9d~ . In a Plantf-breeding ebxp~rimdent the observed fdrequencies .of the fOli r

Istmct types 0 progeny 0 tame were al> a2, a3 an a4 respectively whe . On a genetical hypothesis the expected proportions in the fo~c classes are 1(2 - 8), i(1 + 8), i8 and i(1- 8), where 8 is an unknowl~ parameter such that 0 < 8 < 1. Find the equation for 8, the usual maximum. likelihood estimate of 8. Alternatively, a simply calculable linear unbiased estimate 8* may he derived by equating the linear function

L aj == N.

X==al-a2-3a3+3a4

to its expectation. Show that explicitly 8* = (a2+2a3-a4)/N, and hence derive the exact variance of 8*. Also, prove that, irrespective of the true value of 8 1 (*) ~3 -~var8 2N 4N·

99 A multinomial distribution has k (;;a.2) distinct classes and in a random sample of N observations from the distribution, the observed frequencies in the k classes were at> a2, ... , ak respectively. The corresponding expected frequencies were ml> m2, ... , mk (Lk=1 ai = L~=1 mi = N). Assuming that the mi's are all functions of an unknown parameter 8, prove that the equation for 8, the maximum-likelihood estimate of 8, is

t

[ai . dmiJ =0. i=1 mi d8 6=6 Also, show that for the linear function X==t a;.dm i i=1 mi d8

t

l.. (dmi)2. i=1 mi d8 In a particular breeding experiment with a variety of Papaver rhoeas. there were four distinct classes with the expected frequencies var(X) =

E(X)=O;

N

N

N

N

2

4 (3-28+8 2), 4 8(2-8)'4 8(2-8), 4 (1-8).

349

SUPPLEMENT

I'erify that in this case 2N[1 + 2(1- 6)2] var(X)= 6(2-6)[2+(1-6)2]"

100 The serially correlated observations E(Xj) = 0, corr(Xb If X and

Xi+l)

var(Xj) = u = p,

2,

Xl> X2, •••

,x..

are such that

for i = 1, 2, ... ,n;

corr(xb Xi+k) = 0, for k;;:;. 2.

S2 denote the mean and variance of the sample observations, therl

rrove that u2 (i) var(x) = - [1 + 2p(n -l)/n]; n (ii) E(S2) = u 2(1-2p/n); and (iii) -~:s;;p:s;; 1.

Also, if another linear function of the observations is defined as

2

n

xw = n ( n+ 1) 1'=1 L vx", Ihen verify that 2u 2 var(xw ) = 3n(n+ 1) [2n(1 +2p)+(1-4p)].

Hence deduce that, as n ~ 00, the limiting efficiency of x'v as compared with is i.

j

101 If X and Yare two correlated random variables having finite means 3nd variances, define cov(X, Y). (i) Assuming that X and Yare positively correlated and var(X);;:;. var(Y), prove that cov(X, Y):s;;var(X). (ii) Prove that as a first approximation

(Y X) _ cov(X, Y) cov\X' - E(X) . Hence deduce that to the same order of approximation

Y)]

(D E(Y) [ cov(X, E\X)=E(X) 1- E (X)E(Y) . 102 Define the product-moment correlation coefficient between two continuous random variables having finite, non-zero variances, and prove that the coefficient must lie between -1 and + 1. Suppose Xl> X2, ••• ,x.. are random variables such that E(Xj) = p.

x

var(xi) = u 2 , and corr(Xj, Xj) = p, for it- j

II is the mean of the Xi and Yi (Yi =

Xj -

(i = 1, 2, ... , n).

x) are the deviations of the

Xi

from

350


i, prove that (F2 (i) var(i) = - [1 +(n -1)p]; n (11") var (Yi) = (n -1)(1- p)(F2 ,

for i

n

(iii) corr(Yi> Yj) =

-~1' n-

for

i1= j =

=

1,2, , , , , n;

and

1, 2" , " n,

103 If Xl> X 2 " " , XN are correlated random variables with finite m ' and variances, prove that edlJ\

va{~ Xi] = i~ var(Xi)+ i~ j~i cov(Xi, Xj), Suppose that an unbiased coin is tossed n times and the number of tim' the sequence a head followed by a tail (HT) is observed. If Zi is a randoC\ variable which takes the value unity if the sequence HT OCcurs at tI:l~ (i -1)th and ith trials considered together and zero otherwise, express SC the random variable denoting the total number of times HT occurs in the ';; trials, in terms of the Zi' Hence show that (i) E(SII) =i(n -1); and (ii) var(SIt) = 'Mn + 1),

104 (i) If X and Yare two random variables with finite variances, prove that the random variables X + Y and X - Yare correlated unless var(X) = var(Y), (ii) Given that Xl> X2, ••• , XII are serially correlated observations such that and var(xi) = (F2, for i = 1, 2, ... , n, where p and (F2 are parameters (-1 0), prove that the variance of i, the average of the Xi is

[1

(F2 + p _ 2p(1- pll)] n I-p n(1-p)2 .

lOS (i) If Xl> X 2 and X3 are uncorrelated random variables, prove that cov(Xl + X 2 , X 2+ X 3 ) = var(X2), (ii) The serially correlated random variables that for any s ~ 1

XI> X2, • , , , XII' , , •

are such

E(xs ) = 0;

\p\ S2 and S3 are defined as n+r It S2=

LXv; v=r+l

S3=

L

v=n+l

Xv·

SUPPLEMENT

351

prove that

a2pll-r+l(1_ pr)2

COV(SI,S3)= ~ertce

(l-p)2

deduce that if COV(Sb S2) = COV(Sb S3), then 2pll pr =

1 + pll·

,06 Prove that the product-moment correlation coefficient between any I\VO

random variables with finite non-zero variances must lie between -1

JU d

+1.

The random variables Xl> X2, X3, .•• are uncorrelated and each has zero l1Iean and constant variance a 2 • Given any integer n > 1, another sequence 01 random variables Yt> Y2, Y3, ... is defined by the equations 1 1 11 -

Yi=-

n

L Xi+;,

fori=I,2,3, ....

;=0

II r is another non-negative integer, prove that for any fixed i ifr;?;n

n -r n

if r X2> ••• , XII are random observations of X, then prove that the sample mean i has the moment-generating function

exp[np,(e l/ll -1)]. Hence deduce that the cumulant-generating function of the standardized random variable has the series expansion

so that, as n ~ 00, the probability distribution of z tends to normality. 108 Prove that the probability-generating function of a Poisson variable X with mean A is E(8 = exp A(8 -1). X )

If XI and X 2 are two independent Poisson variables with means AI and

"-2 respectively, show that the joint probability-generating function of Xl and ¥=X I +X2 is E(8i". 8n = exp[A 1 (8 1 82 -1)+ A2(82 -1)].

352


Hence prove that the marginal distribution of Y is POisson W'th Al + A2 but that the conditional distribution of Xl given Y == j is 1

e)(l: P)'(l !py-r,

P(XI = r I Y = j) =

Ilh:"u

for 0:S;;X 1 :S;;j,

where P==Al/A2' Discuss briefly how this conditional distribution may be used to t ' equality of the means of Xl and X 2 , CSllhl'

109 A discrete random variable X denotes the number of successe' ' sequence of n Bernoulli trials with probability P for a success in an';~ 1, a Prove that the probability-generating function of X is n.11

t

E(OX) = (q + pO)",

where P + q = 1.

If Xl and X2 are i~depend~nt, random variables respectively denoting tl1l' number of successes m two dIstmct sets of nl and n2 Bernoulli trials w'lI success param.eters PI and P2, prove that the joint probability-general)') 1 , , ( Ill! functIon of Xl and Y = Xl + X 2 IS O( 01> ( 2) == E( O~· , On = (ql + PI 01 (2)n'(q2 + P2 ( 2)"2,

(PI + ql = 1; P2 +q2 == I\.

Hence show by suitable differentiation of 0(01) ( 2 ) that (i) P(Y=r)=

i

v=O

(nl)( ]1

r

~2

v

)Plq'i·-vp2-vq22-r+v,

for 0:S;;Y:S;;n 1 +11 2;

and

(nl)( n2 ) , (ii) P(Xl=sl Y=r)= r s

r-s p

L (n1)( r-]1 n2 )p v

v=O

]1

where p == Plq2/P2ql> and it is assumed that in both (i) and (ii) the binomial coefficients are zero for inadmissible values of r> nl and r> n2' Discuss very briefly how the conditional probability distribution (ii) may be used to provide a test of significance for the null hypothesis H(PI = P2)'

110 A discrete random variable X has the probability distribution defined by the system of equations P(X = r) = Pn

for integral values of X;;. 0,

Define the probability-generating function of X and show how this functioll may be used to evaluate the factorial moments of X. Conversely, if /Ll/) is the jth factorial moment of X about the origin, prove that 1

00

(-I)'

L -,-#L(r+,)' r, ,=0 s,

Pr = --,

Use this result to verify that if, in particular, #Lej) = (n + j _l)(j)pi,

where n, a positive integer, and P (0 132 are positive parameters such that Prove that the probability that X ~ Y is 0.1/(0.1 + 131). Hence derive the joint distribution of X and Y subject to the condition X ~ Y, and then show that in this case (i) the marginal distribution of X has the probability density function (0.1 + 131) e-(1) =~. Explain very briefly the significance of this value of cf>(1).

138 Two gamblers A and B agree to play a game of chance with initial capital of £a and £b respectively, the stakes at each trial being £1 on the occurrence or non-occurrence of an event E. If E occurs at a trial then B pays A £1, whereas if E fails to occur at the trial, then A pays B £1. The lrials of the game are independent, and the constant probabilities of the occurrence or non-occurrence of E at a trial are p and q respectively, where p+q = 1. A player wins when the capital of his opponent is exhausted. If Un (0 < n < a + b) is the expected number of further trials required for A to win when his capital is in, prove that Un satisfies the difference equation Un = 1 +pUn+1 +qu,.-I'

Hence show that the expected duration of play for A to win is a q_p

(a+b)(qa_pa)pb (q _ p )(qa+b _ pa+b)

if pi= q,

and ab if p = q. 139 A finite population consists of the N elements Xl> X 2 , Ihe mean and variance of the population are defined as

_

1

X=-

1 S2=_-

N

LX;

and

N -1

Nj=l

••.

,XN , and

N

L (Xj-Xf j=1

respectively. Suppose a random sample of n «N) observations is taken without replacement from the population. The sample mean x is defined as the average of the elements of the population included in the sample, but an alternative formal definition of x is 1 N i=-

L ZjXj,

nj=l

where Z b Z2, ..• ,ZN are indicator random variables associated with the individual elements of the population sampled such that Zj is either 1 or 0 according as X j is or is not included in the selected sample. Use this

364


alternative definition of i to prove that (i) E(i) = X;

and

..) (_) (N - n)S2 (11 var x =

nN

140 A continuous random variable X has the distribution function F( defined in the .ra.nge O:s;;; X.~ a. I~ x! < ~2 r). In particul . if X has a uniform distribution in the interval (0, a), show that probabil~' density function of the joint distribution of u = xrla and v = x.la is I y NI (r-l)! (s-r~I)! (N-s)! u r- 1 (v-u)·-r-l(l-v)N-.,

O:s;;;u:S;;;l;

v;:ou.

Hence deduce the marginal distribution of u and the conditional distribution of v given u. Also, given u and some constant Va such that u < Va < 1, prove that P(V > va) = 1 - BA (N - s + 1, s - r)

in standard notation of the incomplete B-function and where A"" (va- u)/(I-u).

141 Given that X and Yare independent negative exponentially distributed random variables in the range O:S;;; X, Y < 00 and with means II n I and I1n2 respectively, prove that

P(X;::;'Y)=~. nl +n2

Determine the joint probability distribution of X and Y given that X;:o Y. Use this probability distribution to derive the joint probability distribution of U = X - Y and Y, given that X;::;. Y. Hence deduce the probability distribution of U given that X;::;. Y. Also, show that the unconditional distribution of IUI has the probability density function

142 In a quantitative study of the spatial distribution of a plant population in an area, a square region was divided into n quadrats of equal size, and the number of plants observed in the region was s. For given s, the distribution of plants inside the square is such that a plant is equally likely to be found in anyone of the quadrats independently of the other plants. Determine the probability that a specified quadrat contains no plants. Furthermore, if X is a random variable denoting the number of empty quadrats in the square region, prove that E(X) =

and

n( 1-~r

SUPPLEMENT

II

365 Also, obtain approximate expressions for E(X) and var(X) when sand ~rJJ in such a way that sIn = A, a constant.

1.3 A continuous random variable X has a uniform distribution in the

'olerval -a";;; X,,;;; a. If X,t+l is the median of a sample of 2n + 1 independent X n +!. Use this probability distribution to prove that, for any non-negative integer r,

~bservations of X, determine the probability distribution of r aT(2n+2)f{!(r+ 1)} E(Ix,t +1 I ) = 22n + 1r(n + l)r{n +!(r+ 3)} .

Hence, in particular, verify that E(lxn+!i) = (2nn+ 1 )a/22n + 1 •

Also, show that for large n a first approximation gives E(lxn+li) = a/~. [Note: It may be assumed that for large m > 0

r(m + 1) = .J21Tm(mle)"'.]

144 Suppose that Xl> X2, ••• ,Xn are serially correlated observations each with zero mean and variance u 2 , and corr(Xj, Xi+k) = P for all i arid k = 1,2 such that 1,,;;; i ,,;;; i + k ,,;;; n and zero otherwise. If x is the average of the Xi> prove that . u2 [ 2(2n - 3)P] (1) var(x) = -;; 1 + n ; and (ii)

E[t

(Xi -

X)2 ]

=

u 2[ n

-1

2(2nn- 3)p

J.

Hence show that as n ~ 00 p ~ -1.

145 Suppose Xl < X2 < ... < x.. are ordered observations of a continuous random variable X having a negative exponential distribution with the probability density function a- I e-(X- I3)f""

X~(3,

where a and (3 are positive parameters. By using the transformation UI = Ur

n(xi - (3)

= (n -

r+ 1)(x,- X,-l),

2,,;;; r";;; n,

or otherwise, prove that the u's are independent and identically distributed random variables such that, for any j, 2u/a has the X 2 distribution with 2 d.f. Hence determine the probability distribution of the ratio n(n -l)(xl - (3)

Comment very briefly how this result may be used to test a specific null hypothesis about the parameter (3.

366


146 State the additive property of the X2 distribution. Suppose X and ~ a~e two indep,endent r~ndom variables such that 2ex and 2 Y/O are both dlstnbuted as X2 s each with .2 d.f., where 0 is a Positive parameter. If Xl> X 2 , ••• , XII are random observations of X, and Yb Y2 .. those of Y, determine the joint probability distribution of the sum; . , y" n

It

L Xi

U =

and

L Yi·

v=

i=1

i=1

Hence, by considering the transformation

u=w/z

v=wz,

and

or otherwise, determine the joint distribution of wand z. Finally, use this joint distribution to deduce that the marginal distribu_ tion of z has the probability density function 2f(2n) (OZ-I+O-I Z)-2I1 Z -l {f(nW '

and that

for O:s;;z -1) and (3 (>0) are parameters sUch that (3 > 0: + 2. Prove that (i) the marginal distribution of Y has the probability density function «(3 -0: -2)g(y) no: + l)y13-o
(ii) Here L~'~ 1 Ur

Ur

= f(r) -

= f(1) -

f(r+ 1) where f(r) f(n + 1) ~ i as n ~ 00.

= (2r+ 1)/2r(r+ 1). Hence

(iii) Put x = a-b. Then

g(a,b)=2xa

[1+(1-~rl]+IOg(1-~)

=:: [(!-~)+(!-t)~+(!-t)(~r

+··1

Hence g(a, b»

x3

6a 3

and

whence the result.

12 UL (1975). (i) The expression for f(r -1) - f(r) follows directly. Then the sum of the series to n terms is f(O) - f(n) ~ l/x as n ~ 00. (ii) Set A = [(x -1)/(x + 1)]!. Then log A = ~ [IOg( 1-~) -log( 1 +~) ]

1

1

= --;- 3x 3 + O(x

-5)

.

ANSWERS AND HINTS ON SOLUTIONS: SUPPLEMENT

379

Hence

1 1 1 -1--+--X 2X2 2x 3 ' (iii) Note that

Ur

13 UL (1973). (i) Note that

Ur

(-xY 1 (-xy+ 1 =--+-. 1 whence r x r+ ~ 1 L Ur = -log(l + x) - - [log(l + x) - x]. r=1 X

= (2 r -l)/(r+ 1)! whence

~

L

r=1

~

ur=t

~

L 2r+1/(r+1)!- L

r=1

r=1

~

=t (ii)

on expansion.

n

1/(r+1)!

~

L 2$/s!- L

$=2

$=2

l/s!,

whence the answer.

(n

r 1 ,,+1 ,,+1 ( L _ )(3- ) =--1 L + 1) 3r, whence the answer on summar=1 r 1 r n + r=1 r tion of the binomial expansion.

(iii) The given expression can be rewritten as x(l + l/x)! 10g(1 + l/x) and the answer is obtained by expansion and retaining terms up to x- 3 • (iv) The given expression = e- 2X '(e2x +e- 2x ) 2x4 ] = 2 e -2X'[1 + 2 x 2 +3+'"

whence the answer on the expansion of e- 2x '.

14 UL (1972). (i) The expression for f(r -1) - f(r) follows on substitution and then S" = f(O) - f(n). (ii) If S denotes the sum of the series then differencing shows that (1- x)S = 1 +2x +(2x)Z+(2x)3+ ... =(1-2x)-1 if 12xl

=

L

coefficient of

Xk

k =0

=

in (1- x)-'" x coefficient of

coefficient of x'" in (1- x)-",-(3,

16 UL (1970). (i) The expression =

t

l+l1a

r

( -1

l+na

)r-l

x 1 +l1a

in (1- xf II

whence the result.

(11)(~)r +na(~)

r=O

X",-k

t

r=1

(n-1) r-1

'

= 0 on summation. (ii) Here u r =f(r)-f(r+1), where f(r) = 4Ir 2(r+ 1f. Hence

L" U = f(l) r

f(n

+ 1).

r=1

(iii) Here f(r+ 1) - f(r) = 14r6 -2r2 whence the expression for r6. Hence f1

fI

L r6=9 L r2+1~[f(n+1)-f(1)], and the answer follows on reduction.

17 UL (1969). (i) Note that (r+s+1)(s+2)-(r+s)(s+2)=(s+2).(r+s)(s+1), whence the

result on summation. In particular, for s = 0, Sr = !r(r+ 1)

and

S,,-r = !(n - r)(n - r + 1).

It now follows that S.5,,-r =;\[(n + 3)(2)(r+ 1) for n ;a. 3 with Uo = 1, Ul = q, U2 = pq + q2 = q. The roots of the auxiliary equation are 1, (-!p)(1 ± i.J3) whence the general solution Un =A+B[(-p)n(cos

~71" +i sin n371")]+c[(-p)"(cos n371" -j sin n371")]

where A, Band C are arbitrary constants. To determine the constants use the initial conditions. This gives B=-JL (l+p)iv'3-q. 2.fj· 1 + P + p2 '

A== _ _ I_

1 + p + p2

C = _JL (1 + p)j.fj+q 2.fj . 1 + P + p2

The stated result follows on reduction.

42 UL (1966). The total number of samples of size n is (~). In any sample of size n, x will be the largest integer if the other n - 1 integers in the sample are drawn from the x-I integers less than x. Hence the probability distribution of X. The assumed equality may be proved as folIows.

xt (=:~)= xt C~~) =

Nf' (t+n+k) 1=0

t

N-n

=

L

coefficient of

ZO

in z-l(1 + z)'+n+k

1=0

= coefficient of

N-n

ZO

in (1 + z)n+k

L {(1 + Z)/Z}I

1=0

= coefficient of ZO in (1+z)n+k[(1+z)N-n+l. z-(N-n)_z] = coefficient of zN-n in (1 + Z)N+k+l

= (N+k+l)= (N+k+l). N-n n+k+l For k = -1, we obtain the sum of the probabilities P(X = x). The expression for E[(X + r -1)(.). (ii) There are 2N + 1- n members who are not in the pressure group. If r

is a random variable denoting the number of these members who vote with the pressure group, then r is a binomial variable such that E(r) = !(2N + 1- n); var(r) =i(2N + 1- n). The required probability is P(r~N+ 1-n) = 1-P(r:!S;N-n) =1-

L

N-"

r=O

-ct>(-

(2N + 1- n)(1)2N+l-" r 2

n

)

\.J2N+1-n '

by using the normal approximation for the binomial distribution.


395

For fixed N this probability is an increasing function of n and if we assume that (3) -1, then P(r ~ N + 1 - n) - 1 if n2~9(2N +1- n),

whence the inequality for n.

46 UL (1967). P(X = n) = Probability that there are w -1 marked animals in the first n -1 members of the second sample x Probability of selecting a marked animal on the nth occasion from a total of N - n + 1 animals of which W - w + 1 are marked

. N-n+1 '

which reduces to the stated result on recombining the binomial coefficients. On reduction (N - n)(N - W) N(N- W-n+w)

g(N) g(N-1)

=

(1- ~ / (1- ~=:).

The stated result follows by considering g(N)/g(N -1) > l. · f · (W + 1)n 1. · d An unblase estImate 0 N IS W

47 UL (1965). The probability that the player loses for the first time at the (r+ 1)th trial is prq and his net gain is rx - a. Hence QO

E(S) =

L (rx-a)prq r=O QO

= -a + pqx L

rpr-1

r=1

d = -a +pqx. dp [(1-p)-1],

whence result.

In the same way, QO

E(S2)=X2q

L {r(r-1)+r}pr-2axp/q+a 2 r=O

Hence var(S). The inequality E(S) ~ a leads to x> a/p since 2q < l.

48 UL (1975). P(r, X = x) =

(:}n1- 6)x-r . (1- p )px-n

396

EXERCISES IN PROBAB ILITY AND STATISTICS

and so

L'"

= (1- p)e

Xpx-n

x=n

Lx

(

r=1

x-

1) or-l(l- oy-r

r-1

'"

L

=(1-p)O

Xpx-n

x=n

'"

=(l-p)6

L (n+t)p'

,=0

'"

=nO+(l-p)O

L tp',

,=0

whence the result on summation.

49 UL (1972). When the housewife has x different coupons, the probability of her obtaining a new coupon is 1- x/n and that of a duplicate COupon is X/II. Hence P(r I x) and E(r I x) =

(1-~) n

n n-x

= --

f

r=1

r(~)r-l n

. on summation.

Hence the expected total number of weeks is "-1

11-1

x=1

,=1

L n!(n-x)=n L t-

50

1•

UL (1971). ~

(i)

L

~

r(k)p(X = r) = f.L k

L e-lLf.Lr-k/(r- k)!

r=a ~-k

=f.Lk

L

e-ILf.LS/s!,

whence the result.

s=Ot-k

(ii) If Y denotes the number of rolls of film sold, then

P(X= r) P(Y= r)= 1-P(X=0)' for Y::::.1. Therefore the probability distribution of Z is P(Z = 0) = P( Y = 1) P(Z=r)=P(Y=r), for 2.:;;Y.:;;5 P(Z = 6) = P(Y::::. 6). Hence 5

rP(X=r)

'"

P(X=r)

E(Z) = r~21-P(X=0) +6 r~61-P(X=0)

1 _ [f.LP(1 ':;;X':;;4)+6{1- P(O':;; X.:;; 5)}], 1-e jJ.


397

",hence the result. Similarly, 1 E(Z(2) = --_- [/-L 2p(0 ~ X 1-e IL

~ 3) + 30{1- P(O ~ X ~ 5)}]

",hence var(Z). The average return per roll of film sold is (in pence)

42- 6(1-a)+/-Lb. 1-e- 1L

Sl UL (1970). Expected loss =

I e- IL /-L:r. . ar(r+ 1) e-/3r

r=O

-ar e IL =a e- r~1 (r _1)! /-L,[(r-1)+2] 00

=a e- IL e- 2/3/-L2 [

-/3(r-2) r-2 -/3(r-1) r-l] /-L +2e-/3/-L L e /-L , r=2 (r-2)! r=1 (r-1)! <X>

<X>

Le

whence the answer on summation. Since 1- /-L < 0 and /-L e-/3 < 1, we have Expected loss < 3a e 1 -

1L

< 3a.

S2 UL (1969). Assume a Poisson model for the distribution of errors with mean /-L. Then the expected cost of correcting the errors made on a stencil is

I e-IL/-Lr 2r(3r+2) r=O r! . r+1 =2e- I /-L'[3r(r+1)-(r+1)-1] =

1L

(r+1)! r r 1 r+l] 3 L _/-L__ L ~+- L _/-L_ r=l(r-1)! r=or! /-Lr=o(r+1)!'

r=O

=2e- 1L

<X>

[

<X>

whence the result on summation. The residual profit is a -2[3/-L -1

+;

<X>

(l-e- IL )]

and this will be equal to '\a if

(1-'\)a+2 2

3 /-L+1 (1-e-) IL /-L =1+~/-L+i/-L2- ... ,

----=

whence the answer by retaining only the terms in /-L.

53 UL (1968). Suppose Ny calendars are ordered. For 0 ~ x ~ y, the profit is PI = Nx x 30a + N(y -x) x 30~ -Ny x30

398


and for x > y the profit is P2=Ny x30a-Ny x 30

Hence y

00

I

G(y)=30N

L

[(a-{3)x-(1-{3)y]P(x)+30N

X~O

(a-1)yP(x)

x~y+,l

whence the result on using the relation y

00

L

P(x) = 1-

x~y+l

I

P(x).

x~o

Direct substitution shows that G(y + 1) - G(y) = 30N[a -1- (a - (3)F(y)]. Maximum profit is obtained for the smallest value of integer y which makes aG(y)-a.P(x=O)+

I

pr-1

a .p(x=r)

r=l

whence the result. O<e->-O,

and

O<e->-(l-e->-):S;;~.

Therefore

whence the inequality by using p e->- :S;;!. 58 UL (1967). The probability of finding < n defectives in the sample of N is

This is also the probability of accepting the batch after inspection, which now contains only M - N components. For each component sold, the expected net income is 1 - a6. Therefore g(n, 6) = (M - N)(l- (6)

~t~ (~6X(1- 6)N-X

400


whence 1

J

g(n) = E0[g(n, 0)] = (M - N) (1- aO)

~~: (~OX(l- O)N-X . 66(1- 0) dO

o 1

= 6(M - N)

=6(M-N)

"of (N\ J(1- aO)Ox+1(1- 0)N-x+1 dO

X~O x)

o

:~: (~[B(X+2, N-x+2)-aB(x+3, N-x+2)J

6(M-N) .. -1

L

)(4) (x + l)(N - x + l)(N + 4 - ax - 2a), on reduction N+4 X~O 6(M-N) .. = (N+4)(4) Z~1 z(N-z+2)(N+4-a-az),

= (

whence the result on summation over z. For ~ = n/ N, we have g(n)-

(M-N)e 2 [6-4(1+a)~+3ae].

It now follows that g(n) is maximized for

a~ =

1.

59 UL (1968). The expected cost of a packet before devaluation is 00

a

+ {3p, -

~

(TV27T

J(X 2_(T2) exp[-(x -p,)2/2(T2_'Y(X-p,)/(T] dx _00

The stated result now follows by putting u = z + l' and then integrating over u by using .the properties of the normal integral. The expected cost after devaluation is a' + {3' p,' - e!"Y'( p,' - 1'17')2

whence, under the given conditions, , a+{3p,-a' p, = {3' ;

,

(a -a')-({3 -{3')p, +'Y{3'(T

17=

'Y{3'

Under the modified system the proportion of underweight packets is ct>[(l-p,')/(T'] whence the stated result.

60

UL (1972). The expected payment per page is (in £)

ix e- IL + 1 x p, e- IL +ix [1-(1 + p,) e- IL ], whence the answer. If £0 is the new payment to the proof-reader for every page on which he


401

detects no errors, then 9 x e - JL + 1 x,.., e- JL +lX[l-(l +,..,) e- JL ] = 1

whence

9 =i(S+,..,-e- JL ).

61 UL (1973).

co

dx f (1x'+l +X)H4

E(x') =(,\ +2)('\ +3)

o

=(,\+2)(,\+3)B(r+2,'\-r+2), if '\+2>r whence the result. Using the particular results for E(x) and E(x 2 ) gives the value of E(T).

62 UL (1973). To evaluate E(x - a) use the transformation y = log(x - a) where -00< y 0. The stated result follows on integration. To obtain the moments, note that log Mo(t) = log[ 1- 1 ;~~(~:~) f3t]-log[ 1- :~]-log[ 1- 2(:~a)] =f3t+f3

2[1-2a(1-a)] t 2 2a(1-a) 2+""

on expansion,

whence the E(X) and var(X). The lower limit for var(X) is obtained by noting that a(1-a)~~. 76 UL (1966). The moment-generating function of X is E(etX ) = ell" whence the moments. The proportionality constant k for the probability density function of Z is obtained from the equation 00

k

Je-!z,[1+az(z+1)]dz=1,

whence

k

=

1 .Jh(1+a) .

The vth moment of Z about the origin is

J 00

E(ZV)

1 fu(1+a)

= IL~(Z).

e -iZ2 (z 1'+ az 1'+ 1 + az 1'+2) dz.

-00

406


For v = 2r + 1 and v = 2r, the required moments are obtained by using the integral expressions for the moments of the unit normal variable X. In particular, E(Z) = a/(1 +a) and var(Z) = 2-1/(1 +af. 77

UL (1965). 00

J

E(e'X) = A e-(A-')x dx = (1- t/A)-I. o

The cumulants of X are obtained by the expansion of -log(1- t/A). The point-probabilities of Yare c e- Ay where 00

c

L e-A'=1

or c=1-e-A.

• =0

Hence 00

G(6)=(1-e- A)

L (6e-A)' =(1-e-A)/(1-6 e-A) . • =0

Therefore the cumulant-generating function of Y is log(1-e- A) -log(1-e- A e') whence the first two cumulants on expansion. Finally, e A -1-A Kl(X)-Kl(Y) = A(eA -1); K2(X) - KiY) =

(e A - 1f - A2 e A A2(eA _1)2

The stated approximations are now obtained by expansion. 78

UL (1972). P(X = r) = Probability that there is one ace in the first r cards x (r+ l)th card dealt is second ace

The recurrence relation is obtained by considering the ratio P(X = r + 1)/P(X = r). It then follows that P(X = r + 1) > P(X = r) if (49-r)(r+1»r(51-r) or if rO,

whence the mean and variance of X. The likelihood of the n observations is

" X~ ( 1)" exp (1" -i i~ ) Il

L = 6A 4

Xi

•

ANSWERS AND HINTS ON SOLUTIONS: SUPPLEMENT SO

407

that nx

L 3 log Xj, II

log L = constant - 4n log A- - +

A

j=t

whence the estimate A on differentiation with respect to A. We have E(x) = 4A; var(x) = 4A 2/n = E(x 2) - E2(X). Therefore E(x 2) = 16A 2(1 + 1/4n). Hence an unbiased estimate of A2 is nx 2/4(4n + 1). Also, E(x 2/16) = A2(1 + 1/4n).

80 UL (1973). The likelihood of the sample observations is

( 1)11 .exp--1 L (xj-Of II

L= - SO

that

~2wO

20

j

=t

n 1 log L = constant-llog 0 - 20 [ns 2 + n(x - Of].

Differentiation with respect to 0 leads to the stated equation for 0. The quadratic in 0 has one positive r

Theoretical Exercises in Probability and Statistics, 2nd Edition

Theoretical Exercises in Probability and Statistics

Probability and Statistics (2nd Edition)

Probability and Statistics (2nd Edition)

Probability and Statistics (2nd Edition)

Probability and Statistics (2nd Edition)

Exercises in probability

Exercises in probability

Counterexamples in Probability, 2nd Edition

Counterexamples in Probability, 2nd Edition

Statistics - Exercises

Probability and Statistics Explorations with MAPLE (2nd Edition)

Dependence in Probability and Statistics

Dependence in probability and statistics

Probability and Statistics in Engineering

Lectures in Probability and Statistics

One Thousand Exercises in Probability

One Thousand Exercises in Probability

One Thousand Exercises in Probability

One thousand exercises in probability

One Thousand Exercises in Probability

Theoretical And Empirical Exercises in Econometrics

Theoretical Fluid Dynamics, 2nd Edition

Introduction to Probability and Statistics , Thirteenth Edition

Probability and Statistics for Engineers, 5th Edition

Theoretical Fluid Dynamics, 2nd Edition

Probability Theory and Mathematical Statistics, Third Edition

Probability and Statistics

Probability, Statistics and Truth, Second Revised Edition

SAS Statistics and Probability

Probability, statistics, and truth

Theoretical Exercises in Probability and Statistics, 2nd Edition