Univariate Discrete Distributions

Univariate Discrete Distributions Univariate Discrete Distributions THIRD EDITION NORMAN L. JOHNSON University of No...

Author: Norman L. Johnson | Adrienne W. Kemp | Samuel Kotz

223 downloads 2108 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Univariate Discrete Distributions

Univariate Discrete Distributions THIRD EDITION

NORMAN L. JOHNSON University of North Carolina Department of Statistics Chapel Hill, North Carolina

ADRIENNE W. KEMP University of St. Andrews Mathematical Institute North Haugh, St. Andrews United Kingdom

SAMUEL KOTZ George Washington University Department of Engineering Management and Systems Engineering Washington, D.C.

A JOHN WILEY & SONS, INC., PUBLICATION

Copyright  2005 by John Wiley & Sons, Inc. All rights reserved. Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com. Library of Congress Cataloging-in-Publication Data: ISBN 0-471-27246-9 Printed in the United States of America. 10 9 8 7 6 5 4 3 2 1

To the memory of Norman Lloyd Johnson (1917–2004)

Contents

Preface 1

xvii

Preliminary Information 1.1

1.2

1

Mathematical Preliminaries, 1 1.1.1 Factorial and Combinatorial Conventions, 1 1.1.2 Gamma and Beta Functions, 5 1.1.3 Finite Difference Calculus, 10 1.1.4 Differential Calculus, 14 1.1.5 Incomplete Gamma and Beta Functions and Other Gamma-Related Functions, 16 1.1.6 Gaussian Hypergeometric Functions, 20 1.1.7 Confluent Hypergeometric Functions (Kummer’s Functions), 23 1.1.8 Generalized Hypergeometric Functions, 26 1.1.9 Bernoulli and Euler Numbers and Polynomials, 29 1.1.10 Integral Transforms, 32 1.1.11 Orthogonal Polynomials, 32 1.1.12 Basic Hypergeometric Series, 34 Probability and Statistical Preliminaries, 37 1.2.1 Calculus of Probabilities, 37 1.2.2 Bayes’s Theorem, 41 1.2.3 Random Variables, 43 1.2.4 Survival Concepts, 45 1.2.5 Expected Values, 47 1.2.6 Inequalities, 49 1.2.7 Moments and Moment Generating Functions, 50 1.2.8 Cumulants and Cumulant Generating Functions, 54 vii

viii

CONTENTS

1.2.9 1.2.10 1.2.11 1.2.12 1.2.13 1.2.14 1.2.15 1.2.16 1.2.17 1.2.18 2

Families of Discrete Distributions 2.1 2.2

2.3

2.4

2.5 2.6 2.7 2.8 2.9 3

Joint Moments and Cumulants, 56 Characteristic Functions, 57 Probability Generating Functions, 58 Order Statistics, 61 Truncation and Censoring, 62 Mixture Distributions, 64 Variance of a Function, 65 Estimation, 66 General Comments on the Computer Generation of Discrete Random Variables, 71 Computer Software, 73

Lattice Distributions, 74 Power Series Distributions, 75 2.2.1 Generalized Power Series Distributions, 75 2.2.2 Modified Power Series Distributions, 79 Difference-Equation Systems, 82 2.3.1 Katz and Extended Katz Families, 82 2.3.2 Sundt and Jewell Family, 85 2.3.3 Ord’s Family, 87 Kemp Families, 89 2.4.1 Generalized Hypergeometric Probability Distributions, 89 2.4.2 Generalized Hypergeometric Factorial Moment Distributions, 96 Distributions Based on Lagrangian Expansions, 99 Gould and Abel Distributions, 101 Factorial Series Distributions, 103 Distributions of Order-k, 105 q-Series Distributions, 106

Binomial Distribution 3.1 3.2 3.3 3.4 3.5 3.6

74

Definition, 108 Historical Remarks and Genesis, 109 Moments, 109 Properties, 112 Order Statistics, 116 Approximations, Bounds, and Transformations, 116

108

ix

CONTENTS

3.6.1 Approximations, 116 3.6.2 Bounds, 122 3.6.3 Transformations, 123 3.7 Computation, Tables, and Computer Generation, 124 3.7.1 Computation and Tables, 124 3.7.2 Computer Generation, 125 3.8 Estimation, 126 3.8.1 Model Selection, 126 3.8.2 Point Estimation, 126 3.8.3 Confidence Intervals, 130 3.8.4 Model Verification, 133 3.9 Characterizations, 134 3.10 Applications, 135 3.11 Truncated Binomial Distributions, 137 3.12 Other Related Distributions, 140 3.12.1 Limiting Forms, 140 3.12.2 Sums and Differences of Binomial-Type Variables, 140 3.12.3 Poissonian Binomial, Lexian, and Coolidge Schemes, 144 3.12.4 Weighted Binomial Distributions, 149 3.12.5 Chain Binomial Models, 151 3.12.6 Correlated Binomial Variables, 151 4

Poisson Distribution 4.1 4.2

4.3 4.4 4.5 4.6

4.7

Definition, 156 Historical Remarks and Genesis, 156 4.2.1 Genesis, 156 4.2.2 Poissonian Approximations, 160 Moments, 161 Properties, 163 Approximations, Bounds, and Transformations, 167 Computation, Tables, and Computer Generation, 170 4.6.1 Computation and Tables, 170 4.6.2 Computer Generation, 171 Estimation, 173 4.7.1 Model Selection, 173 4.7.2 Point Estimation, 174 4.7.3 Confidence Intervals, 176

156

x

CONTENTS

4.7.4 Model Verification, 178 4.8 Characterizations, 179 4.9 Applications, 186 4.10 Truncated and Misrecorded Poisson Distributions, 188 4.10.1 Left Truncation, 188 4.10.2 Right Truncation and Double Truncation, 191 4.10.3 Misrecorded Poisson Distributions, 193 4.11 Poisson–Stopped Sum Distributions, 195 4.12 Other Related Distributions, 196 4.12.1 Normal Distribution, 196 4.12.2 Gamma Distribution, 196 4.12.3 Sums and Differences of Poisson Variates, 197 4.12.4 Hyper-Poisson Distributions, 199 4.12.5 Grouped Poisson Distributions, 202 4.12.6 Heine and Euler Distributions, 205 4.12.7 Intervened Poisson Distributions, 205 5

Negative Binomial Distribution 5.1 5.2 5.3

Definition, 208 Geometric Distribution, 210 Historical Remarks and Genesis of Negative Binomial Distribution, 212 5.4 Moments, 215 5.5 Properties, 217 5.6 Approximations and Transformations, 218 5.7 Computation and Tables, 220 5.8 Estimation, 222 5.8.1 Model Selection, 222 5.8.2 P Unknown, 222 5.8.3 Both Parameters Unknown, 223 5.8.4 Data Sets with a Common Parameter, 226 5.8.5 Recent Developments, 227 5.9 Characterizations, 228 5.9.1 Geometric Distribution, 228 5.9.2 Negative Binomial Distribution, 231 5.10 Applications, 232 5.11 Truncated Negative Binomial Distributions, 233 5.12 Related Distributions, 236 5.12.1 Limiting Forms, 236

208

xi

CONTENTS

5.12.2 5.12.3 5.12.4 5.12.5 5.12.6 5.12.7 5.12.8 5.12.9 6

Extended Negative Binomial Model, 237 Lagrangian Generalized Negative Binomial Distribution, 239 Weighted Negative Binomial Distributions, 240 Convolutions Involving Negative Binomial Variates, 241 Pascal–Poisson Distribution, 243 Minimum (Riff–Shuffle) and Maximum Negative Binomial Distributions, 244 Condensed Negative Binomial Distributions, 246 Other Related Distributions, 247

Hypergeometric Distributions 6.1 6.2

Definition, 251 Historical Remarks and Genesis, 252 6.2.1 Classical Hypergeometric Distribution, 252 6.2.2 Beta–Binomial Distribution, Negative (Inverse) Hypergeometric Distribution: Hypergeometric Waiting-Time Distribution, 253 6.2.3 Beta–Negative Binomial Distribution: Beta–Pascal Distribution, Generalized Waring Distribution, 256 6.2.4 Pólya Distributions, 258 6.2.5 Hypergeometric Distributions in General, 259 6.3 Moments, 262 6.4 Properties, 265 6.5 Approximations and Bounds, 268 6.6 Tables, Computation, and Computer Generation, 271 6.7 Estimation, 272 6.7.1 Classical Hypergeometric Distribution, 273 6.7.2 Negative (Inverse) Hypergeometric Distribution: Beta–Binomial Distribution, 274 6.7.3 Beta–Pascal Distribution, 276 6.8 Characterizations, 277 6.9 Applications, 279 6.9.1 Classical Hypergeometric Distribution, 279 6.9.2 Negative (Inverse) Hypergeometric Distribution: Beta–Binomial Distribution, 281 6.9.3 Beta–Negative Binomial Distribution: Beta–Pascal Distribution, Generalized Waring Distribution, 283 6.10 Special Cases, 283

251

xii

CONTENTS

6.10.1 6.10.2 6.10.3 6.10.4 6.10.5 6.11 Related 6.11.1 6.11.2

Discrete Rectangular Distribution, 283 Distribution of Leads in Coin Tossing, 286 Yule Distribution, 287 Waring Distribution, 289 Narayana Distribution, 291 Distributions, 293 Extended Hypergeometric Distributions, 293 Generalized Hypergeometric Probability Distributions, 296 6.11.3 Generalized Hypergeometric Factorial Moment Distributions, 298 6.11.4 Other Related Distributions, 299

7

Logarithmic and Lagrangian Distributions 7.1

7.2

Logarithmic Distribution, 302 7.1.1 Definition, 302 7.1.2 Historical Remarks and Genesis, 303 7.1.3 Moments, 305 7.1.4 Properties, 307 7.1.5 Approximations and Bounds, 309 7.1.6 Computation, Tables, and Computer Generation, 310 7.1.7 Estimation, 311 7.1.8 Characterizations, 315 7.1.9 Applications, 316 7.1.10 Truncated and Modified Logarithmic Distributions, 317 7.1.11 Generalizations of the Logarithmic Distribution, 319 7.1.12 Other Related Distributions, 321 Lagrangian Distributions, 325 7.2.1 Otter’s Multiplicative Process, 326 7.2.2 Borel Distribution, 328 7.2.3 Consul Distribution, 329 7.2.4 Geeta Distribution, 330 7.2.5 General Lagrangian Distributions of the First Kind, 331 7.2.6 Lagrangian Poisson Distribution, 336 7.2.7 Lagrangian Negative Binomial Distribution, 340

302

xiii

CONTENTS

7.2.8 7.2.9 8

Mixture Distributions 8.1

8.2

8.3

8.4 9

Lagrangian Logarithmic Distribution, 341 Lagrangian Distributions of the Second Kind, 342

Basic Ideas, 343 8.1.1 Introduction, 343 8.1.2 Finite Mixtures, 344 8.1.3 Varying Parameters, 345 8.1.4 Bayesian Interpretation, 347 Finite Mixtures of Discrete Distributions, 347 8.2.1 Parameters of Finite Mixtures, 347 8.2.2 Parameter Estimation, 349 8.2.3 Zero-Modified and Hurdle Distributions, 351 8.2.4 Examples of Zero-Modified Distributions, 353 8.2.5 Finite Poisson Mixtures, 357 8.2.6 Finite Binomial Mixtures, 358 8.2.7 Other Finite Mixtures of Discrete Distributions, 359 Continuous and Countable Mixtures of Discrete Distributions, 360 8.3.1 Properties of General Mixed Distributions, 360 8.3.2 Properties of Mixed Poisson Distributions, 362 8.3.3 Examples of Poisson Mixtures, 365 8.3.4 Mixtures of Binomial Distributions, 373 8.3.5 Examples of Binomial Mixtures, 374 8.3.6 Other Continuous and Countable Mixtures of Discrete Distributions, 376 Gamma and Beta Mixing Distributions, 378

Stopped-Sum Distributions 9.1 9.2 9.3 9.4 9.5 9.6

343

Generalized and Generalizing Distributions, 381 Damage Processes, 386 Poisson–Stopped Sum (Multiple Poisson) Distributions, 388 Hermite Distribution, 394 Poisson–Binomial Distribution, 400 Neyman Type A Distribution, 403 9.6.1 Definition, 403 9.6.2 Moment Properties, 405 9.6.3 Tables and Approximations, 406

381

xiv

CONTENTS

9.7 9.8 9.9 9.10 9.11 9.12 9.13

9.6.4 Estimation, 407 9.6.5 Applications, 409 Pólya–Aeppli Distribution, 410 Generalized Pólya–Aeppli (Poisson–Negative Binomial) Distribution, 414 Generalizations of Neyman Type A Distribution, 416 Thomas Distribution, 421 Borel–Tanner Distribution: Lagrangian Poisson Distribution, 423 Other Poisson–Stopped Sum (multiple Poisson) Distributions, 425 Other Families of Stopped-Sum Distributions, 426

10 Matching, Occupancy, Runs, and q-Series Distributions 10.1 10.2 10.3 10.4

10.5 10.6

10.7

10.8

Introduction, 430 Probabilities of Combined Events, 431 Matching Distributions, 434 Occupancy Distributions, 439 10.4.1 Classical Occupancy and Coupon Collecting, 439 10.4.2 Maxwell–Boltzmann, Bose–Einstein, and Fermi–Dirac Statistics, 444 10.4.3 Specified Occupancy and Grassia–Binomial Distributions, 446 Record Value Distributions, 448 Runs Distributions, 450 10.6.1 Runs of Like Elements, 450 10.6.2 Runs Up and Down, 453 Distributions of Order k, 454 10.7.1 Early Work on Success Runs Distributions, 454 10.7.2 Geometric Distribution of Order k, 456 10.7.3 Negative Binomial Distributions of Order k, 458 10.7.4 Poisson and Logarithmic Distributions of Order k, 459 10.7.5 Binomial Distributions of Order k, 461 10.7.6 Further Distributions of Order k, 463 q-Series Distributions, 464 10.8.1 Terminating Distributions, 465 10.8.2 q-Series Distributions with Infinite Support, 470 10.8.3 Bilateral q-Series Distributions, 474 10.8.4 q-Series Related Distributions, 476

430

CONTENTS

11 Parametric Regression Models and Miscellanea

xv

478

11.1 Parametric Regression Models, 478 11.1.1 Introduction, 478 11.1.2 Tweedie–Poisson Family, 480 11.1.3 Negative Binomial Regression Models, 482 11.1.4 Poisson Lognormal Model, 483 11.1.5 Poisson–Inverse Gaussian (Sichel) Model, 484 11.1.6 Poisson Polynomial Distribution, 487 11.1.7 Weighted Poisson Distributions, 488 11.1.8 Double-Poisson and Double-Binomial Distributions, 489 11.1.9 Simplex–Binomial Mixture Model, 490 11.2 Miscellaneous Discrete Distributions, 491 11.2.1 Dandekar’s Modified Binomial and Poisson Models, 491 11.2.2 Digamma and Trigamma Distributions, 492 11.2.3 Discrete Adès Distribution, 494 11.2.4 Discrete Bessel Distribution, 495 11.2.5 Discrete Mittag–Leffler Distribution, 496 11.2.6 Discrete Student’s t Distribution, 498 11.2.7 Feller–Arley and Gegenbauer Distributions, 499 11.2.8 Gram–Charlier Type B Distributions, 501 11.2.9 “Interrupted” Distributions, 502 11.2.10 Lost-Games Distributions, 503 11.2.11 Luria–Delbrück Distribution, 505 11.2.12 Naor’s Distribution, 507 11.2.13 Partial-Sums Distributions, 508 11.2.14 Queueing Theory Distributions, 512 11.2.15 Reliability and Survival Distributions, 514 11.2.16 Skellam–Haldane Gene Frequency Distribution, 519 11.2.17 Steyn’s Two-Parameter Power Series Distributions, 521 11.2.18 Univariate Multinomial-Type Distributions, 522 11.2.19 Urn Models with Stochastic Replacements, 524 11.2.20 Zipf-Related Distributions, 526 11.2.21 Haight’s Zeta Distributions, 533 Bibliography

535

Abbreviations

631

Index

633

Preface

This book is dedicated to the memory of Professor N. L. Johnson, who passed away during the production stages. He was my longtime friend and mentor; his assistance with this revision during his long illness is greatly appreciated. His passing is a sad loss to all who are interested in statistical distribution theory. The preparation of the third edition gave Norman and I the opportunity to substantially revise and reorganize parts of the book. This enabled us to increase the coverage of certain areas and to highlight today’s better understanding of interrelationships between distributions. Also a number of errors and inaccuracies in the two previous editions have been corrected and some explanations are clarified. The continuing interest in discrete distributions is evinced by the addition of over 400 new references, nearly all since 1992. Electronic databases, such as Statistical Theory and Methods Abstracts (published by the International Statistical Institute), the Current Index to Statistics: Applications, Methods and Theory (published by the American Statistical Association and the Institute of Mathematical Statistics), and the Thomson ISI Web of Science, have drawn to our attention papers and articles which might otherwise have escaped notice. It is important to acknowledge the impact of scholarly, encyclopedic publications such as the Dictionary and Bibliography of Statistical Distributions in Scientific Work, Vol. 1: Discrete Models, by G. P. Patil, M. T. Boswell, S. W. Joshi, and M. V. Ratnaparkhi (1984) (published by the International Co-operative Publishing House, Fairland, MD), and the Thesaurus of Univariate Discrete Probability Distributions, by G. Wimmer and G. Altmann (1999) (published by Stamm Verlag, Essen). The new edition of Statistical Distributions, by M. Evans, N. Peacock, and B. Hastings (2000) (published by Wiley, New York), encouraged us to address the needs of occasional readers as distinct from researchers into the theoretical and applied aspects of the subject. The objectives of this book are far wider. It aims, as before, to give an account of the properties and the uses of discrete distributions at the time of writing, while adhering to the same level and style as previous editions. The 1969 intention to exclude theoretical minutiae of no apparent practical importance has not

xvii

xviii

PREFACE

been forgotten. We have tried to give a balanced account of new developments, especially those in the more accessible statistical journals. There has also been relevant work in related fields, such as econometrics, combinatorics, probability theory, stochastic processes, actuarial studies, operational research, and social sciences. We have aimed to provide a framework within which future research findings can best be understood. In trying to keep the book to a reasonable length, some material that should have been included was omitted or its coverage curtailed. Comments and criticisms are welcome; I would like to express our gratitude to friends and colleagues for pointing out faults in the last edition and for their input of ideas into the new edition. The structure of the book is broadly similar to that of the previous edition. The organization of the increased amount of material into the same number of chapters has, however, created some unfamilar bedfellows. An extra chapter would have had an untoward effect on the next two books in the series (Univariate Continuous Distributions, Vols. 1 and 2); these begin with Chapter 12. Concerning numbering conventions, each chapter is divided into sections and within many sections there are subsections. Instead of a separate name index, the listed references end with section numbers enclosed in square brackets. Chapter 1 has seen some reordering and the inclusion of a small amount of new, relevant material. Sections 1.1 and 1.2 contain mathematical preliminaries and statistical preliminaries, respectively. Material on the computer generation of specific types of random variables is shifted to appropriate sections in other chapters. We chose not to discuss software explicitly—we felt that this is precluded by shortage of space. Some of the major packages are listed at the end of Chapter 1, however. Many contain modules for tasks associated with specific distributions. Websites are given so that readers can obtain further information. In Chapter 2, most of the material on distributions based on Lagrangian expansions is moved to Chapter 7, which is now entitled Logarithmic and Lagrangian Distributions. There are new short sections in Chapter 2 on order-k and q-series distributions, mentioning their new placement in the book and changes in customary notations since the last edition. Chapters 3, 4, and 5 are structurally little changed, although new sections on chain binomial models (Chapter 3), the intervened Poisson distribution (Chapter 4), and the minimum and maximum negative binomial distributions and the condensed negative binomial distribution (Chapter 5) are added. It is hoped that the limited reordering and insertion of new material in Chapter 6 will improve understanding of hypergeometric-type distributions. Chapter 7 now has a dual role. Logarithmic distributions occupy the first half. The new second part contains a coherent and updated treatment of the previously fragmented material on Lagrangian distributions. The typographical changes in Chapters 8 and 9 are meant to make them more reader friendly. Chapter 10 is now much longer. It contains the section on record value Distributions that was previously in Chapter 11. The treatment of order-k distributions

PREFACE

xix

is augmented by accounts of recent researches. The chapter ends with a consolidated account of the absorption, Euler, and Heine distributions, as well as new q-series material, including new work on the null distribution of the Wilcoxon– Mann–Whitney test statistic. Chapter 11 has seen most change; it is now in two parts. The ability of modern computers to gather and analyze very large data sets with many covariates has led to the construction of many regression-type models, both parametric and nonparametric. The first part of Chapter 11 gives an account of certain regression models for discrete data that are probabilistically fully specified, that is, fully parametric. These include the Tweedie–Poisson family, the Poisson lognormal, Poisson inverse Gaussian, and Poisson polynomial distributions. Efron’s double Poisson and double binomial and the simplex-binomial mixture model also receive attention. The remainder of Chapter 11 is on miscellaneous discrete distributions, as before. Those distributions that have fitted better into earlier chapters are replaced with newer ones, such as the discrete Bessel, the discrete Mittag–Leffler, and the Luria–Delbrück distributions. There is a new section on survival distributions. The section on Zipf and zeta distributions is split into two; renewed interest in the literature in Zipf-type distributions is recognized by the inclusion of Hurwitz–zeta and Lerch distributions. We have been particularly indebted to Professors David Kemp and “Bala” Balakrishnan, who have read the entire manuscript and have made many valuable recommendations (not always implemented). David was particularly helpful with his knowledge of AMS LATEX and his understanding of the Wiley stylefile. He has also been of immense help with the task of proofreading. It is a pleasure to record the facilities and moral support provided by the Mathematical Institute at the University of St Andrews, especially by Dr. Patricia Heggie. Norman and I much regretted that Sam Kotz, with his wide-ranging knowledge of the farther reaches of the subject, felt unable to join us in preparing this new edition. Adrienne W. Kemp St Andrews, Scotland November 2004

CHAPTER 1

Preliminary Information

Introduction This work contains descriptions of many different distributions used in statistical theory and applications, each with its own pecularities distinguishing it from others. The book is intended primarily for reference. We have included a large number of formulas and results. Also we have tried to give adequate bibliographical notes and references to enable interested readers to pursue topics in greater depth. The same general ideas will be used repeatedly, so it is convenient to collect the appropriate definitions and methods in one place. This chapter does just that. The collection serves the additional purpose of allowing us to explain the sense in which we use various terms throughout the work. Only those properties likely to be useful in the discussion of statistical distributions are described. Definitions of exponential, logarithmic, trigonometric, and hyperbolic functions are not given. Except where stated otherwise, we are using real (not complex) variables, and “log,” like “ln,” means natural logarithm (i.e., to base e). A further feature of this chapter is material relating to formulas that will be used only occasionally; where appropriate, comparisons are made with other notations used elsewhere in the literature. In subsequent chapters the reader should refer back to this chapter when an unfamiliar and apparently undefined symbol is encountered. 1.1 MATHEMATICAL PRELIMINARIES 1.1.1 Factorial and Combinatorial Conventions The number of different orderings of n elements is the product of n with all the positive integers less than n; it is denoted by the familiar symbol n! (factorial n), n! = n(n − 1)(n − 2) · · · 1 =

n−1

(n − j ).

(1.1)

j =0

Univariate Discrete Distributions, Third Edition. By Norman L. Johnson, Adrienne W. Kemp, and Samuel Kotz Copyright  2005 John Wiley & Sons, Inc.

1

2

PRELIMINARY INFORMATION

The less familiar semifactorial symbol k!! means (2n)!! = 2n(2n − 2) · · · 2, where k = 2n. The product of a positive integer with the next k − 1 smaller positive integers is called a descending (falling) factorial ; it will in places be denoted by n(k) = n(n − 1) · · · (n − k + 1) =

k−1

(n − j ) =

j =0

n! , (n − k)!

(1.2)

in accordance with earlier editions of this book. Note that there are k terms in the product and that n(k) = 0 for k > n, where n is a positive integer. Readers are WARNED that there is no universal notation for descending factorials in the statistical literature. For example, Mood, Graybill, and Boes (1974) use the symbol (n)k in the sense (n)k = n(n − 1) · · · (n − k + 1), while Stuart and Ord (1987) write n[k] = n(n − 1) · · · (n − k + 1); Wimmer and Altmann (1999) use x(n) = x(x − 1)(x − 2) · · · (x − n + 1), x ∈ R, n ∈ N. Similarly there is more than one notation in the statistical literature for ascending (rising) factorials; for instance, Wimmer and Altmann (1999) use x (n) = x(x + 1)(x + 2) · · · (x + n − 1), x ∈ R, n ∈ N. In the first edition of this book we used n[k] = n(n + 1) · · · (n + k − 1) =

k−1

(n + j ) =

j =0

(n + k − 1)! . (n − 1)!

(1.3)

There is, however, a standard notation in the mathematical literature, where the symbol (n)k is known as Pochhammer’s symbol after the German mathematician L. A. Pochhammer [1841–1920]; it is used to denote (n)k = n(n + 1) · · · (n + k − 1)

(1.4)

[this definition of (n)k differs from that of Mood et al. (1974)]. We will use Pochhammer’s symbol, meaning (1.4) except where it conflicts with the use of (1.3) in earlier editions. The binomial coefficient nr denotes the number of different possible combinations of r items from n different items. We have n n! n = = ; (1.5) r r!(n − r)! n−r

3

MATHEMATICAL PRELIMINARIES

also n n = =1 0 n It is usual to define

n r

and

n+1 n n = + . r r r −1

(1.6)

= 0 if r < 0 or r > n. However,

−n (−n)(−n − 1) · · · (−n − r + 1) = r! r r n+r −1 . = (−1) r

(1.7)

The binomial theorem for a positive integer power n is n n n−j j a b . j

(a + b)n =

(1.8)

j =0

Putting a = b = 1 gives n n n + + ··· + = 2n 0 1 n and putting a = 1, b = −1 gives n n n − + · · · + (−1)n = 0. 0 1 n More generally, for any real power k (1 + b) = k

∞ k j =0

j

aj ,

−1 < b < 1.

(1.9)

By equating coefficients of x in (1 + x)a+b = (1 + x)a (1 + x)b , we obtain the well-known and useful identity known as Vandermonde’s theorem (A. T. Vandermonde [1735–1796]): n a+b a b = . n j n−j j =0

Hence

2 2 2 n n 2n n + + ··· + . = 1 n n 0

(1.10)

4


The multinomial coefficient is

n r1 , r2 , . . . , rk

=

n! , r1 !r2 ! · · · rk !

(1.11)

where r1 + r2 + · · · + rk = n. The multinomial theorem is a generalization of the binomial theorem: 

n

n! k a ni i=1 i  , aj  = k i=1 ni ! j =1 k

(1.12)

where summation is over all sets of nonnegative integers n1 , n2 , . . . , nk that sum to n. There are four ways in which a sample of k elements can be selected from a set of n distinguishable elements:

Order Important? No Yes No Yes

Repetitions Allowed?

Name of Sample

No No Yes

k-Combination k-Permutation k-Combination with replacement k-Permutation with replacement

Yes

Number of Ways to Select Sample C(n, k) P (n, k) C R (n, k) P R (n, k)

where C(n, k) =

n! , k!(n − k)!

(n + k − 1)! C (n, k) = , k!(n − 1)! R

P (n, k) =

n! , (n − k)!

(1.13)

P (n, k) = n . R

k

The number of ways to arrange n distinguishable items in a row is P (n, n) = n! (the number of permutations of n items). The number of ways to arrange n items in a row, assuming that there are k types of items with ni nondistinguishable items of type i, i = 1, 2, . . . , k, is the multinomial coefficient n1 ,n2n,...,nk . The number of derangements of n items (permutations of n items in which item i is not in the ith position) is

1 1 1 n 1 + − + · · · (−1) Dn = n! 1 − . 1! 2! 3! n!

5


The signum function, sgn(·), shows whether an argument is greater or less than zero: sgn(x) = 1 when x > 0;

sgn(0) = 0;

sgn(x) = −1

when

x < 0.

The ceiling function, x, is the least integer that is not smaller than x, for example, e = 3,

7 = 7,

−2.4 = −2.

The floor function, x, is the greatest integer that is not greater than x, for example, e = 2,

7 = 7,

−2.4 = −3.

The notation [·] = · is called the integer part. π =4

∞ (−1)j = 3.1415926536, 2j + 1 j =0

e=

∞ j =0

ln 2 =

1 = 2.7182818285, j!

∞ (−1)j −1

j

j =0

= 0.6931471806.

1.1.2 Gamma and Beta Functions When n is real but is not a positive integer, meaning can be given to n!, and hence to (1.2), (1.3), (1.5), (1.7), and (1.11), by defining (n − 1)! = (n),

n ∈ R+ ,

(1.14)

where (n) is the gamma function. The binomial theorem can thereby be shown to hold for any real power. There are three equivalent definitions of the gamma function, due to L. Euler [1707–1783], C. F. Gauss [1777–1855], and K. Weierstrass [1815–1897], respectively: Definition 1 (Euler):

∞

(x) = 0

t x−1 e−t dt,

x > 0.

(1.15)

6


Definition 2 (Gauss): (x) = lim

n→∞

n!nx , x(x + 1) · · · (x + n)

x = 0, −1, −2, . . . .

(1.16)

Definition 3 (Weierstrass): ∞ x x 1 γx = xe exp − , 1+ (x) n n

x > 0,

(1.17)

n=1

where γ is Euler’s constant γ = lim

1+

n→∞

1 1 1 ∼ 0.5772156649 . . . . + + · · · + − ln n = 2 3 n

(1.18)

From Definition 1, (1) = 0! = 1. Using integration by parts, Definition 1 gives the recurrence relation for (x): (x + 1) = x(x)

(1.19)

[when x is a positive integer, (x + 1) = x!]. This enables us to define (x) over the entire real line, except where x is zero or a negative integer, as

(x) =

  

∞

t x−1 e−t dt,

x > 0, (1.20)

0

 x −1 (x + 1),

x = −1, −2, . . . .

x < 0,

From Definition 3 it can be shown that

∞

0

1 2

= π 1/2 ; this implies that

√ e−t dt = π; t 1/2

hence, by taking t = u2 , we obtain

∞ 0

Also, from

1 2

2 −u π exp du = . 2 2

(1.21)

= π 1/2 , we have (2n)!π 1/2 , n + 12 = n!22n

(1.22)

7


Definition 3 and the product formula sin(πx) = πx

∞

1−

n=1

x2 n2

(1.23)

together imply that (x)(1 − x) =

π , sin(πx)

x = 0, −1, −2, . . . .

(1.24)

Legendre’s duplication formula [A.-M. Legendre, 1752–1833] is √

π(2x) = 22x−1 (x) x + 12 ,

x = 0, − 12 , −1, − 32 , . . . .

(1.25)

Gauss’s multiplication theorem is (mx) = (2π)(1−m)/2 mmx−1/2

m j =1

x = 0, −

j −1 x+ , m

1 2 3 ,− ,− ,..., m m m

(1.26)

where m = 1, 2, 3, . . . . This clearly reduces to Legendre’s duplication formula when m = 2. Many approximations for probabilities and cumulative probabilities have been obtained using various forms of Stirling’s expansion [J. Stirling, 1692–1770] for the gamma function: (x + 1) ∼ (2π)1/2 (x + 1)x+1/2 e−x−1 1 1 1 − × exp + − · · · , (1.27) 12(x + 1) 360(x + 1)3 1260(x + 1)5 (x + 1) ∼ (2π)1/2 x x+1/2 e−x 1 1 1 1 × exp + − + · · · , − 12x 360x 3 1260x 5 1680x 7 (x + 1) ∼ (2π)1/2 (x + 1)x+1/2 e−x−1 1 1 × 1+ − ··· , + 12(x + 1) 288(x + 1)2

(1.28)

(1.29)

(x + 1) ∼ (2π)1/2 x x+1/2 e−x 1 139 571 1 × 1+ − − +··· . + 12x 288x 2 51,840x 3 2,488,320x 4 (1.30)

8


These are divergent asymptotic expansions, yielding extremely good approximations. The remainder terms for (1.27) and (1.28) are each less in absolute value than the first term that is neglected, and they have the same sign. Barnes’s expansion [E. W. Barnes, 1874–1953] is less well known, but it is useful for half integers: 7 1 31 + − + · · · . (1.31) x + 12 ∼ (2π)1/2 x x e−x exp − 24x 2880x 3 40320x 5 Also (x + a) (a − b)(a + b − 1) a−b ∼x + ··· . 1+ (x + b) 2x

(1.32)

These also are divergent asymptotic expansions. Series (1.31) has accuracy comparable to (1.27) and (1.28). The beta function B (a, b) is defined by the Eulerian integral of the first kind :

1

B(a, b) =

t a−1 (1 − t)b−1 dt,

a > 0,

b > 0.

(1.33)

0

Clearly B(a, b) = B(b, a). Putting t = u/(1 + u) gives

∞

B(a, b) = 0

ua−1 du du, (1 + u)a+b

a > 0,

b > 0.

(1.34)

The relationship between the beta and gamma functions is B(a, b) =

(a)(b) , (a + b)

a, b = 0, −1, −2 . . . .

(1.35)

The derivatives of the logarithm of (a) are also useful, though they are not needed as often as the gamma function itself. The function ψ(x) =

d (x) [ln (x)] = dx (x)

(1.36)

is called the digamma function (with argument x) or the psi function. Similarly ψ (x) =

d d2 [ψ(x)] = 2 [ln (x)] dx dx

9


is called the trigamma function, and generally ψ (s) (x) =

ds d s+1 [ψ(x)] = [ln (x)] dx s dx s+1

(1.37)

is called the (s + 2)-gamma function. Extensive tables of the digamma, trigamma, tetragamma, pentagamma, and hexagamma functions are contained in Davis (1933, 1935). Shorter tables are in Abramowitz and Stegun (1965). The recurrence formula (1.19) for the gamma function yields the following recurrence formulas for the psi function: ψ(x + 1) = ψ(x) + x −1 and ψ(x + n) = ψ(x) +

n

(x + j − 1)−1 ,

n = 1, 2, 3, . . . .

(1.38)

j =1

Also 

n

ψ(x) = lim ln(n) − n→∞

 (x + j )−1 

j =0 ∞

= −γ −

1 x + x j (x + j )

(1.39)

j =1

= −γ + (x − 1)

∞

[(j + 1)(j + x)]−1

(1.40)

j =0

and ψ(mx) = ln(m) +

m−1 1 j ψ x+ , m m

m = 1, 2, 3, . . . ,

(1.41)

j =0

where γ is Euler’s constant (∼ = 0.5772156649 . . .). An asymptotic expansion for ψ(x) is ψ(x) ∼ ln x −

1 1 1 1 + − + ···, − 2 4 2x 12x 120x 252x 6

(1.42)

and hence a very good approximation for ψ(x) is ψ(x) ≈ ln(x − 0.5), provided that x ≥ 2. Particular values of ψ(x) are ψ(1) = −γ ,

ψ

1 2

= −γ − 2 ln(2) ≈ −1.963510 . . . .

10


1.1.3 Finite Difference Calculus The displacement operator E increases the argument of a function by unity: E[f (x)] = f (x + 1), E[E[f (x)]] = E[f (x + 1)] = f (x + 2). More generally, E n [f (x)] = f (x + n)

(1.43)

for any positive integer n, and we interpret E h [f (x)] as f (x + h) for any real h. The forward-difference operator is defined by f (x) = f (x + 1) − f (x).

(1.44)

Noting that f (x + 1) − f (x) = E[f (x)] − f (x) = (E − 1)f (x), we have the symbolic (or operational ) relation ≡ E − 1.

(1.45)

If n is an integer, then the nth forward difference of f (x) is n f (x) = (E − 1)n f (x) =

=

n j =0

n n

j

j =0

(−1)j E n−j f (x)

n (−1)j f (x + n − j ). j

(1.46)

Also, rewriting (1.45) as E = 1 + , we have f (x + n) = (1 + )n f (x) =

n n j =0

j

j f (x).

(1.47)

Newton’s forward-difference (interpolation) formula [I. Newton, 1642–1727] is obtained by replacing n by h, where h may be any real number, and using the interpretation of E h [f (x)] as f (x + h): f (x + h) = (1 + )h = f (x) + h f (x) +

h(h − 1) 2 f (x) + · · · . 2!

(1.48)

11


The series on the right-hand side need not terminate. However, if h is small and n f (x) decreases rapidly enough as n increases, then a good approximation to f (x + h) may be obtained with but few terms of the expansion. This expansion may then be used to interpolate values of f (x + h), given values f (x), f (x + 1), . . . , at unit intervals. The backward-difference operator ∇ is defined similarly, by the equation ∇f (x) = f (x) − f (x − 1) = (1 − E −1 )f (x).

(1.49)

Note that ∇ ≡ E −1 ≡ E −1 . There is a backward-difference interpolation formula analogous to Newton’s forward-difference formula. The central-difference operator δ is defined by δf (x) = f x + 12 − f x − 12 = (E 1/2 − E −1/2 )f (x).

(1.50)

Note that δ ≡ E −1/2 ≡ E −1/2 . Everett’s central-difference interpolation formula [W. N. Everett, 1924– ] f (x + h) = (1 − h)f (x) + hf (x + 1) − 16 (1 − h)[1 − (1 − h)2 ]δ 2 f (x) − 16 h(1 − h2 )δ 2 f (x + 1) + · · · is especially useful for computation. Newton’s forward-difference formula (1.48) can be rewritten as f (x + h) =

∞ h j =0

j

j f (x).

(1.51)

If f (x) is a polynomial of degree N , this expansion ends with the term containing N f (x). Applying the difference operator to the descending factorial x (N) gives x (N) = (x + 1)(N) − x (N) = (x + 1)x(x − 1) · · · (x − N + 2) − x(x − 1)(x − 2) · · · (x − N + 1) = [(x + 1) − (x − N + 1)]x(x − 1) · · · (x − N + 2) = N x (N−1) .

(1.52)

Repeating the operation, we have j x (N) = N (j ) x (N−j ) , For j > N we have j x (N) = 0.

j ≤ N.

(1.53)

12


Putting x = 0, h = x, and f (x) = x n in (1.51) gives x = n

n x k=0

k

k 0n =

n S(n, k)x! k=0

(x − k)!

,

(1.54)

where k 0n /k! in (1.54) means k x n /k! evaluated at x = 0 and is called a difference of zero. The multiplier S(n, k) = k 0n /k! of the descending factorials in (1.54) is called a Stirling number of the second kind. Equation (1.54) can be inverted to give the descending factorials as polynomials in x with coefficients called Stirling numbers of the first kind : x! s(n, j )x j . = (x − n)! n

(1.55)

j =0

These notations for the Stirling numbers of the first and second kinds have won wide acceptance in the statistical literature. However, there are no standard symbols in the mathematical literature. Other notations for the Stirling numbers are as follows:

First Kind

Second Kind

s(n, j ) n − 1 (n) B j − 1 n−j

S(n, k) n (−k) kBn−k n k 0n /k!

(j )

Sn

(m) n

Reference Riordan (1958) Milne-Thompson (1933) David and Barton (1962) Abramowitz and Stegun (1965)

j Sn

ᑭnk

Jordan (1950)

Sn

j

σnk

Patil et al. (1984)

S(n, j )

Z(n, k)

Wimmer and Altmann (1999)

Both sets of numbers are nonzero only for j = 0, 1, 2, . . . , n, k = 0, 1, 2, . . . , n, n > 0. For given n or given k, the Stirling numbers of the first kind alternate in sign. The Stirling numbers of the second kind are always positive. An extensive tabulation of the numbers and details of their properties appear in Abramowitz and Stegun (1965) and in Goldberg et al. (1976). The numbers increase very rapidly as their parameters increase. Useful properties are ∞ s(n, j )x n [ln(1 + x)]j = j ! , (1.56) n! n=j

13


(ex − 1)k = k!

∞ S(n, k)x n n=k

Also

and

n

n!

.

(1.57)

s(n + 1, j ) = s(n, j − 1) − ns(n, j ),

(1.58)

S(n + 1, k) = kS(n, k) + S(n, k − 1),

(1.59)

S(n, j )s(j, m) =

j =m

n

s(n, j )S(j, m) = δm,n ,

(1.60)

j =m

where δm,n is Kronecker delta [L. Kronecker, 1823–1891]; that is, δm,n = 1 for m = n and zero otherwise. Charalambides and Singh (1988) have written a useful review and bibliography concerning the Stirling numbers and their generalizations. Charalambides’s (2002) book deals in depth with many types of special numbers that occur in combinatorics, including generalizations and modifications of the Stirling numbers and the Carlitz, Carlitz–Riordan, Eulerian, and Lah numbers. The Bell numbers are partial sums of Stirling numbers of the second kind, Bm =

m

S(m, j ).

j =0

The Catalan numbers are Cn =

1 2n . n+1 n

The Fibonacci numbers are F0 = F1 = 1, F2 = F0 + F1 = 2, F3 = F1 + F2 = 3, F4 = F2 + F3 = 5, .. . Their generating function is g(t) = 1/(1 − t − t 2 ). The Narayana numbers are 1 n n N (n, k) = . n k k−1

14


1.1.4 Differential Calculus Next we introduce from the differential calculus the differential operator D, defined by Df (x) = f (x) =

df (x) . dx

(1.61)

More generally D j x N = N (j ) x N−j ,

j ≤ N.

(1.62)

Note the analogy between (1.53) and (1.62). If the function f (x) can be expressed in terms of a Taylor series, then the Taylor series is f (x + h) =

∞ j h j =0

j!

D j f (x).

(1.63)

The operator D acting on f (x) formally satisfies ∞ (hD)j j =0

j!

≡ ehD .

(1.64)

Comparing (1.48) with (1.63), we have (again formally) ehD ≡ (1 + )h

and

eD ≡ 1 + .

(1.65)

Although this is only a formal relation between operators, it gives exact results when f (x) is a polynomial of finite order; it gives useful approximations in many other cases, especially when D j f (x) and j f (x) decrease rapidly as j increases. Rewriting eD ≡ 1 + as D ≡ ln(1 + ), we obtain a numerical differentiation formula f (x) = Df (x) = f (x) − 12 2 f (x) + 13 3 f (x) − · · · .

(1.66)

(This is not the only numerical differentiation formula. There are others that are sometimes more accurate. This one is quoted as an example.) Given a change of variable, x = (1 + t), we have [D k f (x)]x=1+t = D k f (1 + t).

(1.67)

Consider now the differential operator θ , defined by θf (x) = xDf (x) = xf (x) = x

df (x) . dx

(1.68)

15


This satisfies θ k f (x) =

k

S(k, j )x j D j f (x)

(1.69)

j =1

and

x k D k f (x) = θ (θ − 1) · · · (θ − k + 1)f (x).

Also

(1.70)

[θ k f (x)]x=et = D k f (et ),

(1.71)

e−ct [θ k f (x)]x=et = (D + c)k [e−ct f (et )], and

(1.72)

x c θ k [x −c f (x)] = [ect D k {e−ct f (et )}]et =x = [(D − c)k f (et )]et =x = (θ − c)k f (x).

(1.73)

The D and θ operators are useful for handling moment properties of distributions. Lagrange’s expansion [J. L. Lagrange, 1736–1813] for the reversal of a power series assumes that if (1) y = f (x), where f (x) is regular in the neighborhood of x0 , (2) y0 = f (x0 ), and (3) f (x0 ) = 0, then k ∞ (y − y0 )k d k−1 x − x0 x = x0 + . (1.74) k! dx k−1 f (x) − y0 k=1

x=x0

More generally h(x) = h(x0 ) +

∞ (y − y0 )k k=1

k!

d k−1 dx k−1

x − x0 h (x) f (x) − y0

k ! , (1.75) x=x0

where h(x) is infinitely differentiable. (This expansion plays an important role in the theory of Lagrangian distributions; see Section 2.5.) L’Hôpital’s rule [G. F. A. de L’Hôpital, 1661–1704] is useful for finding the limit of an indeterminate form. If f (x) and g(x) are functions of x for which limx→b f (x) = limx→b g(x) = 0, and if limx→b [f (x)/g (x)] exists, then f (x) f (x) = lim . x→b g(x) x→b g (x) lim

(1.76)

The use of the O, o notation (Landau’s notation) [E. Landau, 1877–1938] is standard. We say that f (x) f (x) = o(g(x)) as x → ∞ if lim =0 x→∞ g(x)

16


and f (x) = O(g(x))

" " " f (x) " "

Univariate Discrete Distributions

Univariate discrete distributions

Univariate discrete distributions