Developments in Dual System Estimation of Population Size and Growth
This page intentionally left blank
Development...
45 downloads
910 Views
8MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Developments in Dual System Estimation of Population Size and Growth
This page intentionally left blank
Developments in Dual System Estimation of Population Size and Growth Edited by KarolJ.Krotki
The University of Alberta Press 1978
First published by The University of Alberta Press Edmonton, Alberta, Canada 1978 This book has been published with the help of a grant from the Humanities Research Council of Canada, using funds provided by the Canada Council. Copyright © 1978 The University of Alberta Press Canadian shared cataloguing in Publication Data Main entry under title Developments in dual system estimation of population size and growth Includes bibliographies. ISBN 0-88864-017-X 1. Demography - Methodology - Addresses, essays, lectures. 2. Population forecastingAdresses, essays, lectures. I. Krotki, Karol J. HB881.D49 301.32 C77-002073-9
All rights reserved No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form, or by any means, electronic mechanical, photocopying, recording or otherwise without the prior permission of the copyright owner. Design: P. Bartl, Dept. of Art and Design Cover design: P. Bartl, Chi Lee Printed in Canada by Printing Services of The University of Alberta iv
Contents
Acknowledgments xiii Preface xiv by W. Parker Mouldin, Acting President, the Population Council, New York Introduction xvi Chapter 1 The Role of PGE/ERAD/ECP Surveys Among the Endeavours to Secure More and Improved Demographic Data 1 by Karol J. Krotki 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8
The need for improved demographic data 1 An overview of the available techniques 2 The theory behind the PGE/ ERAD/ ECP technique 8 The PGE handbook and some essential PGE/ERAD/ECP features 10 An ideal PGE/ ERAD/ ECP exercise 12 An overview of the world uses of the PGE/ERAD/ECP technique 13 Modes of securing the advantages of the technique 25 Inadvertent modes of destroying the advantages of the technique 27 1.8.a Matters affecting independence 27 l.S.b Matters affecting matching procedure 27 l.S.c Matters affecting samples 28 l.8.d Matters affecting organization 28 1.9 Quasi PGE/ ERAD/ ECP estimates 29 1.10 The paucity and importance of costing data 29 1.11 The past history and the probable future of the technique 48 Discussion by Lee L. Bean 49 Chapter 2 The State of the Art in Dual Systems for Measuring Population Change 53 by H. Bradley Wells and Daniel G. Horvitz 2.1 2.2
Introduction 53 Sample design 54 2.2.a Some general sampling considerations 54 2.2.b The need for realistic parameters 55 2.2.c Rotation designs 56 V
2.2.d Some aids to improved estimation 56 2.3 The data collection elements of the dual collection system 57 2.3.a The essential elements of a dual collection system 57 2.3.b The continuous recording procedure 58 2.3.c The household survey procedure 61 2.3.d Matching procedures 63 2.3.e Field identification procedures and other factors common to both collection procedures 64 2.4 Conclusion 69 Discussion by Ivan P. Fellegi 69 Chapter 3 Dual Estimation in Demography Employing Time Series and Cross Section Data 74 by P. Krishnan 3.1 3.2 3.3 3.4
Introduction 74 Pooling cross section and time series data 74 Best estimator 75 Suggested estimation strategies 75 Estimator 1 75 Estimator 2 76 Estimator 3 77 An appendix to chapter 3: Correlation pattern in Canadian data on births and deaths 79 Discussion by Eli S. Marks 79
Chapter 4 Dual System Estimators Based on Multiplicity Surveys 81 by Monroe G. Sirken 4.1 4.2 4.3 4.4 4.5 4.6 4.7
Introduction 81 Counting rules 81 Multiplicity estimators 82 Dual system multiplicity estimators 83 Variance of sample estimators 84 Relative precision of different rules 84 Conclusion 86 Appendix A to chapter 4: Variance of DSM sample estimators 87 Appendix B to chapter 4: Simplification of Hr 87 Discussion by Ivan P. Fellegi 89
Chapter 5 The Collection of Demographic Data in Francophone Africa and Liberia Using the PGE/ERAD/ECP System 92 by Francois Pradel de Lamaze 5.1 5.2 5.3 5.4 5.5 5.6 5.7 vi
Dual collection 92 Some observations 92 Some undertakings in Madagascar 94 The Tunisian experiment 95 The case of Senegal and Cameroun 96 An endeavour in Algeria 96 Morocco: an extended experiment 97
5.8 Dual collection in Liberia 99 Discussion by William Seltzer 100 Chapter 6 The Egyptian Study to Measure Vital Rates: Some Estimates by Dual Collection 104 by K. E. Vaidyanathan 6.1 6.2 6.3 6.4 6.5
Introduction 104 Study design 104 Findings of the study 106 An evaluation of the estimates 108 Conclusion 109 Discussion by William Seltzer 110
Chapter 7 Some Practical Problems Suggested by the Application of the PGE/ERAD/ECP System in Morocco 113 by Mohamed Rachidi 7.1 7.2 7.3 7.4 7.5 7.6
Moroccan demography 113 The periodic household survey and its rotating objective 114 Cluster size and some consequences 114 The efficiency of personnel: resident and outside 115 The numbering of structures 116 Organization in the field 117 7.6.a Interval between two household surveys 117 7.6.b Relation between rounds in periodic surveys 117 7.6.c Events in medical institutions 117 7.6.d The problem of supervision in the field 118 Discussion by Charles Nobbe 119
Chapter 8 PGE/ERAD/ECP Matching Experiences in Morocco 122 by El Arbi Housni, Frances Notzon, and Marie-Daniele Picket 8.1 8.2 8.3 8.4 8.5
Features of field work most relevant to matching 122 The purpose of matching 123 Experimental matching: procedure used 124 Experimental matching: problems encountered 126 Production matching 128 8.5.a Vital rates 129 8.5.b Field verification 132 8.6 Conclusion 133 8.7 An appendix to chapter 8: The use of and experimental study for reaching decisions on matching rules 135 by Gad Nathan 8.A.I General 135 8. A.2 Data available from an experimental matching study 135 8. A.3 The variances of the estimator and its estimation 137 8. A.4 Comparison of matching rules 139 8. A.5 A proposal for a decision process 141 vii
Chapter 9 The PGE/ERAD/ECP System of Data Collection in Africa and a Comparison of its Results With Those of Analytic Techniques 145 by Roderic P. Beaujot 9.1 Introduction 145 9.2 Some African experiences with the PGE/ERAD/ECP 145 9.3 Analytic techniques 148 9.4 Some African experiences with analytic techniques 150 Discussion by Ansley J. Coale 153 Chapter 10 The Role of Dual System Estimation in Census Evaluation 156 by Eli S. Marks 10.1 The importance of publishing a census evaluation 156 10.2 Methods of census evaluation and their biases and variances 157 10.3 Internal and external consistency analysis 158 10.4 Dual system estimation 159 10.5 The two types of PES (post-enumeration survey) 161 10.6 A PES with a PGE/ ERAD/ ECP approach 162 10.7 Out-of-scope error in a PES 164 10.8 Widening the available options for census evaluation 165 Discussion by William Seltzer 166 PGE/ERAD/ECP issues 166 Census evaluation issues 167 General survey research issues 167 An appendix to chapter 10: PGE/ERAD/ECP evaluation of the Korean and Paraguayan results 168 10.A.I Introduction to the Korean and Paraguayan results 168 10.A.2 Overall results: Korea 170 10.A.3 Overall results: Paraguay 171 10. A.4 Differences in completeness between migrants and non-migrants 173 10.A.5 Differences in completeness between types of area 173 10.A.6 Completeness differences correlated with sex and age 174 10.A.7 Differences in completeness related to household composition and migration status 181 10.A.8 Post-stratification 185 10. A.9 Handling of migrants with "insufficient information for matching" 186 Chapter 11 The 1974 Post-Enumeration Survey of Liberia—A New Approach 189 by Eli S. Marks and John C. Rumford 11.1 11.2 11.3 11.4 11.5 11.6 11.7 viii
The traditional PES approach 189 Dual system estimation 191 Independence 193 Other biases 196 Handling of migrants 197 Use of one-way matching 198 Elimination of field verification 199
11.8 PES results 199 11.9 Some defects of the Liberian PES 200 11.10 Conclusion 202 Chapter 12 The Problem of Independence and Other Issues 205 12.1 12.2
C. Scott on Sources of error in the dual system approach 205 V.H. Muhsam on The bias of the PGE/ERAD/ECP estimates due to overenumeration 208 Discussion by Eli S. Marks 210
Appendix International Association of Survey Statistians Meeting, Vienna, 1973: Organizer's Report 215 Glossary 218 About the Authors 229 References 235 Index 248
ix
List of Tables
1.1 1.2 1.3 1.4 1.5 2.1 4.1 4.2 5.1 5.2 5.3 5.4 5.5 6.1 6.2 6.3 7.1 8.1 8.2 8.3 8.4 8.5 X
An overview of the known world involvement of PGE/ERAD/ECP techniques 14 Summary of world involvement in PGE/ERAD/ECP 25 Survey costs divided by office/field and by primary/secondary units 30 Total and proportionate survey costs 37 Costs with varying types of surveys 42 Estimated completeness rates and crude birth rates by area, periods under different recording operations, switch-back trial; Colombia, 1971-73 62 Estimates of 6,0,0,and fl based on alternative counting rules for enumerating deaths 85 Proportion of deaths that were missed by type of housing unit in the survey experiment, Los Angeles, July-October 1969 86 Estimated completeness rates in two Madagascar dual collections, 1967-70 94 Estimated completeness rates in two dual collection Sheikhdoms of Tunisia, 1968-69 95 Estimated completeness rates in dual collection allowing for delayed registration, two Sheikhdoms of Tunisia, 1968-69 96 Vital rates in Liberia, 1969 100 Migrants found by the procedures of a PGE/ ERAD/ ECP enquiry. Liberia: District of Voinjama, 1969 100 Dual collection estimation of vital events in the Lower Egypt Survey, 196566 106 Estimated vital rates from the two dual system procedures and the percentage completeness, the Lower Egypt Survey, 1965-66 107 A comparison of the dual collection vital rates with those of civil registration 109 Results of matching 1972 births and deaths in Morocco 119 Experimental matching of birth documents by characteristics and tolerance limits in the CeRED region of Morocco, 1972-73 127 Experimental matching of death documents by characteristics and tolerance limits in the CeRED region of Morocco, 1972-73 128 Matching outcomes of combinations of characteristics for birth documents in the CeRED region of Morocco, 1972-73 129 Matching outcomes of combinations of characteristics for death documents in the CeRED region of Morocco, 1972-73 130 Production matching of vital events in the CeRED region of Morocco, 197273 131
8.6 Crude vital rates per 1,000 population in the CeRED region of Morocco, 197273 131 8.7 PGE/ ER AD/ ECP estimates of vital events in the CeRED region of Morocco, 1972-73 according to various homogeneous groupings 132 8.8 The outcome of field verification of vital events reported in the urban and rural parts of the CeRED region of Morocco, 1972-73 133 9.1 Background information of PGE/ERAD/ECP studies conducted in seven African countries 146 9.2 Estimated completeness of reported births and deaths for each collection procedure in six African studies 147 9.3 Percentage distribution of estimated total births, by PGE/ ERAD/ ECP category for six African studies 148 9.4 Percentage distribution of estimated total deaths, by PGE/ ERAD/ ECP category, for six African studies 149 9.5 Stable population and PGE/ERAD/ECP estimates for Egypt 150 9.6 Stable population estimates for Morocco, 1960, 1965, 1971 151 9.7 Analytic and PGE/ ERAD/ECP estimates for Liberia 152 10.1 Estimates of completeness in the 1972 census of Paraguay 169 10.2 Estimates of completeness in the 1970 census of Korea 170 10.3 Completeness of census enumeration by age, sex, and area for Paraguay 1972 PES 175 10.4 Estimated census completeness by relationship to household head for Paraguay 1972 PES 182 10.5 Estimated census completeness by migration status for Paraguay 1972 PES 184 11.1 Persons recorded in both the census and the post-enumeration survey by age and sex: Liberia, 1974 190 11.2 Persons recorded in the post-enumeration survey by age and sex: Liberia, 1974 190 11.3 Estimated completeness of the census by age and sex: Liberia, 1974 191
xi
List of Figures
1.1 1.2 2.1 2.2 2.3 2.4 2.5 4.1 7.1 8.1 8.2 8.3 8.4 8.5 8.6 11.1
xii
Schematic presentation of the categories of events obtained by a dual collection system 6 Hypothetical examples of the outcome of three sets of matching rules: average, strict, relaxed 7 The four types of registration procedure 59 Minimum questionnaire content for record keeping and matching operations 64 Relative differences in average number of children ever born and proportion surviving for self and proxy reporters, ever married women, urban and rural areas, Mindanao Center for Population Studies, July 1972 66 Estimated completeness rates by reported month of birth, recording, and survey procedures, by area, Dual Record Study, Mindanao Center for Population Studies, January-June 1972 68 Interrelations between error and sample size for different types of survey 70 Counting rules tested in a multiplicity survey experiment for enumerating deaths, Los Angeles, July-October 1969 85 Summary results of most useful personnel by area and type of event, Morocco, July 1971-December 1972 116 Characteristics and tolerance limits used in experimental matching of birth records 125 Characteristics and tolerance limits used in experimental matching of death records 126 Events with reports in both sources by matching status in each source 137 Breakdown of reports by matching status 138 Hypothetical example: data on matching rules 143 Hypothetical example: relationship between rule A(l) and A(2) 144 The questionnaire used in the Liberian post-enumeration survey, Form PES2 194
Acknowledgements
The first thanks are due to Dr. Ivan P. Fellegi, Assistant Chief Statistician of Canada who, in his capacity as organizer of the first meeting of the International Association of Survey Statisticians in Vienna, August 1973, invited the editor of this book to organize a session on dual system estimation in demography. The original four contributors of invited papers, the two authors of contributed papers, and the four discussants, were happy to be joined later by eight other contributors and four discussants. We were all united in our interest in producing high quality data and obtaining the greatest possible analytic use from them. We may have disagreed in the pursuit of this objective, but we can only hope that in the process we helped to clarify some of the issues. Mr. Benson Morah, then a graduate student at the University of Alberta, acted as assistant editor for all the parts of the book. Miss A. Candace Fedoruk, then a graduate student at the University of Alberta and currently a Civil Servant in Ottawa, translated chapters 5 and 7 from their original French. Mr. Khalid Siddiqui, a graduate student at the Population Studies Center of the University of Michigan, worked out the tables in chapter 1. Figures 2.3, 2.4 and 2.5 were produced in the Population Research Laboratory at the University of Alberta by Mrs. Ilze Hobin. William Seltzer edited chapter 7. Finally, the appreciation of the whole international community of statisticians and demographers is expressed to the countless field workers in all five continents who laboured, often under trying conditions, to obtain more and better demographic data and in this manner increased our understanding of and knowledge about humanity, its present, and future.
xin
Preface W. Parker Mauldin
Many countries of the world do not have statistical systems that generate reasonably accurate data on population size and rate of population growth. Censuses are often one-time affairs with no continuing professional growth for planning and supervising data collection and subsequent analysis. Even so, data on population size are generally thought to be moderately good, in spite of the fact that we do not know the population of Nigeria within a figure of 10 million persons, nor that of China within 50, or possibly 100, million. But there has been an appreciable improvement in census data during the past 25 years and there is enough interest to ensure that further improvements are likely. But much time is yet to pass before adequate attention is given to systematic and scientific assessment of the accuracy of census data. Vital statistics systems are less complete and far less well maintained than are censuses. Most current estimates of the numbers and rates of births and deaths and of natural increase in less developed countries are based on sample surveys or are derived from censuses; they are not based on actual counts of births and deaths. The prospects for upgrading vital statistics systems to a moderately high level within a decade or so are not promising and, therefore, we shall continue to be dependent on census data and surveys. During the past decade or two there have been a number of promising new theories and techniques backed with some empirical data for the collection of vital statistics by one-time sample surveys. Indeed, the largest effort in man's history to collect current information on levels of and changes in fertility, the World Fertility Survey, is based on a single survey in each of about 40 less developed countries with about 19 of the more developed countries also participating in this important undertaking. Such surveys have been the norm for almost 40 years, and with recent improvements and with the care that is going into this massive effort, undoubtedly a great deal of useful information will be generated. This book explores a different philosophy, a different approach: that of collecting vital statistics and estimating population size by two independent systems, and comparing the results on a name-by-name basis. Such systems are more complicated than single systems, and typically are more costly. They are subject to a number of problems. But they have one striking advantage for the statistician who seeks certainty of knowledge as to the accuracy of results. If the two systems give somewhat different results, the statistician knows that there are deficiencies in at least one of the systems, and a field check will disclose whether the deficiencies are confined to one system or if they exist in both systems. Such a result is not comforting but it reveals xiv
the need for further improving the systems. Sometimes technicians are acutely aware that there are problems with a single system, but less rigorous technicians often do not see or understand the existence and nature of such defects. This book discusses a number of theoretical issues related to dual systems of data collection, practical problems that arise in carrying out such systems, reports in detail on selected surveys (particularly in Africa where vital statistics systems are notably weak), and summarizes actual surveys as well as the state of the art. It is an important and timely book because as the world seeks a new economic order it must have basic information about population as well as about the economy in order to measure changes, and to judge the process of progress. New York, N.Y. January 1976
W. Parker Mauldin Acting President The Population Council
XV
Introduction
The ink was not yet dry on the final draft of the PGE handbook (Population Growth Estimates: A Manual of Vital Statistics Measurement, Marks et al., 1974) when the organizer of the first conference of the International Association of Survey Statisticians, Vienna, August 1973, set up a session on dual system estimation. Five of the 12 chapters in this book were presented at the conference (Wells and Horvitz, Sirken, Pradel, Vaidyanathan, and Marks); two were presented orally in general outline (Krotki and Scott-Muhsam-Marks); and five were invited after the conference (Krishnan, Rachidi, Housni et al., Beaujot, and Marks-Rumford) to complete the subject matter. An important appendix has also been added (Nathan). The purpose of this collection of 12 chapters is to present the next stage in the development of the vastly expanding field of dual system estimation. The previous stage was wound up with the publication of the PGE handbook. At various points, but particularly at the end of chapter 1 and throughout chapters 3 (Krishnan), 4 (Sirken), and 10 (Marks), we attempt to glean an indication of future developments that are likely to take place in this field. A further purpose is to continue the clarification required to preserve the advantages of the method. The broth of the PGE method [as it is known among English-speaking professionals for Population Growth Estimation, ERAD among francophones (Estimation du Rhythme d'Accroissement Demographique), ECP among Spanish speakers (Estimation del Crecimiento de la Poblacion)] runs the risk of being spoiled by too many cooks itching to interfere with the ideal recipe and unable to buckle down to producing a superior but standard product. For example, several well documented errors have been committed in the practice of social surveying when lists of households (of increasing staleness) were used as the sampling frame instead of area samples. The PGE/ERAD/ECP technique counsels the numbering of structures as an extension and strengthening of the detailed mapping required by area sampling. Yet the critical distinction between structures and households is blurred, and alleged "refinement" is proposed in the form of listing of families and individuals (Scott, 1973b: 412). The 12 chapters fall conveniently into three groups. After the background information and articulation of some of the critical issues posed before collectors of demographic data in chapter 1, the next three chapters picture the present and some possible developments in the future state of the art in the field. The middle four chapters (five through eight) report on recent empirical evidence and actual experiences. The four final chapters deal with particularly topical and troublesome issues, on the solution of which much of the future development of the method will xvi
Introduction depend. They also report some practical application of the innovative techniques suggested theoretically in other parts of the book. Not all the topics covered in this book are of equal importance, nor have all the contributors been brainwashed by the PGE/ERAD/ECP gospel to the same extent. Consequently not all the chapters have the same status and not all of them have been included with the same degree of editorial imprimatur. However, we thought that there would be a distinct advantage in assembling declared partisans of the technique, sceptical spectators, partisans of other techniques who are downright hostile to dual estimation, and also those who tend to walk off at a tangent while professing sympathy; the latter kind is likely in the process to destroy the essence of the technique, but may — who knows — stumble into further fruitful developments. The content of the 12 chapters can be conveniently summarized. First the background story is given and the key issues are highlighted in chapter 1 — in a somewhat partisan fashion for the sake of brevity of presentation. In chapter 2, Wells and Horvitz state carefully some of the more recent results achieved and indicate some of the work being done in the area that will, it is hoped, show some promise. In chapter 3 (Krishnan) the classical textbook approach is taken of aiming at the minimum variance and/or maximum likelihood. The focus on variance is at variance with a principle PGE handbook concern to worry at all times about biases as well. The mathematics will be of no immediate utility to demographers and statisticians but they suggest some interesting possibilities in the future, and the emphasis on regression techniques in combining past and present data is innovative. Chapter 4 (Sirken) continues the well-known work of the author in multiplicity estimators and takes up their role in dual systems. The four chapters on the general review of francophone Africa and Liberia (5: Pradel), on Egypt (6: Vaidyanathan), the detailed report on Moroccan fieldwork (7: Rachidi), and the presentation of the matching procedures worked out for Morocco (8: Housni et al.) give some data that were never published in accessible sources, some data that were never published in anglophone media, and much experience and many conclusions based on actual endeavours in the field. The empirical derivation of matching rules of the Moroccan practitioners in the field reported in chapter 8 is sharply contrasted in the appendix (Nathan) with the complex and demanding principles of such a derivation based on theoretical requirements and experimental consideration. Chapter 9 (Beaujot) continues these contributions with rare data and rare experiences but, more importantly, poses a problem quite critical for the future of the PGE/ERAD/ECP: how far can we hope to go with analytic techniques while using grossly imperfect data in the estimation of vital rates. In chapter 10 Eli Marks makes a contribution in the all important field of census evaluation. As stated in section 10.4 "the theory and practice of dual system estimation for census evaluation has exhibited very little development during the past 10 to 15 years" even though increasingly competent exercises were carried out, particularly in Canada and the United States. For purposes of census evaluation all single methods of evaluation have weaknesses and a combination of dual system evaluation with demographic techniques is proposed. In an appendix to chapter 10 and in chapter 11 (Marks and Rumford) practical applications of some of the new advances carried out in Korea, Paraguay, and Liberia are reported upon and supported with original data not otherwise generally available, and not otherwise conveniently accessible. Chapter 12 concludes, somewhat unsatisfactorily, with some fireworks that are, we hope more illuminating than heat generating. Some of the issues are left hanging in the air but they will not go away. They will stay with us for a long time, until they are solved. This unsatisfactory state of the technique, and for that matter of much of the field of data collection, or rather the incomplete state of the art, has consequences. xvii
Introduction Many of our pages read like papers at an unfinished series of seminars at a university, but they cannot be finished seminar-like through further theoretical clarification. They can only be clarified through further field work on the basis of the issues defined in this volume and in the previous writings. What the editor did attempt was to clarify the avowed differences of opinions and differences of experiences among the contributors either in the form of additional endnotes in chapters or through direct textural intervention (sanctioned in each case by the contributor concerned). One non-negotiable textual intervention concerned terminology. This editor is not an academic prima donna and tries to follow previous examples whenever possible, but the situation is difficult. The POPLAB dictionary (Chanlett, 1974) does not follow the terminology of the PGE handbook. Not all POPLAB writers follow the POPLAB dictionary. French language literature tempts the reader with expressive words like "exploitation" for analysis and "confrontation" for matching, but one resists these temptations for the sake of uniformity. Our choices are given in the glossary. They fall halfway between the PGE handbook and the POPLAB dictionary. It might be useful at the outset to admit the necessity for certain peculiarities essential to the PGE/ERAD/ECP success. One of them, unreasonable to an outsider on commonsense grounds, is the insistence on not correcting data, especially not in the field. Hence the insistence on separating supervision from evaluation, on allowing no reconciliation, etc. In section 10.7 we suggest that even in census evaluation "Reconciliation is not necessary with the newer PES techniques" (post-enumeration survey). We, therefore, note with some concern that Parker Mauldin in the preface hopes that field checks "will disclose whether deficiencies are confined to one system or if they exist in both systems." He knows much about PGE/ ERAD/ ECP — that is why he was invited to write the preface — and he must remember that this is a sure way to invite collaboration between the two systems. Some of the elaborations proposed by previous writers in this field have since been found to be either empirically or, on further reflection, not very fruitful pursuits in the practical world. While in theory (Chakraborthy, 1963; Das Gupta, 1964) there is no difference between matching documents from two sources or from three and more sources, there seems to be no actual advantage in using more than two sources (Marks et al., 1974: 401-402). Probably the largest PGE/ERAD/ECP enquiry ever undertaken is that of the Indian Sample Registration Scheme. Falling under the incomplete type of a dual system enquiry discussed in section 1.8 it raises the fundamental question of how to evaluate such incomplete PGE/ ERAD/ ECP enquiries. The objection to the complete exercise apparently, is that not all biases can be removed (there is always some lingering suspicion that some biases remain). The reasoning then continues that it is better in such circumstances not to remove any. Neither the PGE handbook nor the present volume pretend to be definitive works. "Such a work could only be attempted after many of the design alternatives discussed had been systematically tested" (Marks et al., 1974: 5). When clean and definite progress has been made in almost every direction in the field of population studies and in an understanding of their problems, one segment refuses stubbornly to yield: that of vital statistics registration. It is different from the others inasmuch as for results it requires a sustained effort. PGE/ERAD/ECP also requires sustained attention and long nursing, but being more selective, and operating on a more reduced scale, it provides a viable alternative. Ann Arbour, Michigan, January 1975
xvin
KJK
Chapter 1 The Role of PGE/ ERAD/ ECP Surveys Among Endeavours to Secure Improved Demographic Data Karol J. Krotki
1.1 The need for improved demographic data The inadequacy of demographic data for the majority of humanity owing to the paucity and recency of population censuses, and even more to the non-existence or incompleteness of the registration of vital events, has caused much ingenious searching for substitute data. Another stimulus to such a search is given by the need for data required by formulators of population policies, and by designers and evaluators of family planning programmes. Both groups have been brought into existence because of the emergence of the so-called "population problem" in precisely those countries with inadequate demographic data. Demographers, particularly those working under the auspices of the United Nations, have made important contributions to the development of analytic techniques (e.g. United Nations Population Studies, 39 and 42) for the assessment and correction of faulty and unreliable data. Such assessment has often resulted in an estimation of demographic parameters, possibly and probably closer to reality than the reported data. While we are very much interested in and in sympathy with these endeavours, we show in chapter 9 the strong subjective element that enters into the estimation process and which throws into question these analytic techniques when in untried hands. There is no substitute for real data. Hence the ingenuity in the development of new analytic techniques aimed at increasing the usefulness of such unreliable data as are made available traditionally, was paralleled by ingenuity in the collection of new types of data and new methods of collecting data. The sources of demographic data for purposes of population studies can be grouped in four categories: (i) the population census, often combined with a census of housing, sometimes agriculture, occasionally fertility or some other specific, population-related topics; (ii) civil registration of vital events, such as births, deaths, stillbirths, adoptions, marriages, divorces, separations; (iii) various kinds of demographic and socio-economic surveys, either sample censuses, if this contradiction in terms can be permitted, or intensive enquiries to fill in gaps left by insufficient content of census questionnaires; (iv) recent and often unorthodox attempts to obtain demographic data in the absence of traditional data or to meet new needs usually involving special surveys, particular record keeping, and new observational techniques. This chapter is concerned with the last category, but it does not deal with such sources as electoral lists, poll-tax lists, administrative records with a demographic content, or health records.1 The chapter deals with surveys or other related arrangements specifi1
/./
Karol J. Krotki
cally made to secure more and improved demographic data. It ignores situations where demographic data become available as a by-product of other activities. Among the many new survey and non-survey sources of data, several distinct groups have become important in recent years: the service statistics from family planning programmes, KAP surveys (knowledge, attitude, practices in family planning), a group of intensive and expensive population observations, usually out of reach of underdeveloped countries (such as the national and local fertility surveys in the United States, Canada, Great Britain, and other countries, often involving panels of women pursued for years in the quest for data); highly specialized surveys such as those based on photogrammetry, called by some writers sociogrammetry; multiround surveys employing retrospective questions; multiround surveys employing the household change technique; and finally PGE/ ERAD/ ECP surveys which are the subject of this book. 1.2 An overview of the available techniques At the conclusion of the previous section in the listing of types of surveys we did not mention the single round survey. Although the most popular and most frequently used kind of survey, it has become by now somewhat discredited and it is common ground among many professionals working in the field that it should be used only in exceptional circumstances. As a rule the results are of such quality that it is probably better not to have them. One is less misled in their absence. The main reason why single round surveys give unsatisfactory results when evaluated is the unreliability of human recollection. Events of interest to the survey are forgotten or the most relevant respondents are no longer there or the events are placed in the wrong time scale. These disadvantages are compounded by the ease with which such a survey can be launched. Even when there is the weakest statistical organization, the smallest sum of money, when the purpose is not clearly understood, some kind of survey can be put together, some kind of group of interviewers can be trained and let loose onto the population. It is true that the particularly dismal results obtained from single round surveys, which will be quoted later in this section, are made to look particularly bad because of this compounding aspect. Let it, therefore, be repeated that they suffer from an inherent disadvantage, irrespective of the organization; namely the frailty of the human memory, unchecked and unsupported in conditions of a single round survey. The single round survey is also called ad hoc with the slightly pejorative implication of the phrase. "It is generally accepted that this cannot give reliable information on births and deaths" (Scott and Coker, 1971: 253). "Single retrospective surveys cannot be depended upon to provide valid or reliable estimates of births and deaths" (Mauldin, 1966: 652). "On the basis of evidence a number of those with wide experience in the developing world have concluded that single ... [round] household surveys using simple retrospective questions on births are particularly vulnerable to error" (Seltzer, 1973: 23). Similar experiences have been reported from francophone Africa. "L'humilite est certes necessaire car les difficultes sont considerables et inedites; il semble en particulier que les resultats obtenus se pretent difficilement a un traitement d'ensemble; Fimportance et la diversite des erreurs d'observation qui paraissent les effecter rendent d'autre part fort illusoire toute tentative d'adjustement general" (Blanc, 1964: 85). The first Population Growth Estimation seminar held at the Population Council offices in New York on 23 January, 1969 gathered a group of international experts and practitioners together to consider the matter. Its conclusion was that "a single system is unsatisfactory in the present state of the art in underdeveloped countries; a balanced 2
Role of PGEI ERA D/ ECP surveys
1.2
dual system is essential" (Mauldin and Bean, 1969). The language of numerous reports emanating from or through the United Nations is often diplomatic, but unequivocal is the ultimate meaning that single round surveys with retrospective questions are no good (e.g. United Nations, 1971: 157). A carefully sifted collection of evidence for the purposes of the PGE handbook (Marks et al., 1974: 54, table 2.11) has shown that the completeness of reporting deaths in single rounds in a large number of Asian surveys is hardly higher on the average (51 percent) than that of the official civil registration (49.5). It is precisely because of the inadequacy of the civil registration system that the need for alternative sources of data arises. What is the use of the alternatives if they cannot do better? For births the completeness seems to be better (67 percent) than in registration (56 percent). The range of reported completeness in the instances for which data are available is markedly wider for single round surveys than in civil registration. The reported range was 68 and 67 percentage points for the completeness of births and deaths respectively in single round surveys as against 43 and 49 percentage points in civil registration. In other words, not only was the completeness virtually the same as in the unsatisfactory civil registration, but the "product" was of a more uneven standard. "An attempt was made during the 1960 census of Morocco to obtain information on births and deaths in the last twelve months. An examination of a number of census returns suggested to census officials that these data were very incomplete" (Sabagh and Scott, 1967: 760, note 2).2 In endnote 8 to our chapter 9, the survey in North Carolina is recalled where among 3,000 households selected from birth and death registrations, only 92 percent reported to the survey the previously registered births and only 83 percent the previously registered deaths (Horvitz, 1966).3 Still, our evidence suggests that some single round surveys do less poorly than others. In other words, by trying hard some better results can be obtained in spite of the disadvantages inherent in the single rounds. At the end of section 10.5 we discuss the meaning, effectiveness, and operationality of doing a "better" job, selecting "best" enumerators, giving them "intensive training," and "reconciliation" of confusing answers and the like. This belief in the ultimate improvement of human endeavour is, of course, endearing, but less realistic than the approach of using all sources and a readiness to live in error, but measure it. To go back again to the end of the previous section, the choice between methods of collecting more and improved data for demographic purposes is, thus, limited to multiround surveys employing retrospective questions, multiround surveys employing household change technique, and the PGE/ ERAD/ ECP surveys. Multiround surveys with retrospective questions collect data more than once from the same and/ or from rotating households; their main advantage stems from the long term commitment of the organizers and the consequent cumulation of experiences and presumably gradual improvement in data. Multiround surveys recording household changes collect data at least twice from the same households; their main advantage lies in the record of household structures from the earlier round being available to the interviewers during the more recent round. The PGE/ ERAD/ ECP surveys consist of two independent means of collecting data, hence "dual collection system" and a case-by-case comparison of the reports collected; these surveys are unique in having a built-in means of evaluating the completeness rates of each of the two procedures in the dual system and, consequently, the joint omissions in both procedures. The corollary of the advantage of multiround surveys (accumulated experience and increased quality) is the risk of losing with time the financial and administrative support necessary for continuing existence. This disadvantage does not affect the consideration because, if the commitment is really that light, then the results are probably not worth having. The advantage of experience on the other hand is considered essential in countries where the evaluation of quality of data is taken seriously. 3
1.2
KarolJ. Krotki
Statistics Canada was told in the early 1970s to augment in size and content its monthly survey at the cost of — it is said — $8 million (Canadian). It was thought necessary to take many monthly rounds before the survey would be ready and public data released. In the context of our interest, such surveys have also the advantage of overlapping recall periods when taken sufficiently frequently. For example, during the original Pakistan PGE exercise, the periodic survey was taken quarterly with a recall period of 12 months. Thus each vital event had the theoretical chance of being reported four times. The "extra" information was useful in getting a solid grip on the household and helping in the matching operations (about which more later). Some of this additional value might have been due to the frequency of the survey rounds as opposed to the number of times the same event was reported. The Pakistan exercise had the benefit of two tricks of trade, small but important, as will be argued later: for three of each four visits the interviewer had the benefit of the household composition as reported at previous interviews, but not the records of the reported vital events; every fourth interview the chain was broken and a virgin listing of households and their composition had to be started. The National Sample Survey (NSS) of India tried for a long time to obtain retrospectively data on births and deaths. When each survey was taken as a "stand alone" survey, then their results were not different from the discouraging results elsewhere in the world with single round surveys. As elsewhere, the reported rates were so low that arbitrary adjustments had to be made so that published data would not offend too much against commonsense.4 In the 7th, 14th, and 15th rounds of the NSS interesting innovations were attempted (Som, 1973). The interviews were held monthly with a reference period of one year: each vital event thus had a chance of being reported 12 times, and 12 monthly consecutive interviews produced 144 observations. Each month of the year was represented 12 times. The averaging proceeded by the 12 months for which the recall period was one month, for which the recall period was two months, for which the recall period was three months and so on. An estimate of memory decay was in this way obtained, the celebrated-in-literature "Som curve" was created, and the events with "zero" recall period were estimated. The experiment was extended into a two-year recall period so that data became available for "last year" and "the year before last". If one believes that some respondents tend to advance the dates of vital events, that is, bring events from the 13th and 14th month into the 12th month then the Som curve is underestimating memory decay; and vice versa if respondents stretch out the recall period and put events further back. The two examples of Pakistan and India highlight two uses to which multiround surveys can be put usefully: strengthening the grip of the survey organization on the population, and multiplying the amount of data available for averaging purposes. A third legitimate use, matching events reported during different rounds for the purposes of the PGE/ ERAD/ ECP technique, will be discussed later on in this section. The attempt to collect data through the household change technique in multiround surveys raises other questions. The faith in this method rests on the list of members of households collected during the first round and the changes recorded in this list during subsequent rounds. It is clear that the method reduces some omissions. Persons who have died since the previous round, even the smallest babies, should be reported as having died, without fail, provided they were entered onto the list of members of the household during the previous round. The method also reduces some of the problems of dating. For some of the events that have taken place between visits the error of dating can be at most equal to the length of the interval between rounds.5 It is clear that a method "in which the enumerator merely has to check a list of names provided, obviously offers a strong temptation to the enumerator who is less than fully conscientious" (Scott, 1973: 7). Actually the problems with the household change 4
Role of PGEj ERA D/ ECP surveys
1.2
technique are more fundamental than merely tempting lazy enumerators. The first complaint is that it ties the survey to the listed households and their composition as reported at the time of the first round.6 Partly for this reason, and partly because of the interviewer's tendency to take the list of members of the household seriously, the survey works with a "panel". Such a panel starts as a not very representative sample (most samples suffer some selective omissions) and gradually becomes less and less representative of the current, live population. The problem of the de facto and the de jure definitions and their application to mobile members of the society are severe whatever the procedure adopted, but the household change technique is probably more clumsy in dealing with them than others. Similarly the reason for rotating a sample and avoiding staleness arises in all methods, but the household change technique has the additional disadvantage of "tying" the interviewer to whatever happens to have been listed in the first round. The problem is not the development of "significant respondent resistance" (Scott, 1973:7) with repeated calls on the same household. The technique tends to bias the sample, in place of throwing freely and probabilistically the enumeration net onto the population and its vital events in each round afresh. The long history of unsatisfactory results delivered by the household change technique (India, 1958-59; Cambodia, 1958-59; Brazil, 1961; Morocco, 1961-62; Indonesia, 1961-62; Nigeria, 1965-66; Tunisia, 1968-69) was not sufficiently discouraging and it was further applied in Algeria, 1970; Senegal, 1970-71; Lesotho, 1971-72; Haiti, 1972; Honduras, 1972; and Saudi Arabia, 1972. For countries where reports are available the experience was no different.7 For a probably atypical experience of overenumeration, more significant than underenumeration in births and deaths reporting, see the Moroccan multi-purpose survey of 1961-62 (Sabagh and Scott, 1967: 768). The experience is most sobering in that it dissects the complexities and unreliability of retrospective reporting, whatever the method.8 The high priest of the household change technique tried at the last moment to rescue something of it by suggesting that the changes between rounds could be done actually "in the blind" (Scott, 1973: 7).9 Having then obtained two or more listings of household members these are matched on a case-by-case basis, that is individual by individual, not just totals. However, the suggestion that the PGE/ ERAD/ ECP technique has "something in common with multiround surveys" is dismissed after a page of thinking aloud as "generally... impracticable" (ibid). The technique, should there ever be a field worker and researcher unrealistic enough to try it, involves "reconciliation" in the field. Obviously the differences between the two rounds: the original and the "blind" could not be accepted at the face value. They could merely be differences of recording and not real differences. As will be shown in sections 1.7 and 1.8 of this chapter reconciliation is a dirty word in PGE/ ERAD/ ECP parlance. For this and for other reasons the change of household technique can have nothing in common with a true PGE/ ERAD/ ECP survey. The fact that some PGE/ ERAD/ ECP surveys used the three months round is "analagous to the multiround surveys" (Scott, 1973:6) using the household change approach formally but without much expectation of meaning. We come now to the PGE/ ERAD/ ECP technique proper. Its arithmetical justification is given in the next section of this chapter. Some of its underlying principles are explained in other sections, particularly section 1.4 that summarizes the PGE handbook (Marks et al., 1974). Its history is sketched in section 1.11. Here we limit ourselves to an introduction comparable to the introductions of the other methods. Briefly, the technique consists of two procedures of collecting data and a case-bycase comparison of individual reports. From this comparison or matching of records, three kinds of records result. There are records in respect of events reported by both procedures (two records for each event, one from each procedure); records for events reported by one procedure only (one record for each event, all from one procedure);
5
Karol J. Krotki
1.2
and records for events reported by the other procedure only (one record for each event, all from the other procedure). Provided that the two procedures and sets of events were independent of each other we are then justified, in accordance with section 1.4, in making an estimate of events with regard to which neither procedure produced a record. This picture is summarized in figure 1.1. Figure 1.1 Schematic presentation of the categories of events obtained by a dual collection system Procedure R (- continuous recording) Events observed
Events not observed
All events
N(RS) = M
U(S)
N(S)
Events not observed
U(R)
U(R)U(S)/M = Z
V(S)
All events
N(R)
V(R)
N
Events observed
Independence between the lines and independence between the columns implies the following relationships: For procedure R Pr(R) = N(R)/N = M/N(S) = U(R)/V(S) For procedure S Pr(S) = N(S)/N = M/N(R) = U(S)/V(R) In short, the probability of inclusion is the same whether one talks about all events, or events caught by the other procedure, or events omitted. The problem of independence is sounded repeatedly throughout this chapter and in fact throughout all the other chapters. Particularly in sections 1.6 and 1.7 we show how simple devices can increase independence, and in section 1.8 how easily thoughtlessness can destroy it. Pradel in chapter 5 seems to be less convinced of the possibilities of ensuring independence and right to the very end of chapter 12 we are concerned with it. The matching problem is simpler and is negotiable. It is obvious that if we work with very strict matching rules we will end up with few cases in the first category, many in the other two categories, and consequently with a large estimate of the fourth category. On the other hand, if we set up lax matching rules, we will finish with a large first category, two small lots in the second and third category, and consequently a small estimate for the fourth category. The outcome of a hypothetical case with three sets of matching rules has been summarized in figure 1.2. Many features of carrying out a survey are applicable to PGE/ ERAD/ ECP surveys in the realm of sampling theory, sampling practice, questionnaire design, field work, office procedure and the like. The two needs of high independence and matching rules with a zero net error are peculiar to this technique. For that price, we obtain a method that is selfchecking. None of the others are. While further features of the PGE/ ERAD/ ECP technique are elaborated in other sections of this chapter and, indeed, in other chapters, it will be useful at this stage to return to the question of matching between rounds. The PGE handbook presents in chapter 4, and in some detail, three alternative systems. One of these is a comparison between two field procedures very similar to each other, namely single round surveys. 6
1.2
Role of PGEI ERA D/ ECP surveys
Figure 1.2 Hypothetical examples of the outcome of three sets of matching rules: average, strict, relaxed. (One procedure observed 840 events, the other 850) Average Recording
Observed Not observed
Observed
Not observed
790
60
50
4
estimate of N = 904 (± 4 at 95%) Strict Recording
Observed Not observed
Observed
Not observed
750
100
90
12
estimate of N = 952 (± 8 at 95%) Relaxed Recording
Observed Not observed
Observed
Not observed
820
30
20
1
estimate of N = 871 (± 2 at 95%) Single round surveys follow each other at, say, every six months with a recall period of, say, twelve months. In this manner, each vital event should be reported twice, once in each of two neighbouring surveys. Other combinations of frequencies and recall periods are possible as long as sufficient overlaps occur. The question then arises, why are single round surveys acceptable in this case? Why is not the sauce of the goose of PGE/ ER AD/ ECP good enough for the gander of the "stand alone" single round survey or at least for the household change technique? Primarily, the third example of the PGE handbook leans over backwards in stressing the need for independence and suggesting innumerable tricks of the trade to achieve it (Marks et a/., 1974, e.g. 250253,425-428). Even so, between-round comparisons in the Pakistan PGE would not have been worthwhile because of the lack of independence between the rounds of the survey.10 In fact, we do not have enough experience in the world to talk with confidence about such an approach (survey with survey). The frequent and popular examples of 7
1.2
Karol J. Krotki
PGE/ ERAD/ ECP systems combine typically survey with civil registration (first example in the PGE handbook: 227-228) or survey with special registration (second example in the PGE handbook: 238-250). The Director-General of the next Central Statistical Office that will undertake a PGE/ ERAD/ ECP survey, can perform a real service to the international fraternity of statisticians, and probably to himself and his country, by trying a really well-founded third example suggested in the PGE handbook. We are aware of many other possible approaches in the field of measuring vital events. Those particularly hurtful are taken up in section 1.8. There is merit in others, but for the given outlay per hour of time and per unit of money they repay less than a correct PGE/ ERAD/ ECP survey. Among other approaches handicapped from the start we would list the following: pregnancy histories, expensively elaborate in their interview dynamics and resulting in longitudinal data unwieldy for traditional methods of analysis which are mostly cross sectional (Bogue and Bogue, 1967; Bogueand Bogue, 1970); maternity histories, less troublesome and somewhat more realistic than pregnancy histories, but also resulting in data surplus to available means of analysis; sample registrations following civil registration principles (Hauser, 1954; Cavanaugh, 1963) because of the basic incompatibility between the type of personnel, attitudes, finance, and a host of other considerations relevant to the establishment of a legal system and a statistical system (Linder, 1971); and panels of respondents (Vaidyanathan, 1973) because of the staleness of the sample increasing with time. 1.3 The theory behind the PGE I ERAD I ECP technique The comparative novelty of the PGE/ ERAD/ ECP technique does not lie in a new principle. In fact, the technique is based on a proposition in probability that has been with us for a long time. If two events take place independently of each other then the probability that they will take place simultaneously is equal to the probability of one of the events happening multiplied by the probability of the other event happening. In terms of the notation used in figure 1.1, if R takes place independently of S then the probability of R and S both taking place simultaneously is equal to R times S. The algebraic formulations are well known (e.g. Krotki, 1969; Krotki, 1971): the probability of (R and S) = probability of R times probability of S (1.1) where probability of R = R/(R + non R) (1.2) probability of S = S/(S + non S) (1.3) Let it be recalled once more that equation (1.1) is true on the condition that the occurrence and frequency of R has no effect on the occurrence and frequency of S and vice versa. Equation (1.1) can then be rewritten as follows: Pr(R) = [Pr(R and S)]/[Pr(S)] (1.4) The three terms of equation (1.4) can be redefined as follows: Pr P2 > .8, it still might be advisable to select a subsample of the ki clusters for the periodic survey even if CR = Cs. 55
2.2.b
H. Bradley Wells and Daniel G. Horvitz
The optimum subsampling rate f = h/ H (within clusters) for the periodic survey is also given by the right side of equation (2.1) above, under the additional assumptions that k2 = ki, that full-time resident recorders must be used with the continuous recording procedure in order to reduce costs and to maintain better statistical control, and that the intracluster correlation is zero. The effect of a positive intracluster correlation is to reduce the optimum within cluster subsampling rate for the periodic surveys. For PI = P2 = .7, equation (2.1) suggests that within cluster subsampling would be desirable if Cs is somewhat larger than CR (when k2 = ki). The gain in precision brought about by subsampling within clusters and using the savings to increase the number of clusters should be weighed against the added design complexity. The above results, while providing some guidance, are still restricted to particular parameter combinations (i.e., h = H in the first instance and k2 = ki in the second) and therefore yield only locally optimum designs. For countries or cultures for which estimates of the variances and covariances for Ni, N2 and M for one or more cluster sizes are available, and for which an appropriate cost function and reasonably accurate variable cost coefficients are known, unconditional solutions for the optimum combination of the design parameters ki, k2, h and H should be computed. Other considerations enter into the choice of cluster size than merely that value of H which, together with appropriate choices of ki, k2 and h minimizes the relvariance of the estimated number of births (or the estimated birth rate) for a given survey budget. The cluster size must be such that boundaries for the area containing the cluster can be specified and easily identified in the field. As H decreases, the boundary identification problem increases. This can lead to the inclusion of events which do not belong to sample clusters and to the exclusion of events which do belong. Also migration or local moves can become a significant factor for small clusters. Further, the variation in the adjusted number of events n may be somewhat greater for small clusters since the number matched M, might then be too small for stable estimates of N resulting in a larger ratio bias. Thus, dual collection systems for estimating vital events and vital rates require somewhat larger cluster sizes than would ordinarily be appropriate in household surveys. 2.2.c Rotation designs An important objective in most national vital event dual collection systems is to be able to detect changes in fertility and mortality, as well as to estimate their magnitude accurately. If an estimate of change from the preceding year were the only objective, then the same sample of clusters should be used for both years. If an estimate of average level for each of the two years were the only objective, then a fresh sample of clusters should be selected for the second year. Since there is interest in both change and level, a rotation sample in which some clusters are retained and other clusters are dropped each year would be most appropriate. The argument for some rotation of sample clusters is strengthened further by the need to minimize respondent conditioning effects and respondent fatigue, both of which can lead to unknown biases of response (Marks et al., 1974,396 and 397). 2.2.d Some aids to improved estimation Despite a sound sample design and appropriate field procedures with adequate controls for quality, the dual system estimate of the number of vital events may still be subject to unknown biases. Correlation bias occurs when the two procedures are not operating independently. Completeness bias occurs when vital events are included or 56
State of the art
2.2.d
excluded erroneously in either or both procedures due to errors in reporting the time of the event or the geographic location of the event. When completeness of reporting is decidedly different for subgroups of the population for both procedures, correlation bias will result. This can be reduced by dividing the sample into appropriate groups (post-stratification), such as by age of mother or type of event (i.e. infant death and non-infant death) and estimating the number of events separately for each group. This procedure will reduce the correlation bias provided that each group is homogeneous with respect to completeness of reporting. Care should be taken that the groups are not so small as to increase the ratio bias substantially (Chandrasekaran and Deming, 1949). Completeness bias can be reduced in a similar fashion, that is by grouping the population or the events into homogeneous strata with respect to the probability of out-of-scope errors, preparing separate estimates for each group, and then combining the group estimates. For example, to adjust for boundary errors, events reported for households in areas on or near the boundaries of the cluster, or events reported in the first and last months of the reference period can be classified into a single stratum for events with a "high probability of out-of-scope error" and the remaining events into a second stratum. The usual matching and adjusted estimate can be computed for the latter stratum. The estimate for the first stratum would not be based on matching, but rather on an average of the events observed by the two procedures. Rules for classifying out-of-scope events are necessary, such that the number included in the first stratum balances (approximately) the estimated number of in-scope omissions (Marks et a/., 1974). Completeness bias may be introduced by migration. The continuous recording procedure may miss births which occurred to in-migrants prior to moving to the sample cluster, but during the survey period. The periodic survey may pick up such births, but miss births to out-migrants that the continuous recording picked up. In order to obtain an improved estimate, the sample population is divided into two groups. Those households which resided in a sample cluster for the entire period are placed in the first group. All migrant households are placed in the second stratum. Again, the usual dual system estimate n is computed for the first stratum. Births (Nn) for the households that moved out of the cluster can be estimated by multiplying the number reported by the reciprocal of the completeness rate for the periodic survey for the non-migrant households (i.e. N2/ M). Births (N22) for the households that moved into the cluster can be estimated by multiplying the number reported by the reciprocal of the completeness rate for the continuous recording procedure for the non-migrant households (i.e. Ni/M). The average of these latter two estimates for the migrant households is used for the second stratum estimate.
2.3 The data collection elements of a dual collection system The first three essential data collection elements of a dual collection system are: 2,3.a The essential elements of a dual collection system i. A (continuous) recording procedure. This is the first source for recording births and deaths and it may be an existing record source such as civil registration. More commonly it is a special procedure somewhat different in nature from either the civil registration system or the usual sample survey. ii. A (periodic) survey procedure. This almost always is designed specifically as the second source of data to be matched with the recordingprocedure. This may be a single
57
2.3.a
H. Bradley Wells and Daniel G. Horvitz
visit survey but more often it will be a longitudinal periodic (multiple visit) household survey. iii. Procedures for the matching of recording and survey events. These will include designs for record flow, data processing, and matching of records from both procedures as well as procedures for field recheck or verification of unmatched and doubtfully matched events. The dual system of data collection for measuring births and deaths is much more complex than the simple sum of the complexities due to two single procedures producing the same reports. It requires co-ordinating the design, definitions, training of field and office staff, scheduling of field work, and at the same time maintaining as much independence as is practicable between the two procedures. In designing data collection procedures for each of the procedures, a number of basic questions must be answered bearing in mind the objectives and the cost factors in relation to the precision required. These include questions concerning the content of event records and interview schedules; the wording of items; the field procedures, including frequency of visits to informants or households; the use of full-time versus part-time staff (part-time continuous or full-time intermittent employment); staff qualifications in terms of education and experience; residency of the field staff (local or non-local residents); training and supervisory procedures; and the flow of records and reports. Clearly the answers to these questions are interrelated. They are also related to sample design factors including cluster size, subsampling, and whether the sample is fixed or rotating, and to the levels of literacy of the survey population, and to the availability of qualified persons to staff the system at all levels. Whether or not subsampling is used, a relatively fool-proof field identification system is required as a fourth essential element in a dual collection system. iv. Field identification procedures. These are necessary in order that the sample units can be consistently and unambiguously located by field workers of either procedure. In the event that only a single survey is to be conducted for identifying events to be matched with a recording procedure that already exists (for example, measuring the completeness of civil registration), the field identification procedure must be capable of correctly assigning records from either procedure to the same (small) geographic areas in the matching process. Field identification procedures are crucial in minimizing the biases that can result if area boundaries of the two procedures do not correspond. Even when a good field identification system has been developed there will be errors in using it. See Cooke (1971) for a general description and Madigan et al. (1974) for a specific illustration of approaches to the field identification system. 2.3.b The continuous recording procedure The major objective of a recording procedure is to obtain as complete a list as possible of births and deaths in the sample population. Recording procedures can encompass a wide range but may generally be classified into four groups based upon the combination of legal status and effort required of field staff as depicted in figure 2.1. Most civil registration systems fall into group i. Examples of dual collection efforts related to this are the U.S. birth registration tests of 1940,1950, and 1968; Thailand, 1964-66; Singur 1946-47; and Vaso Town 1966-68 (Marks et al., 1974). We are not aware of any recording procedures in group ii nor of purely group iii procedures. Dual systems in which active efforts by field staff to supplement the usual passive civil registration are relatively few but are illustrated by the Peruvian effort (Cavanaugh, 1963) and a large national system now operated in the Philippines by the Bureau of Census and Statistics.2 Such efforts are a combination of i and iii. 58
State of the art
2.3.b
Figure 2.1 The four types of registration procedure Effort of Field staff Passive Active
Legal status Civil registration Special registration i iii
ii iv
Most recording procedures, including the current Indian system and the early Pakistan and Turkish efforts, fall into group iv. The dual collection systems being tested by POPLAB use group iv recording procedures. There are two important variations in the activities followed in recording procedures of group iv. These two recording variations can also be used together, resulting in three types ranked in terms of the intensity of recording contact with households: i. Informant or "routine round contact" procedure (contact household only if vital event is reported). Pakistan and early Indian experience were based upon this method. ii. Periodic household visits (whether or not vital event is reported). The early Turkish and the Liberian efforts were of this type. iii. Combination of i and ii. The POPLABs at Xavier University in Mindanao and Kenya are experimenting with this approach and India is now using it in some states. The recording procedure based on routine round (RR) contact must at a minimum include the following four activities by the recorder designed specifically to minimize household visits: i. Setting up a group of local persons (routine round contacts) who are expected by virtue of their professional or social status to be aware of vital events occurring in the sample area and who are willing to report them to the recorder. ii. Visiting the RR contacts on a regular basis (weekly or bi-weekly) and asking about families which experienced vital events. iii. Visiting those families for which events are reported and completing a detailed vital event report for each event. iv. Transmitting individual vital event reports to headquarters. In addition the recorder might optionally be required to maintain a special register of vital events, update a household listing,3 and up-date maps periodically. The recording procedure based upon repeated household visits is, in effect, a periodic survey since every household is supposed to be contacted at intervals, whether or not they have had a vital event. The household visit approach requires that the recorder, at a minimum: (i) visit each household at specified intervals and inquire directly about whether any vital events have occurred; (ii) complete a vital event report for each reported event; (iii) transmit the reports to headquarters. In addition the recorder may, as in the informant system, be required to maintain a register of vital events, and up-date household listings, maps, dwelling lists, and the numbering system. The recorder may also be required to report migrations, for example, as in Turkey and Liberia. In India, after the same sample units had been covered for a number of years, estimated completeness rates for the informant-based recording procedure began to decline. Recorders are now required to conduct quarterly household visits in addition to their usual visits to informants (India, 1972).
59
2.3.b
H. Bradley Wells and Daniel G. Horvitz
Preliminary results from part of a rather complicated experiment done in the Colombian POPLAB (table 2.1) show higher recording completeness rates in both rural and urban samples for household visits than for the informant system. Overall birth rates are also higher where the recording procedure used household visits. This suggests a lack of independence between the recording and survey procedures. The results are difficult to interpret since survey completeness was better than recording completeness when household visits were used in the first period, but poorer than recording completeness when household visits were used in the second period. There was an overall decline in survey completeness in the second period which was expected, since two three-month surveys were used in period I and one six-month survey in period II. On the other hand overall recording completeness increased in the second period. Taken at face value these preliminary Colombian results are disturbing because the estimated birth rates vary so greatly with the different factors and combination of factors. It is possible that unknown correlations may be responsible for the observed differences. If this is borne out in the more detailed analyses now underway it would suggest that more intensive efforts are required to develop data collection and estimation techniques appropriate for quasi-independent procedures rather than continuing to use the independence assumption.
Table 2.1 Estimated completeness rates and crude birth rates by area, periods under different recorder operations, switch-back trial*, Colombia, 1971-72.
Area and procedure
Sequence of recorder activities and period Household visits Period I, Informants Period I, informants Period II household visits Period II Period Period I II Change I II Change Completeness rates (percent)
Rural area Recorder Survey Urban area Recorder Survey
81 86
88 68
+7 -18
68 73
90 65
+22 -8
76 82
77 60
+1
-22
62 70
87 70
+25 0
41 29
58 33
+ 17 +4
Crude birth rates per 1.000 Rural area Urban area
53 28
42 25
-11 -3
* At the end of Period I recorders in sample areas where monthly household visits had been used were trained to use the informant system in Period II and vice versa for the other half of the sample areas. Note: Period I is 1 October, 1971 - 31 March, 1972; Period II is 1 April, 1972 - 30 September, 1972. Source: Adapted from data in table 4 shown by permission of Departmento Administrative National de Estadistica in Myers and Lingner (1973). 60
State of the art
2.3.c
2.3.c The household survey procedure Experience with survey techniques and procedures for measuring population phenomena is much more widespread and well known than experience with the recording procedure. There are, however, a much wider variety of survey options which can be used as a component in a dual collection system. The first objective of the survey procedure is to obtain reports of a high proportion of the vital events which occurred in the sample during a time period which overlaps completely with that of the recording procedure. Except in instances where the population base for vital rates is obtained from an outside source, such as a census, the second objective of the survey is to provide the population base. When the household survey is repeated periodically, a third objective can be to collect substantive data of special interest and this content may be varied from time to time.4 A major feature of any survey design is the frequency of interview. This can be crucial in a periodic multiround survey. On the one hand frequent interviewing may cause respondent conditioning and/ or fatigue. On the other hand, interviews spaced too far apart may result in many survey households being lost because of migration and in under-reporting due to recall lapse about events of interest. A dual collection system requires a minimum of one survey to collect vital event reports for the same period prior to interview for which events were covered by the recording procedure. Theoretically there is no limit to the maximum frequency of survey interview, but the practical constraints of costs and respondent fatigue force limits to be set. The optimum survey frequency is not known, either for dual collection systems or for single surveys. Furthermore, what is optimal for one country may be less than optimal in another. The most popular survey frequency in continuing dual collection studies has been six months, although Pakistan and Thailand used three-month intervals. Experimental testing of three, six and 12 month survey frequencies with nonoverlapping recall periods has been carried out in the Colombia POPLAB. Preliminary results are shown in table 2.2. Three-month frequencies were investigated only in Santander. With the exception of period II survey results for births in Santander (lines d and e) the results support the conclusion that survey coverage is more complete for more frequent visits; two three-month surveys are more complete than one six-month survey, and similarly two six-month surveys are more complete than one twelve-month survey. Differences are more marked for deaths than for births. Also the estimated completeness rates in Santander increased rather markedly from period I to period II, perhaps due to better training and practice or to increasing dependence between procedures or to a combination of these. It is also possible that the frequency of survey results have been confounded with other factors in the design (see table 2.1 and the discussion in section 2.3.b). The effect of using different frequencies with overlapping recall periods is being tested in India and in several other POPLABs. A closely related question is the length of the recall period for which respondents are asked to report vital events. A recall period which is longer than the interval between repeated interviews is called an overlapping period. In Pakistan interviews were conducted every three months but respondents at each interview were asked to report all events for the previous 12 months. Thus each event had four chances to be reported in the survey. In Turkey and Liberia interviews were done at six-month intervals, in January and July. The July interview used a non-overlapping recall period of only six months while the January interview used a recall period of 12 months, a sixmonth overlap.5 A major virtue of overlapping reference periods in a single system periodic multiround survey is that it minimizes the effects of time telescoping of events, provided 61
H. Bradley Wells and Daniel G. Horvitz
2.3.c
Table 2.2 Estimated completeness rates for births and deaths by recording and survey procedures by period and frequency of survey, Santander and Bolivar, Colombia, 1971-73. Study period and dates
Number and frequency ofsurvey(s)
Births Recorder Survey
Births Recorder Survey
Santander Period I Oct. 1971 to March 1972 Period II April 1972 to March 1973
a) Two 3-month b) One 6-month
71 77
85 73
68 58
65 41
c) One 6-month, Oct. 1972 to March 1973 d) Two 6-month e) One 12-month
83
82
83
53
84 85
75 76
81 85
58 54
81 80
77 65
83 75
58 46
Bolivar Period II April 1972 to March 1973
f) Two 6-month g) One 12-month
Source: Data adapted from Colombia Departmento Administrativo National De Estadistica, 1973, "Primeros Resultados del Estudio ERED". Centro de Investigaciones en Metodos Estadisticos Para Demographia, Bogota, Junio 1973.
records from different rounds are matched and duplicates are eliminated. It is not clear that this virtue of a single system multiround survey will be desirable in a dual collection system. If it is used, the record keeping and matching operation within the survey itself increases in volume and quite likely in complexity. A compromise approach uses the overlapping recall period for the interview and then processes only those event records which are reported in the most recent nonoverlapping reference period of interest. This approach may improve time reporting and will minimize the number of reports for processing. Further research on the value of overlapping periods in a dual collection system is needed. Measurement bias is a major consideration in determining optimal survey frequency and length of recall periods. The manner in which recall lapse, telescoping errors, respondent fatigue, respondent conditioning, and the changing composition of the sample over time influence measurement errors, has not been systematically studied in dual collection systems, nor indeed in single systems, for measuring population change. As indicated by Marks el al. (1974) the range of choices for survey procedure designs depends so directly upon the recording procedure that it is unwise to make recommendations without considering a fairly wide range of feasible alternatives. If the objectives of the survey are to provide both birth and death events for matching and population data for the denominator, we would be inclined to recommend six-month survey intervals, provided the budget can support them.
62
State of the art
2.3.d
2.3.d Matching procedures The objective of the matching operation in a dual collection system is to yield minimum matching bias and matching variance as defined by Seltzer and Adlakha (1974), and in Marks et al. (1974). On the basis of experience to date Marks et al. (1974) recommend, and we agree, that matching be done by hand rather than machine or computer even in a large scale, country-wide dual collection system, especially in the developmental stages. It is also considered essential that matching procedures be developed specifically for the two field procedures as they actually operate in a given setting. A set of vital reports whose "true" match status is known is required to assess the value of using different characteristics in alternative matching rules. This is essentially a "boot-strap" operation as recommended by Marks et al. (1974) and summarized as follows by Madigan and Wells (1973). "Developing matching procedures for a new dual collection system involves a number of steps: i. Select or create a sample of 300-600 reports of births or deaths from each of the two procedures; these could come from a pretest or from the first stages of an actual collection program. The completed reports would of course have to contain the characteristics which are to be considered in developing the matching rules. Different characteristics will usually be required for deaths than for births. ii. Match the events from the two procedures as accurately and carefully as possible. At this stage all information on the reports, and knowledge of cultural characteristics, and interviewing procedures should all be utilized implicitly in arriving at decisions as to what constitutes a match. It is recommended that this be done independently by three junior or senior professionals. After the three have independently agreed on a match or a non-match these are considered as 'true or correct'. If only two of the three agree on a report then all are asked to reconsider. After reconsideration the remaining reports on which the three disagree as well as the non-matches are sent back to the field for verification or additional information.6 After the field check some outof-scope events may be eliminated and final decisions are made regarding the 'true' match status of the remaining reports thus creating the standard against which alternative matching rules can be judged. iii. Erroneous matches and non-matches for individual matching characteristics are then compared over the feasible range of tolerance limits by matching only on single characteristics. The sum of erroneous matches and erroneous non-matches is defined as the gross matching error and their difference is defined as the net matching error for a characteristic. iv. The data on gross and net matching error as an aid in choosing the characteristic set or subsets and the corresponding weight for each characteristic which will become the basis for the matching rules either explicit or implicit to be applied in the subsequent matching operation. There are usually an extremely large number of possible characteristic sets but a reasonably good explicit matching procedure can usually be developed by examination of the gross errors of individual characteristics. For an individual characteristic gross error shows the relative ability of a specific cutting point or tolerance limit to place an event in the correct class and tolerance limits should be chosen to minimize gross matching error for the characteristic. Net error, taking account of sign, indicates the direction of bias for the individual characteristic and therefore helps determine the relative weight which should be given to a match or nonmatch on that characteristic and vice versa for a variable which yields a high erroneous match rate". One approach to choosing a set of characteristics with minimum net matching error is to rank the individual characteristics (at chosen tolerance limits) from low to 63
2.3.d
H. Bradley Wells and Daniel G. Horvitz
high in terms of erroneous matching error (reverse of discriminating power) and consider the first, the first two, and first three, etc. in combination and choose the set which minimizes net matching error for low gross matching error. Another approach is to assign different weights to the different characteristics so as to achieve the same goal. Weights may be based upon judgement and empirical evidence, i.e. the gross and net errors, and can be derived approximately without considering all possible characteristic sets. Alternatively, weights may be derived through statistical methods such as discriminant or cluster analysis, ideally by considering all combinations of characteristics and cutting points through computer analysis. Although the above procedures have been proposed, no application has yet been reported in which these procedures have been compared within the same study against alternative rules. Madigan and his co-workers are conducting such studies (Madigan and Wells, 1973). Myers and Lingner (1973) described the characteristics and tolerance limits being used for matching in the various POPLABs and there were few similarities other than the use of names of decedents and newborns and their parents. There is great need for studies in which results of such "approximate" rules are compared with "optimum" rules based upon consideration of all combinations of characteristics. 2.3.e Field identification procedures and other factors common to both collection procedures Several other factors which influence the design of interview, editing, and supervisory procedures are relevant. These all relate more or less directly to the content of one or both collection procedures. For convenience we include them in this section although most are discussed only briefly. Individual vital event reports in each procedure must, as a minimum, include sufficient detail to determine whether the event matches or corresponds to an event for the same time period in the other procedure. If the survey is repeated periodically the same basic detail would be required in each round to match within the survey and eliminate duplicate survey event reports. Elimination of duplicate survey events is especially critical when there are overlapping reference periods for the survey (Sabagh and Scott, 1967). Minimum essential items for event reports in any survey which is part of a dual Figure 2.2 Minimum questionnaire content for record keeping and matching operations Content
Survey only Household schedule
Survey and recorder Birth Death report report
Geographic identification Details on household members Births in report period Deaths in report period Place event occurred Particulars: infant /decedent Mother /father Fat her /wife
X X X X
X
64
X X
X
X X X X
X X X X
State of the art
2.3.e
collection system to provide reports of births and deaths and denominator data for estimation of vital rates are shown in figure 2.2. Abernathy and Lunde (1972) described the record content in dual collection systems in India, Pakistan, Turkey, and Liberia. Marks et al. (1974) include a detailed checklist of suggested items for inclusion and an indication of how useful each item is for the different purposes of matching, demographic analyses, and survey control. Decisions must be made at the outset regarding counting rules for the numerator events, birth or death, and for the denominator, total population or women exposed to risk. Two common ways of calculating vital rates are: (i) on the basis of residence (de jure} i.e. events to residents during the year divided by the number of persons residing in the area at midyear, and (ii) on the basis of occurrence (de facto) i.e. events which occur in the area during the year divided by the number of persons occupy ing (living in) the area at midyear. Usually the denominators for crude rates, the total populations at midyear are not very different for the two rates; however, for age specific rates or sex-age specific rates "residents" and "occupants" might differ more in some groups than others, depending upon migration rates and the definitions of migrants. Further, when vital events and population data are from civil registration and census (or projections of the census) respectively, the denominators often are identical in either case. Dual collection systems usually collect data for both numerator and denominator because they attempt to follow residents/ occupants in sample areas over time. Migration, changes in household composition, and travel outside the sample area for delivery of an infant or for hospitalization before death create special problems. Decisions must also be made on handling vital events which occur to inmates of institutions such as prisons, hotels, and long-term hospitals. The continuous recording procedure, especially if it relies upon RR contacts, is more likely than the periodic survey procedure to find and record events occurring to non-residents or visitors in the sample area. The periodic survey on the other hand is perhaps more likely to find events which occurred elsewhere to newcomers or to residents temporarily absent; it also is more likely to miss early infant deaths, or deaths which caused the breakup and migration of a family which has been resident in the area. The de facto approach was used in the Pakistan country system (Marks et al., 1974), Philippines dual collection system, and in the Moroccan POPLAB. India and the MCPS collect both de facto and dejure vital events but after matching and field verification,7 analysis is based upon dejure events. The Turkish Demographic Survey, the Liberian, the Colombian, Kenyan, and the proposed new Turkish systems use the dejure approach to counting vital events. Determination of "residence" classification for the population requires at a minimum that a question on "usual" residence be asked for each individual at the time of survey. It is possible to ask a whole series of additional questions to establish residence more accurately, such as split places of abode, student status, intent to reside here, and duration of stay here. It is recommended however that the simplest counting rules consistent with the usual statistical definitions of the country be used by both procedures. Both defacto or dejure approaches should yield the same result for a representative country-wide sample. In either case it is usually best to use other means to measure vital events occurring to the institutional population. In the pre-test phases of developing a system it would be wise to test procedures for collecting both de facto and dejure events and to develop and test matching and field reverification procedures which would assist in arriving at "the most probable" 65
2.3.e
H. Bradley Wells and Daniel G. Horvitz
classification status of each questionable event. This would provide a basis for deciding whether analysis should be based upon a dejure or a de facto approach. Whatever approach is used the procedures and subject matter content must be designed to include questions in sufficient detail to permit residence classification of an event or a person. It should also minimize possible multiple counting of events for those who move within the sample area. Some POPLABs are collecting data on children ever born and children surviving, in addition to children born last year, so that estimates of fertility obtained by stable population or Brass estimation procedures (United Nations Population Studies 42) can be compared with dual collection estimates. Data on children ever born and children surviving can be obtained in a number of ways. As a minimum, for each woman (or ever married woman) age 15 years to some age beyond childbearing, two questions must be asked: (i) How many live births have you (she) had, and (ii) how many are still alive? At the other extreme, the data can be obtained by reconstructing a complete pregnancy history for each woman in the childbearing ages. The timing of each live birth from puberty onward should be asked for and the subsequent survival status of each live birth. As well, there should be an accounting for any long interlive birth intervals by probing for the use of contraceptives or foetal death. Because there may be differential levels of completeness in reporting births by sex, survival status, and whether the child is living in the home, it is sometimes recommended that three bits of data be collected separately by sex of offspring for each woman: children born alive and (i) living here; (ii) living elsewhere; (iii) now dead. Because of concern with reporting errors Madigan and Herrin(1973) also recorded whether or not the data on children ever born and children surviving were self or proxy reported. Some results are shown in figure 2.3.
Figure 2.3 Relative differences in average number of children ever born and proportion surviving for self and proxy reporters, ever married women, urban and rural areas, Mindanao Center for Population Studies, July 1972 66
State of the art
2.3.e
Selfreporting ever married women in the urban area reported higher average numbers of children ever born at every age than proxy reported women. The differences between self and proxy reporters were greatest between 20-29 years of age, somewhat lower under age 20, and decreased to roughly less than 10 percent at ages 40 and above. For rural women a similar age trend in differences in the average of children ever born was observed, but the relative differences between self and proxy reporters were quite a bit lower than the urban ratios at every age above 20 years. For rural women above age 40 there were no clear cut differences between self and proxy reporters. Ratios of the proportion of children surviving show much smaller differences between self and proxy reporters. Considered as a whole these ratios decline with age for both urban and rural women. At almost every age the ratio of self to proxy reports of the proportion surviving is lower for urban than rural women. Taken at face value it would appear that something more than simple proxy under-reporting of children born and children surviving may be responsible for such results. One can speculate that: i. Under-reporting is related to familial relationship and the degree of association of proxy reporters and the women for whom they report. These would be closer in the rural than urban areas. ii. Cultural factors, such as reluctance to report deaths, might also be related to the age of the respondent. Proxy respondents would tend to be older for young mothers and younger for old mothers; for example daughters might report more often for women over age 40. This might be a possible explanation of the decline with age in the reported survival rates. Further research into the possible effects of proxy versus selfreporting should be conducted because this obviously has potential for introducing bias into any survey regardless of the method of analysis used. Additional interesting preliminary results provided by Madiganef a/., (1973) from the Philippine POPLAB experience shown in figure 2.4 illustrate negative correlation. In almost every month, both for rural and urban areas, the completeness rates for one procedure increases while the rates for the other decreases. Baseline household listings were done in the summer of 1971 and the continuous recording procedure was started immediately thereafter. Retrospective surveys were conducted in January/ February 1972 with a recall period from time of interview to Christmas 1970, and again in July/August 1972 using essentially a non-overlapping recall period from date of interview to Christmas 1972. Hence events reported in the January/ February survey had a second chance of being reported again in the July/ August survey. In addition, the recorder or interviewer who first reported an event was paid a bonus. Recorders were instructed not to make calls in their area during the period when survey interviewers were in the field. Analysis and matching, however, were based upon the period 1 January - 30 June, 1972. Events reported as occurring outside of this period were excluded. The fall and rise in survey completeness and the rise and fall of recording completeness over the reference period suggests that the observed changes occur because of what happens near the end points of the reference period. Taking the estimated completeness rates at face value the survey pattern might well be due to a combination of double coverage in January and the recall lapse phenomenon. The low completeness levels for the recorder at the start and end of the period may be due to a combination of the incentive effect and time delays in reporting. Similar analyses in studies without incentives, for different interview frequencies, overlapping recall periods and for informant based recording procedures will help to throw light upon the influence of some of these factors. 67
2.3.e
H. Bradley Wells and Daniel G. Horvitz
Figure 2.4 Estimated completeness rates by reported month of birth, recording and survey procedures, by area, Dual Record Study, Mindanao Center for Population Studies, JanuaryJune 1972
Innovative advances in single system survey methods for measuring population change have also been made in recent years. Procedures developed are useful also in dual collection systems for control of field work and processing records from the two procedures. Sabagh and Scott (1967) describe the results and limitations of the multiple round 1961-63 Moroccan survey. The matching and reconciliation procedures followed in that survey are recommended reading for anyone faced with the task of designing field and office procedures for matching and maintaining household and individual records for multiple-round periodic surveys, whether or not they are part of a dual collection system. Similar procedures are described for the Population Growth Survey in Liberia (Liberia, 1969). The longitudinal periodic survey and person-years of exposure accounting procedures for denominators developed by CELADE (Arretx and Somoza, 1965), and applied on a national scale in Honduras (1972), bear further investigation as an alternative or supplementary approach to the usual periodic survey component in dual collection systems. Further study, through analytical and simulation models, is needed to explore the magnitude of possible biases in various estimation procedures such as Brass, and other stable population procedures, and Chandrasekaran-Deming in the presence of different types of nonsampling errors such as age-misreporting, telescoping of vital events in time and space (location), memory lapse or biases in reporting of children born alive but now dead, and differential reporting by sex or other characteristics. 68
State of the art
2.4
2.4 Conclusion We have attempted to examine some of the inherent limitations as well as problems yet to be solved in the design of dual collection systems. It certainly was not a profound discussion. In all fairness we probably have devoted more effort to the strong or potentially strong features of dual collection system than we have to enumerating the difficulties. On balance we do feel that some form of the dual collection approach is essential to cope adequately with measurement errors in population surveys. Although concrete results are indeed sparse, the upsurge in the use of the method to obtain substantive demographic data, coupled with the increased interest in experimental testing of the method, would seem to indicate that the dual collection era is either here or is just around the corner. The dual collection system movement certainly appears to be growing in popularity very rapidly. Nevertheless, the state of the art probably should still be characterized as developing. In spite of our optimism about its potentials, we are very much aware of the practical difficulties in achieving the degree of technical and administrative control that is absolutely essential in operating a dual collection system. Other proponents of the dual collection are also well aware of these difficulties (Krotki, 1972, Marks et al., 1974). The basic question, for which there probably is a long list of possible answers, is: How good do the data have to be? It seems clear that dual collection systems are making significant contributions in helping to determine how good the data are. Discussion by Ivan P. Fellegi
The authors have to be congratulated for their bravery in agreeing to undertake a brief review of the "state of the art" relating to a topic as broad as dual collection systems. It is just recently that Marks, Seltzer, and Krotki have written a book of several hundred pages covering the same topic; and it is only recently that I, in commenting on that book, found myself writing well over 100 pages in discussion. The present discussion will be much shorter. I will start with some general comments on dual collection systems, prompted by the present paper but not exclusively related to it. At several places in the paper sweeping statements are made about the superiority of estimates derived from dual collection systems over those obtained from single source surveys. For example, the statement has been made, without qualification, that "while the variances may be greater for the PGE estimates the mean square error will be less". It is a fact [shown in Marks (1971) and Fellegi (1974)] that the variance of PGE estimates will be greater than the corresponding estimate derived from the survey alone. However, this result is derived without taking into account the fact that if the survey was conceived of as a "stand-alone" operation, rather than one part of a dual collection system, it would have available to it a significantly larger budget, it would be free of some significant design constraints (e.g. it could typically utilize considerably smaller clusters, leading to increased sampling efficiency), and thus a stand-alone survey at the same budget (rather than same sample size) as that of a corresponding PGE survey would probably have its sampling variance reduced by a factor of up to two or three. Of course, it would be subject to the well known biases of under-reporting. This discussion highlights some important points: i. One is not justified in assuming that under all circumstances the best strategy is to trade off the variance in favour of reducing the bias. This should be the result of a deliberate examination of the likely parameters of any given situation. If the budget is 69
Discussion
Ivan P. Fellegi
small, resulting in very small sample sizes, the improvement of the sampling variance may outweigh the potential reduction in bias. ii. Even where the overall strategy calls for the PGE approach, the estimates derived from the survey alone have a smaller sampling error. At the national level (or other large aggregation) the PGE estimate, because of its smaller bias, presumably has a smaller mean squared error (otherwise we should not have designed a dual collection system), so the PGE estimate would be preferred. As we pass to more detailed disaggregations, the variance will inevitably increase while the order of magnitude of the bias will remain more or less the same. There must be a cross-over point beyond which the variance begins to dominate the bias and therefore the survey estimate alone, or better still the average of survey and recorder estimates, is preferable to the PGE estimate. The argument is illustrated in figure 2.5 (assuming that the PGE estimate is unbiased): clearly for higher aggregates (to the right of the "cross-over point") the PGE estimate is to be preferred; for finer disaggregations the single source or average estimates are preferred. iii. The "best of both worlds" might prevail if we could use the single source estimate but reduce its bias through means other than the PGE estimate. It is not inconceivable that using a dual collection system, but using it differently from the usual PGE estimation, such an "ideal" could be approximated. What I have in mind is the use of the recording procedure as a direct quality control activity with feedback to the survey, rather than an independent operation. This would involve a careful analysis of the types of vital events uncovered by continuous recorders but missed by the survey interviewers, tightening the periodic survey operations accordingly, pointing out to survey interviewers the specific vital events they missed and analysing the reasons, etc. Clearly, using the continuous recording procedure in this way would introduce a positive correlation between continuous recording and the periodic survey data but probably (although generally not known) positive correlations exist anyway (except in
Figure 2.5 Interrelations between error and sample size for different types of survey 70
State of the art
Discussion
situations involving some specific incentive system, such as that reported in the discussion centering on figure 2.4 regarding the experience in the Philippines). In connection with point (iii) above, the material in table 2.1 is very instructive. Let us denote by Pr the overall completeness rate of continuous recording, by Pr|s the completeness rate of continuous recording of those vital events which are also covered (i.e., not missed) by the household survey. Then the expected value of the PGE estimate is (approximately) equal to (Pr/Pr|s)X
(2.2)
where x is the number of vital events. If Pr = Pr|s then registration and survey are independent. More often (and at least in the case of the Colombian experience reported in table 2.1) some positive correlation exists, in which case Pr