Copyrighted Matena)
series e d i t o r
ALAN
BRYMAN
Surveyins the Social World Principles and practice in survey research
Surveying the social world
Understanding Social Research Series Editor: Alan Bryman
Published titles Surveying the Social World A l a n A l d r i d g e a n d K e n Levine Ethnography J o h n D . Brewer Unobtrusive
Methods
R a y m o n d M . Lee Biographical Research B r i a n Roberts
in Social
Research
Surveying the social world PRINCIPLES AND PRACTICE IN SURVEY RESEARCH
A L A N ALDRIDGE a n d KEN LEVINE
Open University Press Buckingham • Philadelphia
For Meryl, Eileen, Alice and Max Open University Press Celtic Court 22 Ballmoor Buckingham MK18 1XW email:
[email protected] world wide web: www.openup.co.uk and 325 Chestnut Street Philadelphia, PA 19106, USA First published 2001 Copyright © Alan Aldridge and Ken Levine, 2001 A l l rights reserved. Except for the quotation of short passages for the purpose of criticism and review, no part of this publication may be reproduced, stored i n a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher or a licence from the Copyright Licensing Agency Limited. Details of such licences (for reprographic reproduction) may be obtained from the Copyright Licensing Agency L t d of 90 Tottenham Court Road, London, W1P OLP. A catalogue record of this book is available from the British Library ISBN 0 335 20240 3 (pb)
0 335 20241 1 (hb)
Library of Congress Cataloging-in-Publication Data Aldridge, Alan (Alan E.) Surveying the social w o r l d : principles and practice in survey research/ Alan Aldridge and Ken Levine. p. cm. — (Understanding social research) Includes bibliographical references and index. ISBN 0-335-20241-1 — ISBN 0-335-20240-3 (pbk.) 1 . Social surveys—Methodology. 2. Questionnaires. I . Levine, Kenneth, 1945-. I I . Title. I I I . Series. H M 5 3 8 . A 5 3 2001 300'.723—dc21
Typeset by Type Study, Scarborough Printed and bound i n Great Britain by Marston Book Services Limited, Oxford
00-068921
Contents
Series editor's Preface
foreword
1
W h y survey? O u r a p p r o a c h t o this b o o k W h a t is a survey? M e t h o d s o f data c o l l e c t i o n i n surveys Surveys a n d other research strategies T h e success o f the survey Critiques o f surveys Response t o the critiques o f surveys T h e social c o n t e x t o f surveys W h y are people w i l l i n g t o take p a r t i n surveys? W h y are people r e l u c t a n t t o take p a r t i n surveys? Research ethics A n i n v i t a t i o n t o survey research Further reading
\ 1 5 6 7 9 12 14 15 17 19 22 23 24
2
T h e o r y i n t o practice T h e c o m p o n e n t s o f the m o d e r n social survey T h e survey as a research strategy Types o f survey design
25 25 28 31
vi
3
4
5
6
Surveying the social world Relations between t h e o r y a n d research I n c o r p o r a t i n g a theoretical d i m e n s i o n i n t o surveys
32 35
Reliability and validity Further reading
39 41
P l a n n i n g y o u r project R e v i e w i n g y o u r assets
42 42
Setting the timetable C o m p u t i n g a n d software resources
46 47
G a i n i n g access t o organizations Three methods o f g a t h e r i n g data E m a i l a n d interactive surveys Diaries C h o o s i n g a m e t h o d o f gathering data C o m b i n i n g methods o f data gathering
49 51 56 57 58 58
Further reading
60
Selecting samples Introduction Theoretical populations P r o b a b i l i t y s a m p l i n g strategies Accuracy, precision a n d confidence intervals Sample size a n d s a m p l i n g e r r o r O t h e r types o f e r r o r t h a t affect surveys
61 61 63 64 75 76
Sampling strategies: n o n - p r o b a b i l i t y s a m p l i n g
79
Further r e a d i n g
83
C o l l e c t i n g y o u r data D o i n g i t yourself C o m m i s s i o n e d research C o v e r i n g letters f o r postal questionnaires
84 84 85 86
A p p r o a c h i n g respondents f o r a n i n t e r v i e w
90
Piloting D i s t r i b u t i o n a n d r e t u r n o f questionnaires Further reading
90 92 93
D e s i g n i n g the questions: w h a t , w h e n , w h e r e , w h y , h o w m u c h a n d h o w often? T h e sociological i m a g i n a t i o n U n d e r s t a n d i n g w h a t matters t o respondents Recognizing differences between respondents U s i n g u n a m b i g u o u s language sensitively T h e role o f open-ended questions T a c k l i n g the social desirability p r o b l e m
94 94 95 96 98 101 103
Contents
vii
Questions a b o u t respondents' k n o w l e d g e A v o i d i n g o v e r l a p p i n g categories A s k i n g a b o u t age A v o i d i n g d o u b l e - b a r r e l l e d questions A v o i d i n g negatives, d o u b l e negatives a n d worse T h e m a i n things t h a t go w r o n g i n designing questions, a n d h o w t o prevent t h e m T h e m o s t f r e q u e n t l y raised p r o b l e m s , a n d o u r answers Questionnaire layout Designing i n t e r v i e w schedules Setting u p f o r c o d i n g Further reading
107 112 114 118 121 123
7
Processing responses Introduction M a n u a l , s e m i - a u t o m a t e d a n d a u t o m a t e d data i n p u t D a t a file f o r m a t s a n d data types C o n s t r u c t i n g the c o d e b o o k Levels o f measurement Pre-coding a n d p o s t - c o d i n g M i s s i n g data M u l t i p l e responses C h e c k i n g a n d cleaning the data Further r e a d i n g
124 124 125 127 128 129 131 132 132 133 134
8
Strategies f o r analysis Introduction D i m e n s i o n s o f analysis Analysis o f open-ended responses E x a m i n i n g single variables Measures o f central tendency, dispersion, spread a n d shape Standardizing variables Statistical inference a n d s a m p l i n g e r r o r Cross-tabulation Testing hypotheses a n d statistical significance Measures o f association f o r n o m i n a l variables Measures o f association f o r o r d i n a l variables Measures o f association f o r r a t i o variables - c o r r e l a t i o n M u l t i v a r i a t e analysis Further r e a d i n g
135 135 136 137 139 142 143 143 144 147 149 152 152 155 160
9
104 104 105 106 106
Presenting y o u r findings
161
W r i t i n g f o r an audience
161
Characteristics o f the classic research r e p o r t
163
viii
Surveying the social world Use o f tables, figures a n d diagrams R e p o r t i n g o n the research m e t h o d s
166 170
Advantages o f the classic a p p r o a c h Disadvantages o f the classic a p p r o a c h Writing up Writing W r i t i n g u p and w r i t i n g Further r e a d i n g
171 172 173 173 174 175
Glossary Appendix Appendix References Index
1: The T r a v e l Survey questionnaires 2: Websites of professional associations
176 184 189 190 194
Series editor's foreword
T h i s U n d e r s t a n d i n g Social Research series is designed t o help students t o understand h o w social research is c a r r i e d o u t a n d t o appreciate a v a r i e t y o f issues i n social research m e t h o d o l o g y . I t is designed t o address the needs o f students t a k i n g degree p r o g r a m m e s i n areas such as sociology, social policy, psychology, c o m m u n i c a t i o n studies, c u l t u r a l studies, h u m a n geography, p o l i t i c a l science, c r i m i n o l o g y a n d o r g a n i z a t i o n studies a n d w h o are r e q u i r e d to take modules i n social research m e t h o d s . I t is also designed t o meet the needs o f students w h o need t o c a r r y o u t a research project as p a r t o f their degree requirements. Postgraduate research students a n d novice researchers w i l l find the b o o k s equally h e l p f u l . T h e series is concerned t o help readers t o ' u n d e r s t a n d ' social research methods a n d issues. T h i s w i l l m e a n d e v e l o p i n g an a p p r e c i a t i o n o f the pleasures a n d f r u s t r a t i o n s o f social research, an u n d e r s t a n d i n g o f h o w t o i m p l e m e n t certain techniques, a n d an awareness o f key areas o f debate. T h e relative emphasis o n these different features w i l l v a r y f r o m b o o k t o b o o k , but i n each one the a i m w i l l be t o see the m e t h o d or issue f r o m the p o s i t i o n of a p r a c t i s i n g researcher a n d n o t s i m p l y t o present a m a n u a l o f ' h o w t o ' steps. I n the process, the series w i l l c o n t a i n coverage o f the m a j o r methods of social research a n d w i l l address a v a r i e t y o f issues a n d debates. Each b o o k i n the series is w r i t t e n b y a p r a c t i s i n g researcher w h o has experience o f the techniques o r debates t h a t he or she is addressing. A u t h o r s are encouraged to d r a w o n t h e i r o w n experiences a n d inside k n o w l e d g e .
x
Surveying the social world
T h i s n e w b o o k o n surveys by A l a n A l d r i d g e a n d K e n Levine is very m u c h i n tune w i t h the aims o f the series. I t is concerned t o b r i n g o u t n o t just the principles t h a t are i n v o l v e d i n survey research b u t also a host o f p r a c t i c a l issues. H o w e v e r , i n survey research there are d i f f e r e n t contexts t o w h a t m i g h t be meant by a t e r m l i k e ' p r a c t i c a l issues'. Q u i t e r i g h t l y , A l d r i d g e a n d Levine refer quite o f t e n t o large, f r e q u e n t l y c o m p l e x exercises i n survey research t o illustrate some o f their m a i n p o i n t s . B u t f o r many, i f n o t m o s t , readers o f this b o o k such a c o n t e x t is very far f r o m the r e a l i t y they w i l l be facing i f they w i s h t o c a r r y o u t a social survey. I t is this second scenario w i t h w h i c h this b o o k is largely concerned. Students, w h e t h e r undergraduate or postgraduate, are l i k e l y t o have l i m i t e d resources a n d i n v a r i a b l y l i m i t e d t i m e at their disposal. Texts o n survey research t h a t focus p r i m a r i l y o n large, lavishly f u n d e d n a t i o n a l surveys are h a r d l y p e r t i n e n t t o such a s i t u a t i o n . A l d r i d g e a n d Levine's b o o k is f u l l o f advice o n h o w t o devise survey research i n the k i n d o f e n v i r o n m e n t t h a t t y p i c a l l y c o n f r o n t s a student: namely, h a v i n g a f a i r l y t i g h t l y focused set o f research questions t h a t are t o be answered using a survey a p p r o a c h , b u t w i t h l i m i t e d resources. A l d r i d g e a n d Levine b r i n g their experience o f c o n d u c t i n g a relatively small-scale survey o n a h i g h l y focused t o p i c - t r a v e l t o w o r k decisions a n d behaviour o f staff a n d students at their u n i v e r s i t y - t o p u t some flesh o n the bones o f the principles o f survey research. T h e y b r i n g o u t the k i n d s o f issue t h a t need t o be t a k e n i n t o account w h e n c o n d u c t i n g such research. I n the process, they i d e n t i f y c r u c i a l decisions a b o u t the c o n d u c t o f surveys: w h a t k i n d o f sample t o select, w h e t h e r t o i n t e r v i e w o r t o use a self-completion questionnaire, h o w t o design survey questions, a n d so o n . I n a d d i t i o n , they address various h a r d w a r e a n d software issues a n d p r o v i d e a h e l p f u l overv i e w o f approaches t o q u a n t i t a t i v e data analysis. But i t is the sense o f being i n o n the reality o f w h a t i t is l i k e t o d o a survey t h a t distinguishes this b o o k f r o m others o n the survey a p p r o a c h a n d t h a t w i l l p r o v e indispensable t o f u t u r e survey researchers. Social surveys are rarely i f ever perfect. H o w e v e r , there are n u m e r o u s traps t h a t can ensnare the u n w a r y a n d this b o o k w i l l alert readers t o w a y s o f a v o i d i n g t h e m , as w e l l as i n t r o d u c i n g the realities o f survey research. Alan Bryman
Preface
I n an era i n w h i c h 'social i n c l u s i o n ' , 'active citizenship' a n d 'customercentred' feature a m o n g the p o p u l a r buzz w o r d s , i t is n o t s u r p r i s i n g t h a t an increasing n u m b e r o f i n d i v i d u a l s a n d i n s t i t u t i o n s are attracted t o the social survey as a w a y o f c o n s u l t i n g interest groups, audiences a n d clients. Surveys a b o u n d , b u t m a n y o f the people w h o w i l l c a r r y t h e m o u t lack any f o r m a l t r a i n i n g i n social research methods a n d need guidance a b o u t principles as w e l l as p r a c t i c a l k n o w - h o w . There is n o shortage o f existing t e x t b o o k s t h a t deal w i t h social surveys and some o f t h e m have established w o r t h y r e p u t a t i o n s . H o w e v e r , f o r the purposes o f the non-professional people m e n t i o n e d above a n d also f o r students being i n t r o d u c e d systematically t o the m e t h o d f o r the first t i m e , m a n y existing w o r k s have one or b o t h o f t w o d r a w b a c k s . First, they f a i l t o distinguish between the possibilities open t o an i n d i v i d u a l o r small g r o u p c o n d u c t i n g a modest survey o n a l i m i t e d budget, a n d w h a t is possible f o r a research centre c o m m a n d i n g a sizeable team sustained o n the basis o f a substantial research g r a n t . A variety o f strategies a n d techniques are r u l e d o u t i f the resources a n d staff t o i m p l e m e n t t h e m are l a c k i n g , a n d w e have t r i e d t o signal t h r o u g h o u t w h a t is feasible i n small-scale a n d solo projects. Second, some t e x t b o o k s m a k e the successful c o m p l e t i o n o f a social survey appear e x t r a o r d i n a r i l y u n l i k e l y . There is a tendency t o counsel p e r f e c t i o n a n d t o appeal t o ideals w i t h o u t p r a c t i c a l w o r k a r o u n d s being offered. There seem t o be so m a n y traps, hazards a n d obstacles t h a t o n l y an I n d i a n a Jones,
xii
Surveying the social world
p r o p e l l e d b y massive d e t e r m i n a t i o n a n d s u p e r h u m a n p o w e r s o f foresight, c o u l d overcome t h e m a l l . W h i l e i t is t r u e t h a t there are a v a r i e t y o f factors t h a t have t o be entertained, w e set o u t t o reassure readers t h a t surveys can indeed be c o n d u c t e d b y o r d i n a r y m o r t a l s . W e have n o t neglected the p r o b lems a n d p i t f a l l s b u t w e have t r i e d t o offer alternatives a n d remedies w h e r ever possible. B e y o n d t h a t , w e have sought t o strike a positive note a n d t o offer reassurance at the f e w p o i n t s i t is l i k e l y t o be needed. Part o f the e d i t o r i a l b r i e f w e were given was t o a v o i d a heavily statistical a p p r o a c h . T h e analysis o f survey data, even f o r small-scale investigations, necessarily involves the selection a n d use o f statistical t o o l s , so this is n o t an easy task. W e have concentrated o n the general role p l a y e d w i t h i n surveys by descriptive a n d i n f e r e n t i a l statistics, seeking t o m a i n t a i n a focus o n h o w they fit i n w i t h the other dimensions o f survey analysis a n d r e f e r r i n g readers t o other sources f o r the step-by-step detail o f procedures. B o t h o f the a u t h o r s have been associated w i t h the Survey U n i t at the U n i versity o f N o t t i n g h a m , U K , a n d one o f the investigations i t c o n d u c t e d , the Travel Survey, is used as a r u n n i n g example t h r o u g h o u t the b o o k . W e w o u l d like t o take this o p p o r t u n i t y t o t h a n k the present a n d past staff o f the U n i t , Jan Wagstaff, Beth Rogers, N e r y s A n t h o n y , D r N i c o l a Hendey, H e l e n Foster a n d Becky N u n n f o r their h a r d w o r k a n d g o o d h u m o u r i n i n n u m e r a b l e p r o jects. T h e undergraduates f r o m the School o f Sociology a n d Social Policy ( f o r m e r l y the School o f Social Studies), together w i t h postgraduates f r o m various departments t a k i n g the Q u a n t i t a t i v e M e t h o d s m o d u l e , also deserve o u r t h a n k s . T h e y have c o n f i r m e d once again t h a t teaching a n d l e a r n i n g are always t w o - w a y processes. W e a c k n o w l e d g e the c o n t r i b u t i o n o f Sue Parker i n the School o f Sociology a n d Social Policy, a n ever-helpful source o f supp o r t a n d encouragement t o student l e a r n i n g i n modules i n v o l v i n g surveys a n d statistics. O u r t h a n k s are due t o Paddy R i l e y o f A c a d e m i c C o m p u t i n g Services, U n i v e r s i t y o f N o t t i n g h a m , f o r the benefit over m a n y years o f his expertise w i t h SPSS a n d other c o m p u t e r packages. Finally, w e are i n d e b t e d t o Professor A l a n B r y m a n , the series editor, f o r his m a n y h e l p f u l suggestions. A l l o f the above r e m a i n entirely blameless f o r any errors o f o m i s s i o n o r commission.
I ) Why survey?
Our approach in this book Every b o o k o n social surveys is t r y i n g t o be h e l p f u l . Despite the g o o d i n t e n t i o n s , i t is a l l t o o easy t o be unrealistic a n d o f f - p u t t i n g . W h y is this? We suggest the f o l l o w i n g reasons: • Checklists of do's and don'ts: T h e don'ts always seem t o o u t n u m b e r the do's. Survey research sounds like a m i n e f i e l d . • Counsels
of perfection:
A n y f a i l u r e t o abide by the do's a n d don'ts appears
2
Surveying the social world t o i n v a l i d a t e the w h o l e survey. M a n y readers sense they w i l l never m a t c h u p t o this i d e a l , so w h y bother t r y i n g ?
• Too much technique, not enough imagination: T h e design a n d analysis o f surveys involves technicalities - hence the do's a n d don'ts. B u t i f t h a t were all there is t o i t , i t w o u l d be very d u l l . L u c k i l y , i t does n o t have t o be like t h a t . Successful surveys i n v o l v e an exercise o f the sociological i m a g i n a t i o n , as w e l l as s k i l f u l use o f techniques. Survey research is a c r a f t , like t h r o w i n g a p o t , a n d brings m u c h the same satisfactions (and f r u s t r a t i o n s ) . • Statistics: Statistical analysis is a p o w e r f u l i n s t r u m e n t , a n d i t is f o o l i s h t o attack i t . B u t statistics are t o o l s , n o t an end i n themselves. T h e h a r d p a r t is usually n o t the statistics, b u t the sociological i m a g i n a t i o n . I f these are the p r o b l e m s w i t h b o o k s o n social surveys, h o w have w e dealt w i t h them? U s i n g a n d extended example We use a recent a n d real life example of a survey, w h i c h w e refer t o t h r o u g h o u t the b o o k , t o illustrate the practical a n d theoretical issues w h i c h arise at each stage. T h u s t h r o u g h o u t y o u w i l l f i n d discussions o f the Travel Survey. T h e purpose is t o examine the p l a n n i n g and execution o f a single real survey t h a t y o u can f o l l o w step by step t o see h o w the different aspects a n d activities t h a t make u p a social survey f i t together. M a n y chapters c o n t a i n a b o x focusi n g o n features o f the Travel Survey relevant t o the topics dealt w i t h i n t h a t chapter. T h e Travel Survey questionnaires are reproduced i n A p p e n d i x 1 .
Box
I. I
The Travel
Survey
The Travel Survey was commissioned f r o m the Survey Unit at the University of Nottingham, U K , early in 1998 by the administrative department responsible f o r buildings, parking and transport facilities. They needed information on the commuting habits of students and staff, so that they could fulfil the commitments they had given t o the local authority t o minimize the traffic congestion likely t o be caused by the construction of a new satellite campus about half a mile f r o m the main site. They also wanted t o preserve the parkland character of the main campus by encouraging 'environmentally friendly' forms of commuting such as buses and bicycling. The survey was intended t o generate detailed data on commuting patterns and related attitudes among staff and students that would enable transport consultants t o advise the university on a variety of 'green' policies. Thus its objective was primarily descriptive rather than analytic: the task was t o describe variations in commuting patterns rather than t o offer explanations of them.
W h y survey?
3
T h e sociological i m a g i n a t i o n L i k e a l l methods o f social research, surveys call f o r a n exercise o f the sociological i m a g i n a t i o n . I n surveys, as i n f i e l d w o r k , w e have t o 'take the role o f the o t h e r ' (George H e r b e r t Mead's phrase); t h a t is, w e m a k e a n i m a g i n a t i v e leap i n t o the roles o f o u r respondents, t r y i n g t o get inside t h e i r experiences, their p r i v a t e troubles, their joys a n d aspirations, a n d their w a y s o f t h o u g h t a n d expression. We have t o be sensitive t o nuances o f language, t o the w i d e r c u l t u r e , a n d o f t e n t o the o r g a n i z a t i o n a l a n d o c c u p a t i o n a l setting. W e have t o a v o i d stereotypes a n d stereotyped t h i n k i n g .
Box t .2
The sociological imagination: sensitive topics
In the 1960s, a team of sociologists at the University o f Cambridge, U K conducted an investigation into the values, beliefs and social activities of relatively well paid working-class people: the Affluent Worker studies. As part of their survey, they asked a sample o f respondents t o keep a diary logging their weekly social and leisure activities. Some respondents were embarrassed that most of their leisure time was spent on everyday activities like mowing the lawn, cleaning the car and going shopping. They were w o r r i e d that the researchers would think their lives were dull - an example of the social desirability* problem. This example reminds us t o be imaginative about what the potentially sensitive topics are likely t o be. Looking at it positively, sensitive issues also tend t o be the most interesting sociologically and the most important socially.
B e i n g realistic Every researcher k n o w s t h a t compromises have t o be made a n d desirable things left u n d o n e . We o f t e n have the simple choice: m a k e the best o f i t , o r do n o t h i n g .
Box 1.3
Being realistic: no time to do a pilot
In 1993, Aldridge was approached by a senior administrator at the University of Nottingham, U K . T h e university's Management Group was debating whether o r n o t t o build a day nursery on campus f o r the children of students and staff. They were n o t sure what the level and pattern of
The first use of a term included in the glossary is printed in bold.
4
Surveying the social world
demand w o u l d be. Could Aldridge help by conducting a survey of staff and mature students? This was half-way through October. The Vice-Chancellor wanted a r e p o r t and recommendations by mid-December. A f t e r discussion, i t was agreed that this could be put back t o early January at the latest. Strictly, Aldridge did not have the time and resources t o do t h e survey 'properly', in t e x t b o o k fashion. But i t seemed a very important project. Better that the university have some objective information t o go o n than none at all. Aldridge therefore w e n t ahead, but had t o make some compromises. He decided that there was no time t o conduct a pilot survey to test the question wording f o r all the problems that can arise. All surveys are supposed t o be carefully piloted; t o o m i t this is risky. (It is not, despite the impression sometimes given, unethical.) Aldridge decided t o do the following: • undertake crash reading about nursery provision, t o identify the key issues (Aldridge knew very little about the topic); • show the draft questionnaire t o a few friends and colleagues, asking them t o be extremely critical and pull no punches; • keep the questionnaire as simple as possible, covering only the key issues and avoiding anything fancy; • spend a lot o f time o n the covering letter, t o t r y t o ensure that the questionnaire would be well received by a very diverse group of respondents: n o t just academic staff and students, but secretaries, porters, cleaners, ground staff and so on; not just people w i t h infant children, but childless people, childfree people, and people w h o would have been desperate f o r a nursery but for w h o m it was t o o late because their children were grown up; • dispense w i t h a follow-up (reminder) letter, even though it would certainly have boosted the response rate; • keep the analysis straightforward and the final r e p o r t short and t o the point. Happily, i t turned out well. The response rate was reasonable, respondents were very cooperative, and the r e p o r t was w r i t t e n on time. The ViceChancellor was pleased. And the university decided t o build the nursery.
O u r readers' experience a n d resources We are w r i t i n g m a i n l y f o r readers w h o have h a d very little experience i f any o f d o i n g a survey. Some readers w i l l have t a k e n p a r t i n a survey as a respondent - w h i c h m a y o r m a y n o t have been a s t i m u l a t i n g experience. We are also assuming t h a t , i n m o s t cases, the sort o f survey the reader w i l l be l i k e l y t o u n d e r t a k e , at least t o begin w i t h , w i l l be a relatively small-scale
W h y survey?
5
one w i t h l i m i t e d resources. These can be very w o r t h w h i l e - size is n o t the m o s t i m p o r t a n t t h i n g . T h e reader m a y w e l l be w o r k i n g solo or, i f n o t , i n a small t e a m . T h e reader m a y be a student, or someone w a n t i n g d o a survey o n behalf o f an o r g a n i z a t i o n . A l t h o u g h o u r b o o k does sometimes refer t o large-scale surveys like the General H o u s e h o l d Survey or the Census, w e are n o t p r i m a r i l y w r i t i n g a b o u t those. A f t e r a l l , i f y o u are w o r k i n g o n such a survey y o u w i l l receive t r a i n i n g a n d be t o l d w h a t t o d o ! H i n t s a n d examples Each survey is u n i q u e . T h e r e f o r e , lists o f do's a n d don'ts are t o o i n f l e x i b l e . A s o l u t i o n i n one survey m a y n o t w o r k i n another. W e p r o v i d e general h i n t s , n o t inflexible rules. W e also give real examples o f h o w w e have t r i e d t o solve p r o b l e m s i n o u r o w n research. W e use o u r Travel Survey f o r the U n i v e r s i t y o f N o t t i n g h a m as a n extended example r u n n i n g t h r o u g h o u t the b o o k . Statistics O u r a i m is t o i n t r o d u c e the b r o a d p r i n c i p l e s o f statistical analysis, t o c l a r i f y which statistics are a p p r o p r i a t e when, a n d t o indicate w h a t statistics can a n d c a n n o t d o . W e p r o v i d e suggestions f o r f u r t h e r r e a d i n g o n the technical aspects.
W h a t is a survey? A social survey is a type o f research strategy. By this w e m e a n t h a t i t involves an o v e r a l l decision - a strategic decision - a b o u t the w a y t o set a b o u t g a t h ering a n d analysing data. T h e strategy i n v o l v e d i n a survey is t h a t we collect the same information about all the cases in a sample. Usually, the cases are i n d i v i d u a l people, a n d a m o n g other things w e ask a l l o f t h e m the same questions. T h i s is the type o f survey w e concentrate o n i n this b o o k . T h e items o f i n f o r m a t i o n w e gather f r o m o u r respondents are the v a r i ables. Variables can be classified i n t o three b r o a d types, depending o n the type o f i n f o r m a t i o n they p r o v i d e : • attributes - t h a t is, characteristics such as age, sex, m a r i t a l previous e d u c a t i o n • behaviour - questions such as w h a t ? when? h o w often? (if at all)
status,
• opinions, beliefs, preferences, attitudes - questions o n these f o u r characteristics are p r o b i n g the respondent's p o i n t o f view. We shall examine the n a t u r e o f variables m o r e f u l l y i n C h a p t e r 2. For the m o m e n t , the key p o i n t is t h a t a survey aims t o gather s t a n d a r d i n f o r m a t i o n i n respect o f the same variables f o r everyone i n the sample.
6
Surveying the social world
Methods of data collection in surveys Social surveys e m p l o y a variety o f methods t o gather i n f o r m a t i o n , such as questionnaires,
face-to-face i n t e r v i e w s , telephone
interviews a n d obser-
vation. Questionnaires These are f o r m s c o n t a i n i n g sets o f questions w h i c h the respondent c o m pletes a n d returns t o the researcher. One m a i n type is the postal (/mail) questionnaire, w h i c h is sent a n d r e t u r n e d t h r o u g h the post. Questionnaires m a y also be completed a n d r e t u r n e d o n the spot, f o r example i n a classroom or dentist's w a i t i n g - r o o m . T h e r a p i d g r o w t h o f e m a i l has opened u p another interesting p o s s i b i l i t y f o r the d i s t r i b u t i o n a n d r e t u r n o f questionnaires.
Face-to-face interviews I n this b o o k w e d o n o t refer t o questionnaires w h e n t a l k i n g a b o u t interviews. Rather, w e say t h a t the i n t e r v i e w e r has a n i n t e r v i e w schedule (for use i n a s t r u c t u r e d i n t e r v i e w ) o r an i n t e r v i e w guide (for use i n an u n s t r u c t u r e d or semi-structured i n t e r v i e w ) . (Some sociologists use 'questionnaire' m o r e broadly, t o include i n t e r v i e w schedules. W h e n they w a n t t o m a k e the dist i n c t i o n , they use the t e r m self-completion questionnaire.) Face-to-face interviews can be classified u n s t r u c t u r e d , a n d semi-structured.
i n t o three types:
structured,
1 I n a s t r u c t u r e d i n t e r v i e w , the questions a n d the question o r d e r are pre-set. The interviewer aims t o be i n c o n t r o l o f the i n t e r a c t i o n , a n d the respondent is just t h a t - someone w h o responds t o questions t h a t are p u t t o h i m or her. The i n t e r v i e w schedule is like a questionnaire, except i t is read o u t a n d filled i n by the interviewer. 2 I n u n s t r u c t u r e d i n t e r v i e w s , neither the questions n o r the question order are p r e d e t e r m i n e d . U n s t r u c t u r e d interviews are e x p l o r a t o r y , a n d i n p r i n ciple n o n - d i r e c t i v e : i t is m o r e l i k e a focused conversation. The a i m is t o enable people t o express themselves i n their o w n w o r d s , h i g h l i g h t i n g their o w n feelings, preferences a n d p r i o r i t i e s rather t h a n those o f the researcher. A l t h o u g h there is n o i n t e r v i e w schedule the interviewer m a y w e l l have an i n t e r v i e w guide, consisting o f a set o f p r o m p t s t o r e m i n d t h e m w h a t m a i n topics need t o be covered. 3 A semi-structured i n t e r v i e w is one w h i c h aims t o have the best o f b o t h w o r l d s . Parts o f the i n t e r v i e w are s t r u c t u r e d , w i t h a set o f questions directed i n sequence t o the respondent, w h i l e other parts o f the i n t e r v i e w are relatively u n s t r u c t u r e d e x p l o r a t i o n s o f p a r t i c u l a r or general issues. U n s t r u c t u r e d interviews are w i d e l y used i n therapy a n d counselling. T h e y
W h y survey?
7
clearly d o n o t meet the r e q u i r e m e n t , i n t r i n s i c t o the survey m e t h o d , t h a t standardized i n f o r m a t i o n is gathered systematically f r o m a l l respondents. A survey, by d e f i n i t i o n , cannot be w h o l l y based o n u n s t r u c t u r e d i n t e r v i e w s . T h i s does n o t m e a n t h a t survey researchers a n d n o n - d i r e c t i v e interviewers have t o be at loggerheads. T h r o u g h o u t this b o o k w e p o i n t t o the advantages o f m u l t i - m e t h o d research strategies. Questionnaires, u n s t r u c t u r e d interviews, focus g r o u p s , p a r t i c i p a n t o b s e r v a t i o n , diaries: a l l these m e t h o d s , a n d others besides, can be c o m b i n e d i n i m a g i n a t i v e a n d i n n o v a t i v e w a y s . Telephone i n t e r v i e w s T h e nature o f telephone interactions w i t h strangers implies t h a t telephone interviews are i n v a r i a b l y o f the s t r u c t u r e d variety. Observation Examples o f surveys based o n o b s e r v a t i o n are t r a f f i c censuses, a n d studies o f pedestrian f l o w s t h r o u g h c i t y centres (very useful c o m m e r c i a l l y t o anyone w a n t i n g t o k n o w w h e r e t o set u p shop).
Surveys and other research strategies T h e survey is one o f the three b r o a d research strategies available i n social research. T h e others are the e x p e r i m e n t a n d the case study. T h e experiment W i t h i n the social sciences, experiments have tended t o be c o n d u c t e d exclusively by psychologists. Some experiments are carried o u t l a b o r a t o r y , others i n n a t u r a l settings, ' i n the field' - t h o u g h i t is far f o r field experiments i n the social sciences t o m a t c h the ideal type o f called ' t r u e ' e x p e r i m e n t a l design. A m i d the w i d e v a r i e t y o f types o f m e n t a l design w e can d i s t i n g u i s h the f o l l o w i n g key features.
almost i n the harder the soexperi-
Experiments are usually designed t o test hypotheses (tentative explanations a n d predictions) a b o u t the causal relations between variables. T h e researcher carefully c o n t r o l s the independent variables (the p o t e n t i a l causes) i n order t o measure their i m p a c t o n the dependent variables, the effects. The people t a k i n g p a r t i n the e x p e r i m e n t , the subjects, are d i v i d e d i n t o t w o or m o r e groups, o n the basis o f a r a n d o m assignment o f i n d i v i d u a l s t o groups. These groups are exposed t o d i f f e r e n t e x p e r i m e n t a l treatments. Statistical tests are used t o determine the extent t o w h i c h any differences i n the measurement o f outcomes (dependent variables) are due t o each independent variable. I n an e x p e r i m e n t , the researcher deliberately introduces a
difference
8
Surveying the social world
between the people t a k i n g p a r t . For example, take a c l i n i c a l t r i a l i n w h i c h some subjects receive a n e w d r u g designed t o relieve headaches, w h i l e others receive n o t r e a t m e n t at a l l . V e r y o f t e n , n o t r e a t m e n t means being given a placebo - a harmless p r e p a r a t i o n w h i c h has n o m e d i c a l value o r p h a r m a c o logical effects. A t the outset, each subject stands an equal chance o f being i n the e x p e r i m e n t a l g r o u p receiving the d r u g or i n a c o n t r o l g r o u p receiving n o t r e a t m e n t o r the placebo. Clearly, i t w o u l d be hopeless i f a l l the m e n were p u t i n t o one g r o u p a n d all the w o m e n i n t o the other, because t h e n w e w o u l d n o t k n o w w h e t h e r differences i n o u t c o m e were due t o the d r u g or t o the sex o f the p a r t i c i p a n t . R a n d o m assignment o f i n d i v i d u a l s t o groups is a statistically derived technique f o r addressing the p r o b l e m t h a t other independent variables, i n this example the subject's sex, m i g h t be causing the differences i n the dependent variables, i n this example headache relief. E q u a l l y clearly, i t w o u l d be n o g o o d i f subjects k n e w w h e t h e r they were receiving a d r u g or a placebo. I f they d i d k n o w , i t m i g h t w e l l affect t h e i r response t o the t r e a t m e n t , thereby i n v a l i d a t i n g the e x p e r i m e n t . F o r this reason, active placebos are sometimes used - t h a t is, placebos w h i c h m i m i c the side-effects o f the d r u g b u t w i t h o u t its hypothesized therapeutic benefits. Very o f t e n , i t is also desirable t h a t the researchers themselves d o n o t k n o w , at the t i m e they are a d m i n i s t e r i n g a t r e a t m e n t , w h e t h e r i t is a d r u g o r a placebo. I f they d i d k n o w , they m i g h t u n i n t e n t i o n a l l y c o m m u n i c a t e their feelings a n d expectations t o their subjects, s u b t l y i m p l y i n g t h a t the d r u g w o u l d w o r k whereas the placebo w o u l d n o t . A n e x p e r i m e n t o r c l i n i c a l t r i a l i n w h i c h neither the subjects n o r the researchers k n o w , d u r i n g the e x p e r i m e n t , t o w h i c h g r o u p the subjects have been assigned, is k n o w n as a double blind procedure. I n surveys, b y contrast, the researcher is dealing w i t h differences between respondents t h a t are g i v e n , n o t e x p e r i m e n t a l l y created. M e n a n d w o m e n , smokers, people w h o have given u p s m o k i n g a n d people w h o have never s m o k e d , car drivers, m o t o r c y c l i s t s , cyclists a n d pedestrians: w e d o n o t e x p e r i m e n t a l l y create these differences, o u r respondents present t h e m t o us.
T h e case study As the name i m p l i e s , a case study involves an i n - d e p t h i n v e s t i g a t i o n i n t o a p a r t i c u l a r example o f a social p h e n o m e n o n o r i n s t i t u t i o n . T w o areas o f socio l o g y i n w h i c h case studies have p l a y e d a p r o m i n e n t p a r t are the sociology o f e d u c a t i o n , w h e r e detailed w o r k has focused o n social interactions i n classrooms, s t a f f r o o m s , p l a y g r o u n d s a n d so o n , a n d the sociology o f r e l i g i o n , w h e r e studies have focused o n m i n o r i t y religious movements such as the M o o n i e s (Barker 1 9 8 4 ) , e x a m i n i n g , f o r example, the r e l a t i o n s h i p between M o o n a n d his f o l l o w e r s , a n d p r o b i n g the question, have members exercised choice or are they brainwashed?
W h y survey?
9
Case studies t y p i c a l l y i n v o l v e a w i d e range o f research techniques, i n c l u d i n g o b s e r v a t i o n , p a r t i c i p a n t o b s e r v a t i o n , i n t e r v i e w s , d o c u m e n t a r y analysis, a n d asking people t o keep a diary. T h e y m a y also i n v o l v e some survey w o r k - case studies a n d surveys are n o t i n c o m p a t i b l e .
The success of the survey M o d e r n survey research is the f r u i t o f a l o n g a n d c o m p l e x h i s t o r y o f social, scientific a n d p h i l o s o p h i c a l development. W e t e n d t o take surveys f o r granted, b u t v i e w e d h i s t o r i c a l l y they are a n achievement. Survey research t o d a y is u n d e r p i n n e d b y discoveries i n s a m p l i n g theory, m u l t i v a r i a t e analysis a n d scaling m e t h o d s . Readily obtainable c o m p u t e r packages m a k e sophisticated a n a l y t i c a l tools w i d e l y available. F u n d a m e n t a l ideas such as the concept o f the respondent - a person w h o is b o t h the object o f e n q u i r y a n d a n i n f o r m a n t - were very s l o w t o develop ( M a r s h 1 9 8 2 : 1 9 ) . These advances t o o k place i n a range o f disciplines - sociology, psychology, demography, geography, m a r k e t i n g , o r g a n i z a t i o n research, statistics - a n d this c o n t r i b u t e d t o their success, since n o one discipline h a d a m o n o p o l y o n the survey. I n First W o r l d countries, surveys are f o u n d everywhere, a n d are conducted by a l l manner o f organizations, b o t h large a n d small, f r o m government agencies t h r o u g h large business c o r p o r a t i o n s t o small v o l u n t a r y organizations. I f surveys were as hopeless as some o f their m o r e extreme critics suggest, i t is h a r d t o e x p l a i n w h y they are so widespread a n d so e n d u r i n g . B o x 1.4 gives an example o f a survey whose i m p a c t has been incalculable.
Box 1.4
Smoking and lung cancer
In the first half of the twentieth century, lung cancer death rates increased sharply in several countries. By the 1950s, there was evidence f r o m both laboratory w o r k and studies of hospital patient records that appeared t o implicate smoking as a factor. However, the tobacco companies and some doctors remained unconvinced, arguing that atmospheric pollution and improved diagnosis were plausible alternatives and that t h e causal processes underlying respiratory cancers had not been identified. In 1951, Richard Doll and A . Bradford Hill (with the later collaboration of Richard Peto) embarked o n a major epidemiological study of smoking and cancer. They arranged for the British Medical Association (BMA) t o send questionnaires about smoking behaviour t o every d o c t o r on t h e Medical Register in Britain at four points over the next 21 years (eliciting responses f r o m over 34,000 individuals). They also traced and analysed the death certificates of 10,072 doctors w h o died over the period.
10
Surveying the social world
The results (see, f o r example, Doll and Hill 1952 and Doll and Peto 1976) showed that the lung cancer death rate of those doctors under 70 years w h o smoked was twice that of lifelong non-smokers of comparable age, w i t h increased death rates for other respiratory tract conditions and degenerative heart disease. Although the research did not attempt t o explain what i t was about smoking that caused lung cancer and the other associated conditions, it did provide large-scale evidence of a link between smoking and ill health. This evidence was hard t o refute and impossible t o ignore. The report's publication marked a significant turning point in official and public awareness of the dangers of tobacco smoking. Some other noteworthy features of the study are listed below. • Doctors were selected not because of any especially high o r low levels of smoking o r any suspected special susceptibility t o cancer, but mainly because they were a population likely t o be interested in the research, motivated t o cooperate and capable of reporting their smoking accurately and honestly. • A n o t h e r reason for choosing doctors was the existence of an accurate and ready-made sampling frame, the Medical Register, which meant there would be less difficulty in tracing doctors than a sample f r o m the general population. • The study was able t o show that the risks o f death increased steadily w i t h the number of cigarettes smoked. • It also revealed significant reductions in the death rate of the group o f doctors w h o gave up smoking compared t o those w h o continued t o smoke. The overall death rate f r o m lung cancer declined over the course of the study as many doctors gave up smoking, while other non-respirat o r y cancer rates remained stable. • The consistency o f the lung cancer death rate among doctors across different areas cast doubt on both atmospheric pollution and diagnostic improvement as major contributory factors. If these t w o had been operating, a differential between rural and urban rates would have been detected. • The study put the onus on those sceptical of the smoking-cancer link t o find a factor that varied simultaneously w i t h the incidence of smoking (the independent variable) and disease rates (the dependent
variables).
A m o n g the m o s t p o t e n t i a l l y i m p o r t a n t b u t also p r o b l e m a t i c surveys are those w h i c h i n v o l v e i n t e r n a t i o n a l comparisons. One example is discussed i n B o x 1.5.
W h y survey?
Box 1.5
An international survey of adult literacy
Comparative survey research may seek the collection of data f r o m respondents w h o belong t o different ethnic groups, cultures o r nation states. The design of such studies requires both methodological and administrative problems t o be addressed. W i t h o u t sacrificing a standardized approach, the data collection instruments may need t o be translated into different languages and t o use quite different forms of expression t o reflect divergent cultural perspectives. Practical considerations can rule o u t the use of self-completion questionnaires (rural postal services may be inadequate). The idea that certain low status social categories (children, unmarried women) will give their opinions freely t o strangers in private interviews may be locally unfamiliar o r unacceptable. The International A d u l t Literacy Survey, conducted on behalf of the Organization for Economic Cooperation and Development in 1994, was an attempt t o establish the comparative levels of adult literacy and numeracy in eight developed societies (Canada, Germany, the Netherlands, Poland, Sweden, Switzerland, the United States and Eire) using a suite of common tests and schedules ( O E C D 1995). A research team f r o m each nation conducted a probability sample that was designed t o be representative of its non-institutionalized population aged 16-65. In total, well over 20,000 individuals were involved. In Canada, respondents were given a choice of English o r French test materials; in Switzerland, the sample was restricted t o French-speaking and German-speaking cantons w i t h respondents required t o use the corresponding language. Respondents completed a test booklet and their demographic and employment details were gathered in an interview lasting, on average, one hour, conducted in their homes. People w i t h very low levels of literacy were screened o u t of the samples by initial test questions. A m o n g the general findings were marked national differences: for example Sweden had large proportions at the t o p levels of the numeracy and literacy scales w i t h small proportions at the lowest level, while the position in Poland was the reverse. There were also strong links in all countries between being currently unemployed and low levels of literacy. H o w ever, the significance and implications of the findings of such complex surveys is open t o challenge. The measurement procedures used rest on assumptions that can be contested while the nature of the causal links lying behind the observed differences, in this case links between individual skills, employment and economic development, may be contentious. For a critique of the IALS, see Levine (1999).
II
12
Surveying the social world
Critiques of surveys O v e r the years there have been n u m e r o u s criticisms o f social surveys as a research strategy. We can classify t h e m i n t o t w o b r o a d , d i a m e t r i c a l l y opposed types: scientific critiques a n d h u m a n i s t i c critiques.
Scientific critiques o f surveys Critiques o f this k i n d are o f t e n m o u n t e d by people f o r w h o m the experim e n t a l m e t h o d is the o n l y v a l i d means o f a r r i v i n g at scientific findings. A c c o r d i n g t o t h e m , surveys m a y display some o f the t r a p p i n g s o f science reliance o n statistical analysis, use o f j a r g o n , the appearance o f o b j e c t i v i t y b u t a l l this is superficial. T h e charge is t h a t surveys cannot be scientific because the variables are n o t p r o p e r l y c o n t r o l l e d . I n experiments, the researcher makes strenuous efforts t o c o n t r o l f o r the possible effects of extraneous independent v a r i ables. I n a r a n d o m i z e d c l i n i c a l test, f o r example, e v e r y t h i n g is geared t o measuring the effects o f the d r u g . T h a t a n d t h a t alone is w h a t interests us. Experiments are designed t o isolate a very s m a l l n u m b e r o f key variables so as t o measure the causal relations between t h e m . Surveys, i n contrast, are s p r a w l i n g constructions, t y p i c a l l y i n v o l v i n g a large n u m b e r o f variables c o v e r i n g a respondent's a t t r i b u t e s , behaviour a n d o p i n i o n s . N o v a l i d causal inferences can be d r a w n f r o m survey research, i t is said. I f w e find a c o r r e l a t i o n between, say, respondents' religious a f f i l i a t i o n a n d their level o f e d u c a t i o n a l a t t a i n m e n t , w e have n o w a y o f k n o w i n g w h a t the causal mechanisms are o r even, i n m a n y cases, i n w h i c h d i r e c t i o n the causality r u n s . T h e m o s t t h a t can be h o p e d f o r f r o m a survey is some descriptive m a t e r i a l t h a t m a y suggest hypotheses w h i c h can be scientifically tested t h r o u g h an e x p e r i m e n t a l design.
H u m a n i s t i c critiques o f surveys I n this perspective, the p r o b l e m w i t h surveys is n o t t h a t they f a i l t o be scient i f i c , b u t t h a t the a i m t o be scientific is misconceived. T h i s c r i t i q u e has a n u m b e r o f dimensions. O n e m a j o r o b j e c t i o n is t h a t surveys are atomistic: they treat society a n d c u l t u r e as n o m o r e t h a n the sum o f the i n d i v i d u a l s w i t h i n i t . T h e sociology o f r e l i g i o n provides an example. C a n w e really measure the religiosity o f a society by asking a sample o f the p o p u l a t i o n a b o u t their o w n religious beliefs a n d behaviour? A r g u a b l y , w e s h o u l d assess the social a n d c u l t u r a l i m p o r t a n c e o f r e l i g i o n by e x a m i n i n g the influence o f r e l i g i o n o n the educ a t i o n system, o n the l a w , o n the p o l i t i c a l process, a n d o n the c o m m e r c i a l decisions o f business c o r p o r a t i o n s ( A l d r i d g e 2 0 0 0 ) . I f o u r i n v e s t i g a t i o n shows t h a t r e l i g i o n has l i t t l e influence, t h e n society has been secularized -
W h y survey?
13
r e l i g i o n has lost social significance - even i f o u r surveys s h o w t h a t the m a j o r ity o f people say they believe i n G o d . Paradoxically, a l t h o u g h surveys are a t o m i s t i c they are n o t really c o n cerned w i t h i n d i v i d u a l s at a l l . T h e t h r u s t is t o p r o d u c e aggregate data: 80 per cent are this, 55 per cent t h i n k t h a t , 2 per cent d o the other. T h e language of survey research betrays its lack o f concern w i t h the i n d i v i d u a l : respondents, samples, cases. A n d w h a t are statistics, i f n o t a means o f analysing aggregate data? Focus o n the i n d i v i d u a l h u m a n being, a n d the statistician is silent. O n e s t r a n d i n the h u m a n i s t i c c r i t i q u e o f survey research has been t r e n c h a n t l y expressed b y B l u m e r (1956) i n his attack o n the l i m i t a t i o n s o f ' v a r i able analysis' - b y w h i c h he means the r e d u c t i o n o f social processes t o the c o r r e l a t i o n between variables. These variables, he argues, are n o t generic: they d o n o t stand f o r abstract categories, a n d so c a n n o t be generalized b e y o n d the specific c o n t e x t o f the survey. T h e y are l o c k e d i n t o w h a t B l u m e r calls the 'here a n d n o w ' , w h i c h , w e m a y n o t e , soon becomes the 'there a n d t h e n ' . T h e depressing c o n c l u s i o n is t h a t variable analysis results i n k n o w ledge w h i c h is neither generalizable n o r c u m u l a t i v e . N o r does i t offer any insight i n t o the i n t e r p r e t i v e processes t h r o u g h w h i c h social reality is c o n structed. A c c o r d i n g t o the h u m a n i s t i c c r i t i q u e , surveys are o n l y m a r g i n a l l y less a r t i ficial t h a n experiments. Surveys c a n n o t overcome the p r o b l e m o f the react i v i t y o f research i n s t r u m e n t s , because they are by their very n a t u r e a crashing i n t r u s i o n i n t o the n o r m a l f l o w o f social l i f e . Respondents are selfconsciously behaving as respondents. O n e o b v i o u s a n d ineradicable expression o f this is the p r o b l e m o f social desirability. Respondents' answers are influenced b y t h e i r desire t o be h e l p f u l a n d t o live u p t o t h e i r o w n self-image or t o a n ideal w h i c h they t h i n k w i l l l o o k g o o d t o the researcher. Respondents w i l l therefore o v e r - r e p o r t t h e i r v i r t u o u s acts a n d p l a y d o w n o r ignore their failings a n d foibles. T h e y w i l l also t r y t o appear consistent, w i t h the result t h a t their o p i n i o n s a n d beliefs w i l l seem m o r e coherent t h a n they really are. Part o f the a r t i f i c i a l i t y o f surveys, a c c o r d i n g t o critics, is t h a t they are d r i v e n by the concerns o f the researcher rather t h a n the respondent. T h e essence o f a social survey is t o p u t questions t o respondents. W h a t e v e r efforts w e m a k e t o a l l o w respondents t o express themselves i n their o w n w o r d s , w e c a n n o t go very far. I t is s i m p l y n o t possible i n a survey design t o have a large n u m b e r o f open-ended questions, w h e r e respondents are free t o answer i n w h a t e v e r w o r d s they choose. M o s t o f o u r questionnaire or interv i e w w i l l i n e v i t a b l y consist o f closed questions, w h e r e w e present a series o f choices f r o m w h i c h respondents are asked t o choose. I t f o l l o w s f r o m this, say the critics, t h a t w e shall find i t a l m o s t impossible to gauge the salience o f issues t o o u r respondents. I t is w e , after a l l , w h o are raising the issues i n the first place. W e can o f course ask respondents h o w i m p o r t a n t given issues are f o r t h e m . Even so, this is h a r d l y a s o l u t i o n t o the
14
Surveying the social world
social desirability p r o b l e m . A d m i t t i n g t o a researcher t h a t y o u have n o interest i n issues apparently deemed t o be i m p o r t a n t is a d i f f i c u l t t h i n g t o d o . Some critics conclude f r o m a l l this t h a t the o n l y v a l i d use o f surveys is t o gather basic factual i n f o r m a t i o n , as i n n a t i o n a l censuses. M a r k e t researchers can use surveys t o find o u t w h a t p r o d u c t s w e b u y a n d w h a t possessions w e o w n . T h i s , however, is h a r d l y the stuff o f a v i b r a n t social science. I t is m u n d a n e , u n t h e o r i z e d f a c t - g r u b b i n g - w h a t C. W r i g h t M i l l s (1970) called abstracted empiricism. I t shows, w h a t is m o r e , t h a t the basic f u n c t i o n of social surveys is t o p r o v i d e useful i n f o r m a t i o n t o people w h o have p o w e r over us. Some d r a w the devastating conclusion t h a t surveys are an i n s t r u ment used by O r w e l l i a n B i g Brothers t o keep tabs o n the proles.
Response to the critiques of surveys Few sociologists n o w a d a y s see sociology as a h a r d science o n a par w i t h nuclear physics o r m i c r o b i o l o g y . M o s t people agree. For t h a t reason, the scientific c r i t i q u e o f surveys is less pressing t h a n the h u m a n i s t i c c r i t i q u e . Despite the scientific c r i t i q u e , w e believe t h a t surveys have a p a r t t o p l a y i n establishing causal relations, as w e shall e x p l a i n . B u t causality is always c o m p l e x because society is c o m p l e x . Decades o f research i n t o the effects o f the mass m e d i a , i n c l u d i n g a host o f true experiments, have p r o d u c e d very little h a r d evidence. The t r u t h is - t h o u g h pressure groups find i t impossible t o accept - w e s i m p l y d o n o t k n o w m u c h a b o u t the effects o f the mass m e d i a , a n d perhaps w e never w i l l . For us, f o r o u r students, a n d w e suspect f o r m o s t o f o u r readers, i t is the h u m a n i s t i c c r i t i q u e t h a t is p o t e n t i a l l y the m o r e d a m a g i n g . O u r response t o i t , developed t h r o u g h o u t the b o o k , is i n essence, this.
Poor surveys I t is sadly t r u e t h a t t o o m a n y surveys are p o o r l y designed, b a d l y executed a n d i n c o r r e c t l y analysed. T h e y y i e l d n o t h i n g o f value. Clearly, t h o u g h , exactly the same is true o f ill-conceived experiments a n d botched field w o r k . ' R u b b i s h i n , r u b b i s h o u t ' applies t o a l l research strategies. O u r a i m is t o p r o m o t e the cause o f g o o d surveys.
A m u l t i - m e t h o d approach Surveys can be f r u i t f u l l y c o m b i n e d , i n a l l sorts o f imaginative w a y s , w i t h u n s t r u c t u r e d i n t e r v i e w s , observational fieldwork, d o c u m e n t a r y analysis, focus groups a n d so o n . U s i n g m o r e t h a n one research strategy enables us t o t r i a n g u l a t e data, t h a t is, t o use a variety o f methods t o test the v a l i d i t y a n d r e l i a b i l i t y o f o u r findings. We give examples i n Chapter 3. We do n o t accept
W h y survey?
i5
t h a t surveys c a n n o t address sensitive a n d subtle issues. I n o u r view, i t is disastrous t o erect a sectarian barrier between surveys a n d fieldwork, q u a n t i tative a n d q u a l i t a t i v e m e t h o d s . As O a k l e y has argued ( 1 9 9 8 ) , one danger is the c r e a t i o n o f a gendered h i e r a r c h y o f k n o w l e d g e i n w h i c h q u a n t i t a t i v e research is represented as objective, hard-edged a n d masculine, w h i l e q u a l i tative research is subjective, sensitive a n d f e m i n i n e . T h e a p p a r e n t l y sharp o p p o s i t i o n between q u a n t i t a t i v e a n d q u a l i t a t i v e research is a social c o n struct t h a t perpetuates p a t r i a r c h y ; u p o n serious e x a m i n a t i o n , a l l social research t u r n s o u t t o have q u a n t i t a t i v e a n d q u a l i t a t i v e elements.
T h e r o l e o f social t h e o r y Social surveys can p l a y a significant p a r t i n the development a n d testing o f sociological theory. Surveys d o n o t have t o be f a c t - g r u b b i n g . I t is w o r t h a d d i n g t h a t i n m a n y cases so little is k n o w n a b o u t a t o p i c t h a t a f e w facts w o u l d n o t go amiss.
Servants o f p o w e r I t is t r u e t h a t survey research is useful t o c o m m e r c i a l organizations a n d t o the state. O n the other h a n d , survey research can give a voice t o the general p u b l i c , t o consumers, a n d t o disadvantaged a n d disprivileged g r o u p s . T h i s brings us t o the social c o n t e x t o f surveys.
The social context of surveys Social surveys as w e u n d e r s t a n d t h e m are a m o d e r n p h e n o m e n o n . T h e y developed d u r i n g the p e r i o d o f i n d u s t r i a l i z a t i o n , a n d came t o f u l l f r u i t i o n i n the t w e n t i e t h century. T h e B r i t i s h Census began i n 1 8 0 1 , a n d has been carried o u t every ten years since t h a t date w i t h the e x c e p t i o n o f 1 9 4 1 , at the height o f the Second W o r l d War. Similarly, the decennial (ten yearly) Census of P o p u l a t i o n i n the U n i t e d States began i n 1790. T w o o f the m o s t i m p o r t ant surveys ever carried o u t i n the U K were Charles Booth's Life and Labour of the People in London (published 1 8 8 9 - 1 9 0 2 i n seventeen volumes) a n d Seebohm Rowntree's study o f Y o r k , Poverty: A Study of Town Life ( 1 9 0 2 ) . A n u m b e r o f i n f l u e n t i a l surveys were c a r r i e d o u t by Mass Observation, w h i c h was f o u n d e d i n 1936 a n d w h i c h h a d a keen sense o f a m i s s i o n t o i n f o r m the general p u b l i c a b o u t the state o f the n a t i o n . I n o u r o w n times, m a j o r n a t i o n a l surveys include the General Household Survey a n d the Labour Force Survey. I n a d d i t i o n t o these large-scale affairs, there are c o u n t less small surveys t a k i n g place every week o f the year ( t h o u g h they slacken off d u r i n g m a j o r h o l i d a y s ) . There is every sign t h a t surveys w i l l c o n t i n u e t o f l o u r i s h i n the t w e n t y - f i r s t century a n d b e y o n d .
16
Surveying the social world
Social surveys are also a feature o f First W o r l d societies. T h e y depend u p o n s t r o n g central i n s t i t u t i o n s a n d advanced c o m m u n i c a t i o n s i n f r a s t r u c tures. T h e T h i r d W o r l d c a n n o t always a f f o r d t h e m , a n d the c o m m a n d economies o f the c o m m u n i s t societies h a d less need o f t h e m . A t the t i m e o f w r i t i n g , the f o r m e r c o m m u n i s t countries are experiencing p r o f o u n d c o n flicts i n their t r a n s i t i o n t o a m a r k e t economy, a liberal-democratic p o l i t y a n d a consumer society. T h e i r p r o b l e m s are n o t just economic b u t social a n d c u l t u r a l . T h e neglect o f serious survey w o r k was characteristic o f their lack o f responsiveness t o consumer interests. Reading texts o n social survey design a n d analysis w r i t t e n i n the 1960s a n d 1970s can be revealing. A t times i t seems almost another w o r l d . T h e social scientist t y p i c a l l y comes across as an a u t h o r i t y d e m a n d i n g coopera t i o n f r o m respondents. A c u l t u r a l g u l f lies between us a n d t h e m . These respondents are i n c o m p e t e n t . T h e y w i l l m i s u n d e r s t a n d o u r i n t e r v i e w questions a n d mess u p o u r questionnaires unless w e give precise instructions a n d spell e v e r y t h i n g o u t i n m i n u t e d e t a i l . Similarly, the survey director, a m a n , w i l l have t o give lengthy t r a i n i n g a n d detailed instructions t o his h i r e d - h a n d interviewers, w h o are w o m e n . These w o m e n , l i k e the respondents, t e n d t o get things w r o n g unless their w o r k is closely m o n i t o r e d . T h e alienated l a b o u r o f the mass p r o d u c t i o n assembly-line is thus reflected i n the closely c o n t r o l l e d routines o f the h i r e d - h a n d researcher. T h e advice given i n these texts is n o t so m u c h p o o r as o u t o f date. T h e language o f this era carries over i n t o c o n t e m p o r a r y research, a n d m u c h o f i t can feel u n c o m f o r t a b l e . T h e people w h o take p a r t i n o u r research are c o n v e n t i o n a l l y called 'respondents', w h i c h m a y suggest a s t i m u l u s response m o d e l o f o u r r e l a t i o n s h i p w i t h t h e m . Some researchers t h i n k i t w o u l d be better t o speak o f ' i n f o r m a n t s ' , as the social a n t h r o p o l o g i s t s d o , t o acknowledge the p o i n t t h a t people are s u p p l y i n g us w i t h i n f o r m a t i o n w h i c h they have a n d w e w a n t . A t least, u n l i k e m a n y psychologists a n d medical researchers, w e d o n o t refer t o people as 'subjects'. A n o t h e r u n f o r t u n a t e w o r d is ' i n s t r u c t i o n s ' . W h e n w e ask people t o fill o u t questionnaires, w e need t o give t h e m guidance o n w h a t w e are l o o k i n g for, as w e l l as some e x p l a n a t i o n o f the rationale u n d e r l y i n g o u r questions. T h i s guidance is c o n v e n t i o n a l l y t e r m e d ' i n s t r u c t i o n s ' , even t h o u g h w e d o n o t have the p o w e r o f c o m m a n d t h a t the t e r m appears t o i m p l y . People can refuse t o be i n t e r v i e w e d , o r p u t the p h o n e d o w n o n us, o r t h r o w o u r quest i o n n a i r e i n t o the waste paper basket. T h e y can also c o m p l a i n t o us, o u r sponsors o r o u r employers, as w e have b o t h discovered. I n some w a y s , this language o f 'respondents' a n d ' i n s t r u c t i o n s ' does n o t matter. A f t e r a l l , w e d o n o t use i t i n t a l k i n g to the people w e are surveying; i t is an o c c u p a t i o n a l discourse w e e m p l o y a m o n g ourselves t o t a l k about t h e m . T h e p o i n t is, w e need constantly t o r e m i n d ourselves a b o u t o u r relationship w i t h o u r respondents. Just as a c o m m e r c i a l firm w h i c h treats its
W h y survey?
17
customers as ' p u n t e r s ' is l i k e l y t o lose business, so a survey w h i c h sees respondents as i g n o r a n t d i m w i t s is a survey scarcely w o r t h d o i n g . Today, m o r e t h a n ever before, people are uneasy a b o u t the w a y i n w h i c h social surveys use aggregate data. I n surveys w e are t y p i c a l l y c o m p a r i n g m e n w i t h w o m e n , smokers w i t h n o n - s m o k e r s , a n d car drivers w i t h cyclists a n d pedestrians. I n d i v i d u a l s are submerged i n t o a category - w h i c h w e m a y find objectionable. People have c o m p l a i n e d f o r years a b o u t sociologists' alleged obsession w i t h social class. I n the w o r d s o f N u m b e r 6, the lead character i n the 1960s c u l t T V series The Prisoner: ' I a m n o t a number, I a m a h u m a n b e i n g ! ' T h e m o r e w e c a n persuade o u r respondents t h a t their o w n individual experience a n d o p i n i o n s c o u n t f o r s o m e t h i n g , the better. Reflecting o n the s o c i o - c u l t u r a l c o n t e x t o f surveys can help us t o i d e n t i f y the reasons w h y people are w i l l i n g t o take p a r t i n t h e m a n d the m a i n sticki n g - p o i n t s . F r o m these reflections, w e c a n d r a w some b r o a d conclusions a b o u t basic principles w h i c h c a n guide us i n designing o u r research.
Why are people willing to take part in surveys? H e l p i n g the researcher T h i s has always been one o f the m o s t p o w e r f u l motives f o r filling o u t questionnaires a n d agreeing t o be i n t e r v i e w e d . People w a n t t o be h e l p f u l . U n f o r tunately, t h e i r help m a y take the f o r m o f t e l l i n g us w h a t they t h i n k w e w a n t to hear. T h i s is another example o f the social desirability p r o b l e m , one o f the m a i n challenges t h a t c o n f r o n t the social researcher. Altruism As w e l l as h e l p i n g the researcher, respondents are o f t e n m o t i v a t e d b y the hope t h a t the research w i l l p r o m o t e social progress. People v o l u n t e e r f o r a l l sorts o f social activities i n o r d e r t o m a k e the w o r l d a better place. R i c h a r d Titmuss's classic w o r k ( 1 9 7 0 ) , The Gift Relationship, uses the U K ' s v o l u n t a r y b l o o d d o n o r system as a case study o f the p o w e r o f a l t r u i s m .
Citizenship R e s p o n d i n g t o surveys c a n be a w a y o f expressing one's democratic r i g h t as a citizen t o have a voice i n p u b l i c affairs. T h i s is p r o b a b l y the m a i n reason w h y people t u r n o u t t o vote i n elections. Even i f o u r vote is u n l i k e l y t o m a k e any difference t o the o u t c o m e , w e m a y still h o l d i t i m p o r t a n t t o have o u r say i n the democratic process. So i t is w i t h surveys. T h i s implies t h a t people's m o t i v a t i o n t o take p a r t i n a survey w i l l be strengthened i f they believe t h a t their expressions o f o p i n i o n w i l l c o u n t f o r s o m e t h i n g .
18
Surveying the social world
Let's t a l k a b o u t us O f t e n , w e survey n o t the general p u b l i c b u t a p a r t i c u l a r g r o u p w i t h i n i t : students, clergy, people w i t h literacy p r o b l e m s , members o f an ethnic m i n o r i t y a n d so o n . I n such surveys i t is n o r m a l l y clear t o respondents t h a t the reason they have been selected is t h a t they are members o f a p a r t i c u l a r g r o u p or s t r a t u m i n society. O u r survey gives t h e m the chance t o be representatives of their g r o u p . T a k i n g p a r t i n survey research is one w a y a g r o u p o f people can g a i n a hearing f o r t h e i r o p i n i o n s , experiences a n d ideas. T h i s m o t i v e can be very p o w e r f u l w h e n a g r o u p feels a sense o f grievance t h a t its p o i n t of v i e w has been m i s u n d e r s t o o d a n d its p r o b l e m s i g n o r e d . L u c k i l y f o r us, most groups feel t h a t w a y .
Let's t a l k a b o u t me G i v e n a p p r o p r i a t e safeguards, people like t a l k i n g a b o u t themselves. I t m a y n o t always be the noblest m o t i v e , b u t i f w e expect o u r respondents t o be saints w e s h o u l d consider an alternative career t o survey research. I n the n e x t section, w e discuss some o f the reasons w h y people m a y be reluct a n t o r u n w i l l i n g t o take p a r t i n surveys. V e r y o f t e n , motives are m i x e d . People m a y have s t r o n g reasons b o t h t o p a r t i c i p a t e a n d n o t t o d o so. B o x 1.6 gives one instance o f this ambivalence.
Box 1.6
Establishing the 'dark figure' of unrecorded crimes
in the late 1960s, the US government sought t o expand its intelligence about crime and criminals beyond the information available in the Uniform Crime Reports (UCR) the standard format in which official police and c o u r t statistics are presented in the USA (of similar status t o the H o m e Office's Criminal Statistics in the U K ) . O n e objective was t o estimate the size of the dark figure, the volume of crimes that had actually been committed but which, f o r various reasons, w e n t unrecorded in the UCR. Researchers adopted an ambitious design which included surveys of selected US cities and of businesses in order t o gather information on levels of white collar crimes like fraud. However, the most influential strand was a survey of the general public, based on eliciting details f r o m a representative sample of US households about the occasions on which household members had been the victims of eight major types of crime, defined in the same way as they were in the UCR (Coleman and Moynihan 1996:71-2). f
The first US nationwide victimization survey in 1972 suggested crime rates three t o five times those of the UCR. Despite a variety of methodological problems, including doubts about the accuracy of respondent recall
W h y survey?
19
and the capacity of the researchers t o translate the respondents' common sense definitions of offences into the legalistic framework of the UCR, the findings commanded considerable public and official attention. Although the use of victimization surveys spread quickly beyond the USA, the British government waited until 1981 before commissioning the first national U K survey which was conducted in 1983. Over time, the methodology used has been refined and the research objectives expanded t o cover the processes underlying the non-reporting of incidents and the variety of roles victims can play in precipitating crimes. Victimization surveys are n o w generally accepted as an important complement t o o r t h o d o x criminal and judicial statistics, though the initial claim that they could establish the ' t r u e ' prevalence of crime is n o w regarded sceptically. Self-report surveys, in which samples of the population are asked about their o w n offending, are a further way in which the o r t h o d o x official statistics can be supplemented. Inviting respondents t o admit they have broken the law, o r even that they have engaged in lesser kinds of deviant conduct, necessitates very careful question design and interviewing technique. Self-report surveys have been used extensively in connection w i t h so-called victimless crimes like drug abuse, and also w i t h schoolchildren regarding smoking, substance abuse and under-age alcohol consumption.
Why are people reluctant to take part in surveys? Decline o f deference I t used t o be said p o p u l a r l y t h a t B r i t a i n was a class-conscious society, r i v e n by class d i s t i n c t i o n a n d snobbery. I n sociological terms, w h a t was being referred t o was n o t class b u t status: l o o k i n g u p t o o r d o w n o n people d e p e n d i n g o n t h e i r social b a c k g r o u n d , o c c u p a t i o n , e d u c a t i o n a n d style o f life c o m p a r e d t o y o u r o w n . P r o f o u n d social a n d c u l t u r a l changes are e r o d i n g these status-conscious patterns o f t h o u g h t a n d behaviour, as d e m o n s t r a t e d by the Affluent Worker studies o f the 1960s ( G o l d t h o r p e et al.). O n e consequence is t h a t social researchers can n o longer expect deferential c o o p e r a t i o n f r o m ' o r d i n a r y ' people. W h a t is t r u e o f B r i t a i n is true elsewhere.
Scepticism a b o u t experts L i n k e d t o the decline o f deference is a g r o w i n g scepticism a b o u t the expertise of scientists a n d professionals. H i g h l y publicized scandals have reduced p u b l i c confidence i n the pronouncements o f experts. The BSE crisis i n the U K is one d r a m a t i c example. One practical consequence is t h a t s i m p l y
20
Surveying the social world
m e n t i o n i n g a university or scientific a f f i l i a t i o n i n a covering letter is n o longer accepted as a guarantee o f h o n o u r a b l e intentions i n the w a y i t once was.
Consumerism T h e rise o f consumer society is one o f the key issues i n c o n t e m p o r a r y sociology. C o n s u m e r i s m implies choice, i n c l u d i n g the choice o f exit - i n this case, refusal t o take p a r t . I t m a y also i m p l y an o r i e n t a t i o n t o w a r d s cost-benefit analysis: w h y s h o u l d I take part? W h a t w i l l I g a i n , a n d w h a t w i l l i t cost me i n t i m e , e f f o r t , o r f r u s t r a t i o n ?
C o m p e t i t i o n f r o m m a r k e t research a n d salespeople Social scientists are n o t the o n l y people c o n d u c t i n g surveys. T h e fact t h a t c o m m e r c i a l m a r k e t research relies o n surveys as a p r i n c i p a l source o f i n f o r m a t i o n is a sign o f the p o w e r o f the survey m e t h o d . Salespeople sometimes p r e t e n d t h a t they are c o n d u c t i n g a survey w h e n they are really t r y i n g t o sell us s o m e t h i n g (a t r i b u t e vice pays t o v i r t u e ) . Telephone sales pitches r o u t i n e l y begin w i t h the false assurance, ' D o n ' t w o r r y , M r A l d r i d g e , I ' m n o t t r y i n g t o sell y o u a n y t h i n g ' . W e need t o distinguish o u r o w n research f r o m these other activities.
Survey fatigue A r g u a b l y , there are just t o o m a n y surveys g o i n g o n . People get fed u p (technically, survey fatigue), a n d are n o t w i l l i n g t o take p a r t i n yet another survey unless i t is w e l l designed a n d seems especially w o r t h w h i l e .
Intensification o f social life T h e society o f leisure, once p r e d i c t e d i n the 1960s, has n o t yet a r r i v e d . M a n y people feel under increasing pressure at w o r k . A l l sorts o f therapies are available t o help people cope w i t h the stress o f m o d e r n l i v i n g . O u r telephone c a l l , o u r r i n g o n the d o o r b e l l , o u r questionnaire o n the d o o r m a t , m a y seem like yet another i n t r u s i o n i n t o people's precious free t i m e . O u r surveys need to come across as p a r t o f the s o l u t i o n , n o t p a r t o f the p r o b l e m . Dislike of form-filling O n e source o f stress is filling o u t o f f i c i a l f o r m s . W e have yet t o meet anyone w h o enjoys d o i n g t h e i r t a x r e t u r n . A questionnaire w h i c h feels l i k e an o f f i c i a l f o r m is p r o b a b l y n o t one t h a t w i l l achieve a h i g h response rate. So, t o o , an i n t e r v i e w t h a t is experienced as an i n t e r r o g a t i o n is u n l i k e l y t o y i e l d r i c h i n f o r m a t i o n or deep insights.
W h y survey?
21
Privacy T h e concept o f 'the i n f o r m a t i o n society' has received a l o t o f a t t e n t i o n f r o m sociologists (Webster 1 9 9 5 ) . People are concerned a b o u t the data t h a t c o m mercial a n d p u b l i c agencies h o l d o n t h e m ; hence m a n y societies have passed laws o n data p r o t e c t i o n a n d f r e e d o m o f i n f o r m a t i o n . I n general, people are nowadays far m o r e suspicious a b o u t the uses t o w h i c h data are p u t t h a n they were i n the past. T h i s means t h a t any guarantee o f confidentiality has t o be seen t o be w a t e r t i g h t . I f the researcher can guarantee a n o n y m i t y , so m u c h the better, even t h o u g h i t can raise problems f o r the researcher. T h e nature o f the guarantee o f c o n f i d e n t i a l i t y o r a n o n y m i t y s h o u l d be realistic a n d crystal clear.
Box 1.7 lessons
Encouraging people to take part in surveys: general
• Value of the research W e presumably think that o u r w o r k will be valuable. The m o r e we can convince respondents that this is true, the better. W h e r e v e r possible, we should find ways of feeding the main findings back t o o u r respondents and t o people like them. • Value of respondents contribution W h y should a respondent bother t o answer o u r questions, when they have plenty of other things t o do? W h a t difference will their participation make t o the value of o u r research? Some respondents are w o r ried that they have nothing original and interesting t o say, o r that they don't k n o w much about the topic. W e need t o convince people that their o w n individual response is important. • Being explicit In modern societies, respondents are increasingly sophisticated and critical. They are familiar w i t h surveys, and alert t o deceptive techniques of persuasion. Many people are concerned about the researchers' hidden agenda, their sources of funding, and the uses t o which the findings will be put. W e need t o make the rationale of o u r research as explicit as possible. • A humanistic approach Many respondents, and most sociologists, do not believe that sociology is a hard science like nuclear physics o r inorganic chemistry. If our style of research - for example, rigidly structured questionnaires and interviews, w i t h little opportunity for respondents t o express their o w n views in their o w n words - suggests that we are treating people as the objects of scientific research, we are likely t o encounter resistance. People should have the opportunity t o express their views in their o w n words.
22
Surveying the social world
Research ethics Professional research ethics can be seen i n the c o n t e x t o f the w i d e r c u l t u r a l factors w e have just been r e v i e w i n g . T h e f u n d a m e n t a l principles o f research ethics f l o w f r o m the n a t u r e o f the social r e l a t i o n s h i p between researcher a n d respondent, a r e l a t i o n s h i p w h i c h is necessarily embedded i n a set o f c u l t u r a l values, n o r m s a n d codes o f c o n d u c t . A l l o f the m a j o r professional bodies such as the B r i t i s h Sociological A s s o c i a t i o n (BSA), the B r i t i s h Psychological Society (BPS) a n d the A m e r i c a n Sociological A s s o c i a t i o n p u b l i s h guidelines o n research ethics t o w h i c h their members are expected t o adhere. A p p e n d i x 2 suggests a f e w w e b addresses w h e r e these m a y be v i e w e d . T h e general principles o f research ethics i m p a c t s o m e w h a t differently, depending o n the research strategy chosen. Despite its different a p p l i c a t i o n s , the core o f research ethics is due respect f o r the i n t e g r i t y o f people p a r t i c i p a t i n g i n o u r research. Respect f o r o u r respondents can be b r o k e n d o w n i n t o three key c o m ponents: i n f o r m e d consent, c o n f i d e n t i a l i t y a n d sensitivity.
I n f o r m e d consent C o m p a r e d t o fieldwork observations, one p o t e n t i a l v i r t u e o f surveys is t h a t they are relatively o v e r t . T h e p r o b l e m s o f covert research are far less pressi n g f o r the survey researcher t h a n they m a y be f o r the ethnographer w o r k i n g i n the field. Even so, as survey researchers w e need t o be as open as w e reasonably can be a b o u t the purposes o f o u r research, the sources o f f u n d i n g , a n d the p o t e n t i a l audiences f o r a n d uses o f o u r findings. We s h o u l d m a k e i t easy f o r respondents t o raise any queries they m a y have. I n some cases i t m a y be desirable t o give the name o f a responsible person w h o m they can contact i f they w a n t t o v e r i f y w h o w e are a n d the n a t u r e o f o u r research. I n i n t e r v i e w s , w e s h o u l d have p r o o f o f o u r i d e n t i t y re a d i l y available. I t m a y also be desirable t o indicate t h a t o u r research has the a p p r o v a l o r s u p p o r t o f a relevant person o r b o d y - a trade u n i o n , say, or a c h a r i t y . W e also need t o consider w a y s i n w h i c h w e can m a k e a s u m m a r y o f o u r findings available t o o u r respondents, so t h a t i n f o r m e d consent comes t o f r u i t i o n i n an i n f o r m e d outcome.
Confidentiality Respondents are usually offered an assurance o f confidentiality. I n some cases this extends f u r t h e r t o a n o n y m i t y , w h i c h is the stronger guarantee t h a t n o t even the researchers w i l l be able t o i d e n t i f y w h o the respondent is -
W h y survey?
23
something w h i c h is o n l y easily achieved i n the case o f self-completion questionnaires. O u r assurances need t o be as clear as possible, so t h a t people are n o t misled. We also need t o be aware t h a t , i n some cases, i t is a l l t o o possible f o r a knowledgeable reader t o i d e n t i f y a respondent even i f w e have given t h e m a p s e u d o n y m a n d apparently concealed their identity. T h i s is a p a r t i c u l a r p r o b l e m w h e n the researcher is surveying the members o f an o r g a n i z a t i o n : there m a y be very f e w w o m e n or members o f ethnic m i n o r i ties, p a r t i c u l a r l y i n senior positions. H o w are w e g o i n g t o represent their responses w h i l e concealing their i d e n t i t y f r o m t h e i r f e l l o w w o r k e r s a n d their bosses?
Sensitivity One i m p o r t a n t area i n w h i c h sensitivity needs t o be exercised is i n the use o f language, p a r t i c u l a r l y as regards 'race' a n d ethnicity, sex a n d gender, age, a n d disability. Examples o f g o o d practice can be f o u n d i n other people's p u b l i s h e d w o r k . A p a r t i c u l a r l y useful source is The Q u e s t i o n B a n k , a resource centre f u n d e d by the ESRC (Economic a n d Social Research C o u n cil) a n d r u n by the N a t i o n a l Centre f o r Social Research, a n d the Universities o f S o u t h a m p t o n a n d Surrey i n the U K . Its internet address is: h t t p : //www.natcen.ac.uk/cass/. Language evolves, a n d varies cross-culturally i n the English-speaking w o r l d , so i t is i m p o r t a n t t o keep u p t o date o n acceptable usage i n the c u l t u r e i n question. W h i l e encouraging respondents t o take p a r t i n a survey is entirely a p p r o p r i a t e , a t t e m p t i n g t o p u t pressure o n t h e m is n o t . As citizens they have the r i g h t t o refuse (except w h e n there is a legal requirement t o respond, as i n the decennial Census). T h e p o t e n t i a l f o r undue pressure is greatest i n an o r g a n i z a t i o n a l c o n t e x t , where people m a y feel t h a t they w i l l be j u d g e d ' u n c o o p e r a t i v e ' i f they decline t o p a r t i c i p a t e .
An invitation to survey research O u r b o o k is an i n v i t a t i o n t o survey research. T h e w o r d i n v i t a t i o n implies j o i n i n g i n s o m e t h i n g w o r t h w h i l e a n d enjoyable. Surveys can be b o t h , w e believe. There are, o f course, p r o b l e m s a n d f r u s t r a t i o n s , a n d w e have t r i e d t o be open a b o u t t h e m . A t the same t i m e , w e are positive. T h e p r o b l e m s are there t o be overcome, a n d a successful survey can c o n t r i b u t e t o understandi n g social life i n the hope o f m a k i n g things better.
24
Surveying the social world
Key summary p« * * * *
Surveys are a for They involve the They have t o caf They can be con
P o i n t s f o r reHeci
Further reading Marsh (1982) The Survey Method: The Contribution of Surveys to Sociological Explanation is an excellent account of the survey as a research strategy. It examines the major critiques and vigorously defends surveys against them. For a comprehensive account of the survey method see Babbie (2001) The Practice of Social Research.
2 ) Theory into practice
The components of the modern social survey I n Chapter 1 , the social survey was defined as a strategy i n w h i c h the same i n f o r m a t i o n was collected f r o m a l l the cases i n a sample (or f o r the w h o l e p o p u l a t i o n o f interest). T h i s d e f i n i t i o n n o w requires some e l a b o r a t i o n . Because i t is b r o a d , i t covers a great m a n y o f the exercises c o n d u c t e d t h r o u g h o u t h i s t o r y by agents o f r u l i n g elites i n order t o establish, f o r example, the p o p u l a t i o n numbers o f key ethnic, religious or o c c u p a t i o n a l g r o u p s , the scale o f enterprises f o r t a x c o l l e c t i o n purposes, o r the m a n p o w e r
26
Surveying the social world
available f o r m i l i t a r y service. I n this b o o k , however, w e are p r i m a r i l y concerned w i t h the survey i n its c o n t e m p o r a r y f o r m s . T h e m o d e r n survey is a synthesis o f certain ideas a n d m e t h o d o l o g i c a l i n n o v a t i o n s t h a t were a v a i l able t o be used together o n l y by the m i d d l e o f the t w e n t i e t h century. These components are discussed b e l o w a n d a p a r t i a l e x p l a n a t i o n is offered t o a question t h a t m a y have arisen i n the m i n d of some readers - w h y is the systematic social survey such a recent development?
Respondent/informant orientation As suggested i n Chapter 1 , the idea t h a t the p r o v i d e r o f the i n f o r m a t i o n i n a survey is a respondent or an i n f o r m a n t is an i m p o r t a n t conceptual development, i n itself reflecting changing ideas o f citizenship a n d social part i c i p a t i o n . I n f o r m a n t s deserve t o be treated w i t h respect as knowledgeable a n d , w i t h i n l i m i t s , reliable, their c o o p e r a t i o n has t o be carefully sought a n d their rights a c k n o w l e d g e d (for example, t o have the i n f o r m a t i o n they p r o vide treated c o n f i d e n t i a l l y ) . There are exceptions t o the d e f i n i t i o n o f respondent offered i n Chapter 1 : some are proxies f o r the real subjects o f the i n q u i r y (parents f o r small c h i l d r e n , members f o r the o r g a n i z a t i o n a l teams o f w h i c h they are p a r t ) ; some censuses a n d other o f f i c i a l surveys d o impose legal sanctions f o r n o n - c o o p e r a t i o n . Nevertheless, the m o d e r n survey revolves a r o u n d i d e n t i f y i n g strategic i n f o r m a n t s , persuading t h e m t o cooperate, a n d p a i n s t a k i n g l y c o n s t r u c t i n g questionnaires a n d i n t e r v i e w schedules c o n t a i n i n g questions t h a t w i l l be m e a n i n g f u l t o t h e m . I n f o r m a t i o n c o l l e c t i o n is the p r i n c i p a l a i m w i t h other objectives set aside o r made subsidiary (these m i g h t include p r o m o t i n g awareness o f goods a n d services or r e c r u i t i n g p o t e n t i a l f o l l o w e r s t o a cause or interest g r o u p ) . I n contrast, pre¬ m o d e r n surveys were o f t e n exercises i n c o m p u l s i o n c o n d u c t e d o n groups unable t o refuse. I n some cases, such as B r i t i s h nineteenth-century research o n the poor, direct contact w i t h the g r o u p itself was o f t e n m i n i m i z e d a n d the p r i n c i p a l resort was t o t e s t i m o n y f r o m v a r i o u s expert intermediaries such as inspectors, the police or employers. T h e aims o f such exercises were o f t e n very b l u r r e d a n d concern f o r the interests of p a r t i c i p a n t s was rarely paramount. Standardized data c o l l e c t i o n i n s t r u m e n t s The questions posed i n p r e - m o d e r n surveys were o f t e n ad hoc a n d i l l considered, unselfcritically reflecting the social w o r l d s , assumptions a n d language o f the authors (or the bureaucracies i n w h i c h they w o r k e d ) rather t h a n the i n f o r m a n t s . There were a d d i t i o n a l obstacles t o be overcome. L o n g after the i n a u g u r a t i o n o f n a t i o n - w i d e m a i l deliveries, the use o f postal questionnaires w i t h samples o f the general p u b l i c was h a m p e r e d by l o w levels o f mass literacy a n d by w i d e s p r e a d suspicion t h a t the i n f o r m a t i o n being
Theory into practice
27
requested was an extension o f surveillance by agencies o f social c o n t r o l . Structured questionnaires a n d i n t e r v i e w schedules w i t h standardized w o r d i n g a n d e x p l i c i t d e f i n i t i o n s , wherever possible tested i n p i l o t exercises o n small groups o f respondents, c o u l d o n l y become an i n t e g r a l p a r t o f the m o d e r n social survey w h e n the social i n f r a s t r u c t u r e f a c i l i t a t e d their use. A significant p a r t o f the e f f o r t i n designing c o n t e m p o r a r y instruments is devoted t o m a k i n g the r e s u l t i n g schedule o r questionnaire comprehensive, applicable t o every respondent o r s i t u a t i o n t h a t m i g h t be encountered. A t the same t i m e , user-friendliness is a m a j o r c o n s i d e r a t i o n : respondents need t o feel at ease d u r i n g i n t e r v i e w s , w h i l e response rates f o r s e l f - c o m p l e t i o n questionnaires are p r o m o t e d i f the f o r m s are made as s t r a i g h t f o r w a r d as possible by clear l a y o u t a n d h e l p f u l graphics.
Systematic selection procedures T h e p r e s u m p t i o n t h a t scientific thoroughness o b l i g e d social researchers t o collect i n f o r m a t i o n f r o m every m e m b e r o f a c o m m u n i t y o r every h o u s e h o l d i n the p a r i s h was an e n d u r i n g one a n d p r o b a b l y delayed the social science a p p l i c a t i o n o f the statistical ideas u n d e r p i n n i n g p r o b a b i l i t y ( r a n d o m ) s a m p l i n g w h i c h were i n c i r c u l a t i o n since the m i d d l e o f the nineteenth cent u r y . T h e p r a c t i c a l a p p l i c a t i o n o f s a m p l i n g t h e o r y t o social surveys t o o k place i n the first decades o f the t w e n t i e t h century a n d the five t o w n s i n q u i r y by B o w l e y a n d B u r n e t t - H u r s t , p u b l i s h e d i n 1915, was possibly the first B r i t i s h sociological study t o include estimates o f the r e l i a b i l i t y o f findings based o n samples ( M a r s h 1982: 2 6 ) . A m o n g other t h i n g s , the advent o f sample surveys solved the p r o b l e m o f h a v i n g t o secure the resources t o f u n d large teams t o process massive a m o u n t s o f i n f o r m a t i o n f r o m large numbers o f respondents. T h i s h a d p r e v i o u s l y l o c k e d surveying i n t o fields w h e r e c h a r i table o r state s u p p o r t was f o r t h c o m i n g : s a m p l i n g procedures helped t o open the d o o r f o r small groups a n d i n d i v i d u a l s t o use the survey as a t o o l .
M u l t i v a r i a t e analysis T h e final core c o m p o n e n t o f the m o d e r n survey a n d the m o s t recent t o be integrated w i t h the others is m u l t i v a r i a t e analysis; t h a t is, statistical p r o cedures f o r analysing the relations between sets o f variables whose values are v a r y i n g simultaneously. Adequate descriptive statistics dealing w i t h n u m e r i c a l observations have l o n g been available. H o w e v e r , a key task i n surveys w i t h e x p l a n a t o r y goals is t o u n r a v e l causal processes after they have operated i n the real w o r l d s o f respondents a n d the cultures a n d social structures i n w h i c h they are located. M u l t i v a r i a t e techniques can i n t r o d u c e statistical c o n t r o l s t h a t eliminate c o m p l i c a t i n g variables a n d enable answers t o ' w h a t i f ? ' questions t o be f o r m u l a t e d . These techniques are c r u c i a l f o r u n r a v e l l i n g c o m p l e x issues where m a n y factors are i n p l a y - h o w d i f f e r e n t
28
Surveying the social world
social classes pass o n d i f f e r e n t i a l advantages t o their o f f s p r i n g over generations, the precise extent t o w h i c h academic achievement is the p r o d u c t o f h o m e b a c k g r o u n d , i n d i v i d u a l a b i l i t y a n d school characteristics. T h e w i d e s p r e a d use o f m u l t i v a r i a t e statistics f o r the analysis o f survey data o n l y developed after the Second W o r l d War. O n e o f the factors t h a t arrested its d i f f u s i o n was the t i m e a n d d r u d g e r y t a k e n u p b y the elaborate statistical calculations i n v o l v e d . T h e advent o f first m a i n f r a m e t h e n later desktop computers r u n n i n g s o f t w a r e applications specifically designed f o r social surveys represented m a j o r advances. Investigators n o w possess an unprecedented capacity t o e x p l o r e survey data t h o r o u g h l y using e x p l o r a t o r y a n d analytic statistical t o o l s . L i s t i n g these f o u r c o m p o n e n t s m a y present an idealized p i c t u r e o f the m o d e r n social survey. N o t every instance employs systematic s a m p l i n g , w h i l e the objectives i n some c o n t e m p o r a r y descriptive surveys render m u l t i variate analysis superfluous. T h e p o i n t , however, is t h a t the c o m p o n e n t s are available f o r use i n those research situations able t o e x p l o i t t h e m , a n d their j o i n t use creates a t o o l o f social i n q u i r y o f exceptional p o t e n t i a l . H o w e v e r , effective synthesis does n o t take place b y itself. Each c o m p o n e n t has a slightly different logic a n d requirements a l l o f w h i c h require h a r m o n i z a t i o n . T h e task i n designing surveys is t o m a k e these f o u r c o m p o n e n t s fit together as seamlessly as possible.
The survey as a research strategy T h i s section considers some o f the m a i n characteristics o f the survey as a research strategy, t a k i n g account b o t h o f its strengths a n d its l i m i t a t i o n s . As w e l l as o f f e r i n g guidance o n the choice between strategies, this discussion aims t o help the reader assess h o w a survey c o u l d be l i n k e d t o a n d c o o r d i nated w i t h o t h e r types o f research strategy so as t o c i r c u m v e n t the l i m i tations o f each. Extensive/intensive Surveys are the p r i m e example o f an extensive research technique i n the social sciences, one capable o f gathering comparable i n f o r m a t i o n f r o m respondents across a w i d e range o f different social g r o u p s . O n e frequently-used tactic is t o e m p l o y a survey i n the first phase o f a project t o establish w h a t the general outlines o f the researchable p r o b l e m are a n d t h e n t o use the data collected t o design a m o r e intensive second phase using case studies or other intensive approaches. For example, a h y p o t h e t i c a l investigation i n t o h o m e w o r k i n g m i g h t use a survey t o m a p the sectors o f business a n d i n d u s t r y i n w h i c h i t was most prevalent a n d t o establish w h a t types o f employee were w o r k i n g f r o m h o m e w i t h w h a t general levels o f success a n d satisfaction. A second stage
Theory into practice
29
c o u l d then a d o p t a n a r r o w e r b u t m o r e detailed focus. O n the basis o f the results o f phase one, i t w o u l d p r o b a b l y be possible t o locate organizations t h a t use h o m e - w o r k i n g intensively a n d also some t h a t have t r i e d a n d abandoned i t , a n d t o gather m o r e i n f o r m a t i o n a b o u t the arrangements i n these contrasting instances. A l t e r n a t i v e l y , i t m i g h t be possible t o set u p a p r o gramme o f i n - d e p t h interviews w i t h i n d i v i d u a l s w h o have h o m e w o r k i n g experience t h a t w o u l d e x p a n d the data available o n matters such as domestic management problems a n d c o m m u n i c a t i o n w i t h w o r k colleagues. T h i s is merely an i l l u s t r a t i o n o f one m a j o r strength o f the survey: i t is also entirely possible, p r o v i d e d the t o p i c a n d setting are suited, t o design a survey t h a t is a i m e d at a relatively small g r o u p o f respondents a n d w h i c h collects detailed data f r o m t h e m i n a single c o l l e c t i o n o p e r a t i o n .
Naturalness/artificiality/intrusiveness Surveys stand i n an intermediate p o s i t i o n between a h i g h l y naturalistic strategy such as p a r t i c i p a n t observation a n d the clearly a r t i f i c i a l l a b o r a t o r y experiment. A w e l l - c o n d u c t e d i n t e r v i e w has some o f the character a n d f a m i l i a r i t y o f a n o r m a l conversation (and i t m a y take place i n the respondent's h o m e , w o r k p l a c e o r other f a m i l i a r setting) b u t i t is nevertheless a c o n versation w i t h an i n t e r v i e w e r w h o is n o r m a l l y a stranger. A l t h o u g h there are advantages speaking t o respondents o n their h o m e ' g r o u n d ' , interviewers are usually entering an e n v i r o n m e n t t h a t has, t o a greater o r lesser extent, been prepared t o receive t h e m , so the s i t u a t i o n cannot be regarded as entirely authentic. Indeed, i f there is l i m i t e d space a n d privacy, an entirely ' n o r m a l ' a n d authentic s i t u a t i o n can u n d e r m i n e the p o s s i b i l i t y o f c o n d u c t i n g any k i n d o f i n t e r v i e w because o f i n t e r r u p t i o n s f r o m other f a m i l y m e m bers, colleagues o r the telephone. Street i n t e r v i e w i n g is self-evidently h i g h l y intrusive a n d refusals are c o m m o n . Self-completion instruments such as email o r postal questionnaires are l o w o n n a t u r a l i s m b u t they have the balancing advantage o f a l l o w i n g the respondent t o select the t i m e f o r their c o m p l e t i o n , an o p t i o n t h a t significantly reduces t h e i r intrusiveness.
Qualitative/quantitative Surveys are o f t e n characterized as a p r e - e m i n e n t l y q u a n t i t a t i v e research strategy b u t this is a m i s p e r c e p t i o n . A p r i m e advantage o f surveys is precisely t h a t they a l l o w the simultaneous c o l l e c t i o n o f b o t h types o f data. Open-ended questions are n o t s i m p l y devices t o deal w i t h the cases n o t covered by the closed categories offered i n a previous question. The m a t e r i a l they elicit can open u p i m p o r t a n t insights i n t o respondent m o t i v a t i o n a n d perceptions. There are, o f course, l i m i t s o n the extent t o w h i c h the respondents t o postal questionnaires can be expected t o w r i t e lengthy essays o n
30
Surveying the social world
their views or preferences, a n d these are situations i n w h i c h personal interviews m a y be preferable. I n general, the capacity o f surveys t o deliver m u t u ally s u p p o r t i n g q u a l i t a t i v e a n d q u a n t i t a t i v e data s h o u l d n o t be neglected.
Causal inference I n some t e x t b o o k s , the social survey is c o m p a r e d u n f a v o u r a b l y w i t h the experiment a n d is p o r t r a y e d as a p o o r a n d l o g i c a l l y deficient r e l a t i o n . As n o t e d o n page 8, this is largely because the classical l a b o r a t o r y experimenter has the advantage o f being able t o m a n i p u l a t e the key independent variables i n 'real t i m e ' . I n a d d i t i o n , c o n f o u n d i n g factors can be c o n t r o l l e d t h r o u g h the l a b o r a t o r y i s o l a t i o n o f the subjects a n d their r a n d o m a l l o c a t i o n t o the e x p e r i m e n t a l a n d c o n t r o l groups. I n terms o f m a k i n g causal inferences f r o m data, the l a b o r a t o r y e x p e r i m e n t appears t o have a m a j o r advantage over the survey w h i c h , as w e have seen, has t o reconstruct n a t u r a l l y - o c c u r r i n g causal processes after they have t a k e n place (ex post facto) t h r o u g h statistical m a n i p u l a t i o n o f the data. One o f the several p r o b l e m s this creates is a m b i g u i t y a b o u t the precise sequence o f changes a n d thus p o t e n t i a l u n c e r t a i n t y over w h e t h e r the data implies causation or o n l y c o - v a r i a t i o n . There are t w o p o i n t s t o be made here. T h e first is t h a t a p r o p e r l y designed survey s h o u l d be able t o reconstruct causal relationships, b u t i t requires careful design a n d it necessitates a sample large e n o u g h t o p e r m i t the use o f sufficiently sophisticated statistical t o o l s . Second, the analysis above neglects the role o f t h e o r y : w i t h o u t an adequate theoretical f r a m e w o r k i n play, neither the experimenter n o r the surveyor is i n a p o s i t i o n t o i d e n t i f y w h i c h are the salient variables t o include i n the design or t o m a k e a p p r o p r i a t e inferences a b o u t any patterns i n the observed data.
F l e x i b i l i t y /r i g i d i t y A f r e q u e n t l y o v e r l o o k e d difference between research strategies is the d i f f e r ent degrees o f flexibility they p e r m i t the researcher. Some essentially q u a l i tative research strategies a l l o w a p r e l i m i n a r y analysis o f the first w a v e o f data so t h a t the o u t c o m e can be used t o determine the venues a n d the topics t o be pursued i n the second w a v e , a n d so o n . T h i s a l t e r n a t i o n between data c o l l e c t i o n a n d analysis is especially useful i n p r e l i m i n a r y research w h e r e there are m a n y p a r a l l e l avenues t h a t c o u l d be e x p l o r e d , i n the l i g h t o f w h i c h the researcher w a n t s as m a n y o p t i o n s left open as possible. Surveys d o n o t lend themselves t o such a r o l l i n g strategy. T h e y m a y be t h o u g h t o f as ' f r o n t l o a d e d ' i n the sense t h a t a series o f m a j o r i n t e r l o c k i n g decisions c o v e r i n g a l l the m a i n components m e n t i o n e d under the previous key heading need t o be made before any data c o l l e c t i o n can begin. Once testing a n d p i l o t i n g have been c o m p l e t e d , the s t a n d a r d i z i n g logic o f surveys p r o h i b i t s changes t o the d e f i n i t i o n o f the target p o p u l a t i o n , the sample design or the contents o f the
Theory into practice
31
questionnaire/schedule. I t is n o t possible t o change tack i f early responses d o n o t live u p t o expectations. T h i s r i g i d i t y reinforces the emphasis t h a t s h o u l d be placed o n t h o r o u g h p r e p a r a t i o n a n d pre-testing.
Types of survey design There are three basic designs f o r surveys t h a t reflect the m a i n directions o f c o m p a r i s o n t h a t w i l l be made at the data analysis stage. Cross-classificatory
(cross-sectional)
I n some senses, this is the f u n d a m e n t a l survey design. There is a single stage o f data c o l l e c t i o n (sometime referred t o as 'single shot') a n d the u n i t o f analysis is a case w i t h a l l o f its characteristics (variables). A l t h o u g h a case is f r e q u e n t l y equivalent t o a respondent, this is n o t inevitable (the case m i g h t actually be a h o u s e h o l d a n d the respondent s i m p l y a member p r o v i d i n g i n f o r m a t i o n a b o u t its e x p e n d i t u r e patterns o r leisure activities). T h e m a i n focus is the c o m p a r i s o n o f aggregate groups o f cases characterized b y d i f f e r ent values o n k e y variables rather t h a n the p r o f i l e o f characteristics possessed b y a n y p a r t i c u l a r case. T h e objective is t o see i f groups o f cases have c o - v a r y i n g values o n other, dependent, variables. T h e Travel Survey is an example o f a cross-sectional survey. T h e a i m is t o see w h a t characteristics go w i t h the choice o f d i f f e r e n t modes o f t r a v e l f o r the j o u r n e y t o w o r k a n d w h a t attitudes are associated w i t h , f o r example, car use as against the use o f bicycles o r p u b l i c t r a n s p o r t . T h e analysis o f stand-alone cross-classificatory surveys revolves a r o u n d the c o n s t r u c t i o n a n d c o m p a r i s o n o f such subgroups. Part o f this design's strength lies i n the w a y a n analyst c a n c h o p u p a sample i n t o m a n y quite d i f f e r e n t sub-groups t o e x p l o r e the separate dimensions o f the research t o p i c .
L o n g i t u d i n a l a n d panel studies I n a l o n g i t u d i n a l survey, data c o l l e c t i o n takes place repeatedly i n order t o m o n i t o r the o p e r a t i o n o f social processes over t i m e (the data generated are k n o w n as t i m e series). I n the special case o f a panel study, the same respondents are i n v o l v e d at each stage ( a l l o w i n g f o r d r o p - o u t s ) . T h e B r i t i s h H o u s e h o l d Panel Survey (see B o x 2.1) is a large-scale example o f a panel study. T h e presence o f t i m e as a n e x p l i c i t d i m e n s i o n i n these research designs makes certain k i n d s o f causal inference m u c h easier. There are v a r i o u s statistical techniques t a i l o r e d t o the requirements o f l o n g i t u d i n a l studies: these include those based o n A R I M A (AutoRegressive Integrated M o v i n g Average, also k n o w n as Box-Jenkins models) a n d a c t u a r i a l techniques dealing w i t h the d i f f e r e n t i a l s u r v i v a l o f cases i n a p o p u l a t i o n .
32
Surveying the social world
Hierarchical I n this m o r e c o m p l e x design, the m a i n line o f c o m p a r i s o n is between the characteristics o f a case a n d the characteristics o f a c o l l e c t i v i t y i n w h i c h the case is a u n i t or member. O n e o f the p r i n c i p a l research aims i n hierarchical designs is t o trace the influence o f the c o l l e c t i v i t y o n its members. A n illust r a t i o n o f the a p p l i c a t i o n o f hierarchical analysis w o u l d be a study o f the r e c i d i v i s m - relapsing i n t o crime - o f the ex-inmates o f a p r i s o n . The research question m i g h t be w h e t h e r r e c i d i v i s m was closely connected t o the characteristics o f the p r i s o n such as its physical l o c a t i o n , staff characteristics a n d the average l e n g t h o f sentences, a n d w h e t h e r any o f these factors interacted w i t h features o f the personal b i o g r a p h y o f the i n d i v i d u a l inmates (such as the n u m b e r a n d type o f offences they h a d p r e v i o u s l y c o m m i t t e d ) so as t o increase the chances o f f u r t h e r o f f e n d i n g . N o t i c e t h a t there are t w o l o g i c a l l y different k i n d s o f variable i n v o l v e d i n this design: prisons have some p r o p erties as i n s t i t u t i o n s t h a t c a n n o t be i n f e r r e d f r o m the aggregation o f the characteristics o f i n d i v i d u a l inmates (and vice versa). M u l t i l e v e l statistical models are available t o facilitate the analysis o f this k i n d o f t w o - l e v e l data. As w e l l as s h o w i n g t h a t the n o m i n a t e d independent variables c o - v a r y w i t h the n o m i n a t e d dependent variables, a l l three types o f design m u s t include the capacity t o detect the o p e r a t i o n o f r i v a l independent variables t h a t c o u l d lead t o m i s i n t e r p r e t a t i o n o f the findings. T h e i d e n t i t y o f these r i v a l variables m a y be self-evident, or alternatively the m a i n candidates w i l l have been p r o p o s e d i n the research l i t e r a t u r e . O n e example o f a p o t e n t i a l l y c o n f o u n d i n g variable is d i f f e r e n t i a l exposure t o atmospheric p o l l u t a n t s as an alternative cause o f cancer i n the study discussed i n B o x 1.4. T h i s factor was c o n t r o l l e d i n this study by the a b i l i t y o f the investigators t o demonstrate t h a t there was a higher cancer prevalence f o r s m o k i n g doctors t h a n n o n s m o k i n g doctors across b o t h r u r a l a n d u r b a n locations, w h e r e atmospheric p o l l u t a n t s w o u l d be absent a n d present respectively. A f u r t h e r example is the need t o c o n t r o l f o r i n d i v i d u a l a b i l i t y w h i l e s t u d y i n g the i m p a c t o f class size a n d teaching styles o n p u p i l academic achievement (see Bennett et a l . 1 9 7 6 ) . C o n f o u n d i n g variables need t o be measured a n d statistically c o n t r o l l e d , or alternatively be excluded entirely f r o m a design (for e x a m p l e , b y defining the target p o p u l a t i o n n a r r o w l y ) .
Relations between theory and research T h e n a t u r a l w a y t o l a u n c h this t o p i c is t o define ' t h e o r y ' , b u t this m u s t be done i n a p r e l i m i n a r y f a s h i o n so as t o a v o i d discussions t h a t w o u l d go b e y o n d the scope o f this b o o k . • There is a w i d e measure o f agreement t h r o u g h o u t the social a n d n a t u r a l sciences t h a t theories are the m o s t i m p o r t a n t a n d the m o s t intellectually
Theory into practice
Box 2.1
33
Panel studies
Some of the largest scale surveys use a panel design. O n e of the most important in the U K is the British Household Panel Survey (BHPS) administered by the Institute of Social and Economic Research at the University of Essex. In surveys of this type, a designated group of households are monitored over periods as long as a decade w i t h repeated waves of data collection. This enables their response t o general shifts that have occurred in the economic and social environment t o be examined. Such research also provides insights into the manner in which the impact of macro factors (for instance, changes in the labour market opportunities for women) depends on micro factors such as the age and generational composition of a household. Some studies attempt t o interview ail the adult members in the household, while others use a key informant t o supply information on behalf of themselves and the others. It is common f o r a core set of basic questions t o be used w i t h every household, complemented by a selection f r o m additional sets put only t o households of a particular type (low income, single parent, gross income above a threshold). In some studies, the second generation households set up by the children f r o m first generation families are tracked and incorporated into the research as they are f o r m e d . Such activities require large teams and substantial budgets. For f u r t h e r information on the BHPS, see http://www.iser.essex.ac.uk/ bhps/doc/index.htm
r i g o r o u s means o f p r o d u c i n g explanations o f phenomena (and the m o s t satisfactory basis f o r p r e d i c t i o n s ) . Observations t h a t lack a theoretical u n d e r p i n n i n g c a n n o t p r o v i d e a basis f o r e x p l a n a t i o n or p r e d i c t i o n . • There is n o consensus o n w h a t the precise technical specifications f o r a t h e o r y s h o u l d be a n d o n l y p a r t i a l agreement over w h a t constitutes an adequate e x p l a n a t i o n . • Every systematic discipline possesses a c h a n g i n g set o f concepts t h a t organizes k n o w l e d g e w i t h i n t h a t field a n d identifies the entities i n the w o r l d w i t h w h i c h i n q u i r y is concerned. A t least some o f these concepts have an abstract a n d idealized character - t h e i r existence c a n n o t be directly substantiated b u t m u s t be i n f e r r e d f r o m their effects o n w h a t is observable. Some social science examples, m o r e or less at r a n d o m , include perfect competition, the self, the Schumpeterian workfare state, governmentality. • There is an o n - g o i n g debate between t w o m a i n p h i l o s o p h i c a l camps over the status o f theoretical entities: realists believe they represent mechanisms a n d processes t h a t d o exist i n the w o r l d , instrumentalists see t h e m as
34
Surveying the social world
s i m p l i f y i n g devices, h e l p f u l t o make sense o f research data b u t n o t t o be credited w i t h an independent existence (Chalmers 1 9 9 9 ) . • A t h e o r y is a linguistic c o n s t r u c t i o n t h a t , c r i t i c a l l y a m o n g a range o f funct i o n s , states the existence o f a general l a w - l i k e r e l a t i o n s h i p between t w o or m o r e abstract concepts. Theories are f r e q u e n t l y developed discursively by their authors so t h a t the presentation is m i x e d u p w i t h extraneous illust r a t i o n , c o m m e n t a n d c r i t i c i s m : a schematic r e c o n s t r u c t i o n o f a t h e o r y strips i t d o w n t o the essential p r o p o s i t i o n s a n d i t is these p r o p o s i t i o n s t h a t c o u l d be i n c o r p o r a t e d i n t o a research design o r m i g h t be i n v o k e d t o interpret research f i n d i n g s . • The o r t h o d o x perspective o n e x p l a n a t i o n (the covering-law model) p o r trays i t as h a v i n g the f o r m of a deductive argument, that is, one i n w h i c h the t r u t h of the premises guarantees the t r u t h of the conclusion. The p h e n o m enon t o be explained is described w i t h i n a statement that forms the conclusion of the argument: at least one o f the premises has t o be a statement f o r m u l a t i n g a l a w - l i k e generalization b o r r o w e d f r o m a theory (for example, 'revolutions occur d u r i n g periods of rising mass expectations', 'suicide rates are positively associated w i t h the degree o f i n d i v i d u a l i s m i n society'). The other premise states a list o f ' i n i t i a l c o n d i t i o n s ' or l i m i t i n g states o f affairs that have t o be satisfied f o r the theoretical generalization t o h o l d . • A l l e m p i r i c a l research rests o n some theoretical assumptions a b o u t w h a t entities exist a n d are capable o f being investigated. I n some cases, the assumptions are made e x p l i c i t a n d the perspectives a c k n o w l e d g e d , i n others they are i m p l i c i t a n d need t o be teased o u t . For example, even a t h o r o u g h l y descriptive i n q u i r y l i k e the Travel Survey rests o n a variety o f theoretical premises. By asking a b o u t h o m e addresses, n u m b e r o f cars i n the h o u s e h o l d a n d w o r k patterns, the assumption is being made t h a t the selection o f a m o d e o f c o m m u t i n g is an essentially r a t i o n a l choice based largely o n i n d i v i d u a l assessments o f t i m e , cost a n d convenience. H o w e v e r , choice of m o d e o f t r a n s p o r t c o u l d a r g u a b l y rest o n quite d i f f e r e n t a n d less calculative considerations. A preference f o r car use m i g h t reflect, at least i n p a r t , a desire f o r p r i v a c y a n d the feelings o f i n v u l n e r a b i l i t y t h a t some i n d i v i d u a l s associate w i t h m o t o r i n g . For others, car use is l i k e l y t o carry strongly negative c o n n o t a t i o n s because i t is perceived as e n v i r o n m e n t a l l y destructive. By o m i t t i n g questions t h a t c o u l d t a p these considerations, the Travel Survey is effectively i n c o r p o r a t i n g one theoretical perspective i n preference t o others. I n summary, some k i n d o f theoretical t h i n k i n g is always an input t o any e m p i r i c a l research. • A t the same t i m e , research m a y have a theoretical output. T h e analysis o f survey data, as Chapter 8 w i l l argue, is never s i m p l y a matter o f i d e n t i f y i n g statistically significant associations. T h e substantive a n d theoretical significance o f such associations has t o be established i n the l i g h t o f the theoretical perspectives b u i l t i n t o the research design, or those available b e y o n d i t i n the discipline or disciplines p a r e n t i n g the research.
Theory into practice
35
• T e x t b o o k s o n research methods t e n d t o h i g h l i g h t the special a n d quite rare instances i n w h i c h a research project revolves a r o u n d the testing o f specific theoretical hypotheses. I t is m o r e realistic, however, t o acknowledge t h a t a great deal o f e m p i r i c a l research is eclectic, d r a w i n g o n whatever bodies o f theoretical t h i n k i n g seem relevant. Rather t h a n setting o u t t o test a theory, most survey research is either e x p l o r a t o r y or developmental i n t h a t i t seeks tentatively t o establish, o r modestly advance, theoretical t h i n k i n g o n some t o p i c . Such advance can be achieved i n a variety o f w a y s : one is by elabora t i n g key concepts so as t o a l l o w a m o r e refined a p p l i c a t i o n o f a crudely articulated t h e o r y : another is t o develop the measurement apparatus associated w i t h a n existing theory a n d thereby extend its scope i n t o n e w areas o f a p p l i c a t i o n . I t is true t h a t a great deal o f research has exclusively descriptive goals, t h o u g h even here secondary analysis can d r a w o u t e x p l a n a t o r y possibilities unrecognized by the o r i g i n a l investigators.
Incorporating a theoretical dimension into surveys I n the l i g h t o f the discussion i n the previous section, i t s h o u l d be apparent t h a t creating a survey t h a t incorporates theoretical t h i n k i n g is an exercise t h a t demands some creative t h o u g h t . H o w e v e r , a checklist always helps! 1 W i l l the designated target p o p u l a t i o n have the a p p r o p r i a t e attributes f o r e x p l o r i n g o r testing the theoretical perspective(s) o f interest? (see Chapter 4 , page 63) 2 W i l l the chosen survey design p e r m i t the relevant logical comparisons t o be made t h a t e x p l o r i n g or testing the theories requires? W i l l the t e m p o r a l order o f changes i n the values o f variables be clear? (see the previous section) 3 W i l l the questions posed enable the d e r i v a t i o n o f a l l the variables t h a t are central t o a theory? W i l l the key theoretical concepts be adequately operationalized? 4 Even t h o u g h they p r o b a b l y d o n o t belong t o the theoretical perspectives o n w h i c h a survey is based, w i l l there be data available o n p o t e n t i a l l y c o n f o u n d i n g variables? (see the previous section) M o s t o f the items o n this checklist are dealt w i t h elsewhere i n the b o o k so the rest of this section concentrates o n the issues raised i n p o i n t 3 a b o u t h o w to operationalize theoretical concepts w i t h i n surveys. T h e task is t o find adequate i n d i c a t o r s a n d measurable effects f o r theoretical mechanisms a n d p r o cesses w h i c h are n o t themselves d i r e c t l y observable or measurable. A leading A m e r i c a n i n n o v a t o r i n social science research m e t h o d s , Paul Lazarsfeld, developed a strategy f o r b r e a k i n g general theoretical constructs d o w n i n t o their measurable dimensions t h a t has been i n f l u e n t i a l (Lazarsfeld 1 9 5 8 ) , b u t this is a challenging task even f o r the experienced investigator.
36
Surveying the social world There are three m a i n types o f m e a s u r i n g device t h a t are used r e g u l a r l y i n
social surveys: derived measures; ready-made i n d i c a t o r s ; a n d psychometric, educational a n d other tests a n d scales.
D e r i v e d measures These are simple i n d i c a t o r s devised b y investigators themselves a n d b u i l t f r o m the responses t o a series o f questions posed i n the i n t e r v i e w or quest i o n n a i r e . A derived variable w i l l usually be constructed i n the first stages o f data analysis b y some f o r m o f m a t h e m a t i c a l s u m m a t i o n f r o m several other variables c o n t a i n e d i n the c o d e b o o k (see Chapter 7, pages 1 2 8 - 9 ) . D e m o graphic characteristics such as age, sex a n d f a m i l y a n d h o u s e h o l d size are exceptional i n t h a t relatively f e w other variables o f theoretical consequence lend themselves t o measurement via the responses t o a single direct question. B o x 2.2 provides an example o f derived variables used i n the Travel Survey.
Ready-made i n d i c a t o r s There is a vast range o f social, economic, h e a l t h , e d u c a t i o n a l , social psychological a n d o t h e r types o f i n d i c a t o r t h a t have been used i n social science research. Some have established positions as s t a n d a r d measures: a l t h o u g h n o t necessarily flawless, their w i d e s p r e a d use i n the past guarantees f u r t h e r use i n the f u t u r e as n e w projects seek direct c o m p a r a b i l i t y w i t h older ones. T h e f o l l o w i n g examples are chosen m o r e or less at r a n d o m : • T h e p r o p o r t i o n o f p u p i l s i n a school eligible f o r free school dinners is w i d e l y used i n B r i t a i n as a r o u g h a n d ready c o m p a r a t i v e measure o f the level o f social d e p r i v a t i o n i n the school's catchment area. • T h e p r o p o r t i o n o f households n o t o w n i n g a car, available f r o m B r i t i s h Census data, is s i m i l a r l y used as a simple i n d i c a t o r o f the socio-economic character o f u r b a n n e i g h b o u r h o o d s . • T h e R e t a i l Prices I n d e x (RPI) is a key measure o f i n f l a t i o n i n the U K econo m y as i t affects consumers. • Deaths per m i l l i o n passenger miles t r a v e l l e d is used i n studies o f accident risk t h a t w i s h t o c o m p a r e different modes o f t r a n s p o r t (planes versus cars) or different e n v i r o n m e n t s ( m o t o r w a y s versus other t r u n k roads). • B o x 2.3 examines the use o f social class as a n i n d i c a t o r .
Psychometric, e d u c a t i o n a l a n d other tests a n d scales There is a huge v a r i e t y o f p e n c i l a n d paper tests dealing w i t h p e rs o n a l i t y characteristics, social attitudes, social psychological factors related t o groups a n d t e a m memberships, a n d other topics t h a t c o u l d p o t e n t i a l l y be i n c l u d e d i n i n t e r v i e w schedules o r questionnaires. Selecting a single t o p i c
Theory into practice
37
f r o m the m a n y possibilities, there are several measures o f anxiety, depression a n d suicidal i d e a t i o n available i n c l u d i n g the Hospital Anxiety and Depression questionnaire ( H A D ) ( Z i g m o n d a n d Snaith 1983) a n d the Beck Anxiety Inventory (Beck et al. 1 9 8 8 ) . T h e i n c l u s i o n o f such tests i n research m a y pose a v a r i e t y o f difficulties. W i t h the i l l u s t r a t i o n s c i t e d , there c o u l d be ethical p r o b l e m s i n asking respondents w h e t h e r they feel depressed o r suic i d a l i f there was any p o s s i b i l i t y t h a t this c o u l d actually trigger such feelings. There are also p r a c t i c a l considerations - w i l l c o m p l e t i n g a lengthy test be b o r i n g o r take u p t o o m u c h o f the respondent's time? W i l l a fixed test p r o cedure fit i n w i t h the rest o f the i n t e r v i e w o r questionnaire? Finally, there are the issues o f the adequacy o f the measures themselves t h a t are discussed i n the n e x t section.
Box 2.2
Derived variables in the Travel
Survey
Respondents indicated a large variety of different combinations of main (question 3) and alternative (question 4) modes of commuting, partly because question 4 allowed multiple responses. There was a need t o classify commuters into a small number of commuting groups for the analysis. This was done partly on the basis of whether and how the car (including m o t o r bikes) featured in an individual's commuting pattern. Eight derived variables were created, each representing a mode of commuting that reflected a distinctive combination of responses t o the t w o questions: • Group I Exclusive car users: respondents w h o ticked one f r o m boxes 5, 6, 7 and 8 in response t o question 3 and also either I o r any of 6, 7, 8 o r 9 for question 4. • Group 2 Car users with some public transport: ticked one f r o m 5, 6, 7 and 8 for question 3 and also 3 and/or 4 f r o m question 4. • Group 3 Exclusive users of foot or pedal bike: ticked I o r 2 for question 3 and also I o r 2 and/or 3 for question 4. • Group 4 Exclusive users of public transport: ticked 3 o r 4 for question 3 and also I o r 4 and/or 5 for question 4. • Group 5 Car users with some foot or pedal bike: ticked one f r o m 5, 6, 7 and 8 for question 3 and also 2 and/or 3 for question 4. • Group 6 Foot or pedal bike with some public transport: ticked I o r 2 for question 3 and also ticked 4 and/or 5 in question 4. • Group 7 Public transport with some foot or pedal bike: ticked 3 o r 4 for question 3 and also 2 and/or 3 for question 4. • Group 8 Foot or pedal bike with some car use: ticked I o r 2 for question 3 and also ticked any of 6, 7, 8 o r 9 for question 4. Only a handful of cases did not f i t into any of these eight groups. N o case could belong t o more than one.
38
Surveying the social world
The resulting groups did reveal interesting associations w i t h other variables including, fairly obviously, the availability of cars t o a household (question 22). Although the Travel Survey did not gather respondents' views about the environment, this might be an additional direction in which associations could be found. This is a very simple example of how, generally, classifications based on derived variables can play a role in bridging the gap between the response t o a specific question and constructs which are closer t o the realm of theory.
Box 2.3
Measuring social class
Social class is a key concept in both classical and contemporary social theory. There is a voluminous debate, on the one hand over how the abstract conception should be formulated, and on the other about what empirical indicators are appropriate. Since 1911, the Registrar General's classification of what were originally called 'social grades', based on industrial group, occupation and level of skill, has been used in the United Kingdom as one of the principal empirical indicators of social class, particularly in officially-sponsored research (Rose and O'Reilly 1997: I ) . This classification was originally devised by a medical statistician t o examine differentials in mortality and fertility rates. It was renamed 'social class based on occupation' (SC) in 1990 by which point it had been used in innumerable research studies. The categories are as follows (OPCS 1991: 12) I II III
Professional occupations Managerial and technical (formerly Intermediate') Skilled occupations ( N ) Non-manual (M) Manual
IV V
Partly-skilled occupations Unskilled occupations
Over time, SC became increasingly incongruent w i t h the prevailing theoretical approaches t o social class and was also criticized for lacking reliability and validity (see page 39). A variety of alternatives, such as the twenty category classification by socio-economic group (SEG - OPCS 1991: 13-14), the Goldthorpe classification based on employment relations (Rose and O'Reilly 1997:40-8), the Institute of Practitioners in Advertising's Social Grade Scheme (A, B, C I , etc), widely used in market research, and Erik Olin Wright's schema based on Marxian class t h e o r y ( W r i g h t 1985), have been devised and applied in empirical research.
Theory into practice
39
In 1994, the Office of National Statistics (ONS), the U K government agency responsible f o r SC and SEG, commissioned a review of existing class classifications w i t h the intention of producing a revised scheme. The 'collapsed', eight category, interim version of the revised socio-economic classification (SEC) which resulted, based largely on the Goldthorpe approach, is set o u t below: 1 2 3 4 5
Higher professionals/senior managers Associate professionals/junior managers O t h e r administrative and clerical workers O w n account non-professional Supervisors, technicians and related workers
6 Intermediate workers 7 O t h e r workers 8 Never worked/other inactive Unlike SC, SEC has been subjected t o extensive testing t o establish its validity using data collected by O N S f r o m its Omnibus Survey and the Labour Force Survey. SEC also has much more explicit links w i t h t h e o r y than the SC. A version was used in the 2001 Census in Britain. In order t o use the SEC in a survey, questions will be needed that elicit the following three characteristics f r o m a respondent: • occupation • size of employing establishment (if any) • employment status (employer, employee, self-employed, not active)
Reliability and validity U s i n g pre-developed indicators a n d tests gives p r o m i n e n c e t o the issues o f r e l i a b i l i t y a n d v a l i d i t y . R e l i a b i l i t y is a measure o f the extent t o w h i c h the results o f an i n d i c a t o r or test are consistent over t i m e . T h i s consistency can itself be measured i n the f o r m o f a statistical coefficient o f r e p r o d u c i b i l i t y , o f t e n Cronbach's alpha, w h i c h is s i m i l a r t o a c o r r e l a t i o n coefficient (see page 152). There are several d i f f e r e n t comparisons t h a t can be made t o examine r e l i a b i l i t y : • Test-retest:
respondents
complete
the same i n s t r u m e n t o n d i f f e r e n t
occasions. • Internal
consistency:
i f a psychometric or other pencil a n d paper test con-
sists o f m a n y items t a p p i n g the same u n d e r l y i n g concept, s p l i t - h a l f m e t h ods can be used t o compare the consistency o f results between (say) o d d and even n u m b e r e d items.
40
Surveying the social world
• Inter-observer reliability: w i l l d i f f e r e n t interviewers using the same schedule p r o d u c e equivalent responses f r o m the interviewee? A rule o f t h u m b o f t e n q u o t e d is t h a t r e l i a b i l i t y coefficients s h o u l d be at least 0.7 t h o u g h , as w i t h m a n y rules o f t h u m b , i t is precise b u t a r b i t r a r y . V a l i d i t y raises the question o f w h e t h e r a measuring device is actually connected adequately t o the theoretical mechanism, process or construct i t was intended t o capture. D o , f o r example, h i g h scores o n the H A D questionnaire correlate w i t h cases o f clinically-definable depression? Once again there is a variety o f approaches t o j u s t i f y i n g a n i n s t r u m e n t : • Content validity: this is decided by a panel o f experts w h o r e v i e w w h e t h e r a measure does e v e r y t h i n g i t s h o u l d : i t is clearly a p r e t t y flimsy test a n d raises ' w h o validates the v a l i d a t o r ' questions! • Concurrent validity: this measures a construct's v a l i d i t y against an u n i m peachable s t a n d a r d , another f o r m o f measurement w h i c h has itself demonstrable v a l i d i t y b u t w h i c h m a y be c o m p l e x , expensive o r have other restrictions o n its use: such a s t a n d a r d is o b v i o u s l y n o t always available. • Predictive validity: can the measure successfully i d e n t i f y outcomes a n d consequences? D o respondents scoring h i g h l y o n H A D b u t w i t h o u t sympt o m s at the t i m e o f testing subsequently get diagnosed as c l i n i c a l l y depressed? Because m a n y factors m a y intervene after testing t o prevent or delay outcomes, predictive v a l i d i t y is o f t e n h a r d t o establish w i t h certainty. • Construct validity: this l o o k s back at the p e r f o r m a n c e o f a measure over t i m e , preferably c o v e r i n g a w i d e range o f studies, t o see i f i t has p r o d u c e d f r u i t f u l findings. T h u s , i t w o u l d be possible t o review the use o f (say) the SC measure o f social class t o see w h e t h e r i t w a s , a n d remains, a n effective means o f e x p l a i n i n g h e a l t h differentials, v o t i n g patterns a n d other phenomena w h i c h are theoretically l i n k e d t o class m e m b e r s h i p . I n fact, such a r e v i e w o f construct v a l i d i t y was c o n d u c t e d f o r the SC measure by an academic panel o n behalf o f the O N S . T h e i r c o n c l u s i o n was t h a t SC was n o t v a l i d a n d r e c o m m e n d e d its replacement by SEC (see B o x 2.3). One o f the significant advantages o f using established measuring i n s t r u ments is t h a t the b u r d e n o f establishing r e l i a b i l i t y a n d v a l i d i t y has already fallen o n another's shoulders.
Key s u m m a i The key comj:
Theory into practice
41
2 The use of standardized questionnaires and interview schedules designed with an emphasis o n respondents* understanding and convenience
i Further reading A n excellent guide to the evolution of social surveys and also a defence of their utility, which is also referred to i n the further reading for Chapter 1 , is Marsh (1982) The Survey Method. The first three chapters of Hughes (1976) Sociological Analysis: Methods of Discovery are a good introduction to the connections between theory and research. Hughes and Sharrock (1997) The Philosophy of Social Research, 3rd edition, is a guide to different styles of research and the kind of philosophical underpinnings each has. There are several excellent introductions to the philosophy of science including Chalmers (1999) What is This Thing Called Science?, 3rd edition. Miller's (1991) Handbook of Research Design and Social Measurement, 5th edition and M u r p h y et al. (1994) Tests in Print (IV) are among the guides to published tests and scales. L i t w i n (1995) How to Measure Survey Reliability and Validity provides a brief guide to the topic indicated by the title. (This latter volume is part of Fink, ed. (1995) The Survey Handbook, which contains small guides on most aspects of surveying.)
(V)
Planning your project
Reviewing your assets I n C h a p t e r 2 (page 3 0 ) , i t was suggested t h a t surveys were less flexible t h a n some other research strategies because the need f o r s t a n d a r d i z a t i o n leaves l i t t l e scope f o r m o d i f i c a t i o n s t o be i n t r o d u c e d i n m i d - c o u r s e . C a r e f u l advanced p l a n n i n g is therefore essential. A realistic i n i t i a l appraisal o f
Planning your project
43
the resources t h a t y o u can b r i n g t o bear t o c a r r y o u t the survey w i l l help t o c o n f i r m t h a t the o p e r a t i o n envisaged is viable. I t w i l l also p r o v i d e a basis f o r c o n s t r u c t i n g an o u t l i n e timetable t h a t covers the w h o l e o f the project.
B a c k g r o u n d i n f o r m a t i o n a n d l i b r a r y resources Even f o r small-scale c o m m u n i t y a n d in-house o r g a n i z a t i o n a l surveys, prel i m i n a r y research is desirable t o c o n f i r m t h a t the i n f o r m a t i o n being sought has n o t already been collected i n other exercises. T h i s w i l l mean a p p r o a c h ing personnel i n the a p p r o p r i a t e organizations a n d i t w i l l p r o b a b l y be necessary t o reveal y o u r i n t e n t i o n s , at least i n o u t l i n e , i n order t o secure their cooperation. Some l i b r a r y p r e p a r a t i o n is i m p o r t a n t f o r nearly a l l surveys: • t o establish w h a t i n f o r m a t i o n is already available a b o u t the proposed target p o p u l a t i o n ; • f o r surveys w i t h an e x p l i c i t theoretical d i m e n s i o n , t o investigate h o w other research has applied the concepts a n d perspectives i n w h i c h y o u are interested; • t o t r a c k d o w n comparable previous research, especially previous surveys, even t h o u g h they m a y have t a k e n a d i f f e r e n t tack; • t o check any indicators o r scales t h a t y o u propose t o employ. O n l y the largest public libraries i n m a j o r cities w i l l be able t o offer m u c h help w i t h some o f these. Smaller public libraries i n the U K should have Social Trends, an annual digest o f official statistics published by the Office f o r N a t i o n a l Statistics, and this is a very useful starting p o i n t i n the search f o r statistical i n f o r m a t i o n published by and o n behalf of government. The N a t i o n a l Statistics website is at http://www.statistics.gov.uk/. For US i n f o r m a t i o n , the f o l l o w i n g are g o o d starting points: http://www.whitehouse.gov/news/ fsbr.html f o r links t o statistical reports f r o m US Federal agencies; h t t p : //www.gla.ac.uk/Library/Depts/MOPS/, a guide t o the US Government statistics that are available o n the Internet; http://www.fedstats.gov/ f o r links t o over one h u n d r e d Federal agencies; http://www.census.gov/, US Bureau of the Census Site. For A u s t r a l i a , see http://www.abs.gov.au, A u s t r a l i a n Bureau of Statistics i n f o r m a t i o n service. For Canada, see http://www.statcan.ca, Statistics Canada site. Some o f the b a c k g r o u n d m a t e r i a l w i l l be i n academic journals t o w h i c h o n l y university a n d research institute libraries subscribe. I n t e r - l i b r a r y loans can be arranged b u t they can be s l o w a n d a cover charge w i l l be a p p l i e d f o r each i t e m . T h e findings a n d the i n s t r u m e n t a t i o n f o r m a n y surveys are never published i n an o r t h o d o x f a s h i o n f o r a variety o f reasons: the d o c u m e n t a t i o n f o r t h e m m a y have t o be o b t a i n e d by c o n t a c t i n g the i n s t i t u t i o n a l sponsors o f the research.
44
Surveying the social world The U n i v e r s i t y o f Essex at Colchester, U K , has several resources o f p o t e n -
t i a l value t o investigators p l a n n i n g surveys. • T h e D a t a A r c h i v e is a c o l l e c t i o n o f over 4 0 0 0 datasets f r o m surveys c o n ducted i n B r i t a i n a n d other countries, especially E u r o p e a n d N o r t h A m e r ica. These datasets are available t o academic researchers as c o m p u t e r database files. T h e U R L is: http://www.data-archive.ac.uk/ • Q u a l i d a t a is the E c o n o m i c a n d Social Science Research Council's archive o f data f r o m q u a l i t a t i v e research projects w h i c h is itself searchable o n line. T h e U R L is: http://www.essex.ac.uk/qualidata/ • T h e I n s t i t u t e f o r Social a n d E c o n o m i c Research (ISER) is responsible f o r r u n n i n g several large scale surveys a n d m u c h o f the m a t e r i a l related t o t h e m , i n c l u d i n g i n s t r u m e n t a t i o n a n d findings, is available o n - l i n e . T h e U R L is: http://www.iser.essex.ac.uk/
H u m a n resources Even i f y o u are c o n d u c t i n g a survey i n a solo capacity, there m a y be a d d i t i o n a l help y o u can call o n f o r some o f the labour-intensive tasks. Table 3 . 1 identifies suitable tasks t h a t can be allocated t o volunteers o r casuallyp a i d helpers. Some tasks like i n t e r v i e w i n g a n d c o d i n g s h o u l d be given o n l y t o suitably t r a i n e d a n d responsible assistants capable o f w o r k i n g independently. Even a n essentially r o u t i n e task like p r e p a r i n g a m a i l s h o t can require supervision t o deal w i t h any c o m p l i c a t i o n s . Some k i n d s o f e r r o r t h a t c o u l d occur at this early stage are n o t easily reversible a n d c o u l d jeopardize the entire survey. A l w a y s be cautious i n y o u r estimates o f the p r o d u c t i v i t y o f inexperienced volunteers o r p a i d helpers: they m a y face steep l e a r n i n g curves a n d they m a y n o t be as c o m m i t t e d as y o u t o q u a l i t y a n d sustained performance. A n o t h e r k i n d o f h u m a n resource is assistance f r o m experts. T w o k i n d s are m e n t i o n e d later i n the b o o k . T h e first is the 'insider', a member o r associate o f the g r o u p t h a t y o u are researching, o r a n employee w i t h i n a n organiza t i o n i n w h i c h y o u are c a r r y i n g o u t data c o l l e c t i o n . Insiders can p l a y an invaluable role p a r t i c u l a r l y i f n o one i n the research t e a m has first-hand k n o w l e d g e a b o u t the research groups o r the venue. T h i s k i n d o f associate can help y o u before research begins w i t h y o u r request f o r access a n d y o u r data c o l l e c t i o n i n s t r u m e n t s , a n d again after data c o l l e c t i o n w i t h the i n t e r p r e t a t i o n a n d presentation o f findings. T h e second k i n d o f expert is a statistician, o r anyone w i t h substantial experience o f research design a n d data h a n d l i n g , w h o m y o u m a y possibly need t o consult r e g a r d i n g u n u s u a l o r c o m p l e x sample design a n d analysis p r o b l e m s . One w a y t o m a k e the m o s t o f l i m i t e d h u m a n resources is t o b u y i n c o m m e r c i a l research services f o r parts o f a project. I n p r i n c i p l e , almost any elem e n t o f the research process can be purchased f r o m suitable agencies: i n
Planning your project Table 3.1
45
Delegable tasks i n surveying
Task
Skills
Checking databases, labels Stuffing and unstuffing mailshots Data entry Telephone follow-ups Transcription of taped interviews
Clerical, basic computer Manual/clerical Basic computer Basic communication skills Audio-typing
required
practice, costs are considerable a n d y o u r budget, i f i t exists, m a y n o t stretch ( a l t h o u g h i n q u i r i e s a b o u t a p p r o x i m a t e costs w i l l d o n o h a r m ) . T h e m o s t cost-effective element t o sub-contract m a y w e l l be d a t a - e n t r y : the keyboarders e m p l o y e d b y specialized 'data c a p t u r e ' agencies can transfer a l l the responses f r o m questionnaires o r i n t e r v i e w e r - c o m p l e t e d schedules i n t o a c o m p u t e r data file, o f f e r i n g the a d d i t i o n a l benefit o f v a l i d a t e d data e n t r y (see C h a p t e r 7). T h e charges are usually calculated f r o m the t o t a l n u m b e r o f key depressions plus a n i n i t i a l 'set-up' fee. For i n s t r u m e n t s w i t h o u t a large n u m b e r of open-ended responses, the costs can be very reasonable a n d the time-saving considerable.
F i n a n c i a l resources M a n y people w i t h o u t experience o f social research express shock a n d dismay w h e n they are given a q u o t a t i o n f o r c o n d u c t i n g w h a t they consider to be a small-scale research project. There is an e n d u r i n g belief t h a t social research ( u n l i k e ' r e a l ' scientific research) can be c o n d u c t e d o n a shoestring. U n f o r t u n a t e l y , the fear t h a t lay sponsors w i l l b a u l k at the costs sometimes leads first-time researchers, a n x i o u s f o r s u p p o r t f o r a cherished p r o j e c t , t o p r o d u c e ' o p t i m i s t i c ' estimates t h a t underestimate the t r u e costs a n d s i m p l y help t o perpetuate belief i n the shoestring. T h e list b e l o w gives some r o u g h estimates o f the costs, c u r r e n t at the t i m e o f p u b l i c a t i o n , o f selected elements of c o n d u c t i n g a survey. • I t w i l l cost at least £ 7 0 0 f o r an a u d i o - t y p i s t t o p r o d u c e transcripts f r o m t w e n t y 4 5 - m i n u t e taped personal i n t e r v i e w s . • I g n o r i n g the design process, t o d i s t r i b u t e 2 0 0 0 copies o f a t w o - s i d e d A 4 postal questionnaire together w i t h a one-sided A 4 c o v e r i n g letter, a n d get back 1000 responses, a l l o w i n g f o r p h o t o c o p y i n g , stationery, p a y i n g stuffers, a n d second class postage o u t a n d back, w i l l cost a r o u n d £ 1 0 0 0 . • I t w i l l cost a b o u t £ 4 5 0 t o have a data capture agency process the 1000 questionnaires, assuming there was an average o f 2 5 0 key depressions per response.
46
Surveying the social world
T e c h n i c a l resources Small-scale survey research f o r t u n a t e l y rarely requires sophisticated e q u i p m e n t (beyond computers w h i c h are dealt w i t h o n page 4 7 ) . O n e m a j o r e x c e p t i o n is t h a t projects based o n tape-recorded i n t e r v i e w s need t o b o r r o w or invest i n a g o o d q u a l i t y audio-cassette recorder (preferably one w i t h an auto-reverse o n r e c o r d f a c i l i t y so t h a t the i n t e r v i e w e r does n o t have t o m a n u a l l y stop proceedings a n d t u r n the cassette over after 45 m i n u t e s ) . I f a large n u m b e r o f i n t e r v i e w s have t o be t r a n s c r i b e d , a dedicated cassette t r a n scriber is preferable t o an o r d i n a r y tape-recorder. These machines have a 'go back' f a c i l i t y t h a t r e w i n d s the tape a l i t t l e a n d replays so t h a t the t y p i s t can listen again t o an i n a u d i b l e phrase: they also come w i t h a f o o t c o n t r o l a n d earphones.
Setting the timetable Precise answers a b o u t h o w l o n g any phase o f a survey s h o u l d take t o c o n d u c t are d i f f i c u l t i n the absence o f h a r d i n f o r m a t i o n r e g a r d i n g the scale o f the research a n d the resources available. Nevertheless, i t is possible t o p r o vide some general p o i n t e r s . • Y o u clearly need t o w o r k b a c k w a r d s f r o m i m m o v a b l e deadlines (the delivery o f the project r e p o r t fixed by a sponsor, o r a l i m i t e d w i n d o w of o p p o r t u n i t y f o r data c o l l e c t i o n set b y h o l i d a y s or other restrictions o n y o u r access t o the target p o p u l a t i o n ) . • I t is usual t o give respondents a m i n i m u m o f t w o weeks t o complete postal questionnaires. • A n extensive p r o g r a m m e o f personal i n t e r v i e w s can be very t i m e - c o n s u m i n g , especially i f t r a v e l is i n v o l v e d : the overhead o f arrangements (and re-arrangements) can be d i f f i c u l t t o manage at the same t i m e as actually c o n d u c t i n g other i n t e r v i e w s (can someone back at base d o this?). D o n o t over-estimate the t o t a l n u m b e r o f i n t e r v i e w s t h a t a small t e a m is capable o f c o n d u c t i n g , n o r the t i m e i t w i l l take t o complete t h e m . • T r a n s c r i b i n g tape-recorded i n t e r v i e w s is also t i m e - c o n s u m i n g : a n e x p e r i enced a u d i o t y p i s t under o p t i m u m c o n d i t i o n s m i g h t take f o u r times the length o f the i n t e r v i e w f o r a f u l l t r a n s c r i p t i o n o f a personal interview. Focus groups are considerably m o r e d i f f i c u l t because o f m u l t i p l e a n d o v e r l a p p i n g speakers. • Analysis expands t o fill the t i m e a l l o t t e d f o r its c o m p l e t i o n : i n m o s t instances, i t is best t o settle o n a finite p e r i o d f o r the analysis i n advance associated w i t h a clear u n d e r s t a n d i n g o f the m i n i m u m y o u hope t o achieve w i t h i n these l i m i t s . • T h e overheads f o r c o l l a b o r a t i n g can be considerable. K e e p i n g each other
Planning your project
47
i n f o r m e d , g e t t i n g people back o n t r a c k after p r o b l e m s a n d a v a r i e t y o f other c o o r d i n a t i o n tasks can eat i n t o the p r o d u c t i v i t y gains o f h a v i n g several hands. A t e a m o f three people rarely produces three times the o u t p u t o f one member.
Computing and software resources A l t h o u g h i t is possible t o c o n d u c t a small-scale survey w i t h o u t c o m p u t e r assistance, i t w o u l d be perverse t o a t t e m p t i t . Access t o a desktop or l a p t o p c o m p u t e r a n d suitable software is a l m o s t certainly the single m o s t i m p o r t ant resource t h a t can a i d the investigator i n a survey project. A c o m p u t e r n o t o n l y takes the d r u d g e r y o u t o f n u m e r i c a l and t e x t u a l data analysis b u t w i t h suitable s o f t w a r e i t can help t o design a professional l o o k i n g questionnaire a n d ensure accurate data entry. Even a basic machine, e q u i p p e d o n l y w i t h w o r d processing a n d spreadsheet a p p l i c a t i o n s , can at the m i n i m u m help w i t h project r e c o r d keeping a n d correspondence. I f there are suitable levels o f c o m p u t e r literacy i n play, such a machine can p l a y a useful r o l e i n a l l phases o f a survey.
Hardware H o w ancient a n d l i m i t e d a machine y o u can get by o n depends very m u c h o n the scale o f the project a n d the k i n d o f s o f t w a r e y o u envisage e m p l o y i n g . Unless a machine has a h a r d disc a n d a central processor (CPU) o f r o u g h l y equivalent p o w e r t o an I n t e l 8 0 4 8 6 , i t can p r o b a b l y be r u l e d o u t . (This is n o t meant t o suggest t h a t the c o m p u t e r m u s t be a PC w i t h an I n t e l or similar processor o r r u n n i n g a version o f M i c r o s o f t ' s W i n d o w s o p e r a t i n g system.) T h e m o r e sophisticated survey analysis packages t e n d t o be d e m a n d i n g i n terms o f b o t h processor p o w e r a n d m e m o r y requirements. O l d e r machines w i t h p o o r e r specifications w i l l certainly struggle t o r u n t h e m a n d m a y perf o r m so s l o w l y as t o be unusable. I f there are several investigators and/or h i r e d hands c o l l a b o r a t i n g over a project, i t is preferable t o keep a l l the data files a n d other i n f o r m a t i o n i n one l o c a t i o n , i n v a r i a b l y o n a n e t w o r k e d system, w h e r e access t o t h e m can be m a x i m i z e d . E q u a l l y i m p o r t a n t , there are grave dangers o f generating u n s y n c h r o n i z e d a n d diverging versions o f the same key files w h e n d i f f e r e n t m e m bers o f a t e a m store their w o r k i n progress i n d i f f e r e n t places.
Software There are three m a i n types t o consider: standard business a p p l i c a t i o n s , onestop a p p l i c a t i o n s , a n d dedicated applications.
48
Surveying the social world
Standard business applications As i m p l i e d above, a s t a n d a r d w o r d processing a p p l i c a t i o n can be useful t o create c o v e r i n g letters a n d t o execute mailmerges f o r address labels. Selfc o m p l e t i o n questionnaires can be designed o n w o r d processors b u t i t requires a l i t t l e t r i a l a n d e r r o r t o get finely t u n e d results a n d there is rarely any help o n h o w t o get some o f the u n u s u a l layouts t h a t m a y be r e q u i r e d . A standard office spreadsheet like M i c r o s o f t Excel is also extremely useful a n d can be used t o store the data file (see Chapter 7) as w e l l as t o p r o d u c e tables a n d charts. A l t h o u g h there are fewer ready-made solutions a n d less c o n venience t h a n w i t h a specialized survey package, i t is nevertheless possible t o c a r r y o u t q u i t e advanced data analysis a n d t o p r o d u c e excellent charts f r o m a spreadsheet. Business databases such as M i c r o s o f t Access can also store the data file b u t they are o f t e n less i n t u i t i v e t o use a n d d o n o t always have as extensive a range o f g r a p h i c a l presentation facilities.
One-stop applications A one-stop a p p l i c a t i o n offers a comprehensive a n d integrated suite o f facilities f o r c o n d u c t i n g surveys i n c l u d i n g questionnaire design, data entry, data analysis a n d presentation tables a n d graphics. There is clearly a n advantage i n terms o f convenience i n o n l y h a v i n g t o cope w i t h the interface f o r a single a p p l i c a t i o n , w h i l e there m a y also be cost savings t o be h a d . A t the same t i m e , the statistical procedures available i n one-stop packages t e n d t o be m o r e l i m i t e d t h a n those i n the leading dedicated packages, t h o u g h this m a y n o t be a p r o b l e m f o r a small-scale p r o j e c t . Examples o f one-stop packages are: • PinPoint a n d its successor KeyPoint, b o t h p u b l i s h e d b y L o n g m a n Softw a r e (the U R L is http://www.longman.net/keypoint/). T h e latter i n c o r porates facilities f o r c o n d u c t i n g w e b surveys. • Sphinx Survey, p u b l i s h e d by Sage Publications Software a n d d i s t r i b u t e d by Scolari (the U R L is w w w . http://www.scolari.co.uk/). Some versions s u p p o r t the lexical analysis o f open-ended responses.
Dedicated applications A dedicated a p p l i c a t i o n concentrates o n s u p p o r t i n g a specific aspect o f survey research, p a r t i c u l a r l y data analysis. As a result, there t e n d t o be m o r e user o p t i o n s a n d a higher degree o f s o p h i s t i c a t i o n t h a n i n a one-stop package. T h i s can present p r o b l e m s f o r first t i m e users w h o can get lost i n a sea o f d i f f i c u l t choices. O n the other h a n d , there is a very g o o d chance t h a t exactly the f a c i l i t y y o u are seeking w i l l be available ' o f f the s h e l f . Dedicated s o f t w a r e also exists f o r data e n t r y a n d c h a r t i n g , b u t the choice i n the data analysis field is especially extensive. Examples o f data analysis packages i n c l u d e :
Planning your project
49
• SPSS (Statistical Package f o r the Social Sciences), p u b l i s h e d b y SPSS I n c (http://www.spss.com/). T h e grandfather o f survey analysis software w i t h a t r a c k r e c o r d stretching back over 35 years, a very large i n t e r n a t i o n a l user base a n d associated w i t h a w i d e range o f l i n k e d p r o d u c t s a n d services. Supported b y extensive d o c u m e n t a t i o n a n d several t e x t b o o k s (see f u r t h e r reading f o r Chapter 8 ) . • GB Stat ( d i s t r i b u t e d b y Scolari): offers g o o d i m p o r t facilities, g r a p h i n g and e x p o r t t o w o r d processors. There are several general issues relevant t o the selection o f software packages f o r use i n survey projects. • Is i t one-stop o r dedicated? C a n i t deliver a l l the f o r m s o f analysis y o u expect t o employ? • Does i t have g o o d i m p o r t a n d e x p o r t facilities? I n other w o r d s , c a n i t recognize data files t h a t have been constructed i n other p r o g r a m m e s a n d c o m p u t e r environments? I f i t lacks a p a r t i c u l a r facility, y o u m a y w a n t t o e x p o r t some o r a l l o f the data t o another package t h a t does have i t . • Is t h e package w e l l supported? T h i s covers on-screen s u p p o r t , technical help-lines, w e b sites, a n d paper d o c u m e n t a t i o n . Does a n y o f this help come at a price y o u cannot afford? • C a n i t deal w i t h the n u m b e r o f variables a n d a l l the types o f data y o u expect t o collect? I n particular, i f y o u have used open-ended questions, w h a t facilities does i t offer f o r processing the responses? • Does i t p r o v i d e g o o d presentation facilities (especially tables a n d charts) t h a t can be i m p o r t e d d i r e c t l y i n t o the w o r d processor y o u are g o i n g t o use to p r o d u c e the report? • A r e there licence restrictions o n the w a y i t can be used w h i c h w i l l c o n f l i c t w i t h the w a y y o u are p l a n n i n g t o r u n the research? Is there a n educational o r c h a r i t y price tariff?
Gaining access to organizations I n Chapter 5 (pages 86 a n d 90) w e consider ways o f a p p r o a c h i n g i n d i v i d u a l respondents a n d encouraging their p a r t i c i p a t i o n . H o w w e i n t r o d u c e a n d characterize o u r research o n the telephone o r i n person, o r i n a c o v e r i n g letter a c c o m p a n y i n g a questionnaire, is c r u c i a l i n securing p a r t i c i p a t i o n a n d establishing r a p p o r t . I n m a n y cases, however, o u r research is c o n d u c t e d i n a n o r g a n i z a t i o n a l c o n t e x t . O u r survey is n o t o f the general p u b l i c b u t o f members o f a n o r g a n i z a t i o n , w h e t h e r a c o m m e r c i a l company, a u n i o n o r professional association, a religious m o v e m e n t o r a v o l u n t a r y agency. I n such cases the o r g a n i z a t i o n mediates o u r research r e l a t i o n s h i p w i t h o u r respondents.
50
Surveying the social world
O n the face o f i t , the p r o b l e m we face is this: g a i n i n g access t o the organiza t i o n so t h a t w e can survey its members. W h o s e a u t h o r i z a t i o n d o w e need, a n d w h a t do w e have t o say a n d do t o get in} T h i s , t h o u g h , is o n l y one aspect o f access. Access is n o t a one-off achievement b u t an o n g o i n g process, w i t h f o u r dimensions (Buchanan et al. 1 9 8 8 ) : getting i n , getting o n , getting o u t a n d getting back. As H o r n s b y - S m i t h (1993) remarks, i t can be h e l p f u l i f we t h i n k o f o u r i n v o l v e m e n t w i t h the o r g a n i z a t i o n as an 'access career'. Getting on p o i n t s t o the need t o secure w i l l i n g c o o p e r a t i o n once w e have been granted f o r m a l rights o f access t o respondents. Just because senior people have a u t h o r i z e d o u r research does n o t necessarily mean t h a t the r a n k - a n d - f i l e w i l l be enthusiastic. ( W h e n A l d r i d g e a r r i v e d f o r one i n t e r v i e w w i t h a c l e r g y m a n , he was i m m e d i a t e l y asked w h e t h e r he was a spy f r o m the bishop.) We need t o w i n people over. I n a c o m p l e x o r g a n i z a t i o n w e m a y w e l l need t o negotiate w i t h a w i d e range o f gatekeepers w h o can facilitate or hinder o u r research. Getting out is also a sensitive process, despite the insensitive language. G e t t i n g o u t is n o t escape. Once w e have gathered o u r data, w e s h o u l d n o t s i m p l y r u s h o u t o f the o r g a n i z a t i o n b r e a t h i n g a sigh o f relief. W e have b u i l t up obligations t o o u r respondents a n d t o the people w h o have helped us c a r r y the research f o r w a r d . We w i l l p r o b a b l y be feeding back o u r findings t o t h e m i n one w a y or another, so w e need t o m a i n t a i n o u r l i n k s t o t h e m . One reason f o r h a n d l i n g getting o u t sensitively is the need w e o r others m a y have t o get back. W e may, f o r example, w a n t t o c o n d u c t f o l l o w - u p interviews w i t h key i n f o r m a n t s , or gather m o r e i n f o r m a t i o n f r o m c o m p a n y archives. T h e o r g a n i z a t i o n m a y itself ask us t o carry o u t f u r t h e r w o r k . Even if w e have n o i n t e n t i o n o f c o n d u c t i n g f u r t h e r w o r k i n the o r g a n i z a t i o n , others may, so w e o w e i t t o t h e m n o t t o leave a sour atmosphere b e h i n d . There c a n n o t be a set f o r m u l a f o r securing access i n this f o u r f o l d meaning o f the t e r m . Patience a n d quiet persistence are o f t e n needed, as is the willingness t o seize o p p o r t u n i t i e s w h e n they arise - they are seldom predictable. N o r is i t always easy t o p r e d i c t w h i c h o r g a n i z a t i o n s w i l l be the m o s t d i f f i c u l t t o access f o r the purpose o f c o n d u c t i n g a survey, n o r w h a t barriers w i l l need t o be overcome. I n v i r t u a l l y a l l cases, c o n f i d e n t i a l i t y w i l l have t o be negotiated carefully. T h e researchers m a y w e l l need t o demonstrate t h e i r competence t o sceptics, since n o t everyone is c o n v i n c e d o f the value o f social surveys o r the expertise o f social scientists. Even w h e n they are n o t sceptical, m a n y organizations are so time-pressured t h a t they need t o be persuaded t h a t t h e i r members can spare the t i m e t o p a r t i c i p a t e . I n some cases, ascribed qualities o f the researcher such as gender, age, social status, ethnicity, religious a f f i l i a t i o n can be a f o r m i d a b l e a n d even impenetrable barrier. O n e advantage o f using s e l f - c o m p l e t i o n questionnaires (see B o x 3.1) is t h e i r i m p e r s o n a l i t y : the p r o b l e m o f i n t e r v i e w e r effects does n o t arise.
Planning your project
51
Three methods of gathering data W e can d i s t i n g u i s h three vehicles f o r g a t h e r i n g data f r o m respondents: 1
The self-completion questionnaire I n this m e t h o d , respondents f i l l o u t the questionnaire themselves. I t m a y be a p o s t a l (mail) questionnaire, w h i c h they complete a n d r e t u r n by post. I t m a y be a questionnaire handed t o t h e m , f o r example by a teacher i n class or a receptionist i n a w a i t i n g r o o m , w h i c h they are asked t o complete o n the spot a n d h a n d i n . O r i t m a y be a n e m a i l questionnaire w h i c h they complete a n d r e t u r n electronically.
2 The face-to-face interview H e r e the researcher i n t e r v i e w s the respondent i n person, either i n the respondent's h o m e , or i n the researcher's office, o r i n some ' n e u t r a l ' l o c a t i o n . I n social research, m o s t i n t e r v i e w s are one-toone, t h o u g h g r o u p i n t e r v i e w s are also possible - i n t e r v i e w i n g the a d u l t partners i n a h o u s e h o l d , f o r example. 3 The telephone interview H e r e the i n t e r v i e w is c o n d u c t e d over the telephone - or, i n f u t u r e , the v i d e o p h o n e . We set o u t i n the boxes b e l o w the m a i n advantages a n d disadvantages o f each m e t h o d , before considering the key choices t o be m a d e .
Box 3. i
Seif-compietion questionnaires: pros and cons
Advantages
Disadvantages
Cost
Questionnaire length
The cost of reproducing and distributing questionnaires is relatively low.
Self-completion questionnaires need t o be short and also look short, o r the response rate will be low.
Time to collect data
Simple questions
Questionnaires can be distributed and returned quickly.
Complex questions are cumbersome t o ask and take t o o long t o answer.
Large samples
Few open questions
Because costs are low and data collection is fast, it is feasible t o survey large samples of the population. The method benefits f r o m economies of scale.
Since w r i t t e n answers t o open questions can take a long time, only a few such questions can be asked.
52
Surveying the social world
Geographical distribution
Response rate
Since the researcher is not present, the sample can be drawn f r o m a wide geographical area.
Even w i t h good design, response rates can be low unless respondents have strong reasons t o participate. Response rates will be underestimated if questionnaires have been sent t o people w h o are n o t part of the target population o r w h o have moved address. Unless they let us know, we shall count them as refusals when they are not.
No interviewer bias
Control of context of response
There is no interviewer t o introduce unauthorized comments about the research, the questions or the respondent.
The researcher often has no control over w h o fills o u t the questionnaire, nor the spirit in which they do so. Respondents can scan the whole questionnaire first, rather than follow the desired sequence of questions.
No interviewer effects
Response bias
Respondents do n o t have t o relate t o characteristics of the researcher such as their age, sex, ethnicity, dress o r accent.
People w h o experience literacy problems, o r whose mobility is restricted, will be less likely t o respond.
Handling sensitive topics
Salience
Since the researcher is n o t present, respondents may find it easier t o handle sensitive questions, particularly if their responses are anonymous.
Gauging the salience of items t o the respondent can be difficult.
Planning your project
Box 3.2
Face-to face interviews: pros and cons
Advantages
Disadvantages
Length of interview schedule
Cost
Because responses are verbal, it is possible t o ask more questions that in a self-completion questionnaire. The appearance of the interview schedule is n o t relevant t o the interviewee.
Interviews are costly in money and time.
Complex questions
Sample size
The presence of the interviewer enables complex questions t o be explained, if needed, t o the interviewee.
Because of the time and money involved, one interviewer can conduct a limited number of interviews each day. There are no economies of scale.
Question skips
Geographical restrictions
As long as they are clear t o the interviewer, question skips raise no problems f o r the respondent.
The cost of travel and the time it takes may limit the geographical reach of surveys carried o u t by interviews.
Open questions
Time to collect data
Since respondents do not have t o w r i t e their answer, open questions can be used more freely.
Given that interviewing can be taxing f o r the interviewer, especially when interviews are not wholly structured, any one researcher can only undertake a few interviews each day - often four is the maximum.
Salience
Interviewer bias
The use of open questions, and non-verbal cues f o r the respondent, enable the interviewer t o gauge which items are salient t o the respondent and which are of no concern.
Interviewers can introduce bias by offering unauthorized comments on the questions, the research o r the interviewee, which can lead the respondent in a particular direction.
53
54
Surveying the social world
Visual aids
Interviewer effects
Show cards can be used t o help respondents frame their answer.
Personal characteristics of the interviewer - such as age, sex, ethnicity, dress o r accent - can affect the way in which the interviewee responds.
Ranking and rating questions
Leading questions
Relatively complex ranking and rating exercises are possible. For example, occupational titles can be w r i t t e n on cards, and respondents asked t o rank them, o r s o r t them into categories, based on criteria such as social status.
Even w i t h o u t interviewer bias, leading questions can easily be introduced unwittingly into the less structured part of an interview.
Control over context of response In contrast t o self-completion questionnaires, the researcher has cont r o l over w h o responds t o questions and the sequence of questions. By establishing good rapport the researcher can ensure that questions are taken seriously.
Social desirability The presence of the interviewer makes it even more likely that the respondent will seek t o give socially desirable answers.
Rapport
Anonymity
The interviewer's success in achieving a good relationship w i t h the respondent will improve the quality of the answers.
Although confidentiality can be guaranteed, anonymity clearly cannot.
Group interviews
Safety
Sometimes we would like responses f r o m more than one person, for example, f r o m the adult members of a household. This is only feasible in a face-to-face interview.
A t t e n t i o n needs t o be given t o the physical safety of the interviewer, especially if interviews are conducted by one interviewer in the respondent's home.
Planning your project
Box 3.3
Telephone interviews: pros and cons
Advantages
Disadvantages
Cost
Simple questions
Costs are far lower than w i t h faceto-face interviews.
Because strong rapport is hard t o achieve, and because show cards are not possible, complex questions have t o be avoided.
Large samples
Response bias
Because costs are l o w and data collection is fast, it is more feasible t o survey larger samples of the population than if interviews are face-toface.
Unless great care is taken, socially disadvantaged groups will be underrepresented.
Geographical distribution
Sensitive questions
Since the researcher is not present, the sample can be drawn f r o m a wide geographical area.
Telephone conversations are an unsuitable medium f o r asking sensitive questions.
T i m e to collect data
Open questions
There is no travel time, and the respondent's agreement t o participate is quickly obtained.
Open questions are less effective than in face-to-face interviews. O n the telephone, respondent's answers are usually brief, and probes have a limited effect.
Question skips
Limited response categories
As in face-to-face interviews, provided they are clear t o the interviewer question skips raise no problems f o r the respondent.
Respondents cannot be expected t o memorize a long list of response categories. Visual aids such as show cards cannot be used t o help respondents frame their answer, as they can in face-to-face interviews.
Fewer interviewer effects
Contamination by telesales
Although personal characteristics of the interviewer - such as age, sex, ethnicity, o r social class - may be inferred, they are less obvious and intrusive than in face-to-face interviews.
The telephone is widely used for selling goods and services, often w i t h an initial pretence of conducting research. Genuine research is not always easily distinguished f r o m these other activities.
55
56
Surveying the social world
Safety
Cold call
The physical safety of the interviewer is not an issue.
The telephone call comes ' o u t of the blue'; not having been prepared, respondents may be less likely t o agree t o take part.
Anonymity Although confidentiality can be guaranteed, anonymity clearly cannot.
Self-completion questionnaires, face-to-face i n t e r v i e w s a n d telephone interviews are the three m a i n methods o f g a t h e r i n g data i n social surveys. We m a y a d d a f o u r t h , as m e n t i o n e d already i n chapter one: o b s e r v a t i o n . We s h o u l d also note t w o variants o n s e l f - c o m p l e t i o n questionnaires: e m a i l a n d interactive surveys, a n d diaries.
Email and interactive surveys These are electronic variants o f the postal questionnaire, a n d they offer advantages a l o n g w i t h some p r o b l e m s . I n an e m a i l survey, the questionnaire is d i s t r i b u t e d a n d r e t u r n e d electronically. T h i s has advantages over paper questionnaires: • w e can p r e - p r o g r a m the order o f questions, so t h a t respondents progress t h r o u g h the questionnaire i n the sequence w e desire w i t h o u t s k i p p i n g ahead or g o i n g back; • because o f this, the p r o b l e m o f q u e s t i o n skips does n o t arise - the p r o g r a m a u t o m a t i c a l l y moves t o the n e x t relevant q u e s t i o n ; • the p r o g r a m can p r o m p t respondents, a n d alert t h e m t o the fact t h a t they have made a mistake - f o r example, i f they t r y t o t i c k several boxes w h e r e o n l y one is r e q u i r e d ; • there is n o i n t e r m e d i a t e stage o f i n p u t t i n g d a t a ; the data are available f o r i m m e d i a t e analysis; • there are n o p r o b l e m s a b o u t h o w t o arrange f o r the questionnaires t o be r e t u r n e d , a n d n o intermediaries t o intervene i n the process o f d i s t r i b u t i o n and return; • w e can g a i n access t h r o u g h newsgroups t o m i n o r i t i e s w h o are h a r d t o reach b y other means. There are, t h o u g h , some m a j o r d r a w b a c k s :
Planning your project
57
• there is a s a m p l i n g bias t o w a r d s affluent, w e l l educated, y o u n g , w h i t e , male citizens o f First W o r l d countries - t h o u g h this w i l l become less acute as m o r e people g a i n access t o the internet; • w e need p r o g r a m m i n g skills, o r a p r o g r a m m e r , t o design the questionnaire; • respondents need f a m i l i a r i t y w i t h a n d access t o a c o m p u t e r e q u i p p e d w i t h the necessary s o f t w a r e ; • respondents m a y lack confidence i n the security o f data sent over the internet a n d stored o n a remote server o r c o m p u t e r ; • a n o n y m i t y c a n n o t c o n v i n c i n g l y be guaranteed.
Diaries A s k i n g respondents t o keep a d a i l y r e c o r d o f t h e i r actions can be a useful p a r t o f a social survey. Potentially, i t enables us t o gather a large a m o u n t o f data, m u c h o f w h i c h w o u l d be h a r d t o o b t a i n by other means. I t can stand i n f o r o b s e r v a t i o n w h e r e o b s e r v a t i o n is n o t possible. We p o i n t o u t i n Chapter 6 (see B o x 6.6) t h a t one o f the m o s t d i f f i c u l t p r o b l e m s facing the survey researcher is asking a b o u t p e r i o d i c behaviour. Suppose w e are interested i n respondents' c i n e m a - g o i n g . W h i l e a m i n o r i t y o f t h e m m a y have extremely regular behaviour - they go t o the cinema every Saturday, as p a r t o f a regular n i g h t o u t - m o s t respondents w i l l n o t be l i k e this. T h e i r cinema attendance w i l l be far m o r e variable a n d h a r d t o summarize. One answer t o the p r o b l e m is t o ask respondents t o keep a d i a r y o f their activities day by day. We shall n o t have t o depend o n the respondents' f a l l ible m e m o r y . N o r w i l l w e face the social desirability p r o b l e m o f respondents o v e r - r e p o r t i n g socially a p p r o v e d behaviour a n d u n d e r - r e p o r t i n g socially stigmatized activities. A d i a r y w i l l p r o v i d e us w i t h a reliable r e c o r d o f their actual behaviour. O r w i l l it? Diaries are n o t a panacea f o r social respondents m i s r e p o r t their behaviour, c o n f o r m w i t h social n o r m s . K e e p i n g conscious, a n d so m a y itself affect their
d e s i r a b i l i t y effects. N o t o n l y m a y they may alter their behaviour to a d i a r y makes people m o r e selfactions.
I t m a y be t e m p t i n g t o t h i n k o f the d i a r y as a l o w cost w a y o f g a t h e r i n g data. T h e respondent does a l l the w o r k a n d w e reap the benefit. O f course i t is n o t as simple as t h i s . Diaries are d i f f i c u l t t o design a n d t o analyse. I n survey research the d i a r y is a f o r m o f s e l f - c o m p l e t i o n questionnaire, one renewed every day. We s h o u l d design i t as such. Respondents need t o k n o w w h a t t o r e c o r d a n d w h e n t o d o so. As w i t h questionnaires a n d i n t e r v i e w schedules, w e have t o be selective, a n d as h e l p f u l as possible i n e x p l a i n i n g t o the respondent w h a t w e are asking t h e m t o d o a n d w h y . We need t o be clear
58
Surveying the social world
a b o u t the use t o w h i c h the data w i l l be p u t , i n c l u d i n g the issue o f a n o n y m ity or c o n f i d e n t i a l i t y a n d any feedback w e i n t e n d t o give o n the results o f o u r analysis. We s h o u l d contact the respondents i n person or by telephone d u r i n g the course o f t h e i r record-keeping, t o t h a n k t h e m f o r their p a r t i c i p a t i o n , answer any queries, a n d encourage t h e m t o c a r r y o n . We s h o u l d also, ideally, arrange t o call t o collect the d i a r y i n person. Diaries are, t h e n , labour-intensive f o r the researcher t o o . Sometimes, t o keep the data manageable, the researcher asks o n l y a subsample o f respondents t o keep a diary. O n e p r o b l e m here is response bias: perhaps certain k i n d s o f respondent w i l l be m o r e w i l l i n g t o d o so t h a n others.
Choosing a method of gathering data I n surveys c o n d u c t e d by a solo researcher or a small t e a m , p r a c t i c a l c o n siderations o f t e n l i m i t the o p t i o n s open t o us. I f a large sample is r e q u i r e d , or i f respondents are geographically scattered, face-to-face i n t e r v i e w s are n o r m a l l y impossible because they consume too m u c h t i m e a n d money. I f w e need t o ask a l o t o f questions, a n d i f the f o r m a t is c o m p l e x , w i t h m u l t i p l e question skips, t h e n a s e l f - c o m p l e t i o n questionnaire is unsuitable, unless i t can be d i s t r i b u t e d electronically. T h e m o r e questions there are, the m o r e a face-to-face i n t e r v i e w becomes a p p r o p r i a t e . If w e need t o ask a l o t o f open questions, face-to-face i n t e r v i e w s are t o be preferred (see Boxes 3 . 1 , 3.2 a n d 3.3).
Combining methods of data gathering T h e m e t h o d s o f data-gathering w e have been discussing are n o t m u t u a l l y exclusive. I t is o f t e n possible t o c o m b i n e t h e m , w i t h beneficial results.
U s i n g a questionnaire t o generate f o l l o w - u p i n t e r v i e w s T h e r a t i o n a l e is t h a t the questionnaire w i l l p r o v i d e basic i n f o r m a t i o n a b o u t the sample f r o m w h i c h generalizations can be made t o the w h o l e p o p u l a t i o n . I n t e r v i e w i n g is a t o o l w h i c h w i l l a l l o w us t o p r o b e m o r e deeply i n t o the respondents' feelings, attitudes, o r i e n t a t i o n s , hopes a n d fears. I n t e r v i e w s y i e l d r i c h evidence t h a t complements the generalizable b u t t h i n data f r o m a questionnaire. I f the questionnaire has p r o d u c e d unexpected or p u z z l i n g findings, we can explore t h e m i n d e p t h t h r o u g h i n t e r v i e w s . U s i n g a questionnaire t o i d e n t i f y a subset o f respondents f o r i n t e r v i e w can raise p r o b l e m s over a n o n y m i t y o f findings. I f the questionnaire is
Planning your project
59
a n o n y m o u s , h o w can w e m a i n t a i n o u r guarantee w h i l e i d e n t i f y i n g people w h o are w i l l i n g t o be interviewed? There are t w o possibilities. First, w e can breach the a n o n y m i t y w h i l e o f f e r i n g reassurances. A t the end o f the questionnaire, after w e have t h a n k e d t h e m f o r their p a r t i c i p a t i o n , w e can include a brief statement t h a t w e i n t e n d t o c o n d u c t f o l l o w - u p i n t e r v i e w s , a n d ask w h e t h e r they w o u l d be w i l l i n g t o p a r t i c i p a t e . I f so, w e w i l l need t o ask f o r a name a n d address o r telephone number. A l t h o u g h this is relatively s t r a i g h t f o r w a r d , i t o b v i o u s l y means t h a t o u r questionnaire is n o longer a n o n y m o u s . A v a r i a t i o n o f this is i l l u s t r a t e d by the Travel Survey. A s a n incentive t o complete the questionnaire, w e o f f e r e d respondents the o p p o r t u n i t y t o enter a prize d r a w f o r a bicycle, w h i c h w e t h o u g h t an a p p r o p r i a t e r e w a r d ! T h e questionnaire h a d a tear-off s t r i p , o n w h i c h respondents were asked t o give their name a n d details o f h o w w e c o u l d contact t h e m i f they h a d w o n . We reassured t h e m t h a t the slip w o u l d be detached f r o m the questionnaire as soon as w e h a d received i t , a n d t h a t a l l data w o u l d r e m a i n c o n f i d e n t i a l . I f w e decide t h a t the questionnaire m u s t r e m a i n a n o n y m o u s , w e shall have t o s u p p l y respondents w i t h t w o envelopes, one f o r their questionnaire and one f o r their personal details. I f the questionnaire is n o t a n o n y m o u s , w e are able t o tie u p each i n t e r v i e w w i t h the c o r r e s p o n d i n g questionnaire. S h o u l d w e d o so? S h o u l d w e refer back, i n the i n t e r v i e w , t o the responses made o n the questionnaire? A t times, this can be f r u i t f u l , b u t care m u s t be t a k e n . We r i s k p r o v o k i n g socially desirable responses. I f w e c o n t i n u a l l y r e m i n d respondents w h a t they said before, they m a y adjust their replies t o be consistent w i t h t h a t .
U s i n g i n t e r v i e w s , focus groups o r diaries t o suggest items f o r a questionnaire H e r e the i n t e r v i e w s , focus groups o r diaries are being used less f o r their o w n sake t h a n t o help us f o r m u l a t e salient, m e a n i n g f u l questions f o r use i n the questionnaire. Triangulation I n the literature o n research m e t h o d s , t r i a n g u l a t i o n refers t o the use o f a variety o f research strategies, o r o f data f r o m a v a r i e t y o f sources, t o test an hypothesis. T h e t e r m t r i a n g u l a t i o n comes f r o m surveying. W e calculate the p o s i t i o n o f an object, C, by t a k i n g bearings o n i t f r o m t w o p o s i t i o n s , A a n d B. I f w e measure the distance between A a n d B, w e k n o w the length o f one side o f the triangle defined by p o i n t s A , B, a n d C. We use an i n s t r u m e n t such as a t h e o d o l i t e t o measure the angle at apexes A a n d B. F r o m this w e can calculate the exact l o c a t i o n o f C. T h e p o i n t is, w e need t o make t w o o r m o r e independent measurements t o d o so.
60
Surveying the social world
But does this analogy stand u p w h e n dealing w i t h social data? I f w e regard a questionnaire as a measurement f r o m p o i n t A , a n d a n i n t e r v i e w as a measurement f r o m p o i n t B, can w e n o w determine p o i n t C - namely, data a b o u t a respondent? A r e social data like that? I f w e ask a respondent a b o u t her o p i n i o n s o n field sports i n an i n t e r v i e w as w e l l as a questionnaire, have we d e t e r m i n e d her o p i n i o n s as definitively as w e can calculate the l o c a t i o n o f an object i n the landscape? Perhaps n o t . O n e facet o f social data, like q u a n t u m mechanics, is t h a t the act o f measurement affects the t h i n g being measured. T h e accounts t h a t people give i n questionnaires, i n t e r v i e w s , focus groups or diaries are just t h a t : accounts. W h e t h e r i t is possible t o construct one objective, definitive statement o u t o f these v a r y i n g accounts is a contested p h i l o s o p h i c a l quest i o n . R e t u r n i n g t o the p r a c t i c a l matters, w e m a y s i m p l y conclude t h a t an o p e n - m i n d e d use o f a v a r i e t y o f methods w i l l d o n o h a r m , a n d w i l l t e n d t o enrich o u r u n d e r s t a n d i n g o f the social w o r l d .
Further reading Devine and Heath (1999) Sociological Research Methods in Context is particularly useful i n discussing the ways in which different research methods can be combined. Hornsby-Smith's chapter i n Gilbert's (1993a) collection, Researching Social Life, provides a good account of the problems of access. Czaja and Blair (1995) Designing Surveys: A Guide to Decisions and Procedures is readable and wide ranging.
(J)
Selecting samples
Introduction Sampling is the process o f choosing i n a systematic fashion a sub-set o f cases f r o m w h i c h data w i l l be collected f r o m the p o o l o f a l l those p o t e n t i a l l y relevant t o the research being conducted. T h e sub-set selected is the sample, the p o o l is the target p o p u l a t i o n . T h i s t e r m i n o l o g y is used whatever the cases i n question are - they w i l l o f t e n be h u m a n i n d i v i d u a l s , b u t other possibilities
62
Surveying the social world
i n social science research include collectivities (households w i t h i n a defined area, the stores w i t h i n a retail chain), relationships (couples i n the process o f d i v o r c i n g , doctors w i t h patients w h o have a p a r t i c u l a r c o n d i t i o n ) , events (inmate releases f r o m prisons, patient 'episodes' i n hospitals), o r slices o f space-time ( u r b a n intersections m o n i t o r e d over a p e r i o d o f t i m e f o r possible accidents). T h e need t o make any type o f selection always reflects researchers' l i m i t e d resources. I n a n ideal w o r l d , data c o u l d be collected f r o m every case i n a target p o p u l a t i o n (a s i t u a t i o n sometimes referred t o as complete enumera t i o n ) . T h i s is the objective, t h o u g h never the achieved result, i n some censuses c o n d u c t e d b y n a t i o n states i n t o the c o n d i t i o n o f their h u m a n p o p u l a t i o n s . O n e h u n d r e d percent coverage o f small target p o p u l a t i o n s m a y be p r a c t i c a l , b u t t i m e constraints a n d finite budgets f r e q u e n t l y render c o m plete e n u m e r a t i o n o f large target p o p u l a t i o n s o u t o f the question. T h e researcher is t h e n o b l i g e d t o i n t r o d u c e some sort o f selection o f the cases t o be i n c l u d e d w i t h i n the study. There is a f u n d a m e n t a l choice t o be made between the t w o m a j o r types o f selection procedure. I f data f r o m the selected cases is t o be used as the basis f o r generalizations a b o u t a n entire target p o p u l a t i o n t h e n p r o b a b i l i t y (or r a n d o m ) methods o f s a m p l i n g s h o u l d be e m p l o y e d using the principles set o u t i n the section devoted t o this, b e l o w (page 6 2 ) . I f , o n the other h a n d , data f r o m the selected cases can stand i n their o w n r i g h t a n d there is n o r e q u i r e m e n t t o generalize f r o m t h e m , the p r o cedures set o u t i n the section o n n o n - p r o b a b i l i t y s a m p l i n g (page 79) w i l l be adequate. W h a t e v e r selection procedures are a d o p t e d , they need t o be consistent w i t h the o v e r a l l p r o j e c t research design a n d s h o u l d be developed i n c o n j u n c t i o n w i t h the latter. Specifically, w h e r e t h e research design requires a c o m p a r i s o n t o be made between p a r t i c u l a r g r o u p s o r t i m e p e r i o d s , t h e n adequate q u a n t i t i e s o f cases w i t h the a p p r o p r i a t e attributes m u s t be made available f o r the analysis stage b y a n y selection procedures a d o p t e d . I t is wise n o t t o lose sight o f the fact t h a t even the m o s t sophisticated sample design represents n o m o r e t h a n a n a t t e m p t t o reach a r a t i o n a l c o m p r o m i s e between r i g o u r o n the one h a n d a n d a n effective a p p l i c a t i o n o f t i m e a n d m o n e y o n t h e other. A f u r t h e r p o i n t is t h a t even w h e n 'scientific' s a m p l i n g procedures are used under o p t i m u m c o n d i t i o n s , a l l r e s u l t i n g generalizat i o n s d e r i v e d f r o m sample data are i n e v i t a b l y subject t o a degree o f error. T h e k e y advantage t h a t p r o b a b i l i t y s a m p l i n g possesses over the alternative selection procedures is t h a t i t a l l o w s the l i k e l y size o f this e r r o r t o be calculated. Sampling i n a systematic f a s h i o n rests o n relatively s t r a i g h t f o r w a r d principles whose a p p l i c a t i o n i n simple surveys is usually u n p r o b l e m a t i c . H o w e v e r , w h e r e a research p r o b l e m seems t o necessitate the c o n s t r u c t i o n o f a c o m p l e x sample design, the i m p l i c a t i o n s f o r the data analysis stage need t o be checked o u t i n advance w i t h a statistician f a m i l i a r w i t h social surveys.
Selecting samples
63
Theoretical populations A consideration w h i c h is sometimes glossed over i n methods texts b u t w h i c h needs t o be given early a t t e n t i o n i n any b u t the p u r e l y descriptive survey concerns the d e f i n i t i o n o f the target p o p u l a t i o n . I f a research project sets o u t t o test a theoretical hypothesis or even, m o r e modestly, t o a p p l y a n d explore theoretical concepts, there is t h e n a need t o consider w h a t k i n d s o f target p o p u l a t i o n are relevant t o the p a r t i c u l a r hypothesis or concepts. T o ensure the adequate e x p l o r a t i o n o f a theory, the e m p i r i c a l target p o p u l a t i o n selected by the researcher needs t o be i n c l u d e d w i t h i n the theoretical p o p u l a t i o n , the usually infinite d o m a i n o f e m p i r i c a l p o p u l a t i o n s w h i c h any general t h e o r y addresses. T h i s is essentially a conceptual c o n s i d e r a t i o n w h i c h needs t o be dealt w i t h at the research design stage.
Box 4.1
Ensuring a 'theoretically relevant* target population
In Delinquent Boys: the Culture of the Gang (1955), Cohen argued that gang delinquency was a response t o the problems encountered by (largely) w o r k ing class adolescents adjusting t o a system of status evaluation operating in American society through which it was impossible for them t o earn selfrespect. The delinquent subculture represents an alternative status system built on an inversion of key middle class values, particularly respect for legitimately acquired possessions.The 'cavalier misappropriation o r destruction of property' (Cohen 1955:134) that it is argued is characteristic of much juvenile delinquency is interpreted as a rejection of middle class acquisition through diligence, self-discipline and sobriety and a celebration of their opposites. A piece of research setting out t o explore Cohen's ideas would need t o begin by identifying a target population that is theoretically relevant t o Cohen's thesis. A researcher based in Britain might be tempted t o employ a target population of juveniles convicted for the offence of criminal damage and then move on t o analyse the perpetrators' membership of peer groups and their class and school backgrounds. However, consideration would have t o be given t o whether the legal definition of criminal damage in the U K is sufficiently close t o the 'cavalier misappropriation o r destruction of property' for this population t o be appropriate. A n alternative approach might be t o use the records of schools from a Local Education Authority area o r areas t o identify a target population of pupils w h o had committed acts of so-called 'mindless vandalism' in connection with school premises and property. In either case, an additional issue is whether Cohen's t h e o r y was f o r m u lated in intrinsically culture-specific terms so that any fair test would have t o be on an American target population. In general, the cases making up a target population should demonstrably possess the characteristics appropriate t o the t h e o r y under scrutiny.
64
Surveying the social world
The discussion i n B o x 4 . 1 raises the general issue o f the extent t o w h i c h i t is possible t o test a theoretical hypothesis or explore theoretical concepts i n a survey (or other research) w h i c h is n o t specifically designed f o r the task. Sometimes a theoretical f r a m e w o r k can give rise t o b r o a d predictions or have corollaries t h a t are sufficiently general t o be c o n f i r m e d or d i s c o n f i r m e d by the findings f r o m general purpose survey or similar research. I t c o u l d be the case i n the example f r o m B o x 4 . 1 t h a t data o n the d i s t r i b u t i o n o f j u v e n ile p r o p e r t y crime across classes derived f r o m existing o f f i c i a l crime statistics are f o u n d t o be at odds w i t h Cohen's p r o p o s i t i o n s . H o w e v e r , i t is also possible t h a t the o u t c o m e o f this f o r m o f e m p i r i c a l test w i l l be u n c e r t a i n a n d contested because o f (say) debate over w h e t h e r the definitions o f 'class' a n d ' p r o p e r t y c r i m e ' t h a t u n d e r p i n the data are congruent w i t h those offered by the theory. Specifically designed research generally offers the best chance o f a decisive a n d r i g o r o u s test o f a theory.
Probability sampling strategies The m a i n task i n the remainder o f this chapter is t o o u t l i n e the variety o f strategies available f o r selecting a sample f r o m the target p o p u l a t i o n . P r o b a b i l i t y strategies w i l l be dealt w i t h first, f o l l o w e d by n o n - p r o b a b i l i t y strategies. P r o b a b i l i t y or r a n d o m s a m p l i n g is an integral feature o f m o d e r n 'scientific' survey research (indeed, scientific sampling was an alternative name f o r i t , t h o u g h its use has n o w w a n e d ) . P r o b a b i l i t y s a m p l i n g is designed m a i n l y t o assist the accurate estimation o f the values of characteristics o f p o p u l a t i o n s ( p o p u l a t i o n parameters) based o n data obtained f r o m a sample. Examples o f t y p i c a l parameters i n w h i c h social researchers m i g h t be interested i n r e l a t i o n t o p a r t i c u l a r p o p u l a t i o n s are the p r o p o r t i o n o f households t h a t o w n a m o t o r vehicle, a n d the average gross w e e k l y income o f economically active i n d i viduals. T h e t e r m ' p r o b a b i l i t y s a m p l i n g ' is a reference t o the a d o p t i o n o f selection procedures w h i c h a l l o w the use o f inferences derived f r o m the m a t h ematical theory o f p r o b a b i l i t y . T h e l i n k w i t h this theory gives a l l p r o b a b i l i t y selection strategies some u n i q u e features n o t enjoyed by any other ways o f choosing the cases t o supply data, t h o u g h this is n o t t o say i t is universally appropriate o r superior. Its special features are as f o l l o w s : • Probability selection cannot offer any cast-iron guarantee t h a t a particular sample selected w i l l be 'representative' of the m i x of cases i n the target p o p u l a t i o n . Instead, i t offers the researcher the possibility o f calculating the level o f likely sampling e r r o r associated w i t h an estimate o f a p o p u l a t i o n value. Sampling error, w h i c h is discussed further b e l o w (page 76), can be t h o u g h t o f as the v a r i a b i l i t y between every logically possible potential sample o f a given size a n d type: the researcher (normally) selects just one.
Selecting samples
65
• I f the researcher can specify a desired level o f accuracy f o r estimates o f a key p o p u l a t i o n value, the m i n i m u m necessary sample size t o achieve i t can be calculated. I n other w o r d s , i f the k e y p o p u l a t i o n value the researcher wishes t o k n o w is ( f o r example) the m e a n size o f households w i t h i n plus or m i n u s 1 person, i t is possible t o w o r k o u t h o w large the sample m u s t be t o deliver a n estimate o f this precision. • I t is a necessary c o n d i t i o n f o r the use o f a w i d e v a r i e t y o f statistical tests a n d measures at the data analysis stage. P r o b a b i l i t y s a m p l i n g requires the researcher t o organize a l o t t e r y t o determine w h i c h cases i n the target p o p u l a t i o n w i l l m a k e u p the sample. There are, i n fact, a f e w elaborations a n d q u a l i f i c a t i o n s o n this b u t the essence o f p r o b a b i l i t y s a m p l i n g is t h a t i t is designed t o rule o u t the h a n d p i c k i n g o f i n d i v i d u a l cases. ' L o t t e r y ' m a y w e l l conjure u p i n the reader's m i n d images o f flashing lights a n d a mechanical apparatus w i t h r e v o l v i n g d r u m s c o n t a i n i n g n u m b e r e d c e l l u l o i d balls. H o w e v e r , its use i n c o n n e c t i o n w i t h surveys is p a r t l y m e t a p h o r i c a l : i t indicates t h a t the l o g i c a l selection c o n d i t i o n s present i n a fair l o t t e r y m u s t be s i m u l a t e d . N o e q u i p m e n t is r e q u i r e d a n d o n l y a very l i m i t e d f o r m o f l o t t e r y actually takes place. As i t applies i n p r o b a b i l i t y s a m p l i n g , the selection l o t t e r y has t o meet t w o general requirements. T h e first, a n d the m o r e b i n d i n g , states t h a t every case i n the target p o p u l a t i o n s h o u l d have a calculable a n d finite chance o f i n clusion i n the sample. T h i s implies t h a t n o case i n the target p o p u l a t i o n s h o u l d be c o m p l e t e l y excluded a n d n o case c a n be guaranteed i n c l u s i o n i n the sample i n advance o f selection. ' F i n i t e ' means here m o r e t h a n zero b u t less t h a n 1 : a n y case excluded f r o m the outset w o u l d have a zero p r o b a b i l i t y o f i n c l u s i o n i n the sample - because an event t h a t c a n n o t h a p p e n has a p r o b a b i l i t y o f zero - w h i l e a case w i t h a pre-set place i n the sample w o u l d have a p r o b a b i l i t y o f 1 , w h i c h i n p r o b a b i l i t y t h e o r y is the value associated w i t h inevitable events. Surprise is sometimes expressed t h a t the first p r i n ciple is n o t m o r e d e m a n d i n g . I t does n o t state, as i n t u i t i o n expects, t h a t every case be given a n equal chance i n the lottery. W h y this is unnecessary is covered below, i n the section o n stratified r a n d o m s a m p l i n g . (Sample designs i n w h i c h every case i n the p o p u l a t i o n has a n equal chance o f i n c l u s i o n i n the sample are, i n fact, a special case a n d k n o w n as epsem samples as a n abbrev i a t i o n o f the phrase 'equal p r o b a b i l i t y selection m e t h o d ' . ) T h e second p r i n c i p l e requires t h a t the selection o f any case o r g r o u p o f cases takes place independently o f the selection o f a n y other i n d i v i d u a l case or g r o u p o f cases. D r a w i n g case n u m b e r 128 i n the target p o p u l a t i o n f o r i n c l u s i o n i n the sample s h o u l d have n o bearing o n the chances o f n u m b e r 2 5 2 (or any other case) g e t t i n g selected subsequently. T h i s r e q u i r e m e n t is less stringent t h a n the first a n d p r o b a b i l i t y t h e o r y as i t applies t o social surveys is sufficiently r o b u s t t o a l l o w i t t o be cautiously v i o l a t e d i n t r i e d a n d tested w a y s .
66
Surveying the social world
T w o c o m p o n e n t s are r e q u i r e d t o i m p l e m e n t the selection lottery. T h e first is a s a m p l i n g f r a m e , w h i c h is a listing o f a l l the cases i n the relevant target p o p u l a t i o n . Sometimes suitable listings are available r e a d y - c o m p i l e d , i n other instances the c o m p i l a t i o n m u s t be carried o u t b y the researcher. O n e example o f a p r e - c o m p i l e d s a m p l i n g f r a m e w h i c h has o f t e n been used t o sample the 'general p u b l i c ' i n the U K is the Electoral Register, w h i c h is a list o f eligible voters listed b y house n u m b e r w i t h i n each street a n d w a r d o f a p a r l i a m e n t a r y constituency. I t is prepared b y local a u t h o r i t y officials f r o m returns s u b m i t t e d b y householders a n d is h e l d i n each t o w n h a l l . A second example is the P A F (Postcode Address File) w h i c h lists a l l the addresses i n the U n i t e d K i n g d o m b y postcode a n d is p r o d u c e d b y the Post O f f i c e . Since this is available i n d i g i t a l f o r m a t , i t has the advantage o f being computersearchable. Special variants o f the PAF cover p r i v a t e households a n d i n s t i t u t i o n s separately ( a l t h o u g h m o r e detailed listings o f business enterprises, specifying characteristics such as l o c a t i o n , n u m b e r o f employees a n d sector of activity, c a n be purchased f r o m business i n f o r m a t i o n brokers like D u n & Bradstreet L t d ) . O t h e r p r e - c o m p i l e d b u t m o r e specialized s a m p l i n g frames include the lists o f registered members c o n t r o l l e d by professional bodies, trade u n i o n s a n d enthusiast groups o f various k i n d s . T h e physical f o r m a t o f the s a m p l i n g frame listing is i n c i d e n t a l . I n m a n y cases i t c a n be a ' v i r t u a l ' list (thus the n u m b e r s 1 t o 10 c o u l d stand f o r the ten a d m i n i s t r a t i v e divisions o f a state o r r e g i o n w h e n listed i n alphabetical o r d e r ) . T h e k e y requirements f o r any s a m p l i n g frame are t h a t i t is c o m p r e hensive, accurate a n d u p t o date. H o w e v e r , despite the a p p a r e n t l y straightf o r w a r d character o f a s a m p l i n g f r a m e , i t needs t o be emphasized t h a t the task o f creating one f r o m scratch c a n be considerable f o r p a r t i c u l a r target p o p u l a t i o n s : consider, f o r example, the p r o b l e m o f g e t t i n g a n a t i o n a l sample o f discharged b a n k r u p t s o r m a n u f a c t u r i n g organizations w i t h exceptionally h i g h degrees o f w o r k f o r c e absenteeism. Sometimes the obstacles w i l l be insuperable, i n w h i c h case the research w i l l have t o be re-cast t o e m p l o y a different target p o p u l a t i o n , o r i t w i l l have t o be re-designed using a n o n survey m e t h o d o l o g y . T h e second r e q u i r e m e n t f o r i m p l e m e n t i n g the l o t t e r y is a procedure f o r d r a w i n g the cases f r o m the s a m p l i n g f r a m e . T h e procedure w h i c h corresponds m o s t closely t o p r o b a b i l i t y t h e o r y is t o use tables o f r a n d o m n u m bers. These are f r e q u e n t l y i n c l u d e d i n the back o f statistics texts a n d are s i m p l y collections o f r a n d o m digits w h e r e any d i g i t is as l i k e l y t o occur as any other a n d a l l c o m b i n a t i o n s o f digits are also equally likely. C o m p u t e r s can also generate r a n d o m n u m b e r s f o r these purposes a n d m o s t statistical packages include a convenient f a c i l i t y t h a t a l l o w s y o u t o specify a range w i t h i n w h i c h y o u require the n u m b e r s t o f a l l . T h e cases i n the listing need to be n u m b e r e d consecutively f r o m 1 u p t o the t o t a l i n the target p o p u l a t i o n . B o x 4 . 2 explains h o w t o use p r i n t e d r a n d o m n u m b e r tables. Systematic selection is a n alternative w a y t o d r a w cases f o r the sample. I f
Selecting samples
Box 4.2
67
Using random number tables
Tables of random numbers are often presented as blocks of digits w i t h intervening gaps t o assist identification, but there is no significance t o the number of digits included in the rows and columns. If the desired size of sample is (say) 550 cases, the task will be t o select 550 instances of three consecutive digits between 001 and 550 inclusive f r o m a numbered version of the sampling frame list. (You might actually need t o select some 'spare' cases t o cover contingencies such as refusals, failure t o contact, and errors in the sampling frame.) The consecutive digits do n o t all have t o be within a block of numbers. N o t e that the numbers assigned t o each case in the sampling frame should actually o r notionaily be padded w i t h leading zeros because within the random number tables, case 6 will be represented by the combination 006. So, in the example above, a three digit combination 044 in the table would be within range and the case assigned this number would be selected f o r the sample, but the combination 61 I would be ignored as out of range and you would move o n t o the next set of three digits. You should nominate a starting position randomly (that is, w i t h o u t knowing what the first combinations of digits are). Then you can w o r k through the random 'numbers' backwards o r forwards by page, up o r down blocks, including adjacent numbers o r leaving gaps in any way that takes your fancy (provided you proceed consistently and accept and reject mechanically).
y o u require a sample o f 5 0 0 cases f r o m a p o p u l a t i o n o f 1 0 , 0 0 0 , y o u w o u l d w i t h this m e t h o d s i m p l y select every t w e n t i e t h case f r o m the s a m p l i n g f r a m e . (The gap between selected cases is t e r m e d the s a m p l i n g f r a c t i o n o r s a m p l i n g i n t e r v a l : i f the size o f the p o p u l a t i o n is represented b y N a n d the size o f the sample is represented b y n, t h e n the s a m p l i n g i n t e r v a l k = N/n: i n the example cited, k = 10000/500 = 20). There are some a d d i t i o n a l considerations t h a t s u r r o u n d the use o f systematic selection. Y o u r i n i t i a l case s h o u l d be selected at r a n d o m (by, f o r example, using the r a n d o m n u m b e r generator o f a c o m p u t e r statistics package w i t h the range set t o the s a m p l i n g i n t e r v a l ) . I f y o u r sample is a large one (say, over 1 0 0 0 cases), y o u s h o u l d occasionally stop a n d m a k e a n e w start w i t h a case r a n d o m l y chosen i n the same m a n n e r as the first case. T h e reason for this is t h a t a s a m p l i n g f r a m e c o u l d e m b o d y a concealed p e r i o d o r cycle w i t h i n the l i s t i n g o f cases w h i c h coincides w i t h y o u r s a m p l i n g i n t e r v a l . Suppose, f o r e x a m p l e , y o u are c o n d u c t i n g a survey o f residents o n a h o u s i n g estate a n d are s a m p l i n g addresses. U n k n o w n t o y o u , every t e n t h address is associated w i t h a corner p l o t c o n t a i n i n g a m u c h larger house t h a n the others. A s a m p l i n g i n t e r v a l o f t e n c o u l d consistently catch every corner
68
Surveying the social world
house a n d the resulting sample w o u l d significantly exaggerate the average income a n d overall affluence o f households o n the estate. A r a n d o m starting p o i n t a n d p e r i o d i c re-starts are designed t o restrict the extent o f a n y sync h r o n i z a t i o n between the i n t e r v a l a n d the list. M o r e generally, whenever a p r e - c o m p i l e d s a m p l i n g frame is e m p l o y e d , i t is i m p o r t a n t t o k n o w w h a t p r i n c i p l e has governed the o r d e r i n g o f cases o n the list. A l p h a b e t i c a l order is usually ' n e u t r a l ' f o r m o s t research purposes (that is, u n l i k e l y t o be connected w i t h the key variables i n a s t u d y ) , b u t the i m p l i c a t i o n s o f c h r o n o l o g ical, geographical a n d ' e x o t i c ' o r d e r i n g criteria need t o be carefully examined. I f r a n d o m re-starts are n o t used, systematic selection breaches the second (independence) p r i n c i p l e o f the selection l o t t e r y i n t h a t the choice o f the first case effectively determines the i d e n t i t y o f a l l the rest o f the cases t h a t m a k e u p a p a r t i c u l a r sample. H o w e v e r , i t is o f t e n the m o s t convenient procedure t o i m p l e m e n t a n d p r o v i d e d t h a t the s a m p l i n g frame is ordered o n a n e u t r a l p r i n c i p l e a n d p e r i o d i c re-starts are e m p l o y e d , selection o f cases b y s a m p l i n g i n t e r v a l a p p r o x i m a t e s reasonably closely t o the r i g o u r o f f u l l r a n d o m select i o n . I n the sub-sections t h a t f o l l o w , p a r t i c u l a r p r o b a b i l i t y sample designs a n d refinements are e x a m i n e d i n sufficient detail t o assist the reader t o recognize the appropriateness o f each v a r i a n t t o different k i n d s o f research s i t u a t i o n . Details o f m o r e comprehensive treatments o f sample design are given i n the f u r t h e r reading section at the end o f the chapter.
Simple r a n d o m s a m p l i n g (SRS) I n this design, the cases t h a t w i l l m a k e u p the sample are chosen i n a single process o f selection f r o m the s a m p l i n g frame t h a t covers the entire target p o p u l a t i o n . I f the cases are n u m b e r e d f r o m 1 t o N i n advance, selection can be based o n r a n d o m n u m b e r tables. T h e question o f h o w t o determine the a p p r o p r i a t e sample size is dealt w i t h o n page 7 7 .
Stratified r a n d o m s a m p l i n g Stratification is a n i m p o r t a n t refinement o f the selection process w h i c h , o p e r a t i n g under suitable c o n d i t i o n s , c a n i m p r o v e u p o n the effectiveness o f SRS s a m p l i n g . I t requires the i d e n t i f i c a t i o n o f a ' s t r a t i f y i n g v a r i a b l e ' (occasionally several variables) o n the basis o f whose values the target p o p u l a t i o n can be d i v i d e d i n t o distinctive 'strata' o r g r o u p i n g s . A separate r a n d o m sample is then t a k e n f r o m each s t r a t u m . N o r m a l l y , i t has t o be possible t o determine the value o f every case i n the target p o p u l a t i o n o n the strati f y i n g variable(s) before selecting the sample. T h u s , s t r a t i f i c a t i o n requires the researcher t o acquire some i n f o r m a t i o n a b o u t the target p o p u l a t i o n under study i n advance o f data collection ( f o r instance, f r o m previous research) a n d i t exploits this a d d i t i o n a l i n f o r m a t i o n t o b u i l d a sample t h a t
Selecting samples
Box 4.3
69
A simple random sample of customers
A utility company decides t o survey its 16,400 private customers in one of its operating regions t o establish the effectiveness of its customer relations. It is decided that an SRS sample of roughly 2000 will be adequate. The comprehensive list of customer names for the region is extracted f r o m the computer database used f o r billing and the names are transferred t o a spreadsheet and allocated consecutive numbers starting f r o m 00001 and ending at 16400. A random sample of 2000 numbers falling in the range between I and 16400 inclusive is generated by the spreadsheet and the corresponding customers are selected. This sample of customers will provide a systematic basis f o r making generalizations about all the customers in the region. However, there will inevitably be variation between the very large number of different samples of 2000 cases that could be generated by this selection procedure. As a result, generalizations based on the sample data will be subject t o a probable degree of e r r o r that can be calculated. Conducting a SRS of all the customers offers no guarantee, as was made clear on page 64, that the resulting sample will be representative of particular types that exist in the target population. It is unlikely t o contain (for instance) exactly the same p r o p o r t i o n of female customers, new customers, small volume users, o r late payers as there are in the region, if it is important t o ensure that the sample reflects the proportions of cases w i t h key characteristics in the target population, o r that it contains a minimum number of such cases, refinements t o the sample design are needed. A n additional point w i t h large target populations is that paper listings will be bulky and difficult t o handle, so some f o r m of Virtual' computer listing will usually be preferable.
m i r r o r s the c o m p o s i t i o n o f the target p o p u l a t i o n o n the chosen characteristic(s). I n c h o o s i n g a s t r a t i f y i n g variable, the essential r e q u i r e m e n t is t h a t cases i n each ' s t r a t u m ' (that is, w i t h the same value of the s t r a t i f y i n g v a r i able) s h o u l d have similar relationships w i t h the dependent variable(s) o f interest i n the study, b u t t h a t as m u c h v a r i a b i l i t y as possible s h o u l d exist between strata a n d the dependent variable(s). T h e gains f r o m i n t r o d u c i n g s t r a t i f i c a t i o n increase t o the extent t h a t this r e q u i r e m e n t is m e t . Consider an example. I n an o r g a n i z a t i o n a l study o f j o b satisfaction (the dependent v a r i a b l e ) , job grade w o u l d be a n a p p r o p r i a t e s t r a t i f y i n g variable p r o v i d e d t h a t m e m b e r s h i p o f d i f f e r e n t j o b grades significantly affected levels of j o b satisfaction. (There m i g h t be evidence o n this i n the l i t e r a t u r e ) . I f the values o f j o b grade w e r e , ' M a n a g e m e n t ' , 'Technical a n d supervisory s t a f f , 'Shop f l o o r s t a f f , the target p o p u l a t i o n w o u l d i n this case be d i v i d e d i n t o
70
Surveying the social world
these three strata. Lists o f the members i n the g r o u p i n g s c o u l d be o b t a i n e d f r o m the Personnel D e p a r t m e n t a n d w o u l d constitute three s a m p l i n g frames f r o m each o f w h i c h cases w o u l d be chosen i n t u r n by a r a n d o m procedure. Some v a r i a t i o n between the levels o f j o b satisfaction o f i n d i v i d u a l s w i t h i n the shop floor s t r a t u m a n d w i t h i n the other strata w o u l d , o f course, r e m a i n - i f there was none left at a l l , there w o u l d be l i t t l e f o r the research t o reveal! T h e r e q u i r e m e n t f o r s t r a t i f i c a t i o n t o w o r k beneficially is s i m p l y t h a t there s h o u l d be less w i t h i n - s t r a t u m v a r i a t i o n t h a n b e t w e e n - s t r a t u m v a r i a t i o n . The next matter t h a t needs t o be settled is w h a t s a m p l i n g f r a c t i o n s h o u l d be used f o r each s t r a t u m . I n the simplest case, w h e r e the design is referred t o as p r o p o r t i o n a t e stratified s a m p l i n g (PSS), a u n i f o r m f r a c t i o n is used f o r all the strata. T h i s results i n a sample w h i c h is a m i r r o r o f the target p o p u l a t i o n w i t h respect t o the s t r a t i f y i n g variable. Each g r o u p i n g of the s t r a t i f y i n g v a r i able constitutes the same p r o p o r t i o n o f the sample as i t does o f the p o p u l a t i o n . T h e alternative, d i s p r o p o r t i o n a t e stratified s a m p l i n g (DSS), varies the sampling f r a c t i o n f o r different strata. There are several reasons f o r d o i n g this. O n e is s i m p l y t o increase the representation i n the final sample o f small strata w h i c h otherwise m i g h t c o n t r i b u t e o n l y a h a n d f u l o f cases f r o m w h i c h no sound inferences c o u l d be d r a w n . A n o t h e r reason is t o deliberately divert extra research resources t o strata k n o w n i n advance t o have h i g h l y variable relationships w i t h respect t o the key dependent variable, at the expense of strata k n o w n i n advance t o have relatively homogeneous relationships w i t h this variable. I n the research o n j o b satisfaction, f o r example, i t m i g h t be k n o w n f r o m a previous survey t h a t technical a n d supervisory staff were, as a g r o u p , characterized by especially variable a n d fluctuating levels o f j o b satisf a c t i o n i n c o m p a r i s o n t o shop floor staff where the levels were m o r e consistent between i n d i v i d u a l s a n d over t i m e . D i r e c t i n g extra sample cases t o the most variable s t r a t u m i n a sample design f r o m less variable strata can be an extremely effective w a y o f reducing o v e r a l l s a m p l i n g error. T h e DSS sample i n c o l u m n (4) o f Table 4 . 1 n o w provides a better basis f o r m a k i n g generalizations a b o u t M a n a g e m e n t a n d Technical a n d supervisory staff t h a n the SRS design i n c o l u m n (2) o r the PSS design i n c o l u m n (3) w i t h o u t using a larger sample. As i t stands, however, the c o m p o s i t i o n o f the DSS sample i n terms o f strata m i g h t n o w seem t o be i n danger o f g i v i n g a dist o r t e d picture o f the target p o p u l a t i o n as a w h o l e . T h e remedy is t o give each case i n the sample a n u m e r i c a l w e i g h t i n g (based o n the s a m p l i n g f r a c t i o n ) w h i c h returns b o t h over-sampled strata a n d under-sampled strata t o their true p r o p o r t i o n s i n the p o p u l a t i o n : this w e i g h t i n g can be used t h r o u g h o u t the data analysis stage. T h i s capacity t o r e - w e i g h t data explains w h y the l o t t e r y principles d o n o t need t o insist o n g i v i n g a l l the cases (and groups o f cases) i n the target p o p u l a t i o n equal chances o f i n c l u s i o n i n the sample. T h e choice between feeding w e i g h t e d or u n w e i g h t e d data i n t o analyses is a n o p t i o n p r o v i d e d i n m o s t c o m p u t e r survey packages.
71
Selecting samples
Box 4.4
A comparison of SRS, PSS and DSS
This box explores sampling designs for the organizational study of job satisfaction introduced on page 69. A study of job satisfaction seeks t o conduct interviews w i t h a cross-section of all the personnel in an organization w i t h the available resources keeping the limit t o 100 interviews. Column ( I ) in Table 4.1 below shows the size of each stratum and the target population. Column (2) indicates the situation using a simple random sample (SRS) w i t h a sampling fraction of I in 50. The question marks indicate that the strata play no role in the SRS selection process and indicate the uncertainty over exactly h o w many cases there will actually be in the sample f r o m each category. Column (3) shows the results of using a proportionate stratified sample (PSS) w i t h the same I in 50 sampling fraction as column (2) but selected f r o m each stratum separately. O n e obvious problem w i t h adopting a uniform sampling fraction in this instance is that it w o u l d provide t o o few cases f o r study f r o m the smaller strata. Column (4) shows the results f r o m a disproportionate stratified sample (DSS) that varies the sampling fraction for the different strata. The objective in this instance would be partly t o increase the numbers in the final sample from both the management and technical staff strata, but also t o deliberately over-sample technical staff where job satisfaction is likely t o be highly variable. The shop f l o o r staff are 'under-sampled* in comparison t o their proportion in the population while the other t w o groups are 'over-sampled' at the expense of the former. Provided the assumptions about variations in job satisfaction are correct, the sampling e r r o r for (4) should be less than that for either (2) o r (3) despite the sample size remaining constant.
Table 4.1
C o m p a r i s o n o f SRS, PSS a n d DSS desig;ns
Stratum
(V Population
Management Technical Shop floor Total
250 500 4250 5000
N
(2) SRS ?
p p 100
1:50
(3) PSS 1:50
(4) DSS
5 10 85 100
1:8.3 1:12.5 1:141.6 1:50
proportion
n 30 40 30 100
Part of the i m p o r t a n c e o f s t r a t i f i c a t i o n i n p r o b a b i l i t y s a m p l i n g is t h a t i t is the p r i n c i p a l means by w h i c h the researcher can engineer w h a t is p o p u l a r l y t h o u g h t o f as a 'representative' sample. Unless there is s t r a t i f i c a t i o n by gender, h o u s e h o l d size, or average rent etc., t h e n there is n o c o n t r o l i n
72
Surveying the social world
p r o b a b i l i t y s a m p l i n g o f the m a n n e r i n w h i c h specific characteristics a n d attributes o f cases appear i n the sample. I t f o l l o w s t h a t f o r p r a c t i c a l reasons representativeness is s o m e t h i n g t h a t can o n l y be achieved f o r a small n u m b e r o f specific characteristics a n d n o t g l o b a l l y f o r the sample as a w h o l e .
Box 4.5
Combining stratification and systematic selection
A straightforward way t o implement stratification in conjunction w i t h systematic selection is t o s o r t the cases in the sampling frame into order by ascending o r descending value of the stratifying variable (or variables). The table below shows an extract f r o m a sampling frame adapted f r o m the U K Post Office's Postcode Address File. Sampling unit
% owner occupiers
Region I Postal sector 1356 Postal sector 1456 Postal sector 1567 etc . . . Region 2 Postal sector 2345 Postal sector 2456 Postal sector 2567 etc . . .
64 60 56
48 47 47
Postal sectors are areas made up of adjacent postal codes, in order t o achieve national coverage, this sample is stratifying by region. In order t o achieve a range of socio-economic backgrounds, it is also stratifying by percentage of o w n e r occupier households, a figure available f o r small areas f r o m census data. Provided a suitable sampling interval is chosen, selection of sectors f r o m each region and a range of socio-economic backgrounds can be guaranteed. This box can be read in conjunction w i t h Box 4.6.
Multi-stage sampling A second m a j o r refinement o f SRS designs entails the p o s s i b i l i t y o f c o n d u c t i n g several stages o f selection i n sequence. A n example w i l l s i m p l i f y the e x p l a n a t i o n . A research project m a y necessitate data c o l l e c t i o n f r o m a very large, perhaps n a t i o n a l , target p o p u l a t i o n . I t m a y w e l l be the case t h a t n o n a t i o n a l s a m p l i n g f r a m e exists (and even i f one d i d , i t w o u l d be extremely t i m e - c o n s u m i n g t o set u p an SRS design i n c o n j u n c t i o n w i t h i t ) . I t is also
Selecting samples
73
possible, however, t h a t adequate s a m p l i n g frames m a y be available f o r localities. Large scale p o p u l a t i o n s are i n v a r i a b l y organized i n t o a v a r i e t y o f hierarchical u n i t s . I n the case o f a n a t i o n state l i k e the U n i t e d K i n g d o m , one set o f hierarchical units c o u l d be administrative region, parliamentary constituency, borough, ward, street address, a n d household. T h e units are hierarchical i n as m u c h as l o w e r units are 'nested' w i t h i n higher ones: every street address 'belongs t o ' a w a r d , a n d every w a r d fits i n t o a b o r o u g h , a n d so o n . Sampling procedures are able t o m a k e use o f this arrangement by m o v i n g d o w n the hierarchy m a k i n g selections f r o m each u n i t i n t u r n u n t i l they are able t o e x p l o i t existing s a m p l i n g frames o r i t is feasible t o create t h e m . I n the example given, s t a r t i n g at the ' t o p ' o f the system, the p r i m a r y s a m p l i n g u n i t or PSU w o u l d be the a d m i n i s t r a t i v e r e g i o n . A sub-set o f regions w o u l d be chosen using r a n d o m procedures a n d the p a r l i a m e n t a r y constituencies they c o n t a i n e d w o u l d be listed t o m a k e u p the s a m p l i n g frame f o r the second stage (so t h a t the secondary s a m p l i n g u n i t o r SSU w o u l d be the constituency). R a n d o m selection w o u l d again take place a n d the b o r oughs i n the chosen constituencies w o u l d be listed t o f o r m the s a m p l i n g f r a m e f o r the t h i r d stage, a n d so o n . A n i m p o r t a n t ' e c o n o m y ' i n the p r o cedure is t h a t o n l y selected units are passed o n f r o m each stage r e d u c i n g the size o f the n e x t s a m p l i n g f r a m e t h a t has t o be constructed. There is n o theoretical l i m i t t o the n u m b e r o f stages i n a design a l t h o u g h each process o f selection adds c u m u l a t i v e l y t o the o v e r a l l s a m p l i n g error. I t is possible t o c o m b i n e multi-stage s a m p l i n g w i t h s t r a t i f i c a t i o n a n d other refinements t o produce sophisticated designs.
Box 4.6 The Family Expenditure Survey ( F E S ) - a complex national sampling design The FES is an annual sample survey of private households' spending and saving that has been conducted in the U K by the Office of Population Census and Surveys (OPCS) and the Office of National Statistics (ONS) since 1957. The achieved sample size is about 7000 and respondents complete expenditure diaries as well as taking part in interviews. The key features f r o m a sampling viewpoint are as follows: • the design is a two-stage, stratified and clustered, random sample; • the sampling frame is the Post Office's small users' Postcode Address File; • there are various exclusions including offshore islands (owing t o the expense of collection), members of the US armed forces, Roman Catholic priests living in parish accommodation, and households containing members of the diplomatic service of other countries - though non-British households are not generally excluded: N o r t h e r n Ireland is covered but the sampling arrangements differ f r o m those described here;
74
Surveying the social world
• the PSU is a postal sector - ward-sized areas which provide the clustering element in the design: 672 sectors are selected at stage I , 10,000 addresses are selected at stage t w o ; • sectors are stratified by ( I ) Government Office region t o give a geographical spread, (2) whether an area is officially classified as urban o r not t o cover urban-rural differences, (3) the p r o p o r t i o n of owner occupiers and the p r o p o r t i o n of renters according t o the last census t o ensure a socio-economic spread. For further details, see O N S (annually) Family Spending. The URL is: http://www.statistics.gov.uk/products/
Cluster samples are a specialized a d a p t a t i o n o f multi-stage designs. A m a j o r c o m p o n e n t o f the expense o f surveys based o n h o u s e h o l d or w o r k based interviews is the t r a v e l costs i n c u r r e d c o n v e y i n g interviewers t o respondents, especially where the target p o p u l a t i o n is w i d e l y d i s t r i b u t e d geographically. A procedure f o r r e d u c i n g costs by c o n c e n t r a t i n g the data collection o p e r a t i o n is t o divide the area covered b y the target p o p u l a t i o n i n t o a n u m b e r o f clusters o f adjacent cases, perhaps circumscribed by d i s t r i c t or other geographical boundaries f o r w h i c h s a m p l i n g frames exist. The list o f clusters is t h e n sampled r a n d o m l y a n d the chosen clusters are translated i n t o the final s a m p l i n g u n i t , usually addresses. A l l the cases i n chosen clusters (or, at a m i n i m u m , a substantial p r o p o r t i o n o f t h e m ) are i n c l u d e d i n the sample. W i t h the addresses o f selected i n d i v i d u a l s or households b u n c h e d together, field w o r k e r s need t o t r a v e l t o fewer disparate locations a n d can c o n d u c t m o r e interviews per t r i p . T h e logic o f cluster s a m p l i n g means t h a t it w o r k s best w h e r e each cluster is as heterogeneous as possible. Ideally, each cluster w o u l d a p p r o a c h the diversity o f the target p o p u l a t i o n as a w h o l e a n d i n this respect desirable characteristics f o r a cluster are the opposite o f those i n s t r a t i f i c a t i o n w h e r e there is a p r e m i u m o n h o m o g e n e i t y w i t h i n a s t r a t u m . Clustering is an example o f a refinement t h a t is designed p r i m a r i l y t o reduce costs i n a w a y t h a t m i n i m i z e s the i m p a c t o n s a m p l i n g errors, b u t i t w i l l still n o r m a l l y be the case t h a t s a m p l i n g e r r o r is higher i n a cluster sample t h a n i n a SRS o f the same size. A p o t e n t i a l p r o b l e m w i t h multi-stage s a m p l i n g is t h a t i t can lead t o d i f f i culties p r o d u c i n g a sample o f the desired size o w i n g t o p o t e n t i a l differences i n the size o f the PSUs. A useful device t h a t is c o m m o n l y e m p l o y e d t o deal w i t h this is t o arrange f o r the selection o f PSUs w i t h p r o b a b i l i t y p r o p o r t i o n a l t o size (PPS). T h e chance o f selection f o r each PSU is adjusted so t h a t i t is p r o p o r t i o n a l t o the n u m b e r o f cases i n the target p o p u l a t i o n t h a t each PSU contains. F u r t h e r details are given i n B o x 4.7.
Selecting samples
75
Box 4.7 Implementing probability proportional to size for a multi-stage design A convenient way t o apply PPS is t o construct a table f o r all the PSUs in the target population w i t h the cumulative number of cases they represent, in this example, the PSUs are nine regions w i t h the following populations:
Region 1 Region Region Region Region Region
Cumulative size
Size
PSU
2 3 4 5 6
Region 7 Region 8 Region 9
1 100 000 250 000
3 1 1 1
980 190 490 600 300 000 090
000 000 000 000 000 000 000
1 1 2 2 3 6
100 350 330 520 010 610
000 000 000 000 000 000
7 910 000 8 910 000 10 000 000
Assuming random number tables are t o be used and four regions need t o be selected for stage 2, the task becomes one of drawing four lots of four digits ranged between 0001 and 1000 (the final four zeros in the cumulative size column can be ignored). Any set of four digits drawn f r o m the tables between 0001 and 01 10 inclusive will select Region I , between 0111 and 0135 inclusive will select Region 2, between 0136 and 0233 inclusive will select Region 3, etc., etc. Region 3 is represented by 98 sets of digits, as against the 49 sets available for Region 5 which has half its population. Thus the number of sets of digits that correspond t o a region (and therefore the chances of any region being selected) is proportional t o the size of its population. N o t e that care needs t o be taken t o ensure that the selection units and procedures employed in subsequent stages do not 'undo' the p r o p o r t i o n ality created in stage I .
Accuracy, precision and confidence intervals L i t t l e m e n t i o n has so far been made o f sample size. T h e question o f h o w large a p a r t i c u l a r sample needs t o be cannot always y i e l d a simple direct answer. To be clear a b o u t the issues s u r r o u n d i n g size, i t is necessary t o r e t u r n again t o the p r o b a b i l i t y basis o f s a m p l i n g a n d , as a p r e l i m i n a r y , t o i n t r o d u c e
76
Surveying the social world
the d i s t i n c t i o n between accuracy a n d p r e c i s i o n . As w e have seen, r a n d o m selection procedures generate samples the data f r o m w h i c h can be used t o estimate the value o f selected p o p u l a t i o n characteristics. I t is n o t possible, however, t o establish exactly the accuracy o f an estimate, t h a t is, h o w closely a specific estimate based o n an executed sample coincides w i t h the true p o p u l a t i o n value. W h a t the statistics o f p r o b a b i l i t y s a m p l i n g d o instead is t o m a k e i t possible t o calculate the precision o f a n estimate. Precision indicates h o w closely the estimates derived f r o m a l l the samples o f a given size a n d design t h a t c o u l d possibly be selected f r o m the target p o p u l a t i o n cluster a r o u n d the p o p u l a t i o n value being p r e d i c t e d . Precision is measured by the f a m i l y o f s t a n d a r d e r r o r statistics, one o f w h i c h exists f o r every i n d i v i d u a l estimator (thus, there is a s t a n d a r d e r r o r o f the m e a n , a s t a n d a r d e r r o r o f a p r o p o r t i o n , etc.). C a l c u l a t i o n s o f precision take the f o r m o f confidence intervals. T h e interval element is a range o f values centred o n the sample estimate w i t h i n w h i c h the p o p u l a t i o n value is p r e d i c t e d t o f a l l : the confidence element refers t o a level o f c e r t a i n t y attached t o the p r e d i c t i o n , c o n v e n t i o n a l l y 95 per cent o r 99 per cent. I f the researcher wishes t o be 9 9 per cent co nfident i n the p r e d i c t i o n , the range o f values the i n t e r v a l covers w i l l be larger (and therefore the estimate w i l l be less precise) t h a n i f he o r she settles f o r the 95 per cent level. Several o f the t e x t b o o k s cited at the end o f the chapter set o u t the statistics f o r c a l c u l a t i n g confidence intervals i n d e t a i l .
Sample size and sampling error I t has been n o t e d (page 64) t h a t s a m p l i n g e r r o r was a measure o f the overall v a r i a b i l i t y between every possible sample o f a p a r t i c u l a r size a n d design t h a t c o u l d be selected f r o m a target p o p u l a t i o n . T h e greater the s a m p l i n g e r r o r associated w i t h a sample, the l o w e r the precision o f the estimates p r o duced f r o m i t . Increasing the sample size represents one w a y o f r e d u c i n g s a m p l i n g e r r o r a n d i m p r o v i n g precision, b u t i t is rather inefficient because s a m p l i n g e r r o r varies w i t h the square o f sample size ( i n SRS samples). I n other w o r d s , i n o r d e r t o halve the level o f s a m p l i n g error, the sample size m u s t be increased f o u r times. M o d i f i c a t i o n o f the o v e r a l l research design and/or refinement o f the sample design ( t h r o u g h , f o r e x a m p l e , stratification) may be preferable t o c o n d u c t i n g a large a n d p o t e n t i a l l y expensive s a m p l i n g and data g a t h e r i n g o p e r a t i o n . I t is possible t o o b t a i n a concrete i n d i c a t i o n o f the scale o f s a m p l i n g e r r o r for a p a r t i c u l a r type o f sample design. Consider a d i c h o t o m o u s (yes/no) v a r i able such as 'households w h o have spent a h o l i d a y a b r o a d w i t h i n the last five years' w h i c h available evidence m i g h t suggest is split a b o u t 5 0 - 5 0 i n the p o p u l a t i o n , the w o r s t case f r o m a s a m p l i n g v i e w p o i n t . T h e s a m p l i n g e r r o r for this a t t r i b u t e w o u l d be just over ± 3 per cent i n SRS samples o f 1000 at the 95 per cent level o f confidence. T h i s means t h a t i f there were 52 per cent
Selecting samples
77
i n the sample w h o h a d t a k e n f o r e i g n h o l i d a y s , w e c o u l d be 95 per cent sure t h a t there were between 49 per cent a n d 55 per cent i n the p o p u l a t i o n . I f the sample size c o u l d be increased t o 2 5 0 0 , the s a m p l i n g error w o u l d f a l l t o 2 per cent a n d the confidence i n t e r v a l w o u l d s h r i n k t o between 50 per cent and 54 per cent. H o w can the a p p r o p r i a t e size f o r a sample be determined? I f a project has as a key objective the e s t i m a t i o n o f the value o f a p a r t i c u l a r p o p u l a t i o n parameter w i t h a p a r t i c u l a r level o f precision a n d confidence (say, f o r example, the average h o u s e h o l d income i n a p a r t i c u l a r target p o p u l a t i o n plus o r m i n u s £ 1 0 at 95 per cent confidence), t h e n i t is relatively straightf o r w a r d i n SRS samples t o w o r k o u t exactly h o w large the sample needs t o be t o p r o d u c e this (see, f o r e x a m p l e , the calculations i n K a l t o n 1966, p p . 2 4 - 5 , o r M o s e r a n d K a l t o n 1 9 7 1 , section 7.1). I n m a n y cases (as above w i t h f o r e i g n h o l i d a y s ) , such calculations require a p r e l i m i n a r y estimate o f the v a r i a b i l i t y o f the key parameter w i t h i n the p o p u l a t i o n . H o w e v e r , because m o s t projects have diffuse objectives, the a p p r o p r i a t e level o f precision t o set and the variables t o p r i o r i t i z e are n o t always self-evident. Some analysis w i l l i n v a r i a b l y be based o n selected sub-groups w h e r e the numbers w i l l be smaller a n d the s a m p l i n g errors w i l l be higher t h a n w h e n the sample as a w h o l e is under c o n s i d e r a t i o n . T h e c o n v e n t i o n a l strategy is t o err o n the side of c a u t i o n a n d base s a m p l i n g e r r o r considerations o n the least f a v o u r a b l e variables, those l i k e l y t o have the highest v a r i a b i l i t y i n the target p o p u l a t i o n . A r u l e o f t h u m b also sometimes offered is n o t t o p e r m i t the size o f any subg r o u p w h i c h w i l l be the basis o f analysis t o f a l l b e l o w 50.
Other types of error that affect surveys A d i s t i n c t i o n is o f t e n d r a w n between the d i f f e r e n t types o f inaccuracy or error t h a t can affect surveys depending o n their source because w h a t needs to be done a b o u t t h e m varies. As t h e i r name i m p l i e s , selection errors o r i g i nate i n the selection process itself w h i l e non-selection errors are the residual category w h i c h come f r o m anywhere else i n survey m e t h o d o l o g y (and w h i c h w i l l n o t be discussed f u r t h e r here). T h e m o s t serious types o f selection a n d non-selection errors are listed i n Table 4.2 together w i t h l i k e l y responses t o t h e m ( t h o u g h the o p t i m u m response depends o n the p o i n t at w h i c h a p r o b lem comes t o l i g h t ) . Sampling e r r o r was discussed i n the last section. C o n s i d e r i n g the other types o f selection e r r o r i n Table 4 . 2 , a degree o f s a m p l i n g f r a m e inaccuracy may be inevitable a n d w h e r e a list is c o m p i l e d b y an external agency, a researcher m a y have n o o p t i o n other t h a n t o accept its l i m i t a t i o n s a n d take t h e m i n t o account i n the research design. T h e incorrect i n c l u s i o n o f cases w h i c h are n o t p r o p e r l y a genuine p a r t o f the target p o p u l a t i o n m a y subseq u e n t l y come t o l i g h t a n d be self-correcting (for instance, o n contact w i t h
78
Surveying the social world
T a b l e 4.2
Types o f e r r o r i n surveys
Selection problems Sampling error too large Sampling frame flaws Non-response
Possible responses Increase sample size, refine sample design Checks to establish extent of problems Reminders (postal surveys) or recalls (Telephone and household interviews)
Non-selection problems Use of incorrect or biased estimator Interviewer mistakes
Possible responses Consult statistician Simplify interview schedule, re-instruct interviewers Use computer-verified data entry; revise coding schemes
Coding errors
respondents), whereas the o m i s s i o n f r o m the f r a m e o f cases w h i c h s h o u l d have been i n c l u d e d is m o r e serious because i t is less l i k e l y t o be discovered a u t o m a t i c a l l y . Non-response is a f u n d a m e n t a l p r o b l e m t h a t affects m o s t surveys t o a greater o r lesser extent. I t refers t o the f a i l u r e o f research efforts t o gather data f r o m a l l the cases t h a t genuinely b e l o n g i n the sample. Reasons f o r non-response include the refusal o r i n a b i l i t y o f respondents t o participate a n d cases w h i c h t u r n o u t t o be uncontactable ( o n account o f the death o r r e l o c a t i o n o f i n d i v i d u a l s , their change o f status, or the closure o f businesses, etc). I f non-response reaches h i g h levels, i t can threaten the statistical v a l i d i t y o f survey findings. I t is q u i t e distinct f r o m s a m p l i n g e r r o r : by d e f i n i t i o n , s a m p l i n g e r r o r is a r a n d o m v a r i a t i o n between possible samples, b u t non-response is h i g h l y u n l i k e l y t o be r a n d o m . Respondent refusals t o participate i n surveys, f o r example, are l i k e l y t o come d i s p r o p o r t i o n a t e l y f r o m certain social g r o u p s . These include those w h o have u n o r t h o d o x views o n the topics o f the research (or w h o s i m p l y believe t h e i r views t o be u n o r t h o d o x ) a n d w h o can be especially r e l u c t a n t t o reveal t h e m . I n d i v i d u a l s w h o are socially excluded o r have c o n f l i c t u a l relationships w i t h agencies o f social c o n t r o l are especially l i k e l y t o refuse irrespective o f the nature o f the research i n q u e s t i o n . T h i s element o f self-selection means t h a t the responders/achieved cases c a n n o t be t a k e n as representative o f the n o n - r e s p o n ders/non-achieved cases. I n m u c h the same w a y , i n d i v i d u a l s w h o p r o v e d i f f i c u l t t o contact w i l l possibly have occupations a n d life-styles substant i a l l y different t o those o f respondents. T h e classic remedy f o r dealing w i t h the n o n - c o n t a c t element o f n o n response i n p o s t a l questionnaires is postal (or telephone) reminders. I n h o u s e h o l d i n t e r v i e w i n g , the procedures can require a fixed n u m b e r o f callbacks t o an address at different times t o the o r i g i n a l visit. Refusals can be dealt w i t h by a v a r i e t y o f methods i n c l u d i n g incentive payments o r other
Selecting samples
79
rewards a n d careful p r i o r a t t e n t i o n t o t h e design o f c o v e r i n g letters a n d p r e p a r a t o r y i n f o r m a t i o n . A n o t h e r t a c k t h a t can be a d o p t e d w h e n persuasion t o p a r t i c i p a t e has failed is t o t r y t o get at least one piece o f n o n contentious i n f o r m a t i o n f r o m refusers (such as age) so t h a t i t is possible t o c o m p a r e their p r o f i l e w i t h t h a t o f respondents o n a variable c o m m o n t o both. N o n - s e l e c t i o n errors are i n c l u d e d i n Table 4 . 2 f o r completeness a n d are discussed i n the sections o n data c o l l e c t i o n a n d c o d i n g . T h e v a r i e t y o f sources o f e r r o r discussed above underlines the reluctance o f experienced survey researchers t o rely w h o l l y o n large samples t o deliver h i g h precision estimates since increasing sample size reduces o n l y s a m p l i n g e r r o r b u t does n o t deal w i t h the o t h e r sources.
Box 4 . 8
T h e Travel Survey: sample design
The Travel Survey addressed t w o different target populations o f commuters, students and staff. The student target population was restricted t o those in the second o r subsequent year of their courses since a very high p r o p o r t i o n o f first years lived in hails o f residence on the campus itself. Only staff working on the main campus were included. It was decided t o use a DSS approach: the sampling fraction for most of the staff was set t o 1:4 as against 1:5 f o r students because staff commuting was a more critical problem. Staff f r o m three departments scheduled t o move t o a new campus were oversampled w i t h a 1:3 fraction.The designated target populations were 4763 staff and 7995 students. The designated sample size was 1220 staff and 1998 students. The achieved sample sizes were 590 staff (48%) and 282 students (14%).
Sampling strategies: non-probability sampling N o n - p r o b a b i l i t y selection methods d o n o t i m p l e m e n t a r a n d o m selection lottery. T h e y c a n n o t therefore m a k e use o f inferences f r o m p r o b a b i l i t y t h e o r y a n d , i n consequence, they d o n o t p r o v i d e equivalent guarantees o f precision t o the procedures discussed i n the previous sections. T h e y nevertheless m a y have a specialist role t o p l a y at p a r t i c u l a r stages o f the survey process. Convenience s a m p l i n g A convenience sample, as the name implies, is based o n a selection o f cases w h i c h are easily accessible t o the researcher f o r the e x p e n d i t u r e o f relatively
80
Surveying the social world
little e f f o r t . Examples o f h i g h accessibility are households located i n neighb o u r h o o d s close t o the researcher's residence, o r students i n the researchers' o w n classes. T h e element o f deliberate selection b y the researcher a n d the fact o f his o r her association w i t h the chosen cases seriously compromises these types o f selection. Even w h e r e the cases are neither l o c a l n o r personally k n o w n t o the researcher, the 'convenience' o f selecting t h e m m a y be c o n nected t o the fact t h a t they are celebrated o r long-established instances o f their class a n d , i n these respects, a t y p i c a l o f the target p o p u l a t i o n as a w h o l e . T h e use o f convenience samples s h o u l d p r o p e r l y be restricted t o feasibility studies a n d p i l o t research. Even here, their u t i l i t y is p r o b l e m a t i c unless they are made u p o f cases w i t h s i m i l a r attributes t o the target p o p u l a t i o n . I f n o t , the i n f o r m a t i o n they p r o d u c e w i l l be o f little use even t o test o u t the suita b i l i t y o f survey arrangements o r i n s t r u m e n t a t i o n .
Snowball sampling I n this v a r i a n t , the researcher relies o n each case t o s u p p l y details o f the locat i o n o f f u r t h e r cases, so t h a t the sample g r o w s steadily i n extent (metap h o r i c a l l y , like a s n o w b a l l r o l l e d a l o n g the s n o w y g r o u n d ) . I t is a p p r o p r i a t e i n s o m e w h a t specialized circumstances w h i c h m a y be s u m m a r i z e d as follows: • n o s a m p l i n g frame exists; • cases are rare a n d are geographically w i d e l y d i s t r i b u t e d ; • cases are l i k e l y t o k n o w o f each other; • cases are w i l l i n g t o s u p p l y i n f o r m a t i o n a b o u t each other. Circumstances i n w h i c h s n o w b a l l s a m p l i n g m i g h t p r o v e useful are where there is a need t o gather together a c o l l e c t i o n o f organizations o f f e r i n g a very n e w service o r p r o d u c t (and w h i c h are l i k e l y t o be aware o f the c o m p e t i t o r s ) , o r patients (or possibly relatives o f patients) suffering f r o m rare medical c o n d i t i o n s w h o m a y be i n contact w i t h f e l l o w sufferers. I n situations where the c o n d i t i o n o r characteristic is socially undesirable, however, referrals m a y n o t be f o r t h c o m i n g . S n o w b a l l samples suffer f r o m the same m a i n l i m i t a t i o n s as convenience samples a n d their use is generally l i m i t e d t o e x p l o r a t o r y studies.
Purposive s a m p l i n g Purposive s a m p l i n g is e m p l o y e d m a i n l y i n e x p l o r a t o r y a n d i n q u a l i t a t i v e research. T h e logic o f this k i n d o f selection is n o t based o n typicality but o n l o c a t i n g cases w i t h attributes o f p a r t i c u l a r interest t o the researcher. A n i m p l e m e n t a t i o n o f p u r p o s i v e s a m p l i n g is c o n t a i n e d w i t h i n t h e ' g r o u n d e d t h e o r y ' a p p r o a c h discussed i n Glaser a n d Strauss ( 1 9 6 7 ) , Strauss (1987) a n d Strauss a n d C o r b i n ( 1 9 9 3 ) . Together w i t h some o f the other purposive
Selecting samples
81
selection procedures, g r o u n d e d t h e o r y is preoccupied w i t h the creation o f e x p l a n a t o r y categories a n d , t h r o u g h t h e m , w i t h b u i l d i n g theoretical systems, rather t h a n w i t h d e m o n s t r a t i n g t h a t cases are representative o f their e m p i r i c a l p o p u l a t i o n s . I n order t o construct such categories, the researcher seeks a c o l l e c t i o n o f p a r a d i g m a t i c o r ' i d e a l ' instances, extreme examples, recent o r o l d instances, instances w h e r e x occurs w i t h y or i n the absence o f £, etc. F r o m the v i e w p o i n t o f selection, the t w o key elements i n g r o u n d e d t h e o r y are (i) theoretical s a m p l i n g , ' . . . w h e r e b y the analyst decides o n analytic g r o u n d s w h a t data t o collect n e x t a n d w h e r e t o find t h e m ' (Strauss 1987, 38); a n d (ii) s a m p l i n g t o s a t u r a t i o n , w h e r e data f r o m cases w i t h the desired attributes are collected at a research site u p t o the p o i n t at w h i c h n o n e w insights o r f u r t h e r i n f o r m a t i o n is uncovered. Such an a p p r o a c h a l l o w s data gathered early t o be analysed i n t i m e t o influence subsequent data select i o n a n d gathering strategies. The objectives o f purposive s a m p l i n g are r a d i c a l l y d i f f e r e n t f r o m those o f p r o b a b i l i t y s a m p l i n g a n d i t c a n n o t be j u d g e d by the same c r i t e r i a . Clearly, however, l i k e convenience a n d s n o w b a l l techniques, i t does n o t a l l o w the c a l c u l a t i o n o f levels o f precision.
Q u o t a sampling Q u o t a s a m p l i n g is w i d e l y used i n m a r k e t research a n d o p i n i o n p o l l i n g i n circumstances w h e r e p r o b a b i l i t y s a m p l i n g w o u l d also be a p p r o p r i a t e . I t requires researchers t o be able t o estimate i n advance h o w key variables (usually d e m o g r a p h i c attributes l i k e age a n d sex) are d i s t r i b u t e d i n the target p o p u l a t i o n . F r o m this i n f o r m a t i o n , quotas o f i n t e r l o c k i n g attributes t h a t respondents m u s t satisfy are devised a n d given t o interviewers. T h e a c c u m u lated totals o f a l l the quotas reflect the p r o p o r t i o n s o f the characteristics i n the p o p u l a t i o n . T h e interviewers t h e n have some d i s c r e t i o n a b o u t finding suitable respondents t o f u l f i l the q u o t a w i t h i n their allocated n e i g h b o u r hoods. Some assignments a l l o w f o r street i n t e r v i e w i n g , others f o r h o u s e h o l d i n t e r v i e w i n g only. A n example q u o t a t h a t c o u l d be assigned t o an interviewer is given i n Table 4.3. I n this example, the i n t e r v i e w e r needs t o find ( a m o n g others) three w o m e n w h o are i n the 4 5 - 6 4 age bracket a n d are a l l i n the l o w e r o c c u p a t i o n a l class. I f h o u s e h o l d interviews were being c o n d u c t e d , some restrictions l i k e one person per h o u s e h o l d , n o adjacent addresses, m i g h t be a p p l i e d . A q u o t a sample can be regarded as an a t t e m p t t o c o m b i n e the advantages o f s t r a t i f i c a t i o n ( i n t r o d u c e d v i a the i n t e r l o c k i n g characteristics) w i t h a degree o f clustering (the result o f each i n t e r v i e w e r o p e r a t i n g w i t h i n a part i c u l a r l o c a t i o n or n e i g h b o u r h o o d ) . T h e i r a t t r a c t i o n t o c o m m e r c i a l organizations is t h a t they are relatively easy t o set u p q u i c k l y w i t h the clustering o f f e r i n g savings o n overheads l i k e t r a v e l a n d subsistence. C o m p a r i s o n s o f p r o b a b i l i t y a n d q u o t a samples suggest t h a t the latter can, i n knowledgeable
82
Surveying the social world
T a b l e 4.3
I n t e r l o c k i n g sex, age a n d o c c u p a t i o n a l class characteristics o f
respondents Class
Age
Totals
20-29 30-34 45-64 65+
Lower
Totals
Prof/Managerial
Intermediate
Male
Female
Male
Female
Male
Female
1
_
—
-
1
-1
1 1 1
1 3 2 1 7
1 1 3 2 7
--
-
1
1
1
-3
4 6 7 3 20
hands, offer equivalent accuracy despite the fact t h a t the interviewer's greater d i s c r e t i o n i n q u o t a s a m p l i n g is a n a d d i t i o n a l p o t e n t i a l source o f errors. H o w e v e r , q u o t a samples d o n o t p e r m i t s a m p l i n g errors t o be calculated i n the same w a y they are f o r p r o b a b i l i t y samples a n d , o v e r a l l , the technique is better suited t o t e a m research c o n d u c t e d b y experienced practitioners t h a n i t is t o solo o r novice surveyors.
Selecting samples
83
Further reading Chapters 5 and 6 of Moser and Kalton (1971) Survey Methods in Social Investigation (2nd edn) are written at an introductory level and include only the minimum of statistical theory. Kalton's (1983) Introduction to Survey Sampling (out of print but still available in academic libraries) offers a compact, intermediate level, treatment. A more recent alternative to Kalton is Barnett (1991). Kish (1965) offers an advanced theoretical handbook on sampling principles.
Çs^)
Collecting your data
Doing it yourself I n large-scale surveys, data c o l l e c t i o n is t y p i c a l l y c o n t r a c t e d o u t t o an agency w h i c h employs h i r e d hands. T h e y c o n d u c t the i n t e r v i e w s , i f there are any. T h e y code the responses a n d enter t h e m i n t o the computer. I n contrast t o this supposed d r u d g e r y , the creative w o r k o f design a n d analysis is done by the researchers. T h i s b o o k is addressed t o people w h o are collecting the data themselves, either i n d i v i d u a l l y or as a member o f a small research t e a m . D o i n g i t y o u r self has a n u m b e r o f advantages, p a r t i c u l a r l y f o r i n t e r v i e w i n g , as Saunders ( 1 9 9 0 : 383) argues i n his survey o f h o m e o w n e r s .
Collecting your data
85
First, d o i n g i t yourself gives y o u a far better 'feel' f o r the data t h a n i f a h i r e d i n t e r v i e w e r h a d s i m p l y delivered the findings t o y o u . Y o u w i l l k n o w n o t o n l y w h a t was said b u t h o w i t was said. Y o u w i l l have an insight i n t o w h a t areas respondents f o u n d sensitive, a n d w h y this was so. Y o u w i l l also be better able t o judge w h i c h items were the m o s t salient t o respondents. Second, h i r e d interviewers are n o t necessarily interested i n the research. W h y s h o u l d they be, especially i f w e have defined t h e m as the muscle a n d ourselves as the brains? T h e p a y is poor, a n d i t is p i e c e w o r k - so the quicker they can get t h r o u g h an i n t e r v i e w the better i t w i l l be f o r t h e m . To m a k e the i n t e r v i e w go s m o o t h l y , they m a y say things w h i c h the researchers certainly w o u l d n o t have sanctioned. A l d r i d g e recently h a d the experience o f being i n t e r v i e w e d by someone w h o c o m p l i m e n t e d h i m o n his taste i n classical music, w h i c h she deduced f r o m the C D s o n display i n his s i t t i n g r o o m . T h e interviewer proceeded t o agree w i t h some o f his answers! F l a t t e r i n g , perhaps, b u t very d a m a g i n g t o v a l i d i t y . W h e t h e r o r n o t h i r e d interviewers deviate f r o m o u r script, their s i t u a t i o n encourages a n i n s t r u m e n t a l a n d calculative a p p r o a c h t o the i n t e r v i e w s . Because o f t h i s , researchers o n large-scale surveys have t o spend days o n the r e c r u i t m e n t a n d t r a i n i n g o f interviewers, a n d o n m a k i n g the i n t e r v i e w schedule's i n s t r u c t i o n s w a t e r t i g h t . T h i r d , using h i r e d interviewers is feasible o n l y w h e r e i n t e r v i e w s are h i g h l y s t r u c t u r e d . I f w e w a n t t o ask searching open-ended questions, i t is better t o do so ourselves. We w o u l d a d d t w o m o r e p o i n t s , i m p l i c i t i n w h a t Saunders says. D o i n g i t yourself is deeply satisfying. I t is also m o r e conducive t o the exercise o f the sociological i m a g i n a t i o n .
Commissioned research M a n y readers w i l l be c a r r y i n g o u t research f o r someone else: an employer, a v o l u n t a r y association, a c h u r c h or charity. Even t h o u g h y o u are d o i n g i t y o u r self, a n d w h e t h e r o r n o t y o u are being p a i d , y o u d o n o t have a free h a n d . I t m a y be the sponsor's v i e w t h a t they have the a i m a n d the v i s i o n , w h i l e y o u have the technical k n o w - h o w . Paradoxically, however, sponsors are usually s u r p r i s i n g l y vague a b o u t w h a t they w a n t t o k n o w . N o r d o they necessa r i l y have a clear strategy f o r the research - they s i m p l y w a n t t o c o m m i s s i o n 'a survey'. T h i s means t h a t y o u w i l l be i n v o l v e d i n discussion w i t h y o u r sponsors t o establish n o t just the p r a c t i c a l details o f the survey b u t its objectives. Y o u r role is t o help the sponsors c l a r i f y w h a t i t is they w a n t t o k n o w . Sponsors usually recognize this soon after negotiations begin. A t this stage, they are open t o y o u r proposals a b o u t h o w t o define a n d achieve the objectives o f the research.
86
Surveying the social world
Difficulties w i t h sponsors t e n d t o arise later. T h e survey m e t h o d can be a v i c t i m o f its o w n v i r t u e , its openness t o p u b l i c scrutiny. Sponsors w i l l ask t o see y o u r d r a f t questionnaire o r i n t e r v i e w schedule. T w o things t e n d t o h a p p e n . T h e y w i l l ask y o u t o m o d i f y or o m i t some questions as t o o sensitive, a n d they w i l l present y o u w i t h questions they w a n t y o u t o include. These requests m a y come very late, a n d just at the t i m e w h e n y o u are ready to l a u n c h the survey h a v i n g c o m p l e t e d y o u r p i l o t i n g . I f y o u are asked t o m o d i f y o r o m i t questions, i t shows t h a t the sponsor is p r o b a b l y a f r a i d o f the answers. T h i s is a sign t h a t w e s h o u l d be asking exactly those questions. Even t h o u g h sponsors say t h e i r objective is t o i m p r o v e t h e i r service t o their clients, they m a y be f r i g h t e n e d by the prospect of a barrage o f c r i t i c i s m . I n a large o r g a n i z a t i o n , one section - the catering d e p a r t m e n t , say - m a y feel i t is being u n d u l y exposed t o c r i t i c i s m . T h e y m a y w e l l say t h a t they have done their o w n 'survey' already, a n d k n o w a l l they need t o k n o w . W h e n y o u are asked t o include questions, the p r o b l e m is t h a t the sponsor usually expects y o u t o include t h e m w o r d f o r w o r d . This is w h a t they w a n t to k n o w , a n d this is h o w they w a n t y o u t o ask i t . I f the questions are w e l l designed there w i l l be n o p r o b l e m ; b u t w h a t chance is there o f that? O n e or t w o b a d l y w o r d e d questions can seriously damage the o v e r a l l q u a l i t y o f the responses a n d the response rate itself. H o w t o handle this? Clearly, the answer depends o n the precise s i t u a t i o n . We suggest one p r i n c i p l e : emphasize the technicalities. Y o u have been asked to c a r r y o u t a survey because y o u have expertise. Even i f y o u are a beginner, reading this b o o k w i l l give y o u far m o r e k n o w l e d g e a b o u t surveys t h a n y o u r sponsors have. T h e y have asked f o r y o u r advice, a n d y o u s h o u l d n o t be apologetic a b o u t i t . I f y o u helped t h e m i n the early stages t o c l a r i f y their objectives, they are l i k e l y t o f o l l o w y o u r advice n o w . Questionnaires a n d s t r u c t u r e d i n t e r v i e w schedules are documents t h a t sponsors can ask t o vet. Such s c r u t i n y o f the details is n o t so easy i n the case of u n s t r u c t u r e d i n t e r v i e w s a n d focus g r o u p s . T h i s is one reason f o r i n c l u d ing t h e m i n o u r research strategy: they are less vulnerable t o the sponsor's attentions.
Covering letters for postal questionnaires A p o s t a l questionnaire m u s t be accompanied b y a c o v e r i n g letter. There is n o f o r m u l a f o r such letters - as ever, the sociological i m a g i n a t i o n comes i n t o play. H o w the letter is w o r d e d w i l l depend o n the t o p i c o f the research, the respondents, y o u r r e l a t i o n s h i p w i t h the respondents, a n d w h a t y o u are able and w i l l i n g t o p r o m i s e as regards feedback. B o x 5 . 1 offers some guidelines. I f the resources are available, i t is w o r t h sending a r e m i n d e r letter t o those w h o have n o t responded. D o i n g so i n v a r i a b l y generates a significant n u m b e r
Collecting your data
Box 5.1
Guidelines for a covering letter
Style The letter should be clear, straightforward, businesslike and fairly formal, but not pompous. Headed writing paper is helpful. A n informal chatty style will be off-putting t o some respondents, w h o will read it as frivolous. O n the other hand, the days are happily gone when we could address respondents self-importantly, as though they were obliged t o take part. Spelling and grammar Like it o r not, many people interpret mistakes in spelling and grammar as signs that the w r i t e r is careless, ill-educated o r unintelligent. However unfair, these judgements will be made, t o the detriment of the response rate and the quality of responses. D o proof read carefully, and ask for advice if you need t o . Purposes of the research W e need t o say as much as we can about this - but as briefly as possible - in o r d e r t o persuade respondents that participation is w o r t h w h i l e . W e may need t o mention sponsorship o r funding, and should also give a concise statement about o u r position as researchers. How the respondent was selected Unless it is obvious f r o m the context of the research, w e should explain briefly and non-technicaily h o w the respondent was selected f o r inclusion in the study. Why the respondent can help Sometimes respondents fear they cannot help us because they are not experts and do not know enough about o u r research topic. W e may need t o reassure them about this, for example by saying that we are interested in their opinions and experience, and that we wish t o have a broad coverage of all shades of opinion. Confidentiality and anonymity W e need t o be clear about what guarantees we are giving, and t o be alert t o the problem that some respondents may take confidentiality t o mean anonymity. If the questionnaire has a serial number, w e should explain its significance. Feedback W e may decide t o offer feedback individually t o respondents, though this can be costly. Alternatively, we may indicate t o them where o u r findings will be published. O n e possibility is t o use the w o r l d wide web. Answering queries It may be desirable t o give a telephone number o r email address which respondents can use if they have any queries. Thanks part.
W e should thank the respondent in anticipation of their taking
87
88
Surveying the social world
o f e x t r a responses. I t w o r k s m o r e easily w h e n the questionnaires are n o t a n o n y m o u s , since w e can target the reminders t o non-respondents. As w i t h a c o v e r i n g letter, there is n o f o r m u l a f o r a reminder. Clearly, people are e n t i t l e d t o refuse, so w e c a n n o t be accusatory i n t o n e . W h a t w e s h o u l d d o is refer t o the value o f the research a n d o f the respondent's participation.
Box 5.2
Guidelines for a reminder letter
Keep it short A reminder letter should be even shorter than the original covering letter. Content Refer t o the value of the respondent's participation. Facilitation Enclose another
copy of the
questionnaire and
another
stamped addressed envelope. Appreciation
Acknowledge that the respondent's reply may be in the post,
and thank the respondent.
B o x 5.3 shows the t e x t o f the c o v e r i n g letter used i n the Travel Survey. As is n o r m a l i n real life research, the o u t c o m e was a c o m p r o m i s e . A l t h o u g h d r a f t e d b y the Survey U n i t i t w e n t o u t over the signature o f the senior academic responsible f o r t r a f f i c o n campus. G i v e n the c o n t e x t o f the research, it was t h o u g h t essential t o e x p l a i n the reasons w h y the u n i v e r s i t y was c o n d u c t i n g a survey.
Box 5.3
A n example of a covering letter
Professor David Greenaway Pro-Vice-Chancelior and Professor of Economics Department of Economics University Park Nottingham N G 7 2RD
Dear Colleague/Student,
Travel to Work Survey 1998 Thank you f o r finding time t o complete this questionnaire. Why are we doing it? The University is committed t o traffic management policies aimed at reducing vehicle dependency, encouraging the use of public transport and
89
Collecting your data managing vehicular movements within and between o u r campuses. This commitment is part of a wider environmental strategy f o r the University. The survey will identify the travel patterns of staff and students that will assist us in formulating policies t o : • tackle t h e problems of increasing demand f o r vehicular access and parking; • make recommendations f o r developing and supporting viable and accessible transport alternatives. The findings of the survey will be reported t o Transport Consultants w h o will advise us in producing a C o m m u t e r Plan f o r the University, which is expected in O c t o b e r 1998. About the survey The survey is confidential. If you complete and return the questionnaire accompanied by the slip below you can be entered into a prize draw t o win a bicycle. There will be t w o prizes, one each f o r staff and students. The questionnaires and t h e slip, which will be separated immediately o n receipt, should be returned by 12.5.98 in the envelope provided. W i n ners will be announced o n 20.5.98. Many thanks in anticipation of your cooperation. Yours faithfully, Professor D. Greenaway
...
.
.
___
BICYCLE PRIZE If you wish t o be entered f o r the draw, please fill in the details below and tear off this slip. Enclose the slip w i t h your completed questionnaire in the envelope provided and return it through the internal mail. The slip will be separated f r o m t h e questionnaire immediately o n receipt. A l l personal details will remain absolutely confidential.
Name Department Contact telephone number
Please tear off and return with the questionnaire
90
Surveying the social world
Approaching respondents for an interview I t is h e l p f u l , i f possible, t o send p o t e n t i a l respondents a letter first, i n c l u d i n g the same k i n d s o f p o i n t s t h a t w o u l d be i n a c o v e r i n g letter f o r a postal quest i o n n a i r e . H e r e the letter is f u n c t i o n i n g as a k i n d o f letter o f i n t r o d u c t i o n . T h e n , w h e n c o n t a c t i n g the person b y telephone o r i n person, w e c a n refer back t o the letter. W e s h o u l d n o t assume t h a t respondents w i l l remember the detailed contents o f the letter o r even h a v i n g received i t - f o r someone else i n the h o u s e h o l d m a y have opened i t a n d n o t m e n t i o n e d i t . T h e p o i n t is, i t serves as a n i n t r o d u c t i o n a n d shows o u r g o o d f a i t h i n w i s h i n g t o elicit informed and w i l l i n g cooperation. I f c a l l i n g i n person, w e w i l l o b v i o u s l y w a n t t o l o o k respectable, a n d n o t be m i s t a k e n f o r a salesperson o r evangelist. W e s h o u l d carry i d e n t i f i c a t i o n w i t h a p h o t o g r a p h , a n d a n o f f i c i a l letter e x p l a i n i n g w h o w e are a n d g i v i n g a contact address a n d telephone n u m b e r f o r v e r i f i c a t i o n . A s w i t h a covering letter, w e need t o e x p l a i n the purposes o f the research, h o w the findings w i l l be used, w h e t h e r any s u m m a r y r e p o r t w i l l be available, h o w the respondent was chosen, a n d o u r guarantees o f c o n f i d e n t i a l i t y . O n e a w k w a r d p r o b l e m is w h e n one m e m b e r o f a h o u s e h o l d acts as a gatekeeper a n d tries t o refuse p a r t i c i p a t i o n o n behalf o f another ( ' M y w i f e w o n ' t w a n t t o take p a r t i n y o u r survey'). W e s h o u l d d o o u r best, p o l i t e l y , t o t r y t o speak t o the p o t e n t i a l respondent i n person.
Piloting P i l o t i n g is essential, b u t is o f t e n s k i m p e d a n d h u r r i e d . I n o u r experience sponsors rarely a l l o w f o r i t ; they w a n t y o u t o get o n w i t h the survey a n d p r o duce the results. A p i l o t survey is a d u m m y r u n o f the survey proper, i n w h i c h w e a i m t o test all the key aspects o f the survey, i n c l u d i n g access t o respondents, design o f the research i n s t r u m e n t , a n d gathering the data. T h e p i l o t survey m a y be preceded b y one o r m o r e pretests, i n w h i c h w e investigate p a r t i c u l a r aspects o f our survey, such as a specific set o f questions w e consider p r o b l e m a t i c . T h e pretests a n d the p i l o t survey are a l l p a r t o f the overall process o f p i l o t i n g . T e x t b o o k s o n surveys o f t e n propose a n elaborate a n d costly p r o g r a m m e o f pretests, f o l l o w e d by a large-scale p i l o t survey. I t is an ideal impossible t o live u p t o i n m o s t research carried o u t o n a l i m i t e d budget b y a solo researcher o r a small research t e a m . T h e answer is n o t t o despair, b u t t o focus o n the essentials. W e suggest the f o l l o w i n g guidelines: • T r y t o get i t r i g h t first t i m e . A p i l o t survey s h o u l d be as g o o d as y o u can make i t . P i l o t i n g enables us t o refine o u r survey, n o t t o t r a n s f o r m a hopeless mess i n t o a perfect i n s t r u m e n t . Perfection is n o t attainable anyway.
Collecting your data
91
• Q u a l i t y is m o r e i m p o r t a n t t h a n q u a n t i t y . Small-scale b u t intensive p i l o t i n g is far better t h a n large-scale crude p i l o t i n g . • I m a g i n a t i v e use o f small-scale pretests is very p r o d u c t i v e . I t enables us t o get detailed c o m m e n t s a n d suggestions a b o u t h o w t o i m p r o v e o u r research i n s t r u m e n t . • M a k e the p i l o t survey as similar as possible t o the survey proper. I n p r i n ciple, w e s h o u l d be testing the effectiveness o f a l l aspects o f the research design. • U s i n g y o u r judgement a b o u t the target p o p u l a t i o n , choose a representative range o f respondents f o r the p i l o t i n g . R e l y i n g o n friends o r colleagues w i l l n o t be representative o f the target p o p u l a t i o n . A m b i g u o u s , sensitive o r offensive questions m a y n o t be p i c k e d u p . I n the p i l o t i n g process, w e need t o be a t t u n e d t o the signs t h a t w a r n us t h a t something is w r o n g - as set o u t i n Boxes 5.4 a n d 5.5.
Box 5.4
Warning signs in pilot self-completion questionnaires
Giving several answers to a question where only one was required This means we need t o make o u r instructions clearer - for example, please / one only. Giving one answer to a question where several were possible Again, the instructions need t o be clarified - for example, please / all that apply. Failure to answer the question This may mean that the question is awkward o r offensive. Alternatively, something may have gone w r o n g w i t h o u r question skips. Open questions are left blank If hardly anyone answers them, do they have any value? A question asking respondents to rank items is not completed properly Ranking is a complex task. W e should simplify it, usually by reducing the number of items t o be ranked. Respondents write comments in the margins This is a straightforward sign that something is amiss. The questionnaire takes a long time to complete Even if people do not complain, this is a clear warning. Participants in a pilot survey may be more generous w i t h their time than respondents t o the survey proper will be. Almost everyone gives the same answer This is a warning sign of possible social desirability effects.
92
Surveying the social world
Box 5.5
Warning signs in pilot interviews
The interviewer has to clarify or expand on a question Presumably the question is unclear, and needs t o be reworded. The interviewer has to apologize for a question This is an extreme f o r m of the first point. In our experience of being interviewed, it is common in hired hand research. W e should never have t o apologize f o r a question. Interviewees appear reluctant or embarrassed
Something clearly is wrong.
Some questions may be more sensitive than w e realized, o r perhaps our self-presentation is unintentionally inhibiting. The interviews are significantly longer than expected The simple remedy is t o cut the number of questions, and perhaps t o reduce the amount of probing. The interviews are significantly shorter than expected This is a warning that rapport may n o t have been achieved, that respondents have doubts about the research, o r that question probes are not operating properly. There are items where respondents want to say more than we expected This is a sign that these items are salient t o the respondent. W e should consider asking more questions about them, perhaps w i t h deeper probes. Respondents
have difficulty with response categories W e may need t o use
show cards. Interviewers have difficulty with instructions Interviewers have problems t o o . Instructions, especially question skips, are often hard t o follow.
Distribution and return of questionnaires I f w e are sending questionnaires t h r o u g h the post, o r t h r o u g h the i n t e r n a l m a i l o f a n o r g a n i z a t i o n , w e need t o m a k e sure t h a t they reach the r i g h t people. First, w e need u p t o date addresses. Second, w e m u s t m a k e i t easy f o r respondents t o r e t u r n t h e i r questionnaire t o us. I f they are t o use the post, t h e n i t is desirable t o supply t h e m w i t h a stamped addressed envelope, w h i c h appears less i m p e r s o n a l t h a n a business r e p l y envelope. I n some cases, it m a y be m o r e convenient f o r respondents t o use the i n t e r n a l m a i l o f their o r g a n i z a t i o n , p r o v i d e d i t is efficient a n d p r o v i d e d t h a t respondents are c o n fident i t is secure. Sometimes, questionnaires are h a n d e d o u t personally b y an i n t e r m e d i a r y - f o r example, b y a teacher i n a classroom o r a receptionist i n a w a i t i n g area. R e l y i n g o n intermediaries is, however, very dangerous. Unless they have been f u l l y b r i e f e d , a n d unless w e can be quite confident t h a t they w i l l d o as
Collecting your data
93
instructed, i t w i l l p r o b a b l y n o t t u r n o u t w e l l . G o o d intentions are n o t e n o u g h . Intermediaries m a y n o t be f u l l y aware o f the nature a n d purposes o f the research, n o r are they necessarily knowledgeable a b o u t the research process. Sometimes they m a y go t o o far, i m p l y i n g t h a t p a r t i c i p a t i o n is r e q u i r e d a n d refusal n o t an o p t i o n . T h e y m a y give inaccurate i n s t r u c t i o n s , or p u t an i n a p p r o p r i a t e gloss o n the purposes o f the research. I n other cases, intermediaries m a y n o t pursue the m a t t e r at a l l v i g o r o u s l y , b u t w i l l s i m p l y leave questionnaires l y i n g a r o u n d f o r people t o complete i f they feel i n c l i n e d . Few w i l l d o so. Self-completion questionnaires are o f t e n used i n audience research i n theatres a n d l i k e venues. A c o m m o n p r o b l e m is t h a t the respondent is given n o i n s t r u c t i o n s whatever a b o u t w h e r e t o r e t u r n the questionnaire. People m a y be r e l u c t a n t t o leave i t b e h i n d o n their seat. So w e see people leaving the theatre at the end o f the p e r f o r m a n c e , c l u t c h i n g a questionnaire w h i l e v a i n l y l o o k i n g f o r the b o x t o p u t i t i n o r a n o f f i c i a l t o h a n d i t t o . M o s t o f these questionnaires finish their life i n a litter b i n o r the gutter. T h e f a u l t lies w i t h the researchers, f o r h a v i n g given n o t h o u g h t t o h o w the questionnaires are t o be r e t u r n e d .
• Collecting the data youi
Further reading To pursue these questions i n more depth, we suggest that the best way is to read about how researchers have tackled them i n their o w n w o r k . Devine and Heath (1999) provide a good starting point in their Sociological Research Methods in Context. Some of the articles i n Hammond's (1964) classic collection, Sociologists at Work, deal w i t h researchers adapting survey methods to particular settings and problems (see especially the chapters by Lipset, Coleman and Davis).
(V)
Designing the questions: what, when, where, why, how much and how often?
Key elements in this chapter * Asking meaningful questions
The sociological imagination F o r m u l a t i n g the questions t o include i n a questionnaire o r i n t e r v i e w schedule, designing the l a y o u t o f questionnaires a n d p l a n n i n g the sequence o f questions: a l l these lie at the heart o f survey w o r k a n d are one o f its m o s t enjoyable aspects. There are technicalities t o be t a k e n i n t o account a n d p i t falls t o be a v o i d e d , as w e e x p l a i n . B u t the technicalities stem f r o m s o m e t h i n g m o r e f u n d a m e n t a l , the sociological i m a g i n a t i o n .
Designing the questions
95
Professional sociologists d o n o t have a m o n o p o l y o n the sociological i m a g i n a t i o n . I t is g r o u n d e d i n social life - above a l l , i n the lives o f o u r respondents. We use o u r sociological i m a g i n a t i o n t o t r y t o i d e n t i f y the l i n k s between p u b l i c issues a n d p r i v a t e concerns, between the great issues o f o u r society such as p o v e r t y a n d social exclusion, disability, j o b insecurity, a n d the personal experiences o f people engaged w i t h t h e m . W e g r o u n d o u r i m a g i n a t i o n by p r e l i m i n a r y w o r k such as r e a d i n g a b o u t the t o p i c , t a l k i n g t o people, observing t h e m , p i l o t i n g o u r questions a n d so o n . Q u e s t i o n design calls o u r sociological i m a g i n a t i o n i n t o p l a y i n a n u m b e r o f w a y s . W e need t o f r a m e questions t h a t are m e a n i n g f u l , sensitive, precise, searching, a n d salient t o o u r respondents. We need t o c o n s t r u c t the questions i n such a w a y t h a t respondents w i l l w a n t t o answer t h e m as f u l l y a n d t r u t h f u l l y as they can.
Understanding what matters to respondents Surveys are o f t e n c r i t i c i z e d f o r being d r i v e n entirely by the interests o f the researcher. H o w d o w e k n o w t h a t w h a t interests us also interests o u r respondents? T h i s is the p r o b l e m o f salience. Respondents' h e l p f u l c o o p e r a t i o n does n o t necessarily s h o w t h a t w e have engaged w i t h their real concerns.
Box 6.1
Gauging salience
Open-ended questions W e examine the significance of open-ended questions later in this chapter. For the moment, we simply say that t w o of the most productive questions the Survey Unit has asked of first-year undergraduate students at the University of Nottingham, U K , are the following: W h a t would you say you have most liked about being an undergraduate student at the University of Nottingham? and W h a t would you say you have most disliked about being an undergraduate student at the University of Nottingham? Ranking questions These are closed versions of the open-ended questions given above. W e present o u r respondents w i t h a list of alternatives, and ask them t o choose a small number that are the most important t o them. Sometimes we ask respondents t o rank their selection in order of importance. This technique
96
Surveying the social world
can be revealing, though it will be very cumbersome if a ranking is required and the list is long. It can also seem somewhat artificial. Direct questions on salience W e present respondents w i t h a list, asking them t o indicate f o r each item how important it is t o t h e m . This approach is blunt, but can be effective. One very common approach is through a Likert scale, thus: Strongly agree
Agree
Neutral
Disagree
Strongly disagree
Catering on campus is excellent Halls of residence are well equipped and so on. A n alternative way of presenting the response categories is like this: Strongly agree 1 2 3 4 5 Strongly disagree For each item, respondents are asked t o put a ring round the appropriate number. W e suggest later (page I 12) that in most cases it is desirable t o have an odd numbered scale, normally w i t h five categories, so that there is a middle category. This middle category may be labelled 'neutral', o r 'uncertain', o r 'neither agree nor disagree'.
Recognizing differences between respondents A n essential reason f o r d o i n g a survey is t o d r a w comparisons between respondents. I f they a l l t h o u g h t a n d acted alike there w o u l d clearly be n o p o i n t i n a survey, since w e c o u l d s i m p l y take one case a n d generalize f r o m i t . V a r i a t i o n s between respondents can cause technical difficulties, as w e illustrate t h r o u g h the Travel Survey, b u t they are w h a t m a k e a survey w o r t h w h i l e . I n o u r experience, m a k i n g false or d u b i o u s assumptions a b o u t respondents is one o f the m o s t c o m m o n p r o b l e m s t o be overcome.
Designing the questions
Box 6.2 tions
Avoiding unjustified presuppositions and false assump-
Assumptions and presuppositions are similar but n o t quite the same. False assumptions By an 'assumption', we mean something that is taken for granted. All arguments are built on assumptions, but assumptions can be false. For example, in a postal questionnaire sent t o a sample of Church of England clergymen (this was before the church ordained w o m e n priests), Aldridge asked respondents the following question: Is the fundamentalist approach t o the Bible valid today? •
Yes
• •
No Uncertain
A very significant minority of respondents objected t o this question, on the grounds that the t e r m 'fundamentalist' was not only ambiguous but offensive. Aldridge had falsely assumed that the t e r m was clear and neutral! A n o t h e r example known t o us is a questionnaire on cremation and burial, which was delivered by post w i t h o u t even a covering letter and which caused distress t o many respondents, not least t o people w h o had been recently bereaved. The researchers presumably assumed, falsely, that the topic was not particularly sensitive, and that it could be treated as an unproblematic area of academic enquiry. Unwarranted presuppositions By a presupposition, we mean taking the existence of something for granted. The standard philosophical example of this is the question: Is the present King of France bald? The point is, of course, that since France is a republic there is no King of France. It is not true that the present king is bald, but nor is it false. In order t o ask fruitful questions in o u r surveys, we need t o know what there is and what there isn't in the social w o r l d in question. W h i c h of the following posts exist at the University of Nottingham, UK? Deputy Pro-Vice-Chancellor Director of Finance Dean of the Medical School Proctor Answer: the second and third exist, the first and last do not. A well-placed member of the university would k n o w this and could have told you if you had asked. Finding o u t what exists out there is a vital component of all social
97
98
Surveying the social world
research. W e can enquire about the King of France's hair, o r the Proctor's policy on student discipline, only after we have established that these beings actually exist. In seeking t o eliminate false assumptions and unwarranted presuppositions there are no easy answers and no simple tactics. W e are at the heart of the sociological imagination. Knowing about the social and organizational context is critically important, and piloting plays a key role. Here is one possibility: Consulting key informants W e have t o be clear what w e are doing here. Oppenheim (1992: 62-3) warns us against relying on 'experts'. If our questions are sloppy and ill thought-through, it would not take an expert t o tell us so, and the expert probably would not waste her o r his precious time trying t o rescue us from disaster. N o 'expert' knows everything. Experts in survey design can help us w i t h technicalities, as Oppenheim says, but they cannot do o u r thinking for us. Instead of relying on experts, w e should think in terms of making use of key informants. By this we mean people w h o can help in alerting us t o false assumptions and unwarranted presuppositions. They can also warn us about problems in the use of language.
Using unambiguous language sensitively O b v i o u s l y , w e w a n t questions t h a t are m e a n i n g f u l , clear, u n a m b i g u o u s , sensitive a n d revealing. G i v e n the v a r i a t i o n between respondents, this is n o t so easy. There are w e l l - k n o w n a n d n o t so w e l l - k n o w n differences i n language use d e p e n d i n g o n social factors such as age, r e g i o n , a n d social class. A n issue t h a t has t o be dealt w i t h is the social s t a n d i n g o f d i f f e r e n t usages. I n c o n t e m p o r a r y B r i t i s h usage, supposedly 'correct' usages include these: • the m i d d a y meal is lunch, n o t dinner • the r o o m i n w h i c h the f a m i l y gathers (if i t does!) is the sitting the living room o r the lounge
room,
not
• a magazine s h o u l d never be called a book • the loo o r lavatory is n o t a toilet M a n y people are h i g h l y sensitive t o these v a r i a t i o n s i n usage, r e g a r d i n g some o f t h e m as i m p o l i t e , vulgar, or i n c o r r e c t . H o w d o w e a v o i d a m b i g u i t y w i t h o u t p a t r o n i z i n g o u r respondents or ' c o r r e c t i n g ' t h e i r use o f English? Some answers are given i n B o x 6.3.
Designing the questions
Box 6.3
Tactics for dealing with ambiguous or unclear terms
Avoidance The most commonly used terms f o r a midday meal are lunch o r dinner, and for an evening meal dinner, supper, o r tea. One tactic is t o use alternatives such as midday meal, main evening meal, o r main meal of the day. Glossing A n o t h e r possibility is t o gloss the t e r m , that is, t o give a brief explanation of what w e mean by it. Here are t w o interview questions [asked only of those respondents w h o think their soul will live on after death] taken f r o m the Religion and Politics Survey, 1996, conducted by Princeton Survey Research Associates and accessible o n t h e American Religious Data Archive: http://www.arda.tm D o you think there is a heaven, where people w h o have led good lives are eternally rewarded? Yes (Believe in heaven) N o ( D o n ' t believe in heaven) D o n ' t know/Refused ( D o n ' t k n o w if believe in heaven) D o you think there is a hell, where people w h o have led bad lives and die w i t h o u t being sorry are eternally damned? Yes (Believe in hell) N o ( D o n ' t believe in hell) D o n ' t know/Refused ( D o n ' t k n o w if believe in hell) These t w o questions gloss the meaning of heaven and hell. A f t e r all, other meanings are c o m m o n in western culture. For example, some people believe that hell does not entail eternal damnation, others that damnation not only sounds spiteful but also fails t o convey the desolation of being cut off from G o d . Glossing involves an imposition of a meaning. Hence, it is desirable t o convey that we are simply saying what we mean by the t e r m , not what the term means. W e do not want t o give the impression that we are instructing our respondents in Standard English. O u r glossary t o this book provides an example. W e use questionnaire exclusively t o refer t o a f o r m completed by the respondent; other writers use it inclusively t o cover interview schedules as well. W e are not claiming that our usage is correct o r better, but are simply glossing our use of the t e r m t o deal w i t h the ambiguity. Clarification This is a f o r m of glossing in which we explicitly clarify potential ambiguity. Here is an example f r o m the Travel Survey:
99
100
Surveying the social world
O n occasions when you travel t o the campus by car, where do you park? N o t applicable In the Science City area In the central area (including Highfields House, West Drive and Education) O n the periphery (including Halls, History and the Sports Centre) In this case, the location in o u r categories of Highfields House and the other examples is explicitly clarified. W e are in effect glossing what we mean by the central area and the periphery. Giving examples In a Survey Unit questionnaire sent t o Pre-Registration House Officers in England - people in their final year of basic medical training - respondents were asked about specific formal educational sessions, and then were asked: Have any other formal educational meetings been arranged (for example, lectures, journal club, X-ray meetings, etc.)? Giving examples is far more friendly than issuing instructions, but carries the danger of suggesting some answers while possibly distracting attention f r o m others. It is best used when we k n o w that the examples either cover all the main possibilities o r send an unambiguous message about what we have in mind. (Incidentally, the question breaks a rule w e were taught at school: you do not say 'etc.* if you have already said 'for example*. However, we think that being clear and helpful is more important than being formally 'correct'. O n the other hand, many respondents will be shocked t o see misspellings, so it is important t o check spelling carefully, running a speilcheck program if possible.) Indirectly eliminating unwanted meanings This is sometimes possible, though perhaps risky. It depends on respondents picking up cues f r o m the context. Consider the following example: Over the past seven days, have you bought any of the following? Please tick all that apply. A A A A
comic paper newspaper magazine book
In this example, book and magazine are listed separately, w i t h magazine appearing before book. The researcher expects the reader t o infer f r o m this that book is used exclusively of magazine.
Designing the questions
101
The role of open-ended questions Some books w a r n against using open-ended questions at a l l i n surveys, w h i l e others say t h a t open-ended questions s h o u l d be k e p t t o a strict m i n i m u m . W h y is this? Three m a i n reasons are given. 1
Open-ended questions are m o r e d i f f i c u l t t o answer, because respondents or interviewees are called u p o n t o t h i n k t h r o u g h (or t h i n k u p ) their answer f r o m scratch, w i t h o u t help f r o m the researcher. T h i s is p a r t i c u l a r l y p r o b l e m a t i c w i t h questionnaires, since w r i t i n g a n answer requires m o r e t i m e a n d e f f o r t t h a n g i v i n g i t verbally. I f respondents suspect t h a t the reason f o r open-ended questions is t h a t the researcher has n o t t a k e n the t r o u b l e t o t h i n k a b o u t response categories, this m a y w e l l affect the response rate a n d the q u a l i t y o f responses.
2 T h e responses t o open-ended questions are m o r e d i f f i c u l t t o code, u n l i k e closed questions, w h e r e the response categories are pre-coded. 3 T h e responses t o open-ended questions are harder t o analyse. Partly, this is because o f c o d i n g p r o b l e m s . I n a d d i t i o n , a n u m b e r o f respondents w i l l s i m p l y s k i p over open-ended questions. Open-ended questions t y p i c a l l y have a higher rate o f non-response t h a n closed questions d o . Despite these real difficulties, open-ended questions can p l a y a n i m p o r t a n t p a r t i n survey w o r k , b o t h i n questionnaires a n d i n t e r v i e w s . T h e y can be used f o r a n u m b e r o f purposes. T o introduce variety Questionnaires a n d i n t e r v i e w s w h i c h rely o n a very s m a l l n u m b e r o f types o f q u e s t i o n a n d response - a Yes/No/Don't K n o w f o r m a t , f o r example - m a y be s t r a i g h t f o r w a r d , b u t are also l i k e l y t o be seen as t e d i o u s . O n e w a y o f i n t r o d u c i n g v a r i e t y is t h r o u g h the c a r e f u l use o f open-ended questions.
T o t a p salience As discussed above ( B o x 6.1), open-ended questions can be very useful i n h e l p i n g us t o assess the salience o f a n issue t o a respondent.
T o show a humanistic approach Surveys are sometimes t h o u g h t t o be i n e v i t a b l y m u n d a n e , b o r i n g a n d insensitive. By using open-ended questions as w e l l as closed ones, w e are able t o send a clear signal t h a t w e a p p r o a c h o u r research i n a h u m a n i s t i c s p i r i t . O u r respondents are informants, w i t h t h e i r o w n individual p o i n t s o f view, w h i c h they are q u i t e capable o f expressing in their own words.
102
Surveying the social world
T o a c k n o w l e d g e t h a t researchers are n o t omniscient I n some cases, w e have so l i t t l e idea o f w h a t answers m i g h t be f o r t h c o m i n g , or the possibilities are so vast, t h a t i t is s i m p l y n o t possible t o p r o v i d e respondents w i t h a sensible list o f the m a i n alternatives. I n the Travel Survey, people w h o cycle t o w o r k were asked the open-ended question: H o w do you think facilities f o r cyclists could be improved? M e m b e r s o f the Survey U n i t are n o t cyclists themselves, a n d c o u l d n o t easily anticipate w h a t the answers f r o m cyclists w o u l d be.
T o generate q u o t a t i o n s A f e w well-chosen q u o t a t i o n s f r o m o u r respondents can convey the f l a v o u r of responses far better t h a n any other r h e t o r i c a l device. W e are delivering o u r p r o m i s e t o give people a voice. I f o u r survey is being u n d e r t a k e n o n behalf o f a sponsor, direct q u o t a t i o n f r o m respondents - w h o m a y be customers - can have a n i m m e d i a t e i m p a c t . There w i l l be g o o d news as w e l l as b a d . First, the g o o d news, f r o m postgraduate students: W h a t would you say you have most liked about being a postgraduate student at the University of Nottingham? 'High quality - the experienced teachers, good courses.' 'My lovely friends f r o m all around the w o r l d , U K , Taiwan, Turkey, Germany, Greece, Spain, D e n m a r k . . . ' T h e safe and beautiful campus.' T h e b a d news: W h a t would you say you have most disliked about being a postgraduate student at the University of Nottingham? 'Catering is grossly over-priced, especially sandwiches/hot drinks. The overall feel is of a monopolized market.' 'Lack of proper union facilities, emphasis on halls t o exclusion of postgraduates.' Used j u d i c i o u s l y , direct q u o t a t i o n s can b r i n g h o m e t o readers the salient issues f o r respondents - an i m p o r t a n t aspect o f the w r i t i n g o f a research r e p o r t , w h i c h is covered i n chapter nine. Occasionally, a n open-ended question can p r o d u c e a n unexpected response w h i c h can set the researcher t h i n k i n g m o r e deeply a b o u t the issue. A start l i n g example is the f o l l o w i n g , f r o m a p r o g r a m m e o f i n t e r v i e w s i n I s l i n g t o n , L o n d o n , i n 1968 ( A b e r c r o m b i e et al. 1 9 7 0 ) : D o you believe in God? 'Yes.'
Designing the questions
103
D o you believe in a G o d w h o can change the course of events on earth?' ' N o , just the ordinary one.' T h i s is the o n l y survey question w e k n o w o f t h a t has given rise t o a p o e m : D o n a l d Davie's ' O r d i n a r y G o d (Davie 1 9 8 8 ) . 5
Box 6.4
Making the best use of open-ended questions
Use them sparingly Open-ended questions require more time and effort on the part of the respondent, particularly in self-completion questionnaires. They are also more difficult t o code. As Oppenheim warns (1992: 113), open-ended questions 'are often easy t o ask, difficult t o answer, and still more difficult t o analyse'. Do not begin with them It is usually desirable t o begin w i t h closed questions, so that the respondent is drawn into the study and rapport is established before the more difficult open-ended format is introduced. Use them to probe the respondents' view of salient issues In the survey of postgraduates cited above, t w o open-ended questions were used t o tap into students' best and w o r s t experiences of the university. Allow an appropriate space for the response As a general guide, we suggest a space equivalent t o three o r four lines. Any less, and respondents may conclude that their opinions are not really being taken seriously. Any more, and respondents may feel intimidated o r annoyed that an unreasonable effort is being required of them.
Tackling the social desirability problem A m a j o r challenge f o r a l l overt f o r m s o f social research is the social desira b i l i t y p r o b l e m . Respondents t e n d t o give socially a p p r o v e d answers t o o u r questions, t o over-report their v i r t u o u s actions a n d u n d e r - r e p o r t their vices, and t o engage i n socially a p p r o v e d behaviour w h e n they k n o w w e are observing t h e m . The p r o b l e m o f social desirability has a n u m b e r o f dimensions. Respondents m a y be t r y i n g t o d o one or m o r e o f the f o l l o w i n g things:
104
Surveying the social world
• being h e l p f u l a n d cooperative t o the researcher by g i v i n g the answer they t h i n k the researcher w a n t s ; • g i v i n g answers t h a t appear t o s h o w t h a t they are c u l t i v a t e d people, m o r a l l y decent, a n d g o o d citizens; • d e m o n s t r a t i n g t h a t they are r a t i o n a l b y g i v i n g answers t h a t are l o g i c a l a n d consistent.
Box 6.5
Tactics for dealing with social desirability effects
• Be specific, asking neither about hypothetical behaviour (what would you do if?), nor about regular behaviour (how often do you?), but about a specific time period (what did you do in the last seven days?). • Ask indirect questions instead of addressing a sensitive issue head-on. • Avoid leading questions. • Make clear - for example in a covering letter - that o u r research is scientific and ethically neutral. • Consider using self-completion questionnaires that are completely anonymous and that do not involve personal interaction with a researcher.
Questions about respondents' knowledge Surveys f r e q u e n t l y include questions w h i c h t a p a respondent's k n o w l e d g e a b o u t a given issue - f o o d hygiene, say, o r the effects o f s m o k i n g o n h e a l t h . T h i s is n o t the same as asking people f o r their o p i n i o n s . I n a democratic society, a range o f o p i n i o n is t o be expected, b u t lack o f k n o w l e d g e equates t o ignorance, w h i c h is socially undesirable. I f respondents feel t h a t they are facing some k i n d o f test designed t o expose their ignorance, they m a y be u n w i l l i n g t o p a r t i c i p a t e . I n any case, science is always a d v a n c i n g , so w e can never be sure w e have the complete t r u t h a b o u t these questions, a n d the line between k n o w l e d g e a n d o p i n i o n is o f t e n less clear t h a n w e m a y like t o t h i n k . O n e w a y o f dealing w i t h the p o t e n t i a l l y i n t i m i d a t i n g character o f k n o w ledge questions is t o present t h e m as questions a b o u t respondents' o p i n i o n s . Phrases such as ' i n y o u r o p i n i o n ' , ' i n y o u r v i e w ' a n d ' f r o m y o u r o w n e x p e r i ence' m a y be used t o signal t h i s . We can also p r o v i d e respondents w i t h a ' D o n ' t k n o w ' category.
Avoiding overlapping categories Very o f t e n , w e ask respondents t o indicate w h e r e they f a l l i n a p a r t i c u l a r range. T h i s can help t o soften questions t h a t m i g h t otherwise be t o o
Designing the questions
105
sensitive, such as asking a b o u t the respondent's age o r level o f i n c o m e . Take this example: Please state your age last birthday: Under 20 20-30 30-40 40-50 50-60 Over 60 T h e p r o b l e m here, o b v i o u s once m e n t i o n e d , is t h a t a respondent aged 30 falls i n t o t w o categories, namely 2 0 - 3 0 a n d 3 0 - 4 0 . T h e same p r o b l e m applies t o respondents aged 4 0 a n d 5 0 . T h e categories o v e r l a p . T h e response categories need t o be r e f o r m u l a t e d . For example: Please state your age last birthday: Under 20 20-29 30-39 40-49 50-59 60 and over
Asking about age Presenting respondents w i t h a set o f categories, as above, is one c o m m o n w a y i n w h i c h w e can m i n i m i z e people's sensitivities a b o u t age. Instead o f asking t h e m exactly h o w o l d they are, w e ask t h e m t o indicate i n t o w h i c h age range they f a l l . For m a n y purposes, this w i l l be a l l w e need. I f , however, we need t o k n o w respondents' age m o r e precisely, there are some technical p r o b l e m s t o overcome. Consider a c h i l d aged 5 years 9 m o n t h s . H o w m a n y years o l d is he? M o s t respondents w i l l say 5 years, b u t some, a significant m i n o r i t y , w i l l r o u n d the age u p t o 6. T h i s is a p r o b l e m w h e n asking a b o u t the age o f c h i l d r e n , b u t i t applies t o adults t o o . O n e p o s s i b i l i t y is t o ask respondents t o state their age i n years a n d m o n t h s . T h i s m a y w o r k reasonably w e l l w h e n asking a b o u t y o u n g c h i l d r e n , t h o u g h even here there is a small technical p r o b l e m , i n t h a t respondents m a y r o u n d t o the nearest m o n t h . I n any case, w e o f t e n d o n o t need such precision, a n d adults t y p i c a l l y d o n o t t h i n k i n these terms a b o u t their o w n age. A n o t h e r p o s s i b i l i t y is t o ask f o r date o f b i r t h . T h i s is very precise, b u t can s o u n d excessively bureaucratic a n d o f f i c i a l . A m o r e c o m m o n s o l u t i o n t o a v o i d the a m b i g u i t y is t o ask people f o r their 'age last b i r t h d a y ' .
106
Surveying the social world
Avoiding double-barrelled questions A d o u b l e - b a r r e l l e d (or w o r s e , m u l t i p l e - b a r r e l l e d ) question is one where m o r e t h a n one question is being asked at the same t i m e . For example: ' D o y o u o w n a camcorder o r video recorder?' is asking a b o u t t w o separate items. A m o r e subtle example is: ' H o w o f t e n are y o u i n contact w i t h y o u r parents?' - here, t w o people are i n v o l v e d , a n d the respondents' relations w i t h t h e m m a y be very d i f f e r e n t . One tactic f o r detecting this p r o b l e m is t o l o o k f o r the tell-tale w o r d s ' a n d ' a n d ' o r ' , a n d the use o f the slash, as i n cinema/theatre. I n general, the p r o b l e m o f d o u b l e - b a r r e l l e d questions is m o r e l i k e l y t o occur i n i n f o r m a l interviews t h a n i n s t r u c t u r e d i n t e r v i e w s a n d selfc o m p l e t i o n questionnaires. T h e g o o d news is t h a t such questions are also less o f a p r o b l e m i n an i n f o r m a l i n t e r v i e w , since any d i f f i c u l t y they cause can easily be r e p a i r e d . Even so, they are better a v o i d e d . Consider the f o l l o w i n g question: D o you k n o w if your employer has an equal opportunities policy? I f a respondent says 'Yes', w e are r i g h t t o infer t h a t , unless they are being facetious, they mean t h a t the employer does have such a policy. I f they are being facetious, their 'Yes' m a y mean 'Yes I k n o w the answer, b u t I ' m n o t g o i n g t o tell y o u w h a t i t is u n t i l y o u ask the question p r o p e r l y ' . So m u c h f o r facetiousness. W h a t , t h e n , i f the respondent replies ' N o ' ? H e r e there is a serious d o u b t : does the respondent m e a n he does n o t k n o w , or is he t e l l i n g us t h a t his employer does n o t have an equal o p p o r t u n i t i e s policy? I n conversation, phrases such as ' d o y o u k n o w i f ? ' are used t o a l l o w r o o m f o r people n o t t o k n o w the answer t o a question w i t h o u t any i m p l i c a t i o n t h a t they are i g n o r a n t a n d should k n o w . I n social research, even i n i n f o r m a l interviews, w e s h o u l d find other ways o f m a k i n g i t easy f o r respondents t o say t h a t they d o n ' t k n o w .
Avoiding negatives, double-negatives and worse Suppose y o u are opposed t o the p o l i c y t h a t students s h o u l d c o n t r i b u t e t o their u n i v e r s i t y t u i t i o n fees. W h a t is y o u r response t o the f o l l o w i n g question? Tuition fees should not be abolished Strongly agree
Agree
Neutral
Disagree
Strongly disagree
W o r k i n g o u t y o u r o w n p o s i t i o n o n a negative statement such as this can be p e r p l e x i n g . I t is a p a r t i c u l a r l y acute p r o b l e m f o r people w h o disagree w i t h the negative; i n this example, they d o n o t agree t h a t t u i t i o n fees s h o u l d n o t
Designing the questions
107
be abolished - a d o u b l e negative. I t is far simpler t o present respondents w i t h a positive statement, such as: Tuition fees should be abolished Strongly agree
Agree
Neutral
Disagree
Strongly disagree
The main things that go wrong in designing questions, and how to prevent them Questionnaires t h a t are t o o l o n g We s h o u l d resist the t e m p t a t i o n t o ask questions o u t o f idle curiosity. O t h e r things being equal, the longer a questionnaire is the l o w e r the response rate w i l l be. The Travel Survey is quite short, w i t h a t o t a l o f 23 questions f o r staff and 2 1 f o r students - a n d even here, the m a x i m u m n u m b e r o f questions any respondent has t o answer is o n l y 19. As w e l l as being as concise as possible, the questionnaire needs t o be l a i d o u t i n such a w a y t h a t i t looks manageable. T h e same p o i n t applies t o i n t e r v i e w s , w h i c h s h o u l d n o t be p r o l o n g e d unnecessarily. I n a r r a n g i n g a n interview, i t is n o r m a l t o p r o v i d e respondents w i t h a n estimate o f h o w l o n g i t is expected t o take. C o m m o n examples are: a r o u n d three-quarters o f a n h o u r ; n o m o r e t h a n a n h o u r ; between a n h o u r and a n h o u r a n d a half. I t is necessary t o p r o v i d e such estimates so t h a t respondents can set aside e n o u g h t i m e f o r t h e m . B u t d o w e need t o p r o v i d e a s i m i l a r estimate f o r a s e l f - c o m p l e t i o n questionnaire? S h o u l d w e say things l i k e : 'this questionnaire w i l l take a r o u n d ten minutes t o complete'? I f w e d o , we need t o m a k e sure t h a t o u r estimate is accurate, or o u r false reassurance w i l l be c o u n t e r p r o d u c t i v e . I n any case, whatever w e say, o u r respondents w i l l judge f o r themselves w h e t h e r or n o t the questionnaire l o o k s w o r t h their t i m e a n d t r o u b l e . O n balance, therefore, w e t h i n k i t is n o r m a l l y better t o a v o i d such promises.
R a n k i n g questions t h a t are t o o c o m p l i c a t e d R a n k i n g questions appear t o offer an excellent means f o r gauging the relative salience o f items t o the i n d i v i d u a l respondent. I t appears very attractive to offer respondents a list o f items, asking t h e m t o r a n k t h e m a c c o r d i n g t o their i m p o r t a n c e . Surely this w i l l y i e l d a r i c h b o d y o f data f o r analysis? Suppose w e w i s h t o ask a sample o f postgraduate students f r o m other countries a b o u t their o r i e n t a t i o n t o their studies i n B r i t a i n . We m i g h t decide to ask a question such as the f o l l o w i n g : Students tend t o have priorities in what they hope t o gain f r o m postgraduate study. Judging by what you feel at the moment, please rank the following factors in order of importance t o you, putting a I next t o the factor
108
Surveying the social world
which is most important and so on d o w n t o 9 for the factor which is least important. To To To To
be able t o cultivate a wide range of interests experience a different culture interact w i t h different kinds of people develop intellectually
To To To To
acquire knowledge and skills t o base your career on have a full social life make new friendships develop your sporting abilities
To develop your language skills W h a t c o u l d possibly go w r o n g w i t h this? H a r d experience suggests t h a t a l o t can, a n d p r o b a b l y w i l l : • M a n y respondents w i l l n o t r a n k a l l nine items. Instead, they w i l l r a n k a f e w - perhaps three o f f o u r - a n d leave the rest b l a n k . • Some respondents w i l l w a n t t o have t i e d items, a n d i t is very h a r d t o stop t h e m . For example, they m a y decide t h a t ' t o interact w i t h different k i n d s o f people' a n d ' t o develop i n t e l l e c t u a l l y ' r a n k equal second. H o w w i l l y o u analyse their response? • Some respondents w i l l n o t treat i t as a r a n k i n g exercise. Instead, they w i l l place an X o r a V against the items t h a t matter t o t h e m , leaving a l l the rest blank. • Some respondents w i l l w r i t e i n ' a l l o f t h e m ' . There are t w o ways o f dealing w i t h the p r o b l e m s o f r a n k i n g . One p o s s i b i l i t y is t o s i m p l i f y the task. I n the example above, i t w o u l d be m o r e s t r a i g h t f o r w a r d t o ask respondents t o p u t a V (a t i c k o r a check m a r k ) against the three items t h a t are m o s t i m p o r t a n t t o t h e m . Even so, they w o u l d have a l o n g list o f c o m p l e x items t o c o n t e n d w i t h . As another possibility, w e c o u l d p r o d u c e a m u c h shorter list - three items, say - a n d i n v i t e respondents t o r a n k t h e m 1 , 2 , 3. W e r e c o m m e n d t h a t five is the m a x i m u m n u m b e r o f items t h a t respondents be asked t o r a n k . A l t e r n a t i v e l y , w e can change the r a n k i n g i n t o a r a t i n g . T h e Survey U n i t presented the items as f o l l o w s : Students tend t o have priorities in what they hope t o gain f r o m postgraduate study. Judging by what you feel at the moment, please rate how important the following are t o you. Very Fairly Not important important important To be able t o cultivate a wide range of interests To experience a different culture . . . and so on.
Designing the questions
109
Lack of variety A very c o m m o n f a i l i n g i n questionnaire design is t o a d o p t the same f o r m a t f o r a l l o r m o s t o f the responses. O f t e n , this takes the f o r m o f a l o n g series o f statements t o each o f w h i c h the response categories are: s t r o n g l y agree agree - n e u t r a l - disagree - s t r o n g l y disagree. T h e l a y o u t o f such questionnaires m a y be neat a n d t i d y , b u t they r u n the risk o f being tedious t o c o m plete. A b o r e d respondent is seldom a g o o d i n f o r m a n t .
Vague questions a b o u t frequency o f actions I t is very c o m m o n , i n a l l types o f social survey, t o gather i n f o r m a t i o n a b o u t p e r i o d i c a l actions. We w a n t t o k n o w h o w o f t e n respondents d o t h i n g s . For example, h o w o f t e n d o they go t o the theatre? We m i g h t envisage the following: D o you go t o the theatre? Often Sometimes Rarely
Never
B u t w h a t does this tell us? Suppose a respondent goes t o the theatre r o u g h l y once a w e e k . Is t h a t o f t e n , o r sometimes? I f they go once a m o n t h , is t h a t o f t e n , sometimes o r rarely? T h e p r o b l e m is, o f course, t h a t d i f f e r e n t respondents w i l l i n t e r p r e t the categories differently, so w e shall have o n l y the vaguest idea o f the frequency o f attendance a m o n g o u r respondents. Because the response categories are vague, the danger o f social desirabili t y effects is p a r t i c u l a r l y acute. G o i n g t o the theatre is a relatively high-status activity, suggesting an active interest i n the arts a n d the intellectual life. O v e r - r e p o r t i n g m a y be a p r o b l e m . I n the case o f m o r e socially d u b i o u s activities - g o i n g t o the dogs, perhaps? - u n d e r - r e p o r t i n g is m o r e likely. O n e w a y o f dealing w i t h p e r i o d i c a l b e h a v i o u r is t o offer m o r e specific categories o f response, such as: H o w often on average do you go t o the theatre? More than once
Once a week
Once a month
Once a year
Never
a week A d i f f i c u l t y w i t h this is t h a t the response categories, t h o u g h commonsensical, are n o t exhaustive. W h a t a b o u t someone w h o goes t o the theatre o n average every other week - t h a t is, t w i c e a m o n t h or six times a year? W e have n o category f o r her, a n d f o r others whose p e r i o d i c i t y does n o t fit i n t o o u r categories. A n o t h e r p r o b l e m w i t h this a p p r o a c h is t h a t i t assumes t h a t the behaviour i n question is regular, a n a s s u m p t i o n w h i c h m a y be false. Some people go t o the theatre several times d u r i n g h o l i d a y periods, b u t n o t at a l l at other times. We have i n t r o d u c e d the phrase ' o n average' i n t o the question, t o t r y t o deal w i t h this d i f f i c u l t y , b u t a d i f f i c u l t y i t remains.
110
Surveying the social world
Perhaps w e s h o u l d t i g h t e n u p the response categories. T h u s , f o r example: H o w often do you go t o the theatre? Never 1-5 times 6-10 times a year a year
11-20 times a year
O v e r 20 times a year
T h e g a i n i n precision has been b o u g h t at the cost o f extreme artificiality. A w h o l l y different a p p r o a c h is t o ask people a b o u t their b e h a v i o u r over a specified t i m e p e r i o d . W e m i g h t ask t h e m h o w o f t e n they have been over the last week, or the last m o n t h , o r the last year. T h i s has the advantage o f being specific. There are, however, a n u m b e r o f p r o b l e m s t o be t a c k l e d i f this a p p r o a c h is a d o p t e d . To start w i t h , there is considerable a m b i g u i t y i n asking a b o u t weeks or m o n t h s or years. I m a g i n e a respondent filling o u t a questionnaire o n F r i d a y 18 N o v e m b e r . W h a t w i l l she or he u n d e r s t a n d by the phrase, 'over the last week'? Does i t m e a n the p e r i o d since Sunday 13 N o v e m b e r (Sunday being the first day o f the C h r i s t i a n week)? Does i t m e a n the p e r i o d since M o n d a y 14 N o v e m b e r ( M o n d a y being f o r m a n y people the first day o f the w o r k i n g week)? O r does i t m e a n the seven days since Saturday 12 N o v e m b e r ? I n m a n y cases, researchers p r o b a b l y m e a n seven days - i n w h i c h case w e need to say so. I f w e use the phrase 'over the last m o n t h ' , this m i g h t m e a n the p e r i o d since the b e g i n n i n g o f the m o n t h , or the last 30/31 days, or, m o r e r o u g h l y , the last f o u r weeks. As f o r years, 'over the last year' m i g h t m e a n the p e r i o d since the beginn i n g o f the year, or the last 365/366 days. I n some situations, i t w i l l n o t be clear w h e t h e r the year referred t o is the calendar year b e g i n n i n g 1 January or some other year, such as the financial year o r the academic year. I n the case o f 'over the last t w e l v e m o n t h s ' , the fact t h a t w e m i g h t be a f e w days short o f a f u l l year is u n l i k e l y t o m a t t e r - the p e r i o d is l o n g e n o u g h f o r i t t o be a t r i v i a l issue. I n contrast, asking people w h a t they d i d 'yesterday' is n o t a m b i g u o u s . I t minimizes the p r o b l e m s o f m e m o r y recall. T h e longer the t i m e p e r i o d the greater the o p p o r t u n i t y f o r m e m o r y t o be c o l o u r e d by self-image. A short t i m e p e r i o d therefore helps t o c o m b a t social d e s i r a b i l i t y effects. O n e p o t e n t i a l p r o b l e m w i t h asking a b o u t b e h a v i o u r 'yesterday' is t h a t i t may have been an u n u s u a l day. A respondent w h o , say, has t w o glasses o f w i n e every day m a y n o t have h a d a d r i n k o n t h a t p a r t i c u l a r day f o r some special a n d n o t o f t e n t o be repeated reason. I n some cases this w i l l n o t matter. I f w e have a large sample o f respondents, a n d i f w e are interested i n aggregate data rather t h a n i n i n d i v i d u a l s , these v a r i a t i o n s w i l l be very m i n o r and w i l l p r o b a b l y be cancelled o u t (another respondent w i l l , u n u s u a l l y f o r her, have d r u n k t w o glasses o f w i n e o n a special occasion). W h a t w i l l m a t t e r is i f the t i m e p e r i o d is e x c e p t i o n a l f o r a significant n u m b e r o f respondents. I f w e i n t e r v i e w people o n 2 January a b o u t their
Designing the questions
I II
eating a n d d r i n k i n g over the last seven days, w e have chosen a p e r i o d w h i c h i n m a n y societies is a m a j o r feast, a n d n o t t y p i c a l o f the rest o f the year. Unless over-indulgence at Christmas a n d N e w Year is the object o f o u r research, w e s h o u l d choose another p e r i o d . T h i s is an o b v i o u s example, b u t there are m a n y others w h e r e w e need t o be c a r e f u l : h o l i d a y s , the h o l y days of faiths other t h a n o u r o w n , a n d the b e g i n n i n g a n d end o f cycles such as academic terms. Selecting a sensible a n d m e a n i n g f u l t i m e f r a m e is n o t impossible, b u t requires some t h o u g h t a n d o f t e n a l i t t l e research. The m o s t effective w a y o f asking a b o u t p e r i o d i c a l actions w i l l v a r y f r o m case t o case. G i v e n t h a t p o i n t , B o x 6.6 lists some general guidelines, a l l based o n the need t o be as specific a n d u n a m b i g u o u s as possible.
Box 6.6
Asking about periodical actions
• Avoid 'often - sometimes - occasionally - never' and variants on the theme. Such terms are vague, and mean different things t o different people. • D o not ask about 'the last week', ask about 'the last seven days'. • If asking about a year, be clear what period is meant - f o r example, 'since I January', 'since the start of the academic year', 'over the last twelve months'. • Keep the time period as short as you sensibly can, t o minimize problems of m e m o r y recall and social desirability effects. • Make sure the time period is meaningful, and sensibly matches the periodicity of the behaviour in question. • Make sure the time frame is not an unusual one - unless that is the point of the research.
Lack of clarity about confidentiality and anonymity I f w e tell respondents t h a t o u r questionnaire is a n o n y m o u s , i t means t h a t w e have n o w a y o f i d e n t i f y i n g w h i c h questionnaire belongs t o w h i c h respondent. T h i s is a s t r o n g reassurance, a n d o b v i o u s l y impossible i n i n t e r v i e w situations. Even i n a s e l f - c o m p l e t i o n questionnaire, a n o n y m i t y can be p r o b lematic, as discussed o n page 2 3 . For example, consider a survey o f u n i versity staff t h a t asks respondents t o state t h e i r sex, their r a n k , a n d their academic d e p a r t m e n t . Clearly, the researchers c o u l d m o r e c o n f i d e n t l y guarantee a n o n y m i t y t o a male lecturer i n a large engineering d e p a r t m e n t t h a n i t c o u l d t o a female professor i n a small d e p a r t m e n t o f economics. Since a n o n y m i t y is an absolute categorical guarantee, w e need t o be sure w e can genuinely deliver i t .
I 12
Surveying the social world
I f w e offer o u r respondents confidentiality, w e need t o be clear w h a t is i n v o l v e d . T h e researchers, after a l l , k n o w w h o has said w h a t . C o n f i d e n t i a l ity means t h a t w e w i l l n o t disclose this i n f o r m a t i o n t o anyone else. G u a r a n tees o f c o n f i d e n t i a l i t y t y p i c a l l y i n v o l v e the f o l l o w i n g : • the use o f pseudonyms t o disguise the names o f respondents, places a n d organizations; • c h a n g i n g m i n o r a n d i r r e l e v a n t details i n o r d e r t o disguise these names; • keeping the data securely; • n o t a l l o w i n g access t o the data t o anyone outside the research t e a m ; • d e s t r o y i n g the data at the end o f the p r o j e c t , or a n o n y m i s i n g i t a n d p l a c i n g i t i n an archive.
The most frequently raised problems, and our answers I n o u r experience, there are a n u m b e r o f issues t h a t repeatedly t r o u b l e people w h e n designing questions. T h e issues m o s t c o m m o n l y raised w i t h us, and o u r responses t o t h e m , are these.
S h o u l d I i n c l u d e a m i d d l e category? For e x a m p l e , i n asking respondents a b o u t their level o f agreement or disagreement w i t h a statement, w h i c h o f the f o l l o w i n g L i k e r t scales is preferable? Strongly agree
Agree
Disagree
Strongly disagree
Strongly agree
Agree
Neutral
Disagree
Strongly disagree
Some researchers w o r r y t h a t i f they include the m i d d l e category i t w i l l be t o o attractive. Respondents w i l l s i m p l y d u c k the question a n d take the easy w a y o u t . T h e r e f o r e , so the a r g u m e n t r u n s , i t is better t o force respondents i n t o g i v i n g either a positive o r a negative answer. Against this is the p o i n t t h a t respondents m a y l e g i t i m a t e l y be n e u t r a l . F o r c i n g t h e m i n t o either the p r o - o r the a n t i - c a m p is a r t i f i c i a l , a n d can be extremely a n n o y i n g t o people w h o are genuinely n e u t r a l o n a n issue. A n o t h e r version o f this p r o b l e m arises i n the use o f semantic d i f f e r e n t i a l scales, w h e r e respondents are asked t o rate their views o n a b i p o l a r n u m e r ical scale, w i t h o p p o s i n g adjectives at each p o l e . H e r e is an e x a m p l e , t a k e n f r o m the Survey U n i t ' s o m n i b u s questionnaires. Students are presented w i t h a series o f terms describing the university, a n d asked t o p u t a r i n g r o u n d the n u m b e r w h i c h comes closest t o t h e i r o w n view, t h u s : lively friendly
12 12
3 4 5 3 4 5
dull unfriendly
Designing the questions
I 13
W h e n using such scales, w e r e c o m m e n d h a v i n g either five or seven ratings m o r e is t o o c o m p l i c a t e d a n d adds very l i t t l e . W i t h an o d d n u m b e r o f response categories there is a m i d d l e p o s i t i o n here, response category 3. B u t i f w e presented respondents w i t h a n even n u m b e r o f categories, thus: lively friendly
12 3 4 5 6 I 2 3 4 5 6
dull unfriendly
w e w o u l d deny t h e m the m i d d l e o p t i o n . T h i s w a y o f a p p r o a c h i n g the p r o b l e m is a l i t t l e less o b v i o u s t h a t the earlier e x a m p l e , b u t even so i t is a r t i f i c i a l and potentially annoying. I n s h o r t , w e r e c o m m e n d t h a t a m i d d l e category be p r o v i d e d unless there are c o m p e l l i n g reasons f o r n o t d o i n g so. I f questionnaires are w e l l designed, respondents w i l l n o t give a n e u t r a l answer merely because they are b o r e d o r i n t i m i d a t e d . I f respondents are n e u t r a l or i n d i f f e r e n t t o a n i t e m , t h a t is a w o r t h w h i l e finding. S h o u l d I i n c l u d e a ' d o n ' t k n o w ' category? T h i s is a s i m i l a r p r o b l e m w i t h a s i m i l a r answer. Whenever respondents can sensibly be t h o u g h t n o t t o k n o w a b o u t an i t e m , o r t o be u n c e r t a i n a b o u t i t , w e s h o u l d a l l o w t h e m t o express t h e i r d o u b t or uncertainty. Suppressing the p o s s i b i l i t y o f legitimate ' d o n ' t k n o w s ' a n d 'uncertains' merely distorts the social reality, a n d m a y be very o f f - p u t t i n g t o respondents. I f a substantial percentage o f o u r respondents say they d o n ' t k n o w a b o u t an i t e m , t h a t is n o t a p r o b l e m b u t a finding.
S h o u l d i t be sex, o r gender? A m o n g the basic i n f o r m a t i o n w e gather i n a survey, w e usually w a n t t o k n o w w h e t h e r o u r respondents are male or female. I n an i n t e r v i e w w e d o n o t need t o ask, a n d i t w o u l d be absurd t o d o so. B u t w h a t a b o u t a selfc o m p l e t i o n questionnaire? S h o u l d w e label this variable sex, or gender} T h i s is a n extremely c o m p l e x issue. A t first sight, the answer is clear-cut: it s h o u l d be sex. U n t i l recently, sex was i n v a r i a b l y the label used o n questionnaires. T h e d i s t i n c t i o n d r a w n by sociologists is between sex as b i o l o g i cally given (male a n d female), a n d gender as socially constructed (masculine a n d f e m i n i n e ) . B u t this raises a host o f theoretical, p h i l o s o p h i c a l a n d ideological issues. Sociologists have become increasingly concerned a b o u t the i m p l i c a t i o n o f b i o l o g i c a l d e t e r m i n i s m t h a t is f r e q u e n t l y read i n t o the t e r m 'sex'. T h e same holds true o f 'race', a t e r m w h i c h is n o w usually placed inside i n v e r t e d commas t o s h o w t h a t w e repudiate a l l bogus theories o f race a n d racial s u p e r i o r i t y , recognizing instead o u r c o m m o n h u m a n i t y . N o w a days w e r i g h t l y ask a b o u t ethnicity, n o t 'race'. I n a s i m i l a r f a s h i o n , r i g h t l y
I 14
Surveying the social world
or w r o n g l y , 'sex' is c o m i n g t o be a suspect t e r m as far as questionnaires are concerned. Some people prefer 'gender', b u t others dismiss this as misplaced ' p o l i t i c a l correctness'. T h e answer is definitely n o t t o d u c k the question completely. So m u c h o f social life is s t r u c t u r e d b y gendered inequalities between the sexes t h a t t o f a i l t o r e c o r d the sex o f o u r respondents is t o capitulate t o ignorance. Fortunately, w e can deal w i t h the p r o b l e m b y f o r m u l a t i n g o u r question as follows: A r e you: Female
•
Male
•
Questionnaire layout As w e l l as f r a m i n g i n d i v i d u a l questions t h a t are as accurate, searching a n d sensitive as w e can m a k e t h e m , w e also need t o ensure t h a t the o v e r a l l l a y o u t o f a questionnaire (and the structure o f an i n t e r v i e w schedule) is clear, coherent a n d sensitive. A f e w simple guidelines s h o u l d h e l p . I n B o x 6.7, w e c o m m e n t o n h o w w e a p p l i e d these i n the Travel Survey.
Introduction A f e w sentences briefly i n t r o d u c i n g the questionnaire, i n c l u d i n g any o v e r a l l guidance o n its c o m p l e t i o n , are a m u s t . I t does n o t m a t t e r i f they repeat some o f the matters dealt w i t h i n the c o v e r i n g letter.
Instructions As w e l l as any o v e r a l l guidance o n the c o m p l e t i o n o f the questionnaire, w e need t o m a k e clear t o respondents h o w i n d i v i d u a l questions are t o be answered. I n the Travel Survey, y o u w i l l see t h a t w e have given instructions such as 'Please V one box' or 'Please V all that apply'.
Sections I t is o f t e n h e l p f u l t o respondents, p a r t i c u l a r l y i f the questionnaire is f a i r l y l o n g , t o divide i t i n t o sections, each w i t h a brief i n t r o d u c t i o n t o set the scene. O u r o m n i b u s survey o f postgraduate i n t e r n a t i o n a l students c o n t a i n e d the f o l l o w i n g sections: • Section A - Before you came I n this first section w e are interested i n y o u r reasons f o r choosing the U n i versity o f N o t t i n g h a m a n d y o u r p a r t i c u l a r postgraduate research/course.
Designing the questions • Section
B - Now
you are
I 15
here
I n this section w e are interested i n y o u r o p i n i o n s a n d experiences o f the U n i v e r s i t y as a postgraduate. • Section C - Use of the Internet Increasing n u m b e r s o f students are n o w using the Internet a n d the U n i versity o f N o t t i n g h a m ' s W e b site f o r i n f o r m a t i o n . I n the n e x t f e w questions w e w a n t t o ask y o u a b o u t y o u r use o f this technology. • Section D - Research students only T h i s section is f o r students registered f o r M P h i l a n d P h D degrees. I f y o u are s t u d y i n g f o r any other postgraduate qualifications please go d i r e c t l y to section E. • Section
E - Background
details
I n this section w e ask a f e w questions a b o u t yourself a n d y o u r degree course/research.
Use o f c o l u m n s I f possible, i t is a g o o d idea t o divide each sheet o f the questionnaire i n t o t w o c o l u m n s . T h i s uses the space efficiently: i t cuts d o w n o n unnecessary b l a n k space, a n d i t prevents questions f r o m straggling across the page. Questions s h o u l d be n u m b e r e d v e r t i c a l l y d o w n the c o l u m n s , as i n the Travel Survey: like this:
I 2 3
4 5 6
not like this:
I 3 5
2 4 6
Question numbering N u m b e r i n g the questions is essential i n order t o a v o i d c o n f u s i o n . Some w r i t ers suggest t h a t questions can have subletters (3a, 3 b , 3c a n d so o n ) . T h e i r m a i n reason f o r this is t h a t i t makes the t o t a l n u m b e r o f questions seem less t h a n i t is. We believe, however, t h a t i t is possibly c o n f u s i n g a n d tends t o l o o k fussy, so w e r e c o m m e n d t h a t questions are n u m b e r e d 1 , 2, 3 a n d so o n w i t h out any sublettering. W h e r e an o v e r a l l question has a n u m b e r o f p a r t i c u lar examples - as i n questions 13 a n d 14 o n the Travel Survey - there is n o need f o r separate n u m b e r i n g ; they can a l l be presented under the one question.
Sequence o f questions I f possible, w e begin w i t h relatively s t r a i g h t f o r w a r d questions t h a t w i l l be easy t o answer. These w i l l 'break the ice', b u i l d i n g u p the respondent's c o n fidence i n the survey. M o r e c o m p l e x a n d subtle questions are i n t r o d u c e d later. I f w e w i s h t o ask questions a b o u t personal matters such as age, sex,
I 16
Surveying the social world
ethnicity, i n c o m e , a n d m a r i t a l status, these are n o r m a l l y placed at the end, by w h i c h t i m e w e hope t o have gained the respondent's f u l l confidence. A n alternative is t o p u t these questions at the very b e g i n n i n g , w h i c h is w h y they are sometimes k n o w n as 'face-sheet d a t a ' . A l t h o u g h bureaucratically neat, i n t h a t i t 'gets t h e m o u t o f the w a y ' , w e believe t h a t s t a r t i n g this w a y is potentially off-putting, and we do not recommend it.
Q u e s t i o n skips I t is o f t e n impossible t o devise a questionnaire i n such a w a y t h a t a l l the questions are a p p r o p r i a t e f o r everyone t o answer. We m a y w e l l need t o have filter questions: depending o n y o u r answer t o a filter question, y o u either go o n t o the n e x t question o r s k i p t o a later one. T o o m a n y question skips can be extremely c o n f u s i n g , a n d can m a k e a questionnaire l o o k c l u t t e r e d . We therefore t r y t o keep the n u m b e r o f filter questions t o the absolute m i n i m u m , a n d t o be as clear as possible a b o u t where w e expect respondents t o s k i p t o . T h i s is one o f the reasons w h y questions s h o u l d be n u m b e r e d . Filter questions o f t e n take the f o l l o w i n g f o r m . Q u e s t i o n 3 asks respondents i f they p l a y a musical i n s t r u m e n t . I f yes, they go o n t o answer questions 4 , 5 a n d 6, w h i c h ask w h i c h instruments they play, h o w o f t e n they d o so, a n d h o w m u c h they enjoy i t . I f the answer t o question 3 is n o , t h e n clearly questions 4 t o 6 are irrelevant, so w e need t o i n s t r u c t respondents: ' I f n o , please go t o question 7'. T o o m u c h o f this can be c o n f u s i n g , a n d sometimes rather a n n o y i n g : i f respondents have t o keep s k i p p i n g questions, i t m a y seem as i f the questionnaire is n o t really designed f o r t h e m at a l l . One w a y t o m i n i m i z e the c o n f u s i o n a n d p o t e n t i a l annoyance is t o include the s k i p w i t h i n the q u e s t i o n . H e r e is a n example f r o m the o m n i b u s survey of i n t e r n a t i o n a l postgraduates: Have you received any information about postgraduate training courses since you began your degree? Can't remember
•
Yes
No
•
If yes, w h o was offering t o provide the courses advertised? Training body outside the University The University Graduate School Faculty/Department Unsure w h o the provider was O t h e r (please specify the provider)
• • • • •
Designing the questions
I 17
A n o t h e r p o s s i b i l i t y is t o require respondents t o s k i p t o a w h o l e n e w section, as a w a y o f e l i m i n a t i n g c o n f u s i o n . I f a questionnaire has m o r e t h a n one o r t w o question skips, i t m a y w e l l be a sign t h a t s o m e t h i n g is w r o n g . P r o b a b l y w e are t r y i n g t o survey t o o m a n y different groups o f people a b o u t t o o m a n y d i f f e r e n t t h i n g s . Perhaps w e can cut o u t some o f the questions? I f n o t , t h e n perhaps w e can send d i f f e r e n t questionnaires t o d i f f e r e n t groups? For example, the Survey U n i t devised t w o versions o f the Travel Survey questionnaire, one f o r students a n d one f o r staff. T h e m a i n questions were exactly the same; a l l t h a t d i f f e r e d were questions a b o u t the respondents' w o r k . Sending d i f f e r e n t questionnaires t o d i f f e r e n t categories o f respondent can o b v i o u s l y o n l y be done i f w e can i d e n t i f y i n advance the category t o w h i c h a respondent belongs. W h a t i f w e c a n n o t d o so? O n e p o s s i b i l i t y is t o aband o n the idea t h a t w e can c o n d u c t the research by s e l f - c o m p l e t i o n questionnaire, w h i c h is s i m p l y t o o i n f l e x i b l e an i n s t r u m e n t f o r o u r purposes. We s h o u l d consider i n t e r v i e w i n g respondents. O n e advantage o f the i n t e r v i e w f o r m a t is t h a t i t is the interviewer, n o t the respondent, w h o has t o d o the skipping.
Conclusion H e r e w e t h a n k o u r respondents f o r their c o o p e r a t i o n . W e m a y invite t h e m t o offer any f u r t h e r comments at the end o f the questionnaire or o n a n a d d i t i o n a l sheet. We m u s t also remember t o let t h e m k n o w o r r e m i n d t h e m h o w t o r e t u r n the questionnaire t o us. For example: Thank you f o r taking the time t o complete this questionnaire. If you would like t o make any further comments please attach a piece of paper. Please return the questionnaire in the FREEPOST envelope provided either in the internal o r external mail.
Box 6.7
The layout of the Travel
Survey
The Travel Survey illustrates these principles in action. Each survey has its o w n particular difficulties t o be overcome. W e have already mentioned, on this page, the problem of designing a questionnaire suitable for both students and staff. In the end, a separate questionnaire was sent t o each. The sequence of questions was straightforward. W e began w i t h factual questions about respondents' journey t o w o r k , such as the distance f r o m home t o w o r k , the time taken and the means of transport used. W e then moved t o questions which ask respondents for suggestions about how facilities could be improved. These questions require a little more reflection, but they may also appeal t o respondents since they give them an
I I8
Surveying the social world
opportunity t o have an influence on improving the University's provision. Finally, we ask a series of more personal questions. W e hope that, having completed the earlier parts of our questionnaire, respondents will have confidence in us and o u r research. W e offer the guarantee that 'under no circumstances will attempts be made t o identify individuals'. Even though the questionnaire is anonymous this reassurance is still necessary, particularly so f o r people in a minority - a woman technician, f o r example. If the sequence of questions was easy t o decide, a far more troublesome issue was the sheer complexity of many people's journeys t o w o r k . People do not necessarily use the same means of transport every day of the week o r every week of the year. Some variations follow a regular pattern, others are unpredictable. A car driver may use the bus on Fridays, when her partner has the car t o visit his parents. A pedestrian o r cyclist may take the bus o r a taxi if it is raining heavily. A member of staff may use the car during school t e r m time, in o r d e r t o drop children off at school; outside school t e r m , the parent may cycle t o w o r k . Some people have long and complicated journeys t o w o r k : they drive o r walk t o the railway station in Derby, depending on the weather, catch the train t o Nottingham (or Beeston, if that particular train stops there), and then take a bus o r a taxi t o the university depending on the time and the state of their finances. The Survey Unit's questions had t o be sensitive t o all these possibilities, while keeping the questionnaire short and straightforward. Hence, f o r example, question 3: What mode of transport do you use most often for the longest stage of your journey to campus? W e also faced the problem of question skips. W e wanted t o ask people for their suggestions about h o w facilities could be improved. However, we did not want t o ask car drivers t o speculate about the needs of cyclists, o r pedestrians t o address what they guessed might be the concerns of people using public transport. Instead, we wished t o ask people about problems w i t h which they were themselves familiar. Hence the format of questions 8 t o 14. W e used a combination of visual and verbal signals (the instructions, the shading, and enclosing questions 8 t o 14 within a border) t o indicate w h o should answer which questions. The outcome, which appeared t o be successful, is deceptively simple - but it t o o k time t o get it right.
Designing interview schedules So far, w e have concentrated o n the design o f s e l f - c o m p l e t i o n questionnaires. W h a t a b o u t i n t e r v i e w schedules? Essentially, the same principles apply. Even t h o u g h the i n t e r v i e w has the advantage t h a t the researcher can e x p l a i n any unexpected difficulties a n d t r y
Designing the questions
I 19
t o s m o o t h over any sensitive items, this is n o excuse f o r p o o r design. As m u c h t h o u g h t needs t o go i n t o an i n t e r v i e w schedule as i n t o a self-comp l e t i o n questionnaire. T h i s is true even w h e n the i n t e r v i e w is i n f o r m a l a n d u n s t r u c t u r e d . T h e i n t e r v i e w e r needs t o be become very f a m i l i a r w i t h the i n t e r v i e w schedule or guide, so t h a t the i n t e r v i e w can proceed s m o o t h l y w i t h o u t the d i s t r a c t i o n o f the i n t e r v i e w e r f u m b l i n g f o r the n e x t i t e m . T h e principles g o v e r n i n g the sequence o f items are the same as f o r selfc o m p l e t i o n questionnaires. Questions a b o u t personal details are usually held back u n t i l the e n d . As w i t h questionnaires, i t is h e l p f u l t o indicate t o interviewees any significant changes o f t o p i c w i t h i n the interview. H e r e is the w a y i n w h i c h Saunders ( 1 9 9 0 ) i n t r o d u c e d the v a r i o u s sections o f his interviews w i t h h o m e o w n e r s i n the U K : I would like t o begin by asking a few questions about your past and present housing. I'm interested in getting some idea of h o w you spend your spare time. I would n o w like t o ask a few questions about your household's income and outgoings. Finally, returning t o the theme of your house and home . . . A m o n g the specific issues arising i n the p l a n n i n g a n d e x e c u t i o n o f interviews, a n d n o t covered by o u r discussion o f s e l f - c o m p l e t i o n questionnaires, the m o s t i m p o r t a n t are: using probes; using s h o w cards, i n c l u d i n g p r o m p t s ; r e c o r d i n g the responses; a n d r e s p o n d i n g t o interviewees' queries. W e deal w i t h each o f these i n t u r n .
Probes Probes m a y be classified i n t o t w o types: probes seeking m o r e detailed fact u a l i n f o r m a t i o n , a n d probes designed t o encourage respondents t o elaborate o n their o p i n i o n s or accounts o f their o w n experience. I n a s t r u c t u r e d interview, w e m a y need t o p r o b e respondents f o r fuller or m o r e detailed i n f o r m a t i o n . I n order t o decide o n w h a t probes t o use, w e need t o k n o w exactly w h a t i n f o r m a t i o n w e r e q u i r e . Unless an i n t e r v i e w is entirely s t r u c t u r e d , there are l i k e l y t o be occasions w h e n w e w a n t t o d r a w o u r interviewees o u t , asking t h e m n o t merely f o r m o r e i n f o r m a t i o n b u t also t o e x p a n d o n their t h o u g h t s , feelings a n d experiences. W e d o n o t , however, w i s h o u r i n t e r v i e w t o seem like a n i n t e r r o g a t i o n or i n q u i s i t i o n . B o x 6.8 gives a list o f w a y s i n w h i c h w e can p r o b e f o r a fuller response. W e list t h e m i n order o f intrusiveness, w i t h the least i n t r u s i v e first.
120
Surveying the social world
Box 6.8
Some sample probes for eliciting a fuller response
1 A n expectant pause. 2 A n encouraging sound: ' m m h m m ' , 'uh-huh'. 3 Repeating part o r all of the interviewee's reply: 'So, you switched t o sociology after your first year at university?' 4 Summarizing their response. 'So, your reason for switching t o sociology was that you were aiming for a career in market research?' 5 Asking for an example: 'Can you give me an example of the problems you had w i t h economics?' 6 Asking for clarification: 'I'm not quite sure I've understood why you were unhappy w i t h economics. Could you tell me a little more?'
I t is desirable t o make this k i n d o f p r o b e as u n t h r e a t e n i n g as possible. A n expectant pause is o f t e n e n o u g h . Silences, i f they appear i n danger o f becomi n g embarrassing, can be filled w i t h ' m m h m m ' s a n d ' u h - h u h ' s , perhaps n o d d i n g the head t o indicate encouragement. Repeating p a r t o r a l l o f w h a t the interviewee has said is o f t e n very effective i n m o v i n g the interviewee t o elaborate their earlier response. These ways o f p r o b i n g are t y p i c a l l y m o r e effective t h a n b l u n t l y asking f o r ' m o r e details'. A l t h o u g h i t is t e m p t i n g t o p r o b e b y asking, 'Is there a n y t h i n g m o r e y o u w o u l d l i k e t o say?' o r 'Is there a n y t h i n g y o u w o u l d l i k e t o add?', these are as m u c h ways o f b r i n g i n g a t o p i c t o a n end as they are o f o p e n i n g it u p . T h e y i n v i t e responses such as, ' N o , that's a b o u t i t ' , o r ' I can't t h i n k o f a n y t h i n g else, n o ' .
Show cards We o f t e n w a n t t o ask a series o f questions w h i c h have the same response categories. O n e w a y o f h a n d l i n g this w o u l d be t o ask: ' W o u l d y o u say t h a t y o u are very satisfied, satisfied, neutral, dissatisfied o r very dissatisfied w i t h the f o l l o w i n g ? ' , f o l l o w e d by reading o u t each i t e m i n t u r n . T h i s can be a w k w a r d , because i t relies o n the respondent's remembering w h a t the response categories were. T h e longer the list, the greater the p r o b l e m is likely t o be. Respondents m a y say things like: ' I ' m n o t very satisfied, n o . ' T h e p r o b l e m w i l l be t h a t w e do n o t k n o w whether t o record this as 'dissatisfied' o r 'very dissatisfied', so w e w i l l have t o ask, 'Does that mean that y o u are dissatisfied o r very dissatisfied?' The respondent m a y feel t h a t he is being corrected i n some w a y f o r h a v i n g failed t o remember w h a t the appropriate response categories were. T h i s d i f f i c u l t y is a v o i d e d b y h a v i n g the response categories w r i t t e n o u t o n a s h o w c a r d , w h i c h w e h a n d t o the respondent. I n this example, the response c a r d is acting as a p r o m p t o r r e m i n d e r t o the respondent. As w e l l as acting as p r o m p t s , s h o w cards can be used t o present lists o f
Designing the questions
121
items t o respondents, f o r example asking respondents t o indicate w h i c h items they possess f r o m a list o f consumer goods. Similarly, a list o f age bands or income brackets is t y p i c a l l y given t o respondents o n a s h o w c a r d . Show cards are also used t o present m a t e r i a l t o respondents f o r c o m m e n t . Vignettes are t y p i c a l l y presented i n this w a y . Interesting examples o f vignettes m a y be f o u n d i n F i n c h and M a s o n (1993) Negotiating Family Responsibilities. R e s p o n d i n g t o interviewees' queries I n a n i n t e r v i e w , i t is n o t u n c o m m o n f o r respondents t o raise queries a n d questions. I f these are p o i n t s o f c l a r i f i c a t i o n , o r reassurance a b o u t c o n fidentiality, we s h o u l d be i n a p o s i t i o n t o r e p l y t o t h e m o p e n l y a n d straightforwardly. I n some cases, p a r t i c u l a r l y i n less s t r u c t u r e d i n t e r v i e w s , respondents m a y ask questions a b o u t the i n t e r v i e w e r s ' o w n beliefs a n d experiences. For example, i n his interviews w i t h clergy A l d r i d g e was asked a b o u t his o w n religious beliefs o r lack o f t h e m . H i s response was t o say t h a t he w o u l d be very h a p p y t o t a l k a b o u t those issues at the end o f the i n t e r v i e w , b u t w o u l d like f o r the m o m e n t t o concentrate o n the interviewee's o w n beliefs a n d experiences. Respondents were i n v a r i a b l y h a p p y t o proceed i n t h a t w a y .
R e c o r d i n g the responses I n s t r u c t u r e d a n d semi-structured i n t e r v i e w s , the researcher w i l l t y p i c a l l y be r e c o r d i n g the responses as she goes a l o n g . Just as s e l f - c o m p l e t i o n questionnaires have t o be easy f o r the respondent t o complete, so i n t e r v i e w schedules have t o be clear a n d s t r a i g h t f o r w a r d f o r the interviewer. T h i n g s w i l l be made easier i f there is a clear visual d i s t i n c t i o n between the questions t o be asked o f the respondent, a n d i n s t r u c t i o n s t o the i n t e r v i e w e r a b o u t question skips a n d probes. V e r y o f t e n , the i n s t r u c t i o n s t o the interviewer are p r i n t e d i n b o l d c a p i t a l letters. U n s t r u c t u r e d i n t e r v i e w s w i l l usually be tape recorded, w i t h the interviewees' agreement, whereas there is n o p o i n t i n t a p i n g a f u l l y s t r u c t u r e d interview. W h a t a b o u t semi-structured interviews? I f there is a large n u m b e r o f open-ended questions, i t m a y be w o r t h r e c o r d i n g the i n t e r v i e w a n d t r a n scribing the relevant parts o f i t . F u l l t r a n s c r i p t i o n is t i m e - c o n s u m i n g : even w i t h a t r a n s c r i p t i o n machine, f o r every hour's w o r t h o f r e c o r d i n g y o u s h o u l d a l l o w five hours f o r t r a n s c r i p t i o n .
Setting up for coding W h e n designing a questionnaire or schedule, w e need t o l o o k ahead t o the stage at w h i c h w e w i l l be analysing the data. A l l b u t the simplest surveys w i l l call f o r i n p u t t i n g the data i n t o a computer.
122
Surveying the social world
I n s e l f - c o m p l e t i o n questionnaires a n d s t r u c t u r e d i n t e r v i e w schedules, a l l or m o s t o f the items w i l l be pre-coded questions. W e decide i n advance w h a t all the categories o f response w i l l be. I n order t o s i m p l i f y a n d speed u p the process o f entering data i n t o the computer, i t is h e l p f u l t o have n u m b e r s , i n clear b u t u n o b t r u s i v e typeface, b y the side o f each o f the response categories. T h e Travel
Survey
shows h o w this is done. Q u e s t i o n 3, f o r example, l o o k s
like this: 3 W h i c h mode of transport do you use most often for the longest stage of your journey t o work? Please / Walk Bicycle Rail Bus Car as driver Car as passenger Motorbike as driver Motorbike as passenger
all that apply
m a a a a a a a
Even w h e r e respondents are presented w i t h a n open-ended q u e s t i o n , w e sometimes precode the responses. T h i s is o n l y possible w h e r e w e have a reasonably clear idea o f w h a t the responses are l i k e l y t o be. W h e r e w e have little idea, o r w h e r e the range o f p o t e n t i a l answers is u n m a n a g e a b l y large, t h e n p r e c o d i n g is n o t possible. T h e open-ended questions o n the Travel Survey (questions 8, 9, 12, a n d the last p a r t o f 13) were n o t precoded. W i t h open-ended questions, the p r o b l e m o f c o d i n g can be acute. Respondents express themselves i n t h e i r o w n u n i q u e w a y , a n d w e have t o classify their responses under a p r e d e t e r m i n e d h e a d i n g . W h e r e m o r e t h a n one person is i n v o l v e d i n c o d i n g , consistency is even m o r e d i f f i c u l t t o achieve. Consistency is a p r o b l e m , b u t the need t o use o u r i m a g i n a t i o n a n d insight t o i n t e r p r e t respondents' answers is n o t a p r o b l e m b u t essential t o the sociological i m a g i n a t i o n .
Designing the questions
123
on have, designed^ * ~ o r taking part In the Interview whose schedule • * A r e there any questions you are asking merely o
Further reading Oppenheim (1992) Questionnaire Design, Interviewing and Attitude Measurement (new edition) is a clear and thorough guide. Devine and Heath (1999) Sociological Research Methods in Context examines eight major sociological studies, six of which used a survey as one part of their research strategy.
7 ) Processing responses
Key elementi In tl * Hariual and a y t o m * Formats for data fi
Introduction T h e i n f o r m a t i o n collected f r o m respondents w i l l n o r m a l l y need t o be translated i n t o a d i g i t a l f o r m a t i n p r e p a r a t i o n f o r subsequent analysis b y c o m puter, a process t h a t can be t e r m e d response processing. For large samples and lengthy questionnaires, this t r a n s l a t i o n w i l l i n v o l v e a n extensive a m o u n t o f r o u t i n e w o r k , b u t i t also requires some decisions t o be t a k e n t h a t w i l l shape the w a y data analysis can be c o n d u c t e d . I n m o r e detail, response processing entails the f o l l o w i n g elements:
Processing responses
125
1 T h e selection o f a f o r m a t f o r the d i g i t a l data file w i t h i n w h i c h cases a n d responses can be r e c o r d e d , checked, analysed a n d , i f necessary, transf o r m e d . T h e chosen f o r m a t determines the f r a m e w o r k w i t h i n w h i c h elements 2 a n d 4 w i l l be c a r r i e d o u t . 2 T h e c o n s t r u c t i o n o f a c o d e b o o k - a paper or c o m p u t e r i z e d list i d e n t i f y ing a n d l a b e l l i n g the set o f variables t h a t the researcher decides t o derive f r o m the questions a n d the responses, together w i t h a sequence o f (norm a l l y ) n u m e r i c codes a n d t e x t u a l labels t h a t represents a l l the possible types o f response a n d non-response f o r each v a r i a b l e . 3 I n c o n j u n c t i o n w i t h the f o r m a t o f specific questions, the codes chosen i n element 2 w i l l determine the level o f measurement o f each v a r i a b l e : this c o n s i d e r a t i o n has m a j o r i m p l i c a t i o n s f o r data analysis a n d is discussed o n page 129. 4 C o d i n g - the selection o f a n a p p r o p r i a t e code f r o m the c o d e b o o k f o r each case/question a n d its e n t r y ( k e y b o a r d i n g , data i n p u t ) o n t o the c o m p u t e r data file: a l t h o u g h i t is better reserved f o r this specific process, c o d i n g is sometimes used as an equivalent t o response processing as a w h o l e . 5 C h e c k i n g a n d cleaning the data file. Some o f the methods t h a t can be a d o p t e d t o handle the response-processing phase deal w i t h p a r t i c u l a r elements i n this list a u t o m a t i c a l l y , b u t i t is i m p o r t ant f o r the researcher t o appreciate w h a t is h a p p e n i n g ' b e h i n d the scenes' t o prevent u n w a n t e d o p t i o n s being a d o p t e d by d e f a u l t . T h e f o l l o w i n g sections cover each o f the five listed items, s t a r t i n g w i t h data i n p u t because the scale of this labour-intensive task i n m e d i u m a n d large projects can o v e r s h a d o w the other elements i n response processing.
Manual, semi-automated and automated data input T h e use o f a c o m p u t e r t o assist response processing a n d subsequent phases has been presumed p r i n c i p a l l y because m a n u a l data analysis is slow, inflexible, l i m i t e d t o basics a n d e r r o r - p r o n e . V e r y f e w surveys are n o w analysed by h a n d (the exceptions are m a i n l y i n f o r m a l , 'in-house' exercises based o n h a n d f u l s o f respondents a n d p i l o t projects w h e r e the substantive results m a y be i r r e l e v a n t ) . C o m p u t e r packages i n general are discussed i n Chapter 3 b u t here i t is w o r t h n o t i n g the v a r i e t y o f possibilities f o r computer-assisted response processing, s t a r t i n g w i t h the highest levels o f a u t o m a t i o n .
Electronic surveys E m a i l surveys, w e b surveys, a n d surveys c o n d u c t e d using some c o m p u t e r packages (for e x a m p l e , K e y P o i n t ) a l l o w respondents t o generate a n d c o m plete a n electronic questionnaire o r f o r m o n a c o m p u t e r screen. T h e codeb o o k , codes a n d levels o f measurement are a l l d e t e r m i n e d at the p o i n t the
126
Surveying the social world
f o r m is designed so t h a t w h e n a case is r e t u r n e d via e m a i l or d i s k , the responses are already i n a d i g i t a l f o r m a t ready f o r analysis. A m a j o r attract i o n o f this o p t i o n t o researchers is t h a t the l a b o u r o f data i n p u t is p e r f o r m e d by the respondent! A d o w n s i d e is t h a t a l l respondents require c o m p u t e r access.
O p t i c a l M a r k Readers ( O M R ) A n intermediate degree o f a u t o m a t i o n o f data entry is possible t h r o u g h the use o f O M R e q u i p m e n t . T h i s technique employs c u s t o m - p r i n t e d paper questionnaires i n w h i c h pen m a r k s made b y respondents i n pre-designated areas are detected i n a scan o f the f o r m . T h e responses t h a t are detected are electronically c o m p i l e d i n t o a data file. O M R comes i n t w o variants, very expensive dedicated machines t h a t are designed f o r heavy d u t y a n d h i g h t h r o u g h p u t (these w i l l be b e y o n d the budgets o f m o s t one-off projects), a n d software applications (such as Remark OMR, p u b l i s h e d b y P r i n c i p i a P r o d ucts) t h a t r u n o n an o r d i n a r y desktop c o m p u t e r a n d read the i n p u t f r o m a c o n v e n t i o n a l flatbed scanner l i n k e d t o the PC. I n b o t h cases, the c o d e b o o k , codes a n d levels o f measurement are a l l d e t e r m i n e d at the p o i n t the quest i o n n a i r e is designed. A relatively simple cost-benefit c a l c u l a t i o n s h o u l d indicate w h e t h e r O M R is an attractive o p t i o n . I n a d d i t i o n t o the PC, scanner a n d s o f t w a r e package, the costs need t o include p r i n t i n g the c u s t o m questionnaires, whose l a y o u t is constrained b y the requirements o f the s o f t w a r e . I n a d d i t i o n , there are some less o b v i o u s costs a n d l i m i t a t i o n s . Forms need t o be fed i n t o the scanner b y h a n d (time-consuming) or mechanically (an a d d i t i o n a l expense); open-ended t e x t u a l responses c a n n o t be dealt w i t h via O M R a n d have t o be filtered o u t t o a d i f f e r e n t technology; there is always a finite e r r o r rate i n the scanning process caused b y a m b i g u o u s respondent m a r k s w h i c h have t o be resolved by h u m a n inspection: the process o f scanning O M R f o r m s generally m o n o p o l i z e s the capacity o f a desktop c o m p u t e r w h i c h c a n n o t be used simultaneously f o r other purposes.
Specialized data e n t r y s o f t w a r e Computer-assisted telephone i n t e r v i e w i n g ( C A T I ) a n d Computer-assisted personal i n t e r v i e w i n g ( C A P I ) s o f t w a r e i n w h i c h the a p p r o p r i a t e questions a n d p r o m p t s appear o n an interviewer's screen, dedicated data entry p r o grammes (such as SPSS Data Entry II) a n d m a n y general survey packages offer facilities designed t o accompany a n d assist the m a n u a l e n t r y of data via a c o m p u t e r k e y b o a r d . T h e f u n c t i o n s available i n c l u d e : • Data-entry screens Some s o f t w a r e ( p a r t i c u l a r l y the dedicated packages) a l l o w the design o f customized c o m p u t e r screen displays f o r data entry (a
Processing responses
i 27
choice m a y be offered between t i c k i n g boxes o n a n on-screen facsimile o f the questionnaire/interview schedule, o r entry o f values i n t o the cells o f a spreadsheet w h e r e each r o w represents a case a n d each c o l u m n a v a r i able). Supplementary i n f o r m a t i o n t o assist c o d i n g decisions can be supp l i e d t o i n p u t t e r s v i a w i n d o w s o r boxes. Such devices are p a r t i c u l a r l y useful w h e r e data i n p u t is sub-contracted t o h i r e d hands. • Constrained entry fields Software c a n prevent t o o f e w o r t o o m a n y characters being entered f o r a specific response. • Double entry T o ensure i n p u t accuracy, each data i t e m is t y p e d t w i c e w i t h the software testing f o r consistency. • Routing W h e n a response t o a question w i t h a filter takes a b r a n c h t h a t skips ahead, the software c a n ensure t h a t the p o i n t o f e n t r y f o r i n p u t t i n g the n e x t response a u t o m a t i c a l l y j u m p s ahead also, m i n i m i z i n g i n p u t errors. • Bounds checking N u m e r i c a l l i m i t s c a n be set t o ensure t h a t impossible values are n o t entered as a result o f k e y b o a r d errors ( f o r instance, i n a survey w i t h a target p o p u l a t i o n o f employees, attempts t o enter a respondent age outside the range 1 6 - 6 5 inclusive c o u l d be p r o h i b i t e d ) . • Consistency tests These ensure t h a t e m p i r i c a l l y u n l i k e l y o r inconsistent c o m b i n a t i o n s o f characteristics w i t h i n a case are a u t o m a t i c a l l y detected and h i g h l i g h t e d (for example, a h o u s e h o l d w h i c h appears t o o w n 2 0 cars or a n i n d i v i d u a l whose responses t o d i f f e r e n t questions suggest a c u r r e n t status o f b o t h e m p l o y e d a n d u n e m p l o y e d ) . Such cases c o u l d be the result o f i n t e r v i e w e r mistakes o r respondents m i s u n d e r s t a n d i n g questions.
Standard office a p p l i c a t i o n s o f t w a r e I f dedicated software is unavailable, a standard office spreadsheet o r database a p p l i c a t i o n w i l l p r o v i d e a m o r e t h a n adequate m a n u a l data entry e n v i r o n m e n t t h a t m a y offer several o f the facilities m e n t i o n e d above. F a i l i n g these, a w o r d processor o r t e x t e d i t i n g a p p l i c a t i o n w i l l suffice, b u t some m a n u a l w a y o f checking the i n t e g r i t y o f the data entered (or a sample o f i t ) s h o u l d be considered.
Data file formats and data types I f a n integrated survey software package has been selected f o r use i n a p r o ject, this choice w i l l p r o b a b l y have determined the file f o r m a t i n w h i c h the i n p u t data is h e l d f o r analysis a n d presentation purposes. Such packages t e n d t o e m p l o y their o w n p r o p r i e t a r y file f o r m a t s t h a t are very u n l i k e l y t o be directly c o m p a t i b l e w i t h any o f the others. H o w e v e r , m a n y packages a l l o w a range o f a d d i t i o n a l f o r m a t s t o be recognized a n d i m p o r t e d semia u t o m a t i c a l l y (and also, t o a lesser degree, t o be e x p o r t e d ) .
I28
Surveying the social world
I f data i n p u t has t o take place before a c o m p u t e r analysis package has been e a r m a r k e d , the safest strategy is t o i n p u t the data using a w i d e l y recognized file f o r m a t such as CSV ( c o m m a separated value) or T S V (tab separated v a l u e ) . These are very simple t e x t - o n l y files c o n t a i n i n g a series o f data items (the n u m e r i c value or series o f a l p h a n u m e r i c characters representing each response) separated b y a c o m m a o r t a b character a c t i n g as a d e l i m i t e r (or ' p u n c t u a t i o n ' ) between each i t e m . A carriage r e t u r n character punctuates each respondent's data (each case). A CSV file w i l l consist entirely o f p r i n t a b l e characters ( i n c l u d i n g spaces) plus carriage r e t u r n s , a n d a T S V file p r i n t a b l e characters plus tabs a n d carriage r e t u r n s . Such files can be created easily i n w o r d processors a n d spreadsheets ( e x p o r t i n g i n t e x t o n l y m o d e ) a n d t e x t e d i t o r s . A useful o p t i o n i n CSV files is t o i n c l u d e as the first ' r o w ' o f the file (up t o the first carriage r e t u r n ) a list o f the names ( i n q u o t a t i o n m a r k s a n d separated b y commas) o f each v a r i a b l e i n t u r n t h a t the data items represent. A n y a p p l i c a t i o n capable o f i m p o r t i n g the data file s h o u l d a u t o m a t i c a l l y recognize these as v a r i a b l e names a n d display t h e m appropriately. Survey c o m p u t e r packages are capable o f recognizing a finite range of data types (that is, variable values). I t is o b v i o u s l y c r u c i a l t o ascertain t h a t the package y o u select can handle a l l the types t h a t are relevant t o the p r o ject i n h a n d . N e a r l y a l l packages can handle integers ( w h o l e n u m b e r s ) , f l o a t ing p o i n t s (decimals), strings (or alphanumerics, w i t h values made u p o f letters, n u m b e r s a n d spaces) a n d dates. Some have a d d i t i o n a l data types f o r currency a n d freetext (continuous prose). Packages also differ over matters such as l i m i t s o n the t o t a l n u m b e r o f variables, the m a x i m u m a n d m i n i m u m p e r m i t t e d values o f integer a n d f l o a t i n g p o i n t variables, the m a x i m u m n u m b e r o f d e c i m a l places t h a t can be processed, the m a x i m u m l e n g t h o f strings, a n d the characters t h a t can be i n c l u d e d w i t h i n strings (such as p u n c t u a t i o n m a r k s a n d accented characters). A c r u c i a l c o n s i d e r a t i o n i f a quest i o n n a i r e contains open-ended questions is w h e t h e r a package can handle lengthy c o m m e n t s . Text h a n d l i n g i n several packages is restricted t o s t r i n g variables w i t h a m a x i m u m o f 255 characters. I f this is the case, lengthy responses w i l l either need t o be split between several variables, o r openended c o m m e n t s m a y have t o be directed t o an alternative package f o r analysis.
Constructing the codebook Before packages f o r the survey process became available f o r desktop c o m puters, a c o d e b o o k listing the sequence o f variables w i t h their code assignments h a d t o be w r i t t e n o u t by h a n d . Each s o f t w a r e package n o w has its o w n procedures f o r eliciting the c o d e b o o k i n f o r m a t i o n w h i c h enables i t t o find every variable i n the data file, store the values i n t e r n a l l y a n d display the
Processing responses
129
data i n a h e l p f u l f o r m a t o n screen ( p r o v i d i n g the names o f the variables i n a CSV file, as described above, is one w a y o f c o m m u n i c a t i n g a v i t a l p a r t o f the c o d e b o o k ) . H e r e is a list o f items o f i n f o r m a t i o n t h a t a package m i g h t require t o construct its database: • Variable location I n some types o f file f o r m a t , the c o m p u t e r package needs t o k n o w w h e r e w i t h i n the c o l u m n s o f data i n the data file the values for a p a r t i c u l a r variable begin a n d end. • Variable type As i n d i c a t e d above, survey s o f t w a r e m a y recognize a range o f types a n d require t h e m t o be distinguished. • Numeric format T h e character o f n u m e r i c variables m a y need t o be f u r t h e r specified - f o r example, as integer, specific f l o a t i n g (decimal) p o i n t f o r m a t , scientific n o t a t i o n . • Variable name/label Packages m a y operate w i t h i n t e r n a l reference names t h a t are constrained i n l e n g t h a n d s t a r t i n g character, b u t m a y p e r m i t m o r e intelligible variable labels f o r display purposes. • Display formats I t m a y be possible t o c o n t r o l the w a y the value o f a v a r i able is displayed o n screen independently o f the f o r m a t i n w h i c h i t is stored w i t h i n the database. • Value labels These supplement the actual values f o r display purposes w i t h explanations o f the m e a n i n g o f each k i n d o f response/code. A l t h o u g h some o f this i n f o r m a t i o n w i l l be m a n d a t o r y , other elements such as the labels are n o r m a l l y o p t i o n a l . H o w e v e r , i t is usually w e l l w o r t h the t i m e i n v o l v e d t o label f u l l y any database file. I t is p a r t i c u l a r l y i m p o r t a n t i f there is a t e a m c o n d u c t i n g the analysis or r e v i e w i n g the results i n order t o prevent misunderstandings over h o w variables o r codes have been c o n structed. Even solo researchers r e t u r n i n g t o a file after a l o n g i n t e r v a l can forget the d e t a i l o f decisions t a k e n m o n t h s or years p r e v i o u s l y : a f u l l y labelled database file reduces the researcher's dependence o n m e m o r y .
Levels of measurement As Chapter 2 made clear, q u a n t i t a t i v e analysis o f any k i n d requires the key theoretical concepts i n an area o f i n q u i r y t o be o p e r a t i o n a l i z e d , t h a t is, t o be associated w i t h e m p i r i c a l observations or measurements. I n survey research, the question design, the selection o f categories t o classify the responses a n d the choice o f codes t o represent the categories j o i n t l y determine the exact m a n n e r i n w h i c h a concept w i l l be t r a n s f o r m e d i n t o one or m o r e corres p o n d i n g variables. T h e choice o f categories t o classify the responses is f u n d a m e n t a l w i t h i n o p e r a t i o n a l i z a t i o n because i t is the relations t h a t exist between the d i f f e r e n t categories t h a t sets l i m i t s o n w h a t sort o f measurement w i l l be possible a n d h o w sophisticated any statistical analysis o f the resulting data w i l l be. F o u r types o f relations, o r levels o f measurement, are
130
Surveying the social world
c o n v e n t i o n a l l y distinguished. T h e y are presented b e l o w i n ascending order o f the s o p h i s t i c a t i o n o f measurement t h a t can be c o n d u c t e d .
N o m i n a l level W h e r e the set o f categories f o r a v a r i a b l e possesses n o i n t r i n s i c o r d e r o r scale, classification rather t h a n measurement p r o p e r applies. Consider, as an i l l u s t r a t i o n , q u e s t i o n 2 0 o n the Travel Survey q u e s t i o n n a i r e . There is n o i n t r i n s i c o r d e r f o r the d i f f e r e n t w o r k i n g areas w i t h i n the U n i v e r s i t y . ' L i b r a r i e s ' c o u l d have been listed first a n d coded ' 1 ' w i t h the A r t s F a c u l t y listed last a n d g i v e n code ' 1 0 ' w i t h o u t a n y t h i n g being upset. A l p h a b e t i c a l orderings o f categories are c o n v e n t i o n a l , n o t i n t r i n s i c , orders, so ' 1 ' is merely being assigned here as a convenient n u m e r i c a l label (the categories c o u l d have been assigned any set o f n u m e r i c o r n o n - n u m e r i c codes as l o n g as each was d i f f e r e n t - s t a r t i n g at 1 a n d c o u n t i n g u p is s i m p l y a m a t t e r o f c o n v e n t i o n ) . W h e r e codes are a r b i t r a r y , they are n o t n u m b e r s i n m a t h e m a t i c a l terms a n d i t is n o t legitimate t o c a r r y o u t a r i t h m e t i c operations o n t h e m ( i n the above e x a m p l e , there is n o sense i n w h i c h a case i n the A r t s F a c u l t y is nine m o r e or less t h a n one i n the L i b r a r i e s ) . As a result, the types o f statistical analysis t h a t are possible o n n o m i n a l variables are severely restricted. O r d i n a l level I n a variable at an o r d i n a l level o f measurement, the categories d o have an intrinsic order. Consider Travel Survey question 14. There is clearly a scale o f l i k e l i h o o d o f change w i t h i n the categories so t h a t responses i n the 'very l i k e l y ' categories (coded 1) represent greater l i k e l i h o o d o f change i n c o m m u t i n g patterns t h a n those i n the 'possibly' boxes (coded 2 ) , a n d the latter stand f o r greater degrees o f l i k e l y change t h a n those i n the ' u n l i k e l y ' boxes (coded 3). Ideally, 'very l i k e l y ' s h o u l d be coded 3, a n d ' u n l i k e l y ' 1 , b u t 1 f o r 'possibly', 2 f o r ' u n l i k e l y ' a n d 3 f o r 'very l i k e l y ' is i n t u i t i v e l y w r o n g . Even t h o u g h the numbers assigned reflect the o r d i n a l n a t u r e o f the categories, there are still restrictions o n h o w the codes can be m a n i p u l a t e d a r i t h m e t i c ally, a l t h o u g h a w i d e r range o f statistics can be used at the o r d i n a l t h a n at the n o m i n a l level o f measurement.
I n t e r v a l level There are very f e w variables w i t h i n t e r v a l scales i n c o m m o n use i n the social sciences, a l t h o u g h temperature measured i n Fahrenheit provides a n a t u r a l science example. A u n i t o f measurement is i n t r o d u c e d i n t o the p i c t u r e t h a t enables the ' i n t e r v a l ' or q u a n t i t a t i v e difference between any t w o measured cases t o be established. Because the b o t t o m o f a temperature scale like
Processing responses
131
Fahrenheit is n o t anchored i n the real w o r l d constant, absolute zero, b u t uses the a r b i t r a r y ( b u t convenient) p o i n t at w h i c h w a t e r freezes, i t is n o t legitimate t o m u l t i p l y o r d i v i d e , o n l y t o a d d a n d subtract, the difference between temperatures. T h u s , i f Manchester is 8 0 ° F a n d M o n t e v i d e o is 4 0 ° F, y o u can say i t is 4 0 ° F h o t t e r i n Manchester, b u t y o u c a n n o t say t h a t i t is t w i c e as h o t .
R a t i o level I n the highest level o f measurement, the b o t t o m o f the scale is g r o u n d e d i n a ' n o n - a r b i t r a r y zero' p o i n t a n d there are fewer restrictions o n the ways values c a n be legitimately t r a n s f o r m e d . Social scientific examples include age, i n c o m e a n d p o p u l a t i o n density. Consider also q u e s t i o n 1 1 o n the Travel Survey. Each respondent w h o travels b y p u b l i c t r a n s p o r t enters a s u m i n p o u n d s sterling t h a t becomes the value f o r t h a t case o n t h a t variable. I t is legitimate t o c a r r y o u t a n y basic a r i t h m e t i c o p e r a t i o n o n these values, so i f one case has a value o f £ 3 . 5 0 t h e n i t t r u l y represents t w i c e the e x p e n d i t u r e o f another w i t h a value o f £ 1 . 7 5 . To s u m m a r i z e , i n n o m i n a l variables, classification rather t h a n measurement has t a k e n place ( a l t h o u g h i t is still possible t o m a k e some q u a n t i t a t i v e statements a b o u t the d i s t r i b u t i o n o f cases between the categories). I n o r d i n a l variables (such as those w i t h L i k e r t - t y p e categories), o n l y the p o s i t i o n s o f cases relative t o each other o n the variable have been measured. I n r a t i o v a r i ables, a n u m b e r o f units have been assigned t o each case i n a w a y t h a t establishes their absolute positions o n the d i m e n s i o n being measured.
Pre-coding and post-coding I t s h o u l d be clear f r o m the last section t h a t the value a case o r respondent possesses o n a variable measured at the o r d i n a l level o r above is n o t strictly speaking a code at a l l b u t a n o n - a r b i t r a r y measurement t h a t has t o be respected. T h i s is t r u e w h e t h e r the measurement has been c a r r i e d o u t b y the respondent themselves, as i n question 1 1 o n the Travel Survey, o r b y (or o n behalf o f ) the researcher as, f o r example, w h e r e psychological o r c l i n i c a l tests are administered a n d a score o r measure is extracted b y the researcher. O f course, i t m a y be necessary a n d entirely legitimate t o t r a n s f o r m the o r i g i n a l values i n r a t i o variables by, f o r instance, c o n v e r t i n g i m p e r i a l t o m e t r i c units o r dates t o a u n i t o f elapsed t i m e . Variables w i l l f r e q u e n t l y also need t o be l i n k e d a n d integrated i n v a r i o u s w a y s a n d this is another respect i n w h i c h c o m p u t e r survey packages c a n assist a n d m a k e the researcher's life a l o t simpler. For n o m i n a l a n d o r d i n a l variables, i t is useful t o assist data e n t r y b y
132
Surveying the social world
p r i n t i n g the codes t o be assigned beside the a p p r o p r i a t e categories, choices and o p t i o n s o n the questionnaire. T h i s s h o u l d be done discreetly i n a small typeface so t h a t i t does n o t distract respondents. I f the responses t o a quest i o n are finite a n d a l l o f t h e m can be anticipated i n advance (as i n the quest i o n , ' W h i c h o f the f o l l o w i n g p r o d u c t s have y o u b o u g h t i n the last m o n t h ? : P r o d u c t A , P r o d u c t B, P r o d u c t C ) t h e n the question can be pre-coded - t h a t is, a l l the codes f o r each response can be determined a n d p r i n t e d o n the quest i o n n a i r e i n advance o f d i s t r i b u t i o n . I n other cases, respondents w i l l r e t u r n some values t h a t cannot be p r e d i c t e d i n advance, o r there m a y be so m a n y possibilities t h a t they cannot be listed. ( A question e l i c i t i n g the titles o f recently v i e w e d films w o u l d present this p r o b l e m . ) A set o f responses t h a t cannot be a n t i c i p a t e d m u s t be post-coded after data c o l l e c t i o n . T h u s , i n the film example above, once the titles have been s u p p l i e d , i t is t h e n possible t o code t h e m i n a n y a p p r o p r i a t e w a y , f o r example, i n t o genres (action films, romances, comedies, a n d so o n ) . A c o m m o n question f o r m a t (see the Travel Survey, question 2 0 ) is t o present a set o f pre-coded categories i n t e n d e d t o cover the m a j o r i t y o f respondents, f o l l o w e d b y a residual ' O t h e r ' , category ( w h i c h w i l l be post-coded) f o r a m i n o r i t y o f special cases. A l l open-ended questions, b y d e f i n i t i o n , m u s t be post-coded.
Missing data A general p r i n c i p l e i n response processing is t o a v o i d using blanks i n the data file t o stand f o r cases/questions w h e r e there are n o values t o enter. T h i s is because i t is h a r d t o k n o w w h e t h e r a b l a n k is a deliberate c o d i n g decision or a n accidental c o d i n g slip. I t is better practice t o reserve dedicated codes for missing data values, chosen t o be distinct f r o m possible substantive values. C o n v e n t i o n a l l y , codes such as ' 9 9 ' a n d ' - 1 ' are used f o r these p u r poses a n d some c o m p u t e r survey packages can include o r exclude t h e m f r o m statistical processing a c c o r d i n g t o the user's preference. I n some situations, it m a y be desirable t o i d e n t i f y the reasons, w h e r e they are k n o w n , f o r data being missing. I n a n interview-based i n q u i r y , f o r instance, i t c o u l d be helpf u l t o use three d i f f e r e n t codes t o d i s t i n g u i s h between data t h a t is missing because the question w a s ' n o t applicable' f o r the respondent a n d 'respondent refusal' o r 'question n o t p u t ' .
Multiple responses I n some question f o r m a t s , such as question 4 o n the Travel Survey, a respondent can legitimately ' t i c k ' several d i f f e r e n t response categories. There are three alternative methods o f assigning codes t o these m u l t i p l e response quest i o n s . Each m e t h o d preserves a l l the i n f o r m a t i o n supplied b y the respondent
Processing responses
133
a n d w h i c h one is most convenient depends m a i n l y o n h o w the data is g o i n g t o be analysed. M u l t i p l e dichotomies T h e researcher creates several d i c h o t o m o u s variables i n the c o d e b o o k , one f o r each t i c k b o x a question contains. Each variable has one o f its values (by c o n v e n t i o n , '1') representing the s i t u a t i o n w h e r e a respondent has t i c k e d t h a t b o x , a n d the other value (by c o n v e n t i o n , '0') representing the s i t u a t i o n w h e r e the respondent has n o t selected the category. F o r question 3, eight d i c h o t o m o u s variables w i l l be created ( W a l k , Bicycle, R a i l , etc.), one f o r each t i c k b o x . F o r analysis, the m u l t i p l e dichotomies usually need t o be r e c o m b i n e d i n some m a n n e r i n t o m o r e inclusive variables.
O r d i n a l choices Suppose, f o r i l l u s t r a t i o n , t h a t the respondent c a n make three selections w i t h i n the same q u e s t i o n . T h e researcher creates three variables i n the codeb o o k , called say Choicel, Choice! a n d Choice3, representing the first, second a n d t h i r d respondent selections respectively. Each o f these variables has a set o f values c o r r e s p o n d i n g t o a l l the t i c k boxes. Choicel may or may n o t represent a first preference depending o n the question w o r d i n g .
Binary coding O n l y one variable is created i n this s o l u t i o n . Each t i c k b o x w i t h i n a question is given i n t u r n a n u m e r i c a l code i n the sequence 1 , 2 , 4 , 8, 16, 32 . . . T h e variable as a w h o l e is assigned a value equivalent t o the arithmetic t o t a l o f all the codes corresponding t o t i c k e d boxes. A n y t o t a l represents a distinct c o l lection o f t i c k e d choices: i f this system w a s used i n question 4 o f the Travel Survey, a code ' t o t a l ' o f 14 f o r the question w o u l d mean t h a t the respondent h a d checked ' W a l k ' , 'Bicycle' a n d ' R a i l ' as alternative modes o f t r a n s p o r t .
Checking and cleaning the data There are t w o k e y p o i n t s w i t h i n response processing w h e r e checking a n d cleaning the data is advisable. O n e is i m m e d i a t e l y after data c o l l e c t i o n . Part i c u l a r l y i n cases w h e r e a postal survey has been c a r r i e d o u t a n d the data entry is being done b y h i r e d hands o r a n agency, i t is w o r t h g o i n g t h r o u g h the f o r m s before they are passed o n , l o o k i n g f o r p r o b l e m cases. These m a y be ones where k e y fields lack responses (is i t w o r t h processing these cases?) or where there are data i n p u t dilemmas - perhaps respondents have made comments w h i c h were n o t catered for, o r have p r o v i d e d m u l t i p l e responses
134
Surveying the social world
where they were n o t a n t i c i p a t e d , or answers w h i c h are s i m p l y p u z z l i n g . I t is best t o assess the scale o f such p r o b l e m s early o n a n d t o resolve t h e m before they can affect the confidence a n d p r o d u c t i v i t y o f any personnel h i r e d t o c a r r y o u t data i n p u t a n d c o d i n g . T h e second strategic p o i n t f o r checking is w h e n the c o m p u t e r survey package has read a l l o f the data available. A f e w simple checks here can c o n firm t h a t there are n o serious c o d e b o o k errors. Does the n u m b e r o f cases the software sees m a t c h w h a t y o u anticipated? A frequency c o u n t (see page 139) o f some key variables w i l l s h o w w h e t h e r there are any ' i m p o s s i b l e ' values i n key variables w h i c h c o u l d n o t c o r r e s p o n d t o v a l i d responses.
Key summary Unless you are c< fui of respondent
file and possibilit y that m
J
Further reading Although levels of measurement are dealt w i t h in nearly all survey and statistics texts, response processing gets relatively little coverage. Exceptions are Rose and Sullivan (1996), Chapter 3, de Vaus (1991), Chapter 14, and Bourque and Fielder (1995), section 5.
(j^)
Strategies for analysis
Key elements in this chapter
': notes a n d definitions r e q u i r e d t o u n d e r s t a n d the table f u l l y are supplied as footnotes l i n k e d t o specific headings or cells i n the table. 8 Because o f r o u n d i n g , some percentage totals m a y be 99 or 1 0 1 . N o r m a l l y
146
Surveying the social world
it is best t o display the totals as 100 a n d t o e x p l a i n the r o u n d i n g error disparities i n a f o o t n o t e i n the first table. 9 Tables w h i c h have been r e p r o d u c e d f r o m an existing p u b l i c a t i o n always need a precise source: a reference t h a t includes a page and/or table n u m b e r is essential. 10 U n c l u t t e r e d tables w o r k best: a l l superfluous m a t e r i a l s h o u l d be suppressed (unnecessary decimal places, r e d u n d a n t totals): rules (lines) between c o l u m n s are n o t essential a l t h o u g h they can a i d c l a r i t y i n c o m p l e x tables. Table 8.2 shows a cross-tabulation f r o m the Travel Survey data f r o m the responses t o questions 19 a n d 2 2 . The d i r e c t i o n o f percentaging implies t h a t staff g r o u p is the independent variable a n d the n u m b e r o f cars available t o the h o u s e h o l d dependent. (Clearly, h o w m a n y cars y o u o w n e d c o u l d n o t be the d e t e r m i n a n t o f w h i c h staff category y o u belong to.) T h e title reinforces this b y n a m i n g the dependent variable first - see p o i n t 7 i n the list o f crosst a b u l a t i o n conventions above. Table 8.2 reveals o n l y modest percentage differences between the categories o f staff. H o w e v e r , there are three interesting features t h a t stand o u t . First, a l t h o u g h the academic a n d academic related groups are o n salary scales t h a t go t o higher levels t h a n the other g r o u p s , they d o n o t appear t o have any greater access t o cars. Second, f o r n o o b v i o u s reasons a smaller p r o p o r t i o n o f technical staff is i n car-less households t h a n the other g r o u p s ; and t h i r d , m o r e secretarial, clerical a n d j u n i o r a d m i n i s t r a t o r s are i n m u l t i car o w n i n g households t h a n the other g r o u p s . Since each staff g r o u p contains i n d i v i d u a l s at very d i f f e r e n t levels o f seniority a n d stages o f their w o r k careers, these findings need t o be treated w i t h c a u t i o n . For example, some members o f the secretarial g r o u p are y o u n g employees w h o w i l l be i n households t h a t possess a car or cars b u t m a y n o t have the use f o r c o m m u t i n g o f vehicles o w n e d by siblings o r parents. There are t w o simple b u t i m p o r t a n t lessons a b o u t survey analysis t o be learned at this p o i n t . T h e first is t h a t i t is h a r d t o m a k e sense o f survey findings as a n analyst i f a l l y o u k n o w a b o u t the social g r o u p s o r situations y o u are i n v e s t i g a t i n g is the findings themselves. I f this is the case, y o u need a c o l l a b o r a t o r o r p a r t i c i p a n t t o w o r k w i t h y o u w h o can p r o v i d e c o n t e x t u a l insights - somebody, f o r e x a m p l e , w h o c o u l d e x p l a i n the significance o f the d i f f e r e n t staff g r o u p s i n the Travel Survey a n d their respective earnings. T h e second lesson is t h a t i t is relatively u n u s u a l , especially i n descriptive investigations l i k e the Travel Survey, f o r any one finding t o be o f s h a t t e r i n g significance or t o stand alone as the finding i n the p r o j e c t . M o r e c o m m o n l y , y o u w i l l be p u r s u i n g leads a n d hunches a n d h o p e f u l l y g r a d u a l l y p i e c i n g together a d e v e l o p i n g p i c t u r e o f the slice o f the social w o r l d t h a t is o f interest.
Strategies for analysis T a b l e 8.2
147
Cars available t o h o u s e h o l d (q22) b y staff g r o u p ( q l 9 ) Staff group
None One More than one Total N
Academic
Academicrelated
Secretarial, clerical, junior admin Technical
Manual and ancillary
N
%
N
%
N
%
N
%
N
%
17 65 80 162
10 40 49 100
13 60 59 132
10 45 45 100
18 42 70 130
14 32 54 100
3 35 32 70
4 50 46 100
13 40 21 74
18 54 28 100
Testing hypotheses and statistical significance I n a d d i t i o n t o the technique f o r e s t i m a t i n g p o p u l a t i o n values, discussed o n pages 7 5 - 7 , i n f e r e n t i a l statistics provides a capacity t o establish w h e t h e r differences observed between sub-groups w i t h i n a sample o n a p a r t i c u l a r variable (differences, f o r instance, between means o r p r o p o r t i o n s ) are l i k e l y t o indicate equivalent differences i n the p o p u l a t i o n . T h e same procedures can be adapted t o cover situations w h e r e the means or p r o p o r t i o n s t o be c o m p a r e d b e l o n g t o cases t h a t come f r o m different r a n d o m samples. T h e central element i n the procedures is the testing o f a n u l l hypothesis, so-called because i t always asserts t h a t there is n o difference between a sample a n d a p o p u l a t i o n or between one sample a n d another. N o r m a l l y , the researcher hopes t o reject the n u l l hypothesis a n d b y so d o i n g t o c o n f i r m the existence o f e m p i r i c a l differences. I f i t c a n n o t be rejected, the i m p l i c a t i o n is t h a t the evidence is insufficiently s t r o n g t o infer any real w o r l d differences or relationships. T h e test o f a n u l l hypothesis calculates the p r o b a b i l i t y t h a t the recorded differences i n the sample(s) c o u l d have o c c u r r e d entirely by chance o n an assumption o f n o real w o r l d r e l a t i o n s h i p between the variables concerned. (Chance, i n this c o n t e x t , means s i m p l y s a m p l i n g error, a n d the tests rely o n s a m p l i n g d i s t r i b u t i o n s t h a t are theoretical plots o f the results f r o m every sample t h a t c o u l d possibly be obtained.) I f the observed differences are substantial, i t w i l l be extremely u n l i k e l y t h a t they have o c c u r r e d b y chance. I f the observed differences are less substantial, t h e n s a m p l i n g e r r o r becomes m o r e plausible as a n e x p l a n a t i o n a n d i t m a y n o t be possible t o reject the n u l l hypothesis. By c o n v e n t i o n , a p r o b a b i l i t y level o f -05 (5 per cent) is generally t a k e n as the m i n i m u m t h r e s h o l d o f statistical significance. I n other w o r d s , i f
148
Surveying the social world
there is a m o r e t h a n 5 i n 100 chance t h a t s a m p l i n g e r r o r c o u l d account f o r the observed difference, the researcher is o b l i g e d t o accept the n u l l h y p o t h esis. Occasionally, w h e r e there is reason t o be m o r e stringent, a -01 (1 per cent) level is also e m p l o y e d . T h u s , the o u t c o m e o f the test o f a hypothesis involves t w o key c o m p o n e n t s , a measure o f difference i n the f o r m o f the calculated value o f one member o f a large f a m i l y o f test statistics, a n d a p r o b a b i l i t y statement (usually i n d i c a t e d i n summaries o f results as p < -05 or p < -01) a b o u t the c r i t e r i o n level at w h i c h the n u l l hypothesis has been rejected (if the finding is significant) or N S (for n o t significant) i f the finding has a p > -05. W h i c h s a m p l i n g d i s t r i b u t i o n s are a p p r o p r i a t e as the bases f o r m a k i n g inferences f r o m d i f f e r e n t k i n d s o f data is a relatively technical m a t t e r a b o u t w h i c h statistics a n d data analysis texts give guidance. I n general terms, tests w h i c h m a k e n o , o r o n l y easily-satisfied, assumptions a b o u t the f o r m o f the u n d e r l y i n g d i s t r i b u t i o n are k n o w n as n o n - p a r a m e t r i c a n d sources describi n g t h e m can be f o u n d at the end o f the chapter. A test o f significance c o m m o n l y e m p l o y e d w i t h bivariate tables c o n t a i n i n g t w o n o m i n a l variables is the c h i square test f o r independence. S y m b o l ized by the Greek character x ( r h y m i n g w i t h ' t r y ' a n d w r i t t e n as x ) , i t is n a m e d after the theoretical d i s t r i b u t i o n i t e m p l o y s . T h e p o i n t o f the test is t o establish t h a t a case's m e m b e r s h i p o f a p a r t i c u l a r category o n one o f the variables has a bearing o n w h i c h category o f the other variable i t w i l l be i n (and t h a t this effect is d i s t i n c t f r o m s a m p l i n g e r r o r ) . T h e test produces a value t h a t is t h e n c o m p a r e d t o a table o f c h i square d i s t r i b u t i o n values (often r e p r o d u c e d i n the back o f statistics t e x t b o o k s ) . T h e n u l l hypothesis is rejected i f the calculated c h i square value is greater t h a n the t h r e s h o l d value i n the table f o r a p a r t i c u l a r level o f significance, w h i c h takes i n t o account the n u m b e r o f cells i n the table. A n above-threshold value f o r the c h i square statistic m a y also be i n t e r p r e t e d as a sign t h a t there is an association between the variables concerned. T h e c h i square value is n o t , however, p r o p o r t i o n a l t o the strength o f an association (see the next section). 2
T h e same b r o a d logic applies t o the analysis o f variance test o f significance (often abbreviated t o A N O V A ) . T h i s is f r e q u e n t l y a p p l i e d t o situations where there is a r a t i o level dependent v a r i a b l e , such as a test score, size o r income measure, a n d a categorical independent variable, such as gender, religious a f f i l i a t i o n or o c c u p a t i o n ( a l t h o u g h i t can also handle c o n t i n u o u s independent variables). I n either case, analysis o f variance addresses the issue o f w h e t h e r differences between the averages o f the dependent variable w i t h i n the d i f f e r e n t categories o f the independent variable are greater t h a n s a m p l i n g e r r o r w o u l d suggest. I t does so b y c o m p a r i n g the t o t a l a m o u n t o f v a r i a t i o n between the categories o f the independent variable w i t h the t o t a l a m o u n t within categories. T h e greater the a m o u n t by w h i c h the betweencategory v a r i a t i o n exceeds the w i t h i n - c a t e g o r y v a r i a t i o n , the m o r e l i k e l y i t is t h a t the n u l l hypothesis can be rejected a n d an association between
Strategies for analysis
149
independent a n d dependent variables i n f e r r e d . Analysis o f variance uses the F ratio as its test statistic a n d the F d i s t r i b u t i o n as its s a m p l i n g d i s t r i b u t i o n . There are constraints affecting its a p p l i c a t i o n w h e r e the n u m b e r s i n the categories o f the independent variable differ greatly. Statistical texts discuss m o r e f u l l y w h i c h test statistics a n d sampling distributions are appropriate f o r p a r t i c u l a r k i n d s of data a n d set o u t i n detail the steps i n the testing procedures. I t is w o r t h emphasizing here, however, t w o general aspects o f testing. Since the logic of calculating statistical significance involves the a t t e m p t t o discount the effects o f sampling error, there is n o p o i n t i n testing the significance o f data t h a t is n o t derived f r o m p r o b a b i l i t y samp l i n g . Secondly, statistical significance is n o t the same t h i n g as substantive or research significance. I t is possible t o establish any n u m b e r o f statistically significant results t h a t have n o bearing o n the objectives o f the research (or indeed o n the field o f i n q u i r y as a w h o l e ) . T h e reverse is also possible. A realw o r l d difference or relationship m a y f a i l t o achieve statistical significance (the sample size m a y have been t o o small t o p e r m i t sampling error t o be dismissed). T h i s second p o i n t highlights a general l i m i t a t i o n o f significance tests: outcomes are sensitive t o sample size as w e l l as the size o f recorded differences.
Measures of association for nominal variables T h e c a l c u l a t i o n a n d inspection o f table percentages is a m a i n s t a y o f survey analysis i n basic surveys. I n the case o f simple tables, this a p p r o a c h has the advantage o f being s t r a i g h t f o r w a r d f r o m the v i e w p o i n t o f b o t h p r o d u c t i o n a n d c o n s u m p t i o n . H o w e v e r , the c o m p a r i s o n o f percentages between categories has a n u m b e r o f l i m i t a t i o n s t h a t are n o t adequately dealt w i t h b y the responses t o c o m p l e x i t y suggested o n pages 1 4 4 - 5 (collapsing categories o r e x a m i n i n g a series o f bivariate sub-tables i f there is m o r e t h a n one independent v a r i a b l e ) . A severe l i m i t a t i o n o f a l l the bivariate procedures r e v i e w e d so far is t h a t they d o n o t p r o v i d e a s t r a i g h t f o r w a r d measure o f the strength o f the r e l a t i o n s h i p between the variables. Measures o f association address this deficiency. For n o m i n a l variables, there are t w o m a i n families o f simple measures o f association. T h e first f a m i l y is based o n the c h i square statistic a n d derives a d d i t i o n a l measures f r o m i t , the m o s t c o m m o n o f w h i c h are phi (for 2 X 2 tables, w r i t t e n as cf>) a n d its close r e l a t i o n , Cramer's V. T h e values o f b o t h o f these (and m a n y other indices o f association) take u p values between zero (for n o association) a n d 1.00 (for a perfect association). B o t h are relative indices i n the sense t h a t the closer t o zero the weaker a n d the closer t o 1.00 the stronger the associa t i o n . H o w e v e r , neither measure indicates the p r o p o r t i o n o f the o v e r a l l v a r i a t i o n w i t h i n a table t h a t is a t t r i b u t a b l e t o the association. B o t h c h i square-based measures are n o r m a l l y accompanied b y an i n d i c a t i o n o f the o u t c o m e o f the test o f a n u l l hypothesis.
150
Surveying the social world
T a b l e 8.3
Staff g r o u p ( q l 9 ) b y m a i n m o d e o f c o m m u t i n g (q3)
Academic
Academicrelated
Secretarial clerical, junior administrator Technical
%
%
%
Walk 11 Bike 15 Bus or rail 8 Car (driver/ passenger) 65 Motorbike (driver/passenger) 1 Total 100
%
Manual and ancillary %
7 8 6
8 7 18
4 16 7
13 8 14
77
68
69
61
2 100
0 100
4 100
4 100
The second f a m i l y , p r o p o r t i o n a l r e d u c t i o n i n e r r o r (PRE) measures, such as lambda ( w r i t t e n as X), are based o n a c o m p a r i s o n between t w o attempts at p r e d i c t i o n . I n the first a t t e m p t , the value o f the dependent variable is predicted i n a state o f ignorance a b o u t the values o f the independent variable; i n the second, i t is based o n the values o f the independent v a r i a b l e . I f there is a sufficient i m p r o v e m e n t i n the accuracy o f p r e d i c t i o n between the first a n d the second a t t e m p t , a n association between the t w o variables can be i n f e r r e d . T h e results o f a PRE test are also presented as an i n d e x between zero a n d 1.00, b u t a n i m p o r t a n t advantage o f a PRE i n d e x over a c h i squarebased equivalent is t h a t i t has a m o r e i n t u i t i v e l y m e a n i n g f u l i n t e r p r e t a t i o n . A l a m b d a o f .25 is an i n d i c a t i o n t h a t k n o w i n g the values o f the independent variable reduces the e r r o r factor i n p r e d i c t i n g dependent values by 25 per cent. As a result o f l a m b d a being an a s y m m e t r i c a l measure, results w i l l be different depending o n w h i c h variable is identified as independent. U n d e r some circumstances, l a m b d a is ultra-conservative a n d produces results i n d i c a t i n g n o association w h e n other tests result i n positive outcomes. To see the a p p l i c a t i o n o f a PRE measure, consider the Travel Survey issue o f w h e t h e r there is a r e l a t i o n s h i p between m e m b e r s h i p o f a staff g r o u p a n d main mode of travel to w o r k . The percentages i n Table 8.3 are n o t easy t o i n t e r p r e t . T h e m a n u a l a n d a n c i l l a r y g r o u p appears s l i g h t l y less dependent o n cars t h a n the others a n d it contains a larger p r o p o r t i o n o f w a l k e r s a n d a smaller one o f cyclists. There are also slight divergencies f r o m the 'average' a m o n g the other g r o u p s , b u t it is d i f f i c u l t t o judge w h e t h e r there is an o v e r a l l association between the t w o variables. H e r e the measures o f association can p l a y a valuable r o l e . Table 8.4 reports the results o f t w o measures o f association c o n d u c t e d b y the SPSS package f o r the data c o n t a i n e d i n Table 8.3.
O *^
ft ft
•3
\o en o o o o
O O
H O
ft, ft,
00 00
O
tí o
u O
cd
í2
O Tí" O T—i rH O O O O
iO N O O o o
0\ N O H N O o o o
t-h O o
T3 tí
T3 tí
IT) \D rH
cu
2 g
° Ö n tí
ns
S
tí oS