Psychological Testing: BPS Occupational Test Administration Open Learning Programme

Psychological Testing: The BPS Occupational Test Administrative Open Learning Programme David Bartram Patricia A. Lindl...

Author: David Bartram | Patricia Lindley

44 downloads 699 Views 1MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

Psychological Testing: The BPS Occupational Test Administrative Open Learning Programme

David Bartram Patricia A. Lindley

BPS Blackwell

PT_A01.qxd 04/12/2006 12:41 Page i

Psychological Testing: The BPS Occupational Test Administration Open Learning Programme

PT_A01.qxd 04/12/2006 12:41 Page ii

PT_A01.qxd 04/12/2006 12:41 Page iii

Psychological Testing: The BPS Occupational Test Administration Open Learning Programme

David Bartram and Patricia A. Lindley

PT_A01.qxd 04/12/2006 12:41 Page iv

© 2006 by David Bartram and Patricia A. Lindley A BPS Blackwell book BLACKWELL PUBLISHING

350 Main Street, Malden, MA 02148–5020, USA 9600 Garsington Road, Oxford OX4 2DQ, UK 550 Swanston Street, Carlton, Victoria 3053, Australia The right of David Bartram and Patricia A. Lindley to be identified as the Authors of this Work has been asserted in accordance with the UK Copyright, Designs, and Patents Act 1988. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK, Copyright, Designs, and Patents Act 1988, without the prior permission of the publisher. First published 2006 by The British Psychological Society and Blackwell Publishing Ltd 1 2006 Library of Congress Cataloging-in-Publication Data Bartram, David. Psychological testing : the BPS Occupational Test Administration Open Learning Programme / David Bartram and Patricia A. Lindley. p. cm. ISBN-13: 978–1–4051–3107–0 (hardcover : alk. paper) 1. Psychological tests —Study guides. I. Lindley, Patricia A. II. British Psychological Society. III. Title. BF176.B38 2006 150.28′7—dc22 2005033630 A catalogue record for this title is available from the British Library. Set in 10.5/12pt Cheltenham by Graphicraft Limited, Hong Kong Printed and bound in the UK by Athenaeum Press The publisher’s policy is to use permanent paper from mills that operate a sustainable forestry policy, and which has been anufactured from pulp processed using acid-free and elementary chlorine-free practices. Furthermore, the publisher ensures that the text paper and cover board used have met acceptable environmental accreditation standards. For further information on Blackwell Publishing, visit our website: www.blackwellpublishing.com

PT_A01.qxd 10/07/2006 12:16 Page v

CONTENTS List of Figures, Tables and Exercises ....................................................................................................... ix

INTRODUCTION AND STUDY GUIDE ........................................................................................ 1 Introduction to test administration ........................................................................................................ 1 What do test administrators do? ......................................................................................................... 2 Limits on the range of tests that may be administered .................................................................... 3 Keeping up to date ................................................................................................................................ 3 Summary ................................................................................................................................................. 3 The BPS Certificate in Test Administration ........................................................................................... 4 The BPS test administration standards .............................................................................................. 4 What does the Certificate provide? ..................................................................................................... 5 How is the Certificate obtained? .......................................................................................................... 5 Becoming a test administrator ................................................................................................................ 5 Acquire the necessary competence .................................................................................................... 6 Have your competence assessed ......................................................................................................... 6 Checklist of materials in the BPS Open Learning Modules pack ..................................................... 6 What you will need to meet the test administration assessment requirements ........................... 7 How to use the Open Learning Modules ................................................................................................ 7 General hints on self-study methods .................................................................................................. 7 How to organize your work ................................................................................................................... 8 Build up your portfolio of work for assessment ................................................................................ 8

MODULE 1: AN INTRODUCTION TO PSYCHOLOGICAL TESTING FOR TEST ADMINISTRATORS ...................................................................................................... 9 OVERVIEW .................................................................................................................................................. 9 KEY AIMS .................................................................................................................................................... 9 INTRODUCTION ......................................................................................................................................... 9 About these modules .......................................................................................................................... 10 1.1 WHAT ARE PSYCHOLOGICAL TESTS? .......................................................................................... 11 What is a ‘psychological test’? .................................................................................................. 11 Test manuals ............................................................................................................................... 13 What do tests measure? ............................................................................................................. 14 Summary ...................................................................................................................................... 15 1.2

MAIN TYPES OF PSYCHOLOGICAL TEST .................................................................................... 16 Measures of typical performance ............................................................................................. 16 Measures of maximum performance ....................................................................................... 18 Differences between ability and attainment tests .................................................................. 28

PT_A01.qxd 10/07/2006 12:16 Page vi

vi

CONTENTS

Work samples, trainability tests and job simulations ........................................................... 29 Dexterity tests ............................................................................................................................. 30 Apparatus tests: issues for test administration ..................................................................... 30 Computer Adaptive Testing ....................................................................................................... 31 1.3 CONTROLLING THE QUALITY OF PSYCHOLOGICAL TESTING ................................................ 33 Scope: norms and the process of referencing scores ............................................................ 34 Acceptability ............................................................................................................................... 36 Practicality .................................................................................................................................. 36 Fairness ........................................................................................................................................ 37 SUMMARY OF MODULES 1.1 TO 1.3 ............................................................................................. 38 1.4 SCALES AND MEASUREMENT ....................................................................................................... 40 Raw scores ................................................................................................................................... 40 Scales ........................................................................................................................................... 41 Absolute and relative scores: raw scores and normative scores ............................................. 43 Norm-referenced scores -- comparing people with other people ......................................... 44 Self-referenced or ipsative tests -- comparing people with themselves .............................. 45 1.5 UNDERSTANDING AND USING TEST NORMS .............................................................................. 47 Frequency distributions ................................................................................................................. 47 Percentiles ................................................................................................................................... 47 Descriptive measures based on percentiles ........................................................................... 48 Other standard scores ............................................................................................................... 50 T-scores and sten scores ........................................................................................................... 51 SUMMARY OF MODULES 1.4 and 1.5 ........................................................................................... 53 MODULE 1: ANSWERS TO EXERCISES AND SELF-ASSESSMENT QUESTIONS .................................. 54 Answers to Exercises .............................................................................................................................. 54 Exercise 1.5.1: Converting scores to percentiles and grades ......................................................... 54 Exercise 1.5.2: Using norm tables ...................................................................................................... 54 Answers to Self-Assessment Questions ............................................................................................... 55 SAQ 1.2.1 ............................................................................................................................................... 55 SAQ 1.2.2 ............................................................................................................................................... 56 SAQ 1.4.1 ............................................................................................................................................... 56 SAQ 1.5.1: Converting raw scores to percentiles ............................................................................. 56

MODULE 2. TEST ADMINISTRATION AND SCORING .................................................... 57 OVERVIEW ................................................................................................................................................ 57 KEY AIMS .................................................................................................................................................. 58

PT_A01.qxd 10/07/2006 12:16 Page vii

CONTENTS

vii

2.1 TEST ADMINISTRATION ................................................................................................................. 59 High-stakes and low-stakes testing ............................................................................................... 60 What are the functions of the test administrator role? ......................................................... 60 Modes of test administration ........................................................................................................ 61 Open Mode ................................................................................................................................... 61 Controlled Mode ......................................................................................................................... 62 Supervised Mode ........................................................................................................................ 63 Managed Mode ............................................................................................................................ 63 The International Test Commission (ITC) Guidelines ........................................................... 64 The four stages of test administration ........................................................................................ 66 Stage 1: Preparation ....................................................................................................................... 68 Issues to consider ....................................................................................................................... 68 Dealing with candidates who have specific problems ........................................................... 69 Planning the session ................................................................................................................... 71 Preparation of yourself as administrator ................................................................................ 74 Planning Schedule ...................................................................................................................... 74 Preparation for computer-based testing sessions ................................................................. 75 Stage 2: Administration .................................................................................................................. 76 The Test Session Log .................................................................................................................. 76 Introducing the test session ...................................................................................................... 77 Administration of the test .......................................................................................................... 78 Computer-based test (CBT) administration and administration of tests over the internet ............................................................................................................................. 79 Detailing the level of control over the test conditions .......................................................... 79 Stage 3: Scoring ............................................................................................................................... 80 Checking answer sheets ............................................................................................................ 80 Hand-scoring ............................................................................................................................... 80 Other scoring procedures ......................................................................................................... 81 Converting raw scores to standard scores and percentiles ................................................. 83 Summary of points to note when scoring tests ...................................................................... 84 Stage 4: Completing the administration procedures ................................................................. 84 Paper-and-pencil materials ........................................................................................................ 84 Computer materials .................................................................................................................... 84 Record-keeping, monitoring and follow-up ............................................................................. 84 2.2 ISSUES OF CONFIDENTIALITY AND SECURITY ........................................................................... 85 Maintain the confidentiality of test-taker results ....................................................................... 85 Security of test materials ............................................................................................................... 85 Data Protection Act 1998 ............................................................................................................... 86 Confidentiality of test data ........................................................................................................ 87

PT_A01.qxd 10/07/2006 12:16 Page viii

viii

CONTENTS

2.3 PUTTING IT INTO PRACTICE ........................................................................................................ 89 Overview and practical task ......................................................................................................... 90 2.4 FEEDBACK AND REPORTING ........................................................................................................ 91 Preparation of scores for feedback .............................................................................................. 92 The test conditions .................................................................................................................... 92 Comparisons with the performance of relevant others ........................................................ 92 Making arrangements for the feedback session ......................................................................... 93 Helping in the preparation of reports .......................................................................................... 94 Generating computer-based reports ....................................................................................... 94 Creating summary reports ......................................................................................................... 94 ENDPIECE ........................................................................................................................................ 96 MODULE 2: ANSWERS TO EXERCISES AND SELF-ASSESSMENT QUESTIONS .................................. 97 Answers to Exercises .............................................................................................................................. 97 Exercise 2.1.1: Planning the session ................................................................................................. 97 Exercise 2.1.2: Inviting the candidate to the test session .............................................................. 99 Answers to Self-Assessment Questions ............................................................................................. 100 SAQ 2.1.1 ............................................................................................................................................. 100 SAQ 2.3.1 ............................................................................................................................................. 101 GLOSSARY .............................................................................................................................................. 103

PT_A01.qxd 10/07/2006 12:16 Page ix

LIST OF FIGURES, TABLES AND EXERCISES Figures 1.2.1 1.2.2 1.2.3 1.5.1

Example items from verbal, spatial, abstract and numerical tests ..................................................................................................................... 20 The Crawford Small Parts Dexerity Test .................................................................................. 26 Examples of computer-based tests ........................................................................................... 27 Correspondence between the normal distribution and a number of standard score-based scales ........................................................................ 51

Tables 1.1.1 1.2.1 1.2.2 1.2.3 1.4.1 1.4.2 1.4.3 1.5.1 1.5.2 2.1.1 2.1.2

Some of the major differences between a psychological test and a set of questions ......................................................................................................... 14 Types of item used in personality and interest inventories .................................................. 18 Examples of the sort of items used in tests of maximum performance ............................... 19 Examples of apparatus tests ...................................................................................................... 25 Example items for a mood inventory ....................................................................................... 41 Example items for a general ability test ................................................................................... 42 Typical self-report inventory items .......................................................................................... 43 Example of a raw score to percentile conversion table ......................................................... 48 Commonly used standard score scales .................................................................................... 50 Checklist of actions for the four stages .................................................................................... 66 Checklist for planning a test session ........................................................................................ 71

Exercises 1.4.1 1.5.1 1.5.2 2.1.1 2.1.2 2.1.3 2.1.4 2.1.5 2.1.6 2.1.7 2.3.1 2.4.1 2.4.2

Self-administration of Test A ...................................................................................................... 40 Converting raw scores to percentiles and grades ................................................................... 49 Using norm tables ....................................................................................................................... 52 Planning the test session ............................................................................................................ 72 Inviting the candidate to the test session ................................................................................ 73 Familiarizing yourself with the Test Pack materials ............................................................... 74 Introducing the session to the candidates .............................................................................. 78 Checking the answer sheets ...................................................................................................... 80 Scoring .......................................................................................................................................... 81 Converting raw scores into percentiles and standard scores ............................................... 83 Administering Tests A and B ...................................................................................................... 90 Preparing information for the test user: case study 1 ............................................................ 93 Preparing information for the test user: case study 2 ............................................................ 95

PT_A01.qxd 10/07/2006 12:16 Page x

PT_A02.qxd 10/07/2006 12:16 Page 1

Introduction and Study Guide

Before you start work on the Modules, please read through this introductory material. If you have not yet made any arrangements for being assessed for the British Psychological Society’s (BPS’s) Certificate of Competence in Occupational Testing (Test Administration) and wish to do so, then it is best to make these arrangements before you get too far into studying (see below for details).

This introduction provides you with some guidelines on how to study, how to have your competence as a test user assessed and how to go about obtaining tests once you have attained competence.

Introduction to test administration

© No photocopying allowed

Good test administration is of vital importance in ensuring the quality of psychological testing and maximizing the value of the information obtained from tests. In the past the role of the test administrator was relatively straightforward. He or she would make the preparations for the test session, provide the instructions for testing to one or more people and supervise them while they completed the tests. He or she would then carry out various checks on the responses, do the scoring and produce the materials needed by the qualified test user to produce a report or provide feedback on the candidates’ performance. While this still represents much of what test administrators do, with increasing use of computer-based testing and online delivery of tests, the role of the test administrator is changing and becoming more diverse. Not only do you need to know how to administer paper-and-pencil tests, but these days you need to be familiar with computer systems and with the internet. Many tests no longer need a test administrator to carry out scoring, but instead they need someone who can manipulate the scores from one computer database into another, or follow the procedure for generating a report from a set of test scores. 1

PT_A02.qxd 10/07/2006 12:16 Page 2

2

INTRODUCTION AND STUDY GUIDE

It is assumed throughout these Modules that these basic operational skills either exist or can be learnt elsewhere. It is also assumed that any training for the specific skills and competences needed to operate particular software packages will be provided with that package. Online-based testing systems are generally easier to use than desktop systems, and assume that the user simply has general internet and web-browser operating skills. What do test administrators do? It is very important to know what, as a test administrator, you can do and what you cannot do. Understanding the limits of your competence is vital. Qualification as a test administrator can be sought either as an end in itself or as a step on the road to qualification as a test user. In these units, when we refer to the ‘qualified test user’, we mean a person who has obtained a BPS Level A or B qualification, or has an equivalent level of competence, and who is therefore qualified to use a certain range of types of tests. For Level A this range is limited; for full Level B it can be extensive. We can think of testing as involving a number of stages. In the table below you can see where the roles of the test administrator and test user differ.

Stage

Test user

Test administrator

Choosing whether to test or not Choosing which test or tests to use Managing the administration process

Yes Yes Yes

Scoring the results and producing the materials necessary for interpretation Interpreting the results Providing written or oral feedback of the results Evaluating the utility of the test in the longer term

Yes Yes Yes

No No Yes, under direction of the test user Yes, under direction of the test user No No

No

No

As you can see, the test administrator’s role fits within a larger process and is one that is directed by the test user, who has ultimate responsibility for the testing. In effect, a qualified test user can delegate certain aspects of the testing process to test administrators. But the test user retains overall responsibility. But it must be stressed that the role of test administration is important. It is important in two key respects: 1.

It is the test administrator who is at the interface between the actual test and the test taker. How people perform on the test will have a lot more to do with how the test administrator operates than with the test user.

2.

For the same reason, it is the test administrator who is the public face of the testing process. It is the test administrator who meets candidates for selection into an organization, and who has the task of representing the organisation in a positive way.

PT_A02.qxd 10/07/2006 12:16 Page 3

3


Limits on the range of tests that may be administered The BPS test administration qualification is not intended to cover all possible types of tests. There are tests which require a high degree of knowledge of the test itself or of psychology to administer. Such tests are often quite interactive in the way in which they are administered. Examples would include batteries of ability measures like the British Ability Scales or the Weschler Adult Intelligence Scales. You will need to be guided by the qualified test user in terms of what tests you can administer for them, and what they will need to administer themselves. However, even for tests that require them to be involved in the actual administration, there are many supporting processes that you would still be able to manage – arranging the sessions, welcoming candidates, assisting with scoring and so on. The BPS Occupational Testing standards are intended for test users who are working with ‘normal adults’.

•

By ‘normal’ we simply mean people being assessed in the normal run of things – for a new job, for guidance in career choice, for personal development and so on. Test administration in the areas of mental health or forensic testing requires specific skills and considerations that are not covered here – though the general test administration skills you will learn underpin all of these.

•

By ‘adult’ we are excluding the assessment of children (which comes under the general headings of either educational testing or clinical testing, depending on the purpose). We are also excluding testing of the elderly where that is being carried out for clinical or health reasons rather than simply in relation to work or lifestyle.

Keeping up to date These Modules have been written with the future in view. They are based on the 2005 updates to the BPS Level A and Level B standards of competence in test use. The main respect in which they differ from the original standards is in the diversification of the role of test administration brought about by the increasing use of computer-based testing and testing on the internet.

Summary Test administration is an essential part of competent test use. But it does not cover stages such as test choice, understanding the technical qualities of tests, or the interpretation and feedback of results. Consequently, as a test administrator you will not be responsible for these aspects of test use and will be expected to use tests only under the supervision of a qualified test user. If you reach Level A, you will be considered competent by the BPS to use certain types of tests on your own. From Level A onwards, you can start to increase the breadth and depth of your skills and understanding of testing and extend the range of tests you can use fully. These Modules will enable you to set out on the path to test user qualification and take you along to the first important point: test administration.

PT_A02.qxd 10/07/2006 12:16 Page 4

4


The BPS Certificate in Occupational Test Administration These Modules have been designed and are carefully structured to provide you with all the material and information you need to develop your basic competence as an administrator of psychological tests in occupational settings. For practical purposes, the BPS has divided its specification of test user competence in occupational testing into two ‘levels’: Levels A and B. While the test administration qualification forms a meaningful qualification in its own right, it can also be used as a stepping stone to acquiring Level A and Level B test user qualifications. Level A defines the basic foundation skills and competence needed for the use of a limited range of types of test (those which are easier to interpret), including test administration. Level B extends this to cover the competences required for using most of the other psychological tests employed in occupational assessment (including measures of personality). Level A is the starting point for progress on to Level B. As with all professional development, there is no well-defined end point and there are many alternative routes one can take to achieving competence. Levels A and B mark points along a general developmental path. Some people may follow this path on beyond Level B, while others may choose to progress no further than test administration. The BPS test administration standards The Test Administration Certificate is based on a set of standards which relate to an individual’s ability:

•

to administer certain types of psychological test fairly and effectively within occupational settings (such as personnel selection, vocational guidance, management development);

•

to adhere to the codes of practice and professional conduct defined by the BPS and other relevant bodies (for example, the Chartered Institute of Personnel and Development).

The standards are defined by a detailed Checklist of Competences in Occupational Testing (available from the BPS in the Test Administration General Information Pack which you can download at: http://www.psychtesting.org.uk), which specifies a range of knowledge and skills relating to the administration of a permitted range of types of psychological test in occupational settings. It covers the following areas:

•

Relevant underpinning knowledge – especially concerning the nature of psychological testing.

•

Task skills – relating to the performance of test administration related activities.

•

Task management skills – required to achieve overall functional competence: organizing assessment procedures, control and security of materials, etc.

•

Contingency management skills – needed to deal with problems and difficulties, breakdowns in routine, candidates’ questions during test administration, etc.

•

Instrumental skills – relating to specific test administration modes and procedures, such as the use of computer-based test administration, remote administration over the internet, etc.

PT_A02.qxd 10/07/2006 12:16 Page 5

5


What does the Certificate provide? Possession of the Test Administration Certificate provides evidence of your basic competence in certain areas of occupational testing. With a Test Administration Certificate you are qualified to administer a wide range of attainment, ability, aptitude, personality, motivation and other tests under the supervision of a suitably qualified test user. Publishers generally classify their test materials in terms of the competence level required for their use. Most now use the BPS classification into Levels A and B, though this is not universal. You will not be eligible to purchase Level A or Level B tests in your own right. It is important to note that the Test Administration Certificate does not constitute a qualification in psychology and does not confer any ‘psychologist’ status on the holder. How is the Certificate obtained? Any person who can provide sufficient evidence that they meet the standards required for all the items on the checklist of competences will be eligible to apply for a BPS Certificate of Competence in Occupational Testing (Test Administration). To obtain the Certificate, your competence has to be assessed by someone who is recognized by the BPS as qualified to assess people for the Test Administration Certificate. Assessment of competence is subject to a verification process carried out by the BPS which is designed to ensure that assessment is fair and that different assessors are not making very different demands. The BPS holds a register (the Register of Assessors) of these people, and you can obtain from the BPS a list of those who operate in your area. In some cases these are individual consultants, in others people working in consultancy companies. The BPS provides guidelines on assessment for the Certificate and all registered assessors are subject to monitoring and quality checks by the BPS through its verification scheme. All assessors are Chartered Psychologists who themselves hold at least the Level A Certificate and who have expertise in occupational testing and assessment. You will need to contact one or more of the people on the Register of Assessors in your area to ask about their costs, what they can offer you and their availability.

Becoming a test administrator The traditional route to becoming a test administrator has been to attend a one- or two-day course in test administration. This is still the fastest route. However, training which is more spread out in time provides you with more time to absorb new ideas and concepts and to practise the skills and techniques. While the BPS Test Administration Open Learning Modules can be used as teaching materials for conventional training courses, they have also been designed to offer an alternative option: flexible learning through self-study. This provides you with the freedom to pace your learning, it does not require as much time away from work, and it is far cheaper than a conventional training course. There are two steps to becoming a competent test administrator through the BPS self-study route: 1.

Acquire the necessary competence.

2.

Have your competence assessed.

PT_A02.qxd 10/07/2006 12:16 Page 6

6


Acquire the necessary competence You should be able to do most of this working on your own using these Open Learning Modules and associated materials. However, you may wish to have some extra guidance and help. Many of those in the Register of Assessors, who carry out competence assessments, also provide training and tutoring services. At any point in your study, you can ask one of these people to be your tutor. Some offer a telephone ‘helpline’ and face-to-face sessions (tutorials, small group workshops, back-up training sessions and so on). Naturally there will be a charge for any services, and you should ask for a clear statement of the charges before you proceed. Have your competence assessed As already stated, this has to be carried out by someone who is recognized by the BPS as a verified assessor. Typically, the assessment will involve attending an assessment workshop for the assessment of your test administration skills and the submission of various items of evidence of your competence (your ‘portfolio’). The assessment process used by the authors of these Modules, for example, has included: 1.

Written exercises to be completed in your own time and submitted for assessment.

2.

Completion of all the Self-Assessment Questions and Exercises in these Open Learning Modules for inclusion in your portfolio.

3.

A test administration assessment workshop in which you are observed carrying out one or more test administrations.

These are all items that can make up your portfolio (see below). Checklist of materials in the BPS Open Learning Modules pack The complete Open Learning pack should contain:

• •

Introduction to the Modules

• • • •

Module 2: Test Administration and Scoring

Module 1: An Introduction to Psychological Testing for Test Administrators

Glossary of Terms The Test Pack The Assessment Portfolio

In addition, a copy of the General Information Pack for the Test Administration Certificate is included. This provides detailed information about test administration and includes the Checklist of Competences in Occupational Testing (Test Administration). The Test Pack and the Assessment Portfolio contain sufficient copies of material for all your course work. All the material is protected by copyright and should not be reproduced by any means without the prior permission of the publisher. The authors and the publisher do permit the making of additional copies of the Test A and P5 Booklets, Test Session Logs and Candidate Evaluation Forms, as long as they are for your personal use only.

PT_A02.qxd 10/07/2006 12:16 Page 7

7


What you will need to meet the test administration assessment requirements To meet part of the assessment evidence requirements for the Test Administration Certificate you will need to have completed your Assessment Portfolio. This contains all the SAQs and Exercises in the Modules. If you want to obtain the Test Administration Certificate and are planning to use the Open Learning Modules for self-study, then you should consider registering with a verified assessor before you start your studies.

How to use the Open Learning Modules General hints on self-study methods This course covers all that you would normally cover in a two-day test administration course – and quite a lot more! Hence, the minimum amount of study time needed to master the essential parts of the material presented here is likely to be at least 12 or 18 hours. For most people, it would be more realistic to plan for about 15 to 20 hours. You should expect to master all of the materials within this time and be ready for your final assessment. You will need to set aside time to work so you can concentrate well. If you are not used to self-study, then there are a number of points to note. The key to successful self-study is self-discipline, planning your time and establishing a clear ‘contract’ with those around you who will be affected by your studying.

•

If possible, work in a room which is away from telephones, televisions and other people.

•

Come to an agreement with those living with you that you are to be left alone during study periods. Agree when these are to be.

•

Plan your time. Work out a timetable now and treat it as if it were an evening class or some other formal commitment. Don’t expect to work by just grabbing odd opportunities when they arise – they won’t and you will become frustrated at your lack of progress.

•

It is better to have a regular time slot which you use whether you feel like it or not! Lots of very short periods or one or two very long ones are not so good. Psychologists have shown that learning is most efficient when study and practice periods are distributed across time in reasonable-sized chunks.

•

Ideally, each study period should be between one and three hours. Within each period, try to change the type of activity you are engaged in as much as possible: reading, making notes, working out examples, etc. Everyone finds it difficult to keep their attention on their work for more than about half an hour at a stretch if there is no variety in what they are doing.

If you can, plan to spend two or three evenings during the week (say five hours in total) with half a day at the weekend (three hours). You should then be able to cover the full course in one to two weeks.

PT_A02.qxd 10/07/2006 12:16 Page 8

8


How to organize your work Psychological testing involves some quite complex and technical concepts. While you will not need to go into these in any detail for this qualification, some aspects are included in boxes to give you a good background in testing and to help you differentiate between test administration and test use. 1.

Read through each Module quickly to get an overview – do not do the exercises or answer the questions.

2.

Go back and work through carefully making notes as you go.

3.

Carry out the practical work.

4.

Complete the exercises and answer the questions.

It is best to work through the two Modules in sequence. Build up your portfolio of work for assessment All the self-assessment questions (SAQs) and exercises given in the Modules are duplicated in the Assessment Portfolio. Work through them in the order in which they occur in the Modules, but use the Assessment Portfolio as a workbook for writing in. All the work you do should be kept in your Assessment Portfolio as part of the evidence of competence you will need to provide for assessment purposes. Your portfolio should contain all the completed SAQs, exercises, reports and feedback from people you have used in your test administration. Keep all this information together. Your assessor will want to see it as evidence of your competence. Enjoy your learning and developing your practical skills.

PT_C01.qxd 10/07/2006 12:15 Page 9

Module

1

An Introduction to Psychological Testing for Test Administrators

OVERVIEW Module 1 introduces two general categories of assessment instrument: ability and attainment tests on the one hand, and measures of interests and personality on the other.

KEY AIMS Having completed this Module you should be able to: Describe the characteristics of a psychological test Distinguish between tests of attainment, ability and aptitude, personality questionnaires and interest inventories Give examples of each type of test used in occupational assessment


Introduction As a test administrator, you should always be working under the direction of a qualified test user. A qualified test user is someone who has demonstrated competence in a range of skills associated with the use of tests and is registered with one or more suppliers of tests as someone who can purchase tests. Test users may have various different levels of qualification, but the minimum level recognized by the British Psychological Society (BPS) for test use in occupational settings is Level A. More details of Level A and other BPS qualifications can be found on the BPS’s Psychological Testing Centre website: www.psychtesting.org.uk All the decisions that require a detailed knowledge of tests and testing (such as which test to use and how to make use of the results) will be the responsibility of the qualified test user. Your role as a test administrator is to assist and support the work of the test user. In order to understand the role of the test user and how you can support that, it is helpful if you know a bit about psychological tests and testing. 9

PT_C01.qxd 10/07/2006 12:15 Page 10

10

MODULE 1

About these modules Most of the information in these modules is written in a ‘normal’ format in the same way as this paragraph is written. This format is used for both

•

information that will help you to put your role into context and understand the key role of good, professional, test administration in the testing process and

•

the areas of knowledge and skills for which you will be expected to demonstrate competence.

Other information is shown in grey boxes. These boxes contain information that is not needed for the BPS qualification. If you do read it, either now or later, you should be better able to understand some of the terms that the test user may use when talking about testing. However, the materials presented in these units do not provide the level or amount of knowledge and understanding needed for a Level A qualification. Examples of questions from tests are placed in tables in the modules, and these should give you a clearer idea of what different types of test item look like.

PT_C01.qxd 10/07/2006 12:15 Page 11

1.1

What Are Psychological Tests?

Testing is probably the area of psychology which has had the greatest impact on members of the general public. Most people will, at some time in their life, have completed some form of psychological test. Increasingly, they are likely to encounter such tests as part of the process of getting a job, in career guidance and in the assessment of their career development and training needs. The proper use of well-developed tests in these situations can provide considerable benefits – both for the organizations using them and for the individuals being assessed. However, poor use of good tests or the use of badly designed tests can create a whole range of problems. These range from bias and unfair discrimination in selection to the giving of bad advice to people seeking help in their search for employment. Poor administration of tests and the accompanying tasks of scoring and converting to standard scores can introduce another layer of bias into the process and thus reduce the value of good tests and compound the problems of badly designed tests. This means that a test administrator needs skills and knowledge to prevent this from happening. These Modules and the accompanying test pack are designed to provide you with the knowledge and skills you need to administer and score tests in a fair and competent manner. What is a ‘psychological test’?


To help put your role into context and understand the critical importance of professional test administration it is important to understand what a ‘psychological test’ is. In ordinary conversation, a test is something that you take; something that you pass or fail, the results of which are used to make judgements of worth about you. This usage is unfortunate for those who work in psychological testing, as in psychological testing the word ‘test’ refers to a much broader range of assessment procedures. It is probably a good idea to try to avoid using the word ‘test’ if possible, because of the connotations it has; assessment is a preferable term. However, the terminology psychological test has become very well established and so conflict between the two uses of the words is almost unavoidable. This conflict is particularly great when we look at personality or interest assessment. For instance, it is clearly misleading to talk of personality testing since 11

PT_C01.qxd 10/07/2006 12:15 Page 12

12

MODULE 1.1

the common meaning of testing leads us to assume that you can pass or fail a personality test and that therefore some personalities are better or worse than others. Of course, this is not so. A personality ‘test’ is called a test only because it has been constructed according to the principles of psychological test theory. In these Modules we will use the term test in its technical sense to refer to an instrument that has been developed using psychometric principles. This is how the word is used in most of the testing literature and in any technical documentation you will come across. As a test administrator, you need to get clear in your own mind this distinction between the common and the psychological use of the word ‘test’. But you must also remain aware that most test takers will have only the common meaning in their minds when you talk of giving them tests or of testing their interests and personality. For that reason it is important to be very careful how you describe assessment procedures to those who don’t have a technical understanding of psychometrics.

?

OPEN QUESTION: Pause and reflect for a moment on the use of the word ‘test’. 1.

What does the word ‘test’ mean to you personally? What images or memories does it conjure up?

2.

Can you think of any other terms to use instead, which you could use to describe a psychological test to someone who was about to have a test administered to them

Write down your answers. ............................................................................................................................. ............................................................................................................................. ............................................................................................................................. ............................................................................................................................. ............................................................................................................................. ............................................................................................................................. These questions are aimed at making you think about the notion of testing people, and how people might react to the idea of being ‘tested’. You will have a chance to look back at these ideas when you come to Module 2, which deals with practical test administration.

A psychological test consists of a collection of questions or tasks. These are known as test items. In a simple test, the test taker’s answer to each item is scored and the item scores added up to provide a single measure called a raw score or raw scale score. In more complex tests, there may be several scales. For example, the Sixteen Personality Factor Questionnaire (16PF) has 16 scales; the Occupational Personality Questionnaire (OPQ32) has 32 scales. In each of the above examples, each scale is used to measure some specific aspect of personality. Each scale has its own scale score or raw score. In most tests each item or question counts towards the score on only one scale. So, what makes a test so different from a list of questions that anyone might devise? You cannot judge whether a list of questions is a psychological test simply by looking at it. While most tests of ability contain a list of

PT_C01.qxd 10/07/2006 12:15 Page 13

WHAT ARE PSYCHOLOGICAL TESTS?

13

questions of some sort or other, potentially the tasks presented to the test taker range widely and can be anything from which a measurement can be taken. For example, some tests assess the speed and accuracy with which the test taker can move pegs from one hole to another; others look at how well people can rotate images of shapes in their mind; yet others may be based on solving anagrams. However, all will have standardized administration instructions whether these are to be read out by the administrator or presented on a computer screen, and a test administrator needs to be aware of the wide range of instruments and tests that are available. Test manuals The things which distinguish whether a list of questions is or is not a psychological test are the technical information and the procedures laid down for its use. This information will be contained in the test’s manual. The technical documentation accompanying a psychological test, (usually called a test manual or test user manual), will tell you how to administer and score the test and will tell the test user what conclusions can reasonably be drawn from the results. Sometimes, test manuals are split into user manuals, which focus on test administration, scoring and interpretation, and a technical manual, which provides all the technical background on the test’s development, its reliability and validity. ‘Norms’, which are used for converting scores obtained on the test into measures on a standard scale, may be contained within the manual or are often provided as separate supplements. As a test administrator you may need to use norms when you come to scoring tests (see Module 2), however the choice of which norms to use is the responsibility of a qualified test user. The issue of norms is discussed further below and in more practical detail in Module 2. Nowadays it is quite common for user manuals, technical manuals and norms to be available from the test publisher’s website. This is so that the documents can be more easily kept up to date. Tests differ from more informal assessment procedures in having standardized administration procedures (i.e. each time the test is administered, exactly the same standard procedure is followed). This is done to ensure that all test takers are provided with the same opportunity for doing well. It is crucially important for the test administrator to understand that standardized instructions provided in the test materials must be adhered to and given in the same way for every test taker and to ensure that the instructions are not skipped through or ignored altogether when they appear on a screen in a supervised computer administration.

The test’s documentation provides normative information: this enables us to see how people’s performance on the test compares with that of others. Such information might include data on the average scores for different groups (or ‘populations’) such as males under 30; blue-collar workers; female undergraduates, and so on. As well as providing the average scores for such groups, a test manual will also describe how scores are distributed: that is, in general how likely people are to obtain particular scores on the test. This information enables test users to decide what sort of score is an above average one, what average is, and what is below average.

PT_C01.qxd 10/07/2006 12:15 Page 14

14

MODULE 1.1

As a test administrator, you should discuss with the test user which set of norms will be used in any testing session. This will ensure that when the test takers’ scores are converted to standard scores (see Module 2 for a discussion of converting scores) they are being compared with the relevant group.

The technical section of a test manual will also include information about how accurate or reliable the scores are. It is vital for test users to know what the margin of error is on a score, if they are to make appropriate use of it. Test users also need to know about the validity of a test. This tells the test user how well the test is measuring what it says it is measuring and what sort of conclusions can draw from the scores. The concepts of standardization, reliability and validity lie at the heart of psychometrics and it is essential for test users to know about these.

What do tests measure? The processes that psychological tests are designed to measure are not concrete things like height or weight. You cannot measure anxiety, spatial ability or motor co-ordination directly in the same sense as you measure your shoe size. Anxiety is not something you can get hold of and measure. Rather, it is a word which provides a useful way of talking about a range of related types of feelings and behaviours. As such, anxiety is a useful idea that enables us to make sense of the way in which a number of different behaviours, signs and symptoms tend to be associated with each other.

TABLE 1.1.1: Some of the major differences between a psychological test and a set of questions A psychological test 1. 2. 3. 4. 5. 6. 7. 8.

The scientific rationale for the test is presented. Its method of construction is described – how the items were created and the criteria adopted for their selection. For each scale, questions are selected which measure just one characteristic of a person. Test administration procedures are documented and standardized. Information is provided on the scores obtained by a large sample of people from a well-defined population. Measures are provided of the accuracy of the test scores; on the degree of error present in any score you obtain. Evidence is provided about the validity of the test: what it does measure and what it does not. Guidance in interpretation is provided which is linked to evidence of validity.

A set of questions 1. 2. 3. 4.

No scientific rationale is presented. The method of test construction is not described – someone just thought up the questions. There is no evidence to show whether the final score is a measure of just one aspect of ability or a composite of various different ones. Test administration procedures are undocumented and unstandardized.

PT_C01.qxd 10/07/2006 12:15 Page 15

WHAT ARE PSYCHOLOGICAL TESTS?

5. 6. 7. 8.

15

No information is provided on the scores obtained by other people. No indication is provided of the accuracy or stability of the test scores. No evidence is provided about the validity of the test. Validity is assumed on face value (‘face validity’) or what the test looks to be measuring. Guidance in interpretation may be provided, but it is not linked to evidence of validity. Often, arbitrary ‘cut-off’ scores are provided with text explanations (e.g. ‘If you score between 10 and 20 you will find some tasks difficult to deal with’).

Collect examples of ‘tests’ from Sunday newspapers and other magazines. Look at them carefully to see how close they come to meeting the criteria for a psychological test. Are they just lists of questions?

Summary

•

Information about reliability, norms and validity, as set out in the test manual, provides information about what the margin of error is on scores, how groups of people have performed on this test in the past and what their scores mean. This allows us to see how the score of the person being tested compares with those of others and what inferences we can draw from the test scores about other aspects of the person’s behaviour and performance.

•

Tests comprise a series of standardized tasks. They are designed so that everyone is given the same, or carefully matched, tasks to do and everyone is given a standard set of instructions for doing them. In administration of a test, these standardized instructions must be followed carefully.

•

As a result, competent test users using well-designed tests can make reasonably accurate judgements about people’s capacities or potential to act or behave in certain ways. For instance, this might include the likelihood that an individual would be able to cope with the demands of a particular training course and their potential for success in certain types of work.

The technology underlying the development and use of psychological tests may seem complex and difficult to understand at first – but it is the body of information and statistics about people’s responses to its content, that gives a test its particular value and differentiates it from a set of questions.

PT_C01.qxd 10/07/2006 12:15 Page 16

1.2

Main Types of Psychological Test

In general, tests fall into two broad categories: 1.

There are those designed to assess characteristics or attributes, such as personality, beliefs, values, and interests, and to measure motivation or ‘drive’. These are known as measures of typical performance.

2.

There are those designed to measure ability, aptitude or attainment. These are known as measures of maximum performance.

This Module is concerned with psychological testing in general and with making you aware of the diversity of psychological tests. It deals with maximum performance tests (ability, aptitude or attainment) and certain types of measures of typical performance (personality, beliefs, values, interests and motivation). These are the sort of tests suitable for group or individual administration and for assessing differences between people relating to occupational issues. Indepth, individual mental ability testing – carried out for diagnostic reasons – and some forms of personality or motivational assessment can require far more complex processes of administration and require additional specialist training if they are to be used effectively. These are not dealt with in detail in these Modules. The material presented here will provide a basic foundation in test administration in psychological testing on which you could later build further skills and knowledge in test use, and it will cover the requirements for the BPS Occupational Test Administration qualification.


Measures of typical performance

16

Measures of typical performance generally fall into three main categories: measures of Personality, measures of Vocational or Occupational Interests and measures of Drive, Motivation or Need (see Table 1.2.1 for some examples). Personality concerns the way we characteristically respond to other people and situations: how we relate to other people, how we tackle problems, our emotionality and responsiveness to stress, and so on. While interests are also related to personality, measures of interests focus more on what sort of activities we find attractive and which we would rather avoid. Measures of motivation and need focus on the factors which drive us to action (such as the need for success) or cause us to refrain from action (such as the fear of failure). Many personality and interest measures – either directly or indirectly – also provide measures of need.

PT_C01.qxd 10/07/2006 12:15 Page 17

MAIN TYPES OF PSYCHOLOGICAL TEST

(a) Personality inventories. Personality inventories are good examples of tests that assess our preferred or typical ways of acting or thinking. Items that test these characteristics do not have right or wrong answers. Rather, they attempt to measure how much or how little we possess of specific characteristics or sets of characteristics (e.g. gregariousness, empathy, decisiveness). Most instruments designed to measure such characteristics are administered without a time limit and stress the need for people to answer honestly and openly. But, in some situations, such openness may be difficult to achieve (for example, if it is perceived that one’s chances of being selected for a job depend on the results). However, in other situations such problems are less likely to arise where one can be sure that it is in the test taker’s best interests to co-operate and be honest (e.g. in vocational guidance). (b) Interest inventories. Interest inventories are designed to assess in a systematic manner people’s likes and dislikes for different types of work or leisure activity. Satisfaction at work requires not only possessing the necessary skills to do the job competently but also having sufficient interest in it. Like tests of personality, these are not tests in the sense of having right or wrong answers, and hence they are very different from measures of maximum performance (discussed later). Interest inventories have an obvious application in guidance and in staff development assessment situations, where people may need help in sorting out what they do or do not want to do. They provide a means of exploring new options with people, of suggesting areas of work that they would not otherwise have considered. As with personality assessment, assessing interests may provide a useful, positive way of opening new doors for people in a career guidance context. Both forms of assessment are less well suited to judgemental situations where any task a test taker is asked to complete will be perceived as a test of whether he or she has the ‘right’ qualities. Both personality and interest assessment inventories are essentially different in kind from ability tests, even though the same psychometric principles apply (the need for reliability, validity and standardisation). Such inventories are the means of providing a more qualitative description of people. Most of the available personality and interest tests are self-report or self-description instruments. That is, they are like a highly-structured, written interview that has been standardised and subjected to psychometric analysis. Hence, if properly used they can provide valuable sources of data about personality and interests to supplement information obtained from other sources (references, interviews, and the like). (c) Measures of drive, motivation and need. People’s levels of drive or motivation can be thought of as having both state (moods) and trait (permanent characteristics) components. Some people are characteristically more driven than others: some people always seem to be on the go, seeking more and more work or responsibility, while others are the opposite. At the same time, any individual will vary in their level of drive from time to time. On some days they will feel they

17

PT_C01.qxd 10/07/2006 12:15 Page 18

18

MODULE 1.2

have more get up and go than on others. Many personality inventories measure aspects of trait motivation. These are often called needs: the need for achievement the need to be with other people the need to have approval from others the need to avoid failure and so on. Needs motivate us in that they tend to establish our priorities and our goals. Interest measures also provide some indication of motivation. Generally, people strive hardest at those things that interest them most.

TABLE 1.2.1: Types of item used in personality and interest inventories The following are typical of the form and content of items you might find in personality inventories and interest inventories. (Note: they are not taken from any actual inventories. In practice, the range of item formats and, of course, content is far wider than shown here.) For each statement, choose (a), (b) or (c) 1.

Even when people are trying to be constructive, I find I get upset by their criticisms. (a) Yes (b) In between (c) No

2.

If I could live my life again, I would do things very differently. (a) Yes (b) In between (c) No

For each pair of statements, tick the one most true of you: (a) or (b) 1.

(a) I tend to speak my mind whatever the consequences. (b) I like to think through the effect of what I might say before I speak.

2.

(a) I would rather work for a company with a long history of steady growth. (b) I would rather work for an innovative new company that was prepared to take a few risks.

Rate each activity using the scale: (1) Would dislike it a great deal (2) Would dislike it (3) In between (4) Would like it (5) Would like it a great deal 1

2

3

4

5

Working closely with people as part of a team

[ ]

[ ]

[ ]

[ ]

[ ]

Using a computer to analyse data

[ ]

[ ]

[ ]

[ ]

[ ]

Arranging delivery schedules

[ ]

[ ]

[ ]

[ ]

[ ]

Measures of maximum performance Measures of maximum performance (ability, aptitude or attainment) measure how well people can do things, how much they know, how great their potential

PT_C01.qxd 10/07/2006 12:15 Page 19


19

is, and so on. Many of these measure general, rather abstract, characteristics (e.g. verbal fluency, spatial orientation, numerical reasoning) while others may seem more concrete and functional (clerical speed and accuracy, programming aptitude). The distinguishing feature about such ability tests is that they tend to contain questions, problems or tasks for which there are right and wrong (or good and bad) answers or solutions. In addition, while tests of typical performance (personality, beliefs, values, interests, and motivation) are usually administered without any time limit on their completion, tests of maximum performance are usually timed. In some cases the time limitation is very strict and the emphasis is placed on how quickly a person can respond to the items. Tests which contain relatively easy items, but with a strict time limit are called speed tests. In other cases, the time limit is designed to allow most people to complete all the test items, and the focus is on how many they are able to get right. If the score you get is mainly affected by your ability to answer the questions – rather than your speed – the test is a power test. If the test is not automatically timed (as in a computer-delivered test) it is the responsibility of the test administrator to ensure that the timing is strictly adhered to (see Module 2 for practical ways of ensuring the timing is accurate). As you can see, for speed tests, making sure everyone has exactly the same time to do the test is critical if they are to be fairly evaluated. Some examples of the enormous variety of item types used in maximum performance tests are shown in Table 1.2.2.

TABLE 1.2.2: Examples of the sort of items used in tests of maximum performance The following examples give some idea of the sort of items used in tests of maximum performance. These are taken from tests of aptitude and ability. In all of these, the illustrations provided are those given in tests as examples or practice items for the test taker. For reasons of confidentiality, we have not included any that would count towards a person’s score on any test. As a test administrator, test users will provide you with access to such items as well as to information about how they are scored. Part of your responsibility as a test administrator is to ensure that the confidential nature of such information is maintained. The tests illustrated in Figure 1.2.1, in order of presentation, are: Two verbal tests: 1. The General Ability Tests (GAT) Verbal 2. The Advanced Managerial Tests (AMT) Verbal Analysis Two spatial (three-dimensional manipulation) tests: 3. The General Ability Tests (GAT) Spatial 4. The Information Technology Test series (ITT) Spatial Reasoning Two ‘abstract’ reasoning tests: 5. The General Ability Tests (GAT) Non-Verbal 6. The Graduate and Managerial Assessment (GMA) Abstract Two numerical tests: 7. The Advanced Managerial Tests (AMT) Numerical Analysis 8. The Graduate and Managerial Assessment (GMA) Numerical Reasoning

PT_C01.qxd 10/07/2006 12:15 Page 20

20

MODULE 1.2

FIGURE 1.2.1: Example items from verbal, spatial, abstract and numerical tests

1.

The General Ability Tests (GAT) Verbal scale example items. Published by NFER-NELSON.

2.

The Advanced Managerial Tests practice leaflet (AMT) Verbal Analysis item. © SHL Group plc. Reproduced by permission.

PT_C01.qxd 10/07/2006 12:15 Page 21


FIGURE 1.2.1: continued

3.

The General Ability Test (GAT) Spatial example item. Published by NFER-NELSON. Reproduced by permission.

21

PT_C01.qxd 10/07/2006 12:15 Page 22

22

MODULE 1.2


4.

The Information Technology Test series practice leaflet (ITT) Spatial Reasoning item. © SHL Group plc. Reproduced by permission.

PT_C01.qxd 10/07/2006 12:15 Page 23


23


5.

The General Ability Tests (GAT) Non-Verbal example item. Published by NFER-NELSON.

5.

The Graduate and Managerial Assessment (GMA) Abstract example item. Published by NFER-NELSON. Reproduced by permission.

PT_C01.qxd 10/07/2006 12:15 Page 24

24

MODULE 1.2


36+ (8%)

Administration (44%)

Production (56%)

0–5 (16%)

26 – 35 (24%) 6 – 15 (28%) 16 – 25 (24%)

7.

The Advanced Managerial Tests practice leaflet (AMT) Numerical Analysis item. © SHL Group plc. Reproduced by permission.

PT_C01.qxd 10/07/2006 12:15 Page 25


25


8.

1.2.1 SAQ

The Graduate and Managerial Assessment (GMA) Numerical Reasoning example item. Published by NFER-NELSON. Reproduced by permission.

Pause and reflect for a moment on the differences between measures of maximum performance and measures of typical performance. (The following questions will help to structure your thoughts.) 1.

What sorts of attributes are assessed by maximum performance measures and what sorts by typical performance measures?

2.

How do they differ in the way in which they are timed?

Answers to SAQs and exercises can be found at the end of the Module. The tests we have talked about so far all fall into the category of ‘paper-andpencil’ tests – tests where the questions are printed and answers are in written form. Nowadays, you will find increasingly that such tests are also available in computerized (or partially computerized) forms. Not all maximum performance tests are about measuring speed of mental operations. There are also a number of different tests that are concerned with psychomotor ability. For example, hand–eye co-ordination tests are commonly used to select people for training as pilots. Psycho-motor co-ordination tests and other newer types of maximum performance test cannot be carried out using paper-and-pencil technology. They require the use of various items of specialized apparatus, often in combination with a computer. Table 1.2.3 shows examples of these apparatus tests. TABLE 1.2.3: Examples of apparatus tests Some apparatus tests are designed to measure dexterity and hand–eye coordination. Figure 1.2.2 shows the Crawford Small Parts Dexterity Test (CSPDT). Computer-based tests can provide a wide range of new types of assessment not possible with paper-and-pencil technology. The two screen displays are from MICROPAT – a series of tests used for selecting pilots for training. The first (called LANDING) is like a simple computerized flight simulator. The second (called SCHEDULE) requires the test taker to keep track of a changing complex display of information and to make quick and effective decisions. Another large category of apparatus tests is that of work samples. These are described later in this Module.

PT_C01.qxd 10/07/2006 12:15 Page 26

26

FIGURE 1.2.2: The Crawford Small Parts Dexterity Test © The Psychological Corporation. Reproduced by permission.

MODULE 1.2

PT_C01.qxd 10/07/2006 12:15 Page 27


FIGURE 1.2.3: Examples of computer-based tests The two tests illustrated are called LANDING and SCHEDULE and form part of the MICROPAT battery of tests developed for use in the selection of people for training as pilots. Crown copyright.

27

PT_C01.qxd 10/07/2006 12:15 Page 28

28

MODULE 1.2

Differences between ability and attainment tests Tests of maximum performance can be divided into those which assess what we have learned and those which assess our potential for learning new things in the future. We call those which assess what we know or what we can do, tests of achievement and attainment, while those which assess our potential are called ability and aptitude tests.

Ability and aptitude tests are designed to provide an indication of a person’s potential to succeed in a wide range of different activities (e.g. coping with the academic demands of a degree course or being able to acquire the competences needed for a new job). Although such measures will depend somewhat on a person’s previous experience and learning they are used to draw inferences about the person’s potential. Attainment and achievement tests, on the other hand, specifically assess what people have learned and the skills they have acquired (e.g. shorthand and typing tests; knowledge of motor mechanics). What they have learned will, of course, depend partly on their ability – so scores on the two types of test are often related. However, the focus of attainment tests is on what has been learned and not on how or why this learning was acquired. School, college and university examinations are all methods of assessing achievement, and while people (such as potential employers) may draw inferences from them about a person’s ability or suitability for a job, the measures themselves are not designed or intended as measures of ability. The main difference between ability and attainment tests lies in the way scores are used rather than in the actual test items. Many aptitude tests contain items that look very similar to those one could find in attainment measures (e.g. vocabulary tests and mental arithmetic). However, attainment tests are retrospective: they look back at what has been learned, what is known, what skills people have acquired. Ability tests, on the other hand, are prospective: they look forward to what people are capable of achieving in the future. However some abilities cannot be measured until the test taker has a certain level of attainment. For example, the ability to reason using words (verbal reasoning) cannot be measured until the person is able to read. While ability is needed in order to attain new knowledge or skills, a test which shows that someone has reached a certain level of attainment does not tell us much directly about their ability. For example, writing an essay on Roman Britain requires the attainment of relevant knowledge and essay-writing skills. The ease with which this knowledge and these skills are attained will depend on a person’s ability. The quality of the essay, however, while providing a good indicator of the writer’s attainment, would not be of much use as an ability or aptitude measure. In the same way, tests of ability may provide very little direct information about a person’s level of attainment. For example, detecting regular patterns embedded in a background of confusing lines would be unlikely to serve a useful purpose as an attainment test, but it can provide very useful information about spatial ability.

PT_C01.qxd 10/07/2006 12:15 Page 29


29

People can only do verbal reasoning tests or numerical reasoning tests if they have learned the relevant language or number system. This might suggest that they are attainment tests, but this is not so, as they are not designed to measure how well we have learned our language but how good we are at reasoning with it. While they are dependent on the effects of experience, their function is to provide an estimate of potential, or in other words, ability. Test users have to be careful with such tests that they are fair to all those who are being tested by ensuring that the people they test have a sufficient knowledge of a particular language. Just as testers would want to ensure they could understand the test instructions, so they need to be confident that differences between the test takers on the test are due to differences in their abilities to reason with language. We don’t want to confuse differences in reasoning ability with differences due to problems of basic literacy or lack of fluency in the language being used for the test. To summarize:

•

Attainment tests measure what has been achieved; ability tests measure what can be achieved.

•

Tests of attainment and tests of ability sometimes use the same items or content, but the scores are used differently.

•

Certain items are more relevant to attainment than to ability tests, and vice versa.

•

Some abilities cannot be measured until there is a certain level of attainment.

Work samples, trainability tests and job simulations These require separate mention as they are of special interest in occupational testing and test administration. They cover the range from aptitude to attainment. A work sample test is one in which the task has been taken from a job. All work samples assume that you are selecting experienced people. The task is done under standardized conditions. A typing test used for the selection of secretarial staff is an example. It assumes that the applicant has some measure of typing skill, and sets out to see how much. So, it is clearly an attainment test. Another example is the far more complex flight deck simulator check rides used for selecting pilots. These are only usable for the selection of qualified pilots – to select people for initial training as pilots you would have to use aptitude tests, not work sample tests. Trainability tests, on the other hand, are designed to see whether someone is likely to be able to cope with the training required to do a job. Typically these consist of a highly structured short training course with a test of performance at the end. The test that comes after the training is very much like a worksample test. Trainability tests have been developed for a range of occupations, from fork-lift truck driving and sewing machine operation to air crew training. Job simulation exercises are typically met with in the multi-method, multidimensional, multi-assessor procedures that come under the general heading of the Assessment Centre Method. These kinds of procedure often form the basis of assessment for management selection and development and are widely for selection into the military or government service. The job simulations may take the form of in-tray exercises, analysis of complex

PT_C01.qxd 10/07/2006 12:15 Page 30

30

MODULE 1.2

management situations, group problem-solving exercises and so on. They start from the assumption that the applicant does not yet possess the requisite knowledge or skill, but that the underlying ability will manifest itself when he or she works through an exercise that simulates the broad demands of the job in question. Job simulation exercises are usually designed to be accessible to all applicants, no matter what their specific work experience. Thus they are designed to assess aptitude rather than attainment. Where they differ from other aptitude measures is in the assumptions that have to be made about the shared experience of the applicants. For a verbal ability test, we assume a shared experience of the relevant language; for a job simulation exercise we may have to assume some degree of shared work experience – for example, that all applicants have worked in an office and dealt with memoranda. Dexterity tests Mention was made in Table 1.2.3 of another category of test: dexterity tests. These are designed to assess the speed of movement, precision of fine motor control and so on that are aspects of motor co-ordination. These are properly considered as aspects of ability, as they underlie the attainment of many actual performance skills (e.g. driving a car, assembling electrical components on a circuit board or operating a lathe). The key difference between dexterity tests and other tests of ability is that they concern different domains: action and performance on the one hand, thinking and cognition on the other. Apparatus tests: issues for test administration It is clear from the descriptions above that administration of these tests will be more complex and require more from the test administrator than will the paper-and-pencil or the computer-administered tests. Such tests often involve one-to-one interaction with the person who is taking the test, and careful observation and checklists for scoring on the part of the test administrator. Although the rules of using standardized instruction and scoring will be the same, such tests will require careful familiarization and practice on the part of the test administrator before they are ready for the test takers.

Ability versus aptitude Aptitude tests are designed to be job-related and they tend to have names that include job titles (Programmer Aptitude Test Battery; General Clerical Test, etc.). Ability tests, on the other hand, tend to be named after the abilities or processes that they are designed to measure (Spatial Orientation, Diagrammatic Reasoning, etc.). The difference in terminology reflects mainly a difference in the use of the tests. The differences lie in the types of item that are collected together to produce the tests, rather than in the content of the items. Thus we tend to talk of programmer aptitude tests but not programmer ability tests; verbal ability tests but not verbal aptitude tests. In practice there is considerable overlap between these categories. Most aptitude tests contain items that are very similar to those used in tests of specific ability. Many ability test manuals provide information on their job relevance that is just as good as that available for aptitude tests.

PT_C01.qxd 10/07/2006 12:15 Page 31


31

Specific versus general measures of ability and aptitude While measures of ability can be very general (indicating a person’s tendency to be successful in most areas of life), they can also be quite specific (for example, measures of verbal reasoning ability). Aptitude tests, on the other hand tend to be quite specific – relating to the aptitude for attaining some particular type of knowledge or skill. We can think about the differences between general and specific tests by imagining looking at a scene through a camera with a zoom lens: – With a wide-angle shot, you get a good general impression of what is in the scene and what sort of scene it is – but you don’t get any of the details. – By zooming in on some particular part of the picture, you get a lot more detail about that part – but lose sight of the complete picture. General ability tests give you the wide-angle shots. Specific ability tests provide a more detailed close-up set of pictures. Aptitude tests, because they tend to be job-related, also tend to focus on specific abilities. Often they are brought together in aptitude batteries that provide a set of detailed snapshots covering selected parts of the ability domain. As always, the distinction is not clear-cut: you have to administer each test or battery of tests in accordance with the instructions given.

1.2.2 SAQ

1.

What is the key difference between attainment and aptitude tests?

................................................................................................................................. ................................................................................................................................. 2.

Why is it unhelpful only to look at the items that make up these tests?

................................................................................................................................. ................................................................................................................................. .................................................................................................................................

Computer Adaptive Testing Computers have provided the means for developing many novel types of test including adaptive tests. These are tests where the items are selected from a large database or bank of items held on the computer. Each person who takes the test may be given a different selection of items as the computer picks just those items that provide most information about that particular person’s level of ability. This means that very able people don’t waste time doing easy items and less able people are not put off by being given very difficult items. Therefore adaptive tests tend to be more efficient than conventional ones as it takes less time to get the same answer. However, they are more difficult to produce. These tend to be most widely used by large public sector organizations, but use of this technology is becoming increasingly common now, and many customized tests can be produced.

PT_C01.qxd 10/07/2006 12:15 Page 32

32

MODULE 1.2

Computer Adaptive Tests are based on a rather different psychometric theory than that used to make traditional paper-and-pencil tests. This theory (know as Item Response Theory) is also often used to construct non-adaptive tests as well. Using large banks of questions, lots of comparable tests can be produced with the same characteristics, so that everyone does not have to be given the same test.

PT_C01.qxd 10/07/2006 12:15 Page 33

1.3

Controlling the Quality of Psychological Testing

For any particular assessment problem, a range of assessment methods may be available. How do you control the quality and effectiveness of the assessment procedure? For many ‘informal’ assessment procedures, for instance an unstructured interview, it is very difficult to assess, let alone control, the quality of the process. However, psychometrics provides a technology for measuring the quality and effectiveness of assessment. When choosing between various assessment tools, test users have six main factors to consider:

•

The SCOPE, including the range of attributes covered by the test and the range of people with whom the test can be used.

• • • • •

The RELIABILITY or ACCURACY of the measures. The VALIDITY or RELEVANCE of the measures. The ACCEPTABILITY to potential users. The PRACTICALITY of the test regarding cost, equipment and facilities. The FAIRNESS of the test to various groups of people.


The issues of scope, reliability, validity and practicality are matters associated with choosing the right test for the purpose, and this is a function carried out by qualified test users. As a test administrator, however, you need to be familiar with issues affecting acceptability and fairness.

Scope Scope is concerned with the range of attributes the method covers. This aspect of scope has been covered in Sections 1.1 and 1.2. We have already discussed this issue in some detail when talking about the differences between general and specific ability. When we are looking for an assessment tool we need to know if it will:

•

measure how well a person can carry out a particular type of activity (e.g. use a word processor) or measure the person’s 33

PT_C01.qxd 10/07/2006 12:15 Page 34

34

MODULE 1.3

ability to deal with a wider range of activities (e.g. operate electronic equipment). That is, we will need to know the breadth of the measure.

•

provide just a general assessment or a detailed breakdown of skills and abilities. That is, we need to know how specific the measure is.

Assessment methods vary in both their breadth and their specificity. A test of general mental ability that samples several domains may be regarded as ‘broad’ and ‘general’. A single specific ability test that measures a single ability would be narrow and specific. A comprehensive battery of specific ability tests (like the GATB, DAT, or the Personnel Test Battery) that covers several domains but provides separate measures of each one would provide both ‘breadth’ and ‘specificity’. Using the zoom lens analogy mentioned earlier, the coverage of the test or test battery refers to how much of the total picture it covers. If it is a specific aptitude battery, it will provide a set of detailed pictures that cover various parts of the whole scene as well as providing a general measure of ability. If all we require is a measure of general ability, then it is more economical to obtain it with a short general ability test rather than use a whole test battery. If we only need to know about one specific area of ability, it is best to use a single specific test rather than a general ability test or a whole battery of specific tests.

Scope: norms and the process of referencing scores Scope also refers to the range of people for whom the test is applicable. A common use of tests is to compare one person’s performance against that of other people. The quality and quantity of the information we have about such reference or norm groups will, clearly, affect the value of such comparisons. If we want to see how well Jim Smith has performed on a mechanical reasoning test in comparison with engineering training applicants we would want to compare his particular score with those of a ‘typical’ sample of engineering training applicants. Our norm group would need to be representative of such applicants and be based on a large enough sample of people to be reliable. In practice, this means it would need to contain a few hundred people rather than just 20 or 30. The choice of norm group is the responsibility of the test user, but the test administrators will need to ascertain which group is to be used when they convert the scores.

Norms General population norms are intended to represent everyone in the country. To give this degree of representation, they need to be quite large groups and are typically based on 500 or so up to many thousands of people. In practice, the populations norms are based on tend to be more specific. For example they may be for middle-aged adults only, they may exclude special groups with disabilities, or they may not include adequate ethnic minority group representation.

PT_C01.qxd 10/07/2006 12:15 Page 35

CONTROLLING THE QUALITY OF PSYCHOLOGICAL TESTING

Good normative information is vital for any test if it is to be used in a descriptive fashion, for example, ‘John’s performance was well above average in comparison with other craft apprentices’. Where tests are used predictively, and where there are known relationships between raw scores on the test and job performance, norms are not needed. However, this is the exception rather than the rule. Tests for general use all require good normative information. The development of norms and how raw scores on a test are transformed into other ‘normed’ measures are described in detail in Section 1.4.

Reliability What reliance can you place on the score somebody obtains? How accurate is it? For psychological tests, the precision or accuracy with which they measure whatever they are intended to measure is called reliability. In many ways it is the central idea in psychometrics. Reliability is assessed in two main ways: by measuring the consistency of the score and its stability.

•

The items that make up a test scale are all designed to measure the same trait or attribute. We can say that a test provides a consistent measure of an attribute if the responses given to each item or question are related.

•

A test is stable if people tend to get the same score every time they take the test.

Both consistency and stability are expressed mathematically as correlation coefficients (ranging from zero to 1). Zero means you cannot place any reliance on what the test score tells you; a coefficient of 1 means that the score a person obtains is a perfectly accurate measure of the amount of the attribute that they possess.

Validity Clearly, there is little point getting a very accurate measure of something if what you are measuring lacks relevance. In psychometrics, the relevance of a measure – what you are able to infer from or about it – is called its validity. Questions about the validity of a test are essentially questions about whether it measures what it claims to measure. Validity is assessed in a variety of ways. Primarily, these include examining a test’s relationship with other tests of various types and exploring its ability to explain or predict people’s performance or future attainment. If the measures provided by a test are unreliable (i.e. inaccurate), then there is no point asking about their relevance. However, accurate measurement does not necessarily imply relevant measurement.

35

PT_C01.qxd 10/07/2006 12:15 Page 36

36

MODULE 1.3

Accuracy or reliability of measurement is a property of the test – or the assessment procedure. In measuring someone’s height, the accuracy of the measurement is a result of the sort of ruler you use and how you do the measurement. The validity of the measure, however, is not really an intrinsic property of the test. Validity depends on why the test is being used. A test will be valid for some purposes and not for others.

Acceptability The best assessment procedure in the world will be of little use if people refuse to take part in it, so a test user has to consider whether the test taker can be expected to co-operate in the procedure. In general people find tests acceptable when:

• • •

the reasons for taking the tests have been carefully explained to them the test appears to be relevant they are given feedback about their results.

While tests that appear to be highly job-related (e.g. work samples) are generally more acceptable to people who are being selected for work or undergoing vocational guidance, their value depends very much on the purpose of the assessment. The person who requires general vocational guidance will, quite rightly, not be impressed by being given a programmer aptitude battery – as it would be seen as prejudging his or her suitability. However, they can readily be shown the value of obtaining a measure of their general ability level. This is another issue that the administrator would be advised to discuss with the test user. Answering questions from test takers involves having good factual information with which to respond, and reasons for test choice is an example of the information that may be needed. There are many ethical and practical issues associated with the use of tests in industry and commerce. These all have an impact on the acceptability of tests to test takers and to the organization using the tests.

Practicality As with acceptability, an impractical test is as useless as an unacceptable one. Practicality concerns issues such as what it costs; what training is required to use it; how long it takes to administer, score and interpret; what equipment is needed. The quality of the information that an assessment procedure provides must be weighed against the cost of obtaining that information. When compared against other assessment procedures:

•

psychological tests are cost-effective as they provide a lot of accurate, relevant information in a short time;

•

psychological tests provide information that it is very difficult to obtain using other methods.

PT_C01.qxd 10/07/2006 12:15 Page 37

CONTROLLING THE QUALITY OF PSYCHOLOGICAL TESTING

37

However, there are costs:

•

There is a significant ‘start-up’ cost for training, purchase of initial sets of materials and the establishment of facilities for testing.

•

There are recurrent costs – test materials, staff time and so on.

People who are not likely to use psychological assessment regularly will find it is probably more cost-effective to make use of outside qualified assessors. Where there is a need for regular use of tests, then the initial costs of training and purchasing materials will soon be recovered.

Fairness It is critically important that any assessment procedure is fair. For the person being assessed, the need for fairness is obvious. However, it is equally important to the assessor or assessing organization. First, the adoption of unfair practices can make you liable to prosecution. While it is still a fairly rare occurrence in this country, prosecution is quite common in the United States. Second – and perhaps more importantly – unfair use of tests in selection can result in a cost to your organization if it produces a drop in the general quality of your workforce. The simple fact that individuals or groups of people have different scores on a test is not in itself unfair. Indeed the whole point of testing for selection is to discriminate between people. You can only do this if there are differences. The issue of fairness arises when a difference between people that is not relevant for the job is used as a basis for selection. The key question in evaluating fairness is: are the results for different groups of people (e.g. males and females; people from differing ethnic backgrounds; people who vary in age) likely to differ systematically for reasons that have nothing to do with the relevance (validity) of the test? If the answer is ‘yes’, the test is unfair. In practice, establishing whether a test – or any other assessment procedure – is ‘fair’ or not is very difficult. This is one of the issues the test user must consider in choosing whether to use tests or not, and in deciding which test or tests to use. Module 2 discusses fairness in so far as it relates to test administration.

PT_C01.qxd 10/07/2006 12:15 Page 38


Summary of modules 1.1 to 1.3

38

•

Tests comprise a series of standardized tasks. They are designed so that everyone is given the same, or carefully matched, tasks to do and everyone is given a standard set of instructions for doing them.

•

Test documentation provides information about how groups of people have performed on this particular test in the past. This allows us to see how the score of the person being tested compares with that of others and what inferences we can draw from the test scores about other aspects of the person’s behaviour and performance. As a result, competent test users using well-designed tests can make reasonably accurate judgements about people’s capacities or potential to act or behave in certain ways.

•

In general, tests fall into two broad categories: measures of typical performance (designed to assess dispositions, such as personality, beliefs, values, and interests, and to measure motivation or ‘drive’) and measures of maximum performance (designed to measure ability, aptitude or attainment).

•

Attainment tests measure what has been achieved; ability tests measure what can be achieved. Tests of attainment and tests of ability sometimes use the same items or content, but the scores are used differently. Some abilities cannot be measured until there is a certain level of attainment.

•

What a test measures cannot be deduced simply by looking at the items. You have to look at the technical documentation as well to see what the function of the test is, what scores on the test imply, and what inferences can be drawn from them.

•

For the measurement of attainment we make direct inferences from a person’s test performance to conclusions about their attainment of the skills and knowledge needed to complete the test. For the measurement of ability we make indirect inferences from a person’s test performance to conclusions about the ability that may underlie that performance.

•

Test of ability may be either speed or power tests. Speed tests contain a lot of very easy items, and the measure of performance is the number of items completed in a fixed time. Power tests either have no time limit or a very generous one.

PT_C01.qxd 10/07/2006 12:15 Page 39

SUMMARY OF MODULES 1.1 TO 1.3

39

•

Several different aptitude tests are often brought together to form what is known as a test battery.

•

There are six main factors to consider in relation to quality control in testing: • The SCOPE, including the range of attributes covered by the test and the range of people with whom the test can be used. • The RELIABILITY or ACCURACY of the measures. • The VALIDITY or RELEVANCE of the measures. • The ACCEPTABILITY to potential users. • The PRACTICALITY of the test regarding cost, equipment and facilities. • The FAIRNESS of the test in relation to various groups of people.

PT_C01.qxd 10/07/2006 12:15 Page 40

1.4

Scales and Measurement

In this section you will learn about scales and measurement. Psychological tests produce scores – for example, number of questions answered correctly or number of errors – that are used to indicate the amount of some underlying attribute the person has. As a test administrator you will be involved in scoring tests to obtain these scores and with carrying out various procedures on the scores to render them more meaningful for the test user. Before reading any further, carry out Exercise 1.4.1.

EXERCISE 1.4.1: Self-administration of Test A Take Test A from the Test Pack and complete it yourself following the instructions given in Section 3 of the Test Pack. Make sure you keep exactly to the time limit: use a stop watch or a watch with a second hand. (Module 2 will deal with administering tests to other people.) When you have completed this, use the scoring key from Section 3 of the Test Pack and give one point for every correct answer (nothing for incorrect or unanswered items).


Raw scores

40

As Test A contains 25 questions and one point is allocated for each correct answer given, the total score should be somewhere between 0 and 25. This is called the raw score or raw scale score. In the case of Test A, the raw score is the total number of correct answers. A raw score is the absolute score a person gets on a test. It is absolute in that it does not depend on the scores other people might get, or on who else takes or has taken the test. Scores which are related to how other people perform on a test are called relative scores. Normative scores (or norms) are relative scores. For other types of test, each question may have alternative answers which are worth different amounts. For example, many personality inventories provide three alternatives for each item, with the alternatives counting 0, 1 or 2 points. Thus, if you had a test containing 20 items of this sort, the

PT_C01.qxd 10/07/2006 12:15 Page 41

41

SCALES AND MEASUREMENT

minimum score would be zero and the maximum would be 40. Scales can even be designed where the raw scores range from negative values through zero to positive values. Look at example items for a Mood Inventory in Table 1.4.1.

TABLE 1.4.1: Example items for a Mood Inventory The following are six items, which might be used in a Mood Inventory, to show one way in which a scale might be constructed when items represent opposite ends of a scale. People are asked to rate how they feel now in relation to each adjective. They circle ‘0’ if it is quite unlike how they are feeling through to ‘3’ if it is very much like the way they feel. (In a genuine inventory of this type, these items would be mixed up with others relating to other aspects of mood.) Unlike me

Like me

1. SAD 2. DEPRESSED 3. DOWN

0 0 0

1 1 1

2 2 2

3 3 3

4. HAPPY 5. ELATED 6. OPTIMISTIC

0 0 0

1 1 1

2 2 2

3 3 3

Items 1 to 3 are ‘negative’ ones, while items 4 to 6 are ‘positive’ ones. To produce a scale score which takes this into account, scores for negative items are subtracted and those for positive ones added. Thus, we would expect someone who was in a very positive mood to circle ‘0’ for the first three items and ‘3’ for the next three. Someone who was feeling very upset would circle ‘3’ for the first three and ‘0’ for the others.

1.4.1 SAQ

Looking at Table 1.4.1: What is the maximum raw score obtainable? What is the minimum raw score obtainable?

Scales It is common to talk of measuring along a scale. We talk of ability being a scale which goes from low to high scores. Thus scores obtained on a test of some characteristic are generally referred to as scale scores. The number of items you got correct on Test A is your raw scale score. For some tests, all the correct items are counted together to produce just one raw scale score. For other tests, the scoring procedure may divide the items into two or more groups, with each group of items being used to produce a raw scale score. Look at the example in Table 1.4.2. This contains items designed to measure verbal ability and numerical ability.

•

If these items are all mixed together, we tend to say that we have a single test which produces two scale scores (one for verbal and one for numerical ability).

PT_C01.qxd 10/07/2006 12:15 Page 42

42

MODULE 1.4

•

If the two sets of items (verbal and numerical) are presented as two separate tests we would call it a test battery (containing two tests: one of verbal ability and one of numerical ability).

•

On the other hand, we might use the test to provide a measure of general ability. In that case, all the items would be used to produce a single scale score.

In practice, real tests would have a lot more than three items for each scale. With only three items we would not expect to get a very accurate picture of someone’s ability. TABLE 1.4.2: Example items for a general ability test The following six items are typical of the sort of items you might find in a general ability test. The first three are VERBAL ABILITY items and the last three are NUMERICAL ABILITY items. For each question, circle the correct answer : 1.

HAND is to ARM as FOOT is to: SHOE

2.

GLOVE

WOOD CUP

WINDOW

DOOR

PRIMROSE

FLOWER

36

28

33

22

6 is to 18 as 4 is to: 16

6.

TOE

3, 6, 10, 15, 21 . . . Which number comes next? 25

5.

BODY

The following two words are alike in some way: VIOLET and ROSE. Which of the following is unlike both of the above words: FUCHSIA

4.

WRIST

Which of the following means the same as GLASS: TUMBLER

3.

LEG

8 12

4

Which of the following cannot be divided exactly by 7: 49

35

27

56

77

The answers to the above questions are: 1. 2. 3. 4. 5. 6.

LEG TUMBLER FLOWER 28 12 27

In Module 1.2 we saw how tests can be classified under two main headings – measures of maximum performance and measures of typical performance. Measures of maximum performance include both general ability tests and tests of specific abilities and aptitudes:

•

General ability tests tend to mix different types of item together (for example, verbal, numerical) and produce a single score for the whole test. They tend not to make use of separate scores for each item type.

•

Specific ability or aptitude tests (or aptitude test batteries) tend to keep the items for each type of ability in separate tests or ‘subtests’, with a score for each one. These scale scores may sometimes be added together to produce a measure of general ability.

PT_C01.qxd 10/07/2006 12:15 Page 43

43


Tests of typical performance:

•

Personality and interest inventories tend to mix all the items together in one questionnaire. These are then marked using scoring keys which separate the items and produce one score for each of the scales. (See Table 1.4.3 for some examples of the sort of items you might find in personality inventories.)

TABLE 1.4.3: Typical self-report inventory items Have a look at the P5 Personality Inventory from the Test Pack Section 2 for an example of a complete inventory. This inventory is designed to measure five major aspects of personality. Items typical of those you would find in self-report personality inventories are: 1.

I would rather attend: (a) courses which teach facts (b) courses which teach theories

2.

When travelling by public transport, I prefer: (a) to talk to other passengers (b) to keep myself to myself

3.

When I have some task I must do, I prefer: (a) to plan and organize everything before I begin (b) to sort things out as necessary as I go along

4.

When I make a decision, I usually let: (a) my heart rule my head (b) my head rule my heart

5.

I would rather spend time with: (a) people who have creative ideas (b) people with practical skills

Absolute and relative scores: raw scores and normative scores The score a person gets on a test is known as the raw score. Raw scores are absolute quantities. For ability tests they are usually the number of items someone gets right, but they could also be the time taken to complete some activity or some other more complex measure of performance.

We have to be very careful when we interpret raw scores on a test. The raw score is important for what it tells us about a person’s ability. If someone gets a raw score of zero on a test, it does not mean that they have no ability – only that they have failed to reach the lowest point scored on that test. Imagine you lived in a subzero climate, and only had a thermometer which read from zero degrees upwards. According to your thermometer it would always be zero degrees. This does not mean that there is never any temperature at all, only that the temperature is too low to be measured by this scale. The problem lies in having the wrong thermometer, so you are unable to measure below zero degrees.

PT_C01.qxd 10/07/2006 12:15 Page 44

44

MODULE 1.4

In addition to raw scores, we tend to make great use of normative scores or normed scores. These are derived from raw scores and provide a way of describing how well a person has done relative to other people. This is an important part of the process of the test development and standardization. In relation to height and weight, it is like saying: John Smith is 5ft 8in (raw score) which is very tall for his age (normative statement). Mary James weighs 8 stone (raw score) which is underweight for her age (normative statement). In both examples, the raw score is an absolute value and provides one sort of information. The normative statement, on the other hand, provides additional information which helps us interpret the implications of the absolute score. For many everyday measures, we ‘know’ what they mean because we are familiar with the scale. We know that 6ft 3in is tall for people and that 4ft 6in is short; we know that a four-mile walk will take about an hour; that 80 degrees F is quite hot. We have implicit normative information about these scales and can ‘think’ in them. When the UK currency changed from the complex £sd system (12 pence to one shilling and 20 shillings to one pound sterling) to the decimal £p system (100 ‘new’ pence to one pound), people had to go through an extensive period of translating the new money into ‘old’ money in order to know how much it was worth. Similarly, adjusting to the change from Fahrenheit to Centigrade for weather reports has been difficult for many people in the UK.

Norm-referenced, self-referenced, criterion-referenced and domain-referenced measures Most psychological measures are carried out using raw score scales which have no implicit normative meaning. We have nothing we can directly refer them to in order to make sense of them. Therefore we have to relate them to something else. There are four main ways we get round this problem of assigning ‘meaning’ to a score: norm-referencing self-referencing criterion-referencing domain-referencing For test administration we will only consider the first two of these.

Norm-referenced scores – comparing people with other people A norm-referenced score defines where a person’s raw score lies in relation to the scores obtained by other people (that is, the norm group). The reason for using norm-referenced scores is to see whether the person is below average, average, or above average. Such scores are relative measures as they depend on who the ‘other people’ are. A given ability score may be low when compared against a university graduate norm group and high when compared against a sample of people drawn from the general population. Typically norm-referenced scores are expressed either as percentiles or on one of a number of standard score scales. We will look in some detail at these two types of score in a later section of this Module.

PT_C01.qxd 10/07/2006 12:15 Page 45

45


Self-referenced or ipsative tests – comparing people with themselves Self-referenced tests are those where people are asked to make choices between items from different scales. For example, you may be asked to say which of two statements is ‘Most like you’ or which of four statements are ‘Most like you’ and ‘Least like you’. There are some examples in the box below. This type of inventory is quite complicated to score, and in many cases you will find that hand-scoring keys are not published and that responses are scored electronically – see Module 2 for further details about different methods of scoring.

Self-referenced or ipsative tests Self-referenced tests are sometimes referred to as ipsative tests. In an ipsative test, the scores on each scale are dependent on each other to some degree. In a fully ipsative test, the degree to which scores on one scale are dependent on scores on the others depends simply on the number of scales. For example, if you have only two ipsative scales, then whatever you score on one scale fixes what the other scale score must be. What you score on one reduces the freedom for what you might score on the other scales to vary. As the number of scales increases, so this reduction in ‘freedom to vary’ decreases. Number of items attempted on a test and number of items not attempted are two ipsative scales. As items must either be attempted or not, then these two scales are totally dependent on each other – as the score on one goes up, so the other must go down. Ipsative tests will always have at least two – usually more – scales. Ipsative tests are quite common amongst personality and interest measures where typical rather than maximum performance is being looked at. Ipsative tests are a useful complement to non-ipsative ones: each tells us different things about a person. However, considerable caution needs to be exercised when interpreting the results from ipsative tests. As the interpretation of result is the responsibility of the test user, we do not need to go into this in any detail here. Self-referenced scores use the person taking the test as their own ‘norm’. To do this, the test scores have to be derived in a special way. Let us first clarify how norm-referenced measures can be used to talk about both differences between people and differences within people. Compare the following statements about two people’s scores on two personality scales (‘need for achievement’ and ‘need for other people’): 1.

John’s need for achievement and his need for other people are both below average.

2.

John’s need for achievement is stronger than his need to be with other people.

3.

Huda’s need for achievement and her need for other people are both above average.

4.

Huda’s need for achievement is stronger than her need to be with other people.

PT_C01.qxd 10/07/2006 12:15 Page 46

46

MODULE 1.4

The first and third are norm-referenced statements. They tell us that John’s scores on the two scales are lower than the average scores on those scales for other people in the population, while Huda’s are higher. The second and fourth are making comparisons between scales for the two people. They say that both John and Huda are more achievement-oriented than they are people-oriented.

PT_C01.qxd 10/07/2006 12:15 Page 47

1.5

Understanding and Using Test Norms

Frequency distributions As we have said, norms provide a means of relating a person’s raw score to those of other people. In psychometrics, we talk of the reference group as being a population. A ‘population’ contains all the people who conform to some specification, such as UK adult females; male craft apprentices; university arts graduates; Smith & Jones trainee managers. Depending on the specification, the population may be very large (millions of people) or very small – Smith & Jones may have only 40 trainee managers. The standardization data presented in a test manual are obtained from a sample of people drawn from a specific ‘population’. Those data can be used to estimate how other samples (either groups or single individuals) drawn from the same population should perform on the test. Percentiles


If we want to compare an individual’s score with those of other people in some group (for example, Smith & Jones trainee managers), then we want to know what proportion of people in that group do less well on the test. The proportion of people scoring less than a particular score is called the percentile rank of the score. More commonly we refer to this as just the percentile. To say that a score has a percentile of X means that the score exceeds that obtained by X% of the group. Different test manuals will use different terms for percentiles: percentile score; percentile rank; percentile; centile. These all mean the same thing. Be careful not to confuse percentiles and percentage scores. Someone who got 15 out of 20 items right got 75% correct. That is simply another way of describing their raw score. Percentiles, however, refer to how a person’s score compares with those of others. Thus a score of 75% might be at the 65th percentile, or the 90th percentile, or anywhere else, depending on how the raw scores are distributed in the norm group. 47

PT_C01.qxd 10/07/2006 12:15 Page 48

48

MODULE 1.5

Descriptive measures based on percentiles For any scale, given enough data, we can translate ‘raw scores’ into percentile scores in this manner. Many test manuals contain such tables. Others use scales based on percentiles. (By convention, percentiles are only ever used as whole numbers. That is, we would not talk of someone being at the 43.67th percentile but as being at the 44th percentile, the nearest whole number). It is quite common for percentile conversion tables to provide only a limited number of percentile equivalents. Table 1.5.1 shows a typical example. Here the percentiles represent the mid points of bands of scores. The lower and upper percentile points of each band are shown in the right-hand column. Note how, for each percentile, there is a whole range of possible raw scores in some cases and none in others. This is a result of the way people’s scores are distributed. Most people score near the middle of the raw score range while few score at each end. So there is a relatively large percentile change between raw scores near the middle of the range and a much smaller one for raw scores near each end of the range. Tables like Table 1.5.1 are designed to be used in only one direction: that is, you convert a raw score to a percentile equivalent, but not vice versa.

TABLE 1.5.1: Example of a raw score to percentile conversion table Raw score

Percentile

Range of percentiles covered

0–5 6 7 8–9 10 11 12 – 13 14 – 15 16 17 – 18 19 20 21 22–23 24–25 26–27 28–33

1 3 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 97 99

1 2–3 3–7 8–12 13–17 18–22 23–27 28–32 33–37 38–42 43–47 48–52 53–57 58–62 63–67 68–72 73–77 78–82 83–87 88–92 93–96 97–98 99

PT_C01.qxd 10/07/2006 12:15 Page 49

49

UNDERSTANDING AND USING TEST NORMS

1.5.1 SAQ

Converting raw scores to percentiles Using the information in Table 1.5.1, answer the following questions: (a)

What is the percentile for someone who scored 29? [ ]

(b)


(c)


Other common percentile-based scoring systems include: 1.

The five-point grading scheme where the top 10% of scores are classed as grade A; the next 20% as grade B; the next 40% as grade C; the next 20% as grade D and the lowest 10% as grade E.

2.

Decile scales where the raw scores are divided into ten categories each containing 10% of the distribution.

3.

Quartiles where the raw scores are divided up into four categories each containing 25% of the distribution (referred to as the first, second, third and fourth quartiles).

Exercise 1.5.1 gives you the chance to practice converting raw scores to percentiles and grades using different sets of norms. Before you continue, work through Exercise 1.5.1.

EXERCISE 1.5.1: Converting raw scores to percentiles and grades Use the norm tables for Test A contained in the Test Pack. Find the appropriate Test A norm table to convert raw scores to percentiles and then use the General Purpose Conversion Tables to convert the percentile scores to grades. Convert the following raw scores: 11-year-olds Raw score

Percentile

Grade

Craft apprentices Percentile

Grade

University students Percentile

Grade

19 .......................................................................................................................................... 18 .......................................................................................................................................... 16 .......................................................................................................................................... 15 .......................................................................................................................................... 14 .......................................................................................................................................... 12 .......................................................................................................................................... 10 ..........................................................................................................................................

PT_C01.qxd 10/07/2006 12:15 Page 50

50

MODULE 1.5

Other standard scores A number of scales have been developed over the years. Each of these scales is ‘standard’ in that it has a predefined average score for a given population (its ‘mean’) and it also has a standard ‘spread’ of scores around the mean. The statistic used to define the extent to which scores are spread out about the mean is called the Standard Deviation (or SD). Different standard scales are defined in terms of what their means and SDs are. Table 1.5.2 list some of the most common standard scales that you will encounter as a test administrator. Sten scores and Stanine scores are commonly used with personality inventories; T-scores are commonly used with ability and aptitude measures. You will see in Table 1.5.2 reference to ‘z-scores’. These are what all the other scores are derived from. You need not worry about how these are obtained, but there is some more detail if you are interested in Table 1.5.2. TABLE 1.5.2: Commonly used standard score scales Name

Mean

SD

Comments

z-score T-scores Stens Stanines IQ

0 50 5.5 5 100 100

1 10 2 2 15 20

The basis of all the other scales Usually limited to range from 20 to 80 Standard TEN: range from 1 to 10 Standard NINE: range from 1 to 9 Used for many IQ tests US Employment Services aptitude tests

The relationships between these scales are shown in Figure 1.5.1. Notice the shape of distribution in Figure 1.5.1. This is the well-known ‘bell curve’. Technically it is called a ‘normal distribution’. It is used as the reference distribution for converting raw scores on tests into standard scores. The normal distribution is a smooth symmetrical bell-shaped distribution with a mean of 0 and an SD of 1. We call scores with a mean of 0 and SD of 1 z-scores. For most purposes, we can assume that the normal distribution represents the shape of the distribution of the population of scores which underlies any sample we obtain. While the raw scores may not always be distributed as symmetrically as this, it is generally possible to convert them to this shape through the process of converting raw scores into standard scores. Just as there are a number of scores based on percentiles (quartiles, deciles, the five-grade system), so z-scores provide the basis for a range of ‘standard scores’. The problem with z-scores is that all below-average scores are negative. People do not like dealing with negative scores. In ability testing it could be especially awkward – as people who didn’t understand psychometrics might think that a score of – 0.6 meant that they were lacking in ability! A second problem with z-scores is the size of the intervals on the scale. The differences between 0, 1 and 2 are very large in terms of percentile equivalents. We can get round this by using decimal values (1.15, 2.36 and so on) to provide as ‘fine’ a scale as we want. But this is still not ideal, as people generally prefer to work with whole numbers rather than decimals for scores. In addition, working scores out to three or four decimal places might make them look far more accurate than they really are. What we need are scales which are more ‘user-friendly’ than z-scores.

PT_C01.qxd 10/07/2006 12:15 Page 51

UNDERSTANDING AND USING TEST NORMS

51

FIGURE 1.5.1: Correspondence between the normal distribution and a number of standard score-based scales Each score really represents an interval or range of values along the scale. This is shown for the sten and stanine scales by depicting each value as a box. The score (for example 5) is really that associated with the centre of the box, so the point where the ‘5’ and ‘6’ boxes meet would be a score of 5.5. In practice we only use whole number stens and stanines. Both the sten and stanine scales have SDs of two. However, the mean of the stanine scale is 5 whereas the mean of the sten scale is 5.5. Each T-score is also really an interval of values, each interval being one tenth of an SD wide. So a T-score of 50 covers the interval from –0.05 to +0.05 z-scores, 51 covers the interval +0.05 to +0.15 z-scores and so on.

T-scores and sten scores For the remainder of this section, we will focus on the two most common standard score scales: T-scores and sten scores (see Figure 1.5.1). For aptitude and ability tests, T-scores are by far the most common. For personality questionnaires, sten scores are the most widely used standard score scale. Being able to convert between these scales is important as it helps to provide a feel for the fact that they are really all the same; they simply have different means and SDs (just like converting between inches and centimetres). It is purely a matter of convenience that we tend to report test results in stens and T-scores rather than in z-scores. The important point to remember is that they are all derived from z-scores. Most test manuals provide tables which enable you to look-up T-scores directly from raw scores. They may also provide percentile equivalents of each T-score for you. These conversion tables are often called ‘look-up tables’. The conversion tables included in the Test Pack can be used for any test where you are provided with either a percentile table or some form of standard score table.

PT_C01.qxd 10/07/2006 12:15 Page 52

52

MODULE 1.5

You should now be able to try Exercise 1.5.2.

EXERCISE 1.5.2: Using norm tables

(a) Julian Barnes is 16 years old. He obtains a raw score of 15 on Test A. How does this score compare with the scores of university students and craft apprentices? Use the tables in the Test Pack. Mary James is studying for a degree at university. She obtained a T-score of 65 on Test A (university student norms). What is her approximate percentile rank score in the general population? Look at the raw score you yourself obtained when you completed Test A at the start of this Module. How would you interpret that score in terms of the Test A norms provided in the Test Pack? (Remember that this is not a ‘real’ test – it has been constructed for training purposes only and all the normative data have been artificially computer-generated.) (b) Test B is a test of verbal reasoning. It has 60 items, and raw scores can range from zero (all wrong) to 60 (all right). Norm tables for Test B are included in your Test Pack. Using these norm tables, convert the following raw scores to percentiles, grades, sten scores and normalized T-scores. Ensure you use the indicated norm group for each one. Use the General Purpose Conversion Tables in the Test Pack to convert the percentile scores to grades and Table 4 to convert them to stens.

Person

Raw score

Norm group

Percentile

Grade

T-score

Sten

1 45 Z ............................................................................................................................... 2 35 Z ............................................................................................................................... 3 35 W ............................................................................................................................... 4 28 X ............................................................................................................................... 5 24 Y ............................................................................................................................... 6 20 Y ............................................................................................................................... 7 15 Z ............................................................................................................................... 8 12 Z ............................................................................................................................... 9 10 W ............................................................................................................................... 10 10 X ...............................................................................................................................

How would you describe the performance of the two people (#2 and #3) who obtained raw scores of 35 and the two (#9 and #10) who obtained raw scores of 10?

Some tests have a large number of different norm tables supplied with them; others may have very few – possibly just one. How do you decide which table to use and whether the test has adequate normative information? Again you need to clarify this with the test user.

PT_C01.qxd 10/07/2006 12:15 Page 53

Summary of modules 1.4 and 1.5

Raw scores provide an absolute measure of how a person has performed on a test.

•

Tests with more than one scale where the score a person gets on one scale is necessarily related to the scores he or she gets on the others are called ipsative tests. Measures on such tests are self-referenced. Ability tests are not self-referenced. Some personality and some interest measures are.

•

Norms provide a means of giving measures of performance relative to other people. These are known as norm-referenced measures.

•

Norm-referenced measures can be classified into two main types: • Percentile scores • Standard scores

•

Percentile scales include percentiles (or centiles) themselves, and also deciles, quartiles and the 5-grade system. The score corresponding to the 50th percentile is known as the median.

•

Percentile scores must not be added to or subtracted from each other.

•

Standard scores are all based on z-scores (which have a mean of 0 and an SD of 1).

•

Common standard score scales include: T-scores (mean = 50, SD = 10), stens (mean = 5.5, SD = 2) and stanines (mean = 5, SD = 2).


•

53

PT_C01.qxd 10/07/2006 12:15 Page 54

Module

1

Answers to Exercises and Self-Assessment Questions

Answers to Exercises EXERCISE 1.5.1: Converting scores to percentiles and grades

11-year-olds Raw score

Percentile

Grade


Grade


Grade

19 100.0 A 91.8 A 65.9 C .......................................................................................................................................... 18 99.7 A 86.7 B 52.2 C .......................................................................................................................................... 16 98.0 A 73.8 B 23.4 D .......................................................................................................................................... 15 95.5 A 64.3 C 13.3 D .......................................................................................................................................... 14 93.5 A 53.5 C 6.8 E .......................................................................................................................................... 12 83.8 B 28.7 D 1.2 E .......................................................................................................................................... 10 62.2 C 13.8 D 0.0 E ..........................................................................................................................................


Notice how the same raw score can range from an A grade to an E grade depending on which norm group we use. For example, a raw score of 14 is well above average for an 11-year-old (and hence has an A grade) while it is well below average for a university student (and hence has an E grade).

54

EXERCISE 1.5.2: Using norm tables (a) Julian Barnes. Julian Barnes’ score of 15 is a bit above the average for the craft apprentice group, but well below that of the university student group. For the former group it is a percentile of 64 (T-score 54) while for the latter it would be only a percentile of 13 (T-score 39). From the T-scores we can see that these represent scores about one-half an SD above the mean for the craft apprentice group and about one SD below the mean for the university student group. Mary James. Finding Mary James’ percentile involves a bit of playing around with the tables. As we know her T-score was 65, we can translate that to a percentile using

PT_C01.qxd 10/07/2006 12:15 Page 55

55

ANSWERS TO EXERCISES AND SAQS

Table 1. This shows that a T-score of 65 corresponds to a percentile of 93.32. Now go to the norm table for Test A showing the university student norms. The nearest value to 93.32 is 92.7. This is the percentile corresponding to a raw score of 22. So, given her T-score and knowing what norms were used, we can get back to her raw score. Her approximate percentile rank in the general population can now be found by looking up her raw score of 22 in the general population table for Test A (norm table 2). We see from this, that 22 would give a percentile of 99 (an approximate T-score of 73). Interpreting your own Test A score. Look at it in relation to each of the norms groups and the general population table. Is it average, above average, below average? If above or below, how much? For which norms? (b) Your completed table should look like this:

Raw Norm Person score group Percentile Grade T-score Sten .................................................................................................................................. 1 45 Z 75 B 57 7 .................................................................................................................................. 2 35 Z 30 D 45 5 .................................................................................................................................. 3 35 W 65 C 54 6 .................................................................................................................................. 4 28 X 85 B 60 7 .................................................................................................................................. 5 24 Y 25 D 43 4 .................................................................................................................................. 6 20 Y 10 E 40 3 .................................................................................................................................. 7 15 Z 1 E 27 1 .................................................................................................................................. 8 12 Z 1 E 27 1 .................................................................................................................................. 9 10 W 3 E 31 2 .................................................................................................................................. 10 10 X 10 E 37 3 ..................................................................................................................................

Using different norms has the effect of person #3 getting a C grade while person #2 only gets a D. For the other two people, both get E grades as the score of 10 is well below average for both norm groups.

Answers to Self-Assessment Questions SAQ 1.2.1 1.

Tests of maximum performance measure ability, aptitude and achievement. Tests of typical performance examine preferences in areas such as social activity, methods of dealing with problems, work-related areas of interest and the like.

2.

Tests of maximum performance measure cognitive and psycho-motor abilities. Tests of typical performance measure dispositions, interests and drive.

3.

Tests of maximum performance usually have a time limit and are sometimes designed to be carried out under considerable time pressure (speed tests). Tests of typical performance are usually not timed.

4.

Tests of maximum performance score right answers to items. So, the higher the score the better it is. Tests of typical performance score preferences. There are no right or wrong answers, so scores are not good or bad.

PT_C01.qxd 10/07/2006 12:15 Page 56

56

MODULE 1

If you are not sure about these differences, reread Section 1.2 and look carefully at the examples in Tables 1.2.1 and 1.2.2.

SAQ 1.2.2 1.

Attainment tests are designed to measure what has been learned or acquired. Their content is usually determined by the objectives of the training course or academic syllabus. How well people perform on an attainment test will be a complex function of their ability, the training they have received, their interest in the topic and the amount of effort they have put into learning it. Aptitude (and ability) are concerned with the potential people have to attain new knowledge and skills. While aptitude tests rely on a certain degree of attainment – that necessary to understand what the test requires, to read the items etc. – their purpose is not to assess attainment. The less they rely on attainment and previous experience the closer they come to assessing a person’s ‘pure’ potential.

2.

Attainment and aptitude tests can contain very similar items.

SAQ 1.4.1

•

The maximum possible score would be obtained by scoring 3 on all the positive items and zero on the negative ones. That would give +3 + 3 + 3 + 0 + 0 + 0 = +9

•

The minimum possible score would be obtained by scoring 0 on all the positive items and minus 3 on the negative ones. That would give −0 − 0 − 0 − 3 − 3 − 3 = −9

So the scores for the set of six items could range from: −9 to +9.

SAQ 1.5.1: Converting raw scores to percentiles To use the table, find the raw score in the column on the left and then look across the table to find its percentile equivalent in the central column. (a)

What is the percentile rank of someone who scored 29? The score of 29 lies in the raw score range 28–33 and so the percentile is 99.

(b)

What is the percentile rank of someone who scored 15? A raw score of 15 converts to a percentile of 50.

(c)

What is the percentile rank of someone who scored 6? A raw score of 6 converts to a percentile of 33.

PT_C02.qxd 10/07/2006 12:14 Page 57

Module

2

Test Administration and Scoring

OVERVIEW This Module is concerned with administering tests and dealing with scoring procedures. It also covers maintaining security and confidentiality of the test materials and the test data. It is very much a skill-based module with the stress on people’s competence to follow good professional practices in test administration, ensuring the maintenance of standard conditions and fairness. This is explored through a series of practical exercises. For these you will need to obtain the co-operation of a few friends or relatives. They need not have any prior experience of tests or of being tested, but will need to be prepared to take a dummy test and to appraise your performance as an administrator using the form in Section 3 of the Test Pack. This Module also covers the differences between various modes of test administration (for example, paper-and-pencil and computer-based, offline and online or web-based testing) and issues of security, control and confidentiality. These are of central importance in the professional relationship between you as test administrator, the test user, the test taker or candidate and the client or person who will be using the results of the test.


Because this Module is mainly concerned with practical skills, it is advised that you: (a)

First, read through the whole Module BEFORE you begin any of the exercises.

(b)

Next, read through the Module again, completing the paper-andpencil exercises as you go.

(c)

Then, assemble all your testing materials for Test A and the P5 Personality Inventory and become familiar with them.

(d)

Finally, carry out the practical testing sessions and get feedback from your subject(s). Repeat this process until you feel confident and your ‘appraisal’ is positive.

There are a number of stages connected with the administration and scoring of tests: preparation, administration of the tests themselves, 57

PT_C02.qxd 10/07/2006 12:14 Page 58

58

MODULE 2

scoring and translating the test scores into standard scores, and completion of the administration procedures. These are dealt with in Section 2.1. There are also a number of other issues related to the use of tests which are discussed in Section 2.2. These are issues of confidentiality and security and, specifically, awareness of the requirements of the Data Protection Act.

KEY AIMS Having completed this Module, you should be able to: Make adequate preparations for a test session. Administer a test fairly paying due regard to specific test instructions. Deal with problems which may arise during test administration. Carry out scoring and results analysis procedures accurately. Show due regard for matters of confidentiality and test security.

PT_C02.qxd 10/07/2006 12:14 Page 59

2.1

Test Administration


The British Psychological Society’s list of competences relating to test administration contains largely practical skills. They can be grouped into four stages (see Table 2.1.1 for a detailed checklist of the actions associated with each stage). 1.

The first stage is concerned with preparation: advance preparation before the testing begins, that is, advance preparation of yourself as test administrator, preparation of your materials and test room, and preparation of your candidates for the test day.

2.

The second stage is the administration of the test itself. This involves building up a rapport with the candidate(s); briefing them on the purpose of the assessment; using the standard instructions to administer the tests; allowing sufficient time to complete and deal with example questions and questions raised by the candidates; maintaining the test conditions to the standard given in the test manual and, finally, keeping strictly to the instructions for the timing of the tests.

3.

The third stage is concerned with scoring the tests. This involves checking the answer sheets for ambiguous marks; careful use of scoring keys; checking the total scores; translating the raw scores into standard scores and transcribing the results to score sheets; and, finally, completing a record of the testing session in a test log.

4.

The fourth stage is completing the procedures once the test has been carried out: thanking the candidates and explaining the next stage in their assessment and what feedback they will receive from the testing session; checking that the materials are all collected in, that they are clean and unmarked and replaced in secure storage; and ensuring that the data is kept secure and that the requirements of the Data Protection Act are observed.

Other related issues, such as interpreting the scores and relating them to the overall context of assessment for selection or guidance and giving feedback to the candidate, are not dealt with in this set of modules, as these are the responsibility of the qualified test user. It is the test administrator’s responsibility to ensure that the data provided to the test user has been collected under optimal conditions, as required in the test administration instructions. 59

PT_C02.qxd 10/07/2006 12:14 Page 60

60

MODULE 2.1

In many assessment situations (for example, selection for jobs) test administration may be your only interaction with the candidates. It needs careful and professional handling so that the candidate is comfortable and fully conversant with what is happening. The key to this is a good system and good organization. Preparing well in advance enables the administrator to be professional and the candidate to be well informed in the brief, face-to-face interaction of the test session.

High-stakes and low-stakes testing Before we consider the various different modes of administration, it is important to think about testing from the point of view of the impact the test may have on the people involved, in particular the test taker and the person who will be making use of the test results. Where these two are the same people (that is, someone is doing a test for their own interest), we tend to define the situation as a ‘low-stakes’ one. Where the test results will be used by a third party in making a decision (for example, whether or not to hire someone for a job), the stakes are clearly higher. In particular, if we think about the effect any cheating, bias or other form of intentional distortion might have, it is clear that their impact is only likely to be an issue where the test takers think this might influence some decision in their favour. Such behaviours make little sense if you are completing a test in order to find our more about yourself. What are the functions of the test administrator role? The test administrator has six main functions to perform: 1.

To authenticate the identity of test takers. That is, to ensure that the people who present themselves for testing are actually who they say they are. For high-stakes testing, the test administrator should ask for some form of identification to confirm this. This should be some form of photo identification, not something that might have been given to an accomplice. There have been some remarkable cases of people taking tests for others. Perhaps the most remarkable is that of the person who took driving tests for more than half a dozen people before being caught!

2.

To establish rapport with the test taker. It is important to set the right atmosphere for taking a test and to be clear about why the test is being carried out. The test administrator has an important role in setting the right ‘tone’ and in assisting the candidate(s) to give their optimal performance.

3.

To ensure the standardized instructions are followed. Test manuals define exactly how tests should be administered. As a test administrator it is vital that you follow these instructions carefully in order to ensure fairness.

4.

To ‘validate’ results (prevent cheating and collusion). By being present during the test, you are able to check that people do not cheat or collude. People may be tempted in high-stakes testing to get help or cheat. That is why you must always make sure that mobile phones and other communication devices are switched off and, ideally, removed before a supervised test session takes place.

5.

To deal with unexpected conditions or problems. Things can go wrong during a test. A computer may crash, the fire alarm may go off,

PT_C02.qxd 10/07/2006 12:14 Page 61

61

TEST ADMINISTRATION

or someone may fall ill. You need to be able to deal with these events coolly and calmly and to know what subsequent actions must be taken. 6.

?

To ensure security of materials. Finally, you need to ensure that test materials are not compromised by people taking question booklets away or taking notes of the questions out with them when they leave.

OPEN QUESTION: Make some notes on your thoughts on each of these questions before reading further.

•

Which of the six test administration functions is the most important in high-stakes testing?

•

How would you manage each of these functions if a test is being administered on a computer?

•

Which of these could you manage if the test is being administered remotely, online, to the test candidate?

Modes of test administration Before looking in detail at the stages of test administration (preparation, administration, scoring and completing), it will be useful to review the various different ways in which tests can be administered. These different modes of administration require more or less involvement from you as a test administrator. The International Test Commission defines four main modes of administration. These cover all types of tests and all the various different media that can be used in testing (paper, apparatus or computer-based tests; online and offline administration). The four modes are Open, Controlled, Supervised and Managed. Open Mode Open Mode is where the test taker has direct access to the test materials. So there is no involvement of a test user or test administrator. Such tests include the books of tests you might buy in the local bookshop or the tests you can find on the internet that are directly accessible to everyone. Often the only requirement is that you pay some money before you can access the test. However, no qualifications are required from you either in terms of test use or test administration. In effect the test taker takes on the roles of test administrator and test user. They may also take on the role of client, as many Open Mode tests are designed for self-interest or self-assessment. Most of the materials available in Open Mode would not be considered to be proper tests by psychologists (see the definition of tests in Module 1). They may have no technical documentation to support their measurement claims, there may be no information about who produced them, when, or for what purpose. In addition, there would be a need to be very careful of any maximum performance tests that are encountered in this mode. As the mode is open, the test questions would be of little value in any formal high-stakes assessment (e.g. job selection) as it would be relatively easy for anyone to obtain the answers to the questions prior to taking the test.

PT_C02.qxd 10/07/2006 12:14 Page 62

62

MODULE 2.1

In general, Open Mode assessment is for low-stakes assessments, where test security is not an issue. On the positive side, it provides an easy access way of giving people some experience of testing or pre-assessment familiarization with tests. On the negative side, it may add to people’s confusion about the difference between proper psychological testing and the ‘fun’ use of the sort of questionnaires you get in popular magazines and on the web. From the point of view of test administration, you need to know about Open Mode administration as it is becoming increasingly common to provide people with pre-test practice or familiarization materials in this mode. This is a good example of low-stakes testing, as the impact of doing or not doing these tests is largely confined to the test taker. Indeed, it is in the test taker’s best interests to complete such practice materials, in order to put them on a level playing field with other candidates when the real test is taken. Making sure that candidates are aware of the opportunity for legitimate practice and familiarization is important, and will be discussed in more detail below in the section dealing with Preparation.

You can see some examples of what is available as practice tests by following the links on the BPS Psychological Testing Centre’s website: www.psychtesting.org.uk. To see some examples of the range of so-called ‘psychological tests’ that are openly available on the web, go to www.google.com and enter ‘psychological tests’ as the search string.

Controlled Mode Controlled Mode administration is best described with an example: You have applied for a new job and have been shortlisted for the second stage of the selection process. You receive an email containing the description of the process that will follow and a request for you to complete some online tests before attending an assessment centre. The email contains your unique user name and password and a link to a website. You click on the link and, having entered your user name and password, you are then taken through the online tests. When you have finished the results are automatically transferred to the test user. Having done the tests, you think that you might do better if you have another go. So you click the link again from your email, but find that your login no longer works – you are politely told that you have already done the tests once. This is called ‘controlled’ because there is much more control over who accesses the tests and how they access them. This mode is becoming the most common mode of administration for most personality inventories and other self-report measures that are used either in a career guidance and advice setting or for assessment for personal development. Where the stakes are high, tests administered in this mode should always be part of a larger process that contains the necessary checks and balances to ensure results obtained in this mode are checked or validated in some way before any final decisions are made. Clearly, whilst Controlled Mode is more secure than Open Mode, it is still possible for some to pass their login to another person or to complete the tests with other people helping. It is

PT_C02.qxd 10/07/2006 12:14 Page 63

63

TEST ADMINISTRATION

also not a secure mode from the point of view of test materials. Anything appearing on the screen in the remotely supervised situation could be copied by the candidate. Even where software makes it impossible to copy screens, the screen images could be photographed. So Controlled Mode provides an important degree of control, but it would be unwise to rely on information obtained in this way as the basis for any highstakes decisions. As a test administrator, you are not directly involved in administering the tests in either Open or Controlled Mode. However, you will be responsible in Controlled Mode for managing the process whereby people are registered for the tests selected by the test user, dealing with any support issues that might arise, and so on. Supervised Mode This is the mode in which you, as test administrator, have direct face-to-face involvement with the test takers. The test takers will come to a location where you, as test administrator, will be able to supervise them taking the test. This mode generally involves you in all four stages of test administration: preparation, administration, scoring and completion. If the test being used is computer-based or a machine-scoreable paper answer sheet is used, then scoring may be carried out automatically. Most of this module focuses on helping you develop the skills you need for Supervised Mode administration. In so far as you get involved in Controlled Mode administration, you will find the skills required for that are really just a subset of what is required for Supervised Mode. From the test provider’s point of view, there is no direct control over the conditions under which tests are stored or used in this mode. They rely wholly on the test user and test administrators to ensure that the test is administered in an appropriate environment, that materials are kept secure, and so on. Managed Mode Managed Mode is really very similar to Supervised Mode, but it assumes not only that there is direct control by an administrator over the test administration, but also that the location and equipment used can be controlled. Managed Mode generally applies to dedicated ‘test centres’ where, for example, computer-based tests can be administered on equipment with a known specification in terms of screen size, resolution, speed of processor, multi-media capability, and so on. The environment can be controlled in terms of lighting, noise and other potential distracters. Dedicated test centres tend to be required where special pieces of test apparatus are needed. For example, most of the major airlines will have dedicated testing centres where they assess people for selection as pilots using computer-based tests with a range of additional peripheral devices (pedals, joysticks, headphones) which all have to be to a common specification. Test centres are also commonly used for occupational licensing and certification examinations. Providers of these assessments know that they can securely download them to these dedicated centres as there is good physical and virtual security. A provider of a computer-based adaptive test, for example, would be very unlikely to risk downloading an item bank for Supervised

PT_C02.qxd 10/07/2006 12:14 Page 64

64

MODULE 2.1

Mode administration, as they would have no reassurance over the security of their intellectual property. As a test administrator in a Managed Mode setting, you are likely to be involved in testing as a major part, or the whole part, of your job. Typically, people administering tests in Supervised Mode are doing so as one of a number of clerical or administrative functions that they have to perform. The International Test Commission (ITC) Guidelines The ITC Guidelines on Computer-Based Testing (CBT) and Testing on the Internet provide advice on best practice in test administration in relation so some of the issues arising from the use of different modes of administration for high- and low-stakes assessment. In relation to exercising control over a test-taker’s authenticity and the possibility that they might cheat on a maximum performance test, the Guidelines say that you need to: 1.

Ensure test takers provide the appropriate level of authentication before testing begins. Remind test takers (in the Controlled Mode) of the need to obtain a password and user name to access the test. In controlled testing conditions, test takers should be required to provide authentic, government-approved picture identification.

2.

For moderate or high-stakes testing confirm that procedures are in place to reduce the opportunity for cheating. Technological features may be used where appropriate and feasible (e.g. closed-circuit television (CCTV)) but it is likely that such testing will require the presence of a test administrator, a follow-up supervised assessment, or a face-toface feedback session (e.g. for post-sift assessment in job selection situations).

Furthermore the Guidelines note that: 3.

Test takers should be informed in advance of these procedures and asked to confirm that they will complete the tests according to instructions given (e.g. not seek assistance, not collude with others, etc.).

4.

This agreement may be represented in the form of an explicit honesty policy which the test taker is required to accept.

You might also: 5.

Provide test takers with a list of expectations and consequences for fraudulent test-taking practices, and require test takers to sign the agreement form indicating their commitment.

There are already developments of these modes of administration that bring Controlled Mode and Supervised Mode closer together. The technology already exists to: 1.

Check the identity of people who are logging on with chip-andpin cards, thumbprints, retinal eye patterns, etc.

2.

Monitor their behaviour during the test with remote TV monitoring. This will not only provide for control over potential cheating, but also a support option where needed.

PT_C02.qxd 10/07/2006 12:14 Page 65

65

TEST ADMINISTRATION

3.

Control the test materials security though techniques such as test and item generation, which will provide each person with their own unique test.

Such a mode might be described as ‘Remotely Supervised’ as it really has more in common with Supervised Mode than Controlled Mode from the test administrator’s point of view.

While authentication and prevention of cheating are important, you should not become preoccupied with these issues. A great deal of testing takes place in low-stakes settings where these problems are far less of an issue. 2.1.1 SAQ

List the main advantages and disadvantages of each mode of administration for (a) low-stakes and (b) high-stakes testing. Open Mode ............................................................................................................................. ............................................................................................................................. Controlled Mode ............................................................................................................................. ............................................................................................................................. Supervised Mode ............................................................................................................................. ............................................................................................................................. Managed Mode ............................................................................................................................. ............................................................................................................................. Define what the role of a test administrator is for each of the four modes of administration. Open Mode ............................................................................................................................. ............................................................................................................................. Controlled Mode ............................................................................................................................. ............................................................................................................................. Supervised Mode ............................................................................................................................. ............................................................................................................................. Managed Mode ............................................................................................................................. .............................................................................................................................

PT_C02.qxd 10/07/2006 12:14 Page 66

66

MODULE 2.1

The four stages of test administration As noted earlier, there are four stages in test administration. Depending on the mode of administration you may be involved in one, all or none of these. Table 2.1.1 provides a checklist of the main actions required by the test administrator for each stage. You will find a copy of this in Section 3 of your Test Pack.

TABLE 2.1.1: Checklist of actions for the four stages Some items apply only to Supervised or Managed Modes of administration, others apply to all modes. Some or all of the items marked with an asterisk may not be necessary for computer-based assessment systems. Stage 1: Preparation Supervised and Managed Modes 1.

Plan test sessions with due regard to the maximum number of candidates who can be assessed in one session and the maximum duration of each session.

2.

Ensure that any items of equipment (for example, computers) are operating correctly and that sufficient test materials are available for use by the candidates.

3.

Ensure, where reusable materials are being used, that they are carefully checked for marks or notes which may have been made by previous candidates.

4.

Arrange a suitable, quiet location for carrying out the testing and arrange the seating and desk space to maximize comfort and minimize the possibilities of cheating. Make sure that lighting conditions are controlled, especially where computer screens will have to be used.

5.

Inform the candidates of the time and place well in advance.

All Modes 6.

Ensure adequate advance briefing.

7.

Ensure that potential test candidates are not provided with prior access to test materials other than those specifically designed to help them prepare for their assessment.

Stage 2: Administration All Modes 8.

For Supervised or Managed Modes, brief candidates on the purpose of the test session and put them at their ease while maintaining an appropriately businesslike atmosphere. Give clear descriptions to the candidate(s) prior to their assessment concerning: – – –

how their results are to be used who will be given access to them how long they will be retained for

For Controlled Mode, ensure that the computer-based administration includes a briefing that covers the above issues.

PT_C02.qxd 10/07/2006 12:14 Page 67

67

TEST ADMINISTRATION

TABLE 2.1.1: continued Supervised and Managed Modes 9.

10.

Check the identity of the candidates and enter their personal details in the Test Session Log, together with relevant details of what assessment instruments are being used. Check that all candidates have the necessary materials.

*11.

Use standard test instructions and present them clearly and intelligibly to the candidates.

*12.

Provide the candidates with sufficient time to work through example test items.

*13.

Make careful checks to ensure proper use of the answer sheet and response procedures.

14.

Deal appropriately with any questions which arise without compromising the purpose of the test.

*15.

Explain any time limits and the need to maintain silence during the test and make clear that once the test has begun no further questions can be answered.

*16.

Adhere strictly to test-specific instructions concerning pacing and timing.

17.

Collect in all materials when testing has been completed and carry out a careful inventory of materials.

18.

Thank the candidates for their participation when the final test has been completed, and explain the next stage (if any) in their assessment.

19.

Make final entries in the Test Session Log – including notes on any particular problems which arose during the session.

Stage 3: Scoring Supervised or Managed Modes *20.

Visually check answer sheets for ambiguous markings which could be obscured by scoring keys or cause problems with machine-scoring systems.

*21.

Make accurate use of the relevant scoring key.

*22.

Accurately transfer raw score marks to candidates’ records.

All Modes *23.

Use norm tables to find relevant percentile and/or standard scores and complete candidates’ records.

Stage 4: Completing All modes 24.

Keep all test materials and test data in a secure place and ensure that access is not given to unauthorized personnel.

25.

Ensure that all mandatory requirements relating to candidates’ and clients’ rights and obligations under the Data Protection Act have been clearly explained to all parties (i.e. clients and candidates).

26.

Ensure that data are stored according to the requirements of the Data Protection Act.

* Some or all of the items marked with an asterisk may not be necessary for computer-based assessment systems.

PT_C02.qxd 10/07/2006 12:14 Page 68

68

MODULE 2.1

Stage 1: Preparation Issues to consider

?

OPEN QUESTION Imagine that you are invited to take part in a testing session as a candidate. The session may be a part of your professional development or it may be a part of a job selection process. Whichever you have imagined, use a few seconds now to jot down some of the things you would like to know, in advance, about the session. 1.

......................................................................................................................

2.

......................................................................................................................

3.

......................................................................................................................

4.

......................................................................................................................

5.

......................................................................................................................

•

Perhaps you thought about practical things. Will I be able to do the test online at home, or will I have to go somewhere to be tested? Where will the session be held? How long will it take? Will I need to do any preparation before the test? If I have to go somewhere, will there be a need to bring anything to the session?

•

Or maybe you thought about the tests themselves. What will they be about? Will they need any special knowledge? Will they be relevant to what they are being used for? Who will see the answer sheets afterwards?

•

Or, as you now know a bit more about testing, maybe you thought about some of the technical properties of the tests. Will the tests be reliable, valid and have appropriate norms?

Whatever questions have occurred to you, there is no reason to believe that anyone you test in future will not also be asking these, and more, questions before they take their test. It is the responsibility of the test user to choose which tests are to be used and to see that prospective candidates are given as much information beforehand as it is possible and ethical to give. As test administrator, you are likely to be involved in helping the test user with the second of these responsibilities.

The preparations that the test user has to make begin at the point at which the decision is made to administer the test. There are a number of issues that test users have to consider when deciding whether or not to use psychometric tests in a particular context: 1.

Decide what information is needed for, or from, the candidates and whether or not psychometric testing will provide some of that information.

PT_C02.qxd 10/07/2006 12:14 Page 69

69

TEST ADMINISTRATION

2.

If it will, look to see which tests have the potential to provide the information you need.

3.

Look at the technical properties and the practicalities associated with each of the possible tests and decide which is most appropriate for your purposes. The test manual is the most likely source of this information.

As test administrator, on behalf of the test user, you will need to: 1.

Check what materials you will need for the session and see whether there are sufficient supplies in stock if this is a test you already use; if not, decide what you need to order. Do not make photocopies of test materials: it is illegal. If you are using computer-based tests that require release codes or certain numbers of administration units to be prepurchased, you will need to ensure that adequate provision has been made or submit the necessary orders in good time.

2.

Check what equipment (from stopwatches and pens to computers and workbenches, depending on the test and its mode of administration) and conditions are required and whether these can be made available. This information will be available in the test manual.

3.

On the basis of the information you have collected, decide how many candidates you are able to test in one session, how many staff you may need to support you, and how long each test session will last. Be aware that if you are testing more than 15 people in one group session you will probably need some assistance in distributing and collecting materials and in monitoring the candidates during the test.

4.

Plan the session, book the rooms you will require, order any materials you need, and book the staff time that will be needed.

5.

Write to the candidates to invite them to the session and to inform them about it. Write to brief them about any requirement for online Controlled Mode testing, explaining the log-in procedures and support availability.

There are, in addition, a number of issues need to be considered before using tests with any individual candidate. Dealing with candidates who have specific problems Psychometric tests are instruments which are designed to measure the strength of or the amount of a person’s specified attributes or traits. It is important to make sure that the measurement of these attributes is not being adversely affected by difficulties in coping with the ‘mechanics’ of the test, for example in understanding the instructions or in recording the responses. Before you use tests on anybody you should check that the person has the necessary numeracy and literacy skills to understand what is required and to undertake the test, and that they do not have any disabilities that might make it difficult or impossible for them to complete the test. As a test administrator you should try to ensure that all candidates are provided with sufficient information about the tests they will be required to take and the modes of administration. If they are, then they can be asked to notify you of any conditions which they believe might make taking the tests

PT_C02.qxd 10/07/2006 12:14 Page 70

70

MODULE 2.1

unfair or unreasonable for them. You should then raise these issues with the qualified test user to see whether any dispensations or special provisions need to be made.

•

You should never make modifications to a test without the prior authority of the qualified test user.

•

You should never deviate from the standardized procedures without authority from the qualified test user.

•

Not every disability will affect the standardized testing conditions. Where one does, every circumstance will be different.

•

Any authorized modifications to the test or the conditions of administration and the reasons for them should be recorded in the Test Session Log.

Candidates with numeracy, literacy and language difficulties Many tests assume a reasonably high level of literacy on the part of the test taker. This can affect both tests of language-related skills and the measurement of skills which do not require literacy: for example, tests of manual dexterity with written instructions. Those for whom English is a second language may have particular problems with the language used in tests and test instructions. The qualified test user may be able to make provision for foreign language versions of the tests to be used and for a foreign language-speaking person to assist with the administration. Clearly, such tests should not be used with people who have literacy or language problems, unless the job requires verbal reasoning skills and this is what is being tested. The same is true of numeracy difficulties: if a test is designed to measure some attribute other than numeracy, but requires the test taker to be numerate in order to complete the test, then it would have an unfair impact on those with numeracy problems. Candidates with physical or sensory disabilities As with language, literacy and numeracy problems, it is important when testing candidates with physical or sensory disabilities to make sure that the ability being tested is not affected by difficulties in physically coping with the instructions, or by problems in recording their responses. For example, a candidate with a hearing impairment may be disadvantaged with instructions that are given orally, whilst a candidate with visual impairment may be disadvantaged by written instructions. Written responses might cause similar difficulties for those with visual impairments. Clients with hand–eye co-ordination problems, hand tremors, etc. may have difficulty in writing their responses or making accurate use of a multiple-choice format answer sheet. Such problems may result in them taking longer than other candidates physically to complete their responses, with the consequence that they may be especially disadvantaged in speeded tests. Sometimes the disability can be coped with within the standardized administration procedures. For example, if the instructions are to

PT_C02.qxd 10/07/2006 12:14 Page 71

71

TEST ADMINISTRATION

be read aloud by the administrator and simultaneously read by the candidate, provided there is not also a performance or recording problem, the candidate with a hearing impairment or a visual impairment may not be disadvantaged. But it is sometimes impossible to get a true measure of the candidate’s ability within the standardized administration procedure of the test. If a candidate does present a problem because of disability, there are a number of things that the test user needs to consider: whether special provision can be made in terms of equipment, whether some adjustment in terms of time is appropriate and permissible, whether there are alternative forms of the test available (for example in Braille for blind people).

•

If the disability is likely to affect the standardized conditions, the test user should take advice from the test publishers about the extent to which changes can be made and how the interpretation of the results might be affected.

For computer-based testing or tests delivered over the internet you need to check whether it is appropriate to use the various ‘accessibility’ functions built in to some operating systems. It is often a relatively simple matter to increase font size, for example, but doing this might have adverse effects on the test. As in all other cases, do not make adjustments unless authorized to do so by the responsible test user. Planning the session When planning a supervised test session there are lots of things to do and details to remember so you need to be systematic and well organized. Exercise 2.1.1 will give you practice in planning a supervised test session. The checklist in Table 2.1.2 describes what you will need to do prior to the test session. It is planned as if you had three weeks in which to make the arrangements. Ideally that would be the case, but if you have less time these points still need to be covered in the time that you do have available. (This is a general-purpose checklist. In practice you may need to tailor it to fit your specific situation.) If you have read through the whole of Module 2 once, attempt Exercises 2.1.1 and 2.1.2 now.

TABLE 2.1.2: Checklist for planning a test session About 3 weeks before: – If you have not administered the test before, study the Manual, familiarize yourself with it and check the administration procedures. – Check the Test Manual(s) to see what materials are required. – Check to see if there are sufficient supplies of psychometric materials (for example, answer sheets, test booklets, release codes for computer administration of tests, etc.) in stock, and order any necessary extras. – Check that the other materials (pencils, erasers, computers, stopwatches, etc.) are in stock and in working order. – Book the rooms and equipment needed. – Arrange for additional clerical and administrative support if required.

PT_C02.qxd 10/07/2006 12:14 Page 72

72

MODULE 2.1

TABLE 2.1.2: continued About 2 weeks before: – Write to candidates inviting them to the test session. Include any practice materials (or information about how to access online practice materials) that the test publisher recommends (see Exercise 2.1.2). – Retain a list of candidates to refer to when preparing the test log and name tags. – Plan the timetable for the day. About 3 days before: – Assemble all materials and check everything is present (see the Test Pack, Section 3, for a general checklist of materials). – Check that the room and equipment are booked. – Organize the reception arrangements for the candidates. – Brief any assistant(s) about their role on the day, making sure that they understand the importance of maintaining standard test conditions within the test session. The day before the session: – Prepare the log and the score sheet (if required). – Refamiliarize yourself and your assistant(s) with the materials. On the day: – Prepare the room, the tables and materials and check arrangements for the candidates’ arrival. – Put ‘Test in Progress’ signs on the door. – Unplug any telephones and divert calls. – Welcome candidates and follow the timetable.

Do not attempt these exercises until you have read through the whole of Module 2 once.

EXERCISE 2.1.1: Planning the test session Scenario: A test session is being planned: it will be held in three weeks’ time to test 12 candidates. There is a room which is large enough to accommodate 20 tables and which is quiet and can be reserved for the session. Three computers can be accommodated in the room and can be free for the test session. A full day is available in which to arrange the session. All the test materials can be ordered at short notice if they are not in stock. It has been decided that the tests are fair and suitable for the purpose for which they are to be used. What has not been decided is which of the tests to use. Using the administration instructions from two different manuals given below, plan two separate sessions, each for 12 candidates using the two tests described as the basis for the sessions. TEST 1. This is a paper-and-pencil test of ability. It requires that candidates sit at individual tables placed at least three feet apart and all facing towards the administrator. The test has inbuilt practice questions which require that the administrator walks around the room and looks at the candidates’ answers to see that they are using the right boxes. The answers are read out to check that the examples are understood and questions are allowed up to this point. The Manual recommends that no more than eight candidates are tested in any one session unless there is a second administrator present

PT_C02.qxd 10/07/2006 12:14 Page 73

73

TEST ADMINISTRATION

to assist. The test takes approximately 45 minutes: the administration of materials and practice session takes approximately 15 minutes and the timed test element is exactly 25 minutes long. TEST 2. This ability test is administered by computer. Practice tests are available on the test publisher’s website (http://www.publishername.com/ practicetests). These may be openly accessed by candidates to help prepare them for the test by familiarizing them with the kinds of items that the test contains. It is recommended that the candidate takes this practice test before coming to the test session. The test itself has two example items to familiarize the candidate with the question format and the use of the computer keys. The test is timed and runs for 20 minutes exactly. It is suitable for group administration. Plan sessions for each of these tests. Assume that materials will have to be ordered from the publisher. At this stage, do not write to the candidates, but include it in your planning.

EXERCISE 2.1.2: Inviting the candidate to the test session As a result of your planning, the time scale and staff availability it is decided that only the computer-based test will be used. Write to one of the candidates inviting him or her to the first session and giving the following information:

• • •

Time, date and venue of the session.

•

Advance notice that they will not be permitted to receive or make calls or text messages during the test session.

• •

A reminder to bring glasses if they need to wear them.

• •

A request for any special requirements that may be needed.

Length of time the session will take. As much information as you can to prepare the candidate for the tests: send out practice leaflets or inform the test taker about available practice tests or direct test takers to appropriate internet testing practice sites.

Information about the confidentiality of the tests – who will see the answers, what will happen to the data, and who will know the results (including the candidate); how feedback will be given.

A request for the candidate’s signed permission to provide information from the test to the people it is intended for.

Now that you have thought about the advance preparation that is necessary for a testing session, you can plan for the administration of Test A in the Test Pack. Whilst this will be on an individual basis or for a small group, there is still a need to plan ahead. A test administrator needs to be calm, friendly but business-like. Good advance preparation and planning ensures that the administrator is in control. A second aspect is to consider the personal preparation that you need to be a good test administrator.

PT_C02.qxd 10/07/2006 12:14 Page 74

74

MODULE 2.1

Preparation of yourself as administrator One of the best preparations for administering a test is to have done the test yourself. This is essential if you are about to administer a test that is new to you. Once you have administered the test and you are familiar with it, it is still good practice to re-read the instructions and think about the requirements before the test administration. If you have read through the whole of Module 2, attempt Exercise 2.1.3 now.

EXERCISE 2.1.3: Familiarizing yourself with the Test Pack materials Turn to the Test Pack and take out one of the Test A test booklets (Test Pack, Section 2) and the Administration Instructions (Test Pack, Section 3). Test A is an ability test. Turn to the Administration Instructions. Provide yourself with the relevant materials from the General Checklist of Materials (Test Pack, Section 3). Then read through the Administration Instructions. When you are certain about the examples, check your pencils, start the stopwatch and begin the test. Do not worry about the timing. The stopwatch will give you an indication of how long the test takes for you to complete and should reassure you about the timing. Complete the test and leave the scoring for the moment. Turn to the Test Pack and take out one of the P5 Personality Inventory test booklets (Test Pack, Section 2). The administration instructions are part of the test booklet. Provide yourself with the relevant materials from the General checklist of Materials (Test Pack, Section 3). Then read through the Administration Instructions. When you are certain about the examples, check your pencils and begin the test. This is an untimed test, but you might want to set the stopwatch running when you start to see how long you take to complete the questionnaire. Complete the test and leave the scoring for the moment. Now you can begin to plan your test session. To do this you can use the blank planning schedule below. Choose three people to act as test takers and plan the testing session for them. You may wish to test all three together or use three separate administrations; much will depend (as in any testing session) on your situation. Do remember that these tests have been written for training purposes only and that you cannot give your candidates meaningful feedback. Make this clear to the candidates both in your advance preparation and later in your preamble to the testing session. Do not invite your candidate(s) to a testing session until you have read through this Module at least once, and you have thoroughly familiarized yourself with the materials and procedure. Planning schedule Use the following headings to plan your schedule (see Table 2.1.1 and Exercise 2.1.1).

PT_C02.qxd 10/07/2006 12:14 Page 75

75

TEST ADMINISTRATION

About 3 weeks before: About 2 weeks before: About 3 days before: The day before: On the day:

Preparation for computer-based testing sessions Computer-based test administration, whether over the internet or offline, is becoming increasingly popular not only for Controlled Mode, but also for Supervised and Managed Mode testing. Becoming familiar with computer-based tests is just as important as it is for paper-based ones. You need to be aware of how the test operates, what sort of issues or problems might arise for test takers and how to handle those. You need to become familiar with the technical support documentation provided with the test and how to access additional technical support (by phone or email) when needed. 1.

Be familiar with the screen design requirements of the test (e.g. screen size and screen resolution) and ensure that such requirements are compatible with the systems being used.

2.

Where test takers are operating remotely (over the internet), ensure that there are instructions presented with the test that inform them of screen design and layout conventions, including where instructional text and prompts are placed, and how instructions can be accessed once testing begins.

3.

Be familiar with how items are presented, how the test taker can navigate between screens or items, how the test taker is required to respond and how they can change responses, if that is permitted.

Computer-based tests tend to be designed to work on a limited range of systems or hardware configurations. You therefore need to ensure that you have sufficient understanding of the technical and operational requirements of the test (i.e. hardware and software), as well as the necessary hardware, software and human resources to obtain, use, and maintain the CBT on an ongoing basis. We all know that computer systems sometime ‘crash’, connections on the internet can be lost and servers can go offline. For these reasons it is important to take account of the robustness of the computer-based or internet-delivered test when making preparations for a test session. 1.

Ensure processes are in place to log and resolve problems that may arise during testing.

2.

Check the availability of the information necessary for contacting the provider of technical support and for using technical support services as necessary.

3.

Inform test publishers/developers where problems occur with the responsiveness of the computer to the test takers’ input.

4.

For internet testing, know the recommended procedures for dealing with hang-ups, lost connections and slow downloads, and advise test takers accordingly.

PT_C02.qxd 10/07/2006 12:14 Page 76

76

MODULE 2.1

5.

Provide the test taker with the technical support specified in the test documentation if any routine problems occur.

As with paper-based tests, it is important to provide help, information, and practice items for people who are going to have to complete a computer-based or internet-delivered test. If a test is being delivered in Controlled Mode then you need to ensure that the test taker can:

• •

Log in and off the system (e.g. the use of passwords). Access information on the test and the testing process before beginning the test and access on-screen help while completing the test.

You also need to provide sufficient opportunity for the test taker to become familiar with the testing software and the hardware, and provide the information they need to contact you or someone else for help and support if they have a problem and are working remotely. Where appropriate:

• •

Inform the test taker about available practice tests. Direct test-takers to appropriate internet testing practice sites.

Stage 2: Administration The second stage in test administration is the administration of the test itself. This involves building up a rapport with the candidate(s); briefing them on the purpose of the assessment; using the standard instructions to administer the tests; allowing sufficient time to complete example questions and deal with questions raised by the candidates; maintaining the test conditions to the standard given in the Test Manual and, finally, keeping strictly to the instructions for the timing of the tests. It also requires keeping a record (the log) of the testing session. The Test Session Log Recording the session is an important part of any testing session. You will find a number of copies of an example Test Session Log in Section 3 of the Test Pack. These are for you to use when you are carrying out the practice test administration exercises. Take one out now and have a look at it. Log sheets are included with some tests, but you will probably find it more convenient to design your own using a computer spreadsheet. Use the examples provided as a guide. The Test Session Log:

•

Allows you to record the names of the candidates and the tests they have taken. This is useful when you are collating information for the test user to use in feedback from the testing session. It is particularly useful when you do not know the candidates in advance as it can act as a register. Test scores can be recorded on this sheet after scoring is complete (page 1).

•

Provides a means of recording details of the timing. Noting the exact time that the session begins is a double safeguard should your stopwatch fail (page 2).

•

Provides a checklist against which you can count in and count out the materials you have used. This is vital for test security (page 2).

PT_C02.qxd 10/07/2006 12:14 Page 77

77

TEST ADMINISTRATION

•

Allows you to record anything that may cause the test results to be questioned and which thus needs to be treated with caution when interpreting the test (page 3).

Fill in page 1 and page 2 of the log sheet before the session. Fill in the time (page 2) as each test is administered. Fill in page 3 as necessary. Sign and date the log sheet and keep it secure with the test data until the data are destroyed. Introducing the test session Testing is a very formal procedure. As you have learned in the preceding sections, tests have been standardized under stringent conditions and, in order to be fair to your candidates, you must stick to the standardized instructions and conditions for the test so that the test user can compare the candidates with relevant norms groups or other criterion-referenced groups. This means following the test instructions rigidly. However, it does not mean that the whole session need be overly formal and rigid. The initial introduction is the opportunity to welcome candidates, tell them about the length of the test and offer them the chance to have a final cigarette or a visit to the lavatory. Also ensure that they have all turned off any mobile phones, PDAs or other devices that might create an interruption or distraction. Make sure they know in advance that they will not be permitted to receive or make calls or text messages during the test session. It is also a chance to set them at ease and to tell them about the purpose of the testing, what will happen to the results, and when they will be able to get feedback. It is a good opportunity to talk about what will happen to the data and (if applicable) to outline briefly how your client or your company complies with the Data Protection Act (see Section 2.2 of this Module). Do not, in your efforts to put the candidates at ease, underplay the importance of the tests and the necessity for them to do their best. Think for a moment about the session you are planning for Test A:

• • • •

Why are your candidates doing this test? What will happen to their answers? Will you be storing information about them? If so, why?

You need to think about this in two ways: (i) as a role play (where you and your candidates are pretending that this is a real test situation); and (ii) as a training situation (where you are gaining experience of how to administer a test, and the candidates are there to help you in your training). The role play will involve working out an elaborate scenario and preparing information accordingly in order to script your introduction – just as you would if you were testing real candidates in a job selection situation. However, you need to make clear to your candidates – before moving into your role play – what the real conditions are of the training situation. That is:

• •

Your ‘candidates’ are doing the test only to help you.

•

The candidates can put a fictitious name on their answered question book, and therefore the data will not be connected to them.

There will be no feedback to them as there are no normative data on the test and you, in any case, are not qualified as a test user.

PT_C02.qxd 10/07/2006 12:14 Page 78

78

MODULE 2.1

•

The completed questions book will be kept by you as part of your portfolio for assessment purposes.

If you have read through the whole of Module 2, attempt Exercise 2.1.4 now.

EXERCISE 2.1.4: Introducing the session to the candidates Make notes now for the introduction to your session. Do not make a ‘tight’ script as it will defeat the object of the introduction, which is to put candidates at ease. Remember, you need to:

•

first, brief your candidate on the real purpose of the session (that it is part of your learning);

•

second, provide a ‘role-play’ introduction.

Administration of the test The next step is giving the instructions to your candidate. This usually requires the administrator to read the instructions aloud, but be guided by the Manual for this. For some computer-based tests, you will simply be asked to tell the candidates to work through the instructions and examples at their own pace, as they appear on the screen. Test A requires you to read the instructions to the candidate. The transition between the informal introduction and the formal reading of instructions can be bridged by saying something like: I am now about to administer the test. To be fair to every candidate, it is important that every person is given the same instructions and that those instructions are the same as the ones given to the group that you will be compared with. To be fair in this way, I will read the instructions to you. Are there any further questions before I begin? From this point onwards, the test should proceed exactly as is stated in the Manual. Examples, if they are provided, should be carried out according to the instructions and for timed tests the timing should be precise. Any minor interruptions, such as someone entering the room, should be noted in the log. Major interruptions, such as a fire drill or a power cut in a room that relies on artificial lighting, mean that the test session should be abandoned. If a candidate has to leave the room he or she can be given no extra time, but a note should be made in the log against the candidate’s name. Candidates who arrive late cannot be admitted into the group once the formal administration has begun. When the test time is up, stop the test promptly and collect in the materials. Sometimes there are instructions in the Manual for doing this. It is important for test security reasons that you collect in all the printed materials that you have handed out. Check these in the log. Thank your candidates and remind them about the next stage in the procedure. Ask that they do not remove anything from the room that they did not bring in with them – including scrap paper etc. When the candidates have left, collect in any other materials and attend to any machines (for example, computers) that have been used. Carry out all necessary procedures for saving data and making them secure before

PT_C02.qxd 10/07/2006 12:14 Page 79

79

TEST ADMINISTRATION

switching them off. If separate question books have been used, check these for marks before returning to stock.

Computer-based test (CBT ) administration and administration of tests over the internet The ITC’s Guidelines for computer-based testing and testing over the internet cover a range of issues associated specifically with computer-based testing – whether delivered online or offline. In particular they emphasize the need to:

•

Only use the test in those modes of administration for which it has been designed (e.g. do not use a test in an unsupervised mode when it is specified for use only in supervised modes).

•

Verify that test takers know how to interact with an internet testing system (e.g. basic browser operation, use of access passwords).

•

Provide a contact point (e.g. email or phone) for those who do not understand the purpose of the test when testing is being carried out in Controlled Mode.

Detailing the level of control over the test conditions The ITC Guidelines emphasize the need to ensure that the right equipment is used and is checked before the session starts. Test takers must be comfortably seated, and work positions, lighting and other conditions should be in accordance with health and safety requirements. For testing over the internet, where direct supervision of conditions is not possible, test takers should be provided with advice about this.

1.

When administering the test, adhere to the standard hardware, software, and procedural requirements specified in the test manual. Before testing, ensure that software and hardware are working properly.

2.

When testing at a specific test centre, ensure that the test taker is comfortable with the work station and work surface (e.g. that the ergonomics are suitable). For example, test takers should: a. b. c. d.

be encouraged to maintain proper seating posture, be able to easily reach and manipulate all keys and controls, have sufficient leg room, and not be required to sit in one position for too long.

3.

When testing via the internet, provide instructions to test takers that specify the best methods of taking the test.

4.

Ensure that the facilities, conditions, and requirements of the testing conform to national health and safety, and union rules. For example, there may be rules governing the length of time a person should work at a monitor before having a break, or rules as to adequate lighting, heating, and ventilation.

5.

When testing over the internet, inform test takers of such rules and regulations.

PT_C02.qxd 10/07/2006 12:14 Page 80

80

MODULE 2.1

Stage 3: Scoring The third stage is concerned with scoring the tests. This involves checking for ambiguous marks, careful use of scoring keys, checking the total scores, translating the raw scores into standard scores (see Module 2.1), transcribing the results to score sheets and finally completing a record of the testing session in the test log. Tests vary in the way they are designed and the methods used for scoring. The Test Manual should always describe the scoring procedure. For some paper-based tests there will be ‘scoring keys’ with instructions printed on them. Always follow these instructions very carefully. Most tests these days tend to have separate question books and answer sheets. This enables the books to be reused. However, there are some tests where answers have to be written into the question book. These books should not be reused. They should be destroyed after the assessment process has been completed, as long as there is no legal requirement to retain them. The principles of scoring are exactly the same for scoring answers in question books as they are for separate answer sheets, though the design of the scoring keys may differ. Some of the different types of scoring key are described later. We will now go through the process of scoring Test A and then consider some of the variations you are likely to encounter in test design and scoring methods. Checking answer sheets Before beginning to score any test, the answers should be checked for ambiguities, multiple responses to the same item, and so on. If you have read through the whole of Module 2, attempt Exercise 2.1.5 now.

EXERCISE 2.1.5: Checking the answer sheets Take your own completed booklet for Test A and check whether you have ringed two or more answers to any one question. If you have, mark the multiple answers with a cross using a red pen. (When you score your candidates’ booklets, do the same thing.) Take your own completed booklet for the P5 Personality Inventory and follow the instructions for scoring to check that there is a tick in one and only one of the five response columns for each statement. If there are multiple answers, mark them with a cross using a red pen. (When you score your candidates’ booklets, do the same thing.)

Hand-scoring For Test A the correct answers are listed in the Test Pack. There is also one clear acetate scoring key in the Test Pack. Carefully follow the instructions on how to use it. If you have read through the whole of Module 2, attempt Exercise 2.1.6 now.

PT_C02.qxd 10/07/2006 12:14 Page 81

81

TEST ADMINISTRATION

EXERCISE 2.1.6: Scoring Using the acetate key from the Test Pack, score your answers for Test A. Give no points to any item that has been marked with a red cross even if one of the answers is the correct one. Count up the total number correct and write this in the space provided on the front cover of the Test A booklet. Then add up the number of omissions and the number of wrong answers. If no option has been chosen for a question, it is an omission. If one incorrect option or if two or more options have been chosen, it is wrong. Write these totals into the other spaces provided on the front cover. Add the three totals. They should add up to 25. If they do not, you will need to check the scoring as you will have made a mistake. Now follow the scoring instructions provided in the Test Pack for scoring your responses to the P5 Personality Inventory. Record the total scale scores on the form provided.

When scoring tests by hand always check that the number of right answers, the number of wrong answers and the number of omissions adds up to the total number of items.

Other scoring procedures As mentioned earlier, not all tests are designed like Test A. Sometimes tests come with a separate question and answer book; sometimes the questions and answers are on a single sheet; and sometimes both are in the same book. When answers are written into a book, scoring is usually carried out with a key in the form of a strip of card which is aligned with the book so that the correct answers line up with the spaces where the candidate will have indicated their responses. You can then compare the two to see which answers are right, which ones are wrong and which were omitted. Test items tend to be of two main formats: open-ended or multiple-choice. For open-ended items, candidates have to write down what they think the correct answers are. For multiple-choice items they select one of a number of possible answers (usually by placing a cross or tick in a box). Tests may use just one format, or they may mix the formats. Typically, open-ended test item formats are scored with keys which are laid beside the answers, while multiple-choice formats use keys which are laid over the answer boxes. Scoring open-ended response items is rather more difficult than scoring multiple-choice ones as you may have problems reading people’s writing. Always follow the scoring key exactly. If more than one answer is permitted, it will be indicated on the key. Multiple-choice format items often use keys made of clear or coloured acetate with rings or crosses on them to indicate the right choice. These are laid on top of the page and checked to see if the marks on the acetate correspond to the choices made by the candidate. An alternative to the acetate overlay sheet is the card overlay with holes punched in it. These holes are positioned to align with the right answer boxes on the page underneath.

PT_C02.qxd 10/07/2006 12:14 Page 82

82

MODULE 2.1

In multiple-choice tests, candidates may be asked to circle answers, mark or tick boxes, blacken a square or circle, or enter a chosen number in a box. Checking the answer sheet carefully before scoring is particularly important if card punched overlays are used. Once the card is laid over the page you can only see what was marked in the ‘right’ boxes. So someone who had marked every option would appear to have got the item right! Integral answer sheets and scoring keys. These consist of two sheets of paper fixed together with a tear-off strip round the edge for separating them. The answer sheet is printed on the top and scoring instructions on the second sheet. The candidate’s marks on the top sheet are automatically transferred through onto the bottom sheet by means of carbon. Scoring is carried out by removing the tear-off strips and following the scoring instructions printed on the sheet underneath. When using this type of answer sheet, always check any marks that appear just outside the key on the second sheet, to see whether they are actually in a box on the front sheet. If they are, give the appropriate mark as the two sheets of paper have simply shifted slightly. Answer-until-correct format. This is a variant of the multiple-choice format. The candidate is given a number of alternatives and can then ‘reveal’, for each alternative, whether it is correct or not. This is either done by scraping a layer of silver off the answers or by opening a tear-off tag. For computer-based tests (see below), the process if much simpler: you continue to select answers until you find the correct one, and then the next item is presented. The candidate is told to keep revealing answers until the correct one is found. The score is based on the number of answers revealed for each item – the fewer the better. Machine scoring. Some test answer sheets are designed for computer scoring. Some of these can be read using a standard desktop A4 scanner with the publishers software. Others may require the use of special machines known as optical mark readers (OMR). While an OMR can be purchased, most organizations would not find the expense of having their own machine worthwhile. It is far more common to make use of a commercial bureau scoring service for machine scoreable answer sheets. You send off a bundle of answer sheets to the bureau and they are returned with the scores. For machine-scoring, it is particularly important that answer sheets are first checked to ensure that there are no ambiguous marks on the page, and that responses are clearly marked in the correct locations inside the response boxes. You also need to be careful about using the correct grade of pencil. Candidates need to be instructed to make sure that they fully blacken each response box and that, if they change their mind, they must completely erase the first response. Computer scoring. Scanners and OMRs provide a means of getting the responses on an answer sheet into a computer so that it can do the scoring. However, many tests can be obtained in computer-administered formats or may be administered online (in Supervised and Managed Modes as well as Open and Controlled Modes). In many cases the whole test is presented on computer; in others a conventional test book may be used, with either a desktop computer or a PDA acting as the ‘answer sheet’. In both cases, the scoring will be carried out by the computer. Most software will also convert raw scores into standard scores and percentiles using a range of norm group options, and may also provide more sophisticated analysis and interpretation options.

PT_C02.qxd 10/07/2006 12:14 Page 83

83

TEST ADMINISTRATION

Converting raw scores to standard scores and percentiles When choosing a test (see Module 1.2), one of the criteria test users have to consider is whether the test has norm group(s) that are appropriate for their requirements. Assume for the moment that an appropriate norm group was selected by the test user at the outset of the testing procedure. If you have read through the whole of Module 2, attempt Exercise 2.1.7 now.

EXERCISE 2.1.7: Converting raw scores into percentiles and standard scores Test A Turn to Section 4 of the Test Pack, and using the general population norms, Table 2, locate your Test A score in the raw score column, read across to the percentile score column and the T-score column. The following example uses two raw scores (17 and 13): Column 1 T-score 63 T-score 53

Column 2 raw score = 17 raw score = 13

Column 3 percentile score = 90 percentile score = 60

(NB These are ‘manufactured’ data for the purposes of training only.) Now turn to Table 1, column 4. Check the raw score of 17 against university students. You should obtain a percentile score 36.6. Use the General Purpose Conversion Tables (Section 1 in the Test Pack) to convert the percentile to a T-score. From Table 2, we find that a percentile 36.5 is equivalent to a T-score of 47. Now do the same for your own score. First find the general population percentile and T-score and then find the percentile and T-score based on the university student norms. Enter the general population and university student T-scores in the space provided at the bottom of the front page of the question booklet. P5 Personality Inventory Now you can convert your P5 scale raw scores into sten scores. Use the P5 norm table provided to look up the sten equivalents for each of your five raw scores. Enter the values in the spaces on the form provided. Here are some examples. Check to see if you get the same answers: Scale

Raw score

Sten

Extravert

36

7

Agreeable

15

2

Conscientious

29

5

Emotionally Stable

42

8

Open

42

9

PT_C02.qxd 10/07/2006 12:14 Page 84

84

MODULE 2.1

Summary of points to note when scoring tests 1.

Check the answer sheets for ambiguities.

2.

Check the Manual and scoring keys for specific scoring instructions.

3.

Check that your scoring is correct.

4.

If required, convert the raw scores into standard scores and/or percentiles using the norm group selected by the test user (if there is a choice of more than one).

Stage 4: Completing the administration procedures The fourth stage is completing the procedures once the test has been carried out: checking that all the materials are collected in, that they are clean and unmarked and replaced in secure storage; entering the scores on the Test Session Log (if these have not automatically been entered into a computerized record format) and ensuring that the data are kept secure and that the requirements of the Data Protection Act are observed. Paper-and-pencil materials

•

Check all the reusable materials for marks. If there are marks, erase them. If they are indelible then dispose of the materials securely (by shredding or burning them) and remember to reorder stocks.

• • •

Return the clean reusable materials to secure storage (see Section 2.2). Enter the test scores onto the Test Session Log. Place answer sheets and Test Session Log (or printouts of the Log if it was computer-based) in secure storage.

Computer materials

• •

Obtain a printout of the results.

• • •

Complete data back-up procedures.

Ensure that the raw scores, and derived scores (e.g. percentile scores and T-scores) are recorded on the Test Session Log.

Make sure any security procedures are carried out. Store printouts of the results and the Test Session Log securely.

Record-keeping, monitoring and follow-up To maintain quality control over any procedure, you need to record what happens and follow through decisions which have been made to see how effective they were. While it is important to keep records, it is also important to ensure that the information they contain remains confidential. You should keep a record of which tests you have used, when you used them, and why.

PT_C02.qxd 10/07/2006 12:14 Page 85

2.2

Issues of Conﬁdentiality and Security

Maintain the confidentiality of test-taker results The ITC Guidelines on CBT and Testing on the Internet say that you need to: 1.

Know how confidentiality will be maintained when data are stored electronically.

2.

Adhere to country-specific data-protection laws/regulations governing the collection, use, storage and security of personal data.

3.

Protect all material via the use of encryption or passwords when storing sensitive personal data electronically on test centre facilities.

4.

Apply the same levels of security and confidentiality to back-up data as to the data on the live system when back-ups are used to store personal data.


Security of test materials The rationale underlying testing is that its usage will be fair to all candidates and allow them an equal chance to demonstrate their abilities using standardized materials under standardized conditions. It is for these reasons that the photocopying of materials (as well as being illegal) can be unfair: distorted, faded, colour-changed materials change the standard and the equality of opportunity for those candidates who receive them. Prior access to the materials by some candidates could also change the equality of opportunity by giving unfair advantage over the other candidates. It is the responsibility of the person registered as the test purchaser to see that all test materials are kept secure and that only original materials are used.

•

To maintain security of the materials, all test materials (manuals, answer sheets, question booklets, profile sheets, administration instructions) should be kept in secure, locked storage to which only registered test users have access.

•

Materials that are used in a test session should be checked out and checked back in. Candidates should not be allowed to take any materials from the room in which testing takes place. 85

PT_C02.qxd 10/07/2006 12:14 Page 86

86

MODULE 2.2

•

Any materials that are spoiled during a testing session should either be destroyed by shredding or incinerating or returned to the suppliers if they are leased materials. Materials should not be put into waste paper baskets for ordinary disposal.

•

It should be impressed upon non-trained staff who order materials, that these are confidential controlled materials and copying for any purpose is not allowed, nor can materials be lent to anyone. Any requests to borrow materials should be directed to the registered test user.

In relation to maintaining the security of test materials, The ITC Guidelines on CBT and Testing on the Internet say that you need to: 1.

Know the features that have been developed to ensure the security of test materials, and develop procedures that reduce unauthorized access to such materials.

2.

Respect the sensitive nature of test materials and intellectual property rights of test publishers/developers.

3.

Protect test materials from being copied, printed, or otherwise reproduced without the prior written permission of the holder of the copyright.

4.

Protect passwords and user names from becoming known to others who are not authorized or qualified to have them.

5.

Inform the internet service provider/publisher of any breach in security.

Data Protection Act 1998 The Data Protection Act 1998 implements a European Directive on data protection and privacy. It has far-reaching implications for testing and test administration. It gives individuals rights concerning personal data stored on computer or in any other filing system. Personal data is defined as information about living, identifiable individuals. This need not be particularly sensitive information, and can be as little as a name and address. The Information Commissioner publishes guidance on the use of personal data in employment, recruitment and selection and other relevant areas. Further information about the Data Protection Act and copies of this guidance can be obtained from http://www.informationcommissioner.gov.uk/ or by post from: Information Commissioner’s Office Wycliffe House Water Lane Wilmslow Cheshire SK9 5AF The qualified test user will have to ensure that your organization is registered under the Data Protection Act and that steps have been taken to ensure that personal data is obtained and managed in accordance with the responsibilities of data controllers as set out under the Act.

The DPA represents the more general European Union Directive on data privacy in terms of eight key principles:

PT_C02.qxd 10/07/2006 12:14 Page 87

ISSUES OF CONFIDENTIALITY AND SECURITY

1.

Personal data shall be processed fairly and lawfully and, in particular, shall not be processed unless either consent has been given or one of the conditions of necessity outlined in the Schedules to the Act has been met.

2.

Personal data shall be obtained only for one or more specified and lawful purposes, and shall not be further processed in any manner incompatible with that purpose or those purposes.

3.

Personal data shall be adequate, relevant and not excessive in relation to the purpose or purposes for which they are processed.

4.

Personal data shall be accurate and, where necessary, kept up to date.

5.

Personal data processed for any purpose or purposes shall not be kept for longer than is necessary for that purpose or those purposes.

6.

Personal data shall be processed in accordance with the rights of data subjects in this Act.

7.

Appropriate technical and organizational measures shall be taken against unauthorized or unlawful processing of personal data and against accidental loss or destruction of, or damage to, personal data.

8.

Personal data shall not be transferred to a country or territory outside the European Economic Area, unless that country or territory ensures an adequate level of protection for the rights and freedoms of data subjects in relation to the processing of personal data.

87

Confidentiality of test data As a rule of thumb, it is a good idea to keep in mind that any information you might obtain using a psychological test belongs to the test candidate in the first instance. Whatever you do with it should, therefore, be done only with their permission. It is always best to be as explicit as possible about rights of access to data. It needs to be made clear to candidates whether their results will be confidential to you and the test user you are working for, or whether they will be passed on in some form to some other party. You need to make sure that you have the candidate’s permission to provide this information to the people it is intended for. The form in which information is passed on is also important. As a test administrator, you should always refer any request for information about candidates or their scores to the qualified test user. You should never provide this information without authorization. The test user will be responsible for deciding who is given information and in what form. As a rule, actual test scores should only be passed on to people who are able to interpret them. This includes the candidate and other recipients of the results. The ITC Guidelines on CBT and Testing on the Internet say that where data are being transferred over the internet, you also need to: 1.

Prior to test administration, have knowledge of and inform test takers of the security procedures used to safeguard data transmitted over the internet.

PT_C02.qxd 10/07/2006 12:14 Page 88

88

MODULE 2.2

2.

Confirm with the internet service provider that they frequently back up data.

3.

Verify that the internet service provider is able to allow test users and authorized others to discharge their responsibilities as data controllers under local data protection and privacy legislation (e.g. the European Union’s Directive on Data Protection).

If possible, test users should store test scores and general information about the person’s age, gender, ethnic background, training, etc. for possible future use by psychologists for test development, production of new norms and validation against training outcome and job performance. However, if test information is stored for scientific purposes then, for the protection of test candidates, the test user should be very careful to ensure that they do not keep any information which might enable individual people to be identified.

PT_C02.qxd 10/07/2006 12:14 Page 89

2.3

Putting it into Practice

As this is largely a practical Module, you might like to think of a number of practical problems you might face. So this section replaces the usual Self-Assessment Questions (SAQs) and supplements the practical task you will carry out at the end of this Module. Each answer refers you back to Sections 2.1 and 2.2, where you will find information to help solve your problem.

2.3.1 SAQ

Dealing with problems How would you deal with the following problems during (i) a supervised paper-and-pencil testing session and (ii) a supervised computer-based testing session? (a)

A thunderstorm caused a 3-second power cut?

(b)

A candidate became ill?

(c)

A candidate finished five minutes before time and asked if they could leave?

(d)

A candidate asked you (quietly and discreetly) for further information about a test question?

(e)

A candidate asked if he/she can take home a question book to finish questions not finished in the test?


What would you do if at another time (i.e. outside the testing session): (f)

A candidate asked for the results of a computer-administered test that she/he had completed earlier in the week?

(g)

The managing director asked you for a copy of a test for a friend whose offspring is undergoing tests for graduate selection into another company?

(h)

A new edition of a test was purchased and the old materials became redundant?

(i)

A junior member of staff photocopied answer sheets to save money?

89

PT_C02.qxd 10/07/2006 12:14 Page 90

90

MODULE 2.3

Overview and practical task You are now almost ready to begin your own test administration. There is one final task which is peculiar to this Module and does not generally form part of the test administration process. Because you may be carrying out this work as distance learning, you will need to get feedback from your candidates to ensure you got it right or to help you to improve. To standardize this feedback and make it objective, please give a Candidate Evaluation Questionnaire to each of your candidates so that they can give you constructive criticism on your performance. Copies of the Candidate Evaluation Questionnaire are contained in the Test Pack, Section 3. Ask your candidate(s) to complete the questionnaire at the close of the test session. If you have just read Module 2 for the first time, then re-read the first section of Module 2 on Test Administration and complete the exercises. Then do Exercise 2.3.1. If you have re-read Module 2 and completed all the exercises in Section 2.1, then, using those exercises as a guide, carry out Exercise 2.3.1.

EXERCISE 2.3.1: Administering Tests A and B Do not attempt this exercise until you have read through the whole of Module 2 once and have completed Test A yourself. Prepare for your training testing session. Make sure you have fully familiarized yourself with the Test A and P5 Personality Inventory materials and have completed them both yourself before you first administer them to a candidate. Note that for training purposes you can ‘collapse’ the normal two- or three-week planning period into a few days. –

Plan it.

–

Write to, or ring, or speak to, your own candidate(s) at the appropriate time in your plan. Give them all the information you have outlined.

–

Review the instructions.

–

Prepare your checklist and materials.

–

Timetable the session.

–

Prepare a quiet room.

–

Carry out the session as planned.

–

Ask candidates to complete the appraisal forms.

–

Score the answers.

–

Translate into percentiles and T scores or stens as appropriate.

–

Give feedback to candidate.

Repeat this procedure with other people, until you are confident about your administration and you have dealt with any problems noted in your Candidate Evaluation Questionnaires.

PT_C02.qxd 10/07/2006 12:14 Page 91

2.4

Feedback and Reporting

There are a number of stages involved in feeding back test results which might involve you as a test administrator: 1.

Preparation of scores for feedback.

2.

Making arrangements for the test feedback provider to meet with the candidate and, where appropriate, any third party or client.

3.

Helping in the preparation or generation of a written report.

Prior to testing, your candidate should have been briefed either by you or the test user about the reasons for using psychological tests and how the information obtained from them will be used. At that time, the candidates should have been informed about what provision has been made for giving them feedback on their results and on who else will be provided with the information. The provision of feedback to candidates is the test user’s responsibility. As a test administrator you should not be asked to do this. However, it is helpful to be aware of some of the ethical and professional issues involved in providing feedback.


Ethical and professional issues A number of ethical and professional issues and further issues are raised by the need to provide information about the candidate’s performance to third parties (e.g. those making selection decisions). There has been much debate in professional circles about this. However, there is agreement on certain fundamental points concerning the rights of test candidates and the ‘ownership’ of their results. These points are consistent with good practice and are now embodied in legislation, such as the UK Data Protection Act 1998, referred to earlier.

•

Testing should only take place with the informed consent of the person being tested. ‘Informed’ means that they understand what the tests are being used for and who will have access to the data. It does not mean that they have a right to look over the test materials prior to being assessed. 91

PT_C02.qxd 10/07/2006 12:14 Page 92

92

MODULE 2.4

•

Their consent should be obtained for the intended use of the test results and the test results should never be used for purposes other than those to which the candidate has consented. Candidates should know how the results are to be stored, who will have access to them, and for how long they will be retained in an identifiable form.

•

Identifiable records and results of a candidate should not be released to a third party without the prior knowledge and consent of the candidate.

•

For purposes of validation, norm development and so on, it is necessary to retain test scores for considerable periods of time. However, it may not be necessary to retain them in a form which allows the individuals involved to be identified. Any organization which carries out testing needs to develop policies and procedures for handling test data and for removing any personal identification once the information has been used for its primary agreed function.

•

Test scores should be considered to be the property of the person who generated them, that is, the candidate. As such, candidates have a right to know what their scores are. However, this raises a real problem. Unless they are a qualified test user they will not be able to understand what their results mean.

Preparation of scores for feedback Let us assume that you have scored the tests and have converted the raw scores to appropriate standard scores or percentiles. These now need to be put into context in preparation for feedback. The test conditions One context is provided by the conditions under which the tests were taken. Look at the Test Session Log. Did you record any problems with test conditions which may have affected the candidates? Were there any interruptions or disturbances such as fire alarms or sudden noises? Was the testing session held late in the day? Did any candidates sit the tests after a long and tiring journey or on a day when weather conditions made travel difficult? Did candidates have time to relax before the test session started, or were they rushed straight in to do the tests when they arrived? Such events will not necessarily affect the score that a candidate obtains. However, they act counter to the ‘standardization’ of conditions which is assumed when test scores are interpreted, so it is possible that they may have had some effect on their performance. It is therefore important for the test user to consider such factors, not as a basis for ‘adjusting’ a person’s score, but so that they can interpret how that score arose and assess the reliance that can be placed on it as a fair and accurate assessment of the individual’s aptitude or ability. Comparisons with the performance of relevant others For most ability and aptitude assessment purposes, an individual’s scores need to be considered in relation to those of one or more comparison groups. Each

PT_C02.qxd 10/07/2006 12:14 Page 93

93

FEEDBACK AND REPORTING

norm group provides a context within which one can consider the individual person’s performance. The test user will have advised you of which norm or norms to use and you should be able to provide them with the necessary percentile or standard scores.

EXERCISE 2.4.1: Preparing information for the test user CASE STUDY 1 A candidate has taken the VeNuS battery of tests. He is applying for a post as an engineering apprentice in a large company and is currently awaiting the results of his GCSE examinations in which it is predicted he will do well in Maths, English and Craft, Design & Technology. The job requires that the candidate has ‘a good standard of written English and will be able to cope well with the Maths and Engineering drawing at further education level’. On the day of the tests he leaves the room three minutes before the end of the Spatial test. He returns later to collect his jacket and says he left because he was feeling ill. His scores on the tests are: V N S G

25 34 22 81

Using the VeNuS manual to help you, prepare these scores for feedback. Prepare the notes in this order: (i)

Convert the raw scores to standard scores and percentiles. Describe these in lay terms – use the 5-point grading system (see Module 1 – percentiles) to help you choose consistent labels for above- and belowaverage performance.

(ii)

Prepare a note for the test user regarding the candidate’s behaviour during the test session.

Making arrangements for the feedback session Making arrangements with the candidate for a feedback session is a matter of making sure they understand the purpose of the session and what they will get from it. The feedback itself will be provided by the test user or other qualified person. Your job as test administrator is to ensure the necessary arrangements are in place, that appointments have been scheduled and that the feedback provider has been provided with all the information he or she needs in order to conduct the session. The candidate will want to know approximately how long the feedback will take, who will be providing it, what the purpose of it is (for example, is it just giving information or is it a developmental feedback process with action planning involved?), and in what form it will be given (for example, face to face or by telephone). You need to be aware that some candidates may be apprehensive about receiving feedback. As with the test session itself, you should try to put them at their ease.

PT_C02.qxd 10/07/2006 12:14 Page 94

94

MODULE 2.4

Putting the candidate at ease Being assessed is not an everyday occurrence. Candidates can react by feeling threatened or challenged by the situation. Feedback can be perceived as confirmation of one’s greatest hopes or worst fears. It is important to keep this uppermost in mind especially when test administration becomes a routine procedure for you. For candidates it is rarely a routine; it can be an extraordinary event in their life and its outcome can be seen as having very important consequences for them. It is important to treat it in this way.

Feedback also has a public relations dimension. The mere fact that feedback is being provided to candidates will be seen as a positive feature of your organization. The way in which you treat the candidates and the impression you create will also reflect upon the reputation of your organization. Remember that, in selection situations, it is not just organizations that choose applicants – applicants also make decisions which will be affected by all the things that happen to them during the selection process.

Helping in the preparation of reports Whether oral feedback is provided or not, it is generally necessary to prepare one or more written reports on the results of a test session. As a test administrator you may be involved in reporting in two ways: 1.

Generating computer-based reports on individual candidates for the test user.

2.

Putting together summary reports for a group of candidates for the test user.

Generating computer-based reports The test user will have identified which report or reports they require. Ensuring that these are correctly produced in time is important. Computergenerated reports may be PC-based, in which case you will have direct control over the report production, or they may be web-based, in which case the reports may be emailed to you as attachments or downloaded from a website. In some instances, such reports will be sent directly to the test user. For some systems you, as the system user, will have to confirm a number of pieces of information: which report or reports to generate, which norms to use, whom to send the reports to. All of these matters should have been clarified with you by the test user, as they are all the responsibility of the test user not the administrator. Creating summary reports If a large number of people have been tested it is often useful to provide a single summary report for the test user of the people’s test scores. This can be done using a computer spreadsheet, such as Excel, on which you represent each person by a row and each scale by a column. When all the information is on this sheet, the test user can quickly inspect the overall pattern of results – including ordering the people on different attributes. They are also

PT_C02.qxd 10/07/2006 12:14 Page 95

95

FEEDBACK AND REPORTING

able to calculate other derived scores, if they want to, if the information is in this accessible sort of format. Remember, though, that personal data and test scores presented like this are just as confidential as the more detailed, descriptive individual detailed reports. If you are asked to produce this sort of report, it is a good idea to include one column for comments, in which you can indicate anything about the test session that the test user should be aware of in reviewing the results.

EXERCISE 2.4.2: Preparing information for the test user CASE STUDY 2 A group of five candidates has completed the P5 Personality Inventory. Their raw scores are shown below. Candidate B left the test session early, feeling unwell, but had already completed all the items in the inventory. Candidate D seemed very anxious and seemed to spend a lot more time over each item than the other candidates. She was only just over half-way through the inventory by the time the others had finished. Candidate E mentioned that they had done this inventory before for another job application. Prepare a simple summary table (use a computer spreadsheet if you want) for giving to the test user, which gives the candidates’ raw scores and sten scores for each scale, and note any issues relating to behaviour in the test session.

Extravert

Agreeable

Conscientious

Emotionally Stable

Open

A

35

33

35

21

38

B

20

25

12

36

13

C

27

28

27

21

32

D

24

24

48

33

20

E

44

45

41

18

38

Candidate

PT_C02.qxd 10/07/2006 12:14 Page 96

Endpiece You have now completed all the work you should need to do for qualification as a test administrator. These training modules provide you with sample materials to practise with. With the knowledge and skills obtained by following these through – and by doing all the exercises you should be ready to carry out test administrations for real. Initially, you should do this under the direct supervision of a qualified test user. They should be able to judge if you need any further training or development. You may want to have your competence formally recognized and apply for the British Psychological Society’s Certificate of Competence in Test Administration. If you do, remember to keep a copy of all your work as part of your ‘portfolio’ of evidence to support assessment for the Certificate.


Having obtained this Certificate you can, of course go on to obtain further training and experience to qualify for test user certificates from the British Psychological Society. For more information, consult the website: www.psychtesting.org.uk.

96

PT_C02.qxd 10/07/2006 12:14 Page 97

Module

2

Answers to Exercises and Self-Assessment Questions

Answers to Exercises EXERCISE 2.1.1: Planning the session Test 1: Paper-and-pencil session

About 3 weeks before: –

Check the Test Manual to see what materials are required.

–

Check to see if the following materials are in stock: 12 test booklets 12 answer sheets 36 pencils (2 for each candidate and one spare each) 12 erasers 12 pencil sharpeners scrap paper 1 stop watch (in working order) Scoring Keys Administration Instructions

–

Prepare a checklist of materials required for the session.

–

Check if help will be available. (Plan the session on the assumption that there will be assistance available.)

–

Book the room with 12 tables and plan how you and the assistant will arrange it.

–

Ask for the candidate list.



Timetable the session for yourself (note that some of these times will be approximate): 9.05 Arrange the room and materials 10.15 Candidates arrive – have tea/coffee available 10.30 Introductions 10.40 Begin practice session 10.55 Begin testing 25 minutes later finish testing Collect and check materials Thank candidates and inform them about the next stage 11.30 Candidates leave

–

Write to the candidates.

97

PT_C02.qxd 10/07/2006 12:14 Page 98

98

MODULE 2

About 3 days before: –

Check all materials against checklist and assemble them (keep secure).

–

Check the room booking is confirmed, that 12 tables will be available and that the stopwatch works.

–

Brief assistant – make sure he or she has read the standard instructions and understands what should be checked and what help can be given for the practice examples.

The day before: –

Prepare log sheet (see Test Pack).

On the day: –

Prepare the room, the tables and materials.

–

Put ‘Test in Progress’ sign on the door.

–

Unplug any telephones and divert calls.

–

Welcome candidates and follow the timetable.

Test 2: Computer session About 3 weeks before: –

Check test to see what materials are required.

–

Check to see if the following materials are in stock: 12 practice test booklets Test Manual Administration Instructions Pencils Scrap paper At least four test administrations available on each computer

–

Book the room and the three computers: Allow time for four sessions.

–

Plan the outline timetable.

–

Ask for candidate list.


Timetable the session for yourself: 9.00 Final check on computers, additional materials and room layout 9.30 First three candidates arrive – coffee for first testing session 9.45 Introductions 9.50 Practice session and computer familiarization 10.00 Testing begins 10.20 Candidates thanked and informed of next stage 10.25 Candidates leave 10.30 Second testing session . . . 11.30 Third testing session . . . 12.30 Fourth testing session . . . 1.30 Finish

–

Write to candidates.

About 3 days before: –

Check computers and room are booked.

–

Check and assemble materials: Computers prepared Scrap paper and pencils available Test Manual and Administration Instructions

PT_C02.qxd 10/07/2006 12:14 Page 99

99


The day before: –

Prepare Test Log.

On the day: –

Check computers and software are prepared.

–

Put ‘Test in Progress’ sign on the door.

–

Divert calls and unplug telephones.

–

Welcome candidates and follow the timetable.

EXERCISE 2.1.2: Inviting the candidate to the test session Your letter needs to cover the following points: –

The time, date and venue of the test session.

–

Instructions to guide the candidate to the venue, a map and details of public transport and car parking.

–

A name and telephone number to contact if there is a problem with the arrangements.

–

Information about the length of the session and the number of assessments that the candidate will be doing.

–

Inclusion of the practice sheet and recommendation to complete it well in advance.

–

Information that the test will be administered by computer and that familiarization with the keyboard will be given before the test begins.

–

Why the test is being used, who will have access to the results, and how the results will be used.

–

How and when feedback will be given to the candidate.

It is also a good idea to ask candidates to confirm that they will be attending – by sending a reply-paid card with your letter. Specimen letter to a candidate Dear [Name], Thank you for applying for the post of ******, which we advertised recently. So that we can consider your application further, we are inviting you and several other candidates to attend an assessment session. In this session you will be given a test of your aptitude for the work you have applied for. The test, which is taken on a computer, will take approximately 45 minutes. The test contains practice items so you will have the opportunity to familiarize yourself with the type of questions you will be asked and with the method of entering your answers. You do not need computer experience to take this test and you will be given the opportunity to familiarize yourself with the keys you will need to use. There is also a separate practice test which gives you an idea of the kind of questions contained in the test. A copy of this test is enclosed. I suggest that you complete this test well in advance of the actual testing session. Taking the practice test should ensure that you are well prepared. All the information gained from the test is confidential. The results will only be seen by the test administrator and the personnel officer who will be making the selection decision. The data will be retained so that we can monitor and evaluate the effectiveness of the test, but any details which could relate the data to you personally will be removed. It is our policy to give candidates feedback on their test results and we will give you further information on how this will be arranged when you attend the session.

PT_C02.qxd 10/07/2006 12:14 Page 100

100

MODULE 2

The session will be held at the above address at 10.30 a.m. on Thursday, 17 October. Please arrive at reception for 10.15 a.m. on this day. The reception desk is located in the main entrance. I enclose a street map of the area with directions from the railway station and the motorway. The station is a fiveminute walk from the premises. Please let us know if you would like us to reserve a parking space for you. Should you wish to contact us about the session please ring Elaine Roberts on 278432. Please return the enclosed reply-paid card to confirm that you will be attending. Yours sincerely

EXERCISE 2.4.2: Preparing information for the test user CASE STUDY 2 The correct sten scores are shown below.

Candidate

Extravert

Agreeable

Conscientious

Emotionally Stable

Open

A

35 = 6

33 = 8

35 = 7

21 = 2

38 = 7

B

20 = 2

25 = 5

12 = 1

36 = 6

13 = 1

C

27 = 4

28 = 6

27 = 5

21 = 2

32 = 6

D

24 = 3

24 = 5

48 = 10

33 = 5

20 = 2

E

44 = 9

45 = 10

41 = 9

18 = 1

38 = 7

NOTE: There are no model answers provided for Exercises 2.1.3, 2.1.4, 2.1.5, 2.1.6, 2.1.7, 2.3.1, or 2.4.1.

Answers to Self-Assessment Questions SAQ 2.1.1 List the main advantages and disadvantages of each mode of administration for (a) low-stakes testing and (b) high-stakes testing.

•

Open Mode. The main advantage of Open Mode is its accessibility. Because of this, it is useful for providing familiarization with tests and for low-stakes testing so long as the test materials do not need to be kept secure. In Open Mode people can access tests as and when they choose, and they do not have to reveal any confidential personal information. However, these benefits are also limitations when it comes to high-stakes testing. We cannot be sure who has taken a test or under what conditions they took it, or how often they have taken it if it is accessible in Open Mode.

•

Controlled Mode. Controlled Mode provides a greater degree of control over conditions than does Open Mode. By restricting access to those with log-on rights, it is possible to know who was provided with access to the test and to control how many times they can access it. It is also possible to control the time of day or date they can access it and, if desired, to limit access to specific locations (by only permitting a log on from certain IP addresses). However, this mode is not directly supervised by a test administrator, and so the possibilities remain of cheating or colluding with others over the test, or of taking the test in suboptimal conditions. Where the stakes are high, assessments carried out in this mode should always be followed up and verified in some independent way.

PT_C02.qxd 10/07/2006 12:14 Page 101

101


•

Supervised Mode. This is like the traditional paper-and-pencil administration mode, in that the test administrator has to be present to log on and start the test, and to ‘sign off’ at the end. The administrator can therefore carry out all the normal procedures required for ensuring testing is carried out properly.

•

Managed Mode. In Managed Mode the test is delivered under very strictly controlled conditions. As well as being supervised there is control over the hardware, the staffing and the location. This is the best mode to use for very high-stakes testing or where the security of the test content is paramount. It is an expensive option, though, and is probably not necessary for most of the routine use of tests in the occupational assessment field.

Define what the role of a test administrator is for each of the four modes of administration.

•

Open Mode. Providing test takers with information about the tests and how to access them. Providing support through email or by phone as needed. Normally, tests carried out in this mode will be for the interest or development of the test taker, and there will be no requirement for the test administrator to follow up on the results or process these in any way. Where this is required, the administrator may need to assist the test user in retrieving the results and generating any required reports. The administrator may also need to follow up test takers who have not completed tests.

•

Controlled Mode. Carrying out all the normal preparation functions including providing candidates with log-on instructions and guidance on the conditions under which they should do the test. The administrator will need to monitor progress of the testing and deal with following up any candidates who were expected to take the test but have not done so. The administrator will also be responsible for assisting the test user in retrieving the results and generating any required reports.

•

Supervised Mode. The test administrator is directly involved in all four stages of administration, though the detailed administration and scoring may be carried out by the computer.

•

Managed Mode. As for Supervised Mode.

SAQ 2.3.1: Dealing with problems (a)

If conducting a paper-and-pencil test, continue the test but record the event in the log sheet and bring it to notice in the test interpretation. If administering a computer test, the test may have to be abandoned. However, some software will enable the test to be resumed from the point at which the break occurred.

(b)

Abandon their test, destroy the answer sheet and, if feasible, allow them to take a different form of the test, or a different test, at a later date.

(c)

Leave all test materials on their table and, if necessary, quietly remind them that the test time is still running and request they remain quiet and do not disturb the others.

(d)

Repeat the information (usually stated in the Manual) that no further questions can be asked once the test begins.

(e)

Say ‘No!’

(f )

The information should be given in a form that the candidate can understand. This is not usually in raw scores, percentiles or standard scores, but a plain English interpretation which is norm-, content-, or criterion-referenced (see Section 2.2).

(g)

Say ‘No!’ and explain that, as a registered test user, you are not allowed, ethically, to do so.

PT_C02.qxd 10/07/2006 12:14 Page 102

102

MODULE 2

(h)

In some cases, publishers will take back old materials when a new edition is brought out and may credit users for what has not been used. In all other cases, either destroy the old material by shredding or burning or store securely as archive material.

(i)

Destroy the photocopies (by shredding or burning), and remind all staff about the ‘no photocopying’ rule. Order some answer sheets as soon as possible. Most test publishers will respond quickly when the situation is urgent.

PT_D01.qxd 10/07/2006 12:14 Page 103

Glossary

This glossary includes terms used in the test administration modules. It is intended to be used as a source of reference. Test administrators would not be expected to be familiar will all the terms in this glossary, but they may wish to refer to them as they come across them in their work.

A

ABILITY Ability describes the degree to which someone can carry out certain types of mental operations -- generally operations which involve ‘reasoning’ of some form. ABILITY DOMAIN The ability domain is defined by the range of types of mental operation required to answer the items in a test. The three major domains are defined in terms of the ‘mental languages’ involved in responding to items: spatial, verbal and numerical. ABILITY HIERARCHY Abilities can be considered to overlap in that there is a tendency for people who have high ability in one DOMAIN also to have high ability in others (and vice versa). This allows us to describe the structure of ability as a hierarchy, with the most general, broad domain at the top (‘g’) and narrow, taskspecific abilities at the bottom.


ABILITY MEASURE see ABILITY TEST ABILITY TEST Ability tests vary in the types of operation they involve and the types of material they contain. Typical operations include ‘analogies’ (A is to B as C is to ?) and ‘series completion’ (1, 2, 4, 8, ?). The content of ability test ITEMS tends to concern words and sentences, numbers or shapes. Ability tests are generally designed to assess what people are capable of rather than what they have learned or what they know. ABSOLUTE SCORE A score which is independent of the score obtained by other people or on another scale. RAW SCORES obtained in non-IPSATIVE tests are examples of absolute scores. 103

PT_D01.qxd 10/07/2006 12:14 Page 104

104

GLOSSARY

ACCEPTABILITY Acceptability concerns those factors which make a test acceptable to the test taker and the test user. The notion of acceptability is important in the design and administration of assessment procedures and should be taken into account to ensure that the test taker co-operates in the procedure. Acceptability is affected by the FACE VALIDITY of the test and by the test user’s faith in it. ACCURACY The precision with which some attribute is measured. Accuracy is a function of RELIABILITY or freedom from measurement error. ACHIEVEMENT TEST see ATTAINMENT TEST ADAPTIVE TEST A TEST where the items are selected from a large database or bank of items held on the computer. Each person who takes the test may be given a different selection of items, as the computer picks just those items that provide most information about that particular person’s level of ability. ADMINISTRATION The second of four stages of test administration, following PREPARATION. This covers the actual presentation of the test to the test taker and their completion of the TEST SESSION. ADMINISTRATOR The person who administers a psychological TEST. ANALOGIES Test ITEMS having the format: A is to B as C is to ? APPARATUS TEST A TEST that requires the manipulation of various items of specialized apparatus -- peg-boards, typing tests, etc. APTITUDE see APTITUDE TEST APTITUDE BATTERY A sequence of APTITUDE TESTS which can be used to provide both detailed and general overall measures. APTITUDE MEASURE see APTITUDE TEST APTITUDE TEST An aptitude is a potential to succeed at something in particular. ABILITY is assumed to underlie aptitude. Aptitude tests are those which have been designed to measure those mental operations (or abilities) which affect the likelihood of someone acquiring some particular skill (for example, computer programming or TV repairing). ABILITY TESTS differ from aptitude tests in that the former are designed to assess general reasoning skills while the latter tend to contain ITEMS with content which is more specifically related to the aptitude concerned. However, the difference is largely one of function or use. In many cases, the same actual test may be used either as an ability test (to measure a person’s general intellectual functioning) or an aptitude test (to assess their potential for success in some occupation). In practice, measures of GENERAL ABILITY can usually be drawn from aptitude test scores (especially APTITUDE BATTERIES).

PT_D01.qxd 10/07/2006 12:14 Page 105

105

GLOSSARY

ARITHMETIC RULES A TEST item form which involves the application of the arithmetic rules of addition, subtraction, multiplication or division. ASSESSMENT The process of appraising or estimating some attribute or set of attributes of a person. ATTAINMENT The outcomes, for example, of education, training and work experience. ATTAINMENT TEST The focus of attainment or achievement TESTS is knowledge and proficiency; what has been learned rather than on the ability to learn. These tests specifically assess what people have learned and the skills they have acquired, for example shorthand and typing tests. AUTHENTICATION One of the functions of test administration is to authenticate the identity of test takers. That is, to ensure that the person presenting themselves for testing is actually who they say they are. For high-stakes testing, the test administrator should ask for identification to confirm this. This should be some form of photo identification, not something that might have been given to an accomplice. AVERAGE see MEAN

B

BATTERY see APTITUDE BATTERY; TEST BATTERY BIAS Bias occurs in TESTS whenever people’s responses vary in some systematic way which is related to some characteristic which the test was not intended to measure. Factors which can produce bias in scores (at either the item level or overall test score level) include differences in sex, age, culture, educational background and literacy. BREADTH Assessment methods vary in both their breadth and their SPECIFICITY. A TEST of GENERAL ABILITY that samples several ABILITY DOMAINS may be regarded as ‘broad’ and ‘general’.

C

CANDIDATE A person who has taken part in the SELECTION process for a particular job or training course. CENTILE see PERCENTILE RANK CLASSICAL TEST THEORY A theoretical approach used in PSYCHOMETRICS which regards all observed scores as fallible and defines the relationship between observed scores and the TRUE SCORES which are assumed to underlie them. CLIENT The person or persons who request a service (in this case, testing) either for themselves or as a representative of an organization.

PT_D01.qxd 10/07/2006 12:14 Page 106

106

GLOSSARY

COGNITION The internal processes and operations involved in perception, memory, thinking, reasoning and problem-solving. COMPLETION The final stage of test administration. This involves assisting the test user with preparation of reports, generating reports from computer packages, and other activities. COMPOSITE TEST SCORE A score produced by adding together scores of two or more tests or SUBTESTS. In some cases, these may be differentially weighted before they are added. Composite test scores are frequently produced by simply summing the RAW SCORES for each part of a TEST. Composite scores derived from BATTERIES of tests are often produced to provide a general measure of suitability in a selection situation. COMPOSITES see COMPOSITE TEST SCORE CONFIDENTIALITY Any information obtained using a psychological test should be considered as belonging to the test candidate in the first instance. Whatever is done with that information, should be done only with their permission. CONTROLLED MODE This is a mode of test administration in which control is exercised over who can access a test on the internet and how often they can access it. It may also include controls over the location they can access it from and the time or date it is available. CORRELATION The degree of relationship between two measures. If scores on one measure tend to increase as scores on the other increase, then the correlation is positive. If scores on one decrease when scores on the other increase, then the two are negatively correlated. Correlation is the statistical procedure used to calculate a CORRELATION COEFFICIENT. CORRELATION COEFFICIENT A means of quantifying the degree of a relationship or CORRELATION between two measures. CORRELATION COEFFICIENTS can vary from −1 through 0 to +1. Minus one indicates a perfect negative correlation, plus one a perfect positive correlation, and zero indicates no relationship. CRITERION REFERENCING In criterion referencing a person’s score on a TEST is used to predict or anticipate how they will perform on types of task not directly sampled by the test but which have been show to be correlated with test performance. CRYSTALLIZED ABILITY Crystallized ability is that which is most dependent on direct experience and learning, such as is measured in ATTAINMENT testing and in ABILITY tests which depend on acquired skills (e.g. language).

D

DATA PROTECTION ACT 1998 This Act gives individuals rights concerning personal data that is stored on computer or in other filing systems, and it requires those who store and use data about individuals to register with the Information Commissioner.

PT_D01.qxd 10/07/2006 12:14 Page 107

107

GLOSSARY

DECILE A percentile-based scoring system where the RAW SCORES are divided into ten categories each containing 10% of the distribution. DEXTERITY TEST Dexterity tests are designed to assess various aspects of motor co-ordination, such as speed of movement, precision of fine motor control, etc. DIFFICULTY OF TESTS see LEVEL OF DIFFICULTY DIRECTIONS TEST items where the test taker is required to follow one or more directions, such as: ‘Write down the highest number in the sequence.’ DISPOSITION The temperament, personality, or characteristic mode of operating of a person. DISTRIBUTION In psychometric testing terms, the apportionment of scores obtained by people across all possible values of a variable. See also FREQUENCY DISTRIBUTION. DOCUMENTATION see TEST MANUAL DOMAIN A universe, sphere or province of objects which meet some criteria. In psychometrics people talk about ability TEST ITEMS being drawn, for example, from the domain of Verbal Reasoning, or from the domain of Spatial Ability. The term is also applied to areas of achievement. For example, mathematical attainment is a domain containing a range of skills involving arithmetic, algebra, and so on. DOMAIN-REFERENCED MEASURE This is where content-related VALIDITY data are used as the basis for interpreting a test score. The logic is to relate performance on a TEST to the level of performance required in a job by using a common quality standard. Judgements are then made about what level of performance on the test would be required for adequate performance of the job. The weakness of domain-referencing is that it relies on expert judgement. As such it really provides a way of generating hypotheses about how test performance should relate to job performance: it does not actually prove that the two are related. Where possible it needs to be backed up by CRITERION-related validity studies. DOMAIN REFERENCING The process of relating a person’s score on a TEST to levels of competence within some DOMAIN of knowledge or performance. DRIVE An attribute or need of a person which is considered to cause them to act in a certain way or to motivate them to action.

E

EQUATION METHOD A method used to turn raw scores into standard scores.

PT_D01.qxd 10/07/2006 12:14 Page 108

108

GLOSSARY

ETHICAL (ISSUES) Issues concerned with the rights, responsibilities and obligations of those involved in testing -- the test taker, the test user, and the test user’s client.

F

FACE VALIDITY What, to the test taker, the TEST appears to measure. The superficial appropriateness of a test (see ACCEPTABILITY). FAIR SELECTION The application of the principles of FAIRNESS to the selection process. FAIRNESS Fairness in testing is a relative term. A TEST is fair or unfair depending on how it is used and on whom it is used. Its use is fair if it is not BIASED with respect to the groups with which it is used and if it can be shown to be valid. Thus, using either an unbiased test of mechanical reasoning or a biased clerical aptitude test in a clerical selection situation would be unfair. In both cases, differences between people would not be relevant for the job and hence it would be unfair to select people on the basis of such differences. Where a test is known to be valid but shows between-group bias, and the degree and type of test bias is known with respect to some selection procedure, then it is possible to practise fair selection by using different cut-off scores for each group. FEATURES IN COMMON A form of TEST item where the test taker has to identify what feature two or more things have in common with each other and then use that feature to select another with that same feature from a set of alternatives. FEEDBACK see REPORTING BACK FIVE-POINT GRADING SCHEME A common percentile-based scoring system where the top 10% of scores are classed as grade A; the next 20% as grade B; the next 40% as grade C; the next 20% as grade D, and the lowest 10% as grade E. FREQUENCY DISTRIBUTION The number of people who obtained each of the various values which could be obtained on a particular VARIABLE. A frequency distribution shows how people’s scores are distributed across all possible values. Frequency distributions are often used to examine the number of people obtaining each of the possible RAW SCORES on a TEST. FREQUENCY POLYGON A frequency polygon is a graphical representation of a FREQUENCY DISTRIBUTION. It is constructed by joining together points representing the tops of the bars in a HISTOGRAM to form a continuous line. FUNCTIONS OF TEST ADMINISTRATION The test administrator has six main functions to perform: (1) AUTHENTICATION; (2) Establishing RAPPORT; (3) Ensuring standard conditions are followed; (4) VALIDATION OF RESULTS; (5) Dealing with problems that might arise; (6) Ensuring security of materials.

G

GENERAL ABILITY ABILITY TESTS vary from those designed to give an overall measure of general intellectual functioning (GENERAL ABILITY TESTS) through those designed to assess broad areas of ability (for example, Verbal, Numerical or

PT_D01.qxd 10/07/2006 12:14 Page 109

109

GLOSSARY

Spatial) to those focusing on specific MENTAL OPERATIONS (for example, three-dimensional spatial rotations). The latter tend to be used for aptitude assessment. General ability tests, in order to properly cover the full range of mental operations, tend to include ITEMS or SUB-TESTS dealing with each of the main areas of ability. When general ability is tested using a battery of ability TESTS, SPECIFIC ABILITY scores as well as an overall general ability measure can be obtained. GENERAL ABILITY TEST A TEST designed to give an overall measure of general intellectual functioning. See also GENERAL ABILITY GENERAL INTELLIGENCE The ability to perform on TESTS and in tasks which involve the understanding of relationships. The capacity to meet new situations, or to learn to do so by new adaptive responses. General intelligence is one of the seven points that make up the SEVEN-POINT PLAN. GENERAL NORMS General norms are intended to be representative of a large and diverse POPULATION. For example, UK general population; UK adult males; UK 16- to 18-year-old school leavers. See also NORMS; SPECIFIC NORMS GENERAL POPULATION NORMS NORMS suitable for use with most people for converting RAW SCORES to either PERCENTILE- or STANDARD SCORE-based measures. Such norms are usually based on a large representative sample of people (in terms of age, sex and other VARIABLES). GENERATING REPORTS The process of creating a report on a person’s test results by using computer software. Computer-generated reports may be obtained either from stand-alone PC applications or from service providers on the internet. GRADES see FIVE-POINT GRADING SCHEME GUIDANCE The process of giving information to a person to help initiate, support and clarify their decisions.

H

HISTOGRAM A FREQUENCY DISTRIBUTION represented in a graphical form. Each score is represented by a bar, the height of which is equal to the number of people who obtained that score.

I

INSTRUMENT A psychological TEST or other procedure for measuring differences between people. INTEREST INVENTORY An interest inventory is designed to assess, in a systematic manner, people’s likes and dislikes for different types of work or leisure activity. INTERESTS Attitudes towards various types of activity, either in relation to work (vocational interests) or outside work.

PT_D01.qxd 10/07/2006 12:14 Page 110

110

GLOSSARY

IPSATIVE TEST An ipsative test constrains a person’s score on one scale with their score on another scale or scales. As a result, the scores on each scale are dependent on each other to some degree. Sometimes referred to as a self-referenced TEST. ITEM An item is the smallest element within a test to which a score is assigned. Generally a question within a TEST, or a statement in a personality questionnaire. ITEM RESPONSE THEORY A theory defining the relationship between test items and the likelihood of people giving correct responses to those items in terms of the level of the ‘latent trait’ (ability) they possess. The theory forms the basis for many ADAPTIVE TESTING systems. ITEM SCORE The numerical score given to a test taker’s answer to an individual TEST item. For example, a correct response to an item in a numerical reasoning test may be scored as 1 and an incorrect response may be scored as 0.

J

JOB-RELATED KNOWLEDGE ATTAINMENT TEST A TEST designed to measure what a person knows in relation to a particular job. JOB SIMULATION Job simulation exercises are often used in the procedures that come under the general heading of the Assessment Centre Method. Job simulations may take the form of in-tray exercises, group problem-solving exercises, and so on. They start from the assumption that the candidate does not yet possess the requisite knowledge or skill, but that the underlying ability will manifest itself when he or she works through an exercise that simulates the broad demands of the job in question. Job simulations are generally designed to assess APTITUDE rather than ATTAINMENT, though they may rely on the acquisition of some general job competences.

L

LEVEL OF DIFFICULTY The level of difficulty concerns the degree of ABILITY required to answer a test ITEM. TESTS may either be designed to contain items of a similar difficulty level, or the level of difficulty may be increased as the test taker progresses through the test. The idea behind this approach is that it should provide a wider range of discrimination between people, with the more able people getting further into the test. LOCAL NORM A particular type of specific NORM GROUP. This is a sample of people local to an organization. See also NORMS LOG SHEET A written record of a testing session which includes: the names of candidates; the TESTS they have taken; and a record of the timing and of any problems or unusual occurrences.

M

MANAGED MODE A mode of administration in which there is both direct supervision and control over the equipment being used, and other conditions. Typically managed mode administration refers to the use of dedicated testing centres.

PT_D01.qxd 10/07/2006 12:14 Page 111

111

GLOSSARY

MANUAL see TEST MANUAL MAXIMUM PERFORMANCE Measures of maximum performance measure how well people can do things, how much they know and how great their potential is. Measures of maximum performance include TESTS of ABILITY, APTITUDE and ATTAINMENT. These measures are usually distinguished from measures of TYPICAL PERFORMANCE which assess personality, vocational or occupational interests, needs, drives and levels of motivation. MEAN The mean is the arithmetic average of a set of values. The mean is obtained by adding all the values together and dividing the sum by the number of values. The mean tells us where the distribution of values is located along a measurement scale. The mean is also called the average. MEASUREMENT The procedure used to obtain a score from a person on some SCALE. MEASUREMENT ERROR Inaccuracies arising from the measurement process, the sources of which are due to extraneous factors which affect scores on a TEST. See also RANDOM ERROR; SYSTEMATIC BIAS MEDIAN The median is the middle value of a set of scores. For any set which has been rank-ordered (for example, the heights of a sample of people) the median is the point (or height in this case) which corresponds to the 50th PERCENTILE. Also known as the median rank. MENTAL OPERATIONS The internal processes and manipulations that have to be carried out to answer a TEST ITEM. MODE The most frequently occurring value in a distribution, for example the score on a test which is obtained by the largest number of people in a sample. MODES OF TEST ADMINISTRATION Classification by the International Test Commission of test administration modes into four types: OPEN, CONTROLLED, SUPERVISED and MANAGED. MOOD INVENTORY An instrument designed to assess a person’s mood. MOTIVATION Factors which affect a persons likelihood of action and the choices they make between alternative courses of action. The reasons given as to why people perform certain acts. Motivation is often defined in terms of the goals which people seek to attain through their actions. MULTIPLE CHOICE (ITEM FORMAT) For multiple-choice items, test takers have to select one of a number of possible answers. See also OPEN-ENDED (ITEM FORMAT)

N

NEED A requirement which can act as the driving force or motivation for action. Needs include nurturance, affiliation, social approval, sex, self-actualization, etc.

PT_D01.qxd 10/07/2006 12:14 Page 112

112

GLOSSARY

NORM GROUP The sample of people from whom NORMS are derived. Also referred to as a reference group. NORM TABLES see NORMS NORMAL CURVE see NORMAL DISTRIBUTION NORMAL DISTRIBUTION A symmetrical bell-shaped distribution with certain specific properties: the MEAN, MODE and MEDIAN are all equal to each other; the proportion of the values falling between any interval along the scale is known from the mathematical properties of the distribution. There will always be, for example, 68% of the values between −1 and +1 STANDARD DEVIATIONS. This form of distribution is found for a wide variety of both physical and psychological traits. Also called the ‘normal frequency distribution’ or the ‘normal curve’. NORMAL FREQUENCY DISTRIBUTION see NORMAL DISTRIBUTION NORMATIVE Normative information included in a TEST’s DOCUMENTATION enables the test user to see how a person’s performance on the test compares with that of others. NORMATIVE SCORE see NORM-REFERENCED MEASURE NORM-REFERENCED MEASURE A NORM-REFERENCED MEASURE defines where a person’s RAW SCORE lies in relation to the scores obtained by other people (that is, a NORM GROUP). NORM-REFERENCED SCORE The score obtained on a NORM-REFERENCED MEASURE. Such scores are expressed either as some form of PERCENTILE SCORE or STANDARD SCORE. NORM-REFERENCING see NORM-REFERENCED MEASURE NORMS Information, usually in the form of a table, which enables RAW SCORES to be converted into PERCENTILE SCORES or STANDARD SCORES (or both). Also see NORMATIVE, GENERAL NORMS and SPECIFIC NORMS.

O

OCCUPATIONAL PSYCHOLOGY An area of psychology which is concerned with the performance of people at work and in training, and with developing an understanding of how organizations function and how individuals and groups behave at work. The aim is to increase effectiveness, efficiency and satisfaction at work. OPEN-ENDED (ITEM FORMAT) For open-ended items, test takers have to write down their own responses to the items; alternatives are not given. See also MULTIPLE CHOICE (ITEM FORMAT) OPEN MODE Open Mode is where the test taker has direct access to the test materials, so there is no involvement of a test user or test administrator. Such tests

PT_D01.qxd 10/07/2006 12:14 Page 113

113

GLOSSARY

include the books of tests you might buy in the local bookshop or the tests you can find on the internet that are directly accessible to everyone. Often the only requirement is that you pay some money before you can access the test. However, no qualifications are required from you either in terms of test use or test administration. ORAL FEEDBACK see REPORTING BACK

P

PERCENTILE see PERCENTILE RANK; PERCENTILE SCORE PERCENTILE METHOD A method used to convert RAW SCORES into normally distributed Z-SCORES or other STANDARD SCORES. PERCENTILE RANK The value on the RAW SCORE scale below which a given percentage of the sample’s scores lie. For example, if the 85th percentile rank is 16, then 85% of the sample will have scored less than 16. The PERCENTILE RANK is more commonly referred to as just the PERCENTILE or in some cases the centile. PERCENTILE SCORE A number between 0 and 100 expressing a test taker’s RAW SCORE in terms of the percentage of the norm group who scored less. See also PERCENTILE RANK PERSONALITY INVENTORY Psychological TESTS that assess DISPOSITION, that is, preferred or typical ways of acting or thinking. Personality inventories attempt to measure how much or how little a person possesses of a specified TRAIT or set of traits. PHYSICAL MAKE-UP For example, health, physique, appearance, bearing and speech. POPULATION A population contains all the people who conform to some specification. In PSYCHOMETRICS, normative reference groups are populations: for example, UK adult females; university arts graduates; general population. Psychometrics involves making inferences about people who come from some population on the basis of information known about the behaviour of a representative sample from that population. POTENTIAL A capacity or capability to perform or acquire the skills to perform some class of actions. POWER TEST The focus of a power test is on how many items a person is able to answer correctly. The time limit is designed to allow most people to complete all of the test ITEMS. If a person’s score is mainly affected by their ability to answer the questions correctly -- rather than their speed -- the test is a power test. PRACTICALITY The notion of practicality in TEST use concerns issues of cost-efficiency, such as what the test costs; what training is required to use it; how long it takes to administer, score and interpret; what equipment is needed.

PT_D01.qxd 10/07/2006 12:14 Page 114

114

GLOSSARY

PREPARATION One of the four stages in the process of TEST ADMINISTRATION. PROBLEMS TEST items which involve the complex application of arithmetic rules, or logic or deductive reasoning of some form. PROFESSIONAL (ISSUES) Issues associated with professional practice and codes of professional conduct in relation to test use. See also ETHICAL (ISSUES) PSYCHOLOGICAL TEST see TEST PSYCHOLOGICAL TESTING The use of psychological TESTS in the process of assessment. PSYCHOMETRICS Literally, the measurement of mental processes. Psychometrics is the technology that underlies TESTS and their development.

Q

QUARTILE A percentile-based scoring system where the RAW SCORES are divided up into four categories each containing 25% of the distribution (referred to as the first, second, third and fourth quartiles).

R

RANDOM ERROR A type of MEASUREMENT ERROR that is unpredictable. The amount of random error in psychological measurement is represented by the STANDARD ERROR OF MEASUREMENT of a score. RAPPORT Establishing positive rapport with the test taker is one of the functions of test administration. It is important to set the right atmosphere for taking a test and to be clear about why the test is being carried out. The test administrator has an important role in setting the right ‘tone’ and in assisting the test taker(s) to give their optimal performance. RAW SCALE SCORE see SCALE SCORE RAW SCORE The raw score is the sum of the scores given to all the items or questions which a person obtains on completing a TEST. A raw score is the ABSOLUTE SCORE a person gets on a test. REFERENCE GROUP see NORM GROUP REFERENCING SCORES To compare a person’s RAW SCORES on a SCALE against some other measure. The comparison may be with other people’s scores on the same scale, the person’s own scores on other scales, known relationships with other performance measures, expected levels of attainment in the domains from which the test items were drawn. See also CRITERION REFERENCING; DOMAIN REFERENCING; NORM-REFERENCED; SELF-REFERENCED RELATIONSHIPS TEST ITEMS where a test taker has to identify the relationship between two or more things and then use that relationship to select from a set of alternatives.

PT_D01.qxd 10/07/2006 12:14 Page 115

115

GLOSSARY

RELATIVE SCORE A SCALE SCORE which describe a person’s performance relative to that of other people or in terms of some other measure. RELEVANCE Another word used to describe the concept of VALIDITY. RELIABILITY The extent to which one can rely on the obtained TEST score being an accurate measure of a person’s TRUE SCORE, rather than a measure of incidental random factors. REPORTING BACK The process of feeding back the interpretation (see TEST INTERPRETATION) of TEST scores to a CLIENT or CANDIDATE. This can be in the form of oral feedback or a written report.

S

SAME/OPPOSITE TEST ITEMS that require the test taker to choose an alternative response that is either the same as or the opposite of one given. SAMPLE STANDARD DEVIATION The square root of the SAMPLE VARIANCE. SAMPLE VARIANCE A measure of the amount of variation between scores within a sample. SAMPLING The selection of a limited number of people (or other objects) from a defined POPULATION. SAMPLING ERROR Random sampling errors are related to the size of sample: the smaller the sample, the larger the sampling error. Systematic sampling error, or sample bias, arises when the way in which a sample is selected results in it being not truly representative of the POPULATION it is taken from. See also SAMPLING SCALE In testing it is common to talk of measuring characteristics along a scale. Ability, for example, is a scale which goes from low to high scores. Thus scores obtained on a TEST of some characteristic are generally referred to as SCALE SCORES. SCALE SCORE The numerical scores attributed to a test taker’s answers to individual TEST items are added up to provide a single measure called a RAW SCALE SCORE. Scores obtained on a test are generally referred to as scale scores. SCALING see SCALE SCOPE In relation to psychological TESTS, the range of attributes covered by a test (the test DOMAIN) and the range of people with whom the test can be used. SCORES see TEST SCORE

PT_D01.qxd 10/07/2006 12:14 Page 116

116

GLOSSARY

SCORING The process of marking the answers to a psychological TEST, including conversion of RAW SCORES into STANDARD SCORES. SECURITY (OF TEST MATERIALS) Guarding against the unfair and illegal use of TEST MATERIALS. SELECTION The process of choosing people with the best chances of succeeding in a job or on a training course. SELF-REFERENCED MEASURES Self-referenced measures involve comparing a person’s scores on one scale with their scores on other scales. See also IPSATIVE TEST SELF-REFERENCING see SELF-REFERENCED MEASURES SELF-REPORT Self-report measures are instruments that ask the respondent to answer a structured set of questions about themselves. Also called self-description instruments. Most personality and interest TESTS are self-report. SERIES TEST ITEMS where you have to find the next item in a sequence, such as: 1, 2, 3, ? SPECIAL APTITUDE Another way of describing a SPECIFIC APTITUDE, for example, mechanical, dexterity and so on. SPECIFIC ABILITY A particular ABILITY (for example, spatial ability). SPECIFIC APTITUDE A particular APTITUDE (for example, clerical accuracy). SPECIFIC NORMS NORMS which are based on some specific sample, for example public sector engineering workers; clerical staff from a number of different companies. See also GENERAL NORMS SPECIFICITY The degree to which a test assesses specific as opposed to general ability or aptitude. SPEED TEST A TEST which contains relatively easy ITEMS but which has a strict time limit. The measure of performance stresses the number of items attempted within the fixed time. See also POWER TEST SPEEDED It is common to see references to TESTS as more or less speeded. The more the standard time limit results in people failing to attempt some items, the more the test is said to be ‘speeded’. See also POWER TEST; SPEED TEST STAGES OF TEST ADMINISTRATION Test administration can be divided into four stages: PREPARATION, ADMINISTRATION, SCORING, COMPLETION.

PT_D01.qxd 10/07/2006 12:14 Page 117

GLOSSARY

117 STANDARD DEVIATION (SD) The standard deviation is the square root of the VARIANCE. The standard deviation indicates how far scores are spread out around the MEAN. The standard deviation and the mean are the two most important statistics used to describe a distribution of scores. STANDARD ERROR OF MEASUREMENT (SEm) The amount of error associated with making inferences about a person’s TRUE SCORE from their obtained score. STANDARD SCORE In PSYCHOMETRICS, the ‘standard’ scale developed for measuring psychological characteristics is called the Z-SCORE scale or sometimes simply the STANDARD SCORE SCALE. A z-score is a measure equal to one STANDARD DEVIATION (SD) of a distribution. STANDARD SCORE SCALE The most commonly used standard score scales are Z-SCORES (one SD), T-SCORES (one-tenth of an SD), STENS (half an SD), STANINES (half an SD). See also STANDARD SCORE. STANDARDIZATION The procedure of establishing the initial set of NORMS for a TEST, defining the conditions under which it should be used, and of assessing its RELIABILITY and VALIDITY. STANDARDIZED In PSYCHOMETRICS, a standardized measure, TEST or testing procedure is one which has known characteristics. For example, a CORRELATION COEFFICIENT is a standardized score (like a Z-SCORE) which is known to come from a distribution of possible values which range between zero and plus or minus one. STANINES A stanine is a STANDARD SCORE scale with a MEAN of 5, a STANDARD DEVIATION of 2 and a range from 1 to 9. STANINES are used a lot in the USA for personality questionnaire scales. STANINE is an abbreviation of Standard Nine. STATES States are concerned with how a person is feeling or performing at a particular moment in time (for example, a current mood), rather than how they generally feel or typically perform. States are often distinguished from TRAITS, which are more stable and enduring psychological characteristics. STEN SCORE A STANDARD SCORE scale with a MEAN of 5.5, a STANDARD DEVIATION of 2 and a range from 1 to 10. Sten scores are the most widely used STANDARD SCORE SCALE for personality questionnaires. Sten is an abbreviation of ‘standard ten’. SUB-TEST A distinct part of a TEST. A set of similar ITEMS within a test, responses to which are added to produce a sub-test score. Such scores are generally only derived as part of the process of obtaining the overall TEST SCORE and are not interpreted on their own. SUPERVISED MODE This is the mode in which the test administrator has direct face-to-face involvement with the test taker. The test takers will come to a location where the test administrator is able to supervise them taking the test.

PT_D01.qxd 10/07/2006 12:14 Page 118

118

GLOSSARY

SYSTEMATIC BIAS A source of MEASUREMENT ERROR which is predictable and can lead to possible unfair BIAS in the use of TESTS. Systematic bias is potentially measurable.

T

TECHNICAL MANUAL The part of a TEST MANUAL that covers all the technical details of the test, such as design and development, RELIABILITY, VALIDITY, BIAS and STANDARDIZATION. TEST An assessment procedure designed to provide objective measures of one or more psychological characteristics. These include ABILITIES, APTITUDES, ATTAINMENTS, INTERESTS, beliefs, personality and so on. The important feature of psychological tests is that they produce measures obtained under standardized assessment conditions which have known RELIABILITY and VALIDITY. They provide a way of comparing a person’s performance against that of others. An instrument that has been developed using psychometric principles, the term test is used as shorthand for psychological test or psychometric test, which includes various inventories and questionnaires. TEST ADMINISTRATION The process of administering a psychological TEST to one or more people. TEST BATTERY A sequence of TESTS. See also APTITUDE BATTERY TEST DATA The information about candidates and their scores resulting from the taking of a psychological TEST. TEST INTERPRETATION The process of attributing a meaning to a test taker’s score on a psychological TEST, by reference to information about the test’s VALIDITY, RELIABILITY and NORMS. TEST ITEMS see ITEMS TEST LENGTH The number of ITEMS a TEST contains. TEST MANUAL The technical documentation accompanying a psychological TEST which tells the test user how to use and administer the test and what conclusions can be drawn from the results. Variously referred to as TECHNICAL MANUAL, user manual or test documentation. TEST MATERIALS The materials needed to administer and interpret a psychological TEST. These generally include a TEST MANUAL, answer sheets, question booklets, profile sheets, administration instructions and so on. TEST NORMS see NORMS TEST RELIABILITY The RELIABILITY of a TEST.

PT_D01.qxd 10/07/2006 12:14 Page 119

119

GLOSSARY

TEST SCORE The score (RAW SCORE or STANDARD SCORE) obtained on a TEST. TEST SESSION The period during which a psychological TEST is administered. TEST SOPHISTICATION A level of awareness and knowledge of TESTS or testing without which a person’s scores may be negatively biased. Test sophistication may arise from prior exposure, from the process of testing, or through the use of practice tests and information describing testing procedures. This should be distinguished from active coaching in how to do a particular test which will unfairly inflate a person’s scores. TRAINABILITY TEST A trainability test is designed to see whether a person is likely to be able to cope with the training required to do a job. Typically, it consists of a highly structured short training course with a test of performance at the end. TRAIT Traits are those relatively stable and enduring characteristics of people that make them predictable. Traits are usually distinguished from STATES, which are a more changeable form of psychological characteristic, for example, current moods and feelings. TRUE SCORE A person’s true score is the amount of a characteristic the person really has. As measurement involves some degree of added random error, the score a person obtains will usually be a bit larger or smaller than their true score. Over repeated measures, the average of the obtained scores will tend towards the true score. T-SCORE A STANDARD SCORE scale usually based on normalized Z-SCORES. T-scores have a mean of 50 and a STANDARD DEVIATION of 10. Normally T-scores are only used in the range from 20 to 80. TYPICAL PERFORMANCE Measures of typical performance are designed to assess disposition, such as personality, beliefs, values and interests, and to measure motivation or ‘drive’. Measures of typical performance are usually distinguished from measures of MAXIMUM PERFORMANCE which are designed to assess how well people can do things and measure ABILITY, APTITUDES or ATTAINMENT.

U

USER MANUAL see TEST MANUAL

V

VALIDATION The process of building up evidence about what can and cannot be inferred from TEST SCORES.

UTILITY Utility concerns the benefits which accrue from the use of a psychological TEST. Utility is a function of the balance between issues of PRACTICALITY (the ‘costs’ associated with using a test) and RELEVANCE and FAIRNESS (the ‘benefits’ associated with the test). The benefits, in turn, are limited by the RELIABILITY of the test and its SCOPE.

PT_D01.qxd 10/07/2006 12:14 Page 120

120

GLOSSARY

VALIDATION OF RESULTS In the context of test administration, this refers to the need to ensure that the results have been obtained under proper conditions and are not the outcome of cheating and collusion. VALIDITY Information on the validity of a TEST tells the user what is being measured by a test and therefore what inferences can be drawn about the person who has produced the score on the test. RELIABILITY concerns how dependable a score is, that is how free from error it is, while validity concerns what the score is a measure of. Validity is demonstrated by showing how a score relates to the score people get on other similar or different types of tests; how it can be used to explain people’s behaviour; and how it can be used to make predictions about various real-world criteria (such as job success). VARIABLE Variables are the characteristics which we attempt to measure with psychological TESTS. They are so named because they vary from person to person, or for the same person from time to time. The defining characteristic of a variable is that a given person can have only one value of it at any one time -- for example, they cannot be 5ft. 5in. AND 5ft. 10in. at the same time. There are three main classes of variable: nominal, ordinal, and scalar. VARIANCE A measure of variability. Variance indicates how much variation there is in a distribution of scores for one particular variable, for example, scores on a TEST. VISUALIZATION A general ability to manipulate and reason with spatial relationships.

W Z

WORK SAMPLE see WORK SAMPLE TEST WORK SAMPLE TEST A work sample test is one in which the task has been taken from a job. The task is done under STANDARDIZED assessment conditions. Work samples are essentially ATTAINMENT TESTS. They presuppose that the test taker has acquired some measure of a particular skill and set out to see how much.

Z-SCORE A STANDARD SCORE scale with a MEAN of zero and a STANDARD DEVIATION of one.

PT_S01.qxd 06/12/2006 14:35 Page 1

PSYCHOLOGICAL TESTING: THE BPS OCCUPATIONAL TEST ADMINISTRATION OPEN LEARNING PROGRAMME

TEST PACK Section 1 General-Purpose Conversion Tables David Bartram Patricia A. Lindley

© 2006 by David Bartram and Patricia A. Lindley All right reserved. A BPS Blackwell book


CONTENTS TABLE 1: TABLE 2: TABLE 3: TABLE 4:

Conversion tables for z-score and T-scores to percentiles Conversion tables for percentiles to z-score and T-scores Converting percentiles to deciles and grades Converting percentiles or T-scores to sten and stanine scores

The British Psychological Society

PT_S01.qxd 06/12/2006 14:35 Page 2

2

TEST PACK: SECTION 1

z = 1.175

12% of the area under the normal curve

88% of the area under the normal curve

−2.5

−2.0

−1.5

−1.0

−0.5

0

0.5

1.0 1.5 z = 1.175 88th percentile

2.0

2.5

FIGURE: The normal curve Tables 1 and 2 give the area under the normal curve which lies to the left of the z-score cut-off point. As the figure shows, the area to the left of a z-score of 1.175 is 88% of the total. Thus, z = 1.175 lies at the 88th percentile. The tables allow for conversion in either direction: from z-score to percentile or from percentile to z-score. In either case, equivalent T-scores are also given.

TABLE 1: Conversion tables for z-score and T-scores to percentiles z-score −3.50 −3.40 −3.30 −3.20 −3.10 −3.00 −2.90 −2.80 −2.70 −2.60 −2.50 −2.40 −2.30 −2.20 −2.10 −2.00 −1.90 −1.80 −1.70 −1.60 −1.50 −1.40 −1.30 −1.20 −1.10

T-score

Percentile

z-score

T-score

Percentile

15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

0.02 0.03 0.05 0.07 0.10 0.13 0.19 0.26 0.35 0.47 0.62 0.82 1.07 1.39 1.79 2.28 2.87 3.59 4.46 5.48 6.68 8.08 9.68 11.51 13.57

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00 1.10 1.20 1.30 1.40 1.50 1.60 1.70 1.80 1.90 2.00 2.10 2.20 2.30 2.40

50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74

50.00 53.98 57.93 61.79 65.54 69.15 72.57 75.80 78.81 81.59 84.13 86.43 88.49 90.32 91.92 93.32 94.52 95.54 96.41 97.13 97.72 98.21 98.61 98.93 99.18

PT_S01.qxd 06/12/2006 14:35 Page 3

PT_S01.qxd 06/12/2006 14:35 Page 4

4


TABLE 1: (continued ) z-score −1.00 −0.90 −0.80 −0.70 −0.60 −0.50 −0.40 −0.30 −0.20 −0.10 0.00

T-score

Percentile

z-score

T-score

Percentile

40 41 42 43 44 45 46 47 48 49 50

15.87 18.41 21.19 24.20 27.43 30.85 34.46 38.21 42.07 46.02 50.00

2.50 2.60 2.70 2.80 2.90 3.00 3.10 3.20 3.30 3.40 3.50

75 76 77 78 79 80 81 82 83 84 85

99.38 99.53 99.65 99.74 99.81 99.87 99.90 99.93 99.95 99.97 99.98

TABLE 2: Conversion tables for percentiles to z-score and T-scores Percentile

z-score

T-score

Percentile

z-score

T-score

0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 15.5 16.0 16.5 17.0 17.5 18.0 18.5 19.0 19.5 20.0 20.5

−2.570 −2.324 −2.170 −2.054 −1.958 −1.880 −1.812 −1.750 −1.695 −1.644 −1.598 −1.555 −1.514 −1.476 −1.439 −1.405 −1.372 −1.341 −1.311 −1.282 −1.254 −1.227 −1.201 −1.175 −1.150 −1.126 −1.103 −1.080 −1.058 −1.036 −1.015 −0.995 −0.974 −0.954 −0.935 −0.915 −0.896 −0.878 −0.860 −0.842 −0.824

24 27 28 29 30 31 32 33 33 34 34 34 35 35 36 36 36 37 37 37 37 38 38 38 38 39 39 39 39 40 40 40 40 40 41 41 41 41 41 42 42

25.5 26.0 26.5 27.0 27.5 28.0 28.5 29.0 29.5 30.0 30.5 31.0 31.5 32.0 32.5 33.0 33.5 34.0 34.5 35.0 35.5 36.0 36.5 37.0 37.5 38.0 38.5 39.0 39.5 40.0 40.5 41.0 41.5 42.0 42.5 43.0 43.5 44.0 44.5 45.0 45.5

−0.659 −0.643 −0.628 −0.613 −0.598 −0.583 −0.568 −0.553 −0.539 −0.525 −0.510 −0.496 −0.482 −0.468 −0.454 −0.440 −0.426 −0.413 −0.399 −0.385 −0.372 −0.358 −0.345 −0.332 −0.319 −0.305 −0.292 −0.279 −0.266 −0.253 −0.241 −0.227 −0.214 −0.202 −0.189 −0.176 −0.164 −0.151 −0.138 −0.126 −0.113

43 44 44 44 44 44 44 44 45 45 45 45 45 45 45 46 46 46 46 46 46 46 47 47 47 47 47 47 47 47 48 48 48 48 48 48 48 48 49 49 49

PT_S01.qxd 06/12/2006 14:35 Page 5

5


TABLE 2 (continued ) Percentile

z-score

T-score

Percentile

z-score

T-score

21.0 21.5 22.0 22.5 23.0 23.5 24.0 24.5 25.0 50.5 51.0 51.5 52.0 52.5 53.0 53.5 54.0 54.5 55.0 55.5 56.0 56.5 57.0 57.5 58.0 58.5 59.0 59.5 60.0 60.5 61.0 61.5 62.0 62.5 63.0 63.5 64.0 64.5 65.0 65.5 66.0 66.5 67.0 67.5 68.0 68.5 69.0 69.5 70.0 70.5 71.0 71.5 72.0 72.5 73.0 73.5 74.0 74.5 75.0

−0.807 −0.789 −0.772 −0.755 −0.739 −0.722 −0.706 −0.690 −0.674 0.012 0.025 0.038 0.050 0.063 0.075 0.088 0.100 0.113 0.126 0.138 0.151 0.164 0.176 0.189 0.202 0.214 0.227 0.241 0.253 0.266 0.279 0.292 0.305 0.319 0.332 0.345 0.358 0.372 0.385 0.399 0.413 0.426 0.440 0.454 0.468 0.482 0.496 0.510 0.525 0.539 0.553 0.568 0.583 0.598 0.613 0.628 0.643 0.659 0.674

42 42 42 42 43 43 43 43 43 50 50 50 50 51 51 51 51 51 51 51 52 52 52 52 52 52 52 52 53 53 53 53 53 53 53 53 54 54 54 54 54 54 54 55 55 55 55 55 55 55 56 56 56 56 56 56 56 57 57

46.0 46.5 47.0 47.5 48.0 48.5 49.0 49.5 50.0 75.5 76.0 76.5 77.0 77.5 78.0 78.5 79.0 79.5 80.0 80.5 81.0 81.5 82.0 82.5 83.0 83.5 84.0 84.5 85.0 85.5 86.0 86.5 87.0 87.5 88.0 88.5 89.0 89.5 90.0 90.5 91.0 91.5 92.0 92.5 93.0 93.5 94.0 94.5 95.0 95.5 96.0 96.5 97.0 97.5 98.0 98.5 99.0 99.5

−0.100 −0.088 −0.075 −0.063 −0.050 −0.038 −0.025 −0.012 0.000 0.690 0.706 0.722 0.739 0.755 0.772 0.789 0.807 0.824 0.842 0.860 0.878 0.896 0.915 0.935 0.954 0.974 0.995 1.015 1.036 1.058 1.080 1.103 1.126 1.150 1.175 1.201 1.227 1.254 1.282 1.311 1.341 1.372 1.405 1.439 1.476 1.514 1.555 1.598 1.644 1.695 1.750 1.812 1.880 1.958 2.054 2.170 2.324 2.570

49 49 49 49 50 50 50 50 50 57 57 57 57 58 58 58 58 58 58 59 59 59 59 59 60 60 60 60 60 61 61 61 61 62 62 62 62 63 63 63 63 64 64 64 65 65 66 66 66 67 68 68 69 70 71 72 73 76

PT_S01.qxd 06/12/2006 14:35 Page 6

6


TABLE 3: Converting percentiles to deciles and grades Percentiles

Deciles

Grades

Description

E

Well below average

D

Below average

C

Average

B

Above average

A

Well above average

Range

Midpoint

0–10

5

1

11–20

15

2

21–30

25

3

31– 40

35

4

41–50

45

5

51–60

55

6

61–70

65

7

71– 80

75

8

81–90

85

9

91–100

95

10

TABLE 4: Converting percentiles or T-scores to sten and stanine scores Percentiles

Sten

T-scores

Percentiles

Stanine

T-scores

0–2

1

20–30

0–4

1

20–33

3–6

2

31–35

5–10

2

34–37

7–15

3

36–40

11–22

3

38–42

16–30

4

41–45

23–40

4

43–47

31–50

5

46–50

41–60

5

48–52

51–69

6

51–55

61–77

6

53–57

70–84

7

56–60

78–89

7

58–63

85–93

8

61–65

90–96

8

64–67

94–97

9

66–70

97–100

9

68–80

98–100

10

71–80

PT_S02a.qxd 06/12/2006 14:37 Page 1


TEST PACK Section 2 TEST A Test Booklet David Bartram Patricia A. Lindley




This test is for training use only

PT_S02a.qxd 06/12/2006 14:37 Page 2

PT_S02a.qxd 06/12/2006 14:37 Page 3

TEST A: TEST BOOKLET

This test is designed to be used as part of a training course in the use of psychological tests. The results obtained from this test should not be treated as reliable indicators of ability and should not be used for any purpose other than training.

Candidate’s Name: Other information Age (in years): Gender: Male/Female

Results Correct

Incorrect

Omitted

TOTAL

RAW SCORE: Standard scores T-score (General population norms): T-score (University student norms):

This test is for training use only 3

PT_S02a.qxd 06/12/2006 14:37 Page 4

TEST A: INSTRUCTIONS

In the test you are about to do there are two types of item. Examples are given of both so that you can be quite clear what you will be expected to do when you begin the test. Read through this page, completing the examples as you go. The first type of item asks you to carry out some arithmetical operations and check your answer against the five possible answers. When you have decided which is the correct answer, put a circle around it. Example E1 is done for you. Examples: Circle the correct answer E1

6+7−5=?

(8)

9

10

11

12

12

16

20

24

28

Circle the correct answer E2

(8/2) × 4 = ?

The second type of item consists of numbers which go together in some way to form a series. Decide how they go together and then work out the next number in the series. Check your answer against the four possible answers. When you have decided which is the correct answer, put a circle around it. Example E3 is done for you. Circle the next number in the series E3 2, 4, 6, 8, ?

8

(10)

12

16

Each number in the series increases by two. So, 8 + 2 = (10) Circle the next number in the series E4

9, 27, 81, 243, ?

886

486

729

244

If you are uncertain about the types of items you will be asked to do, read through the examples again, checking the answers and the reasoning behind them, before you begin the timed test. Put your pencil down when you have completed and understood the items and the answers. The test administrator will read the instructions for the timed test when you are ready to go on. This test is for training use only 4

PT_S02a.qxd 06/12/2006 14:37 Page 5

TEST A: INSTRUCTIONS

This test has 25 items. Each of the items will be one of the two types you have completed in the examples and you will be asked to answer them in the same way. The test is timed. You will have exactly 5 minutes to complete the test. Answer each item carefully before you move on to the next. Do not miss out any items unless you are really stuck. If you wish to change any answers please ensure that the incorrect answer is erased completely. There is only one correct answer to each item. The correct answer always appears among the possible answers. Please do not talk once the test has begun. When you are told to begin, please turn over the page and work quickly and carefully through the items. Continue to work until you are told to stop. If you are not clear about what you have to do, please ask now.

DO NOT TURN THE PAGE UNTIL YOU ARE TOLD TO DO SO. 5

PT_S02a.qxd 06/12/2006 14:37 Page 6

6


Circle the correct answer

*

1.

9×2+1=

13

15

17

19

21

2.

(7 − 2) × 7 =

33

35

39

41

45

3.

17 × 2 − 16 =

14

18

24

32

34

4.

26/2 + 17 =

20

26

30

36

42

5.

(17 − 3) × 4 =

21

25

51

56

72

Circle the next number in the series 6.

6, 10, 16, 24, ?

26

30

34

38

42

7.

3, 6, 12, 24, ?

28

32

36

42

48

8.

7, 14, 21, 28, ?

35

37

42

49

54

9.

5, 9, 17, 33, ?

49

65

87

99

129

10.

0.5, 1, 2, 4, ?

5

6

8

10

12

21

25

42

48

56

1

2

3

4

5

Circle the correct answer 11.

(28 − 14) × 3 =

12.

(16 + 6)/11 =

13.

13 × 3 + 6 =

33

39

43

45

49

14.

19 − 7 − 5 =

5

6

7

9

11

15.

(57 − 9)/3 =

16

18

25

29

48

Circle the next number in the series 16.

12, 25, 51, 103, ?

125

154

206

207

209

17.

10, 10, 20, 30, ?

30

40

50

60

70

18.

11, 34, 103, 310, ?

413

630

631

931

933

19.

3, 3, 6, 9, ?

12

13

15

18

27

20.

8, 18, 40, 82, ?

100

104

144

164

168

Circle the correct answer 21.

22 × 4 + 88 =

156

166

176

186

196

22.

17 × 3 + 97 =

138

146

148

156

158

Circle the missing number in the series 23.

89, 78, ?, 56

70

68

67

66

64

24.

104, 52, ?, 13

22

24

25

26

28

25.

475, 237, 118, ?

57

57.5

58

58.5

59 *

END OF TEST.

PT_S02a.qxd 06/12/2006 14:37 Page 7

PT_S02a.qxd 06/12/2006 14:37 Page 8

PT_S02b.qxd 06/12/2006 15:02 Page 1


TEST PACK Section 2 P5 Personality Inventory Test Booklet David Bartram Patricia A. Lindley




PT_S02b.qxd 06/12/2006 15:02 Page 2

PT_S02b.qxd 06/12/2006 15:02 Page 3

P5 PERSONALITY INVENTORY BOOKLET

This inventory is designed to be used as part of a training course in the use of psychological tests. The results obtained from this test should not be treated as reliable indicators of personality and should not be used for any purpose other than training.

Candidate’s Name: Other information Age (in years): Gender: Male/Female

3

PT_S02b.qxd 06/12/2006 15:02 Page 4

PT_S02b.qxd 06/12/2006 15:02 Page 5

P5 PERSONALITY INVENTORY:1 ADMINISTRATION INSTRUCTIONS On the following pages, there are statements describing people’s behaviours. Please use the rating scale below to describe how accurately each statement describes you. Describe yourself as you generally are now, not as you wish to be in the future. Describe yourself as you honestly see yourself, in relation to other people you know of the same sex as you are, and roughly the same age. So that you can describe yourself in an honest manner, your responses will be kept in absolute confidence. Please read each statement carefully, and then place a tick in the column that corresponds to the number on the scale, as shown in the example below. Keep your ticks small so that they do not cross into other columns or rows. Response options 1. 2. 3. 4. 5.

Very Inaccurate Moderately Inaccurate Neither Inaccurate nor Accurate Moderately Accurate Very Accurate Statement

A

I make friends easily.

B

I am hard to get to know.

C

I often forget to put things back in their proper place.

Very Inaccurate

Moderately Inaccurate

Neither Inaccurate nor Accurate

Moderately Accurate

Very Accurate

1

2

3

4

5

✓ ✓ ✓

Try to avoid using the ‘in between’ response (3) too often. There is no time limit, but do not take too long thinking about each statement. It is generally best to respond in terms of your first reaction. There are 50 statements in all. If you make a mistake, carefully erase the incorrect tick and then put a tick in the intended location. 1

The items contained in this inventory are from the International Personality Item Pool (2001). [A Scientific Collaboratory for the Development of Advanced Measures of Personality Traits and Other Individual Differences (http://ipip.ori.org/). Internet Web Site.] These items are public domain materials and are available free for use.

5

PT_S02b.qxd 06/12/2006 15:02 Page 6

6


Statements 1 to 25 Response Options 1. 2. 3. 4. 5.


Very Moderately Neither Moderately Very Inaccurate Inaccurate Inaccurate Accurate Accurate nor Accurate 1

1

I am the life of the party.

2

I feel little concern for others.

3

I am always prepared.

4

I get stressed out easily.

5

I have a rich vocabulary.

6

I don’t talk a lot.

7

I am interested in people.

8

I leave my belongings lying around.

9

I am relaxed most of the time.

10

I have difficulty understanding abstract ideas.

11

I feel comfortable around people.

12

I insult people.

13

I pay attention to details.

14

I worry about things.

15

I have a vivid imagination.

16

I keep in the background.

17

I sympathize with others’ feelings.

18

I make a mess of things.

19

I seldom feel blue.

20

I am not interested in abstract ideas.

21

I start conversations.

22

I am not interested in other people’s problems.

23

I get chores done right away.

24

I am easily disturbed.

25

I have excellent ideas.

Page 1 of 2

2

3

4

5

PT_S02b.qxd 06/12/2006 15:02 Page 7

7


Statements 26 to 50 Response Options 1. 2. 3. 4. 5.


Very Moderately Neither Moderately Very Inaccurate Inaccurate Inaccurate Accurate Accurate nor Accurate 1

26

I have little to say.

27

I have a soft heart.

28


29

I get upset easily.

30

I do not have a good imagination.

31

I talk to a lot of different people at parties.

32

I am not really interested in others.

33

I like order.

34

I change my mood a lot.

35

I am quick to understand things.

36

I don’t like to draw attention to myself.

37

I take time out for others.

38

I shirk my duties.

39

I have frequent mood swings.

40

I use difficult words.

41

I don’t mind being the centre of attention.

42

I feel others’ emotions.

43

I follow a schedule.

44

I get irritated easily.

45

I spend time reflecting on things.

46

I am quiet around strangers.

47

I make people feel at ease.

48

I am exacting in my work.

49

I often feel blue.

50

I am full of ideas.

Page 2 of 2

2

3

4

5

PT_S02b.qxd 06/12/2006 15:02 Page 8

PT_S03a.qxd 10/07/2006 12:13 Page 1


TEST PACK Section 3 TEST A and P5 Administrator’s Pack David Bartram Patricia A. Lindley



CONTENTS Checklist of actions for the four stages of test administration Administration instructions for Test A and P5 Personality Inventory General checklist of materials Scoring instruction for Test A and P5 Personality Inventory Test Session Log Candidate evaluation questionnaire Scoring key for Test A Scoring keys for the P5 Personality Inventory


PT_S03a.qxd 10/07/2006 12:13 Page 2

PT_S03a.qxd 10/07/2006 12:13 Page 3

CHECKLIST OF ACTIONS FOR THE FOUR STAGES OF TEST ADMINISTRATION © No photocopying allowed

Some items apply only to Supervised or Managed Modes of administration, others apply to all modes. Some or all of the items marked with an asterisk may not be necessary for computer-based assessment systems.

Stage 1: Preparation Supervised and Managed Modes 1.

Plan test sessions with due regard to the maximum number of candidates who can be assessed in one session and the maximum duration of each session.

2.

Ensure that any items of equipment (for example, computers) are operating correctly and that sufficient test materials are available for use by the candidates.

3.

Ensure, where reusable materials are being used, that they are carefully checked for marks or notes which may have been made by previous candidates.

4.

Arrange a suitable, quiet location for carrying out the testing and arrange the seating and desk space to maximize comfort and minimize the possibilities of cheating. Make sure that lighting conditions are controlled, especially where computer screens will have to be used.

5.

Inform the candidates of the time and place well in advance.

All Modes 6.

Ensure adequate advance briefing.

7.

Ensure that potential test candidates are not provided with prior access to test materials other than those specifically designed to help them prepare for their assessment.

Stage 2: Administration All Modes 8.

For Supervised or Managed Modes, brief candidates on the purpose of the test session and put them at their ease while maintaining an appropriately businesslike atmosphere. Give clear descriptions to the candidate(s) prior to their assessment concerning: – how their results are to be used – who will be given access to them – how long they will be retained for 3

PT_S03a.qxd 10/07/2006 12:13 Page 4

4


For Controlled Mode, ensure that the computer-based administration includes a briefing that covers the above issues. Supervised and Managed Modes 9. 10.

Check the identity of the candidates and enter their personal details in the test session log, together with relevant details of what assessment instruments are being used. Check that all candidates have the necessary materials.

*11.

Use standard test instructions and present them clearly and intelligibly to the candidates.

*12.

Provide the candidates with sufficient time to work through example test items.

*13.

Make careful checks to ensure proper use of the answer sheet and response procedures.

14.

Deal appropriately with any questions which arise without compromising the purpose of the test.

*15.

Explain any time limits and the need to maintain silence during the test and make clear that once the test has begun no further questions can be answered.

*16.

Adhere strictly to test-specific instructions concerning pacing and timing.

17.

Collect in all materials when testing has been completed and carry out a careful inventory of materials.

18.

Thank the candidates for their participation when the final test has been completed, and explain the next stage (if any) in their assessment.

19.

Make final entries in the test session log – including notes on any particular problems which arose during the session.

Stage 3: Scoring Supervised or Managed Modes *20.

Visually check answer sheets for ambiguous markings which could be obscured by scoring keys or cause problems with machine-scoring systems.

*21.

Make accurate use of the relevant scoring key.

*22.

Accurately transfer raw score marks to candidates’ records.

All Modes *23.

Use norm tables to find relevant percentile and/or standard scores and complete candidates’ records.

Stage 4: Completing All Modes 24.

Keep all test materials and test data in a secure place and ensure that access is not given to unauthorized personnel.

25.

Ensure that all mandatory requirements relating to candidates’ and clients’ rights and obligations under the Data Protection Act have been clearly explained to all parties (i.e. clients and candidates).

26.

Ensure that data are stored according to the requirements of the Data Protection Act.

* Some or all of the items marked with an asterisk may not be necessary for computer-based assessment systems.

PT_S03a.qxd 10/07/2006 12:13 Page 5

TEST A: ADMINISTRATION INSTRUCTIONS © No photocopying allowed

For this test, each candidate will need a test booklet, two pencils and an eraser (see the General Checklist of Materials for full details). The following instructions are the same as those attached to the front of the test booklet. Read these aloud to the candidate. Before you open your test booklet, please write your name [or other identifier] in the space on the front cover. Ask candidates to complete any other information on the front cover as necessary. When they have done this, ask them to open the booklet at the first page and to follow through the instructions as you read them aloud. In the test you are about to do there are two types of item. Examples are given of both so that you can be quite clear what you will be expected to do when you begin the test. Read through this page, completing the examples as you go. The first type of item asks you to carry out some arithmetical operations and check your answer against the five possible answers. When you have decided which is the correct answer, put a circle around it. Example E1 has been done for you. The answer is 8 because 6 plus 7 equals 13 and 13 minus 5 equals 8. Now do example E2. Pause and allow time for the candidates to complete the example. Examples: Circle the correct answer E1 6 + 7 − 5 = ?

(8)

9

10

11

12

12

16

20

24

28

Circle the correct answer E2

(8/2) × 4 = ?

When the candidates have finished, say: The correct answer to question E2 is 16, because 8 divided by 2 equals 4, and 4 times 4 equals 16. The second type of item consists of numbers which go together in some way to form a series. Decide how they go together and then work out the next number in the series. Check your answer 5

PT_S03a.qxd 10/07/2006 12:13 Page 6

6


against the four possible answers. When you have decided which is the correct answer, put a circle around it. Example E3 has been done for you. The answer is 10 as each number in the series increases by two. So, 8 plus 2 equals 10. Now do example E4. Circle the next number in the series E3

2, 4, 6, 8, ?

8

(10)

12

16

486

729

244

Circle the next number in the series E4

9, 27, 81, 243, ?

886

Pause and allow time for the candidates to complete the example. When the candidates have finished, say: The correct answer to question E4 is 729 because each number has been multiplied by 3 to give the next number in the series: 243 times 3 equals 729. If you are uncertain about the types of items you will be asked to do, read through the examples again, checking the answers and the reasoning behind them, before you begin the timed test. Put your pencil down when you have completed and understood the items and the answers. The test administrator will read the instructions for the timed test when you are ready to go on. Wait until candidates have put down their pencils and are ready to begin. Then read out the following final instructions – these are on the second page of the candidate’s test booklet. This test has 25 items. Each of the items will be one of the two types you have completed in the examples and you will be asked to answer them in the same way. The test is timed. You will have exactly 5 minutes to complete the test. Answer each item carefully before you move on to the next. Do not miss out any items unless you are really stuck. If you wish to change any answers please ensure that the incorrect answer is erased completely. There is only one correct answer to each item. The correct answer always appears among the possible answers. Please do not talk once the test has begun. When you are told to begin, please turn over the page and work quickly and carefully through the items. Continue to work until you are told to stop. If you are not clear about what you have to do, please ask now. When you have dealt with any final questions, say: Turn to the first page and begin. Start your stopwatch. ........................................................................................................................................................... After exactly 5 minutes, say: Please stop working, put down your pencils and close your test booklets. Please remain seated while I collect the test booklets and other materials. As you collect each one check that the candidate’s name (or other identifier) is on the front. Finally, thank the candidates for their participation.

PT_S03a.qxd 10/07/2006 12:13 Page 7

P5 PERSONALITY INVENTORY: ADMINISTRATION INSTRUCTIONS © No photocopying allowed

For this test, each candidate will need a test booklet, 2 pencils and an eraser (see the General Checklist of Materials for full details). Ask candidates to complete the information on the front cover as necessary. When they have done this, ask them to open the booklet at the first page and to follow through the instructions as you read them aloud. The following instructions are reproduced on the candidate’s test booklet. You should read through these with the candidates. At the end of the instruction, check that everyone is clear about what they have to do and deal with any questions. As this is an untimed test, you will need to monitor progress but not expect everyone to finish at the same time. If you are administering to a group of people, ask people to sit quietly when they have finished so as not to disturb others. Most people will take no more than four or five minutes to complete the inventory. If after three minutes people are still working through the first page of 25 questions, you should remind everyone: Remember. Do not take too long over each statement. Your initial reaction is generally the right one. ........................................................................................................................................................... On the following pages, there are statements describing people’s behaviours. Please use the rating scale below to describe how accurately each statement describes you. Describe yourself as you generally are now, not as you wish to be in the future. Describe yourself as you honestly see yourself, in relation to other people you know of the same sex as you are, and roughly the same age. So that you can describe yourself in an honest manner, your responses will be kept in absolute confidence. Please read each statement carefully, and then place a tick in the column that corresponds to the number on the scale, as shown in the example below. Keep your ticks small so that they do not cross into other columns or rows.

7

PT_S03a.qxd 10/07/2006 12:13 Page 8

8


Statement

A

I make friends easily.

B

I am hard to get to know.

C


Very Inaccurate

Moderately Inaccurate

Neither Inaccurate nor Accurate

Moderately Accurate

Very Accurate

1

2

3

4

5

✓ ✓ ✓

Try to avoid using the ‘in between’ response (3) too often. There is no time limit, but do not take too long thinking about each statement. It is generally best to respond in terms of your first reaction. There are 50 statements in all. If you make a mistake, carefully erase the incorrect tick and then put a tick in the intended location.

PT_S03a.qxd 10/07/2006 12:13 Page 9

GENERAL CHECKLIST OF MATERIALS © No photocopying allowed

Paper-and-pencil tests For every candidate in every session you may need the following materials. Check with the Test Manual to see which are required; add any additional materials to this list: 2 pencils 1 ballpoint pen eraser pencil sharpener scrap paper test booklet answer sheet For the administrator stopwatch test manual Test Session Log list of candidates ‘Test in Progress’ signs administration instructions spare pencils, pens, erasers, pencil sharpener Computer tests For every candidate in every session you may need the following materials. Check with the Test Manual to see which are required; add any additional materials to this list: computer (or terminal) a sufficient number of test administrations on the computer pencil scrap paper test booklet For the administrator stopwatch Test Manual Test Session Log list of candidates ‘Test in Progress’ signs administration instructions spare pens, pencils 9

PT_S03a.qxd 10/07/2006 12:13 Page 10

SCORING INSTRUCTIONS FOR TEST A

The correct answers for each question are listed below. 1. 2. 3. 4. 5.

Place the acetate Scoring Key on the page of answers. Ensure that the circles on the key over lay the asterisks printed on the page. Give one point for each question where the circled answer is in the square braces marked on the key. Give zero for questions where either the wrong answer was circled, more than one answer was circled or no answer was circled. The total raw score will be a number between 0 and 25. Write the total raw score in the space provided on the front cover of the booklet.

For information, the correct answers are: Question 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 10

Answer 19 35 18 30 56 34 48 35 65 8 42 2 45 7 16 207 50 931 15 168 176 148 67 26 58.5

PT_S03a.qxd 10/07/2006 12:13 Page 11

SCORING INSTRUCTIONS FOR THE P5 PERSONALITY INVENTORY © No photocopying allowed

Check each answer sheet to ensure there is just one tick for each statement and that the tick is clearly in one of the five columns. Where this is not the case, but it is clear what was intended, make the tick clearer in the appropriate column so that it will be visible when you come to do the scoring. There are four scoring keys: 1. 2. 3. 4.

Positively scored items on Page 1 Negatively scored items on Page 1 Positively scored items on Page 2 Negatively scored items on Page 2

Place the scoring keys over the relevant answer sheets and align them. Each scoring key has a column headed ‘Scale’ which contains a number: 1, 2, 3, 4 or 5. Each number refers to one of the five personality scales. Add up the values of the ticks for each of the five scales in turn. That is, add up all the ratings given for items that are keyed for scale 1, then for scale 2 and so on. Be careful with the keys for statements that are negatively scored. As shown on the scoring key, for these the scores for each item go 5, 4, 3, 2, 1 and not 1, 2, 3, 4, 5 as you read from left to right.

• •

Enter the scores in the P5 Score Table (in this pack) as you use each key. Add the scores for the four keys together and place them in the Total Raw Score column.

There are 10 statements for each scale, so check that:

•

The total scores are no lower than 10 and no higher than 50.

If they are, you have made a mistake and need to check the scoring. When you have checked that the scores are correct:

•

Convert the five raw scores into standard scores using the Sten Conversion Table provided in the Test Pack.

•

Finally, transfer the raw score total and the sten score to the candidate’s entry on the Test Session Log.

11

PT_S03a.qxd 10/07/2006 12:13 Page 12

PT_S03a.qxd 10/07/2006 12:13 Page 13

P5 PERSONALITY INVENTORY SCORE TABLE

Test Administrator: ……………………………………… Candidate Name: Scale 1 2 3 4 5

Date: Page 1 Positive

Page 1 Negative

Scale

Sten score

Page 1 Positive

Page 1 Negative

Page 2 Positive

Page 2 Negative

Total raw score

Sten score

Page 2 Negative

Total raw score

Sten score

Page 2 Negative

Total raw score

Sten score

Extravert Agreeable Conscientious Emotionally Stable Open

Scale


Page 1 Negative

Page 2 Positive


Candidate Name: Scale 1 2 3 4 5

Total raw score

Date:

Candidate Name:

1 2 3 4 5

Page 2 Negative


Candidate Name:

1 2 3 4 5

Page 2 Positive


Page 1 Negative

Page 2 Positive


13

PT_S03a.qxd 10/07/2006 12:13 Page 14

P5 PERSONALITY INVENTORY SCORE TABLE

Test Administrator: .....................................................

Candidate Name: Scale 1 2 3 4 5


Page 1 Negative

Scale

Page 1 Positive

Page 1 Negative

Page 2 Positive

Page 2 Negative

Total raw score

Sten score

Page 2 Negative

Total raw score

Sten score

Page 2 Negative

Total raw score

Sten score


Page 1 Negative

Page 2 Positive


Candidate Name: Scale

14

Sten score


Scale

1 2 3 4 5

Total raw score

Date:

Candidate Name:

1 2 3 4 5

Page 2 Negative


Candidate Name:

1 2 3 4 5

Page 2 Positive



Page 1 Negative

Page 2 Positive

PT_S03a.qxd 10/07/2006 12:13 Page 15

TEST SESSION LOG © No photocopying allowed

Test Administrator ............................................... Qualified Test User [*] ......................................... Date ......................................... Names of tests administered: Test 1 ......................................................... Test 2 ......................................................... Test 3 ......................................................... Test 4 .........................................................

TEST RESULTS (RAW SCORES) Candidate name/identifier

Test 1

Test 2

Test 3

Test 4

1 2 3 4 5 6 7 8 9 10 11 12

* The Qualified Test User is the person responsible for the session, for the accuracy of scoring, security of materials and so on. He/she is the person who is authorized to obtain test materials and supervise their administration.

15

PT_S03a.qxd 10/07/2006 12:13 Page 16

PT_S03a.qxd 10/07/2006 12:13 Page 17

TEST SESSION LOG

Test Administrator .............................................. Qualified Test User .............................................. Date ..............................................

TEST SESSION Test 1

Test 2

Test 3

Test 4

Number in group: Session start time: Timed part – start: Timed part – finish: Session finish time:

Materials For each test, record the number of items set OUT, the number collected IN and the number CHECKED as suitable for return to stock. Explain any discrepancies on Page 3. Test 1 OUT

IN

CHECK OUT

Test 2 IN

CHECK OUT

Test 3

Test 4

IN

IN

CHECK OUT

CHECK

test booklets answer sheets Test Manual scoring keys computer disks stopwatch pencils pencil sharpeners rubbers scrap paper other materials

17

PT_S03a.qxd 10/07/2006 12:13 Page 18

PT_S03a.qxd 10/07/2006 12:13 Page 19

TEST SESSION LOG

Test Administrator ………………………………… Qualified Test User ………………………………… Date …………………………………

Use this report form to describe: 1.

Information on any candidate’s previous test-taking experience if relevant (for example, the same test having been taken recently).

2.

Any problems encountered with particular candidates (for example, disabilities, problems with understanding instructions, illness, etc.).

3.

Any unusual events or disturbance which occurred during the test session.

4.

Any other information which might need to be taken into consideration in the interpretation of one or more candidates’ results.

5.

Loss of or damage to materials: note what has been done (a) to dispose of damaged materials and (b) to ensure their replacement.

1. 2. 3. 4. 5.

Report completed by:

Name ..................................... Signature ................................ 19

PT_S03a.qxd 10/07/2006 12:13 Page 20

PT_S03a.qxd 10/07/2006 12:13 Page 21

CANDIDATE EVALUATION QUESTIONNAIRE © No photocopying allowed

Would you please give feedback to your test administrator by filling in this appraisal sheet. Please be as objective and constructive as possible in your comments. This is solely for the benefit of the administrator to help him/her to learn from the session. It will not be used by anyone else for assessment or any other purpose. For all your responses, enter a 1, 2, 3, 4, or 5 in the rating box using the rating scale shown. Use the numbers listed below to rate each of the aspects of the testing environment: 1 = Excellent 2 = Good 3 = Satisfactory 4 = Not satisfactory 5 = Poor A. The test environment How do you rate the following aspects of the environment in which the testing took place? Environment

Rating

Lighting Temperature Quietness Comfort of seating Privacy

PLEASE CONTINUE ON THE NEXT PAGE 21

PT_S03a.qxd 10/07/2006 12:13 Page 22

22


B. Performance of the test administrator How do you rate the following aspects of the performance of the test administrator? Performance

Rating

Explaining why the test was to be done Making me feel comfortable and relaxed about doing the test Administrator’s familiarity and ease with the test instruction Answering my questions about the test Stopping the test promptly on time Explaining to me what would happen to my results Feedback of results

C. Other comments Were there any things the administrator failed to do which you think might have made things clearer for you? Were there any things he/she did or said which you found unclear or confusing? Please add any comments which you think might be helpful.

Please give this completed feedback sheet to your test administrator

PT_S03b(acetates).qxd 10/07/2006 12:12 Page 23

Align circle on key with asterisks in test booklet

Test A Scoring key


P5 PERSONALITY INVENTORY SCORING KEY: STATEMENTS 1 TO 25: POSITIVE ITEMS

Very Moderately Neither Moderately Very Inaccurate Inaccurate Inaccurate Accurate Accurate nor Accurate

Scale

Statement

Score 1

1

I am the life of the party.

3

I am always prepared.

5

I have a rich vocabulary.

2

I am interested in people.

4

I am relaxed most of the time.

1

I feel comfortable around people.

3

I pay attention to details.

5

I have a vivid imagination.

2

I sympathize with others’ feelings.

4

I seldom feel blue.

1

I start conversations.

3

I get chores done right away.

5

I have excellent ideas.

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

1

2

3

4

5


P5 PERSONALITY INVENTORY SCORING KEY: STATEMENTS 1 TO 25: NEGATIVE ITEMS


Scale

Statement

Score 1 2

2

I feel little concern for others.

4

I get stressed out easily.

1

I don’t talk a lot.

3

I leave my belongings lying around.

5

I have difficulty understanding abstract ideas.

2

I insult people.

4

I worry about things.

1

I keep in the background.

3

I make a mess of things.

5

I am not interested in abstract ideas.

2

I am not interested in other people’s problems.

4

I am easily disturbed.

3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

5

4

3

2

1


P5 PERSONALITY INVENTORY SCORING KEY: STATEMENTS 26 TO 50: POSITIVE ITEMS


Scale

Statement

Score 26 27

2

I have a soft heart.

1

I talk to a lot of different people at parties.

3

I like order.

5

I am quick to understand things.

2

I take time out for others.

40

5

I use difficult words.

41

1

I don’t mind being the centre of attention.

42

2

I feel others’ emotions.

43

3

I follow a schedule.

5

I spend time reflecting on things.

47

2

I make people feel at ease.

48

3

I am exacting in my work.

5

I am full of ideas.

28 29 30 31 32 33 34 35 36 37 38 39

44 45 46

49 50

1

2

3

4

5


P5 PERSONALITY INVENTORY SCORING KEY: STATEMENTS 26 TO 50: NEGATIVE ITEMS


Scale

Statement

Score 26

1

I have little to say.

28

3


29

4

I get upset easily.

30

5

I do not have a good imagination.

2

I am not really interested in others.

4

I change my mood a lot.

1

I don’t like to draw attention to myself.

38

3

I shirk my duties.

39

4

I have frequent mood swings.

4

I get irritated easily.

1

I am quiet around strangers.

4

I often feel blue.

27

31 32 33 34 35 36 37

40 41 42 43 44 45 46 47 48 49 50

5

4

3

2

1

PT_S04.qxd 06/12/2006 15:06 Page 1


TEST PACK Section 4 TEST A, TEST B and P5 Norm Tables David Bartram Patricia A. Lindley




PT_S04.qxd 06/12/2006 15:06 Page 2

PT_S04.qxd 06/12/2006 15:06 Page 3

3


The tables on this page are derived from simulated data. They are for training purposes only.

Norm Table 1

TEST A: Exact percentiles -- specific groups Raw score

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Percentiles 11-year-olds (n = 500)

Craft apprentices (n = 500)

University students (n = 500)

0.0 0.3 1.2 2.7 5.0 8.0 13.1 22.4 37.0 51.3 62.2 73.6 83.8 90.2 93.5 95.5 98.0 99.0 99.7 100.0 100.0 100.0 100.0 100.0 100.0 100.0

0.0 0.0 0.0 0.1 0.4 0.9 1.5 2.7 4.8 8.9 13.8 19.9 28.7 40.5 53.5 64.3 73.8 81.2 86.7 91.8 95.3 97.2 98.6 99.5 99.9 100.0

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 1.2 3.2 6.8 13.3 23.4 36.6 52.2 65.9 76.9 86.6 92.7 96.6 98.8 99.6

Minimum Maximum Mode Median

1.00 18.00 8.00 9.00

3.00 24.00 13.00 14.00

11.00 25.00 18.00 18.00

Mean Variance SD

9.13 9.30 3.05

13.86 12.67 3.56

17.96 7.05 2.66

PT_S04.qxd 06/12/2006 15:06 Page 4

4



Norm Table 2 TEST A: General population norms (n = 5000) -- 5% percentile bands T-score

27 31 34 37 40 42 43 45 46 47 49 50 51 53 54 55 57 58 60 63 66 69 73

Raw score

Percentile


0–3 4 5–6 7 8 9 – 10 – 11 – 12 – 13 – 14 15

1 3 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 97 99

1 2–3 4–7 8–12 13–17 18–22 23–27 28–32 33–37 38–42 43–47 48–52 53–57 58–62 63–67 68–72 73–77 78–82 83–87 88–92 93–96 97–98 99

16 17 18–19 20 21–25

Mean Median Mode Std Dev Variance Minimum Maximum

11.98 12.00 12.00 4.00 16.02 .00 25.00

Valid Cases

5000

PT_S04.qxd 06/12/2006 15:06 Page 5

5



Norm Table 3

TEST B: Verbal Reasoning Percentile conversion tables for four occupational samples Samples: W:

Mixed sample of managers and supervisors from a range of small and medium-sized companies.

X:

Technical supervisor grades from a range of public sector organizations.

Y:

Middle management sample from a range of medium-sized companies.

Z:

Graduate-entry, middle-management sample from a large multinational.

T-score

Samples W (n = 700)

X (n = 700)

Y (n = 700)

Z (n = 700)

27 31 34 37 40 42 43 45 46 47 49 50 51 53 54 55 57 58 60 63 66 69 73

0–6 7–10 11–14 15–17 18–19 20–21 22 23–24 25–26 27 28–29 30 31–32 33 34–35 36–37 38–39 40 41–43 44–46 47–49 50–55 56–60

0–3 4–5 6–8 9–10 11–12 13 14 15 16 17 18 19 20 21 22 23–24 25 26–27 28–29 30–31 32–33 34–36 37–60

0–12 13–15 16–18 19–20 21–22 23 24 25 26 27 28 29 30 31 32 33–34 35 36 37–38 39–41 42–44 45–48 49–60

0–23 24–26 27–28 29–30 31–32 33 34 35–36 37 38 39 40 41 42 43 44 45 46–47 48–49 50–51 52–53 54–57 58–60

Mean SD

30.40 11.44

19.80 7.87

29.74 8.28

40.17 7.80

Minimum Maximum

0 60

0 47

1 55

14 60

Percentile


1 3 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 97 99

1 2–3 4–7 8–12 13–17 18–22 23–27 28–32 33–37 38–42 43–47 48–52 53–57 58–62 63–67 68–72 73–77 78–82 83–87 88–92 93–96 97–98 99

PT_S04.qxd 06/12/2006 15:06 Page 6


P5 PERSONALITY INVENTORY STEN CONVERSION TABLE

WARNING: The information contained in this table has been created for training purposes. Scores obtained using this table should not be interpreted as genuine test scores. Scale

mean

SD

1

2

3

4

5

6

7

8

9

10

Extravert

32.21 6.82 10–18 19–21 22–25 26–28 29–32 33–35 36–39 40–42 43– 45 46–50

Agreeable

25.57 5.73 10–14 15–16 17–19 20–22 23–25 26–28 29–31 32–34 35–37 38–50

Conscientious 29.32 6.35 10–16 17–19 20–22 23–26 27–29 30–32 33–35 36–38 39– 42 43–50 Emotionally Stable

34.29 7.13 10–20 21–23 24–27 28–30 31–34 35–37 38–41 42–44 45– 48 49–50

Open

31.46 6.72 10–18 19–21 22–24 25–28 29–31 32–34 35–38 39–41 42– 44 45–50

6

PT_S05.qxd 06/12/2006 15:11 Page 1


TEST PACK Section 5 VeNuS Manual David Bartram Patricia A. Lindley




PT_S05.qxd 06/12/2006 15:11 Page 2

The VeNuS General Ability Test Battery

User’s Manual

WARNING This manual is only for use with the BPS Open Learning Programme. All the data were produced by computer simulation. While the information described has been designed to be realistic and to match that which one would expect from ability tests of this type, the VeNuS tests are fictitious.

PT_S05.qxd 06/12/2006 15:11 Page 3

Manual for the VeNuS General Ability Test Battery The VeNuS (Verbal, Numerical and Spatial) general ability test battery contains three specific ability subscales: Verbal (V), Numerical (N) and Spatial (S). Each consists of 40 items of various different types. All three subscales include 10 analogies and 10 series completion items. The remaining 20 items for each scale are: Verbal: semantic category membership and 10 reasoning items Numerical: two types of computation Spatial: pattern completion and addition or subtraction An overall measure of general ability (G) is obtained from the sum of the raw scores on the three scales. The tests are designed as power rather than speed tests. Though administered with a time limit of 30 minutes for each test, most people from the relevant population should have little difficulty completing all items in this time. The tests have been designed for use with the general population and are standardized on the age range 15 to 28 years. 1 Standardization and norms Standardization was carried out on a sample of 600 people, in the age range 15 to 28 years. The sample contained approximately equal numbers of males and females for each of three age bands: 15 to 17; 18 to 21; 22 to 28. There were 200 people in each age band. All those in the 15 to 17 band were in school. The 18 to 21 band included 100 people in higher education; the rest were in employment. All those in the older band (22 to 28) were in full-time employment. In all, there were 320 males and 280 females. Thirty of the sample described their ethnic origin as ‘Black’, 550 as ‘White’ and 20 gave no response. General population norms and measures of reliability (Cronbach’s alpha) were based on this sample (see Annex 1). Additional occupational norm groups have also been obtained. Details of these are contained in Annex 2. 2 Reliability Internal consistency measures (Cronbach’s alpha) were obtained from the item data for the full sample of 600 people. These, together with raw score SEMs and T-score SEMs, are shown in Table 1.

TABLE 1: Alpha reliability and SEMs for the three subscales and general ability scale (N = 600) Raw score

T-score

Mean

SD

Alpha

SEM

SEM/SD

SEM

Verbal Numerical Spatial

20.74 20.89 21.13

7.35 7.01 6.81

0.89 0.92 0.90

2.44 1.98 2.15

0.33 0.28 0.32

3.30 2.80 3.20

General

62.77

17.98

0.95

4.02

0.22

2.20

PT_S05.qxd 06/12/2006 15:11 Page 4

4


A subset of 120 of the 600 people returned within two to three weeks of the first test session and completed a retest. Their results are shown in Table 2. It can be seen that this group was somewhat above average and slightly restricted in range relative to the standardization sample. The retest scores show an average increase of three or four raw score points together with a reduction in variance. This is typical of the sort of short-term retest effects found for ability tests. In general the retest correlations are high, despite the reduced range of the sample and the further reduction in range on the retest scores.

TABLE 2: Test--retest reliability (N = 120) First test

Retest

Correlation

Mean

SD

Mean

SD


23.21 22.14 24.65

6.57 6.47 6.36

27.33 25.56 29.15

6.24 6.03 6.11

0.83 0.81 0.79

General

70.00

16.88

82.04

15.99

0.92

The evidence suggests that the measures are stable, with SEMs of about one-third of an SD for the subscales and about one-quarter of an SD for the general ability measure. 3 Gender and ethnic group differences The mean subscale and scale scores for the standardization sample, broken down by gender, are shown in Table 3. It is apparent that the overall ability score (G) is slightly higher for females than for males. This is due mainly to the higher average scores for females on Verbal Ability. Males show a smaller advantage on the Numerical and Spatial subscales.

TABLE 3: Male (N = 320) and female (N = 280) mean scores for the three subscales and general ability scale Male

Female

Mean

SD

Mean

SD


18.50 21.45 21.87

7.28 6.87 7.14

23.30 20.27 20.28

6.56 7.12 6.33

General

61.82

18.51

63.85

17.33

PT_S05.qxd 06/12/2006 15:11 Page 5

5


The standardization sample contains 30 people who classified themselves as ‘Black’ in ethnic origin and 550 who classified themselves as ‘White’. The remaining 20 were not classified. Scores for these two ethnic groups are shown in Table 4. It is apparent that there is a large difference on the Verbal subscale. The Spatial subscale shows the smallest difference of all three.

TABLE 4: ‘White’ (N = 550) and ‘Black’ (N = 30) mean scores for the three subscales and general ability scale ‘White’

‘Black’

M

Mean

SD

Mean

SD


21.04 21.01 21.18

7.15 7.01 6.79

14.97 18.70 20.10

8.73 6.74 7.27

General

63.24

17.79

53.77

19.55

There was no evidence to show that effects of gender and ethnic origin interacted with each other. 4 Validity 4.1

Correlations between the scales

As expected, the three subscales have moderately high correlations with each other (see Table 5). The very high correlations between each subscale and G are to be expected as each forms part of the G scale.

TABLE 5: Correlations between the subscales and with the general ability scale (N = 600) V

N

Numerical Spatial

0.58 0.52

0.65

General

0.83

0.87

S

0.84

Principal components analysis of the three subscales showed only one component with an eigenvalue greater than one (2.17), which accounted for 72.2% of the variance. This factor, which can be considered to correspond to G, includes equal amounts of each subscale. The communalities are 0.67, 0.77 and 0.73 for V, N and S respectively.

PT_S05.qxd 06/12/2006 15:11 Page 6

6


4.2 Correlations with other tests A series of studies was carried out to assess the construct validity of the three subscales.

TABLE 6: Correlations of VeNuS with some other tests V

N

S

G

0.43 0.58 0.37 0.50

0.54 0.56 0.41 0.54

0.71 0.36 0.50 0.45

0.63 0.62 0.56 0.67

0.55 0.68 0.51 0.38

0.70 0.51 0.37 0.28

0.47 0.44 0.56 0.72

0.61 0.65 0.59 0.58

0.61 0.45 0.27 0.18

0.43 0.51 0.33 0.21

0.50 0.39 0.43 0.52

0.57 0.49 0.45 0.38

0.51 0.34 0.23 0.21

0.35 0.66 0.15 0.25

0.40 0.36 0.45 0.33

0.46 0.53 0.36 0.29

Study A, N = 65: Ravens SPM AH4 Part 1 Verbal/Numerical AH4 Part 2 Diagrammatic AH4 Total Study B, N = 87: GAT Numerical GAT Verbal GAT Non-verbal GAT Spatial Study C, N = 58: DAT Verbal Reasoning DAT Numerical Ability DAT Abstract Reasoning DAT Space Relations Study D, N = 93: SHL TTB Verbal VT5 SHL TTB Numerical NT6 SHL TTB Spatial ST7 SHL TTB Diagrammatic DT8

4.3 Relationships with academic criteria A sample of 215 first-year psychology undergraduate students was tested early in their first term. Of these, 212 subsequently obtained third-class or better honours degrees. Their final degree results, with average VeNuS test scores, are shown in Table 7.

TABLE 7: Variations in VeNuS scores with degree classification

First Class Upper Second Class Lower Second Class Third Class

25 105 68 14

V

N

S

G

34.70 31.82 30.51 26.26

35.37 32.23 31.28 28.43

28.41 27.76 26.52 25.10

98.48 91.81 88.31 79.79

PT_S05.qxd 06/12/2006 15:11 Page 7

7


5 Job-related validity Data are available from three job-related validity studies. Norm tables are provided for each sample in Annex 2. 5.1

Engineering apprentice training

A sample of 256 applicants for apprentice training was tested during selection (though the results of the tests were not used in the selection procedure). Subsequently 68 were taken into training, and correlations were obtained with performance in college assignments and examinations and supervisor ratings of practical work (see Table 8).

TABLE 8: Correlations with engineering training outcome (adjusted for indirect range restriction effects of selection)

Final college examination College assessed coursework Ratings of practical work

V

N

S

0.42 0.41 0.11

0.36 0.33 0.21

0.03 0.16 0.29

By the end of their training, about 70% of the intake were considered to be satisfactory. Based on these data, Table 9 was produced. This table provides raw score cut-offs for various selection ratios. For each, the percentage of those selected who are expected to be successful is given for each of three validities (0.30, 0.35 and 0.40).

TABLE 9: Expectancy table (N = 68) Percent selected 20% 40% 60% 80%

Minimum raw score cut-offs

Validity

V

N

S

G

0.30

0.35

0.40

27 22 20 15

31 27 24 20

32 28 25 22

87 77 69 60

81% 78% 76% 73%

86% 82% 78% 75%

88% 83% 79% 75%

PT_S05.qxd 06/12/2006 15:11 Page 8

8


5.2 Secretarial training course This was a sample of 420 applicants (all female) to FE college secretarial courses. Applicants were not selected on the basis of their tests. About 75% of the applicants were taken on to the courses following a career guidance interview. The only limitation on intake was the availability of places and the applicant’s decision about the course following the interview. Subsequent performance measures (a composite measure based on competence in the use of office equipment, office procedures, filing and typing) indicate that the Verbal scale was the best single predictor of training performance (r = 0.38), with both the other scales showing some relationship (r = 0.29 for Numerical and r = 0.18 for Spatial). 5.3 Supervisor ratings of 190 experienced cruiseline cabin staff Cabin staff in a large cruiseline company were tested with VeNuS, together with a number of other measures, and their scores were related to supervisor ratings (on a 1-–6 scale) of a number of characteristics. Three of the rating scales showed correlations with the VeNuS scales.

TABLE 10: Correlations between subscale scores and supervisor ratings Supervisor ratings Coping with problems Attention to detail General efficiency

V

N

S

0.25 0.19 0.33

0.13 0.31 0.15

0.34 0.26 0.11

PT_S05.qxd 06/12/2006 15:11 Page 9

PT_S05.qxd 06/12/2006 15:11 Page 10

10


ANNEX 1: VeNuS NORMS FOR THE STANDARDIZATION SAMPLE N = 600 in all cases. The values in the table are the raw scores corresponding to each percentile cut-off. Thus, 35% of the population score below 19 on Verbal, 80% score below 79 on the total score, and so on.

Grade

Percentile

Raw Scores

T-Score

Verbal

Numerical

Spatial

General

E

1 3 5 10

2 4 8 11

5 8 9 12

5 9 10 12

19 27 34 40

27 31 34 37

D

15 20 25 30

14 15 16 17

14 15 16 17

14 15 16 18

44 48 50 53

40 42 43 45

C

35 40 45 50 55 60 65 70

18 19 20 21 22 23 24 25

18 19 20 21 22 23 24 25

19 20 – 21 22 23 24 25

56 58 60 63 66 68 70 73

46 47 49 50 51 52 54 55

B

75 80 85 90

26 27 28 30

26 27 28 30

26 27 28 30

77 79 82 87

57 58 60 63

A

95 97 99

32 34 39

32 34 37

32 33 35

91 94 101

66 69 73

20.74 7.35 600

20.90 7.01 600

21.13 6.81 600

62.77 17.98 600

Mean SD n

PT_S05.qxd 06/12/2006 15:11 Page 11

11


ANNEX 2: VeNuS NORMS FOR THE JOB-RELATED SAMPLES Applicants for an engineering apprentice training scheme: N = 256 in all cases

Grade

Percentile

Raw Scores

T-Score

Verbal

Numerical

Spatial

General

E

1 3 5 10

6 8 11 13

12 13 15 17

14 15 16 20

35 42 46 52

27 31 34 37

D

15 20 25 30

14 15 17 18

18 20 21 23

21 22 23 24

56 60 62 65

40 42 43 45

C

35 40 45 50 55 60 65 70

19 20 – 21 22 – 23 24

– 24 25 26 27 – 28 29

25 – 26 – 27 28 29 30

67 69 72 74 76 78 79 81

46 47 49 50 51 52 54 55

B

75 80 85 90

24 27 28 30

30 31 32 34

31 32 33 35

85 87 91 95

57 58 60 63

A

95 97 99

33 35 37

37 38 39

36 37 40

104 106 112

66 69 73

21.11 6.67 256

25.60 6.40 256

26.74 5.71 256

73.45 16.56 256

Mean SD N

PT_S05.qxd 06/12/2006 15:11 Page 12

12


Applicants for a national secretarial training course: N = 420 in all cases

Grade

Percentile

Raw Scores

T-Score

Verbal

Numerical

Spatial

General

E

1 3 5 10

11 14 15 17

6 9 10 13

2 5 6 9

26 32 37 41

27 31 34 37

D

15 20 25 30

19 20 21 22

15 16 17 18

10 12 13 14

46 50 52 55

40 42 43 45

C

35 40 45 50 55 60 65 70

– 23 24 – 25 26 27 28

19 20 21 – 22 23 24 25

15 16 17 18 19 20 – 21

57 60 62 64 66 68 69 71

46 47 49 50 51 52 54 55

B

75 80 85 90

– 29 31 32

26 27 28 30

22 23 24 26

74 77 81 85

57 58 60 63

A

95 97 99

35 37 38

32 – 37

29 30 34

90 94 102

66 69 73

24.61 5.80 420

21.20 6.50 420

17.64 6.68 420

64.46 16.30 420

Mean SD N

PT_S05.qxd 06/12/2006 15:11 Page 13

13


Cruiseline cabin staff: N = 190 in all cases

Raw Scores Grade

Percentile

Verbal

Numerical

Spatial

General

T-Score

E

1 3 5 10

5 6 7 10

7 9 11 13

10 14 15 17

27 34 38 42

27 31 34 37

D

15 20 25 30

11 13 14

14 15 17 18

18 19 20 21

45 50 53 55

40 42 43 45

C

35 40 45 50 55 60 65 70

15 16 17 18 19 20 21

20 21 22 23 24 25

22 23 24 25 26 27 -

56 58 62 63 66 68 69 72

46 47 49 50 51 52 54 55

B

75 80 85 90

22 23 24

26 27 29 -

28 29 31 32

74 76 79 81

57 58 60 63

A

95 97 99

26 28 31

31 32 34

34 35 37

87 89 92

66 69 73

17.21 5.68 190

21.19 6.21 190

24.14 5.68 190

62.54 14.85 190

Mean SD N

PT_S05.qxd 06/12/2006 15:11 Page 14

14 Verbal Freq 3 7 10 6 15 26 24 52 63 71 71 61 59 54 36 19 12 4 1 6

Numerical Freq 4 1 5 14 14 34 36 52 63 71 58 62 50 43 37 27 17 8 3 1


Midpoint 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 I....+....I....+....I....+....I ....+....I....+....I 0 15 30 45 60 75 Histogram Frequency


PT_S05.qxd 06/12/2006 15:11 Page 15

15


Spatial Freq 0 4 4 12 17 31 45 43 45 73 70 67 54 42 41 32 12 5 2 1

General Freq 0 2 2 13 12 25 40 58 66 78 71 62 64 46 40 12 6 3 0 0

Valid Cases


Midpoint 6 12 18 24 30 36 42 48 54 60 66 72 78 84 90 96 102 108 114 120 I....+....I....+....I....+....I ....+....I....+....I 0 20 40 60 80 100 Histogram Frequency 600

Missing Cases

0

PT_S05.qxd 06/12/2006 15:11 Page 16

PT_S06.qxd 06/12/2006 15:07 Page 1


ASSESSMENT PORTFOLIO David Bartram Patricia A. Lindley




PT_S06.qxd 06/12/2006 15:07 Page 2

PT_S06.qxd 06/12/2006 15:07 Page 3

ASSESSMENT PORTFOLIO

Introduction The Student Portfolio has been developed alongside Psychological Testing: The BPS Occupational Test Administration Open Learning Programme. It contains in one book all the questions that are presented in the modules. The questions take the form of 1.

Self-assessment questions (SAQs)

2.

Open questions (OQs)

3.

Exercises

In the modules they are used for the following purposes: 1.

SAQs: These are to enable the learner to review his/her learning and check his/her knowledge. Model Answers are provided in the text for each of these questions.

2.

OQs: These are to allow the learner to pause and reflect or speculate on the implications in a situation. There are no model answers for these.

3.

Exercises: These are to enable the reader to develop and practise skills on the basis of data provided in the Modules or the related material.

If the Portfolio is used in conjunction with the Modules, it offers a convenient method of collating all the learner’s answers to SAQs, makes provision for the Modules to be kept in an unmarked state, and can be used to give to an assessor as evidence that this work has been undertaken. All of the materials in this Portfolio will provide supporting evidence of competence for the Certificate of Competence in Test Administration, but the assessor will require additional evidence and this will need to be discussed with the individual assessor. It is clear that a demonstration of practical skills, such as test administration, will also be required.

3

PT_S06.qxd 06/12/2006 15:07 Page 4

MODULE 1

OPEN QUESTION: Pause and reflect for a moment on the use of the word ‘test’. 1.

What does the word ‘test’ mean to you personally? What images or memories does it conjure up?

2.

Can you think of any other terms to use instead, which you could use to describe a psychological test to someone who was about to have a test administered to them?

Write down your answers. .......................................................................................................................................................... .......................................................................................................................................................... .......................................................................................................................................................... .......................................................................................................................................................... .......................................................................................................................................................... .......................................................................................................................................................... These questions are aimed at making you think about the notion of testing people, and how people might react to the idea of being ‘tested’. You will have a chance to look back at these ideas when you come to Module 2, which deals with test administration.

Collect examples of ‘tests’ from Sunday newspapers and other magazines. Look at them carefully to see how close they come to meeting the criteria for a psychological test. Are they all just lists of questions?

4

PT_S06.qxd 06/12/2006 15:07 Page 5


5

SAQ 1.2.1 Pause and reflect for a moment on the differences between measures of maximum performance and measures of typical performance. (The following questions will help to structure your thoughts.) 1.

What sorts of attributes are assessed by maximum performance measures and what sorts by typical performance measures?

.......................................................................................................................................................... .......................................................................................................................................................... 2.

How do they differ in the way in which they are timed?

.......................................................................................................................................................... ..........................................................................................................................................................

SAQ 1.2.2 1.

What is the key difference between attainment and aptitude tests?

.......................................................................................................................................................... .......................................................................................................................................................... 2.

Why is it unhelpful only to look at the items that make up these tests?

.......................................................................................................................................................... ..........................................................................................................................................................

EXERCISE 1.4.1: Self-administration of Test A Take Test A from the Test Pack and complete it yourself following the instructions in Section 3 of the Test Pack. Make sure you keep exactly to the time limit: use a stopwatch or a watch with a second hand. When you have completed this, use the scoring key in Section 3 of the Test Pack and give one point for every correct answer (nothing for incorrect or unanswered items).

SAQ 1.4.1 Looking at Table 1.4.1: What is the maximum raw score obtainable? What is the minimum raw score obtainable?

PT_S06.qxd 06/12/2006 15:07 Page 6

6


SAQ 1.5.1: Converting raw scores to percentiles Using the information in Table 1.5.1, answer the following questions: (a)


(b)


(c)


EXERCISE 1.5.1: Converting raw scores to percentiles and grades Use the norm tables for Test A contained in the Test Pack. Find the appropriate Test A norm table to convert raw scores to percentiles and then use the General Purpose Conversion Tables to convert the percentile scores to grades. Convert the following raw scores:

Total raw score

11-year-olds Percentile

Grade


Grade


Grade

19 .......................................................................................................................................... 18 .......................................................................................................................................... 16 .......................................................................................................................................... 15 .......................................................................................................................................... 14 .......................................................................................................................................... 12 .......................................................................................................................................... 10 ..........................................................................................................................................

EXERCISE 1.5.2: Using norm tables (a) Julian Barnes is 16 years old. He obtains a raw score of 15 on Test A. How does this score compare with the scores of university students and craft apprentices? Use the tables in the Test Pack. Mary James is studying for a degree at university. She obtained a T-score of 65 on Test A (university student norms). What is her approximate percentile rank score in the general population? Look at the raw score you yourself obtained when you completed Test A at the start of this Module. How would you interpret that score in terms of the Test A norms provided in the Test Pack? (Remember that this is not a ‘real’ test -- it has been constructed for training purposes only and all the normative data have been artificially computer-generated.)

PT_S06.qxd 06/12/2006 15:07 Page 7

7


(b) Test B is a test of Verbal Reasoning. It has 60 items, and raw scores can range from zero (all wrong) to 60 (all right). Norm tables for Test B are included in your Test Pack. Using these norm tables, convert the following raw scores to percentiles, grades, sten scores and normalized T-scores. Ensure you use the indicated norm group for each one. Use the General Purpose Conversion Tables in the Test Pack to convert the percentile scores to grades and Table 4 to convert them to stens.

Person

Raw Score

Norm Group

1

45

Z

2

35

Z

3

35

W

4

28

X

5

24

Y

6

20

Y

7

15

Z

8

12

Z

9

10

W

10

10

X

Percentile

Grade

T-score

Sten

How would you describe the performance of the two people (#2 and #3) who obtained raw scores of 35 and the two (#9 and #10) who obtained raw scores of 10?

PT_S06.qxd 06/12/2006 15:07 Page 8

MODULE 2

OPEN QUESTION: Make some notes on your thoughts on each of these questions before reading further.

• • •

Which of the six test administration functions is the most important in high-stakes testing? How would you manage each of these functions if a test is being administered on a computer? Which of these could you manage if the test is being administered remotely, online, to the test candidate?

SAQ 2.1.1 List the main advantages and disadvantages of each mode of administration for (a) low-stakes and (b) high-stakes testing. Open Mode .......................................................................................................................................................... .......................................................................................................................................................... Controlled Mode .......................................................................................................................................................... .......................................................................................................................................................... Supervised Mode .......................................................................................................................................................... ..........................................................................................................................................................

8

PT_S06.qxd 06/12/2006 15:07 Page 9


9

Managed Mode .......................................................................................................................................................... .......................................................................................................................................................... Define what the role of a test administrator is for each of the four modes of administration. Open Mode .......................................................................................................................................................... .......................................................................................................................................................... Controlled Mode .......................................................................................................................................................... .......................................................................................................................................................... Supervised Mode .......................................................................................................................................................... .......................................................................................................................................................... Managed Mode .......................................................................................................................................................... ..........................................................................................................................................................

OPEN QUESTION: Imagine that you are invited to take part in a testing session as a candidate. The session may be a part of your professional development or it may be a part of a job selection process. Whichever you have imagined, use a few seconds now to jot down some of the things you would like to know, in advance, about the session. 1........................................................................................................................................................ 2........................................................................................................................................................ 3........................................................................................................................................................ 4........................................................................................................................................................ 5........................................................................................................................................................

PT_S06.qxd 06/12/2006 15:07 Page 10

10


EXERCISE 2.1.1: Planning the test session Scenario: A test session is being planned for three weeks’ time to test 12 candidates. There is a room which is large enough to accommodate 20 tables and which is quiet and can be reserved for the session. Three computers can be accommodated in the room and can be free for the test session. A full day is available in which to arrange the session. All the test materials can be ordered at short notice if they are not in stock. It has been decided that the tests are fair and suitable for the purpose for which they are to be used. What has not been decided is which of the tests to use. Using the administration instructions from two different manuals given below, plan two separate sessions each for 12 candidates using the two tests described as the basis for the sessions. TEST 1. This is a paper-and-pencil test of ability. It requires that candidates sit at individual tables placed at least three feet apart and all facing towards the administrator. The test has inbuilt practice questions which require that the administrator walks around the room and looks at the candidates’ answers to see that they are using the right boxes. The answers are read out to check that the examples are understood and questions are allowed up to this point. The Manual recommends that no more than eight candidates are tested in any one session unless there is a second administrator present to assist. The test takes approximately 45 minutes: the administration of materials and practice session takes approximately 15 minutes and the timed test element is exactly 25 minutes long. TEST 2. This ability test is administered by computer. Practice test are available on the test publisher’s website (http://www.publishername.com/practicetests). These may be openly accessed by candidates to help prepare them for the test by familiarizing them with the kinds of items that the test contains. It is recommended that the candidate takes this practice test before coming to the test session. The test itself has two example items to familiarize the candidate with the question format and the use of the computer keys. The test is timed and runs for 20 minutes exactly. It is suitable for group administration. Plan sessions for each of these tests. Assume that materials will have to be ordered from the publisher. At this stage, do not write to the candidates, but include it in your planning.

EXERCISE 2.1.2: Inviting the candidate to the test session As a result of your planning, the time scale and staff availability it is decided that only the computerbased test will be used. Write to one of the candidates inviting him or her to the first session and giving the following information: • Time, date and venue of the session. • Length of time the session will take. • As much information as you can to prepare the candidate for the tests: send out practice leaflets or inform the test taker about available practice tests or direct test takers to appropriate internet testing practice sites. • Advance notice that they will not be permitted to receive or make calls or text messages during the test session. • A reminder to bring glasses if they need to wear them. • Information about the confidentiality of the tests -- who will see the answers, what will happen to the data, and who will know the results (including the candidate); how feedback will be given. • A request for any special requirements that may be needed. • Obtain the candidate’s signed permission to provide information from the test to the people it is intended for.

PT_S06.qxd 06/12/2006 15:07 Page 11


11

EXERCISE 2.1.3: Familiarizing yourself with the Test Pack materials Turn to the Test Pack and take out one of the Test A test booklets (Test Pack, Section 2) and the Administration Instructions (Test Pack, Section 3). Test A is an Ability test. Turn to the Administration Instructions. Provide yourself with the relevant materials from the General Checklist of Materials (Test Pack, Section 3). Then read through the Administration Instructions. When you are certain about the examples, check your pencils, start the stopwatch and begin the test. Do not worry about the timing. The stopwatch will give you an indication of how long the test takes for you to complete and should reassure you about the timing. Complete the test and leave the scoring for the moment. Turn to the Test Pack and take out one of the P5 Personality Inventory test booklets (Test Pack, Section 2). The administration instructions are part of the test booklet. Provide yourself with the relevant materials from the General Checklist of Materials (Test Pack, Section 3). Then read through the Administration Instructions. When you are certain about the examples, check your pencils and begin the test. This is an untimed test, but you might want to set the stopwatch running when you start to see how long you take to complete the questionnaire. Complete the test and leave the scoring for the moment. Now you can begin to plan your test session. To do this you can use the blank planning schedule below. Choose three people to act as test takers and plan the testing session for them. You may wish to test all three together or use three separate administrations; much will depend (as in any testing session) on your situation. Do remember that these tests have been written for training purposes only and that you cannot give your candidates meaningful feedback. Make this clear to the candidates both in your advance preparation and later in your preamble to the testing session. Do not invite your candidate(s) to a testing session until you have read through this Module at least once, and you have thoroughly familiarized yourself with the materials and procedure. Planning schedule Use the following headings to plan your schedule (see Table 2.1.1 and Exercise 2.1.1). About 3 weeks before: About 2 weeks before: About 3 days before: The day before: On the day:

EXERCISE 2.1.4: Introducing the session to the candidates Make notes now for the introduction to your session. Do not make a ‘tight’ script as it will defeat the object of the introduction, which is to put candidates at ease. Remember, you need to:

• •

first, brief your candidate on the real purpose of the session (that it is part of your learning); second, provide a ‘role-play’ introduction.

PT_S06.qxd 06/12/2006 15:07 Page 12

12


EXERCISE 2.1.5: Checking the answer sheets Take your own completed booklet for Test A and check whether you have ringed two or more answers to any one question. If you have, mark the multiple answers with a cross using a red pen. (When you score your candidates’ booklets, do the same thing.) Take your own completed booklet for the P5 Personality Inventory and follow the instruction for scoring to check that there is a tick in one and only one of the five response columns for each statement. If there are multiple answers, mark them with a cross using a red pen. (When you score your candidates’ booklets, do the same thing.)

EXERCISE 2.1.6: Scoring Using the acetate key from the Test Pack, score your answers for Test A. Give no points to any item that has been marked with a red cross even if one of the answers is the correct one. Count up the total number correct and write this in the space provided on the front cover of the Test A booklet. Then add up the number of omissions and the number of wrong answers. If no option has been chosen for a question, it is an omission. If one incorrect option or if two or more options have been chosen, it is wrong. Write these totals into the other spaces provided on the front cover. Add the three totals. They should add up to 25. If they do not, you will need to check the scoring as you will have made a mistake. Now follow the scoring instructions provided in the Test Pack for scoring your responses to the P5 Personality Inventory. Record the total scale scores on the form provided.

EXERCISE 2.1.7: Converting raw scores into percentiles and standard scores Test A Turn to the Section 4 of the Test Pack, and using the General Population Norms, Table 2, locate your Test A score in the raw score column, read across to the percentile score column and the T-score column. The following example uses two raw scores (17 and 13): Column 1 T-score 63 T-score 53

Column 2 raw score = 17 raw score = 13

Column 3 percentile score = 90 percentile score = 60

(N.B. These are ‘manufactured’ data for the purposes of training only.) Now turn to Table 1, column 4. Check the raw score of 17 against university students. You should obtain a percentile score 36.6. Use the General Purpose Conversion Tables (Section 1 in the Test Pack) to convert the percentile to a T-score. From Table 2, we find that a percentile 36.5 is equivalent to a T-score of 47. Now do the same for your own score. First find the general population percentile and T-score and then find the percentile and T-score based on the university student norms. Enter the general population and university student T-scores in the space provided at the bottom of the front page of the question booklet.

PT_S06.qxd 06/12/2006 15:07 Page 13

13


P5 Personality Inventory Now you can convert your P5 scale raw scores into sten scores. Use the P5 norm table provided to look up the sten equivalents for each of your five raw scores. Enter the values in the spaces on the form provided. Here are some examples. Check to see if you get the same answers: Scale

Raw score

Sten

Extravert

36

7

Agreeable

15

2

Conscientious

29

5

Emotionally Stable

42

8

Open

42

9

SAQ 2.3.1: Dealing with problems How would you deal with the following problems during (i) a supervised paper-and-pencil testing session and (ii) a supervised computer-based testing session? (a)

A thunderstorm caused a 3-second power cut?

(b)

A candidate became ill?

(c)

A candidate finished five minutes before time and asked if they could leave?

(d)

A candidate asked you (quietly and discreetly) for further information about a test question?

(e)

A candidate asked if he or she could take home a question book to finish questions not finished in the test?

What would you do if at another time (i.e. outside the testing session): (f)

A candidate asked for the results of a computer-administered test that she or he had completed earlier in the week?

(g)

the managing director asked you for a copy of a test for a friend whose offspring was undergoing tests for graduate selection into another company?

(h)

A new edition of a test is purchased and the old materials became redundant?

(i)

A junior member of staff photocopied answer sheets to save money?

PT_S06.qxd 06/12/2006 15:07 Page 14

14


Exercise 2.3.1: Administering Tests A and B Do not attempt this exercise until you have read through the whole of Module 2 once and have completed Test A yourself. Prepare for your training testing session. Make sure you have fully familiarized yourself with the Test A and P5 Personality Inventory materials and have completed them both yourself before you first administer them to a candidate. Note that for training purposes you can ‘collapse’ the normal two- or three-week planning period into a few days. – – – – – – – – – – –

Plan it. Write to, or ring, or speak to, your own candidate(s) at the appropriate time in your plan. Give them all the information you have outlined. Review the instructions. Prepare your checklist and materials. Timetable the session. Prepare a quiet room. Carry out the session as planned. Ask candidates to complete the appraisal forms. Score the answers. Translate into percentiles and T scores or stens as appropriate. Give feedback to candidate.

Repeat this procedure with other people, until you are confident about your administration and you have dealt with any problems noted in your Candidate Evaluation Questionnaires.

EXERCISE 2.4.1: Preparing information for the test user CASE STUDY 1 A candidate has taken the VeNuS battery of tests. He is applying for a post as an engineering apprentice in a large company and is currently awaiting the results of his GCSE examinations in which it is predicted he will do well in Maths, English and Craft, Design & Technology. The job requires that the candidate has ‘a good standard of written English and will be able to cope well with the Maths and Engineering drawing at further education level’. On the day of the tests he leaves the room three minutes before the end of the Spatial test. He returns later to collect his jacket and says he left because he was feeling ill. His scores on the tests are: V N S G

25 34 22 81

Using the VeNuS manual to help you, prepare these scores for feedback. Prepare the notes in this order: (i)

(ii)

Convert the raw scores to standard scores and percentiles. Describe these in lay terms -- use the 5-point grading system (see Module 1 -- percentiles) to help you choose consistent labels for above and below average performance. Prepare a note for the test user regarding the candidate’s behaviour during the test session.

PT_S06.qxd 06/12/2006 15:07 Page 15

15


EXERCISE 2.4.2: Preparing information for the test user CASE STUDY 2 A group of five candidates has completed the P5 Personality Inventory. Their raw scores are shown below. Candidate B left the test session early, feeling unwell, but had already completed all the items in the inventory. Candidate D seemed very anxious and seemed to spend a lot more time over each item than the other candidates. She was only just over half-way through the inventory by the time the others had finished. Candidate E mentioned that they had done this inventory before for another job application. Prepare a simple summary table (use a computer spreadsheet if you want) for giving to the test user, which gives the candidates’ raw scores and sten scores for each scale, and note any issues relating to behaviour in the test session.

Extravert

Agreeable

Conscientious

Emotionally Stable

Open

A

35

33

35

21

38

B

20

25

12

36

13

C

27

28

27

21

32

D

24

24

48

33

20

E

44

45

41

18

38

Candidate

PT_S06.qxd 06/12/2006 15:07 Page 16

Psychological Testing: BPS Occupational Test Administration Open Learning Programme

Psychological Testing: BPS Occupational Test Administration Open Learning Programme

Maximizing Resources CMIOLP (CMI Open Learning Programme)

Maximizing Resources CMIOLP (CMI Open Learning Programme)

Communication in Organisations CMIOLP (CMI Open Learning Programme)

Successful Project Management CMIOLP (CMI Open Learning Programme)

Developing High Performance Teams CMIOLP (CMI Open Learning Programme)

Psychological Testing: An Introduction

Essentials of Psychological Testing

Essentials of Psychological Testing

Positive Recruitment & Retention CMIOLP (CMI Open Learning Programme)

Open Source Network Administration

Developing Personal Potential CMIOLP (CMI Open Learning Programme)

Essentials of psychological testing

Improving Competitive Advantage CMIOLP (CMI Open Learning Programme)

The Management Task, Third Edition (CMI Open Learning Programme)

Psychological Testing: Principles, Applications, and Issues

Dictionary of Psychological Testing, Assessment and Treatment

Psychological Testing: Principles and Applications 6th Edition

Psychological Testing: Principles, Applications, and Issues

Usability Testing Essentials: Ready, Set...Test!

Straight Talk about Psychological Testing for Kids

Design Driven Testing: Test Smarter, Not Harder

Usability Testing Essentials: Ready, Set...Test!

Psychology (BPS Textbooks in Psychology)

Psychology (BPS Textbooks in Psychology)

Online Education Using Learning Objects (Open and Flexible Learning)

Occupational Hazards

Student Retention in Online, Open and Distance Learning (Open and Flexible Learning Series)

500 Tips for Open and Online Learning

Psychological Testing: BPS Occupational Test Administration Open Learning Programme