FIFTH EDITION
ESSENTIALS OF EDUCATIONAL MEASUREMENT ROBERT L. EBEL DAVIDA. FRISBIE
fifth edirion
ESSENTIALSOF EDUCAT...
482 downloads
6690 Views
20MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
FIFTH EDITION
ESSENTIALS OF EDUCATIONAL MEASUREMENT ROBERT L. EBEL DAVIDA. FRISBIE
fifth edirion
ESSENTIALSOF EDUCATIONAL MEASUREMENT ROBERT L. EBEL DAVIDA. FRISBIE Unir-ersin of Iowa
Prentice,Hall of IndiaFn0vateLImnlted
New Dethi-110001 1991
EIi.'EREE|
ThL lndbn F.pl|nl4a 7f..o0 (Oli8inalU,S.Edition-Rr.I 347.m)
ESSENIIALS OF EDUCANONALMEASUREI'EIIT"5ThEd. by Robedt, EbelandDavidr{ Frisbie
PRENTICE-HAIL INTERNATIONAL, lNC.,Engla,vood Cliffs. PRENTICE-HALL INTERNATIONAL, lNC.,tondon. PRENTICE-HALL OF AUSTRALIA,, PTY.LTD.,Sydney. PRENTICE-HALL CANADA,lNC,,Toronro. PRENTICE-HALL OFJAPAN,lNC.,Tokyo. PRENTICE.HALL OF SOUTHEAST ASIA(PTE.}[TD., SiNSAPOG. ED]TORAPRENTICE-HALL DO BRASILLTDA.,Riode Jan6iIo. ,TERrcANA" PRENTICE-HALL HISPANOA S,,q", MexicoCiry. @ I99l by Prentic€-Hall,Inc., Engleuood Cliffs, NJ., U.s.A. All rights rcseryed. No pan of this book may be reproduced in any form, by mimeographor any othel means, without permission in writing from the publishen.
lsBN{€7692-70G2 The crport rightsof this bok are vestedsolelywith the publisher.
Reprinted lndia by special arangement wtth prentice-Hall, Inc, -inEnglevrcod Clifh, NJ., U.S.A.
Printedby BhuvneshSeth at RajkamalEledric Press,8-35/9, G.T. K,amal Road lndust al Area, Delhi-llOO33 and Publishedby p.entice-Hall of India PrivaE Limited,M-97, Connaqtht CirEu3,New D,elhi-lI oOOl.
Contents xi
Preface
The Status of Educational
Measurement
I
I The Prevalonce of Testing Some Chronic Cornplaints about Testing , 3 7 Some Current Issuesand Developments 17 The Principal Task of the School The Potential Value of Testing in Education 2l Summary Propositions 2l Discussion and for Study Questions
2 Icasurement
and the Instructlonal
Process
23 Eraluation, Measurement, and Testing 26 Process Er-aluation in the Teaching 30 Functions of Achievement Tests 3I Tests Limitations of Achievement 33 Measurements Inrerpreting J8 Summan'Propositions 39 Discussion Qrresdons for Study and
23
l9
YI
CONTENTS
Measurlng
Important
Achlevements
4l
The Cognitive Outcomes of Education 4t Using Instructional Objectives 47 SummaryPropositions 5J 54 Questions for Study and Discussion
4 Descrlblng
and Summefizlng
Measurement
FrequencyDistributions 55 Describing Score Distributions 59 Score Scales Describe Performance Correlation Coefficients 70 Summary Proposition s 74 Questions for Study and Discussion
The Reltablllty
of Test Scores
Results
64
75
76
The Meaning of Reliability 76 Methods of Estimating Siore Reliability gl Using Reliability Information 8j Factors Influencing Score Reliability gg Criterion-ReferencedScoreReliability 94 Summary Propositions 98 99 Questions for Study and Discussion
6 Valldlty:
Interpretatlon
and Use
f OO
The Meaning of Validity 100 Evidence Used ro Support Validity Applying Validity Principles t I0 Summary Propositions I 12 Questions for Study and Discussion
102
ll j
7 Achlevenent
Test Planntng
ll4
Establishing the Purpose for Testing l14 Alternative Types of Test Tasks t 15
jJ
CONTENTS yil
Test Specifications I I7 Item Format Selection 122 Number of Items 128 Level and Distribution of Difficulty S um m ar y P ro p o s i ti o n s 131 Questions for Study and Discussion
Tnre-Fdse
Test Items
I j0 l j2
lr3
Merits of the True-False Forrnat t)j Common Misconceptions about True-False Items Writing Effective True-False Items 142 Multiple True-Ialse Items I5l SummaryPropositions 152 t5j Questions for Study and Discussion
I jz
9 Hulttple-Cholce
Test Items
lj4
The Popularity of the Multiple-choice Format 154 The Content Basis for Creating Multiple-choice Itenrs The Multiple.choice Item Stem 159 Preparing the ResponseChoices 167 S um m ar y P ro p o s i ti o n s 177 l78 Questions for Study and Discussion
ro Other Obiective-Item
Formats
l7g
S hor t - ans w e rIte m s 179 \ I at c hing I te rrs 182 Nur ner ic alP ro b l e l n s 185 S um m ar y P ro p o s i ti o n s i ,8 7 f o r Stu d y a n d D i s cussion Q ues t ions
IB V
11 Essey-Test Items
f 88 -
T he P r ev ale n c eo f Es s a yT e s ti n g 188 T he V alue o f Es s a yT e s ti n g I8 9 Reliabilit v o f Es s a y .te sSc t o re s I9 I P r epar ing E s s a vIre m s 193
157
Viii
CONTENTS
Scoring EssayItems 194 Summary Propositions 197 for Discussion Study and Questions
198
t2 Test Admlnistratlon
add Scoring
199
Preparing the Students 199 Test-preparation Considerations 203 Test-administration Considerations 205 Scoring Procedures and Issues 209 Computer-assistedTestAdministration 216 Summary Propositions 218 218 Questions for Study and Discussion
t3 Evaluatlng
Test and Item Characteristlcs
22O
Test Characteristics to Evaluate 221 Item-analysis Procedures 225 Selection of the Upper and Lower Groups 227 Index of Difficulty 228 Index of Discrimination 231 Item Selection 232 Item Revision 233 OtherCriterion-referencedProcedures 237 PosttestDiscussions 238 SummaryPropositions 239 240 Questions for Study and Discussion
r4 Nontest
and Informal
Evaluation
Methods
Observatiorral Techniques 243 Informal Inventories 253 Oral-questioning Technrques 257 SummaryPropositions 262 Questions for Study and Discussion
241
262
r5 Gradlng
and Reporttng
Achlevements
The Need for Grades 264 Some Problems of Grading
265
264
COarTElr'S
The N{eaning Conveved by Grades 258 E s t ablish i n ga G ra d i n g Sy s te m 2 7I 27] Threats to the Validity of Grades As s i g n m e n ts 2 7 5 G r ading C o u rs e 2 76 Com bini n g Gra d e C o m p < rn e n ts NI er hods o f A s s i g n i n g G ra d e s 279 Grading Software 283 281 Sunrnrary Propositions 284 f< -rr Stu d y a n d D i scussion Q ues t io n s
r6 The Nature of Standardized
Tests
286
285 Characteristicsof Standardized Tests T y pes of S ta n d a rd i z e d ' Ie s t S c o re s 289 Norms 295 299 Selection of Standardized Tests S ur nm ar y P ro p o s i ti o n s 301 302 Questicinsfor Study and Discussion
r7 Using Strndatdized
Achievement
Tests
3O3
'l'he Status of Standardized Achievement Testing Us c s of ' A c h i e v e me n t-te s tR e s u l ts J05 I nt er pr e ti n g Sc o re so f In d i v i d u a l s J09 h) t er pr e ti n g Sc o re so f C l a s s e s 3 1 4 Rep< lr t i n gt() S tu d e n ts a n d Pa re n ts 317 S ( ) m eI nte rp re ta ti o n P ro b l e ms 3 20 School Testing Pr svncrasies tend ro reduce rhe validit)'of grades (Sriggins, lfishie, and criswold, 1989) One outcorne of rhis sr(ond shorrcoming is rhar rhe gr,rdcs tend t.) be unreliable Another is that grades .an be inllared thelr hce vahe rs high.r rhan r her r d (ru a l !a l u c . The absence of€xplicit definrtiors fo. each grade permits reachers to be influenced, either consciously orunknowingl,v, t'y exrraneous facrors in assigniog grades. Rcsearch on this point fiom thrcc or more decadcs ago probably is .har. ac k ri s i i c o f p re s e n t p ra .i i c e (C a rrer, 1952; H adl en 1954i P al mer, 1962) S ome teachers deliberately use high grades as revards and Ios'gradcs as punishDcnrs for beha!ior unrelated to the attarnmeni of inslru.rional obiecrives \' d ' , h rn .l t l l i .r l l 12. l q l Jr. l 9l ' l hl ;n rhe unrel i ahi l i rl T h e " ,Id re \ ^l ' ofteacher's grades on examinadon papcrs are classicdemonsrrauons of rhe insia bility ofjudgments based on presumablt absolure s(andards Identical copies of an English test paper were given to 142 English teachers, wirh rns(ructions ro score it ou rhe basis of 100 percent for a perfect paper Since each reacher lookcd at only one pzper, no relatite basis forJudBne[r was avarlable. The scores as. signed to the 3ame paper ranged all the way from 98 ro 50 percent Si'rilar resulrs were obtarned with tesr papers in geomehy and rn hisrory Typically, grades such as rhose Starch and Ellior collected forsinsle examr nr r i o n p a p e rsd rr n o r h i g h l r re l rdbl e.t^r semen.r grddeq.hose\c. ' eibased i aU i Irion e. in the rang€ of 0.70 ro 0 80 should be common. Semester grades are qudenr m uc h m o re e x re n s i v ea n d .o mp rehensi !e nbsew ari ons ol d ai nmenr5, perhaps as nany as 80 hours of obsen-ario lven so, one hour of inrensivc ,.ob. servadon" under the controlled conditions of a well srandardized achieve'Enr test can yield measures with reliability esnmates in excess of0.90. If the rools of performance assessmentare not well designed, their collecrive worrh over a se mester Day be exceeded by a reliable and valid commercially prepared instrumenr rhat tales no tiDe for the teacher to prepare and a small frachon of class r im e (o a .i mi n i s te r O u r p u rp o s e here r\ noi (ol rgue torrepl a(i ngreacher.made evaluation tools with standardized measur€s. bur to dramatize the unforrunare state of affairs in which some teachers find rhemslves ar grade assignnenr rime We are nor facing utter chaos, but considerable room for improv€menr exisrs.
268
GBAOINGANDREPOFTING,|CNEVEMENTS
THEMEANINGCONVEYED BYGRAOES A,grading sysremis primarity a merhod of communicanng measurementsor achrevemenr.Ir Involvesrbe use of
tob.€re".it ;"n;;;il;,i"H l'"fj:l:ll:... ;lTlT"i:fi:J;Tilpj "ush, rhe clegre€rhat rhe sr ading s) mbolshale rhe sa-.
f.. r *h. ;*-ili.,,, is it possibtefor sradesro s;rv. ure rh PurPoses -*"irg or comnunicalion meaninsrully ^"d i;i;ry-he mejninsol d gride \hi,uld.dep(nd J\ JIrt. r\ p.\sibteon,he in\|l u,. ,'T issued tor who rt o. rhe coursero whrch ir prftai,,,- fhi. ;** rh.,r ;;s;;;;s
." insrrucror.of a deparrm..i, * _a*J"i._."il;."J:;:r l-."",::l::: rnsrturron"fare rDatrers of legitimar€ m enr s ,a n d o th e r i n s ri ru ri o D s .Itm e :
A parricular giade carries t th e .o m p a ri s o n o l absolure s.andard or a relative sranr fied group. Se.^nd, a Fade reDrelr eir h€r am o u o r o f e ffo ri e " p e n a .a o nar r y .a g ra d e re p re re n rse i th e r rh e a l rnstru.rion or rhe amounr oflearnrn The remainder of fiis secoon is a dil ing be(ween rhe alremadve meaninss ute ro the overall meaninS of a graae ,
Absolut€ and Rota vo Stendrrds A gracterepresents a teacher, has performed a ser of rasks in one i units. These judgmeDts of goodness c ot comparison. performance thar is d lenr, or inferior obtains irs qualihri perrormance in quesrion with a peri absolut€ or relarive rg systems used in the United Stares since -trerabsolure or relative grading srandards_
A definite percenr of..p€rfecrion," usr was regard€d as ihe minimum passin! studenh' performances r,irereni edg€, skills, and undersranding_thal -ere
GFADINGAND REPOFTING iCH/€VEMENTS
289
gr dde p re \u ma b l v rs i s n e d i n d e pr,,den' tv uf rhe grade. ot urher srudenrsi n r 4e. ou rs e . p e ' r, r, ' \g ra d l n S l l (u / r i \ thatact.ti l ed a\ ahsotutp sl otn!. In , o rh e r mrro r rrp e o f g r adi ne j l crem i sbascd on rheuseof ,3mal l . num ue r o r te ' re r g rrd e s . o ttF n fi !e , rn exp,es vari ou, tevetsorarhi evemenr ln r hF f l\ c l e rrF ' .{ ._ 1 , D . I .rrre m . rrul y ourqrandi ngpertorman(e i r assi qneda ' .t .r. gr aoe o r I n e b In d ! a re \ d t,o re .rrerasea(hi eremenl C i \ rhe a\eraqe eradc: D r n. u' drc \ r)c ro q a rF rrg e a r h rc v e m e nl and F r\ u(cd ro repon tai tu,c, aA i ;vemenl u n ' ra n r , re d i r l o r i ,,mptrri ng d .ouFe. r ne rptari rF srandard to s r r , h e r,h \ru .l e x r \ p e rfo ' m rn rF i s retFr.n(ed i \ rhe di \rri buri on ot D erl orm. J n, pol o rh c r\ru d rn r\i n rh e i tJ s s Ihu\. tel .r grddi ng i s ,.rn.,,-., .;;r;;;" ; ized as rtlahu grading. O l i o u h c . e d , h tc e r i n rhe gradi ng \tsrem , an be defi nrd i n rbsol ure r c r m r ,s t.rd n r re ta | | re re rm r. A S rrde ot D ma| ndi ,rre a.hi .!emenr ot rhc nr r nr m i rr! e \F n tri r rn o w te d g c d n d undcr{ andi nB 5: C mry rrpresen, adeqdare r dr n, I rh n n a \e ' J g e l i h i e rc n e n , B mJ) i ndi i arc I tc!et ur Jdl d;, ed ar hi erenent wrm fespecr ro coltrse conrenq and A may be used to represenr exceDtional or m er i, o' i u u , d .h i c \ e ,n e n ' . s u , h g e n erardei ,n i ,i o,r ,o,t,t r" q,,i r. .,h.I;;,;; ,;. i o mn ru n r, a l r rh e a l ,\o ture t.\et, ot a, hrevemet, ,n any parri .utar , nLr 5e o r { ' J d p .J rri fi , \u b i c .r n ,J ,r thc pornr i r. rr i , nor $e use uf tetrer \ ) n' bul\ ,j l .tr\| | n q u i 5 h e \ d b ,o l u re anJ retari \e gradrng: i r i \ ,he narure ot" rhe standard against which performance is compared rhar diifferenriates tne two T h p d e i i .ro n ro u \e e ,rh p r d,, ur a rdtr' ri vesradi na smndard i . " b.^turr r he r,rn .l J n p n r.r d (, ., rp i(hcr musr make qi rh rea;rd r" ;" ;i ;,;;;; ",..r ' \i .n standard assessnrenL When rhe absolure is chose", .ll and r;"ls of.;;1,; r r iun m ,' \r b e d e .i Sn e d ro \i e td ,ri ,eri on.retcren,e,t-erho;" i ,,.,p," ," ,i " ... A ;;tr;. . r dn, , J r .d sro r rrJ d rn g m u ,r b r e \ra b t;,hed ror ea, h , omp.ni nr rhdr i \ ro Lon| | i b. ur er or h e ,o ,x \.g rd d /-re \r..p rp F r\.q ur//es,pre\cnr,ri un\.proi errs,drdurher J ! r , g' , m.n l s l l rh e d ,a i .rn n i \ In u \F J retari re \rdndard. a gr" di ne, ompon" ns m u. , L' i c e d e d ro p ro !i d rn g n u rm re tFren.edi nrerp,e,j ri oni .Or ,;u,se,rnborh i . ' \ . \ ( u ru fl \ o r c d e , ,\i u n \ n e e d ,o b. mdde r, t" ;g ,r ,er" r,t s,,atne ,r-L" ta dr , durl d b l e . Ih c h a \i . l o r d e r.rmrri ng,hc,uro" fr poi nr. r,.Li ," ri r Lr c r F n ,e d In o n c ,d \F a n d n .rm .,e t eren,.ed i n the orter. " ' i rr T h o u g h r ,l rrr mj i u ri ' r o t i n.| | ruuon- noq u\e te er sD drns qi rh retj . f i, e s r nn d n rd s .p e r.e n r S rr.ri n g i s h ] no meanr nt,\otpre.S omi i nrLi i uri ons rri tl ,on\crr ro l e cr grades trom D er.enl rards srill prefer ro deFrne pmsin! scores . i n \ome,nqe\, rhe r," rores,* rrany grading methodotogy Some insrrucrors I over r.tr| | ve grddi ng for phi tos.phi ral \tandJrdi ovFrheari nqor, i n some cases. Achl€vem€nt and Effort After rhe decision has been nade about rhe use of absolure or relarive standards, rhe insrnctor musr decrde which p.,f".-"""" .. ;;l;; lor m . uf rr h i F re m F n r s i fl b e i n , tu d e .t i n ,he grade. ".p..t" U ndoub| "f rdtx ..-. ,." , ;.;; baqes ume o t rh e g ra d e s rh e v rs u e o n l atrori orher rhan rhe dcere" or arhi eve.
270
GRADNG AND REPORTING A]:H]EV€MENTS
menr of rnstrucrionat obiectives (S likely $'itl conrinue ro d"io U.-",, conrol in rhe ctassand because son t ile r e a (h i n g . Bu r th e u l e o f s l a d cr leadsr o d i s ro rre d me a n i n A(o t' rh e q r t hr r s o .i a t b e h !v i o r ra rt., rt r" .."h of rheir school progEm
We h a v e a r$ re d rb a r g rrd e ! . and r e w i r.t s ru d e n r l e a rn i n g . c € rr a oemonsrrare grealer desire ro learn
some forrn of recognirjon and rewa
Statusand crowth Some insrructors belele that the amounr of improvemenr sruden achievemenr rhey dernonsrrare ar rhe on orher preliminary observations. ; inirial sratus.The differences bers.er
Il)enr, sxbrracrrng rhcse scores from o tlon ot enors ralher rhan a cancella more error tadcn than enher of rhc s mal consisr mainly of errors ot measr provide reliable scores. insrrucrors ma and posnest rnean But few classroom achievement tests i
suremcn,s orsho.rern, sil;i;
:fi t:::,i.:J,.J,-il:"ffilii","..1,f:1",*,
GFADINGAND UHAUING ANU REPORTING BEPOFTINGACI ACHEVEMENTS 271
In addirion ro rhe reliabitiry con.erni rhere ar€ ortrer problems wr.n groMh measures. One is rhar, Ior mosr edu(arional purposes, kn;qtedge thar a studenfs achievemenr is good, averasr is more useful rhan knowtedEe tha than orhers during a gading lirrior on the preresr have a considerabty gl gains in a c h i e v e m e n r rh a n rh e i r p ( dents are quick ro learn lha! under cjrcumsrances of grading on rhe basis of grorvrh, rheir prerest scores should be as tow as possible ro pe;m( rhe greatesr pos!ible obsenable gain. Ir is rrue Lhar sr2tus grading seems to condemn some srudenrs to low grades in mosr suLrje.rs,senresrcr afre' serne,re. Low grades drscourase effor!, whic h in tu rn i n c .e a s rs rh c p ro b a b i ti o of more toh g' i a.s. S . rt. ui c,6us.1.t. conrinues, bringing dislikr of lear ing and, possibty, eariy wirhdrawat from s c h. ol. I f s ru d e n rsa re ra u g trr to d rs l i ke school by constanrreD ri ndersof rherr l ow achieleDenr, the rcmedY probaLly is nor (o rry ro persuade rhem rhar therr rare of growth roward achieveDent is .rore imporranr rhan starus achieved. tor rhar n a t . an i p a re n r fa l s e h o o d T h e re D e dt i s probab\ ro pro\i de vari ed oD D orrur!f t ies t o e x c e l i n s e v c ra lk i n d s o f o r thuhri e acti ,i i i cs.-The ptrnni ne a,' r:ti mpre nenLat i o n o f s u c h c i l b s c e .B i n l ) $uul d reql rre an aterr, ,ersar e, and d;dr cated rcachcr When ir is accomplished, rhough, grading on rhe basis of srarus achieved will no longer orean thar some srudenrs must always wrn while orhe,s nus t alw a y sl o s e In s re a d$ m e $ ru d e nrsw i tl he abl e ru entor some ot i he rew ards oI ex c el l e n c ei n rh c i r o w n s p e c i n tri c\.C ohen rl S 83). for;xampLe, has de\cri bed alrerna(jve procedu.es for grading rhe achievcmeni ofexceprionat srudents who have bee. "muinstrea,ned "
E S T A B LI S HI NG A G R AD IN GSY ST E M The Grad€ Scale
lircnt Many instrucrors seemed ro 2gree wiih rhis view. Nonerheless, from time r o r , ,nu l h e re l ,J . b e e ,, ,,,, r (.s ,.d o r re new ed rnrcrFsri n efi ni ns rhe $ adi rq 5(ate buodilg p l u \ d n d n ,rn u \ .i g r. ru rhe hd\i . l e e' , o' ' de, i m;r rrai ,i " n.i o rhe bas ic nu m b e rs (l b r e x a m p l e , 4 0 , 3 .5 , 3.0, 2 5). The notion rha. gradrng problems can be simplifi€d and grading errors rcduced bv usiDg fewer caregories is an a(racdve one hs weakness can be ex.
272
GHAOINGAND FEPOFTNGACH]EVEMENTS
In^.I'""":j"y:,-ll::der g',ai"g
cateso es in-sradrnsdoesindeedreduce rhe .l;iih jl:
, ;.";;;:':.i",',:;:i;:";:l':; .rh,i r,.i a.,.,,.i.,,,i.i;;;; ,.,j;:,:;:;.''j.i^:"'-,,f..,"':';1,;',;J';;;j':;? ".;;f:j:'j,T.jilllll: :1i1":::[t":',1::; ::,fli::'j:,:;:tl-:1:1.:::r r' ll;::::..f:,:::;::l
., u.-.."ri,";i:|FJi:lli';';T: l:i:;:: ;X:,;'::;'J; :::. :::"t:".1*,." ;;';;;;;;;'i,ii'i:'iill"i;i"'i;';". fJl::l".1':,,::':.1 ::_p.,.;",. ;;:;':' ;:l:.IJi1; *,;:".' ;:l':;I.flfitl;t ::iq.:::i:': 1.."!,ri;ir r,,.f ,".,8.;i;,;;;:.ji,i i:,.:?::[l ?:lj]:::::j."r.a..,""ra F:l::J',;: ,,;l:;is';, ;;;;;";.,8":':";111,.".,;11:li; ;:i"..":::,:.:t:::.1,.:i::_jlir .''
;'lilii.: ;#;:.'" Y,;1,,i': k;';:l g;,;:::-*;1.1';"_, 1r1 ".]",; ;;"; ;;.;:';;:;i.
f;) ::;:::Jlt J;;:li::J,::t:5:i.::::::tr:;"i , ,,.go, i l:l .,;::"::1ot:lll,l'. '.au,i'g''r," r on ( o n v € )e d b ) rh r g ra d e .
",;r.. "
slli:r';;
i es redutes the i ntor ni
Lellers versus Numbeas
pdcent.sradins
wasa,dedbv rhe subsriru []l:.'H::'j,:j.:::ff:rn$ .,_#,, r.i.,..iii.;;;i;;,:t:;." j;t;:l:ji:: i:.:T"):.,;, # fli_*.-y
l,:::llil:;ltJ::;;:.,a:1x.:1.::r,r::..:r;':;;;;;J;::;i;:, ff:.1::j;.'i: ffi:i:#::-,'jilff:, :L:;::t-:rf--;,6.;;:;;:;'#,::ll'
ir,.,.t.,J".;; :t J;..,;Jl:;iltii: Il;ii;:11]:1j"ha,reueh t::?:1, impr), 4-al,,;,,, ., r,i.. -r.r"* ;;;";i 1i,",',illllili;"'"llJi:li: F o r b o rh rh e s ere a s o n srt", "",_, .,. tU n ,"
i:.ifl"it;-ff::Jlj:ilt:::x;.-",ii:T;,:i4;;'j;:l;:iii:r':,11 r.'Ji"'i'",'','ii ;,;.;'li:]"'x lfili"? [::l,i::"li: :]t:'-y:l:,',:lr ":iJ;;':x: l:l,?'lil ruili";li:* [:ij"j;ii:l*,::::r:;'lilr j1i.fl"",',iij.ll"l,T;i ,;.J'",,"r,"ii," "i ;,ltti",l)'J more f;#:,ll :':::":: subrle rta s e o u \th i q ,.l !_ -,,^ ,,ri a trvmbot\i ng,adi " " -" " ngboutd
".r.r.
changes
Slngloor Mutttptecradss
, s are rDoreexplicir in communicaring
GBADINGAND HEPOBT/NG ACHEVEMENTS 273 wha ( s ru d e n rsc a n d o rh a n a re l e rt
senrs and rhar sufficient evidence L { o' d e r(rm,n ,n g rh e \e p .,r,rr R rJde grades rePorred ar one rime, rtre x enced by cons'dembte h,lo eftacr 'r. s r ude n rm a y i n fl u e n c e e a .h o fth c gr ro rhe scparate aspecrsof achielem( I ng n ra ) h d !c L r o l re r.rn e i l u i j l i o shoncomings of rhe sinsle.svmbot s There is an ecle;ric gradir,ts , thar has promise for sarisfying rh( Ii re n c i n g .Il r nr nr m u m (o m p e re n c y r€rerenced measures ro make rela passed. (One version of rhis mcrhod Only a single grade is assi,$ed to ea gr ade (A , B , C , o rD )a re re fe re n c e dt hav e b e € n i d e n ti fi e d a s mrn i m a l l v es r he r e l a ri \e s ta n d i n g \ o f \ru d e n rr i n regard! as .'beyond basrcs,, or impurranr to success rn srudyins more advancco as pe.rso i rh e s u b i e ,I rn a re r T h e -erte,ri i .y.,.- .,,.r, i r,.,. " " a" " " ," i ." .," " t* .h . ti k e ty ro dssi gna ba,et) pass,nsgrade (D ) ro srudenrs .wno . . , - nave . - I nor mastered -:.te s s skils rhan the) basic be ,"ai.; ."nv"";-;;i;h. tive grading sysrem. -,ght 2. Studenrs who fait at first can be rererred ro jmprore rherr gladc rtrer rhey improve.rheir skills. They are nor relegared .. l,",i"g i.il,..;i;p.i;;;,. on one o . (a s i o n rh e r d e m o n s rra red te$ te;rni ng rhan u;h. :t. s ru d e n re w h o e x .e t j ,e rew ardrd r,.;,di ng ro rhei r l rvel ot arhi eve. a re i n c e n ri v e t ro s u bel ond rhe mi ni mum * * ,,t" r, a.r_.j l _ l^. _: , , _]h .," P a$' ng 4. The system represents a
spr;*y .av".,tJs H:fi:":J,""1 lxi*li-i,l "" "i a"iil;,;",,"[i?::i1."i,T THREATSTO THE VALIOTTY OF qRADES
A distincrion should b€ madeb€rw ofPerformance rhat a rea.her zarzarrr aDd the subseron*ro". ,t,"t" -t "Ptcts
compon€nts rh",.;;G;'J,:,1"d.?f;:lLt'J",H#::':f;:llfl ::H::: srades
studenrs competencewirh respectto the instmctlnat .bj?.t,".r. i;-.;-^i..
274
CRAD NGAND REPOFTNG ACH]EVEMENTS
neDts of a grade should be academically orientedr gradcs should no! be tools of discipline or relards for pleasant peryrnalities or good anirudes. A srudenr who is xssiBned an A grade should have a li.m grasp of the skills and knowledge t augh t. Il th e s l u d e .r i s l ])e fe l y n rargi al academi cal l ybur ve.) i ndusrri ous and congerlial, an A grade i{ould be misleading 3nd woukl rendcr a blow to rhe moti. vation of thc cxccllent studcnts in class Insru(tors can and should sn,e ftedback r o s ' ud (,,r\ h rl , | \p ,\ t ru d 1 rU .r! ul r.' i t. an.l , har., rer,' r i , l ,ur ,,rl ) per form. '. a ce based (,n acade'nic achielement should be used ro derermine srades. In r l, er re ..m rn F Id J ri ^ r' \ r.g J , d ,,rg r rr | ,.r d\ r,' ,l r\tF rJri on\. Ll ' e\ar i ;ndl C orn ', Drissrcn oD Excellen.e in trducarion (i08:l) stated rhar "sndes should be indica. r ot s ofa c a d e D i c rc h i e v e mc n t s o th cy.an bc rcl i ed on as evi dcnce of a srode.t' s readiness for furrher stud,v Grades co.raDinared by other fac(ors givc srudenrs a false sense ofreadiDess aDd provrde misinformarion ro those who seek Lo guide s t uden L si n th rx l u tu rc c d u .rL r-n dl cndc" \" !s S e v e ra l a s p e c rso f s ru d c n Lp erl ormrnce havc beeD l abel ed as porenti al l y inv alid g ra d n rg .o d rp o .e .rs b e r,ruserhey rep.esentbehavi ors thar do D ot rcfl ecr drrecri_vrhc attainDient of rhe imporranr obtect'res of rnsrru.rion (lrisbie, 1977). Though some cx.eptions .ou1d bc noted, thesc variable, generally should nor be used in determining course g.ades Neatness in written work,.orrectness in spelling and grarmarical usagc, and organizadonal ability arc all worlhy trairs and are asse6 in mosr vocauonal endeavors. To this eJ(lent, it seems appropria(e rhat teachers e!aluare rhese as pects of performance and provide studen$ with constructive comments abour them- I{owever unless the course obtectrves include instruction in rhese skills, sruden$ should not bc graded on thcm in the course For example, studen$ essil exarnirralion scores should nor be ntllue ced directly by their spelling abil iry and neither should therr course grades. Sludents whose skills in wrirren ex. pressioD are weak caD and do learD rhe impoflanr knowled8e of scrence, social studies, literaturc, and othcr academic subjects. Iheir wriring skills can and should be evaluared in such courses, bur their course gmdes should nor suffer dire.tly because of their writiDg deficrencres Ib the exrent rhat rhey do, rhese grades are misleadhg to both students and parents and serve to moderare rarher rhan s{imulate intcrcst in rhe subjccr rrea Nlost rnsrnrctors are artracted Lo srudents who are a$eeable, friendly, industrious, and krnd. They try to ignore or Dray even reject those vho display opposite characreristics When ir appcan that certain personaliiies may inrerfere with classrvorl( or have Iinited chances lor employnent io their field ofinrerest, constructive feedback from rhe instnctor may be necessarl: Bur an argumcnra. dve or misbehaving studeDt who recervesa C grade should have only a moderate anount of knowledge abour th€ course content.'Ihe C should Dor reflecr rhe studenas djsposiuon or disruprive behavior direcdy (Bartlett, 198?). Most smaU classesand college selninarr depend on student pardciparion ro some degree for rherr success When parti.ipatlon is an important ingredient in learning, parhcipation grades may be appropriate In such casesthe insrructor should ensurc that all srudents have sufficienr oppo unity to participate and should maintain systematic notes regarding frequency and quality of participa tion (See Chapter 14 for sample recordiDg forms ) Waiting undl the €nd of th€ giading period and r€lying suictly on merDory causesa relativety subJectiverzsk
GfIAD]NGANO FEPOFT]NGACI] EVFMENIS
275
t o bc ev e n m o re s trl )j c i ti \e a n d u n rc l iabl e pani ci pari on pfobabl ! shoutd nor bc graded in nrcst ctasses,howe!e. Doninrtilg uu.j ,t r.f.ut, r.,rj "rrro,l,,rt.a win, an. l i n rro v e (e d o r s h v s tu d e rrtstend ro l < i s" Iosr.r.ro., ", ,ray ,unntro pr.,;i .t. c \ aluat n e i n fb rn ra ri o n L os [rd e n rs J h uur \ rr ruus ]ql krri ut Lhcl ruLl en15pctr.nr_ alir ic s ,r n c l u d i n g w rl l i n g n e s s1 0 p a r r,LipJre, hU r !r a.i rng s)roul dn,,r Lt rhc nea,rs or oor ng s o . Sru d e n rsa r a l l tc v c l s s h o u l d b e err.ouragcd ro rrten.l ci assestrccadscthe t c c t u. es ,d e D ro n s rrrri o D sa, n d d i s c u s si onspresunr.rbl yharc bee desi gnc.t j .i r(, c ilit at e r c i r l e a rn i n g . tfs tu d c n rs rn i s sserei al cl asses,Lhentherr pcrfonnancc,,n p rp rr\, d n .l | r9 i € .rs l ikcl v ri l l suffe. If rhe i D srrucr(,rrcdu.es t heir gr . rc l eh e ' J u s e .' b s e n .e ,5 u .h srudcD tsa1r subrri tre{l (o a forrn of.l oubl e ^f J iupJ r d\ . F ,n e x a n rp te ,! r,, e g e i n s rructor ma), say fi ar ctassarrcD daocccou!rr lu- per ' en r u l rh e .u l r{ e g rJ d . L r r,,r . rdr nrs $ ho Li \\ { tr I rt , t,rse\ ,h r. ,.. et t e,r r r ( t \, J mo u n r ro 2 0 p tr, e n r. l ;r lk t. \hu, \pcri e,,,, t,.tsh!,,,c\,,t ...,, rnJ. I n t her r L rrs s e sp ru b a b l ) n e e d ro e x l m i ne drei r cti ssroom e;vi ronIJ)cD tand j l . srructional pr.,ce(lures to dere.mine if changes are nee.lcd. there ousht ro trc m or e p. o (u c rrl e m c a n s o t e n .o u ra gi ng srudeD (s ro ar(cnd ctassesftan Lrr rhrearen ro towcr rhcir grade So me i n s r.J c ro rs a re m o rc g e neruusi D rt,rn grdLl i nArtu! rhrv ouehr ro Lr bp, J , , .i rl ,d t, J r,l .a r to \F r q r.,,l esnri B hr hrur\r rhci r \ru,tFrr\ v .i r;r' sr, H, , \ ' \ e . .r\ \d J t, | , t,l i r' 1 ,t,J \.,r9 1 ,.,1 .rt,e i ,rt,t, InF| l rnr.nnut rhi \ I,h,J,,,,,J,t;ir. nor delen s i b l -T h e d F ,r c r^ t" b e te \F r\ rh n g . r h a ra n vre s z ri v e .€ a c ri o ins b o and (2) rhaLelaluaring a perfor a Personas a Person.Nor eler an d d i ti g e rr o n e s ,i s g o o d .a n d lu tl g m e n rr d b n u r \rri t' n g J n d spcaki ng ski s, personal i ry rrai rs, efl or! . ' no, r r ur ' \ d r' u n a ' e L r rc r(h c n .onstaD dy as rhey i nrerucr w i Lh therr sl u d eth F { td rr.,r\tr,,r denr , . lur \!l u L l .n r.\r' n a,rl rl rcJU dgmer,r\ma,trJbnurJi rde ,r, promise rs Do easv task. Bur accurari and ntcaninstul llllc,.:': sracles dc. pc nd on r-d L
GRA DI NGCO UB S EAS SIGN M EN T S
276
GFAO NG AND BEPOFTNG ACNIEVEMENTS
and sevcnth days hc missed nonc He Dissed only one our 01 twcnty on the resr given the eighrh day Which grade best desc.ibes l'reddy's levct of a.hievement? 'Ihough he may not have caught on as rapidly as some of hrs peerr, Freddy ap pears to be able to rdentify prepositions Sone tbrm ofgrading might be used ro motilare anC dirccr Frcddy and his classDales,but all such grades need not enter hro dcrcrnrinnrg thc final coursc or term grade Perhaps the most frequent shortcoming alsociated with gradnrg assign meDts such as papers, reports, prcsentaiions, artd projects is the faihrre of the teacher to sp€cify and dcs.rlbe in alhtatue\\l\:tL \he imponanr aspecrsof rhe final producr should be likc The lack of "feed forwa.d," .N Sadler (1983) has label€d rt, p.odDces t$.o uDdesrmble outcomes: (1) sotrrestndents presenr incoDplere as sigDmeDts because they nisunderstood thc tcacher's iDtcDt, and (2) grading be.ones a chorc fbr the reachcr bccause rhe crirefla thar distinguish better assignmcnts fron poorer ones ha! e nor been explicated 'l he gt ading gu i de thar scetu so logi.al to prepare for scoring essay items is equallv beneficial to the teacher for grading assignments. lt can help to accomplish rhese things: I When presenrcd ro rhe students ar the time the assignnenr is madc, potential misunderstandings about what to do can be overcome. fhe Daure of the final product can be describcd completely, and thc relative inportaDce of various aspecrs of it can bc prese red. Oflen an example of an A assignmenr f iom a p re v i o u s c l a s sj s a h e l p fu l m odel 2 Opportuniues for exoaneous factors to influeDce grading are reduced because the relevanr elements have been defiDed. Grading variables and evaluation lariables can be separared so lhat a conscious eftbfl can bc made by rhe reacher to lrake comnen$ about the nongiaded aspects of the work 3 Grading can be done efFrcien.ll becauselittle time rs needed to decide which part, of the assrgnmen t to rserghl most hea! ily Less time is necded to judge c om ple te n e s sa s w e l l 4 Feedback ro students can be som€what diagnostic because missingseg ments and studenr misconceptions a.e more readily identified We discussed in Chaprer 19 the iDportance of preparing students tbr examinanons so that they know whar ro expect and can prepare themselves lur. rher A grading guide, like the checklist shown in Figure 14 2, can sene this same uselul tuncrion for assignments, and it also caD contribute to more valid and rehable measures of a.hrevemenr ifused wisely by the grader.
CO M B I NI NGG R A D EC OMP O N EN T S When teachers derermrne a course Erade by conbining grades or scores from tests,papers, demonstradons, and projects, each componert may cany more or Iess weighr than the orhers in deterrnining the final grade. To obHin grades of maximum validitt teachers must give each component the proper w€ight, not too much and nor too litde. How can tley determiD€ what those weights orgftt to be and what rhey actually turn out to be? And if these rwo s€ts of figures are disparaie, what can instnrcrors do? It is not easy to give a fim, precise answer to
GRADINGANOFEPORIINGrcN IEVEMENTS 277
of ,n:rch.influcnce cach .omponenr ozs,lr ro j:: l,ow have in deterDln. :l:'esllon I ngU,,.o m p o s j re q rJ d (.1 r,,,\e \(rrt -,,,.. sx,di ngp,i " .i pr." -." ,, r." | ,' g e n e ra r.rh e u s e o t re \ e ral d " ,i .,.Jr s ber re r rh a n U \e o f o rl l o l e , p rovi , I ns r n. ri u n a l .h j e c ' i r6 a n d p ro i ,ttcd r L' r n r e rs o n a b te d c .u rr.l . O ,h c r (un, nm the rDosr retiabie scorcs shoukt h
,,1i.,,,,,,, ""un, ,hd,r ,.m Pon.,,,ur r
:Tii:i'il: ' ;l .1,;l"l:;;'i,l#1'-i;
enr quire diflicutr ro assess.As a firsr aD. renr rhc \rdn.l dro deri . ron ot i r, \or;s rqr,e.. \Jri rbtc J. anurher, rhc Ii r,r rer of rhe se.ond rn rheir roral.
on a secoDd, and lowesr score on rhe rh th€ raDks of rheir roral scores on rhc rhree rcsts are rhe same as rheir ranks oD rh i rd s e rri o n o l rh e ra bte gr\evhc mdxi mum pu,si bte \ ures oraj _, . .n, s.r ith po, rh.e m e a n s (o re !. a n d rh e s ,Jnd:,d d,v,al i " " , ,r,i * " * , ,r," i i ,* " r fhas otal pornN ..fesr the hi" "qhest m€an r vanabihty
278
GAADINGAND FEPOFTI.]GACH EVEMENTS
r able 1 5 -1 . l v e g h l e dT e s tSc o re s
53 50
65
a9
42
2
3
3 1 0 00 500 25 21;)
'15 65 5 130 I ta 142
136 15r
2
2
aa
22aa
30 l0
1 4 50 65
42 :JO
360 360 360
illus r r a te d i D th e l a s rs c c ri ()n o frh c rnbl cS coreronresrX aremutri pl i edby4.ro c hanq e rh e i r s ra n d a rd(l c v i a ri o Dl i om 2.5 ro 10, rhe same as o" r.,fZ S coi ., " t es t y l rre m u l ti p l ,e d b r 2 , L o c h a nge thei r srandard devi ari on ro r0 al so W i"rh e, . ud1,r-,,,1 d ,.1d .\i n ri ,,n - .h ( rr' \r!aIr eq,rat \ci 8h( anci I,re,tudenr. havi ns r he r . r n ' a \e rJ Ac rd rl ,' n rh p rc \r" rhc." nre r" ral .tores. $'hen. rhe rvhole posible mnge of scores is .dsed, score variabiliry is . cl(xelv related to the exrenr ol rhe ava'labte score scale This means thaL scores oD a,() irem oliecrive resr are likelt ro crrry aboutfour rimes rh€ weighr ofscores on a l O.p o i n r c s s a yre s r q u e s ri o n ,pro!i ded rhar scorcs.xrend acro;s rhe w hote range in lrorh cases. Bur iI onl! a small part of rhe possible scale of scores is rcturll)' used, thc lengrlr of rlrar scale -an be a lery misteading guide to rhe vari r l) iliI y o f th e s fo rc s .
20Vo 20% 10% 30% 20% the Trscorcsofeach compoDenr can be muttiplied by 2, 2, l, 3, and 2, respecrivelx ro achieve dre desired weighdng (Oos rerhof (,1987)has described the weighring
GFADINGAND F€POFTNG ACIIIEVEUENTS
procedures for hoth.riterionreferenced
279
and norm referenced gradirtg situa'
O n e fi n a l a d m o n i ti o n re g a d i ng rel ati ve gradi ng and combi .i ng scoresl k is a n)islake to converr tcst scores to letter grades, tecord rhese in a grade book, and rhen rcconlert the lerre. Sradcs to numbe's (A = I, B = 3) for Purposes of iu, np, r r in B r' ,,.,1J \,,i g e \. \ b F Il .t p ro.Fdur( i r ro tei ord rhe rP srscnre\ dnd Ih c' P ' arrherdd.d.hi rhqharever$ei ghri nq ' , r hei nu rl e ri ," l rn .r' u rp ' d i r.{ rl t score that can be cooverred to a final a co'nPosite Io obtain bas bcer xdoptcd,
which ir i{as bascd, is gilen the same value in the reconlersion p.o.ess (for exam' Dlc, R : 4-0). Sorie of thc rcliabrlitv thc tcacher struggled to achteve in delelop' is lost in the pro(css !'or this reason rt ts desirable to record i"e c".h -".s,"e or standard s.ores r;th€r than letters or rheir numeri.al equivalents raii scotes
LlrrErHoDs
GRADES oF ASSIGNING
Thc procedures a teachcr lbllows for assjgning term gradet ar€ dictated laBely b! r hi Dru a n n rg(h e re a .h c r h a s c h o s cn to atrri bute l o the symbol s The mul ti l ude oi rnethods used in practice Beneralh (an l)e categorized rn lerms of their depen de. c e on c i tl te f a b $ l u tc (r rc l a ri v c s randards(l ri stri e, 1978).Thc P opul ar vari at ions ot rh e s e tw o tl p e s a n d th c i ! .orresP ondi .g strengthsand w eaknessesare dc s c r ibe d i . rh i s s c c (i o n Relative Grading Methods is callcd gading on the .ume The One popnlar radei! ol rela(i!. (l i stri l _,uti oncu11eor some svmmctri . ' ' c uNc re fe ri e d to u s o a l l ,vi s th e n o r mal 'arading \ ar ianr o f i t. T h e n o rm .rc l i ' re n .e d b a si s for rhi s t,vP eol gradi ng i s comP l i cr(ed
ar c n, er e l y a rD n ti fi e ro p g ro u p , i n .l udi ng those w ho D ay have scored20 poi nts Io$er I he bottom 5 percent may €ach be assigncd an F, even though the bottom l5 Der ce n t m a y b e i n d i s ti n g L ri s h a b l ei n achi evcmert R egardl essof the quora r er r iDg s o a re g y u s c d , tl ti s r;l a ti l e g r adi ng mcthod sel dom.ari es a defensj bl c Ihe .ti\tribrtlon gaf itethod, another relati"e grirdrng variation, is base.l oD t hc r c l x ri !e ra n k i n g o l s l u d e D tsi n l he forn of a frequen.l di sri buti on of the .onrposire s.orcs The trequcncy distribution is examined carefully for gaPss er eial c o n s e c u ri v es c o re sth x t n o s tud€ntsobtai ned-A ho.i zontal l i ne i s dra$n r he r o p o i th e n rs t g a p (" H e re a re the A sl ) and a second gaP i s sought The
280
GqADT.IGANDREpoqTTNGTHIEVEM€NIS
P r of t s : re n ri n u e : u n ri l J l l p .$ i L ,te fl ade ranB c| q r. F) ha!c bFen i drnl l fi rd I ne m rl o r ri | rc ) w rrh rh r\ re , h n i q ue rs rhe depcndenceon.hance to form rhc gaps. Tbc size and tocation of gaps may depend as Du.h o" ,a"dor ment error as on aduat a.hieycment differe.ces berwecn srudenrs. ",ea.";.. If the score. from an equivatenr ser ofmcasur.es dNld bc obr"i".d f.r", ,;. e,";lr.;h;;;;i:, Baps m'ght appear in diffe'cnr tocarion( or rtre trrger gaps mJt ,,i." .u, r" l" somewh.rr small. i:r,ors of neasu,rmen, r,,,,n d,//.?i;e,,,re. d; n;.";. c . i il\ r J D ,T I e d , h o rh e ' .u ' d ' rh " \ i r,. re.l ,o d. .n repr rreo rD ci""; surcmerr " \pe, s ir h I h e f,r i n :I Im-n r I h F m d i or ri ^n o| | t," A i .,, i du,i ^. j l ,p mel h.d ., .r,.,. { g n .d . | ;\ s,"u.l ,.nr: Jpp" dr ru be I i shr , n ,tr,:S o,.r.,t,u. rl q h e r c rd .re c ,,n ,c q uenrt\. ,er, hFr\ r,.ei \c re$,.r \' | rdcnr ,1,Ir " : I " *' ' , ' F : pr ar nr s rn d re w d re l u c < r ro re e \ami ne resLpapers ro search tbr,,rhar exrra p. r nr . In d r s o u t.l . l u r F \d rn p te ., h r nqe a { gr.de r^ ,I tn ,i ,Ia .I' sh-rp rh. .'":'.,]'". o , f.rc s re ' ri g h t\ \r' ri Jbte rhi : grddi ng l rerhud i \ rrketr In ri cl d F r aoesrn d ra rc \rm rtd rro rtr.r(,rs s i gn.dh\.um" nrherretrri .eFrxdrnqme,i ,ud,. noqev e ' . w h e n \ o ' e r rre I c td ri !e tv humoscnen r, rhe [Jp .]i ,i i i L,,,r^a merhod a. r ua||v m a v b c a 5 In e q r" tr' l .tero .n mr rudcnrr ar i r ro L,c. " | pi a,. rt' ,e o rh .r \ i .l c h u .rd a n .l Fc,rp!d r .nu nJ rej ,i ,r,e n,rl i nr- pro, " dur. m ighl h c l J h e l e d rh F rra d l o td /,,1 ).a ,anapth^d.t. ,h" ;.p;,," d;,,:;:; ;;. " dar d de ri a ri ,rn i a ' d e re ,rri n i n g rl ,e g radc .uroff poi " ' ,; nts rhai fo,- .d;;;1. ||on, or rh e ,.mp n .i :r \.o re \ tl rc n rhe Inr.,l i rn j nd \r!ndi rd de!rJri o,r,,t rhL r om po, i re n o rr. a re i u m p l rc d . r, o p.i ^r\ ti ,r rhF r dnC eot C sr,,l e_ r" ," r,;. pef lor m a n .e l d rr d e re rn j n c .l t,! d d d Ins ,,ne l ,r| | or rhe srrndar,t i er i r ri on ru ri i t m f dr dn .' r' d s u h r' J .ri n g ,,n e trrl f .t rtr ,r ,r.,tJr.l dc\ri ri .n l ronr rtri . n,edi Jn , he4. d d d u n c .r!n d rrd d . ri rrl o n ro ,rr. upfer, ur.rr i ,t rth.r , ru fi nd rhe-q_! c ut of f sc o .c Srb rrl c t rh e s a m e a D o unr fr;; ttre tow e, orton or trre C s to nno t he D- I c u to i l R e v re s b o rd e rl j n e c ascsb, u$j ng the of,,.i sn_;n;;;; p- ler ed,q u a ti (y o I a s s i g n me n rso. r s o n,eorher rci .-.t " " m1,.,. [ur" ," j .l i ,i . if anv bordolj,re gndes shouid be raiscci o, l"*...,1 -r,i .,.-." t Nr"."".._."; ._;;;;;i. rll componte s(ores, also A rariation of ths mcrhod rhar des.ribes rhc use of relatjve grading on aD rnsrirurioDal bass has becn iltuslrated jn .onsiderable de. t ail by } l b e l 1 1 9 7 ? ). A bs olut e c ra d l n g Me th o d s \,ri o ' ,. n ,e rh o d . rh J r d e D Fnd on p" ,,enr..nre. d\ thF.r t,.,\r, hJ,e a . r . nq. ( dn d rn g h r.ro r\. h u r rh e i r p ^ n utnl r! trd. .| r i i she. arerrt! \i n, ( ,tre rJ: tr r : r , us r e rre n i s (o re \ri o m ' c .r.,p J p e r..dndorh(r p,ui p ..Jrei ar,.rprer(drsrh.. per . enr o r .o n rr n t. s k rl t.. o r k n o d te dg. u\Fr qhi .h i u.i en,\ \i \r ,,,rnmrnd _, dom . ln. re t' e re n c e di n re rp ' p rrri o n F or e\amptp.drc,r \,urF.t b., p., , .;i ;;;.,.: lhar r he s tu d e n r l n u h . 8 3 | e ' , e n ' o r rhe.on' .nr I cpr p.Fn,ed h) rtrei rj r| | ur I i unal ont ec lr p \ | l o m s h ! h re \r I' e ,r( q e rc p,.pdre.j dnd,.rn,pl ed ppri en,\nr..u\I. rrng the scolcs vith performance srandar.ls to pelcent scorcs using at-birrary srandards e cune Thar is, srudenrs wth s.ores rn rhe (o 92 is a B, 78 to 84 is a C, and so on The
GRAO NGANDNEPOFTINC ACNEVEMENTS 281 restricrion hcrc is on thc score ranges rarher than on the number of srudenrs eligiblc to rcaeile each ol the possiblc gmdes Brt what ranouale should be used to determine cach grade categorl .utoff score) Why should the .uroff for an A be 93 rathcr than 94 or 90? A major limitation of percen( gading as used by soDe rea.hers is the use of fixed cutoff points thrt are applied ro @dl' grading componenl in rhe course It seems indefcnsible ro set gnd€ cutoffs rhat remain coDstant throughout the course and over several co.secutive offerhgs of rhe .ourse. What dr6 secm dc|nsible is for the instnctor to establish cutoffs for cach grading .omponent, independenr of the others, depending on the conrent c,f each component. For example, the range for an A might be 93 to 100 for rhe first test, 88 to 1U0 for a term paper 87 to 100 for the second rcsr, and 90 ro 100 fbr the final exam Those who use percenr grading find rhemselvesin a bind when rhe high. esr score obtained on a test was only 68 percent, for example. Was the resr much too dillicult or did students prepare too little? Was instruction relarively ineffec tive) Some insrrucrcrs proceed to adjust the scorcsby replacing rhe perfecr score, 100 percent, with the highest s.o.e, 68 percent in this case. For exarnple, if fie highest score was 34 out of 50 points, each students percent score would be recomfuted using 34 as rhe mariimum rather rhan 50- Though such an adjusrment may .ause all concerned ro brerthe easier, rhe new score can no longer be inrerprete.i as originally iDtended the pmporrion of rhe contenr domain rhe student knows, as sampled bv the test. A new donain has been eshblished What useful inrcrpretarion aan be made of the nc$ scores?How can rhe ner\r domain A final shortcoming of percent grading should be nored. The range of percent scores usually is limire.l to 70 to 100 because (he passing score generally is 70 percen!. The test constructor must exhibir Srear skill ro prepare irems rhar will yield scores distributed rn thls narrow range and rhat, ar rhe same dme, will measure relevant learni g as reflecred by the instruc(ional objecrives Merhods tha{ allow for a lower passing score would permit a greater porendal range of scorcs. likcly would yield more reliable scores, and likely would result in more reliable grade assignments, assum)ng the full range of grades (A ro F) is ro be A second melhod ofabsolurc grading, called here the antent b6en Mtho4 dcpends heavily on rhe judgnents of the ieacher in decidhg rhe rype and amounr of knowlcdge students must displal to earn each grade on the A ro F scale It ir the method mosr compatible with mastery or quasimastery reaching and learning strategies, but it need:rot be limircd to pass fail or sarisfacrory unsatisfactory grading scales.The procedural steps for establishrng performance sGndards and curoffscores are outlined below for a so.item Lesrbuilr to measure achievenent in Mo units of insuuction. I Firsr. rhe grade to be assign€d to thosc who demonstmre minimum a f dec r ng c h re v e m e n 'mu ' | b c e \ta b l i < hed W e q i l l urc D l or i l l urrra' i on purpoces, bu' ir ( o u l d b e L . r\ i l o m m o n i n g rddua(el evel (ou' rer. The reacbe;m;sr de. velop a descriprion, preferably in writing, of the type of knowledge and understanding a student who barely passes should possess.Srmilar descriprions musr be developed to describe C, B, and A performances.
282
i
GFAD]NGANO h'EPORNING rcH IEVEMENTS
2 . W i rh rh e d e s c rrp ti o n si d and de c i d e s i f a s n rd e n ,i fi o n l y I swer r correcrl): tf so, a D is re(or
m or e r h a n a s i n s rc p o i n r, ti k F ro m r . r e( r dc rh r D i n i m u D umbcr ot po caregorx I. T h rs p ro L c r( ro rj ri n e s , rl . . . __ l|ed r h p c s m a re d c u t{ ,ft s c o rc to r D s y mb o l sp re c e d i n g th c i re m s As s ul t he num b e r o fC s y mb o tsi s ra l ti e d a r L perrormance. This pro.ess conrl grade has been derermrncd. The res A = 48_50 B = 40_qt c=2939 D = 1 7 -2 8 I = 0_16 Ln l )c obtai D ed Ly adj usri D grhe esri D rrl .d acpFndrng orr rcsr tengrtr. r he adi rr\rmrnr i ur rh. rnr I rh.rro r,uca\urc\ are l e< ! rhdn
i':i:"I'llj:iil,,j,il: l:,.*:;'J';:i:tl.
q,^ o,e,rn,, o,,""il,,ll'ii,:,"i;l:, s, " s,,0". ll:,ll::r.1,;,l,,ll T h e c o n re n rb a s e d m e th o d i musr exercjse subje.r,vitr h descnrrin ample, musr displal Instructors in rh,
mctors are willrng and abie ro define Der_ : able ro supplv a dettnsibte rarionate fof ar approach has bcen described b), Tclr|il.
A final merhod relaresro rhe use
ilff#i:*iil.','i:i:u]n*s*lii*rn*h#ilnytiutH
GFADING ANDBEFOFT NGrcH EVEMENTS 2I3 ov er t 0 0 re p o rrs d c s c ri b i n g c o n rracrgradi rg and concl uded rhar ,,conrracreTart. ing ap p c a rc ro h a v c a p e .n a n e n t ptr(c J;rung rh. mosr rpproprrot. cui i enr m er ho d \ o fd \s i g n i ' ,8 H rd d F \r^ \ru d en,- H us" \" . rrudi F, ,,i i t," .tr,. r. ,r .o,,. t r ac t in g g e n e ra l l y s h o w e d rh a r s (udcnrs l i ke i r, reache6 assl ened morc hi B h gr adesrh a n w h e c o n v e n ti o n a lme rhodsw er.eused,and sLudcnti chi cvemert ul as nohigh rh rl \i rl ,i u n \e n ri o n /l g rrd,rE ri ,n,rr,rLi rr.trnR Jppej .\,,,hct\.., s u' r e. l ru \.' r s n ,d tt, tJ \\e . n t rn d ,.p cn,ter..\ruJ ,,., our* . r; h ,,u,1,n.,.,,. "ii, giv en ( h e fl e x i b i l i ry ro p u rs u e i n d i vi dual i rrercsrs.tns ch cases a uri rren asree m ent s h o u l d b e m a n d a ro ry s o rb ar D u D rsundersrandrnsh i resard rrr;ha, m u\ r b e r,,u rn p l i \h e d . L r \h ,,,r. dr,l b) hhd dFU Ll trn; " \,\r
GRADING SOFTWARE The time.consuming rasks associared wirh recording resr scores in gractebooks and c o m b i n i n g s c o re sfb r h D a l g ra des ca be handi ed readi \ by a rni crocom purer and any of rhe numerous softwafe packages alailabte f;r i.adns. So,ne t eac he r\ u re a \p re rd \h c .r p rn g rd m dnd de\i tsr rhe,j .un g]" Lti ng ,ppi i ,.rrr,,n pr ogr d m:n rb e ,\ ti n d rh e u n i q u e r. a,ure. or man! ,,r rrre,,,rnnreri i ,i l ,a,r" gs word their relalivelv loq cosr. Bccause sofiware and hardware borh change more rapidly rhan mosr other textbook conrenr, we ha!e chosen not ro describc or evalu;re slrecifi. srad ing software. However, curren! inlormarion can be tocarcd using suci relerc:nces As Dato Saur4, L,-atoJ!!p!!t!Lr.4 ino.^nputn_so't c. r.tl.\r \ta, Inro\hr. / a, . at bn tn a ? r,.a -n Id u t,a t l tu t? , ta J a u ,MI\ i r Fdb o' on tt.tl l 1 l n rddrr i un. \ntr\.r,I r e\ r ew s a n d trs rso t n .$ re te a \.s a rp pri ntFd trequer' tr In \u, h l .rrn.,t, .,, /./,. tronk Leaning, inAdzr, Cra$rcon hnputet Leaminf, and ihe C(,np;b Tancha. Hete are some queshons (o rarse when assessingrhe uril'tv of a gradebook prograrn I 2 3 4 5 6 7
How dranysrudenrsand grading .o mponenb per sLudenL can be r.conuo.laied on a single dar2 diskeu€? l-or the elemenrarys.hool lerel, .an rhc sy{em han.lle mulriple ctascs for a single group of sLudenrs? Ho{ cdnvenienris it to cbangegradesor Lorepla.e scores? C a n re s ts c o re sb e i m p o rL e da s a dara6l e so rheydo not needt(j be kel ent€,ed one by one? Does the variei) or reporting-prinring oprions sarisfybasicDeeds? C rn rh c d d ,r b c .ro rc d o n r d d,a di .te,' c.,ppd,j re r,rm,t,c D ,os,r,n{i \t.,. rh a r! u p i e v a n b c m J .teo l rh e ddra trte.j Are there anl unusual hardware .equi.emenrs .egardnrg menrory,drives, or
8 Can rhe prograqr be retur.cd for tulr refund afre. a rcasonablerrial perbd? O fro u rs e . th e m d i n q u e s ri o n ro a\t r' bout any A ,l di ne sotrhrrc i r...W i l l t he pr o g ra m a l l o w m e ro u re rh e g r adi ng proi edure, and ph:to\oph] I hde J dopiedl ro r e x a m p l e , s o l i l " a r. (h dl $ i tt nol a, i ummodare; rea(h;r,\ r ri r.ri on. referencd__glading pmcrices should nor be considered for adoprion, no maner how friendly the package seems to be.
244
CFADINGAND FEPORIINGACH EVEMENTS
S UM M A RYP B OP O S IT IO N S
I T her er s . ortr n q w o n q w rrre f.o u ranqg s ru d e r l ' ' , r ! , o. 0 1 o ,a d p . , ..- q ." d a .d ,e ," ,,:
Y?
4 Gladine-is r@q@nry rhesubrecr or educarona
::."#::
I ' r c d s ! r e S o r a ch ie ve h e n l
co nrro ve Eyb ec als e Lheqr idir g pr oc es s sd 'L 'd||6 'e''p |'' o\ oo' * . , ' , ' "j", ' "' "", ", , . .
l :" d
o rdnh r.' oo,
l;',lj,il
o, ro,ra..p
ro represenr slmmarive
"",i,h."rr;;;;;;;;_x ;J':,",,:""xlJ:;iHlffi:J5"t:l:1::,ii" ;oe"l,:l".. 16 T/rewetghrcaned by eachcornponenr meas!.e
compone.tsis ro, smalt 7 T hes eec ir of0 1e trh ear b s o tu te o r re ta tv es L a n oarosas a basrsfor gradi.gwrt be ntlenced mor€bypj rosph €lmns deratonsthan by em_
21 The use ot conlractgradng may be advanlageols indviduai nstruclionasituaUons bor 'or grad nol ro. ng cassesot slldenls 2 C , -a at,o o-p-re,so ba.p.rnF6p,.oL.! ' fe c pi !dt .rtor crd rrp oorenta ro. .orpj a. rcnaierors assocaledwtthgradinq
OUESTIONS FORSTUDYAND OISCUSSION I For what uses are high schoo colrse qrades most vatd? 2 !nder whal cnchslanc€s.ould I be approp.iat6 e r 6 9 'a d e s i s s L p db v r e a c r - . , si n d , c r oor r o ev dr Ldr a, aec J r r , c ur Lro . , h a r " " h o o j o ' 3 W-hat s grade inth t@nad what kind of evidencgis reeded lo show that has o. has nol
GFAD NG AND FEPOFTNG A]H EVEMENTS
2Aa
.4 When eller grades a.e used of report cards at the mddle schoot evel, whar rfformalron srrourdbe T!rn shed to commlf cale the meani.Oot each qfade symbo? 5 W halar es om eet f ec t v em ean s o tr e w a r dn g s l u d e n t sf o r r e i f s u p e r be f l o r t t o t e a r np, a r t c , ! arry n the lace ot reraltveytow ach evemenr? 6 W hy r s r heus eor pt us andm tf u s r e l r e r g i a d f gk e y l o y i e d r n o r e v a r d g r a d e s r h a ns j h p t e 7 W hat adv a! lagesdoes ef ler g r a dn g h a v eo v e r n u m e r c a t g r a d r n g ? e What shorcominAs,n afy are i.herenl n rhe ececlic gradtnosyslem
described in lhis
9 Whal ncenlives,other than grades,can leachers use lo molivale srudenls partrcipale 1o rf class aclvilies and 1o compete rromeworkor pfaclrce exerc ses, 10 whal are Lrredsadvanlaqesot Lhe leed jorward,, concepl thal s recommended for use if 9rad ng ass 9nme.ls? Fow cou d eadr disadvanrageyo! iden ed be overcome? l1 lr the scores irom lhree 90to.1lesrs are addedtogelherlo torm a compos le tor grading wny wo! c each resLnot necessaalyhave eqla nt uence(werghL) in delermrnrfgihe rank order o1 ndividuas n the composrle? 12 Undefwhal c i/ c lm s lanc esm ig h l e r a dn g o n i e c L r v eb e p a d c ! a r y appfopriate? l3 W haLdr awbac kdoes s t he s r a n d a r d i e v a r o nm e l h o do r g r a d n Oh a v et o r r e t a r v e r ys m a l 14 W hy m ghr s 0beam or eappr op r a l e p a s s r g s c o r e l h a n T s n a p e r c e n lg r a d r n g sysle.n? 15 W har ar elhe dear c har ac ler is l c s o r a c o m p u t e r g r a d e b o o k s y s l e m t o r l s e a t e a c h o t L h e s e gr ader ev elseem ef t ar y ,m r dd t es c h o o , h g h s c h o o i c o l t e g e ?
The Nature of Standatdned Tests CHARACTERISTICS OF STANDARDIZED TESTS Tlre tern ttdtulafthzal tesr.efers ro a resr thar has been experriy consrrucred, usu allv wjth_tryour, anat),sis,and revision; inciucte,.*pli.tt ii,"ui,t"n, r..u"tru,i" Glandard) adninisrradon and scorin$ p.ovla* *1r", r.. r_i. "na rnrcrpretarron purposes, deriled froD adminisre.ing rhe resr"rtlol-, in u iforrn fashion to a defined sanple of persons Used l published rest or invenrcry. whether Dr not. Most precsely, resrs or measure; r means ro. making score comparisons taskt under the some tzsti@ cotuliiiow anrt t; uith the sme prccedurc; Of cou$e, no ui I.n ' e n d c dro \i c l d norm rercren(cd (omprri runr Lri reri un. :r -co, : l' l\, r r . rTTen. e d rn o9 d,a o m a rn .re re re n c e da r hi e\cmenr rc5r. and,ome per,onati r\ \ x r . ! . all u i ' w h i , h md v b e .o rn m e r(i a t pr.pared and," ;r.,-i y.a_," ;,,.,.J, mea. us uallt pru l rd e ra b l e 5o f n o rm s S rrn d a r d' ro ized s e re rh e s ame fLj nrI ron rn edu( ari on rnd p\}, hotusr d\ s r dndd rd re i g h rs a 'ned\rsm e a s u re sd o i n.onme,.e and,ci ence. rri ..," _." " , m ar k er hJ d i rs o b n rv p c u r s (d tc a n d .on(epr ofhoh mu(h d puuna i ,, i l _rrJ nor . oe\ ur e rh d r a p o u n d o r g ro u n d b e efpur.hased at one marker routd be mnre / p .u n c t o b ra j n e d a r /n orher l hr samr probl em houl d tace rhe ( onr um er a r th e g a s < a ri o n . rh e ta b r i ( shop, and rhei andy counrer W i Lhour r r lnda, dil e d L e q s ,,h e a , h i e v e m c n rra nd dbi ri ,i * .r,,,a.,,, r," . Jri i .* " i i i .* .
288
TESTS 287 THENATUFE OFSIANDAFDZED rooms and schools.ould not be assessedreadily with a common yardsdck. For example, rfeach tiiih grade t€acher in a distrtct werc ro develoP a geognPhy test ro measure studenl achievement, e would Iikely find lests that varied markedly in lhe breadth rnd depth of tasks requrred, the umber of irems, the amount of tcsring ome allowed, the qualiry of t€st rtems, rnd the reliability of the scores obrarned Celtainly, ir would be illogical and inappropriate to make scote com parsons among sludents &om different classrooms and schools under such crr' 'I he distinction Dade ir Chapter 2 between tests and measures will be followed hcre in detarlin,l thc charactenstics of test batteries and srngle'subJecl res(s In addition, because standardized personality measures ancl inventories are sed so rarely by Dost reachers and administrators, we have chosen to hmit our t.earmenr ofsrandardized instruments to tests in the areas ofachievement, cogni tive ability, and aptitude Test Ball6des Somc standardzed tests are developed, published, and administered in coordinated sets known as tesl battfrizs-The nnmber of tests in the set may vary from 3 or ,l to l0 or more, the number of items Per test may vary from as few as 20 to 100 or more. and the administration trme Per test may range from about 10 ninutes to more rhan an hour The admuistration of batteries like the loua Tesk of EAMtionaI DeveLopnmt or he DiXermlial Aftitude Tests may rake as manf as live seDarate test sessions. A primary advantage ofusing abattery over a collection ofseParate tests, whether for achievement or aptitude measurement, is thar fie battery provides comparable scores from the saDe norm grouP for all its tesl,sThis is imPortant, for example, if Mindy's achievement in mathematrcs is to be comPared with her achievem;nt in reading, language, and science Her relative srrengths and y'eak' nessescannot be assessedunless norm.referenced scores using a ri,gl" reference group are available If seParatetestswere used, Mindy might seemro do b€tter on tle riarling test ttran on the math test simply because students oflower achi€ve' menr wer€ more prominent in the norm group of the reading tesi This tllustrarion explains why aptitude batteri€s are used so fr€quently in emPloyment and vocational counseling to help the client understand his or her areas of srength and w€akness.The use ofseparaie tests would not permii useful intraindivtdual An achievement battery is a suney ofihe subJectmatter covered by each resti cov€rage is broad and, therefore, reladv€ly shallow A battery can Provide comprehensive coverage oftort of the impoltant aspects of achievement at the elementary school lev€I, nu', at the secondary level, and &rt, at the .olleg€ lev€l The more uniform the aducational progams of all students are, the more suit' able a test battery w l be for all of ihem. A very practical advantage of a battery is that rhe scores from a ba(ery are reported mg€ther on a snrde report. WheD sepamrc tests are us€d, a score reponis generated for each seParate test, creatinga most cumbenome accurnula' tion of paper for the user.
2I8
THENATURE oF SIANDAFDIZED TESTS
use ofalartery of lesrs rhlt was developed as an inregrared whole . --Thesubsranrial advantages.The nrain disadva;raqc is rhc la.kthus offera fflexibitirv ir r il' o' . 1. .A b d | | e n n ,d ) i n c tu d e ,o rn r { rh,e\r\ rl ur dre ol ti tc rnrcl lar u. er . an d mi ! n mi r o rh e ' . rh e \ b u u l d ha\e prrterrcd. B ur rtri . i pd,i or rhe pr i' e t har mu \r b e p d ,d .u me ri m e s Iu r rhp adrdnrdge\ or ,on!enr ,,e i n u.c, comprehensiveness of covcrage, and comparabiLry ofscores. Mosr achieveDrenr,
Single.subl€cl Tests Tesrs rhar measure achievemcnr rn onc conrenr area, or rhar measure a L a single{ubJecr resr. And because such rhan rhe conespondDg resr found jn a bartery, they will contain more toral irems and more irems per-skill Srngle.subjecr tesrs rend ro be used for parricular purposes, to make a ._ sp€cifi. kind of insrructional decrsion, rarher rban simply io d;scribe studenrs, reiative achievement or aptirude levels For example, readrness res6 used ar rhe pr jm ar t lev e l m i g h r h e l p rh e re d .h e r g o up srudcnr. ot \i mrta, te,di na or dri ,h. m dr & hiev c me n r Ic \c l \ tu r i n \rfu , ri u n r t pu,pose,. { m" rhemari r, res;mi ghr te us edr . der r d e w h r,h \e re n rh g rd d e r\n ' .m o \r l ;kel ) , dndi drres l or erahrh.srJde algeb' a. Re a d i n g h s r, d re u rd h e tp ,cte,r reddi ng mdre,i dl \ rhr;w ou" td be ' o e re a di nts.krttsore;chqrudenrLnd.ot m os t appr op ri a te fo rd e v e l o p i n Srh ,ourse P r or r c r en. yte s tsa n c l g l a d u a o o n c o m p e r r t nat r r e us e d ro m a l e p rn mo ri o n re re n rl
Some single.subject rcsts resemt or thcy provide skrll scores Some trngl separate scores on vocabulary spelling, capitalization. A reading resr may yletd I hcDsion score, and a total score. Ofren wift one anorher thar their separate dia the total score is probably a comprehensive indicaror of achievemenr iD rhe broad content domain delined by rhe resr specificarions. Most of the standardized resrs of.ognirive abilities (intelhgence) to be dercribed more tuny in Chaprer 18 are mosr appropriarely .lassifi;d as single subject tests.That is, rhe rrair rhese tesrsarrempt to measure generally is a sin[le, unitary cbaracterislic. Despite th€ differences among ..inre iAence', resrsin;har r hey pur pof l ro m e a s u rea n d i n rh e rh e o rl on w hi cti rhey ari based, dnd derD i re r } le f a. r t hat s o me v i e l d s u b re s ts c o re s .most i ntetl i genceresrsare l essti te bai rer. ies and more like single{ubject tests
TESTS 269 THENATIREOFSTANDAFDIZED
TYPES OF STANDARDIZEOTEST SCORES Seldom are the raw scores (nunber correco obtained by students on standardized res$ interDreted directly. Inslead, ra scores are converted to some other score scale to facilitate in.erpretaoon. These n€w score scales are desjgned to Permil direct norm referenced interpretations by referring to a singl€ reference group (starus scores) or to s€veral reference groups that have been linked to the sam€ score scale ldeveloDm€ntal scores) Sialus Scores Stalus scoes indicate how a student's test Performance comPares with those of olhers in a single reference SrouP-a class, school, school dist c! or natioDal group Relative position in the group is the focus Status or standing in the group gene.ally is express€d as a p€r.entile rank, but standard scores hke those dcscribecl in Chapter 4 frequentty are used as well ln most casesstanines, ?scores, or normal curle equivalents (NCE' are normaliz€d standatd scor€s de' r i\ ed lr om p e r, (n ri l . ra n k s Ih e .ra n d a rd age \cotc' or devi ari on IQ s(o' e\ rhal corne from cosnitil'e abilities tests are status scores also. The primary purposeofstatus scores is to help in iden tifying intraindivid' achievement (or abihty) across rcsts in a battery. For exarDPle, ual drfferences Vrc's pcrcentile rank of l.{ in vocabulary indicatcs a relative weakness comPared with i reading percentile rank of42 Science mightb€ consider€d a strength for Vic and rnath a weakness if his science stanine s.ore is 7 and his math stanin€ is 4 Of course, such comparisons are legitimate c.nly when the same reference group has bcen used Note that the use of status scores to moDitor year io year Progress can mask grourh F.,r examplc, a student whose reading Percentile rank is 87 tbis year will ob u nr a sim ilar score next year tf normal gro wth occurs The sam€ness convcycd by status scores in this srtuation could be mrsinterPreted to mean that no change occurred In fact, a ,core of about 8? next year would indicate the studenas achievcment changed as much as the achieveDents of others in the norm group (Sec the guideline's shown in Chapter l7 for in terpre tiDB Percentile
Developm6nlal Scores DeuloPm€ntalscofts iDdicate how a studenr's test Performance compar€s with those ofothers rn a se es ofrelated refer€nce groups (Hoovet 1983) There groups difer systematically and deve)opmentally in average achievement and are defiDed in rerms of school grade o. chronoloFcal age. Score scales most fre' quentlyused to express developmental Ievel include grade eq ivalents, age equiv' alents, and developmental standatd scores (sometimes called expanded standard approPriately used in grades K m I Grade equivalent with s.h.,ol subiects that are studied continuously overseveml years at increasing levcls of skrll and complexiry 'Io obtain a table of grade equ ivalents, the test Inust
290
THENATUBE OF STANDAFDIZED TESTS
be given to a large number of studen$ in each of the seve.al Arades for whi.h ir is inteDded Then the m€dian raw score ofstuden$ in cach grade is derermrned The raw score is assigned a grade.equivalent score rhar e)Lpressesrhe grade Ie!el
a grade equrvalent of 3 2 would be assigned ro rhat raw score_If the median raw score obtained by fourth graders on the same resr at fi€ same rime was 30 3, dren a gmdc cquivalent of4 2 would be assigned to rhat.aw score (Does it make sense thar a raw scorc of 26 0 tould be assigned a grade equivalenr of 3 7) crade equivalents Lrsuallvare expressed ro the nearcsr lenth. each renrh corresponding r oughly t o o n e m o n th o fs c h o o l i n g i n a sch.,olyear ofapproxi marel y 10 .r.,nrhs A grade equivalenl oI 7 4, for example, represents rhc median pe'formance of s ev enr h g ra d e rs rt th c e n d o f th e fo u rth month' Tabl e 16 I show s rhe qradc equir alen r { o fe a * i g n c d ru rh . rv p i . dl srudenr In .ai h srade l or earh ut rhrFe resting rrmes. Note the average growth rate from year ro year is 10 and rhar rhe samc uniform $owrh is assumed rhroughout cach year Grade.equn'alent scores carl be used (o desoibe a studcnr's delelop. mental level, in terms ofschool grades, and ro rneasure growrlt from year ro yeal But rhey are iess useful ibr examrning relatire sirengrhs and weai(nesscsbecause, as Table l6-2 ilhrsuates, va abilitl in each test area is dillerenr lor a gilen grade group For e)tample, all sixth graders whose raw scores are at rhe median in rhe fall have GE : 6.2, ro malrer which test area we considcl Bur pe.forDan.e ar t h. 95' h p e rc e n ' i l e L o rre (p ^ n d s ro a u F ol e 2 l ur ,pel l i ng !nd; cf. nr a 0 tur maft computation If Ne looked only at grade equivale'rrs ro make judgmenrs abour strengths and weaknesses,in rhis exa rple $e would erroneously .onsider s pelling a s re n g th , re l a ti v e to ma rh c o mputati on.B ecausesi xrh gradc$ are nl ore homogeneous in Dath computation achrevemenr dran in spelling, rhc range of grade equivalenK necded to describe the bulk of (his grade group is 4 2 to 8 0 aDd 32 r o 9 2 . re s D e c ti re h . Devetopment,l standard scorcs arc similar to grade cquivaleD$ in lunc Iio and have the same advantages and disadlantages of Dosr orher rrpes oI derived s.orcs Thc d€velopmental standard scores shou'n rn Table 16 3 hale average growrh rates that d€crease as students progress through rhe Fades 'r'hese Table16-1. GradeoquivalentScoreslor MedianPerformanc€ at Eachot Thr6eTimes o f Ye a rl n E a c hGfa d e
3 K2 K5 K8
12 l5 ta
22 25 28
32 35 36
42 48
52 55 58
62 65 68
T2 T5 TA
A2 85 88
rAs will be seenin rhe examplesused larer sone publshen d.op rhe dc.lnat poinL{hen repo.nng r sudenfs grade cquiv"ldnt Fo. exanpl€ ta and tl rLould be inrerp.eredin
THENATUBEOF STANDABDIZED TESTS 291
Tabl€l6-2.
Dlfierencesin G€de.€quivaleni Distrlbutionsby Test Ar6as GBADE.EQUI VALENf SCOFE
a7
95 6o 50
92 67 62 57 32
66 62 51 35
5
80 65 62 59
p e rro rm a n craeto i nrqrad€ 6onl he/ow 6 particular scores, used with the Iora Testi of B6X Shilk, illusrrare a significanr lim ir r r ' . nof d l l d c \' l u p me n ra l s ra n d a rd \ore scal er. rhere i s no meani ne or inr e' p' c r dr ' . n h u i l r i n ro a \o re . \^ h a r does a { ore of 120 mean to, a tourrh gradcr tcsted April? Wirhour accessto a chart. like Tabl€ 16 3, we would need ro know these'n things. (l) Dedian performance in fall ofgrad€ 3 is defined as 100, (2) nedian performance in rall or grade 8 is defined as 160, and (3) averaqe an. I q' o\ r h fo r q ra d e s3 ,o 8 i \ 1 2 .D e v e l o pmenral.randard v urFs dre nor ; i dety " ud becauic of the extfu used baggage requrred to iDrerpret them and because rhei . ' r e r o unLm i l i J r ro re a .h e rs a n d p a re n rs . Grade.equivalent scores are fairly easy ro inrerprer because they are ded d n. { F s . r l c rh d r i ( b t i n di ti durl " uho havF l i nl e \ophi \ri carton 'ui, u h ,,\r\.r rnedve rs r..d\u b i e ,' s ra ' r.ri .\. 'Ih rre mi si n' erprerari onj us, ai are s,.rut 'o thar developnentai scores are more scores, but rhere is no .onvincing evidence grossly misused or Disinterprered rhan are srarus scores (Hoover 1983). -I he use ol' um ' n. n. en' r d d ,,,me b d s i , k n o F l .d ge dbour devrl upmenrat srate\ rre rhe r . \ ingr e. lr e n r\ ro rF .p u n s i h l e rn te rp re ri r i on ot gr,.i e equi val enr vorF, A n ey. ample will illlrsrrate. IfJo nerte, a brighr fifrh.grade grrl, gers a gade equivalent score of 8 4 on an arithmeti. test de$gned for grades 5 and 6, how should her score be inre,preted? Chances are rhis resr was nor administered ro eighrh graden, so the value 8:l is the estimat€dgrade equilalentOy the process ofexhapolarion). The typical Tabl616-3. Developm€nlar StandardScorestor Medianperformanceat Each of ThreeTim6sot Yearin Eaci craoe
Sprinq
56 60 64
73 a1
100 104 108
9I 95 13
112 124
124 128 132
136 140 144 12
14a 152 156 12
160 164 163
292
THENATLJBE OFSTANDABDIZED TESTS
studenr in t}te eighti grade, fourth month would score abour the same as Jonne(re did on this test. However, this does not mcanJonnette can do the same arrihmeric as fie rypical eighth grader She l'ould need to rake a test designed lbr eighrh graders for us to know how 5he would perform on aritlmeric contenr srudied by eighth graders Students who obtarn grade equrvalent scores srgnrficanrly above or below their o n grade level should be retested wirh a higher or lowcr tesr form if rhe userwishes ro obtain more precisc indications ofrheir developmental levels. Ofren th€ per€endle mnk, a slarus indicator is helpful in nakingjudg. ments about the value of out oflevel tesdng for a parricular srude r_ Scoro P.otlles Only if scores on the scveral tests used are comparable is a profile of student scores meaningtul. Scores will be comparable if they arc expressed on rhe s dm e \ r ar u\ s o re s (rl e rrl l p e r(F n rl e r d n t, or al l rl ,p \i me rvpc;t \r,,ndrrd score) and if the "ame reference (norm) group is used for each one ,q.nexample ofone student's score prolile is shown in Figure 16 1 The horizontal lines orr the chart.epresent various percentile ranks, spaced as they would be if rhe rrair beiilg Deasured b) the scores wa, Dormally dislnbuied There rs a vcrrical line on the chart for each t€st in the battery. The percenrlle rank values shown ac'oss the top of the chart for each res( are marked as dots on tbe corresponding verrrcal scales and connected by lines to form ihe prohle l,arry Hill's perfornance s about avemge, ovemll (His percentile rank for the total €sr is 52.) His highesr achievemenr levels (rclative strengrhs) are in reading, vocabulary, and work,srudy skills His lowest (.elative weaknesset are in language and mathematics ProFrles are most useful for identifying individual needs ofst dents and for vocational and educational planning A profrle also nighr bc used ro idendfy srudcnts who should be tested more extensively or to derermine rf imDressions Ior m ed f r om , ld \\ro u m r(s ti n tsrn d u b s c r!d ri un rf, .urTl med P tufi l c, ;epre,cnr a very compact form of vrsual communicadon rhat makes them convenienr for reporting and explaining test results to both srudents and parents. (Additional examples of profiles can be found rn Chapier I?.) Perc6nllle Bands In an attempt to st.ess the fact that rest scores are subj(r ro eror, somc test pubhshers choose not to report an exact per.eDrile rank for each tesr score Instead, rhey provrde a range of lalues within lrhich rhe "true perccndlc rank probably lies This mnge is called alercenlite l)atul.For e\ample, rhe resr manual may show that the percentile rank for a test s.ore of63 is betwcen the values 28 and 57; it ma) go on to stress that the exact perceDtile rank cquivalent is un. known, since it depends on thc unknown size and sign (posrtive or negative) of the error of measuremenr in rhe individual s score T he pr in .i p l e e mp l o y e d i n (o mp u rn g percenti l e ba ds i s tl i e same one involved rn usrng the standard enor ofmeas rement (Chaprer 5) to find the ranr score mnge in which the true score probably hes- The width of fte pcrcentile band depends on two facrors, the reliability of rhe scorcs and the degree of cer. ainty that the band includes rhe true value. Lo score reliabilit) or hrgh degrces
GBADE
Iowa Tests of Basic Skills Form G. H. or J
I i
lIJ'a:6t,jd E['ud|i5l aG(r.qiadrq r d rrllur t.u ,ErEfl!F h ilF d Efi t! b !h! o ir oDn- FFfr rr rEr 'h r. rnii t,n drh !. Ertu !.r e|rc *l - Fq[
c
Flcurel6-1. sampresLudanr Prcrrecharl
299
ZEDTESTS OFSTANDAFD 294 IHE NATURE ofcertaintv lead t wide percentile bands Unfortunalelv the broader'these Per' .entile bands are. he lesi useful is the information the test Provides One use of percentile bands tn a batterv of tests N in decidrng wnether be or not a drfference between any two scores ofan examin€e is large enough to
t be du€ solely to errors of measurement The score report in Figure l6 2 demonsrrates th€ use ofPercentile ban'ls on scores for AlisonBabka fron the load Zrb o/Brrn Sttlls ln the uPPer'right vocabulary, for examPle, Alison's n whi.h means there is a 50 Percent Pr is in the range'1 In thc bottom of rlr scores liom each rest area WhY are **. of 100 always have a percentile rank band at the toP ol u p.-"nt --..t th€ "HIGH" areaT There is a possibllity of underinterPreting test scores using Percenrile
selecttnghigh level be better olf relyin for decision lnai.in lhe user can be thit
of confidence for inte on the perce.rile ran . Gen€r;llv, the larger a score difference' ihe moJe confident a corrcsponding achierement dlffer€nce actlrally exists But usuallvire nade in e decision makrng context using other '
are Dore liketv to helP than to hinder the Process Subtest Scorcs battery provtde seParate medures of Just as tests that constitute a test so ir is differen-i aspects of achievement, Possible to subdivrde.a singl€ tesi into of several un ique skills 'l h€ rePort in parrs n measure; obta' to seoaratelv scored Fll-,re 16lz demoi'srra.es this The desire to obtain as much information as Posri blE from a tesr sometimes leads the test developer to offer a large number ofskill scores, each of which may be based on onL' a few test items There tions io note with regard to interPretinB such skill or subtest scores First as rhe number of separate scores increases, the rehabitity ot eacb Probably diminishes On many tes;, a subtest score based on as fer{ as l0 or 15 ilems may measure samplinq error moie lhan it does irue achi€vement The percenrile bands in Figur e l6 t help a l e rt th e u ' e r ro fi i s p o $ i bi l i rv
nacll
tro{hir&
the size ofa vaDdard enor
296
THE NATUFEOF STANOAFID ZED TESIS
f he s e c d n d(ru ti o D re l a re sto rh e \ al i di tl ofsubrcsLscores.W hcn subren r c . r r c sar c pr o v i d e d , a s i s | n rc w i rh m a .y si nH l c$ubj ect.eadi ng rests,rhc dcret ope. s hould p ro l i d e s u b te s ri rc rc o n e l ari ons i . l hc tcsr manual ro shoB ho1, s ir nihr or . lifte re n r rh e s u b te s b a c ru a l l va re If rhe correl i rro.s are roo hi gh, for ex am plc . r lic s u b i e s tsa re a l l l i l c l v ro b e measuri D gl hc saD reuni rarl rra( o, ski l l T he r es poDs i b l el c s r u s e r s h o u l d l o c u s i n rcrprerari onson toral tesrl co.es i D sxcn c r s c s an. 1jgn o re (h e a !a i l a b i l i r\' o f rh e s u bresrscorcs
NORMS A ' o/ nr , r i. hic h te p o rr h o { Sl u d c n tsa c ru a l ly do pe.form, shoul d nor be contused \|l }t stando"l!, rlhich represenr csrinares of bow well rhe), should perforn tor ex am ple, lhc s ra n d a rd ()f c o F e c tn e s si n a ri rhmeri c .al cul ari on i . most cl assesi s t 00 per . ent , b u ( th c n o rm (a l e ra g e )o fs (u ( 1cnrachi evementon a gi vcn.oni pura lion r c s t nay b e o n l y 8 5 p c rc c n r Ofre n the averagcperformance rakes on rhe f unc r ion of a s ta n d e rd Ih a l i s , rh c a v c ra g ebecoD es rhc cr' reri on agarnsr{ hi ch t he s c or esot i rd i v i d u a l s a re .i u d g e d ro d cl crmi nc rhe scorc D reani gandval ue Cons equenil ),,fe ,r s tu d c n ts a re re g a rd e d as fti l urcs i n an arca of srud! i f rtrei r pcribrmance is ibove rhe norm t,r average), l)nd lew are regarded as succcsscs il t hc ir pc r f b rma n .e i s b e l o w i t Nor ms a re R rme l i d rc sc o n tu s e d w i rh the vari ous rvpes ot scoresrhat arc us ed t o r epor t th e m P e rc e n ti l era n k s , s rani D es,gmde cqui l al en| s, and sten(l ard s.ofes a,e all tlpcs of scores, derived |r()In raw scores, to rcporr no.matile per f or m an. ei r he v a re n o r n o rm s th e m s e l v esN orms arc d' ffercnri ared by ccri ai n charactcristics of ihe reference grorrp dlat .ornprise drcnr 't herc are age norIlrs ar d gr adc Do rms . l o .a l n o i n s a n d n a ri o n al nornrs,gfoup norD r! and i ndi vi dual nor nr s ,t o r r r m c o n l y a fe w - It i s p o s s i b l el o Lonbi nc the chal acreri sri csof a norm g' oup in a la ri e ty o l w a ,v sh a n a rre m p r ro bui l d hi ghl ) di ffcrenri ared D orm gr oups l' or e x a mp l e , rh e N rti o n a l As s e ssmentof E ducari oD alP rogress(N ^trp) rcports normatiye perfo.rna.ce Lascd on agej geographi. region, racc, gender, and c onm uni ty ry p e We c o u l d h n d o u r h ow w hnc ni ne tear.ol d boys ti om rhe r ur al W es r s co .e o n tc s t c x e rc i s e s ,b u t s el doD i s i r w orrhw hi l c to use so many variables in conrb'nation ro describc lest perfornraDce (And, forrunatcl.!: NAEp does nor us c th a t m a n y .h s s i fi c a ri o v a ri a bl csi D a si ngl e compari son) t ions . and nho ta k e (h e te s t a s s c ri o u s l ya sw i l l other srudcnrsfor N hom rhe norms are necdcd Thc three R s rnosr ofrcn used ro judge rhe appropriareness of a ser of.orms for a givcn testing situarion ar e rdr.sotali!tus, relflancej an.l ftcmri. Nor m s o b v i o u s l y mu s r b e o b ra i n e d from $udenrs i n schooi sthar are w i l l . ing to take tiDle out from rheir othcr responsibiliries ro help w(h rhc nonning adninistration That very willingness may make rhem somewhar arypical of the nat ioDal popu l a r;o n o f s .h o o l s a n d s tu d eD ts.To ger enough parri ci pari o. from schooh to provide a reasonably larye norm g.oup is a difficul( underrakiDg. To makc it a rcprcsentaiive sample is even h^rder Firsr, rhe developer musr decide
THE NATIiREOF STANDAFD]ZED TESTS
stuclenrs.fhar is, a morc relev sentdrive sampleui pri!are his Easr.S^urh,Mid(esi, ind 11.i
The adminisrradon of rcstsro ob
297
298
iF.
NAT!]REOF STANDAFDIZIDTESTS
7A 7?
66
60 52-
36 28-
r987
oT Flgure16-3, TheETre.L n a Penodor
948
94 9
99
990
NormsrorscrroolAveaqes q Achevemenr
ardardized
tes(s
lndlvidualvsrsus Group Norms
use norms cont A sonewhar seriouserror ihat sometestusersmake rs to schoorbuir r'om scores u!emse i"c.'p"t p.".d ot '.o..".o rlrenno rhe medrr "iili;i;;;r;;;i.", Alrhuugh aggre8dre rome orhe' lnes,r hool divr i, r', or rnLri\ ro' rhe ; .";. s'.up 'hourd be i6our rn' rrmc a' rhr n'edn ' ;i:;;;;;i;; likt trre .choot averagesate sria""t '.or"", "l For examPre,Lheaunoaes'ore xr a tnrv excerrent ,h. *;;;;';;res ,#;; obtlined r*,);" nl'l :-l'l'-:'-":'J:'::]l rhe s(hoorma) be ro$er "ores 'Jldn ,1..i.-',"8,:rl:J;"11.:i;1,fl::',"::T:',"';':i in rheno,msroup.:"d s.hoor, be bet t e r th a n i h e s c o re s u i o n e l rl tn oI school ;."rPretadons are made' thc percenlile ranks of the i".ij".ppi.p.r,i.
I I : t
',i
ti:
it
e.l
'J
i
alIt ::.-!;
I
ri * 6
i?
e
Ec".:
r;
*
a
3
e
: :i,' I
E] E
;: li
f, gj
\
F
; i
J !
, I G! $:3
ct
295
TNENATIJREOF STANDAFD ZEDTESTS 299 arcragcs arc Dot iikely ever to be lo,{er dran 20 or higher rhan 80 the rrosr ext,enre degrees ol exccllcnc. or deficiencr are likely ro be underesrimared drasoc alh lhem o s (i d e a l b a s j s l o r e l a l u a ri o n ofs.hool al erages,atl pe of rrcarnenr ,cicrcrced irrrcrprcration. is a separate (able oi nolms for school averages And the qu.riil! ot school norms should bejudged b! the saDe .rireria of relevance, .epresentarivenr-Is, and rc.cnc)" as were re.o'nmended for indi!iduai studenr Dor nr i. F ig u re 1 6 3 e x e x rp i i l i c s th c p rcdr.ament tbat can d€vel op w hen dared norms for school arerages arc used
SEL E CT I O NO F S T A N D AB D IZ E D T E ST S Sources ol lnlormalion For thosc who u.ish to iden(jfy pubiished .rrd uDpublish€d resrsrhar measure a prrri.ular trait. or those who seek descriprive information or crirical relrews ol existine mea rcs, a widc varieN of sources is availabla Mosr informa. t r onwill be fo u n d rn p ri L b u r m u c h o fi r i s a.ccssi bl ei hrough compurer retri eval ycorrool (l{MY) generallv is regarded as rhe mosr Thc MentaLMedeimntr conrp.(bensilc sour.c of iDformat'on about pubLshed iesrs. The tenfi edirion (Conoler' and Krarnct 1990), rhe mosr .urrenr printed edirion at rhe rime of rhis \rrrtiDg, in.hrdes such descriptivc information abont ea.h test as aurhor, publi.ar i, ' n dr r F .n rn ,l ' e r o l l ^ r In \.' n ,l l .\ rl .. n u mbi I uf\, orpe repo' r.d. admi ni i rra' i on timc rcquircd, and plces for tes(s and s.oring services In addirion, crirical re. liens by tc(ing spccialists and a bibliography identifying research srudies in rhich rlte nreasurc $'as used are provided Teslsin PTi III (Mitchelt, lS83) is a sunnnary relerence to information detailed in all the MMys published plevi. ously (I he fourth edition is schedulcd ro bc published in 1991 ) 'I he lluros Itrsriture of Menral Mersurcmcnrs has made seveml chanses r , ' r . du' e rh e ' p \c re p u h l r(.' r.n l d g rl r ar l Jl agxe.ic,' rl i er vol umes ofrhe V My Iirsr, a I'icnnial publication schedule has begun, and paperback supplemenrs are l)rollded i. rhe alrernare ye s This Deans informarion about curenr resrs is updated rn prrnt every year Second, all r'Ur\4I information is accessible on,line using Biliographic Retrieval Seni.es (RRS).The sysrelrl (MMYD) is updared con linually so Lhar a computer search lrll uncover the mosr recenr descripdve and evaluaiive informatior about available tesN. Another source of descripdve information abour published rests, kr' A CamPrchM r Refdence lor Atsess,nint in Pstuh.log, Ed,uation, ann B6ines! 6 a cumulari!e listing that contains over 3100 entries (Keyser and Sweetland. 1990)A compa.ion publication, TestCitiqws, pro\ides comprehensive reviews rhar in. clude recommended applications, technical information, and an overall cririque At the dare ofthis wrning, seven volumes had be€n published (Keyser and Sweer. land, 1985 1 9 9 0 ). The most cur.ent information about snndar{iz€d tesrs is in publish€r r J r J luqsz nd th e re s t5$ e ms e l v e e I h o s e sho arc, harB edsi rh sel ecri nCre\ts l or a school rcsringprognm should review a specimen set for each tesrunder consid. eration. For a nominal fee, the publisher witl provide a copy ofone form of rhe
TESTS 3oo THENATUBEOF STANDAFDIZED test and rh€ accompanving test manuals to individuals who are authorized io se srandardized resis Pubhsher rcpresentatives can answer questions aba,u( their tests and processing services ihrough cithef telePhoDe rnquiries or ichool visits lequested by the test.selection committee Some tests have beeD reviewed by mcasureotent specialisls in Profes sional publications like $e Jo1'rnnlaf Edantnnrt M.dntr.nht or MPtlvrrmmt uLtl tualwrim in Guiaow These reviews, as well as lalidrty studies Published in Edr' canbe AeDtificd readily through a co rPur cationaland Pychokgitul MeorrmnL crized literature search of Curmt Intl{ ta Jamal|in Eduatian {CIJtr) lbt a ve.Y finally, college and universrty faculty members in education and pslchol' ogy departments often are avarlable to consult with s.hool Persortnel reBa'ding te;t seGction and use. Some universirjes are $'illing to Provide .onsultarion and test.scorrng senices to school distric6 through their canrPus rnea-qltreDrenIand test.scoring centers. The same seni.es are suPPlied by some statc cducarion de partments through area or regioDal centers esublished throughoul lhe slate to serve school dislrictsSelocllon Criterla Sources of information available to committees or indivi.luats resPon5i ble for selecting standardjzed tests were described in the Prcvioul seclion llrrt what information should be sought from these sourccs and how should the infor' mation be ueighcd in arriving at a selection decision? the items or test tasksrequire lhe Zottdtry. Wnhou( question, test content-thar factor to assess How lell the lests or imPortant know-is the most to examinee subtests rratch the currrculum in terrns of content coverage and etnPhasis must be determrned in selecting achievement tests For tests of aPtitude and intelli
make them easy for students to use. should be legrble in terms of size and cl2r'ry Techni.at adcrydL|. A test that has been judged to bc sufficiently valid Io allow the disrict to accomplish i(5 purposes for testing should be scmtinized furthcr for technical adequacy. The reliabiliiy of test and subrest scores should be as sessedfrom data supplied in the technical manual and comnents nadc br- re viewers ofthc test Data should be Provided in the manual about the equivalency of alternate test forms that may be available. When d.velopmental scores are
arc satisfactory. Tests that survive a validity ard technical screening Practicdt coaidqatio'Lt should be evaluated in terms of.ertain orher rmPortant considerations Schools
TESTS 3II1 OFSTANDARDIZED THENATI]BE
PROPOSITIONS SUMMARY
f""iii;
t
*.*
a sranoard' serecrine '' lli'i"l;'." 'o validllY p"onlv'o g"e h'sl" ', *o"E i,i. .""*. '..".."' ticatesrre srden.s and5uchoracllcal .ui"""". i""""'",'
re r q e n q ar ew'"" ordc owem o l a sw e l l a sh s o r wearnesses and ;oeciiicslrenglhs ro rle o'esenceoi il'*""" ","'J' "" ""t"n
"o"o"""v andcosr' requiremenls as lirne conslderations
302
TI'IENAIUREOFSTANDARDIZED TESTS
FOBSTUOY AND DISCUSSION QUESTIONS 1 Wrich characl€rislcs ol slandardi2edtesls are nost ssnrlrcanl ior lests that w be lsed 1o make crileron relerencedintefpretations? 2 Why m ghl Lhenalonal nor m gr olp s o ' l w o d 'l e r e n t p L b s h e r s y r e l dd f r e r e f t s c o r e s (meansand slandard devialions)il lhe grolps were gven a common test? 3 Wlry are achevomenLbaheres generaliy ess lserL at lhe high schoo than eemenlary 4 ll a math lesl providingthree separatescores rras nlerconearons ol0 79 0 a5 and 0 83 b et weens ! bles ls whai ar e lhe im p/ r c a l i o nl so r s c o r e ! f l e r p r e l a t o n , 5 Why sholld l-scores be reqafdedas slalls scores ralher than developmenhrscorest 6 Why mrght grade-eqLvalenlscores be less uselul lor nlefprel ng scores ol hghschoo seno.s rhan rhoseoi th rd qraders? 7 t a slxih gfadef is wofking al lhe leve oilhe lypca folrrh lrader. s I more appropriarelo tesl lhe stldentwnh a tes(ballery desgredTor grade 4 or!rade 6? &pa n your response I Whal !s lhe meanrrgol a score pror e. !s ng percenlilefanks lhalforms a sLraghl horzon I Why ar e pef c e. l le{ ank baNdspadic! a r y u s e f u i n n l e r p r e l . q s k r l o r s l b 1 e s is . o r e s t 1 0 Whalar e lhe d ller enc esbelweennato n a l s t a n i n en o r m sa n d l o c a p e r c e n L l er a n kn o r m s t 11 Dlr nq a percd o' naliona achevemenl score dec ne whal impacl w the lse ol dated norms have on rhe nlerprelaronsof s1!de.ls scores, 1 2 Wnardoes t m ea. when a s c hools a v e r a g es c o r eo l 3 3 5 r o r g r a d e2 s c e n c e h a s a n a rio na per c ehlier ank ol97? 13 In whal sour.es are yo! kely lo lindlhe mosr cu(e.r crricar revrewor a slafda.dized ach evemenl lesl?
using Standardned Achievement Tests There ar€ good reasons why this chapter focuses on 1l.rr,rather than on selecring t€sts, describing sample test content, or administering and scoring tests. Ihese ar€ all important aspects of sBndardized achievement testrng, but none is so critical as the use of tesis and the scores deriled from them. The climarc of the 1990s drearcns valid test use because there is too much testing for purposes fbr which tests were not designed, and th€re is too li[le appreciation fof the lilnited precision wnh rvhich we are able ro measure educational anainmeDts Conse. quenrly, attention in ftis chaprer will be directed toward (l) an analysis of rhe legitrmate uses of standardized achievement tests, (2) illustrations and explana. tions ofscore interprctation, (3) ptanning topics fbr in.scNice work in rest selc.. tlon, administration, and int€rpretation, and (4) an exploration of some issu€s that affect the quality of a school testing program
THE STATUSOF STANDARDIZEDACHIEVEMENT TESIING The great irony of standardued achrevement testing today is that, while these rcstsare b€ing ovensed to fulfill slate and local legisladve mandates, their results are being underutilized in serving the instructional needs of rcachers and sr. dents. To be sure, many school districts have caretully developed rcsting pro. F?ms with systematic procedures for interyreting and reporting th€ir test re' suhs. But in too many cases the schools are required to give certain tests so thar results can be made available for such DurDos€s as interstate and inrcrdistncr c om par is ons ,te a c h e ra n d a d mi n i s tra to rp ersonnel de.i si ons.and pupi l rerenti on 303
304
USINGSTANDAFD]ZED ACH EVEMENTTESTS
judgmcnts. I he many informauon seekers leghlarors school board nembers, parent gloups! business leaders, and school adminrsraro.s and reachers-do nor share the samc agcnda an.l do Dor, therefore, havc need for rhe saDe rvDes of Teache.s and adminBtrarors gcnerally make less use of srandardized achievement-tesr resulrs rheD rhey could lbr rwo reasons.Firsr, educarors tend ro understand tar less about rcsrs and resr scores rhan would be desirable. Thcir educadonal prepararion prograDs and in,senicc edu.arion setdom ad.lress rhe esscnrials of resting and evalur(ion Consequenrly, rcachers ofren can devore tess consequences because of lor scorcs on urandared rests musr lnvesl rhe bulk of rheir energy and insrrucrional dmc ro prepaftng rheir srudenrs ro do welt in rhe areas (usualll reading aDd math) covercd by rhe accounrabiliry assessmenr As a result, even if other tesr scor€s poDr ro areas ofweakness, rhese reacbers cannor afford ro split thcir effo.is aNay from ihe high.srakes" mandared assessmenr
cepted_anong many educarors and rhe gene sure of educadonal accomplishmenr is virtu unfortunately, he may be correcr. We
I hd\ berome .u unr,i ri (dl l v rL public thar irs validiry,'as a mea unqu€stioned_ To s;me exrentj
xpecrartons For example, when rhe barh. room scale givs a hrgher reading rhan we expecr, how nany ofus firsr wonder if the sale is functioning propcrly? When sropped for a speedDg violation, how many dri!ers firsrquesrion theaccuracy ofrhe radarequipDenr) Butwhen achieve. frequently try to explain substandard tesr resulrs in rcrms ofreacher quatiry, funding. o, ph) s ir a l re \o u " r.. d n d s i mp l r J s sume rhe appropri drenessof rhe mea. s ur e! r nat Dr o v rd e c th l o s e re s u l ts , Another erplanarion for rhe increased use of srandardized achi€vemenr tests, particularly mandared assessm€nr,is rhar rhe celebEkd Nadonal Commis. sion on Excellenre in Educarion recommended more_ In irs reporL A Natiafl at i?t r t { lS 83' , d \p e ( i fi , ti a me l u rk k i rh e x p l i ci r purpo\es w as gi i en
In ou ining the evideDce for concludiD8 rhar we are a nadonar sk, the commission reporred 1.1"iDdicarors ofrisk," 11 ofwhich depend oD rhe use of srandard, ized test scores as crit€ria.
USNGSIANDARD ZEO EVE]VENTTESTS 305 'CH Whether or Dot the achievemenr test has become an insorurion is debar. able What seems certain, howevet is thar good standardrzed achievemenr rests will continue to be needed to help educators moniror the €ffecriveness of iherr efforts and report rhe outcomes oftheirefforts to local boards and par€nrs. Care. lul test selection and wise tesr.score interDrehnon and use can make Dosirilc ( onr r ibu ti o n \ ro fu l fi l l i n g Ih c \r n e p ds
USES OF ACHIEVEMENT.TESTRESULTS Standardized achielemeni.test scores provide a special kind of informarion on the extent ofstudent learning k is special because it l9 based on a consensus of expert l€achers with respect to whar ought ro be leamed in the sudy ofa specific subject, a conseDsus external lo and independent of the local teachers h Lhus provides a basis for comparing local achievemenrs wirh cxrernal norms of achievement in similar classes.lt is usetul iDfomatron becauseir helps to rnform s r udenr q .re rt h e r\. a d mj n i )rra L o r\.a n d the publ ;( dr l rrge ol rhe effi ri rene* of the educational efforts in their schools Schools sometimes have been criticized for serring up resring programs, giving and sco ng the resK, and rhen doin8 norling wirh the resr scores excepr to file them rn the principal's ofiice.If.he school facultyand the individual reach ers do not study the test results to idendry levels and ranges of achievemenr in the school as a hole and within specific classes;ifthey do nor single ou.srudenG ofhigh and low achievemenqand ifthe scores are not reported and tnrerprered to students, parents, and th€ public, these criricisns are jusrifiable. Bur if rhe c tics mean that no coherenr program of.action triggered specifically by the rest resulis and designed to "do something" about them eDerged from rhe resdng program, rhen rhe criiicisms probably are norjust,fi€d What a good school faculiy "does' about standardized t€sr scores is somerhing l*e what good citrzens do wrth information rhey glean from a newspaper Having finish€d !h€ evening paper th€y do not lay ir aside and asl themsel\'es, "Now what am I going to do about all this, about rhe weathe! rhe accidents, Lhe crimes, rhe le8isladve decisions, fte clorhing sales,the srock market reports, rhe baseball games won and los! and all rhe resr?" Th€y may, of cou e, plan specific actions in rerponse to one or two items. But Dost of what is meDorable Lhey simply add to their store of latent knowledge. In hundreds ofunplanned ways ir will affect rhe opinions rhey express larea the votes tiey cast, and rhe orh€r deci. sions they make. lnformadon can be very useful ultimately, eveD when ir rrigg€rs no mmeolarc resPonre. Educators who properly deplorejudging teacher competenc€ solely on Lhe basis ofstudenrs' test scores sometimes fail to see that it is eouallv unwise ro take acrion on school or student problems solely on Lhe basis oi rhose same rest scores Seldom do standardized test scores by themselv€ provrde suf|lcienr guidance for wise and effective educational actioDs. It follows tiat these t€st scores should be regarded p marily as sourc€s ofusetul information, nor as major srim. uli and guid€s to immediate action. A school faculty or teacher who sees lhe need and has t}|e opponunir) shouid not hesitate to develop a program for acrion based pardy on fie rcorej
306
US NG SIANDAFDZED ACNIEVEMENTTESTS
pro!ided by standardizcd tcsts of achieveDent But neither should feel that the resri.g Nas a $.aste of tnle unless such a progran is developcd Thc iDDcdiak purpose to bc scnc(l by standardized tcst scores is the prolision ol instructionrl inlbnnarion, rnformarion that can contribrtte to the wisdom of a host of specifi. a. r ions s t im u l a te d b y o th c r e d u c a ti o n a l needs and devel opments Purposes lor Te6ling All achieremcnr tcsts-whether slandardized or teacher rnade are mainly tools of instnction That js, they ar. designcd on the basis of the goals of insuuction, and dren rcsults are iDtended to show the extcnt of progress to$.ard those goals Standardized achicveme.t iest batleries provide surveys oi the extent of learni.g !r each of scrcral cuficula' areasi therr scores are er pectcd to it prove the decisions tcachers make about students lhe assumpnon is thar reachers $'ill make be(er instructronal de.isions about students dii,l such resr scores than they woutd uilrorl rhem (Ilieronymus and Hoover, 1986) Scores are not iDtendcd to supplaur rcacherj.iudgments Insread, ttrcy may hch to con. firm suspicions and expectations, they nay provide conflicting inlbrmation that should rigger rearsessment, or ihey may point out the need for furlher, more detailcd inlbnnatron. The purposes oudined by the authors of the loua ksts of Ba\ic ShiU! iDd\care the iDportance of sefling instnctional needs and, by tl)eir absence, the ln appropriareness ofusing such testsfor avariety ofaccountabilrty functions (Hreronyrnus and Hoover 1990, p l): I Describethe developnenul level ofstudenm so rhat itrstru.t,onal nar€rials and procedur€scan be adaPed to individu2l( 2. Diagnos€ individual stren8ths and wea\ne$es in educational d€velopment acros subJe.rareasand sk ls within subjecrareas 3 DeLerminethe erren( ofreadines 10begnrinsttucdon,.o proceediu an instruc honal sequence,of to move to an acceleratedlevel of instNctbn 4 Inform administfatiwede.isions in Srouping indivrduak to accomnodateindi vidualized instrucdon 5 DiaBDosegroDp strengthsand sealn€$es for ndjusting.urricular content,€nt pbasis,or approach 6 Det€rmine the relativeefectirenes ofalte..ate methodsor pr%rams ofinstnc. 7 Determinethe effe.ti!€nes ofinnovattue programsor experimenhl approa.hes 8 Provide a m€ans for d€veloping reasonabl€expecaLionsfor rudent achi€vement and for desc.ibin8 progres toward such Soals 9. Describestudenrachievementin termsrhot 3re meaningttl to parcnc, students; and !he generalpublic Examples of ome of these specific purpqses will describe how achiev€ment test scores can b used ro select studenis for remedral attention or for enrrchment opportunides, for readiness for planned rnstructioD, or for diagnosing dift-lcul'
USINGSTANDARD ZED ACI]IEVEMENTTESIS 307
Chapter1 Sel€ctionand Evatuation
Talentedand citted Setection
r ;;lfi:,.
*-e
nationarpercenorerank of ar reaste5 on rhe sk /,rd
308
US NG SIANDAFDIZEDICH EVEMENTTESTS
p.ograms depends on such nonacadem rc variables as inreresr. Dorivarion. Dersisr enc€, and indepFndence_ The narure of the program, (he demand for pa;ticipa tion, and the exrenr of local resourccs may vary enouqh over rime to warrint \ c r ing \ ep a rJ te , n (e r i a l u ' rh e \rn o u s tA L program .i ' dnd. J\ai tabte. Kindergarten R€adln6ss
v ea' ur B o ro j ,e ti n d e rg a e n . Su .l r p l a , cmcn' de.i i i un. shoutd he made u\i ns r nlor m ar on L h a ra rc a d p' ori de. w ho d,e noL rri dt ror ' .hi td,cn k inder gJ r t.n d re rh o n h h o h d !e \o m c ti nd ol dr\el opmrnrdl defi ri , D hl \i (!t. em or ional,s u .i a l -rh a r n e e d e\p e r ra t a .n ri .n (or I i me. i n rhe i udem, nr .om, r "r r hJ r Lhe r eg u l J r k rn d e ' g a e n p ro g ra m doc" nor rana," nnorl oLter
havc not had a pres.hool or home envjronmenr riar nourished such ski s Bur t hes e r r e c o g n i ri \c J b i l i | | e \ rh rrn b e tea' ned rel dri !et] qur,kty, A r\en eome . onLenr r ar e d e l to rr b ) i u d e n r a d re d .her The marn ratue;t readi ne* rore, is to provide a prcrure ofsrudent srrengrhs and weaknrsses,ar rhe skrll levet, and io describ€ the read iness in each sub.jecr area of rhe classof sruden rs Thus, readi ness tests ar€ most uselul when given rn rhe mid.fatl, a time rhar a ows plenry of Finally, for reasons similar ro rhose cired inmedia(ely above, readiness (p ' i n g o l k i n d e rS a' ren are nor useful fo, makrng fi rs,. i( or es obr a i n e d i n gr dde pla. e m e n r d e' h .i e s i o n s . D e c i s i o n s r o rerai n, ro pta, e i n a rransi ri ondl lD ro. gram, or to promote are not likely to be aided by readiness.restscores. She;ard and S m if i r 1 9 8 6 ' h a v e rrg u e d rh a r re re nri on at rhi s decr\i on poi nr has ontv ;eqr. t iv e, oneque n (e s fo r m o ' r s ru d e n rs .n o marrer w har .' i reri ;n i s u,ed. S rude;,. who s how s o e re w c a k n p s .e s;n p re re a d i n8 ,ti tl s l i .reni nR , l erte, recoqni ri un, lelt er - s oun d a q s o c i a ti o n .a n d Ia n g u a g e rel ari onat ronceps ma' need-rempo. rary individual.prcgram plans to r;nediare.heir deficieniies. But cerrainlv other considerations should inform r}le .€tenrion decrsions. not sot€ly rhe ac;demic det ic ienc ie so f k i n d e rg a ri e n . Didgnolls ol Leamlng Dlttlcultles S t an d a rd i z e da .h i e te m e n t b a rreri esare mr deri gned ro be di aqnosri ( ai ds t har pr ov id e d e ta i l e d i n fo rm a ri o n l o r work w ;rh srudenr; H ow ever, ' nl j ;rl ul most do provide considerable group diagnosiic iDformarion, parricularly those thar display resuks in special reporrs rhar show av€rage rest.item scores or avera8e s t ill s c o re sw i th i n re s r!. In s rru .ri o n a l ptanni ng for a ctars can be enhan(ed by BkinS sud data inro account, insrructronal mareials can be sel€cted or devel, oped !o imprpve learning in deficienr areas, and rime can be reallocared frcm topics on which students have demonstrared higher levels of ac.omptishmenr.
USING STANOAFDIZED ACHIEV€MENT TESTS 3I'g Any achidenenr rest can provide "diagnosiic informanon of value t' indivrduat stud€nN if rher are told which items they mrssed With the teacher's help, these students can rhen correct the mistakes or misconcepiions that led rhem asra,v Highl) specific "diagnosis' and "remediadon" of tltis sort can be effecrive and is often accolnplished wnh classroom achievement tests. But such feedback and discussion are impracticai, if not impossible, with standardized One reason for the lack ofsuccess in edu.ational diaqnosis in most fields other rhan elementary readrng and arithmetic rs that most leatning difficulties are not attributable to specific or easily correctable drsorders lnstead, they usu' ally result from accunulations of incomplete learnrng and of distaste for learn' ing Neither of these causesis hard to re.ogn izei neither is easy to cure Diagn osis is nor the real probiem, and diagnostic testing catr do little to solve that problem. thal -A.norherreason for rhis lack of successin educational dia8nos's 's effecdve diagriosis and remediaLion take a great deal more time than most teachers have or most students would be $illing to derote The diagnosing of reading difficulties rs a well.developed skrll, and remedial treatnents can be Yery effec' uve Because rcading is so basrc to o fier leaming, the dme required for d iagnosis and re'nediatioD is often spent ungrudgrngly. But lhere *te subjecr of study is nore advanced and more speciatized, the best solution to learning diffictllties in an area, say algebra, physics, or Cerman, may be ro put offstudy in that area a.d cultivate ieaming in orher areas that present fewer problems. Standardized diagnostic tests in both reading and math are achievement rcs[s used by reading Dr math specialists to gain information about the learning problems ofindividual students. These tes$ are built to allow rcsr takers to d€m onstratc cernin kinils oferrors or misconceptions held by students who are hav' ing difticnlri€s in r€ading (or arithmeti( computation). Often the resulrq of th€ $bje, r m a e e \r i n a b fl rF ry i n d i i a re a B enerrl probl em, dnd the di agnosti , te.t is adminirtered to ascertain the sp€ciFrcdeficits in terms of skills and subskills. Unfortunately, diagnosdc rests, like other achievement rests, help to identify probleD areas, but they seldoD provide reasons for t}te dimculties and cannot prescribe solutions io overcome them. A major challenge to the rcacher is to synrhesizethe entering beha!ior information about a student so that the instruc. tional strategi€s and materials can be selected Lhat will optrmrze that studenfs condrtioDs for learning
S C OB ESO F IN OIV ID U AL S INT E RP RE T I NG Mosr test publishe$ offer such a wide variety ofscore reporis and scoring serrjces rhar schools sometimes have difficulty deciding which ones they should order
counselors, admrnistrarors, parenis-and !/hat kind of information is neededpupil rest and skill scores, building average scores, system.wide averages, class room p€rcent scores. and so on. The list of ne€ds Iilay seem almost eDdless,but rhe review process will help rc rule out many reports that are either similar to one anorh€r or simply nor need€d. There is good reason to b€lieve that the
310
USINGSTANDARDIZEO ACHIEVEMENT TESTS
underutilizatron of scores by reachcrs is due in pa;t ro dre inconvenienr formar in which the scores are reporred ro rheln Of course, part of re rcason for rhe inadequate reportrng is that teachers are seldom consulred aboul reporr formars ihat would be most helpful to them. When tcst results come back to a distri.t, ncarly elery reacherwill rc.cile a lst repor(, an alphaberical listing of students and rheir correspo.ding scorcs. At the middle school and high school levels, repons mighr be arraDged by class period fbr ea.h English, math, science, and social srudies reacher Fisxre 17 lis a sample lisl report showing scores for Mrs. Newton's fifrh-grade classon rhc lood Testsaf Baslc Skilh. A re\io{ of the scores ofAlison Babka will illusrrarc how scores of indivrdual pupils mighr be interyreted. Here are some slarcmenrs thar mighr be made about Alison's performance in dre fall of fifth gradel I
H€r Complet€ Compositt gradc equivalent score (rhe arerage ol tbe llve main scoret is 55, the same as the typical sLudeni aLthe end oflhe fifrh monrh offifth
Her Complcte Composic percerdle ra.k of 60 means thar lio percenr ot fiflh g'aders nahonally have composire sco.es loser rhan hers 3 In sun, Alison\ overall achie!cmenr seems aboxt ave.age .ompared wirh orh.r llrlh graden natio.ally 4 Ali$n s rclrtive srrenerhs a,e in areas in rlhich hrr per.enrile rank is nod.eabll above h€r Complete Composite percentile rank puncruatidn and matb conpu
5 Aliso.\
relarile $eakneses are in areas in rLhich h.r percenrile .ank is roLice ablvlower !han herComplete Co posite per.entile rank-language usage, matb problem solring, vr.ial srudies, and sciencc
At this point we should be interesred in the parricular skills rhar nny halc contributed most ro rhe strengths and weaknessesidentified by rhe (esr scores A report that makes such analysis possible and rhal provides percent correct scores for critcrion-referenced interpreadon is rhe Srudcnt Skills Analysjs report rn Figxre l7-2 (Note that the lndivrdual Performance Profil€ repor! Frgure 16 2, also could be used nicely for this purpose.) A cro s s th e to p o fAl i s o n ' s s k rl l sreport, rhe test scores grade equi val edts and percentile ranks-ftom the list report ale reproduced hr easy reference The fourth column ofnumbers in the botton scction of the reoo is the pcrcenl l o r A l i \o n . a n d rh ( n e x t' uo," l umn.. .' eLge. or rl ' , ,i r* arro the nation, permrt norm-referenced conparisons Here are sorne srarements that might be made about Alison's perfbrmance based on skrll ,coresr 1 Puncruation is a rel2tile shengtb ofAlison's, pardl because ofher perforna.ce
3
wi th rcrminal pun.tuaLion and use of commas Her other skill sco' es arc m! ch like those of.he a!erdge sudenr in her cla$ Alison s math compu.atoD performance was bolsrered by perfects.ores on addi rionhubtracrion ofwhole numbers and decimals. But vhole number nruldplica. .ion/division se€ms ro be a weak skiu widin this generally strong area The language usag€ and €xpre$ion seakne$ seems to be explained maint)'by lsage skills, all of whi.h re€d improlemen.
f{
il
I e
1)
:r
t
g B g .: E
F
!
311
:" :i ! ::ri
:i:: :;::
;
; e :l
:€
-=H
t! !3
312
USINGSTANDAFOTzED /TCHtEVEMENT TESTS gt3 4 The wraknels frarh probtem s( 'n per ( enr ile pe. r eo r r om r he nn L o
bt.acnon. Th€re may be some ! oblem-solving tesr irems to derr
j::l.jt#,..#"T,::T"'"':; ' ('''{::nir ;::l;'J:il'""t""":::T..t.ilji"",l
sru.ties as anythi!8.) 6 . Pc r f ar hd, , , ? in q ic n( e in pbysics and chemistr
wea t o v F r , lbe.e x a
bur pa,r(ulr,ly
wilh r"rpe.r ro ,oDirs
:.e::t1.!:r':Tt:i i:.illili1:if;ii_,',:f.:;;r;illitrrfl"J
7 Alison also had som€ troubte wirh the rezding ofgraphs 8
and tabl€sin rhe visual yf:T':i: j:ii,i':,x.,ij:;J#fl :il::; ";,i,::,*-r.,"lil"-il;il; Though referencemareriah was nor n
, pr,.r,",,,,n gp,*"",J " ;..;i# ;fll.lik';,:i;'il"';J:'"] s:,:,;l:
ll't';ri:#flI;:[.;],:tliil;l::::1",I.j.:l';J#il1]:i:,"..: r €! eit r he \ pec if ic pr ot t em
..^,..---l-ll,li1.,
r\ne.r or in,erprerins rhesiuresor an ind,!iduarsrudenrin.
f..':"::;ti :$;tfl::::,i':J,,ffi n: .:il,:ij,;:""1;::Hp,;".,il;t rvpi cat proFess nas made i n e;r h a,ea, . same on the two occasions. This tabte
:":miL",:,J:"Tji:r!,::i*i i"!.il.[lt{;::r:!iixT]ft (Feidt, Forsyth,and Aln"t, rSSS,p. tt;,
85-99 65 84 35-64 15 34
+8
onc year or l0 monrhs (or 1.0 when rhe belorvaveragemighr gain only 6 ro 8 m, expecreclto gain 12 ro 14 months A pr( seveiarsuccessrve yearscan provide a us
31'
USINGSIANDAFO]Z€OACNIEVEMENTTESTS
quacy ofgrowth-overall
a.d in rhe areas previously nored as6trengrhs or weak.
INTERPREN G SCORES OF CLASSES
tom roll of the list r€port shows the avemgc gradc.equivalenr score and rhc corle-
like these can be nade: l tinishcd the second monlh The p€rc€ntile rank of5r verrfies dris inrerpretarion The relative srengrhs of rhe class ar€ rn areas in rvhich thc percenrile rant n
3. Thc
rclarive weak.esses are in rhe refercn.e materiats and rcadinB areas
.1.
uDexpected rc$ perfomance
b) c€rtain studcnts )
Thus far, in addition to idenrifying areas of group srrengrh and weakness, we have tried ro verifv that the student scores most responsible for rhese cxueme group pertormancer rre not due to incomplete test'ng, random responding, or some orh€r erroneous iactor Students who $,ere not morivared ro rake rhe resrs seriously could have responded in unpredicrable ways thar would caus€ rheir scores to be incongruent with ther. rypicat classroom arrainmenrs. Such scores should be ignored tempomrily so that subsequent Rroup analvsis and insrructional planning will not be distorted. The nexr step rs to determine rhe skills thar mighr explain rhe relarive strengtns and wcaknessesnoted above- The Croup Item Analysis report in Figxre l7-3 shol\'s rhe Visual Materials and Reference Materiais skill scores and corre. sponding rtem scores fbr Mrs. Newton's class- Since reference marerials was a weakness previously rdentrfied, we should look ar rhe scores in the righr column to gaugc performance in that area. The first column of nunbers $hows rhe resr ircm number, and each of the next four columns shows rhe average percent cor. rect score for (1) all fifth graders in the rorioa (2) Mn. Newton's .lars, (3) all fifrh graders in the ,ztld'4€, and (4) all fifth graders in the school slrla!. The lasr col. umn, Diff, is rhe class average minus the natronal average Ir is this column rhar canhelp isolare skill deficiencies and rhe parriculariten, contentthat conrribuled Mrs Neuton s class se€ms to have had some trouble with alphaberizing
-' : ?,i 1 -' $ ] j- 9 ! G! a x t9e-:j iI :3 9 3 s SS 9 l ;Pe !r::3 $ €${ 9S r
!:!!i
;1 ;$ .;i s i i
i n ;a .1 f
:rri ;i s9?
!!r?l
3!-?;3
:ao se sg aal
::;;i5i:,;i i=r r l=d
651 i8i
56
I
:i 9
6 'i :!r!
!!
x iiiii: : :
6 5 6 ,.r 8 :!!:l :f
315
316
USINGSTANDARD]ZEO ACH]EVEMENT TESTS
seem to be problenaric because rh exception of ircm 58, they are sizable Mrs Newron NI nccd !o decide if sne should plan some insrrucrron in this skill; if she shoutd incidenralty introduce alphabetiring tasks in rhe course of presenrrng srience, socist srudies, or other such lessonsi or rf she fiinks her upcoming plans wifl deat wirh rhis sl.ill suffi ciently. The weakness in gencral references nighr hale bcen expc.red if ficle fifth graders had not been instrucred in rhe usc oI artases,atmanacs, and cerran, booit parrs. The disrricr curriculum guide $,ould be a useful referencc for deciding about reasonable expectarions and possrble needs for remedration in sirr," tions like this Wben schools are departmenralized, as lhey usualll are lbr mrddle s.hool and high school grades, sco.e reporrs.an be preparcd sepa.aGly for each ol a e o n e i nfi gxrF l 7 1.,.rn bet.orrdi i r' thi \ latter reporr shows rhe average skill s.ores oF t4 grade t2 srudents on Lhe loaa T . s ^ oI r dur a h o rd l D a .l ' p w n ! tt,n n b e usctl mr, h l i ke rhe sroup i rer Jnat\,i \ r epor t ( F igu r€ l 7 -3 ) ro fi n d s k i l l s rh a r h e l p erpl i ,n are;s of strengrb ana w i ak. ness.For example, rhe Sources oflnfonDarion ctassaverage percenr conecr score of 58 was onl) 3 points less rhan rhe narionrl average score. One skill, use of encyclopedias and almanacs, 1\'asa weak area and another use of rhe ,tdd..t r Guidz, was a srrong area ln science, Mr White's najor area of inreres(, rhese students performed slighrly beiow rhe narional aveDge, bur no skill seemed ro be particularly weak or strong In Quan dra rive Thinl(ing, ano the r area of inreres r to a physics teacher, these srudenrs performed slightly berrer rhan rhe narionat average, particularly in the skills of probabrliry/sGrislics and erponenrs Ifreachers in deparrmenralized schools are expccred ro review rhe srandardized achievemenr scores of rherr sruden$, as rhcy should be, repo s Ly class p€riod should be provided for rhem Ir is uDreasonable, for exampte, for middle school teachers to pour over a list reporr of240 ergh{h graders ro find rhe ones
REPORTINGTO STUDENTSAND PARENTS The most basic use of tesr scores is ro repor rhern ro all who need ro kno , alongwith a simple inrerpreration ofwhat rhey rnean. Theyshor'td be reponed ro s t udent s as w e l l a s ro rh e i r p d re n rs b e .a u\e bofi d' e kev i ngredi cnr, i n (t hool l€arnrng. Parents that are informed a.e likely ro be more involved-ar home and at schoo wilh their children's leaming and are more likely ro work coopera, liv el] $; r }l e l e .rc h e r. Students, too, must be rnformed abour rheir own resrresulrs because rhey m at e c ount l e \s d e ri s i o n sa b o u t th e ,r o s n i n\rrur ri onat i nvotvemenr. dhether ro Participate, how much to participaie, how much effort ro devore, and whar kind of personal standards to adopr. And.unless srudenrs are made aware of 6eir
IF,E Ten plannin&l14 31,199 .o,renr to h€6ure, 8_19,t29 90 dirruk' t€rrt, ll0 1t rumber of n€m, l2a l}O spc.ificrdods, rr7 2r Le{h8 PU.Posc,114 l5 r!p.s ol ll5 t7, 122 28 ,. d.lil_3! T$Ln,{e-inr.rp . b_7. U 58, 2a9_96,gO9 .r{e.io. r€ter€n.cd,6 ?, s5_3a .lrolfscores,57 33 domaD r.lered.ed, i6
Tsr rrl(.|-n -10t, g48_49 r e \ r r r r , n g r l j - ,\ G , 2nutur . I ens6tr€$, 92,l0l, i:01-2
...rne , ri3,t8 9,5 Jti
, !98_99 anbrSrtrt ol r38_40
330 - i'nes, 63, !1111, vJu200n, 24. 342 .13
: r; le ol sp e.d na ron s,t t lr t t , lt l ^Driln
!€a
oI €dkrional
obje.(ves, 5t _5!, t26
8u'd€ttrer for inprovinS, 148 ,mproYrngd^clminaiion, I 49-51 fternrl .omplrjson use, t43 lcarninA effeLb. l4l-42 mNconcepLioru,l:l4, l3?_{2 Dunpre r ue litre, t5t-S2 xgarre suggenraneffccLjlitl 42 .3ron,rc ror urjng, |13 3.1
S oo xfr,5 l,53 , t:0,2 21 r],efs, 5i: 5ll, 1t0,2:1 .(ding b rhe ren,l .r, 3:t3 J/,r rr\rerJ J 5 t2 1j 2 9. 2lr 2 J -8srL,i!r,5 6 ,25 3,3 35 .