Identification and Assessment, Volume 16 (Advances in Learning and Behavioral Disabilities)

ISSUES IN THE IDENTIFICATION OF LEARNING DISABILITIES Thomas E. Scruggs and Margo A. Mastropieri ABSTRACT This chapter ...

Author: Margo A Mastropieri | Thomas E Scruggs

34 downloads 1008 Views 1MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

ISSUES IN THE IDENTIFICATION OF LEARNING DISABILITIES Thomas E. Scruggs and Margo A. Mastropieri ABSTRACT This chapter reviews problems in the identification of learning disabilities, with particular reference to issues involving discrepancy between IQ and achievement as a criterion for definition. Alternatives to present procedures for identification of learning disabilities are described. It is concluded that no presently proposed alternative meets all necessary criteria for identification of learning disabilities, and that radically altering or eliminating present conceptualizations of learning disabilities may be problematic. The major problems of identification of learning disabilities – including overidentification, variability, and specificity – can be addressed, it is suggested, by increasing specificity and consistency of state criteria and strict adherence to identification criteria on the local implementation level. However, further research in alternative methods for identifying learning disabilities is warranted.

ADDRESSING THE PROBLEMS OF IDENTIFICATION OF LEARNING DISABILITIES The phenomenon of learning disabilities (LD) has intrigued and challenged professionals for over a century, since its early descriptions by clinicians of individuals of apparently normal ability, who nevertheless revealed pronounced

Identification and Assessment Advances in Learning and Behavioral Disabilities, Volume 16, 1–36 © 2003 Published by Elsevier Science Ltd. ISSN: 0735-004x/doi:10.1016/S0735-004X(03)16001-6

1

2

THOMAS E. SCRUGGS AND MARGO A. MASTROPIERI

difficulties with typical academic tasks. Nineteenth-century writers, including Broadbent, Kussmaul and Morgan, identified some of the first cases (Hallahan, 2002). Hinshelwood (1917) described a case of a 10-year-old boy who exhibited severe reading difficulties. Demonstrating his severe problems with reading, Hinshelwood nevertheless recorded that the boy “was apparently bright, and in every respect an intelligent boy . . . his progress in arithmetic has been regarded as quite satisfactory . . . his visual acuity is good” (pp. 46–47; see Hallahan, 2002; Torgesen, 1998, for historical overviews). More recently, Mastropieri and Scruggs (2002) described the case of “Andrew,” a third grader in a middle-class suburban school with severe reading problems. Presented with an age-appropriate text, he read only six words per minute, on a much easier text he read only eight words per minute. A free writing sample revealed pre-school level skills. Nevertheless, on an IQ test, Andrew received a Full Scale score of 104. His perceptual-motor functioning was average; his vocabulary was above average; test scores of listening comprehension and verbal expression test scores were average, as were behavior rating scales. Nevertheless, his reading scores were considerably below average (standard scores of 82 or 74, depending on the test used. Math and spelling scores were similarly low. Andrew and the Hinshelwood case are representative of students with learning disabilities, those students who, although apparently normal otherwise, exhibit unexpected underachievement in specific academic areas, usually reading, writing, and spelling. Since the inclusion of learning disabilities as a category in the Education for all Handicapped Children Act (PL 94-142, 1975), concerns have been expressed over appropriate identification procedures. LD is far from alone in searching for better definitions; in fact, most disability categories face issues in identification. For example, the American Association on Mental Retardation (formerly the American Association for the Study of the Feebleminded) has revised its 1921 definition of mental retardation nine times, in 1933, 1941, 1957, 1959, 1961, 1973, 1977, 1983, and 1992 (Beirne-Smith, Ittenbach & Patton, 2002). Nevertheless, the field of learning disabilities has experienced some very specific problems in identification (Bradley, Danielson & Hallahan, 2002; Kavale & Forness, 1995; Scruggs & Mastropieri, 1994–1995). Problems in identification are not always thought to be subject to simple remediation. As a consequence of these problems, it has sometimes been argued that the category of learning disabilities should be eliminated or significantly altered (Aaron, 1997; Algozzine, 1985; Lyon et al., 2001). As argued by Aaron (1997), . . . the concept of learning disability belongs in the history of science, not at the forefront of contemporary educational practice and research. When the [IQ-achievement] discrepancy

Issues in the Identification of Learning Disabilities

3

formula disappears from the educational scene, so will the concept of LD. After 40 years of wandering in the wilderness of learning disabilities, we are beginning to get a glimpse of the promised land (Aaron, 1997, p. 489; see also Swerling & Sternberg, 1996).

In this chapter, we describe the common characterizations of learning disabilities, and review problems in identification. We will then discuss proposed alternatives to the present procedures for identifying of learning disabilities. Although a number of alternatives have been proposed, we suggest that none of these has been adequately tested; neither does any alternative meet all the necessary criteria for identification of learning disabilities that meets with current conceptualizations. Finally, we will suggest that the problems commonly identified can be addressed by other means, particularly, by including more specific state criteria and carefully and faithfully applying these criteria to identification on the school level. This chapter is based largely on a previous work on issues in identification of learning disabilities (Scruggs & Mastropieri, 2002), in addition to information gained from presentation at and participation in the LD Summit recently held in Washington, DC (Mastropieri, 2001), and recently published summaries of papers and responses presented at that meeting (Bradley, Danielson & Hallahan, 2002).

WHAT ARE THE PROBLEMS IN IDENTIFICATION OF LEARNING DISABILITIES? Since the development of the concept of learning disabilities and its inclusion in federal special education legislation, concerns have been raised about the legitimacy of the category. Although numerous and varied concerns have been raised, they can be generally organized as concerns with: (a) (b) (c) (d) (e) (f) (g)

overidentification of learning disabilities; variability in rates of identification; specificity of populations of students; conceptual considerations; issues involving IQ-achievement discrepancy formulae; early identification of learning disabilities; and problems in local implementation of state and federal criteria.

Too Many Students are Identified as LD Over the past quarter century, the population of individuals identified with learning disabilities has increased about 150% to a level that represents over half

4


of all students with disabilities and over 5% of all students in school, a percentage far higher than that found in, for example, mental retardation or serious emotional disturbance (U.S. Department of Education, 2000). The rapid increase in identification rates is frequently the first fact noted in arguments that current definitions are problematic (e.g. Algozzine, 1985; Fuchs, Fuchs & Speece, 2002; Lyon et al., 2001). On one hand, the high identification rates may simply reflect the fact that more students have LD than have other types of disabilities. And, in fact, more individuals are identified with milder forms of disabilities. For example, the number of students identified with mild mental retardation is much higher than the number of students identified with severe mental retardation (Beirne-Smith, Ittenbach & Patton, 2002). As the “mildest” category of disability, it might be expected that it might contain the largest number of students. However, there may be other reasons for the high identification rate. These high identification rates may in fact result from imprecision in federal and state definitions of learning disabilities. However, other researchers have suggested that a confounding of different high-incidence disabilities has resulted in classifying individuals as having learning disabilities, who previously may have been classified as having mental retardation (MacMillan, Siperstein & Gresham, 1996). Wong (1996) suggested that if the concept of learning disabilities has been overgeneralized, perhaps it reflects “teachers’ noble goal of teaching as many problem learners as possible, and not restricting instructional help only to students with learning disabilities” (p. 8; see also Torgesen, 1999, p. 108; Zigmond, 1993). Clearly, learning disabilities is not the only reason for school failure, and overidentification may occur when programs for other underachievers are not available. Regardless of the reason, many consider that prevalence rates for learning disabilities have been “alarmingly high” (Algozzine & Ysseldyke, 1987). Wagner and Garon (1999) shared the view of many when they suggested, “the prevalence of this disability is likely to be closer to 1–3% of school-age children as opposed to recent estimates of 20–30%” (p. 100).

There is Too Much Variability in LD Identification Proportions of individuals identified with learning disabilities have varied considerably across various agencies, including state departments of education and local educational authorities. Finlan (1992) reported considerable variation in state identification rates, with the lowest rate in Georgia (2.10%) and the highest in Rhode Island (8.66%). Coutinho (1995) examined more recent data and reported similar variability, from a low of about 2% (Wisconsin) to a rate of over 7% (Massachusetts) (see also Frankenberger & Fronzaglio, 1991; Lester


5

& Kelman, 1997). Although there has been some debate whether the variability in identification exceeds that of other disability areas (Algozzine & Ysseldyke, 1987; Hallahan, Keller & Ball, 1986), there is little doubt that the variability is considerable. This variability may reflect lack of consistency or precision in identification procedures. Finlan (1992), for example, observed a systematic relation between identification rate and whether or not states employed a specific discrepancy requirement, which relation suggested that identification procedures may be largely dependent on identification policies. Finlan reported that seven of the lowest 10 states in rates of identification employed a specific method of assessing a discrepancy, while only two of the highest 10 states in rates of identification employed a specific discrepancy requirement. This finding reflects the position of Dawes, Faust and Meehl (1989) that more objective procedures in identification may lead to more consistent outcomes. More recently, Lester and Kelman (1997) evaluated state identification rates and concluded that demographic and sociopolitical factors were related to some aspects of state prevalence of learning disabilities, although the same was not true of prevalence of physical disabilities. It does not necessarily follow, however, that there is no reason for actual prevalence rates to vary by geographical region, and that future procedures must lower variability in order for identification procedures to be valid. Characteristics of residents can vary by jurisdiction, and these characteristics can lead to lower vs. higher prevalence of any specific condition. For example, the mean percentage of low-birthweight children is 7.03% nationally, but individual percentages by state range from 4.7 to 15.4%. This observation does not in itself suggest that procedures for identification of low birthweight should be modified so that percentages are the same in each state (see Lester & Kelman, 1997). Nevertheless, variability in identification must be understood far better than it is today, so that judgments can be made regarding whether such variability is the result of variability in the characteristics of state populations, or whether it is the result of inconsistently applied identification procedures. A true and complete understanding of the observed variability in identification can lead directly to proposals for policies to reduce such variability, or to a better understanding of existing variability in learning disabilities.

Learning Disabilities Lack Specificity One of the perennial arguments with the learning disabilities category is that it has been argued that individuals with learning disabilities cannot be reliability distinguished from individuals with generally low achievement (LA) (e.g. Algozzine,

6


1985; Ysseldyke, Algozzine, Shinn & McGue, 1982). More specifically, it is argued that students with reading disabilities cannot be distinguished from “garden variety” poor readers (Fletcher et al., 1994; Spear-Swerling, 1999; Wagner & Garon, 1999). Shaywitz, Escobar, Shaywitz, Fletcher and Makuch (1992) suggested that “dyslexia” simply reflects the lower tail of a normal distribution of reading ability, in contrast to the findings of earlier research (Rutter & Yule, 1975). Ysseldyke et al. (1982) compared students who had been identified as having learning disabilities with students who had not been identified but had scored lower than the 25th percentile on achievement tests, and concluded that “there were no psychometric differences in the performances of the two groups of students” (p. 83). Citing this and the results of related investigations, Algozzine (1985) concluded that “the learning disabilities category has outlived its usefulness” (p. 73). In response, some researchers, re-analyzing research in which learning disabilities – low achievement comparisons were made – have concluded that differences are reliable and substantial (e.g. Fuchs, Fuchs, Mathes & Lipsey, 2000; Kavale, Fuchs & Scruggs, 1994). Algozzine, Ysseldyke and McGue (1995) maintained that, even if some differences can be identified, the two groups can benefit from qualitatively similar instructional approaches, and therefore specific identification of learning disabilities is unnecessary (see also Lyon, 2001; Spear-Swerling, 1999). Fletcher et al. (2002) suggested, “the question is not so much whether children defined as IQ discrepant and low achieving are different, but how much they differ and whether the differences are meaningful for research and practice” (p. 205).

Learning Disabilities Lacks Conceptual Clarity It has been argued that problems in identification arise in part from problems in conceptualizing the definition of learning disabilities. Kavale and Forness (2000) evaluated previous definitions and suggested, . . . the failure to produce a unified definition has meant that LD lacks two critical elements: understanding – a clear and unobscured sense of LD – and explanation – a rational exposition of the reasons why a particular student is LD (p. 240).

Problems in identification will continue if there is not general agreement on the concept of learning disabilities (Kavale & Forness, 1995; Kavale, Forness & Lorsbach, 1991). Part of the difficulty arises from the lack of commonly agreed-upon positive measures of learning disability; that is, there is no test for learning disabilities, as there are IQ tests to be used as an important measure for identification of mental retardation. Another source of difficulty is the partial reliance on exclusionary criteria – that is, things that learning disabilities can not be, including mental


7

retardation, emotional disturbance, or cultural disadvantage (Kavale & Forness, 2000).

IQ-Achievement Discrepancies are Problematic IQ-achievement discrepancy criteria in the identification of learning disabilities have been widely employed (Schrag, 2000); however, discrepancy criteria have been criticized by many, for a variety of conceptual and statistical reasons (Aaron, 1997; Mastropieri, 1987; Scruggs, 1987; Spear-Swerling, 1999; Stanovich, 1991). Technical problems include the degree of measurement error associated with discrepancy methods, in that difference scores involve the additive measurement error of both measures. Another problem is the problem of statistical regression (Cone & Wilson, 1981; Shepard, 1980), where individuals who have been identified for low performance on one measure (e.g. academic achievement) can be expected to score higher on a second measure correlated to the first (e.g. IQ). Fletcher et al. (2002), described previously, referencing generally research that has suggested there is considerable overlap between IQ-discrepant samples and low achieving samples, concluded, “consistent with the call of many researchers, the viability of the IQ-discrepancy classification hypothesis must be questioned” (p. 205). Implicit within arguments against use of discrepancy criteria are criticisms of IQ tests. The use of IQ tests has been criticized, and it has been argued that students exhibiting discrepancies do not differ from poor readers with lower IQs on many reading, spelling, language, or memory tests (Siegel, 1999). It has also been suggested that IQ does not predict reading achievement (Fletcher et al., 1998; Vellutino, Scanlon & Lyon, 2000), although this position has been challenged (Naglieri, 2001). Stanovich (1991) argued that discrepancy definitions as applied to reading disability are threatened by findings that literacy development also develops the cognitive skills revealed on aptitude measures such as IQ tests. Some research has suggested that IQ-achievement discrepancies are not predictive of academic growth rates (Lyon, 2001; Vellutino, Scanlon & Lyon, 2000, but see Speece & Case, 2001, for a different perspective). Embedded in this argument is a concern for the value of IQ testing in special education. Fletcher et al. (2002) maintained, “The concept of IQ as it is applied to LD is outmoded and reflects an obsolete practice . . .. IQ tests do not measure aptitude for learning or provide an index of response to intervention” (p. 234). Fletcher et al. (2002) also maintained that IQ score does not appear to be an indicator of the slow learner, and that “there is not natural subdivision that demarcates mental deficiency from LD” (p. 234). Speece (2002) reported on the

8


results of a recent survey of members of editorial boards in reading and learning disabilities. Fewer than half of the respondents agreed that IQ should be used as a component of a definition of LD, and none of the respondents felt that IQ was the most important consideration. However, respondents who considered that exclusion clauses were important most often chose mental retardation as a criterion, a category of exceptionality generally identified as having low IQ. According to Kavale and Forness (1995), the major problem with discrepancy is how it has been applied – that is, that “the prominence of discrepancy in conceptualizations of LD has resulted in its reification and deification” (p. 162). In many instances, discrepancy (underachievement) has come to be seen as the same thing as learning disabilities, rather than one possible component of a conceptual understanding of learning disabilities. In fact, there are many reasons why a student may exhibit IQ-achievement discrepancies without having learning disabilities. One of the present authors recalls a new student who was suspected of having a learning disability, based on a substantial discrepancy between IQ and achievement. However, this student was observed to make remarkable progress in reading during the first few weeks of the school year. After an examination of the student’s record, it became clear that the observed IQ-achievement discrepancy was probably the result of chronic school inattendance in previous years. In other instances, discrepancies have been observed in students of presumed socio-cultural disadvantage (e.g. Scruggs & Cohn, 1983). In instances, identification procedures rely on exclusionary considerations to determine that adequate educational opportunity has not been provided, and the characterization of learning disabilities cannot be applied. Criticisms of discrepancy criteria are commonly made Aaron (1997) argued passionately against the use of discrepancy criteria in the identification of learning disabilities. Lyon et al. (2001) reviewed literature relevant to discrepancy procedures and concluded, “The IQ-achievement discrepancy, when employed as the primary criterion for the identification of LD, may well harm more children than it helps” (p. 266; see also Fletcher et al., 2002). Such concerns suggest that other procedures for identification may be preferable.

Students with Learning Disabilities are Not Identified Early Enough Keogh and Becker (1973) many years ago alerted the field to the importance of early identification and treatment and the dangers of misidentification (see also Keogh, 1986). Haring et al. (1992) argued that, if the condition of learning disabilities is defined as a lack of academic progress, how could such a determination can not be made on the preschool level? Because of this, different characterizations should be


9

applied. Others have suggested that it may be better not to wait for failures to occur prior to implementing interventions. Fletcher and Foorman (1994) suggested that learning problems are more easily remediated in the earlier grade levels, and concluded that “the focus should be on prevention and early intervention for children at risk for developing learning difficulties” (p. 187). However, models of identification that emphasize observed academic failure may lose valuable time needed for treatment (Hallahan, Kauffman & Lloyd, 1999). Lyon et al. (2001) argued that the present IQ-achievement discrepancy criterion requires a “wait to fail” model: many children can not be reliability identified as having learning disabilities until they have conclusively “failed” in academic learning. This determination might not be made before third grade because of psychometric limitations in discrepancy criteria (p. 269), and, in fact, a large number of students are identified at about third grade. It is true that grade level discrepancies, which require, e.g. academic functioning two years below grade level may be problematic. For a two-year discrepancy to be observed, for example, it is necessary that students be enrolled in school at least two years; this may in some cases eliminate referral before 3rd grade. However, grade level discrepancies to identify learning disabilities are rarely the dominant consideration in learning disabilities identification today. IQ-achievement discrepancies based on standard scores, which are more generally employed, do not necessarily require long periods of academic failure, as students can exhibit severe discrepancies between ability and standardized academic achievement scores early in the 1st grade. For example, Horn and O’Donnell (1985) found that discrepancy scores, even in the first few weeks of 1st grade, were better predictors of future learning disabilities classification than was a more general low-achievement criterion. One reason this “wait to fail” argument against discrepancy criteria has been so generally used may be that many students in fact may not be identified until the third grade, and valuable instruction time is consequently lost. However, the reluctance to identify students with learning disabilities may not be due to the discrepancy criteria, as much as simply a general reluctance to refer students to special education at very early ages. This is unfortunate, however, since early treatment can be effective (Mastropieri, 1988).

Identification Procedures are Not Implemented Appropriately on the Local Level A critical component of the identification process, often lost in purely conceptual or technical analyses of the issues, is the individual or group judgment of professionals who are part of the referral and eligibility process. These judgments can be critical in identifying students who might otherwise be missed, and can help avoid the false

10


negatives or false positives that may result from blindly applying the quantitative criteria. MacMillan, Gresham and Bocian (1998) maintained, “the assignment of responsibility to a committee clearly conveys the desire to receive input from varied perspectives and not rely exclusively on test scores” (p. 322). On the other hand, however, more systematic “actuarial” procedures can result in much more consistent and systematic identification methods, and can be key factors in reducing variability in identification (Dawes, Faust & Meehl, 1989; Ysseldyke, Algozzine, Richey & Graden, 1982). In response to such issues, Meehl (1973) asked, “When should we use our heads instead of the formula?” (p. 81). Clearly there is a need for both judgment and objective measures, but care must be applied. At present, it appears that the identification process is employed less systematically than should be desirable, and that perhaps the process itself contributes to overidentification and inappropriate variability. Gresham (2001) described the identification process as involving inconsistent criteria, including local criteria for school failure and national norms for assessment. Decisions are frequently based upon the degree to which the student could be expected to profit from special education, rather than on established criteria. Investigations of school practices have revealed that precise identification criteria are often not employed. Gottlieb, Alter, Gottlieb and Wishner (1994) reported that a sample of urban students identified with learning disabilities had an average IQ of 81.4, and concluded, “these children today are classified as learning disabled, when in fact most are not.” MacMillan et al. (1998), using a sample from California, concluded that fewer than half of the students identified as having learning disabilities met state requirements, and concluded that “public school practices for diagnosing children with learning disabilities bear little resemblance to what is prescribed in federal and state regulations” (p. 123). In an earlier investigation, MacMillan et al. (1996) had identified a number of students who had been classified as having learning disabilities, with IQ scores lower than 75, and as low as 58. McLeskey and Waldron (1990) examined the records of 1,742 Indiana students referred for evaluation for learning disabilities eligibility, and reported that students identified as having learning disabilities (52%) differed markedly from students referred but not identified (48%). However, over one third of students identified as having learning disabilities did not meet existing state criteria. MacMillan and Siperstein (2002) reviewed relevant literature and concluded: It is evident that the “concept” of LD used by the schools deviates markedly from the original concept of LD articulated in authoritative defnitions. We have no doubt the SI [school identified] LD population reflects a group of children who do, in fact, need assistance; however, among the children identified as LD by the schools are subsets never considered in previous descriptions of LD that acknowledged the heterogeneity present in the original conception of LD (problems


11

in reading, writing, mathematics, verbal expression, etc.). Today we find children classified as LD who would more appropriately be classified as MR or ED if diagnostic criteria were applied rigorously. As long as the LD category absorbs children with IQ scores in the 70–85 range, as well as those with scores below 70, we will never clean up the LD category (p. 319).

Although identification procedures may be more carefully applied in different regions (see Wilson, 1985, for a more positive example), evidence to date suggests that a very substantial source of overidentification of learning disabilities is the application (or misapplication) of state-established criteria at the local level. These data suggest that numbers of students identified could be reduced by as much as one-third or more simply by consistently and systematically applying state criteria at the local level.

HOW CAN WE CHARACTERIZE LEARNING DISABILITIES? Problems in identification have led some to argue that learning disabilities does not exist as a viable condition. Some have argued that learning disabilities do not really exist, but rather have been socially or organizationally constructed, to serve the interests of certain aspects of society rather than children (Christensen, 1999; Coles, 1987; Skrtic, 1999; Sleeter, 1995; see Kavale & Forness, 1998, for a review). Others have suggested the concept is not useful, in that students characterized as having learning disabilities are not distinguishable in any meaningful way from other poor readers or low achievers not characterized as having learning disabilities (Aaron, 1997; Fletcher et al., 1994; Shaywitz, Fletcher et al., 1992; Ysseldyke et al., 1982). These issues have led researchers in two different directions over the years (Lester & Kelman, 1997). Some, on the one hand, have suggested that improvements in precision of identification procedures will lead to resolution of these issues, and support treatment of those who “truly” have learning disabilities. For example, Reynolds (1984) argued: The tremendous disparities in measurement models adopted in the various states in their respective learning disability guidelines and the varying levels of expertise are obvious, major factors contributing to the difference in the proportion of children served as LD in the various states. Lack of a specific definition, improper or lack of application of the severe discrepancy criterion, and the failure to develop appropriate mathematical models . . . are the primary, certainly interrelated causes of these disparities (p. 455; see also Keogh, 1994).

On the other hand, others have suggested that the concept of learning disabilities has little or no practical utility and should be abandoned:

12

THOMAS E. SCRUGGS AND MARGO A. MASTROPIERI It is time to quit viewing eligibility decision making as a technical problem. It means putting an end to efforts to try to find better ways of defining concepts and conditions that cannot be defined and may not exist (Algozzine & Korinek, 1985, pp. 392–393).

Regardless, any accounting of the existence of learning disabilities, positive or negative, must in some way consider the existence of descriptive reports dating back to the 19th century or earlier (see Anderson & Meier-Hedde, 2001; Berlin, 1887; Dejerine, 1892; Hinshelwood, 1896; Kussmaul, 1877), the proliferation of professional and advocacy organizations, such as the Council for Learning Disabilities, the Division for Learning Disabilities of the Council for Exceptional Children, and the Learning Disabilities Association of America, and the thousands of service providers who not only testify to the existence of learning disabilities from their own experience, but who also advocate for improved services (see Scruggs & Mastropieri, 1988). Although issues in identification are far from resolved, a consistent set of observations characterizing learning disabilities has been reported over the years. Wong (1996) suggested, . . . those very characteristics observed by parents, educators, psychologists, and medical professionals about children with learning disabilities in 1963 are the very same characteristics that we see today in children, adolescents, and adults with learning disabilities! (Wong, 1996, p. 22).

As an example, Mastropieri and Scruggs (2002) described the case of “Andrew,” a middle third-grader in a middle-class, suburban school, referred to previously. His reading rate was painfully slow. For a “free writing” activity, he drew a picture which seems to be a comet, and wrote below the picture the single sentence, “It vush the oue wun” [it was the only one]. Psycho-educational testing revealed considerable variability: a Full-Scale IQ of 104, but a reading standard score of 72, with similar deficits in math and spelling. His vocabulary is above average; his listening comprehension and verbal expression scores are almost exactly average. His teachers described him as a good-natured student with good verbal ability, but who exhibited significant problems in reading and math concepts. As such, Andrew provides a classic example of a student who exhibits many of the characteristics considered “typical” of learning disabilities: In the case of Andrew, failure of this alert, cooperative third grader to perform adequately in academic areas was painfully obvious, as was his need for special help. Analysis of his test scores revealed that he exhibited discrepancies in reading of 22 or 30 points, depending on which measure is used, 15 points in spelling, and 31 points in math. His mental ability was normal, as were his vision, hearing, and motor abilities. He was not deprived of necessary cultural exposure, and had received normal educational experiences. Are such considerations inadequate or misguided in identifying Andrew as having a learning disability? We think not . . . (Mastropieri & Scruggs, 2002, p. 453).


13

Kavale and Reese (1991) surveyed 547 teachers of students with learning disabilities in the State of Iowa, and found substantial agreement on their conceptualizations of the nature of learning disabilities. Over 80% of the teachers agreed that the condition of learning disabilities was associated with the following: (a) an ability-achievement discrepancy; (b) intra-individual differences resulting in learning strengths as well as learning weaknesses; (c) academic strengths as well as academic weaknesses; (d) a processing deficit that interferes with learning ability; (e) average or above intelligence; (f) a need for special materials and instructional techniques; and (g) the ability to learn at a different rate than students with mental retardation (pp. 146–147). These statements correspond closely with the three components of learning disabilities consistently mentioned over the past 30 years: “(1) unexpected low achievement relative to aptitude or ability; (2) deficits and uneven profiles in specific perceptual or cognitive processes; and (3) evidence of within-child, presumably causal, neurological condition(s)” (Keogh, 1994, p. 16). However, Keogh acknowledged, “although they may have validity on a construct level, these definitional criteria present serious problems of measurement when making diagnostic decisions or assigning individuals to classes” (p. 16). Wong (1996) provided an overview of many commonly reported features of learning disabilities, including the following: Learning disabilities are associated with unexpected underachievement. This unexpected nature of learning disabilities is a commonly cited characteristic (Keogh, 1994). It is not completely separate from other considerations, because the unexpected or unanticipated nature of learning disabilities has much to do with the concept of deficits in cognitive processing in spite of normal intelligence. Learning problems associated with, e.g. low intelligence, sensory impairments, or cultural disadvantage, in contrast, are “expected,” and provide a direct conceptual contrast with learning disabilities (Kavale & Forness, 1995). The idea of underachievement in spite of normal general ability has been an essential component of the concept of learning disabilities since the earliest reports of over a century ago. Morgan (1896, p. 1378) described a case of “word-blindness” in a 14-year-old boy with severe reading problems as being “bright and of average intelligence in conversation.” Hinshelwood’s (1917) 10-year-old boy with severe reading problems was similarly described as intelligent. More recently, Mastropieri

14


and Scruggs’ (2002) case of a student with average abilities in intelligence and general language skills, who nevertheless exhibited very considerable difficulty in math and literacy tasks, provides an example of this unexpected underachievement. That this low level of achievement was not expected from the student’s known intellectual or linguistic functioning is integrally linked to a general understanding of learning disabilities. Learning disabilities are multifaceted in nature, in that they may be manifest in a variety of areas of academic and/or psychological functioning. Although most students identified as having learning disabilities exhibit problems in the literacy areas of reading, spelling, and written language, and for which considerable comorbidity has been identified (Beitchman, Cantwell, Forness, Kavale & Kauffman, 1998), specific problems in mathematics are often identified (American Psychiatric Association, 2000; Baroody & Ginsburg, 1991; Fleischner, 1994; Kosc, 1981; McLeskey & Waldron, 1990). In some cases these problems are may be associated with problems in reading or spelling (Scruggs & Mastropieri, 1986); in other cases, there may be no such association (Rourke, 1989; Strang & Rourke, 1983). In addition, problems in more than one academic area may be associated with deficits in specific cognitive processes, such as memory (Cooney & Swanson, 1987; Siegel & Ryan, 1988; Swanson, 1993) or attention (Hallahan & Cottone, 1997). Even among students who exhibit reading disabilities, it is unclear whether observed deficits in phonological awareness (Torgesen, 1999) underlie all reading disorders, and are also responsible for the deficits in reading comprehension that are so common in this population (Mastropieri & Scruggs, 1997; Sternberg, 1999). A multivariate analysis of a large data set (Kavale & Nye, 1985–1986, 1991), revealed that learning disabilities is most consistently related to deficits in reading and math, as related to deficits in linguistic (e.g. semantic, syntactic, phonological), neuropsychological (e.g. selective attention, memory, cognitive style), and social/behavioral functioning (e.g. interpersonal perception, intrapersonal perception). These results suggest that learning disabilities are multifaceted, rather than unitary in nature. In response to this multifaceted nature of learning disabilities, attempts have been made to identify “subtypes” of learning disabilities (e.g. Lyon, 1985; McKinney, Short & Feagans, 1985). However, general agreement has not been reached on the nature of learning disabilities subtypes, or how they should be identified (Kavale & Forness, 1987). Learning disabilities involve intraindividual differences or deficits, and result in uneven profiles in specific perceptual or cognitive processes. Individuals with learning disabilities are frequently observed to have relative patterns of strengths and weaknesses (Keogh, 1994). Kirk (1971) described these patterns


15

as intraindividual differences, while Gallagher (1966) referred to them as “imbalances.” These intraindividual differences are not necessarily revealed in large group summaries of subtype scores (e.g. WISC profiles, Kavale & Forness, 1984), since group averages reduce overall intraindividual differences. However, researchers and practitioners have frequently noted relative strengths and weaknesses within the patterns of functioning of individuals with learning disabilities (Wong, 1996). Learning disabilities are associated with within-child, presumably neurological conditions. As such, learning disabilities are not thought to be primarily due to sensory, motor, or intellectual deficits, or cultural disadvantage. This characteristic is included in many definitions of learning disabilities (Kavale & Forness, 2000), and has contributed both to conceptual models of learning disabilities, and to the most severe criticisms of the concept of learning disabilities. Cognitive deficits of known neurological origin have been difficult to measure directly, and problems of reliability and validity of psychological “process” measures called into question the issue of neurological functioning (e.g. Larsen & Hammill, 1975). Mercer, Forgnone and Wolking (1976) suggested, “due to the vague nature of the concept of a process disability, a description of it for the purpose of analyzing definitions of LD is subject to criticism” (p. 378). However, a consideration of problems in cognitive processing was helpful in redirecting previous assumptions that learning disabilities generally reflect visual perceptual processing difficulties, and require perceptual motor training (Kavale & Mattson, 1983). Modern techniques, including functional magnetic imaging (fMRI) techniques, have provided support for a neuropsychological basis for learning disabilities (Hynd, Clinton & Hiemenz, 1999), and that “both anatomical and physiological signatures of dyslexia exist,” for example, in decreased activation of the left temporoparietal and superior temporal cortex during phonological processing (Rumsey, 1996, p. 72). In addition, research from autopsy studies (Galaburda, 1991), positon emission tomography (PET) scans (Wood, Flowers, Buchsbaum & Tallal, 1991) and genetic studies (Olson, 1999) have tended to support these findings. Hynd et al. (1999) suggested, “learning disabilities are most appropriately viewed from a neuropsychological perspective” (p. 60). Whether or not such a characteristic is a useful component of a definition of learning disabilities (Spear-Swerling, 1999), the idea that learning disabilities involves neuropsychological dysfunction – and which may be manifest as disorders of cognitive processes – has been prominent in definitions of learning disabilities. Further, processing deficits in students with learning disabilities have been observed in measures of cognitive processing, including attention, memory, and linguistic processes (Hallahan, Kauffman & Lloyd, 1999).

16


WHAT ALTERNATIVE PROCEDURES FOR IDENTIFICATION HAVE BEEN PROPOSED? Proposed Alternatives Issues in the identification of learning disabilities have tended toward two very different perspectives. On the one hand are those who have argued that concept of learning disabilities is problematic, and should be abandoned entirely. Others, on the other hand, acknowledge the problems in identification, but nevertheless support the overall concept of learning disabilities. Thirty years ago, McCarthy (1971) wrote: The most important decision you will make is that of definition – because your definition will dictate for you the terminology to be used in your program, the prevalence figure, your selection criteria, the characteristics of your population, and the appropriate remedial procedures (p. 14).

Issues of definition, identification, and assessment are of critical importance to the field of learning disabilities. Several alternatives to the present identification procedures have been proposed, many of which overlap in some of their considerations. These alternative identification procedures include the following: The Double Deficit Wolf and Bowers (1999) proposed criteria for identification of learning disabilities that were based on observed deficits on phonological analysis tasks and rapid continuous naming of digits and letters. These two criteria, referred to as a double deficit, have also been seen to discriminate between students with reading disabilities and normally achieving readers in the early grades. Allor (2002) reviewed 16 studies of phonemic awareness and rapid naming conducted between 1990 and 1997, and concluded that performance on each of these two tasks contributes uniquely to development of word reading. Continued efforts to evaluate the double-deficit hypothesis were recommended. However, Ackerman, Holloway, Youngdahl and Dykman (2001) investigated a sample of 101 elementary school children with and without learning disabilities, and found that these children differed on a number of tasks in addition to the double deficit criteria, including orthographic tasks, attention, arithmetic achievement, and most WISC-III factors. They reported that students who met double deficit criteria were no more limited in reading and spelling than those with a single deficit in phonological analysis. Phonological Process Core Difference Model Stanovich (1988) suggested that differences in phonological processes could discriminate between “dyslexic” and “garden-variety poor readers,” particularly


17

during the early grade levels. Torgesen and Wagner (1998), similarly, and have proposed tests of phonological awareness, rapid automatic naming, and verbal short-term memory in the identification of reading disabilities at the early grade levels (Wagner, Torgesen & Rashotte, 1999). Fletcher, Francis, Shaywitz, Lyon, Foorman, Stuebing and Shaywitz (1998), suggested that learning disabilities classification could be considered for any student failing to reach the 25th percentile for specific reading skills, including core reading process measures. They suggested that this could result in higher numbers identified, but may have greater relevance than present policy-based decisions which, in their view, merely fulfill a “gatekeeping function” (p. 199), that is, to reduce number of referrals. Torgesen (2002) recommended a procedure that involved process-marker variables for early identification, and response to treatment (discussed subsequently) variables for later diagnosis. According to this model, all students in early grades exhibiting problems with phonological awareness and rapid automatic naming would receive special instruction and periodic assessment of progress to determine the need for continued specialized interventions. If problems persisted through 2nd or 3rd grade (or higher), students could be identified as having learning disabilities. Chronological Age Definitions Siegel (1989) suggested that IQ is not relevant to the concept of learning disabilities. She maintained that the influence of IQ and achievement is bidirectional, in that lack of achievement may in time exert a negative influence on IQ. She suggested that students should be identified as having learning disabilities if they score below age expectancy in achievement (particularly, in reading of pseudowords) and are not found to be mentally retarded. Lyon and Fletcher (2001) also suggested that discrepancy criteria be abandoned, and should be replaced by comparing achievement in academic areas with age and grade levels. Bayesian Procedures A number of years ago, Alley, Deshler and Warner (1979) proposed a method for identification based upon earlier models proposed by Wissink, Kass and Ferrell (1975). Bayes’ formula combines prior information with current information to determine the probability of learning disabilities. General education teachers complete a checklist for students who are demonstrating learning problems. A numerical weight is assigned to each factor on the checklist, the weight depending on each factor’s odds of defining learning disability. A formula then combines the probability data. For example, although reading decoding considered independently had a probability of only 0.21 of prediction of learning disabilities, decoding in conjunction with reading recognition in conjunction with detecting

18


spelling errors in conjunction with problems with math algorithms resulted in a 0.96 probability of prediction of learning disabilities. It was suggested that additional testing be done to verify these conclusions. Component Model of Reading Disability Aaron (1997) proposed a model of identification that was “not intended to classify, categorize, or label poor readers” (p. 476). Aaron suggested that students be given standardized tests of listening comprehension and reading comprehension. Students who score in the average or high range on listening comprehension but fail to achieve at this level in reading comprehension “are usually hampered by poor decoding skills” (p. 477), which could be confirmed on a measure or reading decoding. Similarly, students can be evaluated for weak comprehension skills, or weakness in both word attack and comprehension. Programs, then, can be implemented that emphasize decoding, comprehension, or both. Neuropsychological Assessment Rourke (1993) described a battery of neuropsychological assessments for the identification of learning disabilities, including tactile-perceptual tests, visualperceptual tests, auditory-perceptual and language related tests. He suggested that these measures could also discriminate among subtypes of learning disabilities, such as nonverbal learning disabilities. Assessment of Cognitive Processing Several researchers proposed using tests of cognitive processing for providing direct evidence for identification of learning disabilities. Naglieri (2001) described the Cognitive Assessment System (Naglieri & Das, 1997), and Swanson (1993) described use of the Swanson Cognitive Processing Test as a dynamic assessment measure. Each of these measures is intended to identify individual differences in information processing, and could be potentially useful in the identification of learning disabilities. Operational Interpretation Kavale and Forness (1995, 2001) proposed a five level process of identification, that included: (1) underachievement or discrepancy to be used as a necessary but not sufficient first criterion; (2) significant deficits in basic skills, focusing specifically on the major academic areas of reading, writing, language, and math; (3) deficits in the efficiency of learning, including strategy use and learning rate;


19

(4) psychological process deficits, including attention, memory, linguistic processing, and metacognition; (5) an exclusionary clause, ruling out alternative causes of learning failure, such as mental retardation, sensory impairment, emotional disturbance, or inadequate instruction. Only when all five operational criteria are met should students be identified as having learning disabilities. Failure to respond to validated treatment protocols, or dual discrepancy criteria. One of the major problems in identification of learning disabilities is in discrimination between constitutional and experiential deficits. Berninger and Abbott (1994) argued that sufficient “opportunities to learn” are often assumed rather than demonstrated, and proposed development of validated protocols of which treatment approaches work best for which learning characteristics (based upon static assessment of learning in 11 domains). When this information is available, learning disabilities can be diagnosed on the basis of treatment nonresponding; that is, when a student fails to make expected gains in spite of educational treatments that have demonstrated positive results for similar learners. Fuchs and Fuchs (1998; see also Speece & Case, 2001) suggested that lower academic functioning could be evaluated dynamically, through identification of deficits in level and slope of learning, as evaluated by curriculum-based measurement (CBM) procedures. Resistance to instruction would be demonstrated by slopes on probes of academic skills, including reading and math. Speece and Case (2001) implemented these procedures on students in the area of reading, grades K-2, and identified samples that differed in some respects (e.g. they were lower in age and mean IQ) from students identified according to traditional discrepancy formulas. Al-Otaiba (2000) suggested, given widespread and converging evidence that phonological processing deficits often lead to learning disabilities, unresponsiveness to effective treatment protocols can provide an alternative to discrepancy-based formulae currently used to identify students with learning disabilities (p. 12; see also Al-Otaiba, this volume; Berninger & Abbott, 1994; Fuchs & Fuchs, 1998; Vellutino et al., 1996).

Fuchs, Fuchs and Speece (2002) referred to unsatisfactory level of performance in addition to inadequate rate of growth as a “dual discrepancy” (p. 34). They described in detail the “treatment validity model” and how it could be used in the identification process. In Phase 1, the regular classroom environment is assessed to determine whether it is sufficiently supportive to warrant identification of individual students as having learning disabilities. If this is not the case, the intervention is conducted at the classroom level to increase the student academic

20


growth rate to a level comparable with the school, district, or nation. If classroom instruction is considered appropriately supportive, Phase II assessment is used to identify students with dual discrepancies, that is, students who are functioning dramatically below peers in level and slope of performance on, e.g. reading rate. In Phase III assessment, a determination is made whether adaptations in the general education classroom can produce acceptable learning for individual students. If these adaptations are unsuccessful, CBM is used, in Phase IV, to determine whether learning disability classification and special education placement is effective for a given student. If not, labeling and placement in special education is not considered justifiable for that student. Recently, representatives of the National Center for Learning Disabilities (NCLD), the National Association of School Psychologists (NASP), and the International Reading Association (IRA) proposed a similar approach, applying CBM to identify students with learning disabilities (Horowitz, Lichtenstein & Roller, 2002). They recommended that students at risk for academic failure should be provided with a “research-based, general education intervention” (p. 2) for 8–10 weeks, monitored by CBM. Students who do not display appropriate gains in slope and level of learning progress on CBM measures are considered potential candidates for special education. This intervention process is expected to eliminate the necessity of IQ-achievement discrepancies. It was also suggested that “at risk” status could be determined by failure to meet a 20th percentile cutoff on standardized, norm-referenced academic measures, and that failure to make “meaningful gains” could be determined by failure to achieve growth at the 20th percentile or higher on CBM measures. Horowitz et al. suggested that there is “sufficient evidence” (p. 3) to suggest that these procedures could be generally implemented in school settings, and that they would have substantial impact. It was argued, rather implausibly, that these procedures would not change the numbers of children served as LD, although referrals to special education would be reduced by almost one-half (p. 3). However, direct empirical tests of this specific model appear to be lacking at present. Published research applications would be useful in determining whether this model should replace current procedures. Al-Otaiba and Fuchs (2002) recently reviewed 23 research reports on the characteristics of students who are unresponsive to early literacy intervention, and found that most unresponsive students were characterized by deficits in phonological awareness. Other characteristics identified less consistently included deficits in phonological retrieval or encoding, verbal ability, behavior problems, or developmental delays. Al-Otaiba and Fuchs suggested that future research efforts should attempt to address a common definition of “treatment unresponsiveness.” For example, Good, Simmons and Smith (1998) suggested reading fluency


21

below 40 words per minute at the end of first grade, while Torgesen (2000) suggested the 30th percentile on standardized reading tests. Al-Otaiba and Fuchs also suggested that more study be given to characteristics such as phonological memory and low IQ, and to the training and fidelity of treatment implementation of trainers. Finally, they highlighted the difficulty in describing a “typical” treatment-non-responder, due to the considerable variation both between and within treatment-non-responders: Our review of the non-responder literature suggests that it may be difficult to characterize a “typical” non-responder because: (a) she or he is likely to have a relatively complex profile of strengths and weaknesses; and (b) her or his learner profile will probably be different from that of other non-responders . . . In short, as research on treatment non-responsiveness continues, we anticipate that unresponsive students will be characterized by considerable intra- and interindividual variation, which would seem to defy precise applications of treatments to learner strengths and weaknesses (Al-Otaiba & Fuchs, 2002, p. 313, see also Al-Otaiba, this volume).

At present, the Bush administration is promoting this type of approach to identification to learning disabilities, in place of identification criteria which include a discrepancy component (Al-Otaiba & Fuchs, 2002). Recently, in a report sponsored by the U.S. Department of Education, Office of Special Education Programs (2002), it was recommended that discrepancy criteria be eliminated from the identification of learning disabilities, that renewed attention be placed on early identification of learning disabilities, and that alternative procedures for identification be considered. One highly recommended alternative was the response-to-intervention approach, where general education teachers implement scientifically-based practices and use curriculum based measurement to document student progress on a regular basis. Students who prove to be “treatment resisters” may be eligible for more intensive interventions, or referral to special education. It seems possible that criteria such as these may be employed in the near future. How Should Alternative Approaches be Evaluated? No alternative procedure for identification of learning disabilities to date has achieved general acceptance. It is possible, however, to specify criteria to be met in any valid, generally accepted identification procedure. These could include the following: Does the alternative identification procedure address the multifaceted nature of learning disabilities? Is it appropriate for identifying disabilities in reading comprehension as well as decoding skill areas, and other possible areas, including for example math, writing, and spelling; or memory, attention, and study/organizational skills? If learning disabilities is a multifaceted condition,

22


measures of a single area of functioning (e.g. phonological core processes) will be of limited use for identification of all learning disabilities. Is the alternative appropriate across the age spectrum of students with learning disabilities? That is, although early identification is of great importance, can the procedures also be used (or modified) to identify students at other age levels? Alternatives that focus entirely on early identification of learning disabilities must assume either that all students can be identified early and remediated, or else that no cases of learning disabilities will emerge after the primary grades. Does the alternative identification procedure possess technical adequacy? That is, have reliability, validity, or administration issues been addressed adequately? No procedure, however conceptually adequate, is of value if it cannot be shown to be both reliable and valid. Further, conceptually adequate procedures are of little value if they cannot be implemented faithfully on a large-scale basis. Will the alternative procedure reduce overidentification of learning disabilities from present procedures? Since overidentification is commonly cited as a major problem of learning disabilities, any alternative that does not reduce the proportion of students identified will not be an improvement, according to this commonly expressed concern. Will the alternative procedure reduce inappropriate variability in rates of identification across both state and local educational authorities? If the alternative procedure does not meet this criterion, it will not be an improvement in this regard. Will the alternative procedure be faithful to present and traditional conceptualizations of learning disabilities? If an alternative procedure identifies populations that are different from those commonly regarded as having learning disabilities, it can not be considered an improvement over present procedures for identifying learning disabilities. For example, many students with mental retardation do not perform well on tests of phonemic awareness, and many do not show response to instruction on a level with normally-achieving peers. If an alternative procedure identifies students with mental retardation as having learning disabilities, then the present conceptualization of learning disabilities will have changed appreciably. Will the alternative procedure produce fewer false positives (identifying students who do not “really” have learning disabilities) or false negatives (failing to identify students who “really” do have learning disabilities)? If not, it will not be an improvement over present methods of identification.

Although many concerns have been expressed regarding existing practice in identification of learning disabilities, there is to date no alternative to existing practice that has general acceptance. Clearly, any modification in present identification


23

procedures must address the criticisms that have been previously been raised. For example, any substantial study of identification procedures must address the issue of across-state and within-state variability in identification rates. With particular reference to the “resistance to validated treatments” approach, several questions remain unanswered. It is unquestionably true that empirically validated treatments ought to be implemented generally (Mastropieri & Scruggs, 2000), and that curriculum-based measurement (CBM) is a most effective method of monitoring progress and identifying those who exhibit learning problems (Fuchs, Fuchs, Hamlett, Phillips & Bentz, 1994). And certainly, documentation of a student’s learning failure in spite of valid and systematically-applied treatments should be a necessary component of LD identification (and in fact, many states do attempt to address this issue with their use of pre-referral intervention teams). Nevertheless, there are reasons for questioning the validity of this “treatment resister” model as a psychometric instrument for identification of learning disabilities. Given the reluctance of general educators to use CBM, it seems unlikely that it will be universally applied, and even less likely that it can be applied in a standardized manner, so that all students receive the same instruction and measurement under the same conditions. Since virtually all of the attention to date has been paid to early reading problems, how this model could be applied to different academic areas and different grade levels remains to be seen. All of these problems speak to a problem of reliability and validity of “treatment resistance” as a psychometric measure. How these procedures will discriminate those with learning disabilities from those with, e.g. mental retardation, autism, or hearing impairments remains unclear. Finally, how such procedures to identify learning disabilities are to be used in conjunction with procedures for identification of other categories of exceptionality remains to be seen.

WHAT ARE OTHER WAYS OF ADDRESSING PROBLEMS IN IDENTIFICATION? Over the years, and most particularly in recent years, there have been criticisms of procedures for the identification of learning disabilities. Several alternative identification procedures have been proposed. However, none of these to date have sufficient research to document the likely consequences of their implementation. Additional research could help determine which, if any, of these proposed alternatives could result in improved identification of learning disabilities. But before alternative procedures are widely implemented, they must be shown to be generally superior to present methods.

24


Any modification in identification, however, should address effectively the aspects of identification that are presently most widely criticized. A review of the literature critical of learning disabilities identification reveals that these problems include: (a) (b) (c) (d) (e) (f) (g)

overidentification; variability; specificity; conceptual considerations; discrepancy issues; early identification; and local implementation.

Following is a set of suggestions that can be employed to address all of these concerns. (1) Reduce overidentification at the school or district level. Overidentification is often assumed because the proportions of students identified as having learning disabilities have been consistently growing, and because the present percentage of identified students (over 5%) seems higher to many than reasonable for a “disability” status. Research in local and state identification procedures reveal that many students – perhaps as many as one-third to one-half – are identified as having learning disabilities without meeting state criteria. One reason this is done is that school personnel realize that many students require special attention which in many cases can only be provided by referral to special education. Another commonly-described reason for overidentification is that other categories of exceptionality – particularly mental retardation and emotional disturbance – are seen to be more stigmatizing than the category of learning disabilities. Therefore, students are identified as having learning disabilities so that they can be provided with special services without use of a more stigmatizing label (MacMillan & Siperstein, 2001, 2002). Overidentification, then, can be significantly reduced not by provision of a new definition on the federal level, but by requiring local educational authorities to employ strict adherence to state definitional criteria. Such a requirement could potentially reduce the proportion of students identified as having learning disabilities. If this is done, however, some provision must be made for other students who need assistance but do not meet learning disabilities criteria. For students who are simply low achievers, or those who were formerly referred to as “slow learners” (e.g. IQs from 70 to 85), schools should provide, and states should support, some form of additional assistance. Since these students could include


25

over 15% of the school population, however, scarce special education funds should not be used to support these learners. Better education for general education teachers, emphasizing educational methods for low achieving students, could also do much to reduce referrals for learning disabilities. For students who are identified as having learning disabilities because more appropriate categories are thought to be stigmatizing, perhaps these other categories could be described differently, using terminology with more neutral connotations. (2) Reduce variability. With more consistently applied local implementation procedures, variability will also likely decrease. States can reduce variability by employing more consistent and specific criteria (perhaps with the support of the U.S. Department of Education). All states could employ a specific discrepancy formula (or other objective criteria for identifying unexpected underachievement), and discrepancy formulae could be made more consistent. Since regression formulae are generally considered technically superior (Kavale & Forness, 2000), states could specify that regression formulae be employed, and provide specific input on the types of assessment measures that are acceptable for identification of learning disabilities. (3) Establish specificity. It appears that earlier concerns that learning disabilities could not be distinguished reliably from general low achievement were overstated. Nevertheless, it is important to demonstrate that students identified as having learning disabilities are reliably different from general low achievers, and that there is reason to infer a deficit in psychological processing that results in a relative inability to learn. This can be accomplished by strict adherence to explicitly stated state criteria, and the careful and consistent exclusion of alternative explanations of discrepant functioning, including the lack of adequate learning opportunities, cultural disadvantage, emotional disturbance, mental retardation, or sensory impairments (see Wilson, 1985). (4) Address conceptual considerations. Conceptual considerations can be addressed by careful adherence to operational state criteria, and general application of the understanding that IQ-achievement discrepancy is not equivalent to learning disabilities, but is only one of the considerations used as evidence of the existence of learning disabilities. Determination that students have received adequate opportunities to learn should be carefully made. Systematic prereferral interventions should be implemented in the general education classroom, and shown to be inadequate, before students are identified. The nature of pre-referral interventions should be made systematic, and reflect validated practices. If, as a result, all students identified as having learning disabilities demonstrate average general abilities, but demonstrate significant problems in areas relevant to academic functioning in spite of adequate and even intensified general education, many of the conceptual concerns will have been met.

26


(5) Address discrepancy issues. Discrepancy remains a significant issue. According to Lyon et al. (2001), “no definitional element of LD has generated as much controversy as the use of IQ-achievement discrepancy in the identification of students with LD” (p. 265). In fact, some individuals clearly will never accept the idea of discrepancy as a criterion for identification (e.g. Aaron, 1997). Nevertheless, at present discrepancy is a most objective indicator of learning disabilities, and its elimination would likely result in overidentification and variability at even higher levels than those at present. In addition, it is difficult to understand current characterizations of learning disabilities without some notion of intraindividual differences between ability and achievement (Mastropieri & Scruggs, 2002). More careful application of discrepancy criteria, in conjunction with carefully documented exclusion criteria, can greatly improve present identification practices. For example, the process suggested by Kavale and Forness (2001) includes specific discrepancy criteria that is supported by documentation of pervasive basic skills deficits, deficits in learning efficiency, deficits in psychological processes, and the exclusion of alternative causes of learning failure. Carefully implemented, such a procedure could address many concerns with present identification practices. Data from the state of Iowa provide evidence that large scale implementation of state criteria can be accomplished to a reasonable degree of fidelity, and that identified students can be demonstrated to be substantially different than students who are not identified (Wilson, 1985). With additional safeguards, a highly appropriate and systematic procedure for identification of learning disabilities can be implemented. (6) Early identification. It is clear that students with learning disabilities who are identified and treated early have a brighter future than students with learning disabilities who are not identified early. It is likely that in many cases lower identification rates during the primary years is the result not of inferior assessment measures and identification, but rather the reluctance of school personnel to classify at early ages. Early identification can be greatly improved by encouragement of early identification practices by the state and federal governments. Nevertheless, it does not logically follow that “the idea that special education funds can be used for early identification and prevention is critical” (Lyon et al., 2001, p. 281). Special education funds were never fully provided by the federal government, and are too limited to be employed in general education arbitrarily, identifying students in need of assistance, but who do not consistently meet specific criteria for learning disabilities. Because of the number of concerns raised, a number of alternative identification methods have been proposed. These include observed deficits in level and slope of academic performance as measured by CBM, deficits


27

in phonological core processes, and failure to respond to validated treatment protocols. Each of these alternative procedures has addressed some important aspects of learning disabilities, and although some seem promising, none have been demonstrated to provide results superior to present procedures. For any alternative approach to be effective, ultimately it must address conceptual, technical, and implementation aspects of the identification process, across grade levels and skill areas, and it must identify individuals who share characteristics commonly associated with learning disabilities, including intraindividual differences and unexpected underachievement. For example, for a “failure to respond to validated treatment” approach to ultimately be of general utility, several criteria must be met. It must be demonstrated that it can be implemented systematically in a standardized way across classrooms, that it can identify “treatment non-responders” in areas other than reading fluency, that it can be implemented from K-12 grade levels, and that the students identified by these methods will meet the characteristics commonly identified with learning disabilities. If this alternative method does not address any of these criteria, additional appropriate procedures must be specified. We suggest that radically altering or eliminating the concept of learning disabilities – without thorough evaluation – could be counterproductive, creating at least as many problems as it addresses. Many of the commonly heard criticisms of prevention efforts – efforts to address the academic needs of the lower 25% in academic achievement can do much to limit referral to special education, but these efforts should be provided through general education funds. The more limited special education funds should be reserved specifically for support of intensive, high-quality interventions for students meeting strict disability standards. (7) Local implementation. Improvements in local implementation of state and federal criteria in the identification of learning disabilities may be the key to addressing many other problem areas in identification. If state criteria for learning disabilities were carefully implemented by all local school districts, many if not most of the criticisms of identification of learning disabilities would be addressed. However, as long as local practices remain subjective and inconsistent, any change in definition or federal policy will be unsuccessful.

SUMMARY AND CONCLUSION: WHAT DO WE KNOW TO DATE ABOUT IMPROVING IDENTIFICATION? In this chapter, we have reviewed criticisms of present practices in identification of learning disabilities. Much of the criticism found in the literature involves

28


concerns with overidentification, variability in identification, and specificity in reliably discriminating learning disabilities from general low achievement. The definition of learning disabilities has also been challenged on conceptual grounds. Particularly common are concerns about discrepancy criteria, which, it has been suggested, suffer from problems in measurement, conceptualization, reliance on IQ testing, and exert an inhibiting influence on early identification. Finally, identification procedures as implemented by local school authorities have been characterized as subjective and identification of learning disabilities can be addressed by careful attention to the local implementation aspect of identification. Improving local identification practices could reduce numbers of students identified by one-third or more, and could increase the homogeneity of identified populations. Improving local identification practices would result in identification procedures that are more within the spirit of IDEA legislation. It is also important that general education programs be implemented for the students who would not be served as having learning disabilities, but who nonetheless may need some additional assistance in acquiring important academic skills. Such programming would be very beneficial to general education students, and would likely reduce referrals to special education. However, scarce special education funds not be used for this purpose. Federal and state general education funds could provide for low achieving students in need of academic assistance, while special education funds can continue to be reserved for students identified as having disabilities. As improved procedures are implemented and numbers of students identified as having learning disabilities diminish, federal support, never fully provided, can begin to pay a greater share of the costs of special education. It is clear that further research is needed in alternative identification procedures. Such research should examine all reasonable alternatives, in addition to examining how present procedures could be improved. Through additional research efforts, and careful consideration of possible alternatives that maintain the conceptual underpinnings of learning disabilities, much needed progress in identification can be made.

REFERENCES Aaron, P. G. (1997). The impending demise of the discrepancy formula. Review of Educational Research, 67, 461–502. Ackerman, P. T., Holloway, C. A., Youngdahl, P. L., & Dykman, R. A. (2001). The double-deficit of reading disability does not fit all. Learning Disabilities Research & Practice, 16, 152–160. Algozzine, R. (1985). Low achiever differentiation: Where’s the beef? Exceptional Children, 52, 72–75. Algozzine, R., & Korinek, L. (1985). Where is special education for students with high prevalence handicaps going? Exceptional Children, 51, 388–394.


29

Algozzine, R., & Ysseldyke, J. E. (1987). In defense of different numbers. Remedial and Special Education, 8, 53–56. Algozzine, R., Ysseldyke, J. E., & McGue, M. (1995). Differentiating low-achieving students: Thoughts on setting the record straight. Learning Disabilities Research & Practice, 10, 140–144. Alley, G. R., Deshler, D. D., & Warner, M. M. (1979). Identification of learning disabled adolescents: A Bayesian approach. Learning Disability Quarterly, 2, 83–86. Allor, J. H. (2002). The relationships of phonemic awareness and rapid naming to reading development. Learning Disability Quarterly, 25, 47–57. Al-Otaiba, S. D. (2000). Children who do not respond to early literacy intervention: A longitudinal study across kindergarten and first grade. Unpublished doctoral dissertation, Vanderbilt University, Nashville. Al-Otaiba, S. D., & Fuchs, D. (2002). Characteristics of children who are unresponsive to early literacy intervention: A review of the literature. Remedial and Special Education, 23, 300–316. American Psychiatric Association (2000). Diagnostic and statistical manual of mental disorders (4th ed.). Text Revision. Washington, DC: Author. Anderson, P. L., & Meier-Hedde, R. (2001). Early case reports of dyslexia in the United States and Europe. Journal of Learning Disabilities, 34, 9–21. Baroody, A. J., & Ginsburg, H. P. (1991). A cognitive approach to assessing the mathematical difficulties of children labeled “learning disabled.” In: H. L. Swanson (Ed.), Handbook on the Assessment of Learning Disabilities: Theory, Research and Practice (pp. 177–228). Austin, TX: PRO-ED. Beirne-Smith, M., Ittenbach, R. F., & Patton, J. R. (2002). Mental retardation (6th ed.). Columbus, OH: Merrill. Beitchman, J. H., Cantwell, D. P., Forness, S. R., Kavale, K. A., & Kauffman, J. M. (1998). Practice parameters for the assessment and treatment of children and adolescents with language and learning disorders. Journal of the American Academy of Child and Adolescent Psychiatry, 37(Supplement), 46S–62S. Berlin, R. (1887). Fine besondre art der worstblindheit: Dyslexia [A special type of wordblindness: Dyslexia]. Wiesbaden: Bergmann. Berninger, V. W., & Abbott, R. D. (1994). Redefining learning disabilities: Moving beyond aptitude – achievement discrepancies to failure to respond to validated treatment protocols. In: G. R. Lyon (Ed.), Frames of Reference for the Assessment of Learning Disabilities: New Views on Measurement Issues (pp. 163–183). Baltimore: Brookes. Bradley, R., Danielson, L., & Hallahan, D. P. (Eds) (2002). Identification of learning disabilities: Research to practice. Mahwah, NJ: Lawrence Erlbaum. Christensen, C. A. (1999). Learning disability: Issues of representation, power, and the medicalization of school failure. In: R. J. Sternberg & L. Spear-Swerling (Eds), Perspectives on Learning Disabilities: Biological, Cognitive, Contextual (pp. 227–249). Boulder, CO: Westview Press. Coles, G. S. (1987). The learning mystique: A critical look at learning disabilities. New York: Pantheon. Cone, T. E., & Wilson, L. R. (1981). Quantifying a severe discrepancy: A critical analysis. Learning Disability Quarterly, 4, 359–371. Cooney, J. B., & Swanson, H. L. (1987). Memory and learning disabilities: An overview. In: H. L. Swanson (Ed.), Memory and Learning Disabilities: Advances in Learning and Behavioral Disabilities (Supplement 2, pp. 1–40). Stamford, CT: JAI Press. Coutinho, M. (1995). Who will be learning disabled after the reauthorization of IDEA? Two very distinct perspectives. Journal of Learning Disabilities, 28, 664–668. Dawes, R. M., Faust, D., & Meehl, P. E. (1989). Clinical versus actuarial judgment. Science, 243, 1668–1674.

30


Dejerine, J. (1892). Contribution a l’etude anatomo-pathologique et clinique des differentes varietes de cecite verbale. Memoriale Societe Biologique, Fev. 27, pp. 61–63. Finlan, T. G. (1992). Do state methods of quantifying a severe discrepancy result in fewer students with learning disabilities? Learning Disability Quarterly, 15, 129–135. Fleischner, J. E. (1994). Diagnosis and assessment of mathematics learning disabilities. In: G. R. Lyon (Ed.), Frames of Reference for the Assessment of Learning Disabilities: New Views on Measurement Issues (pp. 441–458). Baltimore: Brookes. Fletcher, J. M., & Foorman, B. R. (1994). Issues in definition and measurement of learning disabilities: The need for early intervention. In: G. R. Lyon (Ed.), Frames of Reference for the Assessment of Learning Disabilities: New Views on Measurement Issues (pp. 185–200). Baltimore: Brookes. Fletcher, J. M., Francis, D. J., Shaywitz, S. E., Lyon, G. R., Foorman, B. R., Stuebing, K. K., & Shaywitz, B. A. (1998). Intelligent testing and the discrepancy model for children with learning disabilities. Learning Disabilities Research & Practice, 13, 186–203. Fletcher, J. M., Lyon, G. R., Barnes, M., Stuebing, K. K., Francis, D. J., Olson, R. K., Shaywitz, S. E., & Shaywitz, B. A. (2002). Classification of learning disabilities: An evidence-based evaluation. In: R. Bradley, L. Danielson & D. P. Hallahan (Eds), Identification of Learning Disabilities: Research to Practice (pp. 185–250). Mahwah, NJ: Erlbaum. Fletcher, J. M., Shaywitz, S. E., Shankweiler, D. P., Katz, L., Liberman, I. Y., Stuebing, K. K., Francis, D. J., Fowler, A. E., & Shaywitz, B. A. (1994). Cognitive profiles of reading disability: Comparisons of discrepancy and low achievement definitions. Journal of Educational Psychology, 86, 6–23. Frankenberger, W., & Fronzaglio, K. (1991). A review of states’ criteria and procedures for identifying children with learning disabilities. Journal of Learning Disabilities, 24, 495–500. Fuchs, L. S., & Fuchs, D. (1998). Treatment validity: A unifying concept for reconceptualizing the identification of learning disabilities. Learning Disabilities Research and Practice, 13, 204– 220. Fuchs, L. S., Fuchs, D., Hamlett, C. L., Phillips, N. B., & Bentz, J. (1994). Classwide curriculum-based measurement: Helping general educators meet the challenge of student diversity. Exceptional Children, 60, 518–527. Fuchs, D., Fuchs, L. S., Mathes, P. G., & Lipsey, M. W. (2000). Reading differences between low-achieving students with and without learning disabilities: A meta-analysis. In: R. Gersten, E. P. Schiller & S. Vaughn (Eds), Contemporary Special Education Research: Syntheses of the Knowledge Base on Critical Instructional Issues (pp. 81–105). Mahwah, NJ: Erlbaum. Fuchs, L. S., Fuchs, D., & Speece, D. L. (2002). Treatment validity as a unifying construct for identifying learning disabilities. Learning Disability Quarterly, 25, 33–45. Galaburda, A. (1991). Anatomy of dyslexia: Argument against phrenology. In: D. D. Duane & D. B. Gray (Eds), The Reading Brain (pp. 119–131). Pankton, MD: York Press. Gallagher, J. J. (1966). Children with developmental imbalances: A psychoeducational definition. In: W. Cruickshank (Ed.), The Teacher of Brain-Injured Children (pp. 20–34). Syracuse, NY: Syracuse University Press. Good, R. H., Simmons, D. C., & Smith, S. B. (1998). Effective academic interventions in the United States: Evaluating and enhancing the acquisition of early reading skills. School Psychology Review, 27, 56–70. Gottlieb, J., Alter, M., Botlieb, B. W., & Wishner, J. (1994). Special education in urban America: It’s not justifiable for many. Journal of Special Education, 27, 453–465.


31

Gresham, F. (August, 2001). Responsiveness to intervention: An alternative approach to the identification of learning disabilities. Paper presented at the U.S. Department of Education LD Summit, Washington, DC. Hallahan, D. P. (2002). Learning disabilities: Historical perspectives. In: R. Bradley, L. Danielson & D. P. Hallahan (Eds), Identification of Learning Disabilities: Research to Practice (pp. 1–67). Mahwah, NJ: Lawrence Erlbaum. Hallahan, D. P., & Cottone, E. A. (1997). Attention deficit hyperactivity disorder. In: T. E. Scruggs & M. A. Mastropieri (Eds), Advances in Learning and Behavioral Disabilities (Vol. 11, pp. 27–68). Stamford, CT: JAI Press. Hallahan, D. P., Kauffman, J. M., & Lloyd, J. W. (1999). Introduction to learning disabilities (2nd ed.). Boston: Allyn & Bacon. Hallahan, D. P., Keller, C. E., & Ball, D. W. (1986). A comparison of prevalence rate variability from state to state for each of the categories of special education. Remedial and Special Education, 7, 8–14. Haring, K. A., Lovett, D. L., Haney, K. F., Algozzine, B., Smith, D. D., & Clarke, J. (1992). Labeling pre-schoolers as learning disabled: A cautionary position. Topics in Early Childhood Special Education, 12, 151–173. Hinshelwood, J. (1896). A case of dyslexia: A peculiar form of word-blindness. Lancet, 2, 1451–1454. Hinshelwood, J. (1917). Congenital word-blindness. London: H.K. Lewis. Horn, W. F., & O’Donnell, J. P. (1985). Early identification of learning disabilities: A comparison of two methods. Journal of Educational Psychology, 76, 1106–1108. Horowitz, S. H., Lichtenstein, B., & Roller, C. (February, 2002). An intervention-oriented, multi-tiered approach for identifying and serving students with learning disabilities. Straw man prepared for the Learning Disabilities Roundtable “Finding Common Ground” Initiative, Washington, DC. Hynd, G. W., Clinton, A. B., & Hiemenz, J. R. (1999). The neoropsychological basis of learning disabilities. In: R. J. Sternberg & L. Spear-Swerling (Eds), Perspectives on Learning Disabilities: Biological, Cognitive, Contextual (pp. 60–82). Boulder, CO: Westview Press. Kavale, K. A., & Forness, S. R. (1984). A meta-analysis assessing the validity of Wechsler Scale profiles and recategorizations: Patterns or parodies? Learning Disability Quarterly, 7, 136–156. Kavale, K. A., & Forness, S. R. (1987). The far side of heterogeneity: A critical analysis of empirical subtyping research in learning disabilities. Journal of Learning Disabilities, 20, 374–382. Kavale, K. A., & Forness, S. R. (1995). The nature of learning disabilities: Critical elements of diagnosis and classification. Mahwah, NJ: Erlbaum. Kavale, K. A., & Forness, S. R. (1998). The politics of learning disabilities. Learning Disability Quarterly, 21, 245–273. Kavale, K. A., & Forness, S. R. (2000). What definitions of learning disability say and don’t say: A critical analysis. Journal of Learning Disabilities, 33, 239–256. Kavale, K. A., Forness, S. R., & Lorsbach, T. C. (1991). Definition for definitions of learning disabilities. Learning Disability Quarterly, 14, 257–266. Kavale, K. A., Fuchs, D., & Scruggs, T. E. (1994). Setting the record straight on learning disabilities and low achievement. Learning Disabilities Research & Practice, 9, 70–77. Kavale, K. A., & Mattson, P. D. (1983). “One jumped off the balance beam”: Meta-analysis of the efficacy of perceptual motor training. Journal of Learning Disabilities, 16, 165–173. Kavale, K. A., & Nye, C. (1985–1986). Parameters of learning disabilities in achievement, linguistic, neuropsychological, and social/behavioral domains. Journal of Special Education, 19, 443–458.

32


Kavale, K. A., & Nye, C. (1991). The structure of learning disabilities. Exceptionality, 2(3), 141–156. Kavale, K. A., & Reese, J. H. (1991). Teacher beliefs and perceptions about learning disabilities: A survey of Iowa’s practitioners. Learning Disability Quarterly, 14, 141–160. Keogh, B. K. (1986). Future of the LD field: Research and practice. Journal of Learning Disabilities, 19, 455–460. Keogh, B. K. (1994). A matrix of decision points in the measurement of learning disabilities. In: G. R. Lyon (Ed.), Frames of Reference for the Assessment of Learning Disabilities: New Views on Measurement Issues (pp. 15–26). Baltimore: Brookes. Keogh, B., & Becker, L. D. (1973). Early detection of learning problems: Questions, cautions, and guidelines. Exceptional Children, 40, 5–11. Kirk, S. A. (1971). Psycholinguistic learning disabilities: Diagnosis and remediation. Urbana, IL: University of Illinois Press. Kosc, L. (1981). Neuropsychological implications of diagnosis and treatment of mathematical learning disabilities. Topics in Learning and Learning Disabilities, 1, 19–30. Kussmaul, A. (1877). Disturbances of speech. In: H. von Ziemssen (Ed.), Cyclopedia of the Practice of Medicine. J. McCreery (Trans.). New York: William Wood. Larsen, S. C., & Hammill, D. D. (1975). The relationship of selected visual perceptual abilities to school learning. Journal of Special Education, 9, 281–291. Lester, G., & Kelman, M. (1997). State disparities in the diagnosis and placement of pupils with learning disabilities. Journal of Learning Disabilities, 30, 599–607. Lyon, G. R. (1985). Identification and remediation of learning disability subtypes: Preliminary findings. Learning Disabilities Focus, 1, 21–35. Lyon, G. R. (March, 2001). Measuring success: Using assessments and accountability to raise student achievement. Statement to Subcommittee on Education Reform, Committee on Education and the Workforce, U.S. House of Representatives, Washington, DC http://edworkforce. house.gov/hearings/107th/edr/account3801/lyon.htm Lyon, G. R., & Fletcher, J. M. (2001). Early warning system. Education Next (Summer), 23–29. http://www.educationnext.org/20012/22.html Lyon, G. R., Fletcher, J. M., Shaywitz, S. E., Shaywitz, B. A., Torgesen, J. K., Wood, F. B., Schulte, A., & Olson, R. (2001). Rethinking learning disabilities. In: C. E. Finn, Jr., A. J. Rotherham & C. R. Hokanson, Jr. (Eds), Rethinking Special Education for a New Century (pp. 259–287). Washington, DC: Thomas B. Fordham Foundation. MacMillan, D. L., Gresham, F. M., & Bocian, K. M. (1998). Discrepancy between definitions of learning disabilities and school practices: An empirical investigation. Journal of Learning Disabilities, 31, 314–326. MacMillan, D. L., Gresham, F. M., Siperstein, G. N., & Bocian, K. M. (1996). The labyrinth of I.D.E.A.: School decisions on referred students with subaverage general intelligence. American Journal on Mental Retardation, 101, 161–174. MacMillan, D. L., & Siperstein, G. N. (August, 2001). Learning disabilities as operationally defined in schools. Paper presented at the U.S. Department of Education LD Summit, Washington, DC. MacMillan, D. L., & Siperstein, G. N. (2002). Learning disabilities as operationally defined by schools. In: R. Bradley, L. Danielson & D. P. Hallahan (Eds), Identification of Learning Disabilities: Research to Practice (pp. 287–333). Mahwah, NJ: Lawrence Erlbaum Associates. MacMillan, D. L., Siperstein, G. N., & Gresham, F. M. (1996). A challenge to the viability of mild mental retardation as a diagnostic category. Exceptional Children, 62, 356–371. Mastropieri, M. A. (1987). Statistical and psychometric issues surrounding severe discrepancy: A discussion. Learning Disabilities Research, 30, 29–31.


33

Mastropieri, M. A. (1988). Learning disabilities in early childhood. In: K. A. Kavale (Ed.), Learning Disabilities: State of the Art and Practice (pp. 161–179). Boston: College-Hill/Little Brown. Mastropieri, M. A. (August, 2001). Discrepancy models in the identification of learning disabilities. Paper presented at the U.S. Department of Education LD Summit, Washington, DC. Mastropieri, M. A., & Scruggs, T. E. (1997). Best practices in promoting reading comprehension in students with learning disabilities. Remedial and Special Education, 18, 197–213. Mastropieri, M. A., & Scruggs, T. E. (2000). The inclusive classroom: Strategies for effective instruction. Upper Saddle River, NJ: Prentice Hall. Mastropieri, M. A., & Scruggs, T. E. (2002). Discrepancy models in the identification of learning disabilities: A response to Kavale. In: R. Bradley, L. Danielson & D. P. Hallahan (Eds), Identification of Learning Disabilities: Research to Practice (pp. 449–456). Mahwah, NJ: Erlbaum. McCarthy, J. M. (1971). Learning disabilities: Where have we been: Where are we going? In: D. Hammill & N. Bartel (Eds), Educational Perspectives in Learning Disabilities (pp. 10–19). New York: Wiley. McKinney, J. D., Short, E. J., & Feagans, L. (1985). Academic consequences of perceptual-linguistic subtypes of learning disabled children. Learning Disabilities Research, 1, 6–17. McLeskey, J., & Waldron, N. L. (1990). The identification and characteristics of students with learning disabilities in Indiana. Learning Disabilities Research, 5, 72–78. Meehl, P. (1973). Psychodiagnosis: Selected papers. Minneapolis: University of Minnesota Press. Mercer, C. D., Forgnone, C., & Wolking, W. D. (1976). Definitions of learning disabilities used in the United States. Journal of Learning Disabilities, 9, 376–386. Morgan, W. P. (1896). A case of congential word blindness. The Brititsh Medical Journal, 2, 1378. Naglieri, J. (2001). Do ability and reading achievement correlate? Journal of Learning Disabilities, 34, 304–306. Naglieri, J., & Das, J. P. (1997). Cognitive assessment system. Itasca, IL: Riverside. Olson, R. K. (1999). Genes, environment, and reading disabilities. In: R. J. Sternberg & L. Spear-Swerling (Eds), Perspectives on Learning Disabilities: Biological, Cognitive, Contextual (pp. 3–21). Boulder, CO: Westview Press. Reynolds, C. R. (1984). Critical measurement issues in learning disabilities. The Journal of Special Education, 18, 451–476. Rourke, B. (1989). Non-verbal learning disabilities: The syndrome and the model. New York: Guilford Press. Rourke, B. (1993). Arithmetic disabilities, specific and otherwise: A neuropsychological perspective. Journal of Learning Disabilities, 26, 214–226. Rumsey, J. M. (1996). Neuroimaging in developmental dyslexia: A review and conceptualization. In: G. R. Lyon & J. M. Rumsey (Eds), Neuroimaging: A Window to the Neurological Foundations of Learning and Behavior in Children (pp. 57–78). Baltimore: Brookes. Rutter, M., & Yule, W. (1975). The concept of specific reading retardation. Journal of Child Psychology and Psychiatry, 16, 181–197. Schrag, J. (2000). Discrepancy approaches for identifying learning disabilities. Alexandria, VA: National Association of State Directors of Special Education. Scruggs, T. E. (1987). Theoretical issues surrounding severe discrepancy: A discussion. Learning Disabilities Research, 3, 21–23. Scruggs, T. E., & Cohn, S. J. (1983). A university-based summer program for a highly able but poorly achieving Indian child. Gifted Child Quarterly, 27, 90–93.

34


Scruggs, T. E., & Mastropieri, M. A. (1986). Academic characteristics of behaviorally disordered and learning disabled children. Behavioral Disorders, 11, 184–190. Scruggs, T. E., & Mastropieri, M. A. (1988). Legitimizing the field of learning disabilities: Does research orientation matter? Journal of Learning Disabilities, 21, 219–222. Scruggs, T. E., & Mastropieri, M. A. (1994–1995). Assessing students with learning disabilities: Current issues and future directions. Diagnostique, 20, 17–31. Scruggs, T. E., & Mastropieri, M. A. (2002). On babies and bathwater: Addressing the problems of identification of learning disabilities. Learning Disability Quarterly, 25, 155–168. Shaywitz, S. E., Escobar, M. D., Shaywitz, B. A., Fletcher, J. M., & Makach, R. (1992). Evidence that dyslexia may represent the lower tail of a normal distribution of reading ability. The New England Journal of Medicine, 326, 145–150. Shaywitz, B. A., Fletcher, J. M., Holahan, J. M., & Shaywitz, S. E. (1992). Discrepancy compared to low achievement definitions of reading disability: Results from the Connecticut Longitudinal Study. Journal of Learning Disabilities, 25, 639–648. Shepard, L. A. (1980). An evaluation of the regression discrepancy method for identifying children with learning disabilities. Journal of Special Education, 14, 79–91. Siegel, L. S. (1989). I.Q. is irrelevant to the definition of learning disabilities. Journal of Learning Disabilities, 25, 618–629. Siegel, L. (1999). Learning disabilities: The roads we have traveled and the path to the future. In: R. J. Sternberg & L. Spear-Swerling (Eds), Perspectives on Learning Disabilities: Biological, Cognitive, Contextual (pp. 159–175). Boulder, CO: Westview Press. Siegel, L. S., & Ryan, E. B. (1988). Development of working memory in normally achieving and subtypes of learning disabled children. Child Development, 60, 973–980. Skrtic, T. M. (1999). Learning disabilities as organizational pathologies. In: R. J. Sternberg & L. Spear-Swerling (Eds), Perspectives on Learning Disabilities: Biological, Cognitive, Contextual (pp. 193–226). Boulder, CO: Westview Press. Sleeter, C. E. (1995). Radical structuralist perspectives on the creation and use of learning disabilities. In: T. Skrtic (Ed.), Disability and Democracy: Reconstructing (Special) Education for Postmodernity (pp. 153–165). New York: Teachers College Press. Spear-Swerling, L. (1999). Can we get there from here: Learning disabilities and future education policy. In: R. J. Sternberg & L. Spear-Swerling (Eds), Perspectives on Learning Disabilities: Biological, Cognitive, Contextual (pp. 250–276). Boulder, CO: Westview Press. Speece, D. (2002). Classification of learning disabilities: Convergence, expansion, and caution. In: R. Bradley, L. Danielson & D. P. Hallahan (Eds), Identification of Learning Disabilities: Research to Practice (pp. 279–285). Mahwah, NJ: Erlbaum. Speece, D. L., & Case, L. P. (2001). Classification in context: An alternative approach to identifying early reading disability. Journal of Educational Psychology, 93, 735–749. Stanovich, K. E. (1988). Explaining the differences between the dyslexic and the garden-variety poor reader: The phonological core-variable difference model. Journal of Learning Disabilities, 21, 590–604. Stanovich, K. E. (1991). Conceptual and empirical problems with discrepancy definitions of reading disability. Learning Disability Quarterly, 14, 269–283. Sternberg, R. J. (1999). Epilogue: Toward and emerging consensus about learning disabilities. In: R. J. Sternberg & L. Spear-Swerling (Eds), Perspectives on Learning Disabilities: Biological, Cognitive, Contextual (pp. 277–282). Boulder, CO: Westview Press.


35

Strang, J. D., & Rourke, B. P. (1983). Concept-formation/non-verbal reasoning abilities of children who exhibit specific academic problems with arithmetic. Journal of Clinical Child Psychology, 24, 28–37. Swanson, H. L. (1993). Theoretical, technical, and practical aspects of the S-cognitive processing test: The development of a dynamic assessment measure. In: T. E. Scruggs & M. A. Mastropieri (Eds), Advances in Learning and Behavioral Disabilities (Vol. 10, Part A, pp. 135–208). Stamford, CT: JAI Press. Swerling, S. L., & Sternberg, R. J. (1996). Off track: When poor readers become learning disabled. Boulder, CO: Westview Press. Torgesen, J. K. (1998). Learning disabilities: An historical and conceptual overview. In: B. Y. L. Wong (Ed.), Learning about Learning Disabilities (2nd ed., pp. 3–34). New York: Academic Press. Torgesen, J. K. (1999). Phonologically-based reading disabilities: Toward a coherent theory of one kind of learning disability. In: R. J. Sternberg & L. Spear-Swerling (Eds), Perspectives on Learning Disabilities: Biological, Cognitive, Contextual (pp. 106–135). Boulder, CO: Westview Press. Torgesen, J. K. (2000). Individual differences in response to early interventions in reading: The lingering problem of treatment resisters. Learning Disabilities Research and Practice, 15, 55–64. Torgesen, J. K. (2002). Empirical and theoretical support for diagnosis of learning disabilities by assessment of intrinsic processing weaknesses. In: R. Bradley, L. Danielson & D. P. Hallahan (Eds), Identification of Learning Disabilities: Research to Practice (pp. 565–652). Mahwah, NJ: Erlbaum. Torgesen, J. K., & Wagner, R. K. (1998). Alternative diagnostic approaches for specific developmental reading disabilities. Learning Disabilities Research and Practice, 13, 220–232. U.S. Department of Education (2000). Twenty-second annual report to Congress on the implementation of the Individuals with Disabilities Education Act. Washington, DC: Author. U.S. Department of Education, Office of Special Education Programs (2002). Specific learning disabilities: Finding common ground. Washington, DC: Author. Vellutino, F. R., Scanlon, D. M., & Lyon, G. R. (2000). Differentiating between difficult-to-remediate and readily remediated poor readers: More evidence against the IQ-discrepancy definition of reading disability. Journal of Learning Disabilities, 33, 223–238. Wagner, R. K., & Garon, T. (1999). Learning disabilities in perspective. In: R. J. Sternberg & L. Spear-Swerling (Eds), Perspectives on Learning Disabilities: Biological, Cognitive, Contextual (pp. 83–105). Boulder, CO: Westview Press. Wagner, R. K., Torgesen, J. K., & Rashotte, C. A. (1999). Comprehensive test of phonological processes. Austin, TX: PRO-ED. Wilson, L. R. (1985). Large-scale learning disability identification: The reprieve of a concept. Exceptional Children, 52, 44–51. Wissink, J. F., Kass, C. E., & Ferrell, W. R. (1975). A Bayesian approach to the identificaiton of children with learning disabilities. Journal of Learning Disabilities, 8, 58–66. Wolf, M., & Bowers, P. G. (1999). The double-deficit hypothesis for the developmental dyslexias. Journal of Educational Psychology, 91, 415–438. Wong, B. Y. L. (1996). The ABCs of learning disabilities. New York: Academic Press. Wood, F., Flowers, L., Buchsbaum, M., & Tallal, P. (1991). Investigation of abnormal left temporal functioning in dyslexia through CBF, auditory evoked potentials, and positron emission tomography. Reading and Writing: An Interdisciplinary Journal, 3, 379–393. Ysseldyke, J. E., Algozzine, B., Richey, L., & Graden, J. P. (1982). Declaring students eligible for learning disability services. Why bother with the data? Learning Disability Quarterly, 5, 37–44.

36


Ysseldyke, J. E., Algozzine, B., Shinn, M. R., & McGue, M. (1982). Similarities and differences between low achievers and students classified learning disabled. Journal of Special Education, 16, 73–85. Zigmond, N. (1993). Learning disabilities from an educational perspective. In: G. R. Lyon, D. B. Gray, J. R. Kavanagh & N. A. Krasnegor (Eds), Better Understanding Learning Disabilities: New Views from Research and their Implications for Education and Public Policies (pp. 251–272). Baltimore, MD: Brookes.

STARTING AT THE BEGINNING FOR LEARNING DISABILITIES IDENTIFICATION: RESPONSE TO INSTRUCTION IN GENERAL EDUCATION Deborah L. Speece, Dawn Eddy Molloy and Lisa Pericola Case ABSTRACT Most definitions of learning disabilities (LD) include a qualification that adequate general education instruction was received and the child with LD did not benefit. Rarely is this tenet assessed in either practice or research before a diagnosis is made. In this chapter we review three studies that investigated children’s responsiveness to general education reading instruction as an indicator of the need for more intensive interventions. Adequacy of instruction was quantified by children’s level and rate of progress as measured by curriculum-based measures of oral reading fluency. This model of identification was based on Fuchs and Fuchs (1998) treatment-validity model wherein children who do not respond to interventions provided in the general education classroom are potential candidates for special education services. The results of the studies reviewed indicate that the model is valid in that: (a) children who differ from their peers on level and Identification and Assessment Advances in Learning and Behavioral Disabilities, Volume 16, 37–50 Copyright © 2003 by Elsevier Science Ltd. All rights of reproduction in any form reserved ISSN: 0735-004x/doi:10.1016/S0735-004X(03)16002-8

37

38

D. L. SPEECE, D. E. MOLLOY AND L. P. CASE

slope of performance have more severe academic and behavioral problems than children who have IQ-achievement discrepancies or low achievement; (b) children who demonstrate persistent non-responsiveness over three years differ from other at-risk children on reading, reading-related, and behavioral measures; and (c) at-risk children who participated in speciallydesigned general education interventions had better outcomes than at-risk children who did not participate.

STARTING AT THE BEGINNING FOR LEARNING DISABILITIES IDENTIFICATION: RESPONSE TO INSTRUCTION IN GENERAL EDUCATION Twenty-five years ago, the U.S. Office of Education offered the following criteria for identifying learning disabilities (LD): (a) failure to benefit from adequate instruction; (b) a severe discrepancy between achievement and intellectual ability; and (c) exclusion of sensory impairments, mental retardation, emotional disturbance, or environmental, cultural, or economic disadvantage (U.S. Office of Education, 1977). Since that time, accumulated research calls into question the validity of the second criterion, IQ-achievement discrepancy (e.g. Fletcher et al., 1994; Share, McGee & Silva, 1989; Speece & Case, 2001; Stanovich & Siegel, 1994). These data have fueled interest in alternative methods of identifying LD, most of which center on responsiveness to instruction (e.g. Fuchs, 1995; Fuchs & Fuchs, 1998; Speece & Case, 2001; Vellutino et al., 1996). Further evidence that instructional responsiveness may be pertinent to LD identification was provided by a survey of experts that asked for components of a reading disabilities definition (Speece & Shekitka, 2002). A majority of these experts, not surprisingly, selected reading achievement (83.7%) and phonemic awareness (79.4%). Of note is that 67.3% of the respondents also selected responsiveness to instruction. The focus on instruction in reconceptualizing LD is refreshing. Prior to this point, LD was viewed strictly as a within-child deficit despite the fact that degree of learning is influenced by the context in which learning occurs (Carroll, 1963; Deno, 1989; Keogh & Speece, 1996). Response-to-instruction models make no assumptions about the underlying cause of the learning difficulty. Instead, such models recognize that the difficulty may lie within the child, within the instruction, or both. Only by systematically strengthening the quality of instruction and measuring a child’s response to that instruction can inferences be made about the possibility that child deficits contribute to learning difficulties. Thus, the long-held assumption that “processing” or “neurological” deficits play a role in LD is not

Starting at the Beginning for Learning Disabilities Identification

39

ruled out but rather set aside, awaiting the development of valid measurement procedures. The treatment-validity model (Fuchs, 1995; Fuchs & Fuchs, 1998) is the most fully developed conceptualization of how response to instruction may be applied to LD identification. In this model, curriculum-based measures (CBM) are used in general education classrooms to identify children whose level and rate (slope) of performance are below those of their classmates. This “dual discrepancy” of level and slope becomes the marker by which to judge responsiveness to instruction. Dually-discrepant children then receive instruction from their general education teachers that is reformulated to meet their needs. This instruction and continued placement in general education would be “treatment valid” for children who demonstrate improvement. For children who continue to demonstrate a dual discrepancy, the instruction and placement would not be valid, leading to a trial placement in special education to determine responsiveness, and hence, treatment validity, under more intensive instructional parameters. This model is not without difficulty or controversy (Fuchs, Fuchs & Speece, 2002). Full implementation would require weekly assessment of all children in all academic areas addressed by CBM (reading, mathematics, spelling, writing), development of instructional plans that are based on validated instructional protocols and that are feasible for general education classrooms, and fidelity of implementation across all interventions. Despite the enormity of the task, there are many benefits, including: (a) less reliance on teacher referral thereby reducing false negative identification and possible referral bias; (b) a focus on academic behavior rather than difficult-to-measure processing weaknesses; (c) a focus on growth in addition to absolute level of performance; (d) elimination of IQ-achievement discrepancies and correspondingly, the need to administer intelligence tests; (e) elimination of false-positive identification; and (f) the potential to improve general education environments for all children. This new perspective also addresses a criterion in the federal regulations that is generally disregarded in diagnoses of LD: the child has received adequate instruction in general education but failed to benefit from it. In the model proposed by Fuchs and Fuchs (Fuchs, 1995; Fuchs & Fuchs, 1998), classrooms that have a mean level and slope of performance that is substantially lower than other classrooms, would first receive classroom-level intervention geared to improving the academic performance of all children. Thus, children could not be identified as dually discrepant unless there was evidence that most children

40


in the classroom were responding to the curriculum, that is, that instruction was adequate.

ANALYSIS OF THE TREATMENT VALIDITY MODEL There is much to recommend a treatment validity approach to identifying learning disabilities and much to study regarding its implementation. We investigated several aspects of the model in a three-year longitudinal study of reading (Case, Speece & Molloy, in press; Speece & Case, 2001; Speece, Case & Molloy, 2002). Our primary objective was to assess the validity of the model from multiple perspectives. In a series of three studies we asked the following questions: (a) do first- and second-grade children who are identified as dually discrepant in reading differ from children who have IQ-reading achievement discrepancies or who have low reading achievement?; (b) across three years, do dually-discrepant children who are persistently nonresponsive to specially-designed general education instruction experience greater reading, reading-related, and behavioral problems than duallydiscrepant children who do respond? Further, are there differences in the instructional contexts?; and (c) does the implementation of a modified treatment validity model result in better achievement and behavioral outcomes for dually-discrepant children compared to dually-discrepant children who do not participate in the model? This final issue pertains to outcomes associated with an improved pre-referral system that is buttressed by specific and research-based intervention components. The remainder of this chapter summarizes the major results of these studies and ends with implications for research.

Overview of the Sample and Measures As noted, the goal of the three-year study was to investigate the validity of dual discrepancy and responsiveness to treatment as a method of operationalizing reading disability. Three suburban schools from one school district in the Mid-Atlantic states participated and were selected based on comparability of school size, strength of school leadership, ethnic representation, reading scores on district-administered tests, and mobility rate. From a fall CBM population screen of two cohorts of first-grade children and one cohort of second-grade children, we identified the longitudinal sample of at-risk children. Children who performed in


41

the lowest 25% of their class on Letter Sound Fluency (LSF, first grade) or Oral Reading Fluency (ORF, second grade) constituted the longitudinal sample. Each year we identified a Purposive Sample (PS) from each classroom to estimate ORF classroom slope and level, information necessary to identify children as dually discrepant. PS children represented the range of reading skill in each classroom. The measures administered varied for the AR and PS children. All children completed three reading subtests from the Woodcock–Johnson Psychoeducational Battery-Revised (WJ-R; 1989) and an abbreviated battery of the WISC-R (Wechsler, 1974) to estimate Full Scale IQ. The AR group, which was the focus group for the study, also received phonological processing tasks (phonological awareness, RAN, and word reading efficiency) and teacher ratings of their classroom behavior (social skills, problem behavior, and academic competence). At the end of every year, we collected data on services received by all children, including additional instruction outside the general education classroom and interaction with the special education system.

Is the Dually Discrepant Classification Valid? Based on data from the study’s first year (Speece & Case, 2001), we compared children who were dually discrepant (DD, N = 47) on ORF at the end of the year to children who exhibited IQ-reading achievement discrepancies (IQRDG, N = 17) and to children who exhibited low achievement (LA, N = 28), defined as a standard score at or below 90 on the Basic Reading Cluster Score of the WJ-R. We also compared these groups to children in both the Purposive Sample (PS, N = 123) and at-risk sample (AR, N = 86) who were not identified as members of any of the problem-reader groups. Importantly, all groups were defined by the researchers and were not based on teacher referrals. If dual discrepancy is a valid marker for reading disability, then children so identified should be distinguishable from PS and AR non-classified children on reading, reading-related, and behavioral measures. Further, they should exhibit skills that are comparable to or lower than children who compose the other problem-reader groups (IQRDG, LA). We were also interested in differences on age as well as gender and ethnic representation to address issues of early identification and overrepresentation. The results supported the validity of the dual-discrepancy classification. The DD, AR, and PS groups could be compared on IQ, reading, and age. The DD children had lower reading scores than the other two groups, lower IQ than the PS group but not the AR group, and the groups did not differ on age. DD and AR groups could be compared on phonological processing and classroom

42


behavior. The DD group exhibited lower performance on phonological awareness, word reading efficiency, and ratings of academic competence. The DD and AR groups did not differ on RAN, social skills, or problem behaviors. Thus, the DD classification demonstrated construct validity with respect to two critical groups: children who were not expected to demonstrate reading problems and children who demonstrated initial difficulty (like the DD group) but who kept pace with their classmates. With respect to the other two poor reader groups, the DD group was either comparable or experienced more difficulty. We expected that the DD and IQRDG groups would differ on IQ based on the criteria used for classification, and they did. However, despite an approximate 22 point mean difference on IQ, the two groups did not differ on word reading skills (ES = 0.10). This finding supports other evidence that IQ-achievement reading discrepancies do not capture all children who experience reading difficulty (e.g. Fletcher et al., 1994). Not all DD-IQRDG differences on phonological processing reached statistical significance but, in all cases, the DD group had lower performance, with effect sizes ranging from 0.33 to −1.39. These small to large effect sizes suggest that the differences were educationally meaningful. The two groups also differed on classroom behavior (ES range = 0.54–0.71), with a significant difference on academic competence. We observed fewer differences in comparing the DD and LA groups. We found significant differences for second-grade children on phonological awareness, favoring the LA group, and on teacher ratings of classroom behavior, also favoring LA children (ES = 0.21 to −0.29). Although modest, these differences are not trivial given the selection criteria for LA. These groups did not differ on either IQ or reading. The comparisons of age, gender, and ethnic proportions were revealing. For both sets of group comparisons (DD-IQRDG, DD-LA), the DD children were significantly younger (ES = 1.16 and 0.69, respectively). The proportions of females and males did not differ across any of the groups, contrasting with findings for school-identified samples. With respect to ethnicity, the DD and LA groups reflected the proportions of majority-minority representation in the school system from which the sample was drawn. Over 80% of the IQRDG group was from the majority group (Caucasian). Thus, only the DD group reflected all three aspects of social consequential validity (Messick, 1995) in that they were younger and reflected gender and racial equity. The overall picture is that the DD group can be differentiated from normallyachieving children and from children with equivalent scores on screening measures early in the school year. Moreover, meaningful differences existed between the DD group and the two other poor-reader groups and, importantly, the DD group did not reflect the negative social consequences associated with current school


43

identification procedures. Thus, by focusing on both level and growth in reading achievement as indexed by CBM, a valid group of children who experienced reading problems was identified. Although simpler identification methods would be preferred, other analyses indicated that single indicators of reading difficulty (LSF, ORF, phonological awareness) were not sensitive indicators of either DD or status as problem readers (DD, IQRDG, and LA) (Speece & Case, 2001). The dual-discrepancy method would require major changes in the way children are identified; however, our initial evidence suggests that benefits may outweigh the cost of change.

Does Persistent Non-responsiveness Signal Disability? The previous study represented the first phase of the treatment-validity model: the identification of children as dually discrepant. In that study, DD was based on reading performance across most of a school year. In the treatment validity model (Fuchs, 1995; Fuchs & Fuchs, 1998), this phase and ensuing phases are conceptualized as eight-week time frames. Thus, it is likely that our DD group represented children with more severe reading problems than would be experienced by children identified within a shorter time span. Our second study (Case, Speece & Molloy, in press) examined phases two and three of the treatment validity model during which general education instruction is modified for children as soon as they are identified as DD. This study took place in one of our participating schools across three years. DD identification procedures were conducted three to four times a year, roughly corresponding to eight-week intervals. When a child was identified as DD, the authors met with the child’s general education teacher to develop an intervention that was based on the child’s strengths and weaknesses, reflected research-based principles of effective reading instruction, and was considered feasible by the general education teacher. The plans were implemented for eight weeks, at which time the team met to review progress and design new instruction as needed. During the eight weeks, a researcher met with the teacher to discuss implementation problems, offer assistance, and discuss progress. A researcher also made unannounced visits to check fidelity of implementation. The focus of the study was to determine if DD children who were persistently non-responsive to general education interventions experienced more severe reading, reading-related, and behavioral problems than either DD children who were more responsive or at-risk children who were never identified as DD. The “persistent non-responders” would be equivalent to children who enter phase four of the treatment validity model, a trial placement in special education. Thus, these

44


children should have more severe problems if a response-to-instruction model is valid for LD identification. Our sample consisted of 36 children who were in the study for three years. These children were classified into three groups based on degree of responsiveness. The frequently-dually-discrepant group (FDD, N = 7) was composed of children who were identified four or more times as DD over the course of the study and were considered the persistently non-responsive group. The infrequently-duallydiscrepant group (IDD, N = 17) was identified three or fewer times, and the never-dually-discrepant group (NDD, N = 12) was composed of at-risk children defined by the fall screening (like the other two groups) but never identified as dually discrepant. These groups did not differ on age, IQ, or mother’s years of education. To address the primary research question, the groups were compared on the measures described previously: reading, phonological awareness, RAN, word reading efficiency, and teacher ratings of academic competence, social skills, and problem behaviors in a series of repeated measures ANOVAs. The FDD children performed significantly worse that the NDD children on Letter Word Identification, Word Attack, Word Reading Efficiency, Academic Competence, Social Skills, and Problem Behaviors and significantly worse than the IDD children on Letter Word Identification, Word Reading Efficiency, Academic Competence, and Problem Behaviors. The sensitivity of the responsiveness groupings was further apparent in the differences between the IDD and NDD groups on Academic Competence, and Problem Behaviors. Slope estimates provided by Deno, Fuchs, Marsten and Shin (2001) supported our proposal that children in the the FDD group would be candidates for special education services. Deno et al. estimated that oral reading fluency slopes for children in special education were 0.71 in first and second grades and 0.58 in third and fourth grades. The FDD children’s mean slopes were 0.28 and 0.67, respectively, which approximate the Deno et al. figures. Although the FDD children would, by definition, have the lowest slopes of the children in this study, the similarity of their slopes to a separately diagnosed sample provides further evidence supporting criterion-related validity in addition to that found in the group analyses reported above. We considered whether differences in instructional context may explain the academic and behavioral differences between the groups’ responsiveness. If the FDD group was in less-effective classrooms, then the assertion that they are candidates for special education services is less tenable in the treatment-validity model. We compared the groups on a measure of instructional effectiveness that was based on two-yearly classroom observations, teachers’ years of experience, and mean classroom slopes. No group differences or interactions with year


45

were obtained. Thus, the hypothesis that FDD children received poorer quality instruction, had less experienced teachers, or had peers who demonstrated lower growth in ORF could not be supported. The differences observed between the three groups were not a function of differential classroom environment, lending further credence to the validity of the classification defined as degree of responsiveness to instruction.

Does an Improved Pre-referral System Result in Better Outcomes for Dually-Discrepant Children? The previous two studies provided evidence that the dual-discrepancy approach is a valid index of children’s responsiveness to general education instruction. We demonstrated that dual-discrepancy status, based on reading performance across most of the school year, equated with severe reading and behavioral difficulties. Also, we found that children who consistently fail to respond to redesigned general education intervention have considerably poorer academic and behavioral outcomes over the early elementary school years. The final study to be reviewed compared dually-discrepant children who received general education intervention to dually-discrepant children in two other schools who did not receive special interventions (Speece, Case & Molloy, 2002). This study asked whether there was any benefit to implementing a pre-referral system that was strengthened by explicit intervention plans built on research-based principles with attention to fidelity of implementation. These plans were developed collaboratively by the authors and teachers. Our operationalization of pre-referral differs considerably from earlier research on the topic. Past studies typically relied on teacher referral, did not collect data on interventions within classrooms, and the primary dependent variable was frequency of special education placement, not student achievement. The intervention school was called the Full Model (FM) because interventions were implemented in general education classrooms and progress was monitored weekly. Two schools, Assessment Feedback (AF) and Assessment Only (AO), provided comparisons. Children in these schools received the same assessments, but received less frequent oral reading fluency monitoring (approximately 10 data points per year). In the AF condition, researchers met with teachers twice a year, providing assessment data and explaining the concept of dual discrepancy as needed. Researchers provided a written report that identified children who were dually discrepant, including graphs of performance, and discussed the children and project in general with teachers. We did not offer any advice on instruction, but rather emphasized that the DD children may require additional assistance. The

46


AO school was a true control condition. The research team did not meet with the teachers during the study to provide information on DD status. The AF and AO conditions were included to control, respectively, for the possibilities that: (a) feedback to teachers on CBM performance may affect student achievement independent of a specific intervention plan (Fuchs & Fuchs, 1986; Fuchs, Fuchs & Hamlet, 1989a); and (b) measurement alone may affect student achievement (Fuchs & Fuchs, 1986; Fuchs, Fuchs & Hamlet, 1989b; Jenkins, Mayhall, Peshka & Townsend, 1974; Wesson et al., 1988). Only children identified as DD at any point were included in the analyses. Analyses were conducted within year to include all three cohorts of children (Year one N = 75, Year two N = 110, Year three N = 107). The number of DD children within schools was roughly equivalent. As is often the case with school-based research, our design met with some problems. In the summer following Year 1 of the study, the participating school district made plans to intensify general education literacy instruction in first and second grades. In the modified program, two certified teachers staffed each first- and second-grade classroom to deliver 90 minutes of literacy instruction every day. Two of our schools, FM and AO, participated in the modified program which included two weeks of summer training that emphasized phonological awareness, the alphabetic principle, writing, and assessment using running records. Our third school, AF, participated in the school district’s modified program in Year 3. Thus, our Full Model condition faced a stiff test and the research question became what is the added value of the FM condition over the intensified general education literacy program? The most stringent comparisons were between FM and AO in Years 2 and 3 since they both participated in the new program for the same amount of time. Although this was a daunting prospect, we found that the FM/DD children received, over time, an academic benefit above and beyond the new literacy program. In Year 1, prior to the new program, the only difference between the conditions was for social skills and the effect favored AF over FM. Although not significant, the AF teachers also rated their DD children as exhibiting fewer problem behaviors compared to FM (ES = 0.72). In Year 2, the significant comparisons were for Letter Sound Fluency level and slope (first grade children) with FM > AF but not AO. Small to moderate effect size estimates indicated higher performance by FM on WJ-R reading cluster scores compared to both of the control conditions (ES range = 0.31–0.51). For Year 3, we found three significant effects: ORF level, ORF slope, and school attention. “School attention” referred to the sum of additional services (e.g. instruction from a reading specialist) and meetings held on behalf of the students (e.g. school assistance teams, IEP). For ORF level, FM read significantly


47

more words correctly than AF but not AO (ES = 0.48 and 0.29, respectively). For ORF slope, FM grew at a significantly faster rate than either of the two comparison conditions (ES = 0.79 and 0.51, respectively). On school attention, FM had significantly lower scores than AO (ES = 1.33) but not AF (ES = 0.75). The ORF slope and school attention results are especially important because they demonstrate that the FM condition, representing an enhanced pre-referral intervention strategy, did improve outcomes for DD children over and above any effects of the new literacy program implemented in the AO school. It is also true that, by Year 3, several variables (phonological awareness, RAN, WJ-R reading scores, and teacher ratings of classroom behavior) yielded no differences. Thus, the significant findings must be placed in the context of all the variables tested. Nonetheless, these findings represent the first demonstration that an enhanced pre-referral model results in improved academic outcomes for at-risk children.

DISCUSSION AND IMPLICATIONS Although the number of children is relatively small in the studies reviewed, they are representative of children in the participating schools because they were initially identified based on a population screen (approximately 1,000 children in the first two years to identify at-risk children) and further identified based on objective criteria for classification. These design features are critical in understanding alternative methods of identifying learning disabilities that are not confounded by sampling bias. Further research with larger samples of AR and DD children in more academic areas will be necessary to test and extend the reliability of our findings. We established that the dual-discrepancy criterion, based on CBM oral reading fluency measures, identifies a valid, younger group of poor readers free of gender and ethnic bias. We also demonstrated that DD children who are persistently non-responsive to general education interventions have poorer academic and behavioral outcomes and are in need of more intensive interventions than were provided in their general education classrooms. Finally, we found that DD children who received specially-designed general education instruction had better outcomes than DD children who did not participate and required fewer services beyond their general education classrooms. These findings set the foundation for further work with the treatment-validity model to identify LD. A major difference between our implementation and the conceptualization of the treatment-validity framework (Fuchs & Fuchs, 1998) is the length of time to identify candidates for special education. In our second study (Case, Speece & Molloy, in press), we reviewed DD status over a three-year

48


period to examine degree of responsiveness. This was due to several constraints that included time required to secure parent permission and the fact that our project co-existed with, rather than supplanted, the special education identification system. When the model is fully functional and operating on well-defined time frames for identification and intervention, a child who is DD early in the school year could be deemed eligible for special education services before the end of the school year. Research investigating this more efficient model is critical to understanding the characteristics of children identified and their instructional requirements. Another research need is determining the dual discrepancy cut points for level and slope. We used one standard deviation below classroom mean performances, consistent with findings for math (Fuchs & Fuchs, 1998). These criteria may need adjustment, depending on the child’s age. To capture older children who require assistance, reducing the size of the discrepancy may be required. In any event, the issue needs to be studied systematically. Finally, closer examination is needed of research-based interventions that can be realistically and reliably implemented by general education teachers. We relied on the extant literature on phonological awareness, the alphabetic principle, and peer tutoring (e.g. Adams, Foorman, Lundberg & Beeler, 1998; Fuchs, Fuchs, Mathes & Simmons, 1987) for the majority of the intervention plans. Intervention plans were tailored to the child’s needs and negotiated with the teacher. For the most part, the teachers were faithful implementers of the plans and worked diligently to improve student performance. We need to understand more about this process: the critical elements of research-based general education interventions, the professional development necessary to insure faithful implementation, and the relationship to child outcomes. It is clear that different methods to identify LD are needed. The response to treatment paradigm is in the forefront of recommended approaches at the national level (e.g. Donovan & Cross, 2002; President’s Commission on Excellence in Special Education, 2002). This represents a sea change in how identification in the schools will be accomplished and represents a new frontier in the history of the learning disabilities field. The work presented in this chapter indicates there is good reason to pursue such models. It also indicates how much remains to be learned, especially in the realm of research to practice.

ACKNOWLEDGMENTS The work reported in this chapter was supported by Grant HO23F97008 from the U.S. Department of Education, Office of Special Education Programs. This


49

chapter is based on a forthcoming article: Speece, D. L., Case, L. P. & Molloy, D. E. Responsiveness to general education instruction as the first gate to learning disabilities identification. Learning Disabilities Research & Practice. Correspondence concerning this chapter may be addressed to Deborah L. Speece, Department of Special Education, 1308 Benjamin Building, University of Maryland, College Park, MD 20742. Electronic mail may be sent to [email protected].

REFERENCES Adams, M. J., Foorman, B. R., Lundberg, I., & Beeler, T. D. (1998). Phonemic awareness in young children. Baltimore: Paul H. Brookes. Case, L. P., Speece, D. L., & Molloy, D. E. (in press). The validity of a response to instruction paradigm to identify reading disabilities: A longitudinal analysis of individual differences and contextual factors. School Psychology Review. Carroll, J. B. (1963). A model of school learning. Teachers College Record, 64, 321–339. Deno, S. L. (1989). Curriculum-based measurement and alternative special education services: A fundamental and direct relationship. In: M. R. Shinn (Ed.) Curriculum-Based Measurement: Assessing Special Children (pp. 1–17). New York: Guilford Press. Deno, S. L., Fuchs, L. S., Marston, D., & Shin, J. (2001). Using curriculum-based measurement to establish growth standards for students with learning disabilities. School Psychology Review, 30, 507–524. Donovan, S., & Cross, C. (2002). Minority students in gifted and special education. Washington, DC: National Academy Press. Fletcher, J. M., Shaywitz, S. E., Shankweiler, D. P., Katz, L., Liberman, I. Y., Stuebing, K. K., Francis, D. J., Fowler, A. E., & Shaywitz, B. A. (1994). Cognitive profiles of reading disability: Comparisons of discrepancy and low achievement definitions. Journal of Educational Psychology, 86, 6–23. Fuchs, L. S. (1995, May). Incorporating curriculum-based measurement into the eligibility decisionmaking process: A focus on treatment validity and student growth. Paper presented at the Workshop on IQ testing and educational decision making, national research council, national academy of science. Washington, DC. Fuchs, L. S., & Fuchs, D. (1986). Effects of systematic formative evaluation on student achievement: A meta-analysis. Exceptional Children, 53, 199–208. Fuchs, L. S., & Fuchs, D. (1998). Treatment validity: A unifying concept for reconceptualizing the identification of learning disabilities. Learning Disabilities Research and Practice, 13(4), 204–219. Fuchs, L. S., Fuchs, D., & Hamlett, C. L. (1989a). Monitoring reading growth using student recalls: Effects of two teacher feedback systems. Journal of Educational Research, 83, 103–111. Fuchs, L. S., Fuchs, D., & Hamlett, C. L. (1989b). Effects of alternative goal structures within curriculum-based measurement. Exceptional Children, 55, 229–238. Fuchs, D., Fuchs, L. S., Mathes, P. M., & Simmons, D. (1987). Peer-assisted learning strategies: Making classrooms more responsive to student diversity. American Educational Research Journal, 34, 174–206. Fuchs, L. S., Fuchs, D., & Speece, D. L. (2002). Treatment validity as a unifying construct for identifying learning disabilities. Learning Disability Quarterly, 25, 33–46.

50


Jenkins, J. R., Mayhall, W., Peshka, C., & Townsend, V. (1974). Using direct and daily measures to increase learning. Journal of Learning Disabilities, 10, 604–608. Keogh, B. K., & Speece, D. L. (1996). Learning disabilities within the context of schooling. In: D. L. Speece & B. K. Keogh (Eds), Research on Classroom Ecologies: Implications for Inclusion of Children with Learning Disabilities (pp. 1–14). Mahwah, NJ: Lawrence Erlbaum. Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741–749. President’s Commission on excellence in special education (2002). A new era: Revitalizing special education for children and their families. Retrieved July 10, 2002 from PCESE. Web site: http://www.ed.gov/inits/commissionsboards/whspecialeducation Share, D. L., McGee, R., & Silva, P. (1989). IQ and reading progress: A test of the capacity notion of IQ. Journal of American Academy of Child and Adolescent Psychiatry, 28, 97–100. Speece, D. L., & Case, L. P. (2001). Classification in context: An alternative approach to identifying early reading disability. Journal of Educational Psychology, 93, 735–749. Speece, D. L., Case, L. P., & Molloy, D. E. (2002). Effects of an early identification/general education intervention program on children at risk for reading failure. Manuscript in preparation. Speece, D. L., & Shekitka, L. (2002). How should reading disabilities be operationalized? A survey of experts. Learning Disabilities Research & Practice, 17, 118–123. Stanovich, K. E., & Siegel, L. S. (1994). Phenotypic performance profile of children with reading disabilities: A regression-based test of the phonological-core variable-difference model. Journal of Educational Psychology, 86, 24–53. U.S. Office of Education (1977). Definition and criteria for defining students as learning disabled. Federal Register, 42(250) (pp. 65083). Washington, DC: U.S. Government Printing Office. Vellutino, F. R., Scanlon, D. M., Sipay, E. R., Small, S. G., Chen, R., Pratt, A., & Denckla, M. B. (1996). Cognitive profiles of difficult-to-remediate and readily remediated poor readers: Early intervention as a vehicle for distinguishing between cognitive and experiential deficits as basic causes of specific reading disability. Journal of Educational Psychology, 88, 601–638. Wechsler, D. (1974). Wechsler intelligence scale for children – revised. New York: Psychological Corporation. Wesson, C., Deno, S. L., Mirkin, P. K., Maruyama, G., Skiba, R., King, R., & Sevcik, B. (1988). A causal analysis of the relationships among ongoing curriculum-based measurement and evaluation, the structure of instruction, and student achievement. The Journal of Special Education, 22, 330–343. Woodcock, R. W., & Johnson, M. B. (1989). Woodcock–Johnson psychoeducational battery – revised. Chicago: Riverside Publishing Company.

IDENTIFICATION OF NON-RESPONDERS: ARE THE CHILDREN “LEFT BEHIND” BY EARLY LITERACY INTERVENTION THE “TRULY” READING DISABLED? Stephanie Al Otaiba ABSTRACT The primary purpose of this chapter is to synthesize the existing research that describes children who are unresponsive to generally effective early literacy interventions. Studies were selected in which: (a) children ranged from preschoolers to third graders and were at-risk for reading disabilities; (b) treatments targeted early literacy; (c) outcomes reflected reading development; and (d) students’ unresponsiveness to intervention was described. The search yielded 23 studies, eight of which were designed primarily to identify characteristics of unresponsive students; the remaining 15 studies focused on treatment effectiveness, but also identified and described unresponsive students. A majority of unresponsive students had phonological awareness deficits; additional characteristics included phonological retrieval or encoding deficits, low verbal ability, behavior problems, and developmental delays. Methodological issues are discussed that complicate comparisons of non-responders across studies. A secondary purpose of this chapter is to

Identification and Assessment Advances in Learning and Behavioral Disabilities, Volume 16, 51–81 Copyright © 2003 by Elsevier Science Ltd. All rights of reproduction in any form reserved ISSN: 0735-004x/doi:10.1016/S0735-004X(03)16003-X

51

52

STEPHANIE AL OTAIBA

describe findings from recent longitudinal studies that support the hypothesis that non-responders may be the truly reading disabled. Implications for future research and practice are discussed.

INTRODUCTION At the start of this chapter, I want to emphasize two points. First, not all poor readers have reading disabilities. That is to say, reading difficulties are not the same thing as reading disabilities. Reading disabilities are neurologically based, are pervasive across the life span, are not caused by a lack of opportunity to learn or by mental disability, and are difficult to remediate. By contrast, reading difficulties can be prevented and may occur primarily due to a lack of instruction (Vellutino et al., 1996). Second, many students with reading disabilities can be helped, but only through ongoing intervention, or treatment, which is more intensive than classroom instruction. Although 38% of all fourth grade students are reading below the basic level of reading (National Center for Educational Statistics, 1996), current estimates suggest that only 5% of school-age students are actually reading disabled. Nonetheless, reading difficulties are the most common reason for referral for special education (e.g. Mastropieri, Lenhart & Scruggs, 1999) because research has shown that reading trajectories are established early, and once established, are difficult to change (Good, Simmons & Kame’enui, 2001). This gap between poor and strong readers widens over the elementary years (Stanovich, 1986), particularly after third grade (Fletcher & Foorman, 1994; Kennedy, Birman & Demaline, 1986; Lyon, 1985). Even when poor readers continue to grow in reading, they rarely catch up to strong readers because poor readers develop a negative attitude toward reading (Juel, 1988) and, subsequently, practice reading dramatically less than strong readers (Allington, 1994). Specifically, poor readers may read as few words in one year as strong readers read in two days (Cunningham & Stanovich, 1998). Given rising expectations for literacy in our increasingly technological workplace, the well-documented long-term costs of early reading difficulties are a societal concern (e.g. Adams, 1990). Preventing reading difficulties has become an urgent national priority.

HELPING POOR CRACK THE CODE Over the last two decades, researchers have shown that most poor readers do not understand phonological awareness at the level of the phoneme: phonemic

Identification of Non-responders

53

awareness. Without understanding that words are made of individual sounds that can be blended, segmented, and manipulated, children do not understand how to apply phonics, or letter-sound strategies, to sound out words. In other words, they do not understand the system of linkage between sounds, or phonemes, and letters, or graphemes. If children cannot identify words, they cannot comprehend written material, which is, of course, the ultimate goal of reading (Share & Stanovich, 1995). Researchers have also shown that explicit and systematic instruction in phonemic awareness and phonics can help prevent reading difficulties for many children. Explicit instruction, in contrast with implicit or incidental teaching, follows a conscious and logical sequence from easier to more difficult. For example, teaching children to blend syllables like “bat” and “man” to make “batman” would happen before teaching that “bat” is made up of three sounds: /b/ /a/ and /t/. Systematic instruction aims to teach the alphabetic code, which is the system or the relationship between sounds in speech and letters. Two recent influential reports, Teaching Children to Read (National Reading Panel, 2000) and Preventing Reading Difficulties for Young Children (Snow, Burns & Griffin, 1998), summarize converging and convincing evidence that early literacy intervention that includes explicit and systematic instruction in phonological awareness and phonemic decoding, is more effective than implicit approaches. In addition to phonological awareness and phonics training, both reports also identify three additional components of effective early literacy instruction: vocabulary, fluency, and comprehension. Findings from these two reports, known collectively either as “ScientificallyBased Reading Research” (SBRR), or as “evidence-based instruction,” are cited throughout No Child Left Behind Act of 2001 (PL 107–110), especially in Part B, also known as Reading First. To achieve the stated goal of all children reading by third grade, all schools receiving funding through Reading First will be required to provide SBRR instruction and more intensive intervention which includes all five core training components (phonological awareness, phonics, vocabulary, fluency, and comprehension) in order to reduce the proportion of students inappropriately identified for special education. In other words, good classroom instruction is necessary, but may not be sufficient, for all students. Therefore, schools have a financial incentive for selecting reading instructional materials including basal reading curricular programs as well as supplemental intervention materials that are consistent with SBRR. This brings me to the second point: Remediation of reading disabilities is not impossible. With sustained individualized intervention, or treatment, beyond classroom reading instruction some older students can catch up to their peers. In recent years, researchers have conducted very intensive interventions that have successfully remediated reading problems for a number of older children

54

STEPHANIE AL OTAIBA

(Alexander, Anderson, Heilman, Voeller & Torgesen, 1991; Lovett, 1997). These studies demonstrated that it took more than 60 hours of focused one-to-one, intensive (i.e. one to two hours per day) intervention by highly trained reading teachers to help children catch up to their peers (as evidenced by standard scores on word identification tasks). However, even when their word identification problems were successfully remediated, these students still lagged behind in fluency, or oral reading rates, and continued to experience difficulty with reading comprehension (Torgesen, 2000). Fluency may be constrained by lack of motivation, lack of practice, lack of vocabulary, or simply by slower processing. Some have questioned whether such “heroic” interventions can be delivered in general education or in special education classrooms (see Fuchs & Fuchs, 1998; O’Connor, 2000). Unfortunately, special education programs are often unable to provide intensive and individualized reading intervention (Vaughn, Moody & Shumm, 1998). In special education, teachers’ caseloads are high because many children placed in special education are not truly reading disabled but rather instructional casualties (Simmons, Fuchs & Fuchs, 1991). In addition, the context of special education is changing. Under the Individuals with Disabilities Education Act (IDEA, 1997), an increasing number of students with reading disabilities are primarily served in the general education classroom. Moreover, neither special educators nor their general education peers reportedly feel prepared to teach reading to students with such diverse abilities (Moats & Lyon, 1996). A growing number of reading and special education researchers and educators have expressed concern that as many as 30% of children at-risk for reading difficulties (e.g. Blachman, 1994, 1997; Brown & Felton, 1990; Juel, 1994; Mathes, Howard, Allen & Fuchs, 1998; Shannahan & Barr, 1995; Smith-Burke & Jaggar, 1994; Torgesen, Morgan & Davis, 1992) and as many as 50% of children with special needs (e.g. O’Connor, Jenkins, Leicester & Slocum, 1993; O’Connor, Jenkins & Slocum, 1995) may not benefit from generally effective early literacy interventions. These students have been called “treatment-resistors” or “non-responders” (Blachman, 1994, 1997; Torgesen, Wagner & Rashotte, 1994). So, how many children are still being “left behind?” Torgesen (2000) recently estimated that only 2–6% of the school population might be unresponsive to intervention efforts. By Torgesen’s definition, treatment resistors are children who have had access to preventative programs, but who still have not acquired word reading skills and remain below the 30th percentile on Word Attack and Word Identification subtests of the Woodcock Reading Mastery Test-Revised (Woodcock; WRMT, 1987). If Torgesen’s estimates are accurate, they are very similar to federal incidence figures for children with reading disabilities.


55

Overview The purpose of this chapter is to describe the extant knowledge about the characteristics of children who do not respond to well-implemented and generally effective early literacy interventions. A further purpose is to examine two recent longitudinal studies and a long-term follow-up investigation that provide tentative evidence that students unresponsive to treatment are indeed reading disabled. In a “leave no child behind” world, this emerging area of research is particularly important because response to intervention, or treatment, offers an alternative to the IQ-achievement discrepancy-based formulas currently used to identify students with reading disabilities (Berninger & Abbott, 1994; Fuchs & Fuchs, 1998; Lyon et al., 2001; Vellutino et al., 1996). This alternative approach or paradigm is more proactive than the discrepancy paradigm because it does not require students to fall behind in order to qualify for help. In the primary, or prevention stage, of this response to intervention paradigm (see Fuchs & Fuchs, 1998), all students are guaranteed the opportunity to learn through good classroom reading instruction, reflecting use of SBRR. Then, in a secondary prevention stage, children who do not respond to instruction are identified for one or more intensive classroom interventions or treatments. During this secondary prevention stage, student progress would be monitored and diagnostic information gathered to tailor the intervention to meet the students’ needs. This stage is akin to the current pre-referral stage. Only those students who do not make adequate progress following good instruction and individualized intensive intervention would then be considered “truly reading disabled.” A response to intervention approach is consistent with IDEA (1997) because special educators and reading specialists would support classroom teachers in their prevention efforts. This approach is clearly more proactive than the current discrepancy approach. Children with reading disabilities would not have to wait for reading support until they fail. Currently many students do not qualify for special services until they fall far behind their expected reading achievement when remediation becomes more difficult (Adams, 1990). Moreover, special educators would have access to student progress monitoring reports during prevention, which may help them design more effective intensive interventions. Using the response to intervention approach could reduce current special education case-loads and allow more individualized reading instruction for the truly reading disabled (Vaughn, Moody & Schumm, 1998). In the next three sections of the chapter, I will present a summary of the findings from a recent review of the non-responder literature (Al Otaiba & Fuchs, 2002a). Next, I will present findings from two longitudinal studies that provide

56

STEPHANIE AL OTAIBA

tentative confirmation that non-responders eventually are referred for special education. Finally, I will discuss implications and a state-wide model to reduce the incidence of non-responders by translating SBRR research into policy and practice.

WHAT IS KNOWN ABOUT THE CHARACTERISTICS OF NON-RESPONDERS? Al Otaiba and Fuchs (2002a) conducted a thorough review to synthesize the literature describing children unresponsive to generally effective early literacy interventions. Because this is a relatively new area of research, the authors included not only the eight most recent studies that were designed primarily to describe non-responders (Berninger et al., 1999; Hatcher & Hulme, 1999; Schneider, Ennemoser, Roth & Kuspert, 1999; Torgesen & Davis, 1996; Torgesen et al., 1999; Uhry & Shepherd, 1997; Vellutino et al., 1996; Vellutino, Scanlon & Lyon, 2000), but also 15 studies that were designed primarily to explore treatment effectiveness that provided limited information about non-responders (Ehri & Robbins, 1992; Fazio, 1997; Foorman, Francis, Fletcher, Schatschneider & Mehta, 1998; Foorman, Francis, Winikates, Mehta, Schatschneider & Fletcher, 1997; Fox & Routh, 1976; Hurford, 1990; Kasten, 1998; O’Connor et al., 1993; O’Connor, Notari-Syverson & Vadasy, 1996, 1998a, b; O’Shaughnessy & Swanson, 2000; Peterson & Haines, 1992; Snider, 1997; Vadasy, Jenkins, Antil, Wayne & O’Connor, 1997; Vandervelden & Siegel, 1997). In all, 23 studies were selected in which: (a) (b) (c) (d) (e)

participants ranged in grade level from preschool to third grade; children were included who were at-risk for reading disabilities; interventions targeted early literacy; outcomes reflected reading development; and students unresponsive to treatment were described.

Table 1 describes study participants, the definitions of unresponsiveness, the percentage and characteristics of unresponsive students. Table 2 provides background information about types of interventions, their intensity, duration, and effectiveness. I want to emphasize that the findings from the 23 studies reviewed by Al Otaiba and Fuchs (2002a) indicated that early literacy interventions helped most students, including many students with disabilities. However, depending on the individual study, sample selection criteria, and outcome measure, between 8 and 80% of students made little or no improvement.

Article

Demographics

Intervention studies conducted to describe unresponsive students Berninger et al. M age = 7 yrs., 2% Black, 8% (1999) Hispanic, 2% Asian, 4% Native American, M Verbal IQ = 91.60 Hatcher and M age = 7 yrs., M IQ = 68–122 Hulme (1999) Schneider et al. M age = 5 yrs. 7 mos., German (1999) Torgesen and Age: 5–6 yrs., 73% Black, low Davis (1996) SES, M Verbal IQ approximately 91 Torgesen et al. Age: 5–6 yrs., 26% Black, 2.1% (1999) Other, Verbal IQ > 75 Uhry and Shepherd (1997)

Age: 5–8 yrs., 17% Black, middle SES, IQ > 90

Vellutino et al. (1996, 2000)

Age: 5–8 yrs., Mostly white, middle SES, IQ > 90

Definition and Percentage of Unresponsive Students

Characteristics of Unresponsive Students

Growth slopes not different from 0. No growth on words trained: 52%; no growth on WRMT-R WA: 63%; no growth on WRMT-R WI: 75%. Not defined; no percentage reported.

Low PA, slow RAN, poor orthographic skills, and low verbal IQ. Low PA.

No gains during treatment, no percentage reported.

Slow RAN among students with lowest PA. Poor invented spelling, slow RAN, and low verbal ability.

No gains during treatment, segmenting: 30%; blending: 10%.

WRMT-R standard score 85 Did not significantly improve word-reading: 33% overall; (1992) 100% of poor segmenters. Vandervelden and Age: 5–7 yrs., low SES No gains in PA: 13% overall; 18% of students with low PA. Siegel (1997) Could not read more than one word: 27% overall; 36% of students with low PA.


Table 1. Child Characteristics.

58

Table 1. (Continued ) Article

Demographics

Definition and Percentage of Unresponsive Students

Studies exploring treatment effectiveness: Pre-literate readers with disabilities Fazio (1997) Age: 4–6 yrs., M non-verbal IQ: Difficulty learning and recalling a rhyming poem: percentage 85–115 not reported. Kasten (1998) M age = 5 yrs., low SES, White Did not display significant reading growth on WJ-R subtests. No percentage reported. O’Connor et al. (1993) Age: 4–6 yrs. Did not learn to identify rhyming oddities: 8%. Did not learn to blend onset-rime: 36%; Did not learn to segment first sound: 46%. O’Connor et al. (1996, Age: 5–7 yrs., 56% Black, Made less than half the mean gain in PA: general education 1998a, b) M Verbal IQ = 67 students, 18%; students with disabilities, 33%; mild mental retardation, 66%; learning disabilities, 38%; behavior disorders: 50%. Studies exploring treatment effectiveness: Older students Foorman et al. (1997) Age: 7–9 yrs., 32% Black, 24% low SES, Verbal IQ > 79 Foorman et al. (1998) Grades 1–2, 60% Black, 20% Hispanic, low SES Age: 7–9 yrs., IQ > 90

Snider (1997)

Age: 7–9 yrs., 10% low SES

O’Shaughnessy and Swanson (2000)

M age = 7 yrs., 8 mos., 4.4% Black, 2.2% Asian, 28.9% Hispanic, low SES, M IQ = 89.9 Age: 5–8 yrs., 50% low SES

Vadasy et al. (1997)

Learned fewer than 2.5 words on a 50 word list: Implicit code, researchers, 46% implicit code, teachers, 38% embedded code, 44%; and direct code, 16%. Post-treatment segmentation skills are poorer than students without disabilities. No percentage reported. Did not significantly improve reading rate and accuracy on oral reading fluency: 10%. Did not significantly improve rate and accuracy on oral reading fluency: phonological training, 20%; word analogy training, 27%. Gained less than 8 points on the Reading and Spelling subtests of the WRAT-R: 35%.

Low PA and poor verbal ability. Low IQ. Low PA.

Low PA and low IQ.

Low PA poor spelling, Spanish ethnicity, and low verbal IQ.

Low PA and younger children. Poor attention. Poor attention.

Poor attention.

Notes: WRMT-R = Woodcook Reading Mastery Tests-Revised (Woodcooks, 1989). WA = Word Attack; WI = Word Identification; PA = Phonological Awareness; RAN = Rapid Automatized Naming; SES = Socioeconomic Status; PPVT = Peabody Picture Vocabulary Test; WJ-R = Woodcock–Johnson-Reading subtests; WRAT-R = Wide Range Achievement Test-Revised.

STEPHANIE AL OTAIBA

Hurford (1990)

Not defined, no percentage reported.

Characteristics of Unresponsive Students

Article

Treatment Description

Intervention studies conducted to describe unresponsive students Berninger et al. Graduate students trained children to read words (1999) using whole word instruction, phonemic decoding, or whole word + phonemic decoding. Hatcher and Hulme Teachers (not the children’s) trained students (1999) using either phonological awareness training (P), Reading Recovery (R), or both (P + R).

Schneider et al. (1999)

Classroom teachers trained their own students using phonological awareness training.

Torgesen and Davis (1996)

Graduate students conducted rhyming, blending, and segmenting training.

Torgesen et al. (1999)

Research staff and instructional aides trained students in phonological awareness + synthetic phonics (PASP) or embedded phonics (EP). A third group received regular classroom support (RCS). Graduate students trained students in letter – sound correspondence, segmentation and guided reading in phonics-controlled texts and narrative texts and writing. Research staff (certified teachers) trained students in phonemic awareness, phonetic decoding, reading in connected text, and writing.

Uhry and Shepherd (1997)

Vellutino et al. (1996, 2000)

Treatment Intensity, Duration, and Fidelity

Treatment Effectiveness

Individual tutorials in eight 30 min sessions. Total = 4 hrs. No fidelity reported.

All three groups made significant growth.

Individual tutorials in forty 30 min. sessions for 20 weeks. Total = 20 hrs. No fidelity reported.

Whole classroom instruction conducted daily in ten 15 min sessions for 6 mo. Total = 20 hrs. No fidelity reported. Small group format in four 20 min sessions per week for 3 mo. Total = 16 hrs. No fidelity reported. Individual tutorial in four 20 min weekly sessions (two with staff and two with instructional aide). Total = 88 hrs. No fidelity reported.

P group made greater gains on phonological skills than other groups. Only P +R group made greater reading gains than controls and differences continued at 9-mo. follow-up. Treatment students outperformed controls on phonological awareness at posttest, and on reading at end of Grade 2. Treatment students made significantly more growth in blending and segmenting than control students. PASP group scored higher than the other three groups on WRMT-R, WA, and WI; PASP group also scored higher than controls and RCS on WRMT-R, PC.

Individual tutorial two 1 hr sessions per week for 5 mos. Total = 32 hrs. No fidelity reported.

Most students made significant growth in WRMT-R, WI and WA.

Individual tutorial five 30 min sessions per week for 1–2 semesters. Total = 35–40 hrs. No fidelity reported.

67% of tutored students improved beyond lowest 30th percentile on WRMT-R, WI and WA.


Table 2. Treatment Characteristics.

59

60

Table 2. (Continued ) Article

Treatment Description

Treatment Intensity, Duration, and Fidelity

Studies exploring treatment effectiveness: Beginning readers without disabilities Individual tutorial in four 15 min sessions Ehri and Robbins Research staff trained experimental students to across 1 mos. Total = 1 hr. No fidelity (1992) read words by analogy (e.g. cave, save) and reported. trained control students to read unrelated words (e.g. rain, save). Individual tutorial in two 30 min sessions for Fox and Routh Research staff trained students to read short (1976) words by blending two sounds together (e.g. 1 week. Total = 1 hr. No fidelity reported. me). Peterson and Research staff trained experimental students to Individual tutorial in seven 15 min sessions Haines (1992) read words by analogy (e.g. /b/ /all/, /ball/; /f/ across 1 mos. Total = 2 hrs. No fidelity /all/, /fall/). reported. Vandervelden and Research staff trained students in phonological Individual or small group training. In one Siegel (1997) awareness and reading and spelling using initial 30–45 min session per week. For 3 mos. sounds and rimes. Total = 6–9 hrs. No fidelity reported.

O’Connor et al. (1993) O’Connor et al. (1996, 1998a, b)

Research staff trained students in either rhyming, blending, or segmenting using direct instruction. Classroom teachers trained their students in phonological and print awareness.

Significant differences favored experimental students on word reading.

Significant effects on decoding for experimental students who could segment. Significant growth in decoding for experimental students with mid- to high-level segmentation skill. Significantly greater growth on phonological awareness and reading for experimental students.

Individual tutoring in four 15 min sessions across 1 mos. Total = 1 hr. No fidelity reported.

Children with speech or language impairments in the hand motions condition outperformed children in other conditions.

3-yr duration. No total reported. No fidelity reported.

Student’s scores on Reading Miscue Inventory showed he was becoming a developing reader. Experimental students made significantly more growth on rhyming, blending, and segmenting. Experimental students with and without disabilities outperformed controls on blending, segmenting, and word reading on WRMT. Effects continued to favor experimental students with disabilities 1-yr after training.

Small group training four 10 min sessions per week for 7 weeks. Total = 4.5 hrs. No fidelity reported. Whole classroom instruction 100–281 sessions over 6 mos. No total reported. No fidelity reported.

STEPHANIE AL OTAIBA

Studies exploring treatment effectiveness: Preliterate children with disabilities Fazio (1997) Research staff taught children to recall and retell a poem using direct instruction under four conditions: with or without hand motions and with or without singing the poem (phonological encoding). Kasten (1998) Phonics training in resource room setting; whole language classroom instruction.

Treatment Effectiveness

O’Shaughnessy and Swanson (2000) Vadasy et al. (1997)

Instructional aides conducted either phonological awareness and phonemic decoding training (PAT), word analogy training (WAT), or math training (MAT). Community volunteers trained students on phonological awareness, letter – sound correspondence, phonogram activities with magnetic letters, reading, and writing.

Whole classroom instruction 1 hr daily for 6 mos. Total = 120 hrs. No fidelity reported.

Students in synthetic phonics condition performed better than students in sight word condition in segmentation and reading.

Whole classroom instruction was 90 min daily; small group or individual tutorial was 30 min daily. 6 mos. Duration No fidelity reported.

Positive effects favored the direct code group on WJ-R Basic Skills and Passage Comprehension subtests.

Sessions were 30–45 min per day for 3–4 days. Total = 2.5–4 hrs. No fidelity reported.

Experimental students displayed significant improvement in segmentation.

Grouping format not reported. 30–45 min. daily across 9 mos. Total = 90 hrs. No fidelity reported. Small-group format. Three 30-min sessions across 6 weeks. Fidelity reported (94–97%).

Students improved their reading rate and accuracy on oral reading fluency.

Individual tutorial for 100 sessions across 6 mos. Total = 54 hrs. No fidelity reported.


Studies exploring treatment effectiveness: Older children Foorman et al. Resource room teachers taught their students (1997) using synthetic phonics (multi-sensory and systematic at the phoneme level), analytic phonics (implicit and at the onset-rime level), or sight word (whole word) approaches. Foorman et al. Classroom teachers used either direct code, (1998) embedded code, or implicit code. Research staff also conducted an implicit code training group. In addition, Title 1 teachers provided students with additional tutoring that was implicit. Hurford (1990) Research staff trained students to use a computer program that tried to help students discriminate between phonemes (e.g. /ri /li/). Snider (1997) Students’ resource room teacher used direct instruction.

PAT & WAT groups outperformed the MAT group on PA, phonological memory, and on word attack (WRMT-R). Effect sizes for segmentation and reading on WRMT favored experimental students.

Notes: WRMT-R = Woodcock Reading Mastery Test-Revised (Wookcock, 1987). WA = Word Attack; WI = Word Identification; PC = Passage Comprehensive; WJ-R = Woodcook–Johnson-Reading subtest.

61

62

STEPHANIE AL OTAIBA

Variation in Treatment-related Characteristics Complicates Interpretation Table 2 demonstrates the extreme variation in treatments from one investigation to another along a number of dimensions, including duration, intensity, and background of trainers. In seven studies, treatments were brief (1–9 hours), in seven additional investigations, the duration was longer (20–55 hours); and in five studies, treatments ran for more than 80 hours (up to three years). Although more than half of the investigators conducted one-to-one tutorials, researchers in five studies used small groups (Fazio, 1997; O’Connor et al., 1993; O’Shaughnessy & Swanson, 2000; Torgesen & Davis, 1996; Vandervelden & Siegel, 1997), and in six studies, they used whole-classroom formats (Foorman et al., 1997; Kasten, 1998; O’Connor et al., 1996, 1998a, b; Schneider et al., 1999; Snider, 1997). Whereas most trainers were graduate students or research staff, O’Shaughnessy and Swanson used paraprofessionals and Vadasy et al. (1997) used community volunteers as trainers. In only four studies (Foorman et al., 1997; Kasten, 1998; O’Connor et al., 1996; Snider, 1997) special educators conducted the treatments, and general educators conducted the treatment in only one study (Schneider et al., 1999). This small number of interventions conducted by teachers in “real classrooms” undermines generalizability of the findings. Furthermore, the large variation in treatment-related characteristics complicates interpretations of the 23 studies as a group. Consequently, the percentage of non-responders may possibly be related to treatment duration, treatment intensity, or the background or skill of the trainers rather than to the characteristics of unresponsive students. In addition, only one of the 23 studies reported fidelity of treatment data (see Table 2). Without information detailing how faithfully treatments were conducted, we cannot assume that treatments were responsible for observed positive change in reading behavior.

A Closer Look at a Subset of Eight of the Twenty-three Studies To provide further information about possible relationships between treatmentrelated issues and the percentage of non-responders, a subset of eight studies in which interventions lasted more than 10 hours and that reported the percentage of non-responders was selected (Foorman et al., 1998; O’Connor et al., 1996; Snider, 1997; Torgesen & Davis, 1996; Torgesen et al., 1999; Uhry & Shepherd, 1997; Vadasy et al., 1997; Vellutino et al., 1996). These studies were used to explore the relationship between percentage of non-responders and four treatment-related issues: (a) who implemented treatment; (b) the total hours of treatment; (c) the explicitness of the treatment; and (d) grouping size. First, each of these


63

four variables was coded and entered into a database. Studies with more than one treatment were coded separately. The only treatment-related issue that was statistically significantly correlated with the percentage of non-responders was the explicitness of treatment (r = 0.80 and p = 0.000). The mean percentage of non-responders in implicit approaches was 43.80% and, in explicit approaches, this percentage was 28.00%. These finding underscores the importance of providing explicit evidence-based interventions. Child Characteristics of Non-responders A preliminary examination of Table 3 shows the seven clusters of child characteristics that were associated with unresponsiveness to treatment: (a) (b) (c) (d)

phonological awareness; phonological memory; rapid naming; general intelligence (note that this category subsumes verbal ability measured by PPVT-R); (e) attention and behavior problems; (f) orthographic awareness; and (g) demographic information (e.g. SES, parent education). Most notably, all but two teams of investigators explored the relationship between phonological awareness and treatment unresponsiveness, and in 70% of the studies phonological awareness was found to be a clear and important correlate. The importance of intelligence to treatment responsiveness is less clear: 22% of researchers reported a relationship, but 30% did not. This relationship, however, has only been explored within a limited range of intelligence (i.e. >75), with five Table 3. Number of Studies in which Learner Characteristics were Significantly Associated with Non-responding Students. Findings

Yes No Mixed

Phonological Awareness (n = 21)

Phonological Memory (n = 7)

Rapid Naming (n = 7)

IQ or Verbal Ability (n = 15)

Attention (n = 9)

Orthography or Spelling (n = 7)

Demographics (n = 4)

16 1 4

4 2 1

5 1 1

5 7 3

7 2 0

3 4 0

2 0 2

Notes: The “phonological awareness” column should be read this way: In 16 of 21 studies exploring the importance of phonological awareness to treatment non-responsiveness, a statistically significant relationship was obtained; in one study, the relationship was non-significant; in four studies, results were mixed; and in studies (23 − 21 = 2), the researcher did not explore the relationship. Total number of studies + 23. Yes = Statistically significant relationship. No = Non-significant relationship. Mixed = Combination of significant and non-significant relationships.

64

STEPHANIE AL OTAIBA

notable exceptions (Kasten, 1998; O’Connor et al., 1993, 1996, 1998; Snider, 1997). All reported high frequencies of unresponsiveness, and among O’Connor et al.’s students with mild mental retardation, the percentage of non-responders reached 66%. Because the language ability of students in these studies was not reported, it is impossible to know whether treatment unresponsiveness was related specifically to low verbal ability or to a more general developmental disability. The connections among the remaining five characteristics and treatment unresponsiveness have been explored even less consistently. Sixty percent of research teams did not address the importance of phonological memory, and 70, 61, 70, and 80% of the research teams did not explore rapid naming, attention or behavior, orthographic processing, or demographics, respectively (see Table 3). No study provided a direct test of the dual (or multiple) deficit hypothesis (Wolf & Bowers, 1999), which posits that students with combined deficits are more likely to be unresponsive than students with a single deficit. Thus, although there is suggestive evidence of the importance of this last set of characteristics, future research is obviously needed. Limitations of the Current Non-responder Database An even closer inspection of Tables 1–3 show additional concerns about the current knowledge base. First and foremost, no common definition of the construct treatment unresponsiveness emerged from this review. Currently, children may be termed unresponsive because they make no growth, low growth, or because they do not catch up to average-achieving peers. This lack of an agreed-upon absolute standard or criterion for what it means to be a non-responder makes it difficult to make meaningful comparisons across studies and limits researchers’ ability to guide policy and practice. I would like to illustrate these differences in criteria by contrasting the definitions used in two of the studies. Torgesen et al. (1999), as previously described, defined unresponsiveness as standard scores below 85 on word attack and word identification subtests of the WRMT-R. Defining unresponsiveness in terms of performance level may have an important drawback, especially for very low performing students because performance level is insensitive to students’ growth. For example, it may not be realistic to expect a first grade student with mental disabilities to catch up to her typically-developing peers, or to have a standard score above 85 on the WRMT-R. However, it is realistic to expect her to show progress, for example, to show improvement on her reading goals on her IEP (e.g. decoding a word like “cat” or recognizing letter-sounds, or sight words). O’Connor et al. (1996, 1998a, b), on the other hand, defined unresponsiveness as growth of less than half the mean gain of treatment students. Defining unresponsiveness exclusively in terms of growth may also be problematic. Higher


65

performing students may show little growth, but still demonstrate acceptable performance levels; lower performing students, including many with disabilities, may make relatively impressive growth, but have unacceptably low performance levels. When it comes to identifying non-responders, policy makers and educators will need a more meaningful approach to defining unresponsiveness that provides valid benchmarks that relate to future reading performance (e.g. success or failure). A second concern apparent upon close inspection of Table 1 is that participants were drawn from very different sampling procedures and are, therefore a heterogeneous group. These differences make it difficult to compare proportions and to compare the characteristics of non-responders across studies. Five investigations (Foorman et al., 1997; Hurford, 1990; O’Shaughnessy & Swanson, 2000; Snider, 1997; Uhry & Shepherd, 1997) included students with reading disabilities. Five additional studies included children with speech and language impairments (Fazio, 1997), cognitive delays, or behavior disorders (Kasten, 1998; O’Connor et al., 1993, 1996, 1998a, b). As previously reported, children with disabilities, had much higher percentages of unresponsiveness than typically-developing peers (see also Fuchs, Fuchs & Thompson et al., 2002). Further, not all researchers specified their “cut-scores” for at-risk criteria; those who did reported a range from the lowest 35th percentile to the lowest 10th percentile of the distribution on either reading or phonological awareness outcomes. Moreover, these percentiles were based on study samples, not on normative populations. Thus, the lowest 10th percentile from the respective samples of two studies could define different groups of children. To date, the knowledge base about non-responders is still emerging. Recent research has shown that with explicit phonemic awareness and phonics training, many, but still not all, children can be helped to “catch up” to peers in terms of word recognition. The next section of this chapter will describe longitudinal investigations that provide tentative confirmation that children who do not catch up are indeed reading disabled.

ARE NON-RESPONDERS THE “TRULY” READING DISABLED? The findings from two recent longitudinal studies and a long-term follow-up investigation extend the current data base about non-responders. The purpose of O’Connor’s (2000) study was to determine whether the percentage of nonresponders could be reduced by providing layers of intervention that increased in intensity over kindergarten and first grade. Al Otaiba (2001) tracked nonresponders for two years, across kindergarten and first grade, and then conducted

66

STEPHANIE AL OTAIBA

a follow-up study (Al Otaiba & Fuchs, 2002b) at the end of third grade. Together, the objectives of these two studies were: (a) to compare the percentages of non-responders to implicit classroom instruction vs. explicit intervention; (b) to examine the child characteristics of non-responders in order to develop a tentative set of markers; and (c) to determine how well non-responder status predicted school-identified reading difficulties by third grade.

O’Connor (2000) The participants in O’Connor’s study were 189 children, 59 of whom were considered at-risk for reading difficulties. These at-risk students named less than 15 letters and less than four segments correctly in one minute, and scored below 86 on the combined letter-word and dictation subtests of the Woodcock Johnson (Woodcock & Johnson; WJ, 1990). By January of kindergarten, non-responders were identified who still scored below 86 on the WJ literacy subtests (letter-word identification and dictation) and who made limited growth on segmentation and letter-naming. In first grade, O’Connor adapted the non-responder criteria by adding the WJ word attack subtest, eliminating rapid letter naming, and using limited growth on blending and segmenting. During Layer 1, classroom teachers conducted whole-class explicit phonological awareness instruction using the Ladders to Literacy activities (O’Connor et al., 1998a, b). Participating students included children with and without disabilities. In Layer 2, the 25 kindergarten students who did not respond to Layer 1 received more intensive intervention in one-to-one tutorials. These 12-minute sessions were conducted three times a week for three months. In Layer 3, which began in first grade, 20 students were identified who either responded poorly to Layer 2 or whose standard scores on the Woodcock Johnson reading measures fell below 86. Then, these students received 14 weeks of small group instruction (four 30-minute sessions per week) that continued phonological training but also included explicit phonetic blending and decoding. In Layer 4, children were tutored individually by O’Connor herself for four weeks (four 15-minute sessions per week) using a similar intervention. By the end of second grade, all six of the non-responders to this fourth layer of intervention were identified for special education. O’Connor (2000) reported that 7% of children overall were non-responders. Among children with disabilities, this percentage was 30%. Only 1% of the initially at-risk students performed in the normal range of the WJ subtests without


67

receiving O’Connors’s interventions. Moreover, O’Connor reported that across the two years, some children responded well to intensive treatment, but when they returned to regular classroom instruction, they lost ground. These findings suggest the need for ongoing and sustained intensive intervention (especially for children with disabilities) and the need to track children’s progress over time. She also cautioned that although the school reported that as a result of classroom instructional changes the proportion of children in the lowest quartile on the high-stakes Iowa Test of Basic Skills dropped from 40 to 17%, the number of children identified as needing special education for reading problems did not decrease. O’Connor did not focus on the characteristics of her non-responders.

Al Otaiba (2001) and Al Otaiba and Fuchs (2002b) These studies were part of a larger investigation conducted in a southern metropolitan school district that was designed to examine the two-year effects of early literacy intervention on the reading acquisition of 312 children (see Fuchs et al., 2001, 2002). The students’ teachers in kindergarten and first grade were assigned randomly to treatment conditions; therefore, students received one of the following: (a) (b) (c) (d)

two years of treatment; only one year of treatment in kindergarten; only one year of treatment in first grade; or typical classroom instruction with no additional treatment.

All teachers used the district adopted grade-level basal reading programs from Harcourt Brace (e.g. Farr & Strickland, 1995). These basal reading programs took an implicit approach to phonological awareness and to phonics instruction (Stein, Johnson & Gutlohn, 1999). By contrast, the kindergarten treatments were explicit and systematic interventions. They consisted of either teacher-directed phonological awareness activities derived from Ladders to Literacy (O’Connor, Notari-Syverson & Vadasy, 1998a) or a combination of these teacher-directed activities and peer-mediated phonological awareness and decoding practice (Peer-Assisted Learning Strategies for Kindergarten; K-PALS). K-PALS sessions were conducted three times per week for 16 weeks. Each session lasted approximately 30 minutes. The first grade treatment was also explicit and peer-mediated. It consisted of Peer-Assisted Learning Strategies (PALS), which was conducted three times a week for approximately 20 weeks. First-Grade PALS lessons include phonological awareness,

68

STEPHANIE AL OTAIBA

decoding, and sight word training as well as reading short stories (i.e. reading connected text). Identification of Non-responders At the end of kindergarten, unresponsiveness to kindergarten treatment was defined as performing in the lowest 30th percentile of treatment students in terms of amount of growth on letter-sound and segmentation fluency measures. The Rapid Letter-Sound test (RLS) from the work of Levy and Lysynchuk (1997) was adapted to assess the number of letter-sounds a student named correctly in one minute. Students were shown a sheet with lower case letters and asked to tell each letter sound. The Yopp-Singer segmentation test (1994) required children to say the sounds in words (e.g. say the sounds in cat). This test strongly correlated with the Dynamic Indicators of Beginning Literacy Skill (DIBELS, Kaminsky & Good, 1996) phoneme segmentation fluency measure (r = 0.77; Good, Simmons & Kame’enui, 2001). None of the unresponsive students could segment more than 12 phonemes in one minute or identify more than 11 letter-sounds per minute. Recently, Good, Simmons and Kame’enui (2001) reported that students who named fewer than 10 segments per minute on the DIBELS phoneme segmentation fluency measure at the end of kindergarten were at-risk for poor reading outcomes in first grade. Unresponsiveness to treatment in first grade was also defined in terms of fluencyoral reading fluency. However, evaluating students’ growth on this measure was not possible because oral reading fluency measures are too difficult in fall of first grade. Therefore, a criterion or benchmark of 40 words per minute was selected because reading at or above 40 words per minute on unseen first grade text predicts good future reading (Good et al., 1999). Oral reading fluency on text was assessed by having students read aloud for one minute on three grade-level reading passages. To calculate the number of words read correctly in one minute, examiners averaged the number of correct words read across the three samples. Test-retest reliability on oral reading fluency measures range from 0.93 to 0.96 (Fuchs, Deno & Marston, 1983). By contrast, students were identified as either sometimes or always responsive if they were responsive to one or to both years of treatment, respectively. Students were considered responsive to kindergarten treatment whose growth on the segmentation and rapid letter sound fluency measures was at or above the treatment mean and were considered responsive to first grade treatment if their oral reading fluency was at or above the treatment mean. What Proportion of Students were Non-responders? Of the 312 students in the larger study, 132 students met either the unresponsive or the responsive criteria. However, 28 students were lost to the study because


69

they moved, leaving a total of 104 participants. At the end of first grade, these 104 students were classified into three groups in terms of their responsiveness to kindergarten and first grade treatment: never responsive, sometimes responsive, and always responsive. Typical classroom control students were also categorized according to the same unreponsiveness/responsiveness criteria used for treatment students in order to compare proportions of non-responders across conditions. Of the students who received one or two years of treatment in the context of the larger study, only 7.05% were never responsive to treatment. Recall that this is the same percentage of non-responders reported by O’Connor (2000). A higher proportion (i.e. 25.35%) of the students in the typical classroom control group, by contrast, was never responsive. This finding supports Vellutino et al. (1996) assertion that many children who appear to have reading difficulties may not have received adequate instruction or practice on important pre-reading skills. However, among children with disabilities (mostly speech or language disorders) the proportion of never responsive students was higher. Twenty-seven percent of children with disabilities were non-responders, slightly lower than the 30% O’Connor (2000) reported, but still much lower than the 67% of non-responders in the typical classroom control condition. While these findings should be interpreted with caution due to the small sample size, they do suggest that children with speech and language disorders who are unresponsive to kindergarten treatment may require more intensive intervention than whole class phonological training or peer-mediation. This is congruent with findings from a best evidence synthesis of the literature on the efficacy of peer tutoring for students with high-incidence disabilities (Mathes & Fuchs, 1994). Mathes and Fuchs (1994) reported that although peer tutoring was more effective than typical reading instruction, it was less effective than one-to-one teacher tutoring or teacher-led small group explicit interventions. Children unresponsive in kindergarten rarely changed status in first grade. Most (i.e. 91.89%) children, and all children with IEPs who were identified as unresponsive in kindergarten, remained unresponsive in first grade. Thus, it appears that unresponsiveness to treatment is relatively stable. This finding is consistent with Vellutino et al. (1996) suggestion that unresponsiveness to early intervention may be an important indicator of reading disability. Measures Selected to Describe Characteristics of Non-responders Measures were included to reflect characteristics associated with unresponsiveness by previous research (Al Otaiba & Fuchs, 2002a, b). Fidelity of teacher and student implementation was also measured to determine whether differences

70

STEPHANIE AL OTAIBA

in how accurately treatments were implemented affected responsiveness to treatment. Phonological encoding was evaluated using the Word Sequences and Sentence Imitation of the Detroit Tests of Learning Aptitudes-3 (Hammil, 1991; DTLA-3) in order to assess two types of phonological encoding: words that are unrelated (e.g. cold, late, full) and words that are in the context of a sentence (e.g. I saw a fire on the way to school). Phonological discrimination was evaluated using the Word Discrimination subtest of the Test of Language Development 2-Primary (Hammil & Newcomer, 1988; TOLD-P:2). In this subtest, students are asked to tell whether pairs of words are the same or different (e.g. chop-shop). For naming speed, the Rapid Letter Naming Test (RLN) was used to determine the number of letters a child can name in one minute, which is a strong predictor of future reading ability (Adams, 1990; Juel, 1988). In this test, students are shown upper and lower case letters and asked to name them as quickly as they can. Three types of verbal ability were evaluated. General verbal ability was measured with the Peabody Picture Vocabulary-Revised (Dunn & Dunn, 1981; PPVT-R), which is a widely-used individually administered, norm-referenced test of receptive vocabulary. Syntactic knowledge was also included because students unresponsive to treatment were differentiated on syntax in at least one study. In the Grammatic Completion subtest of the TOLD-P:2 (Hammil & Newcomer, 1988; TOLD-P:2), students are asked to supply the missing word in a sentence (e.g. John likes to cook every day. Yesterday, he . “cooked”). The Achenbach Child Behavior Checklist (Achenbach, 1994) was administered by the classroom teacher to address students’ attention and classroom behavior. Teachers also reported if students were receiving special education services, in which case students’ records were examined for goals on students’ Individualized Education Plans (IEPs). Which Characteristics Differentiated Non-responders? Findings from a discriminant function analysis suggest that the variables that reliably differentiated never, sometimes, and always responsive students were: naming speed, attention/behavior, verbal ability, and phonological encoding. This combination of markers, in conjunction with access to treatment, correctly identified 82.4% of the non-responders. Although these markers correctly identified most never responsive students, 17.6% of students were not predicted to be never responsive using these markers. These false negatives included four never responsive students predicted to be sometimes responsive and one student predicted to be always responsive. Thus, such students would not have been identified as likely to need more intensive treatment. In addition, the markers identified eight students as never responsive who actually were sometimes responsive and four students as


71

never responsive who were, in fact, always responsive. Practical implications of such false positives would be of less concern, however, than false negatives. Such children might receive more intensive treatment (conceivably at a higher cost) than needed. There is an important caveat: Non-responders were more often in kindergarten classrooms where teachers had lower ratings on instructional adaptations (adapting level of difficulty for students or allowing lower-performing students to respond) and in first grade classrooms where PALS lessons were conducted with lower treatment fidelity. These issues raise concerns about whether students were unresponsive due to poor treatment fidelity. Moreover, a large proportion of never responsive students were in the classroom control group. No data were collected regarding the quality or type of classroom instruction received by these students. The characteristics that distinguished non-responders from responders were relatively consistent with other investigations and with current reading theory. However, these studies’ findings may extend the literature because classroom teachers, rather than research staff, conducted treatment in their own real-world (diverse) classrooms. This is important given that the general education is the default placement for increasing numbers of young children with disabilities as a result of the Individuals with Disabilities Education Act (IDEA). Other researchers who provided clinical or pull-out interventions have also reported that non-responders were characterized by naming speed deficits (e.g. Berninger et al., 1999; Torgesen & Davis, 1996; Torgesen et al., 1999; Uhry & Shepherd, 1997; Vellutino et al., 1996). This finding is also consistent with the theory that deficits in naming speed may interact with other child characteristics to produce intractable reading disabilities (e.g. Wolf, 1991). Similarly, poor attention and conduct have been attributed to non-responders (Snider, 1997; Uhry & Shepherd, 1997; Vadasy et al., 1997; Vellutino et al., 1996). General verbal ability also differentiated non-responders from responders. Prior findings with regard to the impact of general verbal ability on responsiveness to treatment have been equivocal. On the one hand, several investigations of diverse populations reported that low general verbal ability was associated with treatment unresponsiveness (e.g. Fazio, 1997; Foorman et al., 1997; Torgesen & Davis, 1996). However, Vellutino et al. (1996) did not find PPVT-R, general verbal intelligence, or semantic deficits to be related to treatment responsiveness with mostly Caucasian and middle class participants. Thus, for children from lower income and multicultural families, verbal ability may be associated with responsiveness to treatment. Other researchers (e.g. Adams, 1990; Wolf & Bowers, 1999) have suggested that vocabulary may play a role in reading fluency as it is easier to decode familiar rather than unfamiliar words.

72

STEPHANIE AL OTAIBA

In comparison with responders, non-responders had more difficulty with remembering words in sentences, or phonological encoding. Children unable to recognize and make use of syntactic information in oral language would likely fail to use syntactic or grammatical context clues while reading text, which in turn may have implications for their reading fluency. On the other hand, there were no significant differences among groups on phonological encoding of abstract words. Thus, it appears that students’ memory span for words in abstract, or unrelated, lists did not differentiate responsiveness to treatment, whereas their memory for sentences did. This finding differs from Vellutino et al. (1996), who reported differences between responsive vs. unresponsive students using a measure that assessed immediate and delayed recall of abstract words. Nor were there any significant differences among never, sometimes, and always responsive students on demographic variables such as race, age, or socioeconomic status. This finding differs from Foorman et al. (1997), but the two populations differed because Foorman et al. participants included students with English as a second language. Study 2: Third Grade Follow-up As indicated in the prior section, a follow-up study was conducted at the end of third grade to determine whether at this critical time non-responders differed significantly from responders in their reading ability, retention rates, and participation in special education. To my knowledge, this was the first follow-up study to confirm whether non-responders were in fact identified by schools as having reading difficulties or reading disabilities. At the end of third grade, despite a good deal of detective work and assistance from the school district, only 51 of the original 104 Study 1 participants were located. The remaining students were not accessible because they had moved out of the school district, were home-schooled, or attended private school. Of the 51 participants located, only one child’s parent did not consent for her to be tested. The 50 follow-up participants were tested on reading performance using the WRMT-R word attack and word identification subtests, as well as an oral reading fluency measure. In addition, participants’ classroom teachers were interviewed to determine whether students had an IEP with reading goals, had received tutoring services in reading, had been retained at grade level, or had required a student study team and subsequent pre-referral reading intervention. All second and third grade teachers used the same District-adopted implicit basal reading program as did the kindergarten and first grade teachers: Harcourt Brace (e.g. Farr & Strickland, 1995). It was not possible, however, to observe classroom instruction or special education instruction. Table 4 describes the attrition rates, responsiveness status, and the third grade educational placement history for each of the remaining 50 participants. The


73

Table 4. Attrition, Number of Remaining Participants, and Third Grade Educational Placement by Responsiveness Status. 5/99–5/01

Never (n = 34–21)

Always (n = 44–21)

Sometimes (n = 26–12)

Attrition

19 moved 1 LD private 1 no-consent 4 5 4

16 moved 2 no consent 3 home school 23 0 0

11 moved 1 home school 13 1 0

39 6 5

13

23

14

50

Third Grade IEP reading goals Retained/Referred for reading diff. Total

Total (n = 104–54)

significant differences among never, sometimes, and always responsive students were still evident at the end of third grade in terms of word identification, word attack, and oral reading fluency. Table 4 shows four never responsive students were not identified by schools as having reading difficulties. However, all four were in the no-treatment group and had only received typical classroom instruction. By contrast, every non-responder who received treatment in either kindergarten or first grade was identified by schools as having reading difficulties by the end of third grade. Because of their reading difficulties, non-responders were retained or referred for special education. Thus, the predictive accuracy of responsiveness status was slightly lower (92%) when students who received only the more implicit typical classroom instruction were included.

Summary and Some Limitations of the Studies On the one hand, the O’Connor (2000) and the Al Otaiba and Fuchs (2002b) findings demonstrate the percentages of non-responders can be reduced. On the other hand, they also suggest how intensive interventions must be to for the most difficult-to-reach children, including many students with disabilities. For instance, Al Otaiba and Fuchs reported that non-responders had many characteristics of students with reading disabilities and that their responsiveness status rarely improved. Moreover, all of the kindergarten non-responders with IEPs (for speech and language) remained unresponsive in first grade. O’Connor expressed surprise and concern that two of her non-responders had not yet been identified by the school as having reading difficulties by the end of first grade. Had she been able to track students through second and third grade, however, these students may have eventually qualified for services. However, all of the non-responders to treatment

74

STEPHANIE AL OTAIBA

identified by Al Otaiba and Fuchs were identified by schools as having reading difficulties. Their findings, however, must be interpreted with caution because of the high attrition rates. Clearly, the schools who participated in Al Otaiba and Fuchs (2002b) attempted to provide non-responders with additional help: by providing them with tutoring, by retaining them in grade 1 or grade 2, by conducting pre-referral meetings, or by identifying them for special education services. Unfortunately, limited resources precluded observations of these additional services or detailed descriptions of classroom instruction. The same limitation is true in O’Connor’s (2000) study. Further research is needed to see if the characteristics of non-responders can provide guidance in differentiating or individualizing interventions. For example, non-responders with behavior problems might benefit from motivational support. In a similar vein, research is needed to explore how much instructional time should be allotted to each of the five components and how instruction can be differentiated. For example, a logical research question might be: Do non-responders with very low general verbal ability improve in reading when they receive specialized vocabulary instruction?

USING WHAT WE HAVE LEARNED: A MODEL OF RESEARCH-TO-POLICY-TO-PRACTICE Although unresponsiveness to treatment is still a relatively new area of research, it has become highly visible in current efforts to prevent reading difficulties. This nascent research has already influenced policy, and is embodied in the Leave No Child Behind Act of 2001, which aims to help every child read by third grade. In Florida, the Florida Center for Reading Research is helping coordinate efforts to use this research. Policy makers, informed by research, will provide financial incentives for educational practice to change. In my conclusions, I will focus on this model.

Providing Explicit and Systematic Instruction Clearly, the good news is that we know that explicit and systematic reading instruction will help the vast majority of all students learn to read by the end of third grade. Moreover, findings from the review of the literature (Al Otaiba & Fuchs, 2002b) suggest that classroom teachers could lower the percent of non-responders simply by providing more explicit classroom reading instruction. To that end, teachers need explicit basal reading programs for instruction as well


75

as supplemental materials for intervention. Basal programs should include aligned teacher and student materials that are clearly-organized and that include all five essential SBRR components: phonemic awareness, phonics, fluency, vocabulary, and comprehension. Teachers in Florida will also receive professional development to help them learn why the research shows explicit and systematic instruction is critical for at-risk students. The strongest basal reading programs could serve as an effective support to teacher professional development by providing a concise update of the current research. Teachers and researchers will work together to improve instructional strategies, to intensify instruction, and to individualize instruction for struggling readers. Identifying Non-responders Although no precise definition of non-responder emerged from the review of the literature, it is clear that a combination of early screening and progress monitoring are needed to identify children who are not responding. In Florida, students will be tested four times a year to select students for more intensive intervention. Good, Simmons and Kame’enui (2001) have recently shown that benchmarks or cut-off scores for DIBELS and oral reading fluency measures have good predictive validity for high stakes reading tests at grades three and four. Al Otaiba (2001) and O’Connor (2000) also have shown that phoneme segmentation fluency measures and rapid letter naming may be sensitive enough to distinguish non-responders from responders, even among young children with disabilities. Florida will join other states in conducting large-scale research that is needed to confirm the predictive and concurrent validity of these measures with high stakes testing. Progress monitoring on alternate forms of the same curriculum-based measure will allow teachers and principals to decide whether students are catching up with their peers. Additional diagnostic measures will then be used as needed to further guide intervention. These might include teacher-made tests, behavior observations, or standardized individual tests of phonological awareness or vocabulary. All three types of measures (screening, progress monitoring, and diagnostic measures) need to be valid and reliable and relatively easy and time-efficient for teachers to administer.

Reducing the Percentage of Non-responders Through Immediate and Individualized Intervention We do not yet know the exact instructional conditions that must be available for schools to leave no child behind. However, we can “stack the cards” by focusing on

76

STEPHANIE AL OTAIBA

strategies to reduce the percentage of non-responders. First, we need to maximize the intensity of instruction and intervention for these children for as long as it takes for them to succeed. Even when students are reading successfully and may not need close monitoring of their progress, routine screenings should still be used to ensure they do not fall behind once they return to less-intensive instructional programs. Future research like O’Connor’s (2000) investigation is needed to examine the degree of intensity of instruction non-responders need and then to find effective and efficient ways to make that possible in classrooms with diverse students. In addition, because researchers generally report findings “on average,” such as the NRP report which suggested small group instruction is as effective as one-to-one tutoring, these findings may not necessarily generalize to nonresponders. Therefore, non-responders to less-intensive interventions should be identified early as they are not likely to benefit from “more of the same.” This recommendation appears particularly important for students with disabilities. For example, while PALS was successful on average for most children with disabilities and dramatically reduced the proportion of non-responders among children with disabilities (from two thirds to one quarter), it is important to keep in mind that another year of PALS did not help those children who did not benefit from kindergarten treatment. Future research could identify which children benefit from: (a) explicit classroom instruction; (b) explicit peer mediated intervention; (c) explicit individual or small-group intervention; and which children might need all three. To date, although relatively low-intensity tutorial interventions using peers (Al Otaiba & Fuchs, 2002b) or volunteers (Vadasy et al., 1997) appear more effective than typical implicit classroom instruction, this comparison could be misleading. It is important to recognize that both interventions were explicit, and have not yet been compared directly to classroom instruction that is also explicit. This point is relevant because for non-responders, without such a comparison, it is not yet certain where and how to direct schools’ limited resources. Second, we need more research to identify ways to enhance interventions by differentiating or individualizing based on children’s characteristics. For example, providing extra emotional support (praise or motivation) or scaffolding (breaking down more difficult tasks into easier steps) could reduce the number of non-responders. Juel (1996) has reported that the level of support offered by tutors during beginning reading instruction is an important variable in evaluating treatment effectiveness. Al Otaiba (2001) found that teachers who used more instructional adaptations (i.e. adapted the level of difficulty for students, gave more explicit help, and allowed more low-performing students to respond) had fewer non-responders than teachers who did not use such adaptations. In conclusion, perhaps the most important implication of this chapter is that even with financial incentives for change that will be funded by Reading First, we have a long way


77

to go to create the necessary conditions and commitment in all schools to provide the kind of intensive, individual, and sustained support needed for all children to read by third grade.

ACKNOWLEDGMENTS This research was supported in part by Grant #H324D000033 and Grant #H324B0049 from the Office of Special Education Programs in the U.S. Department of Education to Vanderbilt University. The paper does not necessarily reflect the position or policy of the funding agency and no official endorsement by it should be inferred. Portions of this chapter were presented at the annual meetings of the Society for the Scientific Study of Reading in Boulder, CO, and Council for Exceptional Children in Kansas City, MO, both in 2001. Correspondence should be addressed to Stephanie Al Otaiba, Florida State University, Department of Special Education, 205 Stone Building, Tallahassee, FL 32306-4459 (or [email protected]).

REFERENCES Achenbach, T. M. (1994). Child behavior checklist. Burlington, VT: University Medical Education Associates. Adams, M. J. (1990). Beginning to read: Thinking and learning about print. Cambridge, MA: MIT Press. Alexander, A., Anderson, H., Heilman, P. C., Voeller, K. S., & Torgesen, J. K. (1991). Phonological awareness training and remediation of analytic decoding deficits in a group of severe dyslexics. Annals of Dyslexia, 41, 193–206. Allington, R. L. (1994). The schools we have. The schools we need. The Reading Teacher, 48, 14–29. Al Otaiba, S. (2001). IRA outstanding dissertation award for 2001: Children who do not respond to early literacy instruction: A longitudinal study across kindergarten and first grade. (Abstract). Reading Research Quarterly, 36, 344–345. Al Otaiba, S., & Fuchs, D. (2002a). Characteristics of children who are unresponsive to early literacy intervention: A review of the literature. Remedial and Special Education, 23, 300–316. Al Otaiba, S., & Fuchs, D. (2002b). Non-responder: A synonym for reading disabled? Can third grade reading disabilities be predicted by responsiveness to early literacy intervention? Paper presented at the annual meeting of the Society for Scientific Study of Reading, Chicago, IL. Berninger, V. W., & Abbott, R. D. (1994). Redefining learning disabilities: Moving beyond aptitudeachievement discrepancies to failure to respond to validated treatment protocols. In: G. R. Lyon (Ed.), Frames of Reference for the Assessment of Learning Disabilities: New Views on Measurement Issues (pp. 163–184). Baltimore, MD: Paul Brookes. Berninger, V. W., Abbott, R. D., Zook, D., Ogier, S., Lemos-Britton, Z., & Brooksher, R. (1999). Early intervention for reading disabilities: Teaching the alphabet principle in a connectionist framework. Journal of Learning Disabilities, 32, 491–503.

78

STEPHANIE AL OTAIBA

Blachman, B. A. (1994). What we have learned from longitudinal studies of phonological processing and reading, and some unanswered questions: A response to Torgesen, Wagner and Rashotte. Journal of Learning Disabilities, 27, 287–291. Blachman, B. (1997). Early intervention and phonological awareness: A cautionary tale. In: B. Blachman (Ed.), Foundations of Reading Acquisition and Dyslexia (pp. 408–430). Mahwah, NJ: Erlbaum. Brown, I. S., & Felton, R. H. (1990). Effects of instruction on beginning reading skills in children at risk for reading disability. Reading and Writing: An Interdisciplinary Journal, 2, 223–241. Cunningham, A. E., & Stanovich, K. E. (1998). What reading does for the mind. American Educator, 22(Spring/Summer), 8–15. Dunn, L. M., & Dunn, L. M. (1981). Peabody picture vocabulary test – revised. Circle Pines, MN: American Guidance Service. Ehri, L. C., & Robbins, C. (1992). Beginners need some decoding skill to read words by analogy. Reading Research Quarterly, 27, 13–26. Farr, R. C., & Strickland, D. S. (1995). Treasury of literature: First street. Orlando, FL: Harcourt Brace & Co. Fazio, B. B. (1997). Learning a new poem: Memory for connected speech and phonological awareness in low-income children with and without specific language impairment. Journal of Speech, Language, and Hearing Research, 40, 1285–1297. Fletcher, J. M., & Foorman, B. R. (1994). Issues in definition and measurement of learning disabilities: The need for early intervention. In: G. R. Lyon (Ed.), Frames of Reference for the Assessment of Learning Disabilities: New Views on Measurement Issues (pp. 185–200). Baltimore, MD: Brookes. Foorman, B. R., Francis, D. J., Fletcher, J. M., Schatschneider, C., & Mehta, P. (1998). The role of instruction in learning to read: Preventing reading failure in at-risk children. Journal of Educational Psychology, 90, 37–55. Foorman, B. R., Francis, D. J., Winikates, D., Mehta, P., Schatschneider, C., & Fletcher, J. M. (1997). Early interventions for children with disabilities. Scientific Studies of Reading, 1, 255–276. Fox, B., & Routh, D. K. (1976). Phonemic analysis and synthesis as word-attack skills. Journal of Educational Psychology, 68, 70–74. Fuchs, L. S., Deno, S. L., & Marston, D. (1983). Improving the reliability of curriculum-based measures of academic skills for psychoeducational decision making. Diagnostique, 6, 135–149. Fuchs, L. S., & Fuchs, D. (1998). Treatment validity: A unifying concept for reconceptualizing the identification of learning disabilities. Learning Disabilities Research and Practice, 13(4), 204–219. Fuchs, D., Fuchs, L. S., Thompson, A., Al Otaiba, S., Yen, L., Yang, N., Braun, M., & O’Connor, R. (2001). Is reading important in reading-readiness programs? A randomized field trial with teachers as program implementers. Journal of Educational Psychology, 93, 251–267. Fuchs, D., Fuchs, L. S., Thompson, A., Al Otaiba, S., Yen, L., Yang, N., Braun, M., & O’Connor, R. (2002). Exploring the importance of reading programs for kindergartners with disabilities in mainstream classrooms. Exceptional Children, 68, 295–311. Good, R. H., Simmons, D. C., & Kame’enui, E. J. (2001). The importance and decision-making utility of a continuum of fluency-based indicators of foundational reading skills for third-grade highstakes outcomes. Scientific Studies of Rading, 5, 257–288. Good, R. H., Simmons, D. C., & Smith, A. (1999). Effective academic interventions in the United States: evaluating and enhancing the acquisition of early reading skills. Educational and Child Psychology, 15, 56–70.


79

Hammil, D. D. (1991). Detroit tests of learning aptitude – 3. Austin, TX: ProEd. Hammil, D. D., & Newcomer, P. L. (1988). Test of language development-primary. Austin, TX: ProEd. Hatcher, P. J., & Hulme, C. (1999). Phonemes, rhymes, and intelligence as predictors of children’s responsiveness to remedial reading instruction: Evidence from a longitudinal study. Journal of Experimental Child Psychology, 72, 130–153. Hurford, D. P. (1990). Training phonemic segmentation ability with a phonemic discrimination intervention in second- and third-grade children with reading disabilities. Journal of Learning Disabilities, 23, 564–569. Juel, C. (1988). Learning to read and write: A longitudinal study of fifty-four children from first through fourth grades. Journal of Educational Psychology, 80, 437–447. Juel, C. (1994). At-risk university students tutoring at-risk elementary school children: What factors make it effective? In: E. H. Hiebert & B. M. Taylor (Eds), Getting Reading Right from the Start: Effect Early Interventions (pp. 39–62). Boston, MA: Allyn & Bacon. Juel, C. (1996). What makes literacy tutoring effective? Reading Research Quarterly, 31, 268–289. Kaminsky, R. R., & Good, R. H. (1996). Toward a technology for assessing basic early literacy skills. School Psychology Review, 25, 215–227. Kasten, W. C. (1998). One learner, two paradigms. Reading and Writing Quarterly, 14, 335–353. Kennedy, M. M., Birman, M., & Demaline, B. (1986). The effectiveness of Chapter 1 services. Second interim report from the National Assessment of chapter 1. Office of Educational Research and Improvement (OERI), Washington, DC. Leave No Child Behind Act of 2001. Pub. L. No. 107–110 (H. R. 1). Levy, B. A., & Lysynchuk, L. (1997). Beginning word recognition: Benefits of training by segmentation and whole word methods. Scientific Studies of Reading, 1(4), 359–387. Lovett, M. W. (1997). The effectiveness of remedial programs for reading disabled children of different ages: Does the benefit decrease for older children? Learning Disability Quarterly Sum 1997, 20(3), 189–210. Lyon, G. R. (1985). Identification and remediation of learning disability subtypes: Preliminary findings. Learning Disability Focus, 1, 21–35. Lyon, G. R., Fletcher, J. M., Shaywitz, S. E., Shaywitz, B. A., Torgesen, J. K., Wood, F. B., Schulte, A., & Olson, R. (2001). Rethinking learning disabilities. Washington, DC: Hudson Institute. Mastropieri, M. A., Leinart, A., & Scruggs, T. E. (1999). Strategies to increase reading fluency. Intervention in School and Clinic, 34, 278–283, 292. Mathes, P. G., & Fuchs, L. (1994). The efficacy of peer tutoring in reading for students with mild disabilities: A best-evidence synthesis. School Psychology Review, 23, 59–80. Mathes, P. G., Howard, J. K., Allen, S. H., & Fuchs, D. (1998). Peer-assisted learning strategies for first grade readers: Responding to the needs of diverse learners. Reading Research Quarterly, 33, 62–94. Moats, L. C., & Lyon, G. R. (1996). Wanted: Teachers with knowledge of language. Topics in Language Disorders, 16, 73–86. National Reading Panel (2000). Teaching children to read: An evidence-based assessment of the scientific research literature on reading and its implications for reading instruction. National Institute of Child Health and Human Development, Washington, DC. O’Connor, R. E. (2000). Increasing the intensity of intervention in kindergarten and first grade. Learning Disabilities Research and Practice, 15, 43–54. O’Connor, R. E., Jenkins, J., Leicester, N., & Slocum, T. (1993). Teaching phonological awareness to young children with learning disabilities. Exceptional Children, 59, 532–546.

80

STEPHANIE AL OTAIBA

O’Connor, R. E., Jenkins, J., & Slocum, T. (1995). Transfer among phonological tasks in kindergarten: Essential instructional content. Journal of Educational Psychology, 87, 202–217. O’Connor, R. E., Notari-Syverson, A., & Vadasy, P. F. (1996). Ladders to literacy: The effects of teacherled phonological activities for kindergarten children with and without disabilities. Exceptional Children, 63, 117–130. O’Connor, R. E., Notari-Syverson, A., & Vadasy, P. F. (1998a). Ladders to literacy: A kindergarten activity book. Baltimore, MD: Paul Brookes. O’Connor, R. E., Notari-Syverson, A., & Vadasy, P. F. (1998b). First-grade effects of teacher-led phonological activities in kindergarten for children with mild disabilities: A follow-up study. Learning Disabilities Research & Practice, 13, 43–52. O’Shaughnessy, T. E., & Swanson, H. L. (2000). A comparison of two reading interventions for children with reading disabilities. Journal of Learning Disabilities, 33, 257–277. Peterson, M. E., & Haines, L. P. (1992). Orthographic analogy training with kindergarten children: Effects on analogy use, phonemic segmentation, and letter-sound knowledge. Journal of Reading Behavior, 24, 109–127. Schneider, W., Ennemoser, M., Roth, E., & Kuspert, P. (1999). Kindergarten prevention of dyslexia: Does training in phonological awareness work for everybody? Journal of Learning Disabilities, 32, 429–436. Shannahan, T., & Barr, R. (1995). Reading recovery: An independent evaluation of the effects of an early intervention for at-risk learners. Reading Research Quarterly, 30, 958–996. Share, D. L., & Stanovich, K. E. (1995). Cognitive process in early reading development: A model of acquisition and individual differences. Issues in Eduacation: Contributions from Educational Psychology, 1, 1–57. Simmons, D. C., Fuchs, D., & Fuchs, L. S. (1991). Instructional and curricular requisites of mainstreamed students with learning disabilities. Journal of Learning Disabilities, 24, 354–360. Smith-Burke, M. T., & Jaggar, A. M. (1994). Implementing reading recovery in New York: Insights from the first two years. In: E. H. Hiebert & B. M. Taylor (Eds), Getting Reading Right From the Start: Effect Early Interventions (pp. 63–84). Boston, MA: Allyn & Bacon. Snider, V. E. (1997). Transfer of decoding skills to a literature basal. Learning Disabilities Practice, 1, 54–62. Snow, C. E., Burns, M. S., & Griffin, P. (Eds). (1998). Preventing reading difficulties in young children. Washington, DC: National Academy Press. Stanovich, K. E. (1986). Matthew effects in reading: Some consequences of individual differences in the acquisition of literacy. Reading Research Quarterly, 21, 360–407. Stein, M., Johnson, B., & Gutlohn, L. (1999). Analyzing beginning reading programs: The relationship between decoding instruction and text. Remedial and Special Education, 20(5), 275–287. Torgesen, J. (2000). Individual differences in response to early interventions in reading: The lingering problem of treatment resisters. Learning Disabilities Research and Practice, 15, 55–64. Torgesen, J. K., & Davis, C. (1996). Individual difference variables that predict response to training in phonological awareness. Journal of Experimental Child Psychology, 63, 1–21. Torgesen, J. K., Morgan, S., & Davis, C. (1992). The effects of two types of phonological awareness training on word learning in kindergarten children. Journal of Educational Psychology, 84, 364–370. Torgesen, J. K., Wagner, R. K., & Rashotte, C. A. (1994). Longitudinal studies of phonological processing and reading. Journal of Learning Disabilities, 27, 276–286. Torgesen, J. K., Wagner, R. K., Rashotte, C. A., Lindamood, P., Rose, E., Conway, T., & Garvan, C. (1999). Preventing reading failure in young children with phonological processing disabilities: Group and individual responses to instruction. Journal of Educational Psychology, 91, 579–593.


81

Uhry, J. K., & Shepherd, M. (1997). Teaching phonological recoding to young children with phonological processing deficits: the effect on sight word acquisition. Learning Disability Quarterly, 20, 104–125. Vadasy, P. F., Jenkins, J. R., Antil, L. R., Wayne, S. K., & O’Connor, R. E. (1997). Community-based early reading intervention for at-risk first graders. Learning Disabilities Research & Practice, 12, 29–39. Vandervelden, M. C., & Siegel, L. S. (1997). Teaching phonological processing skills in early literacy: A developmental approach. Learning Disability Quarterly, 20, 63–81. Vaughn, S. R., Moody, S. W., & Shuman, J. S. (1998). Broken promises: Reading instruction the resource room. Exceptional Children, 64, 211–225. Vellutino, F. R., Scanlon, D. M., & Lyon, G. R. (2000). Differentiating between difficult-to-remediate and readily remediated poor readers. Journal of Learning Disabilities, 33, 223–238. Vellutino, F. R., Scanlon, D. M., Sipay, E. R., Small, S., Chen, R., Pratt, A., & Denckla, M. B. (1996). Cognitive profiles of difficult-to-remediate and readily remediated poor readers: Early intervention as a vehicle for distinguishing between cognitive and experiential deficits as basic causes of specific reading disability. Journal of Educational Psychology, 88, 601–638. Wolf, M. (1991). Naming speed and reading: The contribution of the cognitive neurosciences. Reading Research Quarterly, 26, 123–141. Wolf, M., & Bowers, P. G. (1999). The double-deficit hypothesis for the developmental dyslexias. Journal of Educational Psychology, 91, 415–438. Woodcock, R. (1987). Woodcock reading mastery test – revised. Circle Pines, MN: American Guidance Service. Woodcock, R. W., & Johnson, M. B. (1990). Woodcock–Johnson tests of achievement. Allen, TX: DLM. Yopp, H. K. (1994). A test for assessing phonemic awareness in young children. The Reading Teacher, 49, 20–29.

THE ROLE OF READING INTERVENTION RESEARCH IN THE IDENTIFICATION OF CHILDREN WITH READING DIFFICULTIES: A META-ANALYSIS OF THE LITERATURE FUNDED BY THE NICHD Denise M. Necoechea and H. Lee Swanson ABSTRACT There has been much discussion in the literature in recent years on the problems involved in the identification of children with reading disabilities. One of the most influential sources of knowledge in the field of learning disabilities is the National Institute of Child Health and Human Development (NICHD). This agency has typically been a major funding source for methodologically rigorous reading intervention research. Further, such research has contributed significantly to the validity of identifying children suspected of learning disabilities as “treatment resistors” (e.g. Vellutino et al., 1996). Yet, the NICHD has recently been the focus of some controversy. The purpose of this chapter was to synthesize NICHD funded research conducted over the

Identification and Assessment Advances in Learning and Behavioral Disabilities, Volume 16, 83–161 Copyright © 2003 by Elsevier Science Ltd. All rights of reproduction in any form reserved ISSN: 0735-004x/doi:10.1016/S0735-004X(03)16004-1

83

84

DENISE M. NECOECHEA AND H. LEE SWANSON

past 10 years via a meta-analysis to determine what can be generalized from this body of research that can be applied to the identification of students with learning disabilities in reading. The results of the synthesis were that a prototypical intervention study has a mean effect size (ES) of 0.67 (SD = 0.42), indicating that most interventions designed to increase reading skills were effective. The overall ES ranged, however, from 0.19 to 1.76, and therefore some criterion could be established for identifying treatment resistors. Performance below an overall ES of 0.25 was suggested as one of several criteria for identifying children with potential reading disabilities. However, this suggestion must be put in the context of intervention outcomes. The synthesis indicated that: (a) performance was more pronounced on skill or process measures (e.g. ES varies from 0.45 to 1.28 on measures of segmentation and pseudoword reading) than on measures of actual reading (ES varies from 0.17 to 0.60 on real word and comprehension measures); (b) the magnitude of effect sizes were more related to instructional activity (e.g. explicit instruction/practice) than to the content of instruction (e.g. type of phonics instruction); and (c) the bulk of intervention studies focused on a narrow range of reading behaviors (i.e. phonological awareness). Implications related to identification and sound teaching practice versus content training of reading instruction (e.g. phonological skills, comprehension skills) are discussed.

INTRODUCTION Research funded by the National Institute of Health and Human Development (NICHD) has had a profound effect on the field of learning disabilities. One area that has been strongly influenced by NICHD research is the assessment and identification of children with or at-risk for reading disabilities. Specifically, researchers affiliated with NICHD (among others funded from other sources, for example, Share, McGee, McKenzie, Williams & Silva, 1987; Stanovich & Siegel, 1994) have raised serious questions about the current methodology (e.g. discrepancy, heterogeneity, exclusionary criteria) used to identify children with learning disabilities (Fletcher et al., 1994; Shaywitz, Fletcher, Holahan & Shaywitz, 1992; Vellutino, Scanlon & Lyon, 2000; Vellutino, Scanlon & Tanzman, 1994; Vellutino et al., 1994). This body of research has directed the field’s attention to investigating the critical cognitive components of reading acquisition (i.e. phonological awareness, rapid naming, verbal memory, phonological processing) to accurately identify those individuals with reading deficits. Hence, a wealth of evidence now indicates that reading disability is a language-based disorder,

The Role of Reading Intervention Research in the Identification of Children

85

manifested when phonological processing and phonological awareness skills are lacking (Adams, 1990; Foorman, Francis, Fletcher & Lynn, 1996; Liberman, Shankweiler & Liberman, 1989; Share & Stanovich, 1995; Stanovich, 1988; Stanovich & Siegel, 1994) and that measures of these components are better indicators of reading ability than is the traditional use of measures of intelligence. Despite the significant base of research from which we can derive information regarding the identification of reading disabled learners, a critical area that has not been addressed adequately in the current literature is the learner’s response to reading treatment. Recent NICHD research findings indicate that many reading disabled individuals can improve in achievement with appropriate intervention efforts (Lovett et al., 1996). Yet, intervention research supported by NICHD has suggested that some children with learning disabilities may be conceptualized as treatment resistors (Vellutino et al., 1996), typically defined as those individuals who fail to demonstrate adequate improvement in reading achievement despite having access and exposure to effective and intense intervention procedures (Torgesen, 2000, 2002; Torgesen, Wagner, Rashotte & Heron, 2001). What remains unclear is how responsive (or resistant) reading deficits are to various reading intervention treatments. The current research base is also unclear as to how certain reading treatments can add to our ability to establish early and accurate identification procedures of students struggling with reading skills. Given the suggestion that alternative diagnostic tools should be developed, knowledge about response to reading treatments may provide the field of learning disabilities with a basis for identifying children at-risk for reading disabilities. That is, the field may benefit from a baseline of outcomes related to concentrated reading interventions. Such baselines can serve as a reference point for identifying children with learning disabilities. This chapter addresses the question of what can be generalized from reading intervention studies funded by NICHD. Clearly, a number of well-designed reading intervention studies have been funded by other sources (see Swanson, 1999 for a review) and more comprehensive syntheses of reading interventions have been recently published (e.g. Ehri, Nunes, Stahl & Willows, 2002; Swanson, 1999). However, the current synthesis provides a much more comprehensive and exhaustive analysis of instructional components and parameters than has been previously published. Also, in the current synthesis we limit our review to reading intervention research funded exclusively by the NICHD. This was proposed for a number of reasons. First, NICHD reading intervention research has been used as an informative tool in important decision making, specifically that which involves changes in policy (for a discussion of this issue, see Allington & Woodside-Jiron, 1999; Taylor, 1998; Taylor,

86


Anderson, Au & Raphael, 2000). Second, the NICHD reading research program advocates a highly rigorous and methodologically sound research agenda (Lyon, 1999a, b). This agency utilizes a thorough and rigid peer-review system to evaluate research proposals seeking funds for reading research (see Lyon, 1999c for a description of this process). Third, NICHD is unique in that it funds studies that wish to establish reliability of previous findings (replication efforts) across a broad range of areas of literacy development. Moreover, NICHD actively supports research looking at reading development over time (those projects that employ longitudinal designs). Finally, NICHD studies have not only looked at normal reading development, but have included a myriad of studies focusing on disabled readers and the problems they encounter (Lyon, 2002). Recently, commendations were given to the NICHD by Congress for the precedence placed on reading development research and its labors to explicate reading difficulties and reading disabilities (Maloney, 2001). How do children learn to read? Why do some individuals have difficulty with the reading process? Which teaching approaches are necessary for which type of student? Since 1965, the NICHD has attempted to address these questions, and in doing so, has arrived at many conclusions. Over 2,000 refereed journal articles and books have been published in the past 30 years addressing these topics (Adams, 1997). More than 100 researchers from 18 different university-based sites across the nation are involved in the efforts. Yet, the NICHD has arrived at some conclusions not well accepted by others in the scientific community (e.g. Pressley & Allington, 1999a, b; Strauss, 2001). Some research scientists state that NICHD has given priority to word recognition studies, and in doing so, has promoted a narrow conceptualization of reading research (e.g. Pressley & Allington, 1999a; see Issues in Education: Contributions of Educational Psychology, vol. 5, issue 1, for several reviews on these issues). Arguments have been made that substantial monies have been allocated to research focusing on phonemic awareness and word recognition models of reading performance, at the expense of an immensely rich and diverse body of literature on literacy development. Moreover, some critics assert that NICHD has put undue emphasis on research focusing on disabled learners, rather than deriving conclusions based on proficient readers (Pressley & Allington, 1999a). Therefore, some research results obtained from NICHD-funded studies may be too limited in scope to be applied in a catholic manner (Taylor et al., 2000) and should not be used to generate unequivocal conclusions as to how children learn to read. Other researchers also contend that literacy research should espouse a broad perspective of reading development, perhaps employing both qualitative and quantitative methodology, rather than the narrow focus adopted by the NICHD (Alexander & Buehl, 1999; Borkowski, 1999; Morrow, 1999; Pressley & Allington, 1999a; Williams, 1999). Critics of the NICHD reading research program


87

claim that federal funding continues to be channeled down to scientists studying these overemphasized areas (i.e. phonological awareness) of reading (Hoffman, 1999; Pressley & Allington, 1999b) and suggest that NICHD-funded researchers emphasize word level skills because they insist that comprehension abilities will surface once recognition skills are improved (Pressley & Allington, 1999b). Furthermore, Pressley and Allington (1999a) also question the methodological quality of some of the reading studies funded by NICHD (e.g. Foorman, Francis, Novy & Liberman, 1991; Foorman, Francis, Winikates, Mehta, Schatschneider & Fletcher, 1997; Lovett, Borden, DeLuca, Lacerenza, Benson & Brackstone, 1994; Olson, Wise, Johnson & Ring, 1997; Torgesen, Wagner & Rashotte, 1997; Vellutino et al., 1996; Wise & Olson, 1995). Specifically, these studies are criticized for external and internal validity imperfections (Pressley & Allington, 1999a; Troia, 1999). Given the high visibility of NICHD-funded research in reading, as well as criticisms about what can be generalized from such studies, a synthesis of this research is necessary. The intent of this chapter is to synthesize some of the empirical evidence derived from recent NICHD reading prevention and intervention research. In the present review, the research literature is synthesized through meta-analytic procedures. Previous meta-analyses suggest that not all forms of interventions in reading work equally well (e.g. Swanson, 1999; Swanson & Hoskyn, 1998). In this synthesis, we were interested in isolating the instructional interventions yielding the largest effects, as well as determining what types of individuals benefit from which intervention programs. For example, phonemic awareness instruction has recently become a popular pedagogy adopted by several teachers to train individuals in early reading skills. Yet, meta-analyses effect sizes for phonemic awareness interventions indicate that these treatments have rather modest long-term effects (Bus & van IJzendoorn, 1999). The intent of this synthesis, then, is to analyze the research results from several reading intervention studies in an effort to facilitate our understanding about what can be generalized from the NICHD-research findings and the contribution of reading interventions to the identification of reading disabled or at-risk learners. Another purpose of the synthesis was to evaluate the NICHD reading research program in light of the criticisms recently voiced (see Issues in Education, vol. 5, issue 1, for example). We were interested in looking at the breadth and depth of NICHD-funded studies in the area of reading interventions and evaluate the distribution of the research foci. Further, we wanted to analyze the methodological worth of the studies; methodological differences between studies could account for some of the variance and disparity in reading intervention research. Specifically, we wanted to determine if the quality of research methodology common to each study impacted the size of the effect of the reading outcomes.

88


The current synthesis addresses five main questions: (1) What conclusions can be drawn from the NICHD reading research literature that have application to the identification of learning disabilities?; (2) What are the domains of reading (e.g. phonemic awareness, comprehension) that are most susceptible and resistant to change?; (3) What instructional components, or combinations of components, yield the best reading outcomes, and therefore should be included in instructional programs to better identify those children less responsive to treatment?; (4) What methodological variations influence the magnitude of reading outcomes?; and (5) What can we determine regarding the distribution of the NICHD reading research foci?

METHOD Data Collection The online databases of PsycINFO, ERIC, and MEDline were systematically reviewed for studies that were published from 1990 to September of 2000. The databases were searched using the following descriptors: reading, learning disabled (disabilities), reading disabled (disabilities), dyslexia, and at-risk combined with each of the following terms or word pairs: reading remediation, reading instruction, reading strategies, reading failure, word recognition, phonological awareness, reading and treatment, reading and improvement, reading and training, and reading and prevention. In addition, two lists of researchers affiliated with NICHD, as cited in Fletcher and Lyon (1998) and Lyon (1999c) were used for descriptors (author’s last name) for searches on the databases. This search yielded approximately 1,325 citations, from journal articles, book chapters, dissertations, and technical reports. The pool of relevant literature was narrowed down to studies that were funded by the NICHD, as is indicated in the “article acknowledgement” section or footnote of each study. Funding may have been directed to a primary investigator or to a research center. Each study was reviewed and evaluated according to six additional criteria that had been previously established for inclusion in the meta-analysis: (1) Each study included an experimental and control-group design in which participants received treatment to augment their reading performance. Treatment was defined as reading instruction or assistance, given over a


(2)

(3)

(4)

(5)

(6)

89

minimum of three sessions, which extended beyond what would normally be encountered during regular classroom periods. Each study provided treatment for children, ranging from preschool to high school grade level. Studies with participants older than 18 years of age were excluded. Each study emphasized reading instruction and included at least one dependent measure of reading achievement (i.e. any measure of phonological skills, word recognition, and/or reading comprehension). Studies that focused on other academic domains (i.e. spelling or writing instruction), without directly focusing on reading were excluded. Each study reported sufficient information to calculate the ES between the control and treatment group(s) on reading achievement measures. These were calculated from the means and standard deviations of the performance outcomes for the treatment and control groups, or from tests of the significance of the differences in performance between the instruction conditions (e.g. t or F tests, χ 2 ). Each study included original data not reported elsewhere. Overlapping studies that reported on the same data were excluded. Studies included were either the most recent publication of the same data set or the study that reported the broadest number of outcome measures. Each study had to be published in a refereed journal.

Twenty-one studies met the selection criteria.1 Those studies funded by NICHD that were excluded from the analyses reported duplicate findings (identical studies), failed to include a comparison control group, or failed to include in an intervention specifically focused on reading.

Coding Each of the 21 studies was coded for the following information: (a) sample characteristics (see Appendix A), (b) components of treatment intervention and control condition (see Appendix A), (c) treatment assignment procedures (see Appendix B), (d) instructional and curricular materials (see Appendix B), (e) treatment sessions and length, including overall time exposed to intervention (see Appendix B), (f) internal validity (see Appendix B), and (g) sample variation (see Appendix B). Interrater agreement for the coding of study items exceeded 95% for some components of the articles (dependent measures, sample characteristics, statistical data) and 80% for other areas (internal validity and sample variation).

90


Categorization of Dependent Measures Dependent measures were classified as one of five reading categories: (1) reading recognition skills, which included all measures of real-word identification (e.g. WRMT-R word identification subtest; WRAT-3 reading recognition subtest); (2) non-word and pseudo-word reading (e.g. WRMT-R word attack subtest; Goldman-Fristoe-Woodcock non-word reading subtest); (3) phonological skills, which included all measures of blending (synthesis), manipulation, rhyme, and phonemic awareness (e.g. TOPA; Lindamood Auditory Conceptualization test); (4) phonological analysis/word segmentation (e.g. Rosner Test of Auditory Analysis; Torgesen-Wagner Battery of Phonological Analysis); and (5) reading comprehension, which included all measures of silent reading or oral reading comprehension (e.g. WRMT-R passage comprehension subtest; WJ-R passage comprehension subtest). Several studies reported multiple dependent measures within one or more categories. To partially control for statistical independence within the meta-analysis, the effect sizes within each category were averaged within categories before further analysis occurred (see Gleser & Olkin, 1994). For example, if a study provided results for more than one reading recognition measure in the real-word reading measure category, then the effect sizes were averaged across all real-word measures prior to subsequent statistical analyses. Categorization of Treatment Components One hundred fifty-five instructional components were identified (coded 1 for occurrence of components and 0 for non-occurrence) within the 21 studies (see Appendices E and F) in the present study. Instructional components were divided into two broad categories: those reflecting curricular and instructional variables and those reflecting organizational and instructional management variables. Curricular and instructional components were defined as variables involving teaching techniques, content of instruction and practice, instructional methods, and teacher/learner behaviors during which the intervention was administered. Organizational and instructional management components were defined as variables involving the size of the instructional group, the setting of the instruction, and the personnel or mechanism by which the instruction was introduced. By coding instructional components across the studies, we hoped to identify those that were influential in producing larger effect sizes.


91

Based on previous research that have identified instructional components that influenced student outcomes (Adams, 1990; Berninger & Traweek, 1991; Byrne & Fielding-Barnesley, 1991; Lovett et al., 1994; Lundberg, Frost & Peterson, 1988; Pikulski, 1994; Rosenshine, 1995; Slavin, Karweit & Madden, 1989; Swanson, 1999; Swanson & Hoskyn, 1998), as well as patterns that emerged in the current study, we clustered a subset of the 155 individual instructional activities into 20 cluster categories used for further analyses (see Appendix G). We coded the occurrence of instructional component clusters, as follows: (1) Articulatory awareness activities. Statements in the procedural description reporting instructional activities showing how words are produced by articulatory gestures of the oral region. (2) Phonological skills. Statements in the procedural description incorporating the alphabetic principle, word/syllable/phoneme segmentation (analysis), letter-sound correspondence, manipulation of word parts, phonological/ phonemic/syllable awareness, phonics instruction, word family training, and/or blending sounds (synthesis). (3) Questioning/summarizing. Statements in the procedural description reporting answering questions, retelling information, summarizing text, and/or using comprehension strategies. (4) Orthographic skills. Statements in the procedural description incorporating orthographic awareness activities, code-name correspondence, and/or spelling instruction and practice. (5) Reading activities. Statements in the procedural description reporting reading real- and pseudowords, connected and controlled text, and activities involving choral, guided, repeated, independent, shared, silent and/or oral reading of stories. (6) Metacognitive/strategy instruction. Statements in the procedural description reporting instructional “rules,” self-monitoring, error detection, and/or the use of strategies or mnemonic devices. (7) Technology. Statements in the procedural description reporting computerized instruction or practice and/or the use of media to facilitate presentation of material and feedback of performance. (8) Explicit instruction and practice. Statements in the procedural description reporting explicit instruction, detailed instructional steps, feedback of performance, distributed review and practice, monitoring of practice, fluency-building exercises, and/or overlearning of skills. (9) Teacher/student interchange. Statements in the procedural description about directing students to ask questions, teacher/student dialogue, and/or teacher eliciting questions.

92


(10) One-to-one instruction. Statements in the procedural description reporting independent practice, tutoring, individually-paced instruction, and/or instruction that was individually tailored. (11) Small group instruction. Statements in the procedural description reporting a small group instructional setting (2–5 individuals). (12) Large group instruction. Statements in the procedural description reporting a large group instructional setting (>5 members). (13) Regular classroom instruction. Statements in the procedural description indicating that instruction occurred in the regular classroom (where instruction normally occurs for the student). (14) Alternative instructional setting. Statements in the procedural description indicating that instruction occurred outside of the regular classroom (e.g. office area, library). (15) Regular classroom teacher-led instruction. Statements in the procedural description indicating that the regular classroom teacher (the person who normally delivers instruction) introduced the instruction. (16) Non-regular classroom teacher instruction. Statements in the procedural description indicating that a researcher or other adult introduced the instruction, rather than the regular classroom teacher. (17) Computer-mediated instruction. Statements in the procedural description indicating that a computer program introduced the instruction. (18) Skill-modeling. Statements in the procedural description that reported modeling of skills by a teacher, peer, adult, or computer, and/or requiring the student to repeat processes, responses, and/or actions first imitated by the instructor. (19) Control difficulty. Statements in the procedural description reporting the use of systematic training (from simple to complex) of instructional steps, providing assistance, fading prompts or cues, adjusting the level of difficulty to the student, and conducting probes of learning. (20) Advanced organizers. Statements in the procedural description about directing students to attend to the material, explaining the benefits of instruction, pre-training or prerequisite skill building prior to instruction, and/or stating the objectives of instruction. Each study was coded twice for the treatment and control instructional components; this was done to refine code descriptions and to ensure accuracy of the coding procedure. A few studies reported more than one experiment within the same article. Hence, when treatment procedures from a subsequent experiment were described as being similar to the proceeding experiment, unless otherwise specified, it was assumed that components in both experiments were alike and were coded identically. If a targeted study referenced treatment procedures in a


93

different article, however, without providing details, then only the components listed in the target study were coded. Interrater agreement exceeded 88% for instructional components common to experimental and control conditions across all studies. Coding time for each article ranged from 3.0 to 5.5 hours.

Control Group Parameters In most studies, the control group was identified according the study’s description. When the study failed to indicate the control group directly, we referred to the study’s hypothesis statement to assign condition. If no hypothesis was stated (N = 1), the condition that least resembled a phonological treatment was designated as the control group. When multiple control groups were available within a study, only one group was chosen for comparison; the condition that was most removed from reading instruction was designated the control group. The coding of the control condition was similar to the procedures used for treatment conditions in that all curricular, instructional, and organizational variables were coded as present or not present.

Effect Size Calculation Cohen’s d (Hedges & Olkin, 1985) was the primary index of ES. Cohen’s d was calculated as the difference between control and experimental treatment post-test mean scores (partialed for the influence of pre-test scores if that information was available) divided by the average standard deviation (see Appendix C for ES calculations computed for each reading measure within each of the 21 experiments and Appendix D for ES calculations on follow-up studies). To provide a common standard deviation across various studies, we calculated the pooled (average) standard deviation from the post-test performance of the treatment and control conditions. When post-test standard deviations could not be calculated (such as with repeated measures, covariance, and gain scores), adjustments were made in standard deviations (see Rosenthal, 1994, p. 241, for formula). The primary unit of analysis was the average ES for each study (N = 21). Each study was weighted to its precision, giving more weight to the more precise estimates, weighing each study level according to the inverse of the sampling variance (attaching more weight to studies with larger samples). The dependent measure for the weighted estimate of ES was defined as est = (d/l/v), where v is the inverse of the sampling variance, v = (N trt + N crtl )/(N trt ×N crtl ) + d 2 / [2(N trt + N crtl )].

94


Effect sizes vary in value. An ES of +1.00 indicates that the performance of the treatment group exceeded the performance of the control condition by one standard deviation, while an ES of −1.00 indicates the reverse. Positive effect sizes imply that the performance of the treatment group was favorable compared to the control group; negative effect sizes imply that the performance of the control group was favorable compared to the treatment group. The magnitude of the ES may vary as well. Effect sizes less than 0.40 indicated a small effect, between 0.40 and 0.79 indicated a moderate effect, and 0.80 or greater indicated a large effect.

RESULTS Prototypical Study Twenty-one studies yielded 300 effect sizes. When the unit of analysis was aggregated for each study (N = 21), the mean ES across all studies was 0.67 (SD = 0.42), with a range of 0.19–1.76. When the number of reading-related dependent measures was used as the unit of analysis (N = 300), the mean ES across all measures was 0.70 (SD = 0.67), with a range of −0.85 to 3.16. The participants’ chronological ages ranged from 5.4 to 11.5 years old, with a mean of 8.21 (SD = 2.03). Because of the mixtures of grade levels in the participants, a mean grade level was not computed. For the 16 studies that reported participant’s gender, there were 964 male participants and 605 female participants. The mean male sample size was 60.3 (SD = 44.0) and the mean female sample size was 37.8 (SD = 29.4). Of those studies that reported information regarding the intelligence level of the participants (N = 16), the average intelligence score across all of the samples was 97.5 (SD = 7.6) and the mean real word reading achievement standard score (N = 8) was 76.7 (SD = 6.5). Instructional sessions were conducted an average of 4.0 (SD = 1.13) times per week and lasted for a mean of 35.1 (SD = 15.8) minutes. The mean total number of sessions across all studies was 74.3 (SD = 78.4), with a range of 4–300 sessions.

Article Characteristics Articles were published between 1991 and 2000 in the United States, with an average year of publication as 1995 (SD = 2.9). The most frequent publication outlets were the Journal of Educational Psychology (29%), Annals of Dyslexia (19%), Journal of Learning Disabilities (9.5%), and Reading and Writing: An Interdisciplinary Journal (9.5%). The articles were primarily coauthored with


95

two to seven authors, with an average of 4.09 authors. Seventy-one percent of the primary authors were female. Six primary authors (Berninger, Foorman, Lovett, Olson, Torgesen & Wise) had multiple studies used in the current metaanalysis. Methodological Variations The total sample across the 21 studies was comprised of 1,859 subjects, ranging from 12 to 285 subjects in the combined treatment and control groups. This yielded a mean of 88.5 (SD = 65.8) participants per study. The treatment groups ranged from 6 to 119 participants, with a mean of 34.9 (SD = 28.1) subjects. The control groups ranged from 6 to 60 subjects, having an average of 26.1 (SD = 14.2) per group. Sixty percent of the studies selected participants from public school programs. Ten percent of the studies acquired sample participants from University clinic settings, and another 10% indicated that samples originated from clinic/hospital sources. An additional 5% of the studies used participants who were derived from an early childhood care facility. Approximately 15% of the studies failed to specify the source of their sample. Settings where the treatments were administered included the regular classroom (14.29%), resource room (4.76%), pullout area different from the resource room (19.05%), hospital/clinic (14.29%), and University settings (9.52%). A combination of two or more settings was used in 4.76% of the studies. No information concerning instructional setting for treatment was reported in 33% of the studies. The reader is directed to Table 1 for a complete listing of study characteristics. Comparisons were noted between treatment and control group participants relative to instructional parameters and curricular materials. Along with the widely varied treatments used as interventions, 85.7% of the studies made variations in the materials used in the experimental treatment(s) when compared to the control conditions. The sample size varied between treatment and control conditions in 61.9% of the studies. Approximately 33% of the 21 studies reported differences in the proportion of males to females who participated. A different teacher/experimenter was used to administer treatment and control conditions in 61.9% of the studies. A total of 10 studies, or 47.6%, reported variations in the setting where the treatment and control conditions were administered. Only a small number of studies reported substantial variations in sample age, ethnicity, intelligence scores, and reading achievement scores (range of N = 1–3). Approximately 9.5% of the studies indicated that the experimental group/condition did not differ in any way from the control group/condition. A summary of variations across all studies can be found in Table 2.

96


Table 1. Description of Study Characteristics. Description Study sample size Treatment group Control group Intervention settings Regular classroom Resource room Pullout area Clinic/Hospital Other Combination No information

N

%

1859 664 495 3 1 4 3 2 1 7

964 605 290

M

SD

12–285 6–119 6–60

88.5 34.9 26.1

65.8 28.2 14.2

18–65 2–5 4–300

35.1 4.0 74.3

15.8 1.1 78.4

6–174 5–111

60.3 37.8

44.0 29.4

5.4–11.5

8.2

2.02

14.29 4.76 19.05 14.29 9.52 4.76 33.3

Session length (minutes) Sessions per week Number of sessions Subjects Males Females Not cited

Range

51.9 32.5 15.6

Age (in years)

Table 2. Summary of Sample Variations between Control and Treatment Group. Description Sample size Proportion of males to females Age Ethnicity Intelligence Achievement (reading scores) Materials Person administering treatment Setting of instruction Socioeconomic status a Number

No. of Studies Reportinga

% of Studies

13 7 3 2 1 3 18 13 10 0

61.9 33.3 14.3 9.5 4.8 14.3 85.7 61.9 47.6 0.0

of studies that show variations in sample characteristics (N = 21).


97

Internal validity ratings were calculated for each study. Internal validity was rated with a numeric scale on 11 items, which included controls for: (1) participant mortality; (2) Hawthorne effects and selection bias; (3) comparable exposure between treatment and control groups in terms of materials; (4) instruction time between treatment and control groups; (5) instructional delivery (same teachers) between treatment and control groups; (6) fidelity of treatment administration; (7) practice effects on the dependent measure; (8) floor and ceiling effects; (9) interrater reliability; (10) regression to the mean ruled out as an alternative explanation of findings; and (11) homogeneity of variance between groups (see Lysunchuk, Pressley, D’Ailly, Smith & Cake, 1989; Swanson & Hoskyn, 1998, for further coding detail). In reviewing each study, a score of one reflected internal validity control, two reflected no validity control, and three reflected no validity (or unclear) information was provided in the article. The possible range in ratings was 11–33. The lower the point score, the higher the internal validity of the study. The data generated by the scale indicated a total score range from 7 to 28 points. The mean point total for internal validity was 23.33 (SD = 2.69). This finding is similar to the mean of 22.16 that Swanson (1999) found in a synthesis of 93 reading intervention studies in the field of learning disabilities. Thus, the methodological rigor of studies reported in this synthesis does not differ from others that have been previously analyzed. The most common internal validity item controlled for was the amount of time that the control and experimental groups were exposed to the intervention conditions (66.7% of the studies reported this feature). However, only 48% of the studies reported using checks to determine if the teachers and the students did as they were instructed (treatment fidelity) during the treatment session. The percentage of studies reporting interrater reliabilities was 33.3%. Approximately 9% of the studies reported that the controls were exposed to the same instructional materials as trained subjects. Only 28.6% of the studies controlled for teacher effects (employing the same experimenter to provide treatment for all conditions). Only one study reported ceiling or floor effects. That is, that the mean scores were between 90% or 0.90 for the ceiling or the mean scores were between 0 and 10% or 0.10 for the floor. Summary results of internal validity across all of the studies are found in Table 3.

98


Table 3. Summary of Internal Validity Rating. No. of Studies Reportinga

Internal Validity Criteria

1. Subject mortality equal in control and experimental groups 2. Control subjects believed receiving a treatment 3. Control group exposed to same materials as treatment 4. Control group exposed to equal amount of time as treatment 5. Same experimenter provided treatment for all conditions 6. Checks were employed to determine treatment fidelity 7. Alternate forms were used if a dependent measure was repeated 8. There were ceiling or floor effects 9. Interrater reliabilities were reported 10. Regression to the mean ruled out as an alternative explanation 11. Correlations were computed within groups a1

1

2

3

10 2 3 14 6 10 3 1 7 1 7

10 19 18 1 13 2 1 2 8 1 1

1 – – 6 2 9 17 18 6 19 13

= yes; 2 = no; 3 = could not be determined.

Effect Size Effect sizes were separated into five broad categories of dependent measures. These included: (a) phonological skills; (b) segmentation; (c) real word reading; (d) reading comprehension; and (e) pseudoword reading. One hundred and seventy-one standardized dependent reading measures and 134 experimental measures were administered across all studies. Table 4 shows the unweighted and weighted means and standard deviations of Cohen’s d effect sizes as a function of the type of dependent measure across studies. These categories were further subgrouped into either standardized (normreferenced measures) or experimental (researcher-developed) measures. As seen in Table 4, the most frequent dependent measures reflected across the various studies were measures of phonological skills (e.g. phonemic awareness, phonological synthesis, phonological deletion, rhyme). Also frequently used were measures of real word reading (e.g. word identification, regular words, exception words). One of the most striking findings in the analysis is the degree to which the effect sizes varied across categorical domains, with a weighted ES range of 0.17–1.28. As shown in Table 4, marginal effect sizes emerged in standardized measures of reading comprehension, as well as experimental measures of real word reading, reading comprehension, and pseudoword reading. Moderate weighted effect sizes occurred on standardized measures of phonological skills (0.66), real word reading (0.60) and pseudoword reading (0.59). Moreover, experimental measures of phonological skills (0.57) had moderate ES estimates. Based on the weighted


99

Table 4. Weighted Mean Effect Sizes for Studies as a Function of Dependent Measure Category. N

K

Treatment vs. Control

Effect Size d Unweighted Phonological skills Standardized 9 Experimental 11

95% Confidence Interval for Weighted Effects

Effect Size d Weighted

Lower

Upper

Standard Error

Homogeneity (Q)

48 60

1.04 0.60

(0.79) (0.60)

0.66 0.57

0.46 0.38

0.86 0.77

0.10 0.09

37.95*** 28.82***

3 2

9 3

0.58 1.42

(0.92) (0.68)

0.79 1.28

0.31 0.87

1.26 1.70

0.24 0.21

11.92** 7.81**

Real word reading Standardized 13 Experimental 6

66 38

0.76 0.15

(0.56) (0.34)

0.60 0.17

0.46 –0.06

0.75 0.40

0.07 0.12

22.66* 5.24

4 4

0.39 0.36

(0.06) (0.21)

0.41 0.36

0.15 –0.29

0.66 1.01

0.13 0.33

0.23 0.30

31 7

0.89 0.52

(0.62) (0.45)

0.59 0.40

0.39 0.08

0.80 0.79

0.10 0.20

10.51 3.40

Segmentation Standardized Experimental

Reading comprehension Standardized 3 Experimental 1 Pseudoword reading Standardized 6 Experimental 2

Note: N = number of studies; K = number of dependent measures; (·) = standard deviation. ∗ p < 0.05. ∗∗ p < 0.01. ∗∗∗ p < 0.001.

effect sizes, the areas that approached Cohen’s (1988) threshold of 0.80 for a large effect were standardized (0.79) and experimental (1.28) measures of segmentation. Because the homogeneity tests were significant for a subset of the dependent measures, further analyses were necessary to ascertain some of the variables that moderate the effect sizes. Table 5 shows the correlation between aggregated effect sizes (N = 21) and study characteristics. The relationships between ES and publication date, sample characteristics, treatment session data, and the composite score for internal validity were considered. As shown in Table 5, the important finding was that no significant correlations emerged (alpha set at 0.01) for participant, intervention, or internal validity characteristics. Instructional Component Categories As shown in Appendix E, 155 instructional components were coded. The mean number of instructional components reported in the experimental condition was

100


Table 5. Means, Standard Deviations, and Correlation Coefficients between Effect Size and Selected Study Characteristics. Study Characteristics

N

M

(SD)

r

p

Year of publication

21

1996

(2.88)

−0.39

0.07

Participants Male Female Mean IQ score Mean real word reading Mean age

16 16 16 08 17

60.3 37.8 97.5 76.7 8.2

(44.0) (29.4) (7.6) (6.5) (2.0)

−0.40 −0.27 0.30 −0.55 −0.27

0.15 0.31 0.27 0.15 0.29

Instructional intervention Number of sessions (total) Length of session Number of sessions per week Method compositea

21 21 15 21

74.3 35.1 4.0 23.3

(78.4) (15.8) (1.13) (2.69)

−0.35 −0.13 −0.18 0.08

0.57 0.57 0.53 0.72

a Method

was computed from the sum of internal validity items 1 through 11.

30.71 (SD = 16.18) and the mean number reported in the control condition was 14.27 (SD = 14.39). Table 6 shows the mean number of instructional components as a function of the 20 cluster categories for the treatment conditions. For this analysis, all treatment conditions (N = 36) across the 21 studies served as the unit of analysis. The first column shows the component category, the second column shows the mean number of components reported in the treatment condition, the third column shows the possible range in component scores, and the fourth column shows the correlation between the number of components in a category and the ES. As shown in column three, infrequently reported components were those associated with instruction from the regular teacher (M = 0.13, SD = 0.34), instruction in the regular class (M = 0.13, SD = 0.34), large group instruction (M = 0.18, SD = 0.39), articulatory awareness activities (M = 0.24, SD = 0.59), and reading comprehension (M = 0.29, SD = 0.73). Frequently represented categories include phonological skills (M = 5.8, SD = 4.7), reading activities (M = 2.8, SD = 2.1), explicit instruction/practice (M = 2.3, SD = 1.3), technology (M = 2.2, SD = 3.1), one-to-one instruction (M = 1.3, SD = 1.2), and control of task difficulty (M = 1.2, SD = 1.38). The magnitude of the correlation between the reported frequency of the instructional components and ES are reported in column four. As shown, all correlations ranged from low to moderate. Because of the multiple comparisons, alpha was set at 0.01. As shown in Table 6, reported frequency positively affected the ES of the condition. That is, the more certain components were found in a


101

Table 6. Frequency and Effect Size of Component Cluster Categories for Treatment Condition (N = 36). Component Categories

M (SD)

Range

r

Reported

1. Articulatory awareness 2. Phonological skills 3. Reading comprehension 4. Orthographic skills 5. Reading activities 6. Metacognitive/strategy 7. Technology 8. Explicit instruction/practice 9. Teacher-student interchange 10. One-to-one instruction 11. Small group instruction 12. Large group instruction 13. Instruction in regular class 14. Instruction in alterative setting 15. Instruction from teacher 16. Instruction from other adult 17. Instruction from computer 18. Skill modeling 19. Control difficulty 20. Advanced organizers

0.24 (0.59) 5.8 (4.7) 0.29 (0.73) 0.74 (0.95) 2.8 (2.1) 1.1 (1.4) 2.2 (3.1) 2.3 (1.3) 0.53 (0.73) 1.3 (1.2) 0.45 (0.50) 0.18 (0.39) 0.13 (0.34) 0.63 (0.49) 0.13 (0.34) 0.71 (0.46) 0.34 (0.48) 1.00 (1.09) 1.2 (1.38) 0.74 (0.69)

0–2 0–17 0–4 0–3 0–7 0–4 0–8 0–5 0–2 0–3 0–1 0–1 0–1 0–1 0–1 0–1 0–1 0–3 0–5 0–2

0.14 −0.27 0.27 −0.006 0.27 0.26 0.20 0.41 0.21 0.35 0.22 −0.15 0.34 0.29 0.07 −0.07 0.13 −0.21 0.14 −0.22

χ2

Effect Size Not Reported

N

LSM (SD)

N

LSM (SD)

5 19 4 8 18 10 6 20 9 14 7 5 3 11 3 15 5 10 12 10

0.75 0.56 0.88 0.80 0.59 0.80 0.66 0.62 0.71 0.66 0.61 0.66 1.06 0.60 0.83 0.58 0.61 0.56 0.65 0.48

16 2 17 13 3 11 15 1 12 7 14 16 18 10 18 6 16 11 9 11

0.56 0.96 0.54 0.53 0.62 0.44 0.57 0.26 0.49 0.45 0.59 0.59 0.53 0.59 0.57 0.63 0.60 0.64 0.54 0.82

1.40 3.05 3.90 3.30 0.05 8.04∗ 0.45 1.97 3.17 2.38 0.008 0.18 8.00∗ 0.03 1.78 0.09 0.02 0.45 0.82 6.36∗

Note: An asterisk (∗ ) denotes significance at the 0.01 level.

treatment condition, the higher the effect of the treatment. Specifically, having reported more statements in the treatment condition related to explicit instruction/practice (r = 0.41), one-to-one instruction (r = 0.35), and/or instruction in the regular class (r = 0.34) increased the magnitude of the effect for the particular treatment condition. The only significant correlation was related to the explicit instruction/practice component, r(36) = 0.41, p < 0.01. One limitation to this analysis is that studies vary greatly in terms of the detail reported on instructional components. Further, multiple treatment conditions within studies are not independent (Gleser & Olkin, 1994). Thus, we collapsed instructional components across treatments and compared those studies (N = 21) that reported components related to each category and those that did not. A weighted least squares analysis was used to compare the effect sizes of studies that reported instructional components related to each category (no limitations were place on the frequency) and those studies that did not report any instructional components related to a particular category. Because there were 20 multiple comparisons, a prespecified alpha was set at 0.01. Table 6 shows that treatments

102


which include metacognitive/strategy instruction, advanced organizers, and instruction in a regular class setting yielded significantly larger effect sizes than those studies which do not include such components. No other significant effects emerged. Although the differences between reported and not reported components did not differ across categories, the magnitude of the ES was high for a number of components. As shown in column six of Table 6, the majority of categories had at least moderate effect sizes. According to Cohen’s criteria, those treatment conditions reporting the inclusion of orthographic skills (ES = 0.80), metacognitive/strategy instruction (ES = 0.80), instruction from a teacher, rather than another adult (e.g. peer) (ES = 0.83), reading comprehension (ES = 0.88), and instruction in the regular class (ES = 1.06) show large effects.

DISCUSSION This chapter provides a synthesis of reading intervention research conducted over the past decade by researchers funded by the NICHD. Across the 21 studies included in the analysis, the average ES was 0.67, and when effect sizes were averaged across dependent measures (N = 300), the mean ES was 0.70. These findings suggest that these instructional treatments designed to improve reading skills in students in studies that were funded by NICHD were effective. Five important additional findings, however, qualify these results. First, given that the typical effect size across reading treatments is 0.67, the results of the synthesis suggest that treatment resistors would exhibit outcomes below this level on reading measures. When trying to identify children with or atrisk for learning disabilities in the reading domain, professionals could therefore use response to treatment as a guide. Current practice suggests that IQ scores greater than 80 and word recognition scores below the 25th percentile (standard score of 90) serves as an initial beginning point in the identification of children at risk reading disabilities (e.g. Fletcher et al., 1994; Stanovich & Siegel, 1994). Assuming the 25th percentile is an appropriate cut-off point for identifying children at risk for reading disabilities, we would suggest that children whose aggregate reading scores are below an ES of 0.36 are suffering from a reading disability. An ES of 0.36 reflects the first quartile (25th percentile) in this data set. Thus, a criterion of 0.36 ES may be an appropriate cut-off score in which to identify those students with reading deficits. However, an ES of 0.36 approaches Cohen’s criterion for a moderate effect size. Clearly a more rigorous cut-off score should be applied, such as 0.25 (e.g. an ES of 0.25 reflects the 10th percentile in this data set). This criterion approximates that of Swanson (1999, see footnote 1, p. 525) who suggested that a criterion of


103

0.20 ES be used as a criterion for judging processing difficulties after intervention. It is important to note, however, that we are not arguing that response to treatment below a particular outcome effect (ES), alone, should qualify a child as reading disabled. Rather, we believe that treatment response has practical utility, unlike information derived from IQ tests, in that this indicator can provide additional information to make accurate and sound decisions regarding the instruction and placement needs of the individual. Second, the magnitude of change related to treatment is greater in some reading measures than others. Effect sizes considered marginal according to Cohen’s criteria (effect sizes near or below 0.40) occurred in the domains of reading comprehension and experimental measures of real word recognition and pseudoword reading. Those areas that approached Cohen’s (1988) threshold of 0.80 for a large effect were standardized measures of phonological skills (0.66) and segmentation (0.79). Experimental measures of segmentation exceeded Cohen’s large effect criterion (1.28). Studies that demonstrated effect sizes in the moderate range included standardized measures of pseudoword reading and reading comprehension (0.59 and 0.41, respectively). Therefore, these results suggest that treatment resistance is less likely to be seen in measures of the phonological processes (e.g. segmentation) than on measures of actual reading (e.g. real word reading and reading comprehension). Third, only a few key instructional components moderated the magnitude of effect sizes. We found a positive moderate correlation between ES and the number of explicit instruction/practice components reported in the treatment. These include statements in the treatment description related to distributed review and practice, repeated practice, sequenced reviews, daily feedback, and/or weekly reviews. Other meta-analyses in intervention research corroborate these findings (see Vaughn, Gersten & Chard’s, 2000, review of outcomes from research syntheses, for example). Therefore, treatment conditions that provide explicit, detailed instructional steps, with distributive review and practice, and in conjunction with continual feedback of performance, yield higher reading outcomes. This finding is important because only one of the 20 cluster categories contributed to ES estimates, suggesting that not all treatments or instructional components are equally effective (additive) in predicting ES estimates. Thus, the multiple component approaches that are typically found in a number of intervention studies must be carefully contrasted with a component analysis approach that involves the systematic combination of instructional components known to have an additive effect on performance. Finding that extended practice enhances intervention outcomes is consistent with the existing literature. Long-term retention of all kinds of information and skills, for example, is greatly enhanced by distributed practice plus corrective feedback at different time periods.

104


When data were simply analyzed for reporting any component related to a particular category (e.g. a category may include six activities, but the study is given credit for the complete category even if only one activity is reported), larger effects were associated with treatment conditions that incorporated one-to-one instruction and instruction in the regular class setting when compared to studies that did not report such information. Further, two pedagogical features unique to large effects were metacognitive/strategy instruction and advanced organizers. We also found ES estimates to be significantly related to an organizational management feature, namely instruction given in the regular class setting. Most importantly, curricular components, such as phonological skills, spelling, and oral/silent reading activities failed to significantly contribute to positive student outcomes when conditions reporting these components were compared with conditions not reporting these components. Therefore, no outcomes in reading achievement were related to the content of instruction. Thus, it appears that the positive effects of reading intervention research are more related to effective instructional practices than to the specific skills emphasized during the instructional session. This is not to say that instructional content is unimportant. Rather, the results suggest that children may best be taught how to read when teachers focus on good teaching in the context of reading skills instruction. Fourth, variations in research methodology emerged. Given the fact that stringent criteria were used to qualify a reading intervention study for inclusion in the current analysis, it was not expected that the methodological variability of the studies would be great. However, the results show that the 21 studies are heterogeneous in methodological features. Specifically, approximately 68% of the studies failed to control for teacher effects and almost 90% failed to control for exposure to novel materials. Thus, when generalizing from the findings it is important to note whether the same teacher administered both the control and treatment conditions, or if the same materials were involved during the treatment and control conditions. The implication of these findings is that treatments for reading cannot be interpreted accurately unless methodological variations are taken into consideration. Artifacts related to methodology have a profound influence on treatment outcomes (also see Swanson & Hoskyn’s, 1998, discussion of various methodological variables influencing outcomes). Lastly, the distribution of NICHD reading intervention research is narrowly focused on word decoding skills. More than 90% (N = 19) of the 21 studies included standardized dependent measures of real word reading, and another 38% (N = 8) used experimental measures in this instructional domain. In addition, across the 21 studies included in the analysis, approximately 95% (N = 20)


105

included standardized or experimental dependent measures of phonological skills. Unfortunately, measures of reading comprehension (or precursors to reading comprehension, such as vocabulary, oral comprehension) were represented in only 19% (N = 4) of the studies. When dependent measure categories were analyzed as a function of total number of dependent measures (N = 300), 36% (N = 108) of the dependent measures assessed phonological skills, while fewer than 3% (N = 8) focused on reading comprehension skills. Given the fact that the average student studied was eight years old, it is not surprising that the focus of intervention has been on word recognition skills rather than comprehension. Nevertheless, even some of the precursors to comprehension (e.g. vocabulary) were under-represented. Therefore, the results suggest that rather few interventions funded by NICHD in the last decade are focusing on comprehension or older students (e.g. adolescents). This is interesting given the National Reading Panel’s (2000) recommendation that instruction in early reading include phonological processing, as well as text processing, reading comprehension strategies, oral language vocabulary, and other skills instruction. Moreover, previous research points to the emerging evidence on effective reading comprehension treatments. For example, in a previous metaanalysis, Mastropieri, Scruggs, Bakken and Whedon (1996) found a mean ES of 0.98 of treatments designed to augment comprehension skills in students with learning disabilities. In regards to application of these findings in the classroom, the question arises as to where do these explicit instruction, advanced organizers, metacognitive/strategy, and small regular classroom instruction components fit within the context of instruction? Several authors in both regular (e.g. Slavin, Karweit & Madden, 1989) and special education (e.g. Elbaum, Vaughn, Hughes, Moody & Schumm, 2000; Kucan & Beck, 1997; Pressley & Harris, 1994) suggest that effective instruction includes the following: (1) Provide opportunities for students to learn and work in small groups. (2) State the learning objectives and orient the students to what they will be learning and what performance will be expected of them. (3) Review the skills necessary to understand the concept. (4) Present the new information, give examples, and demonstrate the concepts/materials. (5) Teach procedures that encourage students to use strategies and “think aloud” during problem solving. (6) Pose questions (probes) to students and assess their level of understanding and correct misconceptions. (7) Provide extended practice. Give students an opportunity to demonstrate new skills and learn the new information on their own.

106


(8) Assess performance and provide feedback. Review the independent work and give a quiz. Give feedback for correct answers and re-teach skills if answers are incorrect. (9) Provide distributed practice and review. No doubt, the above components have variations within instructional models. The majority of these aforementioned steps are encapsulated in what we have defined as explicit instruction/practice, metacognitive/strategy instruction, advanced organizers, and small group instruction. There are, of course, caveats to our analysis. One obvious caveat is that we collapsed our analysis across all dependent measures in evaluating the importance of instructional components. This is because the measures are correlated (not independent) with other reading measures within a particular study (see Gleser & Olkin’s, 1994, discussion on stochastically dependent effect sizes). Unfortunately, because of the limited number of studies, we did not investigate if a particular instructional component interacts with a particular category of dependent measure in predicting the magnitude of treatment outcomes. Another caveat was that the components we coded might not have matched the components emphasized by the primary authors. For example, some authors, although providing detailed descriptions of certain treatment procedures, failed to give adequate information about the details of the context (i.e. within a “balanced” literacy curriculum). Further, descriptions of the same teaching practice may vary considerably by authors of different theoretical orientations, thereby introducing some artifacts in our coding procedure. In conclusion, the magnitude of effect sizes suggests that some treatments are effective in improving reading. However, most of the studies are directed toward reading processes and therefore outcomes are more isolated to reading skills than to actual reading. Further, only a few instructional components are related to the magnitude of treatment outcomes. These components appear more related to instructional activities (i.e. explicit practice) than to the type of instruction.

NOTE 1. After analysis had been completed, we realized that a study (Torgesen & Davis, 1996) had been overlooked in our search, and thus was not included in the final analysis. In addition, because we relied on footnotes as a basis for determining NICHD as a funding source, some studies may have been inadvertently included. For example, the study by Fuchs et al. (1997) lists NICHD as a funding source although funding is tied to a research center rather than directly to their study. Further, the footnote in the Fuchs et al. study notes other funding sources. However, for consistency of coding, the study was included in the analysis.


107

ACKNOWLEDGMENTS This chapter is an expanded version of the analysis on the same data set as recorded in Necoechea and Swanson (2003, manuscript in preparation). Portions of the results of this study have been presented at the California Learning Disabilities Association Conference (November, 2001) and the National Learning Disabilities Association Conference (February, 2002), which were supported in part by Peloy Endowment funds awarded to the second author.

REFERENCES ∗ Abbott,

S. P., & Berninger, V. W. (1999). It’s never too late to remediate: Teaching word recognition to students with reading disabilities in grades 4–7. Annals of Dyslexia, 49, 223–250. Adams, M. J. (1990). Beginning to read: Thinking and learning about print. Cambridge, MA: MIT Press. Adams, M. J. (1997). About the NICHD program of research on reading development and disorders. Perspectives: The International Dyslexia Association, 23, 3–37. Alexander, P. A., & Buehl, M. M. (1999). An escherian perspective: The recursive nature of reading research. Issues in Education, 5(1), 37–43. Allington, R. L., & Woodside-Jiron, H. (1999). The politics of literacy teaching: How “research” shaped educational policy. Educational Researcher, 28(8), 4–13. ∗ Berninger, V. W., Abbott, R. D., Zook, D., Ogier, S., Lemos-Britton, Z., & Brooksher, R. (1999). Early intervention for reading disabilities: Teaching the alphabet principle in aconnectionist framework. Journal of Learning Disabilities, 32(6), 491–503. Berninger, V. W., & Traweek, D. (1991). Effects of two-phase intervention on three orthographicphonological code connections. Learning and Individual Differences, 3, 323–338. Borkowski, J. G. (1999). Finding the right balance for research on reading instruction. Issues in Education, 5(1), 55–58. ∗ Brady, S., Fowler, A., Stone, B., & Winbury, N. (1994). Training phonological awareness: A study with inner-city kindergarten children. Annals of Dyslexia, 44, 26–59. Bus, A. G., & van IJzendoorn, M. H. (1999). Phonological awareness and early reading: A meta-analysis of experimental training studies. Journal of Educational Psychology, 91(3), 403–414. Byrne, B., & Fielding-Barnesley, R. (1991). Evaluation of a program to teach phonemic awareness to young children. Journal of Educational Psychology, 83, 451–455. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd Ed.). New York: Academic Press. Ehri, L. C., Nunes, S., Stahl, S., & Willows, D. (2002). Systematic phonics instruction helps students learn to read: Evidence from the National Reading Panel’s meta-analysis. Review of Educational Research, 71, 393–448. Elbaum, B., Vaughn, S., Hughes, M., Moody, S. W., & Schumm, J. S. (2000). How reading outcomes of students with disabilities are related to instructional grouping formats: A meta-analytic review. In: R. Gersten, E. Schiller & S. Vaughn (Eds), Contemporary Special Education Research (pp. 105–135). Mahawah, NJ: Lawrence Erlbaum Associates.

108


Fletcher, J. M., & Lyon, G. R. (1998). Reading: A research-based approach. In: W. Evers (Ed.), What’s Gone Wrong in America’s Classrooms (pp. 49–90). Stanford, CA: Hoover Institution Press. Fletcher, J. M., Shaywitz, S. E., Shankwieler, D., Katz, L., Liberman, I., Stuebing, K., Francis, D., Fowler, A., & Shaywitz, B. A. (1994). Cognitive profiles of reading disability: Comparisons of discrepancy and low achievement definitions. Journal of Educational Psychology, 86, 6–23. Foorman, B. R., Francis, D. J., Fletcher, J. M., & Lynn, A. (1996). Relation of phonological and orthographic processing to early reading: Comparing two approaches to regression-based, reading-level-match designs. Journal of Educational Psychology, 88(4), 639–652. ∗ Foorman, B. R., Francis, D. J., Fletcher, J. M., Schatschneider, C., & Mehta, P. (1998). The role of instruction in learning to read: Preventing reading failure in at-risk children. Journal of Educational Psychology, 90(1), 37–55. ∗ Foorman, B. R., Francis, D. J., Novy, D. M., & Liberman, D. (1991). How letter-sound instruction mediates progress in first-grade reading and spelling. Journal of Educational Psychology, 83(4), 456–469. ∗ Foorman, B. R., Francis, D. J., Winikates, D., Mehta, P., Schatschneider, C., & Fletcher, J. M. (1997). Early interventions for children with reading disabilities. Scientific Studies of Reading, 1(3), 255–276. ∗∗ Foster, K. C., Erickson, G. C., Foster, D. F., Brinkman, D., & Torgesen, J. K. (1994). Computer administered instruction in phonological awareness: Evaluation of the Daisyquest program. The Journal of Research and Development in Education, 27(2), 126–137. ∗ Fuchs, D., Fuchs, L. S., Mathes, P. G., & Simmons, D. C. (1997). Peer-assisted learning strategies: Making classrooms more responsive to diversity. American Educational Research Journal, 34(1), 174–206. Gleser, L. J., & Olkin, I. (1994). Stochastically dependent effect sizes. In: H. Cooper & L. V. Hedges (Eds), The Handbook of Research Synthesis (pp. 339–355). New York: Russell Sage Foundation. ∗ Hart, T. M., Berninger, V. M., & Abbott, R. D. (1997). Comparison of teaching single or multiple orthographic-phonological connections for word recognition and spelling: Implications for instructional consultation. School Psychology Review, 26(2), 279–297. Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. San Diego, CA: Academic Press. Hoffman, J. V. (1999). What do reading teacher educators want from reading research? A call from the hall. Issues in Education, 5(1), 77–83. Kucan, L., & Beck, I. L. (1997). Thinking aloud and reading comprehension research: Inquiry, instruction, and social interaction. Review of Educational Research, 67, 271–299. Liberman, I. Y., Shankweiler, D. P., & Liberman, A. M. (1989). The alphabetic principal and learning to read. In: D. P. Shankweiler & I. Y. Liberman (Eds), Phonology and Reading Disability: Solving the Reading Puzzle (pp. 1–22). Ann Arbor, MI: University of Michigan Press. Lovett, M. W., Borden, S. L., DeLuca, T., Lacerenza, L., Benson, N. J., & Brackstone, D. (1994). Treating the core deficits of developmental dyslexia: Evidence of transfer of learning after phonologically – and strategy-based reading training program. Developmental Psychology, 30, 805–822. Lovett, M. W., Borden, S. L., Warren-Chaplin, P. M., Lacerenza, L., DeLuca, T., & Giovinazzo, R. (1996). Text comprehension training for disabled readers: An evaluation of reciprocal teaching and text analysis to training programs. Brain and Language, 54, 447–480. ∗ Lovett, M. W., Lacerenza, L., Border, S. L., Frijters, J. C., Steinbach, K. A., & De Palma, M. (2000). Components of effective remediation for developmental reading disabilities: Combining


109

phonological and strategy-based instruction to improve outcomes. Journal of Educational Psychology, 92(2), 263–283. ∗ Lovett, M. W., Steinbach, K. A., & Frijters, J. C. (2000). Remediating the core deficits of developmental reading disability: A double-deficit perspective. Journal of Learning Disabilities, 33(4), 334–358. Lundberg, I., Frost, J., & Peterson, O. (1988). Effects of an extensive program for stimulating phonological awareness in preschool children. Reading Research Quarterly, 23, 263–284. Lyon, G. R. (1999a, July). Hearing on Title 1 of the Elementary and Secondary Education Act: Congressional testimony to the U.S. House of Representatives Committee on Education and the Workforce. Washington, DC. Lyon, G. R. (1999b, October). Education research: Is what we don’t know hurting our children. Congressional testimony to the U.S. House of Representatives House Science Committee and Subcommittee on Basic Research. Washington, DC. Lyon, G. R. (1999c). In celebration of science in the study of reading development, reading difficulties, and reading instruction: The NICHD perspective. Issues in Education, 5(1), 85–115. Lyon, G. R. (2002). Reading development, reading difficulties, and reading instruction: Educational and public health issues. Journal of School Psychology, 40(1), 3–6. Lysunchuk, L. M., Pressley, M., D’Ailly, H., Smith, M., & Cake, H. (1989). A methodological analysis of experimental evaluations of comprehension strategy instruction. Reading Research Quarterly, 24, 458–470. Maloney, J. (2001). Washington update. LDA Newsbriefs, 36(1), 5–6. Mastropieri, M. A., Scruggs, T. E., Bakken, J. P., & Whedon, C. (1996). Reading comprehension: A synthesis of research in learning disabilities. In: T. E. Scruggs & M. A. Mastropieri (Eds), Advances in Learning and Behavioral Disabilities (pp. 201–227). Greenwich, CT: JAI Press. Morrow, L. M. (1999). Where do we go from here in early literacy research and practice. Issues in Education, 5(1), 117–124. National Reading Panel (2000). Teaching children to read: An evidence-based assessment of the scientific research literature on reading and its implications for reading instruction. Washington, DC: National Institute of Child Health and Human Development. ∗ Olson, R. K., & Wise, B. W. (1992). Reading on the computer with orthographic and speech feedback: An overview of the Colorado remediation projects. Reading and Writing: An Interdisciplinary Journal, 4, 107–144. Olson, R. K., Wise, B., Johnson, M., & Ring, J. (1997). The etiology and remediation of phonologically based word recognition and spelling disabilities: Are phonological deficits the “hole” story? In: B. Blachman (Ed.), Foundations of Reading Acquisition. Mahwah, NJ: Erlbaum & Associates. Olson, R. K., Wise, B., Ring, J., & Johnson, M. (1997). Computer-based remedial training in phoneme awareness and phonological decoding: Effects on the post-training development of word recognition. Scientific Studies of Reading, 1(3), 235–253. Pikulski, J. J. (1994). Preventing reading failure: A review of five effective programs. The Reading Teacher, 48, 30–39. Pressley, M., & Allington, R. (1999a). What should reading instructional research be the research of? Issues in Education, 5(1), 1–35. Pressley, M., & Allington, R. (1999b). Concluding reflections: What should reading instructional research be the research of? Issues in Education, 5(1), 165–175. Pressley, M., & Harris, K. R. (1994). Increasing the quality of educational intervention research. Educational Psychology Review, 6, 191–208.

110


Rosenshine, B. (1995). Advances in research on instruction. Journal of Educational Research, 88(5), 262–268. Rosenthal, R. (1994). Parametric measures of effect size. In: H. Cooper & L. V. Hedges (Eds), The Handbook of Research Synthesis (pp. 231–244). New York: Russell Sage Foundation. Share, D. L., McGee, R., McKenzie, D., Williams, S., & Silva, P. A. (1987). Further evidencerelating to the distinction between specific reading retardation and general reading backwardness. British Journal of Developmental Psychology, 5, 35–44. Share, D. L., & Stanovich, K. E. (1995). Cognitive processes in early reading development: A model of acquisition and individual differences. Issues in Education: Contributions from Educational Psychology, 1, 1–57. Shaywitz, B. A., Fletcher, J. M., Holahan, J. M., & Shaywtiz, S. E. (1992). Discrepancy compared to low achievement definitions of reading disability: Results from the Connecticut Longitudinal Study. Journal of Learning Disabilities, 25, 639–648. Slavin, R. E., Karweit, N. L., & Madden, N. A. (1989). Effective programs for students at risk. Boston: Allyn & Bacon. Stanovich, K. E. (1988). Explaining the differences between the dyslexic and the garden-variety poor reader: The phonological core-variable difference model. Journal of Learning Disabilities, 21, 590–604. Stanovich, K., & Siegel, L. S. (1994). Phenotypic performance profile of children with reading disabilities: A regression-based test of the phonological core difference model. Journal of Educational Psychology, 86, 24–53. Swanson, H. L. (1999). Reading research for students with LD: A meta-analysis of intervention outcomes. Journal of Learning Disabilities, 32(6), 504–532. Swanson, H. L., & Hoskyn, M. (1998). A synthesis of experimental intervention literature for students with learning disabilities: A meta-analysis of treatment outcomes. Review of Educational Research, 68, 277–322. Strauss, S. L. (2001, June/July). An open letter to Reid Lyon. Educational Researcher, 26–33. Taylor, D. (1998). Beginning to read and the spin doctors of science: The political campaign to change America’s mind about how children learn to read. Urbana, IL: National Council of Teachers of English. Taylor, B. M., Anderson, R. C., Au, K. H., & Raphael, T. E. (2000). Discretion in the translation of research to policy: A case from beginning reading. Educational Researcher, 29(6), 16–26. Torgesen, J. K. (2000). Individual differences in response to early interventions in reading: The lingering problem of treatment resistors. Learning Disabilities Research and Practice, 15, 55–64. Torgesen, J. K. (2002). The prevention of reading difficulties. Journal of School Psychology, 40(1), 7–26. Torgesen, J. K., & Davis, C. (1996). Individual difference variables that predict response to training in phonological awareness. Journal of Experimental Child Psychology, 63, 1–21. ∗ Torgesen, J. K., Morgan, S. T., & Davis, C. (1992). Effects of two types of phonological awareness training on word learning in kindergarten children. Journal of Educational Psychology, 84(3), 364–370. Torgesen, J. K., Wagner, R. K., & Rashotte, C. A. (1997). Prevention and remediation of severe reading disabilities: Keeping the end in mind. Scientific Studies of Reading, 1, 217–234. Torgesen, J. K., Wagner, R. K., Rashotte, C. A., & Herron, J. (2001). A comparison of two computer assisted approaches to the prevention of reading disabilities in young children. Manuscript in preparation.


111

∗ Torgesen,

J., Wagner, R. K., Rashotte, C. A., Lindamood, P., Rose, E., Conway, R., & Garvan, C. (1999). Preventing reading failure in young children with phonological processing disabilities: Group and individual responses to instruction. Journal of Educational Psychology, 91(4), 579–593. Troia, G. A. (1999). Phonological awareness intervention research: A critical review of the experimental methodology. Reading Research Quarterly, 34, 28–52. Vaughn, S., Gersten, R., & Chard, D. J. (2000). The underlying message in LD intervention research: Findings from research syntheses. Exceptional Children, 67(1), 99–114. Vellutino, F. R., Scanlon, D. M., & Lyon, G. R. (2000). Differentiating between difficult to remediate and readily remediated poor readers: More evidence against the IQ-achievement discrepancy definition of reading disability. Journal of Learning Disabilities, 53(3), 223–238. ∗ Vellutino, F. R., Scanlon, D. M., Sipay, E. R., Small, S. G., Pratt, A., Chen, R. S., & Denckla, M. B. (1996). Cognitive profiles of difficult-to-remediate and readily remediated poor readers: Early intervention as a vehicle for distinguishing between cognitive and experiential deficits as basic causes of specific reading disability. Journal of Educational Psychology, 88, 601–638. Vellutino, F. R., Scanlon, D. M., & Tanzman, M. S. (1994). Components of reading ability: Issues and problems in operationalizing word identification, phonological coding, and orthographic coding. In: G. R. Lyon (Ed.), Frames of Reference for the Assessment of Learning Disabilities: New Views on Measurement Issues (pp. 279–324). Baltimore, MD: Brookes. ∗ Warrick, N., Rubin, H., & Rowe-Walsh, S. (1993). Phoneme awareness in language-delayed children: Comparative studies and intervention. Annals of Dyslexia, 43, 153–173. Williams, J. (1999). Random observations on reading research. Issues in Education, 5(1), 161–164. ∗ Wise, B. W., & Olson, R. K. (1992). How poor readers and spellers use interactive speech in a computerized spelling program. Reading & Writing: An Interdisciplinary Journal, 4, 145–163. ∗ Wise, B. W., & Olson, R. K. (1995). Computer-based phonological awareness and reading instruction. Annals of Dyslexia, 45, 99–122. ∗ Wise, B. W., Ring, J., & Olson, R. K. (1999). Phonological awareness with and without explicit attention to articulation. Journal of Experimental Child Psychology, 72, 271–304. ∗ Wise, B. W., Ring, J., Sessions, L., & Olson, R. K. (1997). Phonological awareness with and without articulation: A preliminary study. Learning Disability Quarterly, 20, 211–225. ∗ Denotes ∗∗ Two

studies included in the present meta-analysis. experiments within this study were included in the present meta-analysis.

112

APPENDIX A Sample Size (M/F)

Sample Characteristics

Description of Intervention for Treatment Group

Description of Intervention for Control Group

Abbott and Berninger (1999) Annals of Dyslexia

20 (13/7)

Subjects were identified as being low achievers in reading.Average verbal IQ was 102.55. Average WRMT-R Word Attack was 83.90 and average WRMT-R Word Identification was 81.25. The age range was 9.58–13.16 years, with a mean of 11.54 years. Ninety percent of the sample was European-American and 10% were Asian-American. Mothers’ level of education (SES determinant) included high school (25%), community college or technical school (50%), college (15%), and graduate school (10%). Ninety percent of the students received resource room services prior to the study and 70% continued to receive services during the study. Ten percent of the sample did not receive any services. Nineteen of 20 children reported reading outside of school; one child reported no reading outside of school.

Structural Analysis Group: Students received training in phonological and orthographic awareness, alphabetic principle, phonological decoding, and oral reading, through direct instruction. Students received 15 minutes of explicit instruction in syllable types and morpheme patterns according to word origin. Students were also encouraged to check for affixes, roots, syllables and letter-sound correspondences during reading. During 24 minutes of oral reading, students were told to divide words into syllables, sound out the syllables, and pronounce the word.

Study Skills Group: Students received 21 minutes of training in phonological and orthographic awareness, alphabetic principle, phonological decoding, and oral reading through direct instruction. Students received 15 minutes of instruction in study skills, including outlining, writing paragraphs, note-taking, and using an index. During 24 minutes of oral reading, students were taught to use letter-sound correspondences to sound out difficult words.

Berninger, Abbott, Zook, Ogier, Lemos-Britton and Brooksher (1999) Journal of Learning Disabilities

48 (Not stated)

Subjects were referred by teachers as performing at the bottom of their class in reading. Mean verbal IQ was 91.60. Mean WRMT-R Word Identification score was 78.50 and 82.27 on Work Attack. All students scored 1 SD below the mean on phonological processing, 2 SDs below on rapid naming, and 1/2 to 1 SD below on orthographic processing. Mean age was 89.25 months. Forty students were Caucasian, 1 was Asian-American, 1 was African-American, 4 were Hispanic, and 2 were Native American. SES was indicated by educational level of subject’s parents; 5 parents had less than high school; 21 parents

48 words were taught using one of three treatments. Following treatment session, children had reading practice in connected text material. Subword: Attention given to single-letter or multi-letter unit while associated sounds of each unit are produced. Whole word name is also taught for each word following the subword component. Combination: Words were taught using a combination of both whole word and subword training procedures.

48 words were taught using one of three treatments. Following treatment session, children had reading practice in connected text material. Whole Word: Attention given to name of whole word and explicit naming of individual letters. Use of a reminding procedure was incorporated to assist memory of words.


Author/Date/ Journal

Brady, Fowler, Stone and Winburg (1994) Annals of Dyslexia

42 (Not stated)

Subjects were in one of four inner-city public school kindergarten classes. Schools were selected by site administrators and were serving a largely working class, lower SES community. All subjects had PPVT-R IQ scores about 80. Mean age of subjects was 5 years, 4 months. Treatment group consisted of 15 Caucasians, 4 African-Americans, and 2 Asian-Americans. Control group consisted of 20 Caucasians and 1 African-American.

The study was designed with three different components, given in stages during the intervention. Phase 1 (Weeks 1– 4): Emphasis of training was on achieving phonological awareness above the level of the phoneme by focusing on sound through rhyming, segmentation, categorization and identification. Phase 2 (Weeks 5–10): Emphasis was on isolating the phoneme and focusing on the articulatory characteristics of the phonemes. Phoneme deletion and phoneme identification tasks were also introduced. Phase 3 (Weeks 11–18): Emphasis was on segmenting and ordering on phonemes within the structure of the syllable. Analysis was supplemented with manipulatives (tiles, markers, etc.)

Subjects continued in regular classroom, which emphasized a “whole language” approach to language arts.

Foorman, Francis, Fletcher, Schatschneider and Mehta (1998) Journal of Educational Psychology

285 (174/111) Grade 1 Classes: 36 Grade 2 Classes: 29

Subjects were in grades 1 and 2 enrolled in 8 of 19 elementary schools in an urban school district. Participants were receiving Title 1 services and represented the lowest 18% of scores on the district’s emergent literacy exam. Sixty and 20% of the sample were African American and Hispanic, respectively. The remaining 20% of the sample was Caucasian. The school used for the control classroom condition (IC-S) was selected by district administrators, and was noted to be the school with the largest total enrollment, largest number of students enrolled in the federal lunch program, and had the lowest statewide achievement test results for Grade 3 students.

Direct-Code (DC): Focused on letter-sound correspondences practiced in decodable text, taught through direct instruction. Emphasized phonemic awareness, phonics, blending and literature activities within a literature-rich environment. Embedded Code (EC): Focused on systematic spelling patterns (onset-rimes) embedded in predictable, connected text, taught through less direct instruction. Emphasized phonemic awareness and spelling patterns within a literature-rich environment. Implicit Code/Research Implemented Curriculum (IC-R): Focused on meaning-centered literature activities. Taught the alphabetic code embedded in connected text, using indirect, incidental instruction. Emphasized

Implicit Code/Standard District Curriculum (IC-S): Focused on meaning-centered literature activities. Taught the alphabetic code embedded in connected text, using indirect, incidental instruction. Emphasized the integration of reading, spelling and writing, within a literature-rich environment.


had high school education; 31 parents completed some community college or vocational training; 22 parents had college education; and 12 parents had graduate degrees.

113

Sample Size (M/F)


114

Appendix A. Author/Date/ Journal

(Continued )



the integration of reading, spelling and writing, within a literature-rich environment. 80 (40/40)

Students were enrolled in public schools or parochial schools in Houston, Texas. Students were from lower-middle-class and middle-class populations. None of the participants received instruction in English as a second language. Public school classrooms emphasized words in meaningful contexts (less letter-sound instruction). Students in the less LS group had a mean age of 81.83 months, a mean IQ equivalent score of 100 and a mean Gates-MacGinities Reading Test grade equivalent score of 1.45. 32.5% of the less LS group was of ethnic minority. Parochial school classrooms emphasized letter-sound correspondences (more LS group). Students in the more LS group had a mean age of 79.78 months, a mean IQ equivalent score of 107 and a mean Gates-MacGinities Reading Test grade equivalent score of 1.60. Thirty percent of the more LS group was of ethnic minority.

More LS Group: In reading, 45 minutes of daily instruction focused on rules for relating letters and sounds, through basal series exercises, sequenced spelling patterns, and all-class instruction. Children were expected to sound out unknown words and blend sounds together to form whole words. In spelling, students were given lists of words from which they completed practice exercises, spelling bees, and spelling tests. Instruction lasted 20 minutes per day. Total letter-sound instruction per day was 45 minutes, occurring during reading instructional time.

Less LS Group: In reading, 45 minutes of daily instruction focused on language experience and on whole words in meaningful contexts. Activities included reading stories, thematic unit exercises, and vocabulary development. In spelling, teachers spent 20 minutes daily displaying words, pronouncing whole words, pronouncing each syllable (onset-rime), and pronouncing each letter sound. Total letter-sound instruction per day was 15 minutes, occurring during spelling instructional time.

Foorman, Francis, Winikates, Mehta, Schatschneider and Fletcher (1997) Scientific Studies of Reading

114 (76/38)

Participants were students in grades 2 and 3, with reading disabilities, enrolled in 13 of 19 elementary schools in an urban school district in the southwestern United States. None of the subjects had emotional disorders, sensory deficits, neurological disorders, or were classified at the lowest level of English as a second language. Students scored at or below the 25th percentile on reading recognition. Students had IQ scores above 79. Students in the

Synthetic Phonics: The instructional unit of focus was the phoneme. Instruction emphasized letter names, letter sounds, and blending skills, using a systematic approach incorporating multisensory modalities. Analytic Phonics: The instructional unit of focus was the onset-rime. Instruction emphasized manipulating letters, reading and writing, using a direct instruction approach.

Sight Word: The instructional unit of focus was the whole word. Instruction emphasized whole word identification, reading of stories, writing and spelling of whole words.


Foorman, Francis, Novy and Liberman (1991) Journal of Educational Psychology

27 (17/10)

Subjects were children attending Kinderland Child Care Center. This center typically serves lower to middle socioeconomic groups, with many of the students coming from single parent and disadvantaged homes. Subjects were selected on the basis of their scores on Peabody Picture Vocabulary Test-Revised (PPVT-R) and Phonological Awareness Test (PAT). Average scores on PPVT-R for treatment and control groups were 97.6 and 93.2, respectively. Average PAT scores for treatment and control groups were 16.7 and 17.8, respectively. Average age for treatment group was 65.1 months. Average age for the control group was 63.4 months.

DaisyQuest, a computer program, was used to teach and provide practice in using both synthetic and analytic phonological skills. The program consisted of six different instructional activities. The program assists students with recognizing words that rhyme; recognizing words that have the same beginning, middle, and ending sounds; words that can be formed from a series of separately presented phonemes; and how to count the number of sounds in words. All subjects began with a tutorial of a skill area, practiced the skill until mastery, and then moved on to another skill area, until all six activities were completed. A motivational component was incorporated into the program – mastery of each activity elicited clues as to where “Daisy, the dragon” was hiding.

Students participated in the regular preschool program.

Foster, Erickson, Foster, Brinkman and Torgesen (1994) The Journal of Research and Development in Education Experiment 2

70 (Not stated)

Subjects were 2nd semester kindergarten students from four classrooms in a suburban elementary school. Subjects were selected on the basis of their scores on Peabody Picture Vocabulary Test-Revised (PPVT-R). Average scores on PPVT-R for treatment and control groups were 109.2 and 107.0, respectively. Average age for treatment group was 73.2 months, while the average age for the control group was 73.4 months.

DaisyQuest I and II are computer programs used to teach and provide practice in using both synthetic and analytic phonological skills. The entire program consisted of seven different instructional activities. This version of DaisyQuest included more practice items than the program used in Experiment 1. It also included an additional instructional activity, the onset-rime component.

Students participated in the regular kindergarten program.

115



synthetic phonics group had higher verbal IQ scores and initial decoding skills than both of the other groups. Sixty-three percent of the sample population was ethnic minority. In terms of SES, 37% of the sample was middle-class, 20% were working-class, and 29% were lower-class (23 students did not respond to this question). The analytic phonics treatment group had a sizable proportion of lower class students.

116

Appendix A.

(Continued )

Sample Size (M/F)




Fuchs, Fuchs, Mathes and Simmons (1997) American Educational Research Journal

120 students (66/54) 40 classes

Twelve elementary and middle schools were ranked as high, middle, or low level groups based on mean standardized statewide reading scores and the proportion of students receiving free lunch. Six schools were part of a large urban district; six were in two suburban districts. Teachers in the study were those with one or more LD students in their reading class. Students were identified by teachers as being learning disabled (LD), low performing (LP) and in the lowest 25% of class in reading, or average achieving (AA).

Twelve schools were equally divided between PALS and No-PALS conditions, and among high-, mid-, and low-level achievement. Treatment group consisted of 2 high-level, 2 mid-level, and 2 low-level PALS schools. Students were paired together, according to high-low level of performance (e.g. strong with weak) and assigned to read from text material. Each student in each pair rotated the teacher/student roles during instruction. During instruction, students engaged in 10 minutes of partner reading with retell, 10–20 minutes of paragraph summary, and 10 minutes of prediction relay. Comprehension strategies were emphasized. Text material for each group was individualized at the level of the poorer reader.

Control group consisted of 2 high-level, 2 mid-level, and 2 low-level No-PALS schools. Students were taught reading through the usual instructional manner, including silent reading of basal texts and teacher-led, large group discussions.

Hart, Berninger and Abbott (1997) School Psychology Review

12 (6/6)

Subjects were ages 7 years, 0 months to 10 years, 11 months, who had severe problems in reading recognition. Participants were about 2 SD below on WRMT-R Word Identification and Word Attack. All subjects were European-Americans and were enrolled in classrooms that focused on whole language literature activities.

Subjects received 16 sessions of treatment, divided as follows: Multiple Connections: Sessions 1–16 included 7 minutes of phonics (letter-phoneme or letter cluster-phoneme training), 7 minutes of word families (letter cluster-rime training), and 7 minutes of sight words (whole written word/whole spoken word training on individualized list of words). All sessions were followed by 9 minutes of treatment probes.

Subjects received 16 sessions of treatment, divided as follows: Single Connection: Sessions 1–4 included 10 minutes of orthographic coding instruction (whole word, letter, and letter cluster) and 10 minutes of phonological coding (syllable and phoneme segmentation); Sessions 5–8 included 7 minutes each of orthographic and phonological coding and 7 minutes of phonics (letter-phoneme or letter cluster-phoneme training); Sessions 9–16 included 7 minutes of phonics and 7 minutes of word families (letter cluster-rime training). All sessions were followed by 9–10 minutes of treatment probes.



Subjects exhibited severe deficits in reading. On average, subjects scored more than 2 SD below normal on standardized reading and spelling measures. Mean age of sample was 9.7, with a range from 6 years, 9 months to 13 years, 9 months. Verbal and Performance IQ range was 60–122 and 64–130, respectively. All subjects spoke English as their first language. None of the participants had significant behavior disorders, hearing impairment, brain damage, serious emotional disorders, or chronic medical conditions. Twenty-nine percent of the subjects came from low-income families.

Phonological Analysis and Blending/Direct Instruction (PHAB/DI) × 2: Focused on remediating phonological analysis and blending deficits in the context of teaching concurrent letter-sound knowledge and word-attack skills. Involved instruction in word segmenting and blending skills. Training was given at both the oral level and print level. Students were taught letter-sound and letter-cluster-sound correspondences. Content is introduced in direct instruction format. Word Identification Strategy Training (WIST) × 2: Taught students metacognitive phonics to assist them to use what they do know to decoding unfamiliar words. Decoding strategies focused on word identification by analogy, locating familiar parts of words, pronouncing vowels in different variations, and “peeling off” prefixes and suffixes. Students were taught onset-rime segmentation. PHAB/DI to WIST: One cycle of PHAB/DI followed by one cycle of WIST WIST to PHAB/DI: One cycle of WIST followed by one cycle of PHAB/DI

Classroom Survival Skills (CSS)/MATH: Provided training in academic survival skills and self-help strategies. Organizational strategies, academic problem-solving, and study skills were part of the focus. Instructional lessons included school behaviors, study strategies, test-taking strategies, and reference skills. Older students were taught different strategies than younger students, based on the needs of the different level classrooms. Generalization of survival skills is developed through a final project requiring students to apply knowledge to project completion. Math cycle taught basic math concepts, computational skills, number facts, and strategies for word problems through direct instruction format.

Lovett, Steinbach and Frijters (2000) Journal of Learning Disabilities

166 (113/53)

Subjects were severely reading disabled students between the ages of 7 and 13. Participants were referred to the research department at The Hospital for Sick Children to remediate specific reading acquisition problems. Subjects were in the lower 20th percentile of reading achievement on screening measures and averaged more than 2 SD below age-norm expectations at referral. Half of the population consistently scored below the 1st percentile for age on standardized achievement measures. Mean verbal IQ was 92.0 and mean performance IQ was 98.7. Average age of subjects was 9.9 years. All children in the study spoke

Phonological Analysis and Blending/Direct Instruction (PHAB/DI): Focused on remediating phonological analysis and blending deficits in the context of teaching concurrent letter-sound knowledge and word-attack skills. Involved instruction in word segmenting and blending skills. Training was given at both the oral level and print level. Students were taught letter-sound and letter-cluster-sound correspondences. Content is introduced in direct instruction format. Word Identification Strategy Training (WIST): Taught students metacognitive phonics to assist them to use what they do know to decoding unfamiliar words.

Control: Provided training in academic survival skills and self-help strategies. Organizational strategies, academic problem-solving, and study skills were part of the focus. Older students were taught different strategies than younger students, based on the needs of the different level classrooms.

117

85 (61/24)


Lovett, Lacerenza, Borden, Frijters, Steinbach and DePalma (2000) Journal of Educational Psychology

Sample Size (M/F)

118


(Continued )



English as their first language. Students were categorized into one of three treatment subgroups according to their type of reading disability: Phonological-only (PHON) deficit, Visual naming-speed (RAN) deficit, or Double Deficit (DD).

Decoding strategies focused on word identification by analogy, locating familiar parts of words, pronouncing vowels in different variations, and “peeling off” prefixes and suffixes. Students were taught onset-rime segmentation.


149 (92/57)

Participants were disabled and non-disabled students in grades two to six who were referred by teachers based on poor classroom performance and low standardized test scores. None of the subjects displayed sensory or neurological deficits, irregular school attendance, or were learning English as their second language. All students had IQ scores of at least 90, with an average IQ and PIAT ratio of 103.3 and 78, respectively. Average age was 9.8 and average grade was 3.8.

Children read stories on the computer and were trained to “target” difficult words, which would allow them to ask the computer for help with the words. When feedback for a word was requested, the word or its segments were initially highlighted, according to treatment condition, without speech and the subject was encouraged to attempt to decode the word. After a brief delay, the word or its orthographic segments were highlighted, according to treatment condition, in union with the production of the matching speech segments. Syllable Feedback: Using a combination of BOSS and stress rules, this feedback divided syllables by preserving morphemic units in multimorphemic words (e.g. read/er). Onset-Rime Feedback: This feedback divided syllables between the initial consonant cluster and the vowel-consonant group. Whole Word Feedback: This feedback gave assistance with the whole word.

Students received the normal course of reading instruction in their language arts or special education classroom. Activities included reading, writing, and spelling.

Torgesen, Morgan and Davis (1992) Journal of Educational Psychology

48 (23/25)

Subjects were in kindergarten classes within schools serving students predominately from working class families. Mean age for groups AB, B, and C was 71.1, 71.0, and 70.1, respectively. Participants were of average verbal knowledge and were considered nonreaders. Mean vocabulary score for

AB Group: Students received phonological awareness training that included analytic and synthetic activities. In terms of analysis, students were taught how to pronounce all sounds separately. Subjects were taught to identify and pronounce beginning, middle, and ending sounds in two- and

C Group: Students received exposure to a variety of meaning-based language experiences and activities. Subjects listened to stories, discussed illustrations, answered comprehension questions, dramatized events from the story, and related personal experiences to the story’s events.


Olson and Wise (1992) Reading and Writing

Torgesen, Wagner, Rashotte, Rose, Lindamood, Conway and Garvan (1999) Journal of Educational Psychology

138 (69/69)

Subjects were kindergarten students at 13 elementary schools selected on the basis of their scores on screening measures of letter-name knowledge and phonological awareness. Verbal IQ scores ranged from 78 to 126, with means of 92.4, 92.7, 90.7, and 91.8 for PASP, EP, RCS, and NTC groups, respectively. Mean ages of PASP, EP, RCS, and control groups were 65.8, 65.2, 64.9, and 66.0, respectively. Fifty-four percent of the sample was ethnic minority. None of the subjects spoke English as a second language, were repeating kindergarten, or were receiving extensive special education services.

No treatment control (NTC)

119

three-phoneme words. In terms of synthesis, students were taught to pronounce words after hearing the phonemes presented in sequence. B Group: Students received phonological awareness training that focused on synthesis only. Subjects were taught to identify words represented by sequences of separately presented phonemes. Subjects were provided one-on-one instructional support in reading, in addition to their regular classroom instruction, according to each treatment condition. Phonological Awareness Plus Synthetic Phonics (PASP): Systematic training was carried out at an oral and motor level (articulatory awareness training). Seventy-four percent of treatment session was spent on phonological awareness activities, letter-sound correspondences, and phonemic decoding; 6 and 20% was spent on sight word building and reading/writing connected text, respectively. Embedded Phonics (EP): Systematic, less intensive training in phonological awareness skills was carried out in the context of writing activities. Twenty-six percent of treatment session was spent on phonological awareness activities, letter-sound correspondences, and phonemic decoding; 17 and 57% was spent on sight word building and reading/writing connected text, respectively. Regular Classroom Support (RCS): Individual instruction aimed at supporting the reading curriculum in the regular classroom. Twenty-four percent of treatment session was spent on phonics activities, 24% on sight word building, and 43% on meaningful experiences with print. The remaining 9% was spent on spelling activities.


groups AB, B, and C were 94.1, 96.7, and 93.8, respectively. None of the participants had poor attendance records, emotional disorders, or attended special classes.

120

Appendix A.

(Continued )

Sample Size (M/F)




Vellutino, Scanlon, Sipay, Small, Pratt, Chen and Denckla (1996) Journal of Educational Psychology

102 (Not stated)

Individualized Tutoring Group: Students received 30 minutes of one-on-one instruction each day for approximately 70–80 sessions. Tutoring sessions were individualized for each student, but included reading connected text, word identification strategy instruction, sight word building, phonemic awareness training, alphabetic principle awareness, phonetic coding, and writing skills instruction.

Small Group: Students received one-on-one, small group remediation assistance from their home school. Components varied from school to school, but could have included similar techniques as the tutored group, or a structured basal approach to reading.

Warrick, Rowe-Walsh and Rubin (1993) Annals of Dyslexia

28 (Not stated)

Language-Delayed Training: Students were exposed to syllable awareness activities including clapping, counting, and categorizing syllables in words. Children were taught to hear initial sounds in words through prolongation, iteration, and repetition of phonemes. Rhyming games and activities were used to train manipulation of phonemes. Segmentation training (using the say-it-and-move-it techniques) was delivered using manipulatives.

Language-Delayed Control: No training provided. Not stated how children were taught.

Wise and Olson (1992) Reading and Writing

28 (20/8)

Subjects were in kindergarten or first grade classrooms in middle-to-upper-class school districts in New York. All participants had either a Verbal or Performance IQ score at or above 90. Subjects were classified as poor readers and scored at or below the 15th percentile in reading achievement, based on standardized screening measures. None of the subjects had sensory disorders, severe emotional disorders, frequent ear infections, limited intellectual ability, or neurological disorders. All children spoke English as their first language and were not administered medication on a daily basis. Subjects were kindergarten students who were identified as language-delayed according to screening measures. All students scored in the average range in non-verbal intelligence and were nonreaders. None of the subjects spoke English as a second language, displayed sensory deficits, or had physical or emotional disorders. All students were judged as having similar socio-economic backgrounds. None of the participants had ever received formal reading instruction. Subjects were ages 7–14 who volunteered for clinical studies at the University of Colorado for reading and/or spelling difficulties. On WRAT spelling subtest, all students tested at least one grade level below national averages. All but three of the participants tested at least one grade level below national averages on WRAT reading word recognition subtest. Average age and

Interactive Feedback: Subjects used the “Spello” computer program, designed to provide interactive speech feedback. The computer pronounced the word to be spelled, followed by the student attempting to type the word correctly. Word lists were individualized based on pre-test measures. The program showed students which letters were correct in their spelling exercises

Word-Only Feedback: Subjects used the “Spello” computer program, designed to provide interactive speech feedback. The computer pronounced the word to be spelled, followed by the student attempting to type the word correctly. Word lists were individualized based on pre-test measures. The program showed students which letters were correct in their spelling exercises



(orthographic feedback). The computer additionally provided intermediate whole-word speech feedback for children’s spelling attempts, so subjects could explicitly hear how their attempts sounded and could compare it with the target word. Subjects also had the additional option of interactive speech feedback for any spelling attempt that included a vowel, so that subjects could hear how their attempt in progress sounded without having to complete the entire word to receive feedback.

(orthographic feedback). The computer additionally provided intermediate whole-word speech feedback for children’s spelling attempts, so subjects could explicitly hear how their attempts sounded and could compare it with the target word.

103 (69/34)

Participants were in grades two to five who were assessed by their teachers as being in the lowest 10% in reading. Students had verbal or performance IQ estimates at or above 90, with a mean of 102 and 101 for PA and CS groups, respectively. Average age for the CS group was 8.96. Average age for the PA group was 8.97.

Phonological Awareness: Students were taught sounds, letters, and articulatory training using computer programs in groups of three students. Small group sessions were supplemented with individualized practice on the computer. Subjects were able to practice phonological awareness skills using analysis, non-word reading, phonological decoding and sound-spelling computer activities. Students also used ROSS, a storybook reading program, designed to supply target word assistance and reading comprehension feedback.

Comprehension Strategy: Students were taught four comprehension strategies, based on Reciprocal Teaching (Palinscar & Brown) in groups of three students. Small group sessions were supplemented with individualized practice on the computer. Students also used ROSS, a storybook reading practice program, designed to supply target word assistance and reading comprehension feedback.

Wise, Ring and Olson (1999) Journal of Experimental Child Psychology

153 (89/64)

Subjects were in grades two to five who had significant problems in word recognition and were ranked in the lower 10% of their class. Verbal IQ scores for all subjects were at or above 85. No subjects exhibited emotional problems or sensory deficits. All subjects spoke English as their first language. Articulation only group students mean age, grade, and IQ were 8.7, 3.1, and 103.1, respectively. Sound manipulation group students mean age, grade, and IQ were 8.7, 3.1, and 103.7, respectively. Combination group mean age, grade, and IQ were 9.1, 3.5, and 103.7, respectively. Control group

All students received equivalent time in instruction in small groups, on the computer, and one-on-one with the teacher. Articulation-Only: 13.8 hours of small group instruction in explicit articulatory training and phonics instruction, followed by 8 hours of articulatory computer practice and 15.7 hours of ROSS story reading. Sound Manipulation: 14.1 hours of small group instruction in explicit manipulation of sounds within syllables and phonics instruction. 10.7 hours of phonological analysis and sound/letter manipulation computer practice. 12.7 hours

Control students spent time in regular language arts class activities.

121

Wise and Olson (1995) Annals of Dyslexia


grade for treatment group was 11.20 and 6.50, respectively. Average age and grade for control group was 10.40 and 5.60, respectively. Average score on WRAT reading for Interactive Feedback and Word-Only Feedback groups was 4.97 and 4.68, respectively.

122


43 (36/5)



students mean age and grade were 8.5 and 2.7 (IQ was not measured).

of ROSS story reading. Combination: 13.6 hours of small group instruction in explicit articulatory training, explicit manipulation of sounds and letters, and phonics instruction. Nine hours of articulatory computer practice, 6.5 hours of phonological analysis and sound/letter manipulation computer practice. 10.3 hours of ROSS story reading. Articulatory Phonological Awareness: Four sets of students received articulatory-based training in phonological analysis skills. Children were encouraged to use mirrors while learning gestures in a small group setting. Children individually worked on computer programs practicing analysis and manipulation of sounds within syllables. Programs used included PAL, NON, Spello, and ROSS. Subjects occasionally used “Marvin” computer program to further practice articulatory awareness.

Subjects were in grades two to five and were the lowest readers in their grades based on screening measures. All students spoke English as their first language, had no primary sensory disabilities, and displayed no emotional disorders. Verbal and Performance IQ scores were at or above 90. Mean age for the non-articulatory PA and the articulatory PA group was 9.0 and 9.6, respectively. Mean grade of nonarticulatory PA and articulatory PA group was 3.4 and 4.1, respectively.


Nonarticulatory Phonological Awareness: Four sets of students received nonarticulatory phonological analysis training. Subjects were trained to manipulate first syllables, rhymes, and phonemes in all positions of single syllable words. They reviewed onset-rime patterns in a small group setting. Children worked individually on computers reviewing consonant and vowel sounds. Programs used were PAL, NON, Spello, and ROSS.


Wise, Ring, Sessions and Olson (1997) Learning Disability Quarterly

Sample Size (M/F)

(Continued )

Control Group N

Research Design Type

Treatment Assign.

Instructional Materials

Treatment Length

Internal Validity

Sample Variation


10

10

Pretest/Posttest design with control and exper. Group

Intact sample with random assign to treatment

Words program (Henry, 1990); vocabulary list of story words in Decoding Strategies Student Book (Engelmann, Meyer, Johnson & Carnine, 1988); Talking Letters (Berninger, 1998); Social Skills Program (Drumm, 1986); colored disks for sound segementation; stories read from Corrective Reading Skill Applications (Engelmann et al., 1988)

One hour each week for 16 weeks Total treatment time was 16 hours

1-2-2-1-3-13-3-3-3-3

a, g, h


Subword, 16 Combo, 16

Whole, 16

Pretest/Posttest design with control and two exper. Groups


Lists of high-frequency words (Graham, Harris & Loynachan, 1993, 1994); preprimer level storybook; 3 × 5 blank cards

30 minutes for 8 sessions (over 4–8 weeks) Total treatment time was 4 hours

1-2-1-1-1-13-3-3-3-3-3

None

Brady, Fowler, Stone and Winbury (1994) Annals of Dyslexia

21

21

Pretest/Posttest design with control and exper. group

Not stated

Segmentation training procedures (Fix & Routh, 1976; Rosner & Simon, 1971); ADD (Lindamood & Lindamood, 1975); Syllable structure training (Blachman, 1987); Categorization acitivity (“odd one out”); rhyming games; articulatory awareness games; manipulatives.

20 minutes three times per week for 18 weeks Total treatment time was 18 hours

2-2-2-3-2-31-3-3-3-3

a, d, f, g, h, i

123

Treatment Group N


APPENDIX B Authors/Date/ Journal

124

Appendix B.

(Continued )

Treatment Group N

Control Group N


Treatment Assign.


Treatment Length

Internal Validity

Sample Variation


DC: Grade 1: 9 classes Grade 2: 4 classes EC: Grade 1: 11 classes Grade 2: 3 classes IC-R: Grade 1: 11 classes Grade 2: 14 classes

IC-S: Grade 1: 5 classes Grade 2: 8 classes

Pretest/Posttest design with control and three exper. groups

Not stated

Collections for Young Scholars (Open Court Reading, 1995); general literature found in meaning-centered classrooms; writing workshop activities; predictable books; common list of sequenced spelling patterns with referenced books that use such patterns; whole class reading and writing activities; magnetic letters and acetate boards

60 minutes of classroom instruction and 30 minutes of one-on-one or small group tutoring (instructional approach matched treatment or control condition) per day over 7 months

1-2-2-1-2-11-2-1-3-1

f, g, h


40 3 classes in 2 private schools

40 3 classes in 1 public school


Intact sample with random assign to treatment group

65 minutes daily (45 minutes of reading and 20 minutes of spelling) for eight months

1-2-2-1-2-11-2-1-3-3

g, h, i

Foorman, Francis, Winikates, Mehta,

Not stated

Not stated

Pretest/Posttest design with control and two exper. groups

Not stated

HBJ Reading (Harcourt Brace Jovanovitch, 1987); Scott, Foresman Reading (Scott, Foresman, 1985); Phonics Practice Readers, Series B (Modern Curriculum Press, 1986); Modern Curriculum Press Phonics Program workbook (Elwell, Murray & Kucia, 1988); 40 regular and 20 exception single-syllable words with consistent level of word frequency and familiarity (Carroll, Davies, and Richman, 1971) Synthetic phonics program (Cox, 1991); reading and spelling decks; Sight word

60 minutes per day, for 1 year

3-2-2-1-2-13-3-2-3-1

d, e, f, g, h, i


Authors/Date/ Journal


12

15

Pretest/Postest design with control and exper. group


Apple IIGS computers with CPU, RGB color monitor, mouse, earphones, and external 3.5 disk drive; DaisyQuest I computer program (Erickson, Foster, Foster & Torgesen, 1992)

20–25 minutes; 20 sessions; 400–500 total treatment minutes

2-2-2-3-2-33-3-1-3-3

a, g, h, i


34

35

Pretest/Postest design with control and exper. group


Four Macintosh LC computers each with an internal hard drive, mouse, and headphones. DaisyQuest I and II computer programs (Erickson, Foster, Foster & Torgesen, 1992)

Session length not stated 16 sessions; 4–5.3 total treatment hours. Average time was 4.9 hours

2-2-2-3-2-23-1-3-3-3

a, g, h, i

Fuchs, Fuchs, Mather and Simmons (1997) American Educational Research Journal

60 students 20 teachers

60 students 20 teachers


Stratified sampling with random assign to treatment

Basic classroom reading materials including basal texts, library books, novels, weekly readers, and content area textbooks; score and cue cards; instructional plan sheets

35 minutes 3 times per week for 15 weeks Total treatment time was about 26 hours

1-2-2-1-2-13-3-1-3-1

g, h


program activities (Edmark Reading Program, 1984); onset-rimes word list with supporting game and stories; phonetic readers; teacher script for analytic phonics component; whole word direction cards; storybook lessons

Schatschneider and Fletcher (1997) Scientific Studies of Reading

125

126

Appendix B.

(Continued )

Treatment Group N

Control Group N


Treatment Assign.


Treatment Length

Internal Validity

Sample Variation

Hart, Berninger and Abbott (1997) School Psychology Review

6 3/3

6 3/3



List of instant words (Fry, Polk & Fountoukidis, 1984); 3 × 5 cards; Phonics Pictionary; List of common phonograms (Stahl, 1992)

30 minutes daily, 4 days per week, for 4 consecutive weeks Total treatment time was 8 hours

1-1-2-1-1-13-3-3-3-3

None

Lovett et al. (2000) Journal of Educational Psychology

PHAB/DI to WIST: 15 WIST to PHAB/DI: 10 WIST × 2: 20 PHAB/DI × 2: 18

CSS to MATH: 22

Pretest/Posttest design with control and four exper. groups

Intact sample with pseudo-random assign to treatment and control groups

Reading Mastery Fast Cycle I/II and Teacher’s Guide (Engelmann & Bruner, 1988); Corrective Reading Program (Engelmann, Carnine & Johnson, 1988); Benchmark program metacognitive phonics strategies (Gaskins, Downer & Gaskins (1986); Skills for School Success (Archer & Gleason, 1991); Connecting Math Concepts (Engelmann & Carnine, 1992); math manipulatives

1 hour per day over 70 sessions Total treatment time was 70 hours

1-2-2-1-2-33-3-2-3-3

a, b, g, h


PHAB/DI: DD: 27 PHON: 12 VNS: 13 WIST: DD: 32 PHON: 8 VNS: 13

DD: 17 PHON: 11 VNS: 7

Pretest/Posttest design with control and two exper. groups

Intact sample with random assign to treat groups

Reading Mastery Fast Cycle I/II (Engelmann & Bruner, 1988); Corrective Reading (Engelmann, Carnine & Johnson, 1988); Skills for School Success (Archer & Gleason, 1991); metacognitive phonics strategies based on those developed by Gaskins, Downer and Gaskins (1986)

60 minutes, 4 sessions per week Total treatment time was 35 hours

1-2-2-1-1-33-3-2-1-3

a, g



119

42


Intact sample with pseudo-random assign to treatment group

Computer PC/XT clone systems with serial ports, monochrome monitor, mouse, 40 MB hard disk, 360K floppy, and 640K main memory; DECtalk speech synthesizer; Hercules graphics video adaptor; large number of short stories and books entered onto the computer.

Mean reading time was 8.1 hours with 14.2 total hours on the system.

2-2-2-1-2-33-3-1-3-1

g, h, i


AB: 16 7/9 B: 17 9/8

C: 15 7/8

Pretest/Posttest design with control and two exper.. groups


Word sets; phonological awareness activities that allow analysis of words and synthesis of phonemes to create words; meaning-based language arts curriculum

20 minutes 3 times per week AB: 8 weeks of training B & C: 7 weeks of training Total treatment time was 7–8 hours

2-2-2-2-1-33-3-1-3-3

a, b, g

Torgesen, Wagner, Rashotte, Rose, Lindamood, Conway and Garvan (1999) Journal of Educational Psychology

PASP: 33 19/14 EP: 36 15/21 RCS: 37 21/16

32 14/18


Intact sample with random assign to treatment condition

ADD (Lindamood & Lindamood, 1984); limited vocabulary reading books; Poppin Readers (Smith, 1992); Early Literacy Series (Hannah, 1993); list of high frequency words; word-level drill activities and word games; picture-word cards; regular basal text series (Harcourt Brace Jovanovich Bookmark Series, 1983).

80 minutes per week for 5 semesters Total treatment time was about 88 hours

2-2-2-3-2-13-3-2-2-2

a, b, g, h, i

Vellutino et al. (1996) Journal of Educational Psychology

74

26


Intact sample with random assign to treatment group

Regular reading recognition remediation materials

30 minutes per day for about 15 weeks Total treatment time was about 37 hours, 30 minutes

2-2-2-3-2-12-3-1-3-1

a, g, h, i



127

128

Appendix B.

(Continued )

Treatment Group N

Control Group N


Treatment Assign.


Treatment Length

Internal Validity

Sample Variation

Warrick, Rubin and Rowe-Walsh (1993) Annals of Dyslexia

14

14


Stratified sampling but assign to treatment not stated

20 minutes two times per week for eight weeks Total treatment time was 5 hours and 20 minutes

1-2-2-3-2-33-3-3-3-3

g, h, i

Wise and Olson (1992) Reading and Writing

15

13


Volunteer sampling but with pseudo-random assign to treatment group

Stories prepared with several examples of a single phoneme; children’s literature, including rhymes; manipulatives for phoneme manipulation exercises; Say-it-and-Move-it (Ball & Blachman, 1988) IBM-XT compatible computer with mouse and DECtalk speech synthesizer; “Spello” interactive spelling program

15–40 minutes per session Interactive feedback average treatment time was 27 and 24.8 minutes; Word-only feedback average treatment time was 25.2 and 28.8 minutes 30 minutes for 14 sessions;

1-2-1-1-1-33-3-2-3-3

a, b, c


58 39/19

45 30/15

Pretest/Posttest/design with control and exper. group


PA: Total treatment time was 7 hours of PA training and 18.4 hours computer practice CS: Total treatment time was 7 hours CS training and 18 hours computer practice

2-2-2-1-3-23-3-2-3-1

a, b, g

Wise, Ring and Olson (1999) Journal of Experimental

ArticulationOnly: 43 Sound Manipulation:

31


Intact sample but pseudo-random assign to treatment groups

ROSS computer program; ADD (Lindamood & Lindamood, 1975); ADD computerized training (Lindamood-Bell Learning Processes); NON (non-word reading program); Spello computer program (Wise & Olson, 1992) IBM compatible Pentium-based computer with DECtalk speech boards; Sound Blaster

30 minutes daily for 6 months Total treatment time about 14 hours of

2-2-2-1-2-13-3-2-3-1

a, c, g, h, i



42 Combination: 37


24 21/3

17 15/2


Not stated

boards; Computer training programs & games (Lexia Learning Systems, 1994); ADD computer program (Lindamood-Bell Learning Processes, 1997); ROSS, PAL, Non, & Spello programs; ADD program (Lindamood & Lindamood, 1975); nonsense word reading program: Marvin computer program (Lindamood-Bell)

instruction and about 24 hours of individual work on the computer

IBM-compatible 486 with DECtalk speech synthesizer and Sound-blaster boards; articulatory awareness computer program (Lindamood-Bell Learning Processes); PAL, NON, Spello, and ROSS programs; nonarticulatory training programs (Lexia Learning Systems); ADD manual (Lindamood & Lindamood, 1975); Marvin computer program (Lindamood-Bell); colored strips and squares; keyword chart of words.

Total treatment time was 9 hours of small group instruction and 21 hours of computer practice time

2-1-2-1-1-33-3-2-3-3

a, b, c, g


Child Psychology

129

130

APPENDIX C Dependent Measures


Modified rosner test of auditory analysis skills (Berninger, Thalberg, DeBruyn & Smith, 1987) Comprehensive assessment of phonological skills (Wagner & Torgesen, 1999) WRMT-R (Woodcock, 1986)

Pre-test Treatment X


Qualitative reading inventory (Leslie & Caldwell, 1990) Real word and non-word reading efficiency tests (Wagner & Torgesen, 1999) WRMT-R (Woodcock, 1986)

SD

Syllable segmentation 6.20 2.20 Phoneme segmentation 21.50 4.97 Phonological segmentation 12.30 4.06 Nonword memory 16.60 3.66 Word identification 85.40 16.29 Word attack 84.10 16.80 Passage comprehension 87.40 24.16 3.80 1.98

Real word reading efficiency 50.40 14.27 Non-word reading efficiency 22.89 8.61 Word identification: subword 10.87 7.59 Combined 16.31 9.38 Word attack: subword 3.82 3.33 Combined 4.12 2.99 Subword 10.75 9.31

Pre-test Control X

SD

5.60

2.37

20.60

3.10

8.80

2.62

17.10

3.35

77.10

11.15

83.70

9.02

80.60 2.78

17.32 1.77

42.89

5.90

16.57

7.21

Whole word 11.38

7.97

Whole word 3.75

Whole word 9.38

4.33

6.19

Post-test Treatment X

SD

Syllable segmentation 7.40 2.01 Phoneme segmentation 25.10 5.71 Phonological segmentation 18.40 4.77 Nonword memory 18.20 3.85 Word identification 90.30 15.85 Word attack 89.40 12.18 Passage comprehension 95.90 20.64 4.80 2.04

Real word reading efficiency 53.30 13.90 Non-word reading efficiency 24.56 9.57 Word identification: subword 21.67 11.74 Combined 21.15 11.27 Word attack: subword 8.0 5.66 Combined 6.25 5.57 Subword 38.31 15.04

Post-test Control

Effect Size

X

SD

6.80

1.40

0.35

25.50

2.46

−0.10

17.80

4.24

0.13

20.00

0.39

−0.85

82.30

9.50

0.63

87.30

9.46

0.19

88.70 3.33

15.76 1.82

0.40 0.76

45.89

8.22

0.67

17.71

8.10

0.78

Whole word 18.00

9.28

0.35 0.31

Whole word 5.38

4.52

0.52 0.17

Whole word 33.69

16.52

0.29



Brady, Fowler, Stone and Winbury (1994) Annals of Dyslexia

PPVT-R (Dunn & Dunn) K-ABC triangles (Kaufman & Kaufman, 1983) WRMT-R (Woodcock, 1986)

94.16

8.76

0.40 −0.26

9.83

2.82

10.07

3.06

−0.08

8.77

17.60

8.41

17.60

9.87

0.00

3.06

4.70

9.88

9.45

11.25

8.19

−0.16

0.00

0.00

0.00

0.24

0.54

0.04

0.30

0.48

0.00 1.06

0.00 0.70

0.00 1.43

0.00 3.90

0.00 1.81

0.00 1.48

0.00 1.79

0.00 1.34

0.00

0.00

0.00

0.76

1.55

0.13

0.44

0.63

Phoneme deletion (DELET) 0.05 0.22

0.41

0.74

3.46

2.93

2.67

2.42

0.30

Letter ID (upper) 9.12 Letter ID (lower) 3.74 Word identification 0.0 Word attack 0.0 0.70

0.00

84.98

16.39

2.40

8.70

2.34

10.47

9.86

6.72

Combined 39.75 91.77

13.65 9.13

8.62

10.42 11.81

8.92

8.02

5.92

5.16

13.03

9.78

9.93

6.32

0.39

7.18

2.10

6.47

2.39

8.51

2.89

7.36

2.60

0.42

6.20

2.29

6.93

2.05

7.80

1.67

7.27

1.95

0.29

15.23

3.53

14.57

2.94

18.40

3.52

16.95

4.26

0.37

131

Experimental rhyme generation (RHYME) Experimental phoneme segmentation (SEG) Auditory analysis test (Rosner & Simon, 1971) Memory for word strings (MEM) (Rapala & Brady, 1990) Perception (PERCEPT) (Brady, Shankweiler & Mann, 1983) Experimental tongue twister (PRODUC) Boston naming test (Goodglass & Kaplan, 1983)

Combined 12.25 82.96


Reading inventory

Dependent Measures



Torgesen-Wagner battery (Wagner, Torgesen & Rashotte, 1994)

Phonological processing Grade 1 DC 0.68 0.54 Grade 2 DC 1.74 0.80 Grade 1 EC 0.37 0.36 Grade 2 EC 1.38 0.74 Grade 1 IC-R 0.51 0.55 Grade 2 IC-R 1.58 0.62 Grade 1 DC 0.20 0.51 Grade 2 DC 5.73 6.66 Grade 1 EC‘ 0.18 0.88 Grade 2 EC 4.75 4.92 Grade 1 IC-R 0.07 0.32 Grade 2 IC-R 5.12 5.24

N/A

N/A

Post-test Treatment

Post-test Control

X

X

X

Grade 1 IC-S 0.43 Grade 2 IC-S 1.48


Woodcock-Johnson -revised N/A

N/A

(Continued )

Pre-test Control SD

0.50 0.70

0.61 4.90

SD

Phonological processing Grade 1 DC 2.16 0.83 Grade 2 DC 2.51 0.60 Grade 1 EC 1.59 0.77 Grade 2 EC 2.18 0.71 Grade 1 IC-R 1.53 0.88 Grade 2 IC-R 2.21 0.73 Grade 1 DC 12.68 10.21 Grade 2 DC 19.43 10.03 Grade 1 EC 5.00 8.15 Grade 2 EC 18.29 12.02 Grade 1 IC-R 5.23 7.20 Grade 2 IC-R 16.16 14.32 Basic DC 96.1 14.6 EC 88.6 11.2 IC-R 89.6 12.7 Passage comprehension DC 96.7 15.9


Effect Size

SD

0.86

1.11

0.64

0.98 0.45 0.41 0.36 0.45


2.81

1.65

9.35

0.53 0.56 0.37 0.66 0.16

IC-S 84.5

9.7

0.96 0.39 0.46

IC-S 89.0

12.1

0.55


Experimental word reading

SD

132

Appendix C. Authors/Date/ Journal

Formal reading inventory (Wiederholt, 1986)

N/A

91.4 IC-R 92.0 DC 81.8 EC 80.8 IC-R 81.5 N/A N/A

N/A

Reading Regular and exception word reading

N/A N/A

Torgesen-Wagner battery (Wagner, Torgesen & Rashotte, 1994)

Phonological analysis SP 2.08 0.89 AP 1.66 0.88 SP 7.72 9.17 AP 5.84 7.95


PAT (Torgesen & Bryant, 1993)

16.7

STOPA-E (Torgesen & Bryant, 1990)

14.8

Foster, Erickson, Foster, Brinkman and Torgesen (1994) The Journal of Research and

Undersea challenge

47.7

3.8

46.8

3.9

STOPA-E (Torgesen & Bryant, 1990)

25.3

5.2

22.9

8.2

Experimental word-reading

N/A N/A

SW 1.55

0.82

SW 5.29

7.81

4.1

17.8

2.6

7.2

11.4

6.8

0.19

14.8 9.4

0.22 IC-S 83.1

6.9

−0.30

8.3

−0.21 1.35b 0.89b

8.7 N/A N/A

Phonological analysis SP 2.63 0.80 AP 2.32 0.79 SP 20.78 15.30 AP 15.39 13.15 22.2 Adjusted 22.4 14.8 Adjusted 18.5

3.1

53.5 Adjusted 53.3 27.0 Adjusted 26.2

9.0

7.2

4.5

−0.16

SW 2.08

1.01

0.60 0.27

SW 18.13

15.96

0.17 −0.19


3.5

0.88

6.5

0.97 0.53


7.4

0.91

7.4

0.87 0.42

0.89

0.15

133

Foorman, Francis, Novy and Liberman (1991) Journal of Educational Psychology Foorman, Francis, Winikates, Mehta, Schatschneider and Fletcher (1997) Scientific Studies of Reading

12.7


EC

Dependent Measures


Development in Education Experiment 2

Hart, Berninger and Abbott (1997) School Psychology Review Lovett et al. (2000) Journal of Educational Psychology

X

SD

Production test of segmenting

7.2

4.0

5.8

4.5

Production test of blending

9.1

5.4

8.6

5.5

243.08

100.99

233.72

4.31

1.92

9.57

5.04

Comprehensive reading assessment battery (Fuchs, Fuchs & Maxwell, 1988) Experimental reading comprehension Experimental maze choices WRMT-R word attack subtest pseudoword reading (Woodcock, 1987) Trained content – keywords

Trained content – letter-sound knowledge

N/A

PHAB/DI to WIST 65.3 WIST to PHAB/DI 57.5 WIST × 2 69.2 PHAB/DI × 2 63.5 PHAB/DI to WIST 30.9 WIST to PHAB/DI 27.3

4.35

1.73

10.08

4.57

SD 2.6

98.71

5.8 Adjusted 6.0 10.7 Adjusted 10.8 269.18

5.98

1.79

12.93

4.50

1.9

N/A

68.0

33.2

41.7 32.8

3.5

X

101.68

23.2

10.7

Post-test Control

X 12.0 Adjusted 11.8 13.6 Adjusted 13.4 298.68

N/A

33.7

Post-test Treatment

31.3

11.8

PHAB/DI to WIST 115.7 6.4 WIST to PHAB/DI 104.1 20.4 WIST × 2 108.0 19.4 PHAB/DI × 2 97.8 18.9 PHAB/DI to WIST 57.4 3.8 WIST to PHAB/DI 51.1 4.8

Effect Size

SD 4.0

1.88

3.8

1.76 1.02

96.07

0.91 0.30

5.18

1.66

0.46

11.15

3.88

0.42 1.44b

N/A

85.5

30.6

1.63 0.73 0.90 0.50

38.0

8.5

3.16 1.97


Fuchs, Fuchs, Mathes and Simmons (1997) American Educational Research Journal

(Continued )

Pre-test Control

SD

134


Transfer to real words – test of transfer

Transfer of real words – challenge words

Test of transfer – regular words

Test of transfer – exception words

11.6 13.4 9.3

18.2

9.8

6.8 12.7 10.0 22.8

25.0

20.9

12.9 28.4 31.1 9.9

6.7

9.4

5.9 14.0 13.7 9.9

9.4

9.6

5.9 14.0 13.7 4.7 3.8

4.4

2.03 1.63 23.2

7.3

0.28 0.12 0.61 0.43

41.2

29.6

1.07 0.86 0.71 0.46

15.5

16.5

1.51 0.81 1.17 0.51

18.0

15.4

0.49 0.04 0.42 0.13

8.0

7.1

0.21 −0.09

135

2.6

WIST × 2 54.1 7.4 PHAB/DI × 2 49.4 5.5 PHAB/DI to WIST 25.6 10.0 WIST to PHAB/DI 24.0 6.1 WIST × 2 28.9 11.3 PHAB/DI × 2 26.4 7.5 PHAB/DI to WIST 69.3 23.0 WIST to PHAB/DI 61.1 16.8 WIST × 2 62.2 30.0 PHAB/DI × 2 55.3 32.1 PHAB/DI to WIST 43.1 20.1 WIST to PHAB/DI 30.1 19.5 WIST × 2 41.4 27.8 PHAB/DI × 2 26.4 26.3 PHAB/DI to WIST 25.5 15.2 WIST to PHAB/DI 18.5 13.0 WIST × 2 25.1 18.5 PHAB/DI × 2 20.1 16.5 PHAB/DI to WIST 9.7 9.4 WIST to PHAB/DI 7.3 8.0


WRMT-R passage comprehension

WIST × 2 35.5 PHAB/DI × 2 31.1 PHAB/DI to WIST 16.6 WIST to PHAB/DI 14.0 WIST × 2 19.9 PHAB/DI × 2 17.4 PHAB/DI to WIST 25.1 WIST to PHAB/DI 15.6 WIST × 2 34.1 PHAB/DI × 2 24.8 PHAB/DI to WIST 9.2 WIST to PHAB/DI 5.8 WIST × 2 15.8 PHAB/DI × 2 9.5 PHAB/DI to WIST 9.2 WIST to PHAB/DI 5.8 WIST × 2 15.8 PHAB/DI × 2 9.5 PHAB/DI to WIST 3.9 WIST to PHAB/DI 1.8

Dependent Measures

WRMT-R word attack

Goldman-FristoeWoodcock sound analysis

Goldman-FristoeWoodcock sound blending

(Continued )

Pre-test Treatment

Pre-test Control

Post-test Treatment

Post-test Control

X

X

X

X

SD

WIST × 2 7.3 8.5 PHAB/DI × 2 4.9 7.0 PHAB/DI to WIST 5.7 4.6 WIST to PHAB/DI 2.8 2.8 WIST × 2 6.9 7.7 PHAB/DI × 2 6.4 6.5 PHAB/DI to WIST 6.8 6.7 WIST to PHAB/DI 2.4 1.0 WIST × 2 9.1 10.1 PHAB/DI × 2 7.3 10.6 PHAB/DI to WIST 14.5 9.4 WIST to PHAB/DI 16.0 6.5 WIST × 2 15.8 6.5 PHAB/DI × 2 18.2 5.9 PHAB/DI to WIST 7.7 7.0 WIST to PHAB/DI 5.1 2.0 WIST × 2 10.2 7.7

6.7

6.5

20.3

7.9

SD

4.6

6.0

6.1

5.3

SD

WIST × 2 12.8 12.1 PHAB/DI × 2 9.4 9.2 PHAB/DI to WIST 18.0 5.6 WIST to PHAB/DI 14.9 6.1 WIST × 2 15.5 6.8 PHAB/DI × 2 14.9 7.9 PHAB/DI to WIST 27.8 10.8 WIST to PHAB/DI 18.4 11.2 WIST × 2 21.5 13.8 PHAB/DI × 2 17.4 13.7 PHAB/DI to WIST 25.0 2.0 WIST to PHAB/DI 21.5 4.8 WIST × 2 22.1 5.8 PHAB/DI × 2 22.1 6.0 PHAB/DI to WIST 17.5 8.4 WIST to PHAB/DI 13.4 7.7 WIST × 2 17.4 9.3

Effect Size

SD 0.50 0.17

11.2

6.7

1.11 0.58 0.64 0.51

13.2

10.3

1.38 0.48 0.69 0.35

21.1

6.1

0.96 0.07 0.17 0.17 0.39

14.5

7.1

−0.15 0.35


Goldman-FristoeWoodcock nonwords

136


WRAT-3/WRAT-R reading


WRAT-R/WRAT-3 Reading (Jastak & Wilkenson, 1984; Wilkenson, 1993)

WIST DD 61.7 PHON 68.6 VNS 74.0 WRMT-R word identification (Woodcock, 1987)

13.5

34.5

12.8

9.4 22.2 13.9 3.8

23.4

4.5

3.0 6.0 3.7 DD 6.7 11.0 9.9

70.2 PHON 72.5 VNS 78.3

9.0 6.4 7.1

WIST DD 64.5 PHON 75.7 VNS 79.5

9.3 11.3 10.9

DD 12.0 12.1

PHAB/DI × 2 14.2 6.7 PHAB/DI to WIST 47.1 10.3 WIST to PHAB/DI 42.8 8.8 WIST × 2 48.3 17.6 PHAB/DI × 2 46.5 10.4 PHAB/DI to WIST 27.8 3.6 WIST to PHAB/DI 26.4 2.8 WIST × 2 28.5 5.4 PHAB/DI × 2 26.4 4.4 PHAB/DI DD 74.9 8.9 PHON 72.8 8.2 VNS 80.2 8.8

63.8 PHON 72.2

13.3 9.1

PHAB/DI DD 65.9 PHON 68.8

−0.04 42.3

13.5

0.40 0.05 0.39 0.35

25.8

4.4

0.50 0.17 0.55 0.14

DD 69.5 PHON 74.0 VNS 77.9

11.0

0.54

8.3

−0.15

6.8

0.30 0.23*

10.6

−0.46

9.5

0.19

9.8

0.19 −0.08* DD

12.2 11.5

65.5 PHON 74.5

16.6

0.03

7.9

−0.59

137


4.6


WRMT-R word identification

PHAB/DI × 2 5.9 PHAB/DI to WIST 33.3 WIST to PHAB/DI 27.1 WIST × 2 34.9 PHAB/DI × 2 33.8 PHAB/DI to WIST 22.4 WIST to PHAB/DI 21.4 WIST × 2 23.6 PHAB/DI × 2 22.7 PHAB/DI DD 68.5 PHON 68.5 VNS 73.2

Dependent Measures

Pre-test Control

Post-test Treatment

Post-test Control

X

X

X

X


PHAB/DI DD 62.9 PHON 62.7 VNS 73.3

SD 9.3

VNS 70.7

SD 7.3


9.8 10.8 13.1

DD 4.6 4.3 3.9

33.2 PHON 34.5 VNS 38.8

5.4 5.1 4.4

4.8 3.6

DD

10.3 10.8

PHAB/DI DD 39.1 PHON 40.3 VNS 43.8 WIST DD 34.3 PHON 39.3 VNS 41.2

3.6

11.1

VNS 75.9

66.5 PHON 66.5 VNS 69.0

12.7 12.6 10.2


SD 10.3

VNS 76.1

Effect Size

SD 6.6

−0.02 −0.19*

10.4

−0.53

15.3

−0.42

12.1

−0.25 −0.40* DD

5.4 3.6 5.3

34.9 PHON 36.6 VNS 41.0

6.4

0.71

4.5

0.91

4.2

0.59 0.74*

4.1

−0.11

6.8

0.48

5.2

0.04 0.14* DD

11.1 5.5 9.0

68.0 PHON 66.2 VNS 78.0

14.7

0.29

10.0

1.56

9.2

0.67 0.84*



WRMT-R word attack (Woodcock, 1987)

(Continued )

Pre-test Treatment

VNS 69.8

Goldman-FristoeWoodcock reading of symbols (Goldman, Fristoe & Woodcock, 1974)

138


Goldman-FristoeWoodcock sound analysis (Goldman, Fristoe & Woodcock, 1974)


Goldman-FristoeWoodcock sound blending (Goldman, Fristoe & Woodcock, 1974)


13.7 13.9 10.9

DD 5.6 6.5 2.9

14.3 PHON 18.6 VNS 23.0

7.4 2.4 3.1


6.2 4.0 4.1

DD 4.7 4.1 6.3

5.3 3.9 8.5


5.0 PHON 4.3 VNS 14.1

4.6 3.0 7.6


13.4

−0.43

14.2

0.68

10.3

−0.01 0.08* DD

6.9 1.6 3.5

18.1 PHON 17.2 VNS 21.7

6.7

0.40

9.4

1.62

3.9

1.00 1.01*

7.0

−0.12

8.3

0.50

3.9

0.44 0.27* DD

6.3 4.9 7.8

7.4 PHON 8.9 VNS 20.4

7.3

0.79

6.6

0.56

5.9

−0.09 0.45*

6.9

0.49

5.3

0.10

8.8

−0.25 0.11*

139





Dependent Measures Transfer to real words – test of transfer

Pre-test Control

Post-test Treatment

Post-test Control

X

X

X

X


PHAB/DI DD 25.0

SD

SD

DD 63.9 69.9 88.5

62.4 PHON 63.3 VNS 100.8

88.2 53.5 48.2

WIST DD 105.7 PHON 200.6 VNS 211.0

67.4 94.6 93.7

DD 11.1 12.2 17.9

10.4 PHON 4.9 VNS 8.1

22.0 6.4 10.0


9.3 17.9 17.7

DD 4.4


21.3

7.8

PHAB/DI DD 33.1

SD

Effect Size

SD

DD 73.6 62.6 94.5

73.4 PHON 84.3 VNS 116.8

90.3

0.59

63.0

2.17

45.4

1.03 1.26*

76.4

0.39

94.7

1.48

90.3

1.39 1.09* DD

17.8 20.0 20.9

11.3 PHON 7.5 VNS 19.0

25.2

0.17

10.4

1.59

22.3

0.49 0.75*

19.1

0.60

25.8

1.98

21.9

1.14 1.24* DD

2.6

26.3

4.0

2.06



Trained content – letter sounds

(Continued )

Pre-test Treatment

WIST DD 57.1 PHON 83.8 VNS 143.4 Transfer to real words – challenge words

140


WIST DD 23.8 PHON 20.3 VNS 28.8 Trained content – letter-cluster sounds


Goldman-FristoeWoodcock keywords (Goldman, Fristoe & Woodcock, 1974)


5.8

PHON 24.8 VNS 28.6

6.1 5.1


6.2 6.5 5.0

DD 4.1 4.4 5.3

6.9 PHON 7.3 VNS 12.1

5.2 4.1 3.9

3.6 6.2

DD

24.5 29.7

33.8


4.5

31.6

PHON 34.3 VNS 35.4

46.6 PHON 59.6 VNS 74.9

35.2 28.7 24.2

PHAB/DI DD 79.3 PHON 107.7 VNS 100.7 WIST DD 86.1

1.6 1.2

PHON 27.8 VNS 31.3

3.9

2.36

2.0

2.56 2.33*

2.9

1.83

3.3

1.25

1.6

1.78 1.62* DD

5.2 5.3 6.0

7.9 PHON 9.8 VNS 10.4

5.6

0.50

4.3

1.29

3.5

1.41 1.07*

4.5

1.55

5.0

1.16

5.9

1.53 1.41* DD

26.7 10.3 14.6

33.9

55.8 PHON 72.5 VNS 88.4

34.5

0.77

26.7

1.90

18.0

0.76 1.14*

0.89

141

WIST DD 49.9

8.8


PHON 18.6 VNS 29.5

Dependent Measures

Pre-test Control

Post-test Treatment

Post-test Control

X

X

X

X

PHAB/DI DD 23.8 PHON 50.9 VNS 59.9 WIST DD 27.9 PHON 41.8

SD

SD

PHON 116.4 VNS 106.0

23.0 35.0

DD 4.6 4.4 6.6

5.4 PHON 4.3 VNS 18.3

7.1 3.6 3.9


4.8 9.3 10.6

DD 28.5 26.5 31.3

24.2 16.5

PHAB/DI DD 13.4 PHON: 23.7 VNS: 23.9

27.9 PHON 33.8 VNS 30.0

35.5 23.6 11.0

PHAB/DI DD 41.1 PHON 81.8 VNS 83.4 WIST DD 44.8 PHON 84.3

SD

Effect Size

SD

4.2

2.84

23.9

0.84 1.52* DD

9.6 8.4 11.5

8.5 PHON 6.7 VNS 15.2

13.4

0.43

4.9

2.56

8.0

0.89 1.29*

7.0

0.05

14.5

1.61

11.6

0.50 0.72* DD

33.4 22.1 28.7

34.4 PHON 44.4 VNS 46.7

35.8

0.19

24.6

1.60

5.5

2.15 1.31*

34.3

0.30

21.1

1.74



Transfer to nonwords – regular word inventory

(Continued )

Pre-test Treatment

PHON 76.5 VNS 78.2 Goldman-FristoeWoodcock non-word reading (Goldman, Fristoe & Woodcock, 1974)

142


Transfer to nonwords – exception word inventory


WRMT-R word attack (Woodcock, 1987)


WRMT-R non-word reading

DD 16.8 16.2 19.9

22.7 PHON 27.7 VNS 22.3

29.8 17.6 9.0


20.9 10.1 21.8

DD 11.1 10.3 10.8

66.5 PHON 66.5 VNS 69.0

12.7 12.6 10.2

13.9 10.9

DD

4.4


13.7

4.2


6.1 PHON 4.7

6.8 4.1


36.8

1.32 1.12* DD

16.9 18.9 17.4

28.9 PHON 31.8 VNS 32.3

28.0

−0.20

17.2

0.80

7.2

1.35 0.65*

25.7

0.09

10.9

1.33

21.8

1.07 0.83* DD

11.1 5.5 9.0

68.0 PHON 66.2 VNS 78.0

14.7

0.29

10.0

1.56

9.2

0.67 0.84*

13.4

−0.43

14.2

0.68

10.3

−0.01 0.08* DD

5.5 3.9

8.1 PHON 5.1

8.1

0.22

3.2

3.13

143


VNS 74.6

35.5


VNS 53.4

Dependent Measures

Pre-test Control

Post-test Treatment

Post-test Control

X

X

X

X

SD 5.6

Experimental oral nonword, computerized reading test

Olson, Wise, Ring and Johnson (1997) Scientific Studies of Reading

Experimental phoneme deletion Nonword Time-limited word recognition PIAT grade equivalent

SD

VNS 5.9

2.7

6.7 6.9

30.5

95.8

33.6

35.4 36.1

Syllable 37.0 Onset-rime 42.1 Whole word 39.9

18.3

VNS 17.3

6.4


4.1

Syllable 84.6 Onset-rime 78.3 Whole word 91.6

SD

42.9

18.8

22.6 16.3

2.4

N/A

2.5

N/A

24.0

N/A

25.4

N/A

32.8

N/A

32.8

N/A

5.3

N/A

5.3

N/A

Gain scores Syllable 14.9 Onset-rime 16.8 Whole word 17.1 Gain scores Syllable 10.7 Onset-rime 7.6 Whole word 9.9 Gain score 1.1 Gain score 12.5 Gain score 19.9 Gain score 5.2

VNS 12.6

5.7

−0.13

8.6

1.46

6.3

0.18 0.50*

10.1

9.7

0.41+

10.2

0.45+

2.5

8.3

0.51+

10.0

N/A

0.54+ 0.38+

8.6

N/A

0.29+

10.8

10.9

N/A

0.78 1.38*

5.3

11.6

N/A

Effect Size

SD

Gain score 0.9 Gain score 17.1 Gain score 6.0 Gain score 1.8

N/A

0.19

N/A

0.33

N/A

0.48

N/A

0.63



Experimental timed, computerized word recognition

(Continued )

Pre-test Treatment

VNS 8.9


144



25.3 N/A

N/A

24.5 N/A

0.64 0.30

N/A

N/A

0.40

N/A

N/A

N/A

N/A

0.51

N/A

N/A

N/A

N/A

0.29

AB

C 2.9

1.3

AB 1.9

1.9

C 9.7

1.6

6.8

2.6

4.0

3.0

2.48

B 3.4

1.7

AB

C 6.4

1.2

AB 4.0

2.9

1.04 C

12.1

5.0

4.8

3.0

1.83

B 10.7

10.8

WRMT-R (Woodcock, 1989) N/A

N/A

Real word efficiency test

N/A

N/A

B Phoneme blending

Gain score 14.9 N/A

N/A

N/A

B Torgesen, Wagner, Rashotte, Lindamood, Rose, Conway and Garvan (1999) Journal of Educational Psychology

Gain score 27.0 N/A

N/A

N/A

N/A

N/A

N/A

26.8 11.0 Word attack RCS 0.14 0.53 PASP 0.76 1.7 EP 0.28 1.3 Word identification RCS 1.6 2.0 PASP 1.2 1.6 EP 2.7 3.1 4.5 5.3

0.13

0.50

0.02 0.57 0.17

1.1

2.5

0.22 0.05 0.57

4.5

5.4

0.22 0.11

145

5.6 PASP 5.1

3.14



PIAT standard score ROSS monitored semester tests ROSS monitored daily reading comprehension ROSS reading analogous nonwords ROSS reading independent error-detection Phoneme segmentation

Dependent Measures

146


(Continued )

Pre-test Treatment

Pre-test Control

Post-test Treatment

Post-test Control

X

X

X

SD

X

7.5

7.0

0.57 PASP 1.8 EP 0.61 N/A

1.4

SD

SD

Effect Size

SD

EP Nonword efficiency test

Wise and Olson (1992) Reading and Writing Wise and Olson (1995) Annals of Dyslexia

WRMT-R basic skills cluster (Woodcock, 1986) Experimental phoneme repair Experimental phoneme manipulation Experimental rhyme Experimental phoneme segmentation Experimental word segmentation Experimental non-word reading

N/A

N/A

N/A

N/A

0.48

1.3

3.1

0.07 0.60

1.9

0.08 0.19c

N/A

57.1

17.7

54.3

28.2

76.4

19.8

65.0

25.0

0.51

42.9

37.5

35.7

35.5

72.1

30.4

40.7

38.9

0.91

31.4

36.8

22.1

31.1

69.3

30.0

20.7

31.7

1.58

10.5

0.0

13.0

21.3

6.4

8.4

0.72

11.7 6.3

6.4 2.9

12.2 6.1

24.7 13.4

5.0 2.1

10.9 5.8

1.04 0.45

Initial 12.1 Final 8.6 3.6 N/A

N/A

Initial 17.1 Final 23.6 6.4 N/A

N/A

0.73a

ROSS monitored semester tests

0.30a

ROSS monitored daily reading comprehension

0.40a


Vellutino et al. (1997) Journal of Educational Psychology Warrick, Rubin and Rowe-Walsh (1993) Annals of Dyslexia

0.48

RCS



2.4

N/A

2.5

N/A

Gain score 1.1

N/A

Gain score 0.9

N/A

0.19a

Experimental time-limited word recognition

24.0

N/A

25.4

N/A

Gain score 12.5

N/A

Gain score 17.1

N/A

0.33a

Experimental phoneme deletion

32.8

N/A

32.8

N/A

Gain score 19.9

N/A

Gain score 6.0

N/A

0.48a

LACii (Lindamood & Lindamood, 1979)

5.3

N/A

5.3

N/A

Gain score 5.2

N/A

Gain score 1.8

N/A

0.63a

Experimental non-word reading ROSS monitored daily reading tests ROSS monitored monthly tests ROSS reading independent error-detection Experimental time-limited word recognition

25.3

N/A

24.5

N/A

Gain score 27.0

N/A

Gain score 14.9

N/A

0.64a

PIAT untimed word recognition (Dunn & Markwardt, 1970)

LACii (Lindamood & Lindamood, 1979)

N/A

N/A

N/A

N/A

0.42a

N/A

N/A

N/A

N/A

0.60a

N/A

N/A

N/A

N/A

0.29a

17.3

4.6

12.7

2.4

Articulation only 36.3 18.7 Sound manipulation 37.7 18.6 Combination 39.0 20.9 Articulation only 7.0 3.2 Sound manipulation 10.0 3.5 Combination 10.1 3.4

28.5

14.5

0.47 0.56 0.59

5.1

3.0

0.61 1.51 1.56

147

Articulation only 18.6 14.4 Sound manipulation 21.4 15.5 Combination 22.5 15.5 Articulation only 4.5 2.7 Sound manipulation 4.7 3.0 Combination 4.9 2.8


0.51a

ROSS reading analogous nonwords

148


Dependent Measures Experimental phoneme deletion

aF

ratio.

b t-Score. c Chi ∗ ES

+ ES

square. ave. gain (0.632).

WRAT reading (Jastak & Jastak, 1978) Experimental phoneme deletion Experimental computerized untimed non-word reading

Pre-test Control

X

X

SD

Articulation only 31.4 17.4 Sound manipulation 32.4 22.0 Combination 35.3 23.4 2.5

30.8

N/A

2.3

SD 23.5

N/A

Post-test Treatment

Post-test Control

X

X

SD

33.9

19.7

SD

Articulation only 44.7 22.8 Sound manipulation 54.4 22.2 Combination 53.6 25.7 Gains score 1.1 N/A

Gain score 1.4

Effect Size

0.51 0.98 0.87

N/A

0.38a

N/A

N/A

N/A

N/A

0.44a

N/A

N/A

N/A

N/A

0.41a


Wise, Ring, Sessions and Olson (1997) Learning Disabilities Quarterly

(Continued )

Pre-test Treatment

Dependent Measures


Brady, Fowler, Stone and Winbury (1994) Annals of Dyslexia 1-year follow-up

WRMT-R (Woodcock, 1986)

Olson, Wise, Ring and Johnson (1997) Scientific Studies of Reading 10-month follow-up

PIAT untimed word recognition (Dunn & Markwardt, 1970)

N/A

Pre-test Control X

SD

N/A

Standard scores 88.6 N/A Grade equiv. 2.3

Olson, Wise, Ring and Johnson (1997) Scientific Studies of Reading 22-month follow-up

SD

N/A

8.9

N/A

2.3

N/A

Experimental time-limited word recognition Experimental phoneme deletion

N/A N/A

25.5

N/A

Experimental non-word reading PIAT untimed word recognition (Dunn & Markwardt, 1970)

25.2 N/A Standard scores

20.5

N/A

88.6 N/A Grade equiv.

88.9

N/A

2.3

N/A

26.1

2.3 Experimental time-limited word recognition Experimental phoneme deletion Experimental non-word reading LAC II (phoneme awareness) WRAT

N/A

N/A

N/A

N/A N/A

25.5

N/A

25.2

N/A

20.5

N/A

5.2 N/A Standard scores

4.6

N/A

77.3 N/A Grade equiv.

75.8

N/A

X

SD

Word identification 27.8 19.11 Word attack 10.4 12.4 Standard scores Gain score 4.3 7.7 Grade equiv. Gain score 1.4 0.54 Gain score 23.3 10.3 Gain score 25.6 16.0 Gain score 32.8 19.6 Standard scores Gain score −0.10 14.5 Grade equiv. Gain score 2.1 0.99 Gain score 38.7 16.9 Gain score 32.7 13.7 Gain score 34.5 22.2 Gain score 4.4 4.5 Standard scores Gain score 9.7 12.3 Grade equiv.

Post-test Control X

SD

16.6

14.6

3.5 Gain score 0.9 Gain score 1.1 Gain score 20.9 Gain score 13.9 Gain score 16.6 Gain score 0.9 Gain score 1.9 Gain score 35.8 Gain score 23.6 Gain score 27.7 Gain score 3.4 Gain score 10.9

Effect Size

0.66

7.03

0.71

8.1

0.27

0.84

0.28

11.4

0.14

14.8

0.48

14.6

0.60

7.8

−0.06

0.81

0.14

13.2

0.12

15.5

0.39

17.8

0.22

4.2

0.15

7.4

−0.08

149

26.1

Post-test Treatment


APPENDIX D Authors/Date/Journal

Dependent Measures

2.5 Torgesen, Wagner, Rashotte, Rose, Lindamood, Conway and Garvan (1999) Journal of Educational Psychology 6–12-month follow-up

SD N/A

Pre-test Control X 2.4

WRMT-R (Woodcock, 1989) N/A

N/A

Real word efficiency test N/A

N/A

Non-word efficiency test N/A

N/A

SD N/A

Post-test Treatment X

SD

Gain score 2.5 1.2 Word attack RCS 4.9 6.6 PASP 9.8 8.1 EP 4.8 5.2 Word identification RCS 21.4 14.7 PASP 25.6 15.8 EP 20.3 13.4 Passage comprehension RCS 9.4 6.8 PASP 10.7 7.7 EP 8.4 7.1 RCS 50.5 35.8 PASP 65.8 34.9 EP 48.6 33.2 RCS 8.3 11.2 PASP 19.2 14.4 EP

Post-test Control X

Effect Size

SD

Gain score 2.5

2.8

0.85

0.00

4.0

0.40 1.16 0.43

16.6

14.4

0.33 0.60 0.27

6.8

7.1

0.37 0.53 0.23

40.8

36.8

0.27 0.70 0.22

5.9

9.2

0.24 1.13


Torgesen, Wagner, Rashotte, Rose, Lindamood, Conway and Garvan (1999) Journal of Educational Psychology 6–12-month follow-up

(Continued )


150

Appendix D. Authors/Date/Journal

7.5

7.3

0.19

3.1

3.3

3.8

3.4

0.46

4.8

0.36

RCS N/A

N/A

2.2

3.5

0.27

PASP EP Torgesen, Wagner, Rashotte, Rose, Lindamood, Conway and Garvan (1999) Journal of Educational Psychology 18–24-month follow-up

Torgesen, Wagner, Rashotte, Rose, Lindamood, Conway and Garvan (1999) Journal of Educational Psychology 18–24-month follow-up

3.7 Word attack

WRMT-R (Woodcock, 1989)

RCS N/A

N/A

Real word efficiency test N/A

N/A

Non-word efficiency test N/A

N/A

12.2 10.6 PASP 21.3 11.1 EP 12.0 8.8 Word identification RCS 39.6 16.0 PASP 47.9 16.8 EP 40.9 14.4 Passage comprehension RCS 18.9 9.6 PASP 23.2 9.6 EP 19.5 9.0 RCS 51.1

8.7

0.19 1.10 0.18

37.2

17.2

0.14 0.63 0.23

19.3

9.1

−0.04 0.42 0.02

103.9

51.5

0.11

43.0

0.72

43.1

0.27

22.9

23.2

22.4

0.00

20.0

0.94

20.8

0.14

151

109.7 PASP 137.9 EP 116.5 RCS 23.1 PASP 43.2 EP 26.3

10.4


GORT-III comprehension

152

Appendix D. Authors/Date/Journal

Dependent Measures

(Continued )

Pre-test Treatment

Pre-test Control

X

X

SD

X

SD

Post-test Control X

Effect Size

SD

RCS N/A

Warrick, Rubin and Rowe-Walsh (1993) Annals of Dyslexia 1-year follow-up

Post-test Treatment

N/A

Experimental phoneme repair

N/A

N/A

8.1 PASP 10.9 EP 11.0 87.3

Experimental phoneme manipulation Experimental rhyme Experimental phoneme segmentation Experimental real-word reading Experimental non-word reading

N/A N/A N/A N/A N/A

N/A N/A N/A N/A N/A

90.0 87.3 36.2 17.1 4.0

5.4

8.8

6.6

7.2

−0.12 0.30

7.6 16.2

86.9

11.1

0.31 0.03

11.8 11.0 40.1 12.9 4.9

72.3 56.9 10.6 4.3 0.38

31.1 41.1 6.9 5.3 0.65

0.82 1.16 1.07 1.40 1.30


GORT-III comprehension

SD


APPENDIX E Instructional Components 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.

Phonological analysis Segmentation of sentences into phrases Segmentation of phrases into words Segmentation of syllables into morphemes Segmentation of syllables into phonemes Segmentation of words into affixes Segmentation of words into onset-rimes Segmentation of words into syllables Segmentation of words into phonemes/subsyllabic units Word segmentation Articulatory awareness activities (showing how words are produced by the mouth) Assistance given with reading of text Basal reading series Answering comprehension questions Comprehension strategies Oral language comprehension Partner reading with retell Summarizing text Computer program asks questions Computer program demonstrates Computer program monitors practice Computer program presents new material Computer program provides feedback on student performance Conduct probes of learning (intermittent test) Cumulative review Decoding/word attack Diagram or pictorial presentation Discussing stories and their meaning Explicit instruction/activities Fading of prompts or cues Feedback – articulatory Feedback – orthographic Feedback – computerized speech Feedback given at the level of the phoneme Feedback given at the level of the onset-rime

153

154


Appendix E. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73.

(Continued )

Feedback given at the level of the syllable/subsyllable Feedback given at the level of the whole word Fluency-building exercises Homework Independent practice Individually paced Instruction individually Instruction in small group (2–5) Instruction in large group (>5) Instruction in regular classroom Instruction in alternative setting Instructional games Instructional “rules” Intervention introduced by computer program Intervention introduced by regular teacher Intervention introduced by other personnel Learning centers Letter names Level of difficulty applied to each student Listening to stories Literature-rich environment Mastery criteria Mathematical skills instruction Modeling from computer program Modeling from peers Modeling from teachers or trainers Multisensory teaching methods Matching of letter symbols to pronounced words Orthographic awareness Orthographic coding – letter Orthographic coding – letter-cluster Orthographic coding – whole word Orthographic code-name code correspondence Oral spelling Spelling instruction/practice Spelling patterns Overlearning of skills Peer tutoring


74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95. 96. 97. 98. 99. 100. 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112.

155

Alphabetic principle Letter-sound correspondence (sound-symbol relationships) Letter-cluster-sound correspondence (sound-symbols relationships) Manipulation of letters Manipulation of mouth pictures (articulatory gestures) Manipulation of onset-rimes Manipulation of phonemes (deletion, addition, subsitution, repetition, etc.) Manipulation of syllables Phonological awareness Phoneme counting Phoneme identification Phonemic awareness Phonics instruction Rhyme Syllable awareness (clapping, counting, categorizing) Syllable identification Word family training Positive and negative examples given of concept Practice of material Predicting story outcomes Pre-training or prerequisite skill building prior to intervention Choral reading (echo reading) Guided reading (small group with like ability) Independent reading Oral reading Reading connected text Reading decodable test (controlled vocabulary) Reading practice – nonwords Reading practice – real words Reading stories Repeated reading Shared reading (all students read at same time, same pacing, on same text) Silent reading Reminder/encouragement to use certain strategies or procedures Review of material Reward and reinforcers Root words (affixes, prefixes, suffixes) Scripted lessons Self-help skills instruction

156


Appendix E. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133. 134. 135. 136. 137. 138. 139. 140. 141. 142. 143. 144. 145. 146. 147. 148. 149.

(Continued )

Self-monitoring/error detection Sight word instruction/practice Spelling-sound connections Structural analysis training Student able to choose between reading or being read to Student able to choose stories of interest Student acts as tutor or trainer Student asks questions Student repeats computer program action or response Student repeats teacher response or action Study skills instruction Blending sounds Onset-rime blending Phoneme blending Subsyllablic unit blending Syllable blending Systematic training (complex tasks taught after simpler ones) Teacher (or trainer) and student talk back and forth Teacher (or trainer) asks questions Teacher (or trainer) demonstrates Teacher (or trainer) directs student to attend to material Teacher (or trainer) monitored practice Teacher (or trainer) points to material Teacher (or trainer) presents benefits of instruction Teacher (or trainer) presents new material Teacher (or trainer) provides feedback on student performance Teacher (or trainer) provides only necessary assistance Teacher (or trainer) provides support and encouragement Teacher (or trainer) states learning objectives Teaching to generalize skills (promote carryover) Teacher-directed instruction Teacher-guided discovery instruction Thematic instruction Use of manipulatives Use of mirrors Use of reference charts (vowel charts, word charts) Use of visual aids


150. 151. 152. 153. 154. 155.

Using media for instruction/practice Use of strategies or mnuemonic devices Vocabulary instruction/practice Weekly review Whole language/meaning-centered activities Writing activities

157

158

APPENDIX F Author/Date/Journal

Reading Intervention Components for Treatment Group

Reading Intervention Components for Control Group


Structural Analysis Group: 6, 8, 9, 26, 29, 42, 51, 61, 64, 69, 70, 74, 75, 76, 80, 82, 85, 86, 98, 99, 100, 102, 103, 104, 107, 110, 111, 115, 116, 122, 124, 128, 132, 133, 135, 137, 138, 143, 146, 149


Subword: 12, 37, 42, 51, 61, 75, 76, 92, 98, 99, 102, 103, 105, 115, 122, 129, 132, 135, 137, 138, 143

Study Skills Group: 8, 9, 26, 29, 32, 37, 42, 51, 61, 64, 68, 69, 70, 74, 75, 76, 80, 82, 85, 86, 98, 99, 100, 102, 103, 104, 111, 115, 122, 123, 124, 126, 128, 132, 133, 135, 137, 138, 143, 146, 149, 155 Whole Word: 12, 29, 32, 37, 41, 42, 51, 61, 67, 68, 69, 70, 92, 98, 99, 102, 103, 105, 108, 122, 129, 132, 134, 135, 137, 138, 143

Brady, Fowler, Stone and Winburg (1994) Annals of Dyslexia Foorman, Francis, Fletcher, Schatschneider and Mehta (1998) Journal of Educational Psychology

Combination: 12, 29, 32, 37, 41, 42, 51, 61, 67, 68, 69, 70, 75, 76, 92, 98, 99, 102, 103, 105, 108, 115, 122, 129, 132, 134, 135, 137, 138, 143 1, 2, 3, 4, 5, 8, 9, 11, 27, 31, 47, 50, 55, 61, 62, 80, 82, 84, 85, 87, 89, 108, 129, 131, 133, 137, 147, 149 Direct-Code (DC): 13, 16, 42, 43, 44, 45, 48, 50, 56, 70, 75, 85, 86, 92, 100, 103, 115, 124, 137, 143, 155


Embedded-Code (EC): 15, 42, 43, 44, 45, 50, 56, 71, 80, 85, 92, 95, 96, 97, 99, 100, 102, 105, 108, 137, 143, 146, 155 Implicit-Code/Research Implemented Curriculum (IC-R): 28, 42, 43, 44, 45, 50, 52, 56, 70, 74, 86, 87, 96, 100, 105, 137, 144, 151, 154, 155 More LS Group: 9, 13, 26, 40, 42, 44, 45, 48, 50, 52, 71, 75, 90, 108, 124, 126, 131, 137, 153, 155

Foster, Erickson, Foster, Brinkman and Torgesen (1994) The Journal of Research and Development in Education

Fuchs, Fuchs, Mathes and Simmons (1997) American Educational Research Journal Hart, Beninger and Abbott (1997) School Psychology Review

Synthetic Phonics: 44, 46, 51, 53, 62, 70, 75, 86, 102, 108, 124, 126, 129, 137, 155 Analytic Phonics: 44, 46, 47, 51, 77, 95, 96, 97, 100, 105, 108, 111, 125, 137, 152, 155 Experiment #1: 19, 20, 21, 22, 23, 38, 40, 41, 42, 46, 49, 57, 80, 83, 84, 85, 87, 91, 92, 94, 109, 124, 126, 129, 150 Experiment #2: 19, 20, 21, 22, 23, 38, 40, 41, 42, 46, 49, 57, 80, 83, 84, 85, 87, 91, 92, 94, 109, 124, 126, 129, 150 12, 13, 14, 15, 17, 18, 28, 37, 38, 42, 45, 51, 54, 55, 60, 73, 92, 93, 98, 99, 103, 104, 106, 108, 109, 113, 119, 120, 122, 143 24, 37, 42, 46, 51, 54, 57, 61, 75, 76, 84, 86, 90, 92, 98, 102, 108, 114, 122, 132, 133, 134, 135, 137, 138, 148, 149

Implicit-Code/Standard District Curriculum (IC-S): 28, 42, 43, 44, 45, 50, 52, 56, 70, 74, 86, 87, 96, 100, 105, 137, 144, 151, 154, 155

Less LS Group: 7, 8, 9, 13, 44, 45, 50, 61, 70, 84, 89, 103, 113, 115, 121, 130, 131, 132, 137, 138, 145, 149, 150, 152, 154, 155 Sight Word: 26, 27, 39, 44, 46, 47, 51, 52, 57, 70, 102, 103, 114, 137, 49, 152, 153, 155

45, 50

45, 50 13, 28, 44, 45, 50, 106, 137, 143 8, 9, 24, 37, 42, 46, 47, 51, 54, 57, 61, 65, 66, 67, 68, 69, 70, 75, 76, 80, 81, 84, 86, 88, 89, 90, 92, 98, 102, 108, 114, 122, 131, 132, 133, 134, 135, 137, 138, 148, 149, 155


Foorman, Francis, Winikates, Mehta, Schatschneider and Fletcher (1997) Scientific Studies of Reading

50, 154



Torgesen, Morgan and Davies (1992) Journal of Educational Psychology Torgesen, Wagner, Rashotte, Rose, Lindamood, Conway and Garvan (1999) Journal of Educational Psychology

Vellutino, Scanlon, Sipay, Small, Pratt, Chen and Denckla (1996) Journal of Educational Psychology Warrick, Rowe-Walsh and Rubin (1993) Annals of Dyslexia Wise and Olson (1992) Reading and Writing

PHAB/DI × 2: 1, 7, 9, 10, 25, 26, 43, 46, 51, 57, 72, 75, 76, 87, 92, 98, 102, 108, 124, 125, 126, 129, 137, 142, 143, 149 WIST × 2: 7, 26, 29, 43, 46, 51, 61, 71, 87, 92, 94, 110, 111, 113, 114, 115, 128, 129, 132, 137, 143, 148, 151 PHAB/DI to WIST: 1, 7, 9, 10, 25, 26, 29, 43, 46, 51, 57, 61, 71, 72, 75, 76, 87, 92, 94, 98, 102, 108, 110, 111, 113, 114, 115, 124, 125, 126, 128, 129, 132, 137, 142, 143, 148, 149, 151 WIST to PHAB/DI: 1, 7, 9, 10, 25, 26, 29, 43, 46, 51, 57, 61, 71, 72, 75, 76, 87, 92, 94, 98, 102, 108, 110, 111, 113, 114, 115, 124, 125, 126, 128, 129, 132, 137, 142, 143, 148, 149, 151 PHAB/DI: 1, 7, 9, 10, 25, 26, 43, 46, 51, 57, 72, 75, 76, 87, 92, 98, 102, 108, 124, 125, 126, 129, 137, 142, 143, 149 WIST: 7, 26, 29, 43, 46, 51, 61, 71, 87, 92, 94, 110, 111, 113, 114, 115, 128, 129, 132, 137, 143, 148, 151 Syllable Feedback: 8, 12, 14, 19, 20, 21, 22, 23, 24, 26, 32, 33, 36, 37, 40, 41, 42, 49, 51, 54, 59, 94, 97, 98, 102, 103, 106, 113, 117, 118, 120, 134, 140, 150 Onset-Rime Feedback: 7, 12, 14, 19, 20, 21, 22, 23, 24, 26, 32, 33, 35, 37, 40, 41, 42, 49, 51, 54, 94, 97, 98, 102, 103, 106, 113, 117, 118, 120, 124, 125, 134, 140, 150 Whole Word Feedback: 12, 14, 19, 20, 21, 22, 23, 24, 26, 32, 33, 37, 40, 41, 42, 49, 51, 54, 94, 98, 102, 103, 106, 113, 117, 118, 120, 134, 140, 150 AB Group: 1, 9, 25, 43, 46, 47, 51, 57, 82, 84, 87, 94, 108, 124, 126, 137 B Group: 25, 30, 43, 46, 47, 51, 57, 82, 87, 94, 108, 124, 126, 137 PASP: 11, 26, 27, 28, 29, 38, 42, 46, 51, 57, 75, 82, 92, 98, 99, 100, 102, 103, 107, 114, 118, 137, 143, 144, 146, 149, 154, 155 EP: 13, 18, 26, 27, 28, 29, 42, 46, 47, 51, 75, 82, 84, 92, 98, 99, 100, 102, 103, 114, 130, 137, 143, 149, 154, 155 RCS: 26, 28, 29, 42, 46, 51, 70, 75, 82, 86, 99, 114, 137, 154, 155 26, 42, 51, 74, 75, 85, 99, 114, 137, 145, 149, 151, 154, 155 1, 8, 9, 25, 29, 44, 51, 55, 80, 84, 85, 87, 88, 94, 103, 108, 131, 137, 142, 146, 149 12, 20, 21, 23, 27, 32, 33, 34, 37, 40, 41, 42, 46, 49, 59, 67, 70, 92, 94, 97, 103, 108, 109, 115, 121, 131, 139, 144, 149, 150

CSS to MATH: 43, 46, 47, 51, 112, 123, 137, 142, 143, 146, 150, 151

CSS: 43, 46, 51, 112, 123, 137, 142, 151

Normal Reading Instruction: 46, 50, 70, 99, 137, 155

C Group: 14, 28, 43, 46, 51, 55, 103, 137, 154

NTC: Not stated

13, 43


Lovett, Lacerenza, Borden, Frijters, Steinbach and De Palma (2000) Journal of Educational Psychology

Not stated

159

12, 20, 21, 22, 23, 27, 32, 33, 37, 40, 41, 42, 46, 49, 59, 67, 70, 92, 94, 97, 103, 108, 109, 115, 121, 131, 139, 144, 149, 150

160

Appendix F.

(Continued )

Reading Intervention Components for Treatment Group

Reading Intervention Components for Control Group


1, 7, 8, 10, 11, 12, 14, 20, 21, 22, 23, 24, 26, 27, 32, 33, 34, 35, 36, 37, 40, 41, 42, 43, 46, 49, 51, 53, 54, 57, 59, 62, 63, 70, 71, 75, 77, 78, 80, 82, 92, 97, 101, 103, 107, 113, 118, 124, 129, 131, 134, 137, 138, 139, 144, 146, 147, 149, 150 Articulation-Only: 1, 11, 23, 26, 27, 29, 32, 33, 35, 36, 37, 40, 42, 43, 46, 47, 48, 51, 57, 62, 63, 75, 76, 82, 86, 97, 98, 103, 106, 107, 108, 109, 113, 118, 120, 124, 125, 128, 131, 134, 136, 137, 138, 139, 141, 144, 147, 148, 149, 150, 151 Sound Manipulation: 1, 7, 8, 21, 22, 23, 26, 27, 29, 32, 33, 35, 36, 37, 40, 41, 42, 43, 46, 47, 48, 49, 51, 54, 57, 59, 63, 70, 75, 76, 77, 79, 80, 81, 82, 84, 85, 86, 87, 88, 92, 97, 98, 101, 102, 103, 106, 107, 108, 109, 113, 118, 120, 121, 124, 128, 129, 131, 134, 136, 137, 138, 139, 141, 144, 146, 148, 149, 150, 151 Combination: 1, 7, 8, 11, 12, 19, 20, 21, 22, 23, 26, 27, 29, 30, 31, 32, 33, 34, 35, 36, 37, 40, 41, 42, 43, 46, 48, 49, 51, 54, 57, 59, 62, 63, 70, 75, 76, 77, 78, 80, 82, 85, 86, 92, 97, 98, 101, 102, 103, 106, 107, 108, 109, 113, 115, 118, 120, 121, 124, 125, 128, 129, 131, 134, 136, 137, 138, 139, 141, 144, 147, 148, 149, 150, 151, 153 Articulatory Phonological Awareness: 1, 11, 20, 21, 22, 23, 27, 31, 32, 33, 34, 40, 41, 42, 43, 48, 49, 54, 57, 59, 62, 63, 70, 71, 75, 77, 78, 80, 86, 92, 97, 101, 103, 107, 108, 109, 113, 115, 120, 121, 131, 134, 136, 137, 138, 139, 141, 144, 147, 148, 150, 151, 153

7, 8, 10, 12, 14, 15, 18, 20, 21, 22, 23, 24, 26, 28, 32, 33, 35, 36, 37, 40, 41, 42, 43, 44, 46, 49, 51, 59, 61, 92, 93, 97, 98, 103, 107, 113, 118, 119, 120, 124, 125, 126, 128, 131, 134, 135, 137, 142, 144, 149, 150, 151 45, 50



Nonarticulatory Phonological Awareness: 1, 20, 21, 22, 32, 33, 34, 40, 41, 42, 43, 47, 48, 49, 54, 57, 59, 63, 70, 71, 75, 77, 79, 80, 81, 86, 87, 92, 97, 101, 103, 107, 108, 109, 113, 115, 120, 121, 131, 134, 137, 138, 139, 141, 143, 146, 148, 149, 150, 151, 153, 155


Author/Date/Journal


161

APPENDIX G Twenty Cluster Component Category Groups

Component Numbers (as Listed in Appendix E)

1. Articulatory awareness activities 2. Phonological skills 3. Questioning/summarizing 4. Orthographic skills 5. Reading activities 6. Metacognitive/strategy instruction 7. Technology 8. Explicit instruction/practice 9. Teacher/student interchange 10. One-to-one instruction 11. Small group instruction 12. Large group instruction 13. Regular classroom instruction 14. Alternative instructional setting 15. Regular classroom teacher-led instruction 16. Non-regular classroom teacher-led instruction 17. Computer-mediated instruction 18. Skill-modeling 19. Control difficulty 20. Advanced organizers

(11, 78) (1–10, 26, 74–90, 124–128) (14–18) (63–71) (95–106) (48, 107, 113, 151) (19–23, 33, 49, 59) (25, 29, 38, 72, 92, 108, 134, 153) (120, 130, 131) (40–42, 73) (43) (44) (45) (46) (50) (51) (49) (59–61, 121, 122, 132) (12, 24, 30, 54, 91, 129, 139) (94, 133, 136, 141)

CURRENT ADVANCES IN ASSESSMENT AND INTERVENTION FOR CHILDREN WITH LEARNING DISABILITIES Jack A. Naglieri INTRODUCTION The chapter begins by presenting a case study of a 4th grade student, who has been referred by his teacher for an evaluation. However before this case can be completely understood, it is necessary to understand the limitations associated with the general intelligence approach of assessment. The chapter provides an overview of these limitations and suggests using a processing-based approach instead of a general intelligence approach. The second section outlines the Planning, Attention, Simultaneous, and Successive (PASS) theory and approach toward assessment, which is supported by neuropsychological research. The final section returns to the case study and demonstrates how the information gathered using the PASS theory and Cognitive Assessment System (CAS) can be used to guide interventions for various learning disabilities. The Case of Louis Louis is a sociable and active 4th grade student who is popular with his classmates, likes his teachers, and seems to fit in well at school. In general, Louis works hard Identification and Assessment Advances in Learning and Behavioral Disabilities, Volume 16, 163–190 Copyright © 2003 by Elsevier Science Ltd. All rights of reproduction in any form reserved ISSN: 0735-004x/doi:10.1016/S0735-004X(03)16005-3

163

164

JACK A. NAGLIERI

in class and turns in all his work, however his grades do not reflect the effort he puts in. As a result, Louis does not like school or schoolwork very much, and is getting more and more discouraged. Louis’ teacher noticed that he has difficulty following directions that are not written down. Louis’ biggest problem, however, is with reading and spelling; he has poor word analysis skills and struggles to sound out new words. Louis’ teacher initiated an evaluation and several tests were given, among them an ability and achievement test was administered. On the ability test, he earned a Verbal IQ score of 92 and a Performance score of 108. Both of these scores are within the average range, which means that Louis’ ability test scores are within the average range and consistent with his agemates. In contrast, Louis earned a score of 78 on a test of basic reading, 85 on reading comprehension, and 82 in spelling, which are below average scores compared to peers his age. Based on Louis’ test scores, it is apparent that he has a discrepancy between his IQ and achievement scores in reading and writing. These findings along with the observations of Louis’ teachers suggest that Louis may have a learning disability. Although Louis’ performance on ability and achievement tests suggest that he ultimately could be identified as a child with a learning disability, the ability/ achievement discrepancy finding provides limited information about the possible reasons for the problems he is experiencing. Additionally, while the discrepancy may help qualify a child for services it yields little information that is useful for the development of interventions to help the child with the reading problem. Later in this chapter, additional information will be provided about Louis that helps us understand the nature of his cognitive characteristics and how additional information can be useful for diagnostic and intervention purposes. However before this information is provided, a discussion of current intelligence testing technology and alternatives to these traditional methods will be presented.

Traditional IQ Tests For the past 50 years the general intelligence approach, defined by the Wechsler scales, has dominated the field of intellectual assessment (Wilson & Reschly, 1996). As a result, most professionals in education and psychology readily accept that there are two types of intelligence – verbal and non-verbal. It is important to consider, however, that the Wechsler approach to measuring intelligence represents a tradition in psychological assessment that began in 1939, with the publication of the Wechsler-Bellevue Scales, which were developed based on

Current Advances in Assessment and Intervention for Children

165

methods used by the U.S. military in the early 1900s (Yoakum & Yerkes, 1920). Thus, the Wechsler scales represent the predominant pre-World War I notions of how to assess intelligence. Moreover, Wechsler’s view of intelligence was not that verbal and non-verbal were two types of intelligence, but rather that non-verbal tests helped to “minimize the over-diagnosing of feeble-mindness that was, he believed, caused by intelligence tests that were too verbal in content . . . and he viewed verbal and performance tests as equally valid measures of intelligence and criticized the labeling of performance [non-verbal] tests as measures of special abilities” (p. 396; Boake, 2002). The general intelligence approach served to initiate a major contribution made by the field of psychology to society, but the continued reliance on this model over the last century must make one stop and wonder just how well the technology works. Many have begun to ask how effective the general intelligence approach is, and indeed to wonder about the limitations of this approach (Das, Naglieri & Kirby, 1994; Naglieri, 1999; Sternberg, 1988). The verbal/non-verbal approach to conceptualizing intelligence has considerable limitations, especially for culturally and linguistically diverse populations, those with limited English language skills, and children who are experiencing academic problems, like a learning disability (Naglieri, 2000). The limited utility of the verbal/non-verbal model for evaluation of specific intellectual problems associated with learning disabled (LD) children’s academic failure has led some to argue that intelligence tests are irrelevant to the diagnosis of learning disabilities (Siegle, 1989). In fact, after careful review of the research, Kaufman and Lichtenberger (2000) concluded that WISC-III subtest profiles “do not have adequate power on which to base differential diagnosis” (p. 205) for LD or Attention Deficit/Hyperactivity Disorder (ADHD). This should not be a surprise to anyone who reflects on the developmental history of the Wechsler scales and recognizes that the test was not built to identify LD or ADHD children (the concepts were not yet developed). Instead, it should be recognized that it is unreasonable to expect a verbal/non-verbal model, used to measure general intelligence, to show sensitivity to the cognitive problems these children experience. Nevertheless, it is consistent with the research to conclude that scores on a verbal/non-verbal test of intelligence have not been especially helpful for diagnosis of LD or ADHD (Kaufman & Lichtenberger, 2000; Kavale & Forness, 1984). Some authors who have noted the limitations of a general intelligence model have embraced alternative perspectives (Das, Naglieri & Kirby, 1994; Kaufman & Kaufman, 1983; Sternberg, 1988). The elimination of the concept of intelligence is ill advised, and instead, an examination of other modern and reconceptualized views, based heavily on important advances in psychology (especially

166

JACK A. NAGLIERI

cognitive and neuropsychology) and which have relevance to the evaluation and instruction of children with learning problems, will be reviewed in the following sections.

Winds of Change One of the most important developments in the field of psychology that has relevance to the evaluation and instructional planning of children with learning disabilities is the growing body of research in cognitive and neuropsychology. Perhaps one of the most important contributions of cognitive psychology is the understanding that a child’s cognitive processing competence provides a means of conceptualizing what intelligence could be. In addition, the emphasis on cognitive strategy use and planning provides a new way to conceptualize human functioning. For example, the importance of strategic behavior was amply described in the book, Plans and the Structure of Behavior by Miller, Galanter and Pribram (1960). More recently, Goldberg (2001) provided an excellent discussion of the value of strategic thinking, brain functioning, and exceptional children in his book The Executive Brain: Frontal Lobes and the Civilized Mind. Miller et al. and Goldberg emphasize the importance of strategic thinking on the part of the child or adult and the relationships between such thinking and specific neuropsychological constructs, as well as success or failure in a wide variety of areas. These ideas are reflected in the practical suggestions of researchers who have argued for the value of cognitive strategy instruction. Pressley and Woloshyn (1995), in their book Cognitive Strategy Instruction that Really Improves Children’s Academic Performance, describe the components of strategy use in which students are explicitly encouraged to discover and use methods of doing things, monitor their performance, generalize their use of strategies, be aware of the importance of strategies, achieve self-regulated strategy use, and become thoughtful, planful, and evaluative as they work. These instructional goals are actually teaching children a type of cognitive processing referred to as plans and strategies by Miller et al. (1960), frontal lobe functioning by Goldberg (2001), and planning by Naglieri (1999). There is an important connection between the strategy training instructional methods advocated by educators who have focused on the importance of being strategic, and the neuropsychological writings of those who have recognized the importance of, for example, frontal lobe functioning. The recognition that strategy use on the part of the child is closely tied to a type of intellectual cognitive process provides an important connection between the cognitive characteristics of a child and the cognitive demands of academic


167

tasks presented by the teacher. Naglieri and Pickering (2003) illustrate that this approach can have a positive influence on children’s academic performance and that this approach is very different from processing approaches that were tried in the late 1970s, particularly the modality based methods. Is This the Same as ATI? When information about a child’s cognitive characteristics is used to guide the development or selection of academic interventions, the concept of an aptitude-treatment interaction (ATI) is invoked. The essence of this approach is intuitively attractive and logical; to take individual differences in aptitude (ability) or underlying cognitive processes (a more modern term) into account when interventions or treatments are being planned (Cole, Dale, Mills & Jenkins, 1993; Snow, 1991). Snow (1991) defined aptitude or ability as “a complex of personal characteristics identified before and during treatment that accounts for a person’s end state after a particular treatment” (p. 205). That is, an interaction between aptitude and treatment is present when a child’s intellectual characteristics influence to what extent he or she benefits from one type of intervention over another. Although the term aptitude is not limited to intelligence (it could include variables such as personality, motivation, etc.), in this chapter aptitude is defined as an intellectual (cognitive processing) attribute of a child. In this discussion, the way in which the aptitude of intelligence is defined takes on critical importance. Practicing school psychologists have attempted to obtain information that can be used within an ATI conceptualization by evaluating information from the Wechsler Intelligence Scales. To do so, they have interpreted the Wechsler subtests, scales, and indices in many ways to extract meaning out of this test of general intelligence. Unfortunately, school psychologists have used the Wechsler scales in ways that go well beyond its capabilities because intervention design demands more information than the IQ or subtest scores provide.

Moving from IQ to Cognitive Processes In the past 15 years, researchers have become interested in reformulating the concept of intelligence using a cognitive processing perspective. Luria is perhaps the leading cognitive and neuropsychological researcher to have influenced test developers. In fact, he is the “most frequently cited Soviet scholar in American, British, and Canadian psychology periodicals” (Solso & Hoffman, 1991, p. 251). Luria’s most influential works include Higher Cortical Functions in Man (1966a), Human Brain and Psychological Processes (1966b), The Working Brain

168

JACK A. NAGLIERI

(1973), and Language and Cognition (1982). These, and his other works, have helped stimulate an increased awareness of the relationships between cognitive processing and human performance. Luria has influenced how intelligence is conceptualized and measured. The Kaufman Assessment Battery for Children (K-ABC; Kaufman & Kaufman, 1983) was the first test to be influenced by Luria’s cognitive processing theory of human functioning. The K-ABC reflected the authors’ conceptualization of intelligence according to cognitive and neuropsychological perspectives, rather than the general intelligence model that dominated the field since the early part of the last century. Kaufman and Kaufman based their view of intelligence on Luria’s theory as well as the theories of Gazzaniga (1975), Kinsborne (1978), Jensen (1980), Neisser (1967), and Das, Kirby and Jarman (1975, 1979). The K-ABC model was based on the finding that many different theories of intelligence had two basic processes in common – Sequential and Simultaneous processes. This approach was conceptually very different from the verbal/ non-verbal intelligence model used in most individual and group tests of ability. The K-ABC was, in particular, based on two very important concepts. First, that verbal IQ is not intelligence, but rather better conceptualized as achievement. Second, that intelligence was best redefined as basic cognitive processes. Kaufman and Kaufman’s idea that IQ tests could be improved through modification and redefinition using a cognitive processing theory was, in the mid-1980s, a revolutionary concept. The successes and limitations of the K-ABC formed the background for the development of another approach to redefine ability from a cognitive processing theory. The theory is the Planning, Attention, Simultaneous, and Successive (PASS) cognitive processes (Naglieri & Das, 1997a) and is based largely on the neuropsychological work of Luria (1966a, b, 1973, 1980, 1982). The PASS theory was used as the underlying framework of the Cognitive Assessment System (CAS; Naglieri & Das, 1997a). The CAS uses a theory-based view of cognitive processing that puts emphasis on basic psychological processes that are related to performance, rather than a general intelligence verbal/non-verbal IQ model. The four PASS scales represent the kinds of basic psychological processes described in the Individuals with Disabilities Education Act Amendments of 1997 (IDEA’97, see Naglieri & Sullivan, 1998) that are used, for example, in the definition of a specific learning disability. The four basic psychological processes can be used: (1) to gain an understanding of how well the child thinks; (2) to discover strengths and needs of children that can then be used for effective differential diagnosis, instructional development; and (3) to select or design appropriate interventions.


169

THE PASS THEORY: AN ALTERNATIVE TO GENERAL INTELLIGENCE PASS Theory PASS cognitive processes are the basic building blocks of human intellectual functioning (Naglieri, 1999). The PASS processes form an inter-related system of cognitive processes or abilities that interact with an individual’s base of knowledge and skills. The four constructs are defined as follows: Planning is a mental activity that provides cognitive control, use of processes, knowledge and skills, intentionality, and self-regulation; Attention is a mental activity that provides focused, selective cognitive activity over time and resistance to distraction; Simultaneous is a mental activity by which the child integrates stimuli into groups; and Successive is a mental activity by which the person integrates stimuli in a specific serial order to form a chain-like progression. Planning This process provides the means to solve problems of varying complexity and may involve control of attention, simultaneous, and successive processes, as well as acquisition of knowledge and skills. Planning is critical to all activities where the child or adult has to determine how to solve a problem. This includes selfmonitoring and impulse control as well as generation, evaluation, and execution of a plan. Planning can be measured using the CAS planning tests that require the child to develop a plan of action, evaluate the value of the method, monitor its effectiveness, revise or reject a plan to meet the demands of the task, and control the impulse to act without careful consideration. All of the CAS planning subtests require the use of strategies for efficient performance and the application of these strategies to novel tasks of relatively reduced complexity (Naglieri & Das, 1997b). Attention Attention is a mental process by which the person selectively focuses on particular stimuli and inhibits responses to competing stimuli. Attention is involved when there is a demand for focused, selective, sustained, and effortful activity. Focused attention involves directed concentration toward a particular activity and selective

170

JACK A. NAGLIERI

attention is important for the inhibition of responses to distracting stimuli. Sustained attention refers to the variation of performance over time, which can be influenced by the different amount of effort required to solve the test. All CAS attention subtests present children with competing demands on their attention and require sustained focus. Simultaneous Processing Simultaneous processing is a type of mental process that gives the child the means to integrate separate stimuli into a single whole or group. An essential aspect of simultaneous processing is the need to recognize how the separate elements of a stimulus array are interrelated into a whole. For this reason, simultaneous processing tests have strong spatial aspects. The spatial aspect of simultaneous processing includes perception of stimuli as a whole. For example, simultaneous processing is involved in grammatical statements that demand the integration of words into a whole idea. This integration involves comprehension of word relationships, prepositions, and inflections so the person can obtain meaning based on the whole idea. Simultaneous processing can be measured using CAS tasks that require integration of parts into a single whole and understanding of logical and grammatical relationships. These processes vary on the basis of non-verbal and verbal content, but the essential requirement is simultaneous processing. Successive Processing Successive processing is a mental process by which the person works with stimuli in a specific serial order that forms a chain-like progression. Successive processing is required when a person must arrange things in a strictly defined order where each element is only related to those that precede it and these stimuli are not interrelated. This process involves both the perception of stimuli in sequence and the formation of sounds and movements in order. For this reason, successive processing is involved with activities such as phonological awareness (Das, Naglieri & Kirby, 1994) and the syntax of language. This process can be measured using the CAS successive tests which demand use, repetition, or comprehension based on order. PASS Processes The four PASS processes are inter-related constructs that function as a whole as described by Luria (1973), who stated this when he wrote, “each form of conscious activity is always a complex functional system and takes place through the combined working of all three brain units, each of which makes its own contribution” (p. 99). This conception means that the four PASS processes can be


171

thought of as a “working constellation” (Luria, 1966b, p. 70) of cognitive activity. This means that a child may perform the same task with various contributions of the PASS processes along with the application of a child’s knowledge and skills. Although effective functioning is accomplished through the integration of all PASS processes as demanded by the particular task, not every process is equally involved in every task. For example, tests like math calculation may be heavily weighted, or influenced, by a single PASS process such as planning, while reading decoding is strongly related to successive processing. Because of the inter-related nature of the processes and their interaction with achievement based upon the particular demands of that task, a through understanding of a child’s competence in all these areas is important for addressing educational problems.

Description of the CAS In order to operationalize the PASS theory, Naglieri and Das (1997a) developed the CAS following a systematic and empirically based method to obtain efficient measures of the PASS processes that could be individually administered. The PASS theory was used as the foundation of the CAS, so the content of the test was not constrained by previous approaches to intelligence. The CAS reflects the merging of the best in psychometric test development methods with a theory of intelligence redefined as cognitive processing within the context of a user-friendly practical test. There were several assumptions and goals that were used during the development of the CAS (see Naglieri & Das, 1997b for more details), which are as follows: (1) theory should proceed a test of ability; (2) a test of intelligence should be based on a sound theory; (3) the concepts of IQ, intelligence, aptitude, ability, or any other similar terms should be replaced with the concept of cognitive processes; (4) before being considered as the foundation for a test, a possible theory of cognitive processing should be based on a sizable research base and have been proposed, tested, modified, and shown to have several types of validity; (5) a theory of cognitive processes should inform the user about those specific abilities that are related to academic successes and failures, have relevance to differential diagnosis, and provide guidance to the selection and/or development of effective programming for intervention; (6) a test of cognitive processing should evaluate an individual using items that are as free from acquired knowledge as possible.

172

JACK A. NAGLIERI

Development of CAS Subtests for the CAS were developed specifically to operationalize the PASS theory over a period of about 25 years (summarized in three sources: Das et al., 1994; Das, Kirby & Jarman, 1979; Naglieri & Das, 1997b). The sole criterion for inclusion was each subtest’s correspondence to the theoretical framework of the PASS theory. This means that selection of subtests was not constrained by the content of traditional tests of intelligence nor was the method used one that relies on factorial approaches to the development of theories of human abilities (e.g. Carroll, 1993). Development of the CAS subtests was accomplished following a carefully prescribed sequence of item generation, experimental research, test revision, and re-examination until the instructions, items, and other dimensions were refined. Following a careful and thorough period of pilot tests, research studies, national tryouts, and national standardization, the instrument was finalized. This process allowed for the identification of subtests that provide an efficient way to measure each of the processes (Das et al., 1994; Naglieri & Das, 1997b). The PASS Theory was used as the organizational plan for the CAS and for that reason the test’s structure includes four scales. The Planning, Attention, Simultaneous, and Successive Scale standard scores are derived from the sum of subtests included in each respective scale. Like the Full Scale score (derived from the sum of all subtests), each PASS Scale has a normative mean of 100 and a standard deviation of 15. The PASS Scales represent a child’s cognitive functioning in each of the four theoretical areas and are used in identification of specific strengths and weaknesses in cognitive processing. Information about a child’s PASS characteristics can then be used when making diagnostic as well as instructional decisions for a child. CAS Standardization The CAS was standardized on a large representative sample of children aged 5–17 years, who closely match the U.S. population on a number of important demographic variables. The CAS standardization sample was stratified on the basis of: Age (5 years 0 months through 17 years 11 months); Gender (Female, Male); Race (Black, White, Asian, Native American, Other); Hispanic origin (Hispanic, Non-Hispanic); Region (Midwest, Northeast, South, West); Community Setting (Urban/Suburban, Rural); Classroom Placement (Full-time Regular Classroom, Part-time Special Education Resource, Full-time Self-Contained Special Education); Educational Classification (Learning Disability, Speech/Language Impairment, Social-Emotional Disability, Mental Retardation, Giftedness, and Non-special Education); and Parental Educational Attainment Level (less than high school degree, high school graduate or equivalent, some college or technical


173

school, four or more years of college). For details on the representativeness of the sample see the CAS Interpretive Handbook (Naglieri & Das, 1997b). Additionally, children from both regular education and special education settings were included in their appropriate proportions. During the standardization and validity study data collection phase a total of 3,072 children were administered the CAS (2,200 for the normative sample and 872 in reliability and validity studies). Further, a portion (1,600) of the standardization sample was also administered a group of achievement tests.

Validity of PASS Naglieri and Das (1997b) and Naglieri (1999) provide considerable information about the validity of CAS that suggests the approach may offer many advantages for professionals working to improve educational outcomes for children. In this section several important points will be covered. First, research will be summarized that suggests that different PASS profiles have been found for children with Reading Disabilities and Attention Deficit Hyperactivity Disorders (ADHD). Second, that the CAS is more strongly related to achievement than similar tests (Naglieri, 1999). Third, research has found the CAS to be useful with diverse populations, thus fairer than traditional measures of intelligence (Naglieri & Rojahn, 2001; Wasserman & Becker, 2000). Fourth, the CAS has been shown to have strong links to intervention (Naglieri, 1999). Each of these points will be more fully discussed below. PASS Profiles Several studies of the performance of children with ADHD and the PASS theory have now been completed. Paolitto (1999) studied matched samples of ADHD and normal children and found that the group of children with ADHD earned significantly lower scores on the Planning scale. He concluded that his results supported the view of Barkley (1997, 1998) that ADHD involves problems with behavioral inhibition and self-control, which is associated with poor executive control (e.g. planning from PASS). Paolitto also concluded “the CAS was able to successfully identify about three of every four children having ADHD” (p. 4). Similarly, Dehn (2001), Naglieri, Goldstein and Iseman (in press), and Naglieri, Salter and Edwards (2002) found that groups of children who met diagnostic criteria for ADHD earned significantly lower mean scores on measures of planning. Importantly, Naglieri, Goldstein and Iseman (in press) also found that children with ADHD had a different PASS profile than those with anxiety disorders and Naglieri, Salter and Edwards (2002) found that children with ADHD had a different PASS profile than those with specific reading difficulties. The averaged mean

174

JACK A. NAGLIERI

Fig. 1. PASS Processing Scale Profiles for Students with ADHD and LD.

PASS scores across these studies are graphically presented along with a sample of children with reading disabilities (Naglieri & Das, 1997b) in Fig. 1. The figure illustrates the differences that have been found for these populations. Relationships to Achievement One way to test the validity of a theory like PASS is to examine the extent to which the PASS scales relate to some important outcome variable like achievement. To examine this question, Naglieri (1999) summarized several investigations involving large samples of children and several important tests of ability into one table. To that table the Naglieri Nonverbal Ability Test (NNAT, Naglieri, 1997) has been added as an additional point of reference (a traditional test of ability that does not contain verbal/achievement based subtests). Each of the data sets used to obtain these correlations were large (greater than 500) and all included children


175

Table 1. Relationships between Achievement and Ability as Measured by Several Intelligence Tests. Ability Test WISC-III N-NATT Woodcock–Johnson cognitive K-ABC CAS

N

Correlation

Variance

1,284 24,108 888 2,636 1,600

0.59 0.63 0.63 0.63 0.70

35% 40% 40% 40% 49%

from all regions of the country, who differed in racial and ethnic composition and varied on the basis of community characteristics, as well as, parental educational levels. See Naglieri (1999) for details about how these data were obtained. The results are provided in Table 1. The findings of the relationships between ability, defined in a number of different ways, and achievement are quite enlightening. First, the correlation between the NNAT and Stanford Achievement Test (SAT9 ) scores of 0.63 (N = 24,108) is similar to the correlation of 0.59 between the WISC-III (Wechsler, 1991) Full Scale IQ and all WIAT achievement scores (Wechsler, 1992). This suggests that a 38-item progressive matrix test that is completely nonverbal (NNAT) can correlate with achievement as well as a test that contains both nonverbal and verbal content. Thus, verbal tests are not necessarily needed to predict achievement. Interestingly, the results for the seven-scale Woodcock–Johnson Revised Broad Cognitive Ability Extended Battery (0.63) are about the same as these two correlations. This suggests that the WJ-R, a cognitive test that also contains verbal achievement, but has nearly two times as many scales as the WISC-III, does not predict achievement much better and in fact, the correlation is the same as the NNAT/SAT9 . Most importantly, the correlation of 0.63 between the K-ABC (Kaufman & Kaufman, 1983) and the SAT9 suggests that a cognitively based measure of ability that does not contain verbal achievement can correlate with achievement. Similarly, the correlation between the CAS and WJ-R achievement of 0.70 shows that the PASS processes are important for predicting academic success and failure. The correlations between the various ability tests and achievement presented in Table 1 illustrate that the CAS is a powerful predictor of achievement, accounting for considerably more variance in achievement than traditional tests of intelligence. These findings in particular cause doubt on statements by McGrew, Keith, Flanagan and Vanderwood (1997) that the Gf-Gc theory used for the WJ-R is the “most useful framework for understanding cognitive functioning” (p. 1994). Instead, these data illustrate that seven Gf-Gc scales are needed to do as well as the two (Sequential and Simultaneous) K-ABC scales. Finally, these results are particularly important

176

JACK A. NAGLIERI

for two reasons. First, one of the most important dimensions of validity for a test of cognitive ability is the relationship to achievement (Brody, 1992; Cohen, Swerdlik & Smith, 1992). Second, the CAS and K-ABC, unlike the Wechsler scales, do not have subtests that are highly reliant on acquired knowledge (e.g. Arithmetic, Information, Vocabulary). Fairness The changing characteristics of the U.S. population have made fair assessment of children increasingly important in recent years. One way to ensure appropriate and fair assessment of diverse populations is to reduce the amount of knowledge needed to correctly answer the questions on tests of intelligence. However, it is common on traditional IQ tests to have items that measure vocabulary, general information, similarities between two words, math word problems. It is also, of course, common to have vocabulary, information, word analogies, and math word problems on tests of achievement. This overlap in content is considered undesirable by some test developers (Kaufman & Kaufman, 1983; Naglieri & Das, 1997a) and is amply noted by Kaufman and Lichtenberger (1999) when they wrote that the most commonly used IQ test, the Wechsler “Verbal Scale does measure achievement” (p. 133). This simple conclusion is a very important admission that the inclusion of tests that are very dependent upon knowledge, a problem not unique to the Wechsler scales, places persons with limited verbal knowledge at a significant disadvantage. Children from disadvantaged populations, those that have had limited or insufficient educational instruction, and those who are culturally and especially linguistically different (non-English) are at a considerable disadvantage. This is one of the reasons that some have argued that traditional IQ tests are biased. The Wechsler scales have been criticized for being biased against minority children (e.g. Hilliard, 1979) for a variety of reasons. Of considerable concern is that African-Americans have consistently earned lower mean Full Scale IQ scores than whites (Kaufman, Harrison & Ittenbach, 1990; Prifitera & Saklofske, 1998). Although most psychometric experts reject the use of mean score differences as evidence of test bias (Reynolds & Kaiser, 1990) there has been overrepresentation of African-American students in special education classes for children with mental retardation (Reschly & Bersoff, 1999). Some would take this as evidence of test bias because elements of any IQ test that are: (1) irrelevant to the construct being measured; and (2) systematically cause differences between groups is problematic. Further, Messick (1995) argued that because the consequences of the test scores may contribute to issues such as overrepresentation of minorities in classes for children with mental retardation and under-representation of minorities in programs for the gifted that the validity of the instruments are questioned. How


177

Table 2. Ability Test Total or Full Scale Standard Scores by Race. Test WISC-III FSIQ WJ-R cognitive Stanford-Binet IV UNIT K-ABC CAS NNAT

Blacks

Whites

N

Difference

Effect Size

89.9 90.9 98.0 91.6 91.5 95.3 99.3

100.9 102.6 106.1 99.1 97.6 98.8 95.1

252 854 364 222 172 238 4,612

11.0 11.7 8.1 7.5 6.1 3.5 4.2

0.73 0.69 0.54 0.54 0.59 0.26 0.25

Note: Sample sizes are for both White and Black groups combined.

big are the differences between race groups and are they influenced by the nature of the ability test that is used? Wasserman and Becker (2000) addressed this question. An excellent study of race differences on several different IQ tests was conducted by Wasserman and Becker (2000) for a symposium on fair assessment at the American Psychological Association annual convention. These investigators used or conducted studies of race differences for all major intelligence tests that employed a matched group design. This means that samples of Black and White children who were similar on as many demographic variables as available (e.g. age, sex, parent education, community setting, and region) were compared. Group mean scores were then compared and effect sizes (differences between the means divided by the groups’ average standard deviation) were computed. Wasserman and Becker examined the Wechsler Intelligence Scale for Children – Third Edition (WISC-III; Wechsler, 1991); Woodcock–Johnson Tests of Cognitive Ability (WJ-R; Woodcock & Johnson, 1989); Stanford-Binet Fourth Edition (SB-IV; Thorndike, Hagan & Sattler, 1986); Universal Nonverbal Intelligence Test (UNIT; Bracken & McCallum, 1998); and the CAS (Naglieri & Das, 1997a). Results from two additional studies (Naglieri, 1986; Naglieri & Ronning, 2000) were added to their results to include the K-ABC (Kaufman & Kaufman, 1983) and the Naglieri Nonverbal Ability Test (NNAT; Naglieri, 1997), respectively, both of which measure ability without inclusion of traditional verbal and arithmetic tests. The results of this summary are presented in Table 2. The findings in Table 2 should be considered in light of the fact that the concepts used to conceptualize and measure intelligence across these tests are very different. The difference in how intelligence is defined by these various tests provides a way to examine differences between race groups. What is striking about these results, and consistent with conclusions provided by Wasserman and Becker (2000) is the following:

178

JACK A. NAGLIERI

The size of the race differences varies with the particular test; The size of the differences are related to the degree to which the test includes measures that are achievement-like; Tests that rely heavily on verbal achievement (WISC-III, WJ-R; SB-IV) yielded larger race differences; Measures of cognitive processing (CAS & KABC) that require less verbal achievement demands yield smaller race differences; Non-verbal tests (e.g. NNAT & UNIT) that require minimal verbal achievement yield smaller race differences. Some might argue that ability tests that do not contain verbal achievement tests are somehow less valid measures of ability and therefore, the differences between race groups reduced. However, as addressed earlier, tests like the K-ABC, NNAT, and CAS correlate with achievement as well as or better than traditional IQ tests that contain verbal achievement subtests. It is, therefore, reasonable to conclude that redefining intelligence in terms of basic cognitive processes or using non-verbal tests is a viable option for fair assessment. The shortcoming of using non-verbal tests for identification of children with learning disabilities is that such tests are general measures of ability and do not measure multiple forms of ability – something that is very important for differential diagnosis and treatment planning. Additionally, research suggests that tests with academic content (arithmetic, general information, word knowledge, for example) should be avoided in a test of ability, if for no other reason than to eliminate the verbal/achievement component to a test of ability. Following these guidelines will result in a more equitable system for evaluating diverse populations of children. Interventions Related to PASS Theory Two approaches, which have been successfully used to translate CAS results into interventions for children with learning problems, will be discussed in the next section. The first is the PASS Remedial Program (PREP by Das, 1999) and the second is the Planning Facilitation Method described by Naglieri (1999). These approaches are based on the PASS theory and use the information gained about students’ processing abilities to build a cognitively based intervention method. The following section presents both interventions and provides empirical support for both. PREP Remedial Program The PREP program is based on research by Brailsford, Snart and Das (1984), Kaufman and Kaufman (1979), and Krywaniuk and Das (1976). These researchers showed that students could be trained to use simultaneous and successive processes


179

Fig. 2. Illustration of PREP Global and Bridging Tasks.

more efficiently and thereby improve “their performance on that process and some transfer to specific reading tasks also occurred” (Ashman & Conway, 1997, p. 169). The current version of PREP (Das, 1999) makes the connection between successive and simultaneous cognitive processes and reading more explicit and includes more tasks that focus on successive processing than simultaneous processing. The PREP program includes tasks that are non-academic and academic in content to illustrate the successive concept behind reading. For example, Fig. 2 shows an illustration of two conceptually related successive tasks in PREP. In this example, the child is being taught about a two-step sequence using the beginning and endings of pictures of animals. To extend this to the beginning and endings of words, the second task is provided. Similar tasks are used to teach the children to effectively work with longer sequences. Carlson and Das (1997) and Das, Mishra and Pool (1995) conducted studies of the effectiveness of PREP for children with reading decoding problems. Carlson and Das (1997) studied Chapter 1 children who received PREP (n = 22) in comparison to a regular reading program (control n = 15). The samples were tested before and after intervention using two WJ-R subtests: Word Attack and Word Identification. The intervention was conducted in two 50-minute sessions each week for 12 weeks. Similarly, Das et al.’s (1995) study involved 51 Reading Disabled children who were divided into a PREP (n = 31) and control (n = 20) groups. There were 15 PREP sessions given to small groups of four children. Word Attack and Word Identification tests were administered pre- and post-treatment. In both studies PREP groups outperformed the control groups. These findings, summarized in Fig. 3, “suggest that process training can assist in specific aspects of beginning reading” (Ashman & Conway, 1997, p. 171).

180

JACK A. NAGLIERI

Fig. 3. Research Report of Two Experiments on the Effectiveness of PREP.

Planning Facilitation Several research studies have examined how PASS scores can be used to select effective interventions for children with learning disabilities. These intervention studies focused on planning and math based on similar research by Cormier, Carlson and Das (1990) and Kar, Dash, Das and Carlson (1992). Cormier et al. and Kar et al. used a method that stimulated children’s use of planning, which was shown to have had positive effects on performance. In this approach children are taught to discover the value of strategy use without being specifically instructed to do so. Cormier et al. (1990) and Kar et al. (1992) demonstrated that students differentially benefited from the technique that facilitated planning. They found that children who performed poorly on measures of planning earned significantly higher scores than those with good scores in planning. The children were encouraged to examine the demands of the task in a strategic and organized manner. The results indicated that those children with low planning scores (the ones that needed to use this technique the most) were significantly helped by the planning facilitation. Naglieri and Gottling (1995, 1997) and Naglieri and Johnson (2000) used these studies as the basis for their work that focused on improving math calculation performance. The two studies by Naglieri and Gottling (1995, 1997) demonstrated that planning facilitation led to improved performance on multiplication problems for those with low scores in planning, but not for those with high planning scores. In other words, learning disabled students benefited differentially from the instruction based on their cognitive processing status. Thus, it is important to match the instruction to the cognitive weakness of the child. In the studies by Naglieri and Gottling (1995, 1997) and Naglieri and Johnson (2000) students completed mathematics work sheets in a sequence of baseline


181

and intervention sessions over about a two-month period. The method used to indirectly teach planning was applied to individual or groups of children about 2–3 times per week in half hour blocks of time. In the intervention phase, the students were given a 10-minute period for completing a mathematics page, a 10-minute period was used for facilitating planning and another 10-minute period for mathematics. All students were exposed to the intervention sessions that involved the three 10-minute segments of mathematics/discussion/mathematics in 30-minute instructional periods. During the discussion periods, students were encouraged to recognize the need to plan and use strategies when completing mathematic problems. The teachers provided probes that facilitated discussion and encouraged the children to consider various ways to be more successful. When a student provided a response, this often became the beginning point for discussion and further development of the strategy. The teachers used probes like “How did you do the math,” “What could you do to get more correct,” or “What will you do next time,” but they made no direct statements like, “That is correct,” or “Remember to use that same strategy,” nor did they provide feedback on the accuracy on previous pages, and they did not give mathematics instruction. The role of the teacher was to facilitate self-reflection and, therefore, encourage the students to plan so that they could complete the work sheets. The students made statements such as “I have to remember to borrow,” “ I have to keep the columns straight or I get the wrong answer,” and “Be sure to get them right not just get it done.” The relationship between the Planning Facilitation method and PASS profiles was studied by Naglieri and Johnson (2000). The purpose of their study was to determine if children with cognitive weaknesses in each of the four PASS processes would show different rates of improvement when given the Planning Facilitation method. In this study children were selected to form groups based on their PASS scores. Children with a cognitive weakness (an individual PASS score significantly lower than the child’s mean and below 85) in Planning, Attention, Simultaneous, and Successive Scales were used to form contrast groups. In addition, a no cognitive weakness group was identified. The importance of this study was that the five groups of children responded very differently to the intervention. Naglieri and Johnson (2000) found that children with a cognitive weakness in Planning improved considerably over baseline rates, while those with no cognitive weakness improved only marginally. Similarly, children with cognitive weaknesses in Simultaneous, Successive, Attention, and no cognitive weakness also showed substantially lower rates of improvement. The results of this study are provided in Table 3 and illustrate that PASS processes are relevant to intervention for children with learning disabilities.

182

JACK A. NAGLIERI

Table 3. Summary of Research Investigations of the Percentage of Change from Baseline to Intervention for Children with Good or Poor Planning Scores. Study Cormier, Carlson and Das (1990) Kar, Dash, Das and Carlson (1992) Naglieri and Gottling (1995) Naglieri and Gottling (1997) Naglieri and Johnson (2000) Median values across all studies

High Planning

Low Planning

5% 15% 26% 42% 11% 15%

29% 84% 178% 80% 143% 84%

How PASS Can be Used for LD Diagnosis At the beginning of this chapter the case of Louis, whose ability scores were within the average range (Verbal IQ score of 92 and Performance score of 108), but his achievement scores were below average (basic reading score of 78, a reading comprehension score of 85, and a written expression score of 82), was presented. Based on this information it was clear that there was an ability achievement discrepancy, but no detected intellectual problems. That is, the general intelligence model based on the Verbal/Performance organization did not inform us of any cognitive difficulty. In contrast, the child’s performance on PASS tests does offer some additional information that has both diagnostic and instructional relevance. Louis’ performance on the PASS tests clearly indicated that the young man has a cognitive weakness that is related to his academic weakness. Louis earned a CAS Planning score of 104, Attention score of 98, Simultaneous score of 92, and Successive score of 84. Louis’ Successive score is 15 points below his PASS mean of 99 and his Successive score is below average when compared to the normative mean of 100 – making it a “cognitive weakness.” This failure in a basic psychological process along with poor scores in reading (78), reading comprehension (85), and spelling (82) achievement has utility for eligibility as well as instruction. IDEA’97 defines a Specific Learning Disability (SLD) as “a disorder in one or more of the basic psychological processes involved in understanding or in using language, spoken or written, that may manifest itself in an imperfect ability to listen, think, read, write, spell, or to do mathematical calculations.” Louis has a documented disorder in Successive processing that underlies has academic failure in reading and spelling. The difficulty with Successive processing has made attempts to teach him ineffective and the need for some types of specialized instruction more obvious. It is best to identify a disorder of basic psychological processes using a standardized instrument (which was accomplished with the PASS theory


183

Fig. 4. CAS Discrepancy/Consistency Method Using PASS and Achievement Scores for Louis.

and CAS). This provides evidence of an ability/achievement discrepancy and an ability/achievement consistency. This is graphically illustrated in Fig. 4. The differences between the scores Louis earned on each PASS scale and achievement demonstrate that some of the scores are similar and others very different. Louis’ achievement scores in reading (78), reading comprehension (85), and spelling (82) are significantly different than his Planning, Attention, and Simultaneous scores, but not significantly different from his Successive score (values needed for significance are provided by Naglieri, 2002). In other words, Louis’ cognitive weakness in Successive processing is consistent with his poor academic scores. His poor academic scores are significantly lower than his scores of 104, 98, and 92, in Planning, Attention, and Simultaneous processing, respectively. The relationships among these scores are graphically presented in Fig. 4. Note that at the base of the diagram are the two areas of concern – low processing and low achievement. This association allows for the formulation of instructions that can be used to help Louis with his reading and spelling problems.

184

JACK A. NAGLIERI

Fig. 5. Segmenting Words for Reading, Decoding and Spelling Handout.


Fig. 6. Story Maps for Reading Comprehension Handout.

185

186

JACK A. NAGLIERI

Fig. 7. Story Maps Worksheet.

Louis’ low score in Successive processing provides an explanation as to why he is having reading problems. The sequential demands of Successive processing allows a child to organize incoming information in a proper order, which is important for remembering information in order as well as the formation of sounds and movements in order. For this reason, Successive processing is involved with blending of sounds to form words as well as the syntax of language. Successive processing is important for reading decoding because this academic skill requires making sense out of printed letters and words. Knowing what order letters, letter sounds, and words must be in to make sense requires careful examination of the successive series or order of the sounds. Louis needs instruction with reduced successive processing demands. For example, Louis would likely benefit from Segmenting Words for Reading and Spelling, an intervention suggested by Naglieri and Pickering (2003). This intervention can provide Louis with a strategic way to approach reading and spelling that does not rely on his problem area (successive processing), but rather focuses on Planning. The goal of the intervention is to


187

teach students that words can be broken down into smaller parts and helps them understand how words are constructed and how the various parts are related to one another (see Fig. 5). If Segmenting Words for Reading and Spelling does not help Louis with his reading and spelling then the PREP intervention discussed earlier is recommended. Louis is also having a difficult time with reading comprehension and remembering the order in which various events of the story unfold. Story Maps is an intervention that focuses on teaching students how all the facts of the story are related to the main idea (Naglieri & Pickering, 2003). This intervention can help Louis organize what he reads by having him graphically represent the important parts of the story and the relationships among these parts (see Figs 6 and 7).

CONCLUSIONS This chapter began with the assumption that intelligence tests have not changed appreciably since the beginning of the 20th century and that advances in cognitive and neuropsychology have provided the opportunity for change in this field. Tests like the K-ABC and CAS offer cognitive processing alternatives to the general intelligence model. The CAS, which is based on the PASS theory, offers a strong alternative to traditional tests as evidenced by three important findings. First, children’s PASS profiles are relevant to differential diagnosis and especially helpful for those with learning disabilities and attention deficits. Second, the CAS is an excellent predictor of achievement despite that fact that it does not contain verbal and achievement-based tests like those found in traditional measures of IQ. Third, the PASS theory provides information that is relevant to intervention and instructional planning. A case study was presented to illustrate how the CAS can help practitioners evaluate students consistent with state and Federal (IDEA’97) guidelines and can provide valuable information for intervention planning.

REFERENCES Ashman, A. F., & Conway, R. N. F. (1997). An introduction to cognitive education: Theory and applications. London: Routledge. Barkley, R. A. (1997). ADHD and the nature of self-control. New York, NY: Gilford Press. Barkley, R. A. (1998). Attention-deficit hyperactivity disorder: A handbook for diagnosis and treatment (2nd ed.). New York, NY: Gilford Press. Boake, C. (2002). From the Binet-Simon to the Wechsler-Bellevue: Tracing the history of intelligence testing. Journal of Clinical & Experimental Neuropsychology, 24(3), 383–405.

188

JACK A. NAGLIERI

Bracken, B. A., & McCallum, R. S. (1998). The universal non-verbal intelligence test. Itasca: Riverside Publishing Company. Brailsford, A., Snart, F., & Das, J. P. (1984). Strategy training and reading comprehension. Journal of Learning Disabilities, 17, 287–290. Brody, N. (1992). Intelligence. San Diego: Academic Press. Carlson, J., & Das, J. P. (1997). A process approach to remediating word decoding deficiencies in Chapter 1 children. Learning Disabilities Quarterly, 20, 93–102. Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. New York: Cambridge University Press. Cohen, R. J., Swerdlik, M. E., & Smith, D. K. (1992). Psychological testing and assessment. Mountain View, CA: Mayfield Publishing. Cole, K. N., Dale, P. S., Mills, P. E., & Jenkins, J. R. (1993). Interaction between early intervention curricula and student characteristics. Exceptional Children, 60(1), 17–28. Cormier, P., Carlson, J. S., & Das, J. P. (1990). Planning ability and cognitive performance: The compensatory effects of a dynamic assessment approach. Learning and Individual Differences, 2, 437–449. Das, J. P. (1999). PASS reading enhancement program. Deal, NJ: Sarka Educational Resources. Das, J. P., Kirby, J. R., & Jarman, R. F. (1975). Simultaneous and successive syntheses: An alternative model for cognitive abilities. Psychological Bulletin, 82, 87–103. Das, J. P., Kirby, J. R., & Jarman, R. F. (1979). Simultaneous and successive cognitive processes. New York: Academic Press. Das, J. P., Mishra, R. K., & Pool, J. E. (1995). An experiment on cognitive remediation or word-reading difficulty. Journal of Learning Disabilities, 28, 66–79. Das, J. P., Naglieri, J. A., & Kirby, J. R. (1994). Assessment of cognitive processes. Needham Heights, MA: Allyn & Bacon Publishers. Dehn, M. (2001). Cognitive assessment system performance of children with ADHD. Manuscript submitted for publication. Gazzaniga, M. S. (1975). Recent research on hemispheric lateralization of the human brain: Review of the split-brain. UCLA Educator, 17, 9–12. Goldberg, E. (2001). The executive brain: Frontal lobes and the civilized mind. New York, NY: Oxford University Press. Hilliard, A. G. (1979). Standardization and cultural bias as impediments to the scientific study and validation of “intelligence”. Journal of Research and Development in Education, 12, 47–58. Jensen, A. R. (1980). Bias in mental testing. New York: Free Press. Kar, B. C., Dash, U. N., Das, J. P., & Carlson, J. S. (1992). Two experiments on the dynamic assessment of planning. Learning and Individual Differences, 5, 13–29. Kaufman, A. S., Harrison, P. L., & Ittenbach, R. F. (1990). Intelligence testing in the schools. In: T. B. Gutkin & C. R. Reynolds (Eds), Handbook of School Psychology (pp. 289–327). New York: Wiley. Kaufman, D., & Kaufman, P. (1979). Strategy training and remedial techniques. Journal of Learning Disabilities, 12, 63–66. Kaufman, A. S., & Kaufman, N. L. (1983). Kaufman assessment battery for children. Circle Pines, MN: American Guidance Service. Kaufman, A. S., & Lichtenberger, E. O. (1999). Essentials of WAIS-III assessment. New York: Wiley. Kaufman, A. S, & Lichtenberger, E. O. (2000). Essentials of WISC-III and WPPSI-R assessment. New York: Wiley. Kavale, K. A., & Forness, S. R. (1984). A meta-analysis of the validity of the Wechsler scale profiles and recategorizations: Patterns or parodies? Leaning Disability Quarterly, 7, 136–151.


189

Kinsborne, M. (1978). Asymmetrical function of the brain. Cambridge, MA: Cambridge University Press. Krywaniuk, L. W., & Das, J. P. (1976). Cognitive strategies in native children: Analysis and intervention. Alberta Journal of Educational Research, 22, 271–280. Luria, A. R. (1966a). Higher cortical functions in man (2nd ed., revised and expanded). New York: Basic Books. Luria, A. R. (1966b). Human brain and psychological processes. New York: Harper & Row. Luria, A. R. (1973). The working brain: An introduction to neuropsychology. New York: Basic Books. Luria, A. R. (1980). Higher cortical functions in man (2nd ed.). New York: Basic Books. Luria, A. R. (1982). Language and cognition. New York: Wiley. McGrew, K. S., Keith, T. Z., Flanagan, D. P., & Vanderwood, M. (1997). Beyond g: The impact of Gf-Gc specific cognitive abilities research on the future use and interpretation of intelligence tests in the schools. School Psychology Review, 26, 189–210. Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741–749. Miller, G., Galanter, E., & Pribram, K. (1960). Plans and the structure of behavior. New York: Henry Holt and Company. Naglieri, J. A. (1986). WISC-R and K-ABC comparison for matched samples of Black and White children. Journal of School Psychology, 24, 81–88. Naglieri, J. A. (1997). Naglieri non-verbal ability test. San Antonio: Psychological Corporation. Naglieri, J. A. (1999). Essentials of CAS assessment. New York: Wiley. Naglieri, J. A. (2000). Can profile analysis of ability test scores work? An illustration using the PASS theory and CAS with an unselected cohort. School Psychology Quarterly, 15(4), 419–433. Naglieri, J. A. (2002). CAS rapid score. Centreville, VA: NL Associates. Naglieri, J. A., & Das, J. P. (1997a). Cognitive assessment system. Itasca: Riverside Publishing Company. Naglieri, J. A., & Das, J. P. (1997b). Cognitive assessment system interpretive handbook. Chicago: Riverside Publishing Company. Naglieri, J. A., Goldstein, S., Iseman, J. S., & Schwebach, A. (in press). Performance of children with attention deficit hyperactivity disorder and anxiety/depression on the WISC-III and cognitive assessment system (CAS). Journal of Psychoeducational Assessment. Naglieri, J. A., & Gottling, S. H. (1995). A cognitive education approach to math instruction for the learning disabled: An individual study. Psychological Reports, 76, 1343–1354. Naglieri, J. A., & Gottling, S. H. (1997). Mathematics instruction and PASS cognitive processes: An intervention study. Journal of Learning Disabilities, 30, 513–520. Naglieri, J. A., & Johnson, D. (2000). Effectiveness of a cognitive strategy intervention to improve math calculation based on the PASS theory. Journal of Learning Disabilities, 33, 591–597. Naglieri, J. A., & Pickering, E. B. (2003). Helping children learn. Baltimore, MD: Brookes Publishing. Naglieri, J. A., & Rojahn, J. (2001). Gender differences in planning, attention, simultaneous, and successive (PASS) cognitive processes and achievement. Journal of Educational Psychology, 93, 430–437. Naglieri, J. A., & Ronning, M. E. (2000). The relationships between general ability using the NNAT and SAT reading achievement. Journal of Psychoeducational Assessment, 18, 230–239. Naglieri, J. A., Salter, C. J., & Edwards, G. H. (2002). Performance of children with assessment of ADHD and reading disabilities using the PASS theory and Cognitive Assessment System. Manuscript submitted for publication.

190

JACK A. NAGLIERI

Naglieri, J. A., & Sullivan, L. (December, 1998). IDEA and identification of children with specific learning disabilities. Communiqué. Neisser, U. (1967). Cognitive psychology. New York: Appleton-Century-Crofts. Paolitto, A. W. (1999). Clinical validation of the Cognitive Assessment System with children with ADHD. ADHD Report, 7, 1–5. Pressley, M. P., & Woloshyn, V. (1995). Cognitive strategy instruction that really improves children’s academic performance (2nd ed.). Cambridge: Brookline Books. Prifitera, A., & Saklofske, D. (1998). WISC-III clinical use and interpretation: Scientist-practitioner perspectives. New York: Academic Press. Reschly, D. J., & Bersoff, D. N. (1999). Law and school psychology. In: C. R. Reynolds & T. B. Gutkin (Eds), The Handbook of School Psychology (3rd ed., pp. 1077–1112). New York: Wiley. Reynolds, C. R., & Kaiser, S. M. (1990). Bias in assessment of aptitude. In: C. R. Reynolds & R. W. Kamphaus (Eds), Handbook of Psychological & Educational Assessment of Children: Intelligence and Achievement (pp. 611–653). New York: Wiley. Siegle, L. S. (1989). IQ is irrelevant to the definition of learning disabilities. Journal of Learning Disabilities, 22, 469–479. Snow, R. E. (1991). Aptitude-treatment interaction as a framework for research on individual differences in psychotherapy. Journal of Consulting and Clinical Psychology, 59, 205–216. Solso, R. L., & Hoffman, C. A. (1991). Influence of Soviet scholars. American Psychologist, 46, 251–253. Sternberg, R. J. (1988). The triarchic mind: A new theory of human intelligence. New York: Viking. Thorndike, R. L., Hagen, E. P., & Sattler, J. M. (1986a). The Stanford-Binet intelligence scale, Fourth edition: Guide for administering and scoring. Chicago: Riverside. Wasserman, J. D., & Becker, K. A. (August, 2000). Racial and ethnic group mean score differences on intelligence tests. In: J. A. Naglieri (Chair), Making Assessment More Fair – Taking Verbal and Achievement out of Ability Tests. Symposium conducted at the annual meeting of the American Psychological Association, Washington, DC. Wechsler, D. (1991). Wechsler intelligence scale for children (3rd ed.). San Antonio, TX: Psychological Corporation. Wechsler, D. (1992). Wechsler individual achievement test. San Antonio, TX: Psychological Corporation. Wilson, M. S., & Reschly, D. J. (1996). Assessment in school psychology training and practice. School Psychology Review, 25, 9–23. Woodcock, R. W., & Johnson, M. B. (1989). Woodcock-Johnson revised tests of achievement: Standard and supplemental batteries. Itasca, IL: Riverside Publishing. Yoakum, C. S., & Yerkes, R. M. (1920). Army mental tests. New York: Henry Holt and Company.

INDIVIDUALS WITH MENTAL RETARDATION AND A SENSORIMOTOR DISORDER: ASSESSMENT OF DISABILITY Giulia Balboni and Patrizia Ceccarani ABSTRACT The purpose of this investigation was to examine the utility of the Vineland Scales-Expanded Form for an assessment of disabilities in individuals with Mental Retardation (MR) and a sensorimotor disorder. The Vineland score profiles of individuals with MR and a sensorimotor disorder were compared with those of matched peers with MR but without the associated disorder. The disorder group exhibited lower scores only in the adaptive areas relating to the sensorimotor disorder. The Vineland Scales can therefore evaluate the adaptive area deficits of individuals with MR and a sensorimotor disorder. As the Vineland Scales measure everyday living skills, they can be used for an assessment of sensorimotor disabilities. The utility of disability assessment in individuals with MR is discussed.

INTRODUCTION According to the 2002 American Association on Mental Retardation (Luckasson et al., 2002) Mental Retardation (MR) is a state of impaired functioning; it Identification and Assessment Advances in Learning and Behavioral Disabilities, Volume 16, 191–204 Copyright © 2003 by Elsevier Science Ltd. All rights of reproduction in any form reserved ISSN: 0735-004x/doi:10.1016/S0735-004X(03)16006-5

191

192

GIULIA BALBONI AND PATRIZIA CECCARANI

manifests itself in the difficulties an individual has in carrying out performances expected by his/her personal and social environment (Greenspan, 1999; Schalock, 1999a, b). MR is the result of the interaction of the person’s deficits in conceptual, practical, and social intelligences (Thompson, McGrew & Bruininks, 1999) with the demands of the environment (Luckasson et al., 2002). Many individuals with MR, and in particular those with severe or profound MR, have an associated sensorimotor disorder (American Psychiatric Association, 1994; Carvill, 2001). The term sensorimotor disorder refers here to the loss or abnormality of the sensorimotor system that is preceded by illness or trauma of the visual-hearing sensorial or neuromotor system (Dunn, 1999). The sensorimotor system comprises the input (sensation) and the output (motor) that operate to enable a person to notice stimuli and to react to them. Thus, the sensorimotor disorders make it more difficult for persons with MR to interact with their environment. To plan interventions to improve life functioning of individuals with both MR and a sensorimotor disorder, psychometric instruments are therefore needed which evaluate everyday living skill deficits. The instruments must discriminate everyday living skill deficits which are intrinsic to the MR disorder from those which are instead related to the sensorimotor disorder. Psychological tests like Peabody Developmental Motor Scales (Folio & Fewell, 1983), Bruininks-Oseretsky Test of Motor Proficiency (Bruininks, 1978), Bender Visual Motor Gestalt Test (Bender, 1938), and Luria-Nebraska Neuropsychological Battery (Golden, 1987) are generally used to evaluate perceptual and motor skills (Dunn, 1999; Fuller, Awadh & Vance, 1998; Handen, 1997). These instruments are standardized laboratory tests which allow for the evaluation of basic competencies such as gross and fine motor skills or visual and auditory perceptual skills. These instruments can therefore detect impairments; an impairment is the gap between individual performance and the range considered normal for a human being (World Health Organization, 1992). Impairments may affect the individual’s daily living skills in different ways. Thus, the evaluation of the impairments is insufficient for understanding their effects on the life functioning of persons with MR. It is therefore very important also to use instruments which allow for the evaluation of disabilities; a disability is any restriction or lack of ability to perform a daily life activity (WHO, 1992). The instruments have to determine how and to what extent impairments can actually affect the carrying out of everyday life activities. In the field of MR there is a well-known construct concerning everyday living skills; it is called adaptive behavior or adaptive skills. Adaptive behavior refers to the skills that people typically exhibit when adjusting to and dealing with physical and socio-cultural environmental demands they must confront (Thompson et al., 1999). According to the new AAMR definition of adaptive

Individuals with Mental Retardation and a Sensorimotor Disorder

193

behavior (Heal & Tassé, 1999; Luckasson et al., 2002), adaptive behavior is expressed in conceptual, social, and practical adaptive areas and refers to a set of twelve adaptive skills to be evaluated in home, school/work, and community setting: communication expressive, communication receptive, functional academic literacy, functional academic mathematics, social skills, leisure, self-care, health, safety, home living, work, and community. On the other hand, the adaptive behavior general dimensions that have been revealed with empirical investigations are four: communicative, daily living, social, and motor skills (Thompson et al., 1999; Widaman, Borthwick-Duffy & Little, 1991; Widaman & McGrew, 1996). Researchers believe that any adaptive behavior scale should assess these four dimensions (Heal & Tassé, 1999; Schalock, 1999a, b). Sensorimotor disorders are generally associated with deficits in adaptive behavior or everyday living skills. Adaptive areas which require motor competence (e.g. motor skills, self-care and home living skills) are compromised in persons with a motor or a visual disorder (e.g. Pérez-Pereira & Conti-Ramsden, 1999). Communication areas may be compromised in persons with a hearing disorder (e.g. Marschark, 1993). Social areas may be compromised in persons with any motor, visual, or hearing sensorimotor disorder (e.g. Carvill, 2001; Marschark, 1993; Pérez-Pereira & Conti-Ramsden, 1999). Adaptive behavior is in relation to everyday living skills and it is compromised in persons with a sensorimotor disorder. Therefore, the adaptive behavior scales might be used for the evaluation of disabilities in individuals with MR and a sensorimotor disorder. For this purpose, the adaptive behavior scales must be able to reveal the adaptive behavior deficits associated with sensorimotor disorders. Among all the adaptive behavior scales, the Vineland Adaptive Behavior Scales (Sparrow, Balla & Cicchetti, 1984) are probably the most widely used, for both clinical and research purposes (Heal & Tassé, 1999; Luckasson et al., 1992). The Vineland Scales allow for the evaluation of all developmental levels of the four adaptive behavior general domains. The Vineland Scales are four, made up of sub-scales: Communication (Receptive, Expressive, and Written), Daily Living Skills (Personal, Domestic, and Community), Socialization (Interpersonal Relationships, Play and Leisure Time, and Coping Skills), and Motor Skills (Gross and Fine). The Vineland Scales are completed by the individual’s caregiver. They can therefore be utilized also when the direct administration of a test may be particularly complex, such as with individuals with severe-profound MR or with sensorimotor impairments. Many researchers have verified that Vineland Scales exhibit high reliability and validity (Balboni, Pedrabissi, Molteni & Villa, 2001; Carter et al., 1998; Sparrow et al., 1984). Recently, Balboni et al. (2001) verified that the Vineland Scales-Expanded Form reveal the adaptive behavior deficits of persons with MR and either a communication, social, or a motor associated

194


disorder. Three matched groups of persons with MR and with or without either a communication, social, or a motor-associated disorder were compared. It was found that there were significant differences just in the areas of adaptive behavior that were affected by the associated disorder. Moreover, it was revealed that the discriminant utility of the Vineland Scales was independent of the participants’ level of mental retardation (Balboni et al., 2001). The present investigation was intended to verify whether the Vineland Scales can also reveal the adaptive behavior deficits of persons with MR and an associated sensorimotor disorder. For this purpose, the Vineland score profiles of individuals with MR and an associated sensorimotor disorder were compared with those of matched peers with MR but without the associated disorder. It was predicted that the group with a sensorimotor disorder would obtain lower scores than controls only in the adaptive areas which require motor, communicative, and social competences. There is no direct relation between the severity of an impairment and type of disability. Therefore, this investigation was intended to determine whether the Vineland Scales could reveal the most compromised adaptive areas of all the adaptive areas that require motor or communicative or social competences. Finally, this investigation was also intended to determine whether the discriminant utility of the Vineland Scales is the same in evaluating persons with mild-moderate MR as well as individuals with severe-profound MR. For this purpose, sensorimotor disorder-control groups at different levels of mental retardation were compared, which examined whether any interaction effect existed between group (sensorimotor disorder vs. control) and MR level (mild-moderate vs. severe-profound) variables on the Vineland score profiles.

METHOD Participants The participants were 94 individuals (72 males, 77%), including 10 (11%) with mild, 30 (32%) with moderate, 48 (51%) with severe, and 6 (6%) with profound MR, and a chronological age between 6 and 51 years old (M = 23/10, SD = 13/7). Forty-six (49%) lived in residential facilities for persons with disabilities; 48 (51%) lived in the community and attended school (n = 34, school age participants) or daily facilities for persons with disabilities (n = 14, adults). The diagnoses of mild, moderate, severe, or profound MR disorder had been made by the psychologists of the individuals’ facilities according to the DSM-IV (APA, 1994) on the basis of IQ (50–55 to approximately 70; 35–40 to 50–55; 20–25 to 35–40; behind 20–25, respectively) and a clinical judgment of adaptive


195

functioning within the margins overlapping the IQ. Intelligence tests had been employed for 28 participants (30%). Individuals had been evaluated within 12 months of the Vineland’s compilation, with WIPPSI (n = 3), WISC-R (n = 4), Stanford-Binet (n = 5), Colored Progressive Matrices (n = 12), or Progressive Matrices 38 (n = 4). For the other 66 (70%) it was not possible because the participants were found to be too impaired, thus their MR level had been ascertained on the basis of a clinical judgment of life functioning. In particular, the psychologists had evaluated sensorimotor, communication, academic, self care, social and vocational skills, as well as the need of supervision; then they had ascertained the MR level on the basis of the description of these dimensions’ functioning level proposed by the DSM-IV. All the participants were selected among the 1197 individuals that make up the standardization sample of individuals with disabilities of the Vineland Scales-Expanded Form, Italian version (Balboni & Pedrabissi, 2003). A previous interview with the staff of the individuals’ facilities had been used to exclude the persons who had had particular experiential histories (e.g. had recently arrived in the residential facilities, or had attended a non-standard rehabilitative program). The participants were selected in order to create two matched disorder-control groups with or without a sensorimotor disorder associated with MR. All the persons with MR and with a visual, hearing, or a motor deficit, caused by trauma or illness in sensorial or neuromotor system, were selected first; they did not have any other major physical or mental disorder. Then, for each of them a peer was selected with MR and no other major disorder. The peer had to live in the same type of residence as the sensorimotor disorder participant (community or residential facility), had the same MR level and gender, and an age difference within 12 months (if younger than 16 years) or within 24 months (if older). In the main sample of 1197 individuals there were 90 persons with MR and a sensorimotor disorder; however, only 47 were selected because a matched control was not found for the others. The types (with frequencies) of sensorimotor disorders are shown in Table 1. Sensorimotor disorders had been diagnosed by the physicians of the individuals’ facilities and no form of the Vineland Scales had been used. Table 2 displays the characteristics of the sensorimotor disorder-control groups. There were no statistical differences for age, gender, residence type, or MR level between the two groups.

Instrument There are three different versions of the Vineland Scales: the Expanded Form, which allows for a detailed analysis of adaptive behavior; the Survey Form, which allows for a quick preliminary evaluation; and the Classroom Edition, for the

196


Table 1. Sensorimotor Disorders (n = 47). Type

No. (%)

Visual (20, 42%) Visual deficit Blindness

6 (13) 14 (30)

Visual and motor (14, 30%) Visual deficit, leg impairment Visual deficit, leg and arm impairment Blindness, leg impairment Blindness, leg and arm impairment

3 (6) 7 (15) 3 (6) 1 (2)

Visual and hearing (6, 13%) Visual and auditory deficit Blindness, deafness

4 (8) 2 (4)

Hearing and motor (4, 8%) Auditory deficit, leg and arm impairment Deafness, leg impairment Deafness, leg and arm impairment

2 (4) 1 (2) 1 (2)

Visual, hearing, and motor (3, 6%) Blindness, deafness, leg impairment Blindness, auditory deficit, leg and arm impairment

2 (4) 1 (2)

Table 2. Characteristics of the Sensorimotor Disorder and Control Groups. Sensorimotor (n = 47)

Control (n = 47)

Age Mean (SD)

23/10 (13/8)

23/10 (13/8)

Gender (M–F)

36–11

36–11

Residence type Community Institution

24 23

24 23

MR level Mild Moderate Severe Profound

5 15 24 3

5 15 24 3


197

evaluation of adaptive behavior in school. The Classroom Edition is filled out directly by the student’s teacher. The Expanded and Survey Forms are completed by a semi-structured interview of a person that knows the individual well. In this investigation, the Expanded Form Italian version was used, since it provides a more detailed evaluation, and is more suitable than the other editions for rehabilitation program planning. The Italian version of the Expanded Form exhibits high validity and reliability (Balboni & Pedrabissi, 2003) and provides norms based on regular sample (e.g. age equivalent scores) and on sample with disability (e.g. standard scores and adaptive levels). The Vineland Scales had been completed by interviews questioning, via a semi-structured interview, the individual’s parent (n = 14), school special teacher (n = 18), or closest facility staff member (for all the 48 participants who lived in residential facilities and for 14 participants who attended daily-facilities). Previous investigations (Balboni & Pedrabissi, 2003) have revealed that special education teachers and staff members are as reliable as parents in evaluating individuals’ adaptive behavior with the Vineland Scales Italian version. Some of the interviews were staff members of individual facilities, others were psychologists with a specialization in the field of MR. All the interviews had been previously trained in the use of tests and of the Vineland Scales, in particular. This investigation is an a-posteriori investigation, developed after the completion of the Vineland Italian version standardization program; thus the data collection could not had been influenced by prior knowledge of the aims of the investigation.

Research Design We employed a 2 × 2 factorial multivariate research designs, with Group (sensorimotor disorder vs. control) as within subjects controlled variable and MR Level (mild-moderate vs. severe-profound) as between subjects controlled variable. Observed variables were the adaptive areas measured by the Vineland Scales or sub-scales; age equivalent normative scores were used. The Shapiro-Wilk test (Bray & Maxwell, 1993; Stevens, 1986) was conducted, and revealed that observed variables were not always normally distributed. Repeated measures Multivariate Analyses of Variance (MANOVAs) and Analyses of Variance (ANOVAs), respectively, were employed to compare simultaneously and separately the Vineland scales or sub-scales scores at different levels of control variables. In accordance with the Bonferroni procedure for multiple comparisons (Silverstein, 1986), the p value was determined by dividing the chosen p value (0.05) by the number of comparisons. Box tests (Bray & Maxwell, 1993; Stevens, 1986) and Levene tests (Howell, 1997) were carried out;

198


they allowed us to verify, respectively, that generally there was not homogeneity in the covariance matrix (in the case of MANOVAs), nor in the variance (in the case of ANOVAs) of individual cells.

RESULTS Comparisons Within Sensorimotor Disorder and Control Groups The within subjects MANOVAs revealed that the sensorimotor disorder participants had a significantly ( p ≤ 0.01) different profile on the Vineland Scales, Wilk’s ␭ = 0.54, F(4, 43) = 9.04, and on the sub-scales of Daily Living Skills, Wilk’s ␭ = 0.79, F(3, 44) = 3.95, Socialization, Wilk’s ␭ = 0.76, F(3, 44) = 4.51, and Motor Skills, Wilk’s ␭ = 0.56, F(2, 45) = 17.79; no differences were observed in the Communication sub-scales. As can been seen in Table 3, the group with sensorimotor disorders had significantly lower scores on the Vineland domains

Table 3. Age Equivalent Mean (and Standard Deviation) on the Vineland Scales and Sub-Scales for the Sensorimotor Disorder and Control Groups. Sensorimotor

Control

F value

Communication Receptive Expressive Written

2.78 (1.92) 1.52 (2.07) 2.60 (1.99) 3.87 (1.40)

3.36 (2.64) 1.81 (2.43) 2.91 (2.27) 4.64 (2.36)

2.62 0.60 0.88 5.54†

Daily living skills Personal Domestic Community

2.59 (1.33) 2.12 (1.31) 3.66 (1.31) 3.05 (1.30)

3.41 (2.35) 3.14 (2.13) 4.47 (3.41) 3.46 (2.08)

5.43 9.58** 2.59 1.90

Socialization Interpersonal relationships Play and leisure time Coping skills

1.97 (1.02) 1.59 (1.35) 1.32 (0.98) 3.56 (0.85)

2.50 (1.53) 2.03 (1.84) 1.92 (1.71) 4.04 (1.49)

7.33** 2.27 5.71† 9.03**

Motor skills Gross Fine

1.56 (0.77) 1.52 (0.87) 1.73 (0.80)

2.63 (1.24) 2.62 (1.23) 2.73 (1.39)

36.06*** 31.24*** 30.51***

∗

p < 0.05. < 0.01. ∗∗∗ p < 0.001. † p < 0.06. ∗∗ p


199

of Socialization and Motor Skills and on the sub-domains of Personal, Coping Skills, Gross Motor, and Fine Motor. Moreover, the sensorimotor participants had lower scores approaching statistical significance on the sub-domains of Written Communication, and Play and Leisure Time.

Comparisons between Participants with Either Mild-Moderate or Severe-Profound MR Level The MANOVAs highlighted a significant main effect ( p ≤ 0.01) of the MR level on the scores obtained on the four Vineland Scales, Wilk’s ␭ = 0.42, F(4, 42) = 14.68, and on the sub-scales of Communication, Wilk’s ␭ = 0.46, F(3, 43) = 16.92, Daily Living Skills, Wilk’s ␭ = 0.46, F(3, 43) = 12.27, Socialization, Wilk’s ␭ = 0.43, F(3, 43) = 19.33, and Motor Skills, Wilk’s ␭ = 0.50, F(2, 44) = 21.89, Scales. As can be seen in Table 4, the participants with mild or moderate MR, compared to those with severe or profound MR, revealed significantly ( p < 0.001) higher scores on all the Vineland Scales and sub-scales except on the Domestic and Gross Motor sub-scales. Table 4. Age Equivalent Mean (and Standard Deviation) on the Vineland Scales and Sub-Scales for the Participants with Either Mild-Moderate or Severe-Profound MR. Mild-moderate

Severe-profound

F valuea

Communication Receptive Expressive Written

4.71 (1.60) 3.01 (1.84) 4.26 (1.34) 5.45 (1.75)

1.86 (1.12) 0.66 (1.11) 1.65 (1.22) 3.37 (0.58)

51.47 29.53 48.07 33.33

Daily living skills Personal Domestic Community

3.88 (1.30) 3.44 (1.23) 4.75 (1.32) 4.20 (1.51)

2.35 (1.25) 2.03 (1.14) 3.56 (2.15) 2.56 (0.77)

16.49 16.33 4.73 23.69

Socialization Interpersonal relationships Play leisure time Coping skills

3.13 (0.99) 2.87 (1.16) 2.35 (1.15) 4.65 (1.09)

1.57 (0.62) 1.03 (0.60) 1.07 (0.68) 3.18 (0.52)

43.62 49.77 22.69 37.75

Motor skills Gross Fine

2.52 (0.80) 2.30 (0.90) 2.93 (0.72)

1.77 (0.71) 1.91 (0.74) 1.71 (0.74)

11.44 2.66 32.44

a All

statistically significant ( p < 0.001) except for the Domestic and Gross Motor sub-domains.

200


Comparisons between Sensorimotor Disorder and Control Sub-Groups at Different MR Levels MANOVAs and ANOVAs did not reveal any significant interaction effects between group and MR level on the Vineland domain and sub-domain scores. A significant multivariate interaction effect was found only in the Communication sub-domains, Wilk’s ␭ = 0.77, F(3, 43) = 4.23, p < 0.01, but ANOVAs did not reveal any significant univarite differences.

DISCUSSION The primary purpose of the present investigation was to verify whether the Vineland Scales could reveal the adaptive area deficits of persons with MR and a sensorimotor disorder. Compared with peers without the associated disorder, participants with MR and a sensorimotor disorder had a different profile of adaptive behavior. The sensorimotor disorder group obtained lower scores only in the adaptive behavior areas which are generally compromised in persons with a sensorimotor disorder: areas which require motor, communicative, and social competences (e.g. Marschark, 1993; Pérez-Pereira & Conti-Ramsden, 1999). Participants with a sensorimotor disorder may be differently compromised in the different adaptive areas which require the same competence. Thus, the second purpose was to verify whether the Vineland Scales allow for the evaluation of the adaptive areas that are more compromised among all the areas that require motor or communicative or social competences. It was revealed that participants with an associated sensorimotor disorder had different deficits in the different adaptive areas that require the same competences. In particular, regarding deficits in areas which require motor competence, the sensorimotor disorders group had lower skills on the Vineland Motor Scale and on the Vineland Gross Motor, Fine Motor, and Personal sub-scales. Conversely, no difference was observed on the Vineland Domestic sub-scale, although it requires motor competence. The Domestic sub-scale evaluates skills (e.g. housecleaning and food preparation) which are also probably not exhibited by controls; this may happen because the Italian social environment frequently does not require these skills of persons with MR. Regarding deficits in areas which require communicative competence, the sensorimotor disorders participants had lower skills on the Vineland Written sub-scale. No differences were found on the Vineland Receptive, Expressive, and Community sub-scale, that requires verbal competence. Verbal competence is generally compromised by hearing disorders. On the other hand, of all the 47 participants with a sensorimotor disorder, just 6 (13%) were deaf and 7 (15%) had a hearing


201

deficit. Conversely, written competence is generally compromised not only in individuals with hearing disorders but also in those with motor or visual disorders. Finally, regarding deficits in adaptive areas which requires social competence, sensorimotor disorder participants achieved lower scores on the Vineland Social Scale and on the Vineland Coping Skills and Play and Leisure Time sub-scales. Only on the Vineland Interpersonal Relationship sub-scale were differences were not observed; this result is in agreement with studies (e.g. Carvill, 2001; Pérez-Pereira & Conti-Ramsden, 1999) which have revealed that individuals with a sensorimotor disorder may develop alternative ways of relating with others. All the participants with mild-moderate MR obtained higher Vineland scores than those with severe-profound MR; thus, discriminant validity of the Vineland Scales could be dependent upon MR level. Comparisons between sensorimotor disorder-control groups at different MR levels were therefore carried out. No statistically significant group×MR level interaction effects were observed on the Vineland Scales and sub-scales scores. The Vineland Scales are therefore able to reveal the adaptive area deficits in participants with mild and moderate MR as well as with severe and profound MR. It is important to note that assumptions of normality and homogeneity of variance and covariance matrix were not always met. Several researchers have revealed that both ANOVA and MANOVA are relatively robust to violations of these assumptions in cases, such as the present investigation, of equal or almost equal group size (Bray & Maxwell, 1993; Stevens, 1986). Moreover, the group variable effect on the Vineland scores was verified also by the non-parametric Wilcoxon test, which yielded the same results. Unfortunately, intelligence tests had not been employed with many participants of the present investigation because they had been found to be too severely disabled. Their MR level had been therefore revealed via clinical judgment of life functioning. Future research should replicate this study with individuals with an AAMR classification of MR, which is based on need of supports and not on IQ. It would also be of interest to verify the discriminant utility of the Vineland Scales with more homogeneous participants, with either a visual or a hearing disorder. On the other hand, the present investigation revealed that the Vineland Scales has a discriminant capability, although the disorder participants were heterogeneous regarding the sensorimotor disorder type. The Vineland Scales measure the adaptive behavior deficits in persons with MR and a sensorimotor disorder. The construct of adaptive behavior comprises everyday living skills. Thus, the Vineland Scales can be used to evaluate the disability in individuals with MR and a sensorimotor disorder. There is not a direct relation among disability and severities of impairment. Therefore, a

202


disability evaluation alone could reveal the most compromised living skills. The Vineland Scales are therefore useful, especially as they allow for the assessment of sensorimotor disabilities of persons with severe and profound MR. And, in persons with severe and profound MR, sensorimotor disorders have a high incidence and are frequently overlooked (Carvill, 2001). Generally, to have an evaluation of disabilities, informal assessment procedures are used. Examples of these procedures include direct and systematic observation of individuals while they are interacting with their physical and social environments; and interviews of the individual’s caregiver about the individual’s skills in natural life environments (Dunn, 1999; Moore, 1999; Mullen, 1999). If based on knowledge about sensorimotor and perceptual skills development, these informal assessments may be useful for a preliminary evaluation. On the other hand, informal procedures may not be reliable and valid. Thus, informal assessment may not be useful to plan personalized interventions. In contrast, Vineland Scales can be used to obtain a valid and reliable measure of the strengths and weakness of individual’s skills. Vineland Scales may therefore be used to plan personalized interventions intended to improve the general life functioning of persons with MR.

REFERENCES American Psychiatric Association (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. Balboni, G., & Pedrabissi, L. (2003). Adattamento italiano delle Vineland Adaptive Behavior Scales [Vineland Adaptive Behavior Scales-Italian adaptation]. In: S. S. Sparrow, D. A. Balla & D. V. Cicchetti (Eds), Vineland Adaptive Behavior Scales. Florence, Italy: Organizzazioni Speciali. Balboni, G., Pedrabissi, L., Molteni, M., & Villa, S. (2001). The discriminant validity of the Vineland Scales: Score profiles of individuals with mental retardation and a specific disorder. American Journal on Mental Retardation, 106(2), 162–172. Bender, L. (1938). A visual motor test and its clinic use. Research Monograph, 3. New York, NY: American Orthopsychiatrich Association. Bray, J. H., & Maxwell, S. E. (1993). Multivariate analysis of variance. In: M. S. Lewis-Beck (Ed.), Experimental Design and Methods (pp. 337–408). London: Sage Publications Ltd. Bruininks, R. H. (1978). Bruininks-Oseretesky test of motor proficiency. Circle Pines, MN: American Guidance Services. Carter, A. S., Volkmar, F. R., Sparrow, S. S., Wang, J. J., Lord, C., Dawson, G., Fombonne, E., Loveland, K., Mesibov, G., & Schopler, E. (1998). The Vineland adaptive behavior scales: Supplementary norms for individuals with autism. Journal of Autism and Developmental Disorders, 28, 287–302. Carvill, S. (2001). Sensory impairments, intellectual disability and psychiatry. Journal of Intellectual Disability Research, 45(6), 467–483. Dunn, W. (1999). Assessment of sensorimotor and perceptual development. In: E. V. Nuttall, I. Romero & J. Kalesnik (Eds), Assessing and Screening Preschoolers (pp. 240–261). Boston, MA: Allyn and Bacon.


203

Folio, R., & Fewell, R. (1983). Peabody developmental motor scales. Allen, TX: DLM Teaching Resources. Fuller, G. B., Awadh, A. M., & Vance, H. B. (1998). Assessing perceptual-motor skills. In: H. B. Vance (Ed.), Psychological Assessment of Children: Best Practices for School and Clinical Settings (pp. 277–296). New York, NY: John Wiley & Sons, Inc. Golden, C. J. (1987). Luria-Nebraska neuropsychological battery: Children’s revision. Los Angeles, CA: Western Psychological Services. Greenspan, S. (1999). What is meant by mental retardation? International Journal of Psychiatry, 11(1), 6–13. Handen, B. L. (1997). Mental retardation. In: E. J. Mash, & L. G. Terdal (Eds), Assessment of Childhood Disorders (pp. 369–407). New York, NY: Guilford Press. Heal, L. W., & Tassé, M. J. (1999). The culturally individualized assessment of adaptive behavior: An accommodation to the 1992 AAMR definition, classification, and systems of support. In: R. L. Schalock (Ed.), Adaptive Behavior and its Measurement (pp. 185–205). Washington, DC: American Association on Mental Retardation. Howell, D. C. (1997). Statistical methods for psychology (4th ed.). Washington, DC: Duxbury Press. Luckasson, R., Borthwick-Duffy, S., Buntix, W. H. E., Coulter, D. L., Craig, E. M., Reeve, A., Schalock, R. L., Snell, M. E., Spitalnik, D. M., Spreat, S., & Tassé, M. J. (2002). Mental retardation: Definition, classification, and systems of supports (10th ed.). Washington, DC: American Association on Mental Retardation. Luckasson, R., Coulter, D. L., Polloway, E. A., Reiss, S., Schalock, R. L., Snell, M. E., Spitalnik, D. M., & Stark, J. A. (1992). Mental retardation: Definition, classification, and systems of supports (9th ed.). Washington, DC: American Association on Mental Retardation. Marschark, M. (1993). Psychological development of deaf children. London: Oxford University Press, Inc. Moore, M. C. (1999). Assessing the preschool child with visual impairment. In: E. V. Nuttall, I. Romero, & J. Kalesnik (Eds), Assessing and Screening Preschoolers (pp. 360–380). Boston, MA: Allyn and Bacon. Mullen, Y. (1999). Assessment of the preschool child with a hearing loss. In: E. V. Nuttall, I. Romero & J. Kalesnik (Eds), Assessing and Screening Preschoolers (pp. 340–259). Boston, MA: Allyn and Bacon. Pérez-Pereira, M., & Conti-Ramsden, G. (1999). Language development and social interaction in blind children. Hove East Sussex: Psychology Press Ltd. Schalock, R. L. (1999a). The concept of adaptive behavior. In: R. L. Schalock (Ed.), Adaptive Behavior and its Measurement (pp. 1–5). Washington, DC: American Association on Mental Retardation. Schalock, R. L. (1999b). Adaptive behavior and its measurement: Setting the future agenda. In: R. L. Schalock (Ed.), Adaptive Behavior and its Measurement (pp. 209–222). Washington, DC: American Association on Mental Retardation. Silverstein, A. B. (1986). Statistical power lost and statistical power regained: The Bonferroni procedure in exploratory research. Educational and Psychological Measurement, 46, 303–307. Sparrow, S. S., Balla, D. A., & Cicchetti, D. V. (1984). The Vineland adaptive behavior scales. Circle Pines, MN: America Guidance Service. Stevens, J. (1986). Applied multivariate statistics for the social sciences. Hillsdale, NJ: Erlbaum. Thompson, J. R., McGrew, K. S., & Bruininks, R. H. (1999). Adaptive and maladaptive behavior: Functional and structural characteristics. In: R. L. Schalock (Ed.), Adaptive Behavior and its Measurement (pp. 15–42). Washington, DC: American Association on Mental Retardation.

204


Widaman, K. F., Borthwick-Duffy, S. A., & Little, T. D. (1991). The structure and development of adaptive behaviors. International Review of Research in Mental Retardation, 17, 1–54. Widaman, K. F., & McGrew, K. S. (1996). The structure of adaptive behavior. In: J. W. Jacobson & J. A. Mulick (Eds), Manual of Diagnosis and Professional Practice in Mental Retardation (pp. 97–110). Washington, DC: American Psychological Association. World Health Organization (1992). International classification of diseases (10th ed.). Geneva, Switzerland: Author.

DOES IQ AND READING LEVEL INFLUENCE TREATMENT OUTCOMES? IMPLICATIONS FOR THE DEFINITION OF LEARNING DISABILITIES H. Lee Swanson ABSTRACT This chapter summarizes the quantitative literature on whether intervention outcomes for students with learning disabilities (LD) are influenced by variations in IQ and reading level. The analysis clearly shows that a significant intelligence × reading level interaction emerges in treatment outcomes. Across a broad array of interventions it was found that studies which include samples with reading and IQ scores in the 16th and 25th percentile range (standard scores between 84 and 91) yield significantly higher effect sizes than studies that include samples in same low reading range but with higher IQ scores. An analysis of subsets of this data yield similar findings. Implications for definitions of learning disabilities that include measures of intelligence are discussed.


205

206

H. LEE SWANSON

INTRODUCTION Since the inception of the field of learning disabilities, the classification of such children has been partly based on the presence of an aptitude (IQ)-achievement discrepancy (Reynolds, 1981). The implicit assumption for the inclusion of discrepancy scores in the classification of learning disabilities was that children who experience reading, writing and/or math difficulties, unaccompanied by a low IQ, are distinct in cognitive processing from the general “run of the mill” poor, garden variety, slow, or low achievers (otherwise referred to as a low achieving non-discrepancy group). This assumption, however, is equivocal (e.g. Aaron, 1991, 1997). For example, a number of studies have compared children with discrepancies between IQ and reading with non-discrepancy defined poor achievers (i.e. children whose IQ scores are in the same low range as their reading scores) and found that these groups are more similar in processing difficulties than different (e.g. Fletcher, Francis, Rourke, Shaywitz & Shaywitz, 1992; Stanovich & Siegel, 1994). As a result, some researchers have advocated abandoning the concept of reading disability, or at least the requirement of average intelligence, in favor of a view where children with reading problems are best conceptualized as existing at the extreme end of a continuum from poor to good readers (Fletcher, Shaywitz, Shankweiler, Liberman, Stuebing, Francis, Fowler & Shaywitz, 1994; Stanovich & Siegel, 1994). In addition, some researchers have argued that IQ is irrelevant to the definition of reading disabilities and that poor readers share cognitive deficits, irrespective of general cognitive abilities (Siegel, 1992). Hoskyn and Swanson (2000) analyzed the published literature comparing children who are poor readers but either had higher IQ scores than their reading scores or had IQ scores commiserate with their reading scores. The synthesis confirmed the null hypothesis. The findings were consistent with previous studies outside of the domain of reading which report on the weak discriminative power of discrepancy scores (e.g. Cronbach & Furby, 1970; Johns, 1981; Wall & Payne, 1973; Wanous & Lawler, 1972). Although the outcomes of Hoskyn and Swanson’s synthesis generally supported current notions about comparable processing among the discrepancy and non-discrepancy groups, they did find that IQ moderated effect sizes between the two groups. That is, although the degree of discrepancy between IQ and reading was irrelevant in predicating effect sizes, the magnitude of differences in performance (effect sizes) between the two groups were related to IQ. For example, they found that when the effect sizes differences between discrepancy (reading disabled group) and non-discrepancy groups (low achievers in this case) on verbal IQ measures were greater than 1.00 (the mean verbal IQ of the reading disabled (RD) group was approximately 100 and the

Does IQ and Reading Level Influence Treatment Outcomes?

207

verbal IQ mean of the low achieving (LA) group was approximately 85) the approximate mean effect size on various cognitive measures was 0.29. In contrast, when the effect size for verbal IQ was less than 1.00 (the mean verbal IQ for the RD group was approximately 95 and the verbal IQ mean for the LA group were at approximately 90) estimates of effect size on various cognitive measures were less than 0 (M = −0.06). Thus, the further the RD group moved from IQs in the 80 range (the cut-off score used to select RD samples), the greater the chances their overall performance on cognitive measures would differ from the low achiever. In short, although the Hoskyn and Swanson’s (2000) synthesis supports the notion that differences in IQ and achievement are unimportant in terms of predictions of effect size differences on various cognitive variables (see Kavale & Forness, 1994, for a review), the magnitude of differences in IQ between these two ability groups did moderate general cognitive outcomes. Differences on measures between the two groups have also been found by Fuchs, Mathes, Fuchs and Lipsey (2000), Kavale and Forness (1994), and Kavale, Fuchs and Scruggs (1994). Given these findings, one may ask if there is any conceptual or valid base for maintaining IQ scores in the identification of children with learning disabilities? The question is critical because defining children with RD by performance in the normal range of intelligence has been considered one of the fundamental criteria for classification in the field. One obvious test that has been overlooked in the literature on whether IQ should be considered in the classification of learning disabilities is related to treatment outcomes. Unfortunately, there has been no comprehensive analysis on this issue in the literature to date. Although some studies have found very little relevance related to IQ levels within studies (e.g. Vellutino, Scanlon & Lyon, 2000), the literature on the issue of whether IQ has relevance across an array of intervention studies has not been comprehensively studied. Responsiveness to instruction seems to be a missing test in the majority studies comparing discrepancy and non-discrepancy groups (see Fuchs, Fuchs & McMaster, in press, for a review). Thus, the question this chapter explored is whether variations in “IQ” in poor readers are related to treatment outcomes. For example, does it matter, in terms of treatment outcomes, whether samples with learning disabilities have high or low IQ scores, or have large or minimal discrepancies between IQ and reading, or if such children are merely defined by cut-off scores? Quite simply, do variations in how samples with learning disabilities are defined in terms of intelligence and reading have any relationship to treatment outcomes? This is not trivial question because recent efforts have been made to abandon the notion of a discrepancy between IQ and reading in defining learning disabilities. It would seem that efforts to completely disband the use of discrepancy definitions would be premature if groups of children meeting such criteria respond differently (quantitatively or qualitatively) to treatment outcomes. For example, it could be argued that

208

H. LEE SWANSON

current practices of defining children by discrepancies in intelligence and achievement are relevant to treatment outcomes, and therefore are a valid means of classification. Perhaps one means of evaluating whether aptitude variations in learning disabled samples interact with treatment is to compare the relationship between treatment outcomes with multivariate data that include different configurations of how samples with learning disabilities are defined. This can be done by placing studies on the same metric (e.g. effect size) and comparing the magnitude of these outcomes as a function of variations in the sample definition (e.g. on measures of intelligence and reading). In this chapter, I review our findings on the relationship between definition and treatment outcome. In terms of treatment, we have consistently found that strategy and direct instruction models are the most robust in bolstering the performance of students with learning disabilities (Swanson & Hoskyn, 1998). We review, however, our findings related to studies that include samples with large differences in intelligence and reading. These two aptitude measures (intelligence and reading) were isolated for the present synthesis because they are the most frequently reported psychometric measures across all studies (Swanson et al., 1996; Swanson & Hoskyn, 1999).

OVERVIEW OF METHODS We will briefly review the methods for article analysis because they have been described in detail previously (e.g. Swanson & Hoskyn, 1998). The PsycINFO, MEDline, ERIC, and dissertation on-line data bases were systematically scanned for studies from 1963 to 1997 which met the inclusion criteria described below. In addition, every state department was sent a letter requesting technical reports on intervention studies for children and adolescents with learning disabilities. The pool of relevant literature was narrowed down to studies that utilized an experimental design in which children or adults with learning disabilities received treatment to enhance their academic, social, and/or cognitive performance. After a review of these studies, each databased report was evaluated on five additional criteria for study inclusion. These criteria include: (1) the study includes at least one control condition that includes participants with learning disabilities; (2) the study provides sufficient quantitative information to permit the calculation of effect sizes;


209

(3) the study reports mean IQ scores at average or higher (>84) or states that group IQ scores were in the average range; (4) the treatment group receives instruction, assistance, or therapy over at least three days which is over and above what they would have received during the course of their typical classroom experience; (5) the study has to be written in English. The domains of the dependent measures were coded into one of 17 general categories (e.g. reading comprehension, mathematics, writing). Based on a set number of instructional components, studies were classified as a combined SI and DI model (referred to as the Combined Model), DI-alone (DI), SI-alone (SI) and a model (non-SI & non-DI) that failed to reach a critical threshold of “reported” information. We drew upon the literature to operationalize direct instruction and strategy instruction approaches. Based on serveral criteria, studies fell into one of four models: Strategy + Direct Instruction (referred to as the Combined Model), Direct Instruction (DI)alone, Strategy Instruction (SI)alone and the Non-DI + Non-SI instruction model. As a validity check on our classifications, we compared our classification of the treatment conditions with that of the primary author’s general theoretical model and/or the label attached to the treatment condition.

Overview of Results The primary index of effect size was Cohen’s (1988) d, weighted by the reciprocal of the sampling variance. The analyses yielded 180 group design studies which included 1,537 effect sizes comparing students with learning disabilities in the experimental condition with students with learning disabilities in the control condition. The mean effect size pooled within each study was 0.79 (SD = 0.52). The mean effect size for published studies was 0.83 (N = 155, SD = 0.54, weighted mean = 0.76) and the mean effect for unpublished (dissertations, technical reports) was 0.57 (N = 25, SD = 0.32, weighted mean = 0.42). Because there was a significant difference in the magnitude of effect size related to publication outlet, ␹2 (1, N = 180) = 21.29, p < 0.001, we calculated the number of studies potentially left out of our synthesis (i.e. studies that report null results) that would threaten our overall magnitude of the effect size (i.e. the File Drawer Problem) and found that 971 group design studies reporting null results would have to been not retrieved before one can conclude our selection of studies reflects a sampling bias. A prototypical intervention study includes 22.47 minutes (SD = 29.71) of daily instruction, 3.58 times a week (SD = 1.58), over 35.72 (SD = 21.72)

210

H. LEE SWANSON

sessions. The mean sample size for the study is 27.06 (SD = 40.15). The mean treatment age is 11.16 with a standard deviation of 3.22.

Psychometric Information Although the majority of studies had samples identified as learning disabled, studies varied tremendously on the criteria and detail for participant selection. In terms of reporting group mean scores on psychometric information, only 104 studies reported group mean scores for intelligence, 84 studies reported group mean scores on achievement scores in reading, and 22 studies reported group mean scores in mathematics. Beyond IQ, reading, and mathematics scores, psychometric information on other characteristics of the sample was infrequently reported ( DI Alone (LSM = 0.68, N = 47) = SI Alone (LSM = 0.72, N = 28) = Non-DI & Non-SI (LSM = 0.62, N = 43)).

Reporting of IQ and Reading Table 1 shows the variations in effect size as a function of sample characteristics. The table shows the total sample of students with learning disabilities, number of studies in each category (K), unweighted effect size, standard deviation and weighted effect size, averaged within each study. A Chi-square, based on the Weighted Least Squares (WLS) analysis (see Hedges & Olkins, 1985), was computed for each category. Because the degrees of freedom were greater than 1,

Table 1. Effect Size Estimates as a Function of Sampling Variables. All Studies

K

Mean

SD

Weighted Mean

4,871

180

0.79

0.52

0.61

Amount of reported psychometric information on sample 1. No information on intelligence & reading 2,560 2. Intelligence 1,111 3. Intelligence & reading 849 4. Intelligence & reading & mathematics 349

73 55 39 13

0.83 0.80 0.76 0.66

0.50 0.58 0.54 0.28

0.82 0.62 0.63 0.60

Intelligence 1. >84 & 91

1,464 2,822 584

69 86 25

0.77 0.82 0.79

0.57 0.50 0.48

0.63 0.77 0.66

Reading severity 1. 84 & 90

771 127 3,629 293

35 9 122 14

0.86 0.57 0.80 0.69

0.52 0.39 0.54 0.44

0.71 0.51 0.73 0.55

All studies with outliers removed

Sample Size

Notes: Some columns do not add up to the total because sample size varies because of attrition (mortality), the number of subjects in each condition cannot be clearly distinguished, and/or a missing cell related to the methodological composite score emerged. K = number of studies.

212

H. LEE SWANSON

a Scheffe test was used to make comparisons between the estimates of the mean effect size. As shown in Table 1, studies were categorized by the amount of psychometric information reported. Four categories were developed for comparisons (no information, standardized intelligence test scores, standardized intelligence scores + standardized reading test scores, and standardized intelligence test scores + reading scores + mathematics scores). A significant difference in the weighted effect size was found between the categories, ␹2 (3, N = 180) = 13.50, p < 0.01. A Scheffe test indicated that those studies which provided no psychometric information on the learning disabled sample produced significantly ( p < 0.01) larger effect sizes than those studies that report intelligence, reading, and/or mathematics scores. No significant differences (all ps > 0.05) were found between those studies that reported intelligence scores and those that reported standardized intelligence scores and reading and/or math scores. The general pattern was that studies that failed to report psychometric information on participants with learning diabilities yielded significantly higher effect sizes than those studies that reported psychometric information. Our best explanation for this pattern was that samples that are poorly defined inflated treatment outcomes by introducing greater heterogeneity into the sample when compared to studies that selected samples based on psychometric criteria.

IQ and Reading Levels in Isolation Given that the reporting of psychometric information was related to effect size, the sample characteristics were further categorized by the reported range in intelligence scores and the reported range of reading scores. Three categories for comparison were created for intelligence: those studies that reported mean standard scores between 85 and 91, those that reported mean standardized intelligence scores greater than 91, and those that did not report standardized information. If studies provided multiple IQ scores (verbal, performance, nonverbal, etc.), these scores were averaged within studies. The weighted mean effect size as a function of intellectual category was significant, ␹2 (2, N = 180) = 7.43, p < 0.05. As shown in Table 1, A Scheffe test indicated that the highest effect sizes occurred when no information about IQ is presented ( ps < 0.05) when compared to other conditions. No significant differences in effect size emerged between studies that report high-average or low average IQs ( ps > 0.05). Thus, the results clearly showed that reported IQ scores moderated treatment outcomes. The next category considered in our sample analysis was reading severity. The majority of studies that reported reading scores included measures of word


213

recognition. If multiple standardized reading measures were provided in the study, reading scores were averaged across word recognition and reading comprehension. Four categories of reading level were created for comparisons: scores below 85, scores above 84 and less than 91, scores greater than 90, and no standardized scores reported. The weighted mean effect size as a function of reading category was significant, ␹2 (3, N = l80) = 6.36, p < 0.05. A Scheffe test indicated that effect sizes for studies which reported scores below 85 were comparable to those studies that reported no scores ( p > 0.05). The lowest effect sizes occurred between studies that reported reading scores between 84 and less than 91 and those studies that reported scores above 90 when compared to other conditions ( ps < 0.05). The importance of this finding for readers in the 84–91 ranges will be clarified below.

Interaction of IQ × Reading Level Although reported intelligence and reading level had minimal influence on the magnitude of effect size in isolation (see Table 1), the simultaneous impact of both aptitude variables on treatment outcome did. A 3 (Intelligence: >90, 85–90, no information) × 4 (reported reading severity: 90 and no information) weighted least square (WLS) regression analysis (analogue to an ANOVA and ANCOVA) was computed. The unit of analysis was the aggregated effect size for each study. When the methodology composite score was partialed from the analysis, significant effects were isolated to the severity of reading × intelligence interaction, ␹2 (6, N = 179) = 13.22, p < 0.05. The WLS means are provided below. To simplify the analysis, we dropped studies that provided no information on IQ since it has already been established that higher effect sizes emerge when IQ is not reported. The important results related to the interaction are as follows: (1) IQ was irrelevant when reading scores were not reported. That is, within the reading category of no score information (no reported information), the LSMs were 0.63, 0.64 for the intelligence categories of IQ > 90, and IQ 85–90, respectively. No significant differences emerged between means ( p > 0.05). (2) IQ was irrelevant at the severest ( 0.05). Within the severe reading category (reading score 90, and IQs between 85 and 90, respectively. (3) Intelligence was relevant when reading disabilities are defined as occurring between the 16th and 25th percentile ( p < 0.001). That is, when reading levels were between a standard score of 84 and lower than 91 the WLS means

214

H. LEE SWANSON

were 0.52, and 0.95 for intelligence categories of IQ > 90, and IQs 85–90, respectively. (4) Intelligence was relevant for high readers. Average reading (reading scores above the 25th percentile)-low intelligence (IQ scores less than a standard score of 90 but higher than 84) produced the lowest outcomes ( p < 0.001). For the average reading category (reading score >90) the WLS means were 0.68 and 0.05 for intelligence categories of IQ > 90, and IQ 85–90, respectively. In summary, our results show that across an array of studies that variations in IQ and reading moderate treatment outcomes. Most critically, the results show that treatment outcomes for groups of children whose aggregated reading scores are in the 16th to 25th percentile range (standard scores between 85 and 90) are influenced by variations in IQ.

Data Subsets Clearly the data set in the aforementioned analysis was diverse across age and academic domains. Thus, we briefly report on two analyses on subsets of the data. One subset of this data has been reported on adolescents (Swanson, 2001). In this analysis, studies with samples that yield a mean chronological age SI-only = Non-SI & Non-DI). Effect sizes for the DI-only model were significantly (p < 0.05) higher when compared to studies that included the combination model when cut-off scores could not be computed (DI-only > Non-DI & Non-SI = SI-only > Combined).


217

For reading comprehension studies, the regression analysis indicated that significant effects occurred for high IQ discrepancy defined samples (Mean IQ = 97.77; SD = 2.60; Mean reading score = 79.83, SD = 6.68) vs. studies in which discrepancies were in the low IQ range (Mean IQ = 90.28; SD = 2.13; Mean reading score = 81.32, SD = 8.22) or could not be computed ␹2 (2, N = 58) = 6.54, p < 0.05. (Note: the large standard deviation for reading reflected mean scores for IQ were in the same range as mean reading scores.) The results indicated that those studies in which discrepancies between IQ and reading could not be computed (N = 42, M = 0.84) yielded larger effect sizes than those where group discrepancies could be computed. However, a Scheffe test indicated that when IQ scores for studies in which discrepancies could be computed, effect sizes were significantly (p < 0.05) larger (N = 8, M = 0.67) for studies that reported low average range IQs ( 90; (N = 8 WLS M = 0.49)]. No significant interaction emerged related to type of treatment. In summary, the contrast variables that did not yield a significant relationship ( ps > 0.05) to the magnitude of effect size were (1) those that compared the degree of discrepancy (studies with mean scores in IQ and reading >15 and those 91) but reading scores in the same low range (scores between 84 and 91). Although these findings are not related to a particular type of treatment, they support the notion that greater responsiveness to treatment emerges in studies whose samples have mean intelligence and reading scores in the same low range. These groups may be referred to, for lack of a better term, non-discrepancy or low discrepancy groups. Two other important findings emerge when we consider subsets of the Swanson and Hoskyn (1998) data set. First, we find that adolescent samples with discrepancies in intelligence and reading are more likely to yield lower effect sizes than those studies that report aggregated IQ and reading scores in the same low range. This puts a new wrinkle on the literature which has called for the elimination of “discrepancy” criteria in classifying learning disabled students by suggesting that discrepancies may be important in predicting treatment outcomes. Second, we find that treatment measures related to reading recognition and comprehension vary as a function of IQ. Effect sizes for word recognition studies were significantly related to samples defined by cut-off scores (IQ > 85 and reading < 25th percentile), whereas the magnitude of effect size for reading comprehension studies were sensitive to discrepancies between IQ and reading when compared to competing definitional criteria. What are the implications of our findings to definitional issues within the field of learning disabilities? The obvious implication is that IQ has relevance to any definition of learning disabilities. Our results indicate that groups of students with learning disabilities who have aptitude profiles similar to generally poor achievers or slow learners (low IQ and low reading), produced higher effect sizes than those samples with a discrepancy between IQ and reading. Discrepancy in this context is defined as those studies that report aggregated sample mean IQ scores at a higher level than aggregated sample mean low reading (70% of the studies) stated by the primary authors was that the participants exhibit a discrepancy between academic performance and intellectual performance. However, we found this variable problematic because most studies report discrepancies across all sorts of achievement domains. Unfortunately, we did not test whether broadly defined samples vs. narrowly defined discrepancy defined samples provide greater external validity for intervention effectiveness (although the data are available to us). Instead, we sorted studies by a priori sample characteristics. Based on the literature, we imposed our own definitions of learning disabilities for comparisons between studies rather than relying on federal, state, or particular school district definitions. Although there should have been greater homogeneity in our sample selection in terms of matching subject characteristics to the specific treatments (e.g. children with learning disabilities exhibit clear deficits in reading as opposed to mathematics performance when placed in a reading intervention program), we

220

H. LEE SWANSON

assume those learning difficulties in the present studies bear a logical relationship to the target intervention. Our synthesis suggests, however, that what appears to be moderate treatment outcomes are levels of severity related to intelligence and reading. In conclusion, the results indicated that studies that include samples in the slow learner range yield higher effect sizes than studies that include samples with the same level of reading scores but higher IQs. No doubt, future research attempts to validate the usefulness of potential-achievement discrepancies in the identification of children with learning disabilities rests on meeting some assumptions. These assumptions as described in Hoskyn and Swanson (2000) focus on construct integrity, stochastic independence, and face validity. However, one critical assumption they mention in testing the validity of a definition relates to how children who fit a discrepancy definition respond to treatment. We now have evidence on this issue. If it can be independently confirmed that children with relatively high IQ scores but low reading scores are less responsive to treatment than children who have IQ and reading scores in the same low range, then the removal of IQ in the classification of children with learning disabilities is clearly premature.

ACKNOWLEDGMENTS The data analysis reported in this chapter was supported by a U.S. Department of Education Grant (H023E40014), the Chesapeake Institute, and Peloy Endowment Funds. The views of this report do not necessarily reflect the U.S. Office of Education or the Chesapeake Institute. This data set is part (part 3: influence of definitions on treatment outcomes) of a 700 page report submitted as a final report to the U.S. Department of Education. A general overview of the synthesis and a detailing of the instructional components has been reported elsewhere (Swanson & Hoskyn, 1998, 2000). I am most indebted to the assistance of two Research Associates: Carole Sachse-Lee and Maureen Hoskyn for data entry and coding. Author address H. Lee Swanson, School of Education, University of California, Riverside, CA, 92521.

REFERENCES Aaron, P. G. (1991). Can reading disabilities be diagnosed without using intelligence tests? Journal of Learning Disabilities, 24, 178–186. Aaron, P. G. (1997). The impending demise of the discrepancy formula. Review of Educational Research, 67, 461–502.


221

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). New York: Academic Press. Cronbach, L., & Furby, L. (1970). How we should measure “change” – or should we? Psychological Bulletin, 74, 68–80. Fletcher, J. M., Francis, D. J., Rourke, B. P., Shaywitz, S. E., & Shaywitz, B. A. (1992). The validity of discrepancy-based definitions of reading disabilities. Journal of Learning Disabilities, 25, 555–561. Fletcher, J. M., Shaywitz, S. E., Shankweiler, D. P., Katz, L., Liberman, I. Y., Stuebing, K. K., Francis, D. J., Fowler, A. E., & Shaywitz, B. A. (1994). Cognitive profiles of reading disability: Comparisons of discrepancy and low achievement definitions. Journal of Educational Psychology, 86, 1, 6–23. Fuchs, D., Fuchs, L. S., Mathes, P. G., & Lipsey, M. W. (2000). Reading differences between low achieving students with and without learning disabilities. In: R. Gersten, E. P. Schiller & S. Vaughn (Eds), Contemporary Special Education Research: Synthesis of Knowledge Base on Critical Instructional Issues. Mahwah, NJ: Erlbaum. Fuchs, D., Fuchs, L. S., & McMaster, K. N. (in press). Identifying children at risk for reading failure: Curriculum-based measurement and the dual discrepancy approach. In: H. L. Swanson, K. Harris & S. Graham (Eds), Handbook of Learning Disabilities. NY: Guilford. Hedges, L. V., & Olkins, I. (1985). Statistical methods for meta-analysis. San Diego, CA: Academic Press. Hoskyn, M., & Swanson, H. L. (2000). Cognitive processing of low achievers and children with reading disabilities: A selective review of the published literature. School Psychology Review, 29, 102–119. Johns, G. (1981). Difference score measures of organizational behavior variables: A critique. Organizational Behavior and Human Performance, 27, 44–463. Kavale, S., & Forness, S. R. (1994). Learning disabilities and intelligence: An uneasy alliance. In: T. Scruggs & M. Mastropieri (Eds), Advances in Learning and Behavioral Disabilities (Vol. 8, pp. 1–64). Greenwich, CT: JAI press. Kavale, K., Fuchs, D., & Scruggs, T. E. (1994). Setting the record straight on learning disability and low achievement: Implications for policy making. Learning Disabilities Research & Practice, 9, 70–77. Mercer, C. D., Jordan, L., Allsopp, D. H., & Mercer, A. R. (1996). Learning disabilities definitions and criteria used by stated education departments. Learning Disability Quarterly, 19, 217–232. Reynolds, C. R. (1981). The fallacy of two years below grade level for age as a diagnostic criterion for reading disorders. Journal of School Psychology, 11, 250–258. Siegel, L. S. (1992). An evaluation of the discrepancy definition of dyslexic. Journal of Learning Disabilities, 25, 618–629. Stanovich, K. E., & Siegel, L. S. (1994). Phenotypic performance profile of children with reading disabilities: A regression-based test of the phonological-core variable-difference model. Journal of Educational Psychology, 91, 24–53. Swanson, H. L. (1991). Operational definitions of learning disabilities. Learning Disability Quarterly, 14, 242–254. Swanson, H. L. (1999). Reading research for students with reading disabilities: A meta-analysis of intervention outcomes. Journal of Learning Disabilities, 32, 504–532. Swanson, H. L. (2001). Research on the intervention for adolescents with learning disabilities: A meta-analysis of outcomes related to high-order processing. Elementary School Journal, 101, 331–348.

222

H. LEE SWANSON

Swanson, H. L., Carson, C., & Sachse-Lee, C. M. (1996). A selective synthesis of intervention research for students with learning disabilities. School Psychology Review, 25, 370–391. Swanson, H. L., & Hoskyn, M. (1998). Experimental intervention research on students with learning disabilities: A meta-analysis of treatment outcomes. Review of Educational Research, 68, 277–321. Swanson, H. L., & Hoskyn, M. (1999). Definition × treatment interactions for students with learning disabilities. School Psychology Review, 28, 644–658. Vellutino, F. R., Scanlon, D. M., & Lyon, G. R. (2000). Differentiating between difficult-to-remediate and readily remediated poor readers: More evidence against the IQ-achievement discrepancy definition of reading disability. Journal of Learning Disabilities, 33, 223–238. Wall, T. D., & Payne, R. (1973). Are deficiency scores deficient? Journal of Applied Psychology, 58, 322–326. Wanous, J. P., & Lawler, E. E., III (1972). Measurement and meaning of job satisfaction. Journal of Applied Psychology, 56, 95–105.

TEXT ANXIETY, PERCEIVED COMPETENCE, AND ACADEMIC ACHIEVEMENT IN SECONDARY SCHOOL STUDENTS Daniela Lucangeli and Thomas E. Scruggs ABSTRACT This investigation was intended to examine the relationship among perceived competence, anxiety, and mathematical and verbal achievement in a population of male and female Italian middle school students. One hundred and eighty students were administered measures of trait anxiety, and measures of state anxiety were administered immediately prior to administering achievement tests in math and literature. In addition, students were administered six subscales of a perceived competence scale. Analyses of these data yielded a moderate negative correlation between mathematics achievement and state anxiety for the math test, and a descriptively smaller negative correlation between the literature scores and state anxiety for the literature test. Significant correlations were also observed between achievement and perceived competence for academic ability. The two state anxiety measures were found to be highly correlated; however, trait anxiety was not statistically related to academic achievement in either math or literature. A moderate negative correlation was observed between perceived competence for academic ability and state anxiety for math and a somewhat lower correlation between perceived competence for academic ability and literature


223

224

DANIELA LUCANGELI AND THOMAS E. SCRUGGS

achievement. Males scored higher than females on the test of trait anxiety; however, females and males did not differ on any other anxiety or academic measures, including perceived competence for academic ability, math achievement, or literature achievement. Implications for future research are discussed.

INTRODUCTION The relations among achievement and affective variables such as perceived competence and anxiety, and the interactions of these variables with grade level and gender, have long interested researchers (e.g. Harter, 1999). Previous literature (e.g. Norwich, 1987; Skaalvik & Rankin, 1990) has suggested that academic assessment in math can have a strong effect on anxiety and self-esteem of adolescents (Zohar, 1998). These effects may be more pronounced in female adolescent students, who typically score somewhat higher on anxiety measures, and somewhat lower on math achievement and self-esteem measures (Harter, 1999). Skaalvik (1986), for example, reviewed 29 studies on global self-esteem, and concluded that males generally had higher scores on self-esteem measures. Race/ethnicity and disability status, as well as gender, have also been found to be associated with self-esteem (e.g. Scruggs & Mastropieri, 1983). Skaalvik (1990) studied achievement and self-esteem among 6th grade Norwegian boys and girls. The girls exhibited higher achievement in English and Norwegian subjects, but did not differ from boys in success expectations for, or achievement in mathematics. Further, a difference in general academic self-esteem was not observed. Skaalvik and Rankin (1994) investigated gender differences in math achievement, verbal self-concept, perception of ability, and motivation in a sample of sixth and ninth grade Norwegian students, in order to determine whether differences in non-academic areas were larger than could be explained by achievement differences. Skaalvik and Rankin (1994) reported that boys had higher math self-concept and perceptions of math ability than girls, but did not score significantly higher on math achievement tests. The relation between achievement and situational anxiety such as test anxiety has also long interested researchers (Spielberger & Vagg, 1995). Recently, Zohar (1998) suggested that test anxiety could be studied as a within-subjects factor (e.g. anxiety for tests of different subjects) as well as a between-subjects factor (e.g. males vs. females), and provided evidence that anxiety before tests can be composed of an additive function of situational and dispositional factors.

Text Anxiety, Perceived Competence, and Academic Achievement

225

Clearly, achievement in mathematics and other subject areas is influenced by such variables as development (Girelli, Lucangeli & Butterworth, 2000), motivation (Lucangeli & Pedrabissi, 1997), and metacognition (Lucangeli & Cornoldi, 1997; Lucangeli, Tressoldi & Cendron, 1998). The present investigation, however, was concerned with the relations among mathematics and verbal achievement, anxiety, and perceived competence, across gender and grade level in a sample of Italian adolescents. Results of this investigation were intended to provide additional information relevant to international studies of mathematics and other areas of academic achievement, perceived competence, and anxiety.

METHOD Participants Participants included 180 Italian students (110 male, 70 female) in secondary schools (“scuola superiore”) (66 7th grade, 53 9th grade, and 61 11th grade), with normal IQ and without specific learning disabilities. Sixty-six were in year I (14 years old), 53 were from year II (16 years old), and 61 were from year III (18 years old). Students were administered tests of state anxiety, immediately prior to administration of mathematics tests and literature tests. Students were also administered a variety of other measures. Measures The demographic variables gender and grade level were included in the analysis. Other data sources included the following: Anxiety measures. The trait anxiety measure was an Italian translation of the Trait Anxiety form of the Spielberger State-Trait Anxiety Inventory (STAI; Spielberger, Gorsuch & Lushene, 1970). Examples of this measure include: “I tire easily,” “I have negative thoughts,” and “I’m proud of myself,” to which students answer, “never,” “sometimes,” “often,” or “always.” Alpha reliability of the scale was 0.76. The state anxiety measure was an Italian translation of the State Anxiety form of the STAI. This measure includes such items as: “I feel relaxed,” “I feel anxious,” “I feel confused,” “I feel proud,” to which students reply, “no,” “a little,” “some,” or “a lot.” This measure has frequently been used in research on text anxiety and has been found to be a sensitive indicator of anxiety during the test-taking experience (Zohar, 1998). In the present investigation, alpha reliability was 0.78 for the math test and 0.80 for literature. The state anxiety

226


tests were administered immediately prior to administration of math and literature tests. Perceived competence. Individual self-esteem measures were obtained from an Italian translation of the Perceived Competence Scales for Children (PCSC; Harter, 1982), consisting of subscales measuring 6 dimensions relative to: academic ability, athletic ability, social behavior, physical characteristics, social abilities, and general self-esteem. On this scale, students characterize as “like me” or “unlike me” statements such as “Some students have a lot of friends” (social ability), “Some students are proud of themselves” (general), and “Some students often forget what they learn,” (academic ability). These scales have been used extensively in measuring perceived competence and self-esteem in students (Harter, 1999). Alpha reliabilities of the subscales used in this investigation ranged between 0.65 and 0.71. Academic achievement. Academic achievement was assessed by means of individual performance on standardized Italian mathematics achievement tests. These tests contain content that is specific to each grade level. Individual performance on literature was assessed by means of standardized tests involving both written and oral examinations.

RESULTS Correlations among all achievement, self-esteem, and anxiety measures are given in Table 1. A moderate negative correlation was observed between mathematics achievement and state anxiety for the math test, r = −0.549, p = 0.000. That is, lower math achievement scores were associated with higher state anxiety in math scores. A similar, but descriptively lower correlation was also observed between the literature scores and state anxiety for the literature test, r = −0.463, p = 0.000. The two state anxiety measures, for math and for literature, were found to be highly correlated, r = 0.832, p < 0.001, suggesting that most of the variance in the two measures was shared. However, the dimension of trait anxiety was not significantly correlated with academic tests in math or literature. Significant but lower correlations also were observed between math achievement and perceived competence in academic ability, r = 0.384; and math achievement and general self-esteem, r = 0.214, ps < 0.05. A significant correlation was also observed between the literature achievement scores and perceived competence in academic ability, r = 0.331, p < 0.05. These results suggest that there is a


227

Table 1. Correlation Matrix. 1. Ath. 2. Abil. 3. Behav. 4. Phys. 5. Self 6. Social 7. Ach.–L 8. Ach.–M 9. State–L 10. State–M 11. Trait

1

2

3

4

5

6

7

8

9

10

11

–

0.057 –

0.040 0.277 –

0.050 0.202 0.508 –

0.120 0.420 0.236 0.310 –

0.135 0.323 0.135 0.247 0.750 –

0.013 0.331 0.215 0.098 0.151 0.141 –

0.037 0.384 0.105 0.043 0.214 0.115 0.313 –

−0.059 −0.531 −0.263 −0.162 −0.281 −0.271 −0.612 −0.463 –

0.013 −0.679 −0.230 −0.140 −0.312 −0.292 −0.515 −0.549 0.832 –

−0.038 −0.272 −0.589 −0.698 −0.329 −0.274 −0.116 −0.137 0.227 0.277 –

Note: N = 180 for all correlations. Correlations of 0.140 or greater in absolute value are statistically significant < 0.05. Ath. = Harter Athletics Subscale; Abil. = Harter Ability Subscale; Behav. = Harter Behavior Subscale; Phys. = Harter Physical Subscale; Self = Harter Self Esteem Subscale; Social = Harter Social Subscale; Ach.–L = Literature Achievement; Ach.–M = Math Achievement; State–L = Spielberger State Anxiety (Literature); State–M = Spielberger State Anxiety (Math); Trait = Spielberger Trait Anxiety.

relationship between academic ability and perceived competence in academic ability, as well as a relationship between academic achievement and the state dimension of anxiety. A moderate negative correlation was found between perceived competence in academic ability and state anxiety for math (r = −0.679), and a descriptively lower correlation between perceived competence in academic ability and state anxiety for literature (r = −0.531), both ps < 0.001.

Gender Differences Males scored higher on the test of trait anxiety (M = 43.0, SD = 9.2) than females (M = 39.7, SD = 9.5), t(179) = −2.30, p = 0.23. However, females and males did not differ on any other measures, including state anxiety for math or literature, academic achievement, or any the six self-esteem subscales (all ps > 0.05). Descriptive data and relevant t statistics are presented in Table 2.

Grade Level Differences A oneway analysis of variance (ANOVA) across grade levels revealed a significant difference only in the perceived competence dimensions of athletics, F(2, 177) = 3.46, p = 0.034, and social abilities, F(2, 177) = 3.94, p = 0.021, but these variables were not found to be significantly correlated with academic achievement. Contrary to much previous research, females and males did not

228


Table 2. Gender Differences. Variable

Mean Scores Females (SD)

Males (SD)

t(178)

Mathematics Ach. Literature Ach.

5.87 (0.98) 6.20 (1.00)

6.11 (0.91) 6.26 (0.93)

−1.67 −0.44

Harter scales Athletics Academic Behavior Physical Self-esteem Social State Anx.–Math State Anx.–Lit. Trait anxiety

12.86 (6.00) 12.90 (5.41) 14.77 (4.14) 14.48 (3.83) 12.67 (4.34) 12.64 (4.93) 49.24 (18.16) 44.23 (18.61) 39.70 (9.59)

12.76 (5.89) 12.30 (5.18) 13.55 (4.31) 13.69 (4.33) 13.71 (4.79) 13.30 (4.96) 49.34 (15.80) 43.64 (15.59) 42.99 (9.21)

0.10 0.75 1.87 1.26 −1.48 −0.87 −0.04 0.23 −2.30*

Note: For all comparisons, N(females) = 70; N(males) = 110. ∗ p < 0.05.

differ on any perceived competence dimension, and perceived competence for females considered independently did not decline across grade levels.

Math versus Literature Anxiety To further explore the role of mathematics anxiety, scores on the state anxiety tests administered before the math test and before the written essay test in literature were compared. A t-test revealed that, across all students, state anxiety was stronger prior to the math task (M = 49.30, SD = 16.7) than prior to the written literature test (M = 43.87, SD = 16.7), t(179) = 35.06, p = 0.000.

“Good” versus “Poor” Achievers Math and literature scores were divided into “good” (scores of 6 or above, considered criterion performance in Italy) and “poor” (scores lower than 6). Students with poor scores in math had lower perceived competence scores in academics, [M(good) = 13.40, SD = 5.22; M(poor) = 10.00, SD = 4.57], t(178) = 3.94, p = 0.000, had lower self-esteem, [M(good) = 13.87, SD = 4.70; M(poor) = 11.70, SD = 4.08], t(178) = 2.79, p = 0.000, and were more anxious


229

before the math test, as revealed by the state anxiety test, [M(good) = 61.96, SD = 14.02; M(poor) = 44.96, SD = 15.32], t(178) = −6.63, p = 0.000. There was no statistical interaction with gender. Students with good literature scores had higher perceived competence for academic ability [M(good) = 13.16, SD = 5.31, M(poor) = 9.94, SD = 4.19], t(178) = 3.34, p = 0.001, as well as for social abilities [M(good) = 13.42, SD = 5.15, M(poor) = 11.49, SD = 3.62], t(178) = 2.10, p = 0.037. Participants with poor scores were also more anxious before the literature test [M(good) = 40.13, SD = 15.03, M(poor) = 59.34, SD = 14.84], t(178) = −6.80, p = 0.000. Again, there was no statistical interaction with gender.

DISCUSSION Results of the present investigation support the notion that that perceptions of academic performance in both math and literature are associated with state anxiety and perceived competence for academic ability. Even given that state anxiety for math was higher overall than the state anxiety for literature, an affective dimension specific to mathematics was not apparent. In this investigation also, academic achievement was not found to be associated with trait anxiety. Results of comparisons between males and females were also of interest. Surprisingly, trait anxiety was higher for males, although this variable was not found to be associated with academic achievement. Anxiety specific to mathematics or literature achievement did not differ across gender, nor did math or literature achievement itself. There were no grade level interactions with gender; neither were interactions between gender and achievement (“good” versus “poor” achievers in literature and math). The results of the present investigation of 180 Italian students suggest that the generally reported findings of lower mathematics achievement, and higher mathematics anxiety for females may not in fact be accurate in all cases. Finally, males and females were not seen to differ on any of the six perceived competence measures, and other than the dimensions of athletics and social abilities, these scores did not appear to change appreciably across the three age groups. Perceived competence for athletics and social abilities were not related to achievement. These findings reflect those of Skaalvik and Rankin (1994), who found no gender differences in math achievement, but higher verbal achievement on the part of girls in a sample of Norwegian students. In that investigation differences in self-perceptions of ability were obtained, but they could not be explained by differences in achievement. Data from this and related investigations underline the complex interrelationships among academic achievement, gender, age, and personal psychological

230


variables. Through such research efforts we can hope to increase our understanding of the complex interaction of variables that moderate academic achievement.

REFERENCES Girelli, L., Lucangeli, D., & Butterworth, B. (2000). The development of automaticity in accessing number magnitude. Journal of Experimental Child Psychology, 76, 104–122. Harter, S. (1982). The perceived competence scale for children. Child Development, 53, 87–97. Harter, S. (1999). The construction of the self. New York: Guilford. Lucangeli, D., & Cornoldi, C. (1997). Mathematics and metacognition: What is the nature of the relationship? Mathematical Cognition, 3, 121–139. Lucangeli, D., & Pedrabissi, L. (1997). Componenti cognitivo-motivazionali del successo in matematica: Un’indagine esplorativa. [Cognitive-motivational components of success in mathematics: An exploratory investigation]. Ricerche di Psicologia [Research in Psychology], 21, 59–74. Lucangeli, D., Tressoldi, P., & Cendron, M. (1998). Cognitive and metacognitive abilities involved in the solution of mathematical problem solving: Validation of a comprehensive model. Contemporary Educational Psychology, 23, 257–275. Norwich, B. (1987). Self-efficacy and mathematics achievement: A study of their relation. Journal of Educational Psychology, 79, 384–387. Scruggs, T. E., & Mastropieri, M. A. (1983). Self-esteem differences by sex and ethnicity. Journal of Instructional Psychology, 10, 177–179. Skaalvik, E. M. (1986). Sex differences in global self-esteem: A review. Scandinavian Journal of Education Research, 30, 167–179. Skaalvik, E. M. (1990). Gender differences in general academic self-esteem and in success expectations on defined academic problems. Journal of Educational Psychology, 82, 593–598. Skaalvik, E., & Rankin, R. J. (1990). Math, verbal, and general academic self-concept: The internal/ external frame of reference model and gender differences in self-concept structure. Journal of Educational Psychology, 82, 546–554. Skaalvik, E., & Rankin, R. J. (1994). Gender differences in mathematics and verbal achievement, self-perception and motivation. British Journal of Educational Psychology, 64, 419–428. Spielberger, C. D., Gorsuch, R. L., & Lushene, R. E. (1970). Manual for the State-Trait Anxiety Inventory. Palo Alto, CA: Consulting Psychologists Press. Spielberger, C. D., & Vagg, P. R. (1995). Test anxiety: Theory, assessment, and treatment. Washington, DC: Taylor & Francis. Zohar, D. (1998). An additive model of test anxiety: Role of exam-specific expectations. Journal of Educational Psychology, 90, 330–340.

THE ASSESSMENT OF SELF-REGULATION IN COLLEGE STUDENTS WITH AND WITHOUT ACADEMIC DIFFICULTIES Cesare Cornoldi, Rossana DeBeni and Maria Chiara Fioritto ABSTRACT This chapter examines the problems involved in evaluating the cognitive and motivational skills of college students of different ability and academic success. A battery is presented which examines students’ self-regulation and some factors underlying it. A study with 240 undergraduates at the University of Padua shows some implications in the use of the battery and proposes a causal model of self-regulation. Self-regulation, defined with reference to the basic competencies of elaboration, organization and self-evaluation, appears critical for student success and is related to students’ implicit theories, self-attribution, academic self-efficacy and motivation to use strategies.

INTRODUCTION Metacognition seems critical for the success in studying (Borkowski, Milstead & Hale, 1988). Typically, metacognitive variables include two types of components.

Identification and Assessment Advances in Learning and Behavioral Disabilities, Volume 16, 231–242 Copyright © 2003 by Elsevier Science Ltd. All rights of reproduction in any form reserved ISSN: 0735-004x/doi:10.1016/S0735-004X(03)16009-0

231

232

CESARE CORNOLDI, ROSSANA DEBENI AND MARIA CHIARA FIORITTO

The first type is related to the student’s reflections and state of awareness concerning his or her own study process. The second type concerns the control the student operates on this process: this control is related to a series of operations which supervise and govern cognitive processes, such as evaluating task difficulty and progress in learning, planning, guiding the elaboration, organizing and monitoring the learning process. These various and partially heterogeneous control processes have been differently defined and considered within different theoretical perspectives, but seem to share a common reference to the subject’s ability to control his own learning process (self-regulation) and to use his metacognitive knowledge in order to do so adequately (Cornoldi, 1995). Self-regulation can be defined as the overall subject’s ability to control (regulate) his or her mental activity. In this chapter, we will examine the particular case represented by self-regulation during study. We will present an assessment procedure devised for the evaluation of self-regulation in study, and we will show how this aspect can be critical in the consideration of the learning process. Finally, we will test a model concerning a series of metacognitive variables that can affect the development of adequate self-regulation. A large body of research on cognitive development and student learning has shown how metacognition can be critical in learning. For example, appropriate selection and monitoring of strategies can improve school achievement (Wolters, 1998), while appropriate self-evaluation may increase student’s confidence (Zimmerman, 2000). In particular, self-regulation is critical for the achievement of late adolescents, including college students (Ley & Young, 1998; Stoynoff, 1996; Wolters, 1998). In this context, Moè and De Beni (2000) have illustrated the particular role of three components of self-regulation, i.e. the abilities to control elaboration, to organize/control the learning process, and to evaluate its progress. Furthermore, research examining aspects underlying self-regulation and factors affecting it has illustrated the role of the motivational variables (e.g. Schunk & Ertmer, 2000). For example, interest in the text and a desire for personal growth may increase motivation and success in studying (Schiefele & Schreyer, 1994). Within a broad metacognitive perspective (e.g. Borkowski & Muthukrishna, 1994), self-regulation is affected by a series of interconnected factors, including self-awareness and motivation. In this perspective, the nature of the student’s learning goals is critical. In particular, during learning, a student can be motivated either to increase his competence, even independent of the external outcomes, and consequently have real learning goals, or to obtain a high level of performance, even independently from the substantial degree of learning and consequently have performance goals (Dweck, 1999; Dweck & Leggett, 1988). It has been

The Assessment of Self-Regulation in College Students

233

demonstrated that these two different categories of goals (learning vs. performance) are inherent to two different implicit metacognitive theories of intelligence, respectively incremental and static. An incremental theory of intelligence assumes that intelligence, as much as other individual characteristics, can be modified and then improved, via e.g. experience, education, or practice. A static theory of intelligence assumes that intelligence is given by nature and cannot be substantially modified. Goals and implicit theories affect self-attribution (Dweck, 1986). Selfattribution refers to the type of explanation an individual offers for the outcomes of his behavior, within the human need of understanding the world and the rules underlying it (Heider, 1958). In particular, it has been shown that attribution can include factors which are either internal or external, either modifiable (controllable) or non-modifiable (Weiner, 1985): Students with an incremental theory of intelligence tend to have a self-attributional pattern more inclined to attribute the causes of their successes and failures to controllable factors, and in particular to personal effort. This type of attribution can have a direct role in increasing self-regulation. In fact, we predict that students who believe that their learning process is affected by their effort will be more motivated to spend cognitive resources to control their learning. In our view, another factor affecting the student’s self-regulation is the student’s perception of self-efficacy, i.e. the perception that his or her finalized actions during studying will realize the desired effects (see also Bandura, 1986). It has been shown that an individual perception of self-efficacy derives from a series of different sources, including successful experiences, models, metacognitive reflection, education, the interpretation of signs of emotional activation. We assume that self-efficacy will affect self-regulation as it gives a fundamental support to the related attributional pattern. In fact, an effort attribution must be substantiated and strengthened by a specific belief in the efficacy of our own behavior. Furthermore, these two aspects will have a relevant impact on self-regulation if integrated with a high strategic attitude, defined as the knowledge of a large number of strategies, and the inclination to use them extensively (Cornoldi, 1995). This view is consistent with models of good students. For example, Borkowski and Muthukrishna (1994) define the good student as a good strategy user with a motivational profile directed toward reaching learning goals and improving metacognitive control. Moè, Cornoldi and De Beni (2001) illustrated how different components of the strategic attitude, i.e. strategic knowledge and evaluation, strategy use and strategic coherence can be critical for the academic achievement of late adolescents. In summary, at different ages, and in particular for late adolescents, selfregulation seems critical for student success. Furthermore, the development of good

234


self-regulatory skills depends on a series of cognitive, metacognitive and motivational variables: in this chapter, we focused on the role of effort self-attribution (hypothesized to depend on student’s goals and implicit theories of intelligence), perception of self-efficacy in studying and strategic attitude.

THE STUDY Participants A group of 240 randomly selected students attending the second year in different Faculties of the University of Padua, and assumed to represent the population of the undergraduate students at this University. Seventy of them followed courses in the Faculties (Facoltà) of Economics (Economia), 79 in Engineering (Ingegneria), 44 in Medicine (Medicina e Chirurgia) and 47 in Social and Political Sciences (Scienze Politiche); 137 (57.1%) were males and 103 (42.9%) females; mean age was 21.5 years (SD = 3.28). These students represented a range of academic achievement, having passed between 18 and 100% of the examinations which could be passed at the time of the study (spring term, second year) at the different Faculties (see Fig. 1). (Notice that only a few students had reached the 100% value, as for each faculty some advanced examinations are typically delayed by the students.)

Fig. 1. Distribution of the Sample of Students with Reference to the Percentages of Passed Examinations.


235

THE SELF-REGULATION QUESTIONNAIRE (SRQ) The SRQ is a 30-item questionnaire, with each item requiring a reply on a 5-point scale (1 = never, 5 = always). Students rate themselves with reference to three dimensions: elaboration (e.g. “When I study, I do not repeat the text verbatim”), organization (e.g. “At the beginning of my homework, I review the activities I must do”), and self-evaluation (e.g. “Usually the ratings I receive from the teachers coincide with my own impressions of my performance”). The Questionnaire was derived from the corresponding areas of the Study Questionnaire devised by Cornoldi, De Beni et al. (1993) and adapted by Moè and De Beni (2000), whose version was used here. A partially modified version is presented in the battery of De Beni, Moè and Cornoldi (in press), which provides information about the strong validity and other psychometric properties of the Questionnaire, including Cronbach’s alpha of the three subscales (respectively 0.48, 0.79 and 0.48 for elaboration, organization and self-evaluation). On the basis of the preceding studies, a composite score including the dimensions of elaboration, organization and self-evaluation appeared to be a good index of the student’s overall selfregulatory level. Therefore, in this study only the summed overall SRS score was used.

OTHER QUESTIONNAIRES The Self-Regulation Questionnaire was presented together with five other Questionnaires in a booklet entitled “Indagine per l’individuazione e l’analisi di fattori strategici e motivazionali in studenti universitari” [Study of the individualization and analysis of strategic and motivational factors in university students]. These Questionnaires are adaptions of batteries we already presented (De Beni et al., in press; Fioritto, Cornoldi, De Beni & Fabris, in press). We present them here with reference to the assessed constructs, as also briefly defined in the Introduction of this chapter. Strategic attitude (SA). The “Strategy Questionnaire” (Moè et al., 2001) (Questionario sulle Strategie di Studio) examined the student’s knowledge and inclination to use study strategies. The Questionnaire presents a list of 39 study strategies concerning different aspects of study for an examination as underlining, note-taking (see Moè et al., 2001, for a complete presentation of the 39 strategies). For the goals of the present study we limited ourselves to asking students to rate on a seven-point scale to what extent they knew and used the strategies. Strategies described in items 5, 9, 33 and 38 are of modest efficacy and are only used as

236


a control (see Moè et al., 2001), and were not included in the summed strategic attitude score. Effort Self-attribution (ESA). Eight academic situations (4 successes and 4 failures) were described to the students. For each situation, students were asked to specify, if the situation were happening to them, to what extent in percentage values the event could be attributed to different factors. For example, an item was: “You failed a written examination. To what extent in percentage was this due to lack of effort ( %), bad luck ( %), test difficulty ( %), lack of ability ( %), lack of help( %).” From the responses to the eight questions, an ESA score was derived, based on the sum of the percentage attributions interesting the effort dimension. Other scores were also derived in correspondence with the other four types of attributions; a distinction was also done between attributions for successes and failures. Intelligence theory and learning goals (ITLG). Students were asked to rate, on a 5-point scale, their degree of agreement with 10 statements concerning their theory of intelligence and their learning goals (see Dweck, 1999). Self-efficacy in study (SES). This is a short questionnaire including 10 items describing study situations. Students were asked to rate on a 5-point scale how much they perceived themselves as effective and able to persist in difficult contexts (e.g. “When I have to study a difficult topic, I continue until I master it”).

RESULTS Table 1 presents the mean scores (positive direction) for each item of the SRQ. Cronbach’s alpha for the SRQ overall score was higher (0.73) than that observed in a preceding study for the three sub-scales, confirming the suggestion to mainly consider the overall score. With reference to the sub-scales, the highest score was for self-evaluation (mean score = 37.4), and the lowest was for elaboration (mean score = 32.5). Actually, for the students it was easier to rate appropriately their degree of learning and success, than to involve themselves in a deep process of elaboration process. The analysis of single items indicates that typically students are involved in the basic elements of text deep elaboration, as they report they read for understanding (item 24 has a mean score of 4.39) and they do not repeat verbatim what is written in the text (item 4 M = 4.33). Furthermore students take notes during the teachers lectures (item 6 M = 4.12). In the area of self-evaluation students report they are good in evaluating the information memorability (item 14 M = 4.07), they are aware if they did not


237

Table 1. Self-Regulation Questionnaire (Mean Ratings and Standard Deviations for Each Item of the Questionnaire). SRQ Items

Mean

SD

1. At the beginning of my homework I review the activities I have to do (O) 2. Usually the ratings I receive from the teachers coincide with my own impressions concerning my performance (A) 3. I avoid delaying the time for studying (O) 4. When I study, I do not repeat the text verbatim (E) 5. During the teacher’s lectures, I like to think of related topics (E) 6. During lectures, I take notes in order to remember better (E) 7. It doesn’t happen that the examination time is unexpected (O) 8. Typically, if I receive negative feedback, I had predicted it (A) 9. I am aware of when I did not study enough (A) 10. When I’m studying, I am aware of what I have not well understood (A) 11. It doesn’t happen that I have too much homework left to do (O) 12. I am aware that I did poor work (A) 13. Usually I am able to anticipate the outcomes of my oral examinations (A) 14. I discriminate well the memorability of parts of the material (A) 15. I am prepared in time for the examinations (O) 16. I like to look back for associated information (E) 17. It is not necessary to read eveything with exactly the same degree of attention (E) 18. First I do my homework and then do what I like to do (O) 19. I keep in mind my academic engagements (O) 20. After an oral examination I know how well I did (A) 21. While reading, I question myself (E) 22. While studying, I repeat the content with my own words (E) 23. While studying, I remain faithful to the text (E) 24. When I study, I want to be sure I understood the text (E) 25. Each day I begin my studying with the assignments closest in time (O) 26. I am immediately aware of the task difficulty (A) 27. When I study, I do not have frequent breaks for fun (O) 28. I am able to organize my time to including both study and hobbies (O) 29. After written examinations, I know how well I did (A) 30. I like to reorganize my study material in a personal way (E)

2.76 2.69

1.27 0.91

3.87 4.33 2.57 4.12 4.55 4.02 4.07 3.98 3.27 4.20 3.67 4.07 3.07 3.00 2.83

1.05 0.98 1.07 1.09 0.80 0.79 1.05 0.94 1.17 0.71 0.99 0.86 1.06 1.09 1.27

3.92 3.85 4.07 2.26 3.29 2.82 4.39 3.61 3.41 3.66 3.14 3.66 2.86

0.95 0.13 0.96 1.05 1.28 1.02 0.89 1.24 1.03 1.01 1.03 1.01 1.19

Notes: All the ratings were given on a 5-point scale (1 = never, 5 = always); for the negative items, scores are presented reversed, as they are subsequently used for obtaining the overall SRS score. Letters in parenthesis specify the analysed dimension E = Elaborazione (Elaboration); A = Autovalutazione (Self-evaluation); O = Organizzazione (Organization).

study enough (item 9 M = 4.07) and that the negative notes they eventually receive were expected (item 8 M = 4.02). For the Organization sub-scale the highest score concerns avoiding to be unprepared for an unexpected examination (item 7 M = 4.55).

238


Items which receive a particularly low rating concern personal elaboration requiring a greater effort, such as self-questioning during study (item 21 M = 2.26) and the creation of associations with other study materials and topics (item 5 M = 2.57).

COMPARISON BETWEEN STUDENTS WITH HIGH AND LOW SELF-REGULATION In order to evaluate the discrimination power of the SRS we considered the distribution of the overall score and we excluded students who had obtained a score within a range of typical values at the Questionnaire, included between 96 and 114 (approximately corresponding to 0.8 standard deviations below and above the mean). In this way, we had a group of 50 students with low self-regulation (LSR) and a group of 57 students with high self-regulation (HSR). Table 2 reports the scores obtained by the two groups at the different tested dimensions. In Table 2 we report the mean scores also for other dimensions measurable through the Table 2. Mean Scores (Standard Deviations in Parentheses) Obtained by a Group with Low Self-Regulation (LSR) and a Group with High Self-Regulation (HSR) (The Values of the Student’s t Obtained in the Comparison of the Mean Values for Each Variable are also Presented).

% examinations Incremental intelligence theory Learning goals Study self-efficacy Strategic attitude Attribution S-effort Attribution S-ability Attribution S-task Attribution S-luck Attribution S-help Attribution F-effort Attribution F-ability Attribution F-task Attribution F-luck Attribution F-help

LSR (n = 50)

HSR (n = 57)

t (107)

67.56 (15.12) 15.34 (4.00)

73.59 (13.50) 16.68 (3.44)

−2.18 −1.87

0.031 0.064

13.92 (3.26) 33.44 (4.91) 135.40 (25.10) 46.54 (19.55) 16.06 (13.10) 16.62 (12.47) 15.92 (13.46) 4.90 (5.77) 42.12 (19.92) 10.80 (8.98) 30.06 (15.26) 10.80 (9.20) 6.12 (6.40)

15.88 (2.96) 39.47 (3.78) 155.05 (25.92) 57.09 (19.80) 24.60 (19.06) 9.84 (10.23) 5.82 (5.96) 2.65 (4.13) 50.74 (21.58) 10.63 (12.08) 29.84 (18.27) 5.95 (8.38) 2.84 (3.53)

−3.25 −7.17 −1.92 −2.76 −2.66 3.09 5.12 2.34 −2.14 n.s. n.s. 2.85 3.33

0.002

Identification and Assessment, Volume 16 (Advances in Learning and Behavioral Disabilities)

Assessment and Intervention (Advances in Learning and Behavioral Disabilities)

Technological Applications, Volume 15 (Advances in Learning and Behavioral Disabilities)

International Perspectives, Volume 20 (Advances in Learning and Behavioral Disabilities)

Policy and Practice (Advances in Learning and Behavioral Disabilities)

Cognition and Learning in Diverse Settings, Volume 18 (Advances in Learning and Behavioral Disabilities)

Research in Secondary Schools, Volume 17 (Advances in Learning and Behavioral Disabilities)

ADVANCES IN CATALYSIS VOLUME 16, Volume 16

Diagnostic assessment of learning disabilities in childhood

Learning Disabilities: From Identification to Intervention

Advances in Parasitology Volume 16

Advances in Taxation, Volume 16

Advances in Genetics Volume 16

Advances in Parasitology Volume 16

Advances in Agronomy, Volume 16

Advances in Geophysics, Volume 16

Assessment and Learning

Advances in Accounting Behavioral Research, Volume 9

Advances in Accounting Behavioral Research, Volume 11

Advances in Accounting Behavioral Research, Volume 14

Advances in Accounting Behavioral Research, Volume 10

Advances in Accounting Behavioral Research ~ Volume 12

Experimental and Behavioral Economics, Volume 13 (Advances in Applied Microeconomics)

Risk Assessment in People With Learning Disabilities (2nd ed)

Advances in behavioral finance,

Specific Learning Disabilities and Difficulties in Children and Adolescents: Psychological Assessment and Evaluation

Advances in behavioral economics

Advances in Behavioral Economics

Key Concepts in Learning Disabilities

Advances in Clinical Chemistry Volume 16

Identification and Assessment, Volume 16 (Advances in Learning and Behavioral Disabilities)

Assessment and Intervention (Advances in Learning and Behavioral Disabilities)

Technological Applications, Volume 15 (Advances in Learning and Behavioral Disabilities)

International Perspectives, Volume 20 (Advances in Learning and Behavioral Disabilities)

Policy and Practice (Advances in Learning and Behavioral Disabilities)

Cognition and Learning in Diverse Settings, Volume 18 (Advances in Learning and Behavioral Disabilities)

Research in Secondary Schools, Volume 17 (Advances in Learning and Behavioral Disabilities)

ADVANCES IN CATALYSIS VOLUME 16, Volume 16

Diagnostic assessment of learning disabilities in childhood

Learning Disabilities: From Identification to Intervention

Advances in Parasitology Volume 16

Advances in Taxation, Volume 16

Advances in Genetics Volume 16

Advances in Parasitology Volume 16

Advances in Agronomy, Volume 16

Advances in Geophysics, Volume 16

Assessment and Learning

Advances in Accounting Behavioral Research, Volume 9

Advances in Accounting Behavioral Research, Volume 11

Advances in Accounting Behavioral Research, Volume 14

Advances in Accounting Behavioral Research, Volume 10

Advances in Accounting Behavioral Research ~ Volume 12

Experimental and Behavioral Economics, Volume 13 (Advances in Applied Microeconomics)

Risk Assessment in People With Learning Disabilities (2nd ed)

Advances in behavioral finance,

Specific Learning Disabilities and Difficulties in Children and Adolescents: Psychological Assessment and Evaluation

Advances in behavioral economics

Advances in Behavioral Economics

Key Concepts in Learning Disabilities

Advances in Clinical Chemistry Volume 16

Recommend Documents