Testing v. Testing: 2

§ May 3rd, 2010 § Filed under Educate, Excerpts/Quotes § Tagged , , , , § 1 Comment

The Committee on Appropriate Test Use of the National Research Council stated in an authoritative report in 1999 that “tests are not perfect” and “a test score is not an accurate measure of a student’s knowledge or skills.” Because test scores are not an infallible measure, the committee warned, “an educational decision that will have a major impact on a test taker should not be made solely or automatically on the basis of a single test score.” ….

Psychometricians are less enthusiastic than elected officials about using tests to make consequential judgments, because they know that test scores may vary in unpredictable ways. Year-to-year changes in test scores for individuals or entire classes may be due to random variation. Student performance may be affected by the weather, the student’s state of mind, distractions outside the classroom, or conditions inside the classroom. Tests may also become invalid if too much time is spent preparing students to take them.

Robert Linn of the University of Colorado, a leading psychometrician, maintains there are many reasons why one school might get better test scores than another. NCLB, he says, assumes that if school A gets better results than school B, it must be due to differences in school quality. But school A may have students who were higher achieving in earlier years than those in the other school. Or school A might have fewer students who are English-language learners or fewer students with disabilities than school B. School A, which is presumably more successful, may have a homogeneous student body, while the less successful school B may have a diverse student body with several subgroups, each of which must meet a proficiency target. Linn concludes, “The fact that the school that has fewer challenges makes AYP [adequate yearly progress] while the school with greater challenges fails to make AYP does not justify the conclusion that the first school is more effective than the second school. The first school might very well fail to make AYP if it had a student body that was comparable to the one in the second school.”

State testing systems usually test only once each year, which increases the possibility of random variation. It would help, Linn says, to administer tests at the start of the school year and then again at the end of the school year, to identify the effectiveness of the school. Even then, there would be confounding variables: “For example, the students at the school with the higher scores on the state assessment might have received more educational support at home than students at school B. The student bodies attending different schools can differ in many ways that are related to performance on tests, including language background, socioeconomic status, and prior achievement.” The professional organizations that set the standards for testing—such as the American Psychological Association and the American Educational Research Association—agree that test results reflect not only what happens in school, but also the characteristics of those tested, including such elusive factors as student motivation and parental engagement. Because there are so many variables that cannot be measured, even attempts to match schools by the demographic profile of their student body do not suffice to eliminate random variation.

___

Ravitch. Diane. The Death and Life of the Great American School System. New York: Basic/Perseus, 2010: 153-4.

One Response to “Testing v. Testing: 2”