The Problem with Standardized Achievement Tests
In This Article: | |
Brought to FEN by the Association for Supervision and Curriculum Development
The companies that create and sell standardized achievement tests are all owned by large corporations. Like all for-profit businesses, these corporations attempt to produce revenue for their shareholders.
Recognizing the substantial pressure to sell standardized achievement tests, those who market such tests encounter a difficult dilemma that arises from the considerable curricular diversity in the United States. Because different states often choose somewhat different educational objectives (or, to be fashionable, different content standards), the need exists to build standardized achievement tests that are properly aligned with educators' meaningfully different curricular preferences. The problem becomes even more exacerbated in states where different counties or school districts can exercise more localized curricular decision making.
At a very general level, the goals that educators pursue in different settings are reasonably similar. For instance, you can be sure that all schools will give attention to language arts, mathematics, and so on. But that's at a general level. At the level where it really makes a difference to instruction--in the classroom--there are significant differences in the educational objectives being sought. And that presents a problem to those who must sell standardized achievement tests.
In view of the nation's substantial curricular diversity, test developers are obliged to create a series of one-size-fits-all assessments. But, as most of us know from attempting to wear one-size-fits-all garments, sometimes one size really can't fit all.
The designers of these tests do the best job they can in selecting test items that are likely to measure all of a content area's knowledge and skills that the nation's educators regard as important. But the test developers can't really pull it off. Thus, standardized achievement tests will always contain many items that are not aligned with what's emphasized instructionally in a particular setting.
To illustrate the seriousness of the mismatch that can occur between what's taught locally and what's tested through standardized achievement tests, educators ought to know about an important study at Michigan State University reported in 1983 by Freeman and his colleagues. These researchers selected five nationally standardized achievement tests in mathematics and studied their content for grades 46. Then, operating on the very reasonable assumption that what goes on instructionally in classrooms is often influenced by what's contained in the texbooks that children use, they also studied four widely used textbooks for grades 4-6.
Employing rigorous review procedures, the researchers identified the items in the standardized achievement test that had not received meaningful instructional attention in the textbooks. They concluded that between 50 and 80 percent of what was measured on the tests was not suitably addressed in the textbooks. As the Michigan State researchers put it, "The proportion of topics presented on a standardized test that received more than cursory treatment in each textbook was never higher than 50 percent" (p. 509).
Well, if the content of standardized tests is not satisfactorily addressed in widely used textbooks, isn't it likely that in a particular educational setting, topics will be covered on the test that aren't addressed instructionally in that setting? Unfortunately, because most educators are not genuinely familiar with the ingredients of standardized achievement tests, they often assume that if a standardized achievement test asserts that it is assessing "children's reading comprehension capabilities," then it's likely that the test meshes with the way reading is being taught locally. More often than not, the assumed match between what's tested and what's taught is not warranted.
If you spend much time with the descriptive materials presented in the manuals accompanying standardized achievement tests, you'll find that the descriptors for what's tested are often fairly general. Those descriptors need to be general to make the tests acceptable to a nation of educators whose curricular preferences vary. But such general descriptions of what's tested often permit assumptions of teaching-testing alignments that are way off the mark. And such mismatches, recognized or not, will often lead to spurious conclusions about the effectiveness of education in a given setting if students' scores on standardized achievement tests are used as the indicator of educational effectiveness. And that's the first reason that standardized achievement tests should not be used to determine the effectiveness of a state, a district, a school, or a teacher. There's almost certain to be a significant mismatch between what's taught and what's tested. A Psychometric Tendency to Eliminate Important Test Items
More on: Testing



