# The Problem with Standardized Achievement Tests

Brought to FEN by the Association for Supervision and Curriculum Development

The third reason that students' performances on these tests should not be used to evaluate educational quality is the most compelling. Because student performances on standardized achievement tests are heavily influenced by three causative factors, only one of which is linked to instructional quality, asserting that low or high test scores are caused by the quality of instruction is illogical.

To understand this confounded causation problem clearly, let's look at the kinds of test items that appear on standardized achievement tests. Remember, students' test scores are based on how well students do on the test's items. To get a really solid idea of what's in standardized tests, you need to grub around with the items themselves.

The three illustrative items presented here are mildly massaged versions of actual test items in current standardized achievement tests. I've modified the items' content slightly, without altering the essence of what the items are trying to measure.

The problem of confounded causation involves three factors that contribute to students' scores on standardized achievement tests: (1) what's taught in school, (2) a student's native intellectual ability, and (3) a student's out-of-school learning.

What's taught in school. Some of the items in standardized achievement tests measure the knowledge or skills that students learn in school. In certain subject areas, such as mathematics, children learn in school most of what they know about a subject. Few parents spend much time teaching their children about the intricacies of algebra or how to prove a theorem.

So, if you look over the items in any standardized achievement test, you'll find a fair number similar to the mathematics item presented in Figure 1, which is a mildly modified version of an item appearing in a standardized achievement test intended for 3rd grade children.

This mathematics item would help teachers arrive at a valid inference about 3rd graders' abilities to choose number sentences that coincide with verbal representations of subtraction problems. Or, along with other similar items dealing with addition, multiplication, and division, this item would contribute to a valid inference about a student's ability to choose appropriate number sentences for a variety of basic computation problems presented in verbal form.

If the items in standardized achievement tests measured only what actually had been taught in school, I wouldn't be so negative about using these tests to determine educational quality. As you'll soon see, however, other kinds of items are hiding in standardized achievement tests.

A student's native intellectual ability. I wish I believed that all children were born with identical intellectual abilities, but I don't. Some kids were luckier at gene-pool time. Some children, from birth, will find it easier to mess around with mathematics than will others. Some kids, from birth, will have an easier time with verbal matters than will others. If children came into the world having inherited identical intellectual abilities, teachers' pedagogical problems would be far more simple.

Recent thinking among many leading educators suggests that there are various forms of intelligence, not just one (Gardner, 1994). A child who is born with less aptitude for dealing with quantitative or verbal tasks, therefore, might possess greater "interpersonal" or "intrapersonal" intelligence, but these latter abilities are not tested by these tests. For the kinds of items that are most commonly found on standardized achievement tests, children differ in their innate abilities to respond correctly. And some items on standardized achievement tests are aimed directly at measuring such intellectual ability.

These sorts of items, because they tap innate intellectual skills that are not readily modifiable in school, do a wonderful job in spreading out test-takers' scores. The quest for score variance, coupled with the limitation of having few items to use in assessing students, makes such items appealing to those who construct standardized achievement tests.

But items that primarily measure differences in students' in-born intellectual abilities obviously do not contribute to valid inferences about "how well children have been taught." Would we like all children to do well on such "native-smarts" items? Of course we would. But to use such items to arrive at a judgment about educational effectiveness is simply unsound.

Out-of-school learning. The most troubling items on standardized achievement tests assess what students have learned outside of school. Unfortunately, you'll find more of these items on standardized achievement tests than you'd suspect. If children come from advantaged families and stimulus-rich environments, then they are more apt to succeed on items in standardized achievement test items than will other children whose environments don't mesh as well with what the tests measure.

One particular question from a 6th grade science test makes clear what's actually being assessed by a number of items on standardized achievement tests. This item first tells students what an attribute of a fruit is (namely, that it contains seeds). Then the student must identify what "is not a fruit" by selecting the option without seeds. The choices are: A) orange, B) pumpkin, C) apple, D) celery. As any child who has encountered celery knows, celery is a seed-free plant. The right answer, then, for those who have coped with celery's strings but never its seeds, is clearly choice D.

But what if when you were a youngster, your folks didn't have the money to buy celery at the store? What if your circumstances simply did not give you the chance to have meaningful interactions with celery stalks by the time you hit the 6th grade? How well do you think you'd do in correctly answering the item in Figure 3? And how well would you do if you didn't know that pumpkins were seed-carrying spheres? Clearly, if children know about pumpkins and celery, they'll do better on this item than will those children who know only about apples and oranges. That's how children's socioeconomic status gets mixed up with children's performances on standardized achievement tests. The higher your family's socioeconomic status is, the more likely you are to do well on a number of the test items you'll encounter in a such a test.

Suppose you're a principal of a school in which most students come from genuinely low socioeconomic situations. How are your students likely to perform on standardized achievement tests if a substantial number of the test's items really measure the stimulus-richness of your students' backgrounds? That's right, your students are not likely to earn very high scores. Does that mean your school's teachers are doing a poor instructional job? Of course not.

Conversely, let's imagine you're a principal in an affluent school whose students tend to have upper-class, well-educated parents. Each spring, your students' scores on standardized achievement tests are dazzlingly high. Does this mean your school's teachers are doing a super instructional job? Of course not.

One of the chief reasons that children's socioeconomic status is so highly correlated with standardized test scores is that many items on standardized achievement tests really focus on assessing knowledge and/or skills learned outside of school--knowledge and/or skills more likely to be learned in some socioeconomic settings than in others.

Again, you might ask why on earth would standardized achievement test developers place such items on their tests? As usual, the answer is consistent with the dominant measurement mission of those tests, namely, to spread out students' test scores so that accurate and fine-grained norm-referenced interpretations can be made. Because there is substantial variation in children's socioeconomic situations, items that reflect such variations are efficient in producing among-student variations in test scores.

You've just considered three important factors that can influence students' scores on standardized achievement tests. One of these factors was directly linked to educational quality. But two factors weren't.

