contact us: firstname.lastname@example.org
Is it true that “what you test is what you get”?
I’ve heard it said by a wide variety of educators, policymakers, and education researchers that “what you test is what you get.” Lauren Resnick, then a professor at the University of Pittsburgh and director of the Learning Research and Development Center, first made the statement many years ago. For a period of time, she devoted most of her professional energies to building K-12 tests meant to be good targets for instruction with the goal of education reform driven by teaching to the test. Similar statements are made in other areas of work, for example, writer Jennifer Fullweiler says “you get what you measure,” concluding that what you measure reflects your priorities. Some skeptics of this statement modify it to say “what you measure is the most that you will get,” but that’s just putting a negative spin on the same thing.
In contrast to the above, standards-based reform in education uses content standards to describe what content teachers are to teach and students are to learn. The United State’s long-standing commitment to standards-based reform continues in the new Every Student Succeeds Act (ESSA). The new act uses state content standards as the targets for teaching and learning and requires student achievement testing in grades three through eight and once in high school with tests aligned to the state content standards. In “The Challenge of Alignment,” my colleague Morgan Polikoff pointed out, however, that state tests are generally found not to be well-aligned to state content standards. Does this mean that teachers are more likely to teach, and therefore students are more likely to learn, what is tested than what is intended by the state content standards? Probably.
But, there is much more that needs to be known about state education policies and practices that guide teacher instruction and student learning to know how effective they are likely to be. The policy attributes theory that guides C-SAIL work hypothesizes that the more standards-based reform policies are specific in stating what is wanted, supported by consistent/aligned additional policies and practices (including assessments), perceived to be authoritative by teachers and administrators, powerful in that they are supported by rewards and sanctions, and stable in that they remain in place over time, the greater their influence will be on instruction and learning (find an explanation of the policy attributes theory here).
Why is it that student achievement tests are not tightly aligned (consistent) to their state content standards?
In answering this question, a good place to start is to look at the standards themselves. The full set of standards for a subject and grade level are not as clear and specific as one might wish. For example, the standards don’t address what it is that teachers are not to teach and students need not learn. Not surprisingly, teachers are reluctant to drop from their instruction content that is not in the standards; they attempt to teach not only what they have been teaching but also all the content in the new standards. Adding new content to old content without deleting any content ends up producing a disjointed and shallow instructional program for students. And student achievement tests cannot be tightly aligned to standards that are not specific.
ADDING NEW CONTENT TO OLD CONTENT ENDS UP PRODUCING A DISJOINTED AND SHALLOW LESSON.
At the same time even if the standards were sufficiently specific, building good tests is not easy. Some content is much more difficult to test than other content, even if it is equally or more important. Furthermore, tests are built against test blueprints. The test blueprints are meant to be well-aligned to the content standards, but typically they are not. Thus, a test may be very well-aligned to the blueprint which drove its construction and still not be well-aligned to the content standards.
Perhaps even more important is that no formal replicable strategy is used in the test construction process to ensure that each item written is aligned to one or more content standards. This creates complications in two ways. First, some of the test items are not well-aligned to any of the content standards. Second, items may focus on some subset of the content standards but collectively not represent the full set of standards. As a result, tests fail to cover the full breadth of the standards.
Adding yet another complication, testing time must be kept within reason, putting a cap on the number of test items. Across years the goal should be for tests to be updated to become more tightly aligned to the full set of standards. Unfortunately, rarely does the process for building new tests take into account the goal of increasing alignment for the collection of tests within a subject across forms/years. If “what you test is what you get,” then only the subsample of the standards that are tested year after year will be what is taught and what is learned!
During my lifetime I have never seen such a lack of stability in education policies as at the present.
Much could be said about building policies with the policy attributes of appropriate authority and power to add weight to the specificity and consistency of the policies, but the education sector must also prioritize the fifth and final attribute, stability. Common sense, experience, and research show that policies that don’t stay in place over time have no real and lasting impact on teaching and learning. Unfortunately, during my lifetime I have never seen such a lack of stability in education policies as at the present. In many states, standards are being changed almost yearly. Each time the standards change, the tests change. Even in states where the standards haven’t changed from one year to the next, the tests have changed because legislatures and courts have ruled that the state can no longer be a part of the multistate testing consortium of which it had been a member (see our interactive map). Clearly, more stability is needed to give standards-based reform any hope of effectiveness.