If policymakers are interested in using value-added models to evaluate teachers, they should take into account the policy caveats of using them, as well as how those caveats can be dealt with. Because of the vigorous debate around value-added scores, policymakers need to be aware of even minor concerns. Below is a list of the different concerns that have been raised, as well as how some districts deal with them.
Lack of buy-in. A number of teachers do not believe value-added models can accurately measure the impact they have on their students. If teachers do not trust the data or how it will be used, they are unlikely to use the data to evaluate or improve their performance. For this reason, it’s vitally important to include teachers when developing the evaluation system, including the design of the value-added model. Another way to get buy-in that is also just good practice is to use value-added models to evaluate principals and other administrators, especially the superintendent, using the same model.
Statistical complexity vs. transparency. Value-added models are statistically complex, and the way results are calculated are not easily understood by non-statisticians. Unfortunately, the complexity of a value-added model is related to its accuracy. The less complex the model, the less accurately it will evaluate teachers. Conversely, the more complex it is, the less transparent it will be. Policymakers should be aware that if teachers, administrators, and the public don’t understand how a teacher is rated, they may not trust or use the results (Economic Policy Institute 2010, Sass 2008).
Even so, not everyone agrees that there must be a trade-off between accuracy and transparency. For example, most people don’t know the complexities of how a computer works, but they still trust a computer’s output. To be transparent, however, the calculations should be available to outside expert review, one that teachers will trust.
Not all teachers are included. Another caveat of value-added models is that they can only be used for teachers who teach tested subjects that have at least one previous test score. For the majority of states and districts, this limits value-added models to math and reading teachers, since they are typically the only subjects tested annually. In addition, in most states only teachers in 4th through 8th grade would have a previous year’s test score, since testing typically starts at 3rd grade. Developing value-added models for high school teachers is also difficult, since student do not always take similar courses in sequence from year to year. Therefore, under current assessment systems, just 25 to 35 percent of all teachers would likely have individual value-added scores, but this is growing (Goldhaber 2010).
To overcome this, some evaluation systems replace individual teacher value-added scores with school-wide value-added scores for teachers who teach untested subjects. Other techniques include replacing value-added scores with teacher portfolios, student surveys, or some other measure of how a teacher has impacted students’ performance.
Teacher effectiveness is relative. Most, if not all, value-added models rate teachers on a curve relative to the effectiveness of their fellow teachers (typically within the school or district) who teach similar students. Therefore, by design, half of the teachers will be below average effectiveness and half would be above average, even if the difference between “worst” and “best” is slight. Theoretically, such relative comparisons, on their own, could lead to false identification of an ineffective teacher in a district with highly effective teachers (and vice versa). Looking carefully at the data, and choosing more appropriate category names, could avert this problem.
Although the results are relative, they can still provide valuable information, since research shows that teacher quality varies greatly within schools. But the relative nature of the comparison is an important reason why value-added scores should be used in concert with other measures of teacher performance.
Cross-district comparisons may also provide needed balance. However, using value-added across districts has its own caveats which are outside the scope of this paper to address.
Impact on team teaching. Implementing a value-added model might lead to an unintentional negative effect on team teaching. A model that does not consider the effects that other teachers have could discourage teachers from working together. To combat this, a measure of the school’s overall performance could be included in each teacher’s evaluation. By basing a teacher’s effectiveness at least partially on the success of the school as a whole, teachers may have more incentive to work with other teachers within the school.
Unable to accurately rank teachers. Value-added models are not precise enough to rank teachers from best to worst, as if they were runners finishing a race. Value-added models most accurately identify the most effective and the least effective teachers, but are unable to distinguish any significant difference in effectiveness between the majority of teachers in the middle (Goldhaber 2010). Research has shown that value-added models identify all but about a third of teachers (the top 15 and bottom 15 percent) as average (McCaffrey, et al. 2003). But identifying the least effective teachers can have a tremendous impact. As stated elsewhere, a very ineffective teacher could subtract nearly $800,000 a year from the labor market. Improving or eliminating the least effective teachers could have a huge impact on our economy, as well as millions of children nationwide. Identifying the most effective teachers could provide opportunity to reward excellence and help other teachers learn from them, as well.
If your district decides to tie teacher evaluations to their compensation, check out the joint statement by the American Association of School Administrators, the American Federation of Teachers, the National Education Association and the National School Boards Association on the 11 guiding principles for teacher incentive plans. Examining the research is just the first step in building a stable and well-accepted process according to the guidelines, which encourage school boards, administrators and unions/associations to work together, and offer specific recommendations such as sustainable funding, multiple measures of evaluation, and collaboration among teachers.
This report was written by Jim Hull, Center for Public Education Senior Policy Analyst.
Posted: March 31, 2011