Learn About: Evaluating Performance | Common Core
Home > Staffing and Students > Building a better evaluation system: At a glance
| print Print


Building a better evaluation system: At a glance

The push to change teacher evaluation systems, and especially to include statistical measures of teachers’ effect on student learning, is here. In 2005, 13 states were able to link teachers to their students’ performance data; in 2010, 35 states were able to do so and the number is expected to grow. The Obama administration’s Race to the Top effort urged states and districts to use this teacher-student link in teacher evaluations in order to be eligible for grants. In response, 17 states reportedly changed their evaluation systems to improve their chances of receiving RTTT funds. Private foundations like the Gates Foundation have also used their resources to examine teachers’ effectiveness and encourage the use of such measures. Clearly, this is a fast-moving train that will likely affect many if not most school districts eventually. In order to prepare, here’s what you should know:

  1. The current system is lacking. Current evaluation systems fail to identify the true variation in teacher effectiveness by rating all but a few teachers as “satisfactory.” One study of teacher evaluation systems nationwide found that only 1 percent of teachers are evaluated as “unsatisfactory.” In districts that use multiple evaluation levels, only 6 percent of teachers rate below the top two categories. Other research proves that there’s huge variability among teachers, even within schools, but it’s hidden by inadequate evaluation tools. Until now, most evaluation systems have not been linked to any measure of students’ real learning, partly because data systems have not been in place. A recent quote by Randi Weingarten, head of the American Federation of Teachers, sums it up: “As important as evaluation is to assessing teacher performance, what passes for teacher evaluation in many districts frankly isn’t up to this important task.”

  2. Improving teacher effectiveness can dramatically impact student learning. Research has shown that teachers have the single greatest impact on students’ performance, more than family background, socioeconomic status, or school. By improving teacher effectiveness, districts could improve student achievement and save money at the same time, because they would be able to identify ineffective teachers early and provide them with appropriate support, rather than having to replace struggling teachers who leave the profession because of a lack of assistance. Designing and implementing a quality teacher evaluation system – one that identifies strong teaching where it exists and targets interventions where they’re needed for improvement -- would take additional funds and careful thought, but the benefits would be significant.

  3. Value-added models have flaws, but are much better than the system we have now. The fairest way to identify strong teaching is through a system that looks at student gains. Value-added models, which work to isolate the impact a teacher has on his or her students’ achievement from other factors, are the latest refinement of such a system. However, value-added models have come under intense scrutiny and criticism, and the criticism needs to be considered. Most importantly, value-added scores, while better than other measures, still fluctuate enough that people question their precision. For instance, multiple studies have found that among teachers ranked in the top 20 percent of effectiveness one year, about a third of them were still in the top 20 percent the following year, although the vast majority stayed in the top half. The wide fluctuation shows that some of the difference in year-to-year scores was due to statistical imprecision instead of an actual change in the teacher’s effectiveness.

    However, while imprecision is a concern, the variation in scores should be considered against the current evaluation system, which almost certainly misidentifies many ineffective teachers as “satisfactory.” One study that compared teachers’ instructional practices to value-added scores concluded “[Value-added scores]…seem to be capturing important differences in the quality of instruction.” Another study found that value-added scores were useful in predicting which teachers would be successful in the future. As long as they are used in concert with other methods of evaluation, value-added scores provide a useful insight into teachers’ impact.

  4. Statistical measures are used to evaluate people in other industries effectively. Using imprecise statistical measures in evaluations is a generally accepted practice in fields outside of teaching. Major League Baseball, for instance, bases its million-dollar salary decisions largely on a player’s statistics, which can vary from year to year about as much as teachers’ do in value-added models. Other professions evaluated on similarly imprecise year-to-year measures include realtors; investors’ rate of return; utility company repairmen; and others. Value-added models should not be compared to a criterion of perfection, but whether including value-added models as part of a comprehensive teacher evaluation system would be an improvement over what is in place now.

  5. There are ways to improve value-added models. The more years of data are used, the more precise value-added models become. For instance, the chance of misidentification drops by 10 percentage points when three years of data are used instead of one. Better state assessments, and aligning the assessments to what is taught, could also improve value-added models.

  6. Multiple measures are the way to go. Virtually all researchers advocate using value-added data as one of multiple measures when making decisions about teachers. Using traditional measures, such as classroom observation, along with value-added data will present a fuller, more accurate picture of a teacher’s true effectiveness. In current formulas that use value-added models, the value-added score generally accounts for 25 to 50 percent of the total rating. Which measures to use and how much weight to put on each are decisions best made locally based on data, resources available, and the district’s goals for the teacher evaluation system.

Questions for school board members

When implementing a teacher evaluation system that includes value-added data, school boards should consider how the data will be used, district technical capabilities, and the evaluation design.

If your district decides to tie teacher evaluations to their compensation, check out the joint statement by the American Association of School Administrators, the American Federation of Teachers, the National Education Association and the National School Boards Association on the 11 guiding principles for teacher incentive plans.

Following are some questions school boards could ask of themselves, the superintendent, consultants, or other service providers.

Policy questions:

  • Why do we want to use value-added results?
  • How will the results of the teacher evaluation be used?
  • Who will have access to the value-added data?
  • How will it be disseminated?
  • How will the evaluation help to improve personnel decisions?
  • How will the evaluation help improve a teacher’s performance?
  • Will principals’ evaluations include value-added scores?
  • Will value-added models be part of the superintendent’s goals?
  • Do the people affected understand and support it?

Technical questions:

  • Are we able to connect teachers to student test scores?
  • Who will design the value-added model?
  • Where can we look for advice on designing a system that would work best for us?
  • What other data can we include in the value-added model?
  • How will the value-added model account for missing student data?

Design questions:

  • What percent of a teacher’s evaluation will be based on value-added scores?
  • What measures will be used to evaluate teachers without value-added scores?
  • What other measures (observation, portfolios, etc.) will be used to evaluate teachers in concert with value-added scores?
  • How will this affect multi-year tenure (for instance, two-year tenure) if the accuracy of value-added scores improves with three years’ worth of data?
  • How will the evaluation account for team teaching, or will it?
  • Should value-added scores be averaged over multiple years?
  • Should the value-added model compare teachers within a single school or compare teachers across the district?
  • How will the value-added model account for differences in student populations and resources across schools?

This report was written by Jim Hull, Center for Public Education Senior Policy Analyst.

Posted: March 31, 2011

Add Your Comments





Display name as (required):  

Comments (max 2000 characters):




Comments: