Sunday, September 05, 2010

Vaule Added: Grading the graders

Sep 3rd 2010, 13:46 by R.M. | WASHINGTON, DC

Description: FEW
weeks ago I wrote a post about the use of value-added statistical analysis
to evaluate teacher effectiveness
. Briefly, I think it's a step in the right direction, but teachers deserve
a more comprehensive evaluation system. Since then the Los Angeles Times,
whose reporting on the subject prompted my initial comment, has published a
database <> with the ratings of
about 6,000 elementary-school teachers based on the paper's value-added
analysis. So, if you are a parent in the Los Angeles area, you can now find
out if your child's teacher is "least effective", "less effective",
"average", "more effective", or "most effective", based on the standardised
test scores of their students. Predictably, the teacher's unions threw a fit
ory> , with the local union planning a protest in front of the Times
building (having already called for a boycott of the paper).

The unions have resisted using student test scores for evaluation purposes
for some time, though they are finally coming round to idea. It has never
been quite clear why teachers or their unions consider their occupation
uniquely difficult to evaluate. There are many jobs that lack simple metrics
with which to gauge effectiveness, and most make do with some combination of
evaluation procedures. The results are not perfect, but Conor Friedersdorf
-on-teacher-pay.html> why it is so odd for teachers, especially, to resist
using unavoidably inexact methods, like value-added testing.

[E]very week they read student assignments and use their fallible judgment
to assign a letter grade, often based on opaque, somewhat arbitrary
standards. This process culminates in a report card sent home at the end of
every semester. It typically assesses achievement on an A to F scale that
presumably doesn't capture every nuance of student mastery over a subject.
High school teachers who give out these grades do so knowing that for many
students they'll one day be scrutinized by college admissions officers,
who'll admit or deny applicants largely based on the average of these
somewhat arbitrary grades that don't capture every nuance of a student's
academic abilities.

Despite its imperfections, I haven't many teachers eager to do away with
grades, and while I've seen a lot of teachers complain about being evaluated
based on test scores-a complaint with which I sympathize-I've never seen a
persuasive defense of "masters degrees earned" or "years worked" as a better
metric of quality. Yet teachers unions champion a status quo that relies on
these very measures.

For all their harrumphing, the Los Angeles teachers union and the local
school district have agreed to negotiate a new evaluation system. "Top
district officials have said they want at least 30% of a teacher's review to
be based on value-added," reports the Times. "But they have said the
majority of the evaluations should depend on observations." That's a start.
The Obama administration is pushing for greater transparency
and more use of value-added analysis across the country. The next step is
agreeing on what to do with teachers who are deemed "least effective",
because heaven forbid we fire any of them.

When Does Holding Teachers Accountable Go Too Far?
The start of the school year brings another one of those nagging, often unquenchable worries of parenthood: How good will my child’s teachers be? Teachers tend to have word-of-mouth reputations, of course. But it is hard to know how well those reputations match up with a teacher’s actual abilities. Schools generally do not allow parents to see any part of a teacher’s past evaluations, for instance. And there is nothing resembling a rigorous, Consumer Reports-like analysis of schools, let alone of individual teachers. For the most part, parents just have to hope for the best.
That, however, may be starting to change. A few months ago, a team of reporters at The Los Angeles Times and an education economist set out to create precisely such a consumer guide to education in Los Angeles. The reporters requested and received seven years of students’ English and math elementary-school test scores from the school district. The economist then used a statistical technique called value-added analysis to see how much progress students had made, from one year to the next, under different third- through fifth-grade teachers. The variation was striking. Under some of the roughly 6,000 teachers, students made great strides year after year. Under others, often at the same school, students did not. The newspaper named a few teachers — both stars and laggards — and announced that it would release the approximate rankings for all teachers, along with their names.
The articles have caused an electric reaction. The president of the Los Angeles teachers union called for a boycott of the newspaper. But the union has also suggested it is willing to discuss whether such scores can become part of teachers’ official evaluations. Meanwhile, more than 1,700 teachers have privately reviewed their scores online, and hundreds have left comments that will accompany them.
It is not difficult to see how such attempts at measurement and accountability may be a part of the future of education. Presumably, other groups will try to repeat the exercise elsewhere. And several states, in their efforts to secure financing from the Obama administration’s Race to the Top program, have committed to using value-added analysis in teacher evaluation. The Washington, D.C., schools chancellor, Michelle Rhee, fired more than 100 teachers this summer based on evaluations from principals and other educators and, when available, value-added scores.
In many respects, this movement is overdue. Given the stakes, why should districts be allowed to pretend that nearly all their teachers are similarly successful? (The same question, by the way, applies to hospitals and doctors.) The argument for measurement is not just about firing the least effective sliver of teachers. It is also about helping decent and good teachers to become better. As Arne Duncan, the secretary of education, has pointed out, the Los Angeles school district has had the test-score data for years but didn’t use it to help teachers improve. When the Times reporters asked one teacher about his weak scores, he replied, “Obviously what I need to do is to look at what I’m doing and take some steps to make sure something changes.”
Yet for the all of the potential benefits of this new accountability, the full story is still not a simple one. You could tell as much by the ambivalent reaction to the Los Angeles imbroglio from education researchers and reform advocates. These are the people who have spent years urging schools to do better. Even so, many reformers were torn about the release of the data. Above all, they worried that although the data didn’t paint a complete picture, it would offer the promise of clear and open accountability — because teachers could be sorted and ranked — and would nonetheless become gospel.
Value-added data is not gospel. Among the limitations, scores can bounce around from year to year for any one teacher, notes Ross Wiener of the Aspen Institute, who is generally a fan of the value-added approach. So a single year of scores — which some states may use for evaluation — can be misleading. In addition, students are not randomly assigned to teachers; indeed, principals may deliberately assign slow learners to certain teachers, unfairly lowering their scores. As for the tests themselves, most do not even try to measure the social skills that are crucial to early learning.
The value-added data probably can identify the best and worst teachers, researchers say, but it may not be very reliable at distinguishing among teachers in the middle of the pack. Joel Klein, New York’s reformist superintendent, told me that he considered the Los Angeles data powerful stuff. He also said, “I wouldn’t try to make big distinctions between the 47th and 55th percentiles.” Yet what parent would not be tempted to?
One way to think about the Los Angeles case is as an understandable overreaction to an unacceptable status quo. For years, school administrators and union leaders have defeated almost any attempt at teacher measurement, partly by pointing to the limitations. Lately, though, the politics of education have changed. Parents know how much teachers matter and know that, just as with musicians or athletes or carpenters or money managers, some teachers are a lot better than others.
Test scores — that is, measuring students’ knowledge and skills — are surely part of the solution, even if the public ranking of teachers is not. Rob Manwaring of the research group Education Sector has suggested that districts release a breakdown of teachers’ value-added scores at every school, without tying the individual scores to teachers’ names. This would avoid humiliating teachers while still giving a principal an incentive to employ good ones. Improving standardized tests and making peer reports part of teacher evaluation, as many states are planning, would help, too.
But there is also another, less technocratic step that is part of building better schools: we will have to acknowledge that no system is perfect. If principals and teachers are allowed to grade themselves, as they long have been, our schools are guaranteed to betray many students. If schools instead try to measure the work of teachers, some will inevitably be misjudged. “On whose behalf do you want to make the mistake — the kids or the teachers?” asks Kati Haycock, president of the Education Trust. “We’ve always erred on behalf of the adults before.”
You may want to keep that in mind if you ever get a chance to look at a list of teachers and their value-added scores. Some teachers, no doubt, are being done a disservice. Then again, so were a whole lot of students.
David Leonhardt is an economics columnist for The Times and a staff writer for the magazine.

No comments: