Monday, September 15, 2008

Leonie Haimson Questions Jim Dwyer on F Grade at PS 8

It is almost impossible to keep up with the enormous body of work Leonie produces. I'm just throwing it up on this blog so we can reference it at some point. One day, when someone writes the history of the BloomKlein stewardship of the NYC school system, if they don't shrink away in horror, will find some of this stuff incredibly useful.


Does this school report card have important information about the school, or is it merely an artifact of an absurd evaluation system?

The latter. In addition to all the other statistical problems – basing 85% of the grade on the results of two high stakes exams, with the gains/losses up to 80% random -- the tests themselves are not “equated” or aligned to make the sort of cross year comparisons that Liebman uses them for.



Check out eduwonkette – actually today’s EdWeek commentator is Aaron Pallas, prof. of sociology at Columbia Univ, named Skoolboy: http://blogs.edweek.org/edweek/eduwonkette/2008/09/let_the_spin_begin.html#comments

You drop them off at the beginning of the year, and on average, by the end of the year, your child lost ground in proficiency," Dwyer quotes Liebman as saying. "Where was the child last year, and where is the child this year?" Liebman asked. "You're comparing them to themselves."

A gentle reminder to Mr. Liebman, who was hired in January, 2006: the state math and ELA tests which children take, and are the primary basis for assigning these lovely letter grades, are not vertically equated. (See skoolboy's testing primer here.) This means that there is no basis for comparing performance on the fourth-grade test with performance on the fifth-grade test. For each test, there is a subjective judgment about what level of performance constitutes proficiency, but the tests are independent. There is no basis for claiming that children are going backward; there's no justification for claiming that a child "lost ground in proficiency," since proficiency doesn't exist in the abstract, but rather in grade-specific skills; and the children are not being compared to themselves, but rather their location in the distribution of children's performance in one year is being compared to their location in the distribution of children's performance the following year.

Actually, the first person to point this out to mewas Bob Tobias, former head of testing at the Board of Ed and now a professor at NYU. You might like to contact him about the chronic misuse of test scores by this administration.



I'd like someone to explain to me why consistently poor results are either mere numerical illusions or not relevant to the mission of the school. Or both.


If the school had consistently poor results, why did the Mayor and the Chancellor applaud it only a few weeks ago and call it a model for the city? Why did they not see the consistently poor results that are now so obvious to Liebman?

The truth is that the system is nonsensical; and the grades are either random, or if they happen by chance in this case to be meaningful, the school is being penalized because it doesn’t spend eight months doing test prep, as most schools now do; or worse yet, does not engage in the cheating that is common.

This is what the DOE’s high-stakes reign of terror has led us to, rampant cheating that the administration doesn’t even bother to investigate when whistleblowers complain.

FYI, there’s a relatively simple statistical test that can be used to identify schools where there is widespread cheating, but the DOE doesn’t want to use it because they don’t want to know the truth, since then their entire house of cards of claimed achievement gains would then collapse.

Really, Mr. Dwyer, you cannot say you oppose high stakes testing but then support this version of school reform; the entire system that the administration has built is based on high-stakes tests to reward or punish schools, principals, teachers, and the students themselves.

That’s all they have offered us; and that’s all they appear to believe education consists of.

Thanks,



Leonie Haimson
Executive Director
Class Size Matters
124 Waverly Pl.
New York, NY 10011
212-674-7320
classsizematters@gmail.com
www.classsizematters.org
http://nycpublicschoolparents.blogspot.com/



Please make a tax-deductible contribution to Class Size Matters now!



From: Jim Dwyer [mailto:dwyer@nytimes.com]
Sent: Sunday, September 14, 2008 8:49 PM
To: leonie@att.net; Leonie Haimson
Cc: nyceducationnews@yahoogroups.com; 'kaneri'; 'Leonie Haimson'
Subject: Re: errors in your column today





Sorry, I mistook your comment to be a comparison of Anderson to PS 8.

The statistics you cite have quite a range, and I didn't respond to them because their relevance to what happened at PS 8 seems unclear to me.

At PS 8, 55% of the kids who were at 3 or 4 went down in proficiency on those tests. On average, they declined 0.17. Why aren't they improving? Is this an utterly random result? Why did fewer than half the kids who were at proficiency levels 1 and 2 in math go up a grade?

Does this school report card have important information about the school, or is it merely an artifact of an absurd evaluation system?

I don't put a lot of faith in high-stakes tests. And I'm not an authority on statistical relevance by any means. But if I had a kid in that school, I'd certainly like to know about those numbers and I'd like someone to explain to me why consistently poor results are either mere numerical illusions or not relevant to the mission of the school. Or both.

Thank you for writing.

Jim Dwyer


At 11:22 AM 9/14/2008, leonie@att.net wrote:

Dear Mr. Dwyer:

thanks for your prompt reply. The relevant sentence in your column is the following:

"Ms. Hirschmann said that test results ­ which make up 60 percent of a school's grade ­ did not illuminate the quality of work."

There is nothing indicated here that you were merely referring to the measure of test score growth from the year before -- indeed, test results determine 85% of a school's grade, and relies on the results of only two standardized tests, because the other two provide the baseline measure.

Moreover, when I was referring to the Anderson school, I wrote that it had been in the peer group of PS 35 last year-- the school in Staten Island that received an "F" -- not PS 8 this year.

Here is the quote: "Also, the peer school to which PS 35 was compared was the Anderson School, a highly selected “gifted and talented” school that children have to test above the 95% percentile to be admitted to, while PS 35 is a school that any child in the neighborhood can attend."

Unfortunately, as I'm just a public school parent, and not a NY Times columnist, I have no idea what PS 8's peer group schools were this year, since the DOE has not yet released this information publicly.

You didn't respond to the statistic I cited, which is that 30-80% of the gains or losses in a school's test scores are believed to be essentially random.

In any case, don't rely on my opinion or that of Liebman -- neither one of us are experts in the area. Instead, I challenge you to find one national expert in statistics or testing without a contract with DOE who believes that the NYC system of school grading has any reliability or validity; if you do, I promise I'll buy you a drink!

thanks,
--
Leonie Haimson
Class Size Matters
124 Waverly Pl.
New York, NY 10011
212-674-7320
leonie@att.net
www.classsizematters.org


-------------- Original message from Jim Dwyer : --------------

Thank you for your note. With some trepidation, I must disagree with your arithmetic.

I mentioned the 60% factor following Liebman's comment on measuring how much kids improve after a year, which involves not just the two tests you mentioned, but a total of four tests, taken a year apart; it is the factor of change in test results that is counted as 60 percent of the progress report.

My understanding is that the additional 25% factor you cite involves the two tests take in the same year, and simply uses the gross scores.

It does seem that comparing a highly selective school with a neighborhood zoned school would be unwise. But on a list of about 40 schools that are said to be peers of PS 8, I didn't see Anderson anywhere. Perhaps it's there, but even if it is, I don't think it's accurate to list, as you do, Anderson as the sole peer of PS 8.

Thank you.

Jim Dwyer



At 03:38 PM 9/13/2008, Leonie Haimson wrote:

Dear Mr. Dwyer:



Thank you for your column on PS 8, but it has several inaccuracies:



1- Test results make up 85% of a school’s grade according the system devised by the NYC Dept. of Education – not 60%, as you wrote. Sixty perceent of the school grade is based on the change in test scores from one year to the next; and the scores themselves determine another 25% of the grade.



Though Jim Liebman may have argued that “the school reports are not the product of a single test…, but a blend of measurements thatt take into account how each child progresses and how well ­ or poorly ­ similar schools are able to help students move forward”, all these comparisons are made on the basis of the results on two high stakes tests – one a English Language Arts exam taken in January, the other a maath exam in March.



Then peer groups are constructed, that each school’s scores are compared to, and the methodology used to construct these peer groups is itself very controversial.



Researchers have found that 34% to 80% of the annual fluctuations in a typical school's scores are random or due to one-time factors alone – unrelated to the actual learning taking place. (See Thomas J. Kane, Douglas O. Staiger, “The Promise and Pitfalls of Using Imprecise School Accountability Measures,” The Journal of Economic Perspectives, Vol. 16, No. 4. (Autumn, 2002), pp. 91-114. http://links.jstor.org/sici?sici=089)



The higher level of randomness applies to schools with small enrollments, like PS 8. Thus, under this grading system, a school's grade will be based more on chance than anything else.



Not surprisingly, this system has arrived at many anomalous results. Many schools on the state or federal govt. failing list have received “As” or “Bs”, and many high-achieving schools have received “Ds” or “F”s.



See for example, PS 35 in Staten Island that received an “F” last year – despite being a neighborhood school where more than 95% of students reached standards. Why? They had made big gains the year before, and thus were likely to drop the year after.



Also, the peer school to which PS 35 was compared was the Anderson School, a highly selected “gifted and talented” school that children have to test above the 95% percentile to be admitted to, while PS 35 is a school that any child in the neighborhood can attend.



IS 89 last year received a “D”, despite being the only middle school in NYC to have just received a No Child Left Behind Blue Ribbon award from the federal govt., as one of the best schools in country in terms of improving achievement for disadvantaged students. http://www.downtownexpress.com/de_231/is89earnsnational.html



PS 8 is far from the only school to have received an undeservedly low grade.



See the oped I wrote for the Daily News here:

http://www.nydailynews.com/opinions/2007/11/07/2007-11-07_why_parents__teachers_should_reject_new_.html?print=1



I hope you will reconsider and correct the statements in your column.



Thanks,



Leonie Haimson

Executive Director

Class Size Matters

124 Waverly Pl.

New York, NY 10011

212-674-7320

classsizematters@gmail.com

www.classsizematters.org

http://nycpublicschoolparents.blogspot.com/



Please make a tax-deductible contribution to Class Size Matters now!



From: nyceducationnews@yahoogroups.com [ mailto:nyceducationnews@yahoogroups.com] On Behalf Of norscot@aol.com

Sent: Saturday, September 13, 2008 6:54 AM

To: nyceducationnews@yahoogroups.com

Subject: [nyceducationnews] Fwd: [ice-mail] At P.S. 8, Image Didn’t Match Performance



http://www.nytimes.com/2008/09/13/nyregion/13about.html

September 13, 2008

About New York

At P.S. 8, Image Didn't Match Performance

By JIM DWYER

How could a red-hot school in Brooklyn Heights ­ with surging

enrollment from middle-class and wealthy families, with test scores

that are above average, and with extras paid for by parents' fund-

raisers ­ be declared a failure?

For some people, news that Public School 8 on Hicks Street will be

getting an F on its upcoming report card cinches the case that

education officials have lost their test-taking marbles.

And yet there is a strong argument that the F grade is just the sort

of blunt truth-telling needed for schools that are highly regarded

in the vaporous, unchallenged esteem of conventional wisdom.

More than 80 percent of the kids at P.S. 8 passed a standardized

math test. Two-thirds passed the language arts test. In 2006, the

mayor said the school should be imitated. In July, the schools

chancellor announced that an annex would be built to accommodate the

demand in what he said was a "very successful school."

In reality, children who start the year at P.S. 8 with decent or

good scores in math and English actually have gone backward, said

James S. Liebman, the chief accountability officer for the city's

Department of Education.

"You drop them off at the beginning of the year, and on average, by

the end of the year, your child lost ground in proficiency," Mr.

Liebman said.

Children on the lower end of the scale ­ the ones who had the most

room for improvement ­ made only the slightest gains compared with

those at similar schools, Mr. Liebman said, while at most schools

across the city, there were big improvements.

"Where was the child last year, and where is the child this year?"

he asked. "You're comparing them to themselves."

For many people, the F grade for P.S. 8 ratifies their skepticism

about standardized tests. If all the children, like those in Lake

Wobegon, are above average, how could the school be failing?

"The whole formula is ridiculous," argued Jane R. Hirschmann, who

founded the Parents' Coalition to End High Stakes Testing. "They

have turned our institutions of learning into institutions of test-

taking." It is only after the standardized English test in January

and the math test in March that real teaching begins, she said.

She noted that in a 2003 journal article, Mr. Liebman himself

bluntly criticized tests.

"High-stakes testing turned out to be an unreliable measure of the

performance of individuals or institutions," Mr. Liebman and his co-

author, Charles F. Sabel, wrote in The New York University Review of

Law and Social Change. "It often created perverse incentives ­ to

teach to the test, or to exclude from the testing pool the students

most in need of help."

At the time the article was published, Mr. Liebman, a law professor

at Columbia and one of the country's leading authorities on the

death penalty, was not yet working for the Department of Education.

"When he wasn't in the shoes he is in now, he certainly understood

that high-stakes tests are not beneficial to students or schools,"

Ms. Hirschmann said.

Mr. Liebman does not dispute that point at all. The school reports

are not the product of a single test, he said, but a blend of

measurements that take into account how each child progresses and

how well ­ or poorly ­ similar schools are able to help students

move forward. Children from better-off families typically do much

better on standardized tests than those from lower-income households.

"If you use high-stakes tests and nothing else, you're measuring ZIP

code, race, socioeconomic status," Mr. Liebman said. "Most

importantly, you need to measure how much kids improve after a year

at their school."

Ms. Hirschmann said that test results ­ which make up 60 percent of

a school's grade ­ did not illuminate the quality of work. "This has

nothing to do with education," she said. "It only tells us whether

the child can take a test."

On average, Mr. Liebman said, the higher-performing students at P.S.

8 lost a little more than one-tenth of 1 percent in proficiency in

English and math; the lower performing students gained about a tenth

of a percent. Why should anyone care about such numbers?

They make a major difference by the time the students are 18, Mr.

Liebman said. Students get scores between 1 and 4. Of those who

finish eighth grade with a 3.0 proficiency in math and English, just

55 percent graduate from high school four years later. For those

with 3.5 scores, the graduation rate is 75 percent.

"I know it's troubling to people in the neighborhood, and it should be troubling," he said of the F grade. "The point is, compared to any other school in the city, this school is off the charts on the low end.

"We're trying to move away from a school that gets by on its reputation."

E-mail: dwyer@nytimes.com


No comments: