The Quick and the Ed: 11/30/08

Friday, December 05, 2008

Finland Bound

I'm leaving tomorrow for a week-long ~~junket~~ fact-finding mission to Finland along with some other think tank people and journalists who will be learning why the Finns are beating the world on PISA and other measures of education success. We'll be in Helsinki, so if you have any suggestions about places to go and people to see--education-related or otherwise--send me an email, kcarey at educationsector.org. And if you actually live in Helsinki, we're staying in the Sokos Hotel Vaakuna Helsinki--drinks are on me.

One of the nice things about ~~vacationing in~~ visiting Finland in an official capacity is that you get invited to the Finnish embassy here in DC for a pre-trip orientation / dinner / sauna bath. The embassy is just as you'd expect, all blond wood, glass, steel, and elegant Nordic design. The dinner included meatballs and I know just enough about Finnish history not to ask if they were Swedish. Afterwards, we retired to the embassy's extensive subterranean sauna facility--the only "diplomatic sauna" in Washington, FYI--where our host, a cheerful broadcast journalist, regaled us with stories of his youth spent in a special army ranger outfit based near the Arctic Circle whose mission was to travel via cross-country skis to positions on the Eastern border and repel attacking Soviet helicopters with shoulder-fired missiles. Why this hasn't replaced the comparatively-much-less-exciting biathlon as an official Winter Olympics sport is a mystery.

Earlier in the evening, he suggested that I should ask Finnish education officials why, if the Finns rank so high on PISA, the University of Helsinki is only the 68th best university in the world. But while I hope to learn about many things next week--early childhood education, recruiting high-quality teachers, and the Bologna process, among others--that one I already know.

In primary and secondary education, success is defined in terms of how much students learn. We can--and do-contest the hows and whys of it, but no one really debates the principle. In higher education, by contrast, success is defined in ways that have little or nothing to do with student learning. The world rankings he cited--which place 17 American universities among the top 20--are based entirely on research measures like publications, citations and prizes. A university could literally not enroll undergraduates and it wouldn't affect their position a bit.

OECD is now in the process of piloting a higher education version of PISA. Countries that lack world-renowned research institutions may find their standing improved when the focus shifts to student learning. But Americans won't know if our unchallenged reputation for having the world's best higher education system is deserved, because we have--sadly--chosen not to participate. I guess when one measure says you're the best, there's no reason to support any others.

Are Value-Added Effectiveness Measures Good Enough to Use for Compensation Decisions?

There’s a great deal of attention being given to using test scores to measure teacher performance these days, recent announcements from the Gates foundation ensure this will be high on the national agenda in coming years. But recent studies show that the value-added measures contain significant amount of error. Which raises questions: how can imperfect measures be incorporated into high-stakes decisions like teacher pay? How good is good enough?

Reformers have been waiting for longitudinal data systems to be implemented to provide value-added data to support improvements to compensation and retention decisions. The data is now there in several states, but the quality of the new information may not be as good as many of us had hoped. Just before Thanksgiving, two new studies were released that show the lack of stability of value-added measures of teacher effectiveness over time. The first by Dan Goldhaber looks at North Carolina data to see if pre-tenure teacher effectiveness (measured by the value-added gain of a teacher’s students) is a good predictor of their effectiveness post-tenure (here). The study showed that a teacher ranked in the bottom quintile of teacher effectiveness has a 32 percent chance of being in the bottom quintile post-tenure. While this is better than random (random would be around 20 percent), it is not much better than random. At the same time, 11 percent of the poor performers pre-tenure (bottom quintile) end up being in the top quintile post-tenure. The measure is a little more consistent at identifying top performers – 46 percent of top performers are top performers post-tenure (see Table 1 for all measures of pre and post value-added effectiveness). Goldhaber also looked at using the first 3-years of data to predict outcomes, and the predictive power does not change much.

Table_1a.pdf

A second paper by Tim Sass shows similar results from California and Florida studies. (here)
This paper focuses on whether value added measures of teacher quality are stable enough to use for compensation decisions. It shows similar results as the Goldhaber study over time. The lack of stability over time may not be surprising given the group of students a teacher gets each year is random. The data can not measure whether a teacher had a particularly disruptive class in the first year, and a better group of students the next. So, the randomness of classroom make-up may have a lot to do with these results. The Sass study shows that while measurable, student characteristics explain some of the differences in value-added effectiveness, but most of the differences across time are unexplained (See Table 2 for complete effectiveness measures).

figure_2.pdf

The part of the Sass study that caused me the greatest concern was how inconsistent these value-added measures are across tests. Students in Florida take two tests annually. They take a low stakes norm-referenced test and a high stakes standard-aligned test. Sass looks at how stable these value-added measures are across these two tests. So for this comparison, the random draw of students is the same for any given teacher. While these results look a little more stable (43% of bottom quintile teachers remain in the bottom quintile on the other exam), they are not as stable as you would hope. If just switching the exam moves 5 percent of teachers from the bottom of the distribution to the top, it would likely make teachers question the validity of the measure reflecting true effectiveness.

These papers and a few others suggest that value-added measures are not very consistent over time, and may not be the panacea for which some reformer have been hoping.

How Good is Good Enough? Now you would think that the bar for improving teacher compensation and tenure decisions would be pretty low. The current compensation structure is based almost exclusively on a teacher’s years of experience and college credits/advance degrees. Advance degrees have been consistently shown to have no impact of teacher effectiveness. For experience, teachers appear to improve their craft slightly over the first two to three years, but additional experience does not seem to have any impact. Clearly moving to value-added compensation could more accurately reward effective teachers than the current system. However, if a compensation system were based partially on these value added measure, I think that teachers would perceive the outcomes above as too arbitrary. It also makes me think that principals and mentor teachers could do a better job of predicting effectiveness than last year’s test results. (See Brian Jacobs on this question – principals seem to do pretty well on the identifying teachers at the top and bottom of the distribution, but their measurement is less predictive than prior year’s value-added (here). Of course this is not an either or choice. Can principals armed with value-added test results do an even better job than either one alone? What about a combination of principal evaluations, mentor teacher evaluations and value-added? Are there more rigorous evaluation methods like those of the Teacher Advancement Program or others better predictors than the value-added measures? (See Ed Sector Report on Teacher Evaluation here) As with all good research, it leads to more research. And with Gates interested in these topics, more research is likely to be on its way.

Thursday, December 04, 2008

Two Steps Back

Two pieces of bad news today for those working to build a quality supply of public schools:

First, the 1st District Court of Appeals in Florida ruled that a 2006 law creating the Florida Schools of Excellence Commission conflicts with the Florida state constitution. According to the court, the Commission, which would be an independent, statewide office established to approve and support charter schools, was unconstitutional because the Florida constitution limits oversight of charter schools to local school districts.

Well, it looks like it's time to change the Florida constitution. A growing body of research shows that having, in addition to local school districts, one or more professional authorizers whose sole focus is approving and overseeing charter schools makes for a healthier and higher quality charter school sector. In fact, a 2006 ES report on Florida charter schools stated that the proposed Schools of Excellence Commission, "will likely reduce the number of appeals to the State Board and relieve unwilling sponsors of their chartering responsibilities while significantly improving the quality and transparency of authorizing across the state."

And the second piece of bad news is the bailout bill passed in the House Education Committee in Michigan which allowed Detroit to keep it's "first class school district" status (reducing the enrollment threshold from 100,000 students to 60,000 students) and thereby limiting the opening of new charter schools in the district. As we've reported, maintenance is required among Michigan's charter schools, but a blunt limit on opening new charter schools does nothing to improve the quality of charter schools or the quality of Detroit Public Schools.

Darling-Hammond Unbound

Score one for the KAPPAN magazine. The edu-magazine has a very timely piece on school accountability in its just-mailed December issue by top Obama policy advisor Linda Darling-Hammond. Anonymous "reformers," some of whom also have ties to the Obama administration-in-waiting, have been taking shots at the Stanford professor during the transition, in part because she has been tough on the quality of state testing under NCLB. They have declared her to be "anti-accountability."

The KAPPAN piece provides a valuable window on her thinking. She's indeed not a fan of NCLB-brand multiple-choice testing. "NCLB reinforced using test-based accountability to raise achievement, yet the US has fallen further behind on international assessments of student learing since the law was passed in 2001," she declares at the top of the article.

Darling-Hammond has spent a lot of time studying the teaching and testing systems of high achieving industrialized countries and likes them better than ours. Among other things, she says, they teach fewer topics in greater depth; focus more on reasoning skills and applications of knowledge rather than on coverage of content; and rely heavily on open-ended questions "that require students to analyze, apply knowledge, and write extensively," in contrast to US tests that "rely primarily on multiple-choice items that evalute recall and recognition of discreet facts." She's right about that.

Darling-Hammond points approvingly to a "growing emphasis" in high-performing countries on "project-based, inquiry-oriented learning" that has led "to an increasing prominence for school-based tasks, which include research projects, science investigations, development of products and reports or presentations about these efforts"--so-called performance tests. The bulk of the article (written with co-author Laura McClosky) describes approvingly locally administered peformance assessment in countries ranging from Finland to Australia, Hong Kong, Sweden, and the UK.

There's little doubt that LDH would push to introduce these kinds of assessments into US public education if she were to have a senior role in the Obama administration: We need, she writes, "a new vision of assessment" in American education. To the extent that testing drives teaching, that would be a good thing.

The question is whether she would push to incorporate performance tests into NCLB-style statewide testing systems, or try to move testing down to the local level.

In writing that "the policy community has little understanding about how systems of assessment for learning might be constructed and managed at scale," she is acknowledging the challenges of using performance testing under the NCLB system, cost and scoring reliability chief among them. One thing she might do if she goes to work for Obama would be to have the federal government sponsor an effort to address the difficulties of doing performance testing at scale, and give states financial incentives for using such tests. Given the growing consensus that well-crafted performance assessments would represent a big step towards teaching students the higher-order thinking skills that they need today (Darling-Hammond points out that US students score lower on problem-solving that their international counterparts), this would be a smart investment--and a refreshing change from the Bush administration's hear-no-evil, see-no-evil stance on test quality.

Another possible policy solution, she implies elsewhere in the article, would be to include performance-based local assessments into "overall examination scoring systems." That's what several of the countries she has studied do.

The big question for supporters of NCLB's statewide standardized testing systems is whether performance assessments would be used for holding educators accountable for student achievement. Much edu-blood was spilled over that question a year ago, when Rep. George Miller included the notion of local assessments in a draft NCLB reauthorization bill.The combatants eventually withdrew from the field and the Miller draft was decommissioned.

But it's clear that Darling-Hammond is ambivalent about using performance testing to hold educators accountable for student achievment. She notes that the countries she has studied "do not use their examination systems to rank or punish schools or to deny diplomas to students." Finland, she writes, "has no external standardized tests to rank students or schools." Instead, she writes approvingly, the testing systems in Finland and other countries are closely linked to efforts to develop teachers' ability to teach higher-level skills to their students; they are part of the countries' human capital strategies.

So, if Barack Obama gives Linda Darling Hammond a major role in his administration, we're going to have a big policy debate over testing in American education and whether we should move beyond NCLB accountability to something potentially very different. Such a debate wouldn't be a bad thing.

Harvard's Endowment Falls to $29 Billion

Even after large stock market losses, if Harvard paid out five percent of its endowment--a requirement for all private foundations except those of colleges and universities--it would increase the school's budget by $214 million.

They've suffered a large financial loss on paper, but so have the rest of us, and lawmakers shouldn't let the economic downturn curtail efforts for endowment sanity.

The Rachel Maddow Show

I'll be on Rachel Maddow's Air America radio show today at ~~1:15~~ 6:00PM, talking about why college keeps getting more expensive. Rest assured, state legislatures will get their share of blame. But I'll also be pointing the finger at people like SUNY-Buffalo president John Simpson, who apparently sees the current economic crisis as the perfect opportunity to raise student tuition in order to fund a grand agenda of local economic development and institutional status-promotion. “It’s easier to push a conversation about this kind of substantive change today than it was a year or two ago, because the world wasn’t in such crisis,” Simpson said earlier this week. Call it the "shock doctrine" theory of making college less affordable.

Some might say that SUNY would still be relatively cheap even if Simpson's plan to raise tuition by 63 percent over the next decade were implemented. But that depends on the student. A new report from the National Center for Public Policy and Higher Education includes the following statistics about how the net cost of attending a four-year public university has changed over time as a percentage of the median income of families in the lowest and highest income quintiles.

Lowest quintile 1999 - 00 : 39%

Lowest quintile 2007 - 08 : 55%

Highest quintile 1999 - 00: 7%

Highest quintile 2007 - 08: 9%

Over the last eight years--and for quite a while before that--the rich have gotten much richer in America at the same time that college has gotten much more expensive. The two trends roughly cancel out. And since people in the highest quintile run things, they don't really see a huge problem. For low-income families, by contrast, college has always been a stretch, even after taking into account higher levels of financial aid. Middle- and upper-income families will probably be able to absorb tuition hikes. Low-income students at the margin, by contrast, will be increasingly priced out of the four-year sector, or have their studies compromised by the need to work, or be shouldered with unmanageable debt. I'm not saying there's never cause to raise tuition, but it would d be nice if students actually got something back in return--better support services and more well-paid instructors, for example. Instead they're being asked to pay for research and other things that help everyone but those paying the bills.

Tuesday, December 02, 2008

Oxygen

A co-worker and I were discussing today the oft-repeated education reformer line that schools should be for students and not for adults. See Joel Klein use a version here and Michelle Rhee's iterations here, here, here, or here. It's a good line, if for nothing else than it puts traditional powers in education policy (read: teachers unions) on the defensive as if they do not consider student concerns, and it does it without even naming them.

My co-worker and I are talking about this trend, and she suggested a metaphor that might contrast the line. Whenever you board a plane, she points out, they go through a long safety ritual. "Please note your nearest exit rows" and "fasten your belt by inserting the buckle and pulling tight on the remaining cord." They also insist that, in the case of an emergency, adults secure their own oxygen masks before attending to their children. Despite the best instincts of parents, this policy actually makes some sense. There's no point in having a bunch of adults trying to help out their children first and fainting in the process. Better to secure their own safety in order to be in a position to help those who need it.

It's an extreme metaphor to be sure, but it actually makes some sense in the context to long-struggling urban education systems. They're bad, they've been bad, and they're crashing for whole segments of the population. But it isn't just the students that need help--district finance, curricula, infrastructure, technology, etc. are all suffering--and adults who try to rush in without fixing some of these problems first will just faint and flounder. They'll have no air.

I'm generally sympathetic to what Rhee is trying to do: she's going after the adults in the system who have long settled for complacency and demographics to explain why DC's public schools have been so bad. But in an effort to test the all-publicity-is-good-publicity theory, she's lobbing fireballs like this one, from last week's Time piece:

Rhee is, as a rule, far nicer to students than to most adults. In many private encounters with officials, bureaucrats and even fundraisers--who have committed millions of dollars to help her reform the schools--she doesn't smile or nod or do any of the things most people do to put others at ease. She reads her BlackBerry when people talk to her. I have seen her walk out of small meetings held for her benefit without a word of explanation. She says things most superintendents would not. "The thing that kills me about education is that it's so touchy-feely," she tells me one afternoon in her office. Then she raises her chin and does what I come to recognize as her standard imitation of people she doesn't respect. Sometimes she uses this voice to imitate teachers; other times, politicians or parents. Never students. "People say, 'Well, you know, test scores don't take into account creativity and the love of learning,'" she says with a drippy, grating voice, lowering her eyelids halfway. Then she snaps back to herself. "I'm like, 'You know what? I don't give a crap.' Don't get me wrong. Creativity is good and whatever. But if the children don't know how to read, I don't care how creative you are. You're not doing your job."

Kevin doesn't like the magazine cover's title ("How to Fix America's Schools"). But it's not just the title. The cover itself is Rhee looking stern in a classroom, dressed in black, holding a broom, suggesting she'll sweep away problems. Quotes like the one above and the cover photo--two things Rhee had complete control over--are the things that test the publicity theory. Like the plane crash metaphor,"interests of children" advocates need to be careful how much they say and do, or else they may find they're lacking air.

Competition on Quality, Service, and Price

A couple of weeks ago I was at a meeting where a higher education spokesperson flat-out stated there was no market demand for student learning data. His group had done focus groups, he said, and it just wasn't as high on their list as other things. His point would be fair, even if true, if there was real, meaningful data available on student outcomes in higher education. Instead, his argument is a chicke

n-and-egg conundrum. We have no meaningful data, so students and parents don't request it. Students and parents don't request it, so colleges don't provide it.

Luckily, consumer preferences can change. Automobile companies told Ralph Nader that safety didn't sell. He shamed them into understanding it can, and now car commercials are as much about side-impact safety beams as they are about horsepower. The automotive industry wasn't about fuel economy when gas prices were cheap, but all of a sudden we're seeing advertisements extolling a vehicle's miles per gallon. Ford went to Congress today hat in hand promising to develop electric vehicles and sell the Hummer brand.

All this is to say that we don't always have to accept $6 million buildings only for tutoring athletes or $55 million dorms complete with Coldstone Creamery, 7/11, state-of-the-art gym (pictured), grocery delivery, room cleaning, and laundry service as the sole basis on which colleges compete for students. They could, you know, compete on quality, service, and price.

For more on this topic and many others, listen to today's Education Sector event, "Is Technology the Answer to Rising College Costs," below:

Generalizability

The Michelle Rhee story continues to percolate ever-upward through layers of media, landing on the cover of Time this week. While there are many sound policy-based reasons for supporting her reform efforts, I have to admit one of the things I like is that she talks the way I talk. Not just about education, but in general. Which isn't surprising; we're the same age and both graduated from upstate New York universities in the same year, 1992. (Mine, SUNY-Binghamton, was where people tended to enroll when they couldn't get into hers, Cornell). To wit: "What I'm finding is that our principals are ridiculously--like ridiculously--conflict-averse," Rhee says. When I was in high school one English teacher went on at length about how the ubiquitous improper use of the word "like" as a means of emphasis was a clear sign of the linguistic apocalypse. Now famous Ivy League-educated Time-cover-gracing people talk this way. (Although there's still some loss in translation--I suspect the second "ridiculously" should have been italicized.)

The story is well-written and worth reading. My biggest qualm--and it's amazing how often this happens--is on the magazine cover. Inside, the piece is titled "Rhee tackles classrooom challenge," which is fair enough. But the cover title is "How to Fix America's Schools." And that's wrong. It would be a mistake to over-generalize about the lessons of DCPS, which is (thankfully) unusual. Most school districts haven't been systematically degraded by three decades of often corrupt one-party rule. Most districts don't employ significant numbers of truly incompetent teachers. Most districts are not unequivocally the worst in the nation when compared to similar districts. Most districts get much less funding per student. Most high-poverty districts aren't funded at levels similar to the surrounding wealthier suburbs. And so on.

The educational challenges in DC are unusual and, compared to most districts, extreme. The needed changes are of commensurate severity. Seeing DC as the definitive proving ground for larger questions about tenure, management style, etc. is not going to serve anyone's interests in the long run. The issues themselves will become over-politicized and thus harder to solve. And inferences drawn about what makes sense for other districts will be distorted by the differences with DC.

Monday, December 01, 2008

Brainstorm

Hope everyone had a great Thanksgiving break. Mine was excellent, including a trip to New York to see The Seagull, right up until the part that involved being on the southbound New Jersey Turnpike at 11PM in the middle of a rainstorm, stop-and-go-traffic, etc. Then, not so much.

Announcement: Starting today I'm going to be a regular contributor to Brainstorm, the group blog at the Chronicle for Higher Education. It will include a fair amounting of cross-posting from Q&E along with new material. You'll also be seeing my byline in the print edition, starting with with this column about the biennial "Measuring Up" state report cards and how they were right all along.

Edison-Go-Round

EdisonLearning, the school management company founded by Chris Whittle in the early 1990s, has fired it CEO, Terry Stecz, two years after he replaced Whittle. In a memo to Edison staffers, Michael Stakias, the president of Liberty Partners, a New York-based private equity firm that is Edison’s majority owner, announced that Jeff Wahl, Edison’s chief operating officer since 2007, would replace Stecz immediately. Before joining Edison, Wahl spent 15 years in various management roles at General Electric. Maybe Liberty's hoping he's got the right background to illuminate a path to profitability at a company named for the inventor of the lightbulb.

Stecz apparently didn't. He was working in management at Pharmacia, the $14-billion heathcare company with products that included Celebrex and Nicorette, when Liberty recruited him to be Edison’s COO in 2004. He became Edison’s CEO when Liberty pushed Whittle out of the company’s management in early 2007. Stecz struggled, with little apparent success, to shake off Edison troubled legacy as a controversial, unprofitable school management company, going so far as to change the company’s name earlier this year from Edison Schools, Inc, to EdisonLearning and announce a move into on-line education.

Whittle hasn’t fared much better that his successor. The high-living entrepreneur set out to launch an international network of high-end for-profit private schools when he left Edison in early 2007, only to depart his new company, Nations Academy, last summer in the wake of a falling out with his major investor, Sunny Varkey of Dubai.

Whittle is reportedly planning a new for-profit private school business. And he’s trying to raise some cash. He’s put up for sale for $27 million a guest house and a third of the property on his 11 acre estate on Georgia Pond in the Hamptons. Six years ago, when Edison’s stock crashed and the company nearly went under, Whittle sought to sell the entire estate, where the neighbors include Steven Spielberg and Martha Stewart, for $45 million, before taking the property off the market when Liberty bought Edison.

18th Century Skills

Before there were “21st Century Skills,” there were “18th Century Skills.” None other than Benjamin Franklin identified 13 virtues to which he aspired. But Franklin knew that simply trying to embody his virtues without keeping track of his performance wouldn’t be enough. So he created a series of tables to record daily transgressions.

Quantifying his data this way made it possible for Franklin to track his progress over time. While he never achieved perfection on the scale he created, he “was by the endeavor a better and happier man” and “had the satisfaction of seeing [marks representing transgressions] diminish” over time.
Not unlike Mr. Franklin, teachers gather quasi-quantitative data about students every day. Doing so can be as simple as a system of behavior checks and minuses. But unless that data is captured consistently over time and communicated to all of the adults who work with that student, we miss opportunities to identify trends and correlations that can help us serve students better and assess the impact of our interventions. Technology has the potential to make such information easier to capture and quantify as well as to provide improved tools of analysis and communication. Wireless Generation, for example, develops simple ways for teachers, students and others to capture quasi-quantitative information about academic and social indicators on hand held devices. Having the information digitized makes it easy to identify trends and correlations.

Imagine if digital tools that could highlight a correlation between lapses in Temperance and Chastity had been available to Mr. Franklin.

The Quick and the Ed