Monday, November 10, 2008

g(t)?

Writing in the Boston Globe (per Matt Yglesias), Harvard economist Edward Glaeser cites Tom Kane's research on teacher quality, saying:

The first step toward improving teacher quality is to attract more talented teachers. The second step is to improve teacher selection on the job, promoting the best and encouraging the worst to help society in some other way.
The key words are "attract" and "selection." But just as important are the missing words: "certify" and "train."

The conventional system for bringing teachers to classrooms relies on pre-service training and certification to ensure quality. Students are required to complete a state-approved training program, usually offered at a university, and then obtain state certification, which increasingly involves passing some kind of exam.

Tom Kane's research shows that the training and certification model is near-worthless as a means of effectively guaranteeing or differentiating teacher effectiveness. The paper he wrote with Robert Gordon and Douglas Staiger for the Hamilton Project in 2006 is well worth reading in full, but you can get the gist just by looking at Figure 1, on page 8.

We have three bell curves, one made of a solid line, one a dashed line, and one a dotted line. Each of the lines shows the distribution of a group of elementary teachers in Los Angeles based on classroom effectiveness, as measured by their students' math test score gains, after controlling for baseline scores, student demographics, and program participation. The solid line shows the population of traditionally certified teachers, the dashed line alternatively certified teachers, and the dotted line uncertified teachers.















The graph shows two crucial things. First, the bell curves are wide--there is a lot of variation in effectiveness among teachers within each population. Students with the most effective teachers gain ten percentile points or more in one year; students with the least effective teachers lose ten percentile points. To put that in perspective, the cumulative black - white achievement gap nationwide is about 35 percentile points.

Second, the curves are, for all intents and purposes, exactly the same. There's a tiny bit of daylight between the certified teacher curve and the other two. But it's dwarfed by vast differences in effectiveness within the populations. The long, tedious argument within the education community about the virtues of training and certification amounts to debating how many angels can dance on the head of a pin. As near as anyone can tell, there is no way to figure out ahead of time who is going to be an effective teacher and who is not via traditional training and filtering processes. The best way--arguably, the only way--to figure who is an effective teacher turns out to be letting people teach and seeing if they're effective.

The logic of this quickly leads to a very different approach toward teacher quality--not train and certify, but, as Glaeser writes, attract and select. Do whatever you can to attract smart, motivated, talented people into the profession, and then impose a ruthless screening process based on results in the first few years. That doesn't mean you shouldn't try to make those first years as supportive as possible. Yet, seen through the lens of those overlapping bell curves, recent disappointing results from a study of induction models aren't surprising. Some students will suffer in the classrooms of the least effective teachers (academically at-risk students should be protected), but that's just as true of the current system.

The interesting thing is that the Kane / Staiger / Gordon paper came out two-and-a-half years ago. I'm not aware of any serious challenges to the data, methods, or conclusions. As far as education policy papers go, it received a lot of attention, getting written up in the Wall Street Journal here and being name-checked by Nicholas Kristoff here and Bob Herbert here. And of course it garnered hyperbolic praise on this blog, way back when.

Yet the education world keeps right on going with the traditional system. There's a lesson here in human psychology: when evidence is revealed that pushes people completely outside of their established frame of reference, a little mental "na-na-na-na-na-" device turns on and they simply proceed as if it doesn't exist. I suspect the dissonance is particularly jarring for education professionals, since the implication is that, paradoxically, there are severe limits on the extent to which good teaching can be taught.

The findings--which are supported by a host of other teacher effectiveness studies--also raise an intriguing question: If variation in teacher effectiveness isn't attributable to anything we can currently observe, then what's driving it? Is there some kind of innate teaching ability, a sort of g(t)? That combined with more generalized qualities like motivation, intelligence, and the experience of having been well-taught?

I don't pretend to know the answer, but I know that the empirical justification for continuing the current way of doing business in teacher policy is quickly eroding, and that researchers would serve education better by focusing more of their attentions on exploring the source of large differences rather than small ones.

10 comments:

Anonymous said...

Like Robert Gordon says, we need to get rid of the bottom quartile of teachers, and it must be in a multimeasured way, (and I would add that we don't yet have the ability to do that primarily using performance models.) Gordon also noted that we need to build a competent corp of principals for the evaluation and the school empowerment to work. How long would it take for that "chicken and egg" routine to be reconciled. In the interim, would you be willing to have your career ruined by an incomepetent principal or statistical model, hoping that someday the system will work?

But the CAP also notes that many schools are "lucky to get even two applicants per opening per year."

Gordon calculated the results of removing the botoom quartile. But if you just had subs taking over those classrooms, you haven't gained anything. Unless you play your cards skilfully, his proposals could do far more harm than good to poor students.

If we want to help kids, the devil is in the details. If you want to know enough about the details to make things better, you have to work with the only people who have a detailed knowledge of actual schools. That means listening to real live inner city teachers and our unions.

But I agree with the main thrust of your post.

KC said...

Aren't you focusing too narrowly on the issue of certification? Don't the data also suggest a need to change the nature of teaching?

In order to be more selective, teaching cannot continue to be a solo performance where one teacher gets a room full of kids to themselves. The prior commentor suggest we need more and better principals to weed out the bad 'uns. But I'd suggest we need to look at ways of building feedback and evaluation into the job by changing the way the job is structured.

If teaching were more collaborative, if teamwork were more ingrained into the job, not only would 'selection' become easier, but 'development' and 'training' would be a lot easier.

I'm not that well read in the research to know if this idea has been explored sufficiently. But from my lay perspective it seems to me that teachers are left way to much to their own devices and really could benefit from more teamwork and collaboration.

Anonymous said...

Doug Harris has addressed this issue in part by pointing out the fragility of value-added measures. (That's what Gordon et al. rely on with the brief, not-well-described use of the L.A. data.) He talks about the big-picture issue as well.

This graph appears in the context of a policy brief, without the methods/data description usually in a formal paper. I know that they've published data on NYC with different questions, but have they published the L.A. stuff with more detail? In research, silence is not consent.

But back to the larger question: I think that the larger point is correct, that variation within categories is pretty large. That's not surprising -- whether you think standard certification works or not, no one is going to say that all programs -- or all alternative-cert programs/routes -- are the same. How to deal with that huge variation is a challenge.

And, yes, the Mathematica study on induction is depressing.

Anonymous said...

Of course that body of research has been ignored. Who among the education professoriate wants to admit that their own programs are useless?

TurbineGuy said...

Once again, I will try and get a straight answer.

Are there any studies that show the relative effectiveness of education majors from different programs?

I don't mean subjective Newsweek college rankings, but value added rankings based on which schools produce the most effective beginner teachers.

Everyone talks about alternate certification vs traditional routes, but I refuse to believe that all traditional routes are the same.

I am willing to wager than if measurable school rankings are published, that education schools would start to make some dramatic changes in their curriculum and programs.

Kevin Carey said...

Short answer: no. But some states are moving in this direction. I believe (someone correct me if I'm wrong) that Louisiana has already compiled this kind of information but hasn't released it to the public. In the long run, it's unavoidable -- accountability policies have created annual testing information, state data systems are making it easier to connect test scores to teachers, and the methods have already been established. The hard part will be, as it often is, the politics, as schools of education will object to being compared to one another on these terms. But it will happen.

AldeBeer said...

Kevin, Louisiana has an ongoing process to re-design teacher education programs and evaluate them as they go. The latest report is from 2004-2006 and can be found here:
http://tinyurl.com/62ny4f

Three programs (Louisiana College, Northwestern State University, and The New Teachers Project) have evidence that their teachers are more effective than experienced teachers. Other schools produce a lot more teachers who are a lot less effective.

Parry Graham said...

kc,

The professional learning community (PLC) model is an attempt to increase collaboration happening within schools. There is very little direct research on its benefits, but it is a popular approach right now in K-12. The tenets of the model speak to the point you make: rather than leaving it up to individual teachers to sink or swim, have them formally work together and share best practices across classrooms.

For more info on the model, just google "Rick DuFour", one of the main proponents.

Parry

TurbineGuy said...

Thank you Kevin.

It seems to me that finding a way to fairly evaluate the effectiveness of different schools would do more to improve teacher quality then any other initiative out there.

TurbineGuy said...

Chad.

You rock. That's exactly what I was looking for.