Thursday, February 07, 2008

Don't Be Questioning My Bill James-itude

Calling my Daily News editorial on the NYC value-added controversy "outlandish and mathematically inept," Steve Koss says:


True baseball aficionados -- those familiar with the work of Bill James, for example -- also understand that these now-famous analytical models are almost exclusively multivariate regression models. In other words, baseball general managers like Billy Beane use mathematical models that predict a player’s value or performance from many different variables simultaneously, each variable clearly measurable and each contributing a portion of the total “value added.” These models are mathematically complex, fraught with issues of relevance, cross-interference among variables, and time series interdependencies (respectively called statistical significance, multicollinearity, and autoregressive conditional heteroskedasticity) that must be carefully considered in their formation and use.

Contrast this approach with the DOE’s under Chancellor Klein, where a teacher’s ostensible “value added” is derived entirely from a single variable, standardized test scores, that is itself an arguably spurious measure. Imagine baseball owners paying their players on the basis of just one variable, such as number of home runs. Within a few years, it would hard to tell the New York Yankees from the New York Giants – every Yankee would be 6’6”, weigh 275 pounds, bench press 500 pounds, and hit 40+ home runs per year. With players judged and rewarded on any single variable, the game of baseball would be rendered unrecognizable, grossly perverted from the multiple-skill game it is today.

Okay, people can say what they like about my credentials, education policy papers, or what have you, but I started buying the Bill James Baseball Abstract in the mid-1980s. These accusations will not stand.

Moreover, Koss doesn't know what he's talking about. The NYC value-added measures are not "derived from a single variable," they're exactly the kind of complicated multi-variate measure he describes. As the NY Times reported.

The city’s pilot program uses a statistical analysis to measure students’ previous-year test scores, their numbers of absences and whether they receive special education services or free lunch, as well as class size, among other factors. Based on all those factors, that analysis then sets a “predicted gain” for a teacher’s class, which is measured against students’ actual gains to determine how much a teacher has contributed to students’ growth.

The NYC model uses something like 12 discrete variables, and the HLM version of value-added pioneered by Bill Sanders is so complicated that you need a PhD in statistics and a special computer at SAS headquarters to run it. It's more complicated that anything Bill James does, as it should be.

As for baseball, yeah, imagine if the Yankees started throwing untold million of dollars at players based primarily on their home run totals, leading to players shooting themselves full of steroids and turning into musclebound, home run producing freaks. It's a good thing that never happened! Instead, the Yankees continue to dominate the American League East and add to their historic World Series victory total by sticking to the tried-and-true Yankee tradition of paying players strictly on the basis of the number of years since they left the minor leagues, regardless of what position they play, how well they hit, or the number of games they win.

No comments: