26 September 2010

Statheads on Crack


Sabermetrics have transformed baseball analysis and helped regular fans like us understand the game differently and better. Computer-fueled statistical analysis has punched holes in old shibboleths and established new, research-supported truths. Often, they tell us that our eyes deceive us.

The stat geeks, spreadsheet-munchers and number nerds responsible for this revolution deserve our appreciation not just for devising new ways to measure things, but to continue to devise better ways. Ten years ago, when seamheads were gestating the concept of dissociating pitching from defense, the results were often best left ignored. They have improved steadily, thanks to MLB's ability to catalog every single pitch, to the point where we could, in broad strokes, predict the decline of Ubaldo Jimenez in the second half of the season.

Sometimes, however, seamheads forget that their work is a journey and not a destination.They are beguiled by half-conceived statistical formulae whose rough edges have yet been smoothed by years of inquiry, testing and polishing. They treat the results as gospel, rather than as suggestions tempered by the adolescent state of the art.

Two years ago, Oriole farmhand Matt Wieters was touted by Baseball Prospectus as the greatest catching prospect since Zeus. (Lightning bolt for an arm, immense two-strike power.) Using their proprietary projection system to examine Wieters' minor league and college careers, it projected him to hit .311/.395/.544 with 31 home runs in 2009 and be worth 60 runs more than a replacement-level catcher. Their system, called PECOTA, compares player performance to similar players of the past and sets upper and lower boundaries on expected future results. What I've listed is the weighted mean -- or most likely -- projected rookie numbers for Wieters.

What seemed obvious to me at the time, and what you've no doubt realized, is that such expectations for any rookie are patently absurd. They fail the smell test. While we should all be open to revealed truths, we must trust our eyes until the contradictory facts congeal into a proof.

In fact, Wieters showed great promise as a 23-year-old in 2009 with a .288/.340/.412 line and nine home runs in half a season. That's still above average for a receiver, but a far cry from the projection that BP writers defended as if it were the 11th commandment. (Honor they PECOTA projection and have no other conjurings before me.) Wieters has sagged even further from that unwarranted esteem this year to .253/.326/.385 and 11 home runs in full-time duty.

I love Baseball Prospectus enough to pay good money for it and read it everyday. But they treat PECOTA as destiny and use it to project team performance -- via a Monte Carlo simulation of a million permutations -- at the start of each season. Then they update their projections as the season unfolds.

The opening season projections amount to on-screen enlightenment. Often, BP sees things I don't. They correctly projected declines for the Dodgers, White Sox, Tigers and Angels. They're a good guide to the upcoming season.

It's when Dr. Jekyll starts reading his own press releases that he becomes Mr. Hyde. Entering the 2010 campaign, BP projected that the Cardinals had something like an 80% chance of making the playoffs. My reaction then was, how could that possibly be? I understand that the Redbirds were favorites, but no non-Yankee contingent can ever be an 80% lock before a pitch is thrown. Imagine a season in which Albert Pujols and Chris Carpenter are cut down by injuries and you can see that such a projection is patent nonsense. But the BP brain trust wrote as if October in St. Louis was a fait accompli.

Two weeks ago, BP had the Braves as an 89.6% lock to play in mid-October, despite a three-game deficit to the surging Phils and just a two game cushion over San Francisco for the Wild Card. Once again my olfactory nerves shuddered. Atlanta took three pops to the chin from their division rival and fell into a flat-footed tie for the Wild Card with nine or 10 games left. BP was still offering nearly 3-1 that the Braves would take the fourth playoff spot. (Their remaining schedule is weaker than San Diego's and San Francisco's.)

I understand why BP thinks the Braves should be favored, but I doubt Atlanta residents are booking their playoff tickets quite yet. As I write this, Chief Nakahoma's nine is a game back of San Diego in the loss column with six to play, having just lost two of three to execrable Washington. How is it possible for a team to be a 90% lock for the playoffs if a 4-6 stretch puts them on the outside looking in?

Perhaps some day the statxperts at Baseball Prospectus will improve their tools and better project playoff odds. Or (more likely, in my view) predicting the future will continue to fall outside the purview of mere humans, regardless of their statistical acumen. In any case, I'll continue to treat sabermetrics as a toolbox necessary to understanding the game I love, not as perfection itself.

So I'll be watching the games this last week and ignoring the predictions.
b

No comments: