08 January 2014

Predicting the Future Is Easy. Getting It Right Is Hard.

The hip new toys in baseball analysis can magnify and sharpen our understanding of the game to levels of resolution we didn't know enough to dream about in the past. But they can't foretell changes in players' hearts, minds, eyes or habits. They can describe how unlikely an event is, but not whether unlikely event X is the unlikely event that will actually occur this year.

That's how the most advanced statistical models of the Predictolator, which ran 50,000 seasonal simulations, projected before the 2013 season got underway that the Angels, Blue Jays and Nationals would make the playoffs, that Washington would play in the World Series, that the Reds would win the NL Central, that the Red Sox and Pirates would lose more games than they won, that the Giants and White Sox would contend for playoff spots and that the Phils would have a better record than the Orioles. Back then, who could have argued?

Everything anyone could reasonably know in advance about the teams was factored in. The sharpest fielding independent pitching analysis, the most cutting-edge defensive efficiency metrics, the most newfangled catcher framing valuations, and the most atomized BABIP-affected hitting trends  -- all ingredients in this statistical bouillabaisse. And yet, the Boston projection was off by 17 wins.

How could anyone predict that Josh Hamilton would suddenly hit like Alexander Hamilton?  Or that Jeff Locke, who had pitched all of 51 innings with a 5.82 ERA in his career, would throw 166 innings of 10-7, 3.52 for Pittsburgh. Or that four-fifths of San Fransisco's vaunted rotation would blow up like a middle-school science project (ERA 30% worse than average). Or that the Yankees would go on the disabled list and stay there.

They couldn't project that four rookie hurlers would catapult St. Louis to 12 extra wins, or that everything going right in Pittsburgh would reduce the Pirates' losses by 19 and propel them into the playoffs. They were off by 15 games in their Cleveland projection because Indian catchers (31 home runs) and second basemen (33 home runs) hit like right fielders and their first four pen jockeys (2.77 ERA in 271 innings) turned out the lights.

Individuals eluded the grasp of the best minds too. Baseball Prospectus's PECOTA system pegged Roy Halladay for 12 wins in 183 innings with a 2.82 ERA. Ineffectiveness and injury led to a 4-5, 6.82 campaign and retirement. BP had known-commodity Max Scherzer at 12 wins, 184 strikeouts and a 3.84 ERA, a far cry from his Cy Young performance of 21 wins, 240 strikeouts and a 2.90 ERA.

In short, the best systems in the world are still just a little bit better than mere mortals at predicting the fickle movements of humans who throw and catch a round, white ball and smack it with a sitck. If you know a handful of broad rules -- for example, hitters gain power and lose speed as they age; lefty batters boost their homer totals in Yankee Stadium; a pitcher with a very low BABIP-against in year X is likely to regress in year X+1; -- you can predict player performance approximately as badly as the Predictolar. 

Or, you can just enjoy the surprises that the game doles out annually.

No comments: