DIPS and Data

With slow news days in the baseball world lately (the Yanks are spending money; an idiot ump is on the loose) and plenty of chaos at my j-o-b, I’ve been retreating to the serenity of my spreadsheets lately. It looks as though I’ll be spending a bit more time there, as I’ve taken it upon myself to run all of the 2002 Major League pitching statistics through my DIPS 2.0 spreadsheet.

I don’t take credit for the DIPS (Defense Independent Pitching Statistic) system. It was invented by a man named Voros McCracken, and he’s presented DIPS numbers for the 1999-2001 seasons via his web site while explaining the system on Baseball Primer and Baseball Prospectus. The gist of it is that McCracken did some studies on pitching statistics involving balls in play and concluded that major-league pitchers do not differ greatly in their ability to prevent hits on those balls hit into play (that is, anything that’s not a home run, a strikeout, a walk or a hit-by-pitch). The rate at which a pitcher allows hits on balls in play is due more to the defense playing behind him than to his own skill, and can vary greatly from year to year.

This is somewhat counterintuitive, but it’s also a very helpful way of looking at pitching stats. DIPS takes the elements of a pitcher’s record that are not affected by the defense — walks, strikeouts, hit-by-pitches, homers — and places them in a neutral context for park, league and defense. The result is a translated line of Defense Independent Pitching Statistics, including a DIPS ERA; that is, an ERA based on defense-independent pitching performance. An important thing about this DIPS ERA, McCracken found, is that it correlates better with the following season’s ERA than the pitcher’s actual ERA does.

For one reason or another, Voros decided not to publish DIPS numbers this year, leaving a sizeable void in the sabermetric universe. But he’s already published fairly coherent instructions on how to calculate DIPS (and he encouragingly answered questions about some of the less coherent aspects of it), so I built a spreadsheet that would do the job. I used it for a few pieces about the Yankee pitchers and this year’s crop of relievers figuring the sheet would give me a jump in the analysis department, but that it was only a matter of time before somebody published complete DIPS for 2002, and more power to them.

Insert sound of crickets chirping.

Nobody’s done so, including myself — mainly because I was never able to get my hands on the raw data in a spreadsheet. But via a rather mundane Primer thread, I managed to find somebody (“mathteamcoach” is his handle) who had most of what I needed. We’ve joined forces to share the tedium of entering Intentional Base on Balls and Batters Faced Pitching data for EVERY SINGLE PITCHER in the service of this project. It’s a dirty job but somebody’s got to do it, and between the two of us we’re about 2/3 done. The results should be finished later this week.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>