Note: The following was retrieved from the Internet Archive Wayback Machine after it disappeared from its previous home at It's Voros McCracken's work, not mine, but he has granted me permission to display it here to keep it in circulation. — Jay Jaffe


Part Two: DIPS and its correlation with ERA for the following year.

James Fraser's Note:  Voros McCracken has contributed another article detailing the predictive value of his Defense Independent Pitching Stats.  Enjoy!

In my previous article I outlined a system of evaluating pitchers that ignores those stats which can be affected by the pitcher's defense. I outlined a few of the reasons I did that, and also described how the statistic is derived. In this article I'm going to put the method to work.

One of the first suggestions I usually get from people goes something like this. "Wow, I would have never guesses Hits correlated so poorly. If you could show that your system correlated with ERA the following year better than others, that would be worth something" or something to that effect. This is an obvious step for me to take, but with it come a few problems. First and foremost, DIPS adjust for environment but ERA, Component ERA and the following year's ERA do not. Since the majority of pitchers pitch for the same team with essentially the same defense behind them from year to year, these environments often remain consistent from year to year. Therefore the problem is that in comparing DIPS to following year ERA, we are comparing a DIPS to a flawed figure.

There are two possible workarounds for this. First, we could un-flaw the following year's ERA by making these adjustments for league, park and team defense. There are two problems with this. One, while park and league can be done reasonably well, adjusting the ERA for team defense is pretty tricky. The second, and more damaging problem is that the previous year's ERA and Component ERA would have to have these adjustments as well for comparison purposes. Essentially, I'd have to create three new stats to judge the worth of a fourth.

The other thing we can do is purposely flaw (or skew) DIPS to meet with the environment in which he pitched. This is what I will do.

I designed the following study to measure whether adjusting for defense affected stats above the overall teams effects, will increase the correlation with following year ERA over ERA and component ERA. The idea is to only make an adjustment for the very high degree of random fluctuation in the stats that can be affected by defense. We won't adjust for OVERALL team defense, but we will adjust for fluctuations from this defensive figure.

For example, Los Angeles' defense gives up hits 10% more often than the average team. Joe Blow pitches for Los Angeles and gives up hits 20% more often than the average player. We won't adjust for that first 10% from Los Angeles' defense, but we will adjust for that second 10% that comes strictly from Blow's stats. After the adjustment, Blow will give up hits 10% more than average, as will the rest of Los Angeles' pitchers. Therefore the individual hit total of pitcher's won't be used, but the cumulative total of all on his team will be.

Here's what we'll do:

First we need a BFP figure. I don't have them in my database, but I can make a pretty good estimation with:


This is the best estimate I've seen for BFP. It is rarely more than 20 BFP and very often within 10 BFP.

We'll use Kevin Millwood's 1998 season.


(note: Millwood's actual BFP total was 748).

Next we need to figure out our SO, BB and HR figures. Well this is easy. Since we're not adjusting for league or park and since these stats can't be affected by defense, these figures remain the same. That is SO=SO, BB=BB, HR=HR:

             BFP  BB  SO  HR
98-Millwood  742  56 163  18
So far so good. Now we need to come up with our hits total. Now as I pointed out in my last article, this is where the key to the DIPS method lies. We are not going to use the pitcher's individual H total in any way. If we did use it here, our method would really be doing the same thing that component ERA does, only in a slightly different way. As such, there'd be no reason why it would correlate any differently than component ERA. Instead of using the pitcher's individual H figures, we'll use that of his entire team's defense.

1998 Atlanta Braves Pitching Stats

IP    H  HR  BB   SO
1438 1291 117 467 1232
So Atlanta's BFP estimate would be:


(note: Atlanta's actual BFP was 5967. Our BFP estimate is pretty accurate.)

Now what we want is a hit factor we can apply to Millwood's numbers and those of all other Atlanta pitcher's that year. We'll call it $H and define it as this:


Which for Atlanta is:

(1291-117)/(5973-467-1232-117)=.282 (rounded)

We now multiply that number to Millwood's (BFP-BB-SO-HR) total and add his home runs back to it to get Hits:


(I'll round here, but maintain decimals throughout. Notice that this has removed 14 hits from Millwood's actual total.)

             BFP  BB  SO  HR   H
98-Millwood  742  56 163  18 161
Now let's get an IP total. Again we'll use team averages from the 1998 Atlanta Braves, rather than Millwood's individual stats.

Important Note: Because of the way we estimated BFP, the following equation will yield the same result for every team in the study, due to things canceling out. If we used actual BFP totals, that would change, but probably not very much. Anyway we now calculate this:


As I mentioned above this will be 1.033 for everybody. So Millwood's IP would look like:


             BFP  BB  SO  HR   H  IP
98-Millwood  742  56 163  18 161 179
Ok we're certainly getting there. All we have to do now before we get to a run estimate is come up with 2B and 3B totals. Well, we'll just come up with these two for the 1998 Braves:


And apply them back to Millwood:


Now we need to make a run estimate so we can compare this to the next year's ERA (along with actual ERA and component ERA). Here's what we need to do. We'll make a run estimate for the 1998 Atlanta Braves using Extrapolated Runs. We need an estimate for AB for extrapolated runs:


This will work fine. So here's the Braves' estimate:


Then we will take that figure and divide it into the Braves actual Earned Runs scored to get a factor to apply to Millwood's estimate.


When we calculate Millwood's XR:

XR=75 and then ER=$ER*XR=.892*75=67

And the finale:


(Remember I've been maintaining decimals)

Now we can compare this ERA figure to the following year's ERA and see how well it correlates. We'll also do the same for:

Actual ERA=ERA
Component ERA=ERC

Millwood's 1998 figures are

ERA =4.08
ERC =3.88

And in 1999:

ERA =2.68

Now what I did was do these estimates for every pitcher from 1993-1999 who pitched 100 Innings in consecutive seasons using the exact same methods as above. I then compared the ERA estimates (ERA, ERC and mine $ERA) from the first year with the actual ERAs the following year, and measured the correlation between each stat and the following year's ERA. The results:

Correlation with ERA the following season:

ERA =.407
ERC =.441

Bill James has stated that his component ERA often predicts future ERA, better than ERA itself. Well in my study it did, but not by a lot. Still there is a difference in its favor. However the system I just laid out above, posted significant advantages over both ERA and Component ERA in predicting future ERA. What's more I also compared the system above and Component ERA to ERA from the season in which they were calculated. That is, how similar are the two to the ERA from the season for which they were calculated:

Correlation with ERA the current season:

ERC =.906

The low correlation for $ERA is actually better, because this indicates that $ERA tells you things ERA does not, much more often than ERC does. ERC and ERA correlate pretty well, therefore in most cases ERC won't be of much more use than ERA. But there is significant discrepancy between ERA and $ERA AND $ERA does significantly better at predicting future ERA than ERA does. $ERA therefore is far more valuable at analyzing future performance than is ERC.

There's more. I divided the 503 Pitchers in the study up into five groups. They were grouped by their first year BFP estimate in the following manner:

Group1: 923-1109 BFP  = 100 Pitchers
Group2: 826-922  BFP  = 101 Pitchers
Group3: 723-825  BFP  = 100 Pitchers
Group4: 599-721  BFP  = 102 Pitchers
Group5: 414-598  BFP  = 100 Pitchers

I ran correlations for each group seeing how well the various estimates correlate with following year ERA depending on how many batters they faced the current year. The results are below:

      Grp1  Grp2  Grp3  Grp4  Grp5
ERA = .524  .481  .486  .300  .191
ERC = .535  .494  .501  .349  .277
$ERA= .594  .535  .519  .473  .413

$ERA posts the highest correlation in each group. What should really jump out at you is Group 5 (the lowest BFP group). Below 600 BFP, ERA really does not correlate well at all and Component ERA has suffered a good deal as well. $ER however has held up much better at these lower BFP figures.

It seems to me that as BFP drops, and more and more pitchers have some of their IP become relief innings, the advantage $ERA has over the other two starts to grow quite a bit. Now remember, these pitchers all logged 100 IP each season. I would assume if we delved into 60 an 70 IP relievers, $ERA's advantage would continue to grow. I believe $ERA correlation for relief pitchers would beat the others quite handily.

So where does that leave us? We now have a stat that has shown to be as good and most likely better than ERA and Component ERA at estimating ERA for the following season. This stat removes the pitcher's hits allowed total from the equation. Why would that be?

There are two answers to this question. The first answer is "That a pitcher's ability to prevent hits on balls in play varies to an extremely large degree from season to season." The second answer is "That the ability for pitchers to prevent hits on balls in play doesn't exist, or if it does, it doesn't really amount to much for almost all pitchers." I suppose it all depends on what you think "ability" is.  I would side with the latter answer. I just don't think actual abilities vary in that way, especially considering that they don't vary like that at all for SO and BB and only to a degree for HR.

Now realize, as I stated way back when, that $ERA and the following year's ERA are flawed. What DIPS does is take that $ERA and make the further necessary adjustments for the overall quality of the team's defense, the park he pitches in and the league he pitches in. These adjustments are actually much easier to do in $ERA than ERA and a little easier than Component ERA. DIPS uses more data than $ERA (Actual BFP, HP, IBB), adjusts for park, league, and overall team defense. These are all adjustments I think most would agree would improve a statistic.

So there you have it. I think DIPS at the very least contribute something valuable to the discussion about pitching statistics. Regardless what you think of DIPS, I urge you to realize a WHOLE LOT of what's on a pitcher's record is the responsibility of his defense. If those parts look bad, but the rest look good, you might want to look closer at the circumstances that pitcher pitches in. It's a hell of a lot easier to pitch for the Yankees, a good defensive team in a good pitcher's park, than it is for the Rangers, a bad defensive team in a hitters park (note: The Rangers have turned over their lineup in 2000 and in each instance of turnover it appears they are upgrading defensively. It appears Sele won't have the opportunity to try them out, though).

I think when you look at a pitcher, and try to determine how well he pitched, it is vital to look at these effects. As we have seen above, it helps.

Voros McCracken can be reached at

Comments to James Fraser

Back to the main page | The Baseball Scholars |