RECENT UPDATES
Around the Bases
STADIUM SOJOURNS:
Spring Back to Life
WALL OF FAME: Tony Suck

WALL OF FAME: Lenn Sakata

MERCURY NEWS: The Staff of Legends
LEADING OFF:
DIPS 2002
BASEBALL PRIMER:
The Hoyt Scale
Re-Revisited

WALL OF FAME: Ron Gardenhire

___________ THE ROSTER

 

All contents of this web site © Jay Jaffe, 2001-2003 except where indicated. Please contact me for any questions or comments regarding this site.

       A R O U N D   T H E   B A S E S

 
Welcome to my web log, published via Blogger Pro. Below are some links to recent baseball-related articles I found of interest, with my own two cents thrown in. Feel free to chime in via the comments link at the bottom of each post (powered by YACCS), or use my Contact page, or my email address, jay@futilityinfielder.com.

Here are the weekly archives of this blog, assuming Blogger hasn't screwed up again. If an archive appears to be missing, you can try hunting for it via the subdirectory. Please note that because of repeated difficulties I've had with Blogger, I no longer recommend their service and will be taking steps to switch to a new one in the near future.

Saturday, November 23, 2002

Nice Move, DIPShit

I went back to playing around with the pitching spreadsheet I used to calculate my DIPS numbers referred to in the article below, and discovered that I made an error in calculating the Yankee park factor. Basically, I transposed two digits for the Yankee home at-bat total, which should have been 2907 instead of 2097. Needless to say, those missing 810 ABs change things a bit. In this case they take the Yankee Stadium Park HR Factor from 1.054 to 0.968, and slightly lower the Yankee starters' DIPS ERAs by a few points. So I'll rerun that last chart:
             ERA   dERA   BABIP
Pettitte 3.27 3.28 .317
Clemens 4.35 3.34 .316
Mussina 4.05 3.73 .290
Hitchcock 5.49 3.81 .373
Wells 3.75 3.87 .284
Hernandez 3.64 3.91 .264
Weaver 4.04 4.11 .283 (NY only)
Weaver 3.52 3.81 .280 (TOT)
Yankees 3.87 3.71 .293
League 4.46 4.46 .290
The change is most apparent in the more homer-prone Yankee pitchers; Mussina, Weaver, and Hernandez all lower their dERA by more than a tenth of a run. And the team's dERA goes down to 3.71, compared to their 3.87 ERA, showing a bit more clearly how the below-average Yankee defense against balls in play cost the team a few runs here and there. Also note that I calculated full-season dERA for Weaver because I finally got around to calculating the Comerica Park HR Factor.

These new figures don't alter any of the other observations I made, but I might as well add a few other notes about the process, while I'm revisiting the calculations:

1. Team stats for the Yanks were taken from ESPN.com. League stats were taken from ESPN.com and MLB.com, with the latter figures used in case of any discrepancy.

2. I used actual Batters Faced Pitching numbers from those sources, rather than estimating as Voros McCracken does in his
step-by-step instructions.

3. When I calculated the Yankee team DIPS, I adjusted for lefties by multiplying the LH correction factor by the percentage of innings pitched by Yank southpaws. This wasn't explicit in Voros' instructions, but it seems like the correct approach.

4. When I calculated the "League" DIPS for that chart, I DID NOT make any adjustment for lefthanded pitchers or knuckleballers, or for park factors (which would probably end up being close to 1, but not exactly so, depending upon the AL/NL balance of homers in interleague games) Thus, that number is probably off by a few points.

5. Voros' method for computing the Park Factor, which he graciously walked me through via email and has given me permission to pass along:
Here's a simple method using Home Runs per AB-SO:

HAB = Home at bats
RAB = Road at bats
HSO = Home Strikeouts
RSO = Road Strikeouts
HHR = Home Home Runs
RHR = Road Home Runs
AB = Total At Bats
SO = Total Strike Outs
HR = Total Home Runs.

The first thing you do is calculate the "actual" rate of

Actual Rate = HR/(AB-SO).

This is the rate to compare to, and how you'll get the factor. After you get that, and let's say it is .041, you now make the more complicated calculation:

Note: For NL and AL teams there is a difference. The below will be for AL teams. For NL teams change the "7"s to "8"s and the "13"s to "15"s. The reason is that what you're doing is estimating what the stats would be if the Yankees played 1/14 th of their games in Yankee Stadium instead of 1/2 (or 1/14 divided by 1/2 equals 1/7):

Adjusted Rate = ((HHR*(1/7))+(RHR*(13/7)))/(((HAB-HSO)*(1/7))+((RAB-RSO)*(13/7)))

Now you'll get a number like the simple calculation above but different. Let's say it is .043

The rest is easy. If you want to adjust numbers from Yankee Stadium to a neutral park then:

Park Factor = Adjusted Rate/Actual Rate

In this case it would be .043/.041 = 1.0488.

So if you have a pitcher whose rate of HR/(BFP-HP-BB-SO) is .025 you multiply .025 by 1.0488 for his Park Neutral rate or:

1.0488 * .025 = .0262
I'm working on calculating DIPS numbers for some of the select free-agent starters as well as other names that have popped up in Yankee trade talks. Anybody with a line on an easy way to gather home and road splits for AB, HR, and SO (all needed to calculate Park Factors) or the willingness to do so themselves, drop me a line.
--posted by Jay Jaffe at 5:02 PM Link

Friday, November 22, 2002

Free-Agent Fiesta

Baseball Primer is holding its second-annual Free-Agent Fiesta. The object is to guess the destinations of 20 free agents:

Jim Thome
Greg Maddux
Jeff Kent
Tom Glavine
Ivan Rodriguez
Roger Clemens
Cliff Floyd
John Olerud
Jamie Moyer
Steve Finley
Paul Byrd
Edgardo Alfonzo
Frank Thomas
Woody Williams
David Justice
Bill Mueller
Ramiro Mendoza
Tom Gordon
Todd Hollandsworth
Omar Daal

As the defending champion (I "predicted" 9 out of 23 correctly), I fully intend to complete my entry by next Wednesday's deadline, and probably sooner. I'll post my entry here when I do.
--posted by Jay Jaffe at 10:52 AM Link

Tuesday, November 19, 2002

Remaking the Yankees for 2003, Part II: Assessing the Rotation

The 2002 Yankees were a shining example of the axiom, "You can't have too much pitching." With the luxury to spend (and spend, and spend), the Yanks ensured themselves a starting rotation that remained among the game's elite, while having the depth to withstand injuries that might have crippled another staff. Though it was often remarked throughout the season that the Yanks was no longer winning on the strength of their pitching (a perception due in part to Roger Clemens' and Mike Mussina's relatively sub-standard performances), in truth the staff turned in its best performance since 1999, relative to the league. Here's a chart showing the Yanks' ERA relative to the AL during the Torre era:
        NYY   AL     rel
2002 3.87 4.46 -0.59
2001 4.02 4.47 -0.45
2000 4.76 4.91 -0.15
1999 4.13 5.18 -1.05
1998 3.82 4.65 -0.83
1997 3.84 4.57 -0.73
1996 4.65 5.00 -0.35
The Yanks were fourth in the league in ERA, and led the league in strikeouts and fewest walks.

The team entered spring training with a logjam of seven starters and ended it in similar fashion, needing every one of them along the way. Roger Clemens served his annual stint on the DL, injuries to mainstays Andy Pettitte and Orlando Hernandez caused both to miss about 1/3 of the season, and Sterling Hitchcock pitched poorly even when healthy. These circumstances made George Steinbrenner's
hamburgling David Wells away from the Arizona Diamondbacks at the last moment look like a stroke of sheer genius. Wells led the team in wins (19) and was second in innings pitched, and while his surgically repaired back was balky at times (particularly in cold weather and whenever he had to retake the mound after a lengthy Yankee rally), he missed only one start all season.

It's difficult to laud the Yankee rotation's performance, however, without recalling its unseemly demise. The Anaheim Angels beat the vaunted Yankee starters mercilessly in the Division Series, to the tune of a 10.38 ERA and 32 hits in 17.1 innings (a mere 4.1 per start). Yankee pitching had spent September feasting on lesser teams (19-8, 3.07 ERA during the month, with no opponents over .500 after September 4), perhaps lulling them into a false sense of security. But against the Angels they looked old and vulnerable. A bad week, or a portent of deeper problems? Let us take a closer look.

Having picked up Andy Pettitte's option last Friday, the Yanks currently have four starters -- Pettitte, Mussina, Wells, and Jeff Weaver -- under contract, five if you count Sterling Hitchcock and his 37 MPH fastball. Roger Clemens is a free agent, and Orlando Hernandez is arbitration-eligible. Below is a chart showing their performances during 2002, along with their 2003 ages and contract amounts (in millions, including signing bonuses but not incentive bonuses):
           Age  Cont   W  L   IP     ERA   K/9  K/W   WHIP  HR/9   AVG   OBP   SLG   OPS
Mussina 34 12.0 18 10 215.7 4.05 7.60 3.79 1.19 1.13 .253 .295 .413 .708
Wells 40 3.0 19 7 206.3 3.75 5.98 3.04 1.24 0.92 .259 .298 .406 .704
Clemens 40 FA 13 6 180.0 4.35 9.60 3.05 1.31 0.90 .250 .315 .397 .712
Hernandez 37 ARB 8 5 146.0 3.64 6.97 3.14 1.14 1.05 .236 .289 .378 .666
Pettitte 31 11.5 13 5 134.7 3.27 6.15 2.88 1.31 0.40 .272 .316 .365 .681
Weaver NY 26 4.1 5 3 78.0 4.04 6.58 3.80 1.23 1.38 .260 .299 .437 .736
Weaver TOT 11 11 199.7 3.52 5.95 2.75 1.21 0.72 .250 .300 .383 .683
Hitchcock 32 6.0 1 2 39.3 5.49 7.09 2.07 1.83 0.92 .326 .378 .457 .835
Unlike last season, when Mussina, Clemens, and Pettitte significantly outperformed the back end of the rotation, this is a MUCH stronger group across the board, with the exception of Hitchcock (whom I'll exclude from the next several generalizations because he stunk). Everybody except Clemens was at least 0.42 runs below the league ERA, and Clemens was at least below league average. Everybody had much better-than-average control (the league average K/W ratio was 1.92), prevented baserunners at a better-than-average clip (league average WHIP -- walks plus hits per inning -- was 1.38), had average-to-good strikeout rates, and -- except for Mussina and the pinstriped Weaver -- stayed away from the long ball (league average 1.10 per 9 innings).

A few notes on these performances:

• Though his ERA was high (for reasons we'll get to in a moment), Clemens' high strikeout rate and excellent control indicate that he's still a force to be reckoned with, if no longer a Cy Young candidate (and let's face it, Mussina should have won last year's award anyway). He was up and down through 2002, never putting together two good months in a row; here are his ERAs by month in sequence: 4.62, 2.98. 5.04, 2.70, 6.10, 3.86.

• Coming off of a season in which he should have won the Cy Young Award, Mussina had a much worse year, disguised by better run support. His ERA rose by 0.9 runs, but his run support rose by about double that, and his W-L record was almost identical. But his strikeout rate fell 0.8 per 9 innings, and his home run rate rose by over 40 percent. Additionally, his slugging percentage allowed with men on base rose by over 100 points. Had it not been for a late-season surge against some of the league's weaker teams, his stats would look even worse.

• Once he recovered from his early-season elbow troubles, Pettitte picked up where he left off in 2001 and was the team's most consistent starter. In the second half, he was 11-2 with a 2.70 ERA, 6.61 K/9, 4.18 K/W and averaged almost 7 innings per start. His home run rate was miniscule; he allowed only 6 all season. Pettitte is a classic Tommy John family pitcher, meaning he gives up lots of hits but survives because he gets good double-play support, controls the running game, and allows few walks or homers. He fell back to his career norm in strikeout rate after spiking in 2001, but his second-half performance gives some reason for optimism that it will climb yet again.

• Wells' W-L record was propped up by stellar run support (7.46 per game); as I mentioned above, this may actually have caused him problems. He still ate innings like a horse, a good sign for his health, and he made more in incentive bonuses than he did in base salary. His strikeout rate has been slowly dropping for the last several years, and his pinpoint control isn't what it once was (no more 5+ K/W ratios).

• Weaver is eight months younger but considerably more polished than the man he was traded for, Ted Lilly, with four full seasons under his belt. His low home run rate in Detroit was no doubt helped by spacious Comerica Park, and he gave up a flurry upon coming to New York. But he settled down and adjusted admirably to his role as the swingman (4-2, 1.94 ERA in 51 innings -- seven relief appearances and four starts -- in August and Septmeber), biding his time for his big shot in the rotation this coming season. Allowed only one stolen base in 199.2 innings -- is this a misprint?

• Though his season was once again interrupted by injury and shadowed by communication problems with the Yankee brass, Hernandez rebounded solidly from a disappointing 2001. I've said before that I felt he got a raw deal with regards to the postseason rotation. Ironically enough he ended up pitching more innings than any Yankee starter in the Division series, and -- those two solo homers notwithstanding -- better than any of them. Joe Torre should have listened.

One area of the staff's performance is worth a closer look: balls in play. The cutting edge of sabermetric thought with regards to pitching is that major-league pitchers do not differ greatly in their ability to prevent hits on balls hit into play (that is, anything that's not a home run, a strikeout, a walk or a hit-by-pitch). The rate at which a pitcher allows hits on balls in play is due more to the defense playing behind him than to his own skill, and can vary greatly from year to year; it does not correlate well from one year to another, the way a statistic influenced by a player's ability should.

This is counterintuitve and somewhat controversial, but a man named Voros McCracken has demonstrated this phenomenon and developed a set of tools called Defense Independent Pitching Stats (DIPS) around it. DIPS takes the elements of a pitcher's record that are not affected by the defense -- walks, strikeouts, hit-by-pitches, homers -- and places them in a neutral context for park, league and defense. The result is a DIPS ERA; that is, an ERA based on defense-independent pitching performance. An important thing about this DIPS ERA (dERA), McCracken found, is that it correlates better with the following season's ERA than the pitcher's actual ERA does. What this is saying, essentially, is that if we control for the results of balls in play, we've got a better indication of what to expect from each pitcher the following season than his actual ERA. I'll leave the proof to McCracken, whose writings on the subject are very detailed but worth your time.

McCracken has published DIPS numbers for all pitchers in each of the past three seasons, 1999-2001. However, he hasn't done so for 2002 and will be unable to himself (for reasons he can't disclose yet, wink wink). But he's outlined his step-by-step calculation methods very clearly, and he was kind enough to walk me through the one area he wasn't explicit about -- park-adjustment for home run rate -- via a couple of emails. So I took it upon myself to create a spreadsheet which would calculate DIPS numbers for the Yankee starters. Here they are, along with their actual ERAs (again) and the batting average allowed against balls in play (these are revised numbers made after an error in my spreadsheet was discovered a few days after initial publication):
             ERA   dERA   BABIP
Pettitte 3.27 3.28 .317
Clemens 4.35 3.34 .316
Mussina 4.05 3.73 .290
Hitchcock 5.49 3.81 .373
Wells 3.75 3.87 .284
Hernandez 3.64 3.91 .264
Weaver 4.04 4.11 .283 (NY only)
Weaver 3.52 3.81 .280 (TOT)
Yankees 3.87 3.71 .293
League 4.46 4.46 .290
If your eyes haven't glazed over from looking at all these stats, this is what you might take from the above charts:

• The Yankees as a team were slightly below average when it came to converting balls in play into outs, something I've touched on before; the inverse of batting average on balls in play is a team's Defensive Efficiency Rating. The team's dERA reflects this, as it's a few points lower than its actual ERA.

• The two pitchers who pitched the most innings for the Yanks had their BABIP rates closest to that of the team's rate. The two pitchers who missed about 1/3 of the season were about 25 points off in either direction. and the one who pitched the fewest was waaaaay off. This reflects McCracken's observation that BABIP tends to even out over time.

• Clemens pitched considerably better than his 4.35 ERA would indicate, and didn't get much support from the Yankee defense or Lady Luck when it came to balls in play.

• Pettitte didn't get much help either, but it didn't seem to hurt him, probably because he was able to avoid the long ball; note that in the first chart, he allowed the lowest slugging percentage of all Yankee starters and had a home rate less than half that of all others except Weaver.

• Hernandez was helped the most by the Yankee defense and Lady Luck, but even given normal support, his performance wouldn't be out of line at the back of the rotation with Wells.

• Hitchcock was particularly hurt by defense and luck. Now, this is where my own faith in DIPS is tested, because watching Hitchcock pitch at times in 2002 felt like watching batting practice -- he wasn't just hit, he was hit HARD. But the theory behind DIPS says that this represents a shortcoming of the Yankee defense and/or the small sample size of Hitchcock's opportunties. It is true that 39.1 innings isn't a lot to base sweeping conclusions on.

Whew! Give a monkey a spreadsheet and he'll throw a lot of numbers at you. I've spent a lot of time taking the Yankee rotation apart here; in my next installment, I'll put it back together.
--posted by Jay Jaffe at 12:35 AM Link

Comments by: YACCS