In response to Tuesday's piece about the Dodgers, reader Andy Vogel piped up in the comments to call me on the following statement about Cesar Izturis: "batting him first or second, as Dodger manager Jim Tracy did all year, is a pretty awful idea." Quite reasonably, Vogel asked the following:
How do you square this statement with data suggesting that batting order doesn't make much of a difference in team scoring? Is it just the extra at bats for Izturis you want to avoid, or is there something more to it? I'm not saying Izturis should bat leadoff, but I'm interested in your take.
My quick response was that it's still a bad idea to throw outs away when it's not necessary. Dropping a player one spot in the lineup takes away about 20 plate appearances over the course of a season. Dropping a guy from 1-2 to 7-8 means eliminating about 120 PA. Those 120 PA get redistributed to guys who are, in all likelihood better hitters than the one dropped, especially when we're talking about a guy with a career OBP below .300. It's not a huge amount but it does add up, especially in a low-run environment like the Dodgers typically play in.
Ignoring the fact that complicated research has been done on this problem, I decided to take a quick stab at modeling this via my spreadsheet. I grabbed the Dodgers' 2004 splits by batting position (1-9) from ESPN.com. For convenience's sake, I gave these positions names based on which player hit there the most or whose stat line they most closely resembled on the team, more or less. Here's what the Dodgers had:
1 Cesar .276 .333 .394
2 Jason .291 .349 .466
3 Milton .255 .335 .421
4 Adrian .306 .384 .563
5 Shawn .296 .355 .492
6 Juan .245 .304 .409
7 Alex .248 .311 .427
8 Dave .191 .284 .297
9 Pitch .178 .219 .258
Yeeech. In firing off my first response to Andy, I neglected to consider just how bad the Dodger hitting was at various spots in the lineup. The #8 spot, much of it taken up by light-hitting catchers Dave Ross and Brent Mayne, was far worse than "Cesar" in the leadoff and "Juan" (Encarnacion) in the #6 spot. Of course, there's the automatic out in the pitcher's spot, which will stay at #9 because neither Jim Tracy nor I resemble Tony LaRussa.
I then set about creating a new batting order using the same nine "players." Each batting slot's rate stats (AVG/OBP/SLG) and per-plate-appearance frequency of events were held constant, but the totals were scaled up or down based on the proportion of plate appearances between old and new lineup positions. That done, I added up the team's totals and used a linear run estimator (a simple version of Bill James' Runs Created) and compared the new lineup to the old one. I lacked hit-by-pitch and sacrifice data, but that stuff tends to come out in the wash anyway.
Here is the new lineup:
1 Jason .291 .349 .466
2 Milton .255 .335 .421
3 Adrian .306 .384 .563
4 Shawn .296 .355 .492
5 Alex .248 .311 .427
6 Cesar .276 .333 .394
7 Juan .245 .304 .409
8 Dave .191 .284 .297
9 Pitch .178 .219 .258
I simply dropped Cesar down to #6, moved Juan down to #7, and then shifted everybody up to the next available slot -- not an incredibly scientific method, but hardly as disconnected from reality as a lineup that leads off with the top two sluggers. Then again, batting the keystone duo of Alex and Cesar fifth and sixth is no great shakes either.
Adding it all up, this "team" has almost exactly the same totals -- three more homers, most notably -- and saves themselves literally a couple of outs. For my trouble, they gain a quarter of a point of OBP and one-and-a-third points of SLG. By the Runs Created formula all of this adds up to the whopping total of...
That's it. Two measly, stinkin' runs. I tried more complicated run-estimation formulas -- a technical Bill James as well as Extrapolated Runs, neither of them exacly appropriate because of the missing data -- and the most I added was another 0.2 runs. Of course, the gains would be more if you buried "Juan" in the landfill of some coastal state -- wait, the Dodgers actually tried that one -- and found a catching tandem that could hit. That move alone could easily gain you ten times the number of runs my suggested lineup adjustment might reap.
The bottom line is that it's far more important to have the right players out there than to spend a lot of time worrying about their optimal order. That said, if better options than a leadoff hitter with a career OBP below .300 exist, they should be taken, because that's more times your top hitters come up with men on base. It's still elementary.
I would be remiss if having gotten such a meager return on my inquiry didn't mention more rigorous studies which tried a lot harder, only to come up with essentially the same answer. In The Numbers Game, Alan Schwarz recounts valiant attempts by proto-sabermetricians Earnshaw Cook (whose estimate yielded a whopping 11-run difference) and Art Peterson (whose FORTRAN game simulations yielded "negligible" differences).
More recently Mark Pankin took a swing at the problem using a mathematical concept called a Markov Chain model coupled with some strong baseball reasoning ["1) Getting on base is everything. To much lesser extent, home run hitters should not lead off. Stolen base ability is irrelevant"]. The maximum improvement he found was a total of 16 runs, with most of the teams within 10 runs, about one full win. Nothing to sneeze at if it's the difference between golf and baseball in October, but otherwise small potatoes.
So there you have it. The next time I'm tempted to rail about batting order, I'll hold my tongue, or kvetch about why the Yankees even signed Tony Womack in the first place, let alone allowed Joe Torre to put him at the top of the... wow, I'm feeling queasy already.
The short version fo all of this mayhem is that the third-party solution I had in place to display my RSS feed decided to go "pro". Lacking the will to pay $10/month for what I once got for free, I set about banging my head against the wall for a few weeks until I came across an excellent tutorial which walked me through almost exactly what I needed to do, using a piece of software called Magpie. See, there are legions of ways to generate an RSS or XML feed (most blogs have them as built-in options) but precious few means of parsing said feeds -- that is, bringing them into other HTML pages converted into readable English rather than coded gobbledygook. I have to thank Ashley Bennett for his patience and guidance with the aforementioned tutorial.
The upshot is that now when you visit my home page, you should be able to see the most recent blog entries at a glance, without any lagtime. Elsewhere on the site, I renamed the dorkiest department to "Field Trips" and made a few other tweaks here and there. Don't sweat it if you don't notice...
My recent chat with Jon Weisman caught the attention of the L.A. Observed blog, which is focused mainly on "[m]edia, culture, books and the politics of Los Angeles and California". In particular, our discussion of L.A. Times writers Bill Plaschke and T.J. Simers seems to have whetted the Observed's appetite for more. Cool.
Meanwhile, the Dodgers generated more headscratch-inducing headlines on Monday with their three-year, $9.9 million contract to shortstop Cesar Izturis. In the discussion following our Big Blue Bull Session, I had noted a couple of things about Izturis:
• He'll be 25 this year, he took a great leap forward as far as his hitting goes last year, and is still at an age where he might continue to improve in that department. Using Baseball Prospectus' numbers, he went from being an average of 10 runs under replacement over the previous two years to 18 runs above last year -- a huge turnaround. While he might regress a bit, he also might continue to improve given his age.
• From a defensive standpoint, BP's numbers put him at +1 run last year, +11 the year before, and -8 the year before that. Those numbers seem a little low given the perception of him as a Gold Glove-worthy defensive whiz. The Ultimate Zone Rating numbers, which are based on play-by-play data, put him at +5 in 2000-2003 (of which he played about 2 seasons total) -- again solid but not sterling. Unfortunately, UZR numbers for 2004 aren't publicly available because their creator, Mitchel Lichtman, has granted his employers, the Cardinals, exclusive access to them, but per Moneyball, it's likely that Dodger GM DePodesta has similar numbers to UZR that tell him a similar thing.
I then stuck my foot in my mouth by declaring that this slightly below-average hitter and slightly above-average defender was still relatively affordable and not arbitration eligible yet, and I was clearly dead wrong on that last note, as this contract is a product of DePodesta avoiding arbitration and adding cost certainty. D'oh!
I don't think it's a great contract by any stretch of the imagination, but it does cover the shortstop's age 25-27 years, which are likely to be his best from a hitting standpoint, and from a market standpoint, it can be argued that he's a bargain when the following contracts are considered:
WARP Age Contract
Orlando Cabrera 3.2 30 4/$32 mil ANA
Cristian Guzman 5.7 27 4/$16 mil WAS
David Eckstein 4.2 30 3/$10.25 mil STL
Edgar Renteria 3.7 29 4/$40 mil BOS
Jose Valentin 5.0 35 1/$3.5 mil LOS
Omar Vizquel 6.3 38 3/$12.25 mil SFO
Cesar Izturis 5.5 25 3/$9.9 mil LOS
WARP is Wins Above Replacement Player, a stat that takes into account both offense and defense and is normalized for park, league, and era -- in other words, the playing field has been leveled. Izturis is the youngest of these players, the only one, with the possible exception of Guzman, who's not past his statistical peak age (25-29), and he's also the cheapest on this list. The two most expensive players here, Renteria and Cabrera, had off years but were still rewarded with contracts only slightly more reasonable than the Derek Lowe pact. In that market, Izturis doesn't look like the worst idea in the world.
That said, this is a guy with a .293 career OBP (.330 last year), some speed (25/34 steals last year) and no power (career SLG of .342, with a .381 last year). Baseball Prospectus' PECOTA forecasting system puts his 2005 weighted mean projection at .261/.304/.353, with a zero percent chance of breakout (improving his per-plate appearance productivity, in Equivalent Runs, by 20% above his three-year baseline) and only a 9.9 percent chance of improvement. In other words, we may well have seen the best of what he has to offer with the stick.
Furthermore, batting him first or second, as Dodger manager Jim Tracy did all year, is a pretty awful idea -- about as bad as Tracy, who often used catcher Paul Lo Duca in that role a few years back, is capable of mustering. He and DePodesta should know better. Still, the Dodgers outperformed their Pythagorean projection by 3.4 games and beat their second-order win projection (which examines the team's performance based on run elements) by 5.8, so it's tough to argue that the strategy truly hampered them.
It seems clear that this move -- locking up the team's defensive anchor -- has a lot to do with the decision to invest heavily in groundball pitchers such as Odalis Perez (career G/F of 1.68) and the extreme wormkiller Lowe (career G/F of 3.34), a controversial move that at best appears to have the Dodgers overpaying 2-3 times what they should for something resembling a League-Average Inning Muncher (LAIM). But as tied together as the two players' fates are, it's likely that the Izturis signing will be hailed by the same L.A. media that's unwilling to cut DePodesta some slack for his other moves. Funny how that works.
This rather counterintuitive way of looking at pitching statistics has its advantages. The chief one is that it's been shown that we can do a better job of evaluating a pitcher's future performance by concentrating on the defense-independent things he does -- strike batters out, walk them, plunk them, and give up homers -- than we can by considering the effects of the defense playing behind him. The vehicle for this is the DIPS ERA (or dERA), which has been shown to correlate better with the following season's ERA than that pitcher's actual ERA.
If you've followed this site for the past couple of years, you've heard all of this before. DIPS has generated no shortage of controversy, but the work that's been done in its wake does far more to validate McCracken's central finding than to discredit it. It should be noted that McCracken is not saying major league pitchers do not control their ability to prevent hits on balls in play, just that they have less control than was assumed in a darker age.
The DIPS 2.0 system is a little long in the tooth, having been used for four years now as McCracken, who currently works as a consultant for the Boston Red Sox, is no longer updating it. Nonetheless, it's handy and straightforward enough (if not exactly simple) to merit keeping it in circulation. My annual preparation of the numbers is a project that yields equal parts awed fascination and spreadsheet-induced blindness at each stage. At some measure, the blissful tedium involved in their preparation tickles my opiate receptors; in the dead of winter, staring at spreadsheets of endless reams of baseball stats late at night is still pretty damn fun and addictive.
And I'm a pretty big geek, but what the hell -- this stuff is useful. So have at it.
As an aside, a few links pertaining to McCracken's work are temporarily being hosted on my site because they were lost in the server move from Baseball Primer to Baseball Think Factory. While I have McCracken's permission to do so and none of them will be mistaken for my own work, I am hopeful they will be restored to their rightful place in due time. DIPS is groundbreaking work that deserves better than to be lost in some "404 Not Found" shuffle.