Chicago, We Have a Problem (or Maybe Not)
The other day I bought The Hardball Times 2009 Baseball Annual. I haven’t finished yet, but as of so far, I would say save your money.
In it, though, there is an article about titled “Fielding Breeds Winning” by a certain John Dewan, which goes into some detail about how the Fielding Bibles’ Plus/Minus system helps determine which teams have a real shot at the title. I’ve argued in the comments about inherent flaws in the Plus/Minus system, but there are a lot of people, famous baseball statistical analysts who think it’s the greatest thing since high socks. My main point of contention with it, is that we know that people are biased when it comes to evaluating fielding, based on some of the ridiculous Golden Glove awards handed out, yet we go from our 500 person sample size for those awards to 3 people sample sizes for the Plus/Minus system, and somehow conclude that the latter is more accurate. Really quickly, the system looks at all plays and says for each one – if any fielder in the majors missed a play you made, you get a Plus, and if you missed a play that any fielder made, you get a Minus. Add up your Pluses and your minues and you get Plus/Minus score for the season.
Well, you might be thinking, isn’t this article supposed to be about a Chicago baseball team, and preferably the Cubs, since that’s who I come here to read about? Well, yeah, it is about the Cubs, because according to this system, the Plus/Minus system, the 2008 Cubs were living on borrowed time.
Something you can find it in this same Annual or at various other baseball statistic sites around the net is a stat called Defensive Efficiency Rating. It’s the reverse of BABIP (batting average on balls in play). How often do PA’s which don’t end in walks, Home Runs or strike outs get turned into an out. The 2008 Cubs, Jim Edmonds and all, lead the league and did it pretty easily with a .706 DER, that’s six tenths of a percent ahead of second place Milwaukee. Which, I think we as the fans, and definitely the pitching staff would agree is a good thing. But here’s the rub. The highly touted Plus/Minus system ranked the Cubs as the leagues’ 13th best team, at minus 27 plays. So an ‘average’ defensive team would have given up 27 hits less than the Cubs. Only the Reds, Rockies and Pirates managed worse team totals.
Now, if you take the highly touted Plus/Minus system for its word, that leads to my titular problem. It is possible, in theory, that the Cubs could have turned balls into outs at a higher rate than all of their competition merely by the virtue of good luck. Yes, they had the best rate, but they were merely fortunate that of the roughly 5600 times hitters were kept in the park by the Cubs staff, many of those balls were hit softly or right to Cubs defenders. If they had such good luck in 2008, then odds are they’ll regress to average luck in 2009, and have some ERA problems. 5600 is a pretty big number, but let’s dig a little deeper.
Usually if you’re going to give up a batted ball, the best kind is a ground ball. The second best kind is a fly ball that stays in the park, and the worst thing to do is give up line drives. During one of our many off-season debates, Rob G dug up the fact that Marmol gave up very few line drives in 2008, which accounted for some of his luck with BABIP (which remember, is essentially the same thing as DER). So maybe it wasn’t the Cubs fielders, but maybe their pitchers gave up fewer line drives than other staffs (staves?) and their position players and DER were merely happy beneficiaries of this skill. That would certainly explain at least a large part of the gap between the two measures.
Line Drives - Cubs 20% League Average 20.7% the best being 19% and the worst being 22%. The Cubs were roughly middle of the pack.
Well, fly balls are not as bad, but they can be bad, how did the Cubs and their diving-for-balls-that-Carlos-Beltran-gets-to-in-a-jog pitching staff do? 39%. League average is 35.2%.
So we can see that the Cubs staff gave up a little fewer line drives, and a pretty good chunk more fly balls than the average NL team. With my amazing math skills I can now conclude, without serious number crunching that the Boys in Blue gave up fewer of the best outcome (groundballs) than the average pitching staff.
But there still is those three guys with their tapes of the games, and their computer screens and they say the Cubs fielders were no good.
Well, maybe the Cubs advance scouts and coaches were just really good at determining where the fielders should set up, you might say. And that may be true, but the Plus/Minus system goes by fielding zones, so if the right fielder is told to stand on the foul line, and a screecher is hit right to him (which he catches), he gets a +, just like some poor schmuck who had to make a diving grab and slam into the wall. So, that cannot be it. That can explain the good DER score, but it cannot explain away the bad Plus/Minus score.
At this point, I am starting to call ‘bullshit’. But maybe it’s true. Maybe even with a sample size of 5600 plays, the Cubs pitchers just happened to have a knack of getting the hitter to hit it at the fielders. (Totally different study, but if there are league and park factors to determine how good leagues are and how easy parks are to hit in, then why aren’t there any factors to determine if different teams are better at hitting away from fielders). I’ve got 5600 plays, and I’ve got three guys with a VHS and a laptop. Call it intuition, but I am going to lean towards the 5600 plays.
But still, it’s nagging me. Luck, I know it exists. Can it exist for that many plays? Or could it be that my suspicion about the dubious likelihood of the Fielding Bible reviewers being able to be objective is valid.
We know that with Golden Gloves, the voters (MLB players and coaches) have problems being objective. They’re heavily influenced by an occasional flashy play, and more bizarrely, hitting prowess. Why are these guys, professional players and former professional players more likely to be biased than the three guys with the VHS machine? The answer is of course simple, they’re not. The Fielding Bible Plus/Minus system tries to be objective. They make the people chart very narrowly where a ball is hit, and they enter how hard it is hit. The ‘where’ part bothers me a little, because different cameras can be positioned slightly different at each ballpark, and a play that looks like a zone 3Z at Wrigley, could be recorded as a 3X at Busch. The inherent objectiveness of how hard the ball is hit bothers me a lot more. Some guy may decide it takes a 75 MPH line drive to be a hard hit ball, while another guy evaluating a different team may set his ‘hard’ line at 90 MPH. Then take the fact that these guys know that Albert Pujols is a supreme player and that Rafael Palmeiro is a Gold Glove deserving first basemen, and that when they dive to make a play, it must have been a hard hit ball.
Let’s say, just for kicks that reputation plays a part in the Plus/Minus system just like it does in the flawed Gold Glove evaluations. How can that be shown?
In statistics there is something called the ‘Variance’. Essentially it gives us a number that describes the change from one set of numbers to the next. If you believe that team fielding is relatively constant from one year to the next, then it stands to reason that a fielding rating system which has a lower variance, will ‘feel’ more accurate than a system with a higher variance. The Hardball Times Baseball Annual, again courtesy of the guys over at The Fielding Bible, give the Plus/Minus team rankings for 2007 and 2008. The average variance between teams’ rankings (1 to 30) for those seasons is about 118.
118! That’s it? No, obviously something is needed to compare it to. By referring to Baseball Prospectus you can find the DER for 2007 and 2008. The Variance for those two years is about 109, or less than the Plus/Minus. The variance for 2006 to 2007 for DER is also less, at 86. Those aren’t really huge differences, but they’re what I have freely available.
I really wasn’t comfortable with the three guys and their VCR’s overcoming the 5600 data points that DER gives me. But now, to believe the Plus/Minus system, I’ve got to ignore some 500,000 data points (all plays from 2006 to 2008). This isn’t by any means conclusive proof that Plus/Minus system is bad, but it’s definitely something to think about.
Then there’s the final kicker. How did the Cubs do in 2006? At DER 4th, in MLB. In 2007, they were 2nd, and they were 2nd again last year. So breathe easy Cubs fans and Carlos Zambrano. The Cubs can field noticeably better than The Fielding Bible can guestimate fielding prowess, and you can be pretty confident they’ll carry that forward into 2009.
Comments