Are Fantasy Baseball Pitchers Correctly Valued In Player Raters?

(Part 2 of How Valid Is the ESPN Player Rater?)

If you’ve ever seen the ESPN Player Rater (or, for that matter, other quantitative player rankings for fantasy baseball), you’ve likely asked yourself:

How could there be so many starting pitchers at the top? (13 in top 20, 19 in top 30) Is that valid or just faulty weighting?

This looks even more peculiar when reviewing qualitative rankings (i.e., someone subjectively lists players) or reviewing pre-draft rankings.

Before creating our own Player Rater, our assumption was that the preponderance of starters in the top ranks of ESPN’s Player Rater was due to faulty methodology versus the true value of starters vs. hitters.

So it came as a mild surprise to find that 12 of our top 20 were starters (and we also agreed with ESPN that JJ Putz deserved top 20 inclusion). We were somewhat relieved when there were only 2 starters in the 21-30 ranks so our 14 out of the top 30 was less than ESPN.

But those were gut reactions. Now that we’ve gone through the exercise, is their truth to ESPN’s (and our) pitcher-heavy top of the rankings? Are they eerily prescient or is this a broken clock scenario?

[It’s important to differentiate this exercise – the proper valuing of player statistics – versus the projection of future statistics that are done by folks like Baseball Prospectus and Ron Shandler and are used for drafting purposes. Their analysis has shown that projecting hitter stats is more accurate than pitchers stats which makes hitters less risky for drafting. While it’s extremely rare to see top 10 draft results with more than 2 pitchers, this does NOT mean pitchers are less valuable. That is based on a risk/value assessment – our analysis just focuses on the ‘value’ part of the equation.]

We’re going to look at this as two separate subquestions: 2A) Should there be a lot of starters in the final season top 20? & 2B) Is ESPN’s ranking of starting pitchers correct?

For question 2A, let’s first look at the players in our top 20 as well as that of ESPN. We shared 19 of the same 20 players, albeit in different order. Ours includes David Ortiz at #19 while they have Cole Hamels included at #20. (Still, agreeing so much with ESPN feels unclean.)

Our Ranking. Name – Pos (ESPN Ranking)

1. Jake Peavy – SP (2)

2. Alex Rodriguez – 3B (1)

3. C.C. Sabathia – SP (3)

4. Johan Santana – SP (4)

5. Matt Holliday – OF (8)

6. Hanley Ramirez – SS (7)

7. Brandon Webb – SP (6)

8. Josh Beckett – SP (5)

9. Magglio Ordonez – OF (11)

10. John Lackey – SP (9)

11. Jimmy Rollins – SS (14)

12. David Wright – 3B (15)

13. Erik Bedard – SP (13)

14. J.J. Putz – RP (10)

15. Aaron Harang – SP (12)

16. Dan Haren – SP (16)

17. Fausto Carmona – SP (19)

18. John Smoltz – SP (18)

19. David Ortiz – 1B (22)

20. Javier Vazquez – SP (17)

22. Cole Hamels – SP (20)

To understand the impact of each category on these players’ total points, we looked at the mean and median points per category for the 13 starters and 6 hitters (excluding Putz).

Category: Mean; Median

R: 4.4; 3.1

HR: 5.1; 3.1

RBI: 4.9; 2.0

SB: 3.5; 0.4

AVG: 5.3; 2.4

W: 4.4; 4.9

SV: -0.3; -0.3

ERA: 4.8; 5.4

WHIP: 4.4; 4.8

SO: 4.4; 4.8

While the means per category looks very consistent across the hitting vs. pitcher stats (aside from Saves), the medians per category are smaller for hitter stats. This is because even the greatest hitters are rarely great in any more than 2-3 categories. While the numbers may average out high, it’s because some players dominate the category (HR=ARod) and others are merely very good or good. A-Rod led the majors in R, HR and RBI but was outside the top 20 in Average and SB. Hanley Ramirez was top 10 in R, SB and AVG but outside the top 20 in HR and top 50 in RBI. David Wright’s isn’t in the top 10 for any category – his true value is being very good across the board. David Ortiz wouldn’t steal a base if you stuffed it with pork and deep-fried it.

Looking at the top 20 in batting average, only 5 of these players were in the top 20 for HRs (Holliday, Ortiz, Pujols, Wright, and M. Cabrera). (Note: Braun didn’t have enough ABs to qualify for average but was in top 20 for HRs)

The best pitchers, on the other hand, tend to be great or at least very good in all four categories. Looking at the top 20 in the MLB for Wins, Strikeouts, and ERA, there are 19 pitchers who are in at least two of the categories – 5 of those pitchers (Peavy, Webb, Lackey, Sabathia, and Beckett) are in the top 20 across all three.

This leads to a rather straightforward theory – starters are more likely to populate the top 20 in a player rater because the top pitchers tend to get high points in all the categories where hitters only have a couple categories where they are great.

We tested this larger theory of “pitching stats are more connected” by doing a correlation analysis on all the hitters and starters to see how closely the performance in one stat is correlated with another.

A perfect 100% would mean that the stats are absolutely correlated – say, purchases at a store and a store’s sales tax receipts (every $1 in purchases would be x% in taxes). A -100% would mean that the stats are completely inverse – say, the amount of total salary a baseball team can afford and their likelihood of picking up Jose Lima.

Below are the results of the correlation analysis.

Analysis:

90-100%

80-89% HR/RBI, ERA/WHIP

70-79%

60-69% R/RBI, W/K

50-59% R/HR, W/ERA, K/ERA, K/WHIP

40-49% R/SB, R/AVG, W/WHIP

30-39% RBI/AVG

20-29%

10-19% HR/AVG, SB/AVG

0-9%

Negative HR/SB, RBI/SB, all stats with saves for starters

Let’s start with the highest correlating stats for hitters and pitchers: HR/RBI and ERA/WHIP. The fact that these stats correlate highest should be rather self-evident.

Those that do well in HR and RBI correlate positively with Runs but poorly with SB and AVG. There is barely any correlation on HR/AVG and a mild one for RBI/AVG (which makes sense since it does require hits generally to drive in runs). SBs are negatively correlated to HR and RBI – not a surprise to anyone who has ever drafted Juan Pierre, Scott Podsednik, or Willy Taveras.

Runs prove to be an interesting category as, besides HR/RBI, they also correlate well with SBs and AVG. This is likely due to high SB and AVG players being on base a lot, at the top of the lineup, and getting driven in by the HR/RBI guys.

So what we tend to have are two types of valuable hitters: Power/middle of the lineup guys who provide strong R/HR/RBI or Speed/top of the lineup guys who provide strong R/SB/AVG. Players like Magglio Ordonez hitting .360 while providing solid power numbers or Hanley Ramirez providing 29 home runs while providing great R/SB/AVG numbers are the EXCEPTIONS and not the rule.

With ERA/WHIP, these stats positively correlate at 40-60% with Wins and K’s. While there are those that do well in just one of these categories (say Wang in Wins or Kazmir in K’s, alliteration unintentional), a great starter doing well in most, if not all the categories, is more common than with hitters. Since there are more successful 3-4 category pitchers vs hitters (where you generally have to tradeoff strengths with weaknesses like Ichiro’s R/SB/AVG vs. HR/RBI), it makes sense that starters are disproportionately valuable.

The final point here – which was covered in the Peavy vs. A-Rod comparison – is the fact that starting pitchers have more influence over a team’s total stats than a hitter. This is particularly true in ERA and WHIP where a top starter may represent around 20% of your innings. Compare this to a hitter who is lucky to represent 8-9% of your ABs.

An illustrative comparison is looking at the stats of the 20th player in our rater – Javier Vazquez. His stat line of 15W/3.74/1.14/213K looks pretty good but what would the equivalent be in value for a hitter? If we link up Runs to Wins, HR to ERA, RBI to K’s, SBs to SV, and AVG to WHIP, our model would require a 120/21/137/0/.345 hitter. (fyi, if you want to see how SBs would play a role, switching the SB and AVG values would net .278 and 36 SBs)

So the answer to Question 2A is yes. We do feel that starters for any particular season should represent a majority of the top 20 value slots – unless, of course, a breed of power/speed guys start cropping up that rack up RBIs and don’t suck at AVG (see Mike Cameron, Chris B. Young).

On to Question #2B, is ESPN’s ranking of starting pitchers correct?

Let’s take a look at the pitchers just outside the top 20 in ESPN’s Player Rater and compare them to our totals.

ESPN Rank – Player Name – Our Rank

20 – C. Hamels – 22

21 – J. Verlander – 26

23 – K. Escobar – 36

24 – T. Lilly – 34

27 – J. Shields – 43

28 – T. Hudson – 42

29 – S. Kazmir – 55

Scott Kazmir’s stat line of 13W/3.48/1.38/239Ks netted him 3.25/3.58/2.55/4.98 in ESPN Player Rater points for a total of 14.36. Turning that into percentages, we’re looking at 22% for Wins, 25% for ERA, 18% for WHIP, and 35% for K’s.

In our rater, Scott Kazmir had 10.5 points that netted him 2.2/3.2/-1.4/6.8 or 21% for Wins, 31% for ERA, -14% for WHIP, and 64.5% for K’s.

Why the negative in WHIP? Because the Best Available Option (BAO) pitcher had a WHIP of 1.32 which bests Kazmir’s 1.38. His WHIP hurts your team, but you’ll take it because he does well in the other stats – especially K’s where he’s truly excellent.

The Kazmir comparison highlights several flaws in ESPN’s ranking of pitchers:

1) Capping High Points at 5 – Kazmir’s contribution in strikeouts has a greater impact on a team than, say, A-Rod’s run total. Treating them both at around 5 distorts Kazmir’s one special thing. It’s as if he’s Dirk Diggler and his “one special thing” has been shortened a couple of inches. (We’ll explore this concept further in another post – the capping, not fictional schlongs.)

2) Positive ERA/WHIP Contributions Are Undercredited – Kazmir’s 206.2 IP at a 3.48 ERA warrants slightly more credit than ESPN doles out since it is well below the BAO ERA of 3.96. If you estimate Kazmir represents between 1/6 and 1/7 of a team’s innings (figure 4 more starters = 800 IP, 4 relievers = 300 IP, total of 1300 IP which is about 1/6.5 of Kazmir’s total), that 0.48 difference in ERA nets out to about a 0.07-0.08 drop in team ERA. This is equivalent to the impact of a player hitting about .337 or driving in 116 RBIs. (This concept is also covered in the A-Rod vs. Peavy post.)

3) Negative Contributions Aren’t Penalized – Kazmir’s WHIP of 1.38 is below the 1.32 WHIP of the Best Available Option pitcher (the top FA starter if the best 50 starters and 40 relievers were taken). How could this net positive points? He’s hurting your team. The WHIP is a tradeoff for his other stats. It’s like going out with a girl because she’s hot AND crazy instead of because she’s hot and IN SPITE of the fact she’s crazy. This ‘tradeoff’ cost is also present in counting stats for second-tier pitchers like C. Wang (gets less than average strikeouts) or Chris “Tall San Diego Pitcher” Young (his 9 wins are less than average).

A 4th issue related to #3 – the overcrediting of slightly above average performance – is more apparent with Ted Lilly’s ERA (3.83 vs BAO 3.96) and WHIP (1.14 vs. BAO 1.32). While Lilly’s ERA should get some positive credit, it is not worth nearly as much as his WHIP. ESPN’s system doesn’t value the two that differently – Lilly with 3.12 points in ERA and 4.30 in WHIP. On the other hand, we credit Lilly with 1.3 ERA points and 4.8 WHIP points.

The summary of all these ESPN Player Rater flaws is the following:

Wins – Slightly undercredits great performance (issue #1), Doesn’t penalize below average performance (issue #3)

ERA – Undercredits great ERAs, overcredits slightly above average to bad ERAs

WHIP – Undercredits great WHIPs, overcredits slightly above average to bad WHIPs

Strikeouts – Similar to Wins but more pronounced.

For the pitchers in the top 20, these effects are minor – Harang and Vazquez are slightly inflated given their so-so ERAs, Carmona is deflated given his great ERA (3.06), Beckett and Webb switch places because Webb’s ERA superiority trumps Beckett’s 2 extra wins. Outside the top 20, the results become more pronounced since issues #3 and #4 play a greater role in the distorted value.

Thus the verdict for Question #2B – is ESPN’s ranking of starting pitchers correct? The answer is no. Their rater does a fair job at the top, but it gets continually distorted as you move outside the top 20 players because it doesn’t properly penalize mediocre to below average performance.

(Note: Unfortunately, the ESPN Player Rater’s improper penalizing of below average performance has a lot in common with the internal review of Baseball Tonight anchors – please tell me John Kruk, Orestes Destrade and Eric Young aren’t coming back…)

Come March, when you’re preparing for your draft and trying to decide between pitchers, you can avoid the above mistakes by just comparing the two pitchers’ projected stats and credit a point for each of the following increments: 1.5 Wins = 0.19 ERA = 0.04 WHIP = 18 Ks. Might take a little more calculating but it could be the difference between taking Jeremy Bonderman over John Lackey (that decision still haunts me from last year…)