With great pride and bland post titling, I’d like to announce a Beta release of our fantasy baseball in-season player rater as well as two charts that highlight the differences between pitcher FIP vs. ERA and batter BABIP vs. AVG.
The player rater work is an adaptation of the Point Shares methodology I’ve used the last couple of years for pre-season and post-season player estimates. Here is a link to a favorable test I did earlier this year vs. ESPN’s player rater methodology. After some trial and error plus assistance from a variety of folks (Eric K at my favorite fantasy baseball escort service – EliteFantasyPlayers.com - and Doug at Dougstats.com among others), we now have a fairly automated system for updating in-season player rankings on a daily basis.
ESPN Roster format (C/1B/2B/SS/3B/5 OF/CI/MI/UTIL/9 P) – 10 Team / 12 Team / 14 Team / 15 Team / 16 Team MLB
Yahoo! Roster format (C/1B/2B/SS/3B/3 OF/2 UTIL/2 SP/2 RP/4 P) – 10 Team / 12 Team / 14 Team / 15 Team / 16 Team MLB
AL-Only (2 C/1B/2B/SS/3B/5 OF/CI/MI/UTIL/9 P) – 10 Team / 12 Team
NL-Only (2 C/1B/2B/SS/3B/5 OF/CI/MI/UTIL/9 P) – 10 Team / 12 Team
The table is ranked based on a players’ projected Point Shares (a player’s value in standings points vs. the average player with some factoring in of position). Dollar estimates are provided both for in-season as well as comparisons vs. pre-season estimates. Take the dollar estimates with a grain of salt for now – they should become more stable as the season goes on. You can filter by position (P for all pitchers, -P for all hitters) and sort by any of the columns (1 click ascending, two clicks descending).
There are two pages focused on popular hitter/pitcher stats outside of 5×5 (popularity based on our pre-season poll results). These tables are filterable/sortable as well.
Hitting – OBP, SLG, OPS, Hits, Total Bases
Pitching – Quality Starts, Holds, Losses
Lastly, there are two tables that highlight differences between pitcher FIP vs. ERA and hitter BABIP vs AVG.
The pitcher table is sorted based on the ‘luckiest’ pitchers – i.e. pitchers ranked in descending order based on the difference between their FIP and their ERA. For those wondering why I chose FIP vs. xFIP, I do not have access to the league-average fly ball to home run ratio nor pitcher HR:FB ratio. You may also find that my FIP estimates are slightly off from other sources – this is mainly because I cannot currently separate out intentional from non-intentional walks but it can also be due to how the ‘constant’ is applied to bring the league average FIP in the 3.20 range.
The hitter table is sorted based on the ‘luckiest’ hitters - i.e. hitters ranked in descending order based on the difference between their current AVG and their expected AVG. A hitter’s expected AVG is calculated by applying a hitter’s 3-year BABIP to their in-season performance. 3-year BABIP was used as this stat does vary per hitter based on various factors (line drive rate, their speed, GB to FB ratio, etc.) but a hitter’s BABIP tends to be steady in the long run. Hitters with less than 100 AB in the previous 3 years are given the league average BABIP of .300.
I’ll do my best to keep these tables updated daily (generally by 10 AM EST). The last column of each chart reflects the last games included so it will be transparent when it has not been updated for a couple days. While I will do my best to keep on top of the moving pieces, please do not hesitate to provide the following information in the comments section of Grey and/or my posts:
Any missing players from the tables (for now, I’m including any hitter/pitcher with 1+ AB or 0.1+ IP. That minimum threshold will likely increase as the year goes on.
Any position eligibility changes based on 10 game in-season eligibility (I know Yahoo! is 5 games but prefer to make one change across both). For hitters where position eligibility seems imminent (e.g., Jesus Montero at catcher), I include the additional position and add an asterisk at the end of it.
Any wonky data or functionality
Other potential FAQ:
Will you ever have a ‘rest of season’ player rater?
Maybe. Would be dependent on a respected projection source providing an uploadable file that is 1) updated on a regular basis, 2) accounts for expected playing time, and 3) is free.
Will you create a dynamic player rater to reflect any conceivable league format?
Not planning on it.
Was your wife turned on by this accomplishment?
Nope. She prefers it when I go Don Draper and fix shit around the house in a white t-shirt.
One of the biggest challenges facing the fantasy baseball fanatic is how to value and rank players. This is felt most acutely during draft season when nearly every fantasy sports site/expert has their own Top 200/300 rankings and each manager has to decide which source(s) to believe. This challenge is also felt – albeit to a lesser extent – during the season when managers are looking for a ‘player rater’ to determine trade values.
When we started Razzball a couple years back, I decided to leverage whatever limited math and Excel skills to come up with the best source of player ranking/valuation. This eventually led me to create Point Shares which estimate the difference in an average team’s points if they were to substitute a given player for the average player at his position. This differs from the Standings Gain Point (SGP) concept which is more akin to VORP/WAR-methodology and uses something close to the replacement player for valuing players. I prefer using the ‘average player’ as a benchmark vs. a ‘replacement player’ but I am not going to delve into methodology in this post (You can reference this post if this topic tickles your curiosity).
There are three major components when it comes to developing preseason rankings:
Playing Time Estimates (PA/AB and IP)
Statistical Projections (HRs per AB, K/9, etc.)
Methodology For Converting #1 and #2 Into Player Rankings/Value
I assume the majority of ‘expert’ rankings (Grey’s included) do not split out each of these three components. Instead, they leverage their past fantasy baseball experience plus playing time/statistical expectations (likely based on third-party projections and, increasingly, component stats like BABIP and FIP) and create what I would call a ‘curated’ rankings list and/or auction $ values. The initial reaction of mathy types might dismiss this for being non-scientific but, in my eyes, a well-curated player rankings beats a poorly-architected quantitative system every day of the week.
Testing curated rankings, though, is a challenge. Unlike with projection systems where each statistic can be broken out and measured (here is a recent test done on FanGraphs), one would need to design a test that used only the order in which the players were ranked (note: $ estimates are easier). A test design based on simulating draft results creates both logistical and methodological challenges . For the past three years, I have taken part in a ‘Forecaster’s Challenge‘ run by Tom Tango (co-author of The Book, creator of the Marcel projection system, and prolific blogger on InsideTheBook.com) which simulated drafts based on submitted rankings and credited victories based on total points (like in a ‘points’ league, each statistic is worth a certain amount of points). I think Tom did a great job at creating an impartial test but the conceits with any test of this type are difficult to overcome (e.g., simulated snake draft vs. actual, reducing 2B/SS/3B to one position, 20 teams vs. the standard 12 teams, inability to factor how well a team could overcome certain draft disappointments vs. others (say, Dunn vs. Hanley) through FA replacements, etc.).
Note: I believe I finished around the middle of the pack in each challenge. My dissatisfaction with some of the test conceits is unrelated to my performance.
When my brain awoke this February/March from my annual winter hibernation from baseball, I hit upon the construct for a test that I believe can remove many of the conceits of past tests. It could be used to test all of the following:
Player Ranking/Value Methodologies – aka ‘Player Raters’ (component #3)
Pre-Season Rankings (baking in components #1, #2, #3)
Statistical Projections (component #2 – specifically, how well do projection systems project stats relevant for fantasy baseball)
Playing Time Estimates (component #1)
In the process, it could also determine how much the final standings are impacted by one’s draft selections as well as the reliability (or lack thereof) of pre-season standings (as in using projections to determine who looks best in the pre-season).
Here is the test:
Take the draft results by team from the 38 Razzball Commenter Leagues in 2011 (hosted on ESPN, 12 team, MLB, 5×5, C/1B/2B/SS/3B/CI/MI/UTIL/9P/3 Bench/1 DL, 180 Games Started, Daily Roster Changes, Unlimited FA/Waiver pickups). This amounts to 456 teams’ worth of draft data.
Create a team total based on ‘expert’ rankings/$ totals/other arbitrary metric (like ESPN Player Rater Total Points)
See how these team totals correlate with each team’s final Total Standings Points
Notes:
For testing individual components, the other two components must be kept constant (e.g., to test Playing Time Estimates, use the same Statistical Projections and Player Ranking/Value Methodology).
Rankings need to be converted into $ because the difference in value between picks progressively gets smaller as the draft progresses.
The key benefits of such a test vs. a simulated test are:
These are ACTUAL draft results based on real fantasy baseball manager behavior.
The team standings points reflect ACTUAL in-season fantasy baseball manager behavior such as replacing poor-performing draft picks, using replacement players when players are injured, etc.
I’m going to focus my first test on Player Ranking/Value Methodologies (aka Player Raters) because it is the easiest one to do. Why? Because there is a sure-fire, uncontroversial source for Playing Time Estimates and Statistical Projections to act as the ‘constant’ – 2011 Final Season Statistics.
I tested the following free public sources as part of the test:
ESPN Player Rater (It’s one player rater so no way to customize for league format. Note: This link will likely be overwritten with 2012 data once the season starts. I have archived the results.)
Last Player Picked - 12 Team League, $260, ‘Optimal Hitter/Pitcher Mix’, using same roster format as listed above with 6 SP/3 RP (most representative split of pitchers based on league behavior)
In addition, I tested a total points formula that’s primarily based on the one that Tom Tango created for the Forecaster Challenge: HR+SB+(R+RBI)/3 + (H-(0.27*AB)) + 2*W + 1.5*SV + K/5 + (IP-(H+BB+ER)/2). The one difference was to multiply Saves by 1.5 vs. 1 to better reflect RP value.
Since these leagues are ostensibly populated by Razzball readers, I first wanted to test to see if there might be any bias in draft behavior. Below are the correlation percentages between the Average Draft Positions (ADP) for players in the Razzball Commenter Leagues (RCL) vs. Grey’s pre-season 2011 rankings, the pre-season 2011 Point Share rankings, and the ESPN Top 300 for 12-Team leagues. I included all players drafted in at least 30 of the 38 leagues. I broke out the ADP for the top 100 teams vs. all 456 teams to see if there might be a ‘Razzball’ bias amongst only the top teams.
Correlation (%)
RCL ADP-Top 100 Teams
RCL ADP- All Teams
Grey’s Rankings
Point Shares (3/8)
Point Shares (Late March)
ESPN Top 300 (12-Team)
RCL ADP – Top 100 Teams
—
99.5
91.8
79.5
82.7
96.7
RCL ADP – All Teams
99.5
—
92.6
80.4
81.5
96.7
Grey’s Rankings
91.8
92.6
—
72.5
72.9
85.5
Point Shares (3/8)
79.5
80.4
72.5
—
96.9
78.3
Point Shares (Late March)
82.7
81.5
72.9
96.9
—
79.9
ESPN Top 300 (12-Team)
96.7
96.7
85.5
78.3
79.9
—
I assume the extremely high correlation with ESPN’s Top 300 for 2011 (96.7%) is driven by the default ADP used in the draft software. Interestingly, Grey’s rankings and my Point Share rankings differ from ESPN’s (78-85% correlation) but differ more from each other (~73% correlation). Given these correlations, I think it’s fair to assume that the Razzball Commenter League (RCL) draft results are fairly indicative of standard ESPN drafts.
Here are the correlation % results for the Player Ranking/Value Methodologies (links to each were provided above, here is an aggregated view). I tested both my actual Point Shares as well as my conversion to dollars. ESPN Player Rater is based on their Total Points in their Player Rater. Last Player Picked is based on their $ estimates.
Other notes:
Players not found in a player rater (usually based on injuries/missed playing time) are set at $0 for Point Shares/LPP and zero for ESPN Player Rater.
Any player with < $0 in Point Shares/LPP is capped at $0 as players that bad (or missed that much time) were likely excised from a team roster before they could do a full season’s worth of damage (and, remember, that ‘replacement value’ is at $0). For instance, Brian Matusz was drafted in every league. His $ estimate in Point Shares was -$31 in Point Shares, -$25 in LPP, and -7.08 in ESPN Player Rater points. All are now set to zero. For Point Shares, I capped it at -2.64 which is the equivalent of $0.
Source
Correlation With Team Standing Points
Point Shares
63.8%
Point Shares (converted to $)
63.7%
ESPN Player Rater
56.7%
Last Player Picked
55.2%
Points Formula
49.7%
Based on the above results, I would answer the question of “How much are the final standings impacted by one’s draft selections?” as probably somewhere in the 60-65% range. I can’t say for sure since it’s unclear what the actual ceiling for player rater accuracy. Please note that this is a wholly different question than “How much are the final standings impacted by one’s draft selections as valued by preseason rankings/projections?” That will be answered in my next test.
The minute difference between Point Shares and my $ values isn’t that surprising since my dollar conversion formula is just a calculation from the Point Shares. If it resulted in significantly different results, it would be a sign that my calculation was flawed.
I cannot say for sure why Point Shares beats ESPN and Last Player Picked as I do not know all the details behind their methodology. They correlate at 93.5% which isn’t markedly higher than their correlation vs. Point Shares (89.9% for ESPN, 92.4% for LPP). Last Player Picked is the more transparent of the two in terms of methodology and it’s clear that Mays @ LPP uses ‘replacement level’ as the foundation of his analysis (vs. me using ‘average player level’). No idea if that really plays a role here. If I had to guess what drives ESPN’s Player Rater, I’d venture some application of Z-Scores per category.
I also really don’t care to spend too much energy researching why my Point Shares methodology appears to be superior. One variable I can say for sure is that my position factors (e.g. a catcher w/ same stats as an OF is worth more) have no measurable impact. I ran Point Shares with no positional adjustments and got a 63.73% correlation instead of 63.78% (z-score 0f -0.1).
If you compare each of the three rankings/$ estimates, you could potentially deduce some of the methodology differences. For instance, it probably comes as no shock to anyone familiar with the ESPN Player Rater that – when comparing it to Point Shares – some of the largest differences come into play with players whose primary value comes from stolen bases. Here are some examples (Point Share Rank / ESPN Rank): Michael Bourn (47/14), Elvis Andrus (77/51), Coco Crisp (109/60), and Brett Gardner (118/66).
If Bourn was really the 14th most valuable player, you’d think that teams who drafted him received great value (ESPN had him ranked 89th in the pre-season) and performed disproportionately better vs. teams that did not draft him in the Razzball Commenter Leagues. As you can see in this spreadsheet, teams who drafted Bourn finished almost exactly in the middle of the pack. While this test isn’t perfectly conclusive of player value (e.g., Granderson only ranked #89), the results seem to correlate fairly well with expectations. The following are in the top 10%: Kemp, Ellsbury, Weaver, Verlander, and Bautista. The following are in the bottom 10%: Hanley Ramirez, Carl Crawford, Jayson Werth, Joe Nathan, Kendrys Morales, Chase Utley. I wonder if this potential issue with the ESPN Player Rater is driving Matthew Berry’s 2012 love for Michael Bourn (note: even at Point Shares #47 rank, you could make an argument if you felt confident that Bourn could repeat his 2011 stats that he warrants a 3rd round pick. I’d consider him a 5th round pick at best.)
It also should be noted that both Last Player Picked and ESPN Player Rater have significant usability advantages vs. my Point Shares. Last Player Picked can customize $ estimates based on just about any league permutation imaginable. While ESPN Player Rater doesn’t allow for league customization, it is updated throughout the season which is a huge advantage vs. Point Shares/LPP.
I tried to be as transparent and unbiased as possible with this analysis. The one piece of information that I didn’t link to is the actual draft selections per team. I will provide that once I’ve completed my next analysis. Please feel free to comment with questions and/or to point out ways I may have screwed up the analysis.
My next test will be testing 2011 Player Rankings against team results. I will only use free, publicly available rankings unless authorized by someone at the company behind the subscription-based rankings. All player rankings must have a date stamp prior to the beginning of the 2011 season. If you see a notable omission below, please provide me with a link to the rankings. Thanks to FantasyPros.com who helped me gather some of the below rankings:
Razzball – Point Shares (1 version on March 8th, one done around end of March)
Razzball – Grey’s Rankings
ESPN – Matthew Berry’s Top 200
ESPN – Pre-Season Top 300
FantasyPros.com Aggregated Top 300
FoxSports Top 300
Hardball Times (Jeffrey Gross) – not public but permission-provided
KFFL Top 200
Last Player Picked – using 2011 Composite Stats
RotoChamp Top 300
RotoExperts Top 300
SI.com Top 300
USAToday.com Top 200
Note: CBSSports.com uses a static link for its free pre-season guide so the link now points to 2012 rankings (if someone has a saved download of the 2011 PDF, please e-mail it to me at rudy@razzball.com). Our pals at Yahoo! (perhaps wisely) do not publish pre-season rankings.
Point Shares are our proprietary methodology for ranking players. See here for a primer. If you’re in a rush or don’t care to read a methodology post, these rankings estimate a player’s impact on a team’s points vs. the average drafted player at that position. Ever wonder what the value is of , say, Carl Crawford’s SBs? Our estimate in a 12-team league is 3.3 points – so if you have a team with average speed and Crawford, you’ll fall close to 10 points (average is 6.5 + 3.3). How bad does Jacoby Ellsbury’s HR/RBI hurt you vs. an average OF? He costs you 1.6 points in HR and another 1.5 points in RBI (that’s why he comes in at #102 vs. the very high pick in other rankings).
I don’t recommend that you use this as a de facto draft rankings. You have to factor in how the other teams in your league will value players. No reason to draft a player a few rounds before anyone else will.
You’ll find that Point Shares value pitchers more than any set of drafters ever would. Before you go all crazy and draft 3 pitchers in the first 5 rounds, remember that these estimates are based on adding a player to the average team. Once you add a stud pitcher like Lincecum, your pitching staff is way above average. Any additional pitcher will have less incremental value. Unlike Spinal Tap’s amplifiers, you can’t go past 10 in a category if you’re in a 10 team league no matter how much you dominate.
In the next week or so, we’ll be posting 8-team, 10-team, and 12-team AL-only and NL-only.