Don't be shellfish...Share on FacebookTweet about this on TwitterShare on Google+

It has been almost 2 years since we launched our first daily fantasy baseball tool (Streamonator in 2012).  Since then, we have launched several other tools such as Rest of Season Player Rater + Hittertron in 2013 and DFSBot in 2014.

Razzball Nation has been a huge part of these tools from the start – both in encouraging us to create them and providing ongoing feedback to make them better (e.g., we now report ‘next week’ data on Fridays to assist those in weekly roster leagues, added game time, etc).

But one valid ‘ask’ that we have not been able to deliver until now is:  “How accurate are Razzball daily projections?”

We are no strangers to accuracy testing.  We have run accuracy tests of Fantasy Baseball Preseason Rankings for three straight years and have run accuracy tests on baseball projections as well.  But it has still been a challenge to come up with a format that 1) could be updated on a daily basis and 2) could make some sense without an advanced statistical degree.

Our first working version is now available.  Meet the Ombotsman.  We have appointed the Ombotsman to provide transparency on the accuracy of our various daily fantasy baseball ‘bots (Streamonator, Hittertron, DFSBot).

The testing is done on every day’s worth of data for Streamonator, Hittertron, and DFSBot.  The actual player stats/DFS points/actual $ values for every starting player are compared to our projections via correlation testing.  Two data sets can have a correlation percentage between 100 and -100%.  100% would mean that they are perfectly correlated.  This could mean they equal exactly or that a single formula can take the first as an input to calculate the second point (easy example, Celsuis and Fahrenheit temperatures are 100% correlated but obviously are not equal).  0% means completely random.  Negative correlations would indicate an inverse relationship – e.g., strikeouts and batting average (needless to say, if projections for a stat are negatively correlated with the actual results, the projections are worthless).

The tests are summarized at the month but also broken out by the day.  When viewed by the day, you will see a lot of volatility because of the smaller sample size.  All it takes is for Strasburg to give up 8 ER to the Astros or a 1-0 game at Coors to put a significant dent into a day’s projection accuracy.  The monthly averages provide a better, less volatile gauge on projection accuracy.   The accuracy tests include:

  • DFSBot Hitter and Pitcher Testing - This is the clearest test IMO.  I correlate my projected salaries/DFS points against actual DFS points (for hitters, it just includes those who were in starting lineups) and then do the same based on their DFS salaries for that day.
    • Key takeaway #1:  Our projected salaries/DFS points consistently correlate better with actual results for both hitters and pitchers than the salaries of all three DFS services covered in DFSBot.  These are companies valued in the millions of $ and are making efforts to adjust player values based on opponent, park, recent performance, etc.  This is no small feat in my opinion.
    • Key takeaway #2:  It appears that FanDuel’s point structure makes it a little harder to project than DraftKings and DraftStreet.  Both my results and FanDuel’s results are slightly lower than for the other two services.  This means bupkis to me but might mean something to a hard-core DFS player.
    • Key takeaway #3:  Pitcher results are more predictable than hitter results.  (Now, whether this means one should gamble that much more on non-aces who project as nice values vs reliably awesome aces is a different topic)
  • Streamonator Testing – There are two types of tests done for Streamonator.  The first is a correlation test against various statistics (W, L, IP, H, ER, K, BB, ERA, WHIP) as well as the estimated $ value of the start.  The second is a distribution that shows the average results of starts based on various projected $ ranges (e.g., $0 – $3.50) as well as how often starts fall into the various ranges.
      • Key takeaway #1:  There are large differences in the correlations by pitcher stat.  For June, K projections correlated at 43% while Wins/Losses correlated only at 9.2% and 6.9% respectively.  ERA/WHIP fell in between at 20-25%.  While this produced an initial idea of ‘weighting each category for Streamonator $ calculations, the reality is that this already happens to an extent.  The low correlation on wins/losses is tied to the fact that my W/L projection model is very conservative so most pitchers gameday Win/Loss estimates are bunched together where it will show more differentiation for K’s.  (The model uses projected ERA for the starter and the opponent.  You’d think that would be a strong predictor of winning % but, as this testing shows, it is weak.  And I think any other variable – e.g., bullpen strength, IP per SP, etc. – is even weaker)
      • Key takeaway #2:  Now we can see proof that Streamonator dollar values provide reliable results in the long run.  How?  If you look at the Streamonator Distribution table, you will see that, on average, the Streamonator projections by range match up very well with the average return per $ range.  For example, for the 30 days ending with July 6th, there were 95 starts in the past 30 days estimated as worth $10.5 to $14.  The average estimated value of these is $12.1.  The average value of those starts came out to $12.0.  Below is the full snapshot for those 30 days.  The SON $ Average and the Actual Average $ columns correlate at 96% (r^2 of 91%)
    Stream-o-Nator Actual $ Averages by Projected $ Range
    (June 7-July 6)
    Count Stream-o-nator AVG $ Actual Average $
    <-$7 59 -11.9 -12.6
    $-7 to $0 139 -3 -0.8
    $0 to $3.5 111 1.7 3.1
    $3.5 to $7 131 5.1 3.5
    $7 to $10.5 124 8.7 10.6
    $10.5 to $14 98 12.1 13.5
    $14 to $17.5 65 15.3 17.8
    $17.5 to $21 41 18.8 21.5
    $21 to $28 30 24 27.8
    $28+ 11 31.2 54.9
    • Key takeway #3:   All pitchers have great/bad days but, the better the projected $, the greater the chance they have a great day ($28+) and the lesser the chance they have a bad day (below -$7).
  • Hittertron Testing – There are two types of tests done for Hittertron.  The first is a correlation test done against the following statistics:  PA, AB, H, R, HR, RBI, SB, BB, SO, AVG, OBP and SLG.  The second is a distribution that shows the average results for hitters based on various projected $ ranges (e.g., $0 – $3.50) as well as how often hitter days fall into various point ranges.
    • Key takeaway #1: Strikeouts are the easiest stat to project with most other stats having similar correlations.  (Stolen bases are misleading as so many player have 0.  You can project zero SB every day for Miggy Cabrera and be right 99% of the time).
    • Key takeaway #2:  There is proof that Hittertron provides accurate results in the long run.  The Hittertron distribution grid shows that the average results (measured in points) match up very well with each projected $ range.  Aside from two slight differences, the actual average hitter points increase for every range.  The correlation between the averages per range for Hittertron $ and the Actual Points comes in at 98% (r^2 of 96%).
Hittertron Actual Point* Averages by Projected $ Range(June 9-July 8)
$Range Count HON_AVG$ ACT_AVG_PTS
<-$7 683 -11.4 1.56
$-7 to $0 1099 -3.4 2.63
$0 to $3.5 731 1.8 4.14
$3.5 to $7 729 5.2 4.28
$7 to $10.5 646 8.6 4.46
$10.5 to $14 594 12.1 5.06
$14 to $17.5 480 15.6 4.93
$17.5 to $21 351 19 6.17
$21 to $28 450 24 6.94
$28 to $35 227 31.1 8.83
$35+ 194 44.4 7.98
* Points calculated as 10*(HR+SB+(R+RBI)/3+H-(.265*AB))
    • Key takeaway #3:  As one would expect, hitters are not as reliable on a day-by-day basis as starting pitchers.  Even for the highest projection range ($35+), only 20% fall into the top 3 hitter points buckets while 19% fall under the worst bucket (think 0-for-4).  When streaming hitters, it is imperative to take the ‘long view’ vs. the ‘short view’ as even the best hitter matchups are going to deliver goose eggs more often than those multi-Hit/R/RBI days with a HR or SB.

While we would love to directly compare our accuracy results vs other sites, that is impossible for several reasons ranging from “Almost every other site who projects daily data charge a subscription” and “This type of automated testing requires a data feed vs manual data pulls”.  So we cannot state that we are the most accurate.  But we do feel comfortable in stating that, with the release of the Ombotsman, we are the most transparent of all daily fantasy baseball projection services.

Please feel free to suggest additional tests – though it may take me a while to implement the good ideas (I’ll try to gently swat away your idea if it’s bad/redundant).

 

From Around The Web

  1. SMLV1 says:
    (link)

    Hey Rudy, why does SON like Fister @BAL $14.5 but No love for Gio @BAL $3.2 tomorrow?

    • Fister projected @PHI not @BAL. Gio @BAL is bad matchup. A lot of tough RH bats in a hitter’s park

      • SMLV1 says:
        (link)

        @Rudy Gamble: Fister just pitched @BAL tonight did good 1 Win, 3Ks, 2.57 ERA and 1.29 Whip. Gio is pitching @BAL 7/10…..

      • SMLV1 says:
        (link)

        @Rudy Gamble:

        My Stats – 4 Wins, 6 Saves, 32 Ks, 2.75 ERA and 1.02 Whip
        Opponents Stats – 2 Wins, 0 Saves, 20 Ks, 2.28 ERA and 1.31 Whip

        Im in 1st place, he is in 2nd

        Im starting Gio @BAL
        Opponent starting, Smyly @KC, Bailey @Home VS. CHC and S. Miller @Home VS. PIT

        Bench Gio or start Gio?

        • For h2h, I start gio. Otherwise, you may be forfeiting wins and k’s. But I expect it to be a mediocre start. HRs will be key.

  2. J-FOH says:
    (link)

    I thought test for Hitter-Tron would include Syphlis and Gonorrhea

    • Hittertron sprays McAfee on his picks just to be safe

      • Carnac says:
        (link)

        Anything/one sprayed by John McAfee should definitely be tested.

    • Maxi says:
      (link)

      @J-FOH: I agree, we need more STD testing due to Hitter-Tron’s promiscuouity concerning all things corrugated metal.

  3. Hawk says:
    (link)

    Rudy – it’s stuff like this that makes Razzball the best Fantasy site on the web. Outstanding.

  4. Heraldo says:
    (link)

    This is absolutely amazing! Thanks Rudy.

  5. Nico

    XxAznBayBeexX Nico says:
    (link)

    Transparency leads to trust. Good stuff, Rudy. You had me at Accuracy.

    • Crapshoot Kershaw says:
      (link)

      @XxAznBayBeexX Nico: it does, but try telling that to a woman (or if you are a woman, try telling it to a man) RE: relationship stuff.

  6. James says:
    (link)

    Rudy you are simply amazing. Everyone else on this site knows baseball, is funny, and a good writer… you are all those things, and a SABR beast.

    I think i followed abotu 80% of that, now i wish i had went into a math field so i could understand what is going on there.

  7. Simply Fred

    simply fred says:
    (link)

    Rudy, you da’ man!

    When I compared your hitter streaming results late last year vs others, you came out on top. still, and a small sample size, the streaming hitter results weren’t (at that time) as good as say carrying a bench bat. You have a feel for this, or measurement?

    For example, i am ‘guessing’ that rostering a Calhoun vs. righties, and/or Dyson for spot steals, is more productive than just streaming hitters on a daily basis…? So, I guess a measure would be what are you getting from streaming a hitter slot comparative to Calhoun and Dyson cumulative stats (and their relative ABs)?

    • Calhoun is a top 40 OF so he is someone who should be owned 100%. I think having 1-2 lefty hitters who only start vs RHP(e.g. joyce) are fine to carry for stretches when they are hot and/or hitting in top 5 of lineup. All bets off though when games at coors!

      • Simply Fred

        simply fred says:
        (link)

        @Rudy Gamble: ok, how about average player stats vs. average streamer stat?

        • Simply Fred

          simply fred says:
          (link)

          @simply fred: my bad, in stats above (i guess? tough for my old brain to connect)

        • Player vs streamer is relative. Depends on league format and the owners. Stats in article plus ombotsman page indicate that my $ estimates do a solid job across all hitters.

          • Simply Fred

            simply fred says:
            (link)

            @Rudy Gamble: opened my eyes, dug in, and it only took five minutes to see what the results mean. clearly, hitter-tron is kicking it. have seen others in my league living by it. now i see why they are at/near top of the league. awesome! (and, again, my apologies)

  8. Simply Fred

    simply fred says:
    (link)

    ha! there goes Calhoun again!

  9. Simply Fred

    simply fred says:
    (link)

    and, yes, i had goebbert streaming today along with you (the other side of the coin… :-) )

  10. Andrew says:
    (link)

    Are bullpen strengths calculated in the SON? Is that a reason that the W/L correlations are so low? Since W’s count as such a big part of DFS scoring it would be nice to see that correlation improve.

    Love all that you do!

    • Thanks Andrew. So far, the only factors I have found that help predict Wins are pitcher projected ERA and opposing pitcher ERA. Bullpen strength – which is tricky to apply – likely has a negligible impact at best.

      • Andrew says:
        (link)

        @Rudy Gamble:

        Interesting, but I think the bullpens have to be considered somehow. If let’s say the A’s SP and the Astros SP both go into the 9th with a 1 run lead (and both SP are assumed to be out of the game) – the A’s SP W % is higher than the Astros SP WP% at this point. Astros bullpen ERA is over 5 while the A’s is under 3 – which makes a huge difference when factoring in for the W.

        I know their is a lot of volatility and unpredictability with the bullpen, but shouldn’t their be slight adjustments for it?

        • I would need to run a regression test off past games to see if that ends up being a factor. Will test it at some point.

  11. Baezaworldseries says:
    (link)

    Who do you think you are? You’re letting us see how accurate/inaccurate your projections are.? For shame. I want fluff. I want to hear how infallible you are. I want to hear how it wasn’t your fault the Mets blew up my #1 sp. We pay you good money to make us feel better. We don’t want the truth.
    Just joking around. This is probably the best/ballsiest tron/bot in the history of fantasy baseball. Ground breaking stuff. I can’t believe it’s all free!!!! It is still free right?

  12. Bombo Rivera says:
    (link)

    Not basing this on consistent observation, but it seems like Hittertron can be a bit wonky when considering park effects. I know Coors is a good place to hit, but it seems to go a bit overboard at times. It also seems to go a bit crazy when looking at handedness splits. Could those two things push the overall HT values away from actual values?

    • Simply Fred

      simply fred says:
      (link)

      @Bombo Rivera: really?
      Jake Goebbert 0 HR in 23 AB for SD this season. at COL…dinger!

    • If incorrectly applied, park factors and handedness factors could certainly do more harm than good. With some help from Steamer, I think my application of these factors is correct. Generally following a lot of standards developed by SABR community like Tom Tango and pizzacutter. But it was a steep learning curve

  13. Barenrewn says:
    (link)

    Great read Rudy. I use the streamonator religiously and just began using the hittertron b/c of injuries and lack of depth at the position in my league. I’ve had some mixed results so far, but better good than bad. Nice job.

  14. Simply Fred

    simply fred says:
    (link)

    so, r.weeks is ranked #39 for tomorrow. hitter-tron properly has him facing a righty. i believe, he only gets in the lineup facing lefties. is this a tweak that could enhance the program (not ranking him high when he likely won’t be in the lineup)? or, what to make of this?

    • The %St column estimates likelihood to start. He is at 5%. If u want to avoid unlikely starters, just filter that column.

      • Simply Fred

        simply fred says:
        (link)

        @Rudy Gamble: give me a break. if he is only 5% to start, one would surmise that he should NOT be the 39th ranked player for tomorrow. the 5% should be incorporated into the projected value for the day. his $U should be 5% of the 17.7=.89, ranking him 158. doesn’t seem too sophisticated an add…

        • Playing time projections factor into RoS and weekly projection because they are useless without it. For daily, I strive to estimate every hitter’s value should he start. Thus, if/when a player does start, you know his value. Key for evaluating for today and lineups post. I could multiply value * %ST to adjust default rankings but I think current solution is preferable

          • Simply Fred

            simply fred says:
            (link)

            @Rudy Gamble: well, thought i would go right to trying it out. went to ‘tomorrow’. went down the list until i found ‘weeks’ at 39. i dropped a player and added ‘weeks’ based on hitter-tron’s high ranking for the day. after discovering week’s was facing a righty, i had to drop him and look for another (of course the player i dropped went to waivers and he was not available).

            just thought you should get a perspective of the ‘feel good’ from that.

            shoot, weeks could have been listed low based upon his projected not start against righties, then when he does, i never know to pick him up. somehow, losing him for an unexpected start doesn’t balance adding/dropping players unecessarily…just me i guess…

            of course, it makes sense, to expect new users to have figured out that they should sort by ‘%St’ first, especially when that isn’t called for weekly or rest of season…

            your tool, you get to do what feels good to you…

            • My Minor Likes Female Pujols says:
              (link)

              @simply fred: dude, it’s a free tool that’s had countless hours put in to help you with your pool and you’re picking it apart. Quit complaining and don’t use it if you don’t agree with all the aspects….. There’s paid sites you can use too ya know

              • Simply Fred

                simply fred says:
                (link)

                @My Minor Likes Female Pujols: dude2: re-read. try to take a look at the upside. yes, countless hours put in, with tremendous results and superb tool! that doesn’t preclude the opp to add a little to make it slightly better yet. Rudy, sees that.

                • Crapshoot Kershaw says:
                  (link)

                  @simply fred: problem is that fred here isn’t some newbie that should even have any doubts about when weeks starts. we ALL know from reading comments who starts at 2B on righty days. He simply forgot to link that known info when he quickly grabbed weeks (probably did it so fast that he didn’t have time to notice this obv point). It’s our jobs to see who is and who isn’t playing. The lineup page is linked above.

            • @simply fred: i’ve added a bolded note about this above the grid. appreciate the feedback.

  15. troy says:
    (link)

    LOVED IT this is the kind of stuff that makes u guys my first read of the day, and my last. Great stuff man… can’t wait to play around with it now. As a big dfs player on fanduel and fantasy aces I’m really enjoying all that u guys r doing. You guys and fantasy alarm are really all a person needs. Thanks again rudy

    • Thanks for the kind words!

      • troy says:
        (link)

        @Rudy Gamble: u guys deserve all the praise in the world u guys rock

  16. DrEasy says:
    (link)

    Fantastic work, Rudy! A quick question: if the correlation between the bot and reality was negative, wouldn’t the predictions still be useful? I’d simply do the opposite of what the predication recommends and would do well.

    As for ideas for possible tests, why not compare your predictions with simpler “baseline” sabermetric tests? We know from Pizza Cutter which rates stabilize quickly (BB%, K%, etc), and so some naive formula that would use them to estimate AVG and other fantasy categories might be a good start (then again, maybe that’s how your bots work already?). Or for example using the Sh-h (“should hit”) discussed on FanGraphs based on BABIP.

    • @DrEasy: Yes, if the correlation was consistently negative than, hypothetically, you can do the opposite of the projection. I think that’s doubtful though.

      I’ve seen some of the Pizza Cutter tests on when stats stabilize. But not sure how I’d apply this. I guess it would be testing based on ‘Season to Date’ likelihood for each of the stats – e.g., if Nelson Cruz has a HR per 10 AB, then I give him 0.4 HR for the day and see how that compares. This wouldn’t work, obviously, in April but it’s not a bad idea at this point in the season. Could also use last 7 or 30 day data to test ‘hot/cold’ hitters.

      I could also, besides using AVG, use the adjusted AVG estimates I have (based on expected BABIP).

      Will have to give it some thought…

      FYI, I have shied away from using my ROS results because they have a number of baked in assumptions/factors for playing time (which impact LHP/RHP PA mix).

      • goodfold2 says:
        (link)

        @Rudy Gamble: that’d be a hilarious marketing strategy for your bots, “our bots give negative correlations, so doing the opposite will lead to the power of fantasy winnings!”. like Costanza when he does the opposite of every impulse he has and becomes super successful.

  17. Oliver says:
    (link)

    Based on this analysis, the best way to stream using Hitter-tron would be on a week-long basis. Am I interpreting that correctly?

    I have not been using the Hitter-tron because I often find myself in this quandary. I am curious how other players handle it. Let’s project out next week, and see that among 3B, Carlos Santana is projected to outperform Evan Longoria. Do I bench Longoria for Carlos for that week? Are folks actually doing this? Another example, when you have Joey Bats, Adam Jones, Brantley, Khrush Davis, and Alex Gordon at OF, do you bench one of these guys if a lower-tiered OF is projected to outperform? Or do players generally stick to their “tried-and-true” performers and only use the Hitter-tron when they lose someone as a result of injury, etc.?

    • @Oliver: No, that’s not the case. Streaming by the day is much more preferable since it opens up the option to platoon players – especially RH bats against LHP. In weekly leagues, ‘starters’ have an additional advantage of playing time that usually makes them better options than platoon players.

      In daily leagues, I never sit my best hitters (in RCL, that’d be Trout, Beltre, Rizzo, Altuve) but will consider benching lefty hitters against LHP based on the Hittertron results. For a guy like Gordon, he’s likely only benched against a Sale/Kershaw but Brantley would be against league average or better LHP. Also worth noting that LHPS are harder to run against so it’s a double blow for LH where speed drives a lot of their value.

      • Spammer Jay says:
        (link)

        @Rudy Gamble: Thanks Rudy, I had the same question.

      • Oliver says:
        (link)

        @Rudy Gamble: Very interesting. Thanks for responding! Not at all how I envisioned using it. The analysis is awesome, BTW!

        • goodfold2 says:
          (link)

          @Oliver: using the weekly is most useful for those in leagues where you can only set lineup once (weeklies, the fantasy baseball setting for spelunkers/lumberjacks that can only get to the interwebs once a week), or for player short term pickups where you have limited weekly moves.

  18. frankgrimes says:
    (link)

    Rudy! It’s clear how much work you put into these tools
    seems to me some peeps expect you to be able predict every litte thing for them haha
    Like you said the most transparent and non subscription info out there
    you left out the most entertaining and cool peeps around as well

      • Mordacious Levator says:
        (link)

        @Rudy Gamble: true, but most of the time it’s just people whining about how “SON said pitcher X was a top 10 option today so i grabbed him and now i’ll lose ERA/WHIP for the week, how do i trust such a thing?”

        • And now withe Ombotsman, the easy response is, “Because SON’s $ estimates are matching actual $ on average. As the chart shows, $7-$14 starts have a 25% chance of being very poor (worse than -$7). So shit will happen in the short-term but it averages out in the long-term.”

  19. RicoSuave says:
    (link)

    Wow! awesome work Rudy! The RazzCrew rocks!
    Thanks guys

  20. pull the trigger?? says:
    (link)

    Non keepers like Tulo Trout miggy zobrist grienke hammels wainright reyes trout rios price A.Sanchez desmond J Upton Samardzija Are on the lower teams who should want my keepers i have listed below.
    With tanaka going down my First place won’t remain till I trade off some keeper chips.
    Next years prices in a 260 12 team mixed 5×5
    K Bryant $6
    Alcanterra$4
    Hanley R $13
    Puig $8
    C.mart $5
    CAN YOU and our readers COME UP WITH Some TRADE SCENARIOS considering my roster below.
    RUNS SB AVE ARE TIGHT and WITHOUT TANAKA K’s/W will be an issue
    C Santana
    C D’Arnaud
    1b Abreu
    COR Glldschmidt
    2B kiPnis
    SS Hanley
    Mid Lastella
    Of Puig
    Of Bruce
    Of pence
    Of gardener
    Of eaton
    U betts
    P loshe
    P bumgardnr
    P Zimmerman
    P C Martinez
    P eovaldi
    P Duffy
    PRosenthal
    P Rodney
    P Casilla
    B deGrom
    B Nelson
    DL wacha
    DL tanaka

    Thanks all!

  21. Cheese Eating Surrender Monkey says:
    (link)

    any of us that’ve play daily fantasy know Razzball’s predictive sources are at least ahead of the curve compared to Draftkings (which is the best one, seemingly)’s salaries for some players. particularly those that are young/injury replacement types. you can go weeks getting a good player on the cheap basing it on Hitter T at draftkings, eventually the salaries go up there, but you’ve had weeks of value by that point.

    • @Cheese Eating Surrender Monkey: Thanks for the feedback. I’ve also seen the occasional underpricing of a role player or rookie. Best case is finding them (DFSBot makes it easier) AND when the masses don’t. Last year, I remember catching Y. Petit was pitching home @SF for the minimum $5000. Almost threw a perfect game for me!

      • Cheese Eating Surrender Monkey says:
        (link)

        @Rudy Gamble: i also streamed that start in a 16 teamer. but i believe i streamed him for a horrible one after that too.

Comments are closed.