Razzball’s daily fantasy baseball tools – Streamonator, Hittertron, and DFSBot – are available by subscription in 2015. If you play in roto leagues with daily roster changes or Daily Fantasy Sports like DraftKings, these tools will rock your world. Please see our Subscriptions page for details – including how to get a free subscription by opening up a new Daily Fantasy Sports account.
Fantasy baseballers are becoming increasingly analytical. Estimating a player’s future batting average now reflexively leads to checking their BABIP, their batted ball profile (GB/LD/FB) and a hitter’s K%. Any discussion of a pitcher’s ERA will likely reference their FIP/xFIP/SIERA/etc. Aside from the whole bastardization of baseball outcomes for the illusion of empowerment and erosion of professional productivity, the average fantasy baseballer is much closer in perspective to a sabermetrician than the average fan watching from his/her couch (or the remaining indoctrinated baseball journalists that still roam the land).
One area where I admit I have a tough time reconciling my analytic side with my fantasy baseball instincts is the value of a hitter’s recent performance.*
* I know many fantasy baseball players look at hitter/pitcher matchup data. I think the chapter on this in The Book clearly drives home the point that these results are not predictive because of small sample sizes. I always ignore matchup data and think it is a total waste of time…well, except for Goldschmidt vs Lincecum
It seems perfectly natural to gravitate towards hitters with recent success (say, multiple hits in the past couple games, 2 HR in past 3 games, etc) versus someone who has thrown up a couple of 0-fers. But could 10-20 PA produce any semblance of a statistical signal with all that small sample noise (e.g., even if there are cases where a player is truly ‘hot’ and outperforming his skill level because of some combination of health/confidence/mojo, small samples also bring cases where a player just flukes into a couple 2-hit games or takes advantage of a couple of hanging sliders).
As a tie-breaker between two equally projected hitters, recent performance is, at worst, a harmless impulse. The harder question to answer is, “At what point should you choose a hitter with weaker projections because they hit better in the past couple of games?” Given I manage our daily hitter projections (Hittertron and the hitter portion of DFSBot) – which adjust Steamer Rest of Season projections with various gameday adjustments like opposing pitcher and park factors – the key question rattling in my brain is:
Is recent game performance properly accounted for in our Hittertron/DFSBot daily hitter projections?
Our daily projections incorporate two key elements:
- Estimated Talent/Skill from Rest of Season Steamer Projections which weighs in-season performance against previous year performance and is updated daily.
- Game-Specific context such as Opposing pitcher, Park factors, Home/Away, likelihood to start, and expected spot in lineup.
If ‘hot streaks’ and ‘cold streaks’ influence a player’s next game performance (and, thus, a player performs above/below their general talent level), that would represent a third element. I believe most fantasy baseball players THINK like this but is it correct thinking?
If you google ‘hitter streakiness‘, you will find some wonky-ass analyses. Probably all more brilliant than I could have thought up. But they do not answer my question directly nor will they likely make much sense to anyone reading this rag of a blog.
Here is the overview of the analysis (note: I have done the same for starting pitchers vs Streamonator and will release in a future post) :
- Create two hitter data sets based on 2014 data: 1) Every hitter that started 4 straight games (for a ‘last 3 game’ vs ‘next game’ analysis) and 2) at least 5 of his last 6 games (for a ‘last 5 game’ vs ‘next game’ analysis)
- Note: I ended up limiting this to May-August 2014 and players with 500+ PA in 2014 due to data size challenges
- Compile the hitter’s stats for the ‘last 3 games’ and ‘last 5 games’ (if they started 4 of last 5 games, only count the games they started and divide by 4)
- Compile the Hittertron projections for the ‘last 3 games’ and ‘last 5 games’ (if they started 4 of last 5 games, only add the projections for the games they started)
- Determine the correlations between a player’s ‘next game’ stats versus: 1) their Hittertron projections for that game, 2) their ‘last 3 game’ and ‘last 5 game’ stats, and 3) the difference between their Hittertron ‘last 3 game’ / ‘last 5 game’ projections and their actual ‘last 3 game’ / ‘last 5 game’ stats.
- Determine the correlation using a regression based on Hittertron AND the ‘last 3 game’ / ‘last 5 game’ stats to see if recent game performance provides additional predictive value.
A huge advantage of leveraging the Hittertron projections as part of this analysis is that it neutralizes the ‘context’ variables when comparing previous game performance. For example, Hittertron factors in the ballpark and home/away status. If a Rockie played his three previous games in Colorado and is now on the road at Petco, it is very likely he will hit worse. That could look like a false dip in performance when it is comparable performance once you subtract the higher projected stats expected at Coors vs Petco. The same goes for opposing pitcher strength and facing runs of RHP or LHP.
Here are the results of the analysis:
|Game #4 Correlations Of Daily Stat Projections For Batters Who Started Previous 3 Team Games (149 unique batters, 7,169 instances)|
|Stat||Hittertron||Prev 3 Games Minus Hittertron Proj||Previous 3 Games||Hittertron, Prev 3 Games Minus Proj (regression on both variables)||Improvement (Column 4 minus Column 1)|
|Game #6 Correlations Of Daily Stat Projections For Batters Who Started At Least 4 of Previous 5 Team Games (149 unique batters, 10,236 instances)|
|Stat||Hittertron||Prev 5 Games Minus Hittertron Proj||Previous 5 Games||Hittertron, Prev 5 G Avg Minus Proj (regression on both variables)||Improvement|
The analysis shows that previous 3 and 5 day performance provides zero to negligible (and likely not statistically significant) improvement upon the Hittertron results. The previous game averages (4th columns) have minor correlations with ‘next game’ results but most of that is a reflection of player skills that are already accounted for in the Hittertron results (hence, the correlations in the 3rd column are virtually all 0%).
Still not convinced? The below tables show the actual vs projected performance for hitters based on how many stolen bases and home runs each player had in the past 3 games. If streakiness had statistical significance, players who had multiple HR and SB should outperform their projections compared to those with zero HR/SB. But that is not the case. While there is a positive correlation between last 3 game SBs and next game SB, the fact that the projections skew in the same way indicates this is a reflection of general skill level and not a short-term burst. (Note: The reason why HR and SB projections are higher than actual in all cases is that 2014 was a lower offensive environment than expected. The inverse should be true for pitchers in my next analysis – that the actual results overperformed the projections)
|Game #4 Stolen Bases For Batters Who Started Previous 3 Team Games (149 unique batters, 7,169 instances)|
|Last 3 Game SBs||# of Instances||Next Game Avg SB (Actual)||Next Game Avg SB (Projected)||Next G Avg SB Per 162 Games (Actual)||Next G Avg SB Per 162 Games (Projected)|
|Game #4 Home Runs For Batters Who Started Previous 3 Team Games (149 unique batters, 7,169 instances)|
|Last 3 Game SBs||# of Instances||Next Game Avg HR (Actual)||Next Game Avg HR (Projected)||Next G Avg HR Per 162 Games (Actual)||Next G Avg HR Per 162 Games (Projected)|
I am a little surprised that recent SBs provide no additional benefit given that SB attempts are driven at least somewhat by confidence (so a number of successful attempts would embolden the manager or player) and that slight changes in a player’s health throughout the year would lead to stretches where a speedster will not attempt SBs. I do think that SBs, like male elephants, come in bunches but a lot of this is influenced by game-by-game variables like the pitcher, the catcher, and score (close game, blowout, etc) so previous game SBs are just statistical noise.
There is nothing to gain from adding recent game performance into the hitter projection mix and one should not choose a lesser-projected player over another because they have done well in previous games. Assuming equal Hittertron (or DFSBot) projections and both hitters are starting that next day, choosing the ‘hotter’ player is perfectly fine. As is choosing the hitter who has performed better against the opposing pitcher. Or choosing the hitter whose last name comes first alphabetically. Or choosing the hitter who is on your favorite team. Because each of these approaches will net out to about the same results in the long run.
Is it possible that recent 3-5 game performance helps improve other daily projection systems? Possibly, but IMHO this would be a flashing neon warning sign that said projection system is inferior to Hittertron/DFSBot. Based on this analysis, a projection system that learns something new from a hitter’s recent 3-5 game performance is as comforting as hearing your doctor say, “Hmm, this Wikipedia article on <fill in malady> has got me thinking…”.
Is it possible that there is some interval of games greater than ‘last 5’ that provide additional predictive value to Hittertron? I do not think so. Certainly greater in-season intervals (e.g., last 30 games) are better predictors than smaller in-season intervals but I think the increases sample improves ‘skill/talent’ measurement that is already properly weighted by the Steamer projections (e.g., AVG does not have predictive power until ~900 AB).
One last thought for Daily Fantasy players. Recent game performance may well be a market inequity that one could exploit – particularly in Daily Fantasy games. If DFS participants’ bias towards ‘hot’ hitters (and DFS $ values may reflect recent game performance), it may pay to target ‘cold’ players as they may be more affordable and produce more unique lineups.