Dissecting When Streaks Become The Norm For Fantasy Baseball

Note: If you don’t want the story, just scroll down to the bottom to see the statistical markers on how to tell the difference between Hot/Cold streaks or a real improvement or eroding of skills. But Santa is watching. And so am I. All the time.

Welcome friends and family! Actually, you’re right. Hi Mom. Thanks for being my only reader. I’m not sure if you remember who Chris Shelton is, but I certainly do. A week or so ago, I was actually thinking back to the best hot streaks to open the season, and if any of them were produced by players that I had owned. After all, the season is only a week old. Hope springs eternal and what not. So, of course I’m expecting Mike Morse to hit 489 homeruns. And yes, Yu Darvish will finish the season with 56,284 strike-outs. Stop looking at me like that. Thinking back and reminiscing all those seasons I’ve been playing fantasy baseball, which is 16 years if you needed to know. And since I deemed it necessary that you know that, I also, while somewhat ego driven, deem it necessary that you know I am not an old fogey. I’m actually only 30. Which to me feels old, but in fact, really is not that old. If you need proof, ask anyone over the age of 30 how they feel. I assume they will say they feel older than me. And then roll their eyes in disgust. And also, while we’re on this tangent, I’m not fat either. My OkCupid profile says I’m ‘average’, so therefore, it is the truth. And no, you’re not getting a link. Unless you are a hot female that resides in the greater metro area.

But back to reminiscing and stuff, before all of you get lost in the swooning over myself, the stuff of legends. This thought process immediately brought me to the aforementioned Chris Shelton. I bring him up because the second year I owned him, 2006, ended up being the season where I learned that there is more to fantasy baseball than meets the eye. Yeah, you got the Transformers theme stuck in your head now. Thank me later. As much as I believed that I was always one of the smartest in the room (secretly, I’m actually quite humble on the outside), with Shelton, I found out that there was always more to learn.

During the first month of the 2006 season, Shelton was all over the headlines, being talked about by the baseball punditry over and over again. Why, might you ask? In that first month of play, he led the league in homeruns with 10. Ryan Howard, who eventually finished the year with the most homeruns at 58, only had half that total in the same span.

This is where I got in trouble. I picked him off of waivers, as touched upon earlier, the year before. He had catcher eligibility, and scouting reports showed a player that could have power and took a walk every once and a while. I grabbed him in a sneaky-ninja like fashion, and enjoyed a 299/360/510 season, albeit, in only limited action. Based on what he had done, and what I already knew about him, I simply assumed that he would get better. To me, at the time, player curves were static. Basically, I had the mindset that if you were this young, had this much success, you were only going to go up. I didn’t understand things like regression, or that curves fluctuate, or anything about sample sizes. My static ‘bell-curve’ was science, it was law, it was set in stone from whatever process I used at the time. We actually saw this exact line of thinking after Mike Trout had that unparalleled 2012 season. Many out there were touting that this was only the beginning, he could only get better. But it doesn’t work like that. Regression is a cruel beast, and is not to be used as a blunt tool. It has to be used thoughtfully and carefully. In terms of Trout, there simply must be regression to the mean. And before you say, well, we don’t know Trout’s mean yet. Sure we don’t. But we know what a MLB player’s mean is, and that illuminates a lot. I always say, in baseball, like most things, context is everything. Now, don’t take this the wrong way, like I think Trout suxorz and derped his way through it all. He didn’t. But that type of season just doesn’t improve upon itself. Everything we know about the game at this point tells us that the season we saw was historic, and the chances of repeating are so low, that it’s statistically impossible to fathom a season that is better.

What does that have to do with Shelton? Good question! We left off at the point in the story where he was living in the same spotlight as the top sluggers of that year, and at that moment, everyone was wildly describing him as the next Cecil Fielder. Heck, I was busy fastening my smarty pants to strut with, spouting to my league mates and anyone who would listen that I nabbed a guy that would hit 50+ homeruns, all from one uncontested waiver claim. The impetuousness of youth.

Do you know how many homeruns he hit over the course of the next five months? Six. He only hit six more homeruns for the rest of the year. And out of the total of 16 homeruns he hit, zero came in the second half of the season. What did I not understand? What did I miss? I still remember, everyday, looking at his 0’fer box scores and saying, “Well, he’ll snap out of it.” Let me tell you, that’s exhausting to do for 84 straight games. 145 MLB plate appearances and 3 years later, he was out of baseball.

It’s funny, because the things we take for granted now, like HR/FB, K%, BABIP, and all the other advanced stats that are now so easily accessible, could have told me an entire different story. And so that brings us to the point of this post. After thinking back to the fall of Chris Shelton, I began to wonder, is there a way to know if what we are seeing is an illusion or real? It might have been easy to say what Jason Hammel did last year was a fluke, until he produced an entire season outside what was expected. And then you find real evidence for the change. How could you have seen Jose Bautista coming? Or Edwin Encarnacion for that matter? This line of thinking was intriguing. What if we could find out the booms and busts before everyone else did? Or, at the very least, pinpoint markers so we can at least have a starting point for data that could tell us such a thing? So I sought out any information I could on the subject. And now, I bring to you, the Razzball readership, the Holy Grail of my work.

The answer is yes, you can figure it out. Well, at least with a 0.70 correlation. What, you didn’t take AP Stats? Get more Asian in you son! Basically, 1.0 correlation is perfection for multiple data sets, but highly unlikely. It is widely concluded that a 0.70 correlation of data is the ultimate benchmark when doing numbers work such as this. Did that make any sense to you at all? No? Good! That’s why you need me! Or so I say every night before I go to bed…

Anyhow, let’s just get to it. How can you tell what’s real and what isn’t? Make sure you put this in your favorites bar, you’ll want to reference this material throughout the year. Reading the chart is easy. No, seriously, it is. Once a player reaches a certain benchmark (Plate Appearances or Batters Faced), that statistical production stabilizes, and you can say, with some measure of confidence, that any change the player shows is real or not real.

Batter Benchmarks

STAT	Contact%	K%	BB%	HR/FB	OBP	SLG	OPS	ISO
#PA	100	150	200	300	500	500	500	550

Pitcher Benchmarks

STAT	K/PA	GB/FB	K/BB	BB/PA
#BF	150	200	500	550

A couple of things to quickly go over. While not as useful indicators as the offensive benchmarks above, Swing% stabilizes at 50 PA, while GB/FB rates have a magic number of 250 PA.

Now, how do you use these benchmarks? Well, simply mosey on over to your favorite stat provider, and take a look at the what the player’s career norm’s are, and then take note of any changes. In the case of Mike Morse, we already know he has power, but does he have this much power? Is this a pace that can continue? Obviously, we are still in small sample size territory, so I would recommend waiting more than a week or two before doing this. But to help the sample size, simply go back to last year to fill in the PA gap. After all, Jose Bautista’s offensive 2010 doesn’t seem as surprising when you implement this analysis with results of the 2009 season. In the case of Morse, look over his Swing%. Is he swinging at more pitches? This could mean he’s changed the approach he uses. What about his HR/FB rate? Are they above his career norms? If they are, maybe this new approach is working. Check his K% and BB%. Are they trending upwards? Then maybe his ‘MOAR’ power is here to stay.

As you can see, there is much less data on the pitching side of things. This is to be expected, as all things pitching are usually volatile. It’s just the nature of the hairy beast. Yes, I have added hair to the beast. So I said, so it shall be done. And yes, there is a bit of mixing and matching to do here with your own research. But hopefully, this tool allows you to make a better judgement on what a player is actually doing and perhaps give you an edge.

I can’t take most of the credit here. When researching this topic, I came across data goes all the way to the last five years, but all of it was still very relevant. Unless you are a huge math geek, you’re probably better off reading a Dan Brown novel than this. But for those who like this kind of scientific endeavor, I want to make sure to give credit where credit is due. Please check out where my research led me to if you want to take the same path. (Here, here, here and here. There are many other places where I gathered information, but these are the main instigators of this post.)