There has been a lot of interest on the site and in the forums involving streaming pitchers.  As many of you know, I value SP more than the average drafter but I am not completely averse to streaming – particularly on teams in shallower league formats (10-12 MLB) with small benches and when I have lost SPs to injury (like RCL where Beachy and Colby are toast).  This is about the time of year when I like streaming best as most teams are at or ahead of pace for innings/games started so there is less competition in free agency.

Inspired by the season-to-date success of frequent commenter/forum poster and occasional contributor Fred, I went about creating a tool and testing/developing a methodology for identifying viable streaming candidates.  The tool is at razzball.com/streamers and we will add a link to it on the homepage.

The variables I ended up using are as follows:

• FIP Index – This is just the league average FIP (4.00) divided by the pitcher FIP * 100.  So a pitcher with a 3.00 FIP has an index of 133 (4/3*100) and is 33% better than average.  Going into the analysis, I assumed that the primary variable would likely be a measure of the pitcher’s skill.  As many of you are aware, FIP (Fielding Independent Pitching) has proven to be a better predictor of future ERA than past ERA as it focuses on the aspects that pitchers control most (HRs, K, BB) and excludes the areas where pitchers have lesser control (BABIP).  Furthermore, it is easy enough to calculate so I can automate it easily.  xFIP (which replaces a pitcher’s HR/rate with league average) and SIERA (a more complicated variation of FIP) were considered but I do not think they provide sufficient improvement over FIP to justify all the complexity.  (I’m not alone in that sentiment.).  In cases where the pitcher has thrown less than 100 IP, I have credited them with an index of 80 for the IP difference and took the weighted average of the two (e.g., a pitcher with an FIP index of 200 in 10 IP would receive 90 IP of 80 and average out to 92).  This is probably too conservative but I’d rather be too conservative with pitchers with lesser track records than too liberal.
• Last 20 Game FIP Index – This takes the pitcher’s FIP for the past 20 games.  While this metric correlates fairly high with FIP Index (most good/bad pitchers are going to be high/low in both), it does account for pitchers who are on hot/cold streaks.  If a pitcher hasn’t pitched in the last 20 games, I just credit them with their season FIP.  To avoid this index becoming too distortive, I put a ceiling at 200 (e.g., any Last 20 Game FIP of 2.00 or under gets a 200 whereas, uncapped, a FIP of 0.50 would get a 800, et al.)
• Park Factor – This index isolates the hitting environment for the park in which the start is taking place.  The higher the index, the worse it is for the pitcher.  I have calculated these indexes using a weighted average of 2008-2011 Runs/Game in each park (from ParkFactors.com) and 2012 Runs/Game from ESPN.  I planned on just using the 2012 Runs/Game but the smaller sample size (~50 games) seems too volatile.  I tested the 2008-2011 factor, the 2012 factor, and the weighted factor against a 4 game test data set and the weighted factor correlated best to pitcher ERA/WHIP.
• Opponent Home Park Adjusted Run Index – This index aims to isolate the offensive prowess of the pitcher’s opponent.  It is key to isolate the team’s offense from their home park to avoid double-counting Park Factor.  Since I do not have access to a data feed for importing Home/Road offense splits, I did this by dividing each team’s Runs/Game against the following factor (% of games at home * Park Factor + % of games away * 100).  So teams in offensive parks like the Rockies and Yankees have their Runs/Game decreased and vice versa.  The team indices should mirror their Road stats (as this usually averages out to close to a neutral-offense park).  The reason I chose Runs/Game vs. OPS or another offensive metric is to remain consistent to the other three indices (which helps since indices are driven by the denominator).
• Home Start vs. Road Start – Pitchers generally pitch better at Home vs. Away – even after factoring in the park.  Courtesy of ESPN, here are the stats:  2011 Home/Road/Average ERA:  3.82/4.07/3.94, 2012 Home/Road/Average ERA:  3.80/4.21/4.00.  So pitchers roughly do 5% better than average at Home and 5% worse than average on the Road.

The formula for determining a pitcher’s score came from doing a multiple regression analysis of these four variables against WHIP (which I considered the most predictive stat to gauge a stream’s success).

The correlations to ERA/WHIP for each stat based on a 4-game test are in parentheses – see here for the worksheet.  The higher the correlation – the better the variable is at predicting ERA/WHIP):

FIP Index (19.4%/20.1%)
Last 20 Game FIP Index capped at 200 (10.8% / 16.5%)
Park Factor (11.7%/7.2%)
Opponent Home Park Adjusted Run Index (3.8%/5.3%)

The formula from the regression analysis is really mathy – if you really really care, I explain it in more depth on the ‘Formula’ tab of the worksheet linked above.  The key points:

• The correlation between a pitcher’s resulting index and their ERA/WHIP is 23.1%/25.1% which does improve upon using FIP alone (19.4%/20.1%).  This is by no means fantastic – you can read this as the score helped explain only 25% of a pitcher’s WHIP with the other 75-80% driven by chance/luck.  But this type of skill/luck split is common in Fantasy Baseball and one that keeps me really humble as an analyst (see here to see my analysis of pre-season rankings and how they correlate with RCL Team success – the predictiveness of rankings is about as humbling as it gets).
• Interestingly, despite the fact that FIP Index is the best single indicator, the Park Factor is the index with the highest weight in the formula.  Ignoring the Home/Road start variable (which is not an index and thus is on a different scale when determining the coefficient), the weights are:  31% FIP Index, 14% Last 20 Game FIP Index, 39% Park Factor, and 17% Opponent Home Park Adj. Run Index.

Variables I considered but did not use include:

• Home/Road splits for Pitchers – In the roundups, we often reference pitchers with extreme home/road splits like Tommy Milone in 2012 or Clayton Richard in 2011.  The reality is that – once you adjust for the park – these splits are not very predictive because pitching at Home/Away really isn’t a skill.  So, yes, a Padre pitcher fares better at Home vs. Away but that’s reflected in the park factors as well as the Home/Road adjustment.  Here is an analysis I did looking at 2011 vs. 2012 Home/Road pitcher splits.  If a pitcher has shown consistent and considerable splits in Home/Road over multiple seasons, I can see this as potentially valuable criteria – I just do not think it has shown to be consistent across enough pitchers to warrant inclusion.
• Day/Night splits for Pitchers – Same rationale as why I disregard Home/Road splits for pitchers.  Fun example – this year, Hiroki Kuroda leads the majors in day game ERA among pitchers with 30+ IP – an amazing 0.00 ERA (and 0.67 WHIP).  His night ERA is 4.23.  It’s as if he has the opposite of Josh Hamilton’s daylight-sensitive blue eyes!  Given this huge split, you would think that Kuroda is a better day pitcher vs. night pitcher.  For 2009-2011, in 108 IP Kuroda’s day ERA is 3.56 which is WORSE than his night ERA of 3.29.  Either Kuroda has suddenly picked up a daytime pitching skill upon donning Yankee pinstripes or this is just an anomaly.  If you believe the former, feel free to overlay that as your own personal variable for considering streaming pitchers.
• Left/Right splits for a Team’s Offense – This one is interesting.  I do not have these stats at my disposal but there could be something here given my assumption that some batters exhibit consistent left/right matchup splits.
• Pitcher K-rate, O-Swing % – Certainly valuable for identifying pitcher success but K’s are accounted for in FIP.  No reason to double count this.
• Pitcher BB-rate – Certainly valuable but BBs are accounted for in FIP.
• Pitcher vs. Opponent Previous In-Season Matchups – I do not think this holds up as a predictive variable mainly because 1) small sample sizes and 2) there are too many cases where it is the first matchup so it cannot be applied equally across all starts.  This FanGraphs post by Dave Cameron talks about the fallacy of reading too much into pitcher/batter matchup data – I think pitcher/team matchups are just an extension of this.  That said, I can see using this subjectively – no reason to go forward with streaming a pitcher if you do not feel comfortable.  Personally, I do not like it when a pitcher has to face a team twice within short succession.  Is this statistically valid?  Probably not.
• % Owned – This would not be a variable – it is just useful reference data that could help in identifying pitchers more likely to be available for streaming.  Unfortunately, this is not something that we can automate and I do not think we have the requisite rights to this data.  Maybe one day we will have a true techie on staff who can figure this out with Yahoo’s API.

Given the relatively low correlations between the Stream-o-nator’s Score(~25%) and pitcher ERA/WHIP, there would seem to be significant room for improvement.  Perhaps in time, we end up adding extra variables, tweaking the current variables, or tweaking the weights of the current variables.  But I would guess that any additional gains will be marginal given that four positively correlated variables only add an additional 5% to Season FIP’s 20% correlation.

My general advice on pitcher streaming is that there are ways to improve your chances of successfully streaming but there are no sure bets.  Some are in your control (picking pitchers with the best chance of success), some are not (whether other people in the league are also streaming).  If you are expecting to lead the league in ERA/WHIP via streaming, you are banking on a statistical long-shot.  It’s possible (as Fred has achieved season to date) but statistically improbable nonetheless.  If you can manage to get production equivalent to average in your fantasy baseball league with streaming, I think you are ahead of the game.

Hope this is helpful and thanks to Fred and Awesomus Maximus for your feedback during the testing stages.

## 62 Responses (Jump straight to the comment form)

1. Drew says:

@Rudy this is the greatest thing ever, kudos to you

2. elaw77 says:

Start Volquez @CIN? Scherzer @BOS?

3. Coco Krispie says:

Thanks this is great. Rudy how does your differ from others around the web and on other mainstream sites?

• OaktownSteve says:

@fred barker, basically a very, very slight improvement on league average.

4. fred barker says:

@Rudy, you have done a TREMENDOUS job of producing the MOST ACCURATE predictive tool for streaming. I have scrapped my ‘gut’ formula. I immediately grabbed the top three pts. guys available in my league for the next 3 days: Parker/Beavan/Diamond–all 120 and above Scores (looks to me that anything 115+ is gold).

Razzers: expect the occasional ‘not-as-predicted’ performance. But, all-in-all, I believe, that this will get you close to 3.00 ERA and 1.20 WHIP over a season–dependent upon how fast they get scooped in your league. :-)

Rudy, I will try to dig into the formulas, but KNOW, that I am a BELIEVER!

Thank you!!!!!!!!

5. OaktownSteve says:

It’s can’t be done. If it could it would already have been done by Vegas or the people trying to beat Vegas. Fred’s streaming methodology is pretty easy to explain. He’s looking for pitchers with the lowest rates (ERA, WHIP) versus the poorest hitting teams. The rest of the analysis is just noise which won’t correlate, as you see yourself above. In fact, I think the best way to pick streamers is to look at the Vegas line and the O/U.

The RCL format is particularly stream friendly. Daily line ups. Not particularly deep leagues. Unlimited roster moves. Also, because there’s an GS limit rather than an innings cap, you can really benefit from extra relievers adding stats and bringing down rates.

Employing a streaming strategy that opens up additional roster spots for extra relievers is, in my opinion, the optimal strategy for the RCL format.

• fred barker says:

@OaktownSteve, it HAS been done. My season so far: 3.107 ERA/1.195 WHIP thru 153 GS. Over 75 SP and 65 RP streamed. 1 SP drafted and traded 1st month.

The problem with the Vegas line is that it just weighs Win probability, nada for peripherals. It could have Yanks over Sox at 10 to 5, but the game could be very high scoring. Not a good indicator for streaming.

• fred barker says:

@fred barker, Rudy’s will be even better.

• OaktownSteve says:

@fred barker, I’m going to predict that Rudy’s methodology will not do well.

I was thinking about Econ Nobel Daniel Kahneman while I was out getting tacos. He proved that simple scoring systems with few inputs outperform “gut” predictions and also more mathematically complicated predictive systems for events or situations that are difficult to predict. I think that Rudy’s model has too many inputs that he’s already indicated don’t mathematically correlate.

I’m going to hypothesize that instead of more complexity, you want less. I offer as a model a two-input scoring system:

FIP with a score of 3, 2, or 1 with 3 being best. Say FIP under 3.75, between 3.75 and 4.50 and 4.50+

Opponent runs scored (road or home depending on venue) with a score of 3,2,1 for bottom 1/3, middle 1/3, top 1/3 of runs per game league average for that night.

Add the two scores and take the highest scores among streamable pitchers plus ties. I’d put that system against anything I’ve read here.

• fred barker says:

@OaktownSteve, roll up your sleeps, put your system to the test next year. if it produces better results, i’ll be happy to jump to it.

• OaktownSteve says:

@fred barker, Vegas doesn’t just give the win probability, they also give the O/U.

Your results do not validate your methodology. You have a small sample size relatively speaking.

More imporatantly you have no mathematical correlation between input and results. I am speculating that your success in streaming is due primarily to the availability of pitchers pitching against the league’s weakest offenses. Other inputs are subject to your own confirmation bias.

I do have a question though…do you have your ERA and ratios for just the starters you’ve streamed? The overall ERA and Whip numbers are the reason I brought up the RCL format. When you stream pitchers you can maximize the number of roster spots that can be used for relievers (facilitated by the no innings cap) and that’s going to drive down your ratios relative to competetors that use only starters and closers.

• VinWins says:

@OaktownSteve, Vegas also takes into account gamblers, as they set the line to get equal betting on both sides of any bet, correct?

Fred’s stats at the all star break:

68 SP have recorded 55 wins and 648 strikeouts with an ERA of 3.13 and 1.22 WHIP in 797 IP.

• MattTruss223 says:

@VinWins, You are correct. Ideally they have equal bets on each side. Vegas banks on the juice. I wouldn’t trust Vegas with my fantasy team.

• OaktownSteve says:

@MattTruss223, I mentally factor in the Vegas bias. I’m just looking for guidlines.

• Sweeney says:

@VinWins, I’ll give you one guess as to the best way to get equal betting on both sides of any bet.

• Sweeney says:

@Sweeney,

P.S. — if your answer involves the assumption that you could retire today and get rich by just perpetually betting against the Yankees at your local sportsbook, you are an idiot.

• VinWins says:

@Sweeney, B y changing it when they get too much on one side?

• OaktownSteve says:

@VinWins, My sense is that those are not all streams though. If I recall correctly, Fred only drafted one pitcher but he has kept many of those pitchers for multiple starts, which is not really a true stream. In the original post he references RA Dickey, James McDonald, Lance Lynn and Jarod Parker all of whom have had stretches of excellence. Looking at the overall stats would not validate the proffered streaming strategy if that were the case. It’s simply a form of the strategic ploy of not drafting starting pitchers and then banging the waiver wire intra-season and riding a hot hand. If those pitchers are picked up and started multiple times using a different for deciding when to start them than the criteria used to add new streamers, your numbers and methodology don’t really mean anything.

By the way, it’s interesting that the SP ERA is lower than the overall ERA. The SP ERA is 3.13 and the team ERA is 3.107 as of today. The relievers are bringing the ERA up if those numbers are accurate.

• VinWins says:

@OaktownSteve, At the time, his RPs had a 2.75 ERA.

I know my RPs have blown up my ratios far too often, though.

• OaktownSteve says:

@VinWins,

How do his RPs have an ERA of 2.75 and his starters have an ERA of 3.13 and his team ERA is 3.17?

• MattTruss223 says:

@OaktownSteve, 3.107 not 3.17

• VinWins says:

@OaktownSteve, As I said, this was at the all star break, when the orignal post was published.

• OaktownSteve says:

@VinWins, got it. But these are not really streaming pitchers numbers, correct? This is just all starters whether starting multiple time based on recent performance rather than by the streaming criteria being discussed? This question might be for Fred and not for you.

• VinWins says:

@OaktownSteve, You may be right. If the definition of streaming is keeping for only 1 start, then not all these starts would be considered streaming.

• OaktownSteve says:

@VinWins, I think the key think is not really what the definition of streaming is but rather whether the factors Fred offered (R v L, home v away, FIP, day night, OPS or whatever) when used as part of an analysis provided any meaningful predictive powers for a decision to start a pitcher on a particular day whether that pitcher be already rostered or he be a daily pick up. If a pitcher is picked up and then started in subsequent starts using different criteria (i.e. becomes an auto-start for some period of time) then the entire premise has no validity.

More to the point, Fred’s criteria are a real hodgepodge of thing including quantatative criteria (stats) indescriminantly applied and qualitative stuff like “I go by Grey.” I am pretty much calling bullshit on the whole idea and looking for other explanations because if there is a way including the above distinction about streaming and also luck/sample size. If there is a way to predict individual game performance, this ain’t it.

• OaktownSteve says:

@OaktownSteve, also, it’s fair to say that if Fred had a 3.13 starter ERA at the ASB and a 2.75 reliever ERA and now has a 3.17 team ERA then either the relievers have been really crapy post-ASB or the starter ERA has already regressed.

• MattTruss223 says:

@OaktownSteve, It’s 3.1 ZERO 7

3.107, not 3.17.

• OaktownSteve says:

@MattTruss223, yeah…not really relevant to the point but thanks for the correction. I’m on a conference call and toggling back and forth.

• MattTruss223 says:

@OaktownSteve, It is relevant. You’re saying his ERA has gone up since the ASB, when in fact it has lowered by .023. So, it’s pretty relevant to the point you’re trying to make. You’re saying it’s raised .04 instead of lowered .023.

Now, it’s not relevant to your claim that streaming successfully can’t be done.

• MattTruss223 says:

@VinWins, Kinda seems like semantics to me. Some of the guys he kept on for awhile (Lynn), but others, like McDonald I think he added and dropped for starts so he could fill the slot with RPs in between.

@OaktownSteve, I agree, the RCL seems best suited for this. The Games Started loophole makes it easier then if it were an innings cap league.

Either way, he only drafted 1 SP and traded him early on. It’s pretty sweet if you can do it.

• OaktownSteve says:

@MattTruss223, I rolled with two starters last year (Verlander, Haren so good picks) and daily streamed almost every other start. Ended up 6th in the RCL overalls. So I agree that there are strategic elements mixed up in the conversation here that have to do with format.

I’m really not buying the premise that we’ve got something as far as concerns predicting individual game performance of a pitcher. I know for a fact that the numbers aren’t there to back it mathematically (not yet) and I’m predicting that Rudy’s methodology, which can be validated mathematically, will not have much success. I get the sense that Rudy knows this too if you look at his concluding words in the post.

• MattTruss223 says:

@OaktownSteve, Agreed, I’ve done basically the same thing this year with great results (MadBum and Cliffly – Just traded MadBum).
Rudy isn’t a fan, he said so in the podcast. But, if you’re going to stream, at least there’s a smidge of analysis behind it, rather then throwing darts and going on ‘gut’ instinct.

• OaktownSteve says:

@MattTruss223, I think one of the beauties in this system is that you don’t even really need to do particularly well with your streamers because of all the extra quality innings you get from your relievers. Yeah it’s nice to think that you can do some analysis and beat the averages, but I think it’s really indulging in some fallacies to do so. But it doesn’t really matter because what you’re really doing is gaming the system/format.

• I laid out the stats for it – the best fit equation based on the variables was about 25 percent for WHIP. This is better than FIP alone (around 20 percent) with park factors adding the most of the other variables.

You can take the spreadsheet I put together and see if you can improve on that. Doubt it. But FIP alone will come pretty close.

I’m not a fan of large-scale streaming as a similar-minded leaguemate or two can kill any potential edge (if there is one). But the aim of the tool – including future iterations – is to help readers make the most educated decisions.

6. fred barker says:

@Rudy, I have been using the opponent’s team hitting OPS vs. my pitcher’s left/right. Will add a column to your table to try to see if we can determine a correlation to improving the model.

7. El Famous Burrito says:

Will it tell me if I should drop Napoli?

8. TheTinDoor says:

Really awesome. I’ve been doing a rudimentary version in my 12-team mixed weekly league; this format is really ideal. Shallow league with NO bench, so plenty of viable arms. And since it’s a weekly league, I really only need to identify 1-2 great matchups a week.

I think Opponent lefty/righty splits would be a welcome addition to your pool of data. What I pull each week is very simple: Team runs scored vs. Lefties, and vs. righties (on a per-AB basis, of course). Some of the differences are really stark; here’s the ranking (lower = more friendly matchup)

TEAM Right Left Diff
Minnesota 11 28 -17
Arizona 13 30 -17

While a righty may work, I would NOT be streaming a lefty against these teams. Is there some overlap between lefty/right splits and home/away? Probably…as a quick-and-dirty method, though, this has worked very well for me.

9. Mike says:

Good work Rudy! Glad I could help jumpstart the engine.

10. Prezii says:

The stream-o-nater is here to stay… Very good addition!

11. Sweeney says:

@Rudy —

This is just begging for a follow-up post with the following analysis. I would do it myself, but then again I don’t write for a popular fantasy baseball blog.

(1) Seperate SPs into about 10-12 tiers based on pre-season valuation (expert rank or ADP).
(2) Determine 2012-to-date average per-start performance for each tier.
(3) Determine the per-start performance of each tier using only the top 1/3 of streamonator-scored starts.

This would allow your readers to decide whether or not to stream using their particular league’s settings. For example, lets say I own 4 SP, and I’m streaming a 5th spot. My SPs are tier 2, 3, 4, and 5 SPs. If I streamed, I’d have to stream tier 7 or tier 8 guys because I know tier 6 and above are almost all owned in my league. Should I drop my tier 5 SP to stream a bunch of tier 7/8 guys? How good would a pitcher have to be in order for me to pick him up at the expense of continuing to stream? Those are the questions that are important to those of us in your audience who are not already RCL uber-streaming devotees.

12. kaiser soze says:

Rudy – Awesome job on the Stream-o-nater. I will be making use of this tool fo-shiggity.

On a different note, what are your thoughts on starting pitchers at the beginning of the week in H2H?

For example, I saw that Grey recommended sitting Scherzer @Bos today in a Roto league. Should he also be benched in a 10-team H2H? Or is his K potential worth the risk? I expect all pitching cats to be close this week…

13. later tater says:

Rudy, this is awesome!

started *two* pitchers is ordinarily wouldn’t have, but they had great ranks on the stram-o-nator;both were great with good ks and two wins!

when streaming, i just make an educated guess based on what i know about the pitcher the park and the opponent; its really nice to be able to have some numbers on the guys i dont know so well

well done!

14. TheNewGuy says:

Brothers got a shot at snagging saves? Trying to see if he’s worth dropping a bench bat (Craig/Lucroy) for.

15. TheNewGuy says:

And again to subscribe to the post!

16. Steve says:

Great job on this Rudy (and Fred & Awesomus Maximus). Harrell, Millwood and Tilman all did good work for me last week

One little wrinkle – has anyone else noticed that it doesn’t seem to filter the date by one-digit numbers? If I put in a ‘2’ to get streamers for this Thursday, it doesn’t filter at all. I assume it’s giving me *every* date that has a ‘2’ in it.

Which will be the case for pretty much the next thousand years…

;-)

• Ha – just put in 8/2 and it’ll filter. Just for you, I’ll get rid of the 2012 at some point (before 2013).

• Steve says:

@Rudy Gamble, Nice one – thanks. As an aside, we write the date differently down here – 2/8 rather than 8/2 is August 2nd for us, but I can make the adjustment for the Stream-O-Nator ;-)

17. Anthony says:

@Rudy – thanks for this! Few questions though – Does this tool update itself each for streamers or ? the higher the score the better the streamer ya ?? Medlen is 133 im tempted to take him … What are your thoughts on him ??

• It updates every day or so (sporadic on weekends) and includes the next 4 days. Higher the score the better. Converted relievers (like Medlen) and pitchers with minimal IP (Germano) are more volatile. I like Medlen but his score is overrated because his FIP was from relief which usually leads to a FIP a run lower than as a starter.

• dingbat says:

@Rudy Gamble, Looks like the S-O-N is in need of an update. It’s still showing pitchers for 7/30. Medlen and Harvey worked out great, by the way!

• It’s updated this morning. Golfed last morning and never made it back from the 19th hole.

18. Anthony says:

each week*

19. Anthony says:

@Rudy thanks for the info. And FIP I believe is just another word for what pitchers ERA should be right? Sometimes pitchers get bad luck/ball park/ etc and this just tells u what the ERA should be for a pitcher? Or am i completely off.

• Yup, that’s about right. Mostly on :)

20. Abdoozy says:

I told myself earlier this season I was never streaming Harang again. Then I saw he was highly rated tonight and decided to give him a shot at home.

Dammit.

21. Melvin Emanuel says:

@Rudy

As an avid streamer, one of the biggest factors I check on fangraphs is opponents last 14 or 30 day offensive stats. Generally I focus on wOBA, bb%, k%, and ISO. In certain situations checking how a team hits LHP can help as well as well as home/away splits.

22. Fuij (Fausto or Roberto? RCL) says:

I actually thought of doing something similar with variables based on pitch fx and plate umpires.
If I can get it together by next spring, I will send it to you.

23. Chupacabra says:

Harang broke the Stream O Nator.

24. tenacean says: