Don't be shellfish...140

Oh, sure, there was a pretender to the title. That guy who pointed to the stands and delivered HIS home run.  But, hey, it was just ONE prediction.  Sure, a bunch of you can make one prediction.  One projection.  But I, Simply Fred, thought we should put the pundits to the test.  All the fantasy baseball ‘perts — or at least the ones whose preseason predictions are still readily available.  Why not measure the ACCURACY of their preseason PREDICTIONS to the actual, end-of-year RESULTS?  FanGraphs provided us with the projections of the contenders: FANGRAPHS, BILL JAMES, MARCEL, ROTOCHAMPS, ZIPS.  We threw in our own dark horse, Grey.

Methodology (reliability and accountability a requirement):

Refined our list of targeted players for which all had provided projections (an even playing field) — 177 hitters. (Pitchers not evaluated, because I’m doing this in my own free time.  You want to know the pitchers; I say to you gently and encouragingly, you do it in your free time).

Needed an impartial, hopefully fair, and representative standard for comparison.  The formula for a player’s value:

Rx1 + (HR x 3.5) + RBIx1 + (SB x 2) + AVG x 290 (Gets at a player’s relative worth.) — (You may have a different formula, but remember our task is to measure end results relative to beginning projections.  And I can only go on my own formula, cause that’s the one I know.)

So, we establish a player’s preseason ‘value’ by applying the formula to a given set of projections.  Then we apply the SAME formula to the end of year actual results.

Example:  Ichiro
Fangraphs:    (110+ (8 x 3.5) + 69 + (36 x 2) + (.329 x .290) = 374 which is your preseason projection.  End of the year production for Ichiro came out to 303, so Fangraphs’ percent difference was 1.23 between the two.  Here’s others:

Rotochamps:  Preseason:  342; End of the Year:  303; Percent difference:  1.13
James:  Preseason:  310; End of the Year:  303; Percent difference: 1.02
Marcel:  Preseason:  290; End of the Year:  303; Percent difference:  0.96
Zips:  Preseason:  303; End of the Year:  303; Percent difference:  1.00
Grey:  Preseason:  299; End of the Year:  303; Percent difference:  0.99

Ideally, you want to come closest to 1.00.  For this player, ZIPs was spot on!  (Grey 2nd closest, and so on.)

We then measure the average of all percent differences for each prognosticator. The initial results for all 177 common players:

Marcel:    1.17
Zips:        1.23
Rotochamps:    1.34
James:        1.36
Grey:        1.36
Fangraphs:    1.41

Wow!  Our Swami seems to be sinking in swampville.  Though all of the projections were far too high.  Hold it.  What’s that we see lurking in the mud?  Injuries!

Let’s limit our pool to those players that had 525 AB or more (to reduce the impact of unforeseen injuries).  If you do that, you get:

Grey:        1.00
Rotochamps:     1.01
James:        1.02
Zips:        0.97
Fangraphs:    1.05
Marcel:    0.88

What does it mean that Grey had a ratio of 1.00?  It means that his overall projections, for players with 525 AB or more, came out at the exact, overall end-of-year production.  It does not mean perfection.  Some projections a little low.  Some a little high.  However, overall, of all the ‘perts, he came closest to end-of-year production. Okay, now feel free to poke holes in this.

## 32 Responses

1. Rank Stank says:

What’s the variance for ‘perts on 525+ ABs?

2. DrEasy says:

Nice work, Fred! I wonder if there’s an easy way to compare projections without “flattening” a player’s value into a single number. Maybe using something like the Euclidian distance between a player (a 5-dimensional data point) and the given projection. You’d probably have to normalize the distance for each dimension.

Then there would be all sorts of fun things to do. One thing would be to do the total sum of the distances for all players for each projection. Another would be to see which projection was “closest” to the player in question, and then see which projection got the most players right.

3. Big Mike says:

Fred, what about on the pitching side? Can you post a similar review?

4. Jeff says:

If you do a pitching one make sure you do just enough analysis to get your score to 1.00 then stop. Also make sure you pick the exact number of pitcher innings that make your score 1.00. (why did you pick 525 ABs instead of 400, 500, 550, 600, etc?)

5. Tony says:

@Big Mike: ha he said it was alot of work and if you wanted something like that check it out yourself.

@Fred: nice work: did you just make that formula up or borrow it from somewhere? Maybe i missed that part. Interesting article. I dont know how conclusive it is though. It looks like all the “perts” were pretty good in their calls?

6. Comment #1 is right – without seeing the variance or standard deviation of these results, that 1.00 is essentially meaningless. I could predict 50% too high on one player and 50% too low on another player, and it would even out to a svelte 1.00 rating.

7. Kwaz says:

Would love to see an SSE analysis for this.

8. Matt L says:

This post reminds me that I wish projections would not collapse injury risk and output. We should get two sets of projections: (1) the expected output for a player if he has 535+ healthy ABs, and (2) the percentage chance that player will get 535+ health ABs. Instead of listing these two separately, projections always collapse them (maybe discounting projection), but that is not helpful b/c a productive player who gets injured can be DL’d or replaced.

9. Dabulls says:

Any chance this analysis can be done retrospectively to other years? Is Grey consistently “good” (pending release of deviation figures) or was it a fluke year?

10. Wake Up says:

I did a similar ‘pert evaluation test. But, instead, I used Archimedes’ Principle. So far, Verducci is in the lead, since I was able to hold him under water for the longest amount of time. At least, I think, that’s how the Archimedes’ Principle works, right?

11. simply fred says:

@Rank Stank: Sorry, didn’t calc variance. Anyone who wants the spreadsheet just email me at:

fred_barker@comcast.net

The options to attempt a valid comparison are innumerable. My goal was to try to estimate which pert’s projections would be most beneficial to use in projecting end of year projection for my draft. I, personally, would rather use projections as if players will be healthy, rather than deal with unpredictable injuries.

@DrEasy: I made an initial attempt last year using a similar strategy that you propose. I ranked each pert’s projection for each stat relative to the final stat. I only got so far as 1B (it was a lot of work). The pert’s were different, but Grey ranked 2nd (I believe; I didn’t save those stats). I will try to repeat again annually. Again, anyone welcome to the spreadsheet. One sheet shows an example of comparing stat-by-stat.

@Big Mike: Sorry, Mike. I am a streamer. I find the pitching too unpredictable and not on my agenda.

@Jeff: My initial cut was at 500 AB. Much more reasonable results than the overall. Then moved up to 525. Felt that level even more solid. POSition is id’d in the spreadsheet. I sorted such. One of two extremely high/low player performance can skew those maybe 5%. For example, 2nd base had an overall ratio of .92 with Grey at .95. The first thought would be for pert’s to up their projections. However, looking at the data, Kinsler had ratios from .68 to .84 (all under-projected him). So, again, the overall ratio of all positions helps to reduce the skewing. I did calc median and mean in the spreadsheet.

@Tony: I made the formula up myself. I played a points league for years and the formula comes close. I made a few adjustments to project closer to real outcomes. For example, HRs usually get 4 pts. 3.5 is closer to RCL outcomes. The 290 factor for AVG is my best attempt to measure it at a level relative to the other stats.

@Red Sox Talk: I have provided the raw data. I don’t claim to be the end-all evaluator. Encourage you to analyze as you see fit. I had done the work for my own projections. Thought others might be interested in the results.

@Matt L:

Personally, I am kinda tapped out regarding the time I am willing to give to further analysis. The results have me at a spot where I am happy to just run with Grey’s projections. I have annually spent a lot of time crunching projections. I don’t think I can get better than an overall 1.00 to end of year production.

BTW: I will calculate ‘injuries’ in some measure in applying it to newcomers. The average of the top 12 by position for ABs last year were:

C-474
1B-571
2B-586
3B-500
OF-535 (top60)
SS-581

I would guess the 500 for 3B reflects more injuries at that position. So, I will project a newbie SS at a top of 581 AB. For 3B, they get the 500 AB.

12. Tony says:

@Dabulls: he’s good man, you new? i’ve been following razzball for 3-4 yrs now…. Obviously he’s wrong on some guys, no one’s ever right, but he shoots straight and values players very well. He doesn’t just look at what a guy did last year and expect him to repeat or improve.

There’s lots of knowlegable people on here as well. If you’re new, if not my bad….

13. simply fred says:

If one used Marcel (who predicted lower overall, perhaps taking into account injuries?), he would be applying projections that fell 12% short of end-of-year actual production for ‘healthy’ players (those with 525+ AB).

14. Tony says:

@simply fred: cool man thanks.

I also believe that if you’re a smart fantasy baseballer, you’ll use this site, and check many others. The other perts you listed are good sources, better to be well researched than just go with one source. Also as long as you stay away from ESPN, you’ll do alright lol

15. Tony says:

i dont know about James tho, he’s kind of a whack job if u ask me

16. Brian Lynch says:

He’s as good as they come, especially with his humor. He has had some huge whiffs though the last few years with Mini Donkey, and the ESPN guys were more right about Ryan Howard than he was and he made a pretty big deal of it preseason.

17. simply fred says:

@Tony: That’s been my thinking, and I am sure I will still check ‘em out. The problem I am now having is: even if you look at the other perts, how do you improve on 1.00?

18. simply fred says:

@Brian Lynch: Given the whiffs, he still came closer to end production. Others had huge whiffs as well. (I didn’t set out to prove Grey the best. Just tried to provide a standard to measure all against.)

19. OaktownSteve says:

http://www.insidethebook.com/ee/

Tango Tiger runs a forecasting challenge every year. If you seach forecasting on the blog linked you’ll get a flood of information that might be interesting to you.

I will say I would not be surprised if Grey did in fact beat all these forecasts. If you look at the data sets for the mechanized forecasts you can always find some obvious outliers that for whatever reason find flaws in the algorithms and produce some really funky projections. I remember PECOTA had Jake Fox hitting 30 home runs for the A’s a few years back for example. A human is always able to include more fators in an evaluation than the limited inputs available to a forecasting system.

Another thing that throws your analysis off is that projection systems incorpate past injuries into their playing time projections. Take Ian Kinsler’s 2011 for instance: Bill James projects him for 609 PAs, Marcel 494, ZiPS 568. The actual 2011 total is 723.

Grey on the other hand, can project a player out for the full season. Because injuries really cannot be predicted at all, Grey gets a serious advantage in getting to predict a full season worth of production for most players. Even if he tinkers with a few projections for “injury prone” players, the systems are reducing playing time for non-injury prone players who happened to have injuries in prior years.

20. OaktownSteve says:

Oh…one last thing…the real utility of projection systems is to look at player comparisons within the context of that system itself, in my opinion. In other words, not how close to Ian Kinsler’s true numbers will Marcel be, but how does Marcel view Ian Kinsler relative to Robbie Cano.

21. simply fred says:

@OaktownSteve: Agree, comparisons within a system valuable. Really, the value of these comparisons is limited much just to draft day. Unless one is playing in a league where he has the same fixed roster at the end of the season that he started with at the beginning (who does that?), there are just too many variables (injuries/adds/drops/trades/daily moves) that projections only contribute to a certain level. Nuancing analysis much beyond what I have just doesn’t seem as if it would return results worth the effort. Perhaps, “Simply” Fred is deserved.

22. sean says:

I always get a chuckle when Marcel tops the experts because it’s the least complicated projection system ever. Just goes to show you that if it looks like a duck and quacks like a duck it’s probably a duck.

Grey has an exceptional talent of evaluating a player’s worth compared to the general market perception, which in the game that we play, can be even more valuable than projecting year-end stats. The difference between an average projection on a player and an elite one usually isn’t as valuable as being able to navigate through draft day and the player pool with a level of sophistication.

23. tggq21 says:

Fred,

Just wanted to thank you and Grey for War Room once again. Very accurate and led to a championship! Hope you can continue with it!

24. simply fred says:

@OaktownSteve: Of course I have perused Tango (and Others) evaluations. Clearly, any system that rates Marcel high has flaws for me. Also, haven’t seen any that evaluated Grey along with the other perts.

@sean: You nailed it!

@tggq21: we’ll see how the available time presents itself. Thank you!

25. SwaggerJackers says:

For any diehard fantasy baseball fans, I created a slow mock auction baseball draft over at CouchManagers.com. I participated in one of these last year but this is the first time I’ve set one up.

The idea is that we all nominate two players to start. You have \$260 to spend total and each time a players price goes up, their clock gets 6 hours added back on. I’ll probably start it on Monday. If we don’t have enough people then bots will fill in the gaps.

26. 616Gambit says:

The hole is that you can be way off both on the high and low end but still come out averaging even. A wild swinger is way less helpful than someone that would overvalue everyone but would correctly compare them to each other.

27. simply fred says:

@616Gambit: Agree. You are welcome to the raw data. Just email at above. There is a sheet that has a sample for comparing each pert’s projection per individual stat. I think that may correlate to your point.

28. Joel says:

A more interesting metric would be how far each projection deviated from a player’s numbers. Injuries do matter, which is why Marcel is so good in aggregate.

29. Mitch Moreland says:

Correlation coefficient is a measure of precision. The prior analysis was of accuracy, which is different. Plugging in the spreadsheet that was emailed to me, I calculated the following results. If points measure overall forecast, then Grey was the most precise for the 69 players where everyone had a forecast and where players had 525 or more AB.

Correlation Coefficients for 177 players
R HR RBI SB AVG pts
Fangraphs 0.48 0.67 0.56 0.82 0.5 0.58
Rotochamps 0.48 0.64 0.56 0.82 0.51 0.57
James 0.54 0.67 0.57 0.79 0.45 0.58
Marcel 0.52 0.64 0.55 0.79 0.46 0.55
ZIPS 0.45 0.65 0.52 0.81 0.54 0.54
Grey 0.52 0.63 0.54 0.78 0.45 0.56

Correlation Coefficients (525 AB) for 69 players
R HR RBI SB AVG pts
Fangraphs 0.47 0.79 0.66 0.87 0.53 0.66
Rotochamps 0.34 0.76 0.67 0.86 0.52 0.60
James 0.63 0.79 0.69 0.86 0.52 0.66
Marcel 0.31 0.71 0.58 0.84 0.43 0.43
ZIPS 0.56 0.80 0.70 0.88 0.60 0.65
Grey 0.51 0.79 0.73 0.85 0.49 0.67

30. simply fred says:

@Mitch Moreland: A number of you CORRECTLY asked for MORE in order to get a more TRUE evaluation. Fortunately, Mitch Moreland stepped to the plate. Mitch was kind enough to bring me up to speed. Thank you Mitch! He provided the following light:

“Accuracy(my attempt): on average the darts hit the bulleye. Maybe none of the darts land on the bullseye, but on average they do.

“Precision: when you want to throw higher, the dart generally goes higher.

For forecasts, precision is the key.”

After a little give and take between us, he moved to the following in place of his correlation coefficients above. This was done primarily to achieve an overall, reliable, value for each pert.

Mitch calculated r-squared values and averaged them for a total rating for the precision of each pert. Further, he provided my original spreadsheet with his work at the bottom of the summary page. This is the place you want to look for the measure of the precision of the perts’ projections.

Avg of R-sq (over 525 AB):

ZIPS

31. Grey says: