Evaluating the Preseason MLB Forecasts

Vegas Watch published its third annual preseason predictions review today and again found that the computer projections are more accurate than human predictions. CAIRO comes out on top this season, with an average root-mean-square-error of 9.24 games, its closest competitor having an RMSE of 9.60.

The FEIN projections I released in March and April fell right in between the CHONE and THT projections in terms of accuracy, with a 9.85 RMSE. But I also said that the Royals’ forecast was optimistic by one or two games since I had accidentally projected them as an NL team; I had estimated a drop of 1.5 wins for Kansas City based on the error, which would lower FEIN’s RMSE to 9.76 , in third place just ahead of Marcel.

After the season, I’ll compare the individual hitter and pitcher projections to what actually happened on the diamond.

Team-by-team forecasted and actual results after the jump.

Read More »

Fantasy Points and Playing Time

I’ve had a suspicion for some time that a player’s value in fantasy football is nothing more than the amount of touches he gets, no matter how good or bad he is. A running back with 250 attempts would have to be 20 percent better, per-carry, than a back with 300 carries just to match his production—that means if the 300-attempt back had a 4.2 YPC, the 250-attempt rusher would need a 5.0 YPC to have the same number of rushing yards.

I looked at all players since 1980 to have a certain number of touches (pass and rush attempts plus receptions; 200 for QBs, 100 for RBs, and 40 for WRs) and found the correlation of playing time and their total number of fantasy points.

The numbers shown be are r-squared, which is the variance of fantasy points that playing time explains; in other words, if the r-squared is .50, then playing time explains 50 percent of a player’s fantasy points, and skill, luck, defenses faced, and other factors explain the other 50 percent.

Read More »

When to Go for It on Fourth Down

Here’s a great series of articles by Brian Burke. He concludes that you should go for it when you have two yards or less to go to the first-down marker everywhere on the field except between your own five- and 10-yard-lines.

As he notes in the comments,

Here’s another thought: Imagine there is no punt in the rulebook, and then one day it’s invented. A guy like me comes up to a coach and says, ‘Kick the ball on every 4th down and the other team gets it 35 yds further down the field.’

The coach would think I’m crazy. “Wait, you want me to give up 25% of my opportunities for a first down on every series…just for 35 yards of field position? Do you realize how much that’s going to kill our chances of scoring?”

2008 Median Fantasy Points

The 2009 season has already started, but I thought it would be fun to look at last year’s median fantasy points.

Median fantasy points are a better predictor of future success because, as Wikipedia notes, it’s "also the central point which minimizes the average of the absolute deviations." In other words, if you were to choose one number to retrodict a player’s fantasy points in any game, the median would also be more accurate than the player’s average fantasy points.

As well, the median is better when game-by-game totals are skewed by one outlier: If a player has 15 games with eight fantasy points, and one game with 40, his average fantasy points per game would be 10, yet his median total is eight—what he scored in all but one game.

Read More »

NFL: Regression to the Mean, Sample Size, and In-Season Projections

What if I told you that Adrian Peterson isn’t as good as his stats say?

My reasoning is the Curse of the Leading Rusher. You’ve never heard of it before, but it’s an obvious trend. Since 1980, the NFL’s leading rusher has seen his rushing yards fall by 489 yards and his YPC by almost half a yard just one season later. Only six of the 31 leading rushers even increased their rushing yards the following season, and nine had less than 1,000 yards.

Convinced? You shouldn’t be. Their decline is nothing more than regression to the mean and a lack of sample size. Let me explain.

Read More »

What is the Best Predictor of Fantasy Points?

I looked at all quarterbacks with 250 attempts with the same team in two straight years and found the correlation between their various year-one stats and their year-two fantasy points and fantasy points per attempt. Here are the results:

(Click column headers to sort.)

Read More »

2009 NFL Win Predictions

It’s hard to make accurate pre-season predictions. Last year, Sports Illustrated’s record predictions had a root-mean-square-error (RMSE) of 3.50 wins. Heck, if you projected each team to win eight games, you’d be off by 3.27 wins each. And if you use the method below, you’d be 3.14 wins away on average.

(DISCLAIMER: I don’t think that these predictions are superior to or any more correct than anybody else’s.)

Looking at all teams since 1994, I ran a regression on their Year X stats to predict their Year X+1 wins. The formula is shown below. The r-squared was .107, with an RMSE of 2.84.

Read More »

NFL: Weighting Game-by-Game Stats (Part 2)

Saturday, I examined how much more late-season stats should be weighted than Week One or Two stats. I found that first-week stats are about 20 percent less significant in future projections than Week 17 stats.

The table below shows the 15 most positively and negatively affected players when we weight their 2008 game-by-game performance.

wtFanPt = weighted fantasy points
FanPt = total unweighted fantasy points
Diff = difference in weighted and unweighted fantasy points. Positive means player performed better late in the season than early in the year.


Read More »

NFL: Weighting Game-by-Game Stats

While I was working on my NFL projections, I used custom weighting for each stat based on tests since 1980 to find which weight gave the lowest error in projected and actual results. The yearly weight for each stat varies based on the year-to-year correlation of said stat; for instance, interception percentage, which has a low y-t-y correlation, had a weighting of 0.9, while receiving yards per catch, which has a high y-t-y correlation, has a weight of .36.

Coincidentally, fantasy points for each position all have weights of or around .5 (.5 for QBs and .48 for both RBs and WRs).

During the season, those weights mean nothing. Last year’s stats may have a weight of .5, but what about two weeks ago? Ten weeks ago?

Read More »

The Next Popular Sabermetric Stat

Here’s Joe Posnanski:

I continue to look for an extremely simple one-stop-shopping stat that could replace OPS. I would LOVE to get behind one. Of course I love Base Runs because it’s so mind-boggling accurate, but it’s complicated*. Even simple runs created is a really good stat, obviously, but it just seems to scare people.

*Of course, so is passer rating and for some reason people cite that all the time.

Wins Above Replacement is the most obvious choice, as it encompasses both offense and defense and includes a positional adjustment. It’s easy to explain that WAR equals batting runs above average plus fielding runs above average plus a positional adjustment plus replacement level.

Read More »