Sunday, I released my MLB projections based on a computer model. The individual projections can then be added up by team to create projected standings.
Before I add up any projections, I first adjust any player’s projected playing time that may be off. As I mentioned before, the projected playing time only looks at a player’s past playing time and adjusts for age. So a player such as Jose Reyes, who had had 700 plate appearances in his four seasons as a starter before missing 126 games last year, may have a lower projection than in reality. I also look for players like Rangers OF Julio Borbon, who will receive a starting role for the first time in his major league career.
Once each player is assigned their proper role and playing time forecast, I add up the projected stats for each team. If a team comes up over 4120 outs (AB minus H plus CS) or 1440 IP, I prorate their stats down to those levels; if a team falls short of those benchmarks, I assign replacement-level production for the remaining outs.
I use a Base Runs equation to calculate projected runs scored for offense, and for the pitching staff I simply look at runs allowed. Win percentage is found using Pythagenpat . Then I adjust each team’s win percentage by their schedule to come up with a final tally.
Here are the results:
Read More »
In every aspect, these projections are better than last year’s. Why?
1. Custom weights. Each stat is weighted based on error tests from 1970 onward; for instance, BABIP for hitters is weighted at 0.88, while strikeout rate, which is more stable year-to-year than BABIP, is weighted at 0.49 (i.e., 2009 has a weight of 1, 2008 has a weight of 0.49, etc.). I use the past four years’ stats for hitters and three years for pitchers.
Read More »
Vegas Watch published its third annual preseason predictions review today and again found that the computer projections are more accurate than human predictions. CAIRO comes out on top this season, with an average root-mean-square-error of 9.24 games, its closest competitor having an RMSE of 9.60.
The FEIN projections I released in March and April fell right in between the CHONE and THT projections in terms of accuracy, with a 9.85 RMSE. But I also said that the Royals’ forecast was optimistic by one or two games since I had accidentally projected them as an NL team; I had estimated a drop of 1.5 wins for Kansas City based on the error, which would lower FEIN’s RMSE to 9.76 , in third place just ahead of Marcel.
After the season, I’ll compare the individual hitter and pitcher projections to what actually happened on the diamond.
Team-by-team forecasted and actual results after the jump.
Read More »
Here’s Joe Posnanski:
I continue to look for an extremely simple one-stop-shopping stat that could replace OPS. I would LOVE to get behind one. Of course I love Base Runs because it’s so mind-boggling accurate, but it’s complicated*. Even simple runs created is a really good stat, obviously, but it just seems to scare people.
*Of course, so is passer rating and for some reason people cite that all the time.
Wins Above Replacement is the most obvious choice, as it encompasses both offense and defense and includes a positional adjustment. It’s easy to explain that WAR equals batting runs above average plus fielding runs above average plus a positional adjustment plus replacement level.
Read More »
Do pitchers play better under pressure? Chris Jaffe looked at whether pitchers perform better when facing their 20th loss and found no increase in production—hits allowed, strikeouts, win percentage, etc.—and found no large uptick in performance.
He concludes that, in all starts when a pitcher has 19 losses, hits, strikeouts, walks, and home runs stay more or less the same, but their ERA falls almost a quarter of a point. Looking at just the first start in which a pitcher has 19 losses, however, there’s no difference in ERA:
Rate Key All Adj H/9 9.29 9.16 9.21 W/9 3.33 3.15 3.17 K/9 5.07 5.17 5.14 HR/9 0.89 0.90 0.90 R/9 4.48 4.60 4.66 ER/9 4.03 4.04 4.10
John Benson over at THT calculates Marcel projections as if a human controlled the weighting—a yearly weighting of 80%/15%/5%. He says that Jason Bartlett’s projected OPS would be 80 points higher with the more nearsighted weighting.
Another guy with a career year is Ben Zobrist, who is third in the AL with a .961 OPS. Benson calculates a 49-point oversight with the human weighting as opposed to Marcel’s weighting.
Baseball America’s Aaron Fitt says that No. 1 overall pick Stephen Strasburg signed a four-year deal worth $15.67 M with the Washington Nationals.
At about $3.9 million per year, Strasburg would have to perform as a 0.9 WAR player per year to be worth the money, assuming a rate of $4.4 million per win. That equates to an ERA of 4.77 in 150 innings or a 4.87 ERA in 180 innings in the NL, benchmarks the Nationals are certainly expecting from Strasburg.
A reader on Rob Neyer’s blog said that Todd Helton deserved the 2000 NL MVP because his road stats were better than every other candidates’ road splits. Neyer brings up the fact that Helton finished the year with eight less Win Shares than Jeff Kent, and “the right man won the award.”
Of course, Win Shares may not be more accurate than WAR, or wins above replacement. Devil_fingers points out that Helton finished that year with the most WAR in the NL (but by such a trivial margin above Barry Bonds—one run—that it’s too close to call) and that Jeff Kent ended up nine runs back of Helton.
On a similar note, Dave Studeman ran the numbers and found a .96 r-squared between WAR and Win Shares since 1900. But it’s the other four percent that is the difference between Kent being eight Win Shares (two-and-two-third wins) above Helton, and Helton being .9 WAR above Kent.
With a 10-7 record, Vazquez has not grabbed many headlines, however, you can make a legit case for him being the NL’s best pitcher not named Tim Lincecum. His 2.62 FIP is second best in the NL behind Lincecum’s freakish 1.96, and third in the majors behind Lincecum and Zack Grienke (his xFIP is also second to Lincecum in the majors). His sensational K/BB of 5.34 is actually ahead of Lincecum and Grienke, and ranks third best in baseball behind Dan Haren and Roy Halladay. He is the only NL pitcher outside of Lincecum to rank in the top five in FIP, tRA, tRA*and xFIP.
I think Vazquez might be my favorite major league pitcher, simply because he’s been so good yet no one notices. He had a 2.95 ERA, 1.05 WHIP, and a 5.91 K/BB ratio in the first half but wasn’t elected to the All-Star game!
Good post at The Book Blog by MGL, pointing out that it’s impossible for any player to be “consistent”:
If his true K rate is 1 per 10 PA and he is so consistent that that never changes, he is still subject to a binomial standard deviation around that “p.” Same for HR rate, hit rate, BB rate, etc. There is nothing that he can do about it, no matter how consistent he may be skill-wise. If a player actually has been incredibly consistent over the course of his career or some long time period, it HAS to be a fluke.
Indeed, a simple binomial distribution in Excel shows us that a true .300 hitter will get three hits in 10 at-bats only 27 percent of the time, hitting .200 or .400 43 percent of the time.
