Sportsimulacrum

Members Login
Username 
 
Password 
    Remember Me  
Post Info TOPIC: ZiPS Projections Information


Member

Status: Offline
Posts: 6
Date:
ZiPS Projections Information
Permalink   
 


Starting with the 2011 MLB season, Diamond Baseball's official projection disk has been produced using Dan Szymborski's ZiPS projections.

Note about ZiPS at Baseball Think Factory (BBTF)

ZiPS projections are Dan Szymborski's computer-based projections of performance. Performances have not been allocated to predicted playing time in the majors - many of the players listed above are unlikely to play in the majors at all in 2009. ZiPS is projecting equivalent production - a .240 ZiPS projection may end up being .280 in AAA or .300 in AA, for example. Whether or not a player will play is one of many non-statistical factors one has to take into account when predicting the future. Players are listed with their most recent teams unless Dan has made a mistake. This is very possible as a lot of minor-league signings are generally unreported in the offseason. ZiPS is projecting based on the AL having a 4.46 ERA and the NL having a 4.41 ERA. Players that are expected to be out due to injury are still projected. More information is always better than less information and a computer isn't what should be projecting the injury status of, for example, a pitcher with Tommy John surgery. Positional offense is ranked by RC/27 and divided into quintiles based on what the most frequent starting players at each position did in 2007-2009. Excellent is the top quintile, Very Good the 2nd quintile and so on.

ZiPS FAQ (2008) BBTF link. --As noted, starting with the 2011 season, Dan Szymborski will be producing the official projection disk for Diamond Mind Baseball, where the disk will be available for purchase. 

Dan Szymborski of BaseballThinkFactory.org puts ZiPS out annually. They're based on three or four years of weighted data depending on a player's age and he uses various 'growth and decline' curves based on the type of player. 'I don't try to find particularly similar players but instead large groups with similar characteristics, such as K rate for pitchers, Speed Score for batters, [batting average on balls in play] BABIP for batters, handedness, and a lot of other stuff.' Pitching projections do take DIPS theory into account by not only regressing BABIP toward the mean but also by taking into account handedness, knuckleballs, and groundball-to-fly ball ratios. It's worth noting that ZiPS does not attempt to project playing time and of the four projection systems, it has the most players with 995 batters and 989 pitchers, many of whom have yet to play in the majors. [from SI.com]

ZiPS Projections at FanGraphs

In-Season ZiPS Projections at FanGraphs

 

ZiPS Projection Downloads

2012 ZiPS Projections (first build and final build)

-- 2012 ZiPS Projections (final build)

-- 2012 ZiPS Projections for Office 2007 .xlsx format (first build)

-- Comma-delimited batters (first build)

-- Comma-delimited pitchers (first build)

* 2011 ZiPS Projection Disk is the official Diamond Mind Baseball Projection Disk, available for purchase from Diamond Mind.

2011 ZiPS Projections, Final Edition  (Link to spreadsheet is inactive.)

2011 ZiPS Projections, 3/25/11 Build  (Link to spreadsheet is inactive.)

** Instructions for installing ZiPS Projection Disk

2010 ZiPS Projection Disk (converted to v10)note at Baseball Think Factory (link to file inactive)

2010 ZiPS Projection Disk (v9) - Final Build -- 2010 ZiPS Projections - Build 2 -- 2010 ZiPS Projections Spreadsheet, Build 2 -- 2010 ZiPS Projections - Build 1 -- 2010 ZiPS Projection Spreadsheet, Build 1 (spreadsheet removed)

2009 ZiPS Projection Disk -- 2009 ZiPS Projections Spreadsheet

2008 ZiPS Projection Disk -- 2008 ZiPS Projections, Final Spreadsheet

2007 ZiPS Projection Disk and Spreadsheet

2006 ZiPS Projection Disk and Spreadsheet

2005 ZiPS Projection Disk

2005-2009 ZiPS Projection Disks (converted to v10)

 

-- Dan Szymborski on ZiPS and platoon splits:

It's a question of math. If I take the 500 players that have the longest careers that I did not give splits to, I would expect 2009 platoon splits to be closer to generic platoon splits than platoon splits generated from their careers for 79% of players. The odds that percentage drops below 50% are 1-in-86, so I'm pretty confident that I'm likely to be closer with more players. Jed Lowrie has a better chance of hitting 45 home runs than his platoon splits from 2009 being, over a full season, as large as they were in 2008. I'm sure you wouldn't take his projection seriously if I projected Lowrie to hit 45 home runs, so why would you like it if I gave him splits that were even less likely to be an accurate representation of his abilities? It's probably a philosophical difference that we can't resolve, but I don't know what's so fun about codifying randomness into fact. Carlos Zambrano threw no-hitters in 3.3% of his starts last season; would you really want there to be a 3.3% chance of him throwing a no-hitter in his DMB starts? It's false precision. I know when I play a projection disk, I want to be dealing with the same questions that a manager would face. Projected platoon splits for the majority of players, because they're turning randomness into a reality, provide a layer of exploitation that nobody would have available to them in real life.

 

-- 2011 ZiPS, now with (regressed) platoon splits for all:

 

"What's changed? I've added projected platoon splits for everyone. These are heavily regressed (as platoon splits ought to be). There's also about a dozen new players, guys who are on 40-man rosters and are projectable (sorry, Bryce Harper fans), like Trystan Magnuson and Joe Paterson. These dozen players do not have projected splits as they were last minute additions."

"... Not so much a shift in my thinking, but a more rigorous model. Before, I was simply doing players over a certain threshold. This year, I've included splits regressed towards generic (the underlying fact that generic platoon splits are better predictors of future platoon splits than actual platoon splits remains true). I wrote a generalized regression model so there's more shading. Instead of either/other, a player with 1200 professional PA will have his platoon splits move most of the way to generic while a player with 3000 PA will be about 50/50 while guys like Jim Thome will have platoon splits that more reflect their actual than generic. Now that I have the ability to get all those platoon splits onto the disk (well, Luke does), I can do what I believe is the best possible solution. Short-term extreme platoon splits are still going to be ironed out for obvious reasons (we're looking forward, not back) and some are going to be unhappy there's less opportunity to exploit these short-term extreme platoon splits, but I feel this is the most intellectually honest way to do it. Getting switch-hitters right was the hard part - I had to get the splits for every switch-hitter going back to the start of the retrosheet era to develop a probabilistic model since you don't have anything easy to regress towards."

 

Defensive and other "subjective" ratings in ZiPS Projection Disks -- From old DMB Forum:

"I use a combination of UZR, Dial's LWZR, and PMR, wherever available, with scouting reports breaking close ties between tiers. For minor leagues, scouting is a little more important because of the lack of quality defensive data, and that's combined with a minor league DR estimator from play-by-play data from Jeff Sackmann (both Sean Smith and I made our own, almost identical systems, in November '07 when Sackmann had stopped calculating his). I tend to be very conservative at assigning defensive ratings. At first, only Pujols gets an EX for range and no other position has more than 4 given out."

-- Also from Dan Szymborski:

"I evaluate the defensive ratings every single season. I'm conservative about arm rating as arms don't really change all that quickly unless there's an injury. For the defensive ratings, I use a combination of three year Dial and Lichtman ZR translations and +/- with scouting reports 'breaking ties' and for minor leaguers, a combination of scouting reports and a rough ZR I developed from PBP data from Jeff Sackmann. For the running and bunting, ZiPS actually spits out the tiers for me with the projection. I use a modified speed score for the running rating and apply EX/VG/AV/FR/PR divided among the population in percentages of 10/20/40/20/10, as with jump. I use a mix of SB% and jump to calculate steal success rates, simply because I don't want a bunch of PR jumpers with EX steal."

ZiPS Utilities

In-season Projection ToolDownload*

Start-Relief Projection ToyDownload*

 

Minor League Translations ("zMLE" or "ZiPS MLE")

2011 ZiPS Minor League Translations FinalCSV Download*

2008 ZiPS Minor League Translations - CSV Download*

3 Decades of Minor League Translations;  Downloads*:  zMLE for Excel 2007zMLE for Excel 2003 (there are more than 65-thousand rows, so data is split into extra sheets)Minor League Park Factors, Real and EstimatedzMLE for Pitchers & Hitters (2 files, CSV format)

 

-- SG on differences between official DMB projection disks (through 2008) and ZiPS:

"One of the biggest differences that I am aware of between the two projection systems is that ZiPS uses Voros McCracken's controversial DIPS theory when projecting pitchers. DIPS basically focuses on a pitcher's strikeouts, walks, and home runs allowed, and assumes that their control of hits on balls in play(non-homer hits and outs) is minimal. Tom Tippet of Diamond Mind did his own research on this theory, and concluded that pitchers "have more influence over in-play hit rates than McCracken suggested", so he uses a pitcher's hits allowed totals in his projections. ZiPS also uses comparisons with similar players in building its projections, whereas Diamond Mind uses a Marcel type projection system which only focuses on what a player himself has done. I am also pretty sure that ZiPS is harsher to older players than Diamond Mind....

 

Miscellaneous

FanGraphs interview

FanGraphs audio interview

Dan Szymborski at ESPN Insider (subscription necessary)

Dan Szymborski on Twitter

Search ZiPS at Baseball Think Factory

ZiPS Archives at BBTF

Link to thread on old DMB forum regarding splits and projections disks

Link to another thread on old DMB forum regarding splits and projections disks---



-- Edited by VKRatliff on Saturday 5th of January 2013 10:29:09 PM



-- Edited by VKRatliff on Saturday 5th of January 2013 10:31:29 PM



-- Edited by VKRatliff on Saturday 5th of January 2013 10:33:44 PM

__________________


Member

Status: Offline
Posts: 6
Date:
Permalink   
 

 

[stextbox id="custom" bgcolor="f4f4f4"]

"ZiPS projections are Dan Szymborski's computer-based projections of performance. Performances have not been allocated to predicted playing time in the majors - many of the players listed above are unlikely to play in the majors at all in 2009. ZiPS is projecting equivalent production - a .240 ZiPS projection may end up being .280 in AAA or .300 in AA, for example. Whether or not a player will play is one of many non-statistical factors one has to take into account when predicting the future. Players are listed with their most recent teams unless Dan has made a mistake. This is very possible as a lot of minor-league signings are generally unreported in the offseason. ZiPS is projecting based on the AL having a 4.46 ERA and the NL having a 4.41 ERA. Players that are expected to be out due to injury are still projected. More information is always better than less information and a computer isn't what should be projecting the injury status of, for example, a pitcher with Tommy John surgery. Positional offense is ranked by RC/27 and divided into quintiles based on what the most frequent starting players at each position did in 2007-2009. Excellent is the top quintile, Very Good the 2nd quintile and so on."

- from Baseball Think Factory (BBTF)
[/stextbox]
ZiPS FAQ (2008); BBTF link.
--As noted, starting with the 2011 season, Dan Szymborski will be producing the official projection disk for Diamond Mind Baseball, where the disk will be available for purchase.
[stextbox id="custom" bgcolor="f4f4f4"]

"Dan Szymborski of BaseballThinkFactory.org puts ZiPS out annually. They're based on three or four years of weighted data depending on a player's age and he uses various 'growth and decline' curves based on the type of player. 'I don't try to find particularly similar players but instead large groups with similar characteristics, such as K rate for pitchers, Speed Score for batters, [batting average on balls in play] BABIP for batters, handedness, and a lot of other stuff.' Pitching projections do take DIPS theory into account by not only regressing BABIP toward the mean but also by taking into account handedness, knuckleballs, and groundball-to-fly ball ratios. It's worth noting that ZiPS does not attempt to project playing time and of the four projection systems, it has the most players with 995 batters and 989 pitchers, many of whom have yet to play in the majors."
[from SI.com]

[/stextbox]

Beginning with the 2011 MLB season, Diamond Baseball's official projection disk has been produced using Dan Szymborski's ZiPS projections.

ZiPS Projections at FanGraphs
In-Season ZiPS Projections at FanGraphs

ZiPS Projection Downloads
2012 ZiPS Projections (first build and final build)
-- 2012 ZiPS Projections (final build)
-- 2012 ZiPS Projections for Office 2007 .xlsx format (first build)
-- Comma-delimited batters (first build)
-- Comma-delimited pitchers (first build)
* 2011 ZiPS Projection Disk is the official Diamond Mind Baseball Projection Disk, available for purchase from Diamond Mind.
2011 ZiPS Projections, Final Edition (Link to spreadsheet is inactive.)
2011 ZiPS Projections, 3/25/11 Build (Link to spreadsheet is inactive.)
** Instructions for installing ZiPS Projection Disk
2010 ZiPS Projection Disk (converted to v10); note at Baseball Think Factory (link to file inactive)
2010 ZiPS Projection Disk (v9) - Final Build
-- 2010 ZiPS Projections - Build 2
-- 2010 ZiPS Projections Spreadsheet, Build 2
-- 2010 ZiPS Projections - Build 1
-- 2010 ZiPS Projection Spreadsheet, Build 1 (spreadsheet removed)
2009 ZiPS Projection Disk
-- 2009 ZiPS Projections Spreadsheet
2008 ZiPS Projection Disk
-- 2008 ZiPS Projections, Final Spreadsheet
2007 ZiPS Projection Disk and Spreadsheet
2006 ZiPS Projection Disk and Spreadsheet
2005 ZiPS Projection Disk
2005-2009 ZiPS Projection Disks (converted to v10)


ZiPS Utilities
In-season Projection Tool; Download*
Start-Relief Projection Toy; Download*

Minor League Translations ("zMLE" or "ZiPS MLE")
2011 ZiPS Minor League Translations Final; CSV Download*
2008 ZiPS Minor League Translations - CSV Download*
3 Decades of Minor League Translations; Downloads*: zMLE for Excel 2007; zMLE for Excel 2003 (there are more than 65-thousand rows, so data is split into extra sheets); Minor League Park Factors, Real and Estimated; zMLE for Pitchers & Hitters (2 files, CSV format)

-- SG on differences between official DMB projection disks (through 2008) and ZiPS:
[stextbox id="custom" bgcolor="f4f4f4"]

"One of the biggest differences that I am aware of between the two projection systems is that ZiPS uses Voros McCracken's controversial DIPS theory when projecting pitchers. DIPS basically focuses on a pitcher's strikeouts, walks, and home runs allowed, and assumes that their control of hits on balls in play(non-homer hits and outs) is minimal. Tom Tippet of Diamond Mind did his own research on this theory, and concluded that pitchers "have more influence over in-play hit rates than McCracken suggested", so he uses a pitcher's hits allowed totals in his projections. ZiPS also uses comparisons with similar players in building its projections, whereas Diamond Mind uses a Marcel type projection system which only focuses on what a player himself has done. I am also pretty sure that ZiPS is harsher to older players than Diamond Mind...."

[/stextbox]

Miscellaneous
FanGraphs interview
FanGraphs audio interview
Dan Szymborski at ESPN Insider (subscription necessary)
Dan Szymborski on Twitter
Search ZiPS at Baseball Think Factory
ZiPS Archives at BBTF
Link to thread on old DMB forum regarding splits and projections disks
Link to another thread on old DMB forum regarding splits and projections disks

 

2014 Team-by-Team ZiPS

AMERICAN LEAGUE
Baltimore Orioles
Boston Red Sox
Chicago White Sox
Cleveland Indians
Detroit Tigers
Houston Astros
Kansas City Royals
Los Angeles Angels
Minnesota Twins
New York Yankees
Seattle Mariners
Tampa Bay Rays
Texas Rangers
Toronto Blue Jays
NATIONAL LEAGUE
Arizona Diamondbacks
Atlanta Braves
Cincinnati Reds
Los Angeles Dodgers
Miami Marlins
Milwaukee Brewers
New York Mets
Philadelphia Phillies
Pittsburgh Pirates
St. Louis Cardinals
San Diego Padres

-------------------------------------------------------------
-- Dan Szymborski on ZiPS and platoon splits:
[stextbox id="custom" bgcolor="f4f4f4"]

"It's a question of math. If I take the 500 players that have the longest careers that I did not give splits to, I would expect 2009 platoon splits to be closer to generic platoon splits than platoon splits generated from their careers for 79% of players. The odds that percentage drops below 50% are 1-in-86, so I'm pretty confident that I'm likely to be closer with more players. Jed Lowrie has a better chance of hitting 45 home runs than his platoon splits from 2009 being, over a full season, as large as they were in 2008. I'm sure you wouldn't take his projection seriously if I projected Lowrie to hit 45 home runs, so why would you like it if I gave him splits that were even less likely to be an accurate representation of his abilities? It's probably a philosophical difference that we can't resolve, but I don't know what's so fun about codifying randomness into fact. Carlos Zambrano threw no-hitters in 3.3% of his starts last season; would you really want there to be a 3.3% chance of him throwing a no-hitter in his DMB starts? It's false precision. I know when I play a projection disk, I want to be dealing with the same questions that a manager would face. Projected platoon splits for the majority of players, because they're turning randomness into a reality, provide a layer of exploitation that nobody would have available to them in real life."

[/stextbox]
-- 2011 ZiPS, now with (regressed) platoon splits for all:
[stextbox id="custom" bgcolor="f4f4f4"]

"What's changed? I've added projected platoon splits for everyone. These are heavily regressed (as platoon splits ought to be). There's also about a dozen new players, guys who are on 40-man rosters and are projectable (sorry, Bryce Harper fans), like Trystan Magnuson and Joe Paterson. These dozen players do not have projected splits as they were last minute additions."

....

"... Not so much a shift in my thinking, but a more rigorous model. Before, I was simply doing players over a certain threshold. This year, I've included splits regressed towards generic (the underlying fact that generic platoon splits are better predictors of future platoon splits than actual platoon splits remains true). I wrote a generalized regression model so there's more shading. Instead of either/other, a player with 1200 professional PA will have his platoon splits move most of the way to generic while a player with 3000 PA will be about 50/50 while guys like Jim Thome will have platoon splits that more reflect their actual than generic. Now that I have the ability to get all those platoon splits onto the disk (well, Luke does), I can do what I believe is the best possible solution. Short-term extreme platoon splits are still going to be ironed out for obvious reasons (we're looking forward, not back) and some are going to be unhappy there's less opportunity to exploit these short-term extreme platoon splits, but I feel this is the most intellectually honest way to do it. Getting switch-hitters right was the hard part - I had to get the splits for every switch-hitter going back to the start of the retrosheet era to develop a probabilistic model since you don't have anything easy to regress towards."

[/stextbox]

Defensive and other "subjective" ratings in ZiPS Projection Disks
-- From old DMB Forum:
[stextbox id="custom" bgcolor="f4f4f4"]

"I use a combination of UZR, Dial's LWZR, and PMR, wherever available, with scouting reports breaking close ties between tiers. For minor leagues, scouting is a little more important because of the lack of quality defensive data, and that's combined with a minor league DR estimator from play-by-play data from Jeff Sackmann (both Sean Smith and I made our own, almost identical systems, in November '07 when Sackmann had stopped calculating his). I tend to be very conservative at assigning defensive ratings. At first, only Pujols gets an EX for range and no other position has more than 4 given out."

[/stextbox]

-- Also from Dan Szymborski:
[stextbox id="custom" bgcolor="f4f4f4"]

"I evaluate the defensive ratings every single season. I'm conservative about arm rating as arms don't really change all that quickly unless there's an injury. For the defensive ratings, I use a combination of three year Dial and Lichtman ZR translations and +/- with scouting reports 'breaking ties' and for minor leaguers, a combination of scouting reports and a rough ZR I developed from PBP data from Jeff Sackmann. For the running and bunting, ZiPS actually spits out the tiers for me with the projection. I use a modified speed score for the running rating and apply EX/VG/AV/FR/PR divided among the population in percentages of 10/20/40/20/10, as with jump. I use a mix of SB% and jump to calculate steal success rates, simply because I don't want a bunch of PR jumpers with EX steal."

[/stextbox]

 

-------------------------------------------------------------
2013 Team-by-Team ZiPS
AMERICAN LEAGUE
Baltimore
Boston
Chicago White Sox
Cleveland
Detroit
Houston
Kansas City
Los Angeles Angels
Minnesota
New York Yankees
Oakland
Seattle
Tampa Bay
Texas
Toronto
NATIONAL LEAGUE
Arizona
Atlanta
Chicago Cubs
Cincinnati
Colorado
Los Angeles Dodgers
Miami
Milwaukee
New York Mets
Philadelphia
Pittsburgh
St.Louis
San Diego
San Francisco
Washington

-------------------------------------------------------------
2012 Team-by-Team ZiPS
AMERICAN LEAGUE
Batimore
Boston
Chicago White Sox
Cleveland
Detroit
Kansas City
Los Angeles Angels
Minnesota
New York Yankees
Oakland
Seattle
Tampa Bay
Texas
Toronto

NATIONAL LEAGUE
Arizona
Atlanta
Chicago Cubs
Cincinnati
Colorado
Houston
Los Angeles Dodgers
Miami
Milwaukee
New York Mets
Philadelphia
Pittsburgh
San Diego
San Francisco
St. Louis
Washington

-------------------------------------------------------------
2011 Team-by-Team ZiPS
AMERICAN LEAGUE
Baltimore
Boston
Chicago White Sox
Cleveland
Detroit
Kansas City
Los Angeles Angels
Minnesota
New York Yankees
Oakland
Seattle
Tampa Bay
Texas
Toronto

NATIONAL LEAGUE
Arizona
Atlanta
Chicago Cubs
Cincinnati
Colorado
Florida
Houston
Los Angeles Dodgers
Milwaukee
New York Mets
Philadelphia
Pittsburgh
St. Louis
San Diego
San Francisco
Washington

-------------------------------------------------------------
2010 Team-by-Team ZiPS
AMERICAN LEAGUE
Baltimore
Boston
Chicago White Sox
Cleveland
Detroit
Kansas City
Los Angeles Angels
Minnesota
New York Yankees
Oakland
Seattle
Tampa Bay
Texas
Toronto

NATIONAL LEAGUE
Arizona
Atlanta
Chicago Cubs
Cincinnati
Colorado
Florida
Houston
Los Angeles Dodgers
Milwaukee
New York Mets
Philadelphia
Pittsburgh
St. Louis
San Diego
San Francisco
Washington

 
 

 



__________________
Page 1 of 1  sorted by
 
Quick Reply

Please log in to post quick replies.



Create your own FREE Forum
Report Abuse
Powered by ActiveBoard