Hi everybody! Long time, no see!
I recently got the wild idea to redo my 1918 season disk now that there is so much more research and information available in an easily readable form compared to c.2000-2004 when I was doing most of the research for it. The biggest addition is the inclusion of PLATOON SPLITS for both batters and pitchers. I wrote a VBA script in Excel that reads the box score event files from Retrosheet and compiles the splits for each player based on the distribution of events and hands on the other team. For pitcher complete games, it's pretty easy. When relievers are involved, I then apportion the event counts on each side based on the hands of the batters and pitchers so the totals balance for each side and for each L/R/S split.
Other improvements are a better and more consistent method for defensive ratings (DRA from Michael Humphreys), actual park factors from Seamheads.com, Pitcher GB% based on their distribution of infield assists / outfield putouts, actual hold ratings for pitchers based on SB allowed / runners on 1st, pitcher durability ratings based on Tom Tango's pitch count estimator applied to every pitching appearance, catcher's arm ratings based on SB/CS numbers at baseball-reference.com, and there's probably more.
My biggest gripe though has to be how DMB changed the player creation sequence in v11. I wrote a sweet script in AutoIT that would read from Excel and populate all the player creation numbers in sequence and do it all in about 2 seconds per player. Well now that script doesn't work so I'm having to enter all the numbers manually. It's going to take a bit longer, but I've got the players created for 6 teams now with 10 to go!
One thing to keep in mind is that since these are compiled from individual box scores, there will be slight discrepancies between the totals in the season disk and the official totals, especially in fielding stats, but also in the batting and pitching stats. Red Sox catcher Sam Agnew officially has 8 2B in 1918 but there are only 7 in the box scores. I am not attempting to reconcile the compiled box score stats with the official totals. The compiled box score stats are probably more accurate anyway at least for that long ago.
If anyone else wants to tackle any of the other seasons where Retrosheet has box score event files, I'll be glad to run my parser on the files for any season. That takes < 1 minute and outputs the data in the sequence needed now for player entry.
But hey, enough of my yakking. How about some screen shots (No ratings entered yet, just stats. I need to correct for RBI and defensive innings too. Retrosheet doesn't have them for every game, just most of them.).....