4 years ago, I started doing some very simple charting for each Georgia Tech football game. According to the commonly accepted definition of a successful play (50% of needed yards on 1st down, 70% on 2nd, 100% on 3rd or 4th), was each play for and against GT successful or not? This simple exercise helped me to see when the score in the middle of a game was out of line with this underlying efficiency measurement and usually helped shape my expectations for the rest of the game. It helped me have more perspective in evaluating game by game performance.
I charted that way for two years and then began adding in the play’s actual yardage. Simply combining success rate and yards per play provided a helpful heuristic to gauge team performance, and I started assessing how well GT did in those numbers compared to their opponents’ average performance.
I told a friend, Mark, about what I was doing two years ago, and he latched on to the idea. He knew far more about Excel than me and took things in some new directions. By last season, he (for Alabama) and I (for GT) were charting about ten different things on every play and producing some pretty sophisticated statistical analysis of each game. That’s the kind of thing you find in my game by game advanced stats reviews. I also started to use GT’s underlying statistics relative to their opponent to make projections for future games. I was limited in being able to apply this, though, because I only had GT data.
Around the same time, Mark and I found out about the work that CFB_Data, Saiem Galani, and others were doing to make play by play data available for every FBS team. I started to play around with this data using the R software language and Saiem’s cfbfastR (what used to be cfbscrapR) package. Based on the work I had done analyzing GT games, I knew that an index that combined EPA/play margin, success rate margin, and yards per play margin had a good track record of predicting future team outcomes. Mark helped me build a prototype model last season that we jokingly called The Binion Index. This offseason, I didn’t come up with a better name, but I did improve the model.
In its current form, The Binion Index rates each FBS team to project a point differential against an average FBS team. The preseason rating uses historical play by play data, measuring EPA/play, Success Rate Margin, and Yards per Play margin to evaluate recent team strength. It combines that data with the 247 Team Talent Index and Bill Connelly’s returning production calculations.
During the season, the model updates with current play by play data, again measuring EPA/play, Success Rate Margin, and Yards per Play margin. Each week, the season to date data plays an increasingly larger role in the model, with the preseason rating decreasing accordingly. Starting in week 3, a scheduling adjustment will be added for each team.
The preseason rating correctly picked winners against the spread in 54.9% of week 0 and week 1 games. That would have led all of the models featured on The Prediction Tracker.
Throughout the season, we’ll be posting weekly updates here, either on Tuesday or Wednesday.
I built this model to have fun, better understand the college football landscape, and grow in my ability to work with data. I hope it might add to your weekly college football analysis diet. A few caveats before you yell at me about Washington:
- The model is trained not to overreact to Week 1
- The play by play data @ESPNCFB was not available for a few games, so a few teams still have their preseason rating (including GT).
- I’m not adding any schedule adjustments until week 3
Without further ado, here’s Week 1 of The Binion Index.
The Binion Index Week 1
|70||San Diego State||-0.66|
|93||UT San Antonio||-4.27|
|113||San José State||-11.22|
|129||New Mexico State||-24.89|