This is the second post in our new series at From The Rumble Seat introducing our readers to sports analytics and what we can find out using data. You can view all other Intro to Football Analytics posts in the story stream. Also I would be remiss if I didn't link to the Visualizing Quarterbacks under Paul Johnson post from Thursday in case you didn't see it over the holidays.
Turnovers are a huge deal in college football, and professional football as well. They can swing a game and teams that win the turnover margin win 73% of the time. So judging a team off turnovers is usually a way good way to go. However if you judge a team based off the amount of fumbles they recover, you will be in for a let down later on. Recovering fumbles is a complete crapshoot, but most people know that without using data to prove it. But that is what the football analytics movement is all about; using data to either prove or disprove common held beliefs about the game of football. In this post, we'll prove that recovering fumbles is not a skill.
A skill is something that teams should exhibit in each game, and especially over a full season. An easy way to test if a team is able to do something consistently is to compare their stats from the first half of the season to the second half of the season. If teams are truly better at something then they should be better at it in both halves of the season. To do this I looked at all runs that occurred in games between FBS opponents from the 2008 to 2013 seasons. I then grouped games that occurred in the first half of each season and also another group for games in the second half of the season. I then found each team's first half and second half fumble recovery rates when they had the ball. The following plot shows each team's first half fumble recovery rate (the % of fumbles they recover while on offense) on the x-axis and the 2nd half of the season's fumble recovery rate on the y-axis.
What this shows is that your 1st half fumble recovery percentage has practically nothing to do with your second half fumble recovery percentage. The solid blue line in the middle is the best-fit linear regression line. It shows almost no change in your predicted 2nd half fumble recovery rate no matter what your 1st half recovery rate is. For comparison here is a plot showing each team's 1st half yards per carry vs 2nd half yards per carry:
There is still some variation and noise, but there is significant linear trend. This means that teams that have higher yard per carries in the 1st half will probably continue to gain more yards than average on runs in the 2nd half of the season. With the fumble recovery rates there is almost no relationship between 1st and 2nd half fumble recovery rates. We can measure this using the correlation between the two variables. (If you recall, correlation measures the strength of the relationship between two variables on a scale from -1 (perfectly inverse relationship) to 1 (perfectly in-line relationship).) The correlation between each team's 1st half of the season and 2nd half of the season's fumble recovery rates is .02, nearly 0. This means, statistically, there is no relationship between how well a team recover's its fumbles in the 1st half of the season and the 2nd half of the season.
Another way to see if there is a repeatable skill in recovering fumbles is to group each team's plays into even and odd numbered plays (the first play of the game is play number 1, the 2nd play of the game is play number 2, etc...). There should be no biases between even plays and odd plays for a team, unlike splitting by half of the season which can be influenced by teams playing better or players getting injured. So teams that are better at something should be better at it on both even numbered plays and odd numbers plays. Well, the correlation between a team's fumble recovery rates on even plays and fumble recovery rates on odd numbered plays is .007, even less than the correlation between halves of the season. This shows that no matter how you split the data up, there is no repeatable skill in recovering fumbles.
- Historically teams do not show a repeatable skill over time of recovering fumbles. Comparing fumble recovery rates between different groupings of plays for teams in the same season will show little to no evidence that teams recover more fumbles than at a 50% rate.
- Here is the dropbox link to the CSV file containing the 1st half and 2nd half season fumble data. Here is the dropbox link to the R file that contains the code I used to generate the data for the analysis.
Next Steps or Questions:
- How should we judge a team for playing well while recovering a high percentage of fumbles, or in other words, getting lucky with turnovers?
- How many points is an average turnover worth?
- Regression to the mean: Teams with more fumbles have recovery rates closer to 50% than teams with fewer fumbles (its harder to get lucky the longer you play).