College Football Chaos -- Examining college football teams' propensities for upsets

Anyone who knows me knows that I’m a huge college football fan. Aside from watching my beloved Penn State Nittany Lions every weekend, I have a four screen setup for watching all of the other big games, pulling for Team Chaos all the way. With respect to college football, chaos is best described as higher ranked teams losing to lower ranked (or unranked teams). Each week, the teams looking to pull of these upsets are collectively the members of Team Chaos.

While there are plenty of good stats out there on teams’ records against ranked opponents, I’d never come across data on the specifics of how teams fair against higher or lower ranked opponents. Through the first few weeks of the 2016 season, it was starting to feel like Team Chaos was having a particularly good run, so I set out to pull together data to see if this was indeed the case. I wrote a simple Python script utilizing a headless browser to pull the source code for weekly top 25 game results from ESPN’s site, render the pages, scrape the pertinent data, and aggregate it into a workable format.

Using D3.js, I’ve pulled together a couple of basic visualizations for exploring the data. It’s still a work in progress, but I update the data weekly and am working on building a more comprehensive and compelling experience around the stories that this data tells. You can check it out here.

Frequently Asked Questions

What is “Team Chaos”?

If you want to descend down the rabbit hole that is /r/cfb (I highly recommend it), this thread captures the range of nuanced opinions on what chaos is. Here we’ll be working with the simplest definition: upsets of ranked teams. When a ranked team is upset, it’s a win for Team Chaos. Teams that defeat higher ranked opponents are “agents” of chaos (and de facto members of Team Chaos that week), while ranked teams losing to lower ranked (or unranked) opponents are “victims” of chaos.

What’s the point?

Some people just want to watch the world burn. But you don’t need to be a maniac to enjoy chaos. It’s always fun to pull for an upset, especially when it throws the rankings (and playoff contenders) into disarray. Whether you don’t have a favorite team to pull for, or your team is having a rough season, chaos is always the answer. It makes every game more exciting to watch. And nothing is sweeter than watching your hated rivals fall victim to the underdog.

Ok, so how do you quantify chaos?

This is another place where you could go down a rabbit hole developing complex equations to output a master chaos metric. And any solution would be incomplete and contentious. To keep things simple, we’re considering chaos over time – ranked upsets per week – and chaos by team – counting how often a given team is an agent or victim of chaos and their win percentages when in a position to take on either role.

Interpretting Results

While it’s hard to draw definitive conclusions from the data, it’s definitely interesting to explore. For example, is a team that often falls victim to chaos consistently overrated? Underperforming? Both? Neither? That’s up to you to decide. This data is meant to start conversations and grant visibility into some stats that aren’t often discussed. This project is also very much a work in progress and will be updated as I find more effective ways to summarize and communicate the data. If you have feedback or suggestions, feel free to tweet at me. Note: the data visualizations, while responsive, are not yet properly optimized for small screens and mobile devices.