[tl;dr - It is Ohio State, Alabama, and Oklahoma in the top 3 just about any way you cut it]
As the spring semester starts so too does the off season for college football. Inevitably this brings lots of prognostication for what is to come in the fall but also for a look back to assess the past--recent and long ago. With this are a million arguments about what are the greatest college football programs and who among the elite can claim Blue Blood status.
The website College Football News (CFN) says the best of them all is the Oklahoma Sooners. As much as I love that idea, I can see reasonable minds disagreeing. Undoubtedly any such lists will imply some hair-splitting considerations. What I really like about their approach is that it has a definitive methodology to it. It is not just some "experts" giving us their feel for the answers as if they could divine the truth free from bias.
Of course all approaches will have bias. This would come in two varieties: assumption based and disposition based. One can create a formal algorithm (that is what I have done below) or one can derive a list from an informal algorithm weighing factors mysteriously within one's mind. While any method can be logically sound, generally speaking the less formal the process is, the more subject to bad reasoning or bad facts it will be. Assumption-based bias would be something like assigning too much weight to a certain factor. Disposition-based bias would be something like favoring a team for a reason not meaningful to the ranking itself.
Many attempts at this barstool debate are prone to bad math such as overcounting a metric since there will naturally be high correlation between commonly used measures (e.g., national championships and winning percentage). To prevent this, simpler is better if a simple approach can yield the desired effect.
Back to CFN's approach, they use a very common and logical method, the inverse of the final AP Top 25 Poll each year to score teams. A first-place team would receive 25 points all the way through #25 receiving 1 point. Summing all the points by team creates the list in order. I do not disagree with their top 3 teams (OU, Alabama, and Ohio State), but the list can be criticized for its obvious shortcomings. For example, leaving out teams' scores when they finish just outside of the top 25 creates artificial distortion making it look like there is more separation within the list than actually exists. Of course there is no easy way to fix this. Additionally the AP poll is not itself without bias as some teams, especially those not typically perceived as being elite, may be systematically underrated. All of this makes this list, like so many others, subject to folly as one works down the list. Technically speaking our confidence in the outcome diminishes by an increasing degree from the top working down.
The CFN ranking is the "greatest programs of all time". As interesting as that is, it isn't necessarily what we commonly are thinking about when we seek to rank programs. Specifically, when we talk about the so-called "Blue Bloods", we are thinking of the best programs with emphasis to one degree or another on where they stand today. This opens up one additional criticism that this list and basically all lists like it suffer from: reverse-recency bias. Maybe we would term it "old-timer bias". This is the fact that these lists give equal weight to success in the distant past as they give to recent outcomes. And this is true whether they are derived from algorithms (assumption based) or expert opinion (disposition based). Of course many expert opinions can have the traditional recency bias problem (favoring the recent over the past), but often it is the traditional teams that get more love than they might deserve--I'm looking right at you Texas A&M and Michigan.
To get around this problem, I have created the model below. Borrowing from the foundational concepts of asset valuation where future cash flows are discounted back to present day (a dollar of earnings in ten years is worth less than a dollar today), I have created a model that gives more weight to recent performance than the same performance achieved in the past.
Model
I believe the most straight-forward way to evaluate teams is the win-loss record. The only enhancement to this might be to include margin of victory*--a technique I have used and will update soon in an additional post.
My model looks at each team's winning percentage by year and then discounts it by a factor for each year back it falls. So a win% in a given year would be worth less and less in the past the longer and longer ago it happened. Notice that I am using winning percentage so that basically there is no impact from the fact that teams play and have played a different number of games within a year and throughout the years--typically more games recently.
I also have included a starting-year cutoff to stop counting results that are past a certain date, which is a changeable variable in the model (see below for the link). Even though a discount factor makes the past less and less valuable in assessing a total score, it might be that football changed so fundamentally we don't want any results before a certain date and the discount factor necessary to otherwise achieve this would be too big--it would make results fade away from importance too quickly.
For me the discount factor I settled on was 4% with a cutoff date of 1946. My reasoning was at a 4% discount rate a 100% win percentage season 18 years ago would be worth only about 50% today--the factor cuts it in half. 18 years is the typical age of an incoming college freshman football player--so there is some relevance, maybe, to the people playing the game.
My starting-year cutoff is 1946, which has historically been marked as a beginning point of college football. However, as I've said before, I am not sure how valid that is. One-platoon football was the rule in most of the 1950s and into the 1960s. Furthermore, racial integration into college football did not meaningfully arrive until the 1970s.
These choices of discount factor and starting-year cutoff are both very arbitrary, but you see there is some logic to them. Importantly, the results do not seem to be sensitive to reasonable changes in either the discount rate or the starting-year cutoff date--the top three remain the same no matter what reasonable parameters are used.
One additional limitation this model has is that I did not look at all college football teams in creating it. Yet this is not the problem it may seem to be at least at the top end of the list (yes, this is of the same type of criticism I made of the CFN list above). Because I had to calculate from raw data the annual winning percentages of each team in the database, I limited it to the top 30 teams in winning percentage over the past 50 years (1972-2021). So, to be sure there are teams that with certain discount factors used (high ones) would find themselves otherwise in the list but are excluded. But this is quite limited to the very bottom of the list. Sorry Oklahoma State, your recent success would not get you very high in this ranking even if you had been good enough to make the list (OkState is 31st in win% over the 1972-2021 timespan for teams that were in D I-A (now FBS) football the entire time). Which brings up another team excluded, Boise State. They have had phenomenal winning teams since joining top-level college football in 1996. I made the decision to disallow them because of this limited time in the sample (the strength of their historic schedule might be another reason).
Some Results
Using a discount factor of 4% and a starting year of 1946:
Using a discount factor of 4% and a starting year of 1972 (last 50 years):
Check out the model for yourself including changing the parameters as you see fit. Here are some of the results given a few parameter choices.
https://drive.google.com/file/d/1M-3q6zWtTAyCFDHEuU5z3eA6jIwRplxq/view?usp=sharing
*MoV isn't completely stable over time, a potential criticism of that model, but since it has tended to increase in the past 50 years, I believe there is some natural recency premium built into that model. Regardless, it would be interesting to add into it a discounting factor, which I will do before publishing the updated results.