Sport

Which Two Best Teams?

Shy Elf.

Posted to Sport on Tue Nov 27, 2007 at 11:34:53 AM EST (promoted by port1080). RSS.

Most major sports settle their championships by having the best teams battle to decide the championship...

 NCAA Division I "Bowl Subdivision" (Formerly I-A) football has 119 teams, only two of which play for the championship, and in fact it is alone amongst NCAA sports in not having a playoff, although the single championship may be considered a one-game playoff.  Due to the toll it takes on the body to play week after week, football schedules have always had fewer games than other major sports.  Excluding the (unused) Hawaii exemption and the conference championship games and one bowl game each, teams are allowed to play only 12 games, despite the "Championship Subdivision (I-AA)" final two teams playing 15 games each.

If we were to start the season with a single-elimination playoff to find the best team, it would take 7 games -- over half of the regular season -- to determine the winner.  Many of the games played by the top teams are against highly inferior teams, so that the favorite has a very high probability of winning.  At the end of the season, we wind up with teams with vastly different schedule strengths, and we have to compare their results to pick two who deserve to play in the championship game.  If we are comparing only on who beat whom, we have a severe lack of data.

So who are the strongest teams this year?  According to the most widely quoted computer ranking, that of Jeff Sagarin(JS in your BCS computer ranking column), they are, in order, Oklahoma, Florida, West Virginia, Ohio State, Oregon, Kansas, LSU, USC, Missouri, and Arizona State.  This has BCS #9 Oklahoma listed as being stronger than BCS #1 Missouri, and that can't be right.  Want to bet on it?  You can, with the betting line being Oklahoma by 3 points at a neutral site this Saturday.  Last week, BCS #2 Kansas was an underdog to BCS #4 Missouri, again at a neutral site.

BCS rankings have little to do with which team is the strongest.  Instead, they rank which team, judged solely by which games it has won and which it has lost (and the strength of those teams) should be judged as the best team.  Since each team has only a few games against quality opponents and these results can be highly effected by luck, these rankings can be very different.  Missouri, with a tougher schedule than Oklahoma and one fewer loss is clearly the better team in terms of BCS ranking.  Oklahoma's median margin of victory of 31.5 points isn't relevant as far as the BCS is concerned.  

The computer polls used to give a larger role to victory margin, but they were out of step with human voters, and, citing concerns about teams running up the score at the end of games, the NCAA chose computer polls with less use of the victory margin which were much less accurate at picking game winners than the polls it had previously used.  There has been no competition to determine which computer polls work well, and the selection of the ones actually used seems completely arbitrary.  It's hard to believe that the Billingsley (RB) system is actually part of determining the national champion.  Not only does a large part of this system's ranking come from games played the previous year, but ratings changes are based on binary non-zero-sum computations involving only the rankings of the two teams playing, so that whichever team plays the most games enjoys a large advantage.  The Anderson & Hester system appears to be mostly a simple addition of winning percentage and opponent's winning percentages, with the "Colley Matrix" just a more sophisticated version of the same, and there is no reason to assume that these systems should work well.  The other three appear to be more reasonable, all three of them based on probabilities of a win for each team equal to the rating of that team divided by the sum of the strengths of both teams, and solving for team strengths which maximize the probability of the observed wins and losses.  This requires a preexisting team strength distribution in order to converge.  This distribution is far too wide in the case of Peter Wofle, who for example predicts a 70% chance of a Missouri victory this week.  Sagarin's results using this algorithm are listed in the "Elo-Chess" column, which is different from the team strengths given in the "Predictor" column.  Massey uses a preconditioner which considers margin of victory, so it it does use this information despite claims to the contrary elsewhere.

The BCS used to use the AP poll, until the AP sued them to make them stop using it, citing the conflict of interest of their writers as their reason.  In response, the BCS increased the importance of the human polls including the coaches' poll the next year.

Two thirds of the BCS ranking is now based on human polls.  Humans can take account of additional factors beyond just game scores, but, seemingly, don't.  When top ranked USC had their quarterback John David Booty break the middle finger of his throwing hand during the first half of the game against Stanford, they elected to leave him in the game.  He continued to play, throwing four interceptions in the second half, and USC lost.  USC lost one more game with their backup quarterback, but with Booty back playing, mauled BCS #6 Arizona State 44-24, for a respectable 9-2 record.  With Booty once again healthy, USC may well be the strongest team in college football, but even this story couldn't convince BCS voters, who left them mired at BCS #8.

When LSU gave up 466 yards to the nation's 98th ranked offense, Ole Miss, it didn't seem to make voters question their status as the nation's #1.  There seems to be a "ladder" mentality at play; you keep the spot on the ladder that you have until you lose, and can move up slightly and only by beating another team at the very top.  This has worked out to the advantage to Ohio State, which has been moving up the ladder without playing, having finished its season early (at the traditional time).  In Ohio State's case, this has and largely canceled the bias towards dropping teams too far for losing one of their last games.  Does anyone else believe that any other team would be sitting at #3 after losing their next to last game?

This brings up the point that Ohio State's last game was on Nov. 17th, and the championship game, where they have a better than even chance of playing is to be held on Jan. 7th.  Excuse me, but isn't this a bit like playing the world series in January?  After a layoff of over 7 weeks, do we really expect them to be in any condition to play?  If the argument against playoffs is that they would make the season too long, wouldn't they be shorter than the season now?  As usual, the game will be indoors in 70 degree weather, which is very hot in football gear, and will further stress the endurance of a northern team used to cold weather in January.  If Ohio State leads West Virginia for the first half, but is blown out in the second half as they run out of energy, won't this be taken as further proof that the Big Ten is awful, rather than confirmation that it's impossible to play well in hot weather after a layoff of over 7 weeks in cold weather?

Finally, let us consider Hawaii.  With Michigan paying to get out of their scheduled game with Hawaii to play Appalachian State instead (and lose), and with Hawaii claiming they can't find anyone to play a 13th game with, their extremely weak schedule might not be their fault, but they still don't deserve to be ranked higher.  If they want to play for the national championship, they should beat Fresno State by more than 7 points or Nevada by more than 2.  But just imagine if they had beaten all of their opponents by 60 points.  With the BCS as it is now, the computers are not allowed to take account of that, and they would still have been ranked near their current 14th by the computers, and they still would not play in the championship game.  Their hunt for the football championship was over before it began.

Many people believe that the regular season is the playoffs for College Football, but if you really wanted the regular season to be the playoffs wouldn't it be organized as a Swiss tournament?  The national champion would usually have 2 or 3 losses because they would be playing only good teams the last half of the year.

Of course, selecting only two teams will be free on controversy only when it is a "goldlocks year", and there are two teams people can agree on.  This year is sure to cause controversy no matter who is picked for the championship game.

Tags: edited by Port1080, written by Shy Elf, NCAA, BCS, college, football, sports (all tags)

This story: 17 comments (1 from subqueue)
Post a Comment
11

Maximize Postdictive Accuracy.

3fingerspointback.

Wed Nov 28, 2007 at 07:16:19 PM EST

5.00 (interesting)

A few years ago I was curious about ranking systems, and actually maintained my own for a few years, for my own amusement.  The idea was to generate a directed graph of Division I teams, with arrows representing games pointing from winners to losers.  A team's score was calculated by figuring how many other teams it could "beat" by traversing wins, then subtracting how many teams could "beat" it the same way.  So if Notre Dame never played Oklahoma, it might be ranked higher than Oklahoma if it beat team A which beat Oklahoma, and Oklahoma could only travel back by beating B which beat C.  I wrote a generator that used my method, and also wrote a Colley Matrix implementation, since he'd published his method in a white paper and I wanted a good comparison.

A lot of people seem to be judging ranking systems by how well they predict the outcome of future games, but my conclusion is that this is exactly backwards.  What needs to be done is to maximize postdictive accuracy, so that a ranking system minimizes the number of times a team gets ranked higher than a team that defeated it.  In a system where there are no league-wide playoffs, this is the only fair way to compare multiple teams.  As it turned out, Colley's system turned out to be a bit better at postdiction than mine by about 10 games over the course of a season.

On the assumption that it is "unfair" to rank a team higher than another that beat it, I resolved to write a new ranking system that would work by explicitly minimizing this "unfairness", kind of a dynamic bubble-sort algorithm in which comparisons of ranking were done by testing how less "unfair" the list would get after each swap.  Unfortunately, I could not get the sorting process to stabilize, and my foolish foray into trying my ranking system on the 4800 games of Division I basketball put me off the whole effort.  Maybe this is how some of the closed-source computer polls are working, in which case more power to them.

(is 3fingerspointback)

12

^ 11

Postdictive Accuracy Maximized

Shy Elf.

Wed Nov 28, 2007 at 11:13:39 PM EST

5.00 (interesting)

Consider 5 teams who have played two round-robin tournaments, with the following results (A beat B once and C and D twice each, etc.)
   A B C D E
A    1 2 2 0
B 1    2 2 0
C 0 0   1 2
D 0 0 1   2
E 2 2 0 0

Records: AB 5-3 E 4-4 CD 3-5
Postdictive accuracy is maximized by the following ranking sets, which get 6 games wrong each:
AB > CD > E and E > AB > CD.  Everything else is worse.
The intuitive AB > E > CD gets 10 games wrong.

So, if we believe that we need to maximize postdictive accuracy, E is either the very best team or the very worst team, and can't be anything else.  This is clearly nonsense.  Maximizing postdictive accuracy is a terrible rating system.

That college football ranking systems based only on wins and losses get less than half as many games between top 25 teams wrong postdictively as predictively even in the last half of the season is a strong statement that there simply aren't enough games for a reasonably accurate rating system when considering only wins and losses.

13

^ 12

Re: Postdictive Accuracy Maximized

thefadd.

Thu Nov 29, 2007 at 03:49:21 PM EST

none

While you both make strong points, I'm more convinced by your argument that by 3fingers'. Your example, however, contemplates a round robin which is not something that happens in college football--at most teams might play twice in a season before the bowl games, once in their conference regular season and once in their conference championship game. Could a postdictive example be at least more plausible in the college football situation?

It is easy to buy small plaster models of what you think life is like.

15

^ 13

Re: Postdictive Accuracy Maximized

Shy Elf.

Thu Nov 29, 2007 at 10:35:41 PM EST

none

The double round-robin was just to make the absurdity of the result more obvious.  Three team cyclic victories are common, where team A beats team B who beats team C who beats team A.  Maximizing postdictive accuracy says that the ranking sets A > B > C, B > C > A and C > A > B are good while the sets C > B > A, B > A > C, and A > C > B are bad.  This is just silly as well, though a bit less obviously so.

The way the reasonable one strength variable victory/loss only ranking systems work is for each pair of teams you define a probability of victory of each team based on the strengths of each team.  For the three computer polls of this type used in the BCS, with probability of victory of team A = P(A), and strength of team a S(A) and it's opponent S(B), the formula used is P(A)=1/(1+exp(S(B)-S(A)))  (I gave a different formula in the writeup, but it's equivalent to this if you take the log of the team strength ratings).

You then adjust the team strength ratings until the probability of the observed result occurring is maximized.  You need to add an initial probability distribution for the team strength ratings, as otherwise the strength of undefeated teams is calculated as infinite.  The three BCS schemes differ on this initial distribution and on how to handle home field advantage, but that's basically all there is to their rating systems.

For the cyclic case, it results in the intuitive result S(A)=S(B)=S(C), and if additional games are played, these three cyclic games will still pull the rankings of A, B, and C towards being the same.

This probability of victory function has not been checked against actual results, and in fact statistics tells us that the formula we should use is not this but, using the Error function Erf(), P(A) = (1+Erf (S(A)-S(B)))/2.  

There are not enough games to determine relative team strengths with reasonable reliability until the second best team has about 4 losses, so determining team strengths by only wins and losses will never work well for college football.  Game wins and losses only cannot rule out Hawaii being so good that they would have a 99% chance of beating the next best team, and the only reason that the BCS computers have them listed around 12th is the assumption that they're probably close to average strength except to the extent that they prove otherwise.

16

^ 12

Re: Postdictive Accuracy Maximized

3fingerspointback.

Fri Nov 30, 2007 at 07:54:14 PM EST

none

Elements like that are probably the reason my system didn't stabilize.  I did try to also account for win percentages and a bit of strength of schedule as well, but it obviously wasn't enough.

However, I can't agree that margin of victory should be included in any automated poll, because the simple implementation will be naturally biased against strong teams that win low-scoring games through good defense, and even methods that account for this will lead to a prisoner's dilemma situation in which high-scoring teams would be compelled to always run up the score against their opponents whether or not they want to.

(is 3fingerspointback)

17

^ 16

Re: Postdictive Accuracy Maximized

Shy Elf.

Fri Nov 30, 2007 at 09:30:35 PM EST

none

These are valid objections, but with the small number of college games between top teams, going to win/loss only is so much worse than victory margin at estimating team strengths that going back to it is really throwing the baby out with the bathwater.

These are good ways to try improving a victory-margin based system, however.   A system which maintained separate offensive and defensive strengths, maybe separately for rushing and passing, would probably be an improvement.  A system which used statistics on success rates by possession would compensate for passing based teams getting more possessions simply by not burning as much of the clock.  It would probably help to have a system which discards points in "junk time" when the winner is obvious and one team (usually Florida, in my experience) is trying to run up the score at the end of the game.  Maybe taking the lowest lead in the 4th quarter would help.  Discounting turnover effects, which are more effected by luck might help.  (Betting against the team with the best turnover margin thus far in the season I intuitively think ought to work, since this team has generally lucky so far in the season, though there is a real talent effect involved in generating turnovers.)

But my main point is that there are so few games between the top teams that any system good at predicting results must distinguish in some way between easy victories and tough-fought close victories.

14

^ 11

Re: Maximize Postdictive Accuracy.

thefadd.

Thu Nov 29, 2007 at 03:51:08 PM EST

none

...which brings up the question of course of whether polls ought to be postdictive or predictive. Should a team that is 0-2 but has just gotten back a star player be ranked based on what they've done or on "the team they are now" with that star player back?

It is easy to buy small plaster models of what you think life is like.

1

It'll Never Change

keta.

Tue Nov 27, 2007 at 01:41:16 PM EST

4.00 (astute)

It's a racket in which the major programs make huge amounts of money.  Why would they want a change?  If you go to a playoff system, lesser ranked teams will have success, gain popularity in recruiting the best talent, and suddenly the big boys aren't as successful and aren't making as much dough.

The last thing college football movers and shakers want is anything akin to the NCAA basketball set up.  They don't want to share, and they don't give a fuck about the fans or the players themselves.

10

^ 1

Re: It'll Never Change

Shy Elf.

Wed Nov 28, 2007 at 12:19:03 AM EST

5.00 (astute)

Everyone knows that a playoff system would make more money.  What the major programs don't believe is that it would make more money for them, since their share would be smaller.  In particular it's the Pac 10 and Big 10 that oppose any changes, since they seem convinced that they would make more money by sending their champions to the Rose Bowl for one game.

There's a natural playoff system for college football.  You already have conferences which do a reasonable job of determining the best team in each conference, so give the champions of the strong Big 10, Pac 10, Big 12 and SEC conferences a bye, add the other 7 conference champions and the strongest independent and you have a 12 team tournament.  This is the last thing the major programs would like to see, since it would naturally lead to a more equitable distribution of the money.  It actually isn't necessary at all, since the only non-BCS conference anyone cares about is the WAC, but this revenue redistribution is what the major schools are afraid of.

There's a lot of excessive tax-free spending going on here.  How can the Big 10 spend $10.95M on expenses for 7 games?  Wouldn't it be cheaper to buy a few houses every year for the players to stay in?  What are they doing, eating nothing but caviar?

To some extent it's fine that the football money is going to the BCS schools who produce the games that people actually want to watch.  But what if you win a bid as a non-BCS school?  You get $4.5M before splitting with your conference.  To keep up with a BCS conference school, your school needs to get a bid almost every other year, which is just impossible.  Even BCS schools don't manage that.  So we have a two-tier system, with the non-BCS schools with no real hope of competing, no matter how good they are.

It's the championship game which is used as an excuse to justify this monopoly on the good bowl payouts.  Everyone has to participate in order to guarantee that the top two teams play, and that is used to justify everyone getting together and parceling out the payouts before we know who is deserving of them.

I see a better than even chance that we'll have a "+1" system starting in 2010 with the next TV contract, when the TV networks throw money at the other 4 BCS conferences to make it happen.  There's also a good chance that the Pac 10 and Big 10 would opt out of the (4 team) national playoff then.

2

^ 1

Here Here

thefadd.

Tue Nov 27, 2007 at 02:09:26 PM EST

none

I will only change if they can figure out a way to bleed more money out of their indentured servants, I mean student-athletes.

I always liked the old system, it had tradition and controversy and allowed for lots of teams to end the season with a win. A playoff system would of course make sporting sense, although as a college sport I don't really care so much that it make sporting sense. What they have now of course is a meaningless mishmash that determines nothing.

It is easy to buy small plaster models of what you think life is like.

4

^ 2

Re: Here Here

Steve Urkel.

Tue Nov 27, 2007 at 03:43:01 PM EST

none

I liked the old system too, where teams would play for the traditional bowls of their conferences. I find the claim "We need to determine who is number one" weird. Why do we need to do that? It's college football.

5

^ 4

Re: Here Here

keta.

Tue Nov 27, 2007 at 06:21:55 PM EST

none

Steve, add, "...in America" to the end of your last sentence and it answers your query.

6

^ 5

Re: Here Here

thefadd.

Tue Nov 27, 2007 at 06:32:10 PM EST

none

It's funny, though. Are their other sporting organizations in the world that decidedly don't work their way to a single un-debated #1?

It is easy to buy small plaster models of what you think life is like.

7

^ 6

Re: Here Here

Steve Urkel.

Tue Nov 27, 2007 at 06:55:49 PM EST

none

Cycling?

I'm not sure what year it stopped, but in college football they used to do the polls for #1 before some teams were finished, which is even weirder.

9

^ 7

Re: Here Here

Shy Elf.

Tue Nov 27, 2007 at 10:23:04 PM EST

4.00 (astute)

No, Cycling works their way to a single undebated #1, and then disqualifies him for doping violations, at least the last two years.  It's not the same thing as not trying.

If you go far enough back, they did the final polls before all the bowl games.

3

Filthy Lucre

uncarved block.

Tue Nov 27, 2007 at 02:18:56 PM EST

4.00 (interesting)

   As just a casual sports fan- and one with a bit of a head cold right now- I'll just throw out a few comments, and ask for a little sympathy if they don't all pull together.

    Is the BCS system a shambles in many ways? Well, if the goal is to determine a clear #1 team* at the end of the year, sure. You could even add to the list of complaints that teams are ranked in the polls before playing a single game; dubious at best considering how much a team can change from one year to the next, and clearly a method that encourages regional and/or conference bias. (Yet another reason Hawaii never stood a chance.) College basketball has a similar problem, but with a rather famous tournament at season end, and many more games, this is not much of a problem.
    But if you look at the current system as a way to maximize viewer attention and commercial interest, it's about the best that you could come up with. Controversy helps when the subject isn't life threatening, and having a little spice every year certainly hasn't harmed college football much. The ranking systems, flawed as they are, still manages to create the hope for parity in the seemingly boundless number of bowl games-- that close games don't always happen could be explained by the extended layoff, but that's pure conjecture. As long as there's a constant demand for something to watch during the holiday break, I have doubts the current bowl system will be replaced entirely by a playoff system any time soon.
    So why is there so much discussion about the system being broken? Two things spring immediately to mind: the first is that there are a hell of a lot of sports reporters than there used to be, and they all have to have something to write about to keep getting paid-- and writing a column about the BCS rankings doesn't require a whole lot of legwork. The other factor- more subjective and personal- is the American dislike for controversy, for not having one right answer to the problem at hand. Just because we viewers participate in, and really drive, the solution that's been reached, doesn't mean we won't complain about the outcome anyway. What can I say? It seems to be a strong American trait to bitch about anything and everything, given the chance . . .

     *Interesting to consider is just how it all plays out further down the list as well.   A difference between #10 and #14 could mean a big difference in the financial payoff as well, but I haven't noticed that the rankings are any more precise in that range either.

Ex ignorantia ad sapientiam; e luce ad tenebras

8

This sport needs a playoff.

MayorBob.

Tue Nov 27, 2007 at 07:04:08 PM EST

4.00 (interesting)

All other divisions of the NCAA have playoffs to establish a football champion.  Delaware is still alive in Division IAA but I fear soon to fall to Northern Iowa.  Collegiate basketball has one made up of a field of 65 teams at the end of the year,  They even have more if you count the NIT, and who does?  Baseball, gymnastics, lacrosse, and so on.  Yet because the schools became wedded to the bowl concept at the end of the year, Division IA doesn't have one.  It's all about money, but I would think the NCAA and the schools would more than make up their money with a playoff series.

Take the top twenty teams at the end of the year (as determined by one of the major polls).  The first weekend, the bottom ranked eight teams play play in games.  The four winners join up with the top 12 teams and they begin the single elimination tournament.  The top seven bowl sites could serve as the site for the final three rounds of the tournament (and trade off which one gets to be the championship game).  All the other bowls are basically just companies looking to endorse a bowl, so let them pick and choose which one of the precursor games is going to get to be called the Autozone Quarter Final Classic.

Illegitimi non carborundum.

This story: 17 comments (1 from subqueue)
Post a Comment