# The Perfect March Madness Bracket, Part I

Lots of posts out there on the probability of predicting a perfect, 66-team tournament bracket. The logic does not seem terribly refined, though; the simplest approach misleads us into thinking it’s far less likely than it really is, as do many of the methods. qb’s approach will be to come up with the maximum conceivable probability based on seed-to-seed matchups; the simplest method requires only 5 keystrokes on an RPN calculator but is the methodological equivalent of asking what the probability is of the proverbial monkey sitting down to a typewriter and hammering out _King Lear_. It’s actually at least a million times more probable than that; humans know how to use historical data reasonably well.

**The Simplest Approach**

These days, there are two play-in games to fill a 64-team bracket, so there are actually 66 teams involved. In order to get one champion out of that mess, we have to play – and therefore predict the outcomes of – 65 games. (Prove it: 2+32+16+8+4+2+1 = 65.)

Assumption #1. If we assume that in every game the teams are evenly matched, the odds of a pool participant (“bettor”) picking the winner correctly is 1/2.

Assumption #2. If we further assume that the outcome of any game is completely independent of the outcomes of any or all of the rest of the games, probability theory says we can simply multiply all of the individual probabilities together to obtain the overall probability of the entire list of predictions being 100% correct.

If there were only 3 games (i. e., if the NCAA tourney were only the Final Four), then the probability of a single person picking the bracket perfectly under those same assumptions would therefore be (1/2) x (1/2) x (1/2) = 1/8 or 12.5%. In the case of a 65-game bracket, the probability is (1/2)^65 = 2.7105 x 10^-20; in other words, one person in 36,893,500,000,000,000,000 (36.9 QUINTILLION!) persons will get it right, on average, picking evenly-matched teams at random.

Incidentally, two things. First, you’ll see some folks saying the odds are one in NINE quintillion. That number is correct if you’re only picking the 64-team bracket of yesteryear. Second, in these days of Obamacare and limp-spined RINOs, the quantity “trillion” is starting to sound ordinary, so “quintillion” – a million trillions – just causes the ol’ melon to go *TILT*. But let’s say each person in the United States (~315 million of us) picked a bracket of evenly-matched teams every year until, on average, we finally got a perfect bracket from someone. The odds say that it would take us 117 BILLION years to get one; *or a more rigorous interpretation would be that if we ever got one, it would probably take 117 billion more years until we got another one.* Given that the scientists say the earth is about 4.5 billion years old, give or take…well, you get the picture.

**Surely It’s Not That Hard!**

Those odds just seem absurdly long, don’t they? I certainly think so. A lot of people get pretty close. And the main reason has to do with Assumption #1; we know, intuitively, that the games are NOT evenly matched, and further, we have a pretty good idea which of the two teams in each matchup is the better team, even though it gets tougher to predict as we reach the later rounds in the tournament. In other words, we’re intelligent beings, most of us (current White House occupants and their 2008 and 2012 voters excluded).

That’s what “seeding” is about. The NCAA seeding procedure is based upon the assumption that the better a team does in the regular season, the more the odds should be stacked in that teams favor to reach the Final Four, where the real barnburners take place. There’s a financial aspect to that, of course, because no TV network exec in its right mind wants Duke, North Carolina, Kentucky, Kansas, Ohio State, Florida, UCLA, and Michigan knocking each other out by the end of the Sweet Sixteen, leaving UNM vs. Colorado State and Michigan vs. Central Florida to battle it out in the Final Four. (Well, maybe @Sam Smeaton would want such an outcome.)

But there’s also an incentive aspect: If a team’s position in the tournament is going to be chosen at random, what’s the incentive to play hard in the conference tournaments or the late regular season if that team knows its early-season record is enough to score an invite to the dance? But if the most dominant teams during the regular season are assured the easier paths to the Final Four, then there’s a powerful incentive to finish strong.

So the upshot is that the brackets are designed to reward the top seeds: in the first round of each region, the #1 seed plays #16, #2 plays #15, and so forth, all the way to #8 vs. #9. If the seeders have done their job well, we ought to expect that #1 is highly likely to beat #16 but that the #8-#9 tussle will be a knuckle-biter. Assuming the higher-seeded team wins each game in the first round, the next round pits #1-#8, #2-#7 and so forth. Again, #1 ought to beat #8 handily, and #4-#5 should be very closely contested. The same principle applies all the way to the four regional finals, where we expect each region’s #1 seed to play #2 for the right to go to the Final Four.

It turns out that, historically speaking, the seeders have done pretty well, if we measure that by the relative success of each seed in the first round. Number 16 has NEVER beaten #1, #2 beats #15 about 95% of the time, and (surprisingly or not), #8 only beats #9 about 47% of the time.

So we’re on pretty safe ground substituting a different first assumption, represented by the graph below.

The horizontal (x) axis represents the difference between the seeds of any two opposing teams; for the #13 vs. #4 game in the first round, this number is 13-4=9. The vertical (y) axis is the probability that the higher-seeded team will win. According to this chart, the #4 seed has historically won 78% of its first-round games.

Caveat: for the Final Four games, of which there are three, all of the contestants are expected to be #1 seeds, so we should probably assume these games are evenly matched so that the probability of predicting each game correctly in the Final Four is 1/2 or 50%.

In the play-in round, there are two games, each involving evenly matched but likely inferior opponents. Let’s assume the probabilities here are 50% as well.

To recap thus far, the odds of picking both play-in games correctly are 1/4, and the odds of picking the Final Four results perfectly are 1/8. Multiply the two together, and we see that already we’re down to a 1/32 chance, and we haven’t even dealt with the First, Second, Third, and Fourth rounds, a whole 60 games!

Part II to come, with the payoff.