It's that time again: Raymond comes up with an absurd, arbitrary criterion for filling out his NCAA bracket.
This time, I studied all the games played in the NCAA men's basketball tournament since 1985 and computed how many of the games were won by the favorite and how many were upsets, broken down by the numerical difference between the seeding of the two teams.
Seed Difference 
Winner  Upset Rate 


Favorite  Underdog  
1  121  105  46% 
2  14  18  56% 
3  110  76  41% 
4  55  23  29% 
5  113  45  28% 
6  9  2  18% 
7  100  42  30% 
8  114  40  26% 
9  81  19  19% 
10  3  2  40% 
11  86  15  15% 
12  3  0  0% 
13  84  4  5% 
14  0  0  N/A 
15  88  0  0% 
I found it interesting that when the teams are seeded N and N+2, you get an upset more than half the time!
If the probability of the favorite winning is p and I choose the favorite with probability q, then the prediction would be correct
Therefore, to maximize the number of correct predictions, I should always choose the favorite, unless the two teams are seeded N and N+2, in which case I always choose the underdog. But that makes for a boring bracket. Consequently, I went for the suboptimal algorithm of choosing
Update:
 Correct predictions are in green.
 Incorrect predictions are in red.
 (!) marks upsets correctly predicted.
 (*) marks upsets predicted but did not take place.
 (x) marks actual upsets not predicted.
Opening Round Game 

16  Mount St. Marys  Mount St. Marys  
16  Coppin St.  
East bracket 

1  North Carolina  North Carolina  North Carolina  North Carolina  Louisville  
16  Mount St. Marys  
8  Indiana  Arkansas (!)  
9  Arkansas  
5  Notre Dame  Notre Dame  Washington St.  
12  George Mason  
4  Washington St.  Washington St.  
13  Winthrop  
6  Oklahoma  Oklahoma  Louisville  Louisville (!)  
11  St. Joes  
3  Louisville  Louisville  
14  Boise St.  
7  Butler  South Alabama (*)  Tennessee  
10  South Alabama  
2  Tennessee  Tennessee  
15  American  
Midwest bracket 

1  Kansas  Kansas  Kansas  Kansas  Gonzaga  
16  Portland St.  
8  UNLV  UNLV  
9  Kent St.  
5  Clemson  Villanova (!)  Villanova  
12  Villanova  
4  Vanderbilt  Siena (!)  
13  Siena  
6  USC  Kansas St. (!)  Wisconsin  Gonzaga  
11  Kansas St.  
3  Wisconsin  Wisconsin  
14  Cal St. Fullerton  
7  Gonzaga  Gonzaga (x)  Gonzaga  
10  Davidson  
2  Georgetown  UMBC (*)  
15  UMBC  
South bracket 

1  Memphis  Memphis  Oregon  Michigan St.  Michigan St.  
16  Texas Arlington  
8  Mississippi St.  Oregon (x)  
9  Oregon  
5  Michigan St.  Michigan St.  Michigan St. (!)  
12  Temple  
4  Pittsburgh  Pittsburgh  
13  Oral Roberts  
6  Marquette  Marquette  Stanford  Stanford (*)  
11  Kentucky  
3  Stanford  Stanford  
14  Cornel  
7  Miami Fla  Miami Fla  Texas  
10  St. Marys  
2  Texas  Texas  
15  Austin Peay  
West bracket 

1  UCLA  UCLA  UCLA  UCLA  Xavier  
16  Miss Valley St.  
8  BYU  Texas A&M (!)  
9  Texas A&M  
5  Drake  Drake (x)  Drake  
12  W. Kentucky  
4  Connecticut  San Diego (!)  
13  San Diego  
6  Purdue  Purdue  Xavier  Xavier  
11  Baylor  
3  Xavier  Xavier  
14  Georgia  
7  West Virginia  West Virginia  Duke  
10  Arizona  
2  Duke  Duke  
15  Belmont  
Finals 

3  Louisville  Michigan St.  Michigan St.  
5  Michigan St.  
7  Gonzaga  Xavier  
3  Xavier  
Well, Raymond, as of today you are 17 for 17. You even got the opening game right.
You got skin in the game or just pontificating ? Surely your coworkers setup a $5 bracket or somesuch.
I’d say Raymond probably doesn’t have any "skin" in the game. If you check out this post < http://blogs.msdn.com/oldnewthing/archive/2006/03/16/552822.aspx > you’ll see he doesn’t know squat about basketball and just dos this for fun. The first year the teams were ranked based on whose president served the longest, and the next year it was based on the pay of the head coach.
I gave a copy of Raymond’s brackets to an NCAA freak here at work and his jaw dropped when he say Raymond had predicted the UMBC upset of Georgetown. I dunno, Raymond — you might have found a good method here!
WHOO! Go Retrievers!
I graduated from UMBC last year, and I can’t remember any sort of basketball team on campus. I have no idea how they managed to make it to the NCAA tournament, much less beat the #2 team.
Shouldn’t they play the game before a winner is declared? It’s starting on 3/21 at 3:10…
OK, I’ll bite: why are so few matches played between teams with an even seed difference?
You ought to read Isaac Asimov’s ‘The Machine That Won The War’ for the most effective statistical algorithm ever created to make the right choices in situations like this.
All the first round games are played with an odd seed difference (1 plays 16, etc). More than half the games in any bracket are first round games, so assuming a 5050 distribution of even and odd seed differences after the first round means that 75% of the games are odd.
Raymond, K St. over USC, impressive.
Because the way tournament is structured, the best team plays the worst team. If all the favorites win (they often do), you will always have an odd number for the seed difference (only an upset produce an even seed difference next round). Only exception is when 8th seed plays the 9th seed in the first round, it’s a virtual coin toss. Therefore, it’s a 5050 for a first seed to meet a 8th or 9th seed in the second round of regional action. (look at row 7 and 8, nearly identical).
So far you’ve only missed two… looks like you have a pretty good method.
Go Michigan St.!
I like the looks of your bracket, Michigan State out front like they should be!
Go Green!
I have nothing to add other than the fact that you have two South brackets and no West bracket. I nitpick at thee!
PingBack from http://codesmithy.wordpress.com/2008/03/22/marchmadnesscrazinesspartone/
You did extremely well with your first round picks. Nice job. I’m trying to understand your methodology. You wrote that you "went for the suboptimal algorithm of choosing q = p." Can you explain that more? I don’t understand but am fascinated by what you came up with.
@patrick — it means that instead of always picking the favorite, he’ll pick the favorite most of the time: for seed difference 1 (upset rate 46%), he’s 46% likely to choose the upset (probably by RNG).
@ScottB & Raymond. I don’t understand… Take, for example, the choices when #4 vs. #13. Twice the #4 seed is picked to win, twice the #13 seed is picked, but the upset rate for this differential is not 50%. And the two times the #13 was picked (San Diego over UCONN, Siena over Vanderbilt), Raymond was correct. Why were these particular #13’s chosen and not the other two #13s? Just trying to figure out whether this was random chance.