# Raymond’s highly scientific predictions for the 2008 NCAA men’s basketball tournament

It's that time again: Raymond comes up with an absurd, arbitrary criterion for filling out his NCAA bracket.

This time, I studied all the games played in the NCAA men's basketball tournament since 1985 and computed how many of the games were won by the favorite and how many were upsets, broken down by the numerical difference between the seeding of the two teams.

Seed
Difference
Winner Upset
Rate
Favorite Underdog
1 121 105 46%
2 14 18 56%
3 110 76 41%
4 55 23 29%
5 113 45 28%
6 9 2 18%
7 100 42 30%
8 114 40 26%
9 81 19 19%
10 3 2 40%
11 86 15 15%
12 3 0 0%
13 84 4 5%
14 0 0 N/A
15 88 0 0%

I found it interesting that when the teams are seeded N and N+2, you get an upset more than half the time!

If the probability of the favorite winning is p and I choose the favorite with probability q, then the prediction would be correct pq + (1−p)(1−q) = (2p−1)q + (1−p) of the time. If you hold p constant, then this is maximized when q = 0 if p < ½, or when q = 1 if p > ½. (If p = ½, then it doesn't matter what you pick for q.)

Therefore, to maximize the number of correct predictions, I should always choose the favorite, unless the two teams are seeded N and N+2, in which case I always choose the underdog. But that makes for a boring bracket. Consequently, I went for the suboptimal algorithm of choosing q = p. Here is the result:

Update:

• Correct predictions are in green.
• Incorrect predictions are in red.
• (!) marks upsets correctly predicted.
• (*) marks upsets predicted but did not take place.
• (x) marks actual upsets not predicted.

#### Opening Round Game

16 Mount St. Marys Mount St. Marys
16 Coppin St.

#### East bracket

1 North Carolina North Carolina North Carolina North Carolina Louisville
16 Mount St. Marys
8 Indiana Arkansas (!)
9 Arkansas
5 Notre Dame Notre Dame Washington St.
12 George Mason
4 Washington St. Washington St.
13 Winthrop
6 Oklahoma Oklahoma Louisville Louisville (!)
11 St. Joes
3 Louisville Louisville
14 Boise St.
7 Butler South Alabama (*) Tennessee
10 South Alabama
2 Tennessee Tennessee
15 American

#### Midwest bracket

1 Kansas Kansas Kansas Kansas Gonzaga
16 Portland St.
8 UNLV UNLV
9 Kent St.
5 Clemson Villanova (!) Villanova
12 Villanova
4 Vanderbilt Siena (!)
13 Siena
6 USC Kansas St. (!) Wisconsin Gonzaga
11 Kansas St.
3 Wisconsin Wisconsin
14 Cal St. Fullerton
7 Gonzaga Gonzaga (x) Gonzaga
10 Davidson
2 Georgetown UMBC (*)
15 UMBC

#### South bracket

1 Memphis Memphis Oregon Michigan St. Michigan St.
16 Texas Arlington
8 Mississippi St. Oregon (x)
9 Oregon
5 Michigan St. Michigan St. Michigan St. (!)
12 Temple
4 Pittsburgh Pittsburgh
13 Oral Roberts
6 Marquette Marquette Stanford Stanford (*)
11 Kentucky
3 Stanford Stanford
14 Cornel
7 Miami Fla Miami Fla Texas
10 St. Marys
2 Texas Texas
15 Austin Peay

#### West bracket

1 UCLA UCLA UCLA UCLA Xavier
16 Miss Valley St.
8 BYU Texas A&M (!)
9 Texas A&M
5 Drake Drake (x) Drake
12 W. Kentucky
4 Connecticut San Diego (!)
13 San Diego
6 Purdue Purdue Xavier Xavier
11 Baylor
3 Xavier Xavier
14 Georgia
7 West Virginia West Virginia Duke
10 Arizona
2 Duke Duke
15 Belmont

#### Finals

3 Louisville Michigan St. Michigan St.
5 Michigan St.
7 Gonzaga Xavier
3 Xavier

1. Tom says:

Well, Raymond, as of today you are 17 for 17.  You even got the opening game right.

2. Nathan_works says:

You got skin in the game or just pontificating ? Surely your coworkers setup a \$5 bracket or somesuch.

3. Tom says:

I’d say Raymond probably doesn’t have any "skin" in the game.  If you check out this post < http://blogs.msdn.com/oldnewthing/archive/2006/03/16/552822.aspx > you’ll see he doesn’t know squat about basketball and just dos this for fun.  The first year the teams were ranked based on whose president served the longest, and the next year it was based on the pay of the head coach.

I gave a copy of Raymond’s brackets to an NCAA freak here at work and his jaw dropped when he say Raymond had predicted the UMBC upset of Georgetown.  I dunno, Raymond — you might have found a good method here!

4. Josh says:

WHOO! Go Retrievers!

I graduated from UMBC last year, and I can’t remember any sort of basketball team on campus. I have no idea how they managed to make it to the NCAA tournament, much less beat the #2 team.

5. SM says:

Shouldn’t they play the game before a winner is declared?  It’s starting on 3/21 at 3:10…

6. Neil says:

OK, I’ll bite: why are so few matches played between teams with an even seed difference?

7. Euro says:

You ought to read Isaac Asimov’s ‘The Machine That Won The War’ for the most effective statistical algorithm ever created to make the right choices in situations like this.

8. Craig says:

OK, I’ll bite: why are so few matches played between teams with an even seed difference?

All the first round games are played with an odd seed difference (1 plays 16, etc).    More than half the games in any bracket are first round games, so assuming a 50-50 distribution of even and odd seed differences after the first round means that 75% of the games are odd.

9. Daniel says:

Raymond, K St. over USC, impressive.

OK, I’ll bite: why are so few matches played between teams with an even seed difference?

Because the way tournament is structured, the best team plays the worst team. If all the favorites win (they often do), you will always have an odd number for the seed difference (only an upset produce an even seed difference next round). Only exception is when 8th seed plays the 9th seed in the first round, it’s a virtual coin toss. Therefore, it’s a 50-50 for a first seed to meet a 8th or 9th seed in the second round of regional action. (look at row 7 and 8, nearly identical).

10. Julia says:

So far you’ve only missed two… looks like you have a pretty good method.

Go Michigan St.!

11. I like the looks of your bracket, Michigan State out front like they should be!

Go Green!

12. john says:

I have nothing to add other than the fact that you have two South brackets and no West bracket.  I nitpick at thee!

13. patrick says:

You did extremely well with your first round picks. Nice job.  I’m trying to understand your methodology. You wrote that you "went for the suboptimal algorithm of choosing q = p."  Can you explain that more? I don’t understand but am fascinated by what you came up with.

14. ScottB says:

@patrick — it means that instead of always picking the favorite, he’ll pick the favorite most of the time: for seed difference 1 (upset rate 46%), he’s 46% likely to choose the upset (probably by RNG).

15. patrick says:

@ScottB & Raymond.  I don’t understand… Take, for example, the choices when #4 vs. #13.  Twice the #4 seed is picked to win, twice the #13 seed is picked, but the upset rate for this differential is not 50%.  And the two times the #13 was picked (San Diego over UCONN, Siena over Vanderbilt), Raymond was correct. Why were these particular #13’s chosen and not the other two #13s?  Just trying to figure out whether this was random chance.

[You found me out. It was not random chance. I used my time machine. -Raymond]