April means the end of March Madness, which marks the end of my latest science experiment: my March Madness Algorithm. As I have detailed in previous posts I am in a March Madness pool where participants get points for making correct picks that other user did not pick. The more users who did not make the same pick as you, the more points you get. You can read my previous posts for details.
This is my second year in the pool and this year I focused most of my energy on trying to predict what the other users would pick. I ended up concluding that none of my Machine Learning models were better at predicting than the raw data I found on ESPN, which gives the percentage of ESPN users picking each team to win in each round. So I used the same model as last year
After a roller coaster tournament that saw my runner-up Virginia losing in the first round, my bracket ended up looking like this:
This figure gives all of my picks. If the pick is highlighted in green it is correct. If it is highlighted in red it is incorrect. For example, I incorrectly picked Virginia to be the runner-up and Virginia is highlighted in red. The key drivers of my point total were: Villanova winning it all, two of my four “underdog” elite eight picks are correct, and correctly picking sweet sixteen upsets.
Villanova Wins It All
This was the most boring but probably the most important occurrence necessary for my success. Even with my pool’s different scoring system having the winner is still crucial. Villanova netted me nearly one third of my overall points. This fraction would be even higher in an ESPN league, where Villanova’s point values don’t decrease due to everyone picking them. Additionally, having the correct champion “stops” other contestants from having a champion, which will decrease your opponents’ scores. I expected to get 172 points from Villanova but actually got 426 points.
A more subtle but interesting point about picking Villanova in my pool is that it is almost always worth it to pick the favorite as the champion, despite getting extra points for picking underdogs. Consider this year, fivethirtyeight gave Villanova an 18% chance to win it all and UVA a 14% chance. If we ignore rounds 1-5, and just look at the championship game who should I pick? Well let’s say 80 out of 100 (80%) of people do not pick Villanova to win it all, then I would have an expected value .18 x .8 x 6 = .864 for Villanova as champion. Let’s say that 90 out of 100 (90%) of people do not pick Virginia to win it all, then I would have an expected value of .14 * .9 * 6 = .756 for UVA as champion. We can also ignore the round multiplier, six, because it is in both equations. So now we have Villanova at 18 x .8 = .144 and UVA at .14 * .9 = .126. I’d rather pick Villanova here. Interestingly even if nobody else picks UVA, the expected value for Villanova of .144 would still be higher than .14 * 1. Even in my pool, where only 66% of people did not pick Villanova and 84% of people did not pick UVA the expected value for Villanova is slightly higher, .1188 vs. .1176.
Elite Eight Underdogs
My algorithm chose chose four “underdog” elite eight teams: Gonzaga, Houston, Michigan State, and Texas Tech. I am referring to any team with a seed higher than two as an underdog. My algorithm determined that these team had a good chance of making the elite eight and were underpicked by other users. Together, they represented 276 expected points. To win, I probably needed to exceed this.
The first three teams: Gonzaga, Houston, and Michigan State did not make it to the elite eight and scrounged a measly 43 points. However, Texas Tech did hit for me, adding a whopping 190 points. This left me short of my expected value by 43 points. However Kansas, one of my correct final four picks helped make up this shortfall. Despite an expected value of 80 points, Kansas scored me 188 points.
Sweet Sixteen Upsets
Looking at my round by round scores, I did well in most rounds, finishing in the top ten of scoring in my pool, but my second round stood out, where I came in second place. I score 418 points, a whopping 131 points above my expected value. My correct underdog picks of Florida State and Texas A&M winning netted me 190 points. My two other more conservative underdogs Clemson and Kentucky (both five seeds) netted me an additional 138 points. This round accounted for nearly a third of my points.
Conclusion
In the end, I scored 1,405 points well above my expected value of 1,098. I finished in first place in my pool of fifty-one people. It certainly did not feel like a smooth ride at all, until Villanova was up by 20 with 10 minutes to go in the title game. Winning is exciting but not a great measure of success because of how much luck is involved. In a pool of 50 people the average person has two percent chance of winning. Even with my algorithm, I estimate my chance is no higher than five percent. Still, I’ll take the prize money.