Monday, March 14, 2016

NCAA Tourney Predictor 2.0

If you're a regular reader of the blog or a sentient web crawler, you may have thought my crowning analytical achievement in this space was the college football playoff predictor.  You would be wrong.  Four months in advance of creating a different kind of life, I have birthed a different entity with one very specific purpose: Picking the NCAA Men's Basketball Tournament. 

Why does this exist?

The simple answer is that winter is boring.  The complicated answer can be found in my post about last year's beta version.  Basically, I wanted to do something different (and thus worthwhile) by accounting for the importance that "matchups" supposedly play in March.  While I think that word is often thrown around to explain simple random variation, I still think it's possible that certain teams' construction might be more adept at beating certain other teams.  By breaking down teams to their key components and simulating games against the teams they actually face, I hope to glean some more subtle insights that other models may miss.

How does it work?

The basic explanation for this model is very simple: I simulate the tournament 10,000 times, and base the win percentages for each level on the results of those simulations.  The backbone that makes this actually representative of the teams involved is far more complicated.  Let's break the main concepts I wrestled with into separate sections.

1. Team Statistics

To simulate the outcomes of games for the teams involved, I need to pull in several metrics to describe those team's performance.  To avoid overly complicating things, I focus on the four factors, and add in a couple of wrinkles.  The full list of stats I use to simulate a game (for both offense and defense) are:

Two-point shooting percentage
Three-point shooting percentage
Percentage of shots taken from three
Free throw shooting percentage
Offensive rebounding percentage
Turnover percentage
Foul percentage
Adjusted Tempo

For every statistic except for the foul rate, I get my metrics from kenpom.com.  While Pomeroy does report on foul rate, he uses the metric FT/FGA.  There are reasons that this is a better measure of a team's true talent in this area, but it's kind of useless in a simulation.  Thus, I pull foul rate information from Team Rankings.  All data is up to date through the end of Championship Fortnight.

2. Team Statistic Adjustments

Most of the statistics above can be used out of the box.  If a team shoots 40 percent on three-pointers, I can plug that directly in to the simulation (there are, of course, opponent adjustments...I'll get to those in a bit).  That said, two metrics (turnover percentage and adjusted pace) need a little work done to be able to integrate with everything else.  The reason for this is a seemingly semantic difference that is actually quite important: the difference between possessions and plays.

The concept of the possession, which describes one team's trip to their end of the court, is the main underlying concept of mainstream basketball analytics.  Understanding the possession, and team's performance relative to the number of possessions, helps analysts to tease out non-obvious truths.  Slow-paced teams such as Virginia and Wisconsin may not put up huge point-per-game numbers, but when we look at scoring per possession, we see a different story.  In the metrics I pull from Pomeroy's site, the adjusted tempo and turnover rate metrics both share a denominator of possessions.  From an analytical standpoint, there is no problem with this.  But when I turn to my simulation, I need to view these in a slightly different light - that is, through the concept of the "play".  Each individual team possession ends with a made shot, defensive rebound, or turnover.  Within each of these possessions can be one or more plays.  A team can shoot, miss, grab an offensive rebound, and shoot again.  This will count as one possession, but two plays.  From the standpoint of my simulation, I am looking to simulate one play at a time.  Thus, I need to know the likelihood that a team turns the ball over on each given play, as opposed to the whole possession.

So how do we do this?  Well from a high level, it's pretty easy to grasp.  The ratio of turonvers per possession to turnovers per play is as such:

TORATE(POSS)/TORATE(PLAY) = 1/(1-EP)

Simple enough, right?  We simply take the ratio of turnovers per possession to turnovers per play, and set it equal to a multiplier based on the number of extra plays (EP) we would expect from that team.  Where does the term on the right come from?  Two explanations:  One, we can simply look at the right hand of the equation as Plays/Possessions.  The ratio of plays to possessions will always be plays/(plays-extra plays), which is roughly the form of that equation.  The second explanation uses infinite sums (fun!)  Let's suppose a team gets an "extra" play 10% of the time.  This means that the ratio of plays to possessions will be 1*(0.1)^0 + 1*(0.1)^1 + 1*(0.1)^2 + ....  If you're inclined towards math, you may notice this resembles the infinite sum (for all k from 0 to infinity) of the term 1/x^k, for x greater than 1.  This sum works out to 1 + (1/(x-1)).  In our case though, EP is decidedly less than 1, so we instead have to use 1/EP as our x term.  If we plug that in, and then reduce everything to its simplest form, we will eventually get 1/(1-EP) which is the term on the right side of the above equation.

The next step in this journey is to calculate the value of our EP metric.  To do this, we need to identify every time that a play is added to a possession.  These times are:

1. An offensive rebound after a missed field goal
2. A non-shooting defensive foul
3. An offensive rebound after a missed free throw

If we combine all of these situations, we get the following equation (the TO% you see has to be the per-play number, not the per-possession number we start out with.  Remember this in a couple paragraphs):

Extra Plays = (((1-FG%)*OR%)*(1-FOUL%-TO%)) + (FOUL%*NONSHOOT%) + (FOUL%*SHOOTFOUL%*(1-FT%)*OR%)

That's a lot of stuff.  Let's re-write that in words so it hopefully makes a little more sense:

Extra Plays = (Odds of rebounding a missed shot*Odds of taking a shot) + (Odds of drawing a non-shooting foul) + (Odds of rebounding a missed free throw)

There is one final wrinkle.  We need the extra plays number in order to calculate the per-play turnover rate.  But we need the per-play turnover rate to be able to calculate the extra play number.  This initially seemed like a breaking point for me, such that I wouldn't be able to move forward with the metrics at hand.  Luckily, some math happened.  Specifically, I noticed that my original equation could be re-stated as such:

F(TO1)/G(TO2) = 1/(1-H(TO2))

Where F, G, and H are functions representing turnover rate per possession, turnover rate per play, and extra plays, respectively (TO1 is the per-possession metric, while TO2 is the per-play metric).  If we plug in the values for each of these functions, we can actually simply everything to find a function X such that TO2 = X(TO1).  This function is:

TORATE(PLAY) = TORATE(POSS) * (1 - x*y - z) / (1 - (TORATE(POSS) * x))

In this equation, x is the odds of rebounding a missed shot, y is the odds of not drawing a foul, and z is the odds of rebounding a missed free throw.  I could hardly believe this actually worked that I had to demonstrate it through an example:


So yeah, all that work just to convert a metric from per-possession to per-play.  But it works, so that's all that really matters.

3. Foul Odds and Ends

In the previous section, you may have noticed the NONSHOOT% and SHOOTFOUL% metrics, and wondered what those were.  Well, hopefully the names are relatively straightforward: NONSHOOT% represents the number of times a foul is called without incurring free throws.  SHOOTFOUL% is simply the inverse.  In my previous screenshot, you'll see I used values of .387 and .613 for these respectively.  Why?  Well, it involves more fun math.

To start my journey, I needed to find a couple of basic stats and work from there.  First, I determined that the average number of fouls a team commits per half is 9.2 (This was based on 2014-2015 data.  I may need to revisit this now that more fouls are called).  As I don't have game logs for every game, I then approximated a distribution of the specific occurrences of each distinct number of fouls, ranging from 1-15.  This gave me a rough idea of how many times teams get into the bonus and the double bonus.  I then compared this to the number of free throws taken per team per half (roughly 10), and determined that the average foul creates the opportunity to shoot 1.1 free throws.  Finally, the pre-tournament free throw rate for 2014-2015 was 69.1%.  Now, with a data model that approximates the typical reality for a college team, I needed to find the following metrics:

% of fouls resulting in shooting a free throw (mentioned above)
% of fouls coming during the act of shooting
% of those shooting fouls where the field goal is made (aka and-ones)

The first metric is used only in my possession-to-play adjustment from the last section.  My simulation uses team-specific foul rates, and then counts up the number of fouls as you would in a regular game, which means the simulation itself has no use for general number like that.  What it does need in order to function properly, is an idea of the second and third figures.  Once again, I did not have individual game logs that would help me determine these two figures, so I needed to use all of the knowledge gained from the previous paragraph to estimate these figures for college basketball at large.  I did so using the following set-up in Excel:


The "Free Throws" and "Free Throws Per Foul" rows were based on the frequencies in the "Percentage of Time" row and the two goal values.  I then used Solver to set those two goals values, such that my "Total FT Per Foul" metric equaled to 1.1 (that metric was calculated by taking a SUMPRODUCT on the "Percentage of Time" row and the "Free Throws Per Foul" row.  Running solver gave me the values you see in the screenshot.  I felt good about the results, as the ~20% mark for making the field goals matches up pretty well with what I've read on the subject.  Once I calculated those values, I was able to calculate the % of fouls resulting in a free throw quite simply, by taking another SUMPRODUCT, this time on the "Percentage of Time" row and "Percentage of Shooting Fouls" row.

The one caveat to all of this is that applying one value for every team probably ignores some inherent differences between the teams.  Change any one of those three inputs for an individual team, and you're likely to see those numbers move a bit.  That said, I don't feel too bad about keeping it simple for now, for a few reasons.  One, I doubt there is a wide range of true talent when it comes to drawing and-ones, and even if there is, it's not going to have a huge impact on the outcome of a game (and-ones are pretty rare).  Two, calculating this figures for each individual team would take a fair amount of computational time, which might slow down my simulation.  Sure, I might be able to find a way to approximate these numbers by running a regression on the inputs or something like that, but that would take more research.  I do hope to improve this in the long-term, but for now, my model does incorporate team's foul rates on both offense and defense, and that should be a representative enough input to reasonably model reality.

4. Opponent Adjustments (out of game)

As mentioned earlier, the inputs to this model consist of several statistics that measure each team's talent in a number of areas.  However, I cannot use these measures as they are, because the numbers are not opponent-adjusted.  It's much harder to perform well against a tougher schedule, and that must be factored in if I am to recreate a realistic simulation.   

Luckily, opponent strength metrics are readily easy to come by.  The front page of Pomeroy's site has adjusted ratings for both opponent's offense and defense.  Of course, these number only apply to offenses and defenses at a high level.  What's the best way to apply these high-level metrics to each of the skill-specific metrics I'll be adjusting?

As it turns out, this is not a question with an easy answer, nor one with readily available research that I could use.  Long term, this is probably the number one opportunity for improvement in my model.  But for now, I feel I was able to come up with a decent approximation.

Using the data for all 351 teams, I took the five metrics I am adjusting for opponent strength (2FG%, OR%, TO%, Foul Rate, and 3FG%), and found the means and standard deviations for the population.  I did the same for the adjusted opponents' offensive and defensive ratings.  My first thought was to simply add these standard deviations together and be done.  So, if the Arizona defense was three standard deviations above average at preventing offensive rebounds, and if their opponent's offenses were one standard deviation above average, then the opponent-adjusted metric would show Arizona to be four standard deviations above average.  However, when I ran simulations based on this (using 2015 data), Kansas became the title favorite.  2015 Kansas was a good team, but was well below the other top-line teams in overall efficiency.  I expect my model to spit out some different results (otherwise what's the point?), but this was a bit too weird.  I quickly discerned that the reason for this was that this aggressive application of opponent adjustment was over-rewarding Kansas' supremely difficult schedule.

What I soon realized was that I had made a math mistake.  If a common set of opponents is one standard deviation above average on offense, we should not expect them to be one standard deviation above average in each individual component of that offense.  Rather, we would expect (on average) that these offenses are somewhere between 0 and 1 standard deviation above average in each individual component.  How then should I weight each component without any information as to opponent's proclivities in each area?  Well, recall that when we talk about the four factors, we generally consider their contribution towards making a good offense in the following proportions:

Shooting: 40%
Preventing Turnovers: 25%
Offensive Rebounds: 20%
Drawing Fouls: 15%

Given this, I did the following:  I created a spreadsheet with four random number columns, corresponding to each of the four factors (so the first column had random numbers from 1 to 40, and so on).  I then created a fifth column that summed the first four columns.  I then found the standard deviations of each column, and divided the standard deviation of the individual columns by the standard deviation of the summation column.  This gave me the following figures:

Shooting: .619
Turnover: .428
Offensive Rebounds: .351
Fouls: .235

After approximating the proportional differences in standard deviations by metric, I did one final thing before applying this to my opponent adjustments.  As it is unlikely that all of these skills are independent of each other (as they were in my crude spreadsheet goofery), I regressed all of these figures 50% towards one (so I added 1 to each, and divided by two).  Teams with good athletes are likely to be able to apply those skills across multiple areas.  Yes, some teams have wildly different skillsets and some teams openly choose not to excel in all areas, but I felt regressing 50% of the way was a good starting point.

To conclude this section, let's apply this modified adjustment to the earlier example.  Arizona is still credited with being three standard deviations above average at preventing offensive boards.  But now, we assume that the offenses they faced were only 1*(.675) = .675 standard deviations above average.  This means that Arizona's true talent metric now says they are 3.675 standard deviations above average instead of the original 4.  This lead to better results in the simulations I performed afterward, so I will keep this logic for now.

5. Opponent Adjustments (in game)

The final item of note concerns what happens when I actually match up two teams against each other for a game.  Let's say Team A is a 30% true talent offensive rebounding team, and that their opponent (Team B) only allows 20% of opponent's misses to be rebounded.  How many offensive rebound should we expect Team A to grab in this matchup?  My original model last year simply took the average of the two, but I knew this wasn't sufficient.  We know full well that things like 3FG% defense are tenuous as best, and so I wasn't sure how much credit we should give defenses for other things as well.  Luckily, Mr. Pomeroy spent the offseason tracking down exactly what I needed for my model.  This allowed me to apply the following weights to offense and defense:

Two-point shooting:  50% offense
Three-point shooting:  83% offense
Foul rate:  36% offense
Offensive rebounding:  73% offense
Turnover rate:  49% offense

Thus, for our earlier example, we would weight Team A's performance by 73% and Team B's by 27%.  This would give us an opponent-adjustment number of 27.3% for team A, which is what I would use in the actual simulation of a game between these two teams.

You will notice in the linked article that defenses have very little control over opponent's possession lengths and free throw shooting.  Thus, I left those un-adjusted.  I will likely delve into possession length a bit more in the future, but as it's unlikely to sway the needle much, it's not a top priority.

What I fixed from last year

If you clicked through to last year's post and looked at the first simulation, you probably noticed that something was off.  Namely, the highest-seeded teams weren't anywhere near as dominant as they should have been.  Given last year's top-heavy nature, the best teams should have been at least 95% to win their first game, instead of 80-90% as they were in my first run of the model.  After poring over the model, I noticed one key mistake: my opponent adjustments for turnovers was wrong.  The reason: Excel made me stupid.  I had copied over the opponent adjustment field from two-point field goals through all of the other metrics.  The problem with this is that while all of the other metrics improve in a positive matter (the higher your offensive rebounding rate, the better you are, etc...), committing turnovers is the opposite.  Thus, I needed to simply change a sign in a formula in my master spreadsheet and everything started to line up with what I would expect.

Things that are still outstanding

I have already mentioned many of things I would like to fix in future versions of this simulation, but I thought I would summarize them, for transparency's sake.  Here are the items, in a rough order of importance:

1. Opponent adjustments (before game)
2. Individual team foul metrics
3. Possession length adjustments
4. Additional statistics (ie. block rate)

While I got to a place that I reasonably happy with for the opponent adjustments, it's still mostly guesswork.  My main project for next season will be conducting a little bit of research into determining the best way to adjust team statistics for their level of competition.  I also think it's worth investigating the foul issues I spoke about earlier, even if I don't think the changes will move the needle all that much.  On the smaller end, my algorithm to determine play length (I used a gamma distribution, so I wouldn't get a bunch of 1 and 2 second plays) is good but not perfect.  Really slow teams get sped up a bit, while faster teams get slightly slowed down.  One could argue that it's a good thing that I'm building a little incidental regression into my model, but still, I can probably improve this.  Finally, there are a few scenarios (jump balls, blocks, end-of-game weirdness) that my model completely disregards at this point.  I don't consider these minor aspects of gameplay super important, but they might be worth a little future investigation.

The Tourney, Predicted

Finally, here's what you came for: The output.  Each number value in the table represents the number of times a team (the row) reach a certain round (the column).  For example, this means that in 10,000 simulations, Kansas made the Elite Eight a smidge over 56% of the time.  I have ordered this by championship probability.



The first thing that probably sticks out to you is the four Big 12 teams in the top six.  This made me worry that I still hadn't calibrated my opponent adjustments correctly (remember that the Big 12 was the strongest conference this year). That said, there are other Big 12 teams whose fates line up with their seed/ratings as well as other strong-scheduled teams (Virginia) whose championship odds don't seem exaggerated.  I am guessing that things are skewed slightly towards the power-conference teams (one example - I have the average 2-seed with about a 92% chance of winning in the first round, while KenPom averages around 88%), but this is starting to get closer to the truth.

I wanted to take a minute to point out a few odd results, and offer potential explanations.  The one that sticks out the most to me is Cincinnati receiving a 73% chance of beating St. Joseph's in the first round.  These teams feature fairly  equal efficiency ratings and strengths of schedule, so it's surprising that the result would differ so much from 50-50.  As it turns out, the main differentiator between the teams is three point defense.  Saint Joseph's efficiency rating benefits greatly from being top 15 in the nation in preventing three point makes, while Cincinnati has been quite poor at this.  But as we know, there is little predictive value in this, and my model compensates accordingly.  As a result, Saint Joseph's main strength is effectively neutralized, which gives the advantage to the Bearcats.

The second odd result is West Virginia, who has an 89% chance of beating a really good Stephen F. Austin team, and the second best chance of winning the whole tournament.  The key here seems to be West Virginia's excellence in certain areas (1st in raw offensive rebounding and 2nd in raw defensive turnovers) combined with their amazing strength of schedule (5th).  This combination of traits serves to make the Mountaineers number one by a distance in the opponent-adjusted versions of those measures (4 percentage points above the next best team in turnovers!)  As I said earlier, I think that I may be over adjusting in some instances, but the performances of teams like West Virginia will be a clear barometer of just how much I need to tip the scales in the future.

No comments:

Post a Comment