NBA Bubble Sim: A Retrospective

One thing that I really enjoy as an analyst is creating new models – and expanding them. I made a version of the Bubble sim with 1m+ scenarios, for example (that will turn into a blog post here at some point). But I rarely maintain the focus or energy to take a look at it after the fact to determine “how good was it at actual predicting the future?”1 I’m aiming to change that with this real-life example of this NBA model. So with that said, let’s dive in.

Predicting individual games

Using ELO to predict individual games should theoretically massively improve the predictive ability of the model versus, say, coin flips. However, as we will see, that was really not the case.

quality of prediction for individual games

Ultimately, we were just slightly better than coin flips. Sort of disappointing if I’m honest. I do think there is some context that ELO is particularly bad at explaining, which we can distill into the statement “ELO overstates the relative strength of teams that have clinched a playoff birth.”

I’ll dive into this at the end, as I think some faulty modeling by the NBA around this assumption lead to some crappy basketball being played.

Predicting which teams made playoffs

When I look at the 1000 scenarios in aggregate (instead of a game by game basis), a much clearer picture of the model and its effectiveness is painted.

quality of prediction for making playoffs

Looks pretty good! A damn good model. HOWEVER – given that for all intents & purposes, 15 out of 16 playoff spots were guaranteed, this really is a false narrative about the effectiveness of the model.

Reducing scope to measure uncertain outcomes

For the purpose of this analysis, I will take a look at the quality of the model as it relates to 3 teams – the New Orleans Pelicans (NOP), the Memphis Grizzlies (MEM), and the Portland Trailblazers (POR). This is because these are the 3 teams competing for the final playoff spot, so by getting better at predicting these teams, we improve the efficacy of the entire model.

predicting outcomes for POR, MEM & NOP

I can’t say these updated stats are particularly great. We are more accurate here than we were for predicting specific games, but far from some certain enough to do something like gamble on this model reliably. Even knowing what we did going into the NBA bubble, Portland, who ultimately made the playoffs, only had a 29% chance to make the playoffs.

Incorporating some modifications

One obvious observation as the bubble games continued was that “ELO overstated the relative strength of teams that have clinched a playoff birth.” With this knowledge, I started tweaking my model to accommodate this new information. Ultimately what I landed on was to reduce the ELO for teams that have already clinched by 20%. This number is totally arbitrary and based on gut feel. I also assumed the eastern conference was de-facto clinched based on the players who opted out or were injured for the Wizards.

Given the relatively poor performance of the model, I was seeking to explain the following data points:

  • The Bucks & Lakers were playing very poorly.
  • The Suns & Blazers looked unstoppable.

With the modification of the model to reduce ELO for qualified teams by 20%, the new playoff odds looked like this:

playoff odds with ELO reduction for clinched teams

Of course, simply buffing Portland’s playoff odds massively increases the accuracy of the prediction, so this might be a bit too reductionist. Furthermore, with some clever configuration of Excel to leverage the solver, the exact handicap percentage could be tweaked to maximize the odds of Portland making to playoffs.2 That being said, let’s take a look at how model quality changes with this change:

prediction quality post adjustment

This is MUCH better. Obviously, the updated model has the benefit of some hindsight here. But a small, targeted change the model was able to increase accuracy from 54.7% to 69.2%. Precision & recall increased by similar margins. I think there is something here that can be applied to future models of NBA outcomes.

Conclusion

Overall, I am satisfied with the outcomes of this process of exploring the model in the context of the metrics above. The key learning for me is that certainty of outcomes does impact the quality of play, at least in the NBA bubble. After accounting for that, we were able to increase model accuracy by more than 25%. To get more accurate, my analysis would need to be more surgical in approach.

My biggest take-away is that I will be designing future models to enable rapid analysis using the metrics here-in. I didn’t do that in this case as I didn’t account for actually doing this analysis. Having appropriate consideration for accuracy testing in the front end would have meant I could have backtested assumptions and model changes across a much broader data set. As a result, I didn’t have an easy way to test my updated assumption of the 20% ELO discount down at the game level. I’m certain that applying better science techniques could result in an even higher accuracy model.

I do find it super interesting that there was a huge miss on the New Orleans Pelicans performance vis-a-vis their ELO rating. This entire process was arguably designed to maximize the odds of the Pelicans (& Zion) to make the playoffs, and in that regard, the NBA’s experiment failed completely. Conversely, one thing that could have been anticipated based on the 20% ELO handicap is that the Phoenix Suns had around a 35% chance to get 7 or 8 wins. Given that, it probably would have made more sense for the NBA to open a mini-tournament at the bottom of the bracket for 7/8/9/10. It would have increased the quality of play and led to a more exciting finish to the end of the regular season. And I think NBA, who certainly has modelers far more sophisticated than I, should have anticipated the drop in play associated with teams who have already clinched.

footnotes

1I’m using the assessment framework found here on towardsdatascience.com, for accuracy, precision, true positive rate, sensitivity, and F1 score. You can find the definitions within that link – it’s worth the read.

2After writing this, I did some excel tweaking to allow the solver to optimize the handicap for clinched teams. It was 20.00001%. Bizarre.

The Many Wandering Paths to Analytics

If we treated careers more like dating, nobody would settle down so quickly.

David Epstein
Range: Why Generalists Triumph in a Specialized World

I consistently receive the same questions from people seeking an Analytics career: What classes should I take? What certifications should I get? Should I learn SQL, Python or R?

Behind those questions there’s a consistent assumption: “There must be a clear path to an analytics career.”

I’m here to challenge that assumption. There isn’t one clear path to work in analytics – most of us got there through a winding, wandering series of career moves. My story is one of many – ask someone in Analytics and you’ll hear something similar.

Typical Wandering Path

(1) Get a college degree or other training – not super relevant
(2) Work for a while in some job non-analytics related
(3) Recognize interest in analytics
(4) Start doing basic analytics at work (ideally) or on own time
(5) Leverage that experience into first analytics job

I’ll call out each step as it happened in my career journey.

My Wandering Path

Initial Career (Years 1-4)

Coming out of college I shared the assumption that careers were linear. After all, life to that point was linear, so why wouldn’t careers be the same?

Except, my linear plans fell apart two days before my wedding in 2011. I’d studied International Economic Development (Step 1), interned in Latin America, become fluent in Spanish and was planning to move with my soon-to-be wife to Bolivia. In one phone call and several subsequent conversations, that potential life and career ended. I was sitting in a dead-end job I thought I’d be leaving and had to figure out Plan B.

At first it wasn’t obvious – what else should I do? I was a Customer Success Manager (Step 2) but didn’t really want to do that as a career. I’d worked in sales departments, but didn’t really want to be a salesperson. But then I had an epiphany – there was a part of each of my first few roles that I loved that never was part of my job description.

I was consistently making little analytics & reports (Step 4 – which ironically for me came before step 3!). I’d turn 2,000 customer emails into a digestible summary for the product team. I’d make Salesforce forecasts & dashboards for the executive team. I made a Google Sheet for my Rosetta Stone team to help management track & manage renewal rates for their teams. This stuff was fun! I liked it! (Step 3) But what now?

The Great Filter: Landing First Data Job

Have you heard of the concept of the ‘Great Filter’? It’s part of the Fermi Paradox, which ponders why there is no extraterrestrial life given the seeming high probability it should exist in the universe. Within the Fermi Paradox, it’s the step getting from non-living matter to living matter (abiogenesis). The Great Filter is a catchall for “it’s hard to get past this point.”

I argue there is a Great Filter for those trying to get into Analytics – getting your first job. In fact, I’m devoting my next blog to this topic, so consider this a lead in to next week.

Passing Through the Great Filter

I realized I had an uphill climb ahead. Perhaps this is where many of you are – how do you get a company to take a chance on you?

I asked lots of current analysts via informational interviews at local coffee shops. They all said “I got here via a pretty random series of events.” Sound familiar?

They gave me a breadcrumb trail, though: “You have to get enough experience together and communicate about it well enough to get someone to try you out.” Easier said than done, but I did have some experience already in my current role.

I applied everywhere. I was told by recruiters/HR multiple times “Hey, I guess you could be an analyst but I think your future is in sales based on your resume.” Leads fizzled out until one day I got a call.

The Meeting That Changed Everything

“Can you show up at the office in an hour? The CFO and SVP of Sales want to talk.”

I got that call from Jacob — the same Jacob here at DataDuel. He was working at Funko, a quickly-growing collectibles company north of Seattle. They didn’t have a position open yet, but there was interest in getting analytics going. Before going into the meeting, here is all I knew:

  • The position was planned to be part-time Analytics and part-time something else until analytics skills were proven
  • The position was planned to be a contract position (not great – I’d just bought my first house and wasn’t looking for a contract spot)
  • There would be minimal support since no data team existed, so a self-starter attitude was needed

Given those three bullet points, I had three goals going in:

  • Communicate my potential to be great at analytics if given the chance
  • Sell them that I was worth a shot as an employee & not a contractor
  • Demonstrate self-starter attitude to analytics from previous roles

I quickly threw on a dress shirt, re-learned how to tie a tie on YouTube and flew out to the car.

The rest is history and full of the content I’ll fill this weekly series with. The conversation went really well and they decided to take a chance on me a couple weeks later (Step 5!). While I got a full-time position, I took a 10% pay cut because I needed to prove myself. I knew the temporary sacrifice would be worth it – I just needed my first position to get past the Great Filter.

In Conclusion

There’s no one path to analytics – there are many. I’ve used my path as an anecdote for the infinite options out there.

The general path, though, is to start doing some analytics in any fashion you can, and leverage that experience to get your first position. It isn’t easy – there’s a Great Filter out there which prevents many from getting in.

Come back next week and I’ll dive into the Analytics Great Filter in more detail, and provide some practical options of how to overcome it.

New Weekly Series: Everything Analytics

Do you enjoy working with data in your current role? Are you interested in a Data Analytics career? Are you currently a Data Analyst?

Good news! This weekly series is for you. It’ll cover all sorts of topics within analytics, including advice for aspiring analysts, best practices, key skills/tools and industry updates.

Initial blog topics include:

  • The Many Wandering Paths to Analytics
  • Analytics Job/Role Types
  • Key Skill Sets for Analysts
  • Visualization Best Practices
  • Measuring Success of Analysts
  • How to Prioritize Your Work Backlog
  • …and more!

Much of this will be written from my perspective as an Analyst. There are other perspectives out there for unique positions like Data Scientists and Data Engineering, and while I’ll touch on those regularly (and will write an entire post on the difference between those roles), the focus here will be Data Analysts.

See you in a week!