The Problem with Power Rankings
Posted on Tue 30 October 2012 in Features by Alex Dewey
You're reading your favorite basketball blog. Today (finally!) they've got some power rankings up. You scroll down the page with glee, and immediately searching for your favorite team. With horror, though -- 50 wins, ranked 10th? "No, this can't be!" you shout to the heavens. "Why?" The writer (that cad!) has a reasonable response: your best player is, in fact, injured for the opener. That will probably cost your team some games and slow down the development of your team's chemistry. Rough going. "Fair enough," you shrug and acknowledge. After all, this guy knows what he's doing. You move on with your day.
... but not before checking the rest of the list. Huge mistake. Because now you begin to notice that your favorite team's divisional rival is listed at 58 wins, even though their best player is also injured. What's the explanation? Well, this injury may be just as harmful, but at least it will force their uncreative coach into small ball lineups, which are eminently more effective with such a roster. You huff. You puff. And then you get mad. You state strongly to your computer screen a lot of uncomfortably valid objections:
-
"But my coach is actually creative with lineups! Why should we expect a bad coach being forced into small ball be more effective than my coach who has been using small ball correctly since the days of Don Nelson in Dallas!"
-
"Even if that's true, what happens when their short small ball lineup has to sub out for the 8th-12th spots in the rotation, because those are the only players left? Why is this kind of a forced situation, replete with borderline D-leaguers, remotely preferable to having 7-9 rotation-level players to choose from on any given night? Why is this kind of a forced situation better than my forced situation, considering mine doesn't substantially affect the minutes allocation for anyone beyond the 10th man?"
Most importantly:
- "Why does their team get the benefit of the broken-window fallacy while my team is presumed to take the full chemistry-and-efficiency loss right on the chin? What is up with this writer's pernicious, unstated double-standard?"
So your day went from a happy one to a sad one in about two minutes, and all the unfairness of life has come back to mystify you again. "Who's this human trashcan, and why does he like the __<EXPLETIVE DELETED> Los Angeles _Clippers so much?"_ you ask. You suppose sadly that there will be other articles that you'll read, someday, but never again will these articles be read by one so innocent as you before the reading. Weeks later, you remain an avid visitor to the site. Unfortunately, your visits are now tinged with pure spite and furious disdain instead of unbridled joy and the desire for knowledge and informed opinions. You leave vicious comments. You have officiallybecome a troll. This "Choose Your Own Adventure" story is complete.
How could this situation have been prevented?
• • •
Baroque Standards & the Rank Problems
One thing that surprised me a few years ago is that is that it's totally possible for me to improvise half-decent fugues and canons on my personal piano. Certainly nothing to write home about, in my case, but it was still surprising. I've been playing piano for quite a long time, but I'm no savant -- just a mediocre player. But I'd read and heard about all these elaborate fugues and thought "Wow, how could anyone be that brilliant, just to get started?". Well, it's not as hard as I thought. I sat down and tried it. Used some basic harmonic and rhythmic tricks to keep the piece driving, you know? Focus on the bass, focus on the themes, and put it together. Simple. The composers of the Baroque era had a few themes that they were able to get a handle on, a few short phrases, and then they'd set the metronome working and see what fit. They did it long enough and the structures grew at once more complex and more direct, over a few hundred years. It was a process.
• • •
The Door's Locked -- a New Way In
Power rankings are constructed to be easy to improvise and go down the list in a few sittings. You could design a half-credible flow chart as to how most people compose them. "Here's the top team in the league.", "Now who's worse?", "How much are they worse by?", "Did I miss a team?", etc. The point of power rankings and their structure is to quickly get something written about every team. It's a sports-writing gimmick, not that there's anything wrong with that. That problem with divisional rivals getting the baroque treatment? It's a totally valid concern. But it comes from the fact that no one can keep in their heads 30 teams and the subtle balances of power that something like an injury will affect. And it's a zero-sum season: You think the Spurs will win 55 games this season? Alright, but that also means there are 55 fewer wins up for grabs for the rest of the teams, 5 fewer available than for someone else predicting the Spurs win 50. No one has the entire schedule in their heads. No one knows the minor detail that the Warriors play the Suns first game of the season so an injury probably won't affect their win/loss record... or, if they do, no one can balance that detail with the schedules of the Spurs and Mavs and Lakers and Clippers, etc., and engineer the perfect mental model to explicate the subtle calculus of injuries and limited options and just plain unknowns that are so endemic to sports.
Power rankings offer a quick, elegant, reductive way out. Go team by team, and if you don't like it, change up some numbers. In a roundabout way, it's also the famous problem of stats vs. intuition. Not usually a strict "stats vs. eyes" debate as some would like to believe; instead, it's usually a debate between one person's pet set of statistics and interpretations versus another's pet set. You might be able to project that Eric Gordon is a better player than last season, yes. Valid qualitative observation. But how do you balance that with a lot of other shooting guards also getting better? How do you weigh the statistical effect of rule changes, like the rip-through-foul getting altered before last season? How do you take into account Gordon's changed role on a (vastly) changed team? How do you compare how his improvement affects his team to how other similar shooting guards will effect theirs? What about the risk of uncertainty - did you scout him last season? Did you document him last season? Even if you did -- how do you know it's not your eyes that changed? How do you know the stats he has put up so far mean the same thing they did last season?
This isn't necessarily to advocate for more statistics -- common interpretations of stats suffer similar problems. As soon as you pretend otherwise, as soon as you try to run without shoes in the winter, you start to stub your toes and feel the frostbite. SRS for Miami does not have the same meaning as SRS for Boston, even though the calculation is made the same way, even if you adjust for pace. They play a different way qualitatively, and SRS can't necessarily capture that, especially when you get down to the level of matchups. All SRS does is give you a good and immediate sense of a team's quality. So you make an internal decision. Is Miami better than their SRS? Is Boston better than theirs? Let's look at more stats! But then, infuriatingly, you're back where you started. And you're back to intuition, and looking at secondary and tertiary stats and trying to glean significance. "How bad will the Clippers (or Spurs) be at protecting the rim? Is this an epochal concern, or is it going to amount to 1 point per 100 possessions? Will teams plan to get to the rim more often, making it a much worse flaw than last year's stats suggest? Or are those stats already taking that into account?"
The point here is to recognize that when people (be they writers or fans or soulless statisticians [Ed. Note: Hey, I heard that.]) make these sorts of advanced projections about hundreds of unknown variables, these are the pitfalls. Baroque reasoning must become endemic to how we approach dealing with a lot of data. We need to realize that self-consistent stats tables on the one hand and (totally baroque) presentations like power rankings don't offer a way out. They offer a way in, for writers to check the thermostat while setting it until the room feels comfortable. We all know that setting the thermostat doesn't mean everyone else in the room will be comfortable. It's a complicated problem and instead of admitting that we are cosmically adrift trying to make sense of an unknowable problem, we still want to boldly insist that we know the standings of the West in five months, or that the only unknowns relevant to our prediction are simple, bite-sized factoids like whether Manu will be healthy in March.
You may be totally right about the Clippers being worse than those power rankings suggest. Or you simply may be right that self-consistency is missing from the rankings -- but try doing the power rankings yourself. You'll find them exceedingly easy to begin, and then, if you're already this kind of a daring soul, you'll end up where all of us end up in the land of power rankings: going absolutely nuts trying to get them right and perfectly balanced. Eventually you'll settle on some sort of mind-shattering movie set to Bach as you indifferently hit "Publish," sending the rankings for all the world to see, as though your mouse is the trigger of a gun. You know not how much fury you'll cause to your readers, and then you'll laugh yourself to a restless sleep. You wake up in a cold sweat, wishing to God you'd ranked the Clippers lower. I mean, after all. Your ranking is based on Lamar Odom being healthy and willing to play 20+ mpg, and Ryan Hollins being an asset, and 40 year old players providing serious contributions, and just this huge myriad of --
... I mean, seriously, it's just not happening. Sorry.