The Fruitless Pursuit of Objective Optimality

Posted on Fri 04 October 2013 in Features by Aaron McGuire


There are a few cardinal rules in statistics. Correlation is not causation (although it often portends it). There is rarely a single cause behind a complex event (although one is often more important than the others). Then there's the big one: you simply can't model a binary outcome with a linear regression model. If you're modeling to a zero/one output (think wins/losses, hits/outs, makes/misses), logistic regression is clearly superior to linear regression. There's no situation where linear regression is acceptable in that situation. You are doing your data a gross disservice and breaking all assumptions of your model. To put it in layman's terms: if you use the wrong model with your data, you f**ked up. That's the one unimpeachable truth in all of statistics. Right?

As my uncanny vehemence to the point might imply, that's not actually the case. Linear regression is often sub-optimal in cases of binary outcomes, it's true. And it's important to teach first-year statisticians to always take care in picking their model. Taking a raw linear regression model and expecting it to produce results fitting expectations on a binary outcome is doomed to fail -- you'll get outputs beyond your expected values and coefficients that honestly don't make sense. But I was recently person to a talk that made me realize something important. The clever statistician can actually get around that problem. Completely side-step it, in fact. It takes a little bit of post-run tinkering to adjust your linear model to a logistic scale -- I won't give you the gory details, but: you need to convert the coefficients through a surprisingly simple transformation (arrived at by equating the derivatives of your respective loss functions) to apply proper bounding to your outputs. Then you need to convert the intercept using a more complicated integral. But that's all math you can do by hand.

Linear regression DOES break the assumptions of a binary outcome. But when you apply the necessary transformations to compare apples to apples rather than apples to oranges, the cost of breaking that assumption can be negligible at best. In fact, in certain datasets, the binary outcome reflects a normal distribution just enough that a transformed linear regression is actually slightly superior to a logistic model. And even in cases where it ISN'T the optimal path, logistic regression models take quite a lot more processing power than linear regressions on even the most modern servers. Hence, modeling data in a linear regression framework with the proper transformations can be significantly more computationally efficient. When you're dealing with data orders of magnitude above the kinds you examine in college (think datasets over 500 gigabytes, which I work with surprisingly often), understanding link functions and ways to convert linear regression estimates to logistic approximations can save you days of processing time and get you quicker results that are nearly as good. The moral: even a discipline's most sacred rules can be broken by a clever, intuitive agent who's playing even a slightly different game.

The rules are the rules. Until they aren't.

• • •

The NBA is almost back. It's close. So close you can taste it. Close your eyes and put your ear to a basketball. Can you hear it? The squeak of the hardwood, the squeal of new Jordans, the swoosh of the net? ... alright, honestly, I can't hear it either. And I probably look really silly right now sitting in my office holding a basketball to my ear. If basketballs were seashells we'd definitely be able to hear it, though. And that's what matters. The disparate agents on your favorite team are collecting. The old and the new, the wizened and the precocious, the Juwan and the Jrue. We're all rapt in anticipation, I tell you what. At this stage of the game nobody really knows what's going to happen. That's the real beauty of the preseason. Every team that wants to be is a playoff team -- every team that's punted the year has the first overall pick in their sights. Nobody's mediocre. Nobody's adrift. We're a winner, damnit!

And so the fans and players enter the NBA's new season with high hopes and a fervent desire to get things right. But it's useful to take a step back and really ruminate on what that means. There are a few rules that the mass commentariat generally agrees on. Contested long two pointers are the worst. Dunks and threes are the greatest. Efficiency reigns. Wins are valuable -- a title, priceless. Sports is a binary exultation of right and wrong. Play the "right" way, you win. Play the "wrong" way, you lose. I'd like to refute that, if only just. Because efficiency, wins, titles are all optimal in a certain frame of thought. But that's the key, isn't it? It's a certain frame of thought.

Sports, like art, is a pursuit of what you value. One must bear in mind the obvious -- any given fan chooses the parameters of their own optimality. And any given player chooses the parameters of THEIR own optimality. Some fans and players have their own deep-seated appreciation for raw efficiency and the calculus of the ideal. But to pretend that those fans and players are the only game in town is to miss the forest for the trees -- there are fans who don't give a moment's thought to the efficiency of the game before them, and there are players who don't really give a flip that the corner three is almost always superior to a fruitless top-of-the-key chuck. There are people who couldn't live without a hyper-efficient basketball team and there are people who couldn't care less. Variety is the spice of life.

• • •

For me, it boils down to this. We can look for what makes a winning basketball player. It's a valuable search, and it's one I'll join in often throughout this year's action. I don't mean to nag, or prescribe, or wag my finger. I'll be right there in the trenches with you, scouring for efficiency and looking for the next big innovation in pursuit of eternal wins. There's always going to be more to learn about the game and the agents that enact it. It's not that we should STOP looking for that. The search of a sort of basketball ideal -- that perfect play, that perfect game, that perfect moment -- is the kind of holy grail quest that can captivate for lifetimes. But sometimes I wonder if the lay basketblogger has overvalued efficiency to the point of incomprehensible lust. I point you to one of the most maligned statements from media day:

Is he wrong? Not factually, although his implication here is somewhat tragic from an efficiency perspective. It's classic Monta behavior. He's being intransigent. He could take better shots if he wanted to. He could be less of a drag on his offense. And he could be "better", by the normal definition of the word. But from a devil's advocate perspective, there's something to be said for remaining true to one's game and sticking to one's guns in the face of mountains of evidence to the contrary. Is it always going to work out for the best? Obviously not, if his goal is to win games. But anyone who's enjoyed their fair share of Cervantes and Camus should be intimately familiar with the idea of a tragic hero. And that's essentially the role Monta's playing here. He's conceding that he takes bad shots and conceding that he could be better. But he's gotten where he is today by playing a certain brand of basketball. Perhaps he likes feeling control over his destiny. Perhaps he feels that success would hardly taste as sweet if he gave up his guns to get there. Perhaps he just likes it better.

Although it's difficult to write a story commending him for that, it's not particularly hard to feel a faint tug away from a bleak world of black and white outcomes. You don't need to be Mick Jagger to feel sympathy for Monta's efficiency-forsaken devil. There's more than one way to play the game and there's more than one way to feel like a winner. There are "better" ways to win, certainly, if winning is your only goal. But basketball is a game of feelings and desires as much as it is a stark pursuit of the angular "W." If Monta feels better when he wins a game his way, that's his prerogative. If a fan prefers to watch Allen Iverson and Kobe Bryant chuck prayers in pursuit of a heroic victory in a hard-fought game, that's their bag. If a coach overvalues an inefficient oldie because he plays the game in a way that fits the coach's style, that's their deal. Et cetera, et cetera.

At the end of the day, I'm a fan who values efficiency and the tenets of winning above many things. I appreciate watching a pinpoint Popovich offense predicated on every player's perfect pass. I appreciate a defense where no man misses their cue. But I can also appreciate the allure of the tragic hero, too. One can value the sharp report of the pistol as the gunner shoots his team in the foot without denying the dread inefficiency of the play. And as we enter a new season full of hope and wonder, it's useful to remind oneself of the many different ways to love our favorite game, and to appreciate the league's Don Quixotes. Those merry players that aren't anywhere near the best that they can be, but are comfortable enough to own up to their foibles and win or lose in their own tragic way.

They do not value efficiency and wins above all things. They are imperfect and improper without regret or regard for convention. And their steadfast devotion to that which popular thought considers outmoded and discarded can be the incomprehensible dash of spice that makes the NBA so enthralling, if only you chance to let it.

• The 2014 season begins in 26 days. •

Monta Ellis have it all (credit to USA Today for the photo)