The Hard Problems AI Can’t (Yet) Touch

It's tempting to consider the progress of AI as though it were a single monolithic entity, advancing towards human intelligence on all fronts. But today's machine learning only addresses problems with simple, easily quantified objectives

Who's better, Lebron James or Stephen Curry? Which country is more powerful, the United States or China? How advanced is modern AI compared to humans? Of the three ridiculous questions above, which appears most underspecified? However misguidedly, we often discuss complex, multi-faceted issues as though they were easily reduced to single scalar quantities.

In the past few years, modern AI techniques have made great strides, accomplishing a number of feats traditionally associated with human intelligence. In particular, machine learning systems using deep neural networks now perform human-level speech recognition and transcription. They also caption images, annotating them with reasonable natural language descriptions. Just a couple months ago, a system marrying reinforcement learning with deep learning beat humans at Go, long thought to be the hardest board game for computers to match human abilities.

Professional Go Player Lee Se-dol Set To Play Google AlphaGo

Amid this success, it's tempting to consider the progress of AI as though it were a single monolithic entity, advancing equally on all fronts. And if we restrict attention to supervised learning and reinforcement learning for games, this seems believable. But we should pause before giving in to Singularitarian magical thinking about the advance of AI. While some pop-futurists suggest otherwise, AI doesn't progress uniformly in all directions as cost per calculation falls. In contrast, we have made rapid progress on some classes of problems while stagnating on others. Our successes have come on problems where objectives are easily and uncontroversially quantified. Other problems with fuzzier objectives: personal assistants, conversation agents, medical decision-making have seen comparatively little advance towards human abilities.

Which Problems are Hard?

One recurring theme throughout the history of AI / machine learning research is that we are terrible at determining exactly what aspect of human intelligence is remarkable. Thirty years ago, many thought that beating a grandmaster at chess would constitute a greater feat of intelligence than matching the cognitive feats of a high-schooler. And yet while current champion Magnus Carlsen can't withstand a Macbook Pro running tree search, modern dialog bots can't match the conversational prowess of an average 10 year old. Perhaps, our values regarding various types of intelligence are rooted in economic thinking. All else equal, scarce skills seem more valuable. Chess champions may appear to possess greater intelligence than socialites simply because there are fewer of them.

Machine learning problems can be hard for many reasons. Most publicly visible AI addresses well-formed pattern matching problems. These include supervised learning problems like image classification or reinforcement learning problems like playing Chess or Go or Atari. Some of these problems are harder than others on account of a paucity of training data, a large action space or many classes, or because the patterns to be learned are especially complex, as with raw audio or video data. On all these fronts, machine learning research has made considerable progress.

However, machine learning problems can also be difficult for another, overlooked reason. For many settings in which we might want to introduce an AI agent, the problems are not well-formed. That is, it's not at all obvious what objective should be optimized.

Nearly all machine learning tasks consist of some kind of optimization. For image classification, our goal is to maximize the percentage of images that are correctly classified. For language translation, our goal is to output a string that closely agrees with a set of ground truth candidate translations. For games like Chess or Go, our goal is simply to win. In short, both supervised learning and reinforcement learning assume apriori knowledge of a single scalar quantity whose maximization equates to success.

But for many real world settings where we might want to insert an AI agent, no one can say at present what the objective function should be. For example, what precisely is the goal of a doctor? How can we distill success of a medical professional to a single reward that is dolled out as outcomes become known? How precisely do we measure quality of life? What is the trade-off between limbs and longevity? How much should the doctor value revenue vs patient health (if the value of life is infinite then all patients should be seen for free). Sometimes the objective function for the doctor might vary from patient to patient depending on their preferences. Human doctors implicitly evaluate these tradeoffs constantly, but before we learn to canonize our objectives current AI may remain confined to more isolated, low-level pattern recognition problems.


Similarly, what is the scalar quantity optimized by a conversant in dialog? Is it simply to maximize engagement? What if engagement is maximized by trolling? Is the goal to be kind? To be informative? To build lasting relationships? Should there be a penalty for being an ass, and if so, how is that quantified?

While we have made dramatic strides in pattern recognition, making headway on difficult problems in supervised and reinforcement learning, our entire paradigm for learning requires well-formed objectives that are determined a priori. And yet for the complicated settings in which we envisage "hard AI" or human-like intelligence, determining these objectives presents a formidable obstacle.

Moreover, it may be precisely the uncontroversial indicators of success (test set accuracy, chess rating achieved) that have enabled progress in the field. Unlike other areas which are stymied by subjectivity, machine learning has benefited from the objectivity conferred by these metrics. How precisely can the community achieve similarly harmonious progress towards more general AI, if the objectives are nebulously defined?

Machine learning has entered its golden age. It's fascinating intellectually and impactful economically. As progress appears to roll in steadily, it's tempting to view this as a one-dimensional march towards a super-intelligence that matches or exceeds human capacity in all endeavors. However, we might remember our atrocious track record for recognizing what precisely about our intelligence is interesting, valuable, or difficult to replicate. Perhaps it's our ability to operate effectively despite hazily defined objectives that is most remarkable - an area where machine learning has yet to make much progress.

Zachary Chase Lipton Zachary Chase Lipton is a PhD student in the Computer Science Engineering department at the University of California, San Diego. Funded by the Division of Biomedical Informatics, he is interested in both theoretical foundations and applications of machine learning. In addition to his work at UCSD, he has interned at Microsoft Research Labs and as a Machine Learning Scientist at Amazon, and is a Contributing Editor at KDnuggets.