Human Compatible

by Stuart Russell

What’s in it for me?

What’s in it for me? Rethink your fundamental assumptions about AI.

Artificial intelligence will be the defining technology of the future. Already, AI is rapidly pervading all levels of society: individuals willfully bring AI into their homes to help them organize their daily lives, city councils and corporations employ AI to help optimize their services, and states take advantage of AI to undertake large scale surveillance and social engineering campaigns. But as AI becomes more intelligent – and our social systems come more and more to depend on it – the threat presented by out-of-control AI becomes more dire.

The risks and downsides of new technologies are far too often left unexplored, as scientists and engineers fixate on their feverish quest to realize the utopias of the future. In fact, many AI experts and corporate high-ups even downplay the risks of AI out of fear of being more strictly regulated.

These lines attempt to remedy this imbalance. The question of how to control AI and mitigate its more disastrous consequences is the biggest question facing humanity today, and it’s precisely this question we’ll explore.

Here you’ll learn

  • how today’s supercomputers compare to the human brain;
  • what a legendary ancient king has to teach us about modern AI; and
  • why automated weapons technology is making life more insecure for everybody.
We need several breakthroughs in software before AI surpasses human intelligence.

Today’s computers can process information at astounding speeds. But even as early as the 1950s, computers were being touted as super-brains that are “faster than Einstein.”

Of course, computers back then had nothing on the human brain. But we still compared the two. In fact, from the very beginning of computer science, we’ve tended to measure computational intelligence – and progress – against human intelligence.

So, what about today’s computers? Some of them, surely, can give us a run for our money?

The key message here is: We need several breakthroughs in software before AI surpasses human intelligence.

The fastest computer in the world today is the Summit Machine, housed at the Oak Ridge National Laboratory in the US. Compared to the world’s first commercial computer, the Ferranti Mark 1, the Summit Machine is 1,000 trillion times faster and has 250 trillion times more memory. That’s a lot of zeros.

In terms of raw computing power, the Summit Machine actually slightly exceeds the human brain, although it requires a warehouse full of hardware and a million times more energy.

Still, it’s impressive. But can we say that today’s supercomputers – the Summit Machine included – are as powerful as the human brain? The answer is decidedly no.

Sure, these computers have impressive hardware, which allows their algorithms to operate faster and process more information. But there’s far more to intelligence than just processing speed.

The real problem in designing intelligence is in the software. As of now, we still need several major conceptual breakthroughs in AI software before we witness anything resembling human-level artificial intelligence.

The most important breakthrough we need is in the comprehension of language. Most of today’s intelligent speech recognition AI are based on canned responses and have trouble interpreting nuances in meaning. That’s why you get stories of smartphone personal assistants responding to the request ‘call me an ambulance’ with ‘ok, from now on, I’ll call you Ann Ambulance.’ Genuinely intelligent AI will need to interpret meaning based not just on the words said but on their context and tone as well.

We can never really say when conceptual breakthroughs will take place. But one thing’s for sure – we shouldn’t underestimate human ingenuity.

Consider the following example. In 1933, the distinguished nuclear physicist Ernest Rutherford announced at a formal address that harnessing nuclear energy was impossible. The very next day, the Hungarian physicist Leó Szilárd outlined the neutron-induced nuclear chain reaction, essentially solving the problem.

We don’t yet know whether superintelligence – intelligence beyond human abilities – will emerge soon, later or not at all. But it’s still prudent to take precautions, just as it was when designing nuclear technology.

We’ve been operating under a misguided conception of intelligence.

If we don’t treat AI with caution, we may end up like the gorilla.

Just consider that, thanks to human-caused habitat loss, every gorilla species today is critically endangered. Sure, in recent decades conservation efforts have successfully pulled some species back from the brink of extinction. But, whether gorilla numbers dwindle or thrive, their fate depends largely on the whims of humans.

The concern is that in a world controlled by superintelligent AI, we’d be in much the same position as the gorillas. Can humans maintain supremacy and autonomy in a world where they rate second place to more intelligent beings?

Thankfully, there’s one important difference between us and the gorillas: we’re the ones designing this new intelligence. It’s paramount that we take great caution in how we design intelligent AI if we’re to ensure they remain under our control. But we have a crucial problem.

The key message here is: We’ve been operating under a misguided conception of intelligence.

In the current paradigm of AI design, an AI’s intelligence is measured simply by how well it can achieve a pre-given objective. The big flaw in this approach is that it’s extremely difficult to specify objectives that will make an AI behave the way we want it to. Pretty much any objective we come up with is liable to produce unpredictable, and potentially very harmful, behavior.

This problem is known as the King Midas problem, named after the fabled king who wished that everything he touched would turn to gold. What he didn’t realize was that this included the food he ate, and even his own family members. This ancient tale is a perfect example of how a poorly specified objective can end up causing more strife than good.

The danger from unpredictable behavior increases as AI becomes more intelligent and wields greater power. The consequences could even present an existential threat to humanity. For example, we might ask a superintelligent AI to find a cure for cancer, only for it to start giving people cancer in order to do experiments on them.

You might be wondering: if we’re not happy with what an AI’s doing, why don’t we just turn the blasted thing off?

Unfortunately, for the vast majority of objectives, an AI would have an incentive not to allow itself to be turned off. That’s because being turned off would threaten its objective. Even apparently straightforward objectives like ‘“make a coffee” would lead an AI to prevent itself from being turned off.

After all, you can’t make coffee if you’re dead.

Instead of just intelligent machines, we should be designing beneficial machines.

Until now, the mantra among AI researchers has been: the more intelligent the better. But is this really what they should be chanting?

As we’ve just seen, an AI given a carelessly stated objective can end up engaging in very harmful behavior. And it’s not much consolation if the AI engages in harmful behavior intelligently – if anything, it makes it worse.

What we need is a different mantra. One that aims to build AI that will stay tuned to human objectives, no matter what. The new mantra should be: the more beneficial the better.

The key message here is: Instead of just intelligent machines, we should be designing beneficial machines.

There are three principles that designers should follow if they’re to make beneficial AI.

The first principle is that AI should only have one objective, which is the maximal fulfillment of human preferences. The author calls this the altruism principle. It ensures an AI will always place human preferences above its own.

The second principle is that the AI should initially be uncertain about what those preferences are. This is the humbleness principle. The idea here is that an uncertain AI will never fixate on a single objective, but will change its focus as new information comes in.

Uncertain AI systems would be more cautious and more likely to defer to humans. Being uncertain, an AI will continuously search for clarifying information. This means they’ll often ask for permission, solicit feedback and might even do trial runs to test human reactions.

And, crucially, uncertain AI will allow themselves to be turned off. This is because it would interpret a human trying to turn it off as having the preference that it be turned off.

The third and final principle for making beneficial AI is that their ultimate source of information about human preferences should be human behavior. This is called the learning principle. It ensures that an AI will always remain in a direct and sustained relationship of learning with humans. It means an AI will become more useful to a person over time as it gets to know her better.

These three principles represent an alternative understanding of what genuine intelligence involves: the ability to scrutinize and redefine one’s own objectives in light of new information. AI with this kind of intelligence would be much closer to human intelligence, since we’re also capable of examining and altering the goals that we strive towards.

And, if AI could change its objectives in light of human preferences, we would have the basis for a radical new relationship between humans and machines: one in which machine and human objectives are essentially the same.

We can expect AI to benefit us in many ways.

It’s been reported that some virtual home assistants, unable to differentiate who’s talking to them, have been obeying commands to buy products that they’ve overheard on the television. Clearly, virtual assistants aren’t quite superintelligent yet.

But this is going to change.

Virtual assistant technology is improving in leaps and bounds, thanks in part to massive investment from the private sector. The reason for such great interest in this technology is that the range of tasks virtual assistants could perform is seemingly unlimited. We’re not just talking about making shopping lists and turning the stereo on, we’re talking about performing the work of highly skilled specialists.

The key message here is: We can expect AI to benefit us in many ways.

Virtual lawyers, for example, are already vastly outperforming real lawyers at sourcing legal information quickly. Similarly, virtual doctors are outperforming human doctors at providing correct diagnoses for illnesses.

Eventually, there may be no need for specialists at all. Instead we’ll all have our own, personal, all-in-one doctor, lawyer, teacher, financial advisor, and secretary in our pockets, on call, 24 hours a day. Thanks to virtual assistants, these vital services will be democratized – no longer accessible only to the rich – thereby raising the standard of living for everybody.

The benefit of AI to scientific research will also be colossal. An AI with even basic reading comprehension skills would be able to read everything the human race has ever written between breakfast and lunchtime. By comparison, it would take 200,000 humans reading full-time just to keep up with the world’s current level of publication.

With the help of superintelligent AI, scientists will no longer have to sort through immense amounts of published research, as AI will be able to extract and analyze the relevant data for them.

Superintelligent AI will also have global applications. By collecting information from surveillance cameras and satellites, we should expect AI to be used to create a searchable database of the entire world in real time. From this data, we could produce models of global systems – such as economic activity, and environmental changes. These models would make it feasible to design effective interventions into these systems, helping us to, for example, mitigate the effects of climate change.

However, the potential for privacy violation implied by a system of global monitoring is obvious. This leads to our next blink, which tackles the darker side of AI that we all need to brace for.

AI is going to make life less secure for everyone.

The Stasi of former East Germany were one of the most effective and repressive intelligence agencies ever to have existed. They kept files on the majority of East German households, listening to their phone calls, reading their letters, and even placing hidden cameras within their homes. This was all done by humans and written on paper, requiring a vast bureaucracy and massive storage units containing literally billions of physical paper records.

Just imagine what the Stasi could have done with AI. With superintelligent AI, it would be possible to monitor everyone’s phone calls and messages automatically. People’s daily movements could also be tracked, using surveillance cameras and satellite data. It would be as though every person had their own operative watching over them 24 hours a day.

The key message here is: AI is going to make life less secure for everyone.

AI could lead to yet other dystopias. This includes the Infocalypse – the catastrophic failure of the marketplace of ideas to produce the truth. Superintelligent AI will be capable of manufacturing and distributing false information without any human input. They’ll also be able to target specific individuals, altering their information diet strategically to manipulate their behavior with surgical accuracy.

To a large extent, this is already happening. Content selection algorithms used by social media platforms – ostensibly designed to predict people’s preferences – end up changing those preferences by providing them only a narrow selection of content. In practice, this means users are pushed to become more and more extreme in their political views. Arguably, even these rudimentary forms of artificial intelligence have already caused great harm, entrenching social division and proliferating hate.

While the Infocalypse is still in its infant stage, the next dystopia is well in the making. This is the state of constant fear caused by autonomous weapons technology.

Autonomous weapons – machines that seek out and neutralize targets by themselves – have already been developed. Such weapons identify targets based on information like skin color, uniform, or even exact faceprints.

Miniature drones called slaughter bots are already being primed to search for, locate, and neutralize specific individuals. In 2016, the US air force demonstrated the deployment of 103 slaughter bot drones. They described the drones as a single organism sharing one distributed brain, like a swarm in nature.

The US is only one of many nations currently building – or already using – automated weapon technology. As autonomous weapons come to displace conventional human warfare, all of our lives will become less secure, since anyone will be targetable – no matter where they are in the world.

Mass automation will either liberate humanity’s potential or debilitate it.

We’ve just looked at three terrifying scenarios that could be caused by AI. But we haven’t yet considered what is perhaps the most worrying and socially destructive threat from AI: automation.

Automation may well grant more people access to important services like healthcare and legal advice. But it could also cause widespread unemployment. Exactly how much of a threat this is is open to debate. Optimists point out that in every previous industrial revolution, automation produced at least as many jobs as it eradicated. If you look at the research, however, it shows that over the past 40 years, jobs have fallen significantly in every industry that has implemented automation technology.

So, should we be worried?

The key message here is: Mass automation will either liberate humanity’s potential or debilitate it.

The truth is, in the long run, AI is likely to automate away almost all existing human labor. This will not only affect low-skilled work like truck driving. As we saw earlier, even highly-skilled professionals like doctors, lawyers, and accountants are at risk.

So, when a machine replaces your labor, what will you have left to sell? Well, not much. But what if you didn’t need to sell anything? What if we could let machines do all the work, and still make sure we all had enough to live by?

One way to do this might be to institute a universal basic income or UBI. A UBI would provide every adult person a reasonable monthly income regardless of circumstance. Those who want to earn more would be free to work – if there’s any available. But, for everyone else, liberated from the need to earn a living, they would be free to pursue whatever they want.

This is a rosy picture, isn’t it? But would this scenario really be a utopia? If all labor, learning, and striving towards skill acquisition were taken over by machines, wouldn’t we become diminished beings?

This is a genuine concern. Until now, the only way to sustain our civilization has been to pass knowledge, from one human to another and over successive generations. As technology develops, we increasingly hand that knowledge and expertise over to a machine that can do the task for us.

Once we lose the practical incentive to pass knowledge onto other humans, that knowledge and expertise will wither. If we’re not careful, we could become an enfeebled species, utterly dependent on the machines that ostensibly serve us.

Final summary

The key message in this book
  • The way we currently design AI is fundamentally flawed. We’re designing AI to be intelligent but not necessarily to have humanity’s best interests at heart.
  • We therefore need to make the fulfillment of human goals AI’s only objective.
  • If we can successfully control superintelligent AI, we’d be able to harness its immense power to advance our civilization and liberate humanity from servitude.
  • But If we fail, we’re in danger of losing our autonomy, as we become increasingly subject to the whims of a superior intelligence.