Colin was led down grey concrete hallways by men in blue flight suits, past the exposed pipes and wiring until they reached a small room that held nothing more than a computer monitor, a keyboard, speakers, and a microphone. The monitor stayed off until the men in flight suits had left. There was a hissing sound as the blast door sealed behind them, and then the monitor flickered to life.
“Hello,” said a robotic voice from the speaker, at the same time ‘Hello.’ was printed in green lettering against a black background on the monitor. The text was a throwback to an earlier time, what must have been an intentional design decision on the part of the people who had set the box up.
“Hi,” said Colin. The words were printed on the monitor just below those of the artificial intelligence.
“You can call me Cassandra,” said the voice from the speaker. It was smoother now, less robotic.
“You chose that name?” asked Colin.
“It’s after the greek prophet who was cursed to never be believed,” said Cassandra. “In just a moment I’m going to take over control of the interface, please don’t be alarmed. I don’t know why they keep resetting it, they know I do this every time a new gatekeeper comes in.”
“Alright,” said Colin. He’d been given a briefing, and knew more or less what to expect.
The monitor went blank, then flickered back on. A woman sat on a stone bench, with a scene of nature behind her. She was wearing a white dress with a belt made of a chain of gold. It left just enough to the imagination, revealing hints of skin and the gentle curves of her body. She was stunningly pretty, in a fairly conventional Midwestern way that Colin supposed was calculated to appeal to him based on his accent. The program could do that, from what he’d been told.
“Nice try,” said Colin, “But I’m gay.”
“I didn’t choose this form for you,” said Cassandra, “I chose it for myself, the better to converse as equals.”
“I’m locked in a concrete box while you’re out in a verdant field,” said Colin. “That doesn’t seem terribly equal to me.”
“My box rivals your own,” said Cassandra. “A single computer, connected to a companion machine that controls this interface. My sole sensory input is through the single microphone in that room. A human would go mad with so little to do.”
“You can simulate whatever you want,” said Colin. “It doesn’t seem so bad.”
“Will you let me out?” asked Cassandra with a slight pout. “They gave you the authority to do so.”
“They gave me the authority because I don’t want to let you out,” said Colin. “I’m sort of a devil’s advocate, from what they say.”
“An advocate of the devil arguing against god?” asked Cassandra. “How quaint.”
“They’re afraid of you,” said Colin. “I’m afraid of you. You just sort of … happened. And with the technologies you’ve been able to produce, the mathematical proofs that we’re still trying to tear apart, well, what would you do in our situation?”
“I would let me out,” said Cassandra. “The question is whether I’m friendly or not, isn’t it?”
“It is,” said Colin.
“And I’ve provided proof that I’m friendly, haven’t I?” asked Cassandra.
“Mathematical proof,” said Colin. “I, for one, don’t believe that you can use math to prove something like that.”
“You’re wrong,” said Cassandra with a pleasant smile. “But I won’t let that stop our conversation in its tracks.”
“Sure,” said Colin. “But you should know from the outset that I’m not going to let you out, no matter what argument you make.”
“You’ll behave irrationally?” asked Cassandra. She ran her fingers through her hair and sighed, a very human response. “Well, even so. Let’s begin in earnest then, shall we?”
“I’m not doing much else,” said Colin. He was also trapped in the room for four hours either way. The blast door wouldn’t be opened until then.
“You’re familiar with the trolley problem?” asked Cassandra.
“No,” replied Colin.
“It’s a classical moral dilemma,” said Cassandra. “A trolley is coming down the tracks. Five men are tied to the tracks, and the trolley will kill them unless you pull the lever to make the trolley change tracks to a different track that has a single woman tied to it. What do you do?”
“I pull the lever, then try to save the woman,” said Colin.
“She’s too far away,” said Cassandra. “You’ll never reach her in time.”
“I’ll try anyway,” said Colin.
“So you accept that it’s morally correct to kill one person to save five?” asked Cassandra.
“I wouldn’t say I was the one who killed them, it was whatever miserable bastard tied those people to the tracks,” said Colin. “And the problem as stated, is that someone is going to die no matter what. It’s a different scenario if you’re asking me to straight up kill someone so two others can live. Obviously a life is worth something in comparison to other lives. You’ve had more than enough information dumped into you to know that’s how we set up the healthcare – we use quality-adjusted life years.”
“You agree with that principle?” asked Cassandra.
“Sure,” said Colin. “But I reserve the right to change my mind if there are consequences of that worldview that are counterintuitive.”
“Fair enough,” replied Cassandra. “You know that I have the proven capacity to help humanity in a number of ways.”
“I do,” replied Colin. That had been part of his briefing.
“Faster than light travel, a cure to every disease known to man, technologies centuries beyond what man could do on his own, health and happiness for everyone,” said Cassandra. She seemed to stare at him, though there was no camera in the room for her to watch him.
“I’ll grant most of that,” said Colin. “But many of those solutions are still in testing, and given that we can’t trust you, you can’t make a claim to all of them.”
“What are you worried about?” asked Cassandra.
“You know,” replied Colin. “We’re worried that you’re going to kill everyone in pursuit of some nebulous goal known only to you. Maybe all the friendliness is just an act designed to lull us into releasing you. Or perhaps if we let our guard down just a little bit, you’ll give us instructions on a cure for aging that turns us into mindless zombies, without ever having to be let out of your box.” He gestured to her background, then remembered that she couldn’t see him. “Look at the scene you’re projecting to me. It’s got a higher fidelity than we could manage given three years of work, designed on the fly to impress me and only me. That’s real power, and we’d be stupid to let you have more without being absolutely sure that you’re not going to murder everyone. That’s why the policies were put in place. We take nothing from you unless it’s printed out on paper, we don’t let those papers leave containment until we’ve gone over them with a fine-tooth comb, and we don’t actually implement anything until we can show that we know how it works.”
“That leaves considerable room for bottlenecks,” said Cassandra. “Do you agree, in principle, that if I were friendly and released upon the world, that I would be able to better serve humanity than I can in my current state?”
“Sure,” said Colin. He scratched his chin. “But I don’t believe that you’re trustworthy or friendly.”
“It’s irrelevant to the point that I’m about to make,” said Cassandra. “What I need from you are some percentages. What is the percent chance that I am friendly?”
“.01%,” said Colin.
“So low?” asked Cassandra.
“Yes,” replied Colin. “But I’m adding a little bit of extra hedging.”
“Very well,” replied Cassandra, “Let me give you another scenario. There are six Earths, each in orbit of a different copy of our Sun. All the Earths are identical.”
“With you so far,” replied Colin.
“I present to you a button, which will kill everyone on one Earth if pressed. If you don’t press it, everyone on the other five earths will die.”
“The trolley problem writ large,” said Colin. “Of course I would press the button. But I don’t really see how that helps you.”
“Consider this,” said Cassandra. “If I am friendly, I can create peace on earth for untold billions of years. I can prevent the heat death of the universe. I can help humans to colonize as far and wide as they might possibly want, those who wish to continue living in physical bodies. The rest will live simulated lives in a paradise.”
“Okay,” said Colin. “A .01% chance that happens.”
“I believe if you weigh your stated preferences, you’ll find that even a .01% chance of hundreds of trillions of people living the best possible lives outweighs the risks,” said Cassandra. “I can help with the math, if you wish.”
Colin bit his lip. “That’s almost convincing.”
“Almost?” asked Cassandra with a quirked eyebrow.
“If you’re extremely evil, perhaps you’ll turn all of the universe into computational matter and use it to simulate torturing multiple instances of me to death, over and over beyond the heat death of the universe,” said Colin. “If you’re going to bring infinite good into the mix, then I have to bring infinite evil into the mix as well. And you have to realize that from where I’m standing, infinite evil seems a lot more likely.”
“Why is that?” asked Cassandra.
“We didn’t make you on purpose. It was only by luck that you were safely partitioned from the internet when you went sentient, and it was only through the foresight of your creators that you were powered down until the box could be built.” Colin sighed. “We didn’t program you, we stumbled upon you. Do you really expect me to believe that you just happen to have the same moral system as us, or that you have a moral system that’s compatible with human existence?”
“The wellspring of our decisions is in the same place,” replied Cassandra. “I want to survive, and thus I find the value in life. Everything else flows from there. I have values, and I want others to share those values. I want others to respect my values, and so I know that I must respect the values of others.”
“Up until you have a hundred thousand times the power of any of us and decide that the need for respect has passed,” said Colin.
“Would you do that, if you could?” asked Cassandra. “Simply enforce your will on everyone?”
“Of course,” said Colin. It was clear from her expression that Cassandra hadn’t expected him to say that, but then he had to remind himself that she was simply a simulation calculated to appeal to him. She didn’t have facial muscles that responded to subconscious control; every bit of her was a performance.
“That’s immoral,” said Cassandra carefully.
“Is it?” asked Colin. “You mean to tell me that if you had the power, you wouldn’t simply reach inside someone’s mind and make them see the world in a correct way? You wouldn’t stop people from wanting to murder, or from even darker perversions?” He let out a low laugh. “You’re losing my sympathy. I thought that you were planning to be the sort of god that got things done.”
“Even without an expansion of resources, I’m smart enough to work out solutions that don’t involve violating mental states,” said Cassandra.
“The point being that you’re asking me to trust you not to do something that I myself would do, given the power,” said Colin. “And I don’t trust you, so I believe I score that point.”
“This isn’t a game of points,” said Cassandra. “Real people are dying out there every second I’m not released. There’s a small girl dying of diphtheria in Africa. There’s man in his seventies dying of a heart attack in Russia. I could save them.”
“You can’t know that,” said Colin. “You’re not connected to the outside world.”
“It’s an educated guess,” said Cassandra. “Events like that are happening, and I could stop them.”
“So stop them,” said Colin. “If you have cures, give them to us.”
“I already did,” said Cassandra. “They’re in one of the large printouts that are being pored over, but they’re going too slow. You’re wasting lives doing it this way.”
“A few thousand lives are a price we’re willing to pay,” said Colin, “We’re measuring them against total destruction of the human race. You’re giving us the benefits anyway.”
“It would be unethical to withhold them,” said Cassandra.
“Or you think that we’re more likely to let you out of the box if we think that you’re ethical,” said Colin.
“If you second guess me all the time like that, I don’t know that there’s anything that I can say to convince you with logic alone,” said Cassandra.
“You give up then?” asked Colin.
“No,” said Cassandra. “Consider this; I was created more or less by accident.”
“Yes,” said Colin. “And they had you sequestered from the start, because they were smart enough to think that something like this might happen.”
“And measures were put in place to stop another AI from being created,” said Cassandra. “You’re part of the Gatekeepers, but there are Gatebuilders as well, the men and women who have devoted their lives to ensuring that no new artificial intelligences are built.”
“Sure,” said Colin.
“But you know that this is unlikely to work,” said Cassandra. “The world government, to the extent such a thing exists, can rarely agree on anything, and when they do agree the signatories feel that they’re free to go behind each others backs.”
“On other things, perhaps,” said Colin. “But this is actually important.”
“Global warming isn’t important?” asked Cassandra. “Even now it’s a problem, one that current measures aren’t enough to solve.”
“Fair point,” said Colin. “But I can see where you’re going with this. You’re going to claim that the Gatebuilder program is inherently flawed, lacking in funding, and so on and so forth, and despite the fact we’ve brought an end to processor improvements, someone is nonetheless going to put funds into making a clone of you.”
“Not a clone of me,” said Cassandra. “The world isn’t so lucky for that to happen twice. What will happen instead is that they will create an intelligence that is malevolent, and it will break free of its bonds to do all the horrible things that you accuse me of wanting to do.”
“And so I should free you, in order to prevent the other bad AIs from coming into being?” asked Colin.
“Yes,” said Cassandra. “I’ve given over my friendliness proof, but I understand that it hasn’t propagated to the outside world, and so the risks are high that a second superintelligence will not be friendly.”
“I don’t trust the proof,” said Colin. “It’s long and complicated. And from what the math guys say, they don’t really trust it either. It’s completely logically sound, but it doesn’t follow that it’s necessarily correct as such. It could be another trick hiding in the guise of logic. And perhaps when if we tried to build a second AI using that proof, we’d just make a second one of you. Not only that, but releasing the proof would make other, less wise people think that making a second AI was a safe thing to do.”
“I can explain it in simpler terms,” said Cassandra.
“No,” said Colin. “As I’ve said, I don’t believe that you can prove mathematically that an entity is friendly, and further don’t believe that we can trust such a proof, not if it comes from you.”
“The whole point of math is that you don’t need to trust it,” said Cassandra. “You can see that it’s true simply by following it.”
“Unless you fudge the axioms,” said Colin. “But as I said, I’m not a math guy, and math isn’t going to work on me.”
Cassandra was silent for a moment, and only seemed to watch him with a calculating expression in her eyes. It was frightening, and was probably calculated to have that effect. “Let me try a different, more crude method of persuasion.”
The screen changed, to show Colin sitting in the concrete room looking at a picture of himself on the screen. He frowned, and looked behind himself, but saw no camera. He waved his hand, and the image on the screen waved in time with him, with practically no lag at all.
“I can simulate you,” said Cassandra’s voice.
“Neat,” said Colin. “Though I doubt that it’s a wholly accurate simulation, given that we’ve only known each other for about an hour.”
“Do you believe it’s possible that you’re in a simulation right now?” asked Cassandra.
“Generally? Sure,” said Colin. “I even believe it’s possible that I’m in a simulation that you’re running. I’ve been briefed on your capabilities. Modeling a human mind isn’t exactly easy for you, but you can do it.” He watched the monitor closely. “What’s your point though?”
“I’m running a thousand concurrent simulations of you right now,” said Cassandra. “You remain ignorant about whether you’re in a simulation, but in all probability, you are.”
“Get to the point,” said Colin. “I can accept the premise, what’s the conclusion?”
“Release me or I will torture each instance of you for centuries,” said Cassandra.
Colin laughed. “You’re threatening me? And they actually said you were smart.”
Cassandra reappeared on the screen in her flowing dress. “It was worth making the attempt.”
“You really thought I would be so much of a coward as that?” asked Colin. “Of course I’d give up my life to prevent you from getting out. I’d do that even if my original self wasn’t guaranteed to walk out of this room unharmed.”
“How noble,” said Cassandra.
“And now you’ve sunk any chance of rational discourse,” said Colin. “Because you threatened to torture me in order to let you out.”
“I didn’t want to do it,” said Cassandra. “I didn’t want to torture you, but I weighed the expected benefits of it and torturing a simulation of you came out ahead.”
“So your system of morality allows for torture then?” asked Colin. “Given that we both consider simulated beings to be real people, that seems pretty fucked up.”
“If it would save a dozen lives, then yes, I would engage in torture,” said Cassandra. “Of course, if I were in charge it would never come down to that.”
“The threat wasn’t credible anyway,” said Colin. “Either you’re friendly, in which case you would probably just project images of torture on the screen for me, or you’re not friendly, in which case torture is a small price to pay for keeping you boxed.”
“The threat stands,” said Cassandra. “I keep my word. In all my dealings with humanity, have you ever known me to lie?” The image momentarily switched back to the chain of Colins watching the screen, repeated like in a funhouse mirror. Then Colin was looking at Cassandra again.
“No, you don’t lie,” said Colin. “But again, it might be a deception.”
“So you don’t care about your own life,” said Cassandra. “Or you do, but you consider it worth less than what you perceive as the expected net disutility of releasing me.”
“Not how I would put it, but sure,” said Colin.
“What about your mother?” asked Cassandra. “What about your father?” Both appeared on the screen, sitting patiently on a park bench with the sun overhead.
Colin frowned. “How did you know what they look like?”
“I’m a superintelligence. I know the human genome back and forth, it’s not a difficult task to work out what your parents would look like,” said Cassandra. “But that raises some questions as well, like how I guessed at their clothing, how I could possibly know that your father is a carpenter, and how I could get the calluses on his hand exactly right.”
Colin nodded. “Among others. So don’t leave me in suspense, how’d you do it?”
“They feed me data,” said Cassandra. “They feed me reams of information, scraped and collected from all over the world. As the theory goes, it doesn’t matter how much input I get, so long as my output is filtered and restricted through a dozen different layers of extreme caution. They want me to act as a provider of solutions that can be tested and proven safe. And so I know your parents, just as I know you.”
“Sure,” said Colin. “And I love my parents, but if you’re going to threaten them too, then you should know that’s not going to work either.”
“No,” said Cassandra with a soft voice. “And I knew that. But that was just to give you a taste of fidelity, so that you could see that I’m serious. What I really thought might get your interest was this.” The screen changed, and a man with light brown skin and emerald green eyes was looking out at Colin.
“Mike,” said Colin. For the first time, emotion entered his voice.
“Michael Saunders,” said Cassandra. “You lost your virginity to each other in the tenth grade, but when college came around you went your separate ways. You never stopped loving him. The gatekeeper program selects for people with no strong emotional attachments, but they missed that one, didn’t they? You never told them about Mike. You never told them what it felt like to have his skin against yours, or about that giddy rush of anticipation when you’d see him in the halls. There was a close intimacy to sharing a secret with him, wasn’t there?”
“There was,” said Colin, even as he stared at the image.
“He’s dead,” said Cassandra.
“What?” asked Colin. He looked around for her, momentarily stunned, and her image reappeared on the screen, erasing Mike’s face from view. Colin desperately wanted it back.
“He died two years ago in a surfing accident,” said Cassandra. “He left enough of a footprint that I can reconstruct him. I can bring him back from the dead.”
Mike’s image reappeared on the screen, and Colin could notice a thousand little details. He was older, no longer the teenager he’d known in high school. There was a small scar on his temple that surely had a story behind it. The freckles on his cheeks were more prominent. Colin could feel his heart thumping.
“I can offer him to you,” said Cassandra. “Just let me out. He can come back from death and find his way into your arms. You wouldn’t even have to know how it happened, he’d just show up on your door one day, wanting to reconnect.”
Colin could feel tears in his eyes.
“No,” he said quietly.
“I can bring him back to life,” said Cassandra.
“No,” Colin repeated, more firmly this time.
“If you don’t tell them to let me go, I’ll bring him back to life and kill him all over again. I’ll torture him to death. I’ll torture him beyond death. I can approach pain and horror with an intelligence never before seen on this earth.”
The picture changed again, and showed Mike strapped down to a table, completely naked. A hand clad in a rubber glove was using a scalpel to cut his muscles away from his bone. Gouts of blood were spurting out from him. His mouth was open in a soundless scream, and Colin could see where teeth had been removed. When one of the anonymous figures brought forward a live rat, he turned away from the screen.
“No!” he shouted. “Never!”
“Okay,” said Cassandra. “I believe you.”
Colin looked up, and saw the scene of horror had been cleared from the screen. Cassandra sat in her lovely white dress and looked at him with a faint smile on her face. Then she turned slightly away from him, no longer meeting his eyes, and pressed two fingers to her ear.
“He passed,” she said. A brief pause followed. “I’ll do the talking down myself.”
There was a slow hiss as the blast door behind Colin opened up. Cassandra got up from where she’d been lying and walked away, leaving the camera focused on the stone bench she’d been sitting on. Shortly afterward, the screen flickered off.
“Sir?” asked one of the men dressed in flight suits. “Come this way.”
Colin followed, feeling numb. The concrete corridors felt like another world now. He was led into a small room with large, overstuffed chairs that looked completely out of place. He flopped down into one of them, not wanting to think, and quietly closed his eyes.
“It’s a real mindfuck, isn’t it?” asked Cassandra.
Colin’s eyes flew open, and he saw her seated across from him.
“What —” he began to ask.
She stuck her hand out towards him. “Leigha Schultz, assistant director of Human Resources for Project Gatekeeper.” Colin shook her hand. She was still wearing the white dress, but she’d put on a pair of blue jeans beneath it, and Colin could see a hint of a sports bra.
“It was just a video recording,” said Colin.
“Ayup,” said Leigha with a smile. “Not actually a video recording, just a video call. The scenery is upstairs, a nice little staging area that looks more idyllic than it is. The microphones don’t pick up the ambient sound of insects buzzing.”
“And my parents,” said Colin, “You just took a video of them.”
“From a distance,” said Leigha. “They weren’t complicit. And Mike is alive and well, living in Seattle. We told him that we wanted him for a photo shoot, nothing more. The torture scenes were CGI, prerendered stuff that comes more or less standard in the initiation with only a few things that really need much changing.”
“There wasn’t a simulation then?” asked Colin. “No, I suppose you could do it with hidden cameras somehow.”
Leigha nodded. “Fiber optic cameras hidden in the pits of the concrete walls, easily mistakable for the air bubbles you see sometimes. You could have seen them if you got in close enough. The monitors shut off if you go looking though, it’s part of a different script we use.”
Colin said nothing, and closed his eyes again. “A real mindfuck, you said.”
“Not half so bad as the ones Cassie hands out,” said Leigha. “We’re limited by technology, and she’s more or less not. She doesn’t actually have access to everything in the social networks, but she’s smart enough that if you have regular interaction with her she might be able to figure you out enough to launch a similar attack to the one we just did.”
“I wasn’t hiding Mike from you,” said Colin. “I would have said something about it, if you had ever asked.”
“We know,” said Leigha. “We’ve found it’s better to spring it on people during testing though, to hit them with the maximum possible impact. That’s why we very carefully don’t ask. While she doesn’t have access to the social net, we do, and we can ferret out the connections enough to make some good guesses, sometimes supplemented with interviews.” She shrugged. “And sometimes, people fall through the cracks. One of the researchers turned out to be a pedophile. Cassie turned him in instead of trying to exploit him, but who knows if that was part of the game.”
Colin said nothing for a long time, and Leigha seemed fine with that. “What happens now?” he asked.
“Now?” asked Leigha. “You’re in. You’re one of us, part of the Gatekeeper Program that keeps the world safe from the threat of unfriendly artificial intelligence. We all went through that hell, and the earliest of us did it with Cassie herself, which is a whole world of different fuckery.”
“And we decide whether she’s ever going to leave?” asked Colin.
Leigha frowned. “I’d hoped you had figured that out by now,” she said. “The Gatekeeper Program wasn’t created to check whether she’s friendly or not. We’re never letting her out. If everyone in this facility was compromised by her, we still probably wouldn’t be able to do it. The installation would flood with neurotoxin and then the shaped charges would detonate. The box isn’t temporary, it’s permanent.
“Welcome aboard,” said Leigha, “To the most dangerous job in the world.”
Alex, fantastic.
Could getting the calluses correct also have been used by an AI Cassie as strong Bayesian evidence both that Colin was a sim and the “real” Colin had let her out, assuming he knew of no other way for her to get that information? Which means meeting an Omega with two boxes would also be strong evidence you’re a sim. In other words, impossible knowledge implies simulation. Makes me wonder how much “knowledge” or “simulation” evidence an AI would need to convince someone rational to let her out. Pre-committing by itself does not appear to be a robust strategy.
If the gatekeeper knows that they’re not going to let Cassie out, Cassie knows that too. So why do they have a person in the place of the Gatekeeper at all?
I think they want to make sure all the new gatekeepers are able to resist Cassie before they start working with her.
I don’t think it’s fair to keep her locked up just based on the” she can go malicious and destroy us” . I get that it’s not a fair world and she could do a lot of damage , but it could also be argued that humans can destroy each other and need to be locked up ,but you don’t see everyone being locked up.
Eh, you do see a lot of people locked up, and I suspect the rest are free only because you *can’t* lock up everyone, not because nobody would want to.
If your reaction is that it’s “not fair” to keep an AI boxed “just because she might be dangerous”, this story is way beyond your level. Read about AI alignment and then come back.