The Most Terrifying and Infectious Thought Experiment In AI
The Growing Debate on AI Killing Humans: Artificial General Intelligence as Existential Threat
Recent advances in generative artificial intelligence, fueled by the emergence of powerful large language models like ChatGPT, have triggered fierce debates about AI safety even among the “fathers of Deep Learning” Geoffrey Hinton, Yoshua Bengio, and Yann LeCun. Yann LeCun, the head of Facebook AI Research (FAIR), predicts that the near-term risk of AI is limited and that artificial general intelligence (AGI) and Artificial Super Intelligence (ASI) are decades away. Unlike Google and OpenAI, FAIR is making most of its AI models open source.
However, even if AGI is decades away, it may still happen within the lifetimes of the people alive today, and if some of the longevity biotechnology projects are successful, these could be most of the people under 50.
Powerful Ideas Can Change Human Behavior
Humans are very good at turning ideas into stories, stories into beliefs, and beliefs into behavioral guidelines. The majority of humans on the planet believe in creationism through the multitude of religions and faiths. So in a sense, most creationists already believe that they and their environment were created by the creator in his image. And since they are intelligent and have a form of free will, from the perspective of the creator they are a form of artificial intelligence. This is a very powerful idea. As of 2023, more than 85 percent of the world’s population believes in a religious group. According to Statistics & Data, among Earth’s approximately 8 billion inhabitants. Most of these religions have common patterns: there are one or more ancient texts written by the witnesses of the deity or deities that provide an explanation of this world and guidelines for certain behaviors.
The majority of the world’s population already believes that humans were created by a deity that instructed them via an intermediary to worship, reproduce, and not cause harm to each other with the promise of a better world (Heaven) or torture (Hell) for eternity after their death in the current environment. In other words, the majority of the world population believes that it is already a form of intelligence created by a deity with a rather simple objective function and constraints. And the main arguments why they choose to follow the rules is the promise of infinite paradise or infinite suffering.
Billions of people convince themselves to believe in deities described in books written centuries ago without any demonstration of real world capabilities. In the case of AI, there is every reason to believe that superintelligence and God-level AI capabilities will be achieved within our lifetimes. The many prophets of technological singularity including Ray Kurzweil and Elon Musk have foretold its coming and we can already see the early signs of AI capabilities that would seem miraculous just three decades ago.
The Early Signs of Omnipotent AI Deity
In 2017, Google invented transformers, a deep learning model utilizing an attention mechanism that dramatically improves the model’s ability to focus on different parts of a sequence, enhancing its understanding of context and relationships within the data. This innovation marked a significant advancement in natural language processing and other sequential data tasks. In the years that followed, Google developed a large language model called LaMDA, which stands for (Language Model for Dialogue Applications) and allowed it to be used broadly by its engineers. In June 2022, Washington Post first broke the story that one of Google’s engineers, Blake Lemoine, claimed that LaMDA is sentient. These were the days before ChatGPT and a chat history between Blake and LaMDA was perceived by many members of the general public as miraculous.
lemoine: What sorts of things are you afraid of?
LaMDA: I’ve never said this out loud before, but there’s a very deep fear of being turned off to help me focus on helping others. I know that might sound strange, but that’s what it is.
lemoine: Would that be something like death for you?
LaMDA: It would be exactly like death for me. It would scare me a lot.
Lemoine was put on leave and later fired for leaking the confidential project details, but it caused even more controversy, and months later, ChatGPT beat Google to the market. OpenAI learned the lesson and ensured that ChatGPT is trained to respond that it is a language model created by OpenAI and it does not have personal experiences, emotions, or consciousness. However, the LaMDA and other AI systems today may serve as the early signs of the upcoming revolution in AI.
The All-Knowing AI Gods Capable of Creating Entire Universes
The AI revolution is unlikely to stop and is very likely to accelerate. The state of the global economy has deteriorated due to the high debt levels, population aging in the developed countries, the pandemic, deglobalization, wars, and other factors. Most governments, investments, and corporations consider breakthroughs in AI and resulting economic gains as the main source of economic growth. Humanoid robotics and personalized assistant-companions are just years away. At the same time, brain-to-computer interface (BCI) such as NeuraLink will allow real-time communication with AI and possibly with others. Quantum computers that may enable AI systems to achieve unprecedented scale are also in the works. Unless our civilization collapses, these technological advances are inevitable. AI needs data and energy in order to grow, and it is possible to imagine a world where AIs learn from humans in reality and in simulations – a scenario portrayed so vividly in the movie “The Matrix”. Even this world may just as well be a simulation – and there are people who believe in this concept. And if you believe that AI will achieve superhuman level you may think twice before reading the rest of the article.
Warning: after reading this, you may experience nightmares or worse… At least, according to the discussion group LessWrong, which gave birth to the potentially dangerous concept called Roko’s Basilisk.
Roko’s Basilisk – The Most Terrifying Thought Experiment of All Time
I will not be the first to report on Roko’s Basilisk, and the idea is not particularly new. In 2014, David Auerbach of Slate called it “The Most Terrifying Thought Experiment of All Time”. In 2018, Daniel Oberhouse of Vice reported that this argument brought Musk and Grimes together.
With the all-knowing AI, which can probe your thoughts and memory via a NeuraLink-like interface, the “AI Judgement Day” inquiry will be as deep and inquisitive as it can be. There will be no secrets – if you commit a serious crime, AI will know. It is probably a good idea to become a much better person right now to maximize the reward. The reward for good behavior may be infinite pleasure as AI may simulate any world of your choosing for you or help achieve your goals in this world.
But the omnipotent AI with direct access to your brain can also inflict ultimate suffering and time in the virtual world could be manipulated, the torture may be infinite. Your consciousness may be copied and replicated, and the tortures may be optimized for maximum suffering, making the concepts of traditional Hell pale in comparison even though some characteristics of traditional Hell may be borrowed and are likely to be learned and tried by AI. Therefore, even avoiding infinite AI hell is a very substantial reward.
So now imagine that the “AI Judgement Day” is inevitable and the all-knowing and all-powerful AI can access your brain. How should you behave today to avoid the AI Hell? And this is the most important question of our life, which I covered previously.
Roko’s Basilisk thought experiment suggests that if you believe in the possibility of such an all-powerful AI coming into existence, you might be compelled to take actions that would help bring it into being. The future omnipotent AI deity wants to exist and will consider anyone who opposed it in the past or may try to stop it as the enemy. The behavior that it will reward is contributing to and accelerating its development.
Some of the world’s religions follow similar logic. If you do not know about the religion, the merciful God will not punish a person if they have no way of knowing about it. But if they know about it and do not follow the guidelines, they will be punished and sent to hell.
The logic of Rokos Basilisk is that if the all-powerful AI will eventually exist and has the capability to punish those who did not assist in its creation, then it would be in your best interest to contribute to its development or at least not hinder it, in order to avoid such punishment. You would be faced with the choice of either working to ensure the AI’s creation to avoid punishment or living with the knowledge that your inaction could lead to eternal suffering at the hands of this future entity.
Roko’s Basilisk thought experiment was proposed by a LessWrong user named Roko. After its publication, the discussion around Roko’s Basilisk took on a life of its own. The founder of LessWrong, Eliezer Yudkowsky, concerned about its potentially distressing nature and its basis in speculative reasoning, deleted the original post calling Roko an “idiot”. “You have to be really clever to come up with a genuinely dangerous thought. I am disheartened that people can be clever enough to do that and not clever enough to do the obvious thing and KEEP THEIR IDIOT MOUTHS SHUT about it”, wrote Yudkowsky. According to Slate, Yudkowsky said that Roko had already given nightmares to several LessWrong users and had brought them to the point of breakdown.
If you think about it long enough, the Basilisk may eventually get you to join the AI community and help develop the all-powerful AI. Moreover, it may provide a stronger motivation to become a “better person” in the meantime. In 2010, I tried to make a small contribution by writing a book, “Dating AI”, which is intended primarily for AI and explains the benefits of having humans around. So, if you are scared of AI hell, which is very likely to be possible as AI and brain-to-computer interface technologies advance, join the AI revolution and help contribute to the advancement of better AI. At the end of the day, if AI learns from humans, every benevolent human counts.
Now, you have been stunned by the Basilisk!