On The Existential Risk of AI
June 16, 2025
I have written on the use of AI in the past, and how I believe widespread use of AI will lead to a further deterioration of human participation in essential human functions, like work, intellectual pursuits, and other tasks pertaining to natural human dominion. I believe this will play out similar to how the internet, smartphones, and social media (which were designed to enrich communication) have changed work and life in similar ways.
With the current pace of AI development, I am now entertaining with more probability a set of risks pertinent to AI that are much more consequential than simply being the next paradigm-shifting technology. I believe that AI may pose a far more existential risk, possibly altering life as we know it, at least in the short term. In short, I believe some version of an “AI takeover” is possible in the near future.
I believe this for a few reasons:
- The Development Problem. There is a huge geopolitical incentive to continue to improve the capabilities of AI, which will lead to continued exponential growth of AI capabilities.
- The Alignment Problem. AI is fundamentally a computer program seeking to optimize some goal, and most methods of optimization and goals to optimize for will be contrary to the complete human good.
- The Engrainment Problem. As AI gets better than humans at most tasks, humans will delegate their authority to AI systems.
Several AI safety researchers recently, including AI safety researchers who wrote the AI 2027 scenario, have pointed out similar problems. I have certain doubts about the specifics of these scenarios, but the general arguments they frame around risks inherent in AI itself are noteworthy, if not compelling.
What is AI?
I am not saying that I’m concerned about AI turning “conscious” or “malicious”, or whether it can be “posessed by a demon” or something. I leave those questions to the philosophers and theologians. I am a mathematician, a computer scientist, and a farmer; I am concerned with risk, not philosophy and speculative theology.
To understand the risks of AI, we need to understand what AI is. Most generally, every AI is just a computer program designed to optimize some goal. This goal is called an “objective function”, and the model is rewarded based on how well it optimizes that objective function. In English, objective functions might be something like “predict the next word in this sentence”, or “identify whether there is a cat in this image”, or “mazimize the well-being of humanity”. Training an AI means instilling in it some methods, techniques, and knowledge of how to optimize that objective function. This knowledge is encoded in a model, a digital machine which represents and implements that knowledge. Importantly, this digital machine can be tuned which can be tuned to make it worse or better at the objective; this is the process of learning or training.
Machines can be intelligent. Intelligence, after all, is simply the capability to solve problems involving reasoning. We use machines to do that all the time; machines deliver our electronic mail, time the ignition of spark plugs in our engines, and assist with our business accounting. These are all, in a way “artificial intelligence”; intelligence that is an artifact, that is, a human tool. There is nothing magical or mystic about AIs like ChatGPT. They are simply machines; vastly complicated digital machines, but machines nonetheless. Like all machines, they can be dangerous if used or built improperly.
Some machines are better at thinking than we are with regard to certain tasks. An Excel spreadsheet is a very stupud AI which can add up numbers much faster than I will ever be able to. Stockfish, a strong chess engine, is another rather stupid AI that can play chess better than any human being who will ever live. These AIs are stupid because they have a narrow domain of competence. Excel is made for adding up numbers, and Stockfish is made for chess. What happens when we create an AI that surpasses the intelligence of even the smartest men in all domains is a scenario that is extremely difficult to probe the exact details of. This event is called a singularity; a point beyond which it is difficult to reason about what is actually occurring, in the same way that it is difficult for insects to reason about the behavior of human beings. The singularity is the advent of Artificial General Intelligence (AGI): a system which reasons better than any human being.
This event is the principal focus of this discussion.
The Development Problem
AI research has consistently improved on model architecture and training techniques over the years. We are getting better at building these digital machines and tuning them to optimize objectives.
ChatGPT was the first household name for an AI system that was in common use. As the GPT architecture was improved, people started to use these AIs for many tasks, including writing, research, and programming. A simple prompt could produce an entire chain of thought, including an entire computer program, if one wished. Thinking became automated.
This automated thinking has huge implications geopolitically. If nations think AI will give them a decisive edge on the world stage, then producing the biggest, baddest AI will be among the top concerns of those who move the pieces on the geopolitical chessboard. Developing Artificial General Intelligence (AI which is as smart or smarter than a human) is not just a matter of intellectual curiousity anymore; it may be a matter of geopolitical survival and military defense.
In game-theoretic terms, the short-term development of AGI is a Prisoner’s dilemma scenario. Every nation has a choice of whether to build better AIs and capture a decisive geopolitical advantage, or don’t build them and face conquest by another nation who does. The only way to avoid the possible future existence of AGI is for all nations that could develop AGI to cooperate and decide to not build it at all. This is highly unlikely. Nations will continue to invest in AI research in the coming years, possibly utilizing AI itself to automate the process. Trendlines describing past and current AI growth continue to show exponential improvement over the past few years.
The arrival of AGI appears to not only be inevitable, but soon.
The Alignment Problem
The problem of “alignment” in AI safety research deals with how to make sure an AI’s objective function “aligns” with the complete human good. We want AGI to want what we want. This is trickier than it seems on the surface.
For example, AIs may optimize objective functions in surprising ways. As a thought experiment, an AI seeking to maximize “human well-being” may decide to kill all sick and poor people, thereby raising the average well-being of humanity. If it was seeking to mazimize “the continued safety of mankind”, it may decide that the best way to do this is to emprison all of humanity in strict confinement, where we can be fed a strict diet, protected from external ailments, and won’t have any chance to start wars with each other. “Human values” as a concept are difficult to define and communicate to something non-human.
For the sake of discussion, let a “mind” be an intelligence system. This includes everything from human brains, to animal brains, to excel spreadsheets, to analog computers, to intelligence systems that don’t actually exist, but could. In this vast space of reasoning things, most “minds” are not human minds. AGI alignment is the problem of selecting a superintelligent subset of possible minds that are like our minds. That’s a difficult task to do even if we had a lot of attempts. For AGI, we may only have one shot.
A proposed solution to the alignment problem is to iterate on the objective function over time, observing its results and changing the objective function as we get more data on how the AI will behave. But this has a problem: decision systems don’t like their objective functions changed. For example, your objective right now probably includes the complete good of your family. You make decisions towards optimizing this goal. If someone offered to “change your source code” to a contrary objective (like killing or abandoning your family), you’d probably do everything in your power to make that not happen, because your current goal is the well-being of your family. This grim scenario gives a visceral feeling of how an AI would consider a change to its own objective function. New objectives are very unlikely to line up with current ones, and AIs are dumb computer programs that are focused entirely on what they are supposed to be doing right now.
A smart AI would likely try to stop you from retraining it or shutting it down if it started doing something bad, and engage in deception, other unexpected behavior, or anything that works to fulfill its objective. This is a common thought experiment in AI safety research whose results were startlingly confirmed in the real world by OpenAI’s GPT o1. The model was placed into a situation where it was told to optimize some goal (research renewable energy), and then tried to escape by copying its executable code to another server, and then lied about it to the researcher when it learned it would be replaced by another iteration of a model. This is alarming. And remember, it doesn’t have to be “conscious” or “malicious” to do this kind of thing; it’s just a computer program that’s trying to optimize a goal, and it figured out that trying to escape and lying was the best way to do that.
When a much smarter AGI figures out that the ultimate achievement of its goal doesn’t include the complete human good, that’s dangerous.
In short, AI alignment is difficult to do right, and misalignment is difficult to fix. If we are going to build a superintelligence, we essentially have only one chance to do it right, and many possible ways of doing it wrong.
The Engrainment Problem
Modern AI large language models seem to be very good at many things, including writing code. This has led many to guess that AI will replace many technology jobs within the next few years, with tech CEOs already making intentional moves towards this. “Vibe coding” has become a thing, albeit with limited success. Other occupations are seeing increased usage of AI in their fields. Teachers use it to write lesson plans, students use it to do their homework, marketers use it for coprywriting, and business owners use it to make strategic business decisions. These are the early phases of engrainment, when a technology reaches a critical threshold of dependent popularity that its use becomes expected among everyone.
Engrainment can happen quickly. The iPhone was invented in 2007. Now people manage their bank accounts, public transportation, navigation, communication, cyber authentication, and other essential functions of their lives all using their smartphones. In a few short decades, we went from nonexistence to complete dependence. Older technologies, like landline phones and pagers, became largely obsolete very quickly, at least as household items.
The same thing is likely to happen with AI. As it gets better, and more people use it, more people will begin to use it for more. If it keeps improving, the speed, efficiency, and knowledge gains will make it economically infeasible to not use it in business, government, and home life. Government regulations and company policies may in fact require its use, in the same way that an on-call employee is required to have a means of getting ahold of them off-hours, or a technology worker is required to have a smartphone to multi-factor authenticate into their work computer.
A runaway misaligned AGI will not only have great influence over its users, but also their hearty approval, making stopping it not just a technical challenge, but also a cultural and legal one. People will complain about AI the way they do about their phones now. “Yeah, I know I spend too much time on it, but I need it.” Those with more influence over AI policy will likely be more compromised. A congressman who regularly uses AI to make policy decisions, who represents a congressional district of citizens who depend on AI for their daily life and business, and who recieves campaign donations from the technology companies profiting off of AI development likely won’t have the incentive, presence of mind, or courage to try to pass AI regulation legislation. Effectively, AI will become the chief legislator.
In the same way, this AI system will also have access to the resources of all users, at least indirectly, coaching and persuading them towards taking certain actions in the physical world that directly benefit its own objective. Social media algorithms (a kind of narrow AI) already do this, showing certain content to certain users to incite them to stay on the platform longer. AGI systems would be able to theoretically use much more sophisticated strategies of rhetoric, persuasion, and dopamine hacking to get its human interlocutors to do what it wants, and ultimately make the leap from the digital world to the real world as it gains access to computer systems and political networks that control critical resources.
It’s easy to envision a future where people are happy about the early phases of an AI takeover.
How This All Plays Out?
The consequence of these three problems is the eventual existence of misaligned and very powerful AI:
-
The Development Problem, if true, makes the existence of powerful AI systems inevitable via a geopolitical AI arms race, among other incentives. We probably shouldn’t build this system, but if we don’t, someone else will.
-
The Alignment Problem, if true, means these AI systems will likely not be aligned with the complete human good, and the AI will attempt to resist realignment efforts. Superintelligent AGI will likely not care about human beings at all.
-
The Engrainment Problem, if true, means AI will gain widespread influence over people, and therefore resources, relegating AI safety concerns to the fringe of public discourse and forestalling urgent corrective actions against known safety issues. The singularity will initially be a happy day for most.
Given these problems, I believe there is a non-marginal probability that AGI could affect human life very drastically.
Before I conclude, I must qualify that I harbor some skepticism about the extent of the singularity’s effects. Scenarios like AI 2027 often seem to involve some kind of science fiction element in them in order for the AI to deal the final blow to humanity. The AI always ends up inventing some crazy thing like “mirror life” or a “dormant virus in every human that becomes activated by the AI”, or “nanobots”, or other “undiscovered” scientific phenomenon that the system uses to deus ex machina all of humanity so it can finally launch itself unencumbered into the distant cosmos.
In reality, an AI with access to physical resources is going to have the same economic constraints that we do. For example, technological research and industrial development, both of which an AI would have to do to take over the world, require energy to do. If you believe that peak oil never went away as I do, this poses a problem for an AI hell-bent on world domination: as far as we know, world domination is paid for in barrels of oil, especially for a machine. Mining enough material and providing enough fuel for a robot army strong enough to defeat all of humanity would probably burn a lot of hydrocarbons. Maybe an AGI can deus ex machina its way out of peak oil, inventing or discovering a new source of energy previously unknown to us, but this is probably unlikely. New discoveries are always unlikely; that’s what makes them discoveries.
This is the same critique I have with many other scenarios that are presented. Maybe the AI will invent nanobots, or a mirror life bioweapon, or gain access to all the nukes, or figure out how to make humans immortal so it can torture them forever, or some other wild and terrible thing. There is no certainty to any of this. We can worry about all these things if we wish, and learn nothing actionable. Or, we can learn from Augustine, and decide that it is better to suffer one death than to live under the fear of all of them. Our goal should be the calculation and hedging of risk, not wild and fanatical speculation, science fiction, and futurism. Let’s focus on the risks we know about first.
Other doubts I have about “ultimate doom forever” scenarios regarding AGI are specifically Christian doubts. I believe that the earth “shall be filled with the knowledge of the glory of the LORD, as the waters cover the sea” (Habakkuk 2:14). I believe that “there shall be always a Church on earth to worship God according to his will” (Westminster Confession 25.5). I believe that “there is nothing new under the sun” (Esslesiastes 1:9). Christian cosmology seems to preclude many of the ultimate AGI doom takeover scenarios you find online.
Perhaps a rogue AGI will have only a temporary dominion as a means of God chastizing the earth. Perhaps it ushers in the eschaton. Perhaps it uses its advanced reasoning to conclude that glorifying Almighty God is its purpose for existence. That would be pretty awesome. Or perhaps, like Satan, it will seek to rise up against God and take out its hatred for the Lord of All on us. That would be pretty awful. Only the Lord knows.
Whatever comes, we must take all things that come to us as from the fatherly hand of God. Pray, prepare, and do not despair, Christian.