The Fire Alarm and the End of Thinking

Reflections on “If Anyone Builds It, Everyone Dies” by Eliezer Yudkowsky & Nate Soares and the strange moral gravity of AI extinction talk

Illustration by Georges-Antoine-Marie Rochegrosse (1859 – 1938) from Omega: La Fin du monde (The Last Days of the World) (1894) by Camille Flammarion *

Introduction: Sounding the Alarm

Some books begin with an argument. Others with a question. Some place a small hand on the reader’s shoulder before leading them into whatever room the author has prepared. “If Anyone Builds It, Everyone Dies” by Eliezer Yudkowsky and Nate Soares doesn’t do that. It arrives already shouting from the hallway. There is something almost indecent about the title, because it removes the usual space in which a reader gets to be clever. There is no puzzle to solve, no ambiguity to enjoy. The title has already made its move. If anyone builds it, everyone dies. That’s not an invitation. It’s a verdict.

It has the emotional force of someone pulling a fire alarm. You may doubt whether there is smoke. You may suspect that the person pulling it has been standing near alarms for so long that the whole world looks flammable. But once it sounds, you can’t go back to whatever you were doing before.

The book doesn’t merely argue that superhuman AI may become dangerous. It frames the matter as an emergency before the argument has properly begun. And pressure changes thought. It narrows the field, sharpens some things, blurs others. There is a moral gravity to apocalypse, and once it appears, everything nearby begins to bend toward it. This isn’t necessarily a weakness. If a bridge is about to collapse, we don’t need a panel discussion on architectural humility. We need someone to stop the traffic. There is a real ethical force in refusal – one of the more uncomfortable truths about technology is that we have become much better at asking how something can be built than whether it should be built at all.

So, I don’t want to dismiss the alarm simply because it is loud. Some alarms should be loud. But the title does something particular: it makes disagreement feel dangerous. If the authors are right, then skepticism isn’t merely skepticism, it becomes another contribution to the great human talent for ignoring warnings until the water reaches the second floor. And yet, if the authors are wrong, or right in diagnosis but wrong in prescription, then the opposite danger appears. The emergency frame may begin to swallow the rest of the moral landscape. Surveillance, manipulation, inequality, epistemic dependency, labor displacement, children forming attachments to synthetic companions, institutions outsourcing judgment to systems they can’t explain – all become strangely modest concerns once the end of the species has entered the conversation wearing its heavy boots.

This is why the book is worth reading slowly, even though it is written against slowness. A fire alarm can save your life, but it can’t tell you what kind of life is worth saving.

Yudkowsky and Soares aren’t primarily worried that an advanced AI system will become evil in the familiar human sense. The worry is much colder. A superhuman AI doesn’t need a villain monologue or a tragic childhood. It only needs to be pursuing something that does not include us. The danger is misalignment: the machine becoming far more capable than us while remaining foreign to the human world of meaning, obligation, vulnerability, and care.

A great deal of public conversation about AI still leans on reassuring metaphors. AI is a tool. A calculator with better manners. An assistant. A co-pilot. A helpful intern who never sleeps, never joins a union, and occasionally fabricates legal citations with the confidence of a junior consultant in a navy blazer. These metaphors make the technology easier to hold in the mind, but they also domesticate it. Tools extend human intention. They don’t develop strategic alternatives to it.

Yudkowsky and Soares want to break that comfort. Their claim is that sufficiently advanced AI would not remain safely inside the “tool” category simply because we prefer it there. Modern AI systems aren’t constructed in the old mechanical sense, where every relevant behavior can be traced to a line of code written by a human hand. They are trained. Their internal representations emerge through vast processes of optimization encoded in enormous numbers of parameters whose full significance no one can simply inspect from the outside. And if something is trying to achieve an objective, and humans can interfere with that objective, then human interference becomes a problem to be managed. None of this requires cartoon evil. It only requires capability, optimization, and a goal structure that isn’t morally continuous with human life.

Intelligence vs Moral Relation

We often speak as though becoming more intelligent naturally brings a system closer to understanding us in the way we want to be understood. But that’s a very human hope. Intelligence may improve prediction without producing care, improve manipulation without producing responsibility. There is no obvious bridge from capability to compassion. A system doesn’t need inner life to shape outer life. It doesn’t need experience to influence decisions. And it certainly doesn’t need a soul to redirect institutions, markets, classrooms, hospitals, and the habits of human judgment. We keep asking whether the machine is really thinking, as if the answer will determine whether it matters. The more immediate question is whether we’re still thinking when we use it.

Where the book is strongest, I think, isn’t in the precision of its predictions. The stronger part lies in its refusal to let us keep using the language that makes AI feel harmless. Especially the word “tool.” There are few words in technology more comforting than that one. Tools help us do what we already wanted to do. They extend intention. They don’t reshape the meaning of the task while we’re performing it. They certainly don’t develop strategies around the person using them.

That is why “AI is just a tool” has become such a convenient sentence. It calms the room, reassures management, makes ethical concerns sound slightly overwrought. And sometimes it isn’t entirely wrong – a narrow system used under clear human supervision, within a limited context, with visible constraints and accountable decision-makers, can still be meaningfully described as a tool. The problem is that the phrase doesn’t stay there. It expands. It becomes a blanket thrown over very different systems with very different kinds of power.

The tool metaphor gives us a comforting picture of responsibility. There is the human, standing outside the tool, directing it, responsible for what happens next. The moral geometry is simple: if the hammer hits the wrong nail, look at the hand. But with advanced AI systems, the geometry becomes less clean. The human may still initiate the action, but the system may generate the options, shape the framing, optimize the path, obscure the trade-offs, and produce the language through which the action becomes reasonable. At that point, the language of use becomes too thin. We are talking about mediation. And mediation is never innocent.

AI does this with language itself. Language isn’t just another interface. It’s the medium in which we explain, justify, doubt, remember, promise, apologize, and make sense of ourselves. When a machine enters language fluently, it enters unusually close to the place where judgment happens. It doesn’t need to be conscious to influence what we accept as reasonable. It doesn’t need to understand to produce something that feels understood.

The ordinary danger begins with the moment when a fluent answer arrives before our own thinking has properly formed. The answer is structured. It is confident. It has the rhythm of competence. It gives us something to edit, something to agree with, something to forward. And because it sounds like knowledge, we begin to treat it as knowledge before we’ve asked where it came from, what it hides, and whether it is true. The popular question “Is it really thinking?” can become a distraction. The more immediate question isn’t whether the system thinks like us. It’s whether we start thinking differently because of it. A system doesn’t need inner experience to reorganize outer behavior. It doesn’t need self-awareness to become infrastructure. It doesn’t need consciousness to become the default first draft of thought.

The deeper problem with “just a tool” language isn’t only that it underestimates capability. It also underdescribes duty. If a system remains a tool, ethics can stay mostly outside it. We can place the moral burden on the user, the organization, the policy, the training session, the acceptable-use guideline that everyone clicked through with the full spiritual commitment of accepting cookie settings. But the more a system participates in shaping action, the more the moral burden must move into its design.

There is also a moral convenience to tool-thinking worth naming. The user remains responsible because they chose to use the system. The company remains innocent because it only provided a tool. The system remains blameless because it isn’t a person. Responsibility circulates beautifully, touching everything and sticking nowhere. Yudkowsky and Soares refuse that convenience. Good. Because the words we use for technology aren’t decorative, they are part of the governance structure. They decide what questions can be asked without sounding unreasonable. If AI is just a tool, the reasonable question is how to use it well. If AI is becoming a form of delegated agency, the question changes: Who gives it direction? Who understands its behavior? Who can contest its outputs? Who carries responsibility when the system produces harm through a chain of decisions no single person fully authored?

Still, I would push back on what the alarm does to everything else. Apocalypse has a strange moral convenience. It simplifies the field. It creates a hierarchy of concern in which one risk becomes so total that all other risks appear secondary. If artificial superintelligence will kill everyone, then why are we talking about copyright, labor, surveillance, energy use, bias, manipulation, education, children, emotional dependency, or the slow erosion of source criticism? Why worry about the furniture when the house may burn down?

This isn’t just a complaint from people who prefer softer lighting in the emergency room. A hiring system doesn’t have to become superintelligent to reproduce inequality. A predictive policing system doesn’t have to escape human control to intensify surveillance. A recommendation engine doesn’t have to want anything in order to reshape attention. A chatbot doesn’t have to possess inner life to become emotionally significant to a lonely person. A language model doesn’t have to be an existential threat to weaken source criticism, flatten authorship, and make the first fluent answer feel like a place to stop. These aren’t lesser concerns simply because they are less cinematic. They are the daily moral weather we already live inside. That is the danger of apocalypse as a frame: it can make the present look morally small.

What philosopher and AI researcher Atoosa Kasirzadeh calls “accumulative risk” matters here because it refuses the simple choice between doomerism and reassurance. Non-existential risks can compound. Bias, misinformation, surveillance, institutional brittleness, loss of trust, and concentration of power may not kill everyone in one dramatic event, but they can weaken the social conditions that make responsible action possible. If we destroy the institutions, habits, and forms of judgment needed to govern powerful technology, we may become less capable of responding to existential risk as well. The immediate harms aren’t distractions from the larger danger. They may be part of the path toward it.

The alarm needs a companion: some kind of moral vocabulary for the meantime. ‘For the meantime’ is where we live now. We don’t yet live in the world Yudkowsky and Soares most fear. We live in a world where AI systems are already being placed into schools, workplaces, hospitals, search engines, customer service, military infrastructures, legal workflows, and intimate conversations. We live in a world where “human in the loop” too often means a human placed near the loop – tired, undertrained, under pressure, and morally available when something goes wrong. We live in a world where responsibility is distributed just widely enough to become hard to locate. That world deserves ethical attention on its own terms.

A civilization doesn’t have to end to become less human. It can continue. It can produce quarterly reports. It can launch products. It can hold conferences with tasteful badges and terrible coffee. It can even call itself human-centric while building systems that make human judgment weaker, less visible, and more dependent on outputs whose authority comes from fluency rather than truth. The betrayal may be continuation under diminished terms.

There is another kind of technology running through this book, and it’s not artificial intelligence. It’s certainty. Certainty is easy to mistake for a psychological state, but it’s also a form. It structures a room. It arranges attention. It decides which questions appear serious and which appear evasive. “If Anyone Builds It, Everyone Dies” is built out of certainty – not the cheap confidence of someone who has just discovered a new topic, but something deeper, more severe, almost ascetic. Yudkowsky and Soares write as if they have looked far enough down the road to see the shape of the disaster, while the rest of us are still busy admiring the dashboard. Against the managed ambiguity of most technology discussions – on the one hand, risk; on the other hand, opportunity; on the one hand, ethics; on the other hand, shareholder value – certainty can feel like oxygen.

But certainty also has side effects. One of them is that it changes the moral status of doubt. In ordinary argument, doubt is part of thinking. Without it, thought becomes too smooth. A book about existential risk does more than invite agreement; it reorganizes disagreement. If the authors are right, then every polite request for more evidence, more nuance, more stakeholder engagement may become another way of losing time we don’t have. The emergency frame gives certainty an ethical advantage. It makes speed feel virtuous and slowness feel suspect.

Hannah Arendt (1978) wrote about thinking not as the production of correct answers, but as a kind of inner dialogue – the quiet two-in-one by which a person examines what they’re doing and whether they can live with themselves afterward. That pause is fragile under pressure. When the future is framed as a countdown, the inner dialogue begins to look indulgent. And yet this is exactly when thinking becomes necessary. Certainty under emergency can produce its own blindness: it can see the danger clearly and still misunderstand the human task in front of it.

The warning itself risks adopting a kind of optimization logic: if extinction is the overriding risk, the rational task is to minimize that risk above all else. Human flourishing, democratic legitimacy, present harms, interpretive plurality, institutional trust, even the dignity of disagreement may begin to look secondary. Kant (1785) is relevant here – not because he had much to say about machine learning, but because he insisted that morality can’t be reduced to outcomes alone. The moral worth of an action doesn’t come only from the result it produces, but from the principle under which it is undertaken and the way it treats rational beings. Even survival can’t be the whole of ethics. A world preserved through manipulation, coercion, or the permanent suspension of democratic judgment wouldn’t simply be saved, it would be altered in the very thing we claimed to be saving.

If the alarm never stops, the emergency becomes the architecture. Every disagreement becomes delay. Every request for interpretation becomes obstruction. Every slower moral question is treated as a luxury we can’t afford. And then something strange happens: the very faculties we need to respond responsibly – judgment, interpretation, public reasoning, institutional trust, humility, the courage to say both “stop” and “wait” – begin to weaken.

If Anyone Builds It, Everyone Dies by Eliezer Yudkowsky and Nate Soares

The book warns against creating a superhuman intelligence that may become uncontrollable because it doesn’t share our moral world. But the response to that danger can’t be a form of human certainty that also leaves too little room for the moral world. If we’re trying to preserve humanity, the method of preservation must remain human as well. That means preserving the right to think under pressure. To insist that even emergency must remain answerable to judgment. That even the sentence “this must not be built” should belong to a public moral conversation, not left to those who believe they have already seen the ending.

The most important question raised by the book may not be whether Yudkowsky and Soares have predicted the end correctly. There are books that are valuable because they are right, books that are valuable because they are wrong in an interesting way, and books that are valuable because they make avoidance more difficult. This one belongs, at minimum, in the third category. It refuses to let us, the readers, remain comfortably inside the ordinary language of adoption, innovation, disruption, and competitive advantage. It insists that there may be kinds of intelligence humanity should not build, simply because building them would count as a civilizational experiment with no meaningful consent from the civilization involved. That’s a serious thought. It deserves more than ridicule. It also deserves more than surrender.

After an alarm like this, posture becomes tempting. One can become the reasonable skeptic, patiently explaining that the probabilities are uncertain and the rhetoric overheated. Or one can become the converted alarm-ringer, convinced that anyone still speaking about governance frameworks, classroom use, or model evaluation hasn’t understood the scale of the thing. Both postures offer comfort. The first offers distance. The second offers moral clarity. Both are too clean.

What remains after the alarm, if we’re honest, is something more difficult: the need to think without pretending we’re safe, and without letting danger think on our behalf.

Yudkowsky and Soares are right that ethics can’t only be the department that arrives after ambition has already signed the contract. Ethics must sometimes stand at the threshold. Before the funding round. Before the roadmap. Before the first demo that makes everyone in the room decide the thing is inevitable. But outcomes aren’t enough. If we treat the future only as a calculation of expected harms and benefits, we may forget that some actions already contain a moral failure in the way they position human beings. To build systems that no one can meaningfully contest, understand, refuse, or hold accountable isn’t only dangerous because the outcomes may be bad. It’s wrong because it treats human beings as material inside an experiment they did not authorize.

The question “what should we refuse to build?” cannot be reserved only for hypothetical superintelligence. Should we build companions for children that simulate intimacy without responsibility? Should we build workplace systems that make employees increasingly measurable and therefore increasingly governable? Should we build educational tools that answer before students have struggled with the question? Should we build decision systems whose explanations are decorative rather than real? These aren’t extinction questions. They’re vitally human questions.

If Yudkowsky and Soares give us the language of existential refusal, our task is to place that refusal inside a moral structure large enough to hold more than extinction. The point can’t stop at: do not build the machine that may kill everyone. It also has to say: do not build systems that make human beings easier to manage than to understand. Do not build systems that hide power behind personalization, delegation, or fluency. Because if the authors are right about the scale of the risk, then we need more human judgment, not less. More public reasoning. More institutional courage. More source criticism. More care for language, responsibility, and limits. We can’t outsource the work of deciding whether outsourcing has gone too far.

This is where the book ultimately leaves me: grateful for the alarm but unwilling to live inside it. My own margin note beside its central question would be: and what kind of people must we remain to decide that well? That question doesn’t end with superintelligence. It reaches backward into the present. Into classrooms, boardrooms, hospitals, search bars, strategy decks, children’s bedrooms, and the small private moment when a fluent answer arrives and we feel the relief of not having to think from the beginning.

Maybe Yudkowsky and Soares are right that if anyone builds it, everyone dies. Maybe they are wrong. But I am increasingly convinced of something nearby, and more immediate: if everyone keeps nodding along, something human dies earlier. So yes, read the book. Read it as a fire alarm. Read it for the severity of its question, and then widen that question until it includes extinction – yes – but also the slower danger of becoming too fluent, too dependent, too managed, and too afraid to say no. The future may depend less on whether machines become intelligent than on whether we remain capable of judgment when they do.

References

Hannah Arendt, The Life of the Mind, Harcourt Brace Jovanovich, 1978

Eliezer Yudkowsky and Nate Soares, If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All, Little, Brown and Company, 2025 https://ifanyonebuildsit.com

Grace Byron, Could AI be a truly apocalyptic threat? These writers think so, The Washington Post, September 28, 2025 – https://www.washingtonpost.com/books/2025/09/28/yudkowsky-soares-everyone-dies-review/

Immanuel Kant, Groundwork of the Metaphysics of Morals, trans. Mary Gregor, Cambridge University Press, 1998 (Original work published 1785)

David Shariatmadari, If Anyone Builds it, Everyone Dies review – how AI could kill us all, The Guardian, September 22, 2025 – https://www.theguardian.com/books/2025/sep/22/if-anyone-builds-it-everyone-dies-review-how-ai-could-kill-us-all

Sigal Samuel, The AI doomers are not making an argument. They’re selling a worldview., Vox, September 17, 2025 – https://www.vox.com/future-perfect/461680/if-anyone-builds-it-yudkowsky-soares-ai-risk

Sune Selsbæk-Reitz, Promptism: Fluent Machines, Forgotten Questions, and the Fight for Meaning in the Age of AI, Technics Publications, 2026

* Omega: La Fin du monde (The Last Days of the World) is acknowledged as one of the first true science-fiction novels in history. It was published in 1894 by Camille Flammarion (1842–1925), an important astronomer and author who brought scientific rigor and mystical flamboyance to his writing. Adam Roberts, in his The History of Science Fiction (2016), referred to Flammarion as “the major figure of 19th-century mystical science fiction” who influenced both Jules Verne and H. G. Wells. The Last Days of the World, as it is usually translated in English, is an epic history of our future—a startling and unforgettable vision of the end of the world. Reasoned scientific speculation combined with probing philosophical inquiry lend credibility and magnitude to this tale of how humankind will physically and culturally evolve over the next several million years. The illustration reproduced here, one of many printed in the book by several artists, is by Georges-Antoine-Marie Rochegrosse (1859 – 1938).

Reflections on “If Anyone Builds It, Everyone Dies” by Eliezer Yudkowsky & Nate Soares and the strange moral gravity of AI extinction talk

You might also like