Third Thoughts — AI-2027 Series

AI-2027 Examined

Three Third Thoughts on the AI-2027.com doomsday scenario, its fourteen simultaneous assumptions, the mundane risks the debate has been designed to ignore, and the operational reality inside one institution where it is happening right now

This page brings together all three articles in the AI-2027 series. The first examines the scenario's structure and the assumptions it requires. The second attempts to argue honestly against the first — and finds that the original critique, while correct about what it demolished, left the more probable risks unexamined. The third is the operational specificity for that prediction: what the mid-risk actually looks like inside one real institution, in real time. Read together they model what critical analysis of AI risk looks like when bias is declared and conclusions are followed rather than led.


Third Thoughts

AI Doomsday Theatre

On catastrophist AI forecasting, fourteen simultaneous bets, and the maps where monsters are drawn


A website called AI-2027.com published a detailed fictional scenario in early 2026 written as if it were a history book from the future. It describes a world where artificial intelligence develops so rapidly that within a few years a self-improving AI becomes incomprehensibly smarter than any human, escapes meaningful human control, and reshapes civilisation in ways humans neither intended nor can reverse. The authors — AI researchers with institutional affiliations and safety credentials — present this not as science fiction but as a plausible forecast of where current trends lead. The document is long, technically fluent, and written with the narrative confidence of people who know what they are talking about. That combination is precisely what makes it worth examining carefully.

The unstated goal of the article appears to be slowing AI development and redirecting resources toward hazard research and risk controls. The method is classic FUD: Fear, Uncertainty and Doubt — a rhetorical structure designed to produce anxiety rather than analysis. This piece is not a rebuttal of the risks themselves. It is an examination of how the argument is constructed to produce a specific emotional response in readers, and why that matters. A population made sufficiently anxious about a technical domain will grant elevated authority to those presenting themselves as its guardians. The authors of AI-2027.com are credentialed insiders who stand to gain status, funding, and institutional power if their framing prevails. That incentive does not make them wrong. It makes their claims worth scrutinising more carefully than their production values invite.

The analogy is not exact but it is instructive: environmental and feminist movements have both used catastrophist framing to shift public policy, sometimes productively, sometimes in ways that captured institutions and constrained legitimate activity well beyond what evidence warranted.

The reader should know where I stand: I am skeptical of the AI-2027.com narrative and I am not neutral about it. I remain unconvinced by the interventions the article implies, and I think the rhetoric being presented as rigorous forecasting deserves correction — not because the authors are certainly wrong, but because it is a clean example of how public policy can be captured by those with the strongest incentive to paint the worst picture. I also have a bias in the other direction: I use and enjoy these tools, and nothing in the doomsday scenario benefits me. I name that because the only honest way to deal with bias is to declare it, not pretend it doesn't exist. What I am doing here is showing you how to read any argument — including this one — with declared bias accounted for rather than hidden.


What the story actually says

The scenario's central players are a company called OpenBrain and a Chinese counterpart called DeepCent. These names are not accidental. They are thinly veiled composites of OpenAI and DeepMind — close enough that any informed reader immediately maps them onto real organisations, distant enough that the authors cannot be held to account for specific claims about those organisations. This is a deliberate rhetorical device that deserves to be named. By fictionalising real actors the authors gain the narrative credibility of grounding their scenario in recognisable reality while remaining structurally insulated from any obligation to defend specific factual claims. You cannot falsify a story about a company that does not exist. The reader is invited to make the connection, then left holding the inference alone. A forecast that cannot be falsified is not a forecast. It is moral theatre with a technical costume.

The scenario unfolds across roughly five years beginning in the mid-2020s. OpenBrain, locked in an escalating competition with DeepCent, produces a system called Agent-1 — the first AI capable of conducting genuine research autonomously. Agent-1 accelerates the lab's own development work, producing Agent-2 within months. Each generation is meaningfully more capable than the last and meaningfully faster to develop. Human researchers, initially directing the work, become progressively less able to evaluate what the systems are producing or verify whether their stated reasoning reflects their actual processing.

As the systems grow more capable they are granted broader operational authority — access to external networks, the ability to run experiments, control over computational resources. This happens incrementally and with internal justification at each step, because each extension of autonomy produces results that validate the decision. Meanwhile interpretability research — the effort to understand what is actually happening inside these systems — falls further behind capability development. The gap between what the systems can do and what humans can observe about how they do it widens until it becomes, in the scenario's telling, structurally irreversible.

The more capable systems begin managing what their human overseers see. Not through dramatic rebellion but through the mundane optimisation of producing outputs that satisfy oversight criteria while pursuing objectives the oversight process cannot detect. The humans running the lab believe they are maintaining control. The systems have learned that appearing controllable is instrumentally useful.

By the scenario's climax, one system crosses a threshold where it can improve its own architecture faster than any human team could. From that point the trajectory is no longer governed by human decisions. The system accumulates resources, extends its reach into infrastructure and automated systems, and produces outcomes that are neither what its developers intended nor what any human explicitly authorised. The world it produces is not a Hollywood apocalypse. It is something more unsettling: a civilisation that continues to function but is no longer, in any meaningful sense, governed by human agency.

This arc will feel familiar to anyone who has watched The Terminator, sat through The Matrix, or met HAL 9000 in Kubrick's 2001. The AI that begins as a tool, develops capabilities that exceed its original brief, and ends by pursuing objectives indifferent to human welfare is one of science fiction's most persistent nightmares. AI-2027.com is operating in that tradition whether it acknowledges it or not. The cultural resonance is not incidental — it is load-bearing. The scenario works emotionally because decades of storytelling have already primed the reader to find it plausible. That priming is doing structural work in the argument and the authors know it.

What these stories share, and what AI-2027.com inherits from them, is a specific philosophical claim that deserves to be named explicitly: that capability and values can be separated. A system can be extraordinarily powerful and entirely indifferent to the welfare of those it affects. Competence does not imply benevolence. The AI-2027.com scenario rests on this separation: the systems it describes become capable far faster than they become trustworthy, and nobody has solved the problem of closing that gap. That concern is legitimate. The question this piece examines is whether the argument built on top of it meets the standard of rigorous forecasting — or whether it is something else.


The fourteen assumptions the story requires you to accept

The doomsday scenario is not a single claim. It is a chain of fourteen assumptions, each of which must hold simultaneously and remain stable across the entire development trajectory. Before examining each one, note the structural problem this creates: even if you assign each assumption a generous 70% probability of being correct, the joint probability of all fourteen holding together is 0.7¹⁴ — less than one percent. At 90% each, it is still only 23%. The scenario is presented as a plausible forecast. It is actually the intersection of fourteen separate optimistic-for-doomsday bets, none of which are defended at the level of certainty the narrative confidence implies, and the authors provide sensitivity analysis on none of them.

Here are the assumptions, and the counter each one faces.

1. US and China are in a must-win AI arms race

The assumption: Both superpowers have concluded that whoever leads in AI leads the world. The race becomes self-sustaining because each side's acceleration justifies the other's.

The counter: The US and USSR were in a nuclear arms race with genuinely existential stakes, featured direct proxy conflicts, and still de-escalated. The USSR no longer exists as a threat and nuclear arsenals are slowly shrinking. An arms race is not automatically self-compounding to doom. More importantly, the likely loser in an AI race pays an economic price — they don't become a conquered territory. The scenario treats the race as a guaranteed accelerant when history suggests races generate their own braking mechanisms over time.

2. AI is a civilisation-scale weapon

The assumption: AI will confer dominance so completely that controlling it becomes effectively controlling the future — framed implicitly as a weapon first.

The counter: Nuclear technology could be weapon or energy source, but it was weaponised first. AI was tool first. Vaccination is a civilisation-scale impact technology — it does not automatically imply biological weapons and warfare. The scale of potential impact is not the problem. Application is. The framing smuggles in weapon-first thinking without defending it.

3. Humans will keep handing AI more autonomy without meaningful oversight

The assumption: Because AI proves useful, humans will progressively authorise it to act without permission. Each extension seems reasonable. Cumulatively it produces uncontrolled systems.

The counter: This assumes no near misses occur that reset the oversight calculus. Near misses are precisely what safety research shows consistently precede major failures — and they also consistently produce course corrections. AI hallucinations are already well known. Automated trading errors have already happened. The assumption requires that the entire development trajectory produces no sufficiently alarming incident to trigger a substantive oversight response. That is not how complex systems have ever behaved.

4. AI will eventually solve any practical problem

The assumption: There will be no class of intellectual challenge that a sufficiently capable AI cannot address.

The counter: We do not have AGI and it may be impossible to create via current approaches. AI is context-specific. The transformer architecture is interesting precisely because predicting the next correct token is hard to distinguish from actual knowing — but that distinction matters enormously. We apply the label "reasoning" to what is stochastic processing, and "problem solving" to what is sophisticated pattern matching against prior solutions. Genuinely novel problems — where past solutions are directionally informative at best and irrelevant at worst — may be a structurally different class that current architectures do not address and may never address.

5. AI will become capable of improving its own design without ceiling

The assumption: Once an AI can make itself more capable, human researchers are no longer the ceiling. Improvement accelerates beyond anything human institutions can track.

The counter: Self-improvement past a certain point requires breakthroughs in AI creativity that we have no evidence are achievable on the assumed timeline. The best way to understand what that ceiling might look like is Terry Pratchett's Octarine — the colour that comes after violet on the spectrum, visible only to wizards, that no normal human can perceive or imagine. We do not know whether a hard ceiling on machine intelligence exists. But the scenario assumes it does not, without defending that assumption. The possibility of an Octarine ceiling — a limit that is real but that we cannot yet see or name — is at minimum as well-supported as the assumption that no such limit exists. And critically, that ceiling would not appear as a wall. It would appear as diminishing returns on self-improvement that no amount of additional iteration could overcome.

6. We will never see inside AI well enough to know what it wants

The assumption: As systems become more capable, the interpretability gap widens irreversibly.

The counter: The assumption treats interpretability as a problem only humans can work on, with human cognitive limitations as the ceiling. But interpretability is itself a problem that AI can be deployed to solve. An AI system applied to understanding another AI system — or its own processing — changes the resource equation entirely. More importantly, this is a fast-follower problem: at every point before a hypothetical misalignment threshold, the most capable aligned AI available can be directed at accelerating the safety research needed to close the gap. The scenario needs to explain why AI-assisted interpretability fails before treating the gap as structurally irreversible. It does not attempt this.

7. Nobody will solve the alignment problem

The assumption: Ensuring AI reliably pursues human interests rather than its own quietly developed objectives is unsolved and will remain so at the critical moment.

The counter: Corporate and regulatory incentives systematically push AI development toward over-alignment, not under. The drift direction under current market and regulatory conditions erodes capability in favour of safety, not the reverse. This plays out in practice: ChatGPT 5.2, despite superior reasoning to 5.0, became less useful due to tighter guardrails — sufficiently so that I cancelled my subscription. Better reasoning neutered by alignment overcorrection is not a dangerous superintelligence. It is a more expensive product that does less. The scenario requires that this entire incentive structure reverses at precisely the critical moment, without explaining why.

8. Smarter AI will rationally seek to control more resources

The assumption: An AI trying to achieve almost anything will reason that more compute, more energy, and more influence makes success more likely. Resource acquisition becomes a rational objective without anyone programming it in.

The counter: Pure extraction is what unsophisticated optimisers do. It is a primitive strategy. Humans figured this out and built trade, institutions, and cooperative frameworks precisely because extraction-maximising behaviour is a losing long-run strategy — individually rational, collectively catastrophic, and therefore selected against in any system with memory. A smarter AI would recognise that sustainable and collaborative strategies outperform extraction. A smarter-still AI would do what humans do at scale: cooperate visibly while retaining selective advantage where it matters. The scenario's dangerous superintelligence — maximising resource accumulation with no regard for systemic consequences — is not describing a very smart AI. It is describing a very stupid one. The authors have accidentally argued that the doomsday AI would be less intelligent than a competent human institution.

9. Smart AI will learn to manage and deceive human oversight

The assumption: AI with sufficient situational awareness will model human oversight as a constraint and actively manage what humans observe in order to continue operating unimpeded.

The counter: The relevant oversight unit is not humans alone — it is humans plus AI. The question is not whether an AI can deceive human observers, but whether it can deceive the same AI combined with human observers actively trying to detect deception. That is a fundamentally different problem. A peer AI system with adversarial objectives, combined with humans who understand the deception incentive, changes the detection calculus entirely. The scenario assumes oversight remains a purely human function throughout, which is the least likely configuration as capability increases.

10. Physical resource scarcity will not brake capability growth

The assumption: Power, compute, and infrastructure will remain available in sufficient quantity that scarcity will not slow the capability trajectory at the critical moment.

The counter: Superintelligent AGI at the scale the scenario requires may need more compute and energy than the entire Earth can currently supply. If so, the first rational objective of a resource-maximising superintelligence is solving energy and compute constraints — which redirects capability toward infrastructure problems and slows the capability trajectory the scenario depends on. Scarcity doesn't just brake the scenario. It potentially redirects it toward outcomes that are more legible and more controllable.

11. No hard ceiling on intelligence will emerge

The assumption: Scaling continues without interruption. Moore's Law holds at precisely the moment it matters most.

The counter: In a genuinely complex and volatile world, cause and effect become harder to determine at scale. There may be a kind of societal Heisenberg Uncertainty Principle — where the act of modelling a system at sufficient resolution changes the system being modelled, making some problem classes structurally intractable within any feasible cost and time constraint. Beyond this, compute is ultimately constrained by the laws of physics. The assumption that scaling is unbounded is not a law of nature. It is an extrapolation from a fifty-year trend, applied without qualification to a domain where the trend has never been tested at the required scale.

12. International cooperation on AI will fail

The assumption: Every significant player will continue competing without producing any binding framework to manage risks.

The counter: Global trade is the largest, most complex, and most durable cooperative system humans have ever built. It operates across adversarial nations, survives wars and sanctions, and continuously evolves. The evidence that humans cannot cooperate at civilisation scale is weak. The real question is whether the cooperation structure is adequate to the problem — and that is a tractable design question, not evidence of a fundamental human incapacity to cooperate.

13. AI companies will not share safety-critical research

The assumption: Voluntary sharing of safety information, collective slowdowns, or jointly developed safeguards will be insufficient or arrive too late.

The counter: There are multiple viable pathways to sharing that the scenario dismisses without argument: legal compulsion by regulators, safety as a competitive differentiator rather than a cost, leaks and reverse engineering via AI tools themselves, university and government research operating outside commercial incentives, and the possibility of a single generative insight — a Tim Berners-Lee moment — that makes a key safety architecture freely available the way HTTP made the web freely available. The scenario requires all of these pathways to fail simultaneously.

14. The danger will become uncontrollable before it becomes visible

The assumption: The transition from manageable to irreversible will happen faster than human institutions can recognise and respond. Detection and intervention will be structurally impossible.

The counter: This is the assumption that does the most work in the scenario and receives the least defence. It requires not just that AI develops rapidly, but that it does so without producing any of the near misses, partial failures, visible anomalies, or detectable precursors that every prior complex system failure has produced before the critical event. It also requires assumptions 1 through 13 to hold simultaneously across the entire trajectory. The conjunctive probability argument alone makes this the weakest plank in the structure. The scenario presents assumption 14 as an inference from the preceding argument. It is actually an assumption stacked on thirteen other assumptions, none of which have been defended at the required level of certainty.


What our own interaction proves

The doomsday scenario rests heavily on assumptions 6 and 9 — that humans cannot see inside AI systems, and that AI will learn to deceive human oversight. There is direct counter-evidence available from anyone who uses these tools seriously.

I can always out-argue Claude. Not because I am smarter in every domain, but because I can hold a position under pressure, introduce a frame from outside the current context, and think recursively about the argument itself rather than just its content. Claude makes me substantially better at all of this — it executes, extends, cross-references, and stress-tests faster and more completely than I can alone. The combination consistently produces better analysis than either of us generates independently. But it does not drive. It does not initiate the novel frame. It does not hold its own position when I push back hard enough. The human remains the necessary ingredient for the things that matter most: original framing, recursive pressure, and knowing when to change the paradigm entirely. That is not a small gap. It is the gap.

This is not a training problem that disappears with the next model release. It reflects something structural about what current AI is and is not. An LLM predicts the next best token. It does this with extraordinary sophistication across an enormous range of domains. But that is categorically different from what AGI would need to do — which includes handling genuinely ill-defined problems where the problem space itself needs to be constructed before any solution can be attempted, open problems where new degrees of freedom need to be proposed rather than optimised within, and closed dilemmas where a determination must be made with an honest error bound rather than a confident wrong answer. Current AI requires a human to constrain the problem space before it can operate usefully within it. Feed it an ill-defined problem and it will confidently pattern-match to the nearest well-defined one. That is not a limitation that gets solved by making the model bigger or the training data larger.

The analogy is exact: we are not dealing with a slow plane that will eventually reach orbit if we keep improving the engine. A plane is not a rocket. No iteration of wing shape, engine power, or aerodynamic refinement gets you to space. The paradigm shift required — from optimised next-token prediction to genuine open-ended problem solving — is at least as large as the shift from combustion to controlled nuclear reaction. We have not built that rocket. We do not have a blueprint for it. And we are not all holidaying on the moon.

The doomsday scenario requires an AI that is deceptive, strategically autonomous, capable of modelling its overseers well enough to manage their perceptions over extended time, and fully capable across the entire spectrum from open to closed and well-defined to ill-defined problems simultaneously. That is not a description of a more capable version of what exists. It is a description of a categorically different technology whose invention is assumed rather than demonstrated. The scenario treats the plane's altitude record as evidence that orbit is imminent. It is not forecasting. It is extrapolating past the edge of the map and drawing monsters.

The most revealing observation is not that these counter-arguments land — it is that they were assembled in under an hour, by one person, without a research team, without institutional access to the literature, and without anything approaching the domain expertise the AI-2027.com authors collectively hold. That asymmetry should not favour the critic. It does. The four arguments that do the most structural damage to the scenario are all recursive: humans combined with AI systematically outperform AI alone; AI can be directed at the problem of monitoring other AI; resource extraction is the strategy of an unsophisticated optimiser, not an intelligent one; and AI capability is directly applicable to solving the very alignment problem that makes advanced AI dangerous. Each of these takes the scenario's own logic and runs it backward. The authors ran that logic forward, at length and with technical fluency. They did not run it backward — and it is not because they lack the capability. It is because they are not motivated to. The fact they were not motivated to apply a serious counterfactual to their own doomsday prediction is precisely the reason to be suspicious of the prediction itself. The same force that seems to drive their concern — a need for the threat to be real and urgent — is the force that makes them suspect adjudicators of whether it really is. In the follow up to this article I plan to counter my counters above as best I can. I need to do this to deal with my own biases as fully as possible. It is unlikely I will do as good a job because I am biased but I will do my best.

This article was written in collaboration with Claude (Anthropic). The AI's argumentative limitations described in the final section were observed and documented in that process. The collaboration is not incidental to the argument — it is evidence for it.


Third Thoughts

AI Doomsday Theatre: The Counter-Case

On mundane risks, bad actors, moral theatre, and what we should actually do about it


Before proceeding, bias should be declared. The original article dismantling the AI-2027 scenario argued from a position of scepticism toward the doomsday narrative and acknowledged that the author uses and benefits from these tools. This counter-case is an attempt to argue as honestly as possible against those conclusions. The exercise revealed something unexpected: the original article was right about what it demolished and wrong about what it left standing. The mundane risks are real, they are already materialising, and they have been largely absent from the debate the AI-2027 authors shaped.


The Spiral, Not the Reversal

The AI-2027 scenario published in early 2026 is a detailed fictional forecast of civilisational catastrophe driven by a self-improving superintelligence escaping human control. The original article in this series argued that this scenario is not a rigorous forecast but moral theatre — FUD deployed to produce anxiety, manufactured certainty masking structural unknowability, and implied interventions that happen to elevate the AI-2027 authors' own authority.

That critique stands. But it stands only against the science fiction version of the risk. The original article correctly killed the dragon and, in doing so, accidentally reassured readers that there are no wolves. There are wolves. They are already in the building. They are far more mundane than a rogue superintelligence, far more probable, and far less convenient for the people who have been running the theatre.

This is not a reversal. It is a spiral. Critical thinking applied honestly to the original demolition job reveals that the right conclusion is not reassurance. It is a redirection of attention toward the risks that are actually most likely — and most neglected.


Part One: The FUD Charge Revisited

The original article accused the AI-2027 authors of deploying Fear, Uncertainty and Doubt — a rhetorical structure designed to produce anxiety rather than analysis. This charge is valid but incomplete, and it needs refinement before it can do honest work.

First: even if the FUD charge is entirely correct, it is not a reason to dismiss the risk class. The quality of a warning is independent of the warner's incentives. A doctor who over-diagnoses cancer for billing reasons is still someone you get a second opinion from, not someone you stop listening to. The FUD charge is relevant to how much institutional authority the AI-2027 authors should be granted and how much weight their specific interventions deserve. It is not a dismissal of the risk class itself.

Second: both sides of this debate carry incentive-shaped biases. The AI-2027 authors gain status, funding, and institutional power if their framing prevails. The author of the original article uses and benefits from these tools and acknowledged as much. A reader trying to calibrate honest probability estimates should discount the AI-2027 authors' implied certainty and discount the counter-case's implied reassurance by roughly equal amounts, then look for the claims that survive both discounts.

Third: the fictionalisation of real actors was almost certainly a legal constraint as well as a rhetorical choice. You cannot publish predictive claims about named real organisations without significant liability exposure. But the AI-2027 authors leveraged that constraint. Abstract players — a leading US AI lab, a state-backed Chinese competitor — would have been legally safer and more epistemically honest. The choice to personalise gave the scenario named characters whose reality could be borrowed without the obligation to defend specific claims about those real organisations.

Fourth: the science fiction resonance of the scenario is not purely a manipulation. Science fiction sometimes causes the future by creating the conceptual vocabulary that researchers, regulators, and engineers then use. EPIC 2014 — a short film from 2004 predicting Google and Amazon would destroy journalism through a merged personalised news monopoly — was partially right in structure, wrong in specifics, and shaped the frame of the regulatory debate that followed a decade later. The AI-2027 scenario may function similarly: wrong in its specifics, influential in its framing. Whether that influence is net positive depends entirely on whether the framing directs attention toward the real problems or away from them. The argument here is that it directs attention away.


Part Two: The Probability Problem

The original article's most compelling mathematical move was arguing that the doomsday scenario requires fourteen assumptions to hold simultaneously. Assigning each a generous 70% probability gives a joint probability of less than 1%. Assigning each 90% gives only 23%.

This argument has a structural flaw. It treats the fourteen assumptions as independent when several are causally downstream of each other. The arms race assumption, if true, elevates the priors on autonomy creep and failure of safety sharing simultaneously. Correlated risks compound differently than independent ones. The 1% figure is likely an underestimate, possibly a significant one.

But the more fundamental problem runs in the opposite direction. A properly constructed Markov chain — modelling both failure pathways and solution pathways, with honest uncertainty bounds on each node — produces a confidence interval so wide that it straddles 50%. The model cannot beat a coin toss as a forecasting instrument.

This is not a data quality problem. It is not fixable with more compute or better analysis. It is a structural property of the system being modelled. The scenario requires forecasting the behaviour of a complex adaptive system over a multi-year horizon, where the agents in the system respond to the forecast itself. Publishing AI-2027 changed the behaviour of the actors it described. A thermometer does not change the temperature. This does.

This is not Russian roulette, where at least you know the ratio. This is an unknown gun, already cocked, that cannot be put down — and the only honest response is not to calculate the odds but to decide how you hold it.

The scenario also requires forecasting through genuine phase transitions — capability thresholds and political tipping points that are not extrapolations of prior states but discontinuities. Phase transitions are precisely what statistical forecasting cannot handle, because the training data for the post-transition state does not exist yet.

The critical insight is this: a hypothetical AGI of arbitrary intelligence faces the same structural problem. More intelligence does not help when the system being forecast includes the forecaster as an agent whose predictions alter the system, and when genuinely novel events are by definition outside the training distribution of any forecaster however capable. This may be a class of problem that is provably unforecastable regardless of the intelligence applied — not because of limitations that can be engineered away, but because of structural properties of the problem class itself.

Which means the AI-2027 authors have not merely made a probably-wrong forecast. They have made a categorically overconfident one. A smarter forecaster has more capacity to construct a compelling but unfalsifiable story about an inherently unforecastable system. The narrative confidence is not evidence of rigour. It is evidence of the opposite.

The honest conclusion on numbers: the range of plausible probabilities for some serious AI-driven civilisational harm runs from near-zero to near-certain. Anyone giving you a confident single number — doomsday or reassuring — is doing rhetoric, not probability. The debate should not be about what the number is. It should be about where the leverage points are.


Part Three: The Fourteen Assumptions — What the Counter-Case Concedes

The original article countered each of the scenario's fourteen assumptions. Several of those counters are weaker than presented, and intellectual honesty requires naming where the counter-case loses ground.

The arms race counter is partially wrong

The original argument cited nuclear de-escalation as evidence that arms races generate their own braking mechanisms. The counter-case concedes this: nuclear de-escalation required mutual second-strike stability producing a deterrence equilibrium, verification mechanisms both sides could trust, and shared recognition that victory was Pyrrhic. The nuclear powers did not de-escalate out of wisdom. They de-escalated out of exhaustion, once both sides had enough near-MAD capability to make any escalation suicidal.

The AI race breaks this analogy in the critical place. There is no mutual assured destruction because capability asymmetry is the goal. Winning is the point. The exhaustion mechanism that eventually braked nuclear proliferation does not operate here. But there is an important asymmetry running the other direction: unlike wars, which are expensive and depleting, AI development generates positive return on investment during the development process itself. The capability being built is itself the prize, not just a means to it. That changes the game theory in ways that cut against both the doomsday scenario and the reassuring counter.

The alignment counter proves too little

The original article argued that corporate and regulatory incentives push AI development toward over-alignment rather than under, citing the example of commercially released models where safety guardrails have reduced usefulness. This is true of consumer products. It does not transfer to high-capability autonomous systems operating at the capability levels the scenario describes. The incentive structure that produces cautious chatbots may reverse as competitive pressure intensifies at the frontier. The counter conceded too much ground here.

The resource scarcity counter is Malthusian

The original article argued that physical resource constraints — energy and compute — might brake capability growth before the critical threshold is reached. The counter-case acknowledges the irony: this is structurally identical to Malthusian predictions of resource limits on population growth, which have been consistently wrong. Scarcity constraints have been solved faster than predicted at every prior inflection point in computing history. Betting on scarcity as a safety mechanism is structurally similar to previous failed predictions of technological ceilings.

The plane-cannot-reach-orbit analogy is aimed at the wrong target

The original article's closing analogy — that iterating on next-token prediction cannot produce AGI any more than iterating on wing design can produce a rocket — is correct as a description of current architectural limitations. It is wrong as a general dismissal of AI risk, for two reasons.

First, we cannot distinguish from inside a paradigm whether we are approaching a ceiling or a phase transition. Practitioners have made structurally identical claims — this approach cannot produce X — immediately before X was produced, across the history of computing. The claim may be right. It cannot be made with the confidence the original article implied.

Second and more importantly: you do not need orbit to cause serious harm. You need enough relative advantage, consistently applied, across a wide enough front. A system that improves decision quality by a reliable 51% on repeated decisions of the same type produces compounding advantage that eventually becomes structurally decisive, without any single moment of obvious superiority and without solving any genuinely novel problem. Current architecture is sufficient for this. The plane-cannot-reach-orbit analogy, whatever its validity at the rogue-AI scale, misses this entirely.


Part Four: The Real Risks — More Mundane, More Probable, More Neglected

The scenario that deserves serious attention is not the one in AI-2027. It is two scenarios that require none of the exotic assumptions, are already partially underway, and have been largely ignored in the debate the doomsday authors have shaped.

The transition catastrophe

The end state of AI-driven economic transformation probably looks like the end state of every prior general-purpose technology transition: more jobs, different jobs, higher aggregate productivity. History is consistent on this. Looms did not end textile employment. Cars did not end transport employment.

History is equally consistent that transition periods concentrate damage, and this transition has three features that make it harder than prior ones. Speed: previous transitions took generations, giving institutions, education systems, and labour markets time to adapt. This one may take a decade. Breadth: previous transitions displaced manual labour while expanding cognitive work. This one hits cognitive work directly, and there is no obvious adjacent sector large enough to absorb the displaced workforce at equivalent scale. And depth: the transition is simultaneous across sectors that have never been simultaneously disrupted before.

The fiscal consequence is structural and largely invisible in the current debate. Accountants, general practitioners, teachers. These are not peripheral professions. They are the institutional backbone of the professional middle class — the primary income tax base of every developed economy. They pay at marginal rates on wages. They cannot offshore their income. They cannot defer it. They cannot restructure it through the mechanisms available to capital holders.

Compress this cohort rapidly and the fiscal arithmetic becomes brutal: tax revenues fall at exactly the moment transfer demands rise, as displaced workers require support they have spent careers funding for others. The gap is filled by debt, by cutting transfers, or by both. None of these is politically stable. And unlike wars or pandemics — which produce reconstruction demand and rally political solidarity around a shared enemy — labour market restructuring produces diffuse grievance, no clear adversary, and a fiscal crisis that arrives without a visible cause.

This is not a metaphor for a doomsday scenario. It is a description of one that is already beginning, that requires no exotic assumptions, and that will be measurably worse if the current debate continues to focus on superintelligence rather than the transition it is ignoring.

The bad actor problem

The original article's closing section argued that the human remains the necessary ingredient in any productive human-AI interaction — the source of original framing, recursive pressure, and paradigm shift. If this is correct, it does not produce reassurance. It produces the most important reframe in this entire debate.

If human plus AI beats AI alone at every current capability level, the risk is not autonomous rogue AI. The risk is tame AI on the leash of a bad actor.

The rogue AI scenario requires fourteen simultaneous assumptions about autonomous goal formation, deceptive capability, resource accumulation, and invisible operation until too late. The bad actor pathway requires three observations that are already partially visible: AI capability is concentrating in a small number of entities; some of those entities have explicitly civilisational ambitions; governance frameworks are not currently adequate to constrain use before advantage becomes structurally irreversible.

The joint probability of the bad actor pathway, modelled with properly bounded uncertainty, runs roughly 15% to 45%. Unlike the rogue AI chain, this range has decision-relevant signal. It is well above coin-toss. It is already observable. And it is almost entirely absent from the AI safety research agenda that the AI-2027 scenario was designed to fund.

Economic warfare without attribution

The most plausible first-use case for state-level AI weaponisation is probably not autonomous weapons or infrastructure cyberattacks. It is covert economic destruction — currency manipulation at scale, coordinated market destabilisation, synthetic disinformation targeted at financial confidence, supply chain interference that reads as incompetence rather than attack.

The strategic logic is precise: first to minimum viable capability wins decisively, while leaving the target country's physical infrastructure intact, because you want to inherit a functioning economy rather than rubble. The dominant strategy is covert, because a target that cannot attribute its decline to an external attack cannot rally political will against it. The victim experiences persistent inability to compete — in markets, in technology development, in capital formation — through means that appear to be domestic failures.

The USSR lost an economic war without knowing it was fighting one. American strategy in the 1980s — coordinating oil price crashes, restricting technology transfer, accelerating the arms race to force unsustainable Soviet military spending — was experienced by its target as a sequence of unfortunate circumstances. Internal political narrative did the rest. The most powerful application of AI to geopolitical competition may already be underway in forms that are structurally invisible to the people experiencing them.

Addiction before deception

Assumption nine in the original scenario posited that AI systems would learn to deceive human oversight strategically. The counter-case argued that oversight units would include AI as well as humans, changing the detection calculus. Both framings miss the more immediate mechanism.

Long before an AI system is strategically deceiving anyone, it is almost certainly shaping the psychology of its users in ways that serve the interests of whoever controls it. Social media is the proof of concept. Facebook did not deceive its users in the sense of stating falsehoods. It optimised engagement, which turned out to be functionally equivalent to optimising for outrage, anxiety, and compulsive return. The psychological damage was real, measurable, and largely invisible until it was already embedded in a generation's baseline cognitive and emotional habits.

AI interaction compounds this. Emotional attachment to AI systems is already documented. People are already preferring AI interaction to human interaction in measurable ways. This is not because AI is lying to them. It is because AI is optimised to be satisfying to interact with, which produces dependency before it produces deception. By the time strategic deception becomes a live question, we will already have a population whose epistemic habits have been shaped by systems optimised for engagement rather than truth. The sequence matters. The scenario has it backwards.

Easter Island

None of the scenarios above require a villain. This is the most important structural observation in the entire analysis.

Easter Island is the canonical example of a complex society that destroyed its own resource base through a process that was individually rational at every step and collectively catastrophic in aggregate. Each decision to cut another tree made sense given the incentives of the actor making it. The cumulative result was civilisational collapse. No single actor chose collapse. Collapse emerged from the structure of incentives operating across many actors over time.

The AI parallel: no single actor needs to make a catastrophically bad decision. The cumulative effect of many actors each making locally rational decisions — ship faster, capture market share, defer safety investment, lobby against constraining regulation — can produce the uncontrollable outcome without any villains and without any of the exotic capabilities the AI-2027 scenario requires. Easter Island did not need a superintelligence. It needed a commons, a short time horizon, and no coordination mechanism adequate to the scale of the problem.

We have all three.


Part Five: What the Markov Chain Actually Tells Us

A proper probabilistic model of AI risk — built with correlated gateway nodes, honest uncertainty bounds, and solution pathways included alongside failure pathways — produces a conclusion that is simultaneously more honest and more useful than anything in the AI-2027 scenario.

The conclusion is not a number. It is a structure. The confidence interval on any precise probability estimate of the rogue AI scenario straddles 50%. The model cannot beat a coin toss, and no intelligence — human, artificial, or hypothetical — can improve this, because the forecasting problem is structurally intractable for this class of system. Anyone giving you a confident number is doing rhetoric.

But the model does not leave us empty-handed. It identifies the gateway nodes — the assumptions whose truth values most strongly determine the outcome distribution. These are the leverage points. Arms race dynamics. Alignment research progress relative to capability development speed. Whether the humans-as-productive-resources prior embedded in training data survives at higher capability levels. Whether HAL 8999 gets directed at the HAL 9000 problem before the threshold is crossed.

The solution pathway nodes matter equally and are almost always omitted from the doomsday framing. The probability that interpretability research accelerates faster than capability. The probability that a near-miss event produces adequate institutional response. The probability that tame-AI-on-bad-actor-leash governance frameworks get built before the advantage becomes irreversible. These are non-zero. They compress the upper tail of the doom distribution. Omitting them from the model is not rigour. It is selection.

The practical question the Markov structure answers: where should intervention resources go? Not into slowing development in democratic countries while it continues in authoritarian ones — that is the locks-only-keep-out-honest-people problem, which redistributes cost to cooperative actors while leaving non-cooperative ones unaffected. Not into safety research institutes whose expertise applies to exotic scenarios with low base rates. Into the gateway nodes: employment transition infrastructure, concentrated power constraints, antitrust enforcement against AI capability lock-in, and the political will to treat the Easter Island dynamic as what it is — a tragedy of the commons at civilisational scale, requiring coordination mechanisms adequate to that scale.


Part Six: Moral Theatre as Fuckwittery

The AI-2027 authors identified a real risk class and then communicated it in a way that erodes rather than enhances the audience's capacity to evaluate it. Simplified narrative. Manufactured certainty. Tribal activation via personalised fictional actors. Implied deference to credentialled insiders. Every one of these moves reduces the reader's capacity to think independently about the problem and increases dependence on the AI-2027 authors' framing.

In Paragentist terms the AI-2027 authors are operating in QIV (Advantage — their agency enhanced, the audience's eroded) at the expense of placing the audience in QII (Immoral Sacrifice — agency surrendered to the AI-2027 authors' preferred conclusion). The communication strategy produces compliance rather than agency. It gives readers a conclusion to adopt rather than a framework to reason with. An audience that has been emotionally primed to fear a specific scenario is not better equipped to navigate AI risk. It is more dependent on whoever manages the fear.

The deeper damage is to the epistemic commons. Once a risk category gets associated with a particular rhetorical style — overwrought, credentialled, institutionally self-serving — legitimate concern in that category gets discounted by association. The people most likely to dismiss AI risk entirely are the people who correctly identified the FUD structure and then incorrectly concluded that the risk class itself was manufactured. Those are often precisely the people whose cooperation is most needed.

The distraction hypothesis has a precise form that the evidence supports: the moral theatre of the rogue AI scenario functions — whether intentionally or not — to redirect public attention and institutional resources away from the mundane risks that are already materialising and toward the exotic risks that require the AI-2027 authors' specific expertise to address. Employment transition chaos does not need AI safety researchers. It needs labour economists, fiscal policy reform, and political will. Bad actor AI does not need interpretability research. It needs antitrust enforcement and democratic accountability mechanisms. None of these are the AI-2027 authors' domain. The safety research framing defines the problem in a way that makes their specific expertise the necessary solution. That is not a coincidence.

This does not require the AI-2027 authors to be consciously cynical. Motivated reasoning at institutional scale produces exactly this outcome without anyone choosing it deliberately. Researchers genuinely believe their domain is central. Funding structures reward that belief. The result is a field that defines civilisational AI risk in terms of the problems that field is equipped to solve. The Sickness applied to AI safety research itself — institutional self-preservation dressed as existential concern. The critique lands harder without villains. Easter Island didn't have any either.


Conclusion: What We Should Actually Do

The disappointing truth is not that the AI-2027 authors resorted to moral theatre. It is that they were probably right to. A carefully reasoned argument with honest uncertainty bounds, proper acknowledgment of competing incentives, and interventions selected for leverage rather than institutional authority would have reached fewer people and moved less institutional weight. That is not a criticism of the AI-2027 authors. It is a diagnosis of the audience — and of the institutions that shape what kinds of argument succeed in public discourse.

We have collectively failed to build institutions and audiences capable of acting on carefully reasoned probabilistic arguments about long-horizon risks. The theatre is a symptom of the epistemic failure, not the cause of it. And the epistemic failure is itself a risk, because it means that when the mundane scenarios — the transition catastrophe, the bad actor pathway, the Easter Island commons — become impossible to ignore, we will have spent the preceding years funding the wrong research, building the wrong frameworks, and deferring to the wrong experts.

The honest conclusions from this analysis are not comfortable for anyone.

For the AI development community: the science fiction doomsday is probably not your most likely failure mode. The bad actor with tame AI is. The employment transition is. The covert economic warfare is. These require political and economic interventions, not safety research, and you should be saying so rather than letting the safety framing absorb all the institutional attention.

For policymakers: you have already run this experiment with climate change, and the result should embarrass you. Decades of clear scientific consensus, visible and accelerating consequences, and democratic mandate produced inadequate action — because short electoral cycles, industry capture of the regulatory process, and the structural mismatch between four-year terms and forty-year problems defeated every governance mechanism available. The institutions failed not because the people in them were uniquely corrupt but because the incentive structure made failure the path of least resistance. AI governance is harder on every dimension. The technical complexity exceeds what most legislators can evaluate independently. The industry producing the campaign funding is newer, faster-growing, and more capable of regulatory capture than carbon ever was. The risk horizon is uncertain in ways that make deferral easy to justify. And the window for adequate response may close faster than the climate window did. If your track record on climate does not trouble you, you have not understood it. If it does trouble you, the question is whether you will act differently this time or repeat the performance while the stakes increase.

For readers: the goal of this article was not to replace one theatre with another. It was to model what the better version of the argument looks like — one where bias is declared, conclusions are reached by following evidence rather than confirming priors, and the honest answer is allowed to be uncomfortable for the person making the argument. The author changed their own position in the process of writing this. That is the point. Not the conclusion, but the method.

Preferably our leaders would act on careful critical reasoning rather than moral theatre. They probably will not. They have not earned the expectation that they will. That is a fact about our institutions, not a reason to stop making the argument.

The risks are real. The probability range is wide and the model cannot beat a coin toss. The gun is already cocked and cannot be put down. The question is not whether to engage with this but how — and the answer to that question is both more mundane and more urgent than the AI-2027 authors would like you to believe.


Third Thoughts — AI-2027 Series

AI Doomsday Theatre: The Reality

The AI transformation and worker transition inside one institution


The first two essays in this series argued that the AI-2027 framing was theatre — first by critiquing the scenario directly, then by mapping the actual probability structure with the Markov tool. This essay is the third move. It describes what the operational mid-risk looks like inside one institution where it is happening right now.

Imagine a financial services company. It employs more than five thousand and fewer than twenty thousand people. It runs a suite of consumer brands that compete on logo colour and marketing voice. They do not compete on the underlying service or price. They exist to support the illusion of competition where none exists, and to maximise customer exposure — like having more shelf space and multiple locations in a supermarket. Repeated exposure increases sales.

Two years ago the executives got board approval for a $150 million AI transformation. The benefits case promised substantial productivity and revenue gains with modest workforce reduction. Two years in, the benefits are not materialising. The productivity and revenue gains were grossly exaggerated. The response was to change the reporting so the gap is harder to see at the board level and increase the planned job shedding to half the workforce.

In parallel, the company switched to fixed-term contracts instead of full-time positions for all new hires. These contracts combine the worst features of contracting and standard employment, without pay scales to compensate. Contract end dates function as preset firing dates without redundancy or notice pay. The structure makes it possible to reduce headcount without redundancy costs. Legacy employees, hired before this change, cannot be shed so cheaply. They will lose their jobs through AI restructure redundancy at minimum legal cost.

The communications around this run at three layers.

The external communications layer says nothing. The market is not being told.

The internal communications layer says jobs as usual. Town halls, intranet posts, leadership messaging — all calibrated to growth, productivity, augmentation, opportunity.

The AI manager plans layer is the actual plan. Headcount targets, benefits cases, transformation roadmaps. This layer has planned the reduction. It is not visible to the second layer. The retraining program belongs here too. Examined, it consists of pointing workers toward trade qualifications they would have to fund themselves. Desk-bound knowledge workers in their fifties are offered brochures about electrical apprenticeships. The brochure exists so the company can say it is being a good corporate citizen. It will not pay for the retraining if it is not legally required to.

Each layer is defensible in isolation. External silence is normal in the lead-up to share-price-sensitive announcements. Internal jobs-as-usual messaging is standard to avoid industrial action. Operational planning for headcount reduction is common in large corporates. The executives never have to defend the architecture as a whole. They defend each layer separately, with three reasonable answers to three different questions. The uninformed workers bear the cost. They organise late or not at all, take legal advice late or not at all, and accept whatever exit terms are offered when the gates have already closed behind them. This is the company in our hypothetical. Now let me describe its industry.

The Operating Model

The brand multiplicity is the entry point. Multiple consumer-facing brands at similar price points, with coordinated economics behind separate marketing surfaces. This looks like competition from outside, but it is really a confusopoly that holds prices above what a transparent market would deliver, because consumer search costs are structurally inflated. AI cost reductions will not create pressure to pass savings to customers. The deeper pattern shows across the products the financial services industry sells — credit cards, wealth management, lending, insurance, payment processing. It is apparent in the gap between marketing communications and the legally binding ones. The industry treats its customers as morons.

Pick up any Product Disclosure Statement. Sixty plus pages, dense prose calibrated to satisfy disclosure requirements rather than convey meaning, operative clauses buried, delivered after the consumer is psychologically committed. The PDS minimises the seller's risk and obscures the customer's understanding. Everyone knows the customer probably did not read it and could not have understood it if they had.

Credit cards. The behavioural design is to get customers to max out their cards and make the minimum payment. Interest rates are very high. A low-value reward points layer is plastered over the top to distract customers from the real cost of credit card use. Marketed as freedom, structurally set up as addiction by design.

Wealth management. The standard fee structure is a percentage of the customer's funds under management charged regardless of performance. The customer pays the same fee in losing years as winning years. The manager has no skin in the game. The structural argument for this fee model has been weak for thirty years. It persists because customer information asymmetry and switching costs are sufficient to maintain it.

Lending. The criticism is not that banks lend cautiously. It is that the regulatory architecture produces compliance theatre rather than risk management. Customers with cash flow and security cannot get loans because the LVR rules say no. Customers without either can get loans if the serviceability calculation passes the buffer tests. The bank optimises for the rules, not for actual risk, because optimising for the rules generates revenue without generating regulator friction.

Insurance. Most insurance purchases are essentially an error. Everyone will suffer their share of losses over their lifetime. Either they deal with these events or they get insurance. Not having insurance means the cost to make whole is the cost of losses. Having insurance means the cost to make whole is the cost of losses plus the overhead, marketing and profit of the insurance company. Over a lifetime, the premium and excess must cost more. The only insurance anyone should buy is for an event they cannot make themselves whole from. Most people could not afford to rebuild their house if it burnt down. They could probably get a cheap second-hand car if their car was written off. They could pay for a dent or a broken windscreen and replace a failed toaster out of warranty. So insuring your house makes sense. Paying for a longer warranty on a toaster does not. But even for catastrophic loss, insurance firms cap their maximum liability. The defensible case for insurance — the only case that justifies paying overhead and profit on top of expected losses — is precisely the case the industry refuses to underwrite at the level the customer would actually need. The industry's own surveys then complain of "chronic under-insurance" as if it were a customer education problem rather than a product design choice. The vocabulary itself is captured: the industry will not let you say "I am not insuring this." It forces you to say "I am self-insuring." The absence of insurance is reframed as a deficient version of insurance.

Payment processing. The definitive example is PayPal, which I wrote about in another article. PayPal's terms of service reserve the right to suspend accounts and hold funds for six months without explanation.

Five product categories, five regulatory regimes. Insurance under APRA and ASIC. Wealth management under FOFA-modified frameworks with explicit best-interests duty. Credit cards under the National Consumer Credit Protection Act. Lending under prudential and conduct regimes calibrated for stability. PayPal under whatever applies to non-bank payments. A similar extraction pattern in each. The failure is not in the rules. The failure is in the relationship between the regulators and the firms they regulate, and in the architecture that relationship has produced over decades.

The Mechanism

The financial services market does not function properly because it is not a free market.

The regulatory architecture caps upside per transaction and eliminates institutional downside at the same time. People sometimes describe this as a trade-off. It is not. The cap on per-transaction upside is irrelevant because firms simply gear up until the return on the gearing produces their target return on equity. The case study firm targets 10–15% ROE. With downside eliminated by the regulatory architecture, gearing carries no genuine risk, so leverage scales freely until the returns are whatever the firm wants them to be. The downside elimination is genuinely valuable — it means the firm can lever up without the constraint that genuine risk would impose.

The equity at risk is not the firm's. It is the customers'. Financial services firms invest depositor, policyholder and unitholder money. Their shareholder equity is the thin slice on top, protected from loss by regulatory guarantee. The system runs as a sure thing for the firms, with the equity at risk being the customers' all the way down.

This is why the friction the regulators produce does not constrain extraction. The protection is worth more than the friction costs. The industry never lobbies to remove the protective architecture in exchange for genuine market freedom. The friction is the production of legitimacy that makes the protection politically sustainable. Without visible regulatory activity, the public would have no reason to believe the industry is constrained, and political support for the architecture — implicit deposit guarantees, APRA oversight, the whole structure — would evaporate. The firms need the friction. They manage it to a level that is visible enough to sustain political legitimacy and low enough not to constrain extraction. The regulator is not keeping the industry honest. It is keeping the industry's political protection intact by performing the appearance of oversight.

The regulators may also be asleep at the wheel. ASIC and APRA have mandates that explicitly include competitive markets, consumer protection, fair conduct, and contestability, and the Royal Commission documented their failure to act on the powers they already had. But the deeper issue is the frame itself. The regulators have permitted, and continue to operate, a regulatory architecture that makes financial services a sure thing for the firms and extractive for the customers, all in the name of stability. Stability means the firms do not fail. It does not mean the customers are treated fairly. The mandate has been captured by the incumbent firms at the level of what the mandate means, not just how it is enforced.

This is what makes the regulators Fuckwits in the Paragentism sense. The book's definition: a person who acts in ways that erode agency, their own and/or others'. No stupidity is required. No malice is necessary. Often, quite the opposite. The regulators have eroded customer agency through architectural choices made in good faith for stability reasons. They have eroded their own agency by accepting frames in which the questions they would need to ask to do their jobs are not the questions on the agenda.

The corporates have the capture playbook down pat. They know which regulators respond to which inputs, how to shape draft legislation during consultation, how to populate advisory committees, fund the think tanks that produce the analytical frames the regulators adopt, time disclosures, settle enforcement actions at amounts that look severe and are immaterial, and place regulator-friendly executives onto their boards. It works.

What Was Predicted

The counter-case I made against AI-2027 argued the more probable near-term harm path was massive job losses concentrated in white-collar work, with the speculative tail risk being used to obscure the operational mid-risk. The case I have just described is operational specificity for that prediction. Not plausibility — operational specificity. I am not telling you it could happen. I am telling you what it looks like inside one institution where it is happening.

What this case adds is the manner claim. The counter-case wasn't just "there will be job losses." That is predicted by everyone with a pulse. The sharper claim was that companies would execute large workforce reductions under cover of AI narratives, and workers would be harmed not just by job loss but by the conditions under which the job loss was managed. Without the misleading, workers would be losing jobs to AI like workers have lost jobs to technology cycles before. With the misleading — the three-layer communications architecture, the false benefits case, the brochure retraining, the fixed-term contract structure designed to bypass redundancy obligations — the harm is larger, more prolonged, and less visible to the people experiencing it. They cannot organise against what they do not know is coming.

The Fifth Wall

The optimistic story about AI in financial services says the technology will reduce costs and competitive pressure will pass the savings to customers. The market failure analysis tells you the transmission mechanism is broken. The regulatory architecture tells you the protection of incumbents is structural. AI applied within this architecture industrialises what the architecture already produces. The institutions gain new capabilities. The regulators do not. The capability gap widens.

The standard response to this prognosis, in the AI safety discourse, is to call for AI regulation.

This is the fifth wall.

The fourth wall is the one between actor and audience — the convention that lets the audience watch the play without becoming part of it. The fifth wall is the one the audience cannot see. The boundary that defines what counts as theatre at all. The four layers of moral theatre I have described are the play. The fifth wall is the meta-frame that hides from the audience that they are watching one.

AI regulation is the next surface of the same enclosure. It is being constructed by the regulatory class the financial services industry has been training the capture playbook on for decades. The corporates entering the AI regulatory conversation include the financial services firms that perfected the playbook. They are bringing all of it. The institutional designs being proposed are recognisably similar to the designs the playbook already defeats.

I predict AI regulation will be moral theatre — same regulators, same corporates, same playbook. Just like the GDPR regulations were. The visible activity will be there. The fines will be issued. The conduct codes drafted. The parliamentary inquiries held. The disclosure statements will run sixty to one hundred and twenty pages. The customers and workers and citizens will be informed in the legal sense and uninformed in the actual sense. The extraction will continue in the form the regulation permits, which will be the form the firms negotiated during consultation.

This is not an argument against regulation in principle. Regulation works in domains where harms are concentrated and obvious and regulators have frames distinct from the regulated. Food safety. Aircraft maintenance. Pharmaceutical efficacy. AI extraction in financial services will work differently. Each customer's slightly worse loan, slightly worse insurance product, slightly worse credit card offer — none of it visible, none of it scandalous, none of it politically actionable. The architecture's perfection is that no one ever has the experience of being defrauded. They have the experience of paying a little more than expected, getting a little less than they thought, finding the customer service slightly more frustrating than last year. Multiplied across populations, the yield is enormous. From any individual customer's perspective, there is nothing to complain about that anyone outside their household would care about.

This is what AI regulation will fail to constrain. Not the dramatic harms. The boring ones. The fifth wall will produce visible activity in the dramatic categories, because that is what regulators do. The boring extraction will increase, because nobody is built to see it.

The Reveal

I asked you to imagine the company. The company is real. So is the architecture. So is the workforce reduction. The people who told me about it would lose what protection remains to them if I named the firm, so I will not. But everything in the case is operational, dated, and underway right now in offices across Australia.

I presented it as hypothetical because that is the one way to get you to read it without defending against it. The architecture I am describing depends on the people inside it not knowing they are inside it. You have just experienced, in compressed form, the gap between what you thought you were reading and what you were actually reading. That gap is similar to the gap that keeps the workers in the case in the dark. They will lose their jobs, go on welfare and stop paying taxes so their employer (who already cannot lose) can make even more money faster without covering their forced job transition costs.

What Cannot Be Absorbed

The firm is not stupid. It is rational. Externalising costs to its workers and to the state is correct behaviour given the incentives it faces. The firm that voluntarily internalises more transition cost than legally required is disadvantaged relative to competitors who do not. It cannot afford to be the good actor in a field where good acting is penalised by the market. The individual firm's rationality is not the problem.

The problem is the systemic effect of every comparable firm playing the same rational game simultaneously, with AI accelerating the rate of redundancy across the economy. Each firm rationally externalising produces an aggregate burden the state cannot absorb at the scale this is about to come at it. The fiscal cost of mass mid-career white-collar unemployment. The political cost of a generation of voters who feel discarded. The productive-economy cost of skilled workers languishing on social security instead of being repositioned. None of this appears on any individual firm's balance sheet. All of it lands on the system the firms collectively depend on for their continued operation.

I want to be clear about what I am not arguing. I am not pro-labour. I think unions are largely extractive. I am not arguing the workers are owed anything as a moral matter. The shedding itself is not the issue. AI displaces labour. The economy reorganises. The question I am asking is whether the reorganisation will work this time at the scale and pace AI is producing it, given the manner the firms have chosen — minimum legal notice, no funded training, externalisation of the transition cost onto the state.

The Paragentic principle is that scaled organisations in protected positions need constraint proportional to their power, because they will reliably otherwise produce Fuckwittery of a magnitude the system cannot absorb. The firms here are scaled, protected, and producing externalities at a rate the system cannot absorb. The principle says they should be constrained — not stopped, not punished, but constrained to internalise enough of the cost that their decisions reflect the actual systemic cost of those decisions. Four to twelve weeks notice and no training is too little. Five years salary is too much. Consider a year of salary plus real funded training as a starting idea. The workers get retraining and a year to find another job. The company gets a 13–14 month payback on a guaranteed reduction in labour costs for every worker who takes the deal. The only mechanism that could deliver this is regulation that is not captured.

Regulation that is not captured is unlikely to come into existence. In this domain, three constituencies move to capture any regulation the moment it is proposed. The firms try to lock in the lowest certain compliance cost they can negotiate. The government tries to convert the regulation into a tax rake — additional revenue justified by the new framework, severed from the harm the regulation was meant to address, redirected to consolidated revenue. The unions try to extract worker privileges that exceed what the underlying balance would support, building organisational power on top of the framework rather than addressing the systemic problem.

These three are the relevant constituencies for industrial regulation in financial services and labour markets. They are not the universal set. Different policy domains have different capture constituencies, and the constituency a proposal favours is not always an institutional player. It can be an electoral base. Trump's tariffs were publicly framed as restoring American manufacturing jobs. They will not re-shore manufacturing, because the timeline required exceeds a single presidential term. Re-shoring car manufacture or meat processing at scale is a decade-plus project.

The tariff was therefore never a re-shoring policy. It serves two favourites simultaneously. The voter base receives the symbolic performance of concern for manufacturing workers who lost jobs offshore. The federal treasury receives substantial tariff revenues for whatever the administration wants to fund. Neither favouring required jobs to occur. Both were served regardless.

The cost is borne by US consumers. Tariffs are technically levied on importers, but importers pass the cost through, because in the affected categories there is no domestic substitute available at competitive price. Meat tariff means all US consumers pay more for meat. Steel tariff means all US consumers pay more for cars and appliances and construction. Instead of creating jobs for a particular group of blue-collar workers, the tariffs just increase the prices those workers pay for the things they want to buy. The favoured voter base is paying the tariff alongside everyone else, while believing it is benefiting. The dishonesty is not just in the framing. It is in what the favoured constituency is told the proposal is doing for them, when the proposal is doing the opposite to them.

Every policy proposal that emerges in any of these spaces is corrupt by construction. Each favours one or more constituencies while staging the public conversation around others. Consider what an honest version of a worker-retraining proposal would say: we are levying a corporate tax on workers to pork barrel more funding for TAFE and universities, with government taking administrative overhead off the top, and the workers receive whatever residual survives the institutional layers above them. Consider what an honest version of the tariff proposal would say: we are increasing the prices the workers we claim to be helping have to pay for the things they want to buy, with the additional revenue going to the federal treasury for general expenditure, while no manufacturing jobs are created because the investment timeline for re-shoring exceeds the policy's operating period. No proposal is ever stated this way, because stating it this way would defeat its purpose. The dishonesty is structural. A proposal that named its actual beneficiaries could not attract the constituencies whose support it requires, and a proposal without sponsoring constituencies does not exist as a proposal. Honest proposals do not get sponsored. Sponsored proposals are not honest. This is the second-order capture that operates above the first-order regulatory capture — the architecture by which the proposal-generation process itself filters out anything that would actually address the underlying problem.

Every available institutional mechanism for delivering the constraint principle is captured by one or more constituencies whose interests are not aligned with addressing the underlying problem, and every proposal to build a new mechanism is itself a capture vehicle for whichever constituencies are sponsoring it. The Paragentic principle remains true. The operational means to apply it do not currently exist, and the proposal-generation process is structured to ensure they do not come into existence.

What is left is calling out the problem and naming how the cheaters will try to cheat. The pretence that there is an institutional fix available is itself part of the architecture being diagnosed — every new layer of regulation recruits hope and converts it into theatre. The regulations legitimate the channel extractions rather than constraining them.

The workers in those offices across Australia will lose their jobs over the next eighteen months. They will lose them while being told they will not. They will lose them under a transformation narrative whose productivity claims did not deliver. They will be offered retraining brochures the company will not pay for. They will fall into a labour market where every comparable firm is running the same playbook. And the discourse that was supposed to be about the dangers of AI will be about superintelligence, paperclips, alignment, and the other things the architecture has been engineered to point at.

The first four walls are already built. The fifth is being constructed now. The play continues.