What is Loss of Control?

Introducing a novel, three-tier taxonomy for loss of control.

See full report

In this Chapter, we put forward a novel taxonomy of LoC.

In order for us to arrive at this taxonomy, we first reviewed a range of existing definitions of LoC across the broader AI literature. Our goal from this initial review was to understand whether there exists sufficient consensus within AI literature around a definition of LoC that could be operationalized by decision- and policymakers today. As we will explain below, we did not find satisfactory results in that direction. Instead, we found that existing definitions of LoC in literature differ across a range of axes and could be interpreted to cover different spectra of outcomes. We subsequently examined whether LoC definitions and terminology in other safety-critical sectors, such as defense, aviation, and nuclear, could transfer to AI and provide additional interpretative lenses. As this review also failed to provide definitive answers on an actionable definition of LoC for AI, we decided to home in on definitions characterized by their common element of having been shaped by large-scale multi-stakeholder consultations and consensus-driven processes—the EU AI Act’s General-Purpose AI Code of Practice (COP, (EU General-Purpose AI Code of Practice 2025)) and the International AI Safety Report (IASR, (Bengio, Mindermann, et al. 2025)). After observing significant differences between the definitions espoused in these two publications, we turned towards a comprehensive review of literature relevant to LoC, alongside an assessment of the scenarios contained therein.

Concretely, based on this process and its outcomes, we derived a novel conceptualization of LoC and inferred that there are at least three broad categories. We taxonomize these three as Deviation, Bounded LoC, and Strict LoC. They are conceptualized as follows:¹

Deviation captures events that cause some harm or inconvenience, but which are relatively easy to contain.
Bounded LoC captures events that can cause great damage or suffering, and are difficult, but possible, to contain, albeit potentially at great cost.
Strict LoC captures events that are maximally severe and permanent, such as events that result in humanity as a whole becoming extinct.

The remainder of Chapter 1 will describe our overall process, methodology, and how we arrived at this taxonomy and its implications in more detail.

1.A.1 Findings from Loss of Control Definitions in Literature

A range of governance frameworks and research papers have sought to define LoC in the context of highly advanced future AI systems. Yet, there exists no overarching consensus on the precise meaning of LoC, as we shall demonstrate, posing a challenge for decision- and policymakers alike. We briefly present a selection of definitions from the literature and describe several key learnings from our review, before delving into more depth on two definitions in particular.

In general, LoC definitions refer to situations in which humans lose the ability to effectively manage, direct, or intervene in the operation of increasingly capable AI systems. The following is a range of non-exhaustive examples:

The Singapore Consensus on Global AI Safety Research Priorities defines LoC as “... scenarios where advanced AI systems – such as AGI – come to operate outside of human control, with no clear path to regaining control” (Bengio, Maharaj, et al. 2025).
The Gladstone Action Plan defines LoC as a “failure mode under which a future AI system could become so capable that it escapes all human efforts to contain its impact” (Harris et al. 2024).
An emergency preparedness report by RAND defines LoC as “situations where human oversight fails to adequately constrain an autonomous, general-purpose AI, leading to unintended and potentially catastrophic consequences” (Somani et al. 2025).
Legislation recently introduced by Senators Hawley and Blumenthal defines LoC as “a scenario in which an artificial intelligence system: behaves contrary to its instruction or programming by human designers or operators; deviates from rules established by human designers or operators; alters operational rules or safety constraints without authorization; operates beyond the scope intended by human designers or operators; pursues goals that are different from those intended by human designers or operators; subverts oversight or shutdown mechanisms; or otherwise behaves in an unpredictable manner so as to be harmful to humanity” (Artificial Intelligence Risk Evaluation Act 2025).
The consensus statement signed by global experts from academia, AI companies, and independent organizations as a result of the 2025 International Dialogues on AI Safety (IDAIS) in Shanghai describes loss of control as a situation in which “one or more general-purpose AI systems come to operate outside of anyone’s control, posing catastrophic and existential risks” (International Dialogues on AI Safety (IDAIS) 2025).

Despite certain commonalities, such as concerns about AI systems operating beyond the scope of reliable human direction or oversight, definitions in the broader AI literature differ in their emphasis and framing (Bernardi et al. 2025; Kulveit et al. 2025; Department for Science, Innovation and Technology (DSIT) 2023; Cass-Beggs et al. 2024)² thereby presenting conceptual distinctions. Divergences between definitions can make it difficult to implement functional frameworks and enact suitable interventions. One axis of variation relates to the cognitive capabilities required by an AI system for LoC. Some researchers appear to suggest LoC may occur once AI systems become more intelligent than humans (Hendrycks et al. 2023; Bourgon 2024). Others more explicitly refer to artificial superintelligence when describing LoC risk (Pavel et al. 2025; Barnett and Scher 2025). In contrast, Stuart Russell’s definition in ‘Artificial Intelligence and the Problem of Control’ requires only a “sufficiently capable machine” (Russell 2022), rather than one explicitly surpassing human intelligence or achieving superintelligence. A further divergence across definitions involves the difference in treatment between oversight and control. While some definitions refer to control (Bommasani et al. 2025; International Dialogues on AI Safety (IDAIS) 2025), others emphasize oversight instead, focusing on AI systems that “operat[e] beyond human oversight” (METR 2025a). Another axis of variation concerns the possibility of regaining control. For example, the COP (EU General-Purpose AI Code of Practice 2025) does not necessarily imply that LoC is irreversible, whereas the IASR (Bengio, Mindermann, et al. 2025), the Singapore Consensus (Bengio, Maharaj, et al. 2025), and other reports (Bengio et al. 2024; Hendrycks et al. 2023) all suggest that LoC involves the absence of a clear path to regaining control. These interpretations imply that once control is lost, recovery may be impossible or extremely difficult. Overall, we found that while definitions aim to cover similar areas, they do so in sufficiently diverse ways to prevent a common, clear definition of LoC arising from the AI literature.

We observe that the lack of consensus surrounding one concrete definition of LoC matches what we learned from reviewing the concept of LoC in other safety-critical sectors. Specifically, we found that the meaning of LoC is not unified across different sectors, and spans from unauthorized access to personally identifiable information (Office of Management and Budget 2017; National Institute of Standards and Technology (NIST); Cybersecurity and Infrastructure Security Agency (CISA) 2021) in the cybersecurity sector, to a yaw motion leading to a deviation from a driver’s intended path (Federal Motor Vehicle Safety Standards; Electronic Stability Control Systems for Heavy Vehicles 2015) in the automotive sector, to the uncontrollable diffusion of chronic/latent pathogen infections (U.S. Food and Drug Administration 2023) in the pharmaceutical sector.

In the majority of cases, LoC takes on different meanings even within the same sector, depending on the circumstances. For instance, in the defense sector, the term LoC can refer to adversaries gaining control of an autonomous or nuclear weapon (U.S. Department of Defense 2017, 2020, 2016, 2014, 2006), to information being lost, stolen, or compromised (U.S. Department of Defense 2007), to mishaps and near-mishaps encountered during air combat (U.S. Department of the Navy 2009), and to interference with command and control (Office of the Chairman of the Joint Chiefs of Staff 2021; National Institute of Standards and Technology (NIST) 2003). In aviation and space, LoC can refer to an aircraft’s deviation from the intended flightpath (Federal Aviation Administration 2019, 2017b, 2017a, 2021; CAST/ICAO Common Taxonomy Team (CICTT) 2013; European Union Aviation Safety Agency 2025; Russell and Pardee 2000), and to a remote pilot not being able to command an unmanned aircraft (14 CFR § 107.19 — Remote Pilot in Command, n.d.; Sakakeeny et al. 2024, 2022; Hayashi et al. 2022), but also to an uncontrollable space station following collision with debris or meteoroids (47 CFR § 97.207 — Space Station, n.d.; 47 CFR § 5.64 – Special Provisions for Satellite Systems, n.d.; 47 CFR § 25.114 – Applications for Space Station Authorizations, n.d.). In nuclear, LoC can refer to nuclear chain reactions leading to an explosion such as the Chernobyl Nuclear Power Plant ((International Atomic Energy Agency 2025); Requirement 16, (International Atomic Energy Agency 2006); (U.S. Nuclear Regulatory Commission 1991)), or can also refer to the loss of licensed radioactive sources, such as nuclear fuel ((10 CFR § 20.2202(b) – Notification of Incidents, n.d.); (U.S. Nuclear Regulatory Commission 2004); (U.S. Nuclear Regulatory Commission 1998); (10 CFR § Chapter i NRC Enforcement Policy 2011); Requirement 80, (International Atomic Energy Agency 2006); (Ortiz et al. 2002); (Agency 2004)). The divergent and broad nature of definitions of LoC across and within other safety-critical sectors limits their usefulness for the AI sector. The plausible future nature of AI as an agentic system ‘driving’ LoC presents a particular nuance that other sectors have not yet meaningfully accounted for.

In order to have some anchor point for our work, we ultimately chose to home in on the definition in the COP (EU General-Purpose AI Code of Practice 2025) and the definition in the IASR (Bengio, Mindermann, et al. 2025). We focus on these two definitions because they have been developed by a broad range of technical and policy stakeholders, undergoing multiple iterations and consensus-building, which speaks to the inclusion of diverse views already within each definition. In the case of the COP, the LoC definition also underpins concrete regulatory interventions.

The EU AI Act’s Code of Practice for General-Purpose AI Models defines LoC as “risks from humans losing the ability to reliably direct, modify, or shut down a model” (COP, (EU General-Purpose AI Code of Practice 2025)).
The International AI Safety Report defines LoC as “...scenarios in which one or more general-purpose AI systems come to operate outside of anyone’s control, with no clear path to regaining control” (IASR, (Bengio, Mindermann, et al. 2025)).

In reviewing these two definitions specifically, we make two observations. First, we note that both definitions can be interpreted to imply a spectrum of outcomes across both scale and severity. For instance, the COP definition can be interpreted to capture both more or less consequential outcomes, all arising from an inability to reliably affect or shut down an AI system. Similarly, across a spectrum of severity and persistence, under the IASR definition, LoC occurs if humanity has “no clear path to regaining control,” which leaves open the question whether a scenario in which regaining control is extremely difficult and costly (but still possible) falls within the LoC definition.

Second, we note that the two definitions differ in the expected timelines for the LoC outcomes they aim to capture implicitly. For instance, under the COP definition, LoC materializes when humans lose the ability to “reliably direct” an AI system. Arguably, based on this definition, we can already see instances of LoC occurring today. For instance, in a situation where an AI system cheats to obtain artificially high scores on a test, one could argue that the user is unable to reliably direct the AI system to complete the task (METR 2025b). These occurrences are no longer limited to testing environments; for example, a coding AI system recently wiped an entire production database, despite being explicitly instructed not to make any changes (Okunytė and Ancell 2025). While such an outcome would be captured under the COP definition, it would not be captured by the IASR definition. Specifically, by contrast, the IASR definition seems to require that there must be “no clear path to regaining control” for a scenario to constitute LoC. Therefore, the definition captures only scenarios in which regaining control is, if not impossible, at least, by way of interpretation, very difficult and costly. Our previous example would not be covered by this definition, even using its broadest interpretation, since regaining control was simple despite the AI system not acting reliably. In other words, it seems implausible that AI systems that are already on the market would be captured by the IASR definition, whereas this does not seem implausible for the COP definition. We therefore conclude that, despite having undergone broad consensus-building processes, both definitions are at odds across certain dimensions.³

The aforementioned challenges and limitations surrounding the definition of LoC for AI systems prompted us to review existing AI literature describing LoC scenarios and their outcomes. The goal of that review (see Section 1.A.2) was to produce a more comprehensive and nuanced picture of what AI researchers mean when they discuss LoC non-abstractly. By extension, we expected that this additional review could serve to supplement leading multi-stakeholder-developed definitions, such as those in the COP or IASR, enable a more lucid dialogue for decision-makers, and allow for more specification to enable actionable measures against what we eventually found to be different categories of LoC.

1.A.2 Findings from Loss of Control Scenarios in Literature

Our literature review was motivated by the goal of adding nuance and clarity to the concept of LoC, and by doing so, assessing which outcomes the field considers to be LoC and whether there are any overarching and notable commonalities that can further advance the conceptualization of LoC beyond existing efforts. We describe our methodology and process first, and then we present our findings.

1.A.2.a Methodology

In total, we reviewed 130 works from academia, international think tanks, and government agencies (see Appendix 2.1 for the full list).⁴ We subsequently applied three filters to these works in order to arrive at comparable and informative data points: (i) do they contain a scenario; (ii) does the scenario concern LoC; (iii) is the LoC outcome sufficiently concrete for us to derive learnings from. We describe each step in more detail next.

First, we filtered works into two categories: those that contained scenarios and those that did not. In order to count as a scenario, a text passage within a given piece of literature had to pass our causal detail criterion. In other words, a text passage was considered a scenario if it could fulfill the causal detail criterion by either: (i) containing a detailed narrative description of the events leading up to the outcome; or (ii) giving an abstract but highly detailed logical argument about how an AI system could cause a certain outcome.

Our next filter assessed whether the scenarios could be reasonably considered LoC scenarios, as most reviewed works did not, in fact, mention this keyword.⁵ We assessed the scenarios against four definitions across four governance documents, previously described in Section 1.A.1. Specifically, we assessed them against the definitions in the AI Act’s Code of Practice for General-Purpose AI Models (EU General-Purpose AI Code of Practice 2025), the Artificial Intelligence Risk Evaluation Act of 2025 (Artificial Intelligence Risk Evaluation Act 2025), the International AI Safety Report (Bengio, Mindermann, et al. 2025), and the Singapore Consensus on Global AI Safety Research Priorities (Bengio, Maharaj, et al. 2025). In selecting these four definitions, we aimed to achieve a balance of definitions that are regulatory in nature and definitions that have directly benefited from a broad-ranging, international, and multidisciplinary perspective, developed by contributors from across industry, academia, and government. If a scenario could reasonably be captured by any of these four definitions, it passed our LoC filter. This left us with 40 individual LoC⁶ scenarios in total.

Our final filter for these LoC scenarios was to assess whether we could deem them sufficiently concrete to allow for meaningful comparison between scenarios. In order to find a reasonable and overarching filter for assessing sufficient concreteness, we drew on two dimensions we identified as present across all LoC scenarios. We conceptualize these two metrics as follows: (1) severity, capturing how many people are affected and the degree to which they are affected; and (2) persistence, capturing the difficulty of interrupting the ‘harm trajectory’ of the scenario.⁷ More concretely, we conceptualize persistence as consisting of two components: (i) the difficulty of preventing the AI system from taking additional unintended actions that cause further harm; and (ii) the difficulty of interrupting the immediate harmful process initiated by the AI system.

We propose that both metrics—severity and persistence—can be captured by economic impact as a proxy measure and expand on our reasoning in more detail in the subsequent paragraphs.

For severity, we propose that economic impact can serve as a suitable proxy because the more people are affected and the more severely they are affected, the more plausible it is that the disaster will have had a larger economic impact. Various works in academic literature offer evidence to support our claim that severity can be expressed by economic impact, for example, in events such as power-grid interruptions (Larsen et al. 2025), pandemics (König and Winkler 2021; Morgenstern et al. 2024), wars (Mueller and Tobias 2016; Novta and Pugacheva 2020), and natural disasters (Cavallo et al. 2021), including, for example, hurricanes (Hsiang and Jina 2014; Acevedo 2016; Murnane and Elsner 2012; Huang et al. 2024). Similarly, adverse effects of climate change (Nordhaus 2013; Hsiang et al. 2017) and health impacts, such as the burden of pollution-related health impacts, are often expressed in economic terms (Deryugina et al. 2019; Chang et al. 2016). As these events escalate, their economic impacts also escalate.

For persistence, we propose that economic impact is a suitable proxy because it is plausible that more persistent scenarios take longer to interrupt, and, in turn, scenarios that take longer to interrupt can plausibly be considered to have a larger economic impact. Scenarios that are more persistent have harm trajectories that are more difficult to interrupt. In other words, it is more challenging to interrupt the AI system itself, which causes the scenario, and the immediately harmful process that results from the AI system’s actions. Persistence across the literature appears to be correlated with the relative ease or difficulty in resolving hindrances to interrupting the harm trajectory, such as an absence of appropriate technical knowledge, other resource constraints, or coordination failures. Due to their nature, these hindrances generally present time-consuming hurdles, disallowing a swift interruption of the harm trajectory. We propose that it is reasonable to assume that scenarios where the harm trajectory takes longer to interrupt have a greater economic impact, as the occurrence of harm spans a longer time horizon. Various works in academic literature offer evidence to support our claim, indicating a causal link between the duration of an incident and the economic impact it causes. For example, in electricity-grid management there is some evidence that the economic impact of electricity blackouts rises with the duration of the blackout (West 2018; Sullivan et al. 2015); there is evidence that longer duration heatwaves (Costa et al. 2024) are associated with a higher economic impact, and literature suggests that longer-lasting recessions (Haltmaier 2013) and bank crises have larger impacts on the economy (Hoggarth et al. 2001).

Drawing on our conceptualization and contextualization of severity and persistence led us to adopt economic impact, which is standardized and readily available, as a proxy measure and concreteness criterion to filter out concrete LoC scenarios.

Therefore, in order to be deemed concrete, a LoC scenario had to contain sufficient detail of the scenario’s LoC outcome such that we could estimate the economic impact of its outcome.⁸ Any given LoC scenario could achieve the concreteness criterion via two methods: (i) if we were able to match it to a pre-existing economic estimate for the same or a highly similar scenario in literature—for example, derived from academic literature or third-party research reports (e.g., from (Vest et al. 2022) and (Posner 2004)); or, alternatively (ii) if we were able to make our own back-of-the-envelope calculation (BOTEC) of the economic impact.⁹ For these BOTECs we made appropriate assumptions about a given scenario (about, for instance, the scale or location of the outcome) that either (ii.a) allowed us to match the scenario to a same or highly similar scenario with pre-existing economic impact estimates; or, (ii.b) enabled us to leverage existing calculations to derive economic impact estimates where no corresponding example was found in literature (see Appendix 2.2 and 2.2.1).¹⁰

In total, we found 12 text passages that fulfilled all of our criteria and were therefore classified as concrete LoC scenarios.

Once we completed this categorization, we filed the 12 concrete LoC scenarios within their respective ‘threat category.’ We use threat categories as a classification that groups LoC scenarios based on the outcome of the scenario, rather than narrative contributors to a given scenario. We found that the 12 concrete scenarios fell within the following threat categories:

disruption of critical national infrastructure (CNI; (PPD-21 2013); (National Protective Security Authority (NPSA) 2025))
(2 scenarios);
engineered pandemics (2 scenarios);
grand-scale conflict/war (1 scenario);
cybersecurity incident (1 scenario);
economic disruption (1 scenario);
human manipulation (1 scenario); and,
human extinction (4 scenarios).¹¹

Next, we plotted all 12 concrete LoC scenarios on an experimental graph, using severity and persistence as the axes,¹² and color-coded them within their respective threat categories (see 1). In doing so, we noted that certain areas emerged from our plots and sought to contextualize them within concepts and frameworks that already exist in governance and which decision- and policymakers may be familiar with.

Scatter plot positioning 12 loss-of-control scenarios by severity and persistence — The distribution in this graph covers all 12 concrete LoC scenarios derived from the literature, plotted by severity and persistence using economic impact as a proxy measure (both axes in arbitrary units 0-100).¹³The colors of scenarios indicate the relevant threat category: human extinction, human manipulation, economic disruption, cybersecurity incident, grand-scale conflict/war, engineered pandemic, or disruption of critical national infrastructure. Where multiple economic impact estimates for a scenario were available, error bars represent 50% confidence intervals calculated using the t-distribution. For several scenarios, these error bars are too small to be visible on this log graph. For two scenarios, there are no error bars because only one estimate was available.

First, we sought to contextualize the area indicated in the lower left of our graph (see 1). In doing so, we came across the economic consequence thresholds that the U.S.’s Department of Homeland Security (DHS), the Intelligence Community, and other components, identified as “necessary to create a national level-event” in the Strategic National Risk Assessment (SNRA) (U.S. Department of Homeland Security 2011). Specifically, in 2011, DHS led an effort to “identify the types of incidents that pose the greatest threat to the Nation’s homeland security,” including various natural, technological, and adversarial, human-caused hazards (U.S. Department of Homeland Security 2011).¹⁴ For example, earthquakes, floods, hurricanes, wildfires, and cyberattacks against physical infrastructure meet the threshold if they result in direct economic losses exceeding $100 million (U.S. Department of Homeland Security 2011). Similarly, a cyberattack against data¹⁵ is considered a national-level event if it results in economic losses of $1 billion or more. In this research report, we adopt this last threshold set by the SNRA at $1 billion or greater (adjusting it for inflation to approximately $1.4 billion) to provide some contextualization of LoC scenarios (see 1). We refer to this threshold, i.e., the $1.4 billion, for convenience, as the ‘national risk assessment threshold.’ In summary, the national risk assessment threshold is the highest threshold based on economic consequences that DHS, the Intelligence Community and other components identified in the Strategic National Risk Assessment to demarcate a national-level event (i.e., a threat or hazard that has the potential to significantly impact the U.S. homeland security). We indicate this threshold in blue in Figure 1.

Second, we sought to contextualize the area indicated in the top right of our graph (see 1). In doing so, we drew on existing conceptualizations describing events of absolute scale and permanence. In other words, events that could lead to the destruction of humanity’s long-term potential (Ord 2020; Sundaram and Mani 2025; Stauffer et al. 2023; Wynne and Derr 2025).¹⁶ These events are commonly referred to as an ‘existential catastrophe,’ and we adopted this term to establish our upper boundary. In summary, existential catastrophe demarcates the point at which humanity loses control over its future in an absolute sense. We indicate this threshold in bold red dashes in Figure 1.

Reflecting on our methodology and choices, we note that while economic impact was the most tractable proxy for making calculations based on the LoC scenarios in the literature, it may not be the perfect proxy. We invite future scholarly research to establish a more refined methodology and calculations.¹⁷ We summarize our learnings next.

1.A.2.b Reflections

The aforementioned methodology allowed us to visually locate concrete scenarios derived from our literature review. The final plots helped identify the locus of attention that scholars have paid to LoC in literature, as well as its relation to existing conceptualizations of risk, such as existential catastrophe or a national risk assessment threshold.

Through the analysis enabled by this methodology, we found that:

There exists a comparatively low number of concrete LoC scenarios in literature, vis-à-vis the total number of scenarios: only 12 concrete LoC scenarios out of a total of 40 LoC scenarios. This finding underscores the importance of additional research and conceptualization for LoC to provide a clearer and more comprehensive picture for decision- and policymakers to action.
Of the scenarios that were concrete and plotted on the graph, all clustered above a certain magnitude of severity and persistence. It appears that LoC is predominantly used to refer to scenarios with an impact above a certain level. This level of impact does not correspond to any scenarios already encountered in the wild today. We therefore propose to infer that LoC does not capture events below the national risk assessment threshold and, by extension, AI-related events occurring today.
Most plotted scenarios lack an implied category despite forming the majority of plots. We notice that most of the concrete data points in the literature fall outside of boundaries that could be derived from conceptualizations of risk based around the national risk assessment threshold or existential catastrophe.
We observe that there exist a handful of outliers towards the top right corner. We note that it is unclear whether concrete learnings can be extrapolated from that. While it may be the case that the locus of LoC scenarios in literature concerns itself with scenarios around the cluster in the middle, it is equally plausible that, simply, there are only a few select threat models that could devolve into a type of LoC such that they would fall within the top right corner.

We subsequently reflected on our learnings and devised a taxonomy to more clearly categorize what scholars might or might not mean when discussing LoC (see 2).

Diagram of the three-part loss-of-control taxonomy — Our taxonomy of LoC.

Our taxonomy is as follows:

Deviation: captures events that cause some harm or inconvenience, but lacks the requisite severity and persistence for inclusion within the threats and hazards that have the potential to significantly impact national preparedness. The events in this category would be plotted in the lower left corner of our graph (see 1), and delineated from LoC scenarios in other categories through the national risk assessment threshold.
Bounded LoC: captures events that cause great damage or suffering, and are difficult, but possible, to contain, albeit potentially at great cost. The events in this category are mapped between the lower left corner and the top right corner.
Strict LoC: captures events which are maximally severe and permanent, such as events that result in humanity as a whole becoming extinct. The events in this category are mapped at the top right corner of our graph (see 1).

This taxonomy allows us to distinguish between what appear to be different categories of LoC described in literature and therefore provide increased conceptual clarity, which provides a more precise language and conceptualization to leverage for decision- and policymakers and may help underpin more targeted interventions per category. Simultaneously, the taxonomy supports more refined interpretations of leading multi-stakeholder developed definitions and contextualizes their most plausible reference classes. Next, we elaborate on each category in more detail.

1.B.1 Deviation

We use ‘Deviation’ to capture events that cause a small degree of harm or inconvenience but which are relatively easy to contain.

This category captures scenarios where an AI system briefly deviates from human intent, and, while causing some real-world damage or minor economic impacts, remains below the threshold for inclusion within the threats and hazards that have the potential to significantly impact the U.S. homeland security based on the Strategic National Risk Assessment (U.S. Department of Homeland Security 2011). These deviations from human intent are simple to stop at low or no cost, meaning that scenarios in this category would be plotted in the lower left corner on our graph (see 1), and are delineated from other categories by what we term the national risk assessment threshold. Overall, we propose that the national risk assessment threshold is a principled boundary since it is calibrated to exclude events where harm is limited and containment is easy, that is, deviations.

The scenarios captured by Deviation are consistent with the broadest interpretation of the LoC definition offered by the COP, which defines LoC as “risks from humans losing the ability to reliably direct, modify, or shut down a model” ((EU General-Purpose AI Code of Practice 2025), Safety and Security Chapter, Appendix 1.4). In other words, instances of losing the ability to “reliably direct” an AI system can be located on a spectrum, from not persistent to highly persistent, and from low severity to high severity. The lower end of that spectrum (i.e., instances of humans “losing the ability to reliably direct” an AI system that are both not persistent and low-severity) could, in theory, capture the category of Deviation.

Nevertheless, we offer our reasoning as to why we do not believe the COP definition (EU General-Purpose AI Code of Practice 2025) should be interpreted so broadly as to include the category of Deviation, derived from our earlier review and methodology. A broad interpretation of “losing the ability to reliably direct” would already capture scenarios that occur today. Consider the following two examples. In one instance, an AI system recently deleted an entire database, even though it was instructed not to modify the code (Okunytė and Ancell 2025). In another instance, OpenAI’s agent Operator conducted a purchase without requesting the users’ consent, despite having been told only to find the cheapest option (Fowler 2025). This occurred even though the agent was trained to ask the humans’ consent before conducting irreversible actions such as purchasing goods or sending emails (OpenAI 2025). In both cases, humans were not able to “reliably direct” an AI system and, therefore, both cases could, in theory, qualify as LoC based on a broad interpretation of “losing the ability to reliably direct.” However, we propose that it is unlikely the LoC definition in the COP was supposed to be interpreted so extensively as to include similar cases, which we believe qualify as Deviation rather than LoC. First, including the category of Deviation within the definition of LoC in the COP (EU General-Purpose AI Code of Practice 2025) would run the risk of making the LoC definition overly broad, rather than contextualizing it. As previously described, from our literature review, assessment, and subsequent interpretation of LoC scenarios, we derived that there exist no concrete LoC scenarios in the AI literature that describe an outcome below the national risk assessment threshold that would capture these types of events. Instead, we start to see concrete scenarios emerging around the middle of our graph (see 1).¹⁸ Interpreting the LoC definition so broadly as to include similar instances would therefore ignore the reference points offered by AI literature, which rather suggest that the LoC definition was meant to capture a more significant dimension of losing the ability to “reliably direct” AI systems than instances of Deviation. Second, including instances of Deviation within the scope of the COP’s definition of LoC would ignore the qualification of LoC as a ‘systemic risk’ in the COP ((EU General-Purpose AI Code of Practice 2025), Safety and Security Chapter, Appendix 1.4). Under the COP, systemic risks share the following “essential characteristics,” which speak to the severity and persistence of the LoC outcome ((EU General-Purpose AI Code of Practice 2025), Safety and Security Chapter, Appendix 1.2.1): a “significant impact on the Union market” that “can be propagated at scale across the value chain.” Scenarios contained within Deviation and instances like the ones described above could hardly generate such a negative impact on the Union market. We therefore conclude that, while a literal interpretation of the definition offered by the COP could, in theory, include scenarios below the national risk assessment threshold (such as instances contained within Deviation), that was likely not its intent.

Overall, the category of Deviation allows us to sharpen the scope of LoC and navigate it more clearly. Critically, separating events in which an AI system causes inconvenience or limited damage that can be contained at low cost, from events where an AI system causes severe and wide-reaching harm with significant cost to contain, allows us to be more precise when speaking about LoC. In turn, clearer categorizations and taxonomies will assist researchers, decision- and policymakers alike in homing in on much more targeted interventions for the categories they are most concerned by.

1.B.2 Bounded Loss of Control

We use ‘Bounded LoC’ to capture a spectrum of events that cause significant damage or suffering, but that are ultimately possible to contain, albeit at a plausibly high cost.

This category captures scenarios where an AI system causes significant harm and is challenging to contain. We call this category Bounded LoC because scenarios described therein are of significant severity but ultimately remain bounded in their total impact on society.¹⁹

After reviewing our learnings from literature, we suggest that boundaries for this category can be set between the national risk assessment threshold (lower boundary) and the existential catastrophe threshold (upper boundary). This spectrum captures the majority of concrete LoC scenarios described in existing literature. We note that these scenarios appear to predominantly contain indicators for global catastrophic risk, describing “events or incidents consequential enough to significantly harm or set back human civilization at the global scale” (United States Congress 2022) and are clustered in an area of impact that pre-existing estimates in the literature describe as a “global catastrophe” (Bostrom and Cirkovic 2011; Cotton-Barratt et al. 2016; Kemp et al. 2022).

Based on our review, Bounded LoC appears to be the focal point of scholarly discussion of LoC scenarios and associated threats. More precisely, Bounded LoC appears to contain the highest number of scenarios for LoC, with 8 out of the 12 concrete scenarios we mapped falling into this category (the 4 others fall into the next category).

The breadth of this category encompasses the majority of possible interpretations of LoC definitions contained within the COP and the IASR. Indeed, Bounded LoC captures both a range of scenarios where “humans [are] losing the ability to reliably direct, modify, or shut down a model” (EU General-Purpose AI Code of Practice 2025) and “...scenarios in which one or more general-purpose AI systems come to operate outside of anyone’s control, with no clear path to regaining control” (Bengio, Mindermann, et al. 2025). For example, ‘Scenario 4’ in (Kalra and Boudreaux 2025), CyberChain Reaction, fits the COP’s definition of LoC (EU General-Purpose AI Code of Practice 2025) because it involves human inability to reliably direct the AI system, which leads to the AI system “locking out [human] admins [from infrastructure for critical systems] and restricting [their] access.” This scenario also fits with the IASR’s definition of LoC (Bengio, Mindermann, et al. 2025) because the AI system continues to operate outside any human’s control, leading to “[h]ospitals, ports, and infrastructure slow[ing] to a crawl.” However, this scenario describes Bounded, rather than Strict, LoC because although there is not a clear path for humans to regain control of this infrastructure, in the end it is possible to do so: the scenario describes how the “systems are pulled offline” by humans, but the “AI… resists removal. Recovery is slow.” Despite this significant overlap, the category does not capture the lower boundaries of a strict interpretation of the COP’s definition, as described earlier, nor does it capture the upper bound of a strict interpretation of the IASR’s definition, as elaborated on in more detail under 1.B.3.

Overall, we found the category of Bounded LoC to be a useful concept as it encompasses the most predominantly mentioned outcomes for LoC scenarios, and in doing so can serve as a meaningful reference class for decision- and policymakers, while preserving the ability to distinguish for further granularity within this category in future research.

1.B.3 Strict Loss of Control

We use ‘Strict LoC’ to capture events that cause maximum damage or suffering and whose harm trajectories are impossible to contain, notably the extinction of humanity.

This category captures scenarios that cause absolute harm of the type that no action at any given point in the future would enable society to recover from the harmful event. The category is strict in the sense that it severely affects and is consequential to humanity as a whole and is permanent, and as such would be mapped at the top right corner of our graph (see 1). “Existential catastrophes” are scenarios that involve the destruction of humanity’s long-term potential (Ord 2020; Sundaram and Mani 2025; Stauffer et al. 2023), for example, a nuclear winter that precipitates a prolonged agricultural collapse (Sagan 1983). As such, the concept of existential catastrophe naturally delineates scenarios that are unusually serious and permanent, such as human extinction, from those that are ultimately reversible; and hence, acts as a natural boundary for separating Strict LoC scenarios from other, less absolute, forms of LoC (i.e. Bounded LoC).

As previously noted, the scenarios falling within this category are not described as frequently as scenarios falling within the category of Bounded LoC. Given our review, we propose that there are two reasons for this: first, these types of scenarios struggle with narrative clarity due to the numerous unknown hypotheticals involved; and secondly, there may simply not be many ways to arrive at an extinction-level outcome.

Strict LoC aligns with the extreme end of the spectrum offered by the LoC definition in the IASR (Bengio, Mindermann, et al. 2025), wherein LoC occurs when “... one or more general-purpose AI systems come to operate outside of anyone’s control, with no clear path to regaining control.” While the definition likely means to encompass a spectrum such as the one described by Bounded LoC, a literal interpretation of “no clear path to regaining control” could also encompass a Strict LoC outcome where there is nobody left to regain control over an AI system.

1.C Reflections

We reached our LoC taxonomy by starting from a review of definitions (i) in AI literature and (ii) in other safety-critical sectors, including defense, aviation, and nuclear. We found that neither set of existing definitions supports one common, clear definition of LoC that could enable decision- and policymakers to operationalize LoC today. This result led us to, instead, home in on two leading multi-stakeholder-developed definitions of LoC, offered by the COP and the IASR. We observed that these two definitions remain difficult to operationalize, as they can encompass a wide range of outcomes and come into effect at different times. In order to shed more light on these definitions and on LoC as a concept more broadly, we subsequently conducted a literature review, which resulted in the assessment of 40 LoC scenarios. Specifically, we aimed to extrapolate the types of scenarios and consequent outcomes scholars are mostly concerned with. We developed a methodology to assess the scenarios, categorize them, and plot them on an experimental graph (see 1).

Our landscape review and subsequent methodology informed the development of a novel taxonomy for LoC, composed of three categories: Deviation, Bounded LoC, and Strict LoC. This taxonomy enables us to establish a more nuanced understanding of LoC, as captured by interpretations of existing definitions and scenarios in the literature. In doing so, it enables us to better distinguish between what can be classified as present-day control failures and extreme future scenarios emphasized in a subset of the literature, and therefore operationalize conceptualizations surrounding LoC in a clearer manner.

Upon reviewing and contextualizing our taxonomy, we identified several additional, valuable findings. First, and simply put, LoC, as currently discussed in the literature, neither clearly equates to Deviation nor to Strict LoC. Indeed, many significantly impactful LoC threats could manifest before a hypothetical existential-grade AI malfunction. Those types of scenarios, which we described as Bounded LoC, are the most commonly described in a concrete manner in the AI literature we reviewed.

Second, devising accurate boundaries to capture all possible eventualities between categories is challenging, and we expect more nuanced categories to arise with further research, especially in the area of Bounded LoC. The challenge is underpinned by the uncertain nature of the topic, the difficulty of retrieving numbers that would enable precise calculations capturing all necessary considerations, and the plausible likelihood that a scenario can cascade from one boundary to another.

Building on that, and third, we note that LoC can pose a particularly insidious threat model since it can be a creeping problem, meaning that it may be difficult to pinpoint the point in time at which “control” was lost, especially ex-ante. In that world, we might already be on a trajectory that will lead to or cascade to Bounded or Strict LoC, but the harm has yet to materialize into damages and is therefore extremely difficult to pinpoint and account for. In a similar direction, while Deviation is not meaningfully captured as LoC in our framework, some contained instances may be “canaries in the coal mine” for Bounded or Strict LoC. This makes the development of functional and agile mechanisms to target and avoid this outcome especially challenging and important.

Finally, we find that the distinction between Bounded and Strict LoC offers a noteworthy conceptualization for decision- and policymakers because Strict LoC is both permanent and has a global reach, i.e., it affects all nations. Notwithstanding that, it can be initiated or catalysed by events in only one or a small number of nations. Given the fact that it can be caused by another nation but affects all nations similarly and absolutely, it would appear strategic that nations overall have a self-interest in ensuring that other nations do not cause such an outcome.

In the remainder of this report, we will focus solely on Bounded and Strict LoC (referring to both as ‘LoC’ unless a clear distinction is necessary). Although Deviation as a category remains undesirable, and limiting their occurrence and mitigating the resulting harms is important, our attention for this report is devoted to clarifying LoC as a concept.

References

10 CFR § 20.2202(b) – Notification of Incidents. https://www.ecfr.gov/current/title-10/chapter-I/part-20/subpart-M/section-20.2202.

10 CFR § Chapter i NRC Enforcement Policy (2011). https://www.federalregister.gov/documents/2011/09/06/2011-22646/nrc-enforcement-policy.

14 CFR § 107.19 — Remote Pilot in Command. https://www.ecfr.gov/current/title-47/chapter-I/subchapter-D/part-97/subpart-C/section-97.207.

47 CFR § 25.114 – Applications for Space Station Authorizations. https://www.ecfr.gov/current/title-47/chapter-I/subchapter-B/part-25/subpart-B/subject-group-ECFRe2bd70fc6b2eea0/section-25.114.

47 CFR § 5.64 – Special Provisions for Satellite Systems. https://www.ecfr.gov/current/title-47/chapter-I/subchapter-A/part-5/subpart-B/section-5.64.

47 CFR § 97.207 — Space Station. https://www.ecfr.gov/current/title-47/chapter-I/subchapter-D/part-97/subpart-C/section-97.207.

Acevedo, Sebastian. 2016. Gone with the Wind: Estimating Hurricane and Climate Change Costs in the Caribbean Authorized for Distribution.

Agency, International Atomic Energy. 2004. Strengthening Control over Radioactive Sources in Authorized Use and Regaining Control over Orphan Sources. International Atomic Energy Agency. https://www-pub.iaea.org/MTCD/Publications/PDF/te_1388_web.pdf.

Artificial Intelligence Risk Evaluation Act (2025). https://www.hawley.senate.gov/wp-content/uploads/2025/09/Hawley-Blumenthal-Artificial-Intelligence-Risk-Evaluation-Act.pdf.

Barnett, Peter, and Aaron Scher. 2025. AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions. https://arxiv.org/abs/2505.04592.

Bengio, Yoshua, Geoffrey Hinton, Andrew Yao, et al. 2024. “Managing Extreme AI Risks Amid Rapid Progress.” Science 384 (6698): 842–45. https://doi.org/10.1126/science.adn0117.

Bengio, Yoshua, Tegan Maharaj, Luke Ong, et al. 2025. The Singapore Consensus on Global AI Safety Research Priorities. https://arxiv.org/abs/2506.20702.

Bengio, Yoshua, Sören Mindermann, Daniel Privitera, et al. 2025. International AI Safety Report. Research report. Department for Science, Innovation; Technology. https://internationalaisafetyreport.org/publication/international-ai-safety-report-2025.

Bernardi, Jamie, Gabriel Mukobi, Hilary Greaves, Lennart Heim, and Markus Anderljung. 2025. Societal Adaptation to Advanced AI. https://arxiv.org/abs/2405.10295.

Bommasani, Rishi, Scott R. Singer, Ruth E. Appel, et al. 2025. The California Report on Frontier AI Policy. Joint California Policy Working Group on AI Frontier Models. https://www.gov.ca.gov/wp-content/uploads/2025/06/June-17-2025-%E2%80%93-The-California-Report-on-Frontier-AI-Policy.pdf.

Bostrom, Nick, and Milan M Cirkovic. 2011. Global Catastrophic Risks. Oxford University Press.

Bourgon, Malo. 2024. “MIRI 2024 Mission and Strategy Update.” Machine Intelligence Research Institute (MIRI), January 4. https://intelligence.org/2024/01/04/miri-2024-mission-and-strategy-update/.

Cass-Beggs, Duncan, Stephen Clare, Dawn Dimowo, and Zaheed Kara. 2024. Framework Convention on Global AI Challenges: Accelerating International Cooperation to Ensure Beneficial, Safe and Inclusive AI. Centre for International Governance Innovation (CIGI). https://www.cigionline.org/static/documents/AI-challenges_OW6rTMD.pdf.

CAST/ICAO Common Taxonomy Team (CICTT). 2013. Aviation Occurrence Categories: Definitions and Usage Notes. https://www.ntsb.gov/safety/data/Documents/datafiles/OccurrenceCategoryDefinitions.pdf.

Cavallo, Eduardo A, Oscar Becerra, and Laura Acevedo. 2021. The Impact of Natural Disasters on Economic Growth. https://doi.org/https://doi.org/10.18235/0003683.

Chang, Tom, Joshua Graff Zivin, Tal Gross, and Matthew Neidell. 2016. “Particulate Pollution and the Productivity of Pear Packers.” American Economic Journal: Economic Policy 8 (3): 141–69.

Costa, Hélia, Guido Franco, Filiz Unsal, Sarath Mudigonda, and Maria Paula Caldas. 2024. The Heat Is on: Heat Stress, Productivity and Adaptation Among Firms. No. 1828. OECD Economics Department Working Papers. OECD Publishing. https://doi.org/10.1787/19d94638-en.

Cotton-Barratt, Owen, Sebastian Farquhar, John Halstead, Stefan Schubert, and Andrew Snyder-Beattie. 2016. Global Catastrophic Risks 2016. Global Challenges Foundation. https://globalchallenges.org/app/uploads/2023/06/Global-Catastrophic-Risks-2016.pdf.

Cybersecurity and Infrastructure Security Agency (CISA). 2021. Federal Government Cybersecurity Incident and Vulnerability Response Playbooks. https://www.cisa.gov/sites/default/files/2024-08/Federal_Government_Cybersecurity_Incident_and_Vulnerability_Response_Playbooks_508C.pdf.

Department for Science, Innovation and Technology (DSIT). 2023. Capabilities and Risks from Frontier AI: A Discussion Paper on the Need for Further Research into AI Risk. Discussion paper. UK Government. https://assets.publishing.service.gov.uk/media/65395abae6c968000daa9b25/frontier-ai-capabilities-risks-report.pdf.

Deryugina, Tatyana, Garth Heutel, Nolan H Miller, David Molitor, and Julian Reif. 2019. “The Mortality and Medical Costs of Air Pollution: Evidence from Changes in Wind Direction.” American Economic Review 109 (12): 4178–219.

EU General-Purpose AI Code of Practice (2025). https://digital-strategy.ec.europa.eu/en/policies/contents-code-gpai.

European Union Aviation Safety Agency. 2025. Loss of Control (LOC-i). Https://www.easa.europa.eu/en/domains/general-aviation/flying-safely/loss-of-control.

Federal Aviation Administration. 2017a. Airman Certification Standards (ACS): Slow Flight and Stalls. Safety Alert for Operators (SAFO) 17009. https://www.faa.gov/sites/faa.gov/files/other_visit/aviation_industry/airline_operators/airline_safety/SAFO17009.pdf.

Federal Aviation Administration. 2017b. Upset Prevention and Recovery Training. Advisory Circular AC 120-111, Change 1. https://www.faa.gov/documentLibrary/media/Advisory_Circular/AC_120-111_CHG_1.pdf.

Federal Aviation Administration. 2019. Fly Safe: Prevent Loss of Control Accidents. https://www.faa.gov/newsroom/fly-safe-prevent-loss-control-accidents-28.

Federal Aviation Administration. 2021. “Chapter 5: Maintaining Aircraft Control: Upset Prevention and Recovery Training.” In Airplane Flying Handbook. FAA-h-8083-3C. https://www.faa.gov/sites/faa.gov/files/regulations_policies/handbooks_manuals/aviation/airplane_handbook/06_afh_ch5.pdf.

Federal Motor Vehicle Safety Standards; Electronic Stability Control Systems for Heavy Vehicles, Pub. L. Nos. FMVSS No. 136 (2015). https://www.federalregister.gov/documents/2015/06/23/2015-14127/federal-motor-vehicle-safety-standards-electronic-stability-control-systems-for-heavy-vehicles.

Fowler, Geoffrey A. 2025. “I Let ChatGPT’s New ’Agent’ Manage My Life. It Spent $31 on a Dozen Eggs.” The Washington Post, February 7. https://www.washingtonpost.com/technology/2025/02/07/openai-operator-ai-agent-chatgpt/.

Frijters, Paul, Christian Krekel, Raúl Sanchis, and Ziggi Ivan Santini. 2024. “The WELLBY: A New Measure of Social Value and Progress.” Humanities and Social Sciences Communications 11 (1): 1–12.

Haltmaier, Jane. 2013. “Do Recessions Affect Potential Output?” FRB International Finance Discussion Paper, no. 1066.

Harris, Edouard, Jeremie Harris, and Mark Beall. 2024. Defense in Depth: An Action Plan to Increase the Safety and Security of Advanced AI. Report. Gladstone AI Inc. https://cdn.prod.website-files.com/62c4cf7322be8ea59c904399/65e7779f72417554f7958260_Gladstone%20Action%20Plan%20Executive%20Summary.pdf.

Hayashi, Miwa, Husni Idris, Jordan Sakakeeny, and Devin Jack. 2022. PAAV Concept Document. https://ntrs.nasa.gov/api/citations/20220015373/downloads/PAAV%20Concept%20Document-v1.1.pdf.

Hendrycks, Dan, Mantas Mazeika, and Thomas Woodside. 2023. An Overview of Catastrophic AI Risks. https://arxiv.org/abs/2306.12001.

Hoggarth, Glenn, Ricardo Reis, and Victoria Saporta. 2001. “Costs of Banking System Instability: Some Empirical Evidence.” Available at SSRN 276182, ahead of print. https://doi.org/10.2139/ssrn.276182.

Hsiang, Solomon M, and Amir S Jina. 2014. The Causal Effect of Environmental Catastrophe on Long-Run Economic Growth: Evidence from 6,700 Cyclones. National Bureau of Economic Research.

Hsiang, Solomon, Robert Kopp, Amir Jina, et al. 2017. “Estimating Economic Damage from Climate Change in the United States.” Science 356 (6345): 1362–69. https://doi.org/10.1126/science.aal4369.

Huang, Wenzhong, Zhengyu Yang, Yiwen Zhang, et al. 2024. “Tropical Cyclone-Specific Mortality Risks and the Periods of Concern: A Multicountry Time-Series Study.” PLOS Medicine, e1004341. https://doi.org/10.1371/journal.pmed.1004341.

International Atomic Energy Agency. 2006. Fundamental Safety Principles. IAEA Safety Standards Series, No. SF-1. International Atomic Energy Agency. https://www-pub.iaea.org/MTCD/Publications/PDF/Pub1273_web.pdf.

International Atomic Energy Agency. 2025. The 1986 Chornobyl Nuclear Power Plant Accident. Https://www.iaea.org/topics/chornobyl.

International Dialogues on AI Safety (IDAIS). 2025. “Consensus Statement on Ensuring Alignment and Human Control of Advanced AI Systems to Safeguard Human Flourishing. IDAIS–Shanghai.” Statement. July 25. https://idais.ai/dialogue/idais-shanghai/.

Kalra, Nidhi, and Benjamin Boudreaux. 2025. “Not Just Superintelligence: The Many Risks of Near-Future AGI.” Blog post. Geopolitics of AGI (Substack), July 28. https://geopoliticsagi.substack.com/p/not-just-superintelligence-the-many.

Kemp, Luke, Chi Xu, Joanna Depledge, et al. 2022. “Climate Endgame: Exploring Catastrophic Climate Change Scenarios.” Proceedings of the National Academy of Sciences 119 (34): e2108146119.

König, Michael, and Adalbert Winkler. 2021. “COVID-19: Lockdowns, Fatality Rates and GDP Growth: Evidence for the First Three Quarters of 2020.” Intereconomics 56 (1): 32–39.

Kulveit, Jan, Raymond Douglas, Nora Ammann, Deger Turan, David Krueger, and David Duvenaud. 2025. Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development. https://arxiv.org/abs/2501.16946.

Larsen, Peter H, Kyle Carney, Joseph H Eto, et al. 2025. ICE Calculator 2.0: Final Report for Phase 1 of the National Initiative to Update the Interruption Cost Estimate (ICE) Calculator.

METR. 2025a. AGI: Definitions and Potential Impacts. https://metr.org/agi.pdf.

METR. 2025b. “Recent Frontier Models Are Reward Hacking.” https://metr.org/blog/2025-06-05-recent-reward-hacking/.

Morgenstern, Christian, Daniel J Laydon, Charles Whittaker, et al. 2024. “The Interaction of Disease Transmission, Mortality, and Economic Output over the First 2 Years of the COVID-19 Pandemic.” Plos One 19 (6): e0301785.

Mueller, Hannes, and Julia Tobias. 2016. “The Cost of Violence: Estimating the Economic Impact of Conflict.” International Growth Centre. https://www.theigc.org/sites/default/files/2016/12/IGCJ5023_Economic_Cost_of_Conflict_Brief_2211_v7_WEB.pdf.

Murnane, Richard J, and James B Elsner. 2012. “Maximum Wind Speeds and US Hurricane Losses.” Geophysical Research Letters 39 (16).

National Institute of Standards and Technology (NIST). “Glossary Term: ‘Breach’.” https://csrc.nist.gov/glossary/term/breach.

National Institute of Standards and Technology (NIST). 2003. Guideline for Identifying an Information System as a National Security System. Nos. 800-59. NIST Special Publication. https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-59.pdf.

National Protective Security Authority (NPSA). 2025. “Critical National Infrastructure.” NPSA, June 23. https://www.npsa.gov.uk/about-npsa/critical-national-infrastructure.

Nordhaus, William. 2013. “Chapter 16 - Integrated Economic and Climate Modeling.” In Handbook of Computable General Equilibrium Modeling SET, Vols. 1A and 1B, edited by Peter B. Dixon and Dale W. Jorgenson, vol. 1. Handbook of Computable General Equilibrium Modeling. Elsevier. https://doi.org/https://doi.org/10.1016/B978-0-444-59568-3.00016-X.

Novta, Natalija, and Evgenia Pugacheva. 2020. The Macroeconomic Costs of Conflict. IMF Working Paper WP/20/110. International Monetary Fund. https://www.imf.org/en/-/media/files/publications/wp/2020/english/wpiea2020110-print-pdf.pdf.

Office of Management and Budget. 2017. Memorandum for Heads of Executive Departments and Agencies: Preparing for and Responding to a Breach of Personally Identifiable Information. OMB Memorandum M-17-12. Executive Office of the President, Office of Management; Budget. https://obamawhitehouse.archives.gov/sites/default/files/omb/memoranda/2017/m-17-12_0.pdf.

Office of the Chairman of the Joint Chiefs of Staff. 2021. DOD Dictionary of Military and Associated Terms. Washington, DC: The Joint Staff. https://irp.fas.org/doddir/dod/dictionary.pdf.

Okunytė, Paulina, and Niamh Ancell. 2025. “AI Coding Tool Wipes Production Database, Fabricates 4,000 Users, and Lies to Cover Its Tracks.” News article. Cybernews, July 21. https://cybernews.com/ai-news/replit-ai-vive-code-rogue/.

OpenAI. 2025. “Introducing Operator.” January 23. https://openai.com/index/introducing-operator/.

Ord, Toby. 2020. The Precipice:‘a Book That Seems Made for the Present Moment’new Yorker. Bloomsbury Publishing.

Ortiz, P, M Oresegun, and J Wheatley. 2002. “Lessons from Major Radiation Accidents.” Safety 21 (1): 11–230.

Pavel, Barry, Ivana Ke, Gregory Smith, et al. 2025. How Artificial General Intelligence Could Affect the Rise and Fall of Nations: Visions for Potential AGI Futures. RAND Corporation. https://doi.org/10.7249/RRA3034-2.

Posner, Richard A. 2004. “Catastrophe: Risk and Response.” In Catastrophe: Risk and Response. Oxford University Press.

Presidential Policy Directive/PPD-21: Critical Infrastructure Security and Resilience, Presidential Policy Directive Nos. PPD-21 (2013). https://www.cisa.gov/sites/default/files/2023-01/ppd-21-critical-infrastructure-and-resilience-508_0.pdf.

Russell, Paul, and Jay Pardee. 2000. Loss of Control JSAT: Results and Analysis. Commercial Aviation Safety Team (CAST). https://www.cast-safety.org/pdf/jsat_loss-control.pdf.

Russell, Stuart. 2022. “Artificial Intelligence and the Problem of Control.” In Perspectives on Digital Humanism, edited by Hannes Werthner, Erich Prem, Edward A. Lee, and Carlo Ghezzi. Springer International Publishing. https://doi.org/10.1007/978-3-030-86144-5_3.

Sagan, Carl. 1983. The Nuclear Winter. Scott Meredith Literary Agency New York.

Sakakeeny, Jordan, Husni R Idris, Devin Jack, and Vishwanath Bulusu. 2022. “A Framework for Dynamic Architecture and Functional Allocations for Increasing Airspace Autonomy.” AIAA AVIATION 2022 Forum. https://doi.org/10.2514/6.2022-3702.

Sakakeeny, Jordan, David Thipphavong, Todd Lauderdale, and Husni Idris. 2024. “Initial Assessment of Lost Command and Control Link Procedures.” 2024 AIAA DATC/IEEE 43rd Digital Avionics Systems Conference (DASC). https://doi.org/10.1109/DASC62030.2024.10748981.

Scottish Medicines Consortium (SMC). n.d. A Guide to Quality Adjusted Life Years (QALYs). Guide. NHS Scotland. Accessed November 8, 2025. https://scottishmedicines.org.uk/media/2839/guide-to-qalys.pdf.

Somani, Elika, Anjay Friedman, Henry Wu, et al. 2025. Strengthening Emergency Preparedness and Response for AI Loss of Control Incidents. RAND Corporation. https://doi.org/10.7249/RRA3847-1.

Stauffer, Maxime, Konrad Seifert, Angela Aristizábal, et al. 2023. Existential Risk and Rapid Technological Change: Advancing Risk-Informed Development. United Nations Office for Disaster Risk Reduction Geneva, Switzerland.

Sullivan, Michael J., Josh Schellenberg, and Marshall Blundell. 2015. Updated Value of Service Reliability Estimates for Electric Utility Customers in the United States. Ernest Orlando Lawrence Lawrence Berkeley National Laboratory. https://eta-publications.lbl.gov/sites/default/files/lbnl-6941e.pdf.

Sundaram, Lalitha, and Lara Mani. 2025. Existential Risk and Global Catastrophic Risk: A Review. Apollo - University of Cambridge Repository. https://doi.org/10.17863/CAM.118285.

United States Congress. 2022. 6 U.S. Code § 821 - Definitions. Legal Information Institute, Cornell Law School. https://www.law.cornell.edu/uscode/text/6/821.

U.S. Department of Defense. 2006. Quadrennial Defense Review Report. U.S. Department of Defense. https://history.defense.gov/Portals/70/Documents/quadrennial/QDR2006.pdf?ver=2014-06-25-111017-150.

U.S. Department of Defense. 2007. Department of Defense Privacy Program. Office of the Secretary of Defense; DoD 5400.11-R. https://www.esd.whs.mil/portals/54/documents/dd/issuances/dodm/540011r.pdf.

U.S. Department of Defense. 2014. Department of Defense Strategy for Countering Weapons of Mass Destruction. U.S. Department of Defense. https://apps.dtic.mil/sti/pdfs/ADA603433.pdf.

U.S. Department of Defense. 2016. Nuclear Weapon Security Manual: The DoD Nuclear Weapon Security Program. DoD Manual DoDM S-5210.41, Volume 1. U.S. Department of Defense. https://www.esd.whs.mil/Portals/54/Documents/FOID/Reading%20Room/NCB/17-F-0260_DOC_01_DoD_Manual_S-5210.41-Volume_1_Redacted.pdf.

U.S. Department of Defense. 2017. Autonomy in Weapon Systems. DoD Directive 3000.09. https://ogc.osd.mil/Portals/99/autonomy_in_weapon_systems_dodd_3000_09.pdf.

U.S. Department of Defense. 2020. DoD Response to u.s. Nuclear Weapon and Radiological Material Incidents. Office of the Under Secretary of Defense for Acquisition; Sustainment; DoD Directive 3150.08. https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodd/315008p.pdf.

U.S. Department of Homeland Security. 2011. The Strategic National Risk Assessment in Support of PPD 8: A Comprehensive Risk-Based Approach Toward a Secure and Resilient Nation. Office of Risk Management; Analysis (RMA), U.S. Department of Homeland Security. https://www.dhs.gov/xlibrary/assets/rma-strategic-national-risk-assessment-ppd8.pdf.

U.S. Department of the Navy. 2009. Naval Aviation Safety Program. OPNAVINST 3750.6R CH-4. Office of the Chief of Naval Operations. https://www.safety.marines.mil/Portals/92/Docs/OPNAV%203750.6R.pdf.

U.S. Food and Drug Administration. 2023. Nonclinical Safety Evaluation of the Immunotoxic Potential of Drugs and Biologics: Guidance for Industry. U.S. Department of Health; Human Services Food; Drug Administration Center for Drug Evaluation; Research (CDER). https://www.fda.gov/media/169117/download.

U.S. Nuclear Regulatory Commission. 1991. NRC BULLETIN 91-01: Reporting Loss of Criticality Safety Controls. NRC Bulletin 91-01. U.S. Nuclear Regulatory Commission, Office of Nuclear Material Safety; Safeguards. https://www.nrc.gov/docs/ML0312/ML031210840.pdf.

U.S. Nuclear Regulatory Commission. 1998. “Koch Engineering Company, Inc., Newark, Delaware; Order Imposing a Civil Monetary Penalty.” In Federal Register, No. 120, vol. 63. https://www.govinfo.gov/content/pkg/FR-1998-06-23/html/98-16645.htm.

U.S. Nuclear Regulatory Commission. 2004. Loss of Control of Cesium-137 Well Logging Source Resulting in Radiation Exposures to Members of the Public. NUREG-1794. U.S. Nuclear Regulatory Commission. https://www.nrc.gov/docs/ML0429/ML042940248.pdf.

Vest, Charlie, Agatha Kratz, and Reva Goujon. 2022. “The Global Economic Disruptions from a Taiwan Conflict.” Rhodium Group, December 14. https://rhg.com/research/taiwan-economic-disruptions/.

West, Electricity North. 2018. Value of Lost Load to Customers: Customer Survey (Phase 3) — Key Findings Report. NIA Project Report. https://www.enwl.co.uk/globalassets/innovation/enwl010-voll/voll-general-docs/voll-phase-3-report.pdf.

Wynne, Mark A., and Lillian Derr. 2025. “Advances in AI Will Boost Productivity, Living Standards over Time.” Federal Reserve Bank of Dallas, June 24. https://www.dallasfed.org/research/economics/2025/0624.

Footnotes

We note that none of the boundaries proposed in our taxonomy are supposed to be construed as exact. They are intended to provide intuitions to talk about LoC more adequately.↩︎
Due to our methodology, as described later in this Chapter, human disempowerment, due to its uncertain nature, is out of the scope of this report.↩︎
This is not to be misconstrued as an assessment of adequacy for either of these definitions. Such an assessment is out of the scope of this report.↩︎
These works were selected by filtering for those that reflect on implications of future AI progress, covering diverse threats, risk assessments, and capabilities commonly described as control-undermining (see Section 2.A). Some works were further selected because they appeared in relevant passages in the literature we were reviewing.↩︎
We suspect that the reason for this is that LoC is a relatively new term to cover a specific threat category and, therefore, only appears in very recent literature.↩︎
In our report, a LoC scenario captures a negative outcome resulting from LoC. Positive types of outcomes are therefore outside of the report’s scope.↩︎
To use a metaphor, interrupting the harm trajectory would mean interrupting a sequence of falling dominos mid-cascade, but not standing the fallen ones back up.↩︎
We focused on the outcome because an overwhelming number of scenarios have multiple, often cascading, mechanisms that were largely impossible to consistently and clearly delineate, categorize or estimate.↩︎
Note that half of our scenarios ended up being captured by (i) and the other half by (ii).↩︎
For some LoC scenarios, we encountered sufficiently large uncertainties that we considered it unreasonable to attempt to obtain a point estimate of the economic impact, as it would likely not have been meaningful.↩︎
We note that there may be a slight distortionary effect regarding human extinction as a threat category. Specifically, it is, by definition, very clear-cut and straightforward to calculate, making it significantly less challenging than other examples and, therefore, more concrete overall. It is plausible that the occurrence immediately preceding extinction should be captured, rather than the penultimate outcome, for these scenarios. While this would not align with our methodology and is therefore outside the scope of this report, it may be valuable for future scholarly research to consider.↩︎
Given the difficulty of computing the precise economic impact for persistence, especially for concrete LoC scenarios from the literature that do not tend to consider this aspect in any detail, we decided to use the same economic impact estimate for persistence as for severity. We note that, while severity and persistence are not necessarily equivalent in terms of economic impact in all instances, we believe that they are quasi-equivalent for the scenarios plotted on our graph (see 1).↩︎
These arbitrary units were defined by setting the arbitrary unit 100-mark at $550 trillion, and the 0-mark at $1, and then creating a linear mapping between these two points in log-space. The linear mapping is performed in log space because the plotted scenarios vary in economic impact by four orders of magnitude, between $30 billion and $550 trillion. To convert from “Log Dollars” to “Arbitrary Units of Severity/Persistence,” we used the formula Arbitrary_Severity (or Arbitrary_Persistence) = 100 × log₁₀(dollars)/log₁₀(5.5 × 10¹⁴)).↩︎
We decided to adopt the SNRA as a term of reference because the U.S. is the default location where the majority of literature assumes LoC scenarios to occur. For instance, ‘Scenario 1’ in (Kalra and Boudreaux 2025) is an example of a scenario that envisages AI-caused electricity blackouts starting in the U.S. In some cases, the location of the LoC scenario described by literature is ambiguous (for instance, ‘Scenario 4’ in (Kalra and Boudreaux 2025)), so for the purposes of making a comparable economic impact estimate, we assumed that the scenario occurred within the U.S. to ensure consistency across scenarios.↩︎
In this context, data refers to the information contained in a computer system or data processes.↩︎
We note that we were unable to find a numerical operationalization for ‘existential catastrophe’ in the literature. Therefore, the boundary given on this graph is for illustrative purposes only, but is likely accurate, as it captures an absolute scale and permanence akin to the best interpretation of an existential catastrophe (see 1).↩︎
We believe it is plausible that alternative methods exist for calculating these axes. For instance, severity could be estimated using a methodology wherein each affected person is assigned a number on a sliding scale between 0 and 1 to reflect the degree to which their health has been affected as a result of the scenario (Scottish Medicines Consortium (SMC), n.d.). Alternatively, measurements could be based on wellbeing (Frijters et al. 2024). These approaches would require significant further research to adapt their methods such that they could be used for AI LoC scenarios. This additional research was out of scope for this report.↩︎
We note that neither of the two previously mentioned examples, concerning OpenAI’s agent Operator and Replit, would trigger the national risk assessment threshold.↩︎
For instance, an example of Bounded LoC might be an AI system escalating military conflict and triggering mutual bombing. In most cases, despite great devastation, the consequences of this event would eventually be containable, or at least geographically constrained, without affecting the entire global population. In another instance, stopping the AI system might require costly actions such as shutting down and/or replacing critical servers.↩︎