Conclusion

See full report

In this research report, we provided a novel conceptualization, taxonomy, and future implications of LoC as a threat. Our work was motivated by recent increases in AI system capabilities and rising attention to the topic in policy, industry, and legal discourse.

In Chapter 1, we set out to shed light on what LoC is. We arrived at our LoC taxonomy by reviewing relevant definitions in AI literature and in other safety-critical sectors, including defense, aviation, and nuclear. We found that neither set of existing definitions supports one common, clear definition of LoC that decision- and policymakers could leverage today. Subsequently, we conducted a comprehensive literature review and developed a methodology to assess and categorize existing LoC scenarios, ultimately plotting them on an experimental graph for comparison. Our graph enabled us to extrapolate a novel taxonomy for conceptualizing LoC: Deviation, Bounded LoC, and Strict LoC.

We believe that this taxonomy will be helpful in structuring discussions around LoC, as each category entails different outcomes in terms of permanence and severity, and may potentially require different mitigation strategies and actionable levers. Moreover, our methodology sheds light on which category is most commonly and concretely referenced in scholarly works: Bounded LoC.

In Chapter 2, we reflected on what can be done to mitigate LoC threats today. We proposed that in the absence of a consensus on capabilities and thresholds to accurately assess and capture LoC risk, decision- and policymakers should focus on actionable, simple steps that can be taken today. We therefore proposed a complementary approach that can offer clear levers, while largely sidestepping existing uncertainties. Specifically, instead of focusing on AI systems’ intrinsic factors (i.e., capabilities and propensities), we proposed to focus on extrinsic factors that can raise the overall risk of LoC occurring. In doing so, we put forward a framework inspired by the research presented in Chapter 1, focusing on assessing and limiting the deployment context, affordances, and permissions of an AI system (the ‘DAP framework’).

In Chapter 3, we focused on the future implications of LoC threats, in light of the assumed rise in AI capabilities and increased strategic and economic competition, which could undermine the DAP framework. We claim that under these conditions, society would eventually find itself living in a ‘state of vulnerability,’ which denotes a state in which a sufficiently capable future AI system has acquired (through humans or independently) or could independently acquire sufficient access to resources, affordances, and permissions (or means to acquire further access) and sufficient capabilities to cause LoC when a catalyst materializes. Once the catalyst is triggered, we speculate on multiple pathways that future could take and propose that it is implausible for society not to eventually face a LoC outcome; therefore, preemptive control and mechanisms should be developed and implemented early on.