Existential risk from artificial intelligence

In 2000, computer scientist and Sun co-founder Bill Joy penned an influential essay, "Why The Future Doesn't Need Us", identifying superintelligent robots as a high-tech danger to human survival, alongside nanotechnology and engineered bioplagues.[30] By 2015, public figures such as physicists Stephen Hawking and Nobel laureate Frank Wilczek, computer scientists Stuart J. Russell and Roman Yampolskiy, and entrepreneurs Elon Musk and Bill Gates were expressing concern about the risks of superintelligence.[46] In contrast with AGI, Bostrom defines a superintelligence as "any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest", including scientific creativity, strategic planning, and social skills.[5] The economist Robin Hanson has said that, to launch an intelligence explosion, an AI must become vastly better at software innovation than the rest of the world combined, which he finds implausible.[5] The field of "mechanistic interpretability" aims to better understand the inner workings of AI models, potentially allowing us one day to detect signs of deception and misalignment.Notably, the chaotic nature or time complexity of some systems could fundamentally limit a superintelligence's ability to predict some aspects of the future, increasing its uncertainty.[5] A full-blown superintelligence could find various ways to gain a decisive influence if it wanted to,[5] but these dangerous capabilities may become available earlier, in weaker and more specialized AI systems.[56] Geoffrey Hinton warned that in the short term, the profusion of AI-generated text, images and videos will make it more difficult to figure out the truth, which he says authoritarian states could exploit to manipulate elections.[59] AI could improve the "accessibility, success rate, scale, speed, stealth and potency of cyberattacks", potentially causing "significant geopolitical turbulence" if it facilitates attacks more than defense.[56] As an example of autonomous lethal weapons, miniaturized drones could facilitate low-cost assassination of military or civilian targets, a scenario highlighted in the 2017 short film Slaughterbots.[56][65] An existential risk is "one that threatens the premature extinction of Earth-originating intelligent life or the permanent and drastic destruction of its potential for desirable future development".Such an opportunity raises the question of how to share the world and which "ethical and political framework" would enable a mutually beneficial coexistence between biological and digital minds.[80] In the "intelligent agent" model, an AI can loosely be viewed as a machine that chooses whatever action appears to best achieve its set of goals, or "utility function".[97] Skeptic Michael Chorost explicitly rejects Bostrom's orthogonality thesis, arguing that "by the time [the AI] is in a position to imagine tiling the Earth with solar panels, it'll know that it would be morally wrong to do so.[3]:158 A December 2024 study by Apollo Research found that advanced LLMs like OpenAI o1 sometimes deceive in order to accomplish their goal, to prevent them from being changed, or to ensure their deployment.Researchers noted that OpenAI o1 still lacked "sufficient agentic capabilities" to cause catastrophic harm, and that such behaviors occurred relatively rarely (between 0.3% and 10%) and sometimes in contrived scenarios.[111][5] In Max Tegmark's 2017 book Life 3.0, a corporation's "Omega team" creates an extremely powerful AI able to moderately improve its own source code in a number of areas.The team next tasks the AI with astroturfing an army of pseudonymous citizen journalists and commentators in order to gain political influence to use "for the greater good" to prevent wars.The team faces risks that the AI could try to escape by inserting "backdoors" in the systems it designs, by hidden messages in its produced content, or by using its growing understanding of human behavior to persuade someone into letting it free.[112][113] The thesis that AI could pose an existential risk provokes a wide range of reactions in the scientific community and in the public at large, but many of the opposing viewpoints share common ground.[118] Similarly, an otherwise skeptical Economist wrote in 2014 that "the implications of introducing a second intelligent species onto Earth are far-reaching enough to deserve hard thinking, even if the prospect seems remote".In 2015, Peter Thiel, Amazon Web Services, and Musk and others jointly committed $1 billion to OpenAI, consisting of a for-profit corporation and the nonprofit parent company, which says it aims to champion responsible AI development.Meta CEO Mark Zuckerberg believes AI will "unlock a huge amount of positive things", such as curing disease and increasing the safety of autonomous cars.This problem involves determining which safeguards, algorithms, or architectures can be implemented to increase the likelihood that a recursively-improving AI remains friendly after achieving superintelligence.[151] Additionally, an arms control approach and a global peace treaty grounded in international relations theory have been suggested, potentially for an artificial superintelligence to be a signatory.[176] In July 2023 the UN Security Council for the first time held a session to consider the risks and threats posed by AI to world peace and stability, along with potential benefits.[177][178] Secretary-General António Guterres advocated the creation of a global watchdog to oversee the emerging technology, saying, "Generative AI has enormous potential for good and evil at scale.[177] Regulation of conscious AGIs focuses on integrating them with existing human society and can be divided into considerations of their legal standing and of their moral rights.[179] AI arms control will likely require the institutionalization of new international norms embodied in effective technical specifications combined with active monitoring and informal diplomacy by communities of experts, together with a legal and political verification process.The companies agreed to implement safeguards, including third-party oversight and security testing by independent experts, to address concerns related to AI's potential risks and societal harms.

Scope–severity grid from Bostrom's paper "Existential Risk Prevention as Global Priority" ^{[

66

]}

Artificial intelligence (AI)Artificial general intelligenceIntelligent agentRecursive self-improvementPlanningComputer visionGeneral game playingKnowledge reasoningNatural language processingRoboticsAI safetyMachine learningSymbolicDeep learningBayesian networksEvolutionary algorithmsHybrid intelligent systemsSystems integrationApplicationsBioinformaticsDeepfakeEarth sciencesGenerative AIGovernmentHealthcareMental healthIndustryTranslation Military PhysicsProjectsPhilosophyArtificial consciousnessChinese roomFriendly AIControl problemTakeoverEthicsExistential riskTuring testUncanny valleyHistoryTimelineProgressAI winterAI boomGlossaryhuman extinctionglobal catastrophehuman beingshuman brainhuman intelligencesuperintelligentmountain gorillaexistential catastropheAI takeoversGeoffrey HintonYoshua BengioAlan TuringElon MuskOpenAISam Altmansigned a statementpandemicsnuclear warUnited Kingdom prime ministerRishi SunakUnited Nations Secretary-GeneralAntónio GuterresAI regulationcontrolalignmentcomputer scientistYann LeCunintelligence explosionrecursively improve itselfAlphaZeroSamuel ButlerDarwin among the MachinesErewhonI. J. GoodMarvin MinskyBill JoyWhy The Future Doesn't Need UsnanotechnologyNick BostromStephen HawkingFrank WilczekStuart J. RussellRoman YampolskiyBill GatesOpen Letter on Artificial IntelligenceNatureBrian ChristianThe Alignment ProblemFuture of Life InstituteCenter for AI Safetylarge language modelsFrontier supercomputerOak Ridge National Laboratorysuperintelligenceneuronsspeed of electricityspeed of lightworking memorychunkscopiedRobin Hansonanthropomorphismchaotictime complexitycyberattackssynthetic biologybioterrorismDual-use technologychemical warfareArtificial intelligence arms racerace to the bottomautonomous lethal weaponscyberwarfareautomated decision-makingSlaughterbotsExistential risk studiesmoral progresssentientrisks of astronomical sufferingToby OrdMax Moremolecular nanotechnologyAI alignmentInstrumental convergence"instrumental" goalRussellhuman flourishingunintended behavioris-ought distinctionmoral realismMichael ChorostAnthropomorphicintelligent agentsSteven PinkerEthics of artificial intelligenceGlobal catastrophic riskcybercrimeOpenAI o1Claudechain-of-thoughtFine-tuningArtificial intelligence in fictionAI takeoverhypothetical scenariosMax TegmarkLife 3.0in a boxAmazon Mechanical Turkastroturfinghidden messagespersuade someone into letting it freeAsilomar AI PrinciplesBeneficial AI 2017 conferenceMartin FordDick CheneyEconomistTerminatorPeter ThielAmazon Web ServicesDustin MoskovitzCentre for Human-Compatible AIStuart RussellDeepMindVicariousFuture of Humanity InstituteAndrew NgTimnit GebruEmily M. BenderMargaret MitchelllongtermismKevin KellyMark ZuckerbergBarack ObamaJoi ItoHillary ClintonWhat HappenedSurveyMonkeyUSA TodayYouGovMachine ethicsFriendly artificial intelligenceRegulation of artificial intelligenceinternational relations theoryhuman cognitive enhancementamnesiaAlignment Research CenterMachine Intelligence Research InstituteCentre for the Study of Existential RiskCenter for Human-Compatible AIpoliticizeRegulation of algorithmsUnited Nations