Cognition Café

We are on track to kill all humans with superintelligent AI systems.

I do not mean it indirectly, such as "AI will trigger world war III, which will then kill 80% of the global population".
I mean it literally. I am talking about human extinction from AI.

My main priority at the moment is to avert this.

Top academics and AI CEOs have warned us about human extinction risks from AI.

They have signed the following public statement:

Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.

To make it clear that they are not talking about abstract risks, they specifically mention pandemics and nuclear war as other societal-scale risks.

You can find the full list of signatories here. Most notably, it includes the CEOs of DeepMind, OpenAI and Anthropic, the AI corps at the forefront of AI research.

From my point of view, the situation is more dire than the statement suggests. It is not "risks" of extinction from AI. It is that the default course of action is to build superintelligent AI systems that kill us all.

My main argument is outlined in this article.

This is not a long-term concern. This is a short-term urgency.

Sam Altman, CEO of OpenAI, mentions in a new page from his site, that superintelligence may be built in a few thousand days (ie: less than 10 years).

Shane Legg, cofounder of DeepMind, in a recent Dwarkesh podcast estimates that we will reach AGI by 2028 (less than 5 years from now!). This is consistent with his long-term predictions of AGI by mid-2020s.

Dario Amodei, CEO of Anthropic, has just said in a November 2024 Lex Friedman podcast (conveniently put at the beginning of the video) that it makes sense we'll reach it by 2026-2027.

The reason why extinction is the default outcome is that companies are in charge of AI research and deciding what to do with it. And as it stands, companies have no direct plan for making superintelligent AI systems who can outsmart us not us.

Their current plan is a leap of faith: that regular AI systems will be able to "align" more powerful ones, all the way to superintelligence. This is based on neither theory nor evidence, as no one built a superintelligent AI system yet.

OpenAI called this problem "superalignment" (as in "aligning superintelligence") in their article from July 2023.
They write there:

Currently, we don't have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue.
Our goal is to build a roughly human-level automated alignment researcher⁠. We can then use vast amounts of compute to scale our efforts, and iteratively align superintelligence.

For DeepMind, the best I could find publicly is their "Responsibility & Safety" page, which is literally just slop. If you can get some clear information from it, please let me know. As far as I can tell, they have no plan for superalignment at all.

Anthropic is much more weaselly, with their website not mentioning AGI (which even DeepMind does!), extinction nor superalignment. The closest thing I could find is this article in their website from March 2023.
They clearly state:

We do not know how to train systems to robustly behave well. So far, no one knows how to train very powerful AI systems to be robustly helpful, honest, and harmless.
And then, when describing their approach, it is just slop about being "empirical".

Fortunately, Chris Olah (a co-founder of Anthropic) wrote in June 2023 some clarifying thoughts. He confirms that Anthropic does not know if superalignment is hard or not. As he says:

It would be very valuable to reduce uncertainty about the situation.
If we were confidently in an optimistic scenario, priorities would be much simpler.
If we were confidently in a pessimistic scenario (with strong evidence), action would seem much easier.

In the meantime, while they do not know, what is their plan? Their plan is to act as if superalignment was easy and keep racing.

The situation is fucked.

We are facing AGI in the next few years, superintelligence in possibly less than 10 years, and extinction from it after that.

Right now, Big Tech companies have free reign over AI research. They can do whatever they want, and they are not held accountable for their actions.

They do not face any regulation close to that of nuclear or biological technology. Nor even the ones for chemical or pharmaceutical industry. Not even the ones for automotive industry. And finally, not even the ones for hair products and sandwiches.

If you want to deeply understand the situation and how we got there, we have written a report about it.

To summarise:

Shane Legg (Deep Mind): AGI by 2028
Sam Altman (OpenAI): Superintelligence in less than 10 years
Dario Amodei (Anthropic): Large-scale automation in 2025
All of them: extinction risk from AI
Me: default course is extinction from AI

This page is not about a solution, it is more about establishing the problem.

If you want a solution, you can read a separate policy proposal. In the short-term, this policy proposal amounts to "stop AGI research asap you dumb fucks", but with more thought and legalese.