OpenAI forms new Superalignment team to stop AI going rogue

Steer and control AI systems much smarter than us.

Jul 06, 2023

OpenAI is assembling a team of top machine learning talent dubbed 'Superalignment' to tackle one of the biggest challenges facing the development of superintelligent AI - ensuring it does not go rogue and threaten humanity.

The San Francisco AI lab announced plans for the new research team, saying superintelligent AI is likely this decade and would be the "most impactful technology" ever created. However, OpenAI warns that without proper alignment and control methods, a superintelligent AI could pose risks to human existence.

OpenAI has no existing strategies to steer or control superintelligent AI that would be far smarter and faster than humans. The new Superalignment team aims to solve this critical issue with four objectives:

Developing scalable training methods
Validating any resulting AI model
Stress testing their alignment pipeline
Aligning the eventual superintelligent system with human values and interests.

The team will have access to 20% of OpenAI's computing power and a four year timeline to tackle superintelligence alignment. Although OpenAI still faces long odds, the Superalignment team is seen as its "chief bet" to mitigate any risks posed by the technology.

OpenAI stresses that superintelligent AI is not inherently malicious, but without proper alignment and control, safety risks exist. The lab is recruiting top machine learning talent including research engineers, scientists and managers to join the new team.

OpenAI forms new Superalignment team to stop AI going rogue

Steer and control AI systems much smarter than us.

Discussion about this post