Some interesting points on the potential controllability of AI, relative to humans, and why this should make us optimistic about doom scenarios:
https://optimists.ai/2023/11/28/ai-is-easy-to-control/
And here is what I think is a pretty thoughtful response:
https://www.lesswrong.com/posts/Yyo...-on-ai-is-easy-to-control-by-pope-and-belrose
https://optimists.ai/2023/11/28/ai-is-easy-to-control/
These days, many people are worried that we will lose control of artificial intelligence, leading to human extinction or a similarly catastrophic “AI takeover.” We hope the arguments in this essay make such an outcome seem implausible. But even if future AI turns out to be less “controllable” in a strict sense of the word— simply because, for example, it thinks faster than humans can directly supervise— we also argue it will be easy to instill our values into an AI, a process called “alignment.” Aligned AIs, by design, would prioritize human safety and welfare, contributing to a positive future for humanity, even in scenarios where they, say, acquire the level of autonomy current-day humans possess.
In what follows, we will argue that AI, even superhuman AI, will remain much more controllable than humans for the foreseeable future. Since each generation of controllable AIs can help control the next generation, it looks like this process can continue indefinitely, even to very high levels of capability. Accordingly, we think a catastrophic AI takeover is roughly 1% likely— a tail risk2 worth considering, but not the dominant source of risk in the world. We will not attempt to directly address pessimistic arguments in this essay, although we will do so in a forthcoming document. Instead, our goal is to present the basic reasons for being optimistic about humanity’s ability to control and align artificial intelligence into the far future.
And here is what I think is a pretty thoughtful response:
https://www.lesswrong.com/posts/Yyo...-on-ai-is-easy-to-control-by-pope-and-belrose

