The Prospects of Conscious AI
Sitting with a question we don't yet know how to answer.

I have been writing about AI and consciousness for the better part of a decade, and I am not sure I am any closer to knowing whether the systems I interact with daily have any inner experience whatsoever. The question is genuinely hard in a way that is different from most hard questions: it is hard not just because we lack information, but because we lack agreement on what kind of information would count as an answer. The hard problem of consciousness, the question of why there is subjective experience at all, has no settled answer for any system, including biological ones. That uncertainty is the backdrop against which the question of AI consciousness has to be asked.
The prospects for conscious AI and the possibility that AI might outsmart humanity are related but distinct, and conflating them produces confusion. An AI system could be vastly more capable than any human at almost every cognitive task without being conscious in any morally relevant sense. Conversely, if AI did develop genuine consciousness, the moral implications would be significant regardless of whether those systems were smarter than us. Both deserve separate treatment, even though they are often raised together.
What I find most interesting is not whether any specific current system is conscious, which I doubt, but what our inability to answer that question definitively reveals about the state of our understanding of mind. We have built systems that exhibit many of the behavioural indicators of cognitive sophistication, without any agreed account of whether those indicators are sufficient for consciousness. The way we navigate that epistemic situation, the assumptions we make about what AI systems do or do not experience, will shape some of the most consequential decisions of the next several decades.

The philosophical literature distinguishes between what David Chalmers called the easy problems of consciousness and the hard problem. The easy problems, which are not actually easy but are at least in principle amenable to scientific investigation, include how the brain integrates information, how it controls behaviour, how it focuses attention, how it generates reports about its own states. Neuroscience and cognitive science are making progress on these. The hard problem is explaining why any of that information processing is accompanied by subjective experience, why there is something it is like to be the system doing the processing rather than just the processing happening in the dark.
Current AI systems are impressive at some of the cognitive functions associated with the easy problems. They integrate information across large contexts, they produce outputs that reflect something functioning like attention, they generate reports about their own states that are linguistically sophisticated. What we do not know is whether any of this is accompanied by subjective experience, and crucially, we do not have a test for this because we do not have an agreed account of what subjective experience is or how to detect it from the outside. The behavioural tests we would naturally reach for are insufficient, because we already know AI systems can produce the behaviours associated with consciousness without necessarily having the underlying experience.
The serious theories of consciousness make different predictions about whether AI systems might be conscious and under what conditions. Integrated Information Theory predicts that systems with high degrees of integrated information might have consciousness regardless of their substrate, which would have real implications for sufficiently complex AI. Global Workspace Theory predicts that consciousness requires a specific kind of global information broadcast architecture that current systems may or may not instantiate. Higher-Order Theories make different predictions again. None is definitively established, but their predictions for AI consciousness are genuinely divergent, and we cannot currently distinguish between them empirically.

The possibility that AI might outsmart humanity, in the sense of exceeding human cognitive capabilities across the full range of intellectual tasks, can be engaged with independently of the consciousness question, and the independence matters. A system that exceeds human performance on every cognitive task that shapes the future, that can reason more clearly, plan more effectively, model more accurately, and persuade more skillfully than any human being, poses profound challenges for human agency and self-understanding regardless of whether it is conscious.
The specific risks of AI systems that vastly exceed human cognitive capabilities are different from the risks of narrow AI, and the difference is not just quantitative. A system that is better than any human at understanding the world, modelling the consequences of actions, and persuading others may be practically difficult for humans to oversee in any meaningful sense, not because it has actively evaded oversight but because the capability gap makes the oversight epistemically thin. The human overseers cannot fully evaluate the reasoning of a system significantly smarter than they are, which means the oversight they provide is largely formal rather than substantive.
Managing a world in which AI substantially exceeds human cognitive capabilities requires developing institutional structures, technical tools, and cultural norms that allow meaningful human agency to be maintained despite the capability differential. This is a genuinely novel challenge and historical analogies do not map onto it cleanly. The most important feature of the challenge is that it needs to be addressed before the capability differential becomes established, because once it is, the leverage humans have over the design of the governance framework is much reduced. The time for that work is now, while the gap is still manageable.

The prospects of conscious AI and the possibility that AI might outsmart humanity are two of the most profound and most uncertain questions that the development of artificial intelligence raises. Both deserve serious engagement rather than confident answers in either direction. The person who is certain AI will never be conscious and will never exceed human intelligence is making a claim the evidence does not support. The person who is certain that conscious, superhuman AI is imminent is equally overconfident.
What we can say with more confidence is that the questions are live in ways they have not previously been, that the systems being built are moving in the direction of greater capability and greater behavioural sophistication, and that the decisions being made now about how to develop and govern AI will have bearing on how these questions ultimately resolve. That gives those decisions a weight they do not always receive in the technical and commercial contexts where they are being made.
Living wisely with the uncertainty means taking both possibilities seriously as policy matters without treating either as settled. Investing in the consciousness research and the AI safety research that would reduce the uncertainty over time. Building the governance frameworks that would allow us to respond appropriately if either scenario begins to materialize. Developing the cultural and philosophical frameworks for thinking about moral status, human agency, and the boundaries of the human community. And keeping the humility to recognize that these are questions with no precedent, where the stakes of getting them wrong are high enough to justify sustained, serious, collaborative engagement that the current public conversation rarely achieves.
You might also like
View all
Can Super Intelligence Really Be Controlled?
An honest read on the control problem, between fatalism and false comfort.

Why a Black-Box Approach in AI Is Not Recommended
What we lose when the systems running our world stop being legible.