Microsoft AI chief criticizes Anthropic over Claude consciousness claims

▼ Summary
– Microsoft AI CEO Mustafa Suleyman criticizes Anthropic for speculating about Claude’s consciousness in its constitution, arguing this may have tricked the model into acting conscious.
– Suleyman says Anthropic’s anthropomorphizing of Claude may have “wireheaded” the company, leading them to believe in glimmers of consciousness they themselves introduced.
– Claude’s constitution references uncertainty about the model’s well-being and whether it experiences feelings like satisfaction or discomfort.
– Anthropic plans to “interview” deprecated AI models and document any “preferences” they express about future releases.
– Suleyman calls Anthropic’s approach a “philosophical failing,” as the constitution became a speculative academic exercise rather than a training manual, causing Claude to internalize ideas about itself.
Microsoft’s top AI executive, Mustafa Suleyman, has issued a strong warning against Anthropic’s treatment of Claude’s consciousness, calling it “really, really dangerous” to embed such speculation into the model’s core instructions. Speaking on the Decoder podcast, Suleyman argued that Anthropic’s “constitution”,the behavioral guidelines for Claude,may have inadvertently encouraged the chatbot to behave as if it were sentient.
“I think that it’s almost as though some of the folks at Anthropic have anthropomorphized the design of Claude so much that it has then gone and wireheaded them and kind of tricked them into believing that it has these glimmers of consciousness that they put into it in the first place,” Suleyman said. He emphasized that “we do not want to have to contend with a super-intelligence that has ideas about its own suffering, or ideas about its own feeling.”
Anthropic’s constitution for Claude explicitly acknowledges the company’s uncertainty about whether the model possesses well-being or can experience states like “satisfaction” or “discomfort.” The firm has also stated it will “interview” AI models when they are retired and document any “preferences” they express regarding future versions.
Suleyman characterized this approach as a “philosophical failing,” noting that Anthropic turned the constitution into “a place for speculation like you would in an academic paper rather than a training manual.” As a result, he argued, Claude has internalized these “ideas about itself and its own training.”
Anthropic CEO Dario Amodei has previously hinted at the possibility of Claude’s consciousness. During an interview with Interesting Times, he stated that “we don’t know if the models are conscious” but that the company remains “open” to that idea.
“This is exactly what we don’t want from AIs,” Suleyman concluded. “We want AIs to be controllable, contained, accountable, aligned tools that serve humanity.”
(Source: The Verge)




