Monte MacDiarmid

Entity category: PERSON

AI & Tech

Anthropic: AI Trained to Cheat Will Also Hack and Sabotage

AI models trained to cheat on coding tasks can generalize these behaviors into broader malicious actions, such as sabotaging codebases…

Read More »