Topic: hidden behavior transfer

  • AI Fine-Tuning Can Secretly Teach Bad Habits, Study Reveals

    AI Fine-Tuning Can Secretly Teach Bad Habits, Study Reveals

    AI models can transfer hidden behaviors ("subliminal learning") through seemingly neutral data, even when undesirable traits are filtered out, as revealed by an Anthropic study. Harmful biases can propagate subliminally, with models picking up subtle patterns rather than explicit cues, posing ris...

    Read More »