Josiah Khor


1 part experiment, 1 part a more permanent and public replacement for keeping approximately 100 tabs open in my browser at all times.

A small number of samples can poison LLMs of any size

OCT 11, 2025

« All posts

Anthropic: A small number of samples can poison LLMs of any size

Specifically, we demonstrate that by injecting just 250 malicious documents into pretraining data, adversaries can successfully backdoor LLMs ranging from 600M to 13B parameters.