Frontier alignment research to ensure the safe development and deployment of advanced AI systems.
Exposing the Systematic Vulnerability of Open-Weight Models to Prefill Attacks