Home
Resume
Writings
Contact
Writings
Samurais & Evil Robots: Adversarial Direction Ablation in LLMs
A mechanistic interpretability study of cultural framing, refusal-direction geometry, and the limits of multi-axis ablation.
Research
2026-05-18
Read
↗
Want to hear more?
Get notified of new writings.
Leave this field empty
Subscribe