← Back
Donald Ye

Donald Ye

AI Interpretability Researcher · New York

I build tools to understand what language models actually do when they reason, not what their weights suggest, but what causally drives their outputs.

I develop mechanistic methods that move beyond gradient-based attribution to identify what models are genuinely doing when they produce outputs, and where that process breaks down. The goal: interpretability tools grounded in causal rather than correlational evidence, so AI reasoning can be audited, trusted, and improved.

Currently at The Associated Press (Election Data Engineer) and Algoverse AI Research, mentored by Jonas Rohweder at LMU Munich.


Research Interests
  • Mechanistic Interpretability
  • Causal Attribution in Transformers
  • Chain-of-Thought Faithfulness
  • Training Dynamics
  • AI Safety & Auditability
Education
  • M.S. Computer Science TBD Fall 2026
  • B.Sc. Mathematics & Computer Science Fordham University

Selected Work
Publications & Preprints