I’m an AI safety researcher focused on few-shot catastrophe prevention — understanding what happens when we catch AI misbehavior and how to best use that information to prevent future catastrophes.

Research Focus

My research asks: When we catch AI doing something dangerous, what’s the most effective way to use that information?

I’m building systematic benchmarks to compare different prevention techniques (probes, fine-tuning, monitoring) and understand the fundamental tradeoffs between control and alignment.

Current Work

I’m currently working on Paper 1: A benchmark framework for evaluating catastrophe prevention techniques. This work builds on pilot studies that validated the research direction and identified a 4-paper research arc.

Background

  • AI Safety Research (Few-Shot Catastrophe Prevention)
  • Teaching Fellow at University of Warwick
  • SFHEA (Senior Fellow of the Higher Education Academy) candidate

Contact


This site shares research progress, pilot studies, and mini-articles on AI safety. All work is pre-publication and subject to revision.