Mark Henry

markhenry.software@gmail.com

Hi. I'm looking for work as an AI alignment research engineer or ML engineer.

Repro: Network Dissection (Bau et al, 2017)

Reproduced mech interp technique for automatically identifying channels that detect visual concepts. PyTorch, convolutional networks.

I reproduced a classic mech interp paper

The original deconvolutional visualization technique from Zeiler and Fergus, 2013. PyTorch, computer vision.

Can I implement a transformer by just reading the paper?

Built transformer from scratch (43M parameters) achieving 3 perplexity on WikiText. PyTorch, attention mechanisms, training optimization.

SPAR Summer 2024: Steering for Censorship of LLM Cognition

Co-authored research on activation steering in LLMs. Created contrastive datasets, ran benchmarks on Gemma 2 and Llama models. Python, HuggingFace, GPU computing.

Steering Gemma 2 with nrimsky's CAA

Extended open-source activation steering library to support Gemma 2 and Llama 3. Python, model architectures, layer-wise interventions.

SPAR Spring 2024: Evaluating Stability of Unreflective Alignment

Co-authored research paper on LLM alignment stability. Designed multi-armed bandit experiments, built evaluation framework. Python, OpenAI API, Jupyter.