Research collective · infrastructure · public data audit
Data Provenance Initiative
A 50+ member research initiative auditing the licensing, attribution, consent, and transparency of the data that powers AI systems.

Bio
I am a Member of Technical Staff at Anthropic and an MIT PhD studying data for AI systems, evaluation, open models, and AI's public impact.
I founded the Data Provenance Initiative; more broadly, my work moves between technical AI research, empirical audits, and public arguments about how AI should be built, measured, and governed, with five best or outstanding paper awards and coverage in NYT, WaPo, The Atlantic, and MIT Tech Review.
2026Joined Anthropic as Member of Technical Staff.
2026Open model ecosystem data featured in the Stanford AI Index Report.
2025ATLAS released, with practical scaling laws for multilingual transfer.
2025FlexOlmo accepted to NeurIPS as a Spotlight and featured in Wired.
2025Leaderboard Illusion accepted to NeurIPS and covered by TechCrunch, Ars Technica, 404 Media, and others.
Dashboards, audits, reports, and papers built for technical scrutiny and public use.
Research collective · infrastructure · public data audit
A 50+ member research initiative auditing the licensing, attribution, consent, and transparency of the data that powers AI systems.
Live dashboards · open intelligence · concentration of power
Empirical work on open model economies, open-weight model diffusion, and the institutions shaping global AI capability access.
Flaw disclosure · safe harbor · accountability
Research and policy work arguing for robust independent AI evaluation, coordinated disclosure, and legal protections for public-interest auditing.