Hey, I’m a PhD Candidate at the MIT Media Lab. My research focuses on training and evaluating large language models, their social impact and governance.
Prior:
2023.10: Launched the The Data Provenance Initiative, covered by the Washington Post and VentureBeat.
2023.09: New paper on Foundation Model Transparency Index, covered by NYT, The Atlantic, and VentureBeat.
2023.05: New paper on A Pretrainer’s Guide to Training Data.
2023.01-05: Invited talks on ‘Effective Instruction Tuning: Data, Methods, & New Abilities’ at Apple, Oracle, Kailua Labs, Databricks, and Amazon.
2023.02-06: Co-instructor for MIT’s Generative AI course MAS.S68.
2023.03: A co-lead for Cohere for AI’s (C4AI) community research effort on Multilingual Instruction tuning.
2023.01: New paper on The Flan Collection. See the Google AI Blog post.
2022.10: New paper on Scaling Instruction-Finetuned Language Models. See 7 min video.