Eddie Zhang

San Francisco, CA

I study reinforcement learning for social good.


About

I am currently working at a new lab. I was previously at OpenAI, working on safety, alignment, and applying AI for social good. I dropped out of a CS PhD at Harvard. Other areas of interest include political philosophy, macroeconomics, and the principles of intelligence.

I am grateful to have worked closely with Tyna Eloundou at OpenAI, Professor Milind Tambe at Harvard, Professor Chuang Gan at MIT-IBM Watson, Professor Amy Zhang at Meta, and Professor William Wang at UCSB.


Selected Research

Why language models hallucinate

Adam Kalai, Ofir Nachum, Santosh Vempala, Edwin Zhang

  • Popularized the idea of abstention during post-training of large language models.
  • Helped propose the claim that moden LLM evals should account for expressions of uncertainty to help incentivize mitigating hallucinations.

OpenAI Blog, 2025

Collective alignment: public input on our Model Spec

*Tyna Eloundou, *Mitchell Gordon, *Edwin Zhang, Sandhini Agarwal

  • Surveyed >19 countries to gather feedback on OpenAI's Model Specification, a document outlining the ideal behavior of OpenAI's models
  • Integrated changes that were into the specification that were gathered from collective feedback.

OpenAI Blog, 2025

Transcendence: Generative Models Can Outperform The Experts That Train Them

Edwin Zhang, Vincent Zhu, Naomi Saphra, Anat Kleiman, Benjamin L. Edelman, Milind Tambe, Sham Kakade, Eran Malach

  • Theoretically and empirically demonstrates that generative models can outperform the experts that train them by low-temperature sampling
  • Ran experiments on the domains of toy Gaussian setting, Chess, and NLP (SQuAD v2)

NeurIPS 2024

Towards Generalist Agents Through Scaling Offline Reinforcement Learning

Edwin Zhang

  • Introduced new perspectives on pursuing Artificial General Intelligence (AGI) under the modern data-driven regime
  • Proposed a computability hypothesis regarding the potential and limits of applying RL for the real-world

Master's Thesis

Education

Harvard University

PhD Student
Computer Science

September 2023 - Present

University of California Santa Barbara

Master of Science
Computer Science

Led three disparate research projects: CFPI, LCD, and an unreleased hierarchical RL project
Teaching Assistant for CS 165B (Machine Learning)

June 2022 - June 2023

University of California Santa Barbara

Bachelor of Science
Computer Science

GPA: 3.96

High Honors
Regents Scholar (top 2.5% of school)
Relevant coursework: Convex Optimization, Game Theory, Advanced Linear Algebra, Differential Geometry, Statistical Machine Learning, Special Topics in Deep Learning

September 2019-June 2022

Employment History

Safety Research

OpenAI
July 2024 - Present

Visting Researcher

MIT-IBM Watson AI

  • Started work on an ongoing hierarchical RL project with Chuang Gan with potential for solving extreme long-horizon control problems such as minecraft diamond crafting
  • Won 3rd place out of 19 in NeurIPS Integrated Language and Understanding Challenge, receiving a $1500 cash prize

December 2022 - May 2023

Research Intern

Meta

  • Proposed, analyzed, and deployed new group page configuration reducing misinformation by 4%, improving the experience of 3 million daily active users
  • Created new facebook post ranking model with 17% gain on offline engagement area under curve (AUC) metrics
  • Started work on Language Control Diffusion with Amy Zhang

July 2022 - September 2022

Computer Vision and Software Engineering Intern

Plato Systems

  • Developed multiple view calibration pipeline through planar homographies and OpenCV.
  • Created set up process and capture script for NVIDIA Jetson platform with multiple third party imaging providers.
  • Designed and led benchmarking of several potential imaging candidates in low light, high light, and no light settings.
  • Refactored and contributed to primary user-facing web application, utilizing VueJS and Express.

June 2021 - June 2022

Lead Full Stack Engineer/First Hire

Allthenticate

  • Led development on cloud platform in early stage startup, collaborating directly with the CEO to architect and implement proprietary API.
  • Taught advanced Vue JS by taking complete responsibility at each step of the development phase – delivered full web application while teaching and leading two other interns working on the same project.
  • Built and deployed 27000 line python backend to use Elastic Beanstalk, implementing dockerized development process to speed up iteration cycles by 25%.
  • Gained experience with emerging web technologies such as JWT, ProtoBuf, and Nuxt.js.

January 2020 - June 2021

Founder and Lead Tutor

Yaitea

  • Assessed a need for tutoring code and critical thinking to children, as programming skills arose in demand and traditional tutoring services struggled to keep up.
  • Collaborated with several students and parents to create lasting relationships
  • Applied ability to learn rapidly and on the fly through the picking up and application of basic marketing to give sales pitches on the tutoring service
  • Organized an extensive programming curriculum of 24 lessons
  • Taught over 200 hours of coding and critical thinking to students
  • Gained comprehensive experience with Google Cloud, Nginx, WordPress, and Frontend Web Dev through creating the tutoring business’ website, at yaitea.com

August 2018 - September 2019

Projects

AlphaGo Zero Reimplementation

Graph Theory w/ UCSB

BERT Lecture Summarization

Predicting Winners in League

3D graphics with React

It's like LinkedIn but Tinder

Green Uber

Connecting HS Students w/ College Students


Invited Talks & Teaching

Teaching is one of my passions. I really love it.

Invited Talks
Teaching
  • UCSB CMPSC 16 (C++) Learning Assistant, Winter '20
  • UCSB CMPSC 165B (ML) Teaching Assistant, Spring '23

Interests

I really like learning, and thinking about learning. I like spending time with people even more.

I also like to run and play tennis.


Awards & Certifications

  • Google Cloud Research Grant ($87,200), 2023
  • Harvard AI Safety Technical Fellowship, 2023
  • Harvard Effective Altruism Precipice Fellowship, 2023
  • Winner of MIT Energy Hackathon, Foothill Ventures/Koidra Division, 2023 Slides
  • Second out of 10 in Amazon Alexa Simbot Challenge ($100,000 cash prize), 2023
  • Third out of 19 in Integrated Language and Understanding (IGLU) Challenge at NeurIPS ($1500 cash prize), 2022
  • First out of 16 in React Category at SBhacks, 2022
  • UCSB Distinction in the Major: Research Track, 2022
  • UCSB High Honors (Top 8.5% at graduation), 2022
  • First out of 78 in Startup Category at SDhacks, 2021
  • Best use of Google Cloud out of 71 at SBhacks, 2021
  • First overall out of 6 at Santa Barbara Startup Weekend, 2020
  • First out of 70 in Database Category at SBhacks, 2020
  • Second out of 85 in AI classification competiton at UCSB, 2020
  • Google Cloud Cybersecurity Grant Winner ($1000), 2020
  • Regents Scholar UCSB ($24000), 2019
  • Google Cloud Startup Grant Winner ($3000), 2019
  • AP Scholar with Distinction, 2019

Miscellaneous

On credit assignment

The credit assignment problem is an extremely interesting problem that appears in Reinforcement Learning and AI in general. Let's say that I play a game of chess, and make n moves in succession. At the end of the game, I get just one discrete feedback signal: the outcome of the game. How does one attribute the importance of each move to the outcome of the game? This is the credit assignment problem. For a more in-depth introduction to the topic I would recommend this paper from Minsky, starting from part 3 on page 10.

The reason I mention this here is because very little of my career credit should be attributed to me. I am eternally grateful to the following people for their kindness, support and guidance. Without them, I would have nothing. In order of recency (not importance): Michael Ovitz, CJ Reim, Sam Altman, Jiachen Li, Chad Spensky, Shou Chaofan, Derren Slinde.

Student mentoring
Parentheses denote first position after mentorship. I try to work closely with students for at least half a year, and get them to either a workshop or conference paper. In the best case scenario, my students will exceed me. One of them may already have :).
*why the domain name eddie.win?
my mom used to call me 'ai da win', a
bastardization of my actual name edwin.
my friends thought this was hilarious and
so they started calling me that too:
the domain name is just a massive joke.