Eddie Zhang

150 Western Avenue · Boston, MA 02134 · ezhang [at] g.[school name].edu

I study reinforcement learning for social good.


About

I am currently a resident at OpenAI, interested in applying RL for social good. I am also a co-founder of the nonprofit Humanity Unleashed, and am on leave from a CS PhD at Harvard. Other areas of interest include alignment and the principles of intelligence.

I am currently mentored by Tong Mu and Alec Helyar at OpenAI, and work on Lillian Weng's Safety and Alignment team. I am also very fortunate to be advised by the wonderful Professor Milind Tambe, and grateful to have worked closely with Chuang Gan at MIT-IBM Watson, Amy Zhang at Meta, and William Wang at UCSB.


Selected Research

Transcendence: Generative Models Can Outperform The Experts That Train Them

Edwin Zhang, Vincent Zhu, Naomi Saphra, Anat Kleiman, Benjamin L. Edelman, Milind Tambe, Sham Kakade, Eran Malach,

  • Theoretically and empirically demonstrates that generative models can outperform the experts that train them by low-temperature sampling
  • Ran experiments on the domains of toy Gaussian setting, Chess, and NLP (SQuAD v2)

NeurIPS 2024

Social Environment Design

Edwin Zhang, Sadie Zhao, Tonghan Wang, Safwan Hossain, Henry Gasztowtt, Stephan Zheng, David C. Parkes, Milind Tambe, Yiling Chen

  • Introduces a research agenda towards using Generative AI for social good
  • Sits at the intersection of EconCS, MARL, Computational Social Choice, and Mechanism Design

ICML 2024, Position Paper

Towards Generalist Agents Through Scaling Offline Reinforcement Learning

Edwin Zhang

  • Introduced new perspectives on pursuing Artificial General Intelligence (AGI) under the modern data-driven regime
  • Proposed a computability hypothesis regarding the potential and limits of applying RL for the real-world

Master's Thesis

Language Control Diffusion

Edwin Zhang, Yujie Lu, William Yang Wang, Amy Zhang

  • Proposed and created language conditioned diffusion RL models, enabling generalization in control through large language models
  • Ran several experiments comparing baselines and proposed method on distributed FAIR cluster through SLURM

ICLR 2024

Education

Harvard University

PhD Student
Computer Science

September 2023 - Present

University of California Santa Barbara

Master of Science
Computer Science

Led three disparate research projects: CFPI, LCD, and an unreleased hierarchical RL project
Teaching Assistant for CS 165B (Machine Learning)

June 2022 - June 2023

University of California Santa Barbara

Bachelor of Science
Computer Science

GPA: 3.96

High Honors
Regents Scholar (top 2.5% of school)
Relevant coursework: Convex Optimization, Game Theory, Advanced Linear Algebra, Differential Geometry, Statistical Machine Learning, Special Topics in Deep Learning

September 2019-June 2022

Employment History

Resident

OpenAI
July 2024 - Feb 2025

Visting Researcher

MIT-IBM Watson AI

  • Started work on an ongoing hierarchical RL project with Chuang Gan with potential for solving extreme long-horizon control problems such as minecraft diamond crafting
  • Won 3rd place out of 19 in NeurIPS Integrated Language and Understanding Challenge, receiving a $1500 cash prize

December 2022 - May 2023

Research Intern

Meta

  • Proposed, analyzed, and deployed new group page configuration reducing misinformation by 4%, improving the experience of 3 million daily active users
  • Created new facebook post ranking model with 17% gain on offline engagement area under curve (AUC) metrics
  • Started work on Language Control Diffusion with Amy Zhang

July 2022 - September 2022

Computer Vision and Software Engineering Intern

Plato Systems

  • Developed multiple view calibration pipeline through planar homographies and OpenCV.
  • Created set up process and capture script for NVIDIA Jetson platform with multiple third party imaging providers.
  • Designed and led benchmarking of several potential imaging candidates in low light, high light, and no light settings.
  • Refactored and contributed to primary user-facing web application, utilizing VueJS and Express.

June 2021 - June 2022

Lead Full Stack Engineer/First Hire

Allthenticate

  • Led development on cloud platform in early stage startup, collaborating directly with the CEO to architect and implement proprietary API.
  • Taught advanced Vue JS by taking complete responsibility at each step of the development phase – delivered full web application while teaching and leading two other interns working on the same project.
  • Built and deployed 27000 line python backend to use Elastic Beanstalk, implementing dockerized development process to speed up iteration cycles by 25%.
  • Gained experience with emerging web technologies such as JWT, ProtoBuf, and Nuxt.js.

January 2020 - June 2021

Founder and Lead Tutor

Yaitea

  • Assessed a need for tutoring code and critical thinking to children, as programming skills arose in demand and traditional tutoring services struggled to keep up.
  • Collaborated with several students and parents to create lasting relationships
  • Applied ability to learn rapidly and on the fly through the picking up and application of basic marketing to give sales pitches on the tutoring service
  • Organized an extensive programming curriculum of 24 lessons
  • Taught over 200 hours of coding and critical thinking to students
  • Gained comprehensive experience with Google Cloud, Nginx, WordPress, and Frontend Web Dev through creating the tutoring business’ website, at yaitea.com

August 2018 - September 2019

Projects

AlphaGo Zero Reimplementation

Graph Theory w/ UCSB

BERT Lecture Summarization

Predicting Winners in League

3D graphics with React

It's like LinkedIn but Tinder

Green Uber

Connecting HS Students w/ College Students


Invited Talks & Teaching

Teaching is one of my passions. I really really love it.
Invited Talks
Teaching
  • UCSB CMPSC 16 (C++) Learning Assistant, Winter '20
  • UCSB CMPSC 165B (ML) Teaching Assistant, Spring '23

Interests

I really like learning, and thinking about learning. I like spending time with people even more.

I love playing tennis (and losing miserably at it to my superior roommate), riding the BART, hating on Apple (sometimes while riding the BART), watching anime, and hunting dinosaurs. Haha just kidding on that last one

or am i?


Awards & Certifications

  • Google Cloud Research Grant ($87,200), 2023
  • Harvard AI Safety Technical Fellowship, 2023
  • Harvard Effective Altruism Precipice Fellowship, 2023
  • Winner of MIT Energy Hackathon, Foothill Ventures/Koidra Division, 2023 Slides
  • Second out of 10 in Amazon Alexa Simbot Challenge ($100,000 cash prize), 2023
  • Third out of 19 in Integrated Language and Understanding (IGLU) Challenge at NeurIPS ($1500 cash prize), 2022
  • First out of 16 in React Category at SBhacks, 2022
  • UCSB Distinction in the Major: Research Track, 2022
  • UCSB High Honors (Top 8.5% at graduation), 2022
  • First out of 78 in Startup Category at SDhacks, 2021
  • Best use of Google Cloud out of 71 at SBhacks, 2021
  • First overall out of 6 at Santa Barbara Startup Weekend, 2020
  • First out of 70 in Database Category at SBhacks, 2020
  • Second out of 85 in AI classification competiton at UCSB, 2020
  • Google Cloud Cybersecurity Grant Winner ($1000), 2020
  • Regents Scholar UCSB ($24000), 2019
  • Google Cloud Startup Grant Winner ($3000), 2019
  • AP Scholar with Distinction, 2019

Miscellaneous

Professional Skills
On credit assignment

The credit assignment problem is an extremely interesting problem that appears in Reinforcement Learning and AI in general. Let's say that I play a game of chess, and make n moves in succession. At the end of the game, I get just one discrete feedback signal: the outcome of the game. How does one attribute the importance of each move to the outcome of the game? This is the credit assignment problem. For a more in-depth introduction to the topic I would recommend this paper from Minsky, starting from part 3 on page 10.

The reason I mention this here is because very little of my career credit should be attributed to me. I am eternally grateful to the following people for their kindness, support and guidance. Without them, I would have nothing. In order of recency (not importance): Jiachen Li, Chad Spensky, Shou Chaofan, Derren Slinde.

Student mentoring
Parentheses denote first position after mentorship. I try to work closely with students for at least half a year, and get them to either a workshop or conference paper.
  • Vincent Zhu (Currently mentoring, 2024)
  • Henry Gasztowtt (Currently mentoring, 2024)
  • Ben Smith (Currently mentoring, 2023)
  • Matthew Ho (UCSD PhD, 2024)
  • Peiyang Song (Caltech BS, 2024)
  • Lauren Cooke (Harvard BS, 2023)
  • Shinda Huang (UCSB MS, 2023)
  • Katelyn Zhang (Google SWE, 2022)
  • Yuhao Zhang (Amazon SWE, 2021)
*why the domain name eddie.win?
my mom used to call me 'ai da win', a
bastardization of my actual name edwin.
my friends thought this was hilarious and
so they started calling me that too:
the domain name is just a massive joke.