Aloha! I'm a Lead Research Scientist at Salesforce Research, heading the AI Economist team. We use reinforcement learning and economic simulations to design economic policies that improve social welfare. My research has been covered in US media, e.g., MIT Technology Review, and international media, e.g., the Financial Times (UK), het Financieele Dagblad, and de Volkskrant (NL).

I'm broadly interested in (artificial) brains and branes: multi-agent systems, deep learning, imitation learning, reinforcement learning and AI in general. Besides that, you can always wake me up for superstrings and topological branes.

I grew up in the Netherlands (gezellig!), and have lived in the UK and the US.

I did machine learning research during my PhD (Physics, 2013-2018) at Caltech, advised by Yisong Yue. I interned at Google in 2015, improving the robustness of deep neural networks for computer vision, with Yang Song, Thomas Leung and Ian Goodfellow. In 2016, I worked on question-answering with deep learning at Google Brain with Andrew Dai and Samy Bengio.

I studied math and theoretical physics at Utrecht University from 2006-2011, going down the rabbit hole on superstring theory. My master's thesis covered exotic dualities and geometry in supersymmetric quantum field theory, advised by Robbert Dijkgraaf and Stefan Vandoren. I received the 2011 Lorentz Graduation Prize from the Royal Netherlands Academy of Arts and Sciences [1 2]. I spent 2010 at Harvard University, and in 2011-2012, did Part III Mathematics at the University of Cambridge. Because honestly, who doesn't want to pretend being at Hogwarts for a little bit...


The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies
Stephan Zheng, Alexander Trott, Sunil Srinivasa, Nikhil Naik, Melvin Gruesbeck, David C. Parkes, Richard Socher.

[Press Release] [Blog] [PDF]

Using reinforcement learning to learn economic policies in AI simulations. We show that the AI Economist finds tax policies that are at least 16% better than prominent tax models. It also shows promising results in human studies, in which real people earn real money in our economic simulation.

ESPRIT: Explaining Solutions to Physical Reasoning Tasks
Nazneen Fatema Rajani, Rui Zhang, Yi Chern Tan, Stephan Zheng, Jeremy Weiss, Aadit Vyas, Abhijit Gupta, Caiming XIong, Richard Socher, Dragomir Radev, ACL 2020.

AI can talk about physics! We train deep language models to generate descriptions of physics. 👉👉 Next stop: describing string theory :).


Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards
Alex Trott, Stephan Zheng, Richard Socher, NeurIPS 2019.

Sibling Rivalry dynamically trades off between exploration and exploitation, and automatically avoids local optima and enables RL policies to converge to the true optimal policy more easily.


On the Generalization Gap in Reparameterizable Reinforcement Learning
Huan Wang, Stephan Zheng, Caiming XIong, Richard Socher, ICML 2019.

We show that under the reparameterization trick, one can derive generalization bounds for reinforcement learning.


NAOMI: Non-autoregressive Missing Value Imputation
Yukai Liu, Rose Yu, Stephan Zheng, Eric Zhan, Yisong Yue, NeurIPS 2019.


Interpolating and extrapolating sequence data is better when done non-autoregressively! NAOMI uses a multi-resolution GAN approach for significantly better trajectory generation, demonstrated on physics and multi-agent trajectory data.

Generating Multi-Agent Trajectories using Programmatic Weak Supervision
Eric Zhan, Stephan Zheng, Yisong Yue, Patrick Lucey, ICLR 2019.

[PDF] [ Demo]

Long-term multi-agent trajectory extrapolation is more realistic with hierarchical recurrent VAEs, that use weakly supervision in the form of long-term goals. More pretty basketball play trajectories!

Detecting Adversarial Examples via Neural Fingerprinting
Stephan Zheng*, Sumanth Dathathri*, Richard Murray, Yisong Yue. (* equal contribution)

[PDF] [Code]

How can we detect adversarial examples? Neural Fingerprinting detects state-of-the-art adversarial attacks with nerar-perfect detection rates on MNIST, CIFAR and MiniImagenet!

Long-term Forecasting using Tensor-Train RNNs
Rose Yu*, Stephan Zheng*, Animashree Anandkumar, Yisong Yue, in submission. (* equal contribution)


How can we predict the weather in 7 days? Such long-term forecasting problems are hard, because the dynamics of the sequential data can be highly complex. We show that explicitly modeling higher-order interactions of the dynamics using Tensor-Train RNNs, we obtain state-of-the-art long-term prediction performance on real-world datasets.

Multiresolution Tensor Learning for Efficient and Interpretable Spatial Analysis
Jung Yeon Park, Kenneth Theo Carr, Stephan Zheng, Yisong Yue, Rose Yu, ICML 2020.


Multi-resolution Tensor Learning for Large-Scale Spatial Data
Stephan Zheng, Rose Yu, Yisong Yue.


How do you quickly learn interpretable tensor models from high-resolution spatiotemporal data? We explain how to do this by using multi-resolution learning! With examples on predicting when basketball players shoot at the basket and how fruitflies behave in pairs.

Structured Exploration via Deep Hierarchical Coordination
Stephan Zheng, Yisong Yue.


We demonstrate a structured exploration approach to multi-agent reinforcement learning, in which agents learn to coordinate with other.

Novel deep learning methods for track reconstruction
HEP.TrkX project: Steven Farrell, Paolo Calafiura, Mayur Mudigonda, Prabhat, Dustin Anderson, Jean-Roch Vlimant, Stephan Zheng, Josh Bendavid, Maria Spiropulu, Giuseppe Cerati, Lindsey Gray, Jim Kowalkowski, Panagiotis Spentzouris, Aristeidis Tsaris.

[PDF] [ PDF] [ Link]

We evaluate modern neural networks on the task of predicting particle tracks, including forecasting uncertainty margins.

Generating Long-term Trajectories Using Deep Hierarchical Networks
Stephan Zheng, Yisong Yue, Patrick Lucey, Neural Information Processing Systems (NIPS) 2016.

[PDF] [ Data]

How can models learn to move like professional basketball players? By training hierarchical policy networks using imitation learning from human demonstrations! Our paper shows how predicting a macro-goal significantly improves track generation and fools professional sports analysts.

Improving the Robustness of Deep Neural Networks via Stability Training
Stephan Zheng, Yang Song, Ian Goodfellow, Thomas Leung, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016.


Neural network predictions can be unstable and make errors, even if only small noise (such as JPEG compression noise) is present in the input. In this paper, we show a simple stochastic data augmentation technique that makes neural networks more robust to input perturbations.


Exotic path integrals and dualities, Stephan Zheng, 2011.


My (bulky!) master's thesis, in which I applied a novel class of exotic dualities to topological strings. Introduced by Edward Witten, these exotic dualities use an extension of Morse theory with complex variables to relate quantum field theories on boundary-bulk geometries, by interpreting quantum fields as Morse functions. I show that this duality can be applied to topological string theory to relate different classes of topological branes.

Screening of Heterogeneous Surfaces: Charge Renormalization of Janus Particles
Niels Boon, Elma C. Gallardo, Stephan Zheng, Eelco Eggen, Monique Dijkstra, Rene van Roij, Journal of Physics: Condensed Matter 22 (2010) 10410.


Janus particles are synthetic particles that are actively investigated for e.g. localized cancer treatment drug delivery. In this paper, we study the electromagnetic screening effects of such particles in plasmas by (numerically) solving the non-linear Poisson-Boltzmann equation.