Aloha! I'm currently a senior research scientist at Salesforce Research in Palo Alto, CA, leading a team on reinforcement learning. I'm broadly interested in multi-agent systems, deep learning, imitation learning, reinforcement learning and AI in general. Besides that, two lifelong passions are physics and math.

Want to share something? Drop me a line at !

I grew up in the Netherlands (gezellig!), and have lived in the UK and the US. Some highlights of my space-time path:

I did research on machine learning in my PhD (Physics) from 2013-2018 at Caltech, in sunny Los Angeles, advised by Yisong Yue. I primarily worked on hierarchical deep learning for spatiotemporal data and reinforcement learning.

2015: As a Google research intern, I worked on improving the robustness of deep neural networks for image classification, together with Yang Song, Thomas Leung and Ian Goodfellow.

2016: I spent another summer doing question-answering in natural language processing (NLP) with deep learning at Google Brain with Andrew Dai and Samy Bengio.

I studied math and theoretical physics (BSc, MSc) at Utrecht University from 2006-2011. I went down the rabbit hole on superstring theory and wrote my master's thesis on exotic dualities and geometry in supersymmetric quantum field theory, advised by Robbert Dijkgraaf and Stefan Vandoren.

I spent all of 2010 visiting Harvard University, where I dove deep into the intersection of geometry and physics.

In 2011-2012, I did Part III Mathematics at the University of Cambridge. Because honestly, who doesn't want to pretend being at Hogwarts for a little bit...

machine learning

NAOMI: Non-autoregressive Missing Value Imputation
Y. Liu, R. Yu, S. Zheng, E. Zhan, Y. Yue, NeurIPS 2019.


Generating Multi-Agent Trajectories using Programmatic Weak Supervision
E. Zhan, S. Zheng, Y. Yue, P. Lucey, ICLR 2019.

[PDF] [ Demo]

Detecting Adversarial Examples via Neural Fingerprinting.
S. Zheng*, S. Dathathri*, R. Murray, Y. Yue, in submission. (* equal contribution)

[PDF] [Code]

How can we detect adversarial examples? Neural Fingerprinting detects state-of-the-art adversarial attacks with nerar-perfect detection rates on MNIST, CIFAR and MiniImagenet!

Long-term Forecasting using Tensor-Train RNNs.
R. Yu*, S. Zheng*, A. Anandkumar, Y. Yue, in submission. (* equal contribution)


How can we predict the weather in 7 days? Such long-term forecasting problems are hard, because the dynamics of the sequential data can be highly complex. We show that explicitly modeling higher-order interactions of the dynamics using Tensor-Train RNNs, we obtain state-of-the-art long-term prediction performance on real-world datasets.

Structured Exploration via Deep Hierarchical Coordination.
S. Zheng, Yisong Yue, in submission


We demonstrate a structured exploration approach to multi-agent reinforcement learning, in which agents learn to coordinate with other.

Generating Long-term Trajectories Using Deep Hierarchical Networks.
S. Zheng, Y. Yue, P. Lucey, Neural Information Processing Systems (NIPS) 2016

[PDF] [ Data]

How can models learn to move like professional basketball players? By training hierarchical policy networks using imitation learning from human demonstrations! Our paper shows how predicting a macro-goal significantly improves track generation and fools professional sports analysts.

Improving the Robustness of Deep Neural Networks via Stability Training.
S. Zheng, Y. Song, I. Goodfellow, T. Leung, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016


Neural network predictions can be unstable and make errors, even if only small noise (such as JPEG compression noise) is present in the input. In this paper, we show a simple stochastic data augmentation technique that makes neural networks more robust to input perturbations.

theoretical physics

Exotic path integrals and dualities.
S. Zheng


My (rather bulky) master's thesis! I discuss a novel class of dualities that relates quantum field theories that live on adjacent geometries (such as boundary-bulk) to each other. This duality relies on a complex version of Morse theory, which relates the geometry of curved manifolds with the extremal point spectrum of their Morse functions. I show that this duality can be analogously used in topological string theory to relate different classes of topological branes to each other.

Screening of Heterogeneous Surfaces: Charge Renormalization of Janus Particles.
N. Boon, E. C. Gallardo, S. Zheng, E. Eggen, M. Dijkstra, R. van Roij, J. Phys.: Cond. Matter 22 (2010) 10410


Janus particles are synthetic particles that are actively investigated for e.g. localized cancer treatment drug delivery. In this paper, we study the electromagnetic screening effects of such particles in plasmas by (numerically) solving the non-linear Poisson-Boltzmann equation.