Stephan Zheng

stephan.zheng [at]
CV Github Google Scholar Twitter


I'm a research scientist at Salesforce Research, working on machine learning, making AI work and bringing it into the real world. Broadly speaking, I work on machine learning for multi-agent systems, deep learning, imitation and reinforcement learning.

In 2018, I got my PhD in the Machine Learning group at Caltech, advised by Professor Yisong Yue. During my PhD, I'm also fortunate to have interned with Google Research (working with Yang Song, Thomas Leung and Ian Goodfellow) and Google Brain (working with Andrew Dai and Samy Bengio) over 2 summers.

Before machine learning, I studied mathematics and theoretical physics during my BSc and MSc at Utrecht University in the wonderful Netherlands, and an MASt from the University of Cambridge. I also was a visiting student at Harvard University. My main focus then was superstring theory and geometric structures in supersymmetric quantum field theory, the topic of my master's thesis, advised by Professor Robbert Dijkgraaf and Professor Stefan Vandoren.

What's new?

  • I successfully defended my PhD thesis! My defense presentation are available here
  • March 2018: our paper on Detecting Adversarial Examples via Neural Fingerprinting is on the Arxiv!
  • February 2018: New paper on multi-resolution learning for spatiotemporal tensor models on Arxiv!
  • Check out the new Basketball AI Demo for our paper Generative Multi-agent Behavioral Cloning!
  • Our paper Long-term Forecasting using Tensor-Train RNNs won best paper award at the NIPS 2017 Time-series Workshop!
  • November 2017: Talk at UCLA CS colloquium!
  • November 2017: 2 papers submitted and 5 papers workshop papers accepted!
  • July 2017: Talks at Beijing University, Shanghai Jiaotong University, Zhejiang University and Didi Chuxing!
  • June 2017: Connecting The Dots paper accepted.
  • June 2017: ICML workshop paper accepted.
  • October 2016: NIPS paper accepted!
  • Summer 2016: interning at Google Brain!
  • Jan 2016: CVPR paper accepted!

Machine Learning Conference Papers

Generative Multi-Agent Behavioral Cloning.
E. Zhan, S. Zheng, Y. Yue, P. Lucey, in submission.

[PDF] [Demo]

Detecting Adversarial Examples via Neural Fingerprinting.
S. Zheng*, S. Dathathri*, R. Murray, Y. Yue, in submission. (* equal contribution)

[PDF] [Code]


How can we detect adversarial examples? Neural Fingerprinting detects state-of-the-art adversarial attacks with nerar-perfect detection rates on MNIST, CIFAR and MiniImagenet!

Long-term Forecasting using Tensor-Train RNNs.
R. Yu*, S. Zheng*, A. Anandkumar, Y. Yue, in submission. (* equal contribution)


How can we predict the weather in 7 days? Such long-term forecasting problems are hard, because the dynamics of the sequential data can be highly complex. We show that explicitly modeling higher-order interactions of the dynamics using Tensor-Train RNNs, we obtain state-of-the-art long-term prediction performance on real-world datasets.

Multi-resolution Tensor Learning for Large-Scale Spatial Data.
S. Zheng, Yisong Yue, in submission


How do you quickly learn interpretable tensor models from high-resolution spatiotemporal data? We explain how to do this by using multi-resolution learning! With examples on predicting when basketball players shoot at the basket and how fruitflies behave in pairs.

Structured Exploration via Deep Hierarchical Coordination.
S. Zheng, Yisong Yue, in submission


We demonstrate a structured exploration approach to multi-agent reinforcement learning, in which agents learn to coordinate with other.

Novel deep learning methods for track reconstruction.
Steven Farrell, Dustin Anderson, Paolo Calafiura, Giuseppe Cerati, indsey Gray, Jim Kowalkowski, Mayur Mudigonda, Prabhat, Panagiotis pentzouris, Maria Spiropoulou, Aristeidis Tsaris, Jean-Roch Vlimant, S. Zheng,


We evaluate modern neural networks on the task of predicting particle tracks, including forecasting uncertainty margins.

HEP.TrkX project: DNNs for HL-LHC online and offline tracking.
Steven Farrell, Dustin Anderson, Paolo Calafiura, Giuseppe Cerati, indsey Gray, Jim Kowalkowski, Mayur Mudigonda, Prabhat, Panagiotis pentzouris, Maria Spiropoulou, Aristeidis Tsaris, Jean-Roch Vlimant, S. Zheng, Connecting The Dots / Intelligent Trackers 2017

[PDF] [Link]

We evaluate modern neural networks on the task of predicting particle tracks, including forecasting uncertainty margins.

Generating Long-term Trajectories Using Deep Hierarchical Networks.
S. Zheng, Y. Yue, P. Lucey, Neural Information Processing Systems (NIPS) 2016

[PDF] [Data]

How can models learn to move like professional basketball players? By training hierarchical policy networks using imitation learning from human demonstrations! Our paper shows how predicting a macro-goal significantly improves track generation and fools professional sports analysts.

Improving the Robustness of Deep Neural Networks via Stability Training.
S. Zheng, Y. Song, I. Goodfellow, T. Leung, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016


Neural network predictions can be unstable and make errors, even if only small noise (such as JPEG compression noise) is present in the input. In this paper, we show a simple stochastic data augmentation technique that makes neural networks more robust to input perturbations.

Machine Learning Workshop Papers

Learning chaotic dynamics using tensor recurrent neural networks.
R. Yu*, S. Zheng*, A. Anandkumar, Y. Yue, Time-series workshop NIPS 2017. Best paper award.



Generating multi-agent trajectories from expert demonstrations.
E. Zhan, S. Zheng, Y. Yue, Bayesian Deep Learning, NIPS 2017



Measuring the robustness of neural networks via minimal adversarial examples.
S. Dathahiri, S. Zheng, S. Gao, R. Murray, Deep Learning: Bridging Theory and Practice, NIPS 2017



Particle Track Reconstruction with Deep Learning.
S. Farrell, ..., S. Zheng, Deep Learning in Physical Sciences, NIPS 2017


A status report on the use of deep learning for high-energy particle tracking.

Multi-Agent Counterfactual Regret Minimization for Partial-Information Collaborative Games.
M. Hartley, S. Zheng, Y. Yue, NIPS 2017


Playing Poker and Bridge is hard because you can't see all the cards that the other players are holding. How do you learn an optimal strategy in this case? We study multi-agent counterfactual regret minimization in the 4-player setting, where few theoretical guarantees exist.

Learning Chaotic Dynamics with Tensor Recurrent Neural Networks.
R. Yu*, S. Zheng*, Time-series workshop, ICML 2017
(* equal contribution)


Long-term forecasting is a central issue in science. We show how tensor-RNNs can do much better forecasting than other recurrent neural networks that are currently popular.

Learning Long-term Planning in Basketball Using Hierarchical Memory Networks.
S. Zheng, Y. Yue, Large-scale sports analytics workshop, KDD 2016


Learning realistic movement policies for AIs playing basketball is hard. We show how hierarchical policies can predict macro-goals dynamically and generate much more realistic behavior than 'flat' policies.

Scalable Training of Interpretable Spatial Latent Factor Models.
S. Zheng, Y. Yue, Workshop on Non-convex Optimization for Machine Learning: Theory and Practice at NIPS 2015


Spatiotemporal tensor models can scale badly with increasing resolution. We propose a novel adaptive optimization scheme to learn flexible multi-resolution tensor models.

Exotic path integrals and dualities.
S. Zheng


My (rather bulky) master's thesis! I discuss a novel class of dualities that relates quantum field theories that live on adjacent geometries (such as boundary-bulk) to each other. This duality relies on a complex version of Morse theory, which relates the geometry of curved manifolds with the extremal point spectrum of their Morse functions. I show that this duality can be analogously used in topological string theory to relate different classes of topological branes to each other.

Screening of Heterogeneous Surfaces: Charge Renormalization of Janus Particles.
N. Boon, E. C. Gallardo, S. Zheng, E. Eggen, M. Dijkstra, R. van Roij, J. Phys.: Cond. Matter 22 (2010) 10410


Janus particles are synthetic particles that are actively investigated for e.g. localized cancer treatment drug delivery. In this paper, we study the electromagnetic screening effects of such particles in plasmas by (numerically) solving the non-linear Poisson-Boltzmann equation.