**Aloha!**
I'm currently a research scientist at Salesforce Research, working on machine learning and AI.
I'm broadly interested in
multi-agent systems,
deep learning,
imitation and reinforcement learning
.

2013-2018: PhD (Physics), Caltech. I started doing machine learning here, advised by Professor Yisong Yue. I primarily worked on hierarchical deep learning for spatiotemporal data and reinforcement learning.

2015, 2016: As a research intern, I worked on computer vision with Google Research (+Yang Song, +Thomas Leung, +Ian Goodfellow) and NLP with Google Brain (+Andrew Dai, +Samy Bengio).

Two earlier and ongoing passions are **physics and mathematics**.
I spent some time studying superstring theory and geometric structures in supersymmetric quantum field theory.

2006-2011: BSc, MSc Utrecht University (in the wonderful Netherlands). My master's thesis was advised by Professor Robbert Dijkgraaf and Professor Stefan Vandoren.

2011-2012: Part III Mathematics, University of Cambridge.

2010: Visiting student at Harvard University.

**NAOMI: Non-autoregressive Missing Value Imputation**

Y. Liu, R. Yu, **S. Zheng**, E. Zhan, Y. Yue, *in submission*.

[PDF]

**Detecting Adversarial Examples via Neural Fingerprinting.**

**S. Zheng***, S. Dathathri*, R. Murray, Y. Yue, *in submission*. (* equal contribution)

How can we detect adversarial examples? Neural Fingerprinting detects state-of-the-art adversarial attacks with nerar-perfect detection rates on MNIST, CIFAR and MiniImagenet!

**Long-term Forecasting using Tensor-Train RNNs.**

**R. Yu*, S. Zheng***, A. Anandkumar, Y. Yue, *in submission*. (* equal contribution)

[PDF]

How can we predict the weather in 7 days? Such long-term forecasting problems are hard, because the dynamics of the sequential data can be highly complex. We show that explicitly modeling higher-order interactions of the dynamics using Tensor-Train RNNs, we obtain state-of-the-art long-term prediction performance on real-world datasets.

**Multi-resolution Tensor Learning for Large-Scale Spatial Data.**

**S. Zheng**, Yisong Yue, *in submission*

[PDF]

How do you quickly learn interpretable tensor models from high-resolution spatiotemporal data? We explain how to do this by using multi-resolution learning! With examples on predicting when basketball players shoot at the basket and how fruitflies behave in pairs.

**Structured Exploration via Deep Hierarchical Coordination.**

**S. Zheng**, Yisong Yue, *in submission*

[PDF]

We demonstrate a structured exploration approach to multi-agent reinforcement learning, in which agents learn to coordinate with other.

**Novel deep learning methods for track reconstruction.**

**HEP.TrkX project: DNNs for HL-LHC online and offline tracking.**

Steven Farrell, Dustin Anderson, Paolo Calafiura, Giuseppe Cerati, indsey Gray, Jim Kowalkowski, Mayur Mudigonda, Prabhat, Panagiotis pentzouris, Maria Spiropoulou, Aristeidis Tsaris, Jean-Roch Vlimant, **S. Zheng**,

We evaluate modern neural networks on the task of predicting particle tracks, including forecasting uncertainty margins.

**Generating Long-term Trajectories Using Deep Hierarchical Networks.**

**S. Zheng**, Y. Yue, P. Lucey, Neural Information Processing Systems (NIPS) 2016

How can models learn to move like professional basketball players? By training hierarchical policy networks using imitation learning from human demonstrations! Our paper shows how predicting a macro-goal significantly improves track generation and fools professional sports analysts.

**Improving the Robustness of Deep Neural Networks via Stability Training.**

**S. Zheng**, Y. Song, I. Goodfellow, T. Leung, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016

[PDF]

Neural network predictions can be unstable and make errors, even if only small noise (such as JPEG compression noise) is present in the input. In this paper, we show a simple stochastic data augmentation technique that makes neural networks more robust to input perturbations.

**Exotic path integrals and dualities.**

**S. Zheng**

[PDF]

My (rather bulky) master's thesis! I discuss a novel class of dualities that relates quantum field theories that live on adjacent geometries (such as boundary-bulk) to each other. This duality relies on a complex version of Morse theory, which relates the geometry of curved manifolds with the extremal point spectrum of their Morse functions. I show that this duality can be analogously used in topological string theory to relate different classes of topological branes to each other.

**Screening of Heterogeneous Surfaces: Charge Renormalization of Janus Particles.**

N. Boon, E. C. Gallardo, **S. Zheng**, E. Eggen, M. Dijkstra, R. van Roij, J. Phys.: Cond. Matter 22 (2010) 10410

[PDF]

Janus particles are synthetic particles that are actively investigated for e.g. localized cancer treatment drug delivery. In this paper, we study the electromagnetic screening effects of such particles in plasmas by (numerically) solving the non-linear Poisson-Boltzmann equation.