*Zhaoran Wang (back to home)*

Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics Sirui Zheng, Lingxiao Wang, Shuang Qiu, Zuyue Fu, Zhuoran Yang, Csaba Szepesvári, Zhaoran Wang International Conference on Learning Representations (ICLR), 2023

Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes Miao Lu, Yifei Min, Zhaoran Wang, Zhuoran Yang International Conference on Learning Representations (ICLR), 2023

Latent Variable Representations for Reinforcement Learning Tongzheng Ren, Chenjun Xiao, Tianjun Zhang, Na Li, Zhaoran Wang, Sujay Sanghavi, Dale Schuurmans, Bo Dai International Conference on Learning Representations (ICLR), 2023

Offline RL without OOD Actions: Enforcing In-Sample Learning via Implicit Value Regularization Haoran Xu, Li Jiang, Jianxiong Li, Zhuoran Yang, Zhaoran Wang, Victor WK Chan, Xianyuan Zhan International Conference on Learning Representations (ICLR), 2023

Represent to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang International Conference on Learning Representations (ICLR), 2023

Adaptive Barrier Smoothing for Policy Gradient with Contact Dynamics Shenao Zhang, Wanxin Jin, Zhaoran Wang International Conference on Machine Learning (ICML), 2023

Achieving Hierarchy-Free Approximation for Bilevel Programs with Equilibrium Constraints Jiayang Li, Jing Yu, Boyi Liu, Marco Y Nie, Zhaoran Wang International Conference on Machine Learning (ICML), 2023

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning Yulai Zhao, Zhuoran Yang, Zhaoran Wang, Jason D Lee International Conference on Machine Learning (ICML), 2023

Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments Yixuan Wang, Simon Zhan, Ruochen Jiao, Zhilu Wang, Wanxin Jin, Zhuoran Yang, Zhaoran Wang, Chao Huang, Qi Zhu International Conference on Machine Learning (ICML), 2023


RORL: Robust Offline Reinforcement Learning via Conservative Smoothing Rui Yang, Chenjia Bai, Xiaoteng Ma, Zhaoran Wang, Chongjie Zhang, Lei Han Advances in Neural Information Processing Systems (NeurIPS), 2022

Relational Reasoning via Set Transformers: Provable Efficiency and Application to MARL Fengzhuo Zhang, Boyi Liu, Kaixin Wang, Vincent YF Tan, Zhuoran Yang, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2022

A Unifying Framework of Off-Policy General Value Function Evaluation Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin Liang Advances in Neural Information Processing Systems (NeurIPS), 2022

Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence Boyi Liu, Jiayang Li, Zhuoran Yang, Hoi-To Wai, Mingyi Hong, Marco Y Nie, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2022

Exponential-Family Model-Based Reinforcement Learning via Score Matching Gene Li, Junbo Li, Nathan Srebro, Zhaoran Wang, Zhuoran Yang (alphabetical order) Advances in Neural Information Processing Systems (NeurIPS), 2022

Sequential Information Design: Markov Persuasion Processes and Efficient Reinforcement Learning Jibang Wu, Zixuan Zhang, Zhe Feng, Zhaoran Wang, Zhuoran Yang, Michael I Jordan, Haifeng Xu Economics and Computation (EC), 2022

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning Chenjia Bai, Lingxiao Wang, Zhuoran Yang, Zhihong Deng, Animesh Garg, Peng Liu, Zhaoran Wang International Conference on Learning Representations (ICLR), 2022

Learning from Demonstrations: Provably Efficient Adversarial Policy Imitation with Linear Function Approximation Zhihan Liu, Yufeng Zhang, Zuyue Fu, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2022

Reinforcement Learning from Partial Observations: Linear Function Approximation with Provable Sample Efficiency Qi Cai, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2022

Human-In-The-Loop: Provably Efficient Preference-Based Reinforcement Learning with General Function Approximation Xiaoyu Chen, Han Zhong, Zhuoran Yang, Zhaoran Wang, Liwei Wang International Conference on Machine Learning (ICML), 2022

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning Shuang Qiu, Lingxiao Wang, Chenjia Bai, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2022

Welfare Maximization in Competitive Equilibria: Reinforcement Learning for Markov Exchange Economies Zhihan Liu, Miao Lu, Zhaoran Wang, Michael I Jordan, Zhuoran Yang International Conference on Machine Learning (ICML), 2022

Adaptive Model Design for Markov Decision Processes Siyu Chen, Donglin Yang, Jiayang Li, Senmiao Wang, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2022


Provably Efficient Causal Reinforcement Learning with Confounded Observational Data Lingxiao Wang, Zhuoran Yang, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2021

A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum Prashant Khanduri, Siliang Zeng, Mingyi Hong†, Hoi-To Wai†, Zhaoran Wang†, Zhuoran Yang† (†: alphabetical order) Advances in Neural Information Processing Systems (NeurIPS), 2021

Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning Yingjie Fei, Zhuoran Yang, Yudong Chen, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2021

Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration Runzhe Wu, Yufeng Zhang, Zhuoran Yang, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2021

Dynamic Bottleneck for Robust Self-Supervised Exploration Chenjia Bai, Lingxiao Wang, Lei Han, Animesh Garg, Jianye Hao, Peng Liu, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2021

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic Yufeng Zhang, Siyu Chen, Zhuoran Yang, Michael I Jordan, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2021

Design-While-Verify: Correct-by-Construction Learning and Control with Verification in the Loop Yixuan Wang, Chao Huang, Zhilu Wang, Zhaoran Wang, Qi Zhu Design Automation Conference (DAC), 2021

Provably Efficient and Safe Exploration via Primal-Dual Policy Optimization Dongsheng Ding, Xiaohan Wei, Zhuoran Yang, Zhaoran Wang, Mihailo R Jovanović International Conference on Artificial Intelligence and Statistics (AISTATS), 2021

Sample Elicitation Jiaheng Wei, Zuyue Fu, Yang Liu, Xingyu Li, Zhuoran Yang, Zhaoran Wang International Conference on Artificial Intelligence and Statistics (AISTATS), 2021

Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy Zuyue Fu, Zhuoran Yang, Zhaoran Wang International Conference on Learning Representations (ICLR), 2021

Principled Exploration via Optimistic Bootstrapping and Backward Induction Chenjia Bai, Lingxiao Wang, Lei Han, Jianye Hao, Animesh Garg, Peng Liu, Zhaoran Wang International Conference on Machine Learning (ICML), 2021

Learning While Playing in Mean-Field Games: Convergence and Optimality Qiaomin Xie, Zhuoran Yang, Zhaoran Wang, Andreea Minca International Conference on Machine Learning (ICML), 2021

Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach Yingjie Fei, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2021

Infinite-Dimensional Optimization for Zero-Sum Games via Variational Transport Lewis M Liu, Yufeng Zhang, Zhuoran Yang, Reza Babanezhad, Zhaoran Wang International Conference on Machine Learning (ICML), 2021


Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou Advances in Neural Information Processing Systems (NeurIPS), 2020

Upper-Confidence Primal-Dual Optimization: Stochastically Constrained Markov Decision Processes with Adversarial Losses and Unknown Transitions Shuang Qiu, Xiaohan Wei, Zhuoran Yang, Jieping Ye, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2020

Can Temporal-Difference and Q-Learning Learn Representations? A Mean-Field Theory Yufeng Zhang, Qi Cai, Zhuoran Yang, Yongxin Chen, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2020

Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoffs in Regret Yingjie Fei, Zhuoran Yang, Yudong Chen, Zhaoran Wang, Qiaomin Xie Advances in Neural Information Processing Systems (NeurIPS), 2020

End-to-End Learning and Intervention in Games Jiayang Li, Jing Yu, Marco Y Nie, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2020

Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach Luofeng Liao, You-Lin Chen, Zhuoran Yang, Bo Dai, Mladen Kolar, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2020

Neural Policy Gradient Methods: Global Optimality and Rates of Convergence Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang International Conference on Learning Representations (ICLR), 2020

Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games Zuyue Fu, Zhuoran Yang, Yongxin Chen, Zhaoran Wang International Conference on Learning Representations (ICLR), 2020

Provably Efficient Exploration in Policy Optimization Qi Cai, Zhuoran Yang, Chi Jin, Zhaoran Wang International Conference on Machine Learning (ICML), 2020

Deep Reinforcement Learning with Robust and Smooth Policy Qianli Shen, Yan Li, Haoming Jiang, Zhaoran Wang, Tuo Zhao International Conference on Machine Learning (ICML), 2020

Breaking the Curse of Many Agents: Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning Lingxiao Wang, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2020

On the Global Optimality of Model-Agnostic Meta-Learning: Reinforcement Learning and Supervised Learning Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2020

Generative Adversarial Imitation Learning with Neural Network Parameterization: Global Optimality and Rates of Convergence Yufeng Zhang, Qi Cai, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2020


Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy Boyi Liu, Qi Cai, Zhuoran Yang, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2019

On the Global Convergence of Actor-Critic: A Case for the Linear-Quadratic Regulator with Ergodic Costs Zhuoran Yang, Yongxin Chen, Mingyi Hong, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2019

Convergent Policy Optimization for Safe Reinforcement Learning Ming Yu, Zhuoran Yang, Mladen Kolar, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2019

Statistical-Computational Tradeoffs in Single-Index Models Lingxiao Wang, Zhuoran Yang, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2019

Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy Yuan Xie, Boyi Liu, Qiang Liu, Zhaoran Wang, Yuan Zhou, Jian Peng International Conference on Learning Representations (ICLR), 2019

Accelerating Nonconvex Learning via Replica Exchange Langevin Diffusion Yi Chen, Jinglin Chen, Jing Dong, Jian Peng, Zhaoran Wang International Conference on Learning Representations (ICLR), 2019

On the Statistical Rate of Nonlinear Recovery in Generative Models with Heavy-Tailed Data Xiaohan Wei, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2019

Multi-Agent Reinforcement Learning via Double-Averaging Primal-Dual Optimization Hoi-To Wai, Zhuoran Yang, Zhaoran Wang, Mingyi Hong Advances in Neural Information Processing Systems (NeurIPS), 2018

Provable Gaussian Embedding with Only One Observation Ming Yu, Zhuoran Yang, Tuo Zhao, Mladen Kolar, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2018

Contrastive Learning from Pairwise Measurements Yi Chen, Zhuoran Yang, Yuchen Xie, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2018

Minimax-Optimal Privacy-Preserving Sparse PCA in Distributed Systems Jason Ge, Zhaoran Wang, Mengdi Wang, Han Liu International Conference on Artificial Intelligence and Statistics (AISTATS), 2018

Edge Density Barriers: Computational-Statistical Tradeoffs in Combinatorial Inference Hao Lu, Yuan Cao, Zhuoran Yang, Junwei Lu, Han Liu, Zhaoran Wang International Conference on Machine Learning (ICML), 2018