Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics Sirui Zheng, Lingxiao Wang, Shuang Qiu, Zuyue Fu, Zhuoran Yang, Csaba Szepesvári, Zhaoran Wang International Conference on Learning Representations (ICLR), 2023
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes Miao Lu, Yifei Min, Zhaoran Wang, Zhuoran Yang International Conference on Learning Representations (ICLR), 2023
Latent Variable Representations for Reinforcement Learning Tongzheng Ren, Chenjun Xiao, Tianjun Zhang, Na Li, Zhaoran Wang, Sujay Sanghavi, Dale Schuurmans, Bo Dai International Conference on Learning Representations (ICLR), 2023
Offline RL without OOD Actions: Enforcing In-Sample Learning via Implicit Value Regularization Haoran Xu, Li Jiang, Jianxiong Li, Zhuoran Yang, Zhaoran Wang, Victor WK Chan, Xianyuan Zhan International Conference on Learning Representations (ICLR), 2023
Represent to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang International Conference on Learning Representations (ICLR), 2023
Adaptive Barrier Smoothing for Policy Gradient with Contact Dynamics Shenao Zhang, Wanxin Jin, Zhaoran Wang International Conference on Machine Learning (ICML), 2023
Achieving Hierarchy-Free Approximation for Bilevel Programs with Equilibrium Constraints Jiayang Li, Jing Yu, Boyi Liu, Marco Y Nie, Zhaoran Wang International Conference on Machine Learning (ICML), 2023
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning Yulai Zhao, Zhuoran Yang, Zhaoran Wang, Jason D Lee International Conference on Machine Learning (ICML), 2023
Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments Yixuan Wang, Simon Zhan, Ruochen Jiao, Zhilu Wang, Wanxin Jin, Zhuoran Yang, Zhaoran Wang, Chao Huang, Qi Zhu International Conference on Machine Learning (ICML), 2023
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing Rui Yang, Chenjia Bai, Xiaoteng Ma, Zhaoran Wang, Chongjie Zhang, Lei Han Advances in Neural Information Processing Systems (NeurIPS), 2022
Relational Reasoning via Set Transformers: Provable Efficiency and Application to MARL Fengzhuo Zhang, Boyi Liu, Kaixin Wang, Vincent YF Tan, Zhuoran Yang, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2022
A Unifying Framework of Off-Policy General Value Function Evaluation Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin Liang Advances in Neural Information Processing Systems (NeurIPS), 2022
Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence Boyi Liu, Jiayang Li, Zhuoran Yang, Hoi-To Wai, Mingyi Hong, Marco Y Nie, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2022
Exponential-Family Model-Based Reinforcement Learning via Score Matching Gene Li, Junbo Li, Nathan Srebro, Zhaoran Wang, Zhuoran Yang (alphabetical order) Advances in Neural Information Processing Systems (NeurIPS), 2022
Sequential Information Design: Markov Persuasion Processes and Efficient Reinforcement Learning Jibang Wu, Zixuan Zhang, Zhe Feng, Zhaoran Wang, Zhuoran Yang, Michael I Jordan, Haifeng Xu Economics and Computation (EC), 2022
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning Chenjia Bai, Lingxiao Wang, Zhuoran Yang, Zhihong Deng, Animesh Garg, Peng Liu, Zhaoran Wang International Conference on Learning Representations (ICLR), 2022
Learning from Demonstrations: Provably Efficient Adversarial Policy Imitation with Linear Function Approximation Zhihan Liu, Yufeng Zhang, Zuyue Fu, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2022
Reinforcement Learning from Partial Observations: Linear Function Approximation with Provable Sample Efficiency Qi Cai, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2022
Human-In-The-Loop: Provably Efficient Preference-Based Reinforcement Learning with General Function Approximation Xiaoyu Chen, Han Zhong, Zhuoran Yang, Zhaoran Wang, Liwei Wang International Conference on Machine Learning (ICML), 2022
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning Shuang Qiu, Lingxiao Wang, Chenjia Bai, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2022
Welfare Maximization in Competitive Equilibria: Reinforcement Learning for Markov Exchange Economies Zhihan Liu, Miao Lu, Zhaoran Wang, Michael I Jordan, Zhuoran Yang International Conference on Machine Learning (ICML), 2022
Adaptive Model Design for Markov Decision Processes Siyu Chen, Donglin Yang, Jiayang Li, Senmiao Wang, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2022
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data Lingxiao Wang, Zhuoran Yang, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2021
A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum Prashant Khanduri, Siliang Zeng, Mingyi Hong†, Hoi-To Wai†, Zhaoran Wang†, Zhuoran Yang† (†: alphabetical order) Advances in Neural Information Processing Systems (NeurIPS), 2021
Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning Yingjie Fei, Zhuoran Yang, Yudong Chen, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2021
Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration Runzhe Wu, Yufeng Zhang, Zhuoran Yang, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2021
Dynamic Bottleneck for Robust Self-Supervised Exploration Chenjia Bai, Lingxiao Wang, Lei Han, Animesh Garg, Jianye Hao, Peng Liu, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2021
Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic Yufeng Zhang, Siyu Chen, Zhuoran Yang, Michael I Jordan, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2021
Design-While-Verify: Correct-by-Construction Learning and Control with Verification in the Loop Yixuan Wang, Chao Huang, Zhilu Wang, Zhaoran Wang, Qi Zhu Design Automation Conference (DAC), 2021
Provably Efficient and Safe Exploration via Primal-Dual Policy Optimization Dongsheng Ding, Xiaohan Wei, Zhuoran Yang, Zhaoran Wang, Mihailo R Jovanović International Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Sample Elicitation Jiaheng Wei, Zuyue Fu, Yang Liu, Xingyu Li, Zhuoran Yang, Zhaoran Wang International Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy Zuyue Fu, Zhuoran Yang, Zhaoran Wang International Conference on Learning Representations (ICLR), 2021
Principled Exploration via Optimistic Bootstrapping and Backward Induction Chenjia Bai, Lingxiao Wang, Lei Han, Jianye Hao, Animesh Garg, Peng Liu, Zhaoran Wang International Conference on Machine Learning (ICML), 2021
Learning While Playing in Mean-Field Games: Convergence and Optimality Qiaomin Xie, Zhuoran Yang, Zhaoran Wang, Andreea Minca International Conference on Machine Learning (ICML), 2021
Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach Yingjie Fei, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2021
Infinite-Dimensional Optimization for Zero-Sum Games via Variational Transport Lewis M Liu, Yufeng Zhang, Zhuoran Yang, Reza Babanezhad, Zhaoran Wang International Conference on Machine Learning (ICML), 2021
Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou Advances in Neural Information Processing Systems (NeurIPS), 2020
Upper-Confidence Primal-Dual Optimization: Stochastically Constrained Markov Decision Processes with Adversarial Losses and Unknown Transitions Shuang Qiu, Xiaohan Wei, Zhuoran Yang, Jieping Ye, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2020
Can Temporal-Difference and Q-Learning Learn Representations? A Mean-Field Theory Yufeng Zhang, Qi Cai, Zhuoran Yang, Yongxin Chen, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2020
Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoffs in Regret Yingjie Fei, Zhuoran Yang, Yudong Chen, Zhaoran Wang, Qiaomin Xie Advances in Neural Information Processing Systems (NeurIPS), 2020
End-to-End Learning and Intervention in Games Jiayang Li, Jing Yu, Marco Y Nie, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2020
Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach Luofeng Liao, You-Lin Chen, Zhuoran Yang, Bo Dai, Mladen Kolar, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2020
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang International Conference on Learning Representations (ICLR), 2020
Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games Zuyue Fu, Zhuoran Yang, Yongxin Chen, Zhaoran Wang International Conference on Learning Representations (ICLR), 2020
Provably Efficient Exploration in Policy Optimization Qi Cai, Zhuoran Yang, Chi Jin, Zhaoran Wang International Conference on Machine Learning (ICML), 2020
Deep Reinforcement Learning with Robust and Smooth Policy Qianli Shen, Yan Li, Haoming Jiang, Zhaoran Wang, Tuo Zhao International Conference on Machine Learning (ICML), 2020
Breaking the Curse of Many Agents: Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning Lingxiao Wang, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2020
On the Global Optimality of Model-Agnostic Meta-Learning: Reinforcement Learning and Supervised Learning Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2020
Generative Adversarial Imitation Learning with Neural Network Parameterization: Global Optimality and Rates of Convergence Yufeng Zhang, Qi Cai, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2020
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy Boyi Liu, Qi Cai, Zhuoran Yang, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2019
On the Global Convergence of Actor-Critic: A Case for the Linear-Quadratic Regulator with Ergodic Costs Zhuoran Yang, Yongxin Chen, Mingyi Hong, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2019
Convergent Policy Optimization for Safe Reinforcement Learning Ming Yu, Zhuoran Yang, Mladen Kolar, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2019
Statistical-Computational Tradeoffs in Single-Index Models Lingxiao Wang, Zhuoran Yang, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2019
Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy Yuan Xie, Boyi Liu, Qiang Liu, Zhaoran Wang, Yuan Zhou, Jian Peng International Conference on Learning Representations (ICLR), 2019
Accelerating Nonconvex Learning via Replica Exchange Langevin Diffusion Yi Chen, Jinglin Chen, Jing Dong, Jian Peng, Zhaoran Wang International Conference on Learning Representations (ICLR), 2019
On the Statistical Rate of Nonlinear Recovery in Generative Models with Heavy-Tailed Data Xiaohan Wei, Zhuoran Yang, Zhaoran Wang International Conference on Machine Learning (ICML), 2019
Multi-Agent Reinforcement Learning via Double-Averaging Primal-Dual Optimization Hoi-To Wai, Zhuoran Yang, Zhaoran Wang, Mingyi Hong Advances in Neural Information Processing Systems (NeurIPS), 2018
Provable Gaussian Embedding with Only One Observation Ming Yu, Zhuoran Yang, Tuo Zhao, Mladen Kolar, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2018
Contrastive Learning from Pairwise Measurements Yi Chen, Zhuoran Yang, Yuchen Xie, Zhaoran Wang Advances in Neural Information Processing Systems (NeurIPS), 2018
Minimax-Optimal Privacy-Preserving Sparse PCA in Distributed Systems Jason Ge, Zhaoran Wang, Mengdi Wang, Han Liu International Conference on Artificial Intelligence and Statistics (AISTATS), 2018
Edge Density Barriers: Computational-Statistical Tradeoffs in Combinatorial Inference Hao Lu, Yuan Cao, Zhuoran Yang, Junwei Lu, Han Liu, Zhaoran Wang International Conference on Machine Learning (ICML), 2018