Follow
Runlong Zhou
Runlong Zhou
Paul G. Allen School of Computer Science & Engineering, University of Washington
Verified email at cs.washington.edu - Homepage
Title
Cited by
Cited by
Year
Stochastic shortest path: Minimax, parameter-free and towards horizon-free regret
J Tarbouriech, R Zhou, SS Du, M Pirotta, M Valko, A Lazaric
Advances in neural information processing systems 34, 6843-6855, 2021
312021
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes
R Zhou, R Wang, SS Du
International Conference on Machine Learning, 42698-42723, 2023
7*2023
Sharp variance-dependent bounds in reinforcement learning: Best of both worlds in stochastic and deterministic environments
R Zhou, Z Zhang, SS Du
International Conference on Machine Learning, 42878-42914, 2023
72023
Understanding curriculum learning in policy optimization for solving combinatorial optimization problems
R Zhou, Y Tian, Y Wu, SS Du
arXiv preprint arXiv:2202.05423, 2022
4*2022
Free from bellman completeness: Trajectory stitching via model-based return-conditioned supervised learning
Z Zhou, C Zhu, R Zhou, Q Cui, A Gupta, SS Du
arXiv preprint arXiv:2310.19308, 2023
12023
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
R Zhou, SS Du, B Li
arXiv preprint arXiv:2402.12621, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–6