Runlong Zhou

20212022202320243 12 27 8

Public access

2 articles

0 articles

available

not available

Based on funding mandates

Simon Shaolei DuAssistant Professor, School of Computer Science and Engineering, University of WashingtonVerified email at cs.washington.edu
Michal ValkoLlama @ Meta Paris & Inria & MVA - Ex: Gemini and BYOL @ Google DeepMindVerified email at meta.com
Matteo PirottaResearch Scientist, Meta (FAIR)Verified email at fb.com
Jean TarbouriechGoogle DeepMindVerified email at google.com
Alessandro LazaricResearch Scientist, Facebook Artificial Intelligence ResearchVerified email at inria.fr
Ruosong WangPhD Student, Carnegie Mellon UniversityVerified email at andrew.cmu.edu
Yuandong TianResearch Scientist, Meta AI (FAIR)Verified email at fb.com
Yi WuInstitute for Interdisciplinary Information Sciences, Tsinghua UniversityVerified email at mail.tsinghua.edu.cn
Zhang ZihanTsinghua UniversityVerified email at mails.tsinghua.edu.cn

Runlong Zhou

Paul G. Allen School of Computer Science & Engineering, University of Washington

Verified email at cs.washington.edu - Homepage


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Stochastic shortest path: Minimax, parameter-free and towards horizon-free regret J Tarbouriech, R Zhou, SS Du, M Pirotta, M Valko, A Lazaric Advances in neural information processing systems 34, 6843-6855, 2021	31	2021
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes R Zhou, R Wang, SS Du International Conference on Machine Learning, 42698-42723, 2023	7*	2023
Sharp variance-dependent bounds in reinforcement learning: Best of both worlds in stochastic and deterministic environments R Zhou, Z Zhang, SS Du International Conference on Machine Learning, 42878-42914, 2023	7	2023
Understanding curriculum learning in policy optimization for solving combinatorial optimization problems R Zhou, Y Tian, Y Wu, SS Du arXiv preprint arXiv:2202.05423, 2022	4*	2022
Free from bellman completeness: Trajectory stitching via model-based return-conditioned supervised learning Z Zhou, C Zhu, R Zhou, Q Cui, A Gupta, SS Du arXiv preprint arXiv:2310.19308, 2023	1	2023
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs R Zhou, SS Du, B Li arXiv preprint arXiv:2402.12621, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–6

Citations per year