Masatoshi Uehara
Title
Cited by
Cited by
Year
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes.
N Kallus, M Uehara
J. Mach. Learn. Res. 21, 167:1-167:63, 2020
752020
Minimax weight and q-function learning for off-policy evaluation
M Uehara, J Huang, N Jiang
International Conference on Machine Learning, 9659-9668, 2020
712020
Generative adversarial nets from a density ratio estimation perspective
M Uehara, I Sato, M Suzuki, K Nakayama, Y Matsuo
arXiv preprint arXiv:1610.02920, 2016
712016
Efficiently breaking the curse of horizon: Double reinforcement learning in infinite-horizon processes
N Kallus, M Uehara
arXiv preprint arXiv:1909.05850, 2019
49*2019
Intrinsically efficient, stable, and bounded off-policy evaluation for reinforcement learning
N Kallus, M Uehara
Advances in Neural Information Processing Systems 32, 2019
372019
Off-policy evaluation and learning for external validity under a covariate shift
M Uehara, M Kato, S Yasui
NeurIPS, 2020
20*2020
Statistically efficient off-policy policy gradients
N Kallus, M Uehara
Proceedings of the 37th International Conference on Machine Learning, 5089-5100, 2020
132020
Causal Inference Under Unmeasured Confounding With Negative Controls: A Minimax Learning Approach
N Kallus, X Mao, M Uehara
arXiv preprint arXiv:2103.14029, 2021
102021
Finite sample analysis of minimax offline reinforcement learning: Completeness, fast rates and first-order efficiency
M Uehara, M Imaizumi, N Jiang, N Kallus, W Sun, T Xie
arXiv preprint arXiv:2102.02981, 2021
92021
A unified statistically efficient estimation framework for unnormalized models
M Uehara, T Kanamori, T Takenouchi, T Matsuda
International Conference on Artificial Intelligence and Statistics, 809-819, 2020
8*2020
Analysis of noise contrastive estimation from the perspective of asymptotic variance
M Uehara, T Matsuda, F Komaki
arXiv preprint arXiv:1808.07983, 2018
82018
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage
M Uehara, W Sun
arXiv preprint arXiv:2107.06226, 2021
7*2021
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies
N Kallus, M Uehara
Advances in Neural Information Processing Systems 33, 2020
72020
Optimal off-policy evaluation from multiple logging policies
N Kallus, Y Saito, M Uehara
International Conference on Machine Learning, 5247-5256, 2021
62021
Localized debiased machine learning: Efficient inference on quantile treatment effects and beyond
N Kallus, X Mao, M Uehara
arXiv preprint arXiv:1912.12945, 2019
6*2019
Fast Rates for the Regret of Offline Reinforcement Learning
Y Hu, N Kallus, M Uehara
arXiv preprint arXiv:2102.00479, 2021
42021
Imputation estimators for unnormalized models with missing data
M Uehara, T Matsuda, JK Kim
International Conference on Artificial Intelligence and Statistics, 831-841, 2020
42020
Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage
JD Chang, M Uehara, D Sreenivas, R Kidambi, W Sun
arXiv preprint arXiv:2106.03207, 2021
32021
Information criteria for non-normalized models
T Matsuda, M Uehara, A Hyvärinen
Journal of Machine Learning Research, 2021
22021
Double reinforcement learning for efficient and robust off-policy evaluation
N Kallus, M Uehara
International Conference on Machine Learning, 5078-5088, 2020
22020
The system can't perform the operation now. Try again later.
Articles 1–20