Deep reinforcement learning from human preferences PF Christiano, J Leike, T Brown, M Martic, S Legg, D Amodei Advances in Neural Information Processing Systems 30, 4299-4307, 2017 | 377 | 2017 |
AI Safety Gridworlds J Leike, M Martic, V Krakovna, PA Ortega, T Everitt, A Lefrancq, L Orseau, ... arXiv preprint arXiv:1711.09883, 2017 | 162 | 2017 |
Learning to Understand Goal Specifications by Modelling Reward D Bahdanau, F Hill, J Leike, E Hughes, P Kohli, E Grefenstette arXiv preprint arXiv:1806.01946, 2018 | 64* | 2018 |
Reward learning from human preferences and demonstrations in Atari B Ibarz, J Leike, T Pohlen, G Irving, S Legg, D Amodei Advances in Neural Information Processing Systems, 8011-8023, 2018 | 60 | 2018 |
Ranking Templates for Linear Loops J Leike, M Heizmann Logical Methods in Computer Science, 2015 | 57 | 2015 |
Linear ranking for linear lasso programs M Heizmann, J Hoenicke, J Leike, A Podelski Automated Technology for Verification and Analysis, 365-380, 2013 | 52 | 2013 |
Scalable agent alignment via reward modeling: a research direction J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 2018 | 50 | 2018 |
Ultimate Automizer with array interpolation M Heizmann, D Dietsch, J Leike, B Musa, A Podelski International Conference on Tools and Algorithms for the Construction and …, 2015 | 28 | 2015 |
Ultimate automizer with two-track proofs M Heizmann, D Dietsch, M Greitschus, J Leike, B Musa, C Schätzle, ... International Conference on Tools and Algorithms for the Construction and …, 2016 | 27 | 2016 |
Thompson sampling is asymptotically optimal in general environments J Leike, T Lattimore, L Orseau, M Hutter Conference on Uncertainty in Artificial Intelligence, 2016 | 26 | 2016 |
Bad universal priors and notions of optimality J Leike, M Hutter Conference on Learning Theory, 1244-1259, 2015 | 26 | 2015 |
Geometric nontermination arguments J Leike, M Heizmann International Conference on Tools and Algorithms for the Construction and …, 2018 | 22* | 2018 |
Universal Reinforcement Learning Algorithms: Survey and Experiments J Aslanides, J Leike, M Hutter arXiv preprint arXiv:1705.10557, 2017 | 16 | 2017 |
Nonparametric general reinforcement learning J Leike PhD thesis, Australian National University, 2016 | 14 | 2016 |
On the computability of Solomonoff induction and knowledge-seeking J Leike, M Hutter International Conference on Algorithmic Learning Theory, 364-378, 2015 | 13 | 2015 |
Learning human objectives by evaluating hypothetical behavior S Reddy, A Dragan, S Levine, S Legg, J Leike International Conference on Machine Learning, 8020-8029, 2020 | 10 | 2020 |
A formal solution to the grain of truth problem J Leike, J Taylor, B Fallenstein Conference on Uncertainty in Artificial Intelligence, 2016 | 10 | 2016 |
Active reinforcement learning: Observing rewards at a cost D Krueger, J Leike, O Evans, J Salvatier Future of Interactive Learning Machines, NIPS Workshop, 2016 | 10* | 2016 |
Sequential extensions of causal and evidential decision theory T Everitt, J Leike, M Hutter International Conference on Algorithmic DecisionTheory, 205-221, 2015 | 10 | 2015 |
Synthesis for polynomial lasso programs J Leike, A Tiwari International Conference on Verification, Model Checking, and Abstract …, 2014 | 7 | 2014 |