Подписаться
Esin Durmus
Esin Durmus
Anthropic
Подтвержден адрес электронной почты в домене stanford.edu - Главная страница
Название
Процитировано
Процитировано
Год
On the opportunities and risks of foundation models
R Bommasani, DA Hudson, E Adeli, R Altman, S Arora, S von Arx, ...
arXiv preprint arXiv:2108.07258, 2021
49232021
Holistic evaluation of language models
P Liang, R Bommasani, T Lee, D Tsipras, D Soylu, M Yasunaga, Y Zhang, ...
arXiv preprint arXiv:2211.09110, 2022
12482022
Benchmarking large language models for news summarization
T Zhang, F Ladhak, E Durmus, P Liang, K McKeown, TB Hashimoto
Transactions of the Association for Computational Linguistics 12, 39-57, 2024
4832024
FEQA: A question answering evaluation framework for faithfulness assessment in abstractive summarization
E Durmus, H He, M Diab
ACL, 2020
4352020
Whose opinions do language models reflect?
S Santurkar, E Durmus, F Ladhak, C Lee, P Liang, T Hashimoto
International Conference on Machine Learning, 29971-30004, 2023
3912023
Easily accessible text-to-image generation amplifies demographic stereotypes at large scale
F Bianchi, P Kalluri, E Durmus, F Ladhak, M Cheng, D Nozza, ...
Proceedings of the 2023 ACM Conference on Fairness, Accountability, and …, 2023
2852023
WikiLingua: A new benchmark dataset for cross-lingual abstractive summarization
F Ladhak, E Durmus, C Cardie, K McKeown
arXiv preprint arXiv:2010.03093, 2020
2162020
Scaling monosemanticity: Extracting interpretable features from claude 3 sonnet
A Templeton
Anthropic, 2024
2022024
Towards understanding sycophancy in language models
M Sharma, M Tong, T Korbak, D Duvenaud, A Askell, SR Bowman, ...
arXiv preprint arXiv:2310.13548, 2023
1792023
Towards measuring the representation of subjective global opinions in language models
E Durmus, K Nyugen, TI Liao, N Schiefer, A Askell, A Bakhtin, C Chen, ...
arXiv preprint arXiv:2306.16388, 2023
1632023
The gem benchmark: Natural language generation, its evaluation and metrics
S Gehrmann, T Adewumi, K Aggarwal, PS Ammanamanchi, ...
arXiv preprint arXiv:2102.01672, 2021
1632021
Marked personas: Using natural language prompts to measure stereotypes in language models
M Cheng, E Durmus, D Jurafsky
arXiv preprint arXiv:2305.18189, 2023
1422023
Studying large language model generalization with influence functions
R Grosse, J Bae, C Anil, N Elhage, A Tamkin, A Tajdini, B Steiner, D Li, ...
arXiv preprint arXiv:2308.03296, 2023
1292023
Evaluating human-language model interaction
M Lee, M Srivastava, A Hardy, J Thickstun, E Durmus, A Paranjape, ...
arXiv preprint arXiv:2212.09746, 2022
1162022
Measuring faithfulness in chain-of-thought reasoning
T Lanham, A Chen, A Radhakrishnan, B Steiner, C Denison, ...
arXiv preprint arXiv:2307.13702, 2023
1032023
Many-shot Jailbreaking
C Anil, E Durmus, M Sharma, J Benton, S Kundu, J Batson, N Rimsky, ...
92*2024
Exploring the role of prior beliefs for argument persuasion
E Durmus, C Cardie
NAACL, 2018
832018
Faithful or extractive? on mitigating the faithfulness-abstractiveness trade-off in abstractive summarization
F Ladhak, E Durmus, H He, C Cardie, K McKeown
arXiv preprint arXiv:2108.13684, 2021
752021
Question decomposition improves the faithfulness of model-generated reasoning
A Radhakrishnan, K Nguyen, A Chen, C Chen, C Denison, D Hernandez, ...
arXiv preprint arXiv:2307.11768, 2023
65*2023
Evaluating and mitigating discrimination in language model decisions
A Tamkin, A Askell, L Lovitt, E Durmus, N Joseph, S Kravec, K Nguyen, ...
arXiv preprint arXiv:2312.03689, 2023
502023
В данный момент система не может выполнить эту операцию. Повторите попытку позднее.
Статьи 1–20