Подписаться
Aleksandar Makelov
Aleksandar Makelov
Independent
Подтвержден адрес электронной почты в домене mit.edu - Главная страница
Название
Процитировано
Процитировано
Год
Towards deep learning models resistant to adversarial attacks
A Madry, A Makelov, L Schmidt, D Tsipras, A Vladu
arXiv preprint arXiv:1706.06083, 2017
115892017
Towards deep learning models resistant to adversarial attacks
A Mądry, A Makelov, L Schmidt, D Tsipras, A Vladu
stat 1050, 9, 2017
222017
Expansion in lifts of graphs
AA Makelov
72015
Rethinking backdoor attacks
A Khaddaj, G Leclerc, A Makelov, K Georgiev, H Salman, A Ilyas, A Madry
International Conference on Machine Learning, 16216-16236, 2023
32023
Is this the subspace you are looking for? An interpretability illusion for subspace activation patching
A Makelov, G Lange, A Geiger, N Nanda
The Twelfth International Conference on Learning Representations, 2023
22023
Backdoor or Feature? A New Perspective on Data Poisoning
A Khaddaj, G Leclerc, A Makelov, K Georgiev, A Ilyas, H Salman, A Madry
2022
Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control
A Makelov, G Lange, N Nanda
ICLR 2024 Workshop on Secure and Trustworthy Large Language Models, 0
В данный момент система не может выполнить эту операцию. Повторите попытку позднее.
Статьи 1–7