Contact: esindurmus AT cs DOT stanford DOT edu
[Google Scholar] [Semantic Scholar] [CV]
Hi! I am Esin Durmus. I am a Research Scientist at Anthropic Societal Impacts team. Previously, I was a Postdoctoral Scholar at Stanford NLP group working with Tatsunori Hashimoto and Dan Jurafsky. I received my PhD from Cornell University where I was advised by Claire Cardie.
My research interests lie at the intersection of Natural Language Processing, Machine Learning, and Computational Social Science. I am interested in developing evaluation methods and metrics to study the reliability and social impact of NLP/AI systems.
Publications
-
Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models
Myra Cheng, Esin Durmus, Dan Jurafsky
To appear at ACL 2023.
[paper]
-
Tracing and Removing Data Errors in Natural Language Generation Datasets
Faisal Ladhak, Esin Durmus, Tatsunori Hashimoto
To appear at ACL 2023.
[paper]
-
Whose opinions do language models reflect?
Shibani Santurkar, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, Tatsunori Hashimoto
To appear at ICML 2023.
[paper]
-
Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale
Federico Bianchi, Pratyusha Kalluri, Esin Durmus, Faisal Ladhak, Myra Cheng, Debora Nozza, Tatsunori Hashimoto, Dan Jurafsky, James Zou, Aylin Caliskan
To appear at FaccT, 2023.
[paper]
-
Benchmarking large language models for news summarization
Tianyi Zhang, Faisal Ladhak, Esin Durmus, Percy Liang, Kathleen McKeown, Tatsunori B Hashimoto
Preprint, 2023.
[paper]
-
When Do Pre-Training Biases Propagate to Downstream Tasks? A Case Study in Text Summarization
Faisal Ladhak, Esin Durmus, Mirac Suzgun, Tianyi Zhang, Dan Jurafsky, Kathleen Mckeown, Tatsunori B Hashimoto
EACL, 2023.
[paper]
-
Evaluating Human-Language Model Interaction
Mina Lee, Megha Srivastava, Amelia Hardy, John Thickstun, Esin Durmus, Ashwin Paranjape, Ines Gerard-Ursin, Xiang Lisa Li, Faisal Ladhak, Frieda Rong, Rose E Wang, Minae Kwon, Joon Sung Park, Hancheng Cao, Tony Lee, Rishi Bommasani, Michael Bernstein, Percy Liang
Preprint, 2022.
[paper]
-
Holistic Evaluation of Language Models
Preprint, 2022.
[paper]
-
Improving Faithfulness by Augmenting Negative Summaries from Fake Documents
Tianshu Wang, Faisal Ladhak, Esin Durmus, He He
EMNLP 2022.
-
Spurious Correlations in Reference-Free Evaluation of Text Generation
Esin Durmus, Faisal Ladhak, Tatsunori Hashimoto
In Proceedings of ACL 2022 Main conference.
[paper]
-
Gemv2: Multilingual nlg benchmarking in a single line of code
2022.
[paper]
-
Faithful or Extractive? On Mitigating the Faithfulness-Abstractiveness Trade-off in Abstractive Summarization
Faisal Ladhak, Esin Durmus, He He, Claire Cardie, Kathleen McKeown
In Proceedings of ACL 2022 Main conference.
[paper]
-
Language Modeling via Stochastic Processes
Rose E Wang, Esin Durmus, Noah Goodman, Tatsunori Hashimoto
In Proceedings of ICLR 2022.
[paper]
-
On the Opportunities and Risks of Foundation Models
[paper] [bib] -
Towards Understanding Persuasion in Computational Argumentation
PhD Dissertation
[paper] [bib] -
Leveraging Topic Relatedness for Argument Persuasion
Xinran Zhao, Esin Durmus, Hongming Zhang, Claire Cardie
In Findings of ACL, 2021.
[paper] [bib] -
The Gem Benchmark: Natural Language Generation, its Evaluation and Metrics
[Team] [paper] [bib] [website] -
WikiLingua: A New Benchmark Dataset for Cross-Lingual Abstractive Summarization
Faisal Ladhak, Esin Durmus, Claire Cardie and Kathleen McKeown.
In Findings of EMNLP, 2020.
[paper] [data] [bib] - Exploring the Role of Argument Structure in Online Debate Persuasion
Jialu Li, Esin Durmus and Claire Cardie.
In Proceedings of EMNLP, 2020.
[paper] [bib] - FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive
Summarization
Esin Durmus, He He and Mona Diab.
In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2020.
[paper] [code] [bib] - The Role of Pragmatic and Discourse Context in Determining Argument Impact
Esin Durmus, Faisal Ladhak and Claire Cardie.
In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
[paper] [bib] -
Determining Relative Argument Specificity and Stance for Complex Argumentative
Structures
Esin Durmus, Faisal Ladhak and Claire Cardie.
In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), 2019.
[paper] [bib] -
A Corpus for Modeling User and Language Effects in Argumentation on Online Debating
Esin Durmus and Claire Cardie.
In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), 2019.
[paper] [bib] [dataset] -
Persuasion of the Undecided: Language vs. the Listener
Liane Longpre, Esin Durmus and Claire Cardie.
In Proceedings of the 6th Workshop in Argumentation Mining 2019.
[paper] [bib] [dataset] -
Modeling the Factors of User Success in Online Debate
Esin Durmus and Claire Cardie.
In Proceedings of the World Wide Web Conference (WWW), 2019.
[paper] [bib] [dataset]
Cornell Chronicle Story
-
Exploring the Role of Prior Beliefs for Argument Persuasion
Esin Durmus and Claire Cardie.
In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), 2018.
[paper] [bib] [dataset] -
Understanding the Effect of Gender and Stance on Opinion Expression in Debates on
"Abortion”.
Esin Durmus and Claire Cardie.
In Proceedings of PEOPLES2018 workshop (co-organized with NAACL) on computational modeling of peoples opinions, personality, and emotions in social media.
[paper] [bib] - Cornell Belief and Sentiment System at TAC 2016
Vlad Niculae, Kai Sun, Xilun Chen, Yao Cheng, Xinya Du, Esin Durmus, Arzoo Katiyar and Claire Cardie.
Text Analysis Conference (TAC), 2016.
[paper] [bib]
Published Datasets
- WikiLingua
- DDO (Debate.org) corpus
- Kialo Dataset: get access via email.
Teaching
- Instructor for Introduction to Natural Language Processing, Cornell University. Fall 2020.
- Teaching Assistant for Introduction to Natural Language Processing, Cornell University. Fall 2016, Fall 2017, Fall 2019.
- Teaching Assistant for Machine Learning for Data Science, Cornell University. Spring 2016.
- Teaching Assistant for Introduction to Web Design, Cornell University. Fall 2015.
Industry Experience
- Research Intern in Google AI Research. Summer 2020 - December 2020.
- Applied Scientist Intern in Amazon AWS. Summer 2019 - December 2019.
- Applied Scientist Intern in Amazon Alexa. Summer 2017.