This corpus consists of collection of 78,376 debates from October 2007 until November 2017 drawn from debate.org along with comprehensive information about 45,348 users who have participated in debates either as a debater or as an audience (i.e. by voting or commenting).
DDO dataset is described in detail in [1] and [2]. To recreate the dataset, you’ll need to download the following two files:
- debates.json: This JSON file contains a Python dictionary that assigns a debate name --- a unique name for each debate --- to comprehensive information about that debate.
- users.json: This JSON file includes a Python dictionary representation of each user in the dataset.
This dataset includes very comprehensive information about the debates and the users. For more information about the content of the dataset and how to use it, please refer to the readme file.
References
- [1] Esin Durmus and Claire Cardie. 2019. A Corpus for Modeling User and Language Effects in Argumentation on Online Debating. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Florence, Italy. Association for Computational Linguistics.
- [2] Esin Durmus and Claire Cardie. 2018. Exploring the Role of Prior Beliefs for Argument Persuasion. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL)..
License
-
The dataset is licensed under the Creative Commons
Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a
copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
If you use any of this data in your work, please reference debate.org and cite the references (described above).