Chenhao Tan, Hao Peng, and Noah A. Smith.
In Proceedings of The Web Conference (WWW'2018).
Abstract:
Political speeches and debates play an important role in shaping the images of politicians, and the public often relies on media outlets to select bits of political communication from a large pool of utterances. It is an important research question to understand what factors impact this selection process.
To quantitatively explore the selection process, we build a three- decade dataset of presidential debate transcripts and post-debate coverage. We first examine the effect of wording and propose a binary classification framework that controls for both the speaker and the debate situation. We find that crowdworkers can only achieve an accuracy of 60% in this task, indicating that media choices are not entirely obvious. Our classifiers outperform crowdworkers on average, mainly in primary debates. We also compare important factors from crowdworkers' free-form explanations with those from data-driven methods and find interesting differences. Few crowdworkers mentioned that "context matters", whereas our data show that well-quoted sentences are more distinct from the previous utterance by the same speaker than less-quoted sentences. Finally, we examine the aggregate effect of media preferences towards different wordings to understand the extent of fragmentation among media outlets. By analyzing a bipartite graph built from quoting behavior in our data, we observe a decreasing trend in bipartisan coverage.
[PDF][Supplementary material][Data(README)][Slides]
@inproceedings{tan+peng+smith:18,
author = {Chenhao Tan and Hao Peng and Noah A. Smith},
title = {``You are no Jack Kennedy'': On Media Selection of Highlights from Presidential Debates},
year = {2018},
booktitle = {Proceedings of WWW}
}