The effect of wording on message propagation: Topic- and author-controlled natural experiments on Twitter

Chenhao Tan, Lillian Lee, Bo Pang
In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL'2014)

Media coverage:
The New York Times: A 25-Question Twitter Quiz to Predict Retweets
The Washington Post: This is the perfect tweet, according to science
The Physics arXiv Blog: Data Mining Reveals How Wording Influences Tweet Propagation
engadget: Crafting the perfect tweet is one-third talent, two-thirds science
The Daily Dot: A data miner's guide to crafting the perfect tweet
Yahoo! Small Business Advisor: 6 Scientific Tips to Get More Retweets
and Slashdot, Brandwatch, Daily Tech Whip ...

Consider a person trying to spread an important message on a social network. He/she can spend hours trying to craft the message. Does it actually matter? While there has been extensive prior work looking into predicting popularity of social-media content, the effect of wording per se has rarely been studied since it is often confounded with the popularity of the author and the topic. To control for these confounding factors, we take advantage of the surprising fact that there are many pairs of tweets containing the same url and written by the same user but employing different wording. Given such pairs, we ask: which version attracts more retweets? This turns out to be a more difficult task than predicting popular topics. Still, humans can answer this question better than chance (but far from perfectly), and the computational methods we develop can do better than an average human as well as a strong competing method trained on non-controlled data.

We put a demo online where you can enter two tweets on the same topic and see which one our algorithm thinks will be retweeted more, or play a little quiz game to see how good you are in telling which tweet will be retweeted more. [Try it yourself!]

[Data(README)] [PDF] [Slides]

     author = {Chenhao Tan and Lillian Lee and Bo Pang},
     title = {The effect of wording on message propagation: Topic- and author-controlled natural experiments on Twitter},
     year = {2014},
     booktitle = {Proceedings of ACL}

This work was supported in part by NSF grant IIS-0910664 and a Google Research Grant. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or other sponsors.