Tracing Community Genealogy: How New Communities Emerge from the Old

Chenhao Tan.
In Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM'2018).

The process by which new communities emerge is a central research issue in the social sciences. While a growing body of research analyzes the formation of a single community by examining social networks between individuals, we introduce a novel community-centered perspective. We highlight the fact that the context in which a new community emerges contains numerous existing communities. We reveal the emerging process of communities by tracing their early members' previous community memberships.

Our testbed is Reddit, a website that consists of tens of thousands of user-created communities. We analyze a dataset that spans over a decade and includes the posting history of users on Reddit from its inception to April 2017. We first propose a computational framework for building genealogy graphs between communities. We present the first large-scale characterization of such genealogy graphs. Surprisingly, basic graph properties, such as the number of parents and max parent weight, converge quickly despite the fact that the number of communities increases rapidly over time. Furthermore, we investigate the connection between a community's origin and its future growth. Our results show that strong parent connections are associated with future community growth, confirming the importance of existing community structures in which a new community emerges. Finally, we turn to the individual level and examine the characteristics of early members. We find that a diverse portfolio across existing communities is the most important predictor for becoming an early member in a new community.

[PDF][Data & Demonstration][Slides]

community genealogy.

     author = {Chenhao Tan},
     title = {{Tracing Community Genealogy: How New Communities Emerge from the Old}},
     year = {2018},
     booktitle = {Proceedings of ICWSM}