TY - JOUR T1 - Quantifying and contextualizing the impact of bioRxiv preprints through automated social media audience segmentation JF - bioRxiv DO - 10.1101/2020.03.06.981589 SP - 2020.03.06.981589 AU - Jedidiah Carlson AU - Kelley Harris Y1 - 2020/01/01 UR - http://biorxiv.org/content/early/2020/03/10/2020.03.06.981589.abstract N2 - Engagement with scientific manuscripts is frequently facilitated by Twitter and other social media platforms. As such, the demographics of a paper’s social media audience provide a wealth of information about how scholarly research is transmitted, consumed, and interpreted by online communities. By paying attention to public perceptions of their publications, scientists can learn whether their research is stimulating positive scholarly and public thought. They can also become aware of potentially negative patterns of interest from groups that misinterpret their work in harmful ways, either willfully or unintentionally, and devise strategies for altering their messaging to mitigate these impacts. In this study, we collected 331,696 Twitter posts referencing 1,800 highly tweeted bioRxiv preprints and leveraged topic modeling to infer the characteristics of various communities engaging with each preprint on Twitter. We agnostically learned the characteristics of these audience sectors from keywords each user’s followers provide in their Twitter biographies. We estimate that 96% of the preprints analyzed are dominated by academic audiences on Twitter, suggesting that social media attention does not always correspond to greater public exposure. We further demonstrate how our audience segmentation method can quantify the level of interest from non-specialist audience sectors such as mental health advocates, dog lovers, video game developers, vegans, bitcoin investors, conspiracy theorists, journalists, religious groups, and political constituencies. Surprisingly, we also found that 10% of the highly tweeted preprints analyzed have sizable (>5%) audience sectors that are associated with right-wing white nationalist communities. Although none of these preprints intentionally espouse any right-wing extremist messages, cases exist where extremist appropriation comprises more than 50% of the tweets referencing a given preprint. These results present unique opportunities for improving and contextualizing research evaluation as well as shedding light on the unavoidable challenges of scientific discourse afforded by social media. ER -