Raincloud plots: a multi-platform tool for robust data visualization

Micah Allen; Davide Poggiali; Kirstie Whitaker; Tom Rhys Marshall; Jordy van Langen; Rogier A Kievit

doi:10.12688/wellcomeopenres.15191.2

Raincloud plots: a multi-platform tool for robust data visualization

Wellcome Open Res. 2021 Jan 21:4:63. doi: 10.12688/wellcomeopenres.15191.2. eCollection 2019.

Authors

Micah Allen^{1

2

3}, Davide Poggiali^{4

5}, Kirstie Whitaker⁶, Tom Rhys Marshall^{7

8}, Jordy van Langen⁹, Rogier A Kievit^{9

10

11}

Affiliations

¹ Aarhus Institute of Advanced Studies, Aarhus University, Aarhus, Denmark.
² Department of Psychiatry, University of Cambridge, Cambridge, UK.
³ Centre of Functionally Integrative Neuroscience, Aarhus University Hospital, Aarhus, Denmark.
⁴ Department of Mathematics, University of Padova, Padova, Italy.
⁵ Padova Neuroscience Center, University of Padova, Padova, Italy.
⁶ Alan Turing Institute, London, UK.
⁷ Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford, UK.
⁸ Department of Experimental Psychology, University of Oxford, Oxford, UK.
⁹ Donders Institute for Brain, Cognition and Behavior, Radboud University, Nijmegen, New Zealand.
¹⁰ Max-Planck Centre for Computational Psychiatry and Aging, UCL/MPI Berlin, London, UK.
¹¹ MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, UK.

Abstract

Across scientific disciplines, there is a rapidly growing recognition of the need for more statistically robust, transparent approaches to data visualization. Complementary to this, many scientists have called for plotting tools that accurately and transparently convey key aspects of statistical effects and raw data with minimal distortion. Previously common approaches, such as plotting conditional mean or median barplots together with error-bars have been criticized for distorting effect size, hiding underlying patterns in the raw data, and obscuring the assumptions upon which the most commonly used statistical tests are based. Here we describe a data visualization approach which overcomes these issues, providing maximal statistical information while preserving the desired 'inference at a glance' nature of barplots and other similar visualization devices. These "raincloud plots" can visualize raw data, probability density, and key summary statistics such as median, mean, and relevant confidence intervals in an appealing and flexible format with minimal redundancy. In this tutorial paper, we provide basic demonstrations of the strength of raincloud plots and similar approaches, outline potential modifications for their optimal use, and provide open-source code for their streamlined implementation in R, Python and Matlab ( https://github.com/RainCloudPlots/RainCloudPlots). Readers can investigate the R and Python tutorials interactively in the browser using Binder by Project Jupyter.

Keywords: Matlab; Python; R; barplots; data visualization; raincloud plots.

Grants and funding

MA is supported by a Lundbeckfonden Fellowship (R272-2017-4345), the AIAS-COFUND II fellowship programme that is supported by the Marie Skłodowska-Curie actions under the European Union’s Horizon 2020 (Grant agreement no 754513), and the Aarhus University Research Foundation, and thanks Lincoln Colling for insightful statistical discussions. KW is funded by the Alan Turing Institute under the EPSRC grant EP/N510129/1. RAK is supported by the Wellcome Trust (grant number 107392/Z/15/Z).