RT Journal Article SR Electronic T1 Public human microbiome data dominated by highly developed countries JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.09.02.458641 DO 10.1101/2021.09.02.458641 A1 Richard J. Abdill A1 Elizabeth M. Adamowicz A1 Ran Blekhman YR 2021 UL http://biorxiv.org/content/early/2021/09/02/2021.09.02.458641.abstract AB The importance of sampling from globally representative populations has been well established in human genomics. In human microbiome research, however, we lack a full understanding of the global distribution of sampling in research studies. This information is crucial to better understand global patterns of microbiome-associated diseases and to extend the health benefits of this research to all populations. Here, we analyze the country of origin of all 444,829 human microbiome samples that have been collected to date and are available from the world’s three largest genomic data repositories, including the Sequence Read Archive (SRA). We show that more than 71% of publicly available human microbiome samples with a known origin come from Europe, the United States, and Canada, including 46.8% from the United States alone, despite the country representing only 4.3% of the global population. We also find that central and southern Asia is the most underrepresented region: Countries such as India, Pakistan, and Bangladesh account for more than a quarter of the world population but make up only 1.8 percent of human microbiome samples. These results demonstrate a critical need to ensure more global representation of participants in microbiome studies.Competing Interest StatementThe authors have declared no competing interest.