Abstract
Although single cell RNA sequencing studies have begun providing compendia of cell expression profiles, it has proven more difficult to systematically identify and localize all molecular cell types in individual organs to create a full molecular cell atlas. Here we describe droplet- and plate-based single cell RNA sequencing applied to ∼75,000 human lung and blood cells, combined with a multi-pronged cell annotation approach, which have allowed us to define the gene expression profiles and anatomical locations of 58 cell populations in the human lung, including 41 of 45 previously known cell types or subtypes and 14 new ones. This comprehensive molecular atlas elucidates the biochemical functions of lung cell types and the cell-selective transcription factors and optimal markers for making and monitoring them; defines the cell targets of circulating hormones and predicts local signaling interactions including sources and targets of chemokines in immune cell trafficking and expression changes on lung homing; and identifies the cell types directly affected by lung disease genes and respiratory viruses. Comparison to mouse identified 17 molecular types that appear to have been gained or lost during lung evolution and others whose expression profiles have been substantially altered, revealing extensive plasticity of cell types and cell-type-specific gene expression during organ evolution including expression switches between cell types. This atlas provides the molecular foundation for investigating how lung cell identities, functions, and interactions are achieved in development and tissue engineering and altered in disease and evolution.
Footnotes
The revised manuscript includes 16 more quantified single molecule fluorescence in situ hybridization (smFISH) experiments of human lungs (Figs. 2, 4c, 6c-e, 7f,g, S4, S5), beyond the 10 such experiments in the original. This confirmed and localized all of the new human lung molecular cell types we discovered in single cell RNA sequencing (the only new molecular type not examined is the OLR1+ classical monocyte, presumably a new circulating cell type); the added experiments also localized and quantified many of the known human lung cell types, subtypes and states. We paired the smFISH experiments with textbook-quality micrographs documenting the normal classical histology of the human lung tissue in the subjects we profiled (Fig. S1). Table S2 summarizes all 58 human lung molecular types and their locations, which are diagrammed in Fig. 2b. The revision also includes 5,400 additional human lung cells profiled by SmartSeq2 (SS2), more than doubling our SS2 dataset to over 9,400 cells and our total number of profiled cells to over 75,000. Our SS2 dataset now includes 13 additional cell types we previously captured only with 10x Chromium, all of which are concordant with the previous 10x datasets while providing deeper expression coverage. We have publicly release all of our computer code and data (gene count/UMI tables, scanpy objects, and Seurat objects), and we are preparing for release of the FASTQ sequence files. We have also developed and opened an online portal of our atlas (https://hlca.ds.czbiohub.org/). Finally, and perhaps most significantly during the coronavirus pandemic, we curated from the literature a list of all human genes known to encode a receptor for a human virus including receptors for Covid-19, SARS (Severe Acute Respiratory Syndrome), and MERS (Middle East Respiratory Syndrome) coronaviruses and other respiratory viruses. We provide the lung cell expression profiles for all of these receptors (Figs. 6b, S8a), predicting lung cell tropisms of the viruses and where they initiate infection (p. 17). Our analysis predicts that Covid-19 (as well as SARS and MERS) infects alveolar cells, most prominently alveolar AT2 cells (Fig. S8b). Indeed, this data along with other emerging data and patient clinical data suggests that virus-mediated AT2 cell dysfunction or destruction is the key pathogenic event in Covid-19.