Anatomical Structures, Cell Types, and Biomarkers Tables Plus 3D Reference Organs in Support of a Human Reference Atlas

This paper reviews efforts across 16 international consortia to construct human anatomical structures, cell types, and biomarkers (ASCT+B) tables and three-dimensional reference organs in support of a Human Reference Atlas. We detail the ontological descriptions and spatial three-dimensional anatomical representations together with user interfaces that support the registration and exploration of human tissue data. Four use cases are presented to demonstrate the utility of ASCT+B tables for advancing biomedical research and improving health.

A major challenge in constructing a human reference atlas is combining the data generated by the different consortia without a common "language" shared across them for describing and indexing the data. For example, different cell types can be assigned using existing ontologies/nomenclatures based on genetic, protein, or other biomarker expression profiles. Rapid progress on single-cell technologies has led to an explosion of cell-type definitions, but no standards exist for the naming of anatomical structures, cell types, and biomarkers across organ systems (but see 20 ). Furthermore, information on what cell types are commonly found in which anatomical structures and what biomarkers best characterize certain cell types is scattered across hundreds of ontologies (e.g., Uberon multi-species anatomy ontology, Foundational Model of Anatomy Ontology [FMA], Cell Ontology [CL], or HUGO Gene Nomenclature) and thousands of publications (e.g., see atlas efforts for brain 21 , heart 22 , lung 22 , and kidney 12 ) on cells identified during human development, disease, and across multiple species. Some critically important details (e.g., shape and distribution of microanatomical structures or the spatial layout of functionally interdependent cell types) are captured via hand-drawn figures-not digitally. This state of affairs impedes progress in biomedical science and practice as data is difficult or impossible to manage, compare, harmonize, and use.
The initial tables and reference objects are not complete, but they demonstrate how existing knowledge can be captured and reorganized in support of a human reference atlas. Entities and relationships in the tables are linked to major ontologies, and existing source publications are referenced. The tables capture data and expertise that is mandatory for compiling a comprehensive human reference atlas, and they are an important tool for facilitating collaboration among the 16 consortia. Specifically, the tables and associated 3D reference objects: (1) provide an agreed-upon framework for experimental data annotation across organs and scales (i.e., from whole body to organs, tissues, cell types, and biomarkers); (2) make it possible to compare and integrate data from different assay types (e.g., scRNAseq and MERFISH data) for spatially equivalent tissue samples; (3) are a semantically and spatially explicit reference for "healthy" tissue and cell identity data that can then be compared against disease settings; and (4) can be used to evaluate progress on the semantic naming and definition of cell types and their spatial characterization. Importantly, they help communicate data structures and user needs to programmers in support of user interfaces that support the construction and usage of a human reference atlas.
The remainder of this paper details the data format, design, and usage of ASCT+B tables and associated 3D reference objects. It presents ten initial ASCT+B tables interlinked via a vasculature table together with a reference library of major AS. We then discuss four concrete use cases that showcase the utility of the initial 11-organ human reference atlas for tissue registration and exploration, data integration, understanding disease, and measuring progress toward a more complete human reference atlas. We conclude with a discussion of next steps and an invitation to collaborate on the construction and usage of a reference atlas for healthy human adults.

ASCT+B Tables
In 2019, the Kidney Precision Medicine Project (KPMP) project published a first version of the ASCT+B tables to serve as a guide to annotate structures and cell types across multiple technologies to appreciate cellular diversity in the kidney 12 . At their core, the tables represent three entity types (AS, CT, B) and five relationship types (in bold italics in Fig. 1c). AS are connected via part_of relationships creating a partonomy tree, CTs are linked via is_a relationships (e.g., T cell is an immune cell), and biomarkers can be of different types indicated by is_a (e.g., gene, protein, lipid, metabolite). Two bimodal networks link CT and AS (i.e., one and the same CT might be located_in multiple AS, while a single AS may have multiple CT) and CT and B (i.e., one and the same B might be used to characterize different CT, and multiple B might be required to uniquely characterize one CT).
The current ASCT+B v1.0 table format captures the AS partonomy (unlimited number of levels), one level of CT (but see extensive CT typology discussed in 127 ), and two B types: gene markers (BG) and protein markers (BP). Each row in the table represents one CT located_in a specific AS together with all B commonly used to identify this CT. In addition, the tables include citations to relevant work documenting not only AS, CT, and B and their interlinkages, but also AS-CT and CT-B relationships.
The initial set of 11 tables was authored using templated Google sheets. For each AS, CT, and B entity, authors completed (1) its preferred name, (2) an ontology name/label, if available, and (3) a unique, universally resolvable ontology ID, if available. A lookup table of organ-specific human, non-developmental data captured in existing formalized ontologies (initially Uberon for AS, Cell Ontology [CL] for CT, and HUGO for B) was provided to authors. Data validation was performed by human experts and algorithmically, testing expert-curated relationships for validity in Uberon by querying Ubergraph 128 , a knowledge graph combining mutually referential OBO ontologies including CL and Uberon and featuring precomputed classifications and relationships. This setup makes it possible to feedback data from the ASCT+B tables to correct and expand Uberon and CL but also to align data modelling efforts across ASCT+B tables, Uberon, and CL. Algorithmic testing and validation are ongoing. Most, but not all, ASCT-B table relationships in the initial 11 tables described here do validate. Like the first maps of our world, the first ASCT+B tables are imperfect and incomplete (see Limitations section). However, they exemplify the utility of ASCT+B tables to standardize and digitize existing data and expertise by pathologists, anatomists, and surgeons at the gross anatomical level; biologists, computer scientists, and others at the single-cell level; and chemists, engineers, and others at the biomarker level.

3D Reference Object Library
The spatial location of CT within AS matters as does the context (e.g., the number and type of other CT within the same AS). A 3D reference object library was compiled to capture the size, shape, position, and rotation of major AS in the organ-specific ASCT+B tables. To create reference organs for the 11 initial organs, experts collaborated closely with medical designers to develop anatomically correct, vector-based objects that correctly represent human anatomy, and are properly labelled using the ontology terms captured in the ASCT+B tables.
All 3D reference organs-except for the brain, intestine, and lymph node-use data from the Visible Human Project (VH) male and female dataset made available by the National Library of Medicine 130 . The brain uses the 141 AS of the "Allen Human Reference Atlas -3D, 2020" representing one half of the human brain 5 . The AS were mirrored to arrive at a whole human brain (as intended) and resized to fit the VH male and female bodies. A 3D model of the male large intestine was kindly provided by Arie Kaufman, Stony Brook University, modified to fit into the VH male body, and used to guide the design of the female large intestine. The lymph node was created using mouse data and the clearing-enhanced 3D method developed by Weizhe Li, Laboratory of Immune System Biology, National Institute of Allergy and Infectious Diseases, NIH 131 . While the size and cellular composition of mouse and human lymph nodes varies, the overall anatomy is well-conserved between species.
The resulting 26 organ objects (male/female and left/right versions exist for kidney and lymph nodes) that are properly positioned in the male and female VH bodies are freely available online 132 . They capture a total of 1,185 unique 3D structures (e.g., the left female kidney has 11 renal papillae; see complete listing at 132 with 557 unique Uberon terms (including the name of the organ) and Supplemental Materials. Files are provided in GLB format for easy viewing in Babylon.js in a web browser 133 and in user interfaces (see examples in Tissue Registration and Exploration).

Reference Atlas Usage
The ASCT+B tables and associated 3D reference organs provide a starting point for building a human reference atlas with common nomenclature for major entities and relationships plus crossreferences to existing ontologies and supporting literature. Although the current table format is rather basic, the tables have proven valuable for addressing the originally stated four challenges: (1) to support experimental data registration and annotation across organs and scales (see next subsection); (2) to compare and integrate data from different assay types (see Data Integration and Comparison); (3) to compare healthy and disease data (see Understanding Disease); (4) and to evaluate progress toward the compilation of the human reference atlas (see Measuring Progress).

Tissue Registration and Exploration
Like any atlas, a human reference atlas is a collection of maps that capture a 3D reality. Like Google maps, the human reference atlas needs to support panning and zoom-from the whole body (macro scale, meters), to the organ level (meso scale, centimeters), to the level of functional tissue units (e.g., alveoli in lung, crypts on colon, glomeruli in kidney, millimeters), down to the single-cell level (micro scale, micrometers). To be usable, the maps in a human reference atlas must use the same index terms and a unifying topological coordinate system so that cells and anatomical structures in adjacent overlapping maps or at different zoom levels can be uniquely named and properly aligned.
The ASCT+B tables and 3D reference organs constitute an agreed-upon framework for experimental data annotation and exploration across organs and scales (i.e., from the entire body down to organs, tissues, cell types, and biomarkers). For example, they are used to support spatially and semantically explicit registration of new tissue data as well as spatial and semantic search, browsing, and exploration of human tissue data. The HuBMAP Registration User Interface (RUI, Fig. 2a) 134 and Exploration User Interface (EUI, Fig. 2b) 135 available via the HuBMAP portal 136 . The code for both user interfaces is freely available on GitHub, and several consortia have registered human tissue samples and are using the exploration user interface to explore their tissue datasets in the context of human anatomy.

Data Integration: Comparing Cell States Across Different Tissues and in Disease
The ASCT+B framework with controlled ontology vocabulary provides a "lookup table" for AS and their CT composition across organs-for cells formed within and resident in a specific tissue (e.g., epithelia and stroma) as well as cells that migrate across tissues (e.g., immune cells). Immune cells originate primarily in the bone marrow in postnatal life with adaptive lymphocytes subsequently differentiating and maturing in lymphoid tissues such as the thymus and spleen prior to circulating to non-lymphoid tissues and lymph nodes (see Fig. 3a). Therefore, these cells recur across the ASCT+B tables in both the lymphoid (bone marrow, thymus, spleen, lymph node) and non-lymphoid (brain, heart, kidney, lung, skin) tissue tables. Existing data supports a more nuanced and tissue-specific, ontology-based assignment of blood and immune cells. For example, scRNAseq data enables deep phenotyping of hematopoietic stem cells (HSCs) and progenitor cells (HSPCs) and their differentiated progenies across various tissues. When comparing fetal liver vs. thymic cell states (Fig. 3a, right), a small region of the HSPCs highlighted as "lymphoid progenitors"is shared across the two organs, indicating cells that have migrated from liver to thymus 36,110 . Data integration defines molecules (e.g., chemokine receptors) that determine tissue residency versus migratory properties. These biomarkers ("B") in turn define tissue-resident vs. migratory cell states, which can be added to the ASCT+B tables to further refine cellular ontology.
A well-annotated healthy human reference atlas can then be used to understand the molecular and cellular alterations in response to perturbations such as infection. Data from several singlecell multi-omics studies of patients' blood can be federated to compute cellular response during COVID-19 pathogenesis, including HSC progenitor states emerging during disease 137 (Fig. 3b).

Understanding Disease
The ASCT+B tables are a semantically and spatially explicit reference for healthy tissue data that can be used to identify changes in molecular states in normal aging or disease. For example, the kidney master table links relevant AS, CT, B to disease and other ontologies for increasing our understanding of disease states. For example, the top significant and specific biomarkers in each reference cell/state cluster might differ in disease or in a cell undergoing repair, regeneration, or in a state of failed or maladaptive repair. Loss of expression or alteration in cellular distribution of an AS specific biomarker may also provide clues to underlying disease. The KPMP is working towards ASCT+B tables that characterize disease by focusing on biomarkers that have important physiological roles in maintaining cellular architecture or function or reveal shifts in cell types associated with acute and chronic diseases. Changes in marker genes in healthy and injured cells provide information on underlying biological pathways and genes that drive these shifts and thus provide critical insights into pathogenetic mechanisms. For example, the NPHS1 gene, which codes for nephrin in the kidney, is one of the top markers of healthy podocytes and is essential for glomerular function. Mutation in NPHS1 may be found in patients with proteinuria. The kidney ASCT+B table records that BG NPHS1 (see lower right in Fig. 1c) is expressed in the podocytes of the kidney, and ontology suggests injury to podocytes and glomerular function may cause proteinuria (Fig. 4). Ontology IDs provided for AS, CT, B facilitate linkages to clinicopathological knowledge and help provide broader insights into disease 138 . For example, the ASCT+B kidney master table and snRNAseq atlas data 54 have been used to reveal the cellular identity of diabetic nephropathy genes by distinguishing the healthy interstitium from a diabetic one 139 . Note that some of the existing data are not at the single-cell level; in these cases, regional data (e.g., data bounded by tissue blocks registered within reference organs with known AS, CT, and B-see the RUI and EUI discussion above) can be compared to the kidney master table. In sum, ASCT+B tables interlinked with existing ontologies provide a foundation for new data analysis and the functional study of diseases.

Measuring Progress
The ASCT+B tables provide a framework to help track progress towards an accurate and complete human reference atlas. Given a scholarly publication that includes a new ASCT+B table, that table can be compared with existing master tables and the number and type of identical (confirmatory) and different (new) AS, CT, and B as well as their relations can be determined. Analogously, the value of a new data release for reference atlas design can be evaluated in terms of the number and type of (new) AS, CT, B and their relations. The ASCT+B Reporter 140 supports the visual exploration and comparison of ASCT+B tables. Table authors and reviewers can use this online service to upload new tables, examine them visually (see Fig.  2), and compare them to existing master tables. When adopted by publishers and editors, the ASCT+B tables provide objective measures and incentives for computing progress towards a human reference atlas. Power analysis methods can be run to assess the coverage and completeness of cell states and/or types and decide what tissues and cells should be sampled next 141 in support of a data-driven experimental design.
As the number of published and data-derived ASCT+B tables grows, estimates can be run to determine the likely accuracy of AS, CT, B and relations. Entities and relations with much highquality evidence are more likely to be correct then those with limited or no evidence. Incomplete data can be easily identified and flagged (e.g., AS with no linkages to CT and CT with no B indicate missing data). Given author information for publications and data releases, experts on AS, CT, B, and their relations can be identified and invited to improve tables further.
Using the ASCT+B tables, biases in sampling with respect to donor demographics (e.g., from convenience samples as opposed to using sampling strategies that reflect global demographics), organs (e.g., based on availability of funding), or cell types (e.g., loss of CT due to differential viability or capture efficiency) can be determined and need to be proactively addressed to arrive at an atlas that truly captures healthy human adults.

Limitations
We designed the format of the ASCT+B tables and 3D reference organs to be easy for experts across many different domains to author and use the tables. They are intended to be intuitive and not require any extensive training in order for them to achieve their intended purpose. Unfortunately, this means that the current set of tables and reference organs do not capture the complete complexity of the human body. For each organ table, experts recorded the process they used to construct the table, which often included simplifying the anatomy to fit within a strict partonomy, making subjective decisions about which cell types and biomarkers had sufficient evidence to be included in the table, or ignoring normal dynamic changes that occur in the organ over time. For several organs, the B are preliminary and are expected to improve in coverage and robustness in the future.

Outlook
ASCT+B tables in combination with the 3D reference library provide a unified framework for experimental data annotation and exploration across different levels (i.e., organ, tissue, cell type, and biomarker). The construction and validation of the tables are iterative. Initially, ontology and publication data, along with expertise by organ experts, is codified and unified. Later, experimental datasets will be compared with existing master tables-confirming a subset of all entities and relations captured in it and adding new ones as needed to capture healthy human tissue data. In the near future, the CT typology will be expanded from one level to multiple, making it possible to compare AS partonomy and CT typology datasets at different levels of resolution. New organs will be added to the 3D reference library and micro-anatomical structures such as glomeruli in the kidney, crypts in the large intestine, and alveoli in the lung will be included.
The number of AS, CT, and robust B is likely to increase as new single-cell technologies and computational workflows are developed. Thus, the tables and associated reference objects are a living "snapshot" of the status of the collective work toward a human reference atlas, against which experimentalists can calibrate their data and ultimately contribute to the atlas by expanding or refining it. Future uses of the human reference atlas might include cross-species comparisons or cross-species annotations, cross-tissue/organ comparisons, comparisons of healthy versus common or rare genetic variations, and usage in teaching-expanding widely used anatomy books 62, 120 to the single-cell level.
It will take much effort and expertise to arrive at a consensus human reference atlas and to develop methods and user interfaces that utilize it to advance research and improve human health. Experts interested in contributing to this international and interdisciplinary effort are invited to register via https://iu.co1.qualtrics.com/jfe/form/SV_bpaBhIr8XfdiNRH to receive more information and meeting invites.

Acknowledgements
We are grateful to Blue Lake from University of California, San Diego, for assistance with annotations and analyzing the snRNAseq HUBMAP data for several of the markers in the kidney ASCT+B tables, Becky Steck and Rachel Dull from University of Michigan for assistance with the nomenclature and curation of kidney partonomy, and Seth Winfree, IUPUI for valuable discussions regarding the kidney ASCT+B  Fig. 1. a, Alphabetical listing of 16 human reference atlas construction efforts (left) linked to 30 human organs they study (right). See legend for color and size coding. The lung is studied by 10 consortia; see orange links. This review focuses on 10 organs (bolded) plus vasculature. b, 3D reference objects for major anatomical structures were jointly developed for 11 organs. c, Exemplary ASCT+B table showing all AS and CT and some B for the glomerulus in kidney; annotated with names of the three entity types (AS, CT, B) and four relationship types (in italics). Fig. 2. a, Registration and semantic annotation of tissue data (blue block) in 3D via collision detection in the Registration User Interface (RUI): A user sizes, positions, and rotates tissue blocks and saves results in JSON format. b, Semantically and spatially explicit search, browsing, and filtering of tissue data (white blocks in spleen and kidney) in the Exploration User Interface (EUI): RUI-registered tissue data can be explored semantically using the AS partonomy on left and spatially using the anatomy browser in middle; a filter in top left supports subsetting by sex, age, tissue provider, etc. Clicking on a tissue sample on the right links to the Vitessce image viewer 142 . Fig. 3. a, HSCs migrate from liver to the thymus during embryonic and fetal development. The transcriptomic identity of these HSCs changes between the first and second trimester pregnancy, as shown by the maroon versus blue shading of the HSCs in the left (embryo) and right (fetal) part of the figure. The scRNAseq data in the UMAP plots are from 110 ; the top plot shows liver cells (blue) and thymus cells (orange) overlapping, labelled "lymphoid progenitor" in bottom plot. b, The nature of HSC subsets in adult blood shifts in health versus COVID-19. HSCs in the blood of COVID-19 patients (top-left) shows a megakaryocyte priming bias when compared to healthy (top-right). This is quantified in the histogram from 110 of relative HSPC contributions for different donor/patient cohorts.  Kidney ASCT+B table linked to processes, disease, clinicopathologic information, and ontology data.