1 - ABSTRACT
Digital light microscopy provides powerful tools for quantitatively probing the real-time dynamics of subcellular structures. While the power of modern microscopy techniques is undeniable, rigorous record-keeping and quality control are required to ensure that imaging data may be properly interpreted (quality), reproduced (reproducibility), and used to extract reliable information and scientific knowledge which can be shared for further analysis (value). Keeping notes on microscopy experiments and quality control procedures ought to be straightforward, as the microscope is a machine whose components are defined and the performance measurable. Nevertheless, to this date, no universally adopted community-driven specifications exist that delineate the required information about the microscope hardware and acquisition settings (i.e., microscopy “data provenance” metadata) and the minimally accepted calibration metrics (i.e., microscopy quality control metadata) that should be automatically recorded by both commercial microscope manufacturers and customized microscope developers. In the absence of agreed guidelines, it is inherently difficult for scientists to create comprehensive records of imaging experiments and ensure the quality of resulting image data or for manufacturers to incorporate standardized reporting and performance metrics. To add to the confusion, microscopy experiments vary greatly in aim and complexity, ranging from purely descriptive work to complex, quantitative and even sub-resolution studies that require more detailed reporting and quality control measures.
To solve this problem, the 4D Nucleome Initiative (4DN) (1,2) Imaging Standards Working Group (IWG), working in conjunction with the BioImaging North America (BINA) Quality Control and Data Management Working Group (QC-DM-WG) (3), here propose light Microscopy Metadata specifications that scale with experimental intent and with the complexity of the instrumentation and analytical requirements. They consist of a revision of the Core of the Open Microscopy Environment (OME) Data Model, which forms the basis for the widely adopted Bio-Formats library (4–6), accompanied by a suite of three extensions, each with three tiers, allowing the classification of imaging experiments into levels of increasing imaging and analytical complexity (7,8). Hence these specifications not only provide an OME-based comprehensive set of metadata elements that should be recorded, but they also specify which subset of the full list should be recorded for a given experimental tier. In order to evaluate the extent of community interest, an extensive outreach effort was conducted to present the proposed metadata specifications to members of several core-facilities and international bioimaging initiatives including the European Light Microscopy Initiative (ELMI), Global BioImaging (GBI), and European Molecular Biology Laboratory (EMBL) - European Bioinformatics Institute (EBI). Consequently, close ties were established between our endeavour and the undertakings of the recently established QUAlity Assessment and REProducibility for Instruments and Images in Light Microscopy global community initiative (9). As a result this flexible 4DN-BINA-OME (NBO namespace) framework (7,8) represents a turning point towards achieving community-driven Microscopy Metadata standards that will increase data fidelity, improve repeatability and reproducibility, ease future analysis and facilitate the verifiable comparison of different datasets, experimental setups, and assays, and it demonstrates the method for future extensions. Such universally accepted microscopy standards would serve a similar purpose as the Encode guidelines successfully adopted by the genomic community (10,11). The intention of this proposal is therefore to encourage participation, critiques and contributions from the entire imaging community and all stakeholders, including research and imaging scientists, facility personnel, instrument manufacturers, software developers, standards organizations, scientific publishers and funders.
2 - INTRODUCTION
The reproducibility crisis affecting the biological sciences is well-documented (12–16). In the field of light microscopy, it can only be addressed if all published images are accompanied by complete descriptions of experimental procedures, biological samples, microscope hardware specifications, image acquisition settings, image analysis parameters and metrics detailing instrument performance and calibration (4,9,13,17–20). This complete description, also known as Image Metadata, consists of any and all information about an imaging experiment that ensures its rigorous interpretation, reproducibility and re-use, and should be recorded alongside the actual image data in the file header or in supplemental files (21). A fully developed metadata model would provide for consistent tracking of crucial information pertaining to the quality, reproducibility and scientific value of image data, and will allow the communication and comparison of such information in a Findable, Accessible, Interoperable, and Reproducible (FAIR) manner (see also Text Box 1 in: Huisman et al., 2021) (21,22). However, as microscopy has evolved from a tool that generates purely descriptive or illustrative data to primary quantitative data acquired with ever more sophisticated and complex instruments, our practices to record this quantitative data and metadata faithfully and reproducibly have not kept up. Moreover, while the Open Microscopy Environment (OME) consortium (5,6,23–25) has made significant advances with the development of the OME Data Model which, together with the widely-used Bio-Formats library (4), serves as the only available de facto specification for accessing and exchanging image data, the field of light microscopy still lacks universally accepted standards for imaging data and specifications for metadata. The want of a consensus has resulted in an out-of-control growth of proprietary and/or incompatible image file formats and metadata capture practices.
TEXT BOXES
Text Box I
Charting a solid path towards next-generation storage mechanisms for community-driven, OME-based Microscopy Image Data Standards
Microscopy Metadata, stored (Figure 1, HOW yellow and WHERE blue bubbles) following the Open Microscopy Environment (OME) Data Model (4,5) is represented in the form of OME-Extensible Markup Language (OME-XML), which is typically stored in the header of OME-TIFF files. Consequently, the XML Schema Definition (XSD) formalism is used to represent the model schema in a machine-readable manner. However, despite its advantages, XSD is not ideally suited to allow the OME Data Model to serve as the foundation for the community-development and maintenance of globally accepted light microscopy standards (Figure 1). Because XSD does not support the storage of novel types of information within the core of the model, the capture of ever-evolving microscopy technologies and modalities requires the periodical release of new versions of the OME XSD schema (6) accompanied by XML Stylesheet Language (XSL) based templates for making sure legacy documents could be kept up to date. This burdensome process is ultimately unsustainable. As a consequence it is necessary to develop new strategies with a more open paradigm.
Under this new paradigm, one would assume that no single authority exists to decide which information has to be recorded in metadata models making it necessary for commonly used concepts to be incorporated over time into community-driven standards. In this context, agreement has to be reached not much on WHAT concepts have to be recorded for the documentation of imaging experiments (Figure 1, magenta bubble), rather on the development of shared mechanisms defining HOW new types of (meta)data have to be recorded (Figure 1, yellow bubble) and associated with the Image data file format (Figure 1, WHERE blue bubble) (35).
In this context, the OME consortium, in collaboration with RIKEN, has started experimenting with the idea of utilizing Resource Description Framework (RDF) (106,107) triples conforming to the Web Ontology Language (OWL) (108) to describe OME-compatible image metadata (32,33,101,102) and be incorporated in the Next-Generation File Formats (NGFF) currently being developed by the OME consortium (35). By employing this method, it would be possible for users to produce, find and access quality-controlled image data for re-analysis and integration. Specifically, the depicted method will provide two major advantages:
Individual groups specializing in different aspects of the imaging world will have equal status and a shared path to develop new areas of the model (28). In turn, this will provide a method for different communities to collectively develop a complete picture (Figure 2) of all the information required to ensure rigor and reproducibility for modern imaging experiments.
At the same time, community-driven standards could evolve gradually over time by incorporating novel concepts into the core as they are developed peripherally from the core, vetted by the community, and commonly adopted.
As a proof-of-concept, an implementation of the OME Data Model was built in RDF/OWL (104), and applied to the modeling of specifications defined by the 4D Nucleome Imaging Standard Working Group(1,2) for the exchange of image data and integration with genomics datasets (31,105). This demonstrated the potential utility of this approach, laying the foundation for ongoing community discussions to identify the path of choice for modern Light Microscopy Image Data Standards (Figure 1).
This manuscript is intended to launch a community-driven way forward to break the impasse. Specifically, it puts forth scalable specifications for light Microscopy Metadata developed jointly by the 4D Nucleome Initiative (4DN) (1,2) Imaging Standards Working Group (IWG) and by the BioImaging North America (BINA) Quality Control and Data Management Working Group (QC-DM-WG) (3,7,8). In order to foster widespread adoption of these 4DN-BINA-OME (NBO namespace) (26) specifications framework (Figure 1A, magenta bubble), parallel work is being conducted to develop userfriendly and when possible automated metadata collection tools (13,27–30); shared metadata storage best practices (Figure 1A, yellow bubble) (31–34); as well as sustainable specifications to switch from proprietary formats for image data into common, cloud-ready OME Next-Generation File Formats (NGFF, Figure 1A, blue bubble) (35,36). Importantly, all of these activities are carried out in the context of the newly launched QUAlity Assessment and REProducibility for Instrument and Images in Light Microscopy (QUAREP-LiMi) initiative (quarep.org; (9,37) and involve several key members of the community, including microscope users, custodians, and manufacturers, imaging scientists, national and global bioimaging organizations, bio-image informaticians, standards organizations, and scientific publishers. The light Microscopy Metadata guidelines proposed here articulate along three orthogonal axes (Figure 1B):
Guideline Tiers - Metadata specification (7,8): A system of adaptable Tiers that specifies which specific subset of metadata information should be included depending on experimental intent and technical complexity.
Core model and Extensions - Metadata extension (26): An initial suite of extensions that expand the core of the OME Data Model (4,5) to better capture state-of-the-art transmitted light and widefield fluorescence microscopy (Basic Extension), and confocal and advanced fluorescence modalities (Advanced and Confocal Extension). Importantly to improve the management of quality control, a novel data model for capturing instrument calibration procedures (Calibration and Performance Extension) was developed in close collaboration with QUAREP-LiMi (9,37).
Requirement Levels - Metadata inclusion: Inherent flexibility in the inclusion of metadata so that specific pieces of information will be considered as “required” (essential for rigor and reproducibility), and “recommended” (to improve image quality and to maximize scientific and sharing value).
The framework of this model is inherently adaptable while providing microscope users and custodians, instrument makers and microscope vendors with a clear and enforceable community-driven mandate for the necessary information to ensure scientific rigor, experimental reproducibility and maximal scientific value.
2.1 The metadata challenge in microscopy imaging: the great variability of data formats and metadata reporting practices
The capture and analysis of microscopy images closely depend on the technique utilized to record and measure light. The introduction of photography, fluorescence, computers and digital light detectors drastically improved the objectivity of observations made through light microscopy and changed light microscopy in three profound ways. First, it has allowed the increasingly accurate recording of progressively lower amounts of light, enabling the visualization and quantitative measurement of sub-cellular and single-molecule (SM) events and molecular interactions with high specificity and temporal resolution. Second, the advent of digital image formation and processing has enabled new imaging modalities, such as Confocal Laser Scanning Microscopy (CLSM), and super-resolution (SR) imaging techniques that allow high-resolution imaging of live samples in three dimensions. Third, digital imaging has led to signal processing and computational methods that enable the extraction of quantitative information from images. Despite these advances, the emergence of light (and in particular fluorescence) microscopy as a key quantitation tool for biomedical research, and the employment of ever more sophisticated and complex instruments, practices to record this quantitative data and metadata faithfully and reproducibly have not kept up.
When performing imaging experiments, scientific rigor is inextricably tied to image quality, the reproducibility of experimental results and the extent to which image datasets have a scientific value. Not only can a high-value dataset be used to answer the scientific questions it was intended to address, but it can also be shared, merged with other datasets, integrated with other data types and further analyzed to answer new questions. Deriving valuable information from images is completely dependent on the consistent recording and storage of “data provenance” information that capture sample preparation and labeling, microscope hardware specification and image acquisition details, on the quantitative assessment of the optical, excitation, detection, and mechanical properties (including error estimation) of the microscope, and on an intimate knowledge of the analysis procedures used to extract quantitative information from the images. Hence the proliferation of quantitative light microscopy techniques has not only opened new scientific landscapes but has also exacerbated the existing challenges of quality control and reproducibility.
A typical light microscopy experiment includes three (sometimes integrated) major steps (Figure 2): 1) Sample Preparation, i.e., all preparative steps resulting in the samples to be imaged; 2) Image Acquisition, i.e., image formation and recording; and 3) Image Analysis, i.e., the post-acquisition processing and quantification of images. Each procedure within these steps can add considerable variability to the final data. Thus, to document all possible sources of uncertainty, microscopy images need to be accompanied by Image Metadata (21) describing all information about the imaging experiment that allows the evaluation, interpretation, reproducibility, and comparison of the actual image data (i.e., quantitative values associated with the image pixels; Figure 2, Pixel Image Data). This will include comprehensive information about all aspects of the microscopy experiment from experimental treatment and sample preparation (Figure 2A, Experimental and Sample Metadata) to microscope hardware specifications, image acquisition settings, image structure, and instrument performance metrics (Figure 2A, Microscopy Metadata), as well as details about any image analysis procedures that were subsequently employed (Figure 2A, Analysis Metadata) (38–40). Although the OME Data Model, coupled with the ubiquitous Bio-Formats image file format conversion library (4,5), has served as the de facto exchange format for image data, it has not as yet evolved into a much needed community mandated Microscopy Image Data Standard (4,5). In addition to support for extensions to capture technological advancements, a mature standard would include: 1) universal image data formats defining the container where data is stored (35); 2) specifications of what information about the imaging experiment should be captured; and 3) standards for how the metadata should be captured, managed and stored (Figure 1).
Additional challenges in record-keeping for microscopy experiments include the following:
Light microscopy is employed to address a diverse range of complex biological questions.This has led to the development of a vast array of adaptable microscopy modalities, each requiring different metadata to be reported as well as diverging quality control approaches, which poses a notable challenge when choosing the correct validation method for the experiment at hand.
The working conditions, theoretical performance, and capabilities of the microscope are difficult to assess and are often unknown by the average user.
The relevant hardware or software metadata can be difficult to retrievefrom available documentation and the user is often unaware of how varying specific parameters might affect the imaging results.
The paucity of automation and intuitive software tools make record-keeping unduly burdensome, forcing experimental biologists to choose between scientific rigor and productivity.
The lack of universal and enforceable image file format and metadata standards results in an unacceptable variability in the information provided by microscope manufacturers alongside images. In addition, the need for raw data files to be converted into other formats prior to interpretation and comparison often yields a significant loss of metadata, or, worse still, inadvertently compromises the data during the conversion process.
It is worth noting that among all experimental steps described in Figure 2, the image acquisition step contributes the most manageable and quantifiable stage, as long as the microscope and imaging system are properly documented, maintained, and operated. Consequently, the development of community-sanctioned standards for the compilation of Microscopy Metadata encompassing, Microscopy data Provenance Metadata (MPM), which documents the process of image acquisition and the structure of the resulting image (Figure 2A, Provenance Metadata), and Microscopy Quality-control Metadata (MQM), which captures calibration procedures and metrics (Figure 2, Quality-Control Metadata) is not only essential for image data quality, reproducibility, and scientific and sharing value, but should be readily attainable (21).
2.2 The importance and potential pitfalls of standardization
The value of Microscopy Image Data Standards (Figure 1) has been widely recognized, resulting in important efforts to establish best performance testing and instrument calibration practices (41–52), to unify data-submission requirements from journals (53) and to produce the exchange format between image data and metadata that forms the basis for this current work (4,5,54,55). Nonetheless, because the existing efforts lack normative value it remains challenging to determine which parameters are relevant to a given technique and imaging experiment and best practice recommendations are often ignored due to their perception as too expensive, complicated and cumbersome.
Much would thus be gained from harmonizing the reporting standards in light microscopy. First, it would facilitate the documentation of any microscopy-based protocol, minimize error, and quantify residual uncertainty associated with each step of the procedure (Figure 2). This, in turn, would provide a wealth of valuable contextual information - collectively referred to as “data provenance” (56,57) – that would greatly increase the scientific and sharing value of the data. Such details would enable the reliable evaluation of scientific claims based on imaging data, facilitate comparisons within and between experiments, allow for reproducibility, and maximize the likelihood that data can be collated and analyzed by other scientists using current and future image processing and analysis methods. Furthermore, the increasing availability of public image repositories (e.g., Movincell (58), Image Data Resource - IDR (55), Electron Microscopy Public Image Archive - EMPIAR (59), Bioimage Archive (60), Allen Cell Explorer (61), the Cochin Image Database (62), the Cell Image Library (63), the RIKEN Systems Science of Biological Dynamics database - SSBD (64), and the NIH CELL Image Library (65)), will undoubtedly increase the need for community-wide documentation and quality control standards, which can adapt to new technologies. As a first step in this direction (20) the Recommended Metadata for Biological Images (REMBI) (66) guidelines were recently proposed that would maximize the possibility of making bioimaging datasets available to other researchers in a timely manner, consistent with the FAIR principles (Findable, Accessible, Interoperable and Reusable) (22), and thus amenable for reuse.
Despite offering innumerable advantages, standardization also has its pitfalls. First, it can massively increase the administrative burden associated with imaging experiments. Second, a lack of flexibility can severely limit the type of data that can be stored. Since it is impossible to know a priori the complexity and diversity inherent to experimental details and imaging modalities that are yet to be developed, it is critical that any proposed set of guidelines that are designed to serve as a basis for a sustainable community standard are defined utilizing technology that meets strict extensibility requirements. Because of its inherent extensibility and the solid plans for modernization (see Text Box 1), the OME Data Model (4,5) provides a robust foundation for Microscopy Metadata (Figure 2B) that can be extended by introducing information that is not yet covered (including experimental specific metadata, modality specific metadata, quality control metadata and analysis-specific metadata). As these extensions (39,67,68) become more commonly used, they too can then be incorporated into the core using community announcements and related vetting processes to meet expanding community needs.
3 - A THREE-DIMENSIONAL MATRIX OF 4DN-BINA-OME MICROSCOPY METADATA GUIDELINES
Since a one-size-fits-all solution for Microscopy Metadata requirements is clearly not tenable, we propose a framework in which microscopy documentation and quality control requirements are organized along three axes that are largely orthogonal to each other (Figure 1B). The first axis is based on the observation that different types of experiments have different reporting and quality control requirements based on technical complexity, experimental design, and image analysis needs. Hence, requirements along this axis are subdivided into Tiers depending on the three criteria listed above (Figure 1B, Guideline Tiers; Figure 3, Table I and Supplemental Table I). The second axis starts with and extends the OME Data Model (4,5) with additional metadata components that are introduced based on the microscopic modality (e.g., epifluorescence vs. confocal microscopy) and accommodates expansion as new technologies are developed that are covered neither by the core nor by the currently proposed extensions (Figure 1B, OME Core vs. Extensions; Figure 5). Last but not least, the third axis grades documentation requirements based on whether each piece of information is essential for rigor and reproducibility (Must use), or recommended to improve image quality and for maximizing scientific and sharing value (Should use; Figure 1B, Requirement Levels; Figure 6). The existence of these three axes will allow institutions, funding agencies, consortia and scientific publishers to define best practices for light microscopy experiment documentation while concomitantly allowing individual scientists to find an appropriate position on the guideline matrix that both matches their needs and remains compatible with community-mandated guidelines. As an example, Table II lists where representative experiments would fall within the Microscopy Metadata guideline matrix (Figure 1B) (7,8)..
3.1 The first axis: a tier-based system of guidelines for light Microscopy Metadata
To achieve rigor and reproducibility, increasingly elaborate imaging experiments require additional metadata on top of those required for more basic experiments. On this account, a graded system for metadata requirements is not only appropriate, but it also minimizes the burden of collecting metadata for each experiment whilst maximizing the opportunities for rigor, reproducibility, evaluation, analysis, and comparison. We envision a flexible system in which different imaging communities (i.e., individual research institutions, individual fields of knowledge or research consortia) would define their own sets of criteria whereby microscope hardware and imaging experiments are classified in Tiers based on experimental and image complexity, microscope technology and imaging modality, and analytical requirements. Hence the tiered system of guidelines presented here (Figure 3; Table I and Supplemental Table I) (8). should be considered as an example of how different imaging experiment types could be placed on a complexity scale to facilitate the collection of the most appropriate minimum set of metadata required for reproducibility and comparison of each category. We expect that this system will naturally evolve organically to incorporate new imaging modalities.
A robust, maximally useful, and efficient metadata standard would be tailored around the different reporting requirements of experiments of increasing complexity. We suggest here a system composed of one Descriptive (Tier 1) and two Analytical (Tiers 2 and 3) tiers (Figure 3; Table I and Supplemental Table I) (8), in which imaging instrumentation and datasets are classified based on the following sets of criteria:
Are results amenable to visual interpretation or is advanced image analysis (e.g., sub-pixel spot localization) required for the full understanding of results?
Are biological samples fixed or alive during acquisition?
Are any parts of the quantitative microscopy pipeline (microscope instrument, acquisition modality and image analysis) relying on novel, rather than fully established technology?
Is the data provenance and quality control metadata tracked, documented and reported by hardware manufacturers or instrument developers?
Consistent with minimum information principles, the system represents a minimal set of metadata required for each tier, covering only the information relevant for the interpretation of the specific imaging experiment (while more complete information is always allowed and encouraged). As an example, while the proposed standard encompasses information about the sample that directly impacts the imaging conditions (e.g., labeling method, mounting medium), it is not intended to replace the complete description of the experimental procedure from start to end.
3.1.1 Minimum Information required for full documentation → Tier 1
Tier 1 covers experiments that fall into two general sub-categories:
Qualitative evaluation: Experiments that require only qualitative assessment of image data for meaningful conclusions to be drawn. Examples include transfection controls, viability tests, or other experiments that serve as minor supporting evidence in a project or manuscript.
Simple image processing and analysis on fixed samples: Experiments that are performed on fixed specimens and require simple processing and analyses to support conclusions. This category includes studies that require the identification, counting, intensity and morphometric analyses of features whose size is above the limit of resolution of the system. Examples include cell-counting, the measurement of reporter intensity, the localization of reporters in the nucleus vs. cytoplasm or the estimation of the size and shape of individual cells.
Hence, this descriptive tier does not require metadata describing advanced hardware features of the microscope or quantifying microscope performance. The complete description of metadata fields to be included in Tier 1 is available on GitHub (7,8,26,69).
3.1.2 Advanced image analysis, live-imaging and super-resolution -- Tier 2
Advances in microscope technology have been accompanied by an increased dependence on complex image analysis methods. Some imaging techniques require digital image processing and image analysis for the very construction of the images (e.g., assembling individual pixel intensities into the pixel array that constitute an image for CLSM, Structured Illumination Microscopy - SIM, and stochastic SR methods like Photo-Activated Localization Microscopy - PALM, and Stochastic Optical Reconstruction Microscopy - STORM). Others use model-driven data-processing to enhance the resolution of the data (e.g., deconvolution) and improve quantitative accuracy and reliable interpretability by combatting issues such as illumination shading, nonspecific background, limited signal, and optical aberrations. In addition, many imaging experiments require advanced image analysis for extracting quantitative information from the raw data. These techniques increase the usefulness of microscopy data but they often require that the data meets certain criteria for the analysis to be useful and reliable, thus requiring more detailed data provenance and performance calibration information for correct interpretation. Specifically, Tier 2 experiments fall into two general sub-categories:
Advanced quantification: Experiments that aim to draw conclusions about features that are near or below the limits of resolution, as well as experiments in which conclusions require post-acquisition reconstitution (i.e., deconvolution). Imaging techniques that use probability-based detection frameworks to function in signal-starved conditions also fall in this category, since they often rely on advanced processing and quantitative analysis. Typical examples include single-molecule (SM) localization microscopy (e.g., SM Fluorescence In Situ Hybridization - FISH), single-particle tracking (SPT) and distance/distribution measurements that aim to achieve SR precision. Because these sophisticated imaging techniques require the optical system to be performing at its theoretical best and often take into account the photophysical behaviors of the fluorophores and the detector, Optical and Intensity Calibration (21) are an essential requirement.
Live cell imaging: Experiments that use transmitted and fluorescence light microscopy on live specimens whether or not they are destined for advanced processing and analysis. These experiments require detailed information about the environmental conditions, phototoxicity data, and focal and stage stability measurements. Typical examples include applications following the real-time dynamics of cellular events and real-time viral SPT experiments. Because the stability of the system across time is often crucial for reproducibility, Mechanical Calibration (21) is also recommended for these experiments.
The capacity of microscope users to provide bona fide documentation of microscopy experiments is often limited by the degree to which they are made aware of all parts of the instrument by the manufacturer. Accordingly, Tier 2 represents the most demanding requirement for data obtained by microscope users using well-established microscope instrumentation, processing algorithms and analysis procedures that have been shown to be quantifiable across a range of conditions. The complete description of metadata fields to be included in Tier 2 is available on GitHub (7,8,26,69).
3.1.3 Manufacturing, technical development and full documentation → Tier 3
Tier 3 is intended to be used by manufacturers of microscope hardware components and for the developers of pioneering technologies. While it is inherently impossible to design a comprehensive metadata standard for future technologies, all efforts should be made by instrumentation developers and manufacturers to provide any piece of information required, implicitly or explicitly, to interpret a given image formation or data processing method. As such, Tier 3 furnishes manufacturers and developers of new microscope modalities with clear community-sanctioned specifications about what provenance reporting and quality control information should be provided to microscope users to ensure scientific rigor, full reproducibility, and re-use value. The complete description of metadata fields included in Tier 3 is available on GitHub (7,8,26,69). Additional method-specific information is expected to be required for most applications.
3.2 The second axis: a system of 4DN-BINA-sponsored community-driven OME extensions
In its simplest form, metadata can be easily represented as lists of key-value pairs, where the first term is a descriptive term for a specific attribute and the second term is the value of the attribute, including units for numerical values. However, lists of key-value pairs are often not sufficient to define rich metadata guidelines as they do not allow to capture the often complex relationships between different real-world components and situations to be described. A better method is the development of abstract models for the data that represents the scenario to be described. Ideally, such a data model would account for the components of the system, the attributes that need to be recorded for each component to be fully documented, and the relationship between components (Figure 4A). A useful formalism for developing, describing, and viewing an appropriate data model is the Entity-Relationship (ER) diagram (70), which subsequently has to be translated into formalized schemas (Figure 4B) to facilitate implementing metadata capture and management tools.
3.2.1 OME: an informatics framework for biological-image data
Since its launch in 2003 (54) the OME initiative has played a leading role in the foundation and development of bioimage informatics (71), a new field of biological informatics which concentrates explicitly on building technology for the sharing and dissemination of reproducible and quality controlled biomedical image data. Specifically, the OME Data Model initially published in 2005 (4,5) and its related software implementations (4,72) were developed to provide an informatics framework for the accurate capture, storage and management of rich metadata representations of all the information required for interpreting, comparing, reproducing, sharing and publishing biological microscopy data (Figure 2, Microscopy Metadata) independent of a proprietary file format. The OME Data Model is first and foremost implemented in the Bio-Formats library (4) which allows the interpretation of the more than 150 different proprietary file formats and their translation into the OME-TIFF and OME-NGFF (35,36) open file formats for biological imaging. This facilitates the transfer of image data between different commercial vendors and open-source software tools thus reducing the barriers between different analytical and data management tools. Finally, the OME Data Model is also implemented in OMERO, a relational database and application server to import, store, process, view and export data (72).
One of the OME Data Model’s main purposes is to describe the different hardware components of the microscope, define the light path of each channel and document the settings used for individual image acquisition sessions (i.e., laser power, exposure times and detector gain). Since the physical setup of microscopes tends to be fixed, while imaging settings are typically adjusted for different samples and acquisition sessions, the OME Data Model subdivides Microscopy Metadata into two main sections:
The <lnstrument> core element (Figure 2B) describes the imaging instrument and is used to store the relatively static description of a given microscope and its hardware components (e.g., objectives, illumination sources, filters, detectors).
The <lmage> core element (Figure 2A) stores the specific instance of an acquired image and, in addition to describing the structure of the, possibly multi-dimensional, image, it documents the image acquisition settings utilized when that image was acquired. To this aim, the <Image> data element stores references to specific hardware components defined in the <lnstrument> alongside any necessary configurations and parameter settings utilized for a given image dataset (e.g., excitation power, filter set and detector gain).
While this structure is very robust and allows to document core microscopy concepts such as <LightSource>, <Objective>, <Filter>, <Detector>, etc.,), the OME specifications have not kept pace with the wide range of technologies that are now routinely used in the life and biomedical sciences.
3.2.2 The 4DN-BINA proposal: a suite of three extensions of the core OME ontology
Due to its status as the de facto exchange specifications for imaging experiments, the robustness of its design, and the solid path forward toward modernization (Figure 4B and Text Box 1), the OME Data Model (i.e., OME core ontology) represents the ideal starting point for the suite of 4DN-BINA extensions presented here (Figure 5). As such, the 4DN-BINA-OME specifications proposal consists of three extensions of the OME Core (4,5) that incorporate the concept of graded documentation requirements based on a tiered-system of guidelines (Figure 3; Table II). To achieve this goal, the 4DN-BINA-OME Microscopy Metadata Specifications (7,26,73) extend the core OME elements <Instrument> (Figure 5A) and <Image> (Figure 5B) to reflect the technological advances and the Quality Control requirements associated with state-of-the-art transmitted light, widefield- and confocal-fluorescence microscopy. Specifically:
The Basic extension is designed to better capture the technical complexity of transmitted light microscopy and wide-field fluorescence, including sub-pixel single-molecule localization and single-particle tracking experiments (Figure 5, blue and grey elements).
The Advanced and Confocal extension is designed to better capture experiments requiring tunable optics and confocal microscopy (Figure 5, green elements).
The Calibration and Performance extension introduces specifications for the capture of metrics required for microscope calibration and quantitative instrument performance assessment (Figure 5, maroon elements).
In order to facilitate understanding of the 4DN-BINA-OME by all relevant members of the community regardless of their information science expertise, while at the same time ensuring machine readability, formal representations of the 4DN-BINA-OME extensions are maintained on GitHub (7) in three formats: (1) a set of graphical ER schemas is used for facilitating an overall understanding of the model structure (73); (2) an excel spreadsheet expresses the details of the model in a human-readable form (73); finally (3) XML Schema Definition (XSD) is used to represent the model schema in a machine-readable manner (26).
3.2.2.1 Basic 4DN-BINA-OME extension
The Basic 4DN-BINA extension of the OME Data Model was designed to better capture the technical complexity of transmitted light and wide-field fluorescence microscopy and is graphically presented in Figure 5, and Supplemental Figures 1A and 2 (7,26,73). This extension puts forth several types of modifications:
Extension of already existing elements such as <Microscope>, <Laser>,<Objective>, and <Filter> by the introduction of additional attributes (Figures 5, 6 and Supplemental Figure 1, blue and blue/red elements).
Introduction of novel elements such as <StageInsert>, <SamplePositioning>, <FocusStabilizationDevice> to capture the complexity of microscope hardware components commonly encountered in the field (Figures 5, 6 and Supplemental Figure 1, grey elements), and which combine with the new <PlaneTransformMatrix> affine transform element to encode locations in real-world coordinates
Mimicking the hierarchical structure of <LightSource>, introduction of several additional Abstract Parent Elements (APE) to describe hardware components, such as <LightSourceCoupling>, <Filter>, <MirroringDevice>, and <Detector>, that can be subdivided in specialized categories to streamline the structure of the model and avoid data duplication (i.e., the <Detector> category can be subdivided in <Camera> and <PointDetector>).
Establishment of the concept of individual <WavelengthRange> to facilitate the description of multi-pass excitation sources, filters, dichroic-mirrors and detectors.
Introduction of additional concepts to better capture the settings of individual hardware components that are employed during the acquisition of a specific image and are stored in the core OME <lmage> element (i.e., <MicroscopeSettings>, <CameraSettings> and <PMTSettings>).
Expansion of the concept of <LightPath> to better capture the complexities of the light path that are typically encountered in different modern light microscopy modalities and uses a new <LightPathMap> element to describe the order of optical components that might be placed between the excitation source and the detector other than the filter and dichroic such as for example, <LightSourceCoupling>, <Prism>, <PolarizationOptics>, <Lens>, and <OpticalAperture>.
Introduction of the <AdditionalDimensionMap> and associated elements to flexibly handle dimensions such as fluorescence lifetime, polarization angle, and lambda beyond the five canonical X, Y, Z, T (time), and C (color) dimensions.
3.2.2.2 Advanced and Confocal 4DN-BINA-OME Extension
This extension is designed to better capture experiments requiring the use of tunable optics and confocal microscopy (Figure 5, green elements). As depicted graphically in Figure 5, and Supplemental Figures 1B and 3 (7,26,73), this extension consists primarily in the introduction of novel concepts required to capture hardware components and image acquisition settings that are needed for confocal microscopy and other advanced acquisition modalities that require tunable excitation and emission light selection such as for example, <ConfocalScanner>, <AcoustoOpticalBeamSplitter>, <LiquidCrystalTunableFilter>, and <PinHole>.
3.2.2.3 Calibration and Performance 4DN-BINA-OME Extension
The specifications for the capture of metrics required for light microscope calibration and quality control captured in this extension were developed in collaboration with QUAREP-LiMi (quarep.org; Figure 5, and Supplemental Figures 1C and 4) (8,9,26,37,73) and are described in detail in an accompanying manuscript (21). A diverse set of metrics (41–51,74–86) can be used to measure microscope performance and control image quality depending on the type of experiment being performed and the questions being asked. Together, these measurements increase the depth and reliability of a variety of assessments, analyses, and comparisons performed on light microscopy images. Such calibration metrics can be subdivided into four categories: 1) optical; 2) intensity/excitation; 3) intensity/detector, and 4) mechanical (see also Table III in: Huisman et al., 2021) (21). Metrics in the first three categories evaluate the great majority of image data. Mechanical calibration metrics become most useful in experiments that involve time-lapse imaging or the tiling of multiple Fields Of View (FOV). In order to capture these metrics categories, the Calibration and Performance extension introduces the following new elements:
<IntensityCalibrationTool> and <LightSensor> represent hardware tools (e.g., power meter) used for performing specific intensity calibration procedures and represented as sub-elements of <Instrument>.
<OpticalCalibration>, <ExcitationCalibration>, <DetectorCalibration> and <MechanicalCalibration> store information that describes each respective calibration procedure, which might be performed to document individual image datasets and the resulting metrics. <OpticalCalibration>, and <DetectorCalibration> are connected with the <Image> element. In turn, <ExcitationCalibration> and <MechanicalCalibration> are associated with the <Channel> and the <Plane> elements respectively.
<CalibrationStandardSlide>, <ColoredBeadsSlide>, <DNAOrigami>, and <FluorescenceReferenceSlide>, which belong to the <OpticalStandard> group and store information that describe reference standards used for <OpticalCalibration> and other procedures including <ChromaticRegistrationEvaluation> and <FieldUniformityEvaluation>.
3.3 The third axis: model elements and attributes requirement levels
Along the third axis (Figure 6), individual metadata fields in these specifications are classified based on requirement level as described by the Request for Comment (RFC) document 2119 (87). The keyword MUST, or the terms “REQUIRED” or “SHALL,” mean that the definition is an absolute requirement to validate experimental claims and ensure reproducibility. The keyword SHOULD, or the adjective “RECOMMENDED,” mean that while there may exist valid reasons in particular circumstances to ignore a particular field, they are highly recommended to maximize Image Quality, scientific value and FAIRness (22). Two examples of the use of the third dimension to add flexibility to the proposed 4DN-BINA-OME Microscopy Metadata specifications are presented below:
Example 1) OME Core and 4DN-BINA Basic extension element <Objective>(Figure 6).
While the Manufacturer, Model, Magnification and Numerical Aperture (LensNA) of an objective are required to be able to interpret microscopy results and for reproducibility, other attributes such as a hardware component’s Lot Number, a Lens’s Back Focal Length and the Calibrated Magnification of an Objective are recommended to maximize Image Quality and scientific value but they are not required because they are not essential for reproducing the experiment.
Example 2) 4DN-BINA Calibration and Performance extension element <ColorBeads>
When using multicolored beads to prepare a colored beads slide to use for the optical calibration of a microscope, the Manufacturer, Catalog Number, and Concentration of the beads preparation alongside the Diameter of the beads are essential for the interpretation of the calibration results and for reproducibility. However, the bead’s Type and Material may be omitted because it can be argued that while that information improves the completeness of the data, they are not absolutely required for the correct interpretation of the results of the Optical calibration procedure in which the beads are utilized.
3.4 Model implementation: Material and Methods recommendations
A recent exploration about the quality of published Method sections in scientific articles containing images obtained with advanced microscopes, found that the quality of reporting was poor, with some articles containing no information about how images were obtained, and many articles lacking important basic details (13). Nonetheless, there is ample evidence that the publication of full details about how each image was obtained is vital for rigor, reproducibility and maximal scientific and sharing value (14–16,88–90). In this context, the 4DN-BINA-OME Microscopy Metadata specifications presented are intended to provide a major contribution towards the development of community-driven criteria for which information should be included in the Methods sections of scientific publications.
As a first step, in close agreement with the proposal presented in parallel efforts (89)23, we propose that Microscopy Metadata appropriate for Tier 1 should always be included in the Material and Methods section of any journal publication to meet minimal rigor and reproducibility criteria (13). As such, the generalized and automated availability of Tier 1 metadata could save considerable effort both for authors, who would not need to search for information scattered across different data-files, hardware setups and lab notebooks in preparation for publication, and for readers, who would not need to search the various sections of publications for information that may or may not have been included.
3.5 Model implementation: Facilitated metadata collection
The importance of rich metadata to ensure the quality, reproducibility, as well as scientific and sharing value of image data cannot be overstated. However, the collection of rich sets of microscopy metadata is time-consuming and, in the absence of active participation from hardware manufacturers, imposes an unfair burden on experimental scientists and is therefore difficult to enforce. Appropriate community-validated software tools and data management practices are essential to streamline and automate the documentation of microscopy experiments. In this context, in parallel with this proposal for Microscopy Metadata guidelines a suite of three complementary and interoperable software tools are being developed and are presented in related manuscripts. 1) OMERO.mde (28) focuses on facilitating the consistent handling of image metadata ahead of data publication and deposition based on shared community Microscopy Metadata specifications and according to the FAIR principles. In addition, OMERO.mde promotes the early development of Image Metadata extension specifications to allow testing and validation before incorporation in community-accepted standards. 2) Micro-Meta App (30) focuses on an easy-to-use, Graphical User Interface (GUI)-based platform that interactively guides users through the process of building tier-based records of microscope hardware, accessories and image acquisition settings containing all relevant Microscopy Metadata as sanctioned by the community specifications such as the ones described here. Because of its graphical nature, Micro-Meta App is particularly suited for imaging scientists to enter all microscope metadata and use the tool for teaching trainees about Microscopy, and training microscope-users in imaging facilities. 3) Finally, MethodsJ2(29) focuses on automating the process of writing Microscopy Metadata guidelines-compliant Methods and Acknowledgment sections for scientific publications utilizing microscopy experiments. MethodsJ2, by design, operates in concert to automatically import Microscopy Metadata from the Micro-Meta App.
3.6 Model implementation: Information required for basic image interpretability
To ensure the basic interpretability of image data acquired before the adoption of community-sanctioned guidelines, any data that might be shared or published should, at the very least, contain the required metadata fields stipulated by the intersection between Tier 1 and the Core of the OME Data Model. Thus, Tier 1/Core sanctions the baseline metadata requirements for any light microscopy experiment to be interpretable, utilized and shared for scientific purposes. Specifically, this includes minimal microscope Hardware Specifications (i.e., microscope, light source and objective manufacturer information and essential description), and essential information about the Image structure (i.e., number of planes, channels and time-points, pixel size, fluorophore name, emission and excitation wavelength, etc.).
4 - CONCLUSION
Light Microscopy images need to be accompanied by thorough documentation of the microscope hardware and imaging settings to ensure a correct interpretation of the results. A significant challenge with the reproducibility of microscopy results and their integration with other data types, such as chromatin folding maps generated by the 4DN consortium (1,2), lies in the lack of standardized reporting guidelines for microscopy experiments as well as instrument performance and calibration standards (13–15,88). Despite a growing consensus that such standards for light microscopy are desirable, previous efforts to develop shared microscopy data models and application programming interfaces (4,5,54) have not yet succeeded in the establishment of a universal set of norms. In this manuscript, a framework to extend the OME Data Model is put forth to help address this challenge. In addition to aligning the OME Data Model to current technological developments, the specifications advanced here focus on the maximization of usability via the introduction of a tiered system of documentation requirements, on an expandable suite of model extensions, including the first available data model for quality-control metadata for light microscopy imaging and flexible use of required, and recommended, fields.
Microscopy is not the only field in which recent technological advances have resulted in increasingly rich datasets. Recent examples are genomic DNA and transcriptomics RNA sequencing, which are, in fact, much younger fields than microscopy. While protocols varied substantially in their early days (the original images from the sequencer were kept with the determined sequence), it took only about a decade to establish metadata requirements. One factor that helped establish such metadata criteria was the NIH Encyclopedia of DNA Elements (ENCODE) consortium (10,11). The development of Standard Operating Procedures (SOPs) and shared benchmarks (i.e., gold-standards) within this group was pivotal for the establishment of agreeable standards for practical day-to-day use. In the interest of scientific progress and making data FAIR, data and metadata standards should not be dictated by individual laboratories or microscope manufacturers. Instead, they should emerge organically from discussions involving all members of the community who can benefit from standardization and be subjected to evaluation before adoption.
In this spirit, the initial draft Microscopy Metadata specifications put forth by the 4DN (1,2) IWG were evaluated and revised by the BINA QC-DM-WG (3), resulting in the current proposal. Furthermore, this process is being carried out in alliance with the QUAREP-LiMi initiative ((9,37) to ensure that all participating imaging community stakeholders (importantly including microscope and software tool manufacturers, who are ultimately responsible for providing the information to be recorded in microscopy metadata) are involved from the ground up and provide timely feedback. Because it is inherently impossible to predict all future changes the light microscopy field might undergo and in order to ensure rigor and reproducibility for image data now and in the future, it is essential that the 4DN-BINA (as well as future) extensions of the OME Data Model for bioimaging metadata proposed here are capable of gradually evolving to capture any future technical development while supporting FAIR data principles(22,28). This is particularly important in the face of the establishment of a growing number of public image data resources (20) such as the IDR (55), EMPIAR (59), and Bioimage Archive (60) hosted at the EMBL - EBI; the Japanese SSBD hosted by RIKEN (64); and, in the USA, the NIH CELL Image Library (65) the BRAIN initiative’s brain tissue imaging resources17. These resources offer the opportunity to emulate for light microscopy the successful path that has led to community standards in the field of genomics (91–95). To this aim, in addition to our work in the context of QUAREP-LiMi, the further development of the Microscopy Metadata Specifications is being coordinated with other parallel initiatives including: 1) the OME community development of general criteria and procedures to capture and store metadata in OME-NGFF (Text Box 1). The OME NGFF effort (35,36) is implementing storage approaches to hold the binary pixel data and the metadata described herein in standardized, shareable, long-lived, efficient, and performant containers (e.g. files). 2) The EMBL-EBI development of the REMBI recommendations for metadata to be included with imaging datasets deposited to BioImage Archive (60,66,96). 3) The development of the International Standards Organization (ISO) 23494-1 standard that will include the 4DN-BINA-OME (NBO) Microscopy Metadata specifications as part of a Provenance information model for biological material and data (34,97). 4) The development of educational material in collaboration with Global Bioimaging to increase awareness about the importance of metadata standards to ensure image data quality, reproducibility and re-use value.
In conclusion, we are convinced that because of its strong roots in the community, and because it is closely linked with the parallel development of easy-to-use interactive tools to facilitate metadata collection (28–30), the flexible model framework presented here will provide a significant step forward towards the establishment of robust and future-proof light microscopy metadata standards. In turn, this will help increase rigor and reproducibility in imaging data, rewarding everyone involved with improved trust in published results.
5 - AUTHORS CONTRIBUTIONS
Author contributions categories utilized here were devised by the CRediT initiative (98,99).
Mathias Hammer: Conceptualization, Methodology, Investigation, Data Curation, Writing - Review & Editing; Maximiliaan Huisman: Conceptualization, Methodology, Investigation, Writing - Original Draft, Writing - Review & Editing, Visualization; Alessandro Rigano:Conceptualization, Methodology, Software; Ulrike Boehm, James J. Chambers, Nathalie Gaudreault, Alison J. North, Jaime A. Pimentel, and Damir Sudar: Validation, Investigation, Data Curation, Writing - Review & Editing; Peter Bajcsy, Claire M. Brown, Alexander D. Corbett, Orestis Faklaris, Judith Lacoste, Alex Laude, Glyn Nelson, and Roland Nitschke: Validation, Writing - Review & Editing; Farzin Farzam: Investigation; Carlas Smith: Conceptualization, Methodology; David Grunwald: Conceptualization, Methodology, Investigation, Resources, Writing - Original Draft, Supervision, Project administration, Funding acquisition; Caterina Strambio-De-Castillia: Conceptualization, Methodology, Software, Validation, Investigation, Resources, Data Curation, Writing - Original Draft, Writing - Review & Editing, Visualization, Supervision, Project administration, Funding acquisition.
6 - ACKNOWLEDGEMENTS
We would like to thank Kevin Fogarty, Lawrence Lifshitz and Karl Bellve at the Biomedical Imaging Group of the Program in Molecular Medicine at the University of Massachusetts Medical School for invaluable intellectual input and countless fruitful discussions and for their friendship, advice, and steadfast support throughout the development of this project.
This project could never have been carried out without the leadership, insightful discussions, support and friendship of all OME consortium members with particular reference to Jason Swedlow, Josh Moore, Chris Allan, Jean Marie Burel, and Will Moore. We are massively indebted to the RIKEN community for their fantastic work to bring open science into biology. We would like to particularly acknowledge Norio Kobayashi and Shuichi Onami for their friendship and support.
We thank all members of BioImaging North America (in particular Lisa Cameron, Michelle Itano, and Paula Montero-Llopis), German Bioimaging (in particular Susanne Kunis and Stephanie Wiedkamp Peters), Euro-Bioimaging (in particular Antje Keppler and Federica Paina) and QUAREP-LiMi (in particular all members of the Working Group 7 - Metadata; quarep.org) for invaluable intellectual input, fruitful discussions and advice. We are also indebted to the following individuals for their continued and steadfast support: Jeremy Luban, Roger Davis, and Thoru Pederson at the University of Massachusetts Medical School; Burak Alver, Joan Ritland, Rob Singer, and Warren Zipfel at the 4D Nucleome Project; Ian Fingerman, John Satterlee, Judy Mietz, Richard Conroy, and Olivier Blondel at the NIH. We would like to thank Dr. Darryl Conte for the critical reading of the manuscript. We are deeply indebted to Thao P. Do, Cell visualization and web specialist at the Allen Institute for Cell Science, for her expert advice and skilled work which was invaluable for the production of Figures 1, 2, 3 and 4.
This work was supported by NIH grant #1U01EB021238 and NSF grant #1917206 to D.G., NIH grant # 5U01CA200059-03 to C.S.D.C and D.G., and by grant #2019-198155 (5022) awarded to C.S.D.C. by the Chan Zuckerberg Initiative DAF, an advised fund of Silicon Valley Community Foundation, as part of their Imaging Scientist Program. D.S. was funded in part by NIH/NCI grants U54CA209988 and U2CCA23380. C.M.B. was funded in part by grant #2020-225398 from the Chan Zuckerberg Initiative DAF, an advised fund of Silicon Valley Community Foundation. R.N. was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) grant number Ni 451/9-1 MIAP-Freiburg. C.S.S. was supported by the Netherlands Organisation for Scientific Research (NWO), under NWO START-UP project no. 740.018.015 and NWO Veni project no. 16761.
Footnotes
↵# Members of the Bioimaging North America Quality Control and Data Management Working Group
Abbreviation list
- OME
- Open Microscopy Environment.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.
- 19.
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.
- 76.
- 77.
- 78.
- 79.
- 80.
- 81.
- 82.
- 83.
- 84.
- 85.
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.
- 93.
- 94.
- 95.↵
- 96.↵
- 97.↵
- 98.↵
- 99.↵
- 100.
- 101.↵
- 102.↵
- 103.
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵