ABSTRACT
Tumors are complex assemblies of cellular and acellular structures patterned on spatial scales from microns to centimeters. Study of these assemblies has advanced dramatically with the introduction of high-plex spatial profiling. Image-based profiling methods reveal the intensities and spatial distributions of 20-100 proteins at subcellular resolution in 103–107 cells per specimen. Despite extensive work on methods for extracting single-cell data from these images, all tissue images contain artefacts such as folds, debris, antibody aggregates, optical aberrations and image processing errors that arise from imperfections in specimen preparation, data acquisition, image assembly, and feature extraction. We show that these artefacts dramatically impact single-cell data analysis, obscuring meaningful biological interpretation. We describe an interactive quality control software tool, CyLinter, that identifies and removes data associated with imaging artefacts. CyLinter greatly improves single-cell analysis, especially for archival specimens sectioned many years prior to data collection, such as those from clinical trials.
Competing Interest Statement
P.K.S. is a cofounder and member of the Board of Directors of Glencoe Software, a member of the Board of Directors for Applied Biomath and a member of the Scientific Advisory Board for RareCyte, NanoString and Montai Health; he holds equity in Glencoe and RareCyte. P.K.S. is a consultant for Merck. PKS declares that none of these relationships have influenced the content of this manuscript. E. A. M. reports compensated service on Scientific Advisory Boards for Astra Zeneca, BioNTech and Merck; uncompensated service on Steering Committees for Bristol Myers Squibb and Roche/Genentech; speakers' honoraria and travel support from Merck Sharp & Dohme; and institutional research support from Roche/Genentech (via an SU2C grant) and Gilead. She also reports research funding from Susan Komen for the Cure for which she serves as a Scientific Advisor, and uncompensated participation as a member of the American Society of Clinical Oncology Board of Directors. J. L. G. serves or has previously served on advisory boards and/or as a scientific advisory board member for Array BioPharma/Pfizer, AstraZeneca, BD Biosciences, Carisma, Codagenix, Duke Street Bio, GlaxoSmithKline, Kowa, Kymera, OncoOne and Verseau Therapeutics, and has research grants from Array BioPharma/Pfizer, Duke Street Bio, Eli Lilly, GlaxoSmithKline and Merck. The other authors declare no competing interests.
Footnotes
This version reflects substantial textual and figure edits prompted by peer-review. These include additional data analysis, a description of a deep-learning model for automated artifact detection, and multiple associated modifications to the CyLinter codebase, project website, and supporting Wiki page. We believe that these revisions substantially improve the content, readability, and potential impact of our manuscript.