Abstract
Clinical diagnoses rely on a wide variety of laboratory tests and imaging studies, interpreted alongside physical examination findings and the patient’s history and symptoms. Currently, the tools of diagnosis make limited use of the immune system’s internal record of specific disease exposures encoded by the antigen-specific receptors of memory B cells and T cells, and there has been little integration of the combined information from B cell and T cell receptor sequences. Here, we analyze extensive receptor sequence datasets with three different machine learning representations of immune receptor repertoires to develop an interpretive framework, MAchine Learning for Immunological Diagnosis (Mal-ID), that screens for multiple illnesses simultaneously. This approach is effective in identifying a variety of disease states, including acute and chronic infections and autoimmune disorders. It is able to do so even when there are other differences present in the immune repertoires, such as between pediatric or adult patient groups. Importantly, many features of the model of immune receptor sequences are human-interpretable. They independently recapitulate known biology of the responses to infection by SARS-CoV-2 and HIV, provide evidence of receptor antigen specificity, and reveal common features of autoreactive immune receptor repertoires, indicating that machine learning on immune repertoires can yield new immunological knowledge. This framework could be useful in identifying immune responses to new infectious diseases as they emerge.
Competing Interest Statement
M.E.Z., R.T., A.K., and S.D.B. are co-inventors on a patent application related to this manuscript. S.D.B. has consulted for Regeneron, Sanofi, Novartis, and Janssen on topics unrelated to this study and owns stock in AbCellera Biologics. A.K. is scientific co-founder of Ravel Biotechnology Inc., is on the scientific advisory board of PatchBio Inc., SerImmune Inc., AINovo Inc., TensorBio Inc. and OpenTargets, was a consultant with Illumina Inc. and owns shares in DeepGenomics Inc., Immunai Inc., and Freenome Inc. C.A.B. reports compensation for consulting and/or SAB membership from Catamaran Bio, DeepCell Inc., Immunebridge, Sangamo Therapeutics, and Revelation Biosciences on topics unrelated to this study. J.D.G. has consulted for Eli Lilly, Gilead, GSK, and Karius, and reports research support from Eli Lilly, Gilead, Regeneron, Merck, and collaborative services agreements with Adaptive Biotechnologies, Monogram Biosciences, and Labcorp (outside of this study). R.T is a consultant for Genentech. J.A.J. has served as a consultant for AbbVie, Janssen, Novartis, and GlaxoSmithKline. J.A.J. also has unrelated patents through the Oklahoma Medical Research Foundation which the foundation has licensed to Progentec Biosciences, LLC. J.T.M has served as a consultant for AbbVie, Alexion, Alumis, Amgen, AstraZeneca, Aurinia, Bristol Myers Squibb, EMD Serono, Genentech, Gilead, GlaxoSmithKline, Lilly, Merck, Pfizer, Provention, Remegen, Sanofi, UCB, and Zenas, and reports research support from AstraZeneca, Bristol Myers Squibb, and GlaxoSmithKline (outside of this study). Other co-authors declare that they have no competing interests.
Footnotes
Updated data availability and code availability statements.