RT Journal Article SR Electronic T1 Somatic hypermutation analysis for improved identification of B cell clonal families from next-generation sequencing data JF bioRxiv FD Cold Spring Harbor Laboratory SP 788620 DO 10.1101/788620 A1 Nima Nouri A1 Steven H. Kleinstein YR 2019 UL http://biorxiv.org/content/early/2019/10/01/788620.abstract AB Motivation Adaptive immune receptor repertoire sequencing (AIRR-Seq) offers the possibility of identifying and tracking B cell clonal expansions during adaptive immune responses. Members of a B cell clone are descended from a common ancestor and share the same initial V(D)J rearrangement, but their BCR sequence may differ due to the accumulation of somatic hypermutations (SHMs). Clonal relationships are learned from AIRR-seq data by analyzing the BCR sequence, with the most common methods focused on the highly diverse CDR3 region. However, clonally related cells often share SHMs which have been accumulated during affinity maturation. Here, we investigate whether shared SHMs in the V and J segments of the BCR can be leveraged along with the CDR3 sequence to improve the ability to identify clonally related sequences. We develop independent distance functions that capture shared mutations and CDR3 similarity, and combine these in a spectral clustering framework. Using simulated data, we show that this model improves both the sensitivity and specificity for identifying clonal relationships.Availability Source code for this method is freely available in the SCOPer (Spectral Clustering for clOne Partitioning) R package (version 0.2 or newer) in the Immcantation framework: www.immcantation.org under the CC BY-SA 4.0 license.Contact steven.kleinstein{at}yale.edu