Abstract
Motivation: For over 10 years allele-level HLA matching for bone marrow registries has been performed in a probabilistic context. HLA typing technologies provide ambiguous results in that they could not distinguish among all known HLA allele sequences, therefore registries have implemented matching algorithms that provide lists of donor and cord blood units ordered in terms of the likelihood of allele-level matching at specific HLA loci. With the growth of registry sizes, current match algorithm implementations are unable to provide match results in real time.
Results: We present here novel computationally-efficient open source implementation of an HLA imputation and match algorithm using a graph database platform. Using graph traversal, our algorithm runtime grows slowly with registry size. This implementation generates results that agree with consensus output on a publicly-available match algorithm crossvalidation dataset.
Availability: The Python, Perl and Neo4jJcode is available at https://git.com/nmdp-bioinformatics/grimm
Supplementary information: Supplementary data are available at Bioinformatics online.