Abstract
Identification of proteins is one of the most computationally intensive steps in genomics studies. It usually relies on aligners that don’t accommodate rich information on proteins and require additional pipelining steps for protein identification. We introduce kAAmer, a protein database engine based on amino-acid k-mers, that supports fast identification of proteins with complementary annotations. Moreover, the databases can be hosted and queried remotely.
Footnotes
1 Go programming language (https://golang.org/)
2 Non-Volatile Memory Express
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.