RT Journal Article SR Electronic T1 Multitask learning for Transformers with application to large-scale single-cell transcriptomes JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.02.05.935239 DO 10.1101/2020.02.05.935239 A1 Minxing Pang A1 Jesper Tegnér YR 2020 UL http://biorxiv.org/content/early/2020/02/11/2020.02.05.935239.abstract AB Recent progress in machine learning provides competitive methods for bioinformatics in many traditional topics, such as transcriptomes sequence and single-cell analysis. However, discovering biomedical correlation of cells that are present across large-scale data sets remains challenging. Our attention-based neural network module with 300 million parameters is able to capture biological knowledge in a data-driven way. The module contains high-quality embedding, taxonomy analysis and similarity measurement. We tested the model on Mouse Brain Atlas, which consists of 160,000 cells and 25,000 genes. Our module obtained some interesting findings that have been verified by biologists and got better performance when benchmarked against autoencoder and principal components analysis.