Abstract
Motivation Many computational tools attempt to infer gene regulatory networks (GRNs) from single-cell RNA sequencing (scRNA-seq) data. One recent advance is DeepSEM, a deep generative model generalizing the Linear Structural Equation Model (SEM) that improves benchmark performance over popular GRN inference methods. While DeepSEM is promising, its results are not stable over multiple runs. To overcome the instability and resolve dropout handling concerns, we propose GRN-VAE.
Results GRN-VAE improves stability and efficiency while maintaining accuracy by delayed introduction of the sparse loss term. To minimize the negative impact of dropout in single-cell data, GRN-VAE trains on non-zero data. Most importantly, we introduce a novel idea, Dropout Augmentation, to improve model robustness by adding a small amount of simulated dropout to the data. GRN-VAE compares favorably to other methods on the BEELINE benchmark data sets, using several collections of “ground truth” regulatory relationships, and on a real-world data set, where it efficiently provides stable results consistent with literature-based findings.
Conclusions The stability and robustness of GRN-VAE make it a practical and valuable addition to the toolkit for GRN inference from single-cell data. Dropout Augmentation may have wider applications beyond the GRN-inference problem.
Availability and implementation Source code is available at https://bcb.cs.tufts.edu/GRN-VAE
Contact hao.zhu{at}tufts.edu; donna.slonim{at}tufts.edu
Competing Interest Statement
The authors have declared no competing interest.