Abstract
Cells regulate their functions through gene expression, driven by a complex interplay of transcription factors and other regulatory mechanisms that together can be modeled as gene regulatory networks (GRNs). The emergence of single-cell multi-omics technologies has driven the development of several methods that integrate transcriptomics and chromatin accessibility data to infer GRNs. While these methods provide examples of their utility in discovering new regulatory interactions, a comprehensive benchmark evaluating their mechanistic and predictive properties as well as their ability to recover known interactions is lacking. To address this, we built a comprehensive framework, Gene Regulatory nETwork Analysis (GRETA), available as a Snakemake pipeline, that includes state of the art methods decomposing their different steps in a modular manner. With it, we found that the GRNs were highly sensitive to methods’ choices, such as changes in random seeds, or replacing steps in the inference pipelines, as well as whether they use paired or unpaired multimodal data. Although the obtained networks performed well in predictive evaluation tasks and partially recovered known interactions, they struggled to capture causal relationships from perturbation assays. Our work brings attention to the challenges of inferring GRNs from single-cell omics, offers guidelines, and presents a flexible framework for developing and testing new approaches.
Competing Interest Statement
PBM is partially supported by funding from GSK. JSR reports funding from GSK, Pfizer, AstraZeneca and Sanofi and fees/honoraria from Travere Therapeutics, Stadapharm, Astex, Pfizer, Grunenthal, Moderna, Tempus, and Owkin.
Footnotes
↵# These two authors jointly supervised this work
Added one more method, one mechanistic evaluation metric, and two more databases for the TF-TF and CRE tasks.