Abstract
Gene co-expression analysis is an effective method to detect groups (or modules) of co-expressed genes that display similar expression patterns, which may function in the same biological processes. Here, we present ‘Simple Tidy GeneCoEx’, a gene co-expression analysis workflow written in the R programming language. The workflow is highly customizable across multiple stages of the pipeline including gene selection, edge selection, clustering resolution, and data visualization. Powered by the tidyverse package ecosystem and network analysis functions provided by the igraph package, the workflow detects gene co-expression modules whose members are highly interconnected. Step-by-step instructions with two use case examples as well as source code are available at https://github.com/cxli233/SimpleTidy_GeneCoEx.
Core Ideas
An R-based workflow that performs gene co-expression analysis was developed.
The workflow is based on tidyverse packages and graph theory.
The workflow is highly customizable, detects tight gene co-expression modules, and generates publication quality figures.
Two plant gene expression datasets were used to benchmark the workflow.
Competing Interest Statement
The authors have declared no competing interest.
Abbreviations
- ANCOVA
- analysis of covariance
- ANOVA
- analysis of variance
- FPKM
- fragments per kilobase exon model per million mapped fragments
- LCM
- laser capture micro-dissection
- msq
- mean sum of squares
- PCA
- principal component analysis
- sd
- standard deviation
- TPM
- transcripts per million
- WGCNA
- weighted gene co-expression network analysis