ABSTRACT
Motivation Long non-coding RNA expression data has been increasingly used in finding diagnostic and prognostic biomarkers in cancer studies. Existing differential analysis tools for RNA sequencing does not effectively accommodate low abundant genes, as commonly observed in lncRNA. We propose a novel and robust statistical method lncDIFF to detect differential expressed (DE) genes without assuming the true density on normalized counts.
Results lncDIFF adopts the generalized linear model with zero-inflated exponential quasi likelihood to estimate group effect on normalized counts, and employs the likelihood ratio test to detect differential expressed genes. The proposed method and tool is suitable for data processed with standard RNA-Seq preprocessing and normalization pipelines. Simulation results illustrate that lncDIFF detects DE genes with more power and lower false discovery rate regardless of the data pattern. The analysis on a head and neck squamous cell carcinomas study also confirms that lncDIFF has better sensitivity in identifying novel lncRNA genes with relatively large fold change and prognostic value.
Availability and Implementation lncDIFF is an R package available at https://github.com/qianli10000/lncDIFF.
Supplementary Information Supplementary Data are available at Bioinformatics online.