Abstract
Background Batch effects are unwanted data variations that may obscure biological signals, leading to bias or errors in subsequent data analyses. Effective evaluation and elimination of batch effects is thus necessary for omics data analysis, especially in the context of large cohort of thousands of samples with different experimental platforms. Existing batch effect reducing tools mainly focus on the development of algorithms, while requiring programming skills and the knowledge of data distribution limits their application for many researchers. In order to facilitate evaluation and correction of batch effects, we provided an user-friendly and easy-to-use graphical batch effects analysis web platform.
Results We developed an open-source R/Shiny based web server -- BatchServer that allows users to graphical interactively evaluate, visualize and correct of the batch effects in high-throughput data sets. BatchServer including a modified ComBat, which was a popular batch effect adjustment tool to correct batch effects, PVCA (Principal Variance Component Analysis) and UMAP (Manifold Approximation and Projection) to evaluate and visualize batch effects. BatchServer is an efficient batch effects processing platform, as its application in three publicly available data sets.
Conclusion Our user-friendly online open-source web server BatchServer supports comprehensive batch effects analysis facilitating the batch effect evaluations and corrections for biologists. BatchServer is deployed at https://lifeinfo.shinyapps.io/batchserver/ as a web server. The source codes are freely available at https://github.com/zhutiansheng/batch_server.
List of abbreviations
- PCA
- principal components analysis
- PVCA
- Principal Variance Component Analysis
- UMAP
- Manifold Approximation and Projection
- SVD
- Singular Value Decomposition
- t-SNE
- t-distributed stochastic neighbor embedding
- K-S test
- Kolmogorov–Smirnov test