Abstract
Shallow whole-genome sequencing (sWGS) offers a cost-effective approach to detect copy number alterations (CNAs). However, there remains a gap for a standardized workflow specifically designed for sWGS analysis. To address this need, in this work we present SAMURAI a bioinformatics pipeline specifically designed for analyzing CNAs from sWGS data in a standardized and reproducible manner.
SAMURAI is built using established community standards, ensuring portability, scalability, and reproducibility. The pipeline features a modular design with independent blocks for data pre-processing, copy number analysis, and customized reporting. Users can select workflows tailored for either solid or liquid biopsy analysis (e.g., circulating tumor DNA), with specific tools integrated for each sample type. The final report generated by SAMURAI provides detailed results to facilitate data interpretation and potential downstream analyses.To demonstrate its robustness, SAMURAI was validated using simulated and real-world datasets. The pipeline achieved high concordance with ground truth data and maintained consistent performance across various scenarios.
By promoting standardization and offering a versatile workflow, SAMURAI empowers researchers in diverse environments to reliably analyze CNAs from sWGS data. This, in turn, holds promise for advancements in precision medicine.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
↵* Co-last authors