PT - JOURNAL ARTICLE AU - Jianbo Zhang AU - Dilip R. Panthee TI - PyBSASeq: a novel, simple, and effective algorithm for BSA-Seq data analysis AID - 10.1101/654137 DP - 2019 Jan 01 TA - bioRxiv PG - 654137 4099 - http://biorxiv.org/content/early/2019/11/27/654137.short 4100 - http://biorxiv.org/content/early/2019/11/27/654137.full AB - Bulked segregant analysis (BSA), coupled with next generation sequencing (NGS), allows the rapid identification of both qualitative and quantitative trait loci (QTL), and this technique is referred to as BSA-Seq here. The current SNP index method and G-statistic method for BSA-Seq data analysis require relatively high sequencing coverage to detect major single nucleotide polymorphism (SNP)-trait associations, which leads to high sequencing cost. Here we developed a simple and effective algorithm for BSA-Seq data analysis and implemented it in Python, the program was named PyBSASeq. Using PyBSASeq, the likely trait-associated SNPs (ltaSNPs) were identified via Fisher’s exact test and then the ratio of the ltaSNPs to total SNPs in a chromosomal interval was used to identify the genomic regions that condition the trait of interest. The results obtained this way are similar to those generated by the current methods, but with more than five times higher sensitivity, which can reduce the sequencing cost by ~80% and makes BSA-Seq more applicable for the species with a large genome.Significance Statement BSA-Seq can be utilized to rapidly identify DNA polymorphismtrait associations, and PyBSASeq allows the detection of such associations at much lower sequencing coverage than the current methods, leading to lower sequencing cost and making BSA-Seq more accessible to the research community and more applicable to the species with a large genome.