PT - JOURNAL ARTICLE AU - Latorre, Sergio M. AU - Langner, Thorsten AU - Malmgren, Angus AU - Win, Joe AU - Kamoun, Sophien AU - Burbano, Hernán A. TI - SNP calling parameters have minimal impact on population structure and divergence time estimates for the rice blast fungus AID - 10.1101/2022.03.06.482794 DP - 2022 Jan 01 TA - bioRxiv PG - 2022.03.06.482794 4099 - http://biorxiv.org/content/early/2022/03/07/2022.03.06.482794.short 4100 - http://biorxiv.org/content/early/2022/03/07/2022.03.06.482794.full AB - Objectives Accurate single-nucleotide polymorphisms (SNP) calls are crucial for robust evolutionary and population genetic inferences in genomic analyses. Such inferences can reveal the time-scales and processes associated with the emergence and spread of pandemic plant pathogens, such as the rice blast fungus Magnaporthe oryzae (Syn. Pyricularia oryzae). However, the specificity and sensitivity of SNP calls depend on the filtering parameters applied to the data. Here, we used a benchmarking approach to evaluate the impact of SNP calling on different population genetic analyses of the rice blast fungus, namely genetic clustering, topology of phylogenetic reconstructions and estimation of evolutionary rates.Results To benchmark SNP calling parameters, we generated a gold standard set of validated SNPs by sequencing nine M. oryzae genomes with both Illumina short-reads and Oxford Nanopore Technologies (ONT). We used the gold standard set of SNPs to identify the SNP calling parameter configuration that maximizes sensitivity and specificity. We found that the choice of parameter configurations can substantially change the number of ascertained SNPs, preferentially affecting SNPs segregating at low population frequency. However, SNP calling parameter configurations did not significantly affect the clustering of isolates in clonal lineages, the monophyly of each clonal lineage, and the estimation of evolutionary rates. We leverage the evolutionary rates obtained from each SNP calling parameter configuration to generate divergence time estimates that take into account the uncertainty associated with both the estimation of evolutionary rates and SNP calling. Our analysis indicates that M. oryzae clonal lineage expansions took place ~300 years ago.Competing Interest StatementThe authors have declared no competing interest.M. oryzaeMagnaporthe oryzaeONTOxford Nanopore TechnologiesVQSRVariant Quality Score RecalibrationGATKGenome Analysis ToolkitSNPSingle Nucleotide PolymorphismGSVDGold Standard Variants DatasetQDQuality by DepthREFReferenceALTAlternativeReadPosRankSumRank sum test for relative positioning REF versus ALT alleles within reads.MQRankSumRank sum test for mapping qualities of REF versus ALT readsBaseQRankSumRank sum test of REF versus ALT base quality scoresFSStrand bias estimated using Fisher’s exact testVCFVariant Call FormatPCAPrincipal Component AnalysisSFSSite Frequency SpectrumMCMCMarkov Chain Monte CarloMRCAMost Recent Common AncestorTMRCATime to the Most Recent Common AncestorHPDHigh Posterior DensityESSEstimated Sample SizeTMRCATime to the Most Recent Common Ancestor