TY - JOUR T1 - Optimizing the identification of causal variants across varying genetic architectures in crops JF - bioRxiv DO - 10.1101/310391 SP - 310391 AU - Chenyong Miao AU - Jinliang Yang AU - James C. Schnable Y1 - 2018/01/01 UR - http://biorxiv.org/content/early/2018/04/29/310391.abstract N2 - Background Association studies use statistical links between genetic markers and variation in a phenotype’s value across many individuals to identify genes controlling variation in the target phenotype. However, this approach, particularly conducted on a genome-wide scale (GWAS), has limited power to identify the genes responsible for variation in traits controlled by complex genetic architectures.Results Here we employ simulation studies utilizing real-world genotype datasets from association populations in four species with distinct minor allele frequency distributions, population structures, and patterns linkage disequilibrium to evaluate the impact of variation in both heritability and trait complexity on both conventional mixed linear model based GWAS and two new approaches specifically developed for complex traits. Mixed linear model based GWAS rapidly losses power for more complex traits. FarmCPU, a method based on multi-locus mixed linear models, provides the greatest statistical power for moderately complex traits. A Bayesian approach adopted from genomic prediction provides the greatest statistical power to identify causal genetic loci for extremely complex traits.Conclusions Using estimates of the complexity of the genetic architecture of target traits can guide the selection of appropriate statistical methods and improve the overall accuracy and power of GWAS.GWAS: Genome-Wide Association StudyGBS: Genotyping-By-SequencingPCA: Principal Component AnalysisLD: Linkage DisequilibriumSNP: Single Nucleotide PolymorphismMAF: Minor Allele FrequencyQTN: Quantitative Trait NucleotideGEMMA: Genomic Association and Prediction Integrated ToolGLM: General Linear ModelMLM: Mixed Linear ModelMLMM: Multi-Locus Mixed-ModelFDR: False Discovery RateHDRA: High-Density Rice ArrayHCC: the Holland Computing Center ER -