RT Journal Article SR Electronic T1 Subset-based genomic prediction provides insights into the genetic architecture of free amino acid levels in dry Arabidopsis thaliana seeds JF bioRxiv FD Cold Spring Harbor Laboratory SP 272047 DO 10.1101/272047 A1 Kevin A. Bird A1 Sarah D. Turner A1 Timothy M. Beissinger A1 Ruthie Angelovici YR 2018 UL http://biorxiv.org/content/early/2018/02/26/272047.abstract AB Amino acids play a central role in plant growth, development, and human nutrition. A better understanding of the genetic architecture of amino acid traits will enable researchers to integrate this information for plant breeding and biological discovery. Despite a collection of successfully mapped genes, a fundamental understanding of the types of genes driving the genetic architecture of amino acid related traits in crop seeds and model systems such Arabidopsis has remained unresolved. To address this issue, we applied genomic prediction using distinct subsets of genes, including those belonging to the known amino acid biochemical pathways, to quantify their contributions to the genetic variation of free amino acid levels in dry seeds. First, we demonstrate that genomic prediction of free amino acid levels is moderately accurate in Arabidopsis seeds. Then, we explore whether specific subsets of SNPs corresponding to amino acid pathways exhibit enhanced predictability for amino acid traits. Surprisingly, for several of the traits we studied, SNPs within the amino acid pathways were no more predictive than randomly generated sets of control SNPs. This may imply a complex genetic architecture that includes other genes related to cellular processes or development. Conversely, a subset of amino acid traits did exhibit enhanced predictability based on pathway SNPs compared to control SNPs. We propose that this latter set of traits may correspond to a simpler genetic architecture. Ultimately, this study provides a potential strategy to assess the involvements of candidate genes in the genetic architecture of a traits using subset-based genomic prediction.