PT - JOURNAL ARTICLE AU - Maryam Ayat AU - Michael Domaratzki TI - Sparse Bayesian Learning for Predicting Phenotypes and Ranking Influential Markers in Yeast AID - 10.1101/489245 DP - 2018 Jan 01 TA - bioRxiv PG - 489245 4099 - http://biorxiv.org/content/early/2018/12/06/489245.short 4100 - http://biorxiv.org/content/early/2018/12/06/489245.full AB - Genomic selection and genome-wide association studies are two related problems that can be applied to the plant breeding industry. Genomic selection is a method to predict phenotypes (i.e., traits) such as yield and drought resistance in crops from high-density markers positioned throughout the genome of the varieties. In this paper, we employ employ sparse Bayesian learning as a technique for genomic selection and ranking markers based on their relevance to a trait, which can aid in genome-wide association studies. We define and explore two different forms of the sparse Bayesian learning for predicting phenotypes and identifying the most influential markers of a trait, respectively. In particular, we introduce a new framework based on sparse Bayesian and ensemble learning for ranking influential markers of a trait. Then, we apply our methods on a real-world Saccharomyces cerevisiae dataset, and analyse our results with respect to existing related works, trait heritability, as well as the accuracies obtained from the use of different kernel functions including linear, Gaussian, and string kernels. We find that sparse Bayesian methods are not only as good as other machine learning methods in predicting yeast growth in different environments, but are also capable of identifying the most important markers, including both positive and negative effects on the growth, from which biologists can get insight. This attribute can make our proposed ensemble of sparse Bayesian learners favourable in ranking markers based on their relevance to a trait.