RT Journal Article SR Electronic T1 Evidence-based gene models for structural and functional annotations of the oil palm genome JF bioRxiv FD Cold Spring Harbor Laboratory SP 111120 DO 10.1101/111120 A1 Chan Kuang Lim A1 Tatiana V. Tatarinova A1 Rozana Rosli A1 Nadzirah Amiruddin A1 Norazah Azizi A1 Mohd Amin Ab Halim A1 Nik Shazana Nik Mohd Sanusi A1 Jayanthi Nagappan A1 Petr Ponomarenko A1 Martin Triska A1 Victor Solovyev A1 Mohd Firdaus-Raih A1 Ravigadevi Sambanthamurthi A1 Denis Murphy A1 Leslie Low Eng Ti YR 2017 UL http://biorxiv.org/content/early/2017/02/25/111120.abstract AB The advent of rapid and inexpensive DNA sequencing has led to an explosion of data waiting to be transformed into knowledge about genome organization and function. Gene prediction is customarily the starting point for genome analysis. This paper presents a bioinformatics study of the oil palm genome, including comparative genomics analysis, database and tools development, and mining of biological data for genes of interest. We have annotated 26,059 oil palm genes integrated from two independent gene-prediction pipelines, Fgenesh++ and Seqping. This integrated annotation constitutes a significant improvement in comparison to the preliminary annotation published in 2013. We conducted a comprehensive analysis of intronless, resistance and fatty acid biosynthesis genes, and demonstrated that the high quality of the current genome annotation. 3,658 intronless genes were identified in the oil palm genome, an important resource for evolutionary study. Further analysis of the oil palm genes revealed 210 candidate resistance genes involved in pathogen defense. Fatty acids have diverse applications ranging from food to industrial feedstocks, and we identified 42 key genes involved in fatty acid biosynthesis in oil palm. These results provide an important resource for studies of plant genomes and a theoretical foundation for marker-assisted breeding of oil palm and related crops.AbbreviationsCDScoding sequenceGOGene OntologyIGintronless geneRresistanceCCcoiled-coilNBSnucleotide binding siteLRRleucine-rich repeatCNLCC-NBS-LRRFAfatty acidFAD2oleoyl-phosphatidylcholine desaturaseFAD3linoleoyl-phosphatidylcholine desaturaseACPacyl carrier proteinFATBacyl-ACP thioesteraseTNLToll/interleukin-1 NBS-LRRAvravirulenceSTKserine/threonine protein kinaseACCaseacetyl-CoA carboxylaseFABFβ-ketoacyl-ACP synthase IIFAB2Stearoyl-ACP desaturaseFATAoleoyl-ACP thioesterase