PT - JOURNAL ARTICLE AU - Chan Kuang Lim AU - Tatiana V. Tatarinova AU - Rozana Rosli AU - Nadzirah Amiruddin AU - Norazah Azizi AU - Mohd Amin Ab Halim AU - Nik Shazana Nik Mohd Sanusi AU - Jayanthi Nagappan AU - Petr Ponomarenko AU - Martin Triska AU - Victor Solovyev AU - Mohd Firdaus-Raih AU - Ravigadevi Sambanthamurthi AU - Denis Murphy AU - Leslie Low Eng Ti TI - Evidence-based gene models for structural and functional annotations of the oil palm genome AID - 10.1101/111120 DP - 2017 Jan 01 TA - bioRxiv PG - 111120 4099 - http://biorxiv.org/content/early/2017/04/05/111120.short 4100 - http://biorxiv.org/content/early/2017/04/05/111120.full AB - The advent of rapid and inexpensive DNA sequencing has led to an explosion of data that must be transformed into knowledge about genome organization and function. Gene prediction is customarily the starting point for genome analysis. This paper presents a bioinformatics study of the oil palm genome, including a comparative genomics analysis, database and tools development, and mining of biological data for genes of interest. We annotated 26,087 oil palm genes integrated from two gene-prediction pipelines, Fgenesh++ and Seqping. As case studies, we conducted comprehensive investigations on intronless, resistance and fatty acid biosynthesis genes, and demonstrated that the current gene prediction set is of high quality. 3,672 intronless genes were identified in the oil palm genome, an important resource for evolutionary study. Further scrutiny of the oil palm genes revealed 210 candidate resistance genes involved in pathogen defense. Fatty acids have diverse applications ranging from food to industrial feedstock, and we identified 42 key genes involved in fatty-acid biosynthesis in oil palm mesocarp and kernel. These results provide an important resource for studies on plant genomes and a theoretical foundation for marker-assisted breeding of oil palm and related crops.