Full-length enriched cDNA libraries and ORFeome analysis of sugarcane hybrid and ancestor genotypes

PLoS One. 2014 Sep 15;9(9):e107351. doi: 10.1371/journal.pone.0107351. eCollection 2014.

Abstract

Sugarcane is a major crop used for food and bioenergy production. Modern cultivars are hybrids derived from crosses between Saccharum officinarum and Saccharum spontaneum. Hybrid cultivars combine favorable characteristics from ancestral species and contain a genome that is highly polyploid and aneuploid, containing 100-130 chromosomes. These complex genomes represent a huge challenge for molecular studies and for the development of biotechnological tools that can facilitate sugarcane improvement. Here, we describe full-length enriched cDNA libraries for Saccharum officinarum, Saccharum spontaneum, and one hybrid genotype (SP803280) and analyze the set of open reading frames (ORFs) in their genomes (i.e., their ORFeomes). We found 38,195 (19%) sugarcane-specific transcripts that did not match transcripts from other databases. Less than 1.6% of all transcripts were ancestor-specific (i.e., not expressed in SP803280). We also found 78,008 putative new sugarcane transcripts that were absent in the largest sugarcane expressed sequence tag database (SUCEST). Functional annotation showed a high frequency of protein kinases and stress-related proteins. We also detected natural antisense transcript expression, which mapped to 94% of all plant KEGG pathways; however, each genotype showed different pathways enriched in antisense transcripts. Our data appeared to cover 53.2% (17,563 genes) and 46.8% (937 transcription factors) of all sugarcane full-length genes and transcription factors, respectively. This work represents a significant advancement in defining the sugarcane ORFeome and will be useful for protein characterization, single nucleotide polymorphism and splicing variant identification, evolutionary and comparative studies, and sugarcane genome assembly and annotation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosomes, Plant / genetics
  • Expressed Sequence Tags
  • Gene Library*
  • Genome, Plant / genetics
  • Genotype
  • Open Reading Frames / genetics*
  • Saccharum / genetics*
  • Saccharum / metabolism

Grants and funding

Dr. SSF was recipient of a CNPq Fellowship, MYNJ is recipient of a CAPES Fellowship. Dr. GMS is the recipient of a CNPq Productivity Fellowship. This work was supported by Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) under the FAPESP Bioenergy Research Program BIOEN (project number 08/52146-0). Co-author Pei-zhong Tang is an employee of Thermo Fisher Scientific. Co-authors Antje Taliana and Scott Backer were employees of Life Technologies (now Thermo Fisher Scientific). Thermo Fisher Scientific provided support in the form of salaries for authors PZ, AT and SB, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.