The repetitive landscape of the chicken genome

  1. Thomas Wicker1,5,
  2. Jon S. Robertson1,5,
  3. Stefan R. Schulze1,
  4. F. Alex Feltus1,
  5. Vincent Magrini3,
  6. Jason A. Morrison3,
  7. Elaine R. Mardis3,
  8. Richard K. Wilson3,
  9. Daniel G. Peterson1,4,
  10. Andrew H. Paterson1,2,6, and
  11. Robert Ivarie2,6
  1. 1 Plant Genome Mapping Laboratory, University of Georgia, Athens, Georgia 30602, USA
  2. 2 Department of Genetics, University of Georgia, Athens, Georgia 30602, USA
  3. 3 Genome Sequencing Center, Washington University Medical Center, Washington University, St. Louis, Missouri 63108, USA
  4. 4 Mississippi Genome Exploration Laboratory, Mississippi State University, Starkville, Mississippi 39762, USA

Abstract

Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7× coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available.

Footnotes

  • [Supplemental material is available online at www.genome.org and http://plantgenome.agtec.uga.edu/g4g. The sequence data described in this study have been submitted to GenBank under accession nos. CL266240–CL281342. Consensus sequences for the novel repeat families and their major subfamilies were submitted to RepBase.]

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.2438005. Article published online before print in July 2004.

  • 5 These two authors contributed equally to this work.

  • 6 Corresponding authors. E-mail ivarie{at}uga.edu; fax (706) 542-3910. E-mail paterson{at}uga.edu; fax (706) 583-0160.

    • Accepted June 2, 2004.
    • Received February 9, 2004.
| Table of Contents

Preprint Server