Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Transcript- and annotation-guided genome assembly of the European starling

View ORCID ProfileKatarina C. Stuart, View ORCID ProfileRichard J. Edwards, View ORCID ProfileYuanyuan Cheng, Wesley C. Warren, View ORCID ProfileDavid W. Burt, View ORCID ProfileWilliam B. Sherwin, View ORCID ProfileNatalie R. Hofmeister, View ORCID ProfileScott J. Werner, View ORCID ProfileGregory F. Ball, View ORCID ProfileMelissa Bateson, View ORCID ProfileMatthew C. Brandley, View ORCID ProfileKatherine L. Buchanan, View ORCID ProfilePhillip Cassey, View ORCID ProfileDavid F. Clayton, View ORCID ProfileTim De Meyer, View ORCID ProfileSimone L. Meddle, View ORCID ProfileLee A. Rollins
doi: https://doi.org/10.1101/2021.04.07.438753
Katarina C. Stuart
1Evolution & Ecology Research Centre, School of Biological, Earth and Environmental Sciences, UNSW Sydney, Sydney, New South Wales, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Katarina C. Stuart
  • For correspondence: katarina.stuart@unsw.edu.au
Richard J. Edwards
2School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, New South Wales, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Richard J. Edwards
Yuanyuan Cheng
3School of Life and Environmental Sciences, The University of Sydney, Sydney, New South Wales, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Yuanyuan Cheng
Wesley C. Warren
4Department of Animal Sciences, Institute for Data Science and Informatics, The University of Missouri, Columbia, Missouri, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David W. Burt
5Office of the Deputy Vice-Chancellor (Research and Innovation), The University of Queensland, Brisbane, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for David W. Burt
William B. Sherwin
1Evolution & Ecology Research Centre, School of Biological, Earth and Environmental Sciences, UNSW Sydney, Sydney, New South Wales, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for William B. Sherwin
Natalie R. Hofmeister
6Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY 14850
7Fuller Evolutionary Biology Program, Cornell Lab of Ornithology, Ithaca, NY 14850
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Natalie R. Hofmeister
Scott J. Werner
8United States Department of Agriculture, Animal and Plant Health Inspection Service, Wildlife Services, National Wildlife Research Center, Fort Collins, Colorado, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Scott J. Werner
Gregory F. Ball
9Department of Psychology, University of Maryland, College Park, MD 20742 USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gregory F. Ball
Melissa Bateson
10Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Melissa Bateson
Matthew C. Brandley
11Carnegie Museum of Natural History, Pittsburgh, Pennsylvania, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Matthew C. Brandley
Katherine L. Buchanan
12School of Life and Environmental Sciences, Deakin University, Waurn Ponds, VIC, 3228, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Katherine L. Buchanan
Phillip Cassey
13Invasion Science & Wildlife Ecology Lab, University of Adelaide, Adelaide SA 5005, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Phillip Cassey
David F. Clayton
14Department of Genetics & Biochemistry, Clemson University, South Carolina 29634
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for David F. Clayton
Tim De Meyer
15Dept. of Data Analysis & Mathematical Modelling, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tim De Meyer
Simone L. Meddle
16The Roslin Institute, The Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian, EH25 9RG, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Simone L. Meddle
Lee A. Rollins
1Evolution & Ecology Research Centre, School of Biological, Earth and Environmental Sciences, UNSW Sydney, Sydney, New South Wales, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Lee A. Rollins
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

The European starling, Sturnus vulgaris, is an ecologically significant, globally invasive avian species that is also suffering from a major decline in its native range. Here, we present the genome assembly and long-read transcriptome of an Australian-sourced European starling (S. vulgaris vAU), and a second North American genome (S. vulgaris vNA), as complementary reference genomes for population genetic and evolutionary characterisation. S. vulgaris vAU combined 10x Genomics linked-reads, low-coverage Nanopore sequencing, and PacBio Iso-Seq full-length transcript scaffolding to generate a 1050 Mb assembly on 1,628 scaffolds (72.5 Mb scaffold N50). Species-specific transcript mapping and gene annotation revealed high structural and functional completeness (94.6% BUSCO completeness). Further scaffolding against the high-quality zebra finch (Taeniopygia guttata) genome assigned 98.6% of the assembly to 32 putative nuclear chromosome scaffolds. Rapid, recent advances in sequencing technologies and bioinformatics software have highlighted the need for evidence-based assessment of assembly decisions on a case-by-case basis. Using S. vulgaris vAU, we demonstrate how the multifunctional use of PacBio Iso-Seq transcript data and complementary homology-based annotation of sequential assembly steps (assessed using a new tool, SAAGA) can be used to assess, inform, and validate assembly workflow decisions. We also highlight some counter-intuitive behaviour in traditional BUSCO metrics, and present Buscomp, a complementary tool for assembly comparison designed to be robust to differences in assembly size and base-calling quality. Finally, we present a second starling assembly, S. vulgaris vNA, to facilitate comparative analysis and global genomic research on this ecologically important species.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • ↵† Joint first authors

  • Update of reference from personal communications to preprint.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted May 26, 2021.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Transcript- and annotation-guided genome assembly of the European starling
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Transcript- and annotation-guided genome assembly of the European starling
Katarina C. Stuart, Richard J. Edwards, Yuanyuan Cheng, Wesley C. Warren, David W. Burt, William B. Sherwin, Natalie R. Hofmeister, Scott J. Werner, Gregory F. Ball, Melissa Bateson, Matthew C. Brandley, Katherine L. Buchanan, Phillip Cassey, David F. Clayton, Tim De Meyer, Simone L. Meddle, Lee A. Rollins
bioRxiv 2021.04.07.438753; doi: https://doi.org/10.1101/2021.04.07.438753
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Transcript- and annotation-guided genome assembly of the European starling
Katarina C. Stuart, Richard J. Edwards, Yuanyuan Cheng, Wesley C. Warren, David W. Burt, William B. Sherwin, Natalie R. Hofmeister, Scott J. Werner, Gregory F. Ball, Melissa Bateson, Matthew C. Brandley, Katherine L. Buchanan, Phillip Cassey, David F. Clayton, Tim De Meyer, Simone L. Meddle, Lee A. Rollins
bioRxiv 2021.04.07.438753; doi: https://doi.org/10.1101/2021.04.07.438753

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3502)
  • Biochemistry (7343)
  • Bioengineering (5319)
  • Bioinformatics (20258)
  • Biophysics (10008)
  • Cancer Biology (7735)
  • Cell Biology (11293)
  • Clinical Trials (138)
  • Developmental Biology (6434)
  • Ecology (9947)
  • Epidemiology (2065)
  • Evolutionary Biology (13315)
  • Genetics (9359)
  • Genomics (12579)
  • Immunology (7696)
  • Microbiology (19008)
  • Molecular Biology (7437)
  • Neuroscience (41011)
  • Paleontology (300)
  • Pathology (1228)
  • Pharmacology and Toxicology (2134)
  • Physiology (3155)
  • Plant Biology (6858)
  • Scientific Communication and Education (1272)
  • Synthetic Biology (1895)
  • Systems Biology (5311)
  • Zoology (1087)