PT - JOURNAL ARTICLE AU - G. Keeble-Gagnère AU - P. Rigault AU - J. Tibbits AU - R. Pasam AU - M. Hayden AU - K. Forrest AU - Z. Frenkel AU - A. Korol AU - E. Huang AU - C. Cavanagh AU - J. Taylor AU - M. Abrouk AU - A. Sharpe AU - D. Konkin AU - P. Sourdille AU - B. Darrier AU - F. Choulet AU - A. Bernard AU - S. Rochfort AU - AM. Dimech AU - N. Watson-Haigh AU - U. Baumann AU - P. Eckermann AU - D. Fleury AU - A. Juhasz AU - S. Boisvert AU - M-A. Nolin AU - J. Doležel AU - H. Šimková AU - H. Toegelová AU - Jan Šafář AU - M-C. Luo AU - F. Camara AU - M. Pfeifer AU - D. Isdale AU - J. Nystrom-Persson AU - IWGSC AU - D-H Koo AU - M. Tinning AU - D. Cui AU - Z. Ru AU - R. Appels TI - Optical and physical mapping with local finishing enables megabase-scale resolution of agronomically important regions in the wheat genome AID - 10.1101/363465 DP - 2018 Jan 01 TA - bioRxiv PG - 363465 4099 - http://biorxiv.org/content/early/2018/07/09/363465.short 4100 - http://biorxiv.org/content/early/2018/07/09/363465.full AB - Background Numerous scaffold-level sequences for wheat are now being released and, in this context, we report on a strategy for improving the overall assembly to a level comparable to that of the human genome.Results Using chromosome 7A of wheat as a model, sequence-finished megabase scale sections of this chromosome were established by combining a new independent assembly based on a BAC-based physical map, BAC pool paired end sequencing, chromosome arm specific mate-pair sequencing and Bionano optical mapping with the IWGSC RefSeq v1.0 sequence and its underlying raw data. The combined assembly results in 18 super-scaffolds across the chromosome. The value of finished genome regions is demonstrated for two approximately 2.5 Mb regions associated with yield and the grain quality phenotype of fructan carbohydrate grain levels. In addition, the 50 Mb centromere region analysis incorporates cytological data highlighting the importance of non-sequence data in the assembly of this complex genome region.Conclusions Sufficient genome sequence information is shown to be now available for the wheat community to produce sequence-finished releases of each chromosome of the reference genome. The high-level completion identified that an array of seven fructosyl transferase genes underpins grain quality and yield attributes are affected by five f-box-only-protein-ubiquitin ligase domain and four root-specific lipid transfer domain genes. The completed sequence also includes the centromere.Contigconsensus region of DNA sequence represented by overlapping sequence reads. Can have unresolved bases (N), but no gaps.Scaffoldconsensus region of DNA sequence represented by ordered (but not necessarily oriented) contigs, separated by gaps of known (estimated) length.Islandgenomic region represented by overlapping sets of DNA sequences (scaffolds), physical entities (optical map or molecule, physical clone), or both.Super-scaffolda portion of the genome sequence where scaffolds have been ordered and oriented relative to each other