TY - JOUR T1 - matOptimize: A parallel tree optimization method enables online phylogenetics for SARS-CoV-2 JF - bioRxiv DO - 10.1101/2022.01.12.475688 SP - 2022.01.12.475688 AU - Cheng Ye AU - Bryan Thornlow AU - Angie Hinrichs AU - Devika Torvi AU - Robert Lanfear AU - Russell Corbett-Detig AU - Yatish Turakhia Y1 - 2022/01/01 UR - http://biorxiv.org/content/early/2022/01/13/2022.01.12.475688.abstract N2 - Phylogenetic tree optimization is necessary for precise analysis of evolutionary and transmission dynamics, but existing tools are inadequate for handling the scale and pace of data produced during the COVID-19 pandemic. One transformative approach, online phylogenetics, aims to incrementally add samples to an ever-growing phylogeny, but there are no previously-existing approaches that can efficiently optimize this vast phylogeny under the time constraints of the pandemic. Here, we present matOptimize, a fast and memory-efficient phylogenetic tree optimization tool based on parsimony that can be parallelized across multiple CPU threads and nodes, and provides orders of magnitude improvement in runtime and peak memory usage compared to existing state-of-the-art methods. We have developed this method particularly to address the pressing need during the COVID-19 pandemic for daily maintenance and optimization of a comprehensive SARS-CoV-2 phylogeny. Thus, our approach addresses an important need for daily maintenance and refinement of a comprehensive SARS-CoV-2 phylogeny.Significance Statement Phylogenetic trees have been central to genomic surveillance, epidemiology, and contact tracing efforts during the COVD-19 pandemic. With over 6 million SARS-CoV-2 genome sequences now available, maintaining an accurate, comprehensive phylogenetic tree of all available SARS-CoV-2 sequences is becoming computationally infeasible with existing software, but is essential for getting a detailed picture of the virus’ evolution and transmission. Our novel phylogenetic software, matOptimize, is helping refine possibly the largest-ever phylogenetic tree, containing millions of SARS-CoV-2 sequences, thus providing an unprecedented resolution for studying the pathogen’s evolutionary and transmission dynamics.Competing Interest StatementThe authors have declared no competing interest. ER -