PT - JOURNAL ARTICLE AU - S. Wesley Long AU - Randall J. Olsen AU - Paul A. Christensen AU - David W. Bernard AU - James R. Davis AU - Maulik Shukla AU - Marcus Nguyen AU - Matthew Ojeda Saavedra AU - Concepcion C. Cantu AU - Prasanti Yerramilli AU - Layne Pruitt AU - Sishir Subedi AU - Heather Hendrickson AU - Ghazaleh Eskandari AU - Muthiah Kumaraswami AU - Jason S. McLellan AU - James M. Musser TI - Molecular Architecture of Early Dissemination and Evolution of the SARS-CoV-2 Virus in Metropolitan Houston, Texas AID - 10.1101/2020.05.01.072652 DP - 2020 Jan 01 TA - bioRxiv PG - 2020.05.01.072652 4099 - http://biorxiv.org/content/early/2020/05/03/2020.05.01.072652.short 4100 - http://biorxiv.org/content/early/2020/05/03/2020.05.01.072652.full AB - We sequenced the genomes of 320 SARS-CoV-2 strains from COVID-19 patients in metropolitan Houston, Texas, an ethnically diverse region with seven million residents. These genomes were from the viruses causing infections in the earliest recognized phase of the pandemic affecting Houston. Substantial viral genomic diversity was identified, which we interpret to mean that the virus was introduced into Houston many times independently by individuals who had traveled from different parts of the country and the world. The majority of viruses are apparent progeny of strains derived from Europe and Asia. We found no significant evidence of more virulent viral types, stressing the linkage between severe disease, underlying medical conditions, and perhaps host genetics. We discovered a signal of selection acting on the spike protein, the primary target of massive vaccine efforts worldwide. The data provide a critical resource for assessing virus evolution, the origin of new outbreaks, and the effect of host immune response.Significance COVID-19, the disease caused by the SARS-CoV-2 virus, is a global pandemic. To better understand the first phase of virus spread in metropolitan Houston, Texas, we sequenced the genomes of 320 SARS-CoV-2 strains recovered from COVID-19 patients early in the Houston viral arc. We identified no evidence that a particular strain or its progeny causes more severe disease, underscoring the connection between severe disease, underlying health conditions, and host genetics. Some amino acid replacements in the spike protein suggest positive immune selection is at work in shaping variation in this protein. Our analysis traces the early molecular architecture of SARS-CoV-2 in Houston, and will help us to understand the origin and trajectory of future infection spikes.Competing Interest StatementThe authors have declared no competing interest.