ABSTRACT
The role of the Huanan Seafood Market in the early SARS-CoV-2 outbreak remains unclear. Recently the Chinese CDC released data from deep sequencing of environmental samples collected from the market after it was closed on January-1-2020 (Liu et al. 2023a). Prior to this release, Crits-Christoph et al. (2023) analyzed data from a subset of the samples. Both studies concurred that the samples contained genetic material from a variety of species, including some like raccoon dogs that are susceptible to SARS-CoV-2. However, neither study systematically analyzed the relationship between the amount of genetic material from SARS-CoV-2 and different animal species. Here I implement a fully reproducible computational pipeline that jointly analyzes the number of reads mapping to SARS-CoV-2 and the mitochondrial genomes of chordate species across the full set of samples. I validate the presence of genetic material from numerous species, and calculate mammalian mitochondrial compositions similar to those reported by Crits-Christoph et al. (2023). However, the number of SARS-CoV-2 reads is not consistently correlated with reads mapping to non-human susceptible species. For instance, 14 samples have >20% of their chordate mitochondrial material from raccoon dogs, but only one of these samples contains any SARS-CoV-2 reads, and that sample only has 1 of ∼200,000,000 reads mapping to SARS-CoV-2. Instead, SARS-CoV-2 reads are most correlated with reads mapping to various fish, such as catfish and largemouth bass. These results suggest that while metagenomic analysis of the environmental samples is useful for identifying animals or animal products sold at the market, co-mingling of animal and viral genetic material is unlikely to reliably indicate whether any animals were infected by SARS-CoV-2.
Competing Interest Statement
JDB is on the scientific advisory boards of Apriori Bio, Aerium Therapeutics, Invivyd, the Vaccine Company, and Oncorus; consults for GSK; and receives royalty payments as an inventor on Fred Hutch licensed patents related to deep mutational scanning of viral proteins.
Footnotes
This revision makes some minor updates that are summarized in the new supplementary appendix. None of the changes substantially alter the results/conclusions. Briefly: (1) Adds metagenomic sequencing analysis for a few samples (eg, A20) that were sequenced both metagenomically and with viral enrichment. Previously all such samples were discarded, now the metagenomic data is separated from the viral enrichment. (2) More consistent preprocessing of single and paired-end reads by fastp (3) Addition of tables (eg Table S8 and S9) that show mammalian in addition to chordate composition, or have no cutoff on samples shown. (4) Supplementary appendix replying to Debarre and Crits-Christoph comments.