PT - JOURNAL ARTICLE AU - Rabadan, Raul AU - Bhanot, Gyan AU - Marsilio, Sonia AU - Chiorazzi, Nicholas AU - Pasqualucci, Laura AU - Khiabanian, Hossein TI - On statistical modeling of sequencing noise in high depth data to assess tumor evolution AID - 10.1101/128587 DP - 2017 Jan 01 TA - bioRxiv PG - 128587 4099 - http://biorxiv.org/content/early/2017/09/04/128587.short 4100 - http://biorxiv.org/content/early/2017/09/04/128587.full AB - One cause of cancer mortality is tumor evolution to therapy-resistant disease. First line therapy often targets the dominant clone, and drug resistance can emerges from preexisting clones that gain fitness through therapy-induced natural selection. Such mutations may be identified using targeted sequencing assays by analysis of noise in high-depth data. Here, we develop a comprehensive, unbiased model for sequencing error background. We find that noise in sufficiently deep DNA sequencing data can be approximated by aggregating negative binomial distributions. Mutations with frequencies above noise may have prognostic value. We evaluate our model with simulated exponentially expanded populations as well as data from cell line and patient sample dilution experiments, demonstrating its utility in prognosticating tumor progression. Our results may have the potential to identify significant mutations that can cause recurrence. These results are relevant in the pretreatment clinical setting to determine appropriate therapy and prepare for potential recurrence pretreatment.