TY - JOUR T1 - Establishing reference samples for detection of somatic mutations and germline variants with NGS technologies JF - bioRxiv DO - 10.1101/625624 SP - 625624 AU - Li Tai Fang AU - Bin Zhu AU - Yongmei Zhao AU - Wanqiu Chen AU - Zhaowei Yang AU - Liz Kerrigan AU - Kurt Langenbach AU - Maryellen de Mars AU - Charles Lu AU - Kenneth Idler AU - Howard Jacob AU - Ying Yu AU - Luyao Ren AU - Yuanting Zheng AU - Erich Jaeger AU - Gary Schroth AU - Ogan D. Abaan AU - Justin Lack AU - Tsai-Wei Shen AU - Keyur Talsania AU - Zhong Chen AU - Seta Stanbouly AU - Jyoti Shetty AU - Bao Tran AU - Daoud Meerzaman AU - Cu Nguyen AU - Virginie Petitjean AU - Marc Sultan AU - Margaret Cam AU - Tiffany Hung AU - Eric Peters AU - Rasika Kalamegham AU - Sayed Mohammad Ebrahim Sahraeian AU - Marghoob Mohiyuddin AU - Yunfei Guo AU - Lijing Yao AU - Lei Song AU - Hugo YK Lam AU - Jiri Drabek AU - Roberta Maestro AU - Daniela Gasparotto AU - Sulev Kõks AU - Ene Reimann AU - Andreas Scherer AU - Jessica Nordlund AU - Ulrika Liljedahl AU - Roderick V Jensen AU - Mehdi Pirooznia AU - Zhipan Li AU - Chunlin Xiao AU - Stephen Sherry AU - Rebecca Kusko AU - Malcolm Moos AU - Eric Donaldson AU - Zivana Tezak AU - Baitang Ning AU - Jing Li AU - Penelope Duerken-Hughes AU - Huixiao Hong AU - Leming Shi AU - Charles Wang AU - Wenming Xiao AU - The Somatic Working Group of SEQC-II Consortium Y1 - 2019/01/01 UR - http://biorxiv.org/content/early/2019/05/13/625624.abstract N2 - We characterized two reference samples for NGS technologies: a human triple-negative breast cancer cell line and a matched normal cell line. Leveraging several whole-genome sequencing (WGS) platforms, multiple sequencing replicates, and orthogonal mutation detection bioinformatics pipelines, we minimized the potential biases from sequencing technologies, assays, and informatics. Thus, our “truth sets” were defined using evidence from 21 repeats of WGS runs with coverages ranging from 50X to 100X (a total of 140 billion reads). These “truth sets” present many relevant variants/mutations including 193 COSMIC mutations and 9,016 germline variants from the ClinVar database, nonsense mutations in BRCA1/2 and missense mutations in TP53 and FGFR1. Independent validation in three orthogonal experiments demonstrated a successful stress test of the truth set. We expect these reference materials and “truth sets” to facilitate assay development, qualification, validation, and proficiency testing. In addition, our methods can be extended to establish new fully characterized reference samples for the community. ER -