RT Journal Article SR Electronic T1 Association Analysis and Meta-Analysis of Multi-allelic Variants for Large Scale Sequence Data JF bioRxiv FD Cold Spring Harbor Laboratory SP 197913 DO 10.1101/197913 A1 Xiaowei Zhan A1 Sai Chen A1 Yu Jiang A1 Mengzhen Liu A1 William G. Iacono A1 John K. Hewitt A1 John E Hokanson A1 Kenneth Krauter A1 Markku Laakso A1 Kevin W. Li A1 Sharon M Lutz A1 Matthew McGue A1 Anita Pandit A1 Gregory JM Zajac A1 Michael Boehnke A1 Goncalo R. Abecasis A1 Bibo Jiang A1 Scott I. Vrieze A1 Dajiang J. Liu YR 2017 UL http://biorxiv.org/content/early/2017/10/03/197913.abstract AB Motivation: There is great interest to understand the impact of rare variants in human diseases using large sequence datasets. In deep sequences datasets of >10,000 samples, ∼10% of the variant sites are observed to be multi-allelic. Many of the multi-allelic variants have been shown to be functional and disease relevant. Proper analysis of multi-allelic variants is critical to the success of a sequencing study, but existing methods do not properly handle multi-allelic variants and can produce highly misleading association results.Results: We propose novel methods to encode multi-allelic sites, conduct single variant and gene-level association analyses, and perform meta-analysis for multi-allelic variants. We evaluated these methods through extensive simulations and the study of a large meta-analysis of ∼18,000 samples on the cigarettes-per-day phenotype. We showed that our joint modeling approach provided an unbiased estimate of genetic effects, greatly improved the power of single variant association tests, and enhanced gene-level tests over existing approaches.Availability: Software packages implementing these methods are available at (https://github.com/zhanxw/rvtests http://genome.sph.umich.edu/wiki/RareMETAL).Contact: xiaowei.zhan@utsouthwestem.edu; dajiang.liu@psu.edu