RT Journal Article SR Electronic T1 Compositional knockoff filter for high-dimensional regression analysis of microbiome data JF bioRxiv FD Cold Spring Harbor Laboratory SP 851337 DO 10.1101/851337 A1 Srinivasan, Arun A1 Xue, Lingzhou A1 Zhan, Xiang YR 2020 UL http://biorxiv.org/content/early/2020/04/28/851337.abstract AB A critical task in microbiome data analysis is to explore the association between a scalar response of interest and a large number of microbial taxa that are summarized as compositional data at different taxonomic levels. Motivated by fine-mapping of the microbiome, we propose a two-step compositional knockoff filter (CKF) to provide the effective finite-sample false discovery rate (FDR) control in high-dimensional linear log-contrast regression analysis of microbiome compositional data. In the first step, we employ the compositional screening procedure to remove insignificant microbial taxa while retaining the essential sum-to-zero constraint. In the second step, we extend the knockoff filter to identify the significant microbial taxa in the sparse regression model for compositional data. Thereby, a subset of the microbes is selected from the high-dimensional microbial taxa as related to the response using a pre-specified FDR threshold. We study the asymptotic properties of the proposed two-step procedure, including both sure screening and effective false discovery control. We demonstrate the finite-sample properties in simulation studies, which show the gain in the empirical power while controlling the nominal FDR. The potential usefulness of the proposed method is also illustrated with application to an inflammatory bowel disease dataset to identify microbial taxa that influence host gene expressions.Competing Interest StatementThe authors have declared no competing interest.