PT - JOURNAL ARTICLE AU - Yunxiao Li AU - Yi-Juan Hu AU - Glen A. Satten TI - MERIT: controlling Monte-Carlo error rate in large-scale Monte-Carlo hypothesis testing AID - 10.1101/2022.01.15.476485 DP - 2022 Jan 01 TA - bioRxiv PG - 2022.01.15.476485 4099 - http://biorxiv.org/content/early/2022/01/18/2022.01.15.476485.short 4100 - http://biorxiv.org/content/early/2022/01/18/2022.01.15.476485.full AB - Background The use of Monte-Carlo (MC) p-values when testing the significance of a large number of hypotheses is now commonplace. It is known that a larger number of MC replicates is required to calculate a p-value near a threshold of significance than calculation of a p-value that is far from the threshold. However, it is not well appreciated that, in large-scale hypothesis testing, we will typically encounter at least some p-values near the threshold. As a result, the list of rejected null hypotheses (detections) from large-scale MC testing can vary when different MC replicates are used, resulting in lack of reproducibility. The method of Gandy and Hahn (GH) [1–3] is the only method that has directly addressed this problem, defining a Monte-Carlo error rate (MCER) to be the probability that any decisions on accepting or rejecting a hypothesis based on MC p-values are different from decisions based on ideal p-values, and then making decisions that control the MCER. Unfortunately, GH is frequently very conservative, often making no rejections at all, with a large number of hypotheses left “undecided”.Methods In this article, we propose MERIT (Monte-Carlo Error Rate control In large-scale MC hypothesis Testing), a method for large-scale MC hypothesis testing that also controls the MCER but is more statistically efficient than the GH method. MERIT assumes a pre-determined number of MC replicates. Because GH is a sequential procedure, we also develop a version of GH that is optimized for a pre-determined number of MC replicates.Results Through extensive simulation studies, we demonstrated that MERIT controlled the MCER and substantially improved the sensitivity and specificity of detections compared to our pre-determined-replicate version of GH. We also illustrated our method by an analysis of gene expression data from a prostate cancer study.Competing Interest StatementThe authors have declared no competing interest.