Abstract
Target-decoy competition (TDC) has been commonly used for a while now in the analysis of tandem mass spectrometry data [9]. Recently, this approach of competition-based false discovery rate (FDR) control has gained significant popularity in other fields after [2] laid its theoretical foundation in a more general setting that included the feature selection problem. In both cases, the competition is based on a head-to-head comparison between an (observed) target score and a corresponding decoy (knockoff) score. The effectiveness of TDC depends on whether the data is homogeneous, which is often not the case: the data might consists of groups with different score profiles or different proportion of true nulls. In such cases, ignoring the group structure TDC might report imbalanced lists of discoveries where some groups might include too many false discoveries. An alternative approach of applying TDC separately to each group does not rigorously control the FDR. We developed Group-walk, a procedure that controls the FDR in the target-decoy / knockoff setting while taking into account a given group structure. We show using simulated and real datasets that Group-walk can deliver substantial power gains when the data naturally divides into groups with different characteristics. Group-walk will be made available at publication.
Competing Interest Statement
The authors have declared no competing interest.