Abstract
Current-day single-cell studies comprise complex data sets affected by nested batch effects caused by technical and biological factors, relying on advanced integration methods. Silhouette is an established metric for assessing clustering results, comparing within-cluster cohesion to between-cluster separation, and adaptations of it have emerged as the dominant choice to evaluate the success of these integration methods. However, silhouette’s assumptions are often violated in single-cell data integration scenarios. We demonstrate that silhouette-based metrics can neither reliably assess batch effect removal nor biological signal conservation and are thus inherently unsuitable for data with (nested) batch effects. We propose alternative, robust evaluation strategies that enable accurate integration method assessment and call to update benchmarking practices.
Competing Interest Statement
The authors have declared no competing interest.