Abstract
Single particle cryo-electron microscopy (cryoEM) is going through a phase of rapid optimization focused on increasing the efficiency, accuracy, and automation of every step in the data pipeline. Machine learning models in particular have been making substantial advances in cryoEM, however their impact has been limited. This limitation is due in part to the lack of availability of realistic ground-truth datasets for training and evaluation of cryoEM machine learning models. To address this limitation and accelerate this phase, we introduce VirtualIce which generates half-synthetic micrographs by projecting proteins onto real, curated micrographs of vitrified buffer. VirtualIce provides configurable features including noise simulation, realistic particle distributions, particle overlapping, particle aggregation, filtering obscured regions, and multiple structures per micrograph. VirtualIce may be a valuable resource to help visualize unknown proteins, accelerate the development of automated data collection and processing pipelines, and develop cryoEM algorithms.
Competing Interest Statement
The authors have declared no competing interest.