Abstract
We develop a data harmonization approach for C. elegans volumetric microscopy data, still or video, consisting of a standardized format, data pre-processing techniques, and a set of human-in-the-loop machine learning based analysis software tools. We unify a diverse collection of 118 whole-brain neural activity imaging datasets from 5 labs, storing these and accompanying tools in an online repository called WormID (wormid.org). We use this repository to train three existing automated cell identification algorithms to, for the first time, enable accuracy in neural identification that generalizes across labs, approaching human performance in some cases. We mine this repository to identify factors that influence the developmental positioning of neurons. To facilitate communal use of this repository, we created open-source software, code, web-based tools, and tutorials to explore and curate datasets for contribution to the scientific community. This repository provides a growing resource for experimentalists, theorists, and toolmakers to (a) study neuroanatomical organization and neural activity across diverse experimental paradigms, (b) develop and benchmark algorithms for automated neuron detection, segmentation, cell identification, tracking, and activity extraction, and (c) inform models of neurobiological development and function.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
We incorporated two additional leading neuron-identification algorithms into our work and found similar results of large performance improvement. We added hold-one-lab-out analysis and found that performance improvement did not depend on having training data from the lab data being test, further supporting the generality of the system.