Summary
Imaging-based spatial transcriptomics technologies, such as MERFISH and seq-FISH, use combinatorial barcoding and imaging to simultaneously detect individual RNA molecules from 10s to 10,000s of genes. These technologies require the decoding of individual RNA molecules’ location and gene identity from stacks of images. However, beyond using ‘blank’ code-words as negative controls, there is a lack of ground truth information embedded within the assay to experimentally measure the accuracy and sensitivity of the decoding algorithm. We introduce Guidestar, a system of spike-in controls integrated within a combinatorial FISH assay, that labels a subset of RNA transcripts with additional probes. These probes are imaged separately as ‘guide bits’, which serve as ground-truth data to assess decoding accuracy at the level of individual RNA molecules. Using Guidestar to evaluate accuracy of an existing decoding method suggested alternative parameter settings that increased sensitivity with minimal impact on accuracy. We also used the Guidestar dataset to train a machine-learning based classifier to distinguish true from false RNA calls, yielding 9% and 40% higher F1 scores across cell line and tissue samples, respectively.
Competing Interest Statement
The Guidestar method described in the manuscript was filed under Singapore Patent Application No. 10202403372Q on 30 Oct 2024. Agency for Science Technology and Research (A*STAR) is the patent applicant and the inventors are K.H.C., N.C., J.T., L.W., W.Y.S., J.B., and M.H.. The remaining author declares no competing interest.