PT - JOURNAL ARTICLE AU - Simone Avesani AU - Eva Viesi AU - Luca Alessandrì AU - Giovanni Motterle AU - Vincenzo Bonnici AU - Marco Beccuti AU - Raffaele Calogero AU - Rosalba Giugno TI - Stardust: improving spatial transcriptomics data analysis through space aware modularity optimization based clustering AID - 10.1101/2022.04.27.489655 DP - 2022 Jan 01 TA - bioRxiv PG - 2022.04.27.489655 4099 - http://biorxiv.org/content/early/2022/05/10/2022.04.27.489655.short 4100 - http://biorxiv.org/content/early/2022/05/10/2022.04.27.489655.full AB - Background Spatial transcriptomics (ST) combines stained tissue images with spatially resolved high-throughput RNA sequencing. The spatial transcriptomic analysis includes challenging tasks like clustering, where a partition among data points (spots) is defined by means of a similarity measure. Improving clustering results is a key factor as clustering affects subsequent downstream analysis. State-of-the-art approaches group data by taking into account transcriptional similarity and some by exploiting spatial information as well. However, it is not yet clear how much the spatial information combined with transcriptomics improves the clustering result.Results We propose a new clustering method, Stardust, that easily exploits the combination of space and transcriptomic information in the clustering procedure through a manual or fully automatic tuning of algorithm parameters. Moreover, a parameter-free version of the method is also provided where the spatial contribution depends dynamically on the expression distances distribution in the space. We evaluated the proposed methods results by analysing ST datasets available on the 10x Genomics website and comparing clustering performances with state-of-the-art approaches by measuring the spots stability in the clusters and their biological coherence. Stability is defined by the tendency of each point to remain clustered with the same neighbours when perturbations are applied.Conclusions Stardust is an easy-to-use methodology allowing to define how much spatial information should influence clustering on different tissues and achieving more stable results than state-of-the-art approaches.Competing Interest StatementThe authors have declared no competing interest.CSSCell Stability ScoreDCISDuctal Carcinoma In SituDLPFCDorsolateral Prefrontal CortexGenSAGeneralized Simulated AnnealingHBC1Human Breast Cancer 1HBC2Human Breast CancerHHHuman HeartHLNHuman Lymph NodeHMRFHidden Markov Random FieldH&EHematoxylin & EosinICInvasive CarcinomaKNNK-Nearest NeighborMCMCMarkov chain Monte CarloMKMouse KidneyMRFMarkov Random FieldPCPrincipal ComponentPCAPrincipal Component AnalysisrCASCreproducible Classification Analysis of Single Cell Sequencing DataRDSR Data SerializedscRNA-seqSingle-cell RNA sequencingSTSpatial Transcriptomics