Statistical Transfer Learning with Generative Encoding for Spatial Transcriptomics

A new method proposes to align single cell reference datasets to spatial transcriptomic genes and tissue images. The technique can transfer single cells to unmeasured histology tissue by ﬁrst aligning a single cell reference dataset to known Spatial Transcriptomic tissue, and learn from the alignment to predict gene expression on new histology. The model can invert the alignment transformation to generate new histology images from gene expression vectors, allowing for in-silico perturbation analyses through dynamically altering the levels of gene expression. Leveraging the cell atlas can lead to annotation of pathology and clinical specimens, enabling a mapping from the cellular and transcriptomic level to imaging tissue.


Introduction
One application of an interpretable statistical transfer learning technique is through Spatial Transcriptomic technologies (ST-tech), Method of the Year (2021) in Nature Methods [1]. The platform measures gene expression spatially on a histological tissue slide, linking gene expression to a tissue in a spatial context. Prior to the advent of ST-tech, there were single cell RNA sequencing technologies (scRNA-seq). The ability to sequence the gene transcription to count the expression and quantify the abundance at a single cell level enables the improved understanding of biological heterogeneity at a cellular level [2] [3] [4].
An important question is how to align single cells to the region of interest (ROI) on ST-tech that already contains gene transcripts within a spatial context. Another question is how to align single cells to regions outside of the ROI (outROI), so as to transfer information from the scRNA-seq to the outROI, thus predicting gene expression on regions without prior measurements. One benefit of the alignment using parametric models is the generation of new tissue images from the modelling of the gene expression vector. This would allow biologists to assess the perturbation of a gene by computationally adjusting its' expression, then via a generative transformation, convert the perturbed gene expression vector into a histology image for spatial context. Figure 1: Aligning single cell data from a high-quality reference to Spatial Transcriptomic technologies (here the image used is from NanoString GeoMx data). Through learning from the alignment, the transfer of single cell data to histology images (without measured gene expression) is realisable.
The contributions of this paper on Generative Modelling are made through two aims. By modelling scRNAseq and ST-tech to transfer learn gene expression from ROI to outROI: • Predict gene expression in histology that does not have any previous transcript measurements

• Generate new histology images based on gene expression vectors via generative modelling
The github code can be found at: https://github.com/AskExplain/gcproc The github analysis code can be found at: https://github.com/AskExplain/gcproc a nalysis An interactive dashboard can be found at: https://board.askexplain.com/genecode

Data
The dataset consists of four main parts: 1) the scRNA-seq reference dataset containing gene expression, the ST-tech dataset containing 2) gene expression in conjunction with 3) the spatial histology tissue corresponding to the gene expression, and 4) the spatial histology tissue without gene expression to be evaluated.

Main Data
The scRNA-seq data is sourced from a single cell kidney reference atlas [5]. The dataset consists of 12,423 single cells, each with 17,682 genes that overlap with the spatial transcriptomic dataset. There are 33 cell type from the adult non-PT parenchyma and immune kidney datasets. There are also 44 cell types from the same atlas, but of fetal nephron, stroma and immune kidney tissue (only the fetal nephron tissue is used).
The ST-tech data is sourced from the NanoString Spatial Omics Hackathon at DevPost. The data used was a subset of what was provided, namely, the gene expression data and the stained histology tissue. The gene expression data consists of 18,504 genes with 231 ROI. The ROI are tiled into 32 by 32 pixel regions, each with the 3 standard RGB colour channels. These are flattened into vectors for further analysis. The outROI consist of a larger image of size 1280 by 1280 pixels, tiled into 1600 pixel frames of 32 by 32 pixels, again with the 3 standard RGB channels.
Importantly, the ST-tech provided is stained with 3 different biomarkers: magenta or pink for WT1, a marker of the podocytes in the glomeruli which are the filtration units of the kidney, yellow for Pan-cytokeratin, a marker of renal tubules which are the tubes that connect the glomeruli to the excretory regions of the kidney, and blue for CD45, a marker for immune cell types.

Perturbation datasets
The perturbation dataset involves only the fetal nephron scRNA-seq dataset from the Kidney Cell Atlas, and does not involve Spatial Transcriptomic data (albeit, model parameters learned using the ST-tech are involved). All images are generated from gene expression data, with added perturbations from a simulated continuous-time gene regulatory dynamics.
In the analysis, only a single cell vector is taken, composed of all 17,682 genes. Five genes are selected (WT1, PODXL, SIX2, NPHS1, NPHS2) to be perturbed simultaneously by complemeting the observed expression with an increase at every time step via a logistic sigmoid curve (parameters for the curves are randomly sampled).
The logistic sigmoid function is given as: where L determines the maximum gene expression level, t 0 defines the intercept of maximum change in the sigmoid curve, and k determines the overall rate at which the gradient changes. Here t is the time step, and x(t) is the gene expression, which is a function of time at t.

Simulation datasets
The simulation dataset uses Splatter, and does not contain any prior gene expression information, other than those needed to estimate the parameters for Splatter [6]. The simulated samples are transformed via the learned parameters to fit within the structure of a kidney scRNA-seq dataset, but does not involve measured observations form the Kidney Cell Atlas.
In the analysis, four cell vectors are taken, each composed of all 17,682 genes. Eight genes are selected (WT1, PODXL, NPHS1, NPHS2, VEGFA, VIM, CDH1, AQP2) to be perturbed simultaneously by complemeting the observed expression with an increase at every time step via a continuous function.
The four histology image tiles from the inverse transformation of the cell vectors are combined by adjusting the colour and pixel gradient between image tiles using the expression only on overlapping pixels: Here, α ranges from [0, 1] at discrete intervals. The full-sized image tiles are of 32 by 32 pixels, and overlapped by 8 pixels at each intersecting border. See Figure 3 for more details. To map between tiles that are inverse transformed from scRNA-seq data, overlapping pixels at intersecting tiles are filled in using a probabilistic gradient. Note that all images are generated from gene expression, and does not impact the genes via a feedback loop.

Transfer Learning method
To integrate data from one set of observations to another into an aligned and reduced dimensional space for both the samples and features via a transformation with matrix projections.
The K and J parameter matrix transforms the samples into the same reduced dimensional plane along the sample space. This gives the benefit that JX and KY are now ready to be analysed together along the same dimensions.
Furthermore, the v and u parameters transform the features across the datasets into the same space, allowing Y v and Xu to be analysed along the same gene dimensions.

Aligning the gene expression of the tiles with the pixel information of the corresponding tiles
The model for this step begins with the ST-tech gene expression tiles Y i,l of the ROI and the ST-tech pixels from the expression tiles P a,l , where a are the pixels, l are the ROI tiles, and i are the genes:

Transfer Learning the encoding via shared spaces
Using the pixels as anchors, the parameters learned in the encoded alignment can be transferred to X the scRNA-seq reference via R the ST-tech pixels from histology tissue of unmeasured gene expression: Notice that only the parameter B k,a for R is fixed as learned in the first stage. It is used as an anchor to pivot the model so the relevant parameters can be learned to enable a similar encoding for the scRNAseq data, X.

Transferred alignment enables prediction of gene expression on new histology tissue
Given the relation between the scRNA-seq dataset X, and R the pixel information from histology images without measured transcripts, a simple solution to aligning gene expression information to the pixel tiles is via:

Generative modelling to transform cells into new tissue images
Given the relation between X the scRNA-seq dataset and R the pixels from the outROI, it is possible to now generate images based on gene expression vectors alone. The model to generate new samples of tissue images is given by: This is notable given the benefit of interpretability for biologists to better understand how genes affect the histology in a spatial context.

Results
There are numerous papers on kidney development, as well as, kidney organoid and stem cell research which will aide in validating the computational results.

Modelling gene regulation from alignment of scRNAseq to histology
The first set of results in Figure 4 involves

Modelling dynamic gene regulatory networks
Modelling gene expression to continuously change as a function of time using logistic sigmoidal curves can be used to improve the interpretability of scRNAseq data. The main aim would be to benchmark continuous-time dynamically coupled gene expression networks to visualise how gene perturbations affects the histology in Figure 5.
Given the novelty of the method, the limitation is that each gene has a 1 to 1 correspondence to a image tile, and thus dynamic network perturbations are essentially a linear interpolation of these gene-based image tiles. This means that the method is more focused on interpretation to generate new hypotheses for biologists, rather than to discover new biology.

Simulating multiple cells as image tissues spatially and temporally
Multiple cells inverse transformed to image tiles can be stitched together to overlap and create a new image tile from the smaller sub-tiles. The new larger image tile can be used to model gene expression spatially. By dynamically altering gene expression through time, it is possible to assess this temporally.
In Figure 6, four cells are inverse transformed to generate the corner image tiles, of which are then stitched together to generate a larger image tile. The four cells containing genes are simulated from Splatter [6], and transformed to fit the shape of data from the Kidney Cell Atlas (in particular, it is of fetal nephron tissue). Notice that no real observations are used to generate the Spatial Transcriptomics simulated data.

Extensions to develop an open-source toolbox
With the generation of the Gene Ontology database, computational methods have had a tool to which models can be benchmarked to assess how well the methods capture biological variation and signals. By extending this to integrate with current methods such as Splatter (see Figure 6), as in SplatPop [7] can enable computational scientists a working tool in the coming years of growing Spatial Transcriptomic modelling.
Furthermore, it can also be used as a tool to benchmark dynamically coupled gene expression regulatory networks (see Figure 5). This new tool can improve upon the interpretability of these models to help life-scientists and biologists draw more precise and detailed hypotheses.
Given the small and limited size of the current dataset (n = 231), it would be of great benefit to assess how well the model can capture inherent gene expression signal on larger datasets. It is hypothesised that it could provide clues to more biological knowledge on how two similar cells with similar gene 6 Related work

Alignment of datasets
The projection of a dataset into the same space can occur by transforming one dataset into the space of another, similar to Procrustes analysis [8], or into the same reduced dimensional space, akin to Generalised Principal Component Analysis [9].

Alignment of cells to spatial transcriptomics
The alignment of cells to ST-tech to impute spatial gene expression involves maximising the aligned gene correlation between cell and ST-tech regions. There exist several methods that optimise the correlation between cell and ST-tech regions, such as Tangram that uses cosine correlation [10], or cell2location [11] which assigns cell type similarity based on a negative binomial regression weight.

Prediction of gene expression
There are a number of methods that aim to predict gene expression on histology tissue without prior measurements using ST-tech data to train on. For example, stNet looks at using a CNN to extract features and predict gene expression of around 200 genes in a fully integrated deep learning model [12]. Another method using data from the The Cancer Genome Atlas Program (TCGA) uses tissue RNA-seq data measured on patients with histology samples and predicts the gene expression at a bulk-level [13].

Generative image modelling
From reviewing the current literature, the generation of tissue images from gene expression vectors is a novel method for ST-tech and scRNAseq data. It is possible to sample from the manifold of likeliy activations based on the weights in a deep learning method to visualise the features learned in a Convolutional Neural Network [14]. The deep mixtures of factor analysers by Tang et al, from Geoffrey Hinton's group, also produced generative models that created sample images, although using face data [15].

Conclusion
The notion of communicating information across disciplines is important, the same idea can be applied to datasets. Aligning datasets together to make use of both in a synergistic way can improve modelling.
In this computational work, Single Cell RNA-sequencing is aligned to Spatial Transcriptomic technologies. This allows both the alignment of gene expression to histology tissue and the inverse through transforming histology tissue to gene expression.
This modelling technique of encoding both datasets into the same space (similar to an Autoencoder) can learn interpretable parametric weights that have biological meaning. The correspondence between gene expression and image tiles, would allow the benchmarking of computational methods that either simulate or generate gene expression for improved interpretations for biologists.
An open-source community-based tool can be developed and integrated into existing methods to benefit the single cell field.

Acknowledgements
Thanks to Ryan Deslandes for preparing the interactive dashboard and providing advice on the database and software platforms.