## Abstract

**Objective** The objective of this research is to unify the molecular representations of spatial transcriptomics and cellular scale histology with the tissue scales of Computational Anatomy for brain mapping.

**Impact statement** We present a unified representation theory for brain mapping of the micro-scale phenotypes of molecular disease simultaneously with the connectomic scales of complex interacting brain circuits.

**Introduction** Mapping across coordinate systems in computational anatomy allows us to understand structural and functional properties of the brain at the millimeter scale. New measurement technologies such as spatial transcriptomics allow us to measure the brain cell by cell based on transcriptomic identity. We currently have no mathematical representations for integrating consistently the tissue limits with the molecular particle descriptions. The formalism derived here demonstrates the methodology for transitioning consistently from the molecular scale of quantized particles - as first introduced by Dirac as a generalized function - to the continuum and fluid mechanics scales appropriate for tissue.

**Methods** We introduce two methods based on notions of generalized functions and statistical mechanics. We use generalized functions expanded to include functional descriptions - electrophysiology, transcriptomic, molecular histology - to represent the molecular biology scale integrated with a Boltzman like procedure to pass from the sparse particles to empirical probability laws on the functional state of the tissue.

**Results** We demonstrate a unified mapping methodology for transferring molecular information in the transcriptome and histological scales to the human atlas scales for understanding Alzheimer’s disease. Conclusions: We demonstrate a unified brain mapping theory for molecular and tissue scales.

## 1 Introduction

One of the striking aspects of the study of the brain in modern neurobiology is the fact that the distributions of discrete structures that make up physical tissue, from neural cells to synapses to genes and molecules, exists across nearly ten orders of magnitude in spatial scale. This paper focusses on the challenge of building multi-scale representations that simultaneously connect the quantum nano-scales of modern molecular biology for characterizing neural circuits architecture in the brain with the classical continuum representations at the anatomical gross and meso scales.

We have been highly motivated by the Cell Census Network project (BICCN [1]) which brings the nano and micron scales of single cell measures of RNA via spatial transcriptomics [2–4] coupled to the tissue scales of mouse atlases. The recent review on bridging scales from cells to physiology [5] motivates the mathematical framework presented herein. The recent emergence of spatial transcriptomics as Nature method of the year highlights the importance and ascendence of such approaches for understanding the dense metric structure of the brain which represent the coarse physiological atlas scales built up from dense imaging measurements at the cellular scales.

Specifically, in our work in Alzheimer’s disease the BIOCARD study [6] we are examining pathological Tau at both the micro histological and macro atlas scales of Tau particle detections, from 10-100 *μ* m [7, 8] and to human magnetic resonance millimeter scales for examining entire circuits in the medial temporal lobe. In the mouse cell counting project we are examining single-cell spatial transcriptomics using modern RNA sequencing in dense tissue at the micron scale and its representations in the Allen atlas coordinates [9].

Most noteworthy for any representation is that at the finest micro scales nothing is smooth; the distributions of cells and molecules are more well described as random quantum counting processes in space [10]. In contrast, information associated to atlasing methods at the gross anatomical tissue and organ scales of Computational Anatomy extend smoothly [11–16]. Cross-sectionally and even cross-species, gross anatomical labelling is largely repeatable, implying information transfers and changes from one coordinate system to another smoothly. This is built into the representation theory of diffeomorphisms and soft matter tissue models for which advection and transport hold [17–21], principles upon which continuum mechanics and its analogues are based.

The focus of this paper is to build a coherent representation theory across scales. For this we view the micron to millimeter scales via the same representation theory called *mathematical measures*, building the finest micron scales from discrete units termed particle measures which represent molecules, synapses and cells. As they aggregate they form tissues. This measure representation allows us to understand subsets of tissues that contain discretely positioned and placed functional objects at the finest quantized scales and simultaneously pass smoothly to the classical continuum scales at which stable functional and anatomical representations exist. Since the study of the function of the brain on its geometric submanifolds -the gyri, sulci, subnuclei and laminae of cortex-are so important, we extend our general framework to exploit varifold measures [22] arising in the modern discipline of geometric measure theory. To be able to compare the brains we use diffeomorphisms as the comparator tool, with their action representing 3D varifold action which we formulate as “copy and paste” so that basic particle quantities that are conserved biologically are combined with greater multiplicity and not geometrically distorted as would be the case for measure transport.

The functional features are represented via singular delta-diracs at the finest micro structure scales. The functional feature is abstracted into a function space rich enough to accomodate the molecular machinery as represented by RNA or Tau particles, as well as electrophysiology associated to spiking neurons, or at the tissue scales of medical imaging dense contrasts of magnetic resonance images (MRIs). We pass to the classical function continuum via introduction of a scale-space that extends the descriptions of cortical micro-circuits to the meso and anatomical scales. This passage from the quantized features to the stochastic laws is in fact akin to the Boltzman program transferring the view from the Newtonian particles to the stable distributions describing them. For this we introduce a scale-space of kernel density transformations which allows us to retrieve the empirical averages represented by the determinism of the stochastic law consistent with our views of the macro tissue scales.

The representation provides a recipe for scale traversal in terms of a cascade of linear space scaling composed with non-linear functional feature mapping. Following the cascade implies every scale is a measure so that a universal family of measure norms can be introduced which simultaneously measure the disparety between brains in the orbit independent of the probing technology, RNA identities, Tau or amyloid histology, spike trains, or dense MR imagery.

Our brain measure model implies the existence of a sequence. This scale-space of pairs, the measure representation of the brain and the associated probing measurement technologies we call Brainspace. To formulate a consistent measurement and comparison technology on Brainspace we construct a natural metric upon it allowing us to study its geometry and connectedness. The metric between brains is constructed via a Hamiltonian which defines the geodesic connections throughout scale space, providing for the first time a hierarchical representation that unifies micro to millimeter representation in the brain and makes Brainspace into a metric space. Examples of representation and comparision are given for Alzheimer’s histology integrated to magnetic resonance imaging scales, and spatial transcriptomics.

## 2 Results

### 2.1 Measure Model of Brain Structures

To build a coherent theory we view the micron to anatomical scales via the same representation theory building upon discrete units termed particles or atoms. As they aggregate they form tissues. This is depicted in Figure 1 in which the top left panel shows mouse imaging of CUX1 labelling of the inner layers of mouse cortex (white) and CTP2 imaging of the outer layers (green) at 2.5 micron in plane resolution. Notice the discrete nature of the cells clearly resolved which form the layers of tissue which are the global macro scale features of layer 2,3,4 which stain more prolificaly in white and the outer layers 5,6 which stain more prolifically in green.

Our representation exists simultaneously at both the micro and tissue millimeter scales. A key aspect of anatomy is that at a micro or nano scale, information is encoded as a huge collection of pairs (*x _{i}*,

*f*) where

_{i}*x*∈ ℝ

_{i}^{d}(

*d*= 2, 3) describes the position of a “particle” and

*f*is a functional state in a given set attached to it (protein or RNA signature or Tau tangle, or for single cell Neurophysiology the dynamics of neural spiking). Basically everything is deterministic, with every particle attached to its own functional state among possible functional state in . But zooming out, the tissue level, say mm scale, appears through the statistical distribution of its constituents with two key quantities,

_{i}*the local density*of particles

*ρ*and the

*conditional probability distribution*of the functional features

*μ*(

_{x}*df*) at any location

*x*. At position

*x*, we no longer have a deterministic functional state but a probability distribution

*μ*on functional states.

_{x}The integration of both descriptions into a common mathematical framework can be done quite naturally in the setting of mathematical measures which are mathematical constructs that are able to represent both the discrete and continuous worlds as well as natural level of approximation between both. Indeed the set of finite positive measures on contains discrete measures
where *w _{i}* is a positive weight that can encode the collection (

*x*,

_{i}*f*) at micro scale.

_{i}As in Boltzmann modelling we describe the features statistically at a fixed spatial scale transferring our attention to their stochastic laws modelled as conditional probabilities in with integral 1. For this we factor the measures into the marginal *ρ* on ℝ^{d} with , and the field of probability distributions on conditioned on *x*:
with field of conditionals

Continuous tissues we abstract as brain measures *μ* with marginal *ρ* having a tissue density *ρ*(*dx*) = *ρ _{c}* (

*x*)

*dx*with respect to the Lebesgue measure on ℝ

^{d}. A fundamental link between the molecular and continuum tissue can be addressed through the law of large numbers since if (

*x*,

_{i}*f*)

_{i}_{i⩾0}is an independent and identically distributed sample drawn from law

*μ/M*of where is the total mass of such

*μ*then we have almost surely the weak convergence

Passing from the tissue scales to the finest molecular and cellular scales behooves us to introduce a scale-space so that empirical averages which govern it are repeatable. As depicted in the top center panel of Figure 1, our multi-scale model of a brain is as a sequence of measures:

Our idealization of Brainspace as a sequence of measures as depicted in Figure 1 descends from the the coarse tissue scale (top) to the finest particle representation (bottom), with color representing function , and radius space-scale.Throughout the range of scales is denoted shorthand *ℓ* < *ℓ _{max}* to mean 0 ⩽

*ℓ*<

*ℓ*with lowest scale

_{max}*ℓ*= 0 and upper

*ℓ*not attained.

_{max}### 2.2 Model for Crossing Scales

The brain being a multi-scale collection of measures requires us to be able to transform from one scale to another. We do this by associating a scale-space to each particle feature by pairing to each measure a kernel function transforming it from a generalized function to a classical function *δ _{z}* :

*h*↦

*h*(

*z*). The kernels carry a resolution scale

*σ*or reciprocally a bandwidth, analogous to Planck’s scale.

To define this we introduce the abstract representation of our system as a collection of descriptive elements made from spatial and functional features. We transform our mathematical measure *μ*(·) on generating new measures *μ*′(·) on by defining correspondences via kernels *z* ↦ *k*(*z*, *dz*′), with the kernel acting on the particles ; the measures transform as

Depicted in the bottom row of Figure 1 is the construction of our formalism for transformations of scale, linear spatial smoothing followed by non-linear transformation of the feature laws. The first operator transforms via a linear kernel *k*_{1} ((*x*, *f*), ·) leaving the feature space unchanged . The second transforms nonlinearly to new features via the kernel *k*_{2}((*x*, *α*), ·) transforming any feature probability ; decomposing it smooths the conditionals :

At every scale *μ ^{ℓ}* remains a measure allowing the renormalization (5a), (5b) to be executed recursively.

We use computational lattices to interpolate , between the multi-scale continuum. The core spatial resampling kernel projects the particles (*x _{i}* ∈ ℝ

^{d})

_{i∈I}to the rescaling lattice (

*y*∈

_{j}*Y*⊂ ℝ

_{j}^{d})

_{j∈J}defined via

*π*apportioning the fraction that particle

*x*shares with site

_{i}*Y*. The second kernel uses maps

_{j}*ϕ*from machine learning transforming features:

The renormalized measures become

The indexing across scales is defined disjointly *I* ⋂ *J* = ∅. Methods 4.2 examines these models.

### 2.3 The Dynamical Systems Model via Varifold Action of Multi-scale Diffeomorphisms

We want to measure and cluster brains by building a metric space structure. We do this by following the original program of D’Arey Thompson building bijective correspondence. In this setting this must be done at every scale with each scale having different numbers of particles and resolutions.

We build correspondence between sample brains via dense connections of the discrete particles to the continuum at all scales using the diffeomorphism group and diffeomorphic transport. For this define the group of k-times continuously differentiable diffeomorphisms *φ* ∈ *G _{k}* with group operation function composition

*φ*∘

*φ′*. For any brain , the diffeomorphisms act

Space scales are represented as the group product, , acting component-wise with action

The |*dφ*(*x*)| term in the action enables the crucial property that when a tissue is extended to a larger area, the total number of its basic constituents increase accordingly and are not conserved, in contrast to classic measure or probability transport. We call this the “copy and paste” varifold action.

Dynamics occurs through the group action generated as a dynamical system in which we introduce time-indexed vector fields which act as controls on the diffeomorphisms: the multi-scale flow is defined

Geodesic mapping flows under a control process along paths of minimum energy respecting the boundary conditions. The controls are coupled across scales by successive refinements *v ^{ℓ}*,

*ℓ*<

*ℓ*, with

_{max}*u*

^{−1}= 0:

Figure 1 right shows the multi-scale control hierarchy.

The dynamical system is an observer and dynamics equation:

Dynamics translates into a navigation in the orbit of brains and provides a metric distance between brains. Paths of minimum energy connecting the identity *φ*_{0} = **Id** to any fixed boundary condition (BC) *φ*_{1} where *φ*_{1} is accessible defines the distance extending LDDMM [23] to a hierarchy of diffeomorphisms, and is a geodesic for an associated Riemannian metric [24]; see Eqn. (22) in Methods 4.4. The metric from *μ*_{0} to *μ*_{1} in the orbit accesible from *μ*_{0} via diffeomorphisms is the shortest length geodesic paths with BCs *φ*_{0} · *μ*_{0} = *μ*_{0} and *φ*_{1} · *μ*_{0} = *μ*_{1}.

The Methods 4.4 discusses the spaces and the smoothness required for the geodesics to define a metric.

### 2.4 Model for Optimal Control for Brain Mapping via the Varifold Measure Norm

The BC for matching two brains is defined using measure norms. Brains with 0 norm difference are equal; brains with small normed difference are similar. Every brain has a variable number of particles, without correspondence between particles. Measure norms accomodate these variabilities. Our norm is modelled as a varifold norm constructed by integrating against a smooth kernel: the multi-scale norm is given as . In the Methods section 4.3 we define separable Gaussian kernels in space and function each with a scale.

Geodesic mapping solves for the control minimizing the energy with the boundary condition the measure norm. We model the controls to be smooth via reproducing kernel Hilbert spaces (RKHS’s) *V _{ℓ}*, norms ‖·‖

*v*, with multi-scale .

_{ℓ}The optimal control *u*. ≔ (*U _{t}*,

*t*∈ [0, 1]) is square-integrable under the

*V*-norms satisfying for

*α*> 0:

For modelling, each RKHS is taken to have diagonal kernels *K ^{ℓ}* (·,·) =

*g*(·,·)id

^{ℓ}_{d},

*g*the Green’s functions with id

^{ℓ}_{d}the

*d*×

*d*identity (see [24] for non-diagonal kernels). Easing notation we remove explicit indexing of the optimal controls by scales

*ℓ*when implied. Optimal control reparameterizes the flows by introducing the state process indexing the measures and endpoint condition:

Hamiltonian control reparameterizes (12) in “co-states” and Greens functions, for *ℓ* < *ℓ _{max}*,
then is continuously differentiable implying each co-state is absolutely integrable:
for

*φ*=

_{t,s}*φ*∘ (

_{s}*φ*)

_{t}^{−1}, with and the x-gradient. The gradients are given by the smooth field

*h*:

_{q}Methods 4.5 establishes the smoothness conditions for the Hamiltonian equations, methods 4.6 the gradients of the norms.

#### 2.4.1 Gradients of the norm endpoints unifying the molecular and tissue models

Imaging at the tissue continuum scales has the measures as dense limits . Calculating the variations on dense voxel images unifies the tissue scales with the sparse molecular scales. The state *q _{t}* ≔ (

*φ*,

_{t}*W*=

_{t}*w*

_{0}|

*dφ*|) with action gives

_{t}The average of *h _{q}* over the feature space determines the boundary gradient:

The continuum unifies with LDDMM [23]; taking *I*(*x*) ∈ ℝ^{+} with *μ _{x}* =

*δ*, gives

_{I(x)}### 2.5 MRI and Tau Histology Scales for Alzheimer’s

#### 2.5.1 Bayes Segmentation of MRI

Figure 2 shows the multi-scale data from the clinical BIOCARD study [6] of Alzheimer’s disease within the medial temporal lobe [7, 8, 25]. The top (left) panel shows clinical magnetic resonance imaging (MRI) with the high-field 200 *μm* MRI scale (right) shown depicting the medial temporal lobe including the collateral sulcus and lateral bank of the entorhinal cortex. Bayes classifiers for brain parcellation performs feature reduction as a key step for segmentation at tissue scales [26]. Feature reduction maps the distribution on gray levels to probabilities on *N* tissue types, defined by the integration over the decision regions *θ _{n}* ⊂ [0, 255]:

Figure 2 (top row, right) depicts a Bayes classifier for gray, white and cerebrospinal fluid compartments generated from the temporal lobe high-field MRI section corresponding to the Mai-Paxinos section (panel 3).

#### 2.5.2 Gaussian Scale-Space Resampling of Tau Histology

For histology at the molecular scales the measure encodes the detected Tau and amyloid particles for fine scale particles with function the geometric features . Crossing from the histology accumulates the (*x _{j}* ∈ ℝ

^{d})

_{i∈I}Tau to the tissue lattice (

*y*∈

_{j}*Y*⊂ ℝ

_{j}^{d})

_{j∈J}using Gaussian scale-space with normal reweighting in ℝ

^{2}. Figure 2 (bottom row) shows the detected Tau particles as red dots at 4

*μ*m with feature reduction to done via moments expansion on tau distribution:

The bottom row (right two panels) shows the two moments at the tissue scale depicting the mean and variance of the particle size reconstructed from the 4*μ* m scale using Gaussian resampling onto the tissue lattice. The mm scale depicts the global folding property of the tissue. The color codes the mean tissue Tau area as a function of position at the tissue scales with deep red color denoting 80 *μ*m^{2} maximum Tau area for the detected particles.

### 2.6 Cellular Neurophysiology: Neural Network Temporal Models

Single unit neurophysiology uses temporal models of spiking neurons with a “neural network” taking each neuron *x _{i}* modelled as a counting measure in time

*N*(

_{i}*t*),

*t*⩾ 0 with the spike times the feature :

The Poisson model with intensity λ(*t*), *t* ⩾ 0 [10] has probabilities .

Post-stimulus time (PST) [27] and interval histograms are used to examine the instantaneous discharge rates and inter-spike interval statistics [28]. The interval histogram abandons the requirement of maintaining the absolute phase of the signal for measuring temporal periodicity and phase locking. Synchrony in the PST is measured using binning [*b _{i}*,

*b*

_{i+1}),

*i*= 1, · · ·,

*B*and Fourier transforms, :

The *n* = 0 frequency computes integrated rate; each phase-locked feature is complex *ϕ _{n}* ∈ ℂ.

### 2.7 Scale Space Resampling of RNA to Cell and Tissue Scales

Methods in spatial-transcriptomics which have emerged for localizing and identifying cell-types via marker genes and across different cellular resolutions [4, 29–32] presents the opportunity of localizing in spatial coordinates the transcriptionally distinct cell-types. Depicted in Figure 3 are the molecular measurements at the micron scales with MERFISH [33]. The molecular measures represent RNA locations with sparse RNA features, . Crossing to cells (*Y _{j}* ⊂ ℝ

^{2})

_{j∈J}partitions into closest particle subsets defined by the distance

*d*(

*x*,

_{i}*Y*) of particle

_{j}*x*to cell

_{i}*Y*. The RNA particles (

_{j}*x*∈ ℝ

_{i}^{d})

_{i∈I}are resampled to the cell centers via indicator functions accumulating to nonsparse mixtures of RNA within the closest cell. The new reduced feature vector becomes the conditional probability on 17 cell-types in . The conditional probabilities on the RNA vectors are calculated via principle components

*E*= 1, 2, …, modelling features as independent, Gaussian distributed:

_{n}, nResampling to tissue uses normal rescaling. The new feature vector becomes the probability of the cell at any position being one of 10 tissue types . The probability of tissue type is calculated using 10-means clustering on the cell type probabilities. The distance for 10-means clustering is computed using the Fisher-Rao metric [34] between the empirical feature laws . The output of 10-means are a partition of feature space giving new features:

Figure 3 (bottom left panels) shows the RNA forming *μ*^{ℓ+1} depicted as colored markers corresponding to the different gene species (bar scale 1, 10 microns). Panel 2 shows the cell type feature space of 17 cell types making up *μ ^{ℓ}* associated to the maximal probability in the PCA projection from a classifier on the PCA dimensions based on the mixtures of RNA at each cell location. Bottom right shows the 10 tissue features associated to the 10-means procedure. In both scales the probabilities are concentrated on one class with probability 1 and the others 0.

### 2.8 Geodesic Mapping for Spatial Transcriptomics, Histology and MRI

#### 2.8.1 Navigation Between Sections of Cell Identity in Spatial Transcriptomics

Figure 4 (top panels 1,2) shows sections from [4] depicting neuronal cell types via colors including excitatory cells eL2/3 (yellow), eL4 (orange), red eL5 (red), inhibitory cells ST (green), VIP (light blue), each classified via high dimensional gene expression features vectors via spatial transcriptomics. The measure crosses to atlas scales using *π _{ρ}* in ℝ

^{2}as above Eqn. (19) with feature reduction expectations of moments, :

The right panel shows the tissue scale features associated to the cell identity given by the entropy a measure of dispersion across the cell identities given by the expectation of the log probability function with zero entropy meaning the space location feature distribution has all its mass on 1 cell type.

Figure 4 shows results of transforming the neuronal cell types (row 2) and the entropy (row 3) at the scales of the cell and tissue. For all of our geodesic mapping examples shown we enforce vector field smoothness via differential operators specifying the norms in the RKHS with *L ^{ℓ}* ≔ ((1 − (

*α*)

^{ℓ}^{2}▽

^{2})id

_{d})

^{2}for

*ℓ*<

*ℓ*.

_{max}#### 2.8.2 Navigation Between Sections of Histology

Figure 5 (rows 1 and 2) shows navigation between the cortical folds of the 4 *μm* histology. Shown in panel 1 is a section showing the machine learning detection of the Tau particles. Columns 2,3, and 4 depict the template, mapped template and target showing the mathematical measure representation of the perirhinal cortex constructed from the positions and sizes at the 4 *μ* m scale (top row) and reconstruction using Gaussian resampling onto the tissue scale (bottom row). The color codes the mean of *μ _{x}* representing Tau area as a function of position at the tissue scales with deep red of 80

*μm*of Tau area the maximum value. The gradients in tau tangle area between superficial and deep layers is apparent with the deep red as high as 80

^{2}*μm*for the innermost tissue fold. The bottom left panel shows the vector field encoding of the geodesic transformation of the of the perirhinal cortex mapping between the two sections in column l. The narrowing of the banks of the perirhinal cortex is exhibited at the tissue scale for motions order 1000

^{2}*μm*(brightness on scale bar).

Figure 5 (row 3), shows the collateral sulcus fold at the boundary of the trans-entorhinal cortex region transforming based on the normed distances between sections with deformation motions 1000 *μm* in size. Shown is the micron scale depicting the transformation of the gyrus with the color representing the entropy of the particle identity distribution.

#### 2.8.3 Mapping between MRI and Histology Simultaneously

All of the examples thus far have created the multi-scale data generated using the resampling kernels from the finest scales. As illustrated in our early figures much of the data is inherently multi-scale, with the measurement technologies generating the coarse scale representations. Shown in Figure 6 is data illustrating our Alzheimer’s study of post mortem MR images that are simultaneously collected with amyloid and Tau pathology sections. MR images have a resolution of approximately 100 *μm*, while pathology images have a resolution of approximately 1 *μm*. For computational purposes the MRI template and target images were downsampled to 759 and 693 particles, respectively with the tau tangles downsampled to 1038 and 1028 particles, respectively. We treated every pixel in the MR image as a coarse scale particle with image intensity as its feature value Eqn. (18), and every detected tau tangle as a fine scale particle with a constant feature value, and performed varifold matching to align to neighboring sections. The endpoint representing the two scales is . For each scale norm we use a varifold kernel given by the products of Gaussian distributions with the varifold measure norm Eqn. (21) at each scale. For the MRI scale, the weights are identically *w* = 1 with the function component given by the MRI image value; for the tau particles there is no function component making *f* = *g* with the kernel 1 for all values.

The top two rows of Figure 6 shows the imaging data for both sections. The bottom row shows the transformed template image at the fine scale. The high resolution mapping carries the kernels across all the scales as indicated by the geodesic equation (14a). Notice the global motions of the high resolution of the fine particles.

## 3 Discussion

Computational anatomy was originally formulated as a mathematical orbit model for representing medical images at the tissue scales. The model generalizes linear algebra to the group action on images by the diffeomorphism group, a non-linear algebra, but one that inherits a metric structure from the group of diffeomorphisms. The formulation relies on principles of continuity of medical images as classical functions, generalizating optical flow and advection of material to diffeomorphic flow of material, the material represented by the contrast seen in the medical imaging modality such as fiber orientation for diffusion tensor imaging, and or bold contrast for gray matter content. Unifying this representation to images built at the particle and molecular biological scale has required us to move away from classical functions, to the more modern 20th century theory of non-classical generalized functions. Mathematical measures are the proper representation as they generally reflect the property that probes from molecular biology associated to disjoints sets are additive, the basic starting point of measure theory. Changing the model from a focus on groups acting on functions to groups acting on measures allows for a unified representation that has both a metric structure at the finest scales, as well as a unification with the tissue imaging scales.

The brain measure formulation, carries with it implicitly the notion of scale-space, i.e. the existence of a sequence of pairs across scales, the measure representation of the brain and the associated scale-space reproducing kernel Hilbert space of functions which correspond to the probing measurement technologies. As such part of the prescription of the theory is a method for crossing scales and carrying information from one scale to the other. Important to this approach is that at every scale we generate a new measure, therefore the recipe of introducing “measure norms” built from RKHS’s for measuring brain disparity is universal across the hierarchy allowing us to work simultaneously with common data structures and a common formalism. Interestingly, the measure norms do not require identical particle numbers across brains in brain space at the molecular scales.

The key modelling element of brain function is that the conditional feature probability is manipulated from the quantized features to the stochastic laws. These are the analogues of the Boltzman distributions generalized to the complex feature spaces representing function. As they correspond to arbitary feature spaces not necessarily Newtonian particles, we represent them simply as empirical distributions on the feature space, with the empirical measure constructed from the collapse of the fine scale to the resampled coarse scale. To model rescaling through scale-space explicitly, the two kernel transformation are used allowing us to retrieve the empirical averages represented by the determinism of the stochastic law consistent with our views of the macro tissue scales. This solves the dilemna that for the quantized atomic and micro scales cell occurence will never repeat, i.e. there is zero probability of finding a particular cell at a particular location, and conditioned on finding it once it will never be found again in the exact same location in another preparation. The properties that are stable are the probability laws with associated statistics that may transfer across organisms and species.

Importantly, our introduction of the |*dφ*(*x*)| term in the action enables the crucial property that when a tissue is extended to a larger area, the total number of its basic constituents should increase accordingly and not be conserved. This is not traditional measure transport which is mass preserving which is not a desirable feature for biological samples. Rather we have defined a new action on measures that is reminiscent of the action on *d*-dimensonal varifolds [35, 36]. We call this property “copy and paste”, the notion being that the brain is built on basic structuring elements that are conserved.

Successive refinement for the small deformation setting has been introduced in many areas associated to multigrid and basis expansions. The notion of building multi-scale representation in the large deformation LDDMM setting was originally explored F. Riesser et al. [37] in which the kernels are represented as a sum of kernels and Sommer et al. [38] in which the kernel is represented as vector bundles. In their multi-scale setting there is a post-optimization decomposition in which the contribution of the velocity field into its different components can then each be integrated. In that multi-scale setting the basic Euler-Lagrange equation termed EPDIFF remains that of LDDMM [39]. In the setting proposed here we separate the scales before optimisation via the hierarchy of layered diffeomorphisms and use a multi-scale representation of the brain hierarchy itself which is directly associated to the differomorphism at that scale. This gives the fundamental setting of the product group of diffeomorphisms with the Euler-Lagrange equation corresponding to the sequence of layered diffeomorphisms [24].

The aggregation across scales from particle to tissue scales on lattices provides the essential link to inference on graphs. It is natural for these aggregated features with associated conditional probability laws to become the nodes in Markov random field modelling for spatial inference; see examples in spatial transcriptomics and tissue segmentation [40]. Building neighborhood relations as conditional probabilities between lattice sites from which global probabilites laws are constructed with the Hammersley-Clifford theorem links us to Grenander’s metric pattern theory formalisms with the atoms and conditional laws at any scale playing the roles of the generators.

## 4 Materials and Methods

### 4.1 Experimental and Technical Design

The objective of this research is to unify the molecular representations of spatial transcriptomics and cellular scale histology with the tissue scales of Computational Anatomy for brain mapping. To accomplish this we designed a mathematical framework for representing data at multiple scales using generalized functions, and mapping data using geodesic flows of multiple diffeomorphisms. We illustrate the method using several examples from human MRI and digital pathology, as well as mouse spatial transcriptomics.

### 4.2 Method of Resampling Across Lattice Scales

Taking on with kernels *k*_{1}, *k*_{2} of (6a), (6b) gives our two-step transformation. The transformation (5a) of *T*_{1} : *μ* ↦ *μ′ ^{ℓ}* gives with

*w*= Σ

_{j}_{i∈I}

*w*(

_{i}π*x*,

_{i}*Y*): with which is (6c).

_{J}The space density . The feature transformation (5b) of *T*_{2} : *μ*′^{ℓ} ↦ *μ ^{ℓ}* gives (6d):

### 4.3 Gaussian Kernel Varifold Norm

Our varifold norm construction models the measures as elements of a Hilbert space *W** which is dual to an RKHS *W* with a kernel *K _{w}*. We introduce the dual bracket notation for

*h*∈

*W*,

*μ*∈

*W**, . The norms are generated by integrating against the kernel according to (11) written with the dual bracket ; the multi-scale norm is given by .

To ensure the brain measures are elements of *W** dual to the RKHS *W*, the kernel *K _{w}* is chosen to densely and continuously embed in bounded continuous functions so that the signed measure spaces of brains are continuously embedded in the dual spaces . The Gaussian kernel (21) satisfies this condition, the kernel taken as separable Gaussians with |·| Euclidean distance:

For measures *μ*_{0}, *μ*_{1} with particle indexing *i* and *j* non-overlapping, gives

For data carrying position information but no feature values (such as tau tangle locations), each *f _{i}*,

*f*is constant and the resulting exponential terms are all 1.

_{j}### 4.4 Geodesics on the Group and the Riemannian Metric

The diffeomorphism group acts on the hierarchy *φ* · *μ* component-wise Eqn. (7b) with the multi-scale group the product
with elements *φ* ∈ *G _{k}* satisfying the law of composition component-wise . The group

*G*supporting

_{k}*k*-derivatives of the diffeomorphisms builds from a space of k-times continuously differentiable vector fields vanishing at infinity and its partial derivatives of order

*p*⩽

*k*intersecting with diffeomorphisms with 1-derivative:

Dynamics occurs via group action generated as a dynamical system in which the multi-scale control flows the hierarchy *t* ↦ *φ _{t}* satisfying of (8a). The control is in the product , each space an RKHS with norm-square selected to control the smoothness of the vector fields. The hierarchy of spaces are organized as a sequence of continuous embeddings:
where is an additional layer containing the others with defined as a space of m-times continuously differentiable vector fields vanishing at infinity as well all its partial derivatives of order

*p*⩽

*m*.

The hierarchy is connected via successive refinements *u ^{ℓ}* =

*u*

^{ℓ-1}+

*v*,

^{ℓ}*u*

^{0}=

*v*

^{0}expressed via the continuous linear operator

**with**

*A*:*V*→*V***. The control process**

*v*=*Au**u*. = (

*u*,

_{t}*t*∈ [0, 1]) ∈

*L*

^{2}([0, 1],

*V*) has finite square-integral with total energy

Optimal curves which minimize the integrated energy between any two fixed boundary conditions (BC) *φ _{0}* =

**Id**and

*φ*

_{1}which is accessible with a path of finite energy extends the LDDMM setting [23] to a hierarchy of diffeomorphisms and describes a geodesic for an associated Riemannian metric [24] on

*G*:

_{k}Existence of solutions for minimizers over ** u** of (22) when is finite can be established when

*m*⩾,

*k*⩾, 1.

### 4.5 Hamiltonian Control Generating the Geodesics

The Hamiltonian method reduces the parameterization of the vector field to the dynamics of the particles that encode the flow of states (13a). We write the dynamics explicitly as a linear function of the control:

We write the flow of the measure with .

The control problem satisfying (12) reparameterized in the states becomes, for *α* > 0:

Hamiltonian control introduces the co-states with Hamiltonian

Under the assumption the Pontryagin maximum [21] gives the optimum control satisfying for every *ℓ*:

The first specifies the state dynamics of Eqn. (23). The second derives a differential equation which the co-state satisfies (see Eqn. (28a)). The third derives the optimal control equation (14a). See Appendix A for proof.

The co-state integral equation (14b) follows from the fact that it satisfies the differential equation for of Eqn. (28a) (see Statement 2 of Appendix B). The fact that the endpoint added to the Hamiltonian *q* ↦ *U*(*q*) is continuously differentiable implies that the solution to the differential equation is absolutely integrable.

### 4.6 Gradients of the Endpoint Varifold Matching Norm

The gradients of (15b) are efficiently rewritten using the state *q _{t}* = (

*x*

_{i,t},

*W*)

_{i,t}_{i∈I}to define the norm-square in terms of

*h*continuously differentiable in

_{q}*x*and bounded ,

We take the variation varying each term with dependence on scale implied.

These represent the gradients of (15b). The matching condition has smooth gradients.

The tissue continuum of Computational Anatomy has with *q _{t}* ≔ (

*φ*,

_{t}*W*=

_{t}*W*

_{0}|

*dφt*|), the measures parameterized by the state satisfy

The average of over the feature space determines the boundary term variation.

With *q _{1}* = (

*φ*

_{1},

*w*

_{1}) take the variation

*φ*

_{1}→

*φ*

_{1}(

*ε*) =

*φ*

_{1}+

*εψ*,

^{φ}*w*

_{1}→

*w*

_{1}(

*ε*) =

*w*

_{1}+

*εψ*:

^{w}This gives the gradients of the matching endpoint Eqn. (17). We note that computing the variation requires *U* as a function of *φ* is for when *k* ⩾ 2.

## Author Contributions

MM and AT designed the theoretical framework and DT built the computational implementation. All authors contributed to writing and editing the manuscript.

## Funding

This work was supported by the National Institutes of Health (NIH) (www.nih.gov) grants R01EB020062 (MM), R01NS102670 (MM), U19AG033655 (MM), and R01MH105660 (MM), the National Science Foundation (NSF) (www.nsf.gov) 16-569 NeuroNex contract 1707298 (MM), the Computational Anatomy Science Gateway (DT and MM) as part of the Extreme Science and Engineering Discovery Environment (XSEDE Towns et al., 2014), which is supported by the NSF grant ACI1548562, Johns Hopkins University Alzheimer’s Disease Research Center with NIH grant P50AG05146, the Dana Foundation’s (www.dana.org) clinical neuroscience research program, and the Kavli Neuroscience Discovery Institute (kavlijhu.org) supported by the Kavli Foundation (www.kavlifoundation.org) (DT, MM, and JT).

## Conflict of Interest

MM owns a founder share of Anatomy Works with the arrangement being managed by Johns Hopkins University in accordance with its conflict of interest policies. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

## Data Availability

The major contribution of this work is a mathematical and computational framework for modeling hierarchical neuroimaging data. Specific datasets shown in examples are for illustrative purposes, but can be made available upon request.

## Acknowledgements

## A Hamiltonian Control Statement

We take for our RKHS *V _{ℓ}*,

*ℓ*<

*ℓ*with the operators

_{max}*L*:

^{ℓ}*V*→

_{ℓ}*V** defining the isometries such that the inner products 〈·,·〉

_{ℓ}*v*satisfy for any :

_{ℓ}*We assume that* *with m ⩾ k + 2. If u. is a solution of the optimal control problem (24) then there exists time-dependent co-state (t ↦ P _{t}) such that for any i ∈ I_{ℓ} for all ℓ*:

*The optimal control satisfies* *and v = Au, for any* :
*with* *and ⛛ _{1}g(a, b) denotes the gradient of g^{ℓ} with respect to the first variable a*.

*Proof*. Under the assumption then we have (** u, q**) ↦

**(**

*ξ*_{q}**) is and standard results of optimal control theory applying the Poyntryagin maximum principle [21] gives**

*u*Taking the variation for of the Hamiltonian for scale *ℓ* varies :
which gives the first equation for the state velocity (28a).

Taking the variation for of the Hamiltonian for scale *ℓ* varies :
so that we get the second two equations of (28a). To calculate , define *u ^{ℓ}*(

*ε*) =

*u*+

^{ℓ}*εψ*implying

^{u}*v*(

^{ℓ+1}*ε*) =

*v*

^{ℓ+1}−

*εψ*,

^{u}*v*(

^{ℓ}*ε*) =

*v*+

^{ℓ}*µψ*for

^{u}*ψ*∈

^{u}*V*. We have

_{ℓ}After summation of (31) for *ℓ* ⩾ *ℓ*_{0}, we get for any that

Now, for any *x*, *α* ∈ ℝ^{d}, consider such that for any . The reproducing property on gives . We get from (32) for that
so that we get the first equality above of (28b) for .

Now since we deduce the equality for given in (28b).

## B Hamiltonian co-state momentum integrable dynamics

We omit the superscripts *ℓ* below since co-states and states and flows are at the same scale.

*Assume q → U(q) is* *in q, then the co-state integral equations (14b) flowing from t = 1 solves the Hamiltonian differential equations* *of* (28a); *the integral equations flowing from t = 0 satisfy*:

*Proof*. First take and show it satisfies (28a). For *W _{i,t}* =

*w*|

_{i}*dφt*|(

*x*),

_{i}*x*=

_{i,t}*φ*(

_{t}*x*) then which implies .

_{i}The satisfies (28a); rewrite the integral solution using *φ _{t,s}* ≔

*φ*∘

_{s}*φ*

_{t}^{−1}and the identities:

Differentiating requires the identity (see below): where the last equality uses (34), . The remaining identity follows :

We point out that the optimal control (14a) is written in the endpoint with the boundary conditions (14b). Using the form of gives in the boundary *t* = 1 satisfies the differential equation. For then constancy with gives the endpoint:

## Footnotes

We have updated to improve clarity of the mathematical theory, and include analysis of additional datasets. We have formatted our paper into a specific journal requirement.