## Abstract

We demonstrate a statistical modeling technique to recognize T cell responses to different external environmental conditions using membrane distributions of T cell receptors. We transformed fluorescence images of T cell receptors from each T cell into estimated model parameters of a partial differential equation. The model parameters enabled the construction of an accurate classification model using linear discrimination techniques. We further demonstrated that the technique successfully differentiated immobilized T cells on non-activating and activating surfaces. Compared to machine learning techniques, our statistical technique relies upon robust image-derived statistics and achieves effective classification with a limited sample size and a minimal computational footprint. The technique provides an effective strategy to quantitatively characterize the global distribution of membrane receptors and other intracellular proteins under various physiological and pathological conditions.

## Introduction

The plasma membrane has a specific protein composition that plays pivotal roles in a wide variety of cellular processes, including receptor-mediated signaling, drug interactions, endocytosis and transport, and cellular communication. Molecular clustering of plasma membrane proteins provides a means to modulate intracellular signal transduction.^{1–4} Recent studies utilizing advanced microscopy have revealed new insights into clusters of membrane proteins and their distinct roles in signaling.^{1,5–15} To extract the nanoscopic structural information, various statistical clustering algorithms have been developed to help circumvent the artifacts that arise from single-molecule localization microscopy. Conventional grouping strategies, such as the Ripley’s and density-based spatial clustering techniques, can be biased toward densely labeled regions or the repeat appearance of single molecules across multiple frames.^{16} Pair correlation analysis overcomes the stochastic variations of the fluorophores.^{17,18} Although these techniques have advanced the quantitative characterization of the spatial organization of membrane proteins on the nanometer scale, there remains a lack of strategies for characterizing the global protein distribution.

Increasing evidence suggests that the mesoscale organization of intracellular proteins contains “fingerprint information” about the cellular states.^{19} The spatial organization of membrane proteins may provide a means to infer the initial cellular response to the external environment. In this study, we developed a classification technique by using T cells whose membrane receptors have been known to correlate with the immune response. T cell receptor (TCR) membrane domains^{9,10,20} and monomers^{21} on quiescent T cells, as well as the aggregation of TCRs on activated T cells *in vitro*^{22–24} and *in vivo*^{25} have been observed. Single-molecule tracking experiments have revealed the role of signal dispersion and amplification of TCR at the plasma membrane.^{26} In addition, multi-dimensional analysis of TCR dynamics using advanced lattice light-sheet microscopy has also enabled the prediction of T-cell signaling states.^{27}

Specifically, we sought to evaluate whether steady-state TCR images from standard fluorescence microscopy can be used to differentiate T cells exposed to different external environments. A popular solution to achieve this capability is through data-driven artificial intelligence (AI) techniques, such as deep learning.^{28,29} Computer-aided assessments of bioimages have enhanced our understanding of non-visual image differences for diagnostic and prognostic purposes.^{30} Despite these new promises, AI techniques face formidable challenges. Controversies arise regarding a lack of transparency in the “black box” of AI algorithms.^{31} Current AI-based image class discrimination techniques rarely result in model parameters whose statistical significance can rigorously support the image model hypothesis.^{32} AI techniques also demand long processing time and operator attention to achieve quantitative image modeling.^{33}

To address these limitations, we have developed an analytical image analysis technique based on partial differential equation (PDE) image models, linear class discrimination of the estimated model parameters, and logistic regression estimation of individual cell class probabilities. We termed the technique: Statistical Classification Analyses of Membrane Protein Images (SCAMPI). We demonstrate that non-visual cues from diffraction-limited fluorescence images can be harnessed to exact the characteristic information pertaining to the specific surface condition a T cell lymphocyte interacts with. We realized SCAMPI using two discrimination models: Fisher Linear Discrimination (FLD) and Logistic Regression (LR). SCAMPI eliminates the need for computationally expensive algorithms. Moreover, the models generated by SCAMPI carry image-derived information and can be used to investigate a wide range of membrane protein organizations.

## Results

### Image model construction

Previous studies have validated PDE models of images.^{34–39} We implemented a general ordinary least squares (OLS) modeling strategy: *Image* = *Model* + *Residuals*, where we identified the best *Model* by minimizing the variance of *Residuals*. Specifically, we estimated digital image models as vectorized partial difference equations (PdEs) subject to OLS parameter estimation. The rationale for this strategy is that non-degenerate images have statistically significant, sample-based, two-dimensional autocorrelation functions and as do linear, stationary, partial difference equations.^{30,36} **Fig. 1a** illustrates that a fluorescence image can be modeled as a combination of pixel-shifted images in coordinates *x, y*, and *x & y*. Similar properties also hold for stationary PdEs. Though such models are mathematically rigorous, they are plagued by having parameters requiring machine learning techniques for estimation.^{36} We overcame this disadvantage with a simple image matrix-to-vector transformation (in column-major order) that results in an analytic PdE parameter estimation procedure (**Fig. 1b**). Such a strategy was initially successful in medical imaging, such as a dementia discrimination application using MRI brain scans.^{36} For modeling a typical fluorescence image, we constructed an image model as a general linear PdE with constant coefficients (**Supplementary Note 1**). The OLS estimates of the model parameters are obtained and evaluated by the Student-*t* scores for their significance. If significant, the corresponding model parameters are retained. If not, the spatial lags in pixels are increased to reconstruct an alternative image model (**Fig. 1b**). **Fig. 1c** shows the intensity profile of a typical fluorescence image of the TCRs in a T cell (170-by-170 pixels). **Fig. 1d** demonstrates the corresponding image model constructed from four spatial lags (166-by-166 pixels).

### Linear discrimination using model parameters

Model parameters obtained from the PDE image models are used to achieve class discrimination by the Fisher linear discriminant method.^{40} FLD projects individual parameter vectors onto a line so as to maximize the separation between projected parameter vectors while minimizing the variation within the projected vectors (**Fig. 2a, and Supplementary Note 2**). We applied FLD to model parameters from a training image data set by grouping images into two classes: *class 0* (activation) and *class 1* (non-activation). Test images, not seen by the classification model, were used to evaluate the precision of class discrimination (**Fig. 2b**).

### Development of SCAMPI

To collect membrane protein images for developing SCAMPI, we obtained total-internal-reflectance fluorescence (TIRF) images of TCRs from CD3-EGFP Jurkat T cells on two types of glass surfaces (**Fig. 2c**). *Class 1* represented TIRF images acquired from T cells on a non-activating surface coated with poly-L-lysine (PLL). The electrostatic interactions between positively charged PLL and negatively charged cell membranes facilitate cell attachment to the glass surface for imaging. *Class 0* represented TIRF images acquired from T cells on an activating surface coated with the OKT-3 antibody. OKT-3 cross-linked the CD3 molecule of the TCR and induced T cell activation. Images were collected by a 100x/1.49 TIRF objective and a Photometric 95B sCMOS camera with an image pixel size of 110 nm. **Fig. 2d** demonstrates representative TCR images from the non-activating PLL surface, while **Fig. 2e** demonstrates representative images from the OKT-3-coated activating surface.

For the FLD-based SCAMPI, there were 97 active cell images and 100 non-active cell images which were randomly divided as TCR images into training data sets of 80 and 78 images, and prediction (test) data sets of 20 and 19, representing *class 0* and *class 1*, respectively. For each image, we obtained model parameters and computed their Student *t* statistics using the White ^{41} parameter covariance matrix estimate corrected for heteroskedastic and autocorrelated residuals; see **Fig. 1b**. The training and testing regression models used spatial lags of six pixels which provided 49 OLS model parameters for each image. This represented going around the loop in **Fig. 1b** six times (**Supplementary Note 1**).

With six spatial lags the model OLS regressions have a database of 26,896 (164×164) pixels. Therefore, each of the 49 OLS model parameters has 576 pixels, thus overfitting is not an issue. For the 80-78 cell training FLD, classification accuracy was 88.6%. The 49-element projection vector from that FLD projected the test image parameters as shown in **Fig. 3a** with a 94.95% classification accuracy. The total runtime for this experiment was 21.89 seconds on a laptop computer.

### Estimating cell class probabilities and linear discriminants

The effectiveness of FLD-based SCAMPI is attributed to the fact that model discrimination parameters depend on both the spatial distribution and fluorescent intensity of individual TCR clusters. The FLD projection of an individual cell represents an optimal weighted average (FLD eigenvector weights) of that cell’s OLS parameters. Nevertheless, the FLD projection does not represent the probability a particular cell belongs to a particular class. Such information becomes useful for evaluating the T cell response when the environment variable changes, such as the ligand concentration and composition. Of note is that TCRs recognize a single agnostic peptide embedded in the major histocompatibility complex and T cells can be activated by a few peptides.^{42} The mechanism of signal dispersion and amplification remains a focus of current imaging investigations. To develop a discrimination technique by which an individual cell class probability can be estimated, we further developed SCAMPI to include logistic regression (LR).

LR as a discrimination tool in this application uses an image’s OLS parameters to estimate the probability that an image belongs to a specific class.^{40} LR validation constraints differ from those of FLD because LR is a nonlinear regression. In addition, LR provides class probabilities for individual test subjects, a capability that FLD and machine learning classification techniques lack. **Fig. 3b** shows 97.5% accurate FLD projections using a 20-20 small sample size, and **Fig. 3c** indicates LR class probability estimates for the same 20 random images chosen from each class of the 100-97 image dataset (**Supplementary Note 3**). Of note is that, in the small sample regime, FLD and LR achieved similar classification results. **Fig. 3d** captures the classification consistency of FLD projections and LR probabilities, thereby independently validating the FLD discrimination results.

## Discussion

The successful class discrimination using 20 training images from each class demonstrates that SCAMPI is capable of accurate class discrimination using a small sample size. We attribute this unique capability to two salient factors: First, a vector transformation that provides OLS regressions with a very large number of samples (the three-parameter model of **Fig. 3** has 9,520 samples per parameter) resulting in robust image-derived statistics and second, the optimal minimization of inter-system noise by OLS estimation. Unlike machine learning techniques, the PDE image model not only carries information about the number of TCR clusters and their size and shape but also the detailed image spatial structure. The latter contains characteristics of the spatial distribution of TCR clusters. Each training image positively enhances the class discrimination model. In our demonstration, as few as 20 images per class were found to be sufficient in achieving accurate class separation and probabilistic corroboration. Importantly, over-fitting to a specific classification model may occur and degrade class separation by SCAMPI. For example, a test evaluation of a 40-40 image sample yielded a 72.5% FLD accuracy against a 97.5% accuracy for the 20-20 image sample. The statistical methods of fitting limits are not possible in machine learning methods. As such, machine learning techniques require big data to construct an empirical discrimination model, in which image-derived statistics are quickly lost during model optimization.

In SCAMPI, we removed inter-system variations by using the imaging data acquired from the same cell line, by the imaging system, and under the same imaging conditions. Such a requirement is necessary because image model parameters are sensitive to the image format and quality. Unique characteristics related to the optical system, sample preparation, and data acquisition have been normalized within these imaging data. These characteristics include the point spread function of the optical system, higher-order optical aberrations, sample labeling densities, photophysical properties of different fluorescence labels, pixel size, and quantum efficiency of the detector camera, all of which play critical roles in the fluorescence imaging data. These inter-system variations represent a major risk in misclassification and the noise they present can be minimized in variance by OLS estimation.

Through SCAMPI, we also revealed that fluorescence images of TCR contained “signature information” about the T-cell response. The clustering of TCRs is well known to correlate with the early signaling events during T cell activation. ^{1,3,25,43} SCAMPI show that non-visual cues from images of membrane receptors can be captured by statistical techniques and effectively utilized to characterize the cell response to the external environment. More importantly, such information can be extracted from standard fluorescence images with a relatively small test sample size. Combined with its small computational footprint, SCAMPI may find its way into clinical settings where potential treatment benefit can be evaluated based on the discrimination image model constructed from cells derived from responders and non-responders. SCAMPI may pave the way for improving the treatment response rate targeting membrane receptors, such as in immunotherapy.

SCAMPI can be readily applied to other membrane receptors and fluorescent labeling techniques. Moreover, SCAMPI is amenable to multiplexed imaging data. In this regard, SCAMPI can be developed using high-dimensional statistics. Coupled with the development of automated and highly multiplexed super-resolution imaging techniques,^{43} SCAMPI has the potential to reveal more complex and global protein interrelationships beyond colocalization and correlation analysis.

In summary, we report a linear discrimination technique SCAMPI to discriminate activated from non-activated T cells based on the spatial organization of T cell receptors. SCAMPI harnesses non-visual cues of fluorescence images and rapidly classifies cellular states with sample-derived statistics. Most importantly, SCAMPI is immune to the drawbacks of AI techniques. It represents a fresh approach to the “big data” challenge and potentiates the fluorescence image-based discovery of structural features related to the cellular states.

## Materials and Methods

### Cells and reagents

Jurkat E6–1 T cells that express CD3-EGFP were cultured in RPMI 1640 Medium (from Gibco, USA, CAT#: 11875119) supplemented with 10% Fetal Calf Serum (FCS) (from Gibco, USA, CAT#: 14190-149) in a humidified atmosphere at 37°C. Cells were incubated in an imaging buffer consisting of HBSS (from Life Technologies, USA, CAT#: 14175-095) supplemented with 1% FCS before the fixation. Monoclonal antibody against CD3ε (clone: OKT3, CAT#: BE0001-2-25MG) was purchased from Bio X Cell, USA. Fixation buffer consisting of 4% paraformaldehyde (Alfa Aeser, USA, CAT#:43368) and 0.1% glutaraldehyde (Electron Microscopy Sciences, USA, CAT#: 16100) was used to fix cells on the coated surfaces.

### Surface Preparation

Eight well chambered cover glasses (Borosilicate sterile No 1.5, CAT# 155409, Lab-Tek) were cleaned with absolute ethanol and dH_{2}O, then incubated overnight at room temperature. Activating surfaces were produced by adding OKT3 antibody (200 μL) at a concentration of 1 μg/ml in PBS (from Gibco, USA) into a well. Poly-L-lysine (PLL) surface were produced by adding PLL (200 μL) at a concentration of 0.01% in H_{2}O (P8920 from Sigma-Aldrich, CAS#: 25988-63-0) into another well. Eight-well chamber slides containing OKT3 and PLL were incubated overnight at 37°C.

### Imaging TCR clusters

Supernatants of the wells containing OKT3 and PLL were decanted and cells (100k) were added to each well. It was incubated at 37°C for 8 minutes. After the incubation, cells were observed under a conventional microscope to confirm if they were attached to the surface or not. Supernatants of OKT3 and PLL coated wells were decanted and a fixation buffer (250 μL) was added to the wells. It was incubated for 15 minutes at room temperature. After 15 minutes, samples were rinsed thoroughly with PBS.

### Total internal reflection fluorescence (TIRF) microscopy

TIRF microscopy experiments were performed on a Nikon Eclipse Ti2 inverted microscope equipped with a 100×/1.49 oil-immersion objective. For TIRF imaging, 488 nm laser was used. Emission light was filtered using appropriate filter sets and recorded on a Prime 95B sCMOS camera with a pixel size of 110 nm in the image plane. Images of TCR clusters were acquired with 2.15 mW (15%, 488 nm) laser power at a 400 ms exposure time.

### SCAMPI Standard Model Statistics

FLD-based SCAMPI is an effective discriminator because model discrimination parameters depend on both spatial distribution and fluorescent intensity of individual TCR clusters. But T cell response to environmental variations, such as the ligand concentration and composition, must be of interest in discrimination and beyond, if one attempts to regulate this response. In particular, T cells can recognize a single agnostic peptide embedded in a major histocompatibility complex and also be activated by a small number of peptides.^{42} Such multidimensional excitation sensitivity requires a standardized T cell model whose parameters capture cell response nuances to multiple excitations, are comparable across experiments, discriminate cell classes, and estimate individual cell probabilities of class membership.

The standardization proposed in the following model is based on the diffusive and advective cell structures found in the cell literature. For this purpose, we propose the PDE model in (1a) which is a temporal equilibrium form of a nonhomogeneous, hyperbolic PDE (**Supplementary Note 1**).). Its digital, estimable form in (2a) clearly illustrates the model dependence on protein advection, parameters *β*_{0,1} and *β*_{1,0}, and diffusion, parameter *β*_{1,1}
To meet the demands placed on (2a) as a T cell protein membrane model it is necessary to restrict the number of images for estimating the *β*_{k,l} so that it remains an accurate discriminator and, as discussed below, simultaneously an accurate predictor of individual class probabilities.

Further, for comparative testing of large image sets for response homogeneity, the image support for (2a) should be a small fraction of the classes being tested.

A random selection of 20 active cell and 20 non-active cell images were used to estimate the *β*_{k,l} in (2a) as b_{k,l}. All 20 element b parameter vectors passed a Kolmogorov-Smirnov test for normality at 0.05 or better (**Table 1**).

**Fig. S1** is a typical normal distribution check for these estimates. The Student-t statistics in **Table 1** were computed using the White asymptotic parameter covariance matrix since OLS image residuals are frequently autocorrelated and heteroskedastic. Each image regression had 28561(169-by-169) pixels so the White asymptotic matrix almost surely applied.

## Author contributions

YSH and WDO conceived of the study. RM conducted fluorescence imaging. WDO performed image model construction and linear discrimination analyses. All authors contributed to the preparation of the manuscript.

## Competing interests

The authors declare no competing interests.

## Data availability

Image data and MATLAB codes are available upon request to the corresponding authors.

## Supplementary Information

### Note 1: Image modeling method, an example

We hypothesize the gray-scale pixel intensity value, *v*(*x, y*), of an image in Cartesian coordinates satisfies the PDE:
In (1a), *u*(*x, y*) is a zero mean random noise variable to be minimized in variance to estimate the α parameters. This equation has a long history as a model for a wide range of images ^{31,33} but also as an advective and diffusion model of particles coagulating over space and time. ^{44} To estimate the α parameters, we approximate (1a) with a partial difference equation (PdE) on a grid indexed by *x* = *i*Δ*x, y* = *j*Δ*y* and approximate derivatives by backward differences. In our imaging experiments, Δ*x*=Δ*y*=110 nm. For discrete images of unit width pixels (1a) becomes the matrix equation
in which *e*_{i,j} is the spatially discrete version of *u*(*x, y*). The vector transform of a matrix sum is the sum of vector transforms. With ** q**=

*vec*(

*v*

_{i,j}),

**z**

_{1}=

*vec*(

*v*

_{i,j−1}),

**z**

_{2}=

*vec*(

*v*

_{i−1,j}),

**z**

_{3}=

*vec*(

*v*

_{1−1,j−1}), and

*vec*(

*e*

_{i,j})= ε, (2

*a*) becomes in which the

*q*vector represents the image to be modeled,

*Z*is a design matrix of spatially lagged versions of

**,**

*q**β*

^{T}= [

*β*

_{0,1}

*β*

_{1,0}

*β*

_{1,1}], and ε is a zero mean residual error vector whose variance is minimized by the OLS estimate

*b*of

*β*,

**is the OLS estimate of the image and is the estimated image model error, .**

*Zb*For the model in (2a), image pixels must be sacrificed to make ** q** and

**compatible for addition. This data loss is not usually significant; for example, the images in**

*Z***Fig. 2**have the

*samples per parameter estimated*in the OLS regression decrease from 9,633 to 9,408.

The Student *t* statistics are computed using the White parameter covariance matrix estimate corrected for heteroskedastic and autocorrelated residuals.^{41} We found it is common for OLS image models defined by *vec* transformations to exhibit heteroskedastic and autocorrelated residuals, that is, the random error terms are, in all probability, from different distributions and correlated. The White parameter covariance estimates are asymptotic results. With 28,224 degrees of freedom in this regression such asymptotic conditions surely prevail. The extraordinarily large Student *t* values reflect the large degrees of freedom per estimated parameter; typical of OLS image models.

The general linear, constant coefficient PdE of 2 independent variables has a discrete PdE representation:
A regression of the vector transform of (5) can be shown to require (p+1)(q+1) OLS parameters. This general model of order *r* = *max* (*p, q*) of an m × n image requires (*r* + 1)(*m* + *n* − *r* − 1)) pixels to be sacrificed as in (2a) above. Note that (2a) is (5a) for *p* = *q* = 1.

### Note 2: Fisher Linear Discriminant

Each image has an OLS-estimated vector of parameters. Recall that the Fisher Linear Discriminator (FLD) will project individual vectors onto a line so that the variation between the projected samples is maximized relative to the variation within the projected samples. To see how this is executed with OLS image parameters, let *B*_{1} and *B*_{2} be the class matrices of parameter vectors of m_{1} and m_{2} respective images. Each vector is of parameter size n so the class matrices are m_{1} by n and m_{2} by n. Let *G*_{1} and *G*_{2} be the estimated covariance matrices of the ** B** matrices and

*G*_{p}the estimated covariance matrix of

*B*_{p}

*=[*

*B*_{1}

*B*_{2}

*]*

^{T}. Then the eigenvector

**v**

_{c}satisfying: for the unique eigenvalue

*λ ≠ 0*, is the optimal projection vector for discriminating active from non-activated cells. (6a) is solved for

**v**

_{c}and used to project the test class parameter vectors. The regression models were

*pde(6)*which amounted to 49 parameters per image so

**v**

_{c}is a 49 element vector for projecting the test images.

The test result of **Fig. 3a** required 158 training set OLS models and 39 test set OLS models, each with 26,896 sample points, plus the solution of (6a) for **v**_{c.} This required 21.89 seconds of CPU time.

Discrimination efficacy increases with the number of parameters per image. However, constraints become tight for experiments with a relatively small number of class members. Equation (6a) only has a non-trivial solution for **v**_{c} if the number of parameters is less than or equal to the total number of class members less two. ^{44} A second constraint is: As *p* + *q* in (5a) increases *G*_{1}, *G*_{2}, and *G*_{p} fail to be positive definite, implying the eigenvector solution of (6a) is no longer valid. ^{44} For the training classes of 158 total cell image samples, 156 or fewer parameters per image is a generous constraint. But the ** G** matrices in (6a) are not positive definite for

*p*+

*q >*7. If there are 20 images per class, then 38 parameters per image becomes a tight constraint, in addition to that of

**matrix positive definiteness. For such cases, logistic regression becomes a plausible alternative classification approach (**

*G***SI Note 3**).

### Note 3: Class probability discrimination using Logistic Regression (LR)

Logistic regression as a discrimination tool estimates the probability a given image belongs to a specific class and does so without FLD-type data constraints.^{40} Let *i* = 1,2,…40 be an image index for 20 inactive cells and 20 active cells. The Bernoulli random variable Y_{i} is assumed to take the value *y*_{i} =1 if image *i* is in the active class and 0 if in the non-active class. *p*_{i} is the probability image *i* is in the *active cell class* conditioned on explanatory (independent) variables hypothesized to control class membership. Let the design matrix of explanatory variables for the LR be **X** composed of *rows x*_{i} Then the hypothesized LR model uses data on *y*_{i} and **X** to estimate *a parameter vector φ* in the logistic regression:
In (7a) we use for the *x*_{i} *rows*, one for each cell image, the estimated row vectors of the ** B** parameter matrices from OLS regressions of the 40 cell images. The class identities are known for all images from the experiment producing the cell images: For non-active cells 1 through 20, the

*y*

_{i}data is 0 and active cells 21 through 40 have

*y*

_{i}= 1.

*φ*is estimated by maximizing the likelihood function of the independent Bernoulli distribution for 40 samples; this is a nonlinear optimization.

^{40}

A goodness of fit measure, analogous to R^{2} for OLS, has been, and is, controversial regarding the number of samples per parameter estimated in the *φ* vector. ^{45} Further, unlike OLS where there are numerous testable statistics available, in LR statistical significance is still mostly reliant on Monte Carlo simulation so there are only a few robust tests to decide which variables to include in a regression. TheTjur^{46} Coefficient of Discrimination for the images used in **Fig. 3c** produced the statistics given in Table S2.

The parameter *φ*_{0} is the mandatory constant term required of LR. The columns of the 40 by 3, **X**, matrix passed the Kolmogorov-Smirnov test for normality at the .05 level for each class, which is to be expected of parameters estimated by an OLS regression of 28224 degrees of freedom. This is a distinct advantage of LR based SCAMPI since it is known that logistic regressions with normal independent variables yield robust Wald statistics. ^{47} With normally distributed independent variables, see **Figure S1**, the Wald statistics are distributed Student-t with 17 degrees of freedom so p < 0.0398 for all of them. The coefficient of discrimination, COD, for a perfect fit is 1.00 and ideal Deviance is 0.0.

**Tables S1** and **S2** present highly significant OLS image parameters yielding highly significant logistic regression probability estimates for the same images. The consistency in the three stages of optimization, OLS, FLD, LR, to achieve these results is captured in **Fig. 3d**.

## Acknowledgements

The authors thank the support from the Department of Chemistry at the University of Illinois at Chicago