A Fully Automated, Faster Noise Reduction Approach to Increasing the Analytical Capability of Chemical Imaging for Digital Histopathology

High dimensional data, for example from infrared spectral imaging, involves an inherent trade-off in the acquisition time and quality of spatial-spectral data. Minimum Noise Fraction (MNF) developed by Green et al. [1] has been extensively studied as an algorithm for noise removal in HSI (Hyper-Spectral Imaging) data. However, there is a speed-accuracy trade-off in the process of manually deciding the relevant bands in the MNF space, which by current methods could become a person month time for analyzing an entire TMA (Tissue Micro Array). We propose three approaches termed ‘Fast MNF’, ‘Approx MNF’ and ‘Rand MNF’ where the computational time of the algorithm is reduced, as well as the entire process of band selection is fully automated. This automated approach is shown to perform at the same level of reconstruction accuracy as MNF with large speedup factors, resulting in the same task to be accomplished in hours. The different approximations of the algorithm, show the reconstruction accuracy vs storage (50×) and runtime speed (60×) trade-off. We apply the approach for automating the denoising of different tissue histology samples, in which the accuracy of classification (differentiating between the different histologic and pathologic classes) strongly depends on the SNR (signal to noise ratio) of recovered data. Therefore, we also compare the effect of the proposed denoising algorithms on classification accuracy. Since denoising HSI data is done without any ground truth, we also use a metric that assesses the quality of denoising in the image domain between the noisy and denoised image in absence of ground truth.

Chemical imaging is an emerging technology in which every pixel or voxel of an image 2 contains hyperspectral data, often consisting of hundreds or thousands of data points. 3 The spectrum at each pixel resolves the chemical components at that point and, thus, 4 provides the molecular profile of the sample [2][3][4]. Computer algorithms that can 5 process the data to information useful for a particular problem often require a specific 6

PLOS
1/11 data quality, at that spectral resolution, that often determines scanning (signal 7 averaging) time. In addition to the chemical signature of the data, another benefit of 8 these technologies is that workflows can be automated with fully digital analysis of the 9 data [5][6][7]. For example, Fourier Transform Infrared (FT-IR) spectroscopic imaging is 10 emerging as an automated alternative to human examination in studying disease 11 development and progression by using statistical pattern recognition [8][9][10][11][12][13]. For a 12 practical protocol for tissue imaging, as demonstrated in at least one instance of tissue 13 histopathology, the signal-to-noise ratio (SNR) of 4cm −1 resolution spectral data needs 14 to be more than 1000 : 1 [12]. To achieve this SNR, especially for the emerging high 15 definition IR imaging [14][15][16], extensive signal averaging is required. The need for signal 16 averaging increases acquisition time (SNR ∼ √ t), in turn, increasing acquisition 17 time [17] to the extent that clinical translation becomes impractical. Signal processing 18 approaches to reduce noise has previously been suggested to mitigate this crippling 19 increase in integration time by mathematical methods to utilize correlations in data to 20 reduce noise but suffer from two major drawbacks. First, given the large size of the 21 data, the mathematical operations require computer processing often comparable to the 22 acquisition time itself [18]. Second, such methods invariably try to separate data into 23 informative and noisy components; subsequently, a manual selection step is required to 24 identify the information-bearing components thus compromising the automation 25 benefits of using spectroscopic imaging for tissue analysis [19].

26
One class of mathematical transform techniques for noise reduction utilize the 27 property that noise is uncorrelated whereas spectra (signals) have a high degree of 28 correlation. In a transform domain, hence, the signal becomes largely confined to a few 29 eigenvalues whereas the noise is spread across all. Noise reduction can be achieved by 30 retaining eigenvalue images that correspond to high signal content and computing the 31 inverse transform. All the eigenvalue data contain signal and noise but the relative 32 proportion of the signal to noise which forms a threshold criterion for inclusion of 33 specific eigenimages in the inverse transform. Inclusion of too many will not allow for 34 significant noise rejection, while inclusion of too few would result in loss of fine spectral 35 features. Hence, identifying eigenvalues corresponding to high signal content is an 36 important step in the noise reduction process.

37
One widely used algorithm was provided by Green  the latter consists of a sequence of two principal component transforms: First to whiten 47 the data (de-correlate noise from data); Second to perform eigen decomposition on the 48 modified covariance matrix, to order the underlying data by SNR.

49
Our goals in this work are to address the major challenges in noise rejection using 50 mathematical methods. Specifically, first, we aim to provide criteria for unsupervised 51 band selection, thereby maintaining objectivity of the data analysis protocol and 52 reducing analysis time within the data processing pipeline by dispensing with the need 53 for manual intervention. Second, we aim to re-examine the mathematical formulation of 54 the MNF approach to speed up the process of computation of the forward and inverse application. Finally, we seek to improve the signal processing methods to provide a 60 higher confidence in the consistency and accuracy of the noise rejection pipeline. We 61 propose a comparison between acquired and denoised biomedical images using a robust 62 metric, thereby providing better denoising guarantees in terms of both root mean 63 square error (RMSE) and structural similarity.

83
In practical situations, noiseless data d j is recorded by instruments as a noisy signal 84 estimate y j , due to fluctuations in detector current, backgrounds and source/instrument 85 factors, which is modeled as: where y j ∈ R S is the actual data collected by the apparatus and δ j ∈ R S is the noise in 87 the same pixel. The goal is to estimate δ j given y j , so we can best estimate the true 88 signal d j such that the relevant features (peak positions, peak heights, relative peak 89 spacing etc.) in the spectrum are preserved. where au is arbitrary unit.

100
Assuming additive noise only, raw data can be represented as Y = D + δ, where Y = {Y 1 , ..., Y S } and D and δ are the uncorrelated signal (actual spectral data with baseline included) and noise components of the raw data Y .
where Σ D and Σ δ are the covariance matrices of D and δ respectively. Noise Fraction (NF) for the i th band is defined as the ratio of noise variance to the total variance for that band. Similarly Signal to Noise ratio (SNR) for the i th band is defined as the ratio of signal variance to the noise variance for that band.
Maximization of the noise fraction leads to a numbering of 104 bands that gives decreasing image quality with increasing component number. The SNR 105 for (Y M N F ) i in MNF space can be formulated: The Noise fraction itself can then be re-factored as follows: The vectors φ i are thus the real, symmetric eigenvectors of the eigenvalue problem: Hence φ i are the eigenvectors of Σ Y Σ −1 δ , and λ i , eigenvalue corresponding to φ i , equals 109 to the noise fraction in (Y M N F ) i . Also, λ 1 ≥ λ 2 ≥ . . . ≥ λ S , so that the components 110 show decreasing image quality. Thus SNR is given by λ i − 1.

112
As the data D and noise δ are additive and uncorrelated, we can write: Let the spectral decomposition of Σ δ be Σ δ = EΛ δ E T . Rotating Σ Y in Eq.7 with eigenvector matrix E and rescaling it using the inverse-square root of the noise singular values Λ δ , results in new covariance matrix where the contribution of noise component has been turned into an identity matrix.
Let the eigen decomposition of Σ W be Σ W = GΛ M N F G T . Rotating Σ W in Eq.8 with eigenvector matrix G results in: The series of transforms that we used to change the original data covariance matrix Σ Y 113 into Λ M N F is given by the transformation vector Φ = EΛ −1/2 δ G which we call the MNF 114 projection vectors and Λ M N F estimates of SNR of the data. 115 Optimizing MNF 116 Expanding the entire MNF transform, we can infer the following: Since R is a block identity matrix R = I K 0 0 0 , introducing an extra R T term keeps 117 the value of the expression unaltered as R * R T = R. Writing the MNF transform in 118 this way, we ensure that we skip the costly matrix inversions of the MNF vectors. Also 119 G * R is effectively choosing the top K eigenvalues of G. Therefore we can reduce the 120 computation cost by finding the reduced rank-K SVD of Σ Y . The Λ δ matrix inversion 121 can also be replaced by more efficient versions due to its diagonal structure.

122
Automatic Band Selection

123
The optimal value of K can be determined by inspecting the entries of 124 Λ M N F = SN R + 1 which is a diagonal matrix. The Rose criteria [21] states that an 125 SNR of at least 5.0 is needed to be able to distinguish image features at 100% certainty. 126 We select the top K bands in the MNF space for which SN R = Λ M N F − 1 ≥ 5.0.

127
Automating this process is the main computational speed factor that brings down the 128 processing time from days down to few hours.  Since power iteration is unstable at times due to the structure of the singular values, we 144 use a version of the Block Lanczos method [22]. We setK = 0.03 × S and compute the 145 rankK-SVD, then let the band selection criteria to decide the optimal K.

147
Although the block Lanczos algorithm can attain machine precision, it inevitably goes 148 many passes through Σ W , and it is thus slow when Σ W is large or does not fit in 149 memory. To circumvent this scenario, we use a faster randomized and memory efficient 150 version which computes theK-SVD of Σ W up to 1 + Frobenius norm relative 151 error [23].

152
Error Metric

153
In the absence of ground truth images of denoised data, we use a non-reference image 154 quality metric this is simple and easy to use. The Method Noise Image (MNI) [24] 155 metric aims at maximizing the structure similarity between the input noisy image and 156 the estimated image noise around homogeneous regions and the structure similarity 157 between the input noisy image and the denoised image around highly-structured regions, 158 and is computed as the linear correlation coefficient of the two corresponding structure 159 similarity maps.   Table 1 shows the algorithmic time and space complexity in terms of memory usage 167 for the different MNF versions. The best algorithm in terms of time-space-accuracy is 168 Approx MNF. This is because it computes the best rank-K SVD with little loss in its   N K)). For the FTIR data with 207 S ∼ 1500 bands and K ∼ 30, we achieve a RAM space saving of ∼ 50×, allowing us to 208 process more data simultaneously in one go.

210
The improvements offered by the different versions of the MNF presented in this study 211 are illustrated in Fig. 2.The extent of denoising both in the spectral and spatial domain 212 is approximately the same for all the different MNF algorithms. Fig. 2.A. depicts the 213 spatial detail offered by different MNF versions with zoomed in sections in Fig. 2.C. and 214 Fig. 2.B. shows the horizontal signal profile across the sample. Next, spectral profiles 215 are compared across the different algorithms with reference spectrum (without MNF) to 216 illustrate the extent of noise removal in each case (Fig. 2.D.). It can be seen that even 217 with a speed up factor of ∼ 10 there is no significant reduction in the spectral and 218 spatial image quality. Along with evaluating the performance of the presented MNF versions by examining the 221 tissue profile, we utilize the MNI (method noise image) metric [24] aiming to maximize 222 the structural similarity between the input noisy image and the denoised image around 223 highly-structured regions, in the absence of ground truth. Fig. 3  offers the same performance as compared to standard MNF with a factor of 60× 252 reduction in the processing time and 50× reduction in memory space required for 253 computation. In this paper, we demonstrate how to automate the band selection process in the MNF 256 space, which drastically reduces the workflow duration of MNF denoising of TMA's 257 from almost a month down to a matter of hours. We introduced three different 258 optimizations of the MNF algorithm depending on the speed-memory-accuracy trade-off, 259 resulting in a 60× runtime improvement and 50× memory efficiency. A well established 260 error metric is also used which helps us decide the quality of denoising, in the absence of 261 ground truth images. Similar classification performance of the suggested approaches as 262 compared to conventional techniques suggesting the potential of the developed methods 263 for computationally efficient analysis of big datasets for diagnostic applications. As a 264 future work, we would like to make better approximations of the noise model itself, so 265 that we can apply approximations for the eigen decomposition of the noise covariance 266 matrix, hence further reducing the computational time of the process.