## Abstract

In this manuscript, we introduce a linear approximation of the forward model of soft x-ray tomography (SXT), such that the reconstruction is solvable by standard iterative schemes. This linear model takes into account the three-dimensional point spread function (PSF) of the optical system, which consequently enhances the reconstruction data. The feasibility of the model is demonstrated on both simulated and experimental data, based on theoretically estimated and experimentally measured PSFs.

## 1. Introduction

Soft x-ray tomography (SXT) refers to the x-ray microscopy technique in which tomographic imaging is done using low-energy x-rays. In particular, the x-ray energy range lies within the “water window”, i.e., between the K-absorption edges of oxygen (2.34 nm; 530 eV) and carbon (4.4 nm; 280 eV) [1]. As the name suggests, water is relatively transparent to the x-rays within this region. In biological samples, the contrast comes from the natural variation of bio-organic molecules making this region especially suitable for imaging of these kinds of samples. In the past decades, soft x-ray microscopy has emerged as a unique tool to study 3D organization of single cells [3, 6, 28]. From simple yeast to complex eukaryotic cells. SXT grants an unprecedented contrast and spatial resolution, filling the information gap between light and electron microscopy [33, 30, 10, 5, 17, 4].

The three-dimensional reconstruction of a sample is obtained, by solving for its spatial distribution of absorption from a series of projection from many different viewing angles around a central rotation axis. A key assumption has traditionally been that these images can be regarded as images of classical projections [37], meaning that the resolution would be limited by that of the optical system, *r* ∝ λ/*NA*, where λ is the wavelength of the illuminating light and *NA* is the numerical aperture of the objective lens. Diffraction limited optics are, however, also characterized by a maximum depth of field as DOF *∝* λ/*NA*^{2}, introducing a limit of sample size where the image formation can be approximated by this kind of parallel projection [24], and where the effect of the optics can be modeled by a depth independent point spread function (PSF).

In principle, both a sufficient depth of field and a good resolution can be obtained by increasing the energy (decreasing λ) and decreasing the *NA* (see Fig. 1). However, natural contrast in bio-organic samples is limited to the energy region of the water window, making such optimization impossible. This means that, especially for larger samples, SXT resolution suffers from the depth-dependent optical PSF.

Limited DOF is a known problem in electron microscopy, where tilting of the sample may extend part of the sample out of focus, and approximate solutions exist to correct for it in the tomographic reconstruction. The so-called defocus-gradient correction, which was first introduced by Jensen and Kornberg [11], involves computational correction of the effects of a depth-dependent defocus and was incorporated to the well-known back-projection algorithm. The problem was revisited by Kazantsev et al. [13], in which the method received rigorous mathematical justification. This kind of correction has also been applied in Fourier space [36, 35], which can reduce the computational times up to two orders of magnitude. However, these corrections are not directly applicable in SXT due to the difference in image formation [16].

In SXT the problem of limited DOF has only recently been addressed, and thus far only experimentally. This has been done by acquiring multiple images at different focus and using a depth-dependent weight on their back-projection to account for their different foci [32], by using through-focus imaging to then computationally extract the ideal projection [22] or by a wavelet based fusion of reconstructions with different foci [19].

Recently, a forward model of image formation in SXT was proposed [24, 25], including rigorous mathematical work [15, 14]. Based on this model, the feasibility and practical application of PSF corrections was shown by Otón et al. [23], where a depth independent correction was applied to SXT data, leading to higher contrast in the obtained reconstruction.

In this manuscript we further generalize the image formation in soft x-ray tomography developed by Otoón et al. [24]. By applying a linear approximation on the model, the effects of a depth dependent PSF can be incorporated into existing iterative reconstruction methods. We present numerical results that show the method is applicable even if the sample is out of focus, or larger than the depth of field. Finally, experimental results show an increase in contrast of a reconstructed image of mouse lymphocytes.

## 2. Image formation in soft x-ray tomography

The understanding of image formation is an important step in any imaging techniques. It allows to choose the best image acquisition strategies and most suitable reconstruction methods. In tomography, the model of image formation has been traditionally based on the Radon transform [29], which is the ideal linear transform (projection) of the specimens attenuation coefficients onto a plane. The projection is linked to the experimental image formation through the Beer-Lambert law, such that the attenuation of the intensity along the ray-paths, *L _{i}*, through the sample can be given by

Here *μ _{Li}(t)* is the linear absorption coefficient (LAC) of the specimen along the ray-path, and

*I*and

_{i}*I*

_{i0}are the attenuated and un-attenuated intensities along the ray-path, respectively.

Although this is a convenient model and a good approximation for highly elongated point spread functions, the image formation in x-ray tomography based on, e.g., diffraction lenses may differ substantially from such ideal model of parallel projections. Despite of this difference, it is beneficial to link the model of image formation in SXT to Eq. (1), so that many available reconstruction methods [21, 12, 9] become available.

Recently, the specifics of the optical system in SXT were integrated into a model of image formation [24, 25]. The model assumes, that propagation of the field, at the vicinity of the sample, can be done by parallel propagation, which is valid if the *NA* of the system is small enough. On the other hand, the model assumes incoherent image formation on the detector, which is reasonable if the *NA* of the condenser and objective lens are matched, so that the image *I _{im}* formed at the detector plane

*x*

_{2}= (

*x*

_{2},

*y*

_{2}) is given by where

*I*(

*x*

_{1},

*z*

_{0}) is the local field intensity at some position

*z*

_{0}before the sample, **:

**R**

^{2}→

**R**

^{2}is the convolution operator in two dimensions, and

*h*(

_{z}

*x*_{1}) is the impulse response of the optical system.

*I*_{im}(

*x*_{2}) and

*I*^{0}

_{im}(

*x*_{2}) denotes the recorded images, the latter a reference image without a sample (see Fig. A.7 for details). Depending on the optical setup of the system, these assumptions might not necessarily hold true [31, 20].

Otón et al. [24] describes two cases in which a known solutions exist. In one case, when the impulse response *h _{z}*(

*x*_{1}) is a delta function, the model coincides with Eq. (1). In the other case, the impulse response

*h*(

_{z}

*x*_{1}) is independent of the axial position

*z*. As a result, the ideal BeerLambert projections can be recovered by deconvolution of the transmission images, after which conventional reconstruction schemes can be applied.

For a more general solution, in our work, we seek a linear approximation to the forward model Eq. (2), so the image formation in SXT can be expressed as
where ** A_{h}** is a linear projection matrix incorporating the PSF of the system,

**is here a discretized, vector representation of the LAC.**

*μ*By following the steps of Otoón et al. [24], and building up the left hand side of Eq. (3). The finite difference (see Appendix for details), yields a linear approximation of the form
where ** A_{h}** is a projection matrix but now incorporating the depth dependent PSF.

## 3. Numerical results

To validate our, now linear, image formation model, we first performed numerical simulations on a phantom sample, Fig. 2, consisting of binary circles of different sizes.

Projection images through the phantom sample were calculated using the non-linear model of Eq. (2) on a larger (*L* = 997^{1}) discrete grid. The sinograms were down-sampled to size *L* = 256 before tomographic reconstructions. To avoid inversion crimes, the size of high-resolution grid was chosen as a prime number, so that neither the reconstruction or deconvolution was done in the same discrete grid as the calculated phantom.

The theoretical PSF was calculated according to the dimensions of the reconstruction grid and was determined as in Ref. [37], by the converging illumination emerging from a circular lens aperture based on the Huygens-Fresnel principle [2]. The LAC of the phantom was scaled so that the minimum transmission was 0.5*I*_{0}.

As a proof of concept, a PSF with a DOF of ± 128, i.e., enclosing the whole sample, and a Rayleigh resolution of 8 units was considered. Two measurements were done, one in-focus case, where the PSF was centered on the center of rotation and one out-of-focus, where the focal spot was shifted to the edge of the image. The sampling criteria of *n* > *L*(π/4) was used for reference and the reconstructions were made using 201 projections^{2}. The final projections were distorted by adding Poisson noise.

Three different reconstructions (as shown in Fig. 3) were obtained by solving Eq. (3) by using the Conjugate Gradient Method on the Normal Equations (CGNE): a conventional minimization using the Beer-Lambert approximation, the depth-independent correction of [24], and the linear PSF model.

In CGNE, the iteration number can be viewed as a reg-ularization parameter for the solution [8] and running too many iterations will result in amplifying high frequency signals and result in noisy reconstructions. The optimal stopping iterations will of course be data dependent. So as to ensure a fair comparison of the phantom images, we used the oracle knowledge of the phantom to find the highest peak-signal-to-noise-ratio (PSNR)

This was done by keeping track of the best solution within the reconstruction scheme and halting when no improvement over the best solution had been recorded within 5 iteration. For the reconstructions shown in Fig. 3 with *I _{0}* = 10

^{6}the stopping iteration numbers for the BeerLambert, deconvolution and PSF reconstructions, respectively were 20,21, and 138 for the in focus PSF, and 20, 21, and 136 for the out of focus PSF.

The optimal deconvolution for the depth-independent correction of [24] was done in two steps – first, noiseless Beer-Lambert projection images were as a convolution kernel to solve for the optimal depth-independent PSF. This 2D kernel was then used to deconvolve the projection images, using the oracle knowledge of the same Beer-Lambert projections as a stopping criteria by minimizing the *l*_{2}-norm of the residual.

As seen in Fig. 3, for a centered PSF, the result is as expected. The Beer-Lambert approximation shows smoothing of the edges, characteristic to the PSF. In this case, as the assumption of a depth independent PSF is valid, a proper solution to the inversion exist [24] and global filtering of the projection images yields a good reconstruction result. The linear PSF inversion is similar in quality, but in the case of centered PSF, the depth independent PSF reconstructions have the benefit of much faster convergence.

In the case where the sample is not positioned in the center of PSF, the Beer-Lambert reconstruction clearly shows the artifacts of the defocus. In this case, no suitable global filter exists and the reconstructions based on projection image deconvolution does not improve on the results with respect to BL. Such artifacts are not present in the linear PSF reconstruction.

In numerical simulation, where the projection matrix corresponding to the PSF is known, the linear approximation performs well for the tomographic inversion, as shown in Fig. 3. In practice, however, the image quality is often limited by noise, which makes the inversion problem highly unstable. To investigate in the stability of the inversion, the numerical experiment was repeated using different noise levels. Shown in Fig. 4 the PSNR of the recovered image as a function of the intensity count. As all images were distorted with Poisson noise, a lower count corresponds to a higher noise level. For all cases, the linear PSF inversion provides the highest PSNR and is most resilient to the effects of this noise. For methods when the model of image formation is insufficient, the quality seems to saturate as image quality increases. In this regime, the image model error dominates over the error caused by the noise in the images.

For out of focus PSF, one can stabilize the reconstructions by taking images over a full, i.e 360°, rotation range. In this fashion, though a sample is not fully in focus at one rotation angle, the sampling at a 180^{°} shift provides additional information, as a different part of the sample will be in focus. Such acquisition can be seen as a modification to the method suggested by Selin et al. [32], where a depth-dependent weight was introduced in the reconstruction scheme to account for projections acquired at different foci. In our case, the projection matrix ** A_{h}** servers a similar function. Such acquisition scheme leads to better reconstruction results in comparison to the 180

^{°}rotation, using the same number of projections, with all methods. As seen from the right side image of Fig. 4, the results are even more substantial for the depth independent reconstruction schemes.

## 4. Experimental results

In order to test the method on experimental data, the PSF of the optical system first has to be measured. The measured PSF can then be used to calculate the necessary weights for the projection (and back-projection) operators used in the reconstruction. The experimental work was performed at the National Center for X-ray Tomography at the Advanced Light Source of the Lawrence Berkeley National Laboratory (LBNL). The data was acquired on the XM-2 soft X-ray microscope [18]. The XM-2 is equipped with a cryogenic rotation stage with full 360^{°} range to enable tomographic data collection from cryo-preserved samples. The optical setup consists of two aperture matched Fresnel Zone Plates, with a relatively low NA (0.234), thus the assumptions needed for Eq. (2) should be fairly well met. However, although the theoretical framework rests on the assumption of incoherent image formation, it puts no restrictions into the actual linear operator ** A_{h}** and when measuring the PSF, it is straightforward to incorporate also other effects in the projection matrix, such as various aberrations, distortions, or a spatially varying PSF.

### 4.1. Measuring the system PSF

To measure PSF of the soft x-ray microscope optics, we prepared the phantom sample composed of gold nanoparti-cles. Spherical gold nanoparticles with diameter of 100 nm (Nanopartz, Cat.No. AR11-100-NB-50) were deposited with a microloader on a 100 nm thick silicone nitride membrane (Silson LtD, Ref.10402101) and then spread by a gentle flow of warm air. The distribution of nanoparticles was confirmed by optical microscopy in dark field mode. The areas with single isolated particles were selected for imaging with x-ray microscope. The regions of interest were imaged with a set of 30 through-focus images with 1 μm step size and 150 ms exposure time each. For each set of radiographs, 20 reference images were acquired with the same exposure time. The radiographs were recorded by a charge-coupled-device camera (Andor IKon-L) with an effective pixel size of 16 nm. A reference profile was obtained by fitting the theoretical intensity function,
to the experimental data of the in focus image. Here, *I*_{0} is the initial intensity, *ì* is scaled LAC, and ** x_{c}** is center of a sphere.

The PSF was determined by maximum-likelihood (ML) deconvolution [34, 38]. Essentially, we assume that the image of a single bead can be written in form ** y** =

******

*p***+**

*h**, where the measured signal*

**ε****is composed of a convolution between the bead profile,**

*y***, the PSF,**

*p***, and an additional noise term e. The ML solution for the PSF can now be found iteratively, using the fitted theoretical bead profile Eq. (6), with the Richardson-Lucy algorithm:**

*h*A full 3D PSF was obtained by bilinear interpolation, an example of a PSF extracted in this way is shown in Fig. 5, from which transverse slices give the needed 2D convolution kernels, ** h_{z}** for Eq. (4). The full-width-at-half-maximum of the gaussian peak in focus corresponds to 57 nm, and a maximum loss of intensity of 20% [2, p. 441] along the axial direction gives a DOF of ± 4.7 pm.

### 4.2. Experimental validation: imaging of mouse B cell

To test the applicability of the method for experimental data, the method was used to reconstruct tomographic images of mouse lymphocytes. To have full angle rotation during the acquisition, the cells were loaded into thin-wall glass capillaries with a micro loader [26]. For cryo-fixation, the capillaries were rapidly plunged into a liquid propane cooled by liquid N_{2}. The data acquisition was done by sequentially rotating the capillary with 2° increments for 180° with an exposure time of 300 ms per image. A series of 10 reference images were taken before and after the scan, that were used to normalize the data. The projection images were aligned using an updated version of the previously developed automatic registration software AREC3D [27].

The tomographic image was reconstructed in three different ways using CGNE, viz.:, A reference image using the Beer-Lambert approximation, a reconstruction where the intensity images are deconvolved with a 2D PSF, i.e., the solution to Eq. (2) if the PSF is not depth-dependent, and using the linear PSF approximation presented in this paper. The 2D kernel for the deconvolution was obtained by taking the average of all the measured 2D kernels within the DOF of the PSF. The deconvolution was performed on the transmission images using RL deconvolution, and stopped before significant ringing was observed^{3}.

In Fig. 6 we show details of the reconstructions showing endoplasmic reticulum for the three different reconstructions. The linear PSF approximation performs as well as the deconvolution method, which can, in this case, be considered to be a proper solution to the inversion as the DOF of the PSF spans the whole sample. Both PSF corrected reconstructions show an increase of contrast with respect to the Beer-Lambert approximation. The effect is less pronounced in the in-plane reconstruction, where image quality is limited by the relatively sparse sampling.

## 5. Conclusions

In this work we have introduced a linear approximation to the reconstruction of tomographic x-ray data, including the effect of the point spread function of the optics. We show numerically, that the approximation is well suited for the inversion of the SXT data both when the PSF acts as depth-independent blur as well as when the sample is partially out of focus.

For experimental measurements, the PSF inversion increases contrast in the image, especially of edge features, and provides similar results as compared to deconvolution. Numerical results show that the introduced inversion using a projection matrix including the PSF is more resilient to noise than its deconvolution-based counterpart. This, however, comes at a price, since the reconstruction using the PSF incorporated projection matrix is both computationally more expensive, and converges significantly slower.

The presented model tackles current limitations of SXT and provides new ways of data acquisition. For a properly sampled and focused PSF, the linear PSF approximation enables use of higher NA optics, as the limited depth of field can be computationally amended. The results in Fig. 4 predict a possible way to extend the depth of field experimentally by shifting the focus to either side of the sample and taking a full 360^{°} rotation data set. This would allow one to circumvent the traditional limitations of diffraction limited optics Fig. 1 and extend the effective depth of field with respect to the resolution. In other words, it would either enable imaging current samples with a higher NA, thus a higher resolution, or extending the DOF of current imaging systems, allowing for full 3D reconstructions of larger samples.

We expect the method to be generally applicable also for other tomographic systems, as log as the assumption of a linear image formation can be met.

## Acknowledgment

Research reported in this publication was conducted at The National Center for X-ray Tomography, located at the Advanced Light Source of Lawrence Berkeley National Laboratory, and was supported by grants from NIH (P41GM103445), DOE's Office of Biological and Environmental Research (DE-AC02-5CH11231), and Chan Zuckerberg Initiative Human Cell Atlas program. Venera Weinhardt was funded by German Research Foundation fellowship WE6221/1-1.

## Appendix A Linear approximation of the image formation

The model of Otón et al. [24] is based on two assumptions, in which the measured projection is a result of the attenuated light passing through the sample, blurred by the PSF of the objective lens. We assume firstly, that there exists an impulse response function ** h_{z}**(

*x*_{1},

*x*_{2}) (see e.g. Refs [2] and [7] for details), such that the intensity field at the image plane,

*I*_{im}(

*x*_{2}), can be expressed by linear transport of an (un-attenuated) field intensity

*U*at position

*z*, where we define

*x*_{1}and

*x*_{2}as two sets of coordinates, corresponding to the planes perpendicular to the optical axis at

*z*

_{1}and

*z*

_{2}and **:

**R**

^{2}→

**R**

^{2}is the convolution operator in two dimensions. The second assumption is, that the local field propagation within teh sample can be done by parallel wave approximation.

For cleaner notation, we construct *h _{z}*(

*x*_{1},

*x*_{2}) so that it includes all linear scaling (magnification, inversion), thus we can express the field transfer simply as a 2D convolution of the local field plane and a 2D PSF [

*h*(

_{z}**)]**

*x*^{2}. Using these assumptions Otoón et al. [24] construct the derivative of

*I*

_{im}

*with respect to*

^{z}*z*by considering a partially cut sample and adding to it slices of thickness Δ

*z*(see Fig. A.7).

Following the derivation by Otoón et al. [24], we construct the image formation by finite difference of the image, but applying this directly on the logarithm of the normalized image, that is we seek an approximation to the normalized image
where we drop the explicit notation of ** x** for cleaner notation. Constructing the finite difference and substituting for the linear transfer in Eq. (A.1) yields

For the left hand side, we note that from the series expansion
we can neglect higher order terms as, *x* ≈ 1. For the right hand side, making the same approximation as Otóon et al. [24] we propagate the field coherently (to linear accuracy)
and as the PSF can be assumed smooth, we can also assume that
yielding

To get rid of the final non-linear terms, we assume that *U(z)*^{2} is smooth enough, such that
yielding

Integrating on both sides of this finite difference approximation of the derivative, we get

With suitable discretization, we can now express the right hand side as a linear projection matrix ** A_{h}μ**, yielding our approximation, Eq. (3).

### Appendix A.1. Validity of the linear approximation

To test the assumptions for the forward model, we calculated the difference on the three forward models of a reconstruction of a biological sample taken with XM2 on beamline 2.1 at the ALS in LBNL [18]. In Fig. A.8 we show the error of a Beer-Lambert projection and our linear PSF approximation, respectively, as compared to the non-linear projection model of Eq. (2). It is evident that the error of the linear approximation is negligible except for at the edges of the samples. This is expected, as the samples in XM2 are mounted in capillary tubes. The capillary tubes produce large gradients in the LAC, where the assumption of a relatively smooth local field in Eq. (A.8) is expected to fail.

## Footnotes

↵1 All lengths describing the numerical validation, are in units of pixels.

↵2 The odd number of projections is convenient, as it ensures that the same sampling could be used for both 180 and 360°. If

*n*is even, and the sampling is done over 360°, the angles will be at a π shift and do not provide any additional sampling if the PSF is in focus and symmetric with respect to the defocus.↵3 Of course, as the experimental data lacks oracle information this could not be done in as rigorous manner as for the phantom sample, so the “best” result had to be qualitatively determined by visual inspection.