Automated image analysis method to detect and quantify fat cell infiltration in hematoxylin and eosin stained human pancreas histology images

Fatty infiltration in pancreas leading to steatosis is a major risk factor in pancreas transplantation. Hematoxylin and eosin (H and E) is one of the common histological staining techniques that provides information on the tissue cytoarchitecture. Adipose (fat) cells accumulation in pancreas has been shown to impact beta cell survival, its endocrine function and pancreatic steatosis and can cause non-alcoholic fatty pancreas disease (NAFPD). The current automated tools (E.g. Adiposoft) available for fat analysis are suited for white adipose tissue which is homogeneous and easier to segment unlike heterogeneous tissues such as pancreas where fat cells continue to play critical physiopathological functions. The currently, available pancreas segmentation tool focuses on endocrine islet segmentation based on cell nuclei detection for diagnosis of pancretic cancer. In the current study, we present a fat quantifying tool, Fatquant, which identifies fat cells in heterogeneous H and E tissue sections with reference to diameter of fat cell. Using histological images of pancreas from a publicly available database, we observed an intersection over union of 0.797 to 0.966 for manual versus fatquant based machine analysis. Author Summary We have developed an automated tool, Fatquant, for identification of fat cells based on its diameter in complex hematoxylin and eosin tissue sections such as pancreas which can aid the pathologist for diagnosis of fatty pancreas and related metabolic conditions. Fatquant is unique as current fat automated tools (adiposoft, adipocount) works well for homogeneous white adipose tissue but not for other tissue samples. The currently available pancreas analysis tool are mostly suited for segmentation of endocrine β-cell based on cell nuclei detection, extracting colour features and cannot estimate fat cell infiltration in pancreas. Graphical Abstract Currently available fat quantification tools like adiposoft can analyze homogenous adipose tissue (left) with intersection over union (IoU) of 0.935 and 0.954 with adiposoft and fatquant, respectively. While in heterogenous tissue (e.g. pancreas on right) which contains adipose (fat cells), acinar cells, adiposoft fails to detect fat cells with IoU=0 while fatquant had IoU=0.797.


Introduction
The accumulation of fats especially in the abdominal area causes insulin resistance leading to deposition of fats (steatosis) in the pancreas, inflammation and finally fibrosis leading to nonalcoholic fatty pancreatic disease (NAFPD). Accumulation of pancreatic fat may lead to pancreatitis, diabetes mellitus or pancreatic cancer (Paul and Shihaz 2020). Pancreatic steatosis can be diagnosed on ultrasound, computed tomography (CT) scan or magnetic resonance imaging (MRI) but pancreatic biopsy remains best method to detect pancreatic fat concentration (Tariq et al 2016, Paul andShihaz 2020). The consequence of pancreatic fat infiltration might provoke a decrease in endocrine (β-cell) number and function, leading to more rapid progression to diabetes (Yu and Wang 2017). Sudies suggest NAFPD as an early marker of glucometabolic disturbance (Yu and Wang 2017). Pancreas transplantation is the only way to treat type 1 4 diabetes (T1D) but fat infiltration in pancreas remains a risk factor that can affect the clinical outcome (Verma andPapalois 2011, Dholakia et al 2017). Fatty pancreas has a prevalence of 35% and may be contributing factor for the malignancy and the metabolic syndrome (Lesmana et al 2015). The histological and MRI tools exhibit good agreement in detecting fat in pancreas (Fukui et  Usually, these algorithms involve splitting the image into various color channels with the red channel binarized using automatic thresholding method to separate the bright pink fat areas from 5 dark purple-bluish cell nuclei. Subsequently, a watershed algorithm is applied to fillup missing fat cell membrane and improves cell count (Galarraga et al 2012, Zhi et al 2018. The output of these processes includes the labels and statistical analysis of individual cells. These tools have been successfully applied to white adipose tissue, however, other organs like liver, pancreas, lungs have been challenging due to heterogeneous cell types ( Figure 2).

Materials and Methods
The H and E images for the analysis were downloaded from a publicly accessible Genotype-

Data and code availability
All the images, annotations along with relevant data code can be found at the following GitHub repository: https://github.com/anniedhempe/Fatquant. The procedure to run this tool is mentioned in the Readme file. The procedure for image processing is briefly demonstrated in the flow chart shown in Figure 3.

Image processing
The steps used are elaborated below.    The color of fat cells in images from GTEX portal ranges approximately between 225 to 255 grayscale values. There can also be other parts of pancreas which has the same range of color.
But applying binary threshold on an image can help in getting rid of many unwanted parts. The pixels of an image whose color values are at least equal to the input parameter value of threshold (e.g. 227) are taken into consideration for further processing and are assigned a new grayscale value 255. The other pixels are assigned value 0. Figure 6 is a thresholded image of Figure 4 with parameter value 230. Figure 7 is also a thresholded image but of Figure 5 where pixels representing the tagged fat cells are assigned grayscale value 255 and the rest is assigned value 0.

Segmentation of white pixels from thresholded image
White pixels are initially segmented by combining tile rendering with scanline rendering and then identifying possible merge of segments in a tile with their immediate neighbors. Tile rendering has been implemented in this system as it helps in reducing time complexity for segments covering large area.
Processing time for segmentation was tested with four sizes of square tiles, which were of length       Figure 11 is a diagrammatic representation of the method discussed where the Red colored rectangle represents smallest dimension.

Figure 11
The system uses a square shaped kernel which is supposed to fit inside the boundaries of valid segments. The side length of that square is determined by input fat diameter values (minimum or maximum) as these diameters equates to diagonal of that square. Side length of a square, can be calculated as: s =d divided by square root of 2.

11
, where s = square side length; d = diameter. This square can be assumed as the largest square that can get inscribed in a circle of given diameter. An elliptical kernel may be a more precise choice for identifying fats but since square kernel is easier to handle so it has been chosen.
The system refers to a minimum diameter value to select segments where a square kernel having side length as per this diameter can fit somewhere in their regions. Then the system refers to a maximum diameter value to discard segments from the selected list where a kernel having side length as per this diameter can fit somewhere in their regions. This means a segment which has narrow areas in many of its portions, but has very wide area in one of its portion can also get discarded if a square kernel as per the maximum diameter can fit in that portion. Figure 12 is a diagrammatic representation of this process performed on an image of dimension 15 x 15 pixels. Figure 12 (a) has seven segments with White colored pixels out of which valid segments are to be selected. Figure 12 (b) has three segments marked with Cyan color which denotes segments getting selected as per minimum diameter. The kernel size is of length 3 pixels. So, four segments do not get selected as the square kernel fails to fit inside the boundary of any of these segments. Figure 12 (c) has only two segments marked with Cyan color which denotes selected segment getting discarded as per maximum diameter. Here the kernel size is of length 4 pixels.
So, a segment which can fit a kernel of length greater than 4 pixels is to be discarded. The previously selected segment which gets discarded could fit a kernel of length 5 pixels.

Removal of fats from boundary
Segments which are identified as fats but also contain pixels from boundary are discarded because their entire size is not known within the dimension of input image ( Figure 14). While comparing Figure 14 with Figure 13 (b), it can be seen that some segments which contain pixels from bottom boundary gets discarded.
Some of these segments may even get discarded while selecting segments as per diameter. E.g.
in Figure 13 (b), one big segment which has pixels in boundary gets discarded after considering maximum diameter area, whereas the segment is present in Figure 13 (a). Removing fats from boundaries is the final step for tagging fats using machine. If users do not have manual tagged data of fat cells then they can conclude the experiment after this step.

Analysis of valid fats
The

14
Fat cell identification on ten sample images of dimension 1716 x 905 pixels was performed using Adiposoft and Fatquant tools. The outputs are shown below.    The threshold value, minimum and maximum diameter were chosen to get an optimal output.  23 chosen as per optimal output. Pixels being part of fat in (b) are marked with yellow color whereas pixels identified as fat in (c) and (d) are marked with cyan color.
From the outputs it can be noted that Adiposoft only shows decent output when adipocytes cover maximum area of a sample image (e.g. Figures 18 and 20). In a heterogeneous sample image this tool can tag many non-fat areas as valid fats (e.g. Figures 21, 22 and 24). Moreover, it can even fail to identify presence of any fat in an image (e.g. Figures 16 and 23).   represents machine tagged areas using Fatquant of images in (a). As per the data mentioned in

Discussion
Pancreatic fat accumulation has been associated with obesity, impaired b-cell function and may be an early sign in the development of metabolic syndrome (Dite et al 2020, Sequeira et al 2022, 26 Rugivarodom et al 2022). Pancreas is a heterogenous tissue and current tools for fat cell analysis