Abstract
Prostate cancer (PCa) is associated with several genetic alterations which play an important role in the disease heterogeneity and clinical outcome including gene fusion between TMPRSS2 and members of the ETS family of transcription factors specially ERG. The expanding wealth of pathology whole slide images (WSIs) and the increasing adoption of deep learning (DL) approaches offer a unique opportunity for pathologists to streamline the detection of ERG:TMPRSS2 fusion status. Here, we used two large cohorts of digitized H&E-stained slides from radical prostatectomy specimens to train and evaluate a DL system capable of detecting the ERG fusion status and also detecting tissue regions of high diagnostic and prognostic relevance. Slides from the PCa TCGA dataset were split into training (n=318), validation (n=59), and testing sets (n=59) with the training and validation sets being used for training the model and optimizing its hyperparameters, respectively while the testing set was used for evaluating the performance. Additionally, we used an internal testing cohort consisting of 314 WSIs for independent assessment of the model’s performance. The ERG prediction model achieved an Area Under the Receiver Operating Characteristic curve (AUC) of 0.72 and 0.73 in the TCGA testing set and the internal testing cohort, respectively. In addition to slide-level classification, we also identified highly attended patches for the cases predicted as either ERG-positive or negative which had distinct morphological features associated with ERG status. We subsequently characterized the cellular composition of these patches using HoVer-Net model trained on the PanNuke dataset to segment and classify the nuclei into five main categories. Notably, a high ratio of neoplastic cells in the highly-attended regions was significantly associated with shorter overall and progression-free survival while high ratios of immune, stromal and stromal to neoplastic cells were all associated with longer overall and metastases-free survival. Our work highlights the utility of deploying deep learning systems on digitized histopathology slides to predict key molecular alteration in cancer together with their associated morphological features which would streamline the diagnostic process.
Competing Interest Statement
The authors have declared no competing interest.