RT Journal Article SR Electronic T1 Deep learning-based auto-segmentation of swallowing and chewing structures JF bioRxiv FD Cold Spring Harbor Laboratory SP 772178 DO 10.1101/772178 A1 Aditi Iyer A1 Maria Thor A1 Rabia Haq A1 Joseph O. Deasy A1 Aditya P. Apte YR 2020 UL http://biorxiv.org/content/early/2020/10/28/772178.abstract AB Purpose Delineating the swallowing and chewing structures in Head and Neck (H&N) CT scans is necessary for radiotherapy treatment (RT) planning to reduce the incidence of radiation-induced dysphagia, trismus, and speech dysfunction. Automating this process would decrease the manual input required and yield reproducible segmentations, but generating accurate segmentations is challenging due to the complex morphology of swallowing and chewing structures and limited soft tissue contrast in CT images.Methods We trained deep learning models using 194 H&N CT scans from our institution to segment the masseters (left and right), medial pterygoids (left and right), larynx, and pharyngeal constrictor muscle using DeepLabV3+ with the resnet-101 backbone. Models were trained in a sequential manner to guide the localization of each structure group based on prior segmentations. Additionally, an ensemble of models was developed using contextual information from three different views (axial, coronal, and sagittal), for robustness to occasional failures of the individual models. Output probability maps were averaged, and voxels were assigned labels corresponding to the class with the highest combined probability.Results The median dice similarity coefficients (DSC) computed on a hold-out set of 24 CT scans were 0.87±0.02 for the masseters, 0.80±0.03 for the medial pterygoids, 0.81±0.04 for the larynx, and 0.69±0.07for the constrictor muscle. The corresponding 95th percentile Hausdorff distances were 0.32±0.08cm (masseters), 0.42±0.2cm (medial pterygoids), 0.53±0.3cm (larynx), and 0.36±0.15cm (constrictor muscle). Dose-volume histogram (DVH) metrics previously found to correlate with each toxicity were extracted from manual and auto-generated contours and compared between the two sets of contours to assess clinical utility. Differences in DVH metrics were not found to be statistically significant (p>0.05) for any of the structures. Further, inter-observer variability in contouring was studied in 10 CT scans. Automated segmentations were found to agree better with each of the observers as compared to inter-observer agreement, measured in terms of DSC.Conclusions We developed deep learning-based auto-segmentation models for swallowing and chewing structures in CT. The resulting segmentations can be included in treatment planning to limit complications following RT for H&N cancer. The segmentation models developed in this work are distributed for research use through the open-source platform CERR, accessible at https://github.com/cerr/CERR.Competing Interest StatementACKNOWLEDGEMENTS This work was partially funded by NIH grant 1R01CA198121 and NIH/NCI Cancer Center Support grant P30 CA008748.