RT Journal Article SR Electronic T1 A methodology for morphological feature extraction and unsupervised cell classification JF bioRxiv FD Cold Spring Harbor Laboratory SP 623793 DO 10.1101/623793 A1 Bhaskar, Dhananjay A1 Lee, Darrick A1 Knútsdóttir, Hildur A1 Tan, Cindy A1 Zhang, MoHan A1 Dean, Pamela A1 Roskelley, Calvin A1 Edelstein-Keshet, Leah YR 2019 UL http://biorxiv.org/content/early/2019/04/30/623793.abstract AB Cell morphology is an important indicator of cell state, function, stage of development, and fate in both normal and pathological conditions. Cell shape is among key indicators used by pathologists to identify abnormalities or malignancies. With rapid advancements in the speed and amount of biological data acquisition, including images and movies of cells, computer-assisted identification and analysis of images becomes essential. Here, we report on techniques for recognition of cells in microscopic images and automated cell shape classification. We illustrate how our unsupervised machine-learning-based approach can be used to classify distinct cell shapes from a large number of microscopic images.Technical Abstract We develop a methodology to segment cells from microscopy images and compute quantitative descriptors that characterize their morphology. Using unsupervised techniques for dimensionality reduction and density-based clustering, we perform label-free cell shape classification. Cells are identified with minimal user input using mathematical morphology and region-growing segmentation methods. Physical quantities describing cell shape and size (including area, perimeter, Feret diameters, etc.) are computed along with other features including shape factors and Hu’s image moments.Correlated features are combined to obtain a low-dimensional (2-D or 3-D) embedding of data points corresponding to individual segmented cell shapes. Finally, a hierarchical density-based clustering algorithm (HDBSCAN) is used to classify cells. We compare cell classification results obtained from different combinations of features to identify a feature set that delivers optimum classification performance for our test data consisting of phase-contrast microscopy images of a pancreatic-cancer cell line, MIA PaCa-2.