PT - JOURNAL ARTICLE AU - Bhaskar, Dhananjay AU - Lee, Darrick AU - Knútsdóttir, Hildur AU - Tan, Cindy AU - Zhang, MoHan AU - Dean, Pamela AU - Roskelley, Calvin AU - Edelstein-Keshet, Leah TI - A methodology for morphological feature extraction and unsupervised cell classification AID - 10.1101/623793 DP - 2019 Jan 01 TA - bioRxiv PG - 623793 4099 - http://biorxiv.org/content/early/2019/04/30/623793.short 4100 - http://biorxiv.org/content/early/2019/04/30/623793.full AB - Cell morphology is an important indicator of cell state, function, stage of development, and fate in both normal and pathological conditions. Cell shape is among key indicators used by pathologists to identify abnormalities or malignancies. With rapid advancements in the speed and amount of biological data acquisition, including images and movies of cells, computer-assisted identification and analysis of images becomes essential. Here, we report on techniques for recognition of cells in microscopic images and automated cell shape classification. We illustrate how our unsupervised machine-learning-based approach can be used to classify distinct cell shapes from a large number of microscopic images.Technical Abstract We develop a methodology to segment cells from microscopy images and compute quantitative descriptors that characterize their morphology. Using unsupervised techniques for dimensionality reduction and density-based clustering, we perform label-free cell shape classification. Cells are identified with minimal user input using mathematical morphology and region-growing segmentation methods. Physical quantities describing cell shape and size (including area, perimeter, Feret diameters, etc.) are computed along with other features including shape factors and Hu’s image moments.Correlated features are combined to obtain a low-dimensional (2-D or 3-D) embedding of data points corresponding to individual segmented cell shapes. Finally, a hierarchical density-based clustering algorithm (HDBSCAN) is used to classify cells. We compare cell classification results obtained from different combinations of features to identify a feature set that delivers optimum classification performance for our test data consisting of phase-contrast microscopy images of a pancreatic-cancer cell line, MIA PaCa-2.