Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

A method for morphological feature extraction based on variational auto-encoder : an application to mandible shape

Masato Tsutsumi, View ORCID ProfileNen Saito, Daisuke Koyabu, View ORCID ProfileChikara Furusawa
doi: https://doi.org/10.1101/2022.05.18.492406
Masato Tsutsumi
1Department of Physics, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nen Saito
2Graduate School of Integrated Sciences for Life, Hiroshima University, Higashihiroshima, Hiroshima 739-8511, Japan, Exploratory Research Center on Life and Living Systems (ExCELLS), National Institutes of Natural Sciences, Okazaki, Aichi 444-8585, Japan and Universal Biology Institute, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Nen Saito
Daisuke Koyabu
3Research and Development Center for Precision Medicine, University of Tsukuba, 1-2 Kasuga, Tsukuba 305-8550, Japan and Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, To Yuen Building, Tat Chee Avenue, Kowloon 999077, Hong Kong
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Chikara Furusawa
4Universal Biology Institute, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan and Center for Biosystem Dynamics Research, RIKEN, 6-2-3 Furuedai, Suita, Osaka 565-0874, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Chikara Furusawa
  • For correspondence: chikara.furusawa@riken.jp
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

ABSTRACT

Shape analysis of biological data is crucial for investigating the morphological variations during development or evolution. However, conventional approaches for quantifying shapes are difficult as exemplified by the ambiguity in the landmark-based method in which anatomically prominent “landmarks” are manually annotated. In this study, a morphological regulated variational autoencoder (Morpho-VAE) is proposed that conducts image-based shape analysis using imaging processing through a deep-learning framework, thereby removing the need for defining landmarks. The proposed architecture comprises a VAE combined with a classifier module. This integration of unsupervised and supervised learning models (i.e., VAE and classifier modules) is designed to reduce dimensionality by focusing on the morphological features in which the differences between data with different labels are best distinguished. The proposed method is applied to the image dataset of the primate mandible to extract morphological features, which allow us to distinguish different families in a low dimensional latent space. Furthermore, the visualization analysis of decision-making of Morpho-VAE clarifies the area of the mandibular joint that is important for family-level classification. The generative nature of the proposed model is also demonstrated to complement a missing image segment based on the remaining structure. Therefore, the proposed method, which flexibly performs landmark-free feature extraction from complete and incomplete image data is a promising tool for analyzing morphological datasets in biology.

AUTHOR SUMMARY Shape is the most intuitive visual characteristic; however, shape is generally difficult to measure using a small number of variables. Specifically, for biological data, shape is sometimes highly diverse as it has been acquired through a long evolutionary process, adaptation to environmental factors, etc., which limits the straightforward approach to shape measurement. Therefore, a systematic method for quantifying such a variety of shapes using a low-dimensional quantity is needed. To this end, we propose a novel method that extracts low-dimensional features to describe shapes from image data using machine learning. The proposed method is applied to the primate mandible image data to extract morphological features that reflect the characteristics of the groups to which the organisms belong and then those features are visualized. This method also reconstructs a missing image segment from an incomplete image based on the remaining structure. To summarize, this method is applicable to the shape analysis of various organisms and is a useful tool for analyzing a wide variety of image data, even those with a missing segment.

INTRODUCTION

Morphology is one of the most intuitively recognizable phenotypes for all organisms and has been thought to result from a variety of complex factors, including adaptation to environmental factors and speciation through neutral evolution. Therefore, comparing morphology among species and individuals is expected to provide insight into the functional role of shape and its developmental and evolutionary history. To decipher such factors from the morphology, quantification and characterization of shape are critical as it allows us to describe, interpret, and visualize the variations in shape. So far, a great deal of effort has been made towards shape analysis, and various methods have been proposed. The most widely used shape analysis is landmark-based geometric morphometrics in which landmarks are defined by anatomically homologous points on multiple samples, and the shape of a given sample is characterized by the coordinates of these landmarks [1–5]. The applications of this landmark-based method are wide-ranging, including vertebrates [6–12], arthropods [13–16], mollusks [17–19], plants [20, 21]; however, there are several difficulties and ambiguities intrinsic to this method despite its prevalence. First, the landmark-based method is unsuitable for comparisons between species and developmental stages in which biologically homologous landmarks cannot be defined. Second, both a large and small number of landmarks can cause the loss of information about the morphology of a sample [3, 5, 22–24]. In addition, errors can be problematic, such as those from measurement devices [25] and setting configurations of landmarks set inadequately by researchers due to differences in skill levels [26]. Besides the landmark-based method, elliptic Fourier analysis (EFA) has also been proposed [27, 28] and applied to characterize the shape of cells [29, 30], bivalves [31], fish [7, 32, 33], and plant organs [34–36].

Typically, the landmark-based method or EFA is combined with principal component analysis (PCA) to reduce high-dimensionality in morphological data into easily visualizable low-dimensional space [1, 6, 7]. Linear methods that reduce dimensionality, such as PCA and linear discriminant analysis (LDA), are straightforward and easily implementable, but a nonlinear approach, such as deep neural network (DNN), might be suitable for capturing more complex features with fewer dimensions. In fact, nonlinear methods based on DNN have been the standard analysis tools in the fields of image classification [37, 38] and medical diagnostic imaging [39, 40]; however, their application to morphological analysis, specifically to feature extraction of morphology, has been still limited to a few cases [41, 42]. A possible drawback of the DNN approach is that the analysis is often black-boxed and difficult to interpret, but many attempts have been made to solve this issue [43–45].

In this paper, a novel and landmark-free method based on a variational autoencoder (VAE) is proposed that analyzes shape from image data without manual landmark annotation. A VAE is a class of DNN and consists of the encoder and decoder in which the encoder embeds high-dimensional image data into low-dimensional latent variables, and the decoder reconstructs the input image from the compressed latent variables [46]. The nonlinear data compressibility of the encoder has allowed VAE to be used for feature extraction from image data [47–49]. The reconstruction capability of decoder also ensures that the input image is compressed while maintaining the information of input image, rather than being compressed in an irreversible way. Herein, the original VAE is modified by integrating a classifier module into the VAE, which allow us to extract morphological features that can best distinguish data with different labeled classes.

The modified VAE model is demonstrated to be superior than the original VAE and PCA-based methods in capturing morphological features by analyzing the mandibular image data of primates. The mandible varies widely in morphology depending on its function and diet [50–53]. For instance, the size and morphology of the mandible joint and its position relative to the biting surface differ between carnivorous and herbivorous mammals due to the differences in their masticatory functions [54, 55]. The proposed method provides a landmark-free and non-linear feature extraction analysis for the morphological data of a 3D object, as exemplified by the mandible. Additionally, an interpretation of the extracted features is presented as well as the application to the mandibular image data with a missing bone segment. The proposed model is a useful and flexible tool for investigating a morphological dataset.

RESULTS

The study aims to develop a landmark-free method for extracting morphological features from images to distinguish different groups. A total 147 mandibles samples from seven different families (i.e., 7 labels) are prepared for verifying the method. These samples comprise 141 samples of the primate mandibles (Cercopethecidae, Cebidae, Lemuridae, Atelidae, Hy-lobatidae, and Hominidae) and six samples of the mandibles of carnivora (Phocidae) as an outgroup. The corresponding 3D mandible data are projected from three directions to produce three projected 2D images, as shown in Fig 1A (see Method section). These three projections of each mandible are used as the input images for the following analysis. The proposed architecture, morphological regulated variational auto encoder (Morpho-VAE), is illustrated in Fig 1B. Note that the VAE module is combined with the classifier module through the latent variable ζ. Subsequently, learning is performed based on the loss function Etotal = (1 – α)(EVAE) + αEC, where EVAE is the loss associated with VAE (i.e., the reconstruction + regularization losses), EC is the classification loss for the classifier module, and α is a hyperparameter that dictates the ratio between EVAE and EC in Etotal. Using the mandible sample images, the hyperparameter α is determined as 0.1 through cross-validation (Fig 1C, see also Method section). This choice of α ensures a low EC with a negligible increase in EVAE from α = 0, indicating that the classification ability can be incorporated into the VAE without lowering the performance in the VAE module. Other hyperparameters, such as the number of layers, number of filters, type of activation function, and optimization function, are also tuned; moreover, the number of dimensions of the latent variable are set to three (see Methods section).

Fig 1.
  • Download figure
  • Open in new tab
Fig 1. Machine learning pipeline for predicting.

(A) Schematic of data preprocessing. (B) Schematic of the Morpho-VAE that comprises the encoder, decoder, and classifier. (C) Plot showing the changes in EC and EVAE as α is varied: Blue points and red points indicate the values of EC and EVAE, respectively, in the optimal model for each of the 10 combinations of training and test data. EC and EVAE are normalized such that the maximum value is 1. The left panel shows the range from 0 to 1, and the right panel shows the expanded range from 0 to 0.3.

Cluster Separation

After the 100-epoch training as described in the Method section, a trained model is obtained that can classify the input image into seven class labels with a high validation accuracy (90% as median, SFig 2B), compress the image into 3D latent space ζ, and reconstruct the image from the latent space. The distribution of training and validation datasets in the latent space (Fig 2A) illustrates that the data points of each label form well-separated clusters from the data with different labels, compared to the results from the dimensionality reduction performed using PCA (Fig 2B) and VAE (Fig 2C). Herein, PCA is performed by transforming the image into a vector of 16,384 (= 128 × 128) dimensions and extracting the top three components; VAE is trained using the same procedure and training, validation and testing datasets to Morpho-VAE as described in the Methods section while ignoring classification loss (i.e., α = 0). To quantify the extent to which the data points with different class labels are separated in each method, the cluster separation index (CSI) is defined as follows: Embedded Image where Embedded Image is the Euclidean distance between the centroids of the i-th cluster Ci, Embedded Image, and the j-th cluster Cj, Embedded Image is the mean distance between a point in Ci, Embedded Image, and the i-th cluster centroid, Embedded Image. When the clusters i and j are separated, CSIij < 1, and CSIij > 1 when one of the clusters is encompassed or partially overlaps the other one. By taking the average of the maximum of CSIij for j ≠ i (i.e., Embedded Image), this index corresponds to the Davies-Bouldin index with p = q = 2 [56], which is widely used to evaluate the degree of cluster separation. Fig 2D shows the CSIs for all pairs of the seven clusters obtained in the reduced feature space of Morpho-VAE, PCA, and VAE, in which a single circle indicates a pair of different classes. Consistent with this result, all points are less than one, which indicates that all pairs of clusters are well-separated; however, for PCA and VAE, almost half of all points are lower than one, suggesting that the data points with different family labels cannot be distinguished in PCA or VAE space. For further verification, the evaluated Davies-Bouldin indices (a score of less than one represents well-separated clusters) are 0.74 (Morpho-VAE), 1.85 (PCA), and 2.09 (VAE).

Fig 2.
  • Download figure
  • Open in new tab
Fig 2. Distribution of data in latent space.

(A-C) Data distribution in latent space: By using the Morpho-VAE, PCA, and VAE, respectively, all the input images are the same and are dimensionally compressed into a 3D latent space by each of the methods. (D) Dot plot of CSI a point below 1 represents a pair of well separated clusters: Blue dots represent Davies–Bouldin indices for different models. (E) Classification accuracy of families by SVM as a measure of cluster separation: The error bars indicate the mean and standard deviations in the accuracy for each of the 10 tuned models.

Additionally, the classification accuracy calculated by the support vector machine (SVM) from the data distribution in the latent space is quantified as another measure of the degree of cluster separation. Since the SVM can solve a classification problem with a high validation accuracy when the clusters of data with different labels are well-separated in the latent space, this SVM-based accuracy is expected to reflect the degree of cluster separation. After the proposed Morpho-VAE is trained using the training data (for PCA, the top three PC vectors from the training data are selected), the same training data is used for training the SVM, and then the SVM accuracy in the latent space is calculated using the test data. The average test accuracy estimated from 10 different combinations of training and test data is shown in Fig 2E in which the Morpho-VAE model achieves a much higher test accuracy than other methods, indicating that the proposed model can embed the data of different families in well-separated clusters in latent space.

Reconstructing and Generating Images from Latent Space

The proposed Morpho-VAE model can reconstruct an image from the low-dimensional latent variable ζ through the decoder as well as compress the input image into ζ through the encoder. This ability guarantees that the compressed latent variable ζ preserves the information about the morphology of the input data, rather than compressing it in an irreversible manner. A representative example of an input and reconstructed images from the input image is shown in Fig 3A in which the entire morphological information of the input image is preserved in the reconstructed image, and some detailed differences are recognizable. The reconstruction loss ERec that reflects the accuracy of the reconstructed input image reaches a plateau during training (SFig 2C), indicating that learning is successful. The reconstructed image is re-inputted into Morpho-VAE to further confirm the extent of morphological information preserved in the reconstruction image; subsequently, the predicted label is obtained through the classifier module and the prediction accuracy is calculated by comparing with the true label. This prediction accuracy can be used as an indicator of the extent of morphological information that is preserved as the precisely reconstructed images should be correctly classified, but the poorly reconstructed images should result in a significant accuracy drop. Fig 3B illustrates this prediction accuracy of the reconstructed image in comparison to the accuracy calculated from the original data with only a few percent of drops observed. Therefore, the reconstruction is demonstrably successful.

Fig 3.
  • Download figure
  • Open in new tab
Fig 3. Image reconstruction by the proposed Morpho-VAE.

(A) Comparison between the original and reconstructed image. (B) Classification accuracy of reconstructed images: The error bars indicate the mean and standard deviations in the accuracy for each of the 10 tuned models. (C) Generating images from latent space: Left figure shows the images reconstructed from the grid points in the PC plane at PC3 = 0 in the Morpho-VAE space, and each point is a data point projected from the Morpho-VAE space onto the PC plane. The size of each point is proportional to the absolute value of its distance from the PC3 = 0 plane. The larger the size, the closer the point is to the PC3 = 0 plane. Right figure shows the positions of points with regards to the PC plane (gray colored).

Similar to VAE, the Morpho-VAE model is categorized as a class of generative models that can generate an image from an arbitrary point in the latent space ζ even when no input data corresponds to the point in ζ. This property enables visualization of the latent space; Fig 3C illustrates the generated images from the uniformly sampled ζ on the 2D square lattice in 3D latent space (right panel in Fig 3C) in which the choice of the 2D plane in the 3D latent space is determined by PCA based on the data distribution in the latent space. The background colors in the left panel of Fig 3C represent the predicted labels from ζ by the classifier module; circles indicate the input data points mapped into ζ with their sizes corresponding to the distance from the PC1–PC2 plane. The generated morphology changes gradually in the latent space (left panel in Fig 3C), indicating that a smooth embedding is achieved of the morphological information into the latent space. In addition, both PC1 and PC2 seem to reflect an anatomical meaningful feature since the angle between the condylar and the coronoid processes approaches 90 degrees as PC1 becomes larger (left panel of Fig 3C), and the angular process becomes larger as PC2 increases.

Visual Explanation of the Basis for Class Decisions

An interpretation of which part of the image Morpho-VAE focuses on the classification task can be made. Herein, a post hoc visual explanation method Score-CAM [45] is used for visualizing important areas in the input image for classification. The schematic overview of Score-CAM is given in SFig 3 (see Method section for detailed procedures). Outcomes of this analysis are “the saliency maps” for each family as shown in Fig 4A in which the darker colors represent the area judged more important for classification by the Morpho-VAE. These maps emphasize essential bone processes: the area around the coronoid process (Fig 1A) for Phocidae, the condylar process for Cercopethecidae, Hylobatidae, and Hominidae. Futhermore, the angular processes, except for Hylobatidae, are highlighted in the x and y projections. These processes connect temporal and pterygoid muscles as well as are crucial in the opening and closing of the jaw; therefore, them being highlighted for classification is reasonable.

Fig 4.
  • Download figure
  • Open in new tab
Fig 4. Visualization of the saliency map by Score-CAM.

(A) Saliency map in each family calculated by the Score-CAM method: The stronger the color, the more intensively the area is highlighted for classification. (B) The horizontal axis is the projection direction used for the input image (e.g., xy indicates that the input image in the z direction is a blank image). The vertical axis refers to the class classification accuracy with the input. The error bars indicate the mean and standard deviations in the accuracy for each of the 10 tuned models.

The Score-CAM analysis also clarifies that the images of z projection do not contribute to the classification task as the colormaps in z projection are all blank (Fig 4A). This result is further confirmed by calculating the classification accuracy from the inputs of single-direction data only (e.g., x projection only) and those of double-direction data only (e.g., x and y projections only), rather than the full dataset of x, y, and z projections (Fig 4B). Both results indicate that the x projection image is most informative. Likewise, the site around the teeth in the x projection (bottom half of the image) tends to be ignored by the map, which likely reflects that the position of the teeth and their presence/absence varies greatly among samples and is thus less informative.

Reconstruction from Cropped Data

Bone samples, especially fossil samples, sometimes miss a part. A possible application of the generative ability of the proposed model is to reconstruct such missing bone parts based on the remaining parts. Herein, the proposed model is demonstrated to achieve this reconstruction from a partially cropped image. Artificially cropped 3D data from the y and z directions (Figs 5G and 5J) is prepared and their x, y, and z projections are used as the data set to be reconstructed. Figs 5A, 5C, and 5D show representative examples of the original, vertically cropped, and horizontally cropped data, respectively, and their reconstructions by the proposed Morpho-VAE are presented in Figs 5E (vertical crop) and 5F (horizontal crop). The reconstructed images from the cropped data (Figs 5C and 5D) illustrate that the cropped area in the mandible of the original image (Fig 5A) is reconstructed well but not perfectly. The image looks closely similar to the reconstructed image from the original (Fig 5B), indicating that the cropped region is less informative than the remaining region.

Fig 5.
  • Download figure
  • Open in new tab
Fig 5. Reconstruction of cropped image.

(A–F) Procedure of image cropping and reconstruction: The figures are the mandibles of Homo Sapience. The panels (C) and (D) represent 40% vertical and 32% horizontal cropping, respectively. (G–L) Reconstruction loss and accuracy after vertical and horizontal image cropping: Crop rate is the percentage of the mandible missing relative to its vertical or horizontal length. The figures in the first column exemplify the cropping of the mandible data. The graph in the second column shows the reconstruction loss between the reconstructed and original images. The graph in the third column shows the classification accuracy of the reconstructed images.

Furthermore, the robustness of this reconstruction is evaluated by calculating the cropped region dependency of the reconstruction loss, i.e., the binary cross-entropy between the reconstructed image from the cropped data and the original image (Figs 5H and 5K, respectively) as well as that of the prediction accuracy (Figs 5I and 5L). Within about 60% and 25% crop rates for the vertical (Figs 5H and 5I) and horizontal (Figs 5K and 5L) crops, respectively, only a slight increase in the loss and drop in the accuracy is observed, indicating that the reconstruction quality is maintained. The loss then starts to increase and the accuracy drops for further increases in the crop size. For the vertical crop, an image with the cropping size just before the loss starts to increase is shown in Fig 5D in which the shapes of the coronoid and condylar processes are just barely preserved. When these processes are completely removed, the reconstruction and classification fail (SFig 4). For the horizontal crop, an image just before the loss increase (Fig 5C) shows that the reconstruction is robust against the cropping of the region around the teeth and tip region of the mandible (i.e., the region around the body of the mandible). Both the aforementioned results indicate that the shape of the coronoid and condylar processes contain relevant information about the overall shape of the mandible, which is consistent with the results of the Score-CAM analysis (Fig 4A).

DISCUSSION

In this study, a method based on VAE combined with a classifier module is proposed for morphological feature extraction and analyzing the image datasets of mandibles. The proposed method compresses the 128 × 128 pixel input image data into 3D latent space in which the data points of different families form well-separated clusters and the degree of cluster separation outperforms those obtained by VAE and PCA (Fig 2). Since the label information of image data is used as the supervisory signal for the classifier module, the proposed model incorporates the essence of supervised learning as well as that of unsupervised learning of a VAE module. This architecture is designed to reduce dimensionality by focusing on the morphological features through which the differences between predefined labels (i.e., family classes) are best distinguished. This is in contrast to VAE in which the latent variables are selected by focusing on the features common to the entire data. Consequently, the proposed Morpho-VAE can be interpreted as a nonlinear version of LDA that is designed to find a linear combination of features that separates data with different classes. While hybrid architectures of VAE-based unsupervised learning and classifier module have been investigated for solving classification tasks on a small number of labeled data with a large number of non-labeled data [57–59], the proposed method provides a novel framework in terms of dimensionality reduction and feature extraction.

The results in Fig 1C also indicate that the reconstruction loss shows negligible increase after taking into account the classification loss as depicted in Fig 1C with α = 0 (reconstruction only) and α = 0.1 (reconstruction and classification), suggesting that the reconstruction performance can be maintained to some extent by adding the classification function; more-over, this ensures the cluster separation of different label data in the latent space.

The characteristics of this model, which select the latent space that distinguishes predefined labels, can be described as extracting morphological features by focusing on synapo-morphic traits through which a clade is well distinguished from others. The distance in the latent space is then considered to be a measure that contains information about these traits. Our naive expectation was that the latent space distance corresponds to the evolutionary distance. However, no clear correlation between the latent space distance and phylogenetic distance that is estimated in [60] is noted (SFig 5). This may account for the lack of datasets for families across the entire primates as well as the insufficient data within families. Even for sufficient data, the relationship between morphological and evolutionary distance is not straightforward as other factors, such as diet (carnivore or herbivore) and presence of a predator, can affect morphology more strongly than evolutionary distance, as exemplified by convergent evolution.

Furthermore, the Score-CAM method, which provides an interpretable visualization of the parts of an image that are important for classification (Fig 4), is applied to overcome the difficulty of interpreting DNN-based analysis. The first notable result of this analysis is that the x projection of the mandible image data is the most important for classification among the x, y, and z projections. This result is likely attributed to the fact that the area of the x projection is the largest and the results of Score-CAM, which focuses on the lateral view of the mandible is consistent with the previous studies in which the landmarks visible from the lateral view of the mandible are important in detecting sexual dimorphism [61–63] and inter-period variation [64]. Moreover, the analysis through a closer look at the x projection shows that the anatomically distinguishable projections of bone, i.e., the angular, condylar, and coronoid processes, are highlighted. For all groups except for Hylobatidae, the angular process is highlighted, but the condylar process for Cercopithecidae, Hylobatidae, and Hominidae are exaggerated. The angular and coronoid processes provide insertion sites for the medial pterygoid and temporalis, respectively; both of which are critical for producing bite force [55]. The coronoid process provides the temporomandibular joint, which works as the fulcrum during biting. The highlighted parts essentially correspond to key regions related to mastication; thus, them being highlighted seems reasonable. For Phocidae, the area around the coronoid process is emphasized. This is reasonable because a well-developed temporalis is a key feature of carnivora, and the coronoid process to which the temporalis inserts is notably enlarged compared with the other two processes.

As an application of the generative aspect of the model, the proposed model is demonstrated to complement a missing bone segment from an artificially cropped image (Fig 5) based on the remaining structure. The reconstruction is robust against the cropping of the region around teeth and tip of the mandible (Figs 5C, 5H, and 5I), but sensitive to the lack of the mandibular joint, i.e., the coronoid and condylar processes (Figs 5D, 5K, and 5L). Both these results are consistent with the results of the Score-CAM analysis (Fig 4A) in which the shape of the bone processes contains relevant information about the overall shape of the mandible. The proposed model can reconstruct a missing segment from data with defects, i.e., data in which a part of the sample is missing or damaged, as is often the case with fossils, as well as classify and reduce dimensionality for visualization from such datasets containing defects. This flexibility of the model is in contrast to the landmark-based method for which the application to data with defects is difficult as the landmarks of the missing parts cannot be defined. The generative model based on VAE has also been applied to jaw reconstructive surgeries for completing the missing segments of the bone based on the remaining healthy structure [65]. The proposed architecture by combining a VAE and classifier module provides a new framework for reconstructing missing bone segments while performing dimensional reduction for visualization and classification. In summary, the proposed model enables dimension reduction and feature extraction by which different label data are well-separated, providing a promising application of analyzing morphological dataset in biology. Although the model is designed for image input data, a combination with the landmark-based method is possible, for instance, the model output through Score-CAM analysis (Fig 4) can be used for defining the landmark positions in a systematic manner. The model can be extended to color image input instead of mask images, which will allow performing advanced analyses by extracting features that integrate texture information, such as bone density and fine bone depression, into shape information. The current model is based on 2D projected images of a 3D object and can be modified in the future to extend to 3D input data, which will provide a deeper analysis and higher resolution of the reconstructed image, but that will also require a huge dataset.

METHODS

Data Sets and Data Preprocessing

3D computed tomography (CT) scanning morphological data of primate mandibles is collected from Primate Research Institute (KUPRI) and MorphoSource.org. Phocidae is used as an outgroup, which is available from MorphoSource.org. Additionally, 3D datasets are collected, which consist of three images of the mandible captured from three orthogonal directions (i.e., top-, front-, and side-views), from Mammalian Crania Photographic Archive Second Edition (MCPA2). A total of 148 mandible datasets (87 Cercopethecidae, 6 Cebidae, 6 Lemuridae, 6 Hylobatidae, 6 Atelidae, 30 Homonidae, and 6 Phoci-dae) are collected (Table 1). Samples are restricted to full adults with no abnormalities in appearance.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1.

Family distribution

Since typically more than 104 datasets are required in machine learning for 3D images [66, 67], 2D machine learning is applied, rather than 3D, by converting 3D mandible image data into three 2D images (i.e., top-, front-, and side-views). SFig 1A shows that the mandible is aligned such that its teeth face downward, and the xy plane is defined as the plane to which the base of the mandible is parallel. Next, the position of the mandible is adjusted such that the line connecting the center of the two medial tips of the condylar head and the mandible tip is parallel to the y-axis. Since the mandibles of all animals collected in this study are left-right symmetrical, one mandible is divided into two pieces by the center of the mandible tip to increase the number of datasets moreover, one part is mirror-image inverted. The divided mandible, which is placed in the xyz space, is then converted into a set of three 2D images with size of 128 × 128 pixels by projection onto the yz (x projection), xz (y projection), and xy (z projection) planes. In addition, the length along the y-axis in the x projection of all the mandibles is normalized to be the same for avoiding size dependency of data (SFig 1).

Model Description

This study aims to extract low-dimensional image features while ensuring the ability to classify the mandible images into families. To this end, Morpho-VAE (Fig 1B), a novel VAE-based model, is proposed in which a VAE module is combined with a classifier module through the latent variable ζ. Similar to conventional VAE, the VAE module of the Morpho-VAE model comprises an l-layer convolution neural network as the encoder and an l-layer deconvolution neural network as the decoder. The encoder is a layer for reducing the input data into low-dimensional latent variable ζ in which the input image is converted into the mean μ and variance σ of the multidimensional normal distribution. Subsequently, the latent variable ζ is sampled from the distribution Embedded Image. The decoder is a layer for reconstructing the low-dimensional latent variable ζ into an output image that has the same resolution as the input image. The network is trained such that the output image is as close as possible to the input data by optimizing the reconstruction loss ERec (see below). The distinct feature of the Morpho-VAE is that the VAE module is combined with a classifier module in which a single layer network converts the low-dimensional latent variable ζ into the output vector for classification by the softmax activation function (Fig 1B). Therefore, the Morpho-VAE has two outputs: the output image for the reconstruction and the output vector for the classification. The classifier module is trained to predict the label from the input data via the latent variable ζ in a supervised learning manner. Herein, family-level classification from the input image is considered; thus, the training labels are: Cercopethecidae, Homonidae, Cebidae, Lemuridae, Hylobatidae, Phocidae, and Atelidae. A more detailed architecture of Morpho-VAE is shown in SFig 2D.

The loss functions Etotal that are required to train the proposed Morpho-VAE are as follows:

  1. Reconstruction Loss (ERec): binary cross entropy between the input and output images, expressed as Embedded Image, where p and q are the input and output image vectors, respectively.

  2. Regularization Loss (EReg): Kullback-Leibler divergence DKL(q(ζ|X)||p(ζ)) between the data distribution in the latent space q(ζ|X) encoded by the encoder from data X and the predefined reference distribution Embedded Image, which is fixed as a Gaussian distribution with mean 0 and variance 1.

  3. Classification Loss (EC): cross entropy between the predicted y′ and true label vectors y from the latent variable ζ and classifier module, expressed as Embedded Image.

From these three loss functions, VAE loss function is defined as EVAE = ERec + EReg. Moreover, the total loss function is defined as Etotal = (1 – α)EVAE + αEC, where α = 0.1 is selected by cross-validation (Fig 1C), and the Morpho-VAE is trained to minimize Etotal by backpropagation.

Hyperparameter Tuning

The structural hyperparameters of Morpho-VAE, such as the number of layers, number of filters in each layer, type of activation function, and type of optimization function, are tuned by Optuna [68].

The number of layers is optimized to be within the range of one to five and the number of filters in each layer is optimized to be within the range of 16 to 128. The activation functions are selected from ReLU, sigmoid, and tanh, and the optimization function is selected from stochastic gradient descent, adaptive momentum estimation (Adam), and RMSprop. Note that the latent space dimension is fixed to three in these processes. These optimizations are performed by searching 500 different conditions, each with 100 epochs of training, and the following parameters are defined as the optimal hyperparameters to minimize the loss function Etotal. The other hyperparameters are listed in STable 2. The number of layers in the encoder is five. The numbers of filters in each layer are 128, 128, 32, 32, and 64 in the order from the layer nearest to the input layer. The selected activation and optimization functions are ReLU and RMSprop, respectively. Moreover, the number of layers in the decoder is five, and the numbers of filters in each layer are 64, 32, 32, 128, and 128 in the order from the layer nearest to the latent variable. The type of optimization function is RMSprop. Note that sigmoid is adopted instead of ReLU as the activation function of the decoder because the input image of this model is a binary image in the range of [0,1], and the output image needs to be in the same range.

After tuning the structural hyperparameters, the dimensions of the latent variable ζ are also explored. The number of dimensions of the latent variable is examined from 2 to 10 by 100 times independent 100-epoch training with different training-validation datasets for each dimension. SFig 2A shows that the mean and median of the minimum of Etotal in each 100-epoch training decrease as the dimension increases from two to eight. Since our aim is to select a low-dimensional feature ζ that generates a low Etotal, the dimension value of three is adopted for which only a slight increases appears in loss value compared with the dimensions ≥ 4, but a certain drop (SFigs 2A and 2B) is observed between the dimensions two and three.

A double cross-validation procedure [69] is used for separating the data into training, validation, and test data. One-third of the total data is used as test data to evaluate the generalization performance of the Morpho-VAE. Of the remaining data, 75% is separated as training data for tuning the hyperparameters of the Morpho-VAE and the remaining 25% as validation data for verifying the hyperparameters to avoid leaks of the same species of data. Since the data set collected in this study has a class imbalance as listed in Table 1, the data set is divided into training and test data using the proportional extraction method, which divides the data by reflecting the sample size of each label. Note that two datasets are obtained from one mandible sample (see Datasets section) but the data are distributed such that the same sample is not included both in the test and training data.

Visualization of the Saliency map (Fig 4A) by Score-CAM

The Score-CAM [45] method is applied to visualize the Morpho-VAE making its decisions. The schematic overview of Score-CAM is given in SFig 3. First, upsampling is performed from the 8 × 8 pixel activation map, which activates the last layer in the convolution layers of the encoder, to a 128 × 128 pixel image and then normalization is implemented such that the maximum and minimum pixel intensities of the image are 1 and 0, respectively. Each pixel intensity of the image is then multiplied by the intensity of the corresponding pixel in the 128 × 128 original input image to create a masking image. Furthermore, this masking image is re-inputted into the Morpho-VAE and the prediction probability is calculated for the label of the input image through the classifier module. Since the calculated prediction probability can be interpreted as the importance of the masking image, this probability is then multiplied by the activation map, and the final outcome of the Score-CAM (Fig 4), “the saliency map”, is obtained by taking a sum over the number of filters (e.g., 64).

SUPPLEMENTAL FIGURES

SFig 1.
  • Download figure
  • Open in new tab
SFig 1. Detailed description of image preprocessing.

(A) The xy plane is defined as a plane to which the base of the mandible is horizontal. (B) After placing the mandible in the xy plane, it is rotated so that the mandible tip and the mid-point of the line connecting the condylar head’s left and right medial tips have the same y-coordinate. (C) The mandible (arranged as shown) is projected from three orthogonal directions. (D) The size of the data is normalized such that the length from the angular process to the tip of the mandible is constant.

SFig 2.
  • Download figure
  • Open in new tab
SFig 2. Latent dimension tuning and detailed architecture of the proposed Morpho-VAE

(A) Mean, standard deviation, and median values of Etotal for 2 - 10 dimension. These values are calculated from 10 independent architectures listed in STable 2. (B) Classification accuracy for 2 - 10 dimension calculated from 10 independent architectures. The selected number of dimension is 3. (C and D) Trajectories of training and validation losses for ERec and EC during training. (E) Detailed architecture of the Morpho-VAE: The upper and lower sides of the figure are the input and output layers, respectively.

SFig 3.
  • Download figure
  • Open in new tab
SFig 3. Schematic image for creating saliency map.

First, the output images with 64 channels obtained from the last convolution layer of the Morpho-VAE with a set of input images are prepared. The, output images are upsampled to the same size as the input images. The resulting 64 outputs are normalized using the maximum and minimum values for each image. These 64 normalized output images are multiplied by the original input images to create 64 images. Each of these is then inputted to the Morpho-VAE to calculate the probability of the seven classes. Each of the 64 probability values is considered as the importance of the 64 outputs. Next, the importance and output are multiplied and then added together to obtain a saliency map.

SFig 4.
  • Download figure
  • Open in new tab
SFig 4. Example of reconstruction failure when crop rate is significantly large.

A significant defect in the coronoid process results in the failure of the reconstruction, where renconstructed image is far different from the original image.

SFig 5.
  • Download figure
  • Open in new tab
SFig 5. Correlation between the distance in the Morpho-VAE and phylogentic tree

The vertical axis represents the Euclidean distance between clusters in the latent space of the Morpho-VAE, and the horizontal axis represents the distance within the phylogenetic tree. The error bars indicate the mean and standard deviations in the accuracy for each of the 10 tuned models listed in STable 2.

TABLES

View this table:
  • View inline
  • View popup
STable 1: List of the mandible data and references
View this table:
  • View inline
  • View popup
  • Download powerpoint
STable 2: All model hyperparameters

ACKNOWLEDGMENTS

We thank: Y. Kondo, K. Aoki, Y. Himeoka, J. Iwasawa, Y. Uchida, H. Higashiyama, A. Tokuhisa, K. Terayama, and Y. Okuno for the meaningful discussions as well as M. Hasebe, T. Fujimori, T. Ueda, S. Yoshida, S. Takada, and K. Agata for supporting comments. We are grateful to T.D. Nishimura and Primate Research Institute of Kyoto University for their help in collecting CT data. This research was supported by Joint Research of the Exploratory Research Center on Life and Living Systems (ExCELLS) (ExCELLS program no. 21-319 and no. 21-102 to N.S.). This research was also supported in part by JSPS KAKENHI (17H06389 to C.F.) and JST ERATO (JPMJER1902 to C.F.). Masato Tsutsumi was supported by RIKEN Junior Research Assistant.

Footnotes

  • ↵* nensaito{at}hiroshima-u.ac.jp

  • ↵† furusawa{at}ubi.s.u-tokyo.ac.jp

References

  1. [1].↵
    F. L. Bookstein, Morphometric tools for landmark data: geometry and biology (Cambridge University Press, 1991).
  2. [2].
    D. C. Adams, F. J. Rohlf, and D. E. Slice, A field comes of age: geometric morphometrics in the 21st century, Hystrix, the Italian Journal of Mammalogy 24, 7 (2013).
    OpenUrl
  3. [3].↵
    P. Mitteroecker and P. Gunz, Advances in Geometric Morphometrics, Evolutionary Biology 36, 235 (2009).
    OpenUrlCrossRefWeb of Science
  4. [4].
    F. James Rohlf and L. F. Marcus, A revolution in morphometrics, Trends in Ecology & Evolution 8, 129 (1993).
    OpenUrl
  5. [5].↵
    M. L. Zelditch, D. L. Swiderski, H. D. Sheets, and W. L. Fink, Geometric morphometrics for biologists: A primer. (Academic Press, San Diego, 2004) pp. 1 – 20.
  6. [6].↵
    N. M. Young, D. Hu, A. J. Lainoff, F. J. Smith, R. Diaz, A. S. Tucker, P. A. Trainor, R. A. Schneider, B. Hallgrímsson, and R. S. Marcucio, Embryonic bauplans and the developmental origins of facial diversity and constraint, Development (Cambridge) 141, 1059 (2014).
    OpenUrlAbstract/FREE Full Text
  7. [7].↵
    A. Loy, S. Busilacchi, C. Costa, L. Ferlin, and S. Cataudella, Comparing geometric morpho-metrics and outline fitting methods to monitor fish shape variability of diplodus puntazzo (teleostea: Sparidae), Aquacultural Engineering 21, 271 (2000).
    OpenUrl
  8. [8].
    E. Sherratt, D. J. Gower, C. P. Klingenberg, and M. Wilkinson, Evolution of cranial shape in caecilians (amphibia: Gymnophiona), Evolutionary Biology 41, 528 (2014).
    OpenUrl
  9. [9].
    S. B. Cooke and C. E. Terhune, Form, function, and geometric morphometrics, The Anatomical Record 298, 5 (2015).
    OpenUrl
  10. [10].
    R. Ledevin and D. Koyabu, Patterns and constraints of craniofacial variation in colobine monkeys: Disentangling the effects of phylogeny, allometry and diet, Evolutionary Biology 46 (2019).
  11. [11].
    D. Koyabu, M. Hosojima, and H. Endo, Into the dark: patterns of middle ear adaptations in subterranean eulipotyphlan mammals, Royal Society open science 4, 170608 (2017).
    OpenUrlCrossRef
  12. [12].↵
    I. T and K. D, Biogeographic variation in skull morphology across the kra isthmus in dusky leaf monkeys, Journal of Zoological Systematics and Evolutionary Research 56, 599 (2018).
    OpenUrl
  13. [13].↵
    Tofilski, Adam, Using geometric morphometrics and standard morphometry to discriminate three honeybee subspecies, Apidologie 39, 558 (2008).
    OpenUrlCrossRef
  14. [14].
    T. K. Suzuki, Modularity of a leaf moth-wing pattern and a versatile characteristic of the wing-pattern ground plan, BMC Evolutionary Biology 13, 158 (2013).
    OpenUrl
  15. [15].
    C. Fernaíndez-Montraveta and J. Marugan-Lobon, Geometric morphometrics reveals sex-differential shape allometry in a spider, PeerJ 5, e3617 (2017).
    OpenUrl
  16. [16].↵
    J. Ren, M. Bai, X.-K. Yang, R.-Z. Zhang, and S.-Q. Ge, Geometric morphometrics analysis of the hind wing of leaf beetles: proximal and distal parts are separate modules, ZooKeys, 131 (2017).
  17. [17].↵
    L. C. Anderson and P. D. Roopnarine, Role of constraint and selection in the morphologic evolution of caryocorbula (mollusca: Corbulidae) from the caribbean neogene, Palaeontologia Electronica 8, 1 (2005).
    OpenUrl
  18. [18].
    J. M. Serb, A. Alejandrino, E. Otarola-Castillo, and D. C. Adams, Morphological convergence of shell shape in distantly related scallop species (mollusca: Pectinidae), Zoological Journal of the Linnean Society 163, 571 (2011).
    OpenUrlCrossRefWeb of Science
  19. [19].↵
    I. Leyva-Valencia, S. T. Álvarez-Castañeda, D. B. Lluch-Cota, S. González-Pelaez, S. Pérez-Valencia, B. Vadopalas, S. Ramirez-Perez, and P. Cruz-Hernandez, Shell shape differences between two panopea species and phenotypic variation among p. globosa at different sites using two geometric morphometrics approaches, Malacologia 55, 1 (2012).
    OpenUrlCrossRef
  20. [20].↵
    T. van der Niet, C. P. Zollikofer, M. S. P. de León, S. D. Johnson, and H. P. Linder, Three-dimensional geometric morphometrics for studying floral shape variation, Trends in Plant Science 15, 423 (2010).
    OpenUrlCrossRefPubMedWeb of Science
  21. [21].↵
    V. Viscosi and A. Cardini, Leaf morphology, taxonomy and geometric morphometrics: A simplified protocol for beginners, PLOS ONE 6, 1 (2011).
    OpenUrlCrossRefPubMed
  22. [22].↵
    P. Gunz and P. Mitteroecker, Semilandmarks: a method for quantifying curves and surfaces, Hystrix, the Italian Journal of Mammalogy 24, 103 (2013).
    OpenUrl
  23. [23].
    D. C. Adams, F. J. Rohlf, and D. E. Slice, Geometric morphometrics: Ten years of progress following the ‘ revolution ‘, Italian Journal of Zoology 71, 5 (2004).
    OpenUrlCrossRefWeb of Science
  24. [24].↵
    A. Watanabe, How many landmarks are enough to characterize shape and size variation?, PLOS ONE 13, 1 (2018).
    OpenUrlCrossRefPubMed
  25. [25].↵
    C. Fruciano, M. A. Celik, K. Butler, T. Dooley, V. Weisbecker, and M. J. Phillips, Sharing is caring? measurement error and the issues arising from combining 3d morphometric datasets, Ecology and evolution 7, 7034 (2017).
    OpenUrl
  26. [26].↵
    B. M. Shearer, S. B. Cooke, L. B. Halenar, S. L. Reber, J. E. Plummer, E. Delson, and M. Tallman, Evaluating causes of error in landmark-based data collection using scanners, PLOS ONE 12, 1 (2017).
    OpenUrlCrossRefPubMed
  27. [27].↵
    F. P. Kuhl and C. R. Giardina, Elliptic fourier features of a closed contour, Computer Graphics and Image Processing 18, 236 (1982).
    OpenUrl
  28. [28].↵
    P. E. Lestrel, Fourier descriptors and their applications in biology (Cambridge University Press, 1997).
  29. [29].↵
    G. Diaz, A. Zuccarelli, I. Pelligra, and A. Ghiani, Elliptic fourier analysis of cell and nuclear shapes, Computers and Biomedical Research 22, 405 (1989).
    OpenUrlCrossRef
  30. [30].↵
    L. Tweedy, B. Meier, J. Stephan, D. Heinrich, and R. G. Endres, Distinct cell shapes determine accurate chemotaxis, Scientific Reports 3, 1 (2013).
    OpenUrl
  31. [31].↵
    J. S. Crampton, Elliptic fourier shape analysis of fossil bivalves: some practical considerations, Lethaia 28, 179 (1995).
    OpenUrlCrossRefGeoRefWeb of Science
  32. [32].↵
    S. R. Tracey, J. M. Lyle, and G. Duhamel, Application of elliptical fourier analysis of otolith form as a tool for stock identification, Fisheries Research 77, 138 (2006).
    OpenUrl
  33. [33].↵
    C. Costa, F. Antonucci, C. Boglione, P. Menesatti, M. Vandeputte, and B. Chatain, Auto-mated sorting for size, sex and skeletal anomalies of cultured seabass using external shape analysis, Aquacultural engineering 52, 58 (2013).
    OpenUrlCrossRef
  34. [34].↵
    R. J. White, H. C. Prentice, and T. Verwijst, Automated image acquisition and morphometric description, Canadian Journal of Botany 66, 450 (1988).
    OpenUrl
  35. [35].
    J. C. Neto, G. E. Meyer, D. D. Jones, and A. K. Samal, Plant species identification using elliptic fourier leaf shape analysis, Computers and electronics in agriculture 50, 121 (2006).
    OpenUrl
  36. [36].↵
    H. Iwata, K. Ebana, Y. Uga, T. Hayashi, and J.-L. Jannink, Genome-wide association study of grain shape variation among oryza sativa l. germplasms based on elliptic fourier analysis, Molecular Breeding 25, 203 (2010).
    OpenUrl
  37. [37].↵
    D. Ciregan, U. Meier, and J. Schmidhuber, Multi-column deep neural networks for image classification, in 2012 IEEE Conference on Computer Vision and Pattern Recognition (2012) pp. 3642–3649.
  38. [38].↵
    1. F. Pereira,
    2. C. J. C. Burges,
    3. L. Bottou, and
    4. K. Q. Weinberger
    A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems, Vol. 25, edited by F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Curran Associates, Inc., 2012).
  39. [39].↵
    S. Wang and R. M. Summers, Machine learning and radiology, Medical image analysis 16, 933 (2012).
    OpenUrlCrossRefPubMedWeb of Science
  40. [40].↵
    A. S. Lundervold and A. Lundervold, An overview of deep learning in medical imaging focusing on mri, Zeitschrift fur Medizinische Physik 29, 102 (2019).
    OpenUrl
  41. [41].↵
    J. F. H. Cuthill, N. Guttenberg, S. Ledger, R. Crowther, and B. Huertas, Deep learning on butterfly phenotypes tests evolution ‘ s oldest mathematical model, Science advances 5, eaaw4967 (2019).
    OpenUrlFREE Full Text
  42. [42].↵
    M. Quenu, S. A. Trewick, F. Brescia, and M. Morgan-Richards, Geometric morphometrics and machine learning challenge currently accepted species limits of the land snail placostylus (pulmonata: Bothriembryontidae) on the isle of pines, new caledonia, Journal of Molluscan Studies 86, 35 (2020).
    OpenUrl
  43. [43].↵
    R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, Grad-cam: Visual explanations from deep networks via gradient-based localization, in Proceedings of the IEEE international conference on computer vision (2017) pp. 618–626.
  44. [44].
    A. Chattopadhay, A. Sarkar, P. Howlader, and V. N. Balasubramanian, Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks, in 2018 IEEE winter conference on applications of computer vision (WACV) (IEEE, 2018) pp. 839–847.
  45. [45].↵
    H. Wang, Z. Wang, M. Du, F. Yang, Z. Zhang, S. Ding, P. Mardziel, and X. Hu, Score-cam: Score-weighted visual explanations for convolutional neural networks, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (2020) pp. 24–25.
  46. [46].↵
    D. P. Kingma and M. Welling, Auto-encoding variational bayes, CoRR abs/1312.6114 (2013).
  47. [47].↵
    R. T. Chen, X. Li, R. B. Grosse, and D. K. Duvenaud, Isolating sources of disentanglement in variational autoencoders, Advances in neural information processing systems 31 (2018).
  48. [48].
    T. Bepler, E. Zhong, K. Kelley, E. Brignole, and B. Berger, Explicitly disentangling image content from translation and rotation with spatial-vae, Advances in Neural Information Processing Systems 32 (2019).
  49. [49].↵
    D. Bank, N. Koenigstein, and R. Giryes, Autoencoders, arXiv preprint arXiv:2003.05991 (2020).
  50. [50].↵
    W. L. Hylander, The functional significance of primate mandibular form, Journal of Morphology 160, 223 (1979).
    OpenUrlCrossRefPubMedWeb of Science
  51. [51].
    W. L. Hylander, Mandibular function in galago crassicaudatus and macaca fascicularis: an in vivo approach to stress analysis of the mandible, Journal of Morphology 159, 253 (1979).
    OpenUrlCrossRefPubMedWeb of Science
  52. [52].
    D. J. Daegling, Mandibular morphology and diet in the genuscebus, International Journal of Primatology 13, 545 (1992).
    OpenUrlCrossRefWeb of Science
  53. [53].↵
    D. J. Daegling and W. S. McGraw, Functional morphology of the mangabey mandibular corpus: relationship to dental specializations and feeding behavior, American Journal of Physical Anthropology: The Official Publication of the American Association of Physical Anthropologists 134, 50 (2007).
    OpenUrl
  54. [54].↵
    W. S. Greaves, The mammalian jaw mechanism–the high glenoid cavity, The American Naturalist 116, 432 (1980).
    OpenUrlCrossRefWeb of Science
  55. [55].↵
    S. W. Herring, Functional morphology of mammalian mastication, American Zoologist 33, 289 (1993).
    OpenUrlCrossRef
  56. [56].↵
    D. L. Davies and D. W. Bouldin, A cluster separation measure, IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-1, 224 (1979).
  57. [57].↵
    D. P. Kingma, D. J. Rezende, S. Mohamed, and M. Welling, Semi-supervised learning with deep generative models, in Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’14 (MIT Press, Cambridge, MA, USA, 2014) p. 3581–3589.
  58. [58].
    X. Zhu and A. B. Goldberg, Introduction to semi-supervised learning, Synthesis lectures on artificial intelligence and machine learning 3, 1 (2009).
    OpenUrl
  59. [59].↵
    J. E. Van Engelen and H. H. Hoos, A survey on semi-supervised learning, Machine Learning 109, 373 (2020).
    OpenUrl
  60. [60].↵
    N. S. Upham, J. A. Esselstyn, and W. Jetz, Inferring the mammal tree: Species-level sets of phylogenies for questions in ecology, evolution, and conservation, PLOS Biology 17, 1 (2019).
    OpenUrlCrossRef
  61. [61].↵
    S. R. Loth and M. Henneberg, Sexually dimorphic mandibular morphology in the first few years of life, American Journal of Physical Anthropology: The Official Publication of the American Association of Physical Anthropologists 115, 179 (2001).
    OpenUrl
  62. [62].
    M. Schmittbuhl, J.-M. Le Minor, A. Schaaf, and P. Mangin, The human mandible in lateral view: elliptical fourier descriptors of the outline and their morphological analysis, Annals of Anatomy - Anatomischer Anzeiger 184, 199 (2002).
    OpenUrl
  63. [63].↵
    M. Coquerelle, F. L. Bookstein, J. Braga, D. J. Halazonetis, G. W. Weber, and P. Mitteroecker, Sexual dimorphism of the human mandible and its association with dental development, American Journal of Physical Anthropology 145, 192 (2011), https://onlinelibrary.wiley.com/doi/pdf/10.1002/ajpa.21485.
    OpenUrlCrossRefPubMedWeb of Science
  64. [64].↵
    A. Pokhojaev, H. Avni, T. Sella-Tunis, R. Sarig, and H. May, Changes in human mandibular shape during the terminal pleistocene-holocene levant, Scientific reports 9, 1 (2019).
    OpenUrl
  65. [65].↵
    1. D. Shen,
    2. T. Liu,
    3. T. M. Peters,
    4. L. H. Staib,
    5. C. Essert,
    6. S. Zhou,
    7. P.-T. Yap, and
    8. A. Khan
    A. H. Abdi, M. Pesteie, E. Prisman, P. Abolmaesumi, and S. Fels, Variational shape completion for virtual planning of jaw reconstructive surgery, in Medical Image Computing and Computer Assisted Intervention – MICCAI 2019, edited by D. Shen, T. Liu, T. M. Peters, L. H. Staib, C. Essert, S. Zhou, P.-T. Yap, and A. Khan (Springer International Publishing, Cham, 2019) pp. 227–235.
  66. [66].↵
    Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, 3d shapenets: A deep representation for volumetric shapes, in Proceedings of the IEEE conference on computer vision and pattern recognition (2015) pp. 1912–1920.
  67. [67].↵
    J. Wu, C. Zhang, T. Xue, W. T. Freeman, and J. B. Tenenbaum, Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling, in Advances In Neural Information Processing Systems (2016) pp. 82–90.
  68. [68].↵
    T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, Optuna: A next-generation hyper-parameter optimization framework, in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2019) pp. 2623–2631.
  69. [69].↵
    M. Stone, Cross-validatory choice and assessment of statistical predictions, Journal of the royal statistical society: Series B (Methodological) 36, 111 (1974).
    OpenUrlCrossRefWeb of Science
  70. [70].
    C. P. Klingenberg, L. J. Leamy, E. J. Routman, and J. M. Cheverud, Genetic architecture of mandible shape in mice: effects of quantitative trait loci analyzed by geometric morphometrics, Genetics 157, 785 (2001).
    OpenUrlAbstract/FREE Full Text
  71. [71].
    W. R. Atchley and B. K. Hall, A model for development and evolution of complex morphological structures, Biological Reviews 66, 101 (1991).
    OpenUrlCrossRefPubMedWeb of Science
  72. [72].
    M. Schmittbuhl, J. Le Minor, F. Taroni, and P. Mangin, Sexual dimorphism of the human mandible: demonstration by elliptical fourier analysis, International journal of legal medicine 115, 100 (2001).
    OpenUrlCrossRefPubMed
  73. [73].
    J. Caple, J. Byrd, and C. N. Stephan, Elliptical fourier analysis: fundamentals, applications, and value for forensic anthropology, International Journal of Legal Medicine 131, 1675 (2017).
    OpenUrl
Back to top
PreviousNext
Posted May 19, 2022.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A method for morphological feature extraction based on variational auto-encoder : an application to mandible shape
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A method for morphological feature extraction based on variational auto-encoder : an application to mandible shape
Masato Tsutsumi, Nen Saito, Daisuke Koyabu, Chikara Furusawa
bioRxiv 2022.05.18.492406; doi: https://doi.org/10.1101/2022.05.18.492406
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
A method for morphological feature extraction based on variational auto-encoder : an application to mandible shape
Masato Tsutsumi, Nen Saito, Daisuke Koyabu, Chikara Furusawa
bioRxiv 2022.05.18.492406; doi: https://doi.org/10.1101/2022.05.18.492406

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4684)
  • Biochemistry (10361)
  • Bioengineering (7682)
  • Bioinformatics (26340)
  • Biophysics (13534)
  • Cancer Biology (10692)
  • Cell Biology (15445)
  • Clinical Trials (138)
  • Developmental Biology (8501)
  • Ecology (12824)
  • Epidemiology (2067)
  • Evolutionary Biology (16867)
  • Genetics (11401)
  • Genomics (15484)
  • Immunology (10619)
  • Microbiology (25224)
  • Molecular Biology (10225)
  • Neuroscience (54481)
  • Paleontology (402)
  • Pathology (1669)
  • Pharmacology and Toxicology (2897)
  • Physiology (4345)
  • Plant Biology (9252)
  • Scientific Communication and Education (1586)
  • Synthetic Biology (2558)
  • Systems Biology (6781)
  • Zoology (1466)