Machine-learning-based predictions of caloric restriction associations across ageing-related genes

Caloric restriction (CR) is the most studied pro-longevity intervention; however, a complete understanding of its underlying mechanisms remains elusive, and new research directions may emerge from the identification of novel CR-related genes and CR-related genetic features. This work used a Machine Learning (ML) approach to classify ageing-related genes as CR-related or NotCR-related using 9 different types of predictive features: PathDIP pathways, two types of features based on KEGG pathways, two types of Protein-Protein Interactions (PPI) features, Gene Ontology (GO) terms, Genotype-Tissue Expression (GTEx) expression features, Gene-Friends co-expression features and protein sequence descriptors. Our findings suggested that features biased towards curated knowledge (i.e. GO terms and biological pathways), had the greatest predictive power, while unbiased features (mainly gene expression and co-expression data) have the least predictive power. Moreover, a combination of all the feature types diminished the predictive power compared to predictions based on curated knowledge. Feature importance analysis on the two most predictive classifiers mostly corroborated existing knowledge and supported recent findings linking CR to the Nuclear Factor Erythroid 2-Related Factor 2 (NRF2) signalling pathway and G protein-coupled receptors (GPCR). We then used the two strongest combinations of feature type and ML algorithm to predict CR-relatedness among ageing-related genes currently lacking CR-related annotations in the data, resulting in a set of promising candidate CR-related genes (GOT2, GOT1, TSC1, CTH, GCLM, IRS2 and SESN2) whose predicted CR-relatedness remain to be validated in future wet-lab experiments.


Introduction
Ageing increases the risk of disease and death as it declines homeostasis and decreases the capacity to respond to environmental stimuli (MacNee, 2016). Given the widespread interest in reversing and ultimately preventing the detrimental effects of ageing, considerable effort has been devoted to understanding its underlying biochemical mechanisms (de Magalhães et al., 2012). It is known that ageing-related changes are multifactorial and involve a variety of processes, including genomic instability, telomere attrition, epigenetic alterations, loss of proteostasis, impaired nutrient sensing, mitochondrial dysfunction, cellular senescence, stem cell exhaustion, and altered intracellular communication (López-Otín et al., 2013).
Caloric Restriction (CR), which involves reducing total dietary energy intake while maintaining adequate vitamin and mineral levels, is currently the most promising intervention for increasing both lifespan and healthspan, as experiments with a variety of species have shown that CR not only induces longevity but also retards the ageing process (de Magalhães et al., 2012). Although the mechanism underlying these pro-longevity effects is unknown, evidence suggests that CR: 1) reduces oxidative damage by reducing the production of Reactive Oxygen Species (ROS); 2) decreases circulating insulin and glucose levels, resulting in decreased cell growth and division and a shift toward maintenance and repair; and 3) decreases growth hormone and insulin-like growth factor levels (Gillespie et al., 2016).
Additionally, a wealth of publicly available omics data on ageing has emerged thanks to new high-throughput sequencing technologies (Wieser et al., 2011). Hence, a relatively recent approach for studying the ageing process is based on Machine Learning (ML) techniques that learn patterns about gene or protein functions by analyzing gene or protein features such as Gene Ontology (GO) terms, metabolic pathways, and protein-protein interactions, to name a few (Fabris et al., 2017). Examples include the association of human genes with ageing-related diseases (Fabris et al., 2020); prediction of gene deletion effects on yeast longevity (Huang et al., 2012); and the determination of blood age (Weidner et al., 2014); among others (Zhavoronkov et al., 2019).
This work aims to identify novel CR-related candidate genes from ageingrelated genes while also identifying genetic features which increase the likelihood that certain ageing-related genes become CR-related. To accomplish this, we created 11 datasets based on 9 different types of predictive features and two approaches to combine all those features into an integrated dataset. The 9 used feature types were: PathDIP pathways, two types of features based on KEGG pathways, two types of Protein-Protein Interactions (PPI) features, Gene Ontology (GO) terms, Genotype-Tissue Expression (GTEx) expression features, GeneFriends co-expression features and protein sequence descriptors. These datasets provide a wealth of information for representing each of the ageingrelated genes under study (i.e., genes retrieved from the GenAge database, as explained in Section 2.1.1). Then, a ML approach was used to predict whether or not an ageing-related gene is CR-related (i.e., using information from the thologs, only one gene from each set of repeated genes was retained, resulting in 1137 human ageing-related genes. GenDR  is a database of Dietary Restriction (DR)related genes. DR-essential genes are defined in GenDR as those which, if genetically modified, impair DR-mediated lifespan extension. In this work, 215 CR-associated genes from the aforementioned four model organisms were retrieved from the "Gene Manipulation" section of the GenDR database, build 4 (24/06/2017), and used as input for another OMA human orthologs query, which led to 152 human CR genes after keeping only one of each repeated ortholog gene coming from different organisms. The ageing-and CR-related human genes retrieval processes are summarised in Figure 1.A.
Finally, the overlap of the retrieved ageing-and CR-related human orthologs resulted in 115 genes (Figure 1.B) which were labelled as Ageing CR -related, while the remaining 1022 ageing-related genes that didn't overlap with CRrelated orthologs were labelled as Ageing N otCR -related. Ageing-and CR-related genes from mice, fruit fles, nematodes, and budding yeast were retrieved from GenAge and GenDR. Then, an OMA database query was performed to retrieve their corresponding human orthologs, which are the ageing-related and CR-related genes used in this work. B. Human ageing-and CR-related genes' overlap.

KEGG-Pertinence Dataset
This dataset consists of binary features, where each feature indicates whether or not an ageing-related gene (instance) belongs to a specific KEGG pathway. To construct this dataset, a data frame relating KEGG pathways with the genes they contain was retrieved by using the getGeneKEGGLinks command of the R's Lima package (Ritchie et al., 2015). Then, only the KEGG pathways that were associated with at least one of the ageing-related genes were retained, yielding 312 KEGG pathways. The resulting dataset contained 799 ageing-related genes: 94 labelled as Ageing CR -related and 705 labelled as Ageing N otCR -related.

KEGG-Influence Dataset
This approach was inspired by the feature-creation method proposed in (Fabris and Freitas, 2016). Instead of producing binary features like KEGG pertinence, that method examines the internal contents of each KEGG pathway to produce numerical features, where each feature value measures the extent to which each protein influenced all the other proteins in the pathway. Figure 2.A illustrates the influence that a reference protein (red node in the graph) exerts on other proteins (the remaining graph's nodes) within a given pathway. In essence, proteins coloured in dark blue can be "reached" only via upstream paths passing through the reference protein and thus are highly influenced by it. As the proteins in the pathway can be reached via more upstream paths not involving the reference protein, they become less influenced by it, and are represented by lighter blue colours. Proteins that receive no influence are coloured in white. Supplementary materials (S.1.1) contain a complete description of this dataset which contains 1770 features and 799 ageing-related genes, 94 of which are labelled as Ageing CR -related and 705 labelled as Ageing N otCR -related.

Protein-Protein Interaction (PPI) Adjacency Dataset
The human physical PPI network was downloaded from BioGrid (Release 3.5.185): BIOGRID-MV-Physical-3.5.181.tab2.zip (Stark et al., 2006). Interactions were then processed using the R's igraph library to create a graph object with a total of 25,292 gene products and 324,892 interactions. After removing loops and repeated edges, the graph consisted of 25,292 nodes and 92,237 edges. This graph contained 850 ageing-related genes, 86 of which were labelled as Ageing CRrelated while the remaining 764 as Ageing N otCR -related. Then, a dataset with binary features was extracted from the PPI-graph's adjacency matrix. The ijth element of this matrix takes the value 1 if the gene-products G i and G j are adjacent in the PPI-graph (i.e., if there is an edge connecting them), or 0 otherwise. We retained only ageing-related genes of the adjacency matrix as instances, whereas for features, we kept not only ageing-related genes but also genes not associated with ageing that directly interact with at least one of the ageing-related genes, yielding 5718 features. Figure 2.B depicts a representation of this dataset where an adjacency matrix is constructed from a graph.

PPI Graph Measures Dataset
This dataset was created from the same base PPI-graph used to create the PPIadjacency dataset, but now using as features only 18 graph measures applied to the 850 ageing-related genes within the PPI graph. These measures are classified as centrality-and non-centrality-based and were computed using the R's igraph and Centricerve libraries (Csardi and Nepusz, 2006;Jalili et al., 2015Jalili et al., , 2016. Following these libraries' documentation, the centrality measures used in this work were: Leverage, Markov, Maximum neighborhood component, Closeness, Betweenness, Laplacian, Diffusion, Semilocal, Subgraph, Geokpat, Eigenvalue, Eccentricity , Degree and Lobb centralities. In addition, we used the Kcore, CR-ratio, Clustering Coefficient and Topological Coefficient as non-centralitybased measures. A detailed description of all these measures, except CR-ratio, is available on the Centiserve webpage (Jalili et al., 2015). The CR-ratio is the ratio of the number of CR-related direct neighbours of the queried gene over the total number of neighbours of the queried gene (i.e., it describes what percentage of the queried-gene's direct neighbours are CR-related). Figure 2.C represents this dataset by displaying a graph whose nodes are coloured based on their degree centrality (the larger the number of neighbours, the higher the degree centrality and the darker the nodes' colour).

GO terms Dataset
In this dataset, each binary feature represents a specific GO term with which each ageing-related gene may or may not be associated. To accomplish this, a BioMart query retrieved a list of the Biological Process GO terms associated with the ageing-related genes, yielding 8640 different GO terms. The retrieved GO terms form a hierarchical structure with two properties: 1) if a gene is associated to a specific GO term, it will also be associated with all its ancestors (i.e., more general GO terms); and 2) if a gene is not associated to a given GO term, then the gene is not associated to any of its descendants. The GO term GO:0008150 (named 'Biological Process') is the root of the hierarchy, i.e., it is an ancestor of all other GO terms in the dataset. The R's GO.db package [Carlson, 2019] was used to retrieve all the ancestors of the originally retrieved GO terms. Those ancestors were then merged to this dataset as new predictive features. This process increased the number of GO terms across all the ageing-related genes from 4877 (i.e., without ancestors) to 8640 (i.e., containing both the original retrieved GO terms and their ancestors). Each GO term (considering both originally retrieved and ancestor GO terms) is represented as a binary feature, indicating whether or not a gene (instance) is annotated with that GO term. Out of the 1137 ageing-related genes, 13 genes were not associated with any GO term in BioMart and were removed. This produced a dataset composed of 1124 ageing-related genes: 114 labelled as Ageing CR and 1010 labelled as Ageing N otCR . Finally, due to the hierarchical structure of GO terms, any gene associated with a fixed GO term is also associated with all of the term's ancestors. Figure. 2.D illustrates this phenomenon by indicating that association with a fixed GO term (red node) implies association with all of its ancestors (orange nodes).

GTEx Dataset
The median expression levels of human genes across 55 different anatomical tissues were retrieved from the GTEx database (Carithers et al., 2015) (Analysis V8 database) by downloading the file corresponding to the median gene-level Transcripts Per Million (TPM) by tissue. Then, only ageing-related genes were retained, resulting in 1111 ageing-related genes: 114 labelled as Ageing CRrelated and 997 labelled as Ageing N otCR -related. The tissues' median TPM scores were then used as predictive features. A graphical representation of this dataset is illustrated in Figure 2.E through a heatmap of the mean expression of each single gene across different tissues.

Co-expression Dataset
The GeneFriends database (Dam et al., 2015) was used to generate co-expression profiles for 1048 ageing-related genes across a set of 44,946 genes that included both the 1048 ageing-related and other genes. The goal was to determine whether the co-expression profile of key genes across the ageing-related genes contributes to the association of certain age-related genes with CR. This dataset contained 106 and 942 Ageing CR -and Ageing N otCR -related genes, respectively. Figure 2.F illustrates a representation of this dataset, where the correlation between the expression of two genes across different samples is depicted.

Protein-Descriptor Dataset
This dataset contains numeric features associated with the proteins encoded by the ageing-related genes. Since each gene may code for either one or more pro-teins, this dataset differs from others in the sense that it provides information on ageing-related proteins rather than genes. The names and sequences of human proteins were obtained using the proteins command in R's ensembldb library, with the database EnsDb.Hsapiens.v86 (Rainer, 2017;Rainer et al., 2019) as the source. 1109 ageing-related genes encoded 6180 ageing-related proteins, from where 115 genes and 514 proteins were designated as Ageing CR -related, while the remaining 994 genes and 5666 proteins as Ageing N otCR -related. Additionally, features were computed from the amino acid sequences of the proteins using the R's protr and Peptides packages (Xiao et al., 2015;Osorio et al., 2015), which resulted in the features discussed in Supplementary materials (S.1.2).

Whole-Datasets
Two datasets were created that combine features from PathDIP, KEGG-Pertinence, KEGG-Influence, PPI adjacency matrix, PPI graph measures, GO terms, GTEx expression data, and Co-expression datasets. The protein-descriptors dataset was not included as it only provides information on proteins rather than genes, which complicates the gene-based merging process as proteins do not always have a one-to-one relationship with genes. We coined the term 'WholeDataset' to refer to the resulting dataset that combines all of the aforementioned features, yielding a total of 63,099 features and 1,137 ageing-related genes: 115 labelled as Ageing CR and 1022 labelled as Ageing N otCR .
Since the merged datasets had different numbers of ageing-related genes, the WholeDataset contained data gaps for ageing-related genes whose features were not annotated across all the datasets. We addressed this issue using two approaches, namely, imputation and intersection, which are described next and illustrated in Figure 2.G, where the combination of datasets results in genes with missing data (purple cells), which are then imputed (green cells) or removed to leave only genes containing features from all the datasets (intersected genes).
Whole-Dataset-imputation In this approach we used a 5 Nearest Neighbors (5NN) imputation method. For each ageing-related gene G that is missing a value for feature F, this method first determines the top five ageing-related genes that have a known value for F and are most similar (have the smallest Euclidean distance) to G in the training set. The Euclidean distance is computed using all the features for which the values of G are known. If F is a continuous feature, its missing value in G is imputed using the mean value of F across the 5NN of G in the training set. If F is binary, the missing value in G is imputed using the mode of F (i.e., its most frequent value) across the 5NN of G in the training set.
Whole-Dataset-intersection In this approach we only retained the ageingrelated genes that are present across every single dataset, except the proteindescriptors dataset. This guarantees the absence of any missing feature values. However, information is lost since only about half of the ageing-related genes had known values for all the features. The resulting dataset contained 628 ageing-related genes: 72 labelled as Ageing CR -related and 556 labelled as Ageing N otCR -related.

Machine Learning
This work focuses on decision tree-based ensembles, which are a type of powerful ML technique that combine the predictions of several base learners (decision trees) in order to improve predictive accuracy over a single base learner. This type of ensembles is usually categorised into two broad groups: (1) Bagging methods, where each base learner is trained independently from the others -so, the base learners are conceptually trained in parallel. In bagging methods, the predictive accuracy is usually improved due to the reduction of the variance in the ensemble's predictions, by comparison with the variance in the predictions of a single base learner.
(2) Boosting methods, where the base learners are trained sequentially, and each base learner is trained with instance weights that are determined in order to correct the errors of previous base learners in the sequence. This tends to reduce the bias in the predictions (Pedregosa et al., 2011). One challenge in this work is that Ageing N otCR -related genes are roughly tenfold more numerous than Ageing CR -related genes, resulting in an imbalanced data that biases ML predictions towards Ageing N otCR -related genes. To address this, under-sampling of the majority class (Ageing N otCR -related genes) was performed for each of their base learners' training set. Hence, after under-sampling each training set (for each base learner) has the same number of Ageing CR -related and Ageing N otCR -related genes. Two of the ML algorithms we used, Balanced Random Forests (BRF) and Easy Ensemble Classifier (EEC), are bagging and boosting methods, respectively. Both BRF and EEC were implemented using the Python package Imbalanced-learn (Lemaître et al., 2017). Additionally, XGBoost (XGB) (Chen and Guestrin, 2016) and CatBoost (CAT) (Dorogush et al., 2018) were used, as they are both high-performance opensource libraries for gradient boosting in decision trees. Figure 3.A illustrates these algorithms graphically. The four techniques were run with random state set to 42. Further details on data preprocessing are available in Supplementary materials (S.2.1).

Predictive accuracy calculation
Predictive accuracy was calculated by using a nested cross-validation (CV) procedure (a common approach in ML), as follows. To implement the outer 10-fold cross-validation, the dataset instances (ageing-related genes) are randomly divided into 10 outer folds of roughly the same size. Then each ML algorithm is run 10 times, each using a different outer fold as the testing set and all the other 9 outer folds as the training set. Before each run of an algorithm, however, its hyperparameters are tuned by an inner 5-fold cross-validation. To implement this, each training set of the outer CV is randomly divided into 5 inner folds of roughly the same size. Then, for each candidate configuration of hyperparameter settings of the algorithm, the algorithm is run 5 times, each time using a different inner fold as the validation set (to measure predictive accuracy) and the other 4 inner folds as a reduced training set. The predictive accuracy of each candidate algorithm configuration is computed as the average accuracy over the 5 validations sets, and then the algorithm configuration with the highest average predictive accuracy is chosen as the best configuration for the current iteration of the outer CV. Next, the algorithm with that best configuration is applied to the full training set of the current iteration of the outer CV, and the learned classifier is evaluated on the current testing set. Finally, this whole process is repeated for the 10 iterations of the outer CV, and the predictive accuracy of the algorithm is computed as the average of the 10 values of predictive accuracy over the 10 testing sets of the outer CV. Note that hyperparameter optimization is performed by the inner CV using only the training set (i.e., not using the  testing set), which is always reserved for measuring generalisation performance. A graphical representation of the nested CV procedure is depicted in in Figure 3.B. (which only displays five outer folds for ease of visualization). In this work, the inner loop was implemented using the scikit-learn's GridSearchCV command (Pedregosa et al., 2011), with random state = 42 and hyperparameters as stated in Supplementary materials (S.2.2).
The performance metric used for hyperparameter tuning during the innerloop of the nested CV was the Geometric Mean (Gmean), defined in equation (1): where sensitivity is the percentage of Ageing CR -related genes (i.e., the minority class) that were correctly predicted as Ageing CR -related, and specificity is the percentage of Ageing N otCR -related genes (i.e., the majority class) that were correctly predicted as Ageing N otCR -related. Since sensitivity and specificity take values in the [0, 1] interval, so does Gmean. Gmean is suited for classunbalanced problems as this metric measures the balance between classification performances on both the majority and minority classes (Ri and Kim, 2020). The closer Gmean is to 1, the better is the classification. The predictive performance of the final models on the testing sets of the outer CV was evaluated by two measures (Figure 3.C): (a) Gmean of sensitivity and specificity, and (b) Area Under the Receiver Operating Characteristic Curve (AUC), which is an overall summary of predictive accuracy. AUC also takes values in [0, 1], where 1 is the ideal value (indicating that all predictions were correct), and an AUC value of 0.5 corresponds to random predictions.
A special form of nested CV procedure was applied to the protein-descriptors dataset, as follows. The sequences of ageing-related proteins encoded by a single ageing-related gene are highly correlated. As a result, the testing and training sets of the proteins dataset are also likely to be highly correlated, impeding a fair measure of predictive accuracy. To address this issue, the inner and outer folds of the nested-CV were performed at the gene level rather than at the protein level. This was accomplished by directly applying the nested-CV splitting to all the ageing-related genes containing proteins in the EnsDb.Hsapiens.v86 database (Rainer, 2017;Rainer et al., 2019). Following that, the corresponding proteins for each of these ageing-related genes were retrieved. Proteins encoded by the genes in the outer training, outer testing, inner training and inner validation sets were then used to create their corresponding data subsets of the proteindescriptors dataset.

Feature Importance Calculation
We calculated the feature importance for the best learned models (Figure 3.D). To accomplish this, we used 100% of the ageing-related genes (instances) of the dataset under study as the training set, which ensured that the features importance were calculated using all of the data available, maximising the quality of the feature importance calculation. No instances were withheld for testing purposes, as this task's objective was not to determine predictive accuracy (already determined by the nested-CV procedure) but to compute the features' predictive relevance.
The importance of BRF's features is determined by using the default Gini index of class impurity, which calculates how well a split separates the samples of the two classes in each node of a decision tree. A feature's importance is basically given by the weighted average of the reduction of the Gini index across the tree nodes labelled with that feature, with weights proportional to the number of instances split by that feature (Menze et al., 2009). On the other hand, EEC, XGB, and CAT calculate a feature's relevance through the permutation method, which compares the model's predictive accuracy on the original data vs the model's accuracy on a dataset with a random permutation of that feature's values, so that the extent of the drop in the model's accuracy after the random permutation indicates how much the model is dependent on the feature. Finally, in order to compare the results of the two feature importance methods, the resulting feature rankings were scaled from their original values to the [0, 100] interval, where 100 represents the most important feature and 0 represents no relevance.

New CR-related genes inference
Each learned classification model outputs a probability that a gene belongs to the Ageing CR -related class. For converting a predicted probability into a class label we use a classification threshold of 0.5, i.e., any gene with a predicted Ageing CR -related probability less than 0.5 is predicted to be Ageing N otCRrelated, whereas any gene with a predicted probability greater than 0.5 is predicted to be Ageing CR -related. Although CR-related predictions are binarized by the threshold, the CR-related probabilities remain informative as a measure of the prediction's certainty, from the model's viewpoint. For instance, if two given genes, A and B, have CR-related probabilities of 0.6 and 0.9, respectively, both are classified as CR-related; but from the model's perspective, gene B is more reliably related with CR .
Hence, it is possible to infer novel Ageing CR -related genes by identifying, among all the genes annotated in the dataset as Ageing N otCR -related genes, which ones have the greatest predicted Ageing CR -related probabilities, which are the strongest false positives (FP) genes. After all, the Ageing N otCR -related class label annotation in the dataset is not very reliable in general, because it basically means that there is no evidence for Ageing CR -relatedness in the literature, and absence of evidence is not the same as evidence for absence of Ageing CR -relatedness. Hence, the strongest FP genes are good candidate targets for future experiments to determine Ageing CR -related genes.
Keeping in mind that the nested-CV's outer testing sets do not overlap and that they collectively include all the genes (instances) in the dataset, we created, for each of the best performing Dataset, ML algorithm combinations (i.e., the best models), two vectors: a CR-probability vector combining the predicted CR-related probabilities from all the outer testing splits, and a CR-annotations vector by combining the original annotations of class labels in the data from all the outer splits (Figure 3.E). Next, based on these two vectors, we retained only Ageing N otCR -related genes with CR-probability equal to or greater than 0.5 (i.e., FP genes), meaning that they were classified as CR-related by the model but lack a CR-related annotation in the dataset (based on the literature). We identified the top 10 FP genes with greater CR-probability for each of the best performing models and discussed their potential as candidate CR-related genes for confirmation in further wet-lab experiments. (Figure 3.F).
Finally, we looked for common top CR-related genes candidates among the shared ageing-related genes between the two strongest models, namely {GO terms, BRF} and {PathDIP, CAT}, as described in the Results section. To do so, we originally computed a common-ranking by averaging the CR-probability scores of common genes in both models and then sorted the genes in descendent order based on the averaged score. This approach had, however, one issue as the CR-probability density distributions of both strongest models lied in different intervals ([0.35,0.75] in {GO terms, BRF} while [0,1] in {PathDIP, CAT}, as described in Results). Consequently, each computed average was biased towards the distribution with the most extreme values (i.e., genes with the greatest/lowest CR-probability scores in {PathDIP, CAT} will have stronger influence when averaging than genes with greatest/lowest CR-membership score in {GO terms, BRF}). With the aim to provide a similar comparison scale for the top CR-related candidates in both models while considering the distribution shape, we mapped the CR-membership scores of all genes in both {GO terms, BRF} and PathDIP, CAT models to the [0,1] interval and then retrieved common genes in both models to compute CR-probability arithmetic averages, highlighting Ageing N otCR -related genes with the highest probabilities of being Ageing CR -related genes from both best models' perspectives.

Statistical analysis
To report on the most important features, two statistical analysis tests were used (with a significance level of 0.01): a Two-Proportions Z-Test for binary features and a T-test for continuous features. The use of the test for binary features is based on the concept of a feature's positive value, which is defined as the presence of annotation (e.g., a GO term annotation) for a gene, as opposed to the absence of that annotation (the negative value of the feature).
The test for binary features was designed to determine whether the proportion of Ageing CR -related genes associated with a particular relevant feature's positive value (i.e., the ratio of Ageing CR -related genes associated with the relevant feature's positive value over the total number of Ageing CR -related genes in the dataset) is significantly different than the proportion of Ageing N otCRrelated genes associated with the same relevant feature's positive value. For continuous features, the test determined whether the mean value of the feature across all Ageing CR -related genes was significantly different than the mean value of the feature across all Ageing N otCR -related genes.

Results
This section first highlights the best performing combinations of ML algorithms and datasets. Top relevant features are then presented. Finally, the predicted top CR-associated genes candidates are reported.

Predictive Accuracy Results
The AUC and Gmean values obtained by BRF, EEC, XGB, CAT are compared in Table 1. For each dataset (feature type) and predictive accuracy measure, the best result from the four algorithms is highlighted in boldface. GTEx and coexpression were the least predictive feature types, yielding results comparable to random predictions (AUC close to 0.5), whereas GO terms and PathDIP were the most predictive: GO terms had the highest average AUC (0.83) and the second highest Gmean (0.75) across the four algorithms; whilst PathDIP had the highest average Gmean (0.76) and the second highest average AUC (0.81). BRF got the highest AUC values overall, whereas the highest Gmean values were more distributed across all algorithms, with higher means for EEC and CAT. By defining a model as a combination of a dataset and the classification algorithm that runs over it Dataset, ML algorithm, the model with the highest Gmean (0.77) was {PathDIP, CAT}, closely followed by {GO Terms, BRF} and {PathDIP, BRF}, both with a Gmean of 0.76. Regarding AUC results, the best model was {GO Terms, BRF} with an AUC of 0.84, closely followed by {GO terms, EEC} and {PathDIP, CAT}, both with an AUC of 0.83. Since {PathDIP, CAT} and {GO Terms, BRF} achieved complementary and notably close first and second places regarding Gmean and AUC, so they are both the most predictive models overall. Table 2 reports sensitivity and specificity results, as measures of predictive accuracy for Ageing CR -related genes and Ageing N otCR -related genes, respectively. The highest mean sensitivity and specificity values across all four algorithms were obtained by PathDIP and Proteins-Descriptors, respectively, as shown in the last two columns of the table. There was no strong winner algorithm in terms of sensitivity, but XGB obtained by far the worst sensitivity values overall, as shown in the last row of the table. On the other hand, XGB achieved in general the highest specificity values, implying more accurate predictions for Ageing N otCR -related genes, the majority class. ROC curves of all the ML algorithms in the two strongest datasets, as well as confusion matrices of the two strongest models, {GO Terms, BRF} and {PathDIP, CAT}, are depicted in supplementary materials ( Figures S1 and S2, respectively).

Feature Importance Results
The top-5 most relevant features in each of the two most predictive models, {GO terms, BRF} and {PathDIP, CAT}, are shown in Tables 3 and 4, respectively. The column Score in these tables indicates the relative importance of the features, in the range from 0 (no relevance) to 100 (maximal relevance). The columns Ageing CR and Ageing N otCR denote the percentage of Ageing CRand Ageing N otCR -related genes with the GO term or pathway in the corresponding row (i.e., the percentage of genes with the feature's positive value). In addition, a proportion is provided in brackets where the numerator indicates the number of Ageing CR -or Ageing N otCR -related genes with the positive feature value, while the denominator indicates the total number of Ageing CR -or Ageing N otCR -related genes, in the dataset.
The p-value columns in Tables 3 and 4 indicate the results of the statistical tests applied to detect whether the values in the Ageing CR column are significantly different from those in the Ageing N otCR column as explained in Section 2.3. Significant p-values are denoted by bold text.
Interestingly, among the GO terms in Table 3, the most relevant one, oxidationreduction process, was the only one that failed to achieve a significant difference in terms of the proportion of Ageing CR -and Ageing N otCR -related genes. Nonetheless, this term is worth highlighting due to it having the highest proportion of occurrence (19.3%) in Ageing CR -related genes among all 5 GO terms in this table. The remaining GO terms, which are related to GPCRs and carbohydrate transport, clearly have a stronger association with Ageing CR -related genes, as each of them occurred in about 10%-16% of the Ageing CR -related genes while occurring in less than 3% of the Ageing N otCR -related genes. Table 4 reports the top PathDIP features. Note that the score for the second best pathway, longevity regulating pathway, is much lower than the score for the best pathway, autophagy. The only feature with no significant difference in its percentage of occurrence in Ageing CR -and Ageing N otCR -related genes was brain-derived neurotrophic factor (BDNF ). The cellular responses to external stimuli pathway contained the greatest proportion of occurrence (20.9%) in Ageing CR -genes.    4.B, indicates that, for both models, the top CR-probabilities across Ageing N otCRrelated genes achieved similar scores that top CR-probabilities in Ageing CRrelated genes, but are much less numerous. Figures 4.C and 4.D depict the top ten CR-candidate genes in the {GO terms, BRF} and {PathDIP, CAT} models, respectively. Even if some of the top genes predicted by both models may have a proclivity for not detecting unknown CR-relationship, it is remarkable that among their top ten CR-candidate genes, only GOT2 overlapped.
Hypothesising that pertinence to the common set of top-10 CR-candidate genes in both models increases likelihood of accurate CR-relatedness prediction, we performed a joint analysis of the top -10 CR-candidate genes (Methods 2.2.3). Briefly, we normalised the range of both models' CR-probability distributions and then retained common ageing-related genes (976 genes, 872 of which are Ageing N otCR -related) in order to compute, for each common ageing-related gene, an arithmetic average of both normalised CR-probabilities. Then, we sort genes under a criteria that considers both a similar CR-membership range for the two models and their distribution shapes. The correlation between the normalised CR-probabilities assigned by the {GO terms, BRF} and {PathDIP, CAT} models was only moderated, being smaller across Ageing N otCR -related genes, increasing throughout Ageing CR -related genes and yielding the highest score when using all ageing-related genes ( Figure 5). Table 5 depicts both models' common Ageing N otCR -related genes whose averaged normalised CR-probability exceed 0.8. Note that all of these genes, namely, GOT2, GOT1, TSC1, CTH, GCLM, IRS2, and SENS2 ; appeared in the top-10 CR-related candidates of at least one of the two most predictive models. The top gene in 5, Glutamic-Oxaloacetic Transaminase 2 (GOT2 ), appeared within the set of top six ranking genes of both models, indicating that its possible CR-relatedness could be similarly inferred from either biological pathways or biological processes GO terms. Insulin Receptor Substrate 2 (IRS2 ) and especially Glutamic-Oxaloacetic Transaminase 1 (GOT1 ) got high CRprobabilities in the {GO terms, BRF} model, and a moderately high probability in {PathDIP, CAT}. Moreover, Sestrin 2 (SENS2 ), the gene with the highest  CR-probability in the {GO terms, BRF} model, also reached the list but it was, by far, the gene with lowest CR-probability in {PathDIP, CAT}, among all genes in Table 5. Finally, TSC Complex Subunit 1 (TSC1 ), glutamate-cysteine ligase regulatory subunit (GCLM ) and Cystathionine Gamma-Lyase (CTH ) got the three highest CR-probability scores in {PathDIP, CAT}, and also relatively high probabilities in {GO terms, BRF}.

Discusssion
This study demonstrated that the most powerful predictors of CR-relatedness across ageing-related genes rely on features heavily based on curated biological knowledge (from the literature), hereafter called 'knowledge-based features', such as GO terms and biological pathways. However, one caveat of these features is the difficulty to directly produce new findings, as they are based on existing knowledge. Nonetheless, the predictive power of these types of features enabled the extraction of additional biological insights from ageing-related genes strongly predicted to be CR-related but lacking current annotation of CR-relatedness.
Features not so heavily based on curated biological knowledge, particularly those based on gene expression and co-expression, were the least predictive, predicting almost randomly and implying that CR-relatedness is unlikely to be explained by gene expression analysis across tissues or by co-expression of ageing-related genes with other genes.
Upon merging the 9 datasets with specific feature types into two "whole" datasets (using two merging approaches), the predictive power was found to diminish compared to the strongest feature type-specific datasets. This occurred despite the use of a simple, univariate feature filtering algorithm. This suggests the importance of exploring the use of more sophisticated feature selection tech- niques, which is out of the scope of this work and left for future research. The joint analysis of the top CR-candidate genes in {PathDIP, CAT} and {GO terms, BRF} models highlighted genes whose CR-relatedness can be explored from both GO terms and biological pathways perspectives. The strengths of both models were complementary, as {PathDIP, CAT} was more suited for classifying Ageing N otCR -related genes while {GO terms, BRF} performed better with Ageing CR -related genes. It is also noticeable that both models achieved similar predictive accuracies despite exhibiting only a moderate correlation of CR-related class predictions. In addition, top-ranked Ageing N otCR -related genes with high CR-probabilities were mostly surrounded by Ageing CR -related genes in the plane defined by the normalised CR-probabilities in {PathDIP, CAT} and {GO terms, BRF} models, suggesting biological process and pathway similarities between the top CR-related candidates and currently established CR-related genes.

GO term Features
The oxidation-reduction process was the most significant GO term for discriminating between the presence or absence of CR-relatedness across ageing-related genes. This term was also notable as it was associated with the greatest number of ageing-related genes across all top features in both PathDIP and GO term feature sets. Nonetheless, it did not demonstrate any significant preference for Ageing CR -nor AgeingNotCR-related genes, indicating that its effects on ageing could be mediated in both CR and NotCR-dependent ways. Evidence indicates that CR improves redox state (Kowaltowski, 2011), though this may not be the mechanism by which CR prolongs life (Lennicke and Cochemé, 2020). It has been observed, however, that low levels of ROS may actually be beneficial as mediators of redox signalling (Lennicke and Cochemé, 2020).
Notably, ageing-related genes associated with glucose and carbohydrate transport were almost exclusively Ageing CR -related, which at first sight may partially suggest a relationship between ageing, CR and glucose transport. Neverthe-less, this relationship, if existing, is not straightforward as abnormal glucose metabolism is a common but not necessary feature of ageing (Kalyani and Egan, 2013). The most significant changes in glucose metabolism are due to ageingrelated insulin dysfunction (Boemi et al., 2016). This phenomenon appears, however, to be a consequence rather than a cause of ageing, as the improvement in insulin sensitivity induced by CR was not required for the effects of CR on fitness and longevity (Dommerholt et al., 2018).
Ageing-related genes within the GPCR signalling pathway were also significantly related with CR. Some GPCRs have emerged as promising targets for reversing senescence and thus ageing (Santos-Otte et al., 2019). In this regard, one of the few studies discussing the relationship between a GPCR, namely TGR5, and CR (Wang et al., 2017) demonstrated that CR benefits on renal function ageing can be partially explained by up-regulation of TGR5 ; as a result, the authors of that study proposed up-regulation of TGR5 as a possible CR-mimetic candidate for renal function.

PathDIP Features
The strongest predictive feature was autophagy, which is responsible for the disposal and recycling of metabolic macromolecules and damaged organelles (Chung and Chung, 2019). The second most significant feature, the longevity regulating pathway, is also associated with autophagy via a well-characterized signalling cascade (Chung and Chung, 2019). The fact that the Ageing CRrelated genes in this study were significantly strongly associated with these autophagy-related pathways supports current hypotheses that some of the CRrelated anti-aging effects are mediated by autophagy (Donati et al., 2008;Manco and Mingrone, 2005).
The BDNF pathway is presumably involved in brain ageing. Moreover, it is well established that CR enhances BDNF in a currently unknown manner (Manchishi et al., 2018;Budni et al., 2015;García-Prieto and Fernández-Alfonso, 2016), and that BDNF declines with age (Erickson et al., 2010). The possible relationship between this gene and CR could be explored through the wellcharacterized CR-related protein kinase B (Akt) pathway, as BNDF and Akt indirectly interact (Chen, 2019). However, in the context of our ML algorithms, it is possible that the BDNF pathway favoured NotCR-related predictions, as only one of its 11 ageing-related genes was predicted as CR-related.
NRF2 is absent from GenDR. Nevertheless, our results suggest a strong association between the NRF2's pathway and CR because, while the ratio of Ageing CR -related to Ageing N otCR -related genes is approximately 1/10 in the overall dataset, this pathway demonstrated the much greater 16/10 ratio; and this pathway has by far the greatest proportion of Ageing CR -related genes, compared to Ageing N otCR -related genes across all top PathDIP pathways. This finding could be supported by recent evidence linking NRF2 and CR (Pomatto et al., 2020).
The "cellular responses to external stimuli" pathway is notable for the large number of ageing-related genes associated with it, only outnumbered, across all the most relevant features discussed in this work, by the "oxidation-reduction process" GO term. However, unlike the redox GO term, the relative distribution of ageing-related genes in this pathway did achieve statistical significance in direction of CR. External stimuli responses include responses to metal ions (Kultz, 2005), from where a metal ion theory of ageing was constructed. This theory has been poorly explored and opens opportunities for novel CR-research directions as it has been shown that CR decreases the level of certain metal ions in cells (Sharma et al., 2011). 4.2 CR-related candidate genes GOT1 and GOT2 are genes whose products are involved in the amino acid metabolism that exist in cytoplasmic and mitochondrial forms, respectively (Stelzer et al., 2016). GOT1 's expression has been shown to change with age , but evidence linking it to CR is far scarcer. To our knowledge, only one study (Song et al., 2015) has demonstrated this relationship and proposed the role of GOT1 as a significant metabolic feature associated with hepatic response to CR that is representative of differences in mediating amino acid influx into the gluconeogenic pathway. While there is lack of evidence linking GOT2 with CR, one study (Miller et al., 2005) suggested that either GOT1 or GOT2 may impact H2S homeostasis, opening a window for further CR-related insights, as H2S signalling cascade has been observed to promote CR-like pro-longevity effects (Ng et al., 2018).
The TSC complex is a critical negative regulator of mTORC1 (Stelzer et al., 2016), the inhibition of which is associated with CR-like benefits (Madeo et al., 2019). In this regard, even if the TSC complex's role in regulating mTORC1 in vivo remains under-explored, one study (Harputlugil et al., 2014) provides insights that indirectly link this gene with CR as it demonstrated that improved insulin sensitivity following short-term protein restriction (PR) required TSC1 for facilitating increased pro-survival signalling after injury, and contributed to PR-mediated resistance to clinically significant hepatic ischemia reperfusion injury.
CTH is a gene that produces endogenous hydrogen sulphide (H2S) as a signalling molecule (Stelzer et al., 2016). Evidence linking CTH to CR is scarce. To our knowledge, only one recent study (Derous et al., 2017) reveals a positive correlation between CTH expression and CR application. This increased expression may have potential contributions to CR pro-longevity effects as inhibition of CTH is associated with about 15% lifespan reduction in worms. Moreover, CTH is a gene that promotes production of H2S, a potential CR-mimetic candidate, suggesting an approach for further studies linking CTH with CR.
The Glutamate-Cysteine Ligase Regulatory subunit (GCLM ) is a gene that regulates the expression of antioxidant enzymes (Stelzer et al., 2016). Its role in CR is not explicitly stated in literature. However, a recent study (Lettieri-Barbato et al., 2020) showed its increased expression during fasting in PASKdeficient mice. Since fasting and intermittent fasting are associated to CR-like benefits, a link between GCLM and CR can be investigated from this per-spective. If such a link does not exist, the outcome may remain informative by providing insights on differences between CR-and fasting-related beneficial signalling cascades.
IRS2 is a cytoplasmic signaling molecule that mediates the effects of insulin, insulin-like growth factor 1, and other cytokines. A homolog of this gene is present within GenDR (Gene Manipulations) , as "chico" in fruit fly, however, it was not detected as an ortholog by the OMA database (Train et al., 2017) and thus was not considered a CR-related gene. This gene has a further independent entry within GenDR (Gene Expression) where 174 mice genes that significantly change their expression during CR are reported Plank et al., 2012). Out of our 7 top CR-candidate genes, IRS2 was the only one overlapping these 174 genes as further explained in Supplementary materials (S.1.3). This suggests that IRS2 does may not only be deferentially expressed during CR but also could have the potential to regulate CR-associated pro-longevity effects.
Sestrin 2 (SESN2) is an intracellular leucine sensor that negatively regulates the TORC1 signaling pathway. This gene was out of the scope of CR-relatedness until a recent work highlighted its role as a novel molecular link that mediates the effects of dietary amino acid restriction on TORC1 activity in stem cells of the fly gut, thereby maintaining gut health and ensuring longevity (Lu et al., 2021). Hence, although the CR-probability of this gene was relatively low in the {PathDIP, CAT} model, it was the highest in the {GO terms, BRF} model; and the averaged-based CR-related prediction of this gene is supported by recent evidence.

Conclusions
To our knowledge, this is one of the pioneering studies applying ML algorithms to CR research in the context of ageing. This work demonstrated the strong potential of ML-based techniques to identify CR-associated features as our findings are consistent with literature and recent discoveries. GO terms and PathDIP pathways were the most predictive types of features. Due to their curated knowledge-driven (literature-based) nature, the use of these feature types in the most predictive models has on one hand mostly corroborated existing knowledge (rather than directly generating new knowledge), but has on the other hand provided statistical support associating CR with the NRF2 pathway and GPCRs, which have been recently accumulating evidence towards CR in the literature, and so are worth further exploring.
Inference on novel CR-related features may be easier to accomplish from feature types not biased by curated knowledge. However, our work found an obstacle to this inference due to the low or even null predictive power of such feature types, implying that either 1) their features did not contain relevant information for predicting CR-relatedness; or 2) the used tree-based ensemble algorithms were not suitable for our classification problem with the unbiased feature types used in this work, especially expression and co-expression data; or 3) the number of currently known ageing-related genes was not large enough for our ML algorithms to find complex patterns in our unbiased data, leading to poor predictive accuracy in such datasets. In future work, the application of deep learning techniques could potentially increase the predictive power of unbiased feature types, which could provide novel insights on possible CR-related protein properties and interactions as well as CR-related gene expression and co-expression signatures.
Further insights were taken from genes annotated as Ageing N otCR -related genes in the dataset but strongly predicted as CR-related genes based on GO terms and PathDIP pathways. This analysis revealed a list of genes outside GenDR that are prone to be related with CR despite lacking such annotation. Most of these genes were consistent with some preliminary CR-related experiments, which makes them worth exploring for further wet-lab experiments to get a deeper understanding of their relationship with CR. Among these genes, GOT2 was the only Ageing N otCR -related gene present within the top six stronger false positives in models learned with both PathDIP and GO term features. Other CR-related gene candidates strongly predicted by both most predictive ML models were GOT1, TSC1, CTH, GCLM, IRS2 and SENS2, which, together with GOT2, remain to be validated in further lab-based experiments.