Abstract
Background Juvenile-Onset Huntington’s disease (JOHD) is a rare form of Huntington’s disease (HD) which leads to chronic neurodegeneration which begins prior to the age of 25. Like HD, JOHD is triggered by a large expansion of CAG nucleotides in the HTT gene which leads to neurotoxicity and aberrant gene expression. However, unlike HD, in JOHD the relationship between the length of CAG expansion and age of disease onset is non-linear. Thus, it would be of interest to identify molecular biomarkers which indicate predisposition to the development of JOHD, and as microRNAs (miRNAs) circulate in bio-fluids they would be particularly useful biomarkers.
Methods We explored a large JOHD miRNA-mRNA expression dataset (GSE65776) to establish appropriate questions that could be addressed using Machine Learning (ML). We sought sets of features (mRNAs/ miRNAs) to predict JOHD or WT samples from aged or young mouse cortex samples, and we asked if a set of features could predict predisposition to JOHD or WT genotypes by training models on aged samples and testing the models on young samples. We used a k-best strategy which included oversampling for unbalanced classes, min_max scaling and 5-fold cross-validation. Several models were created using ADAboost, ExtraTrees, GaussianNB and Random Forest, and the best performing models were further analysed using AUC curves and PCA plots. Finally, genes used to train our miRNA-based predisposition model were compared to HD patient bio-fluid samples.
Results Our testing accuracies were between 66-100% and AUC scores were between 31-100%. We generated several excellent models with testing accuracies >80% and AUC scores >90%. We also identified homologues of mmu-miR-154-5p, mmu-miR-181a-5p and mmu-miR-212-3p, mmu-miR-378b, mmu-miR-382-5p and mmu-miR-770-5p to be circulating in HD patient blood samples at p.values of <0.05.
Conclusions We generated several age-based models which could differentiate between JOHD and WT samples, including an aged mRNA-based model with a 100% AUC score. We also identified several miRNAs used to train our miRNA-based predisposition model which were detectable in HD patient blood samples, which suggests they could be potential candidates for use as non-invasive biomarkers for JOHD/ HD research.
Competing Interest Statement
The authors have declared no competing interest.