RT Journal Article SR Electronic T1 Tumor Origin Detection with Tissue-Specific miRNA and DNA methylation Markers JF bioRxiv FD Cold Spring Harbor Laboratory SP 090746 DO 10.1101/090746 A1 Wei Tang A1 Shixiang Wan A1 Quan Zou YR 2016 UL http://biorxiv.org/content/early/2016/12/01/090746.abstract AB Motivation Cancer of unknown primary origin constitutes 3-5% of all human malignancies. Patients with these carcinomas present with metastases without an established primary site, which may not be found even by thorough histological search methods. Patients with cancer of unknown primary origin always have poor prognosis and hardly have efficient treatment since most cancers respond well to specific chemotherapy or hormone drugs. Many studies have proposed classifiers based on miRNAs or mRNAs to predict the tumor origins, but few study focus on high-dimensional DNA methylation profiles.Results We introduced three classifiers with novel feature selection algorithm combined with random forest to effectively identify highly tissue-specific epigenetics biomarkers such as microRNAs and CpG sites, which can help us predict the origin site of tumors. This algorithm, incorporating differential analysis and descending dimension algorithm, was applied on 14 histological tissues and over 5000 samples based on miRNA expression and DNA methylation profiles to assign given primary tumor to its origin tissue. Our study shows all of these three classifiers have an overall accuracy of 87.78% (72.55%-97.54%) based on miRNA datasets and an accuracy of 96.43% (MRMD: 87.85%-99.76%) or 97.06% (PCA: 92.44%-100%) based on DNA methylation datasets on predicting the origin of tumors and suggests that the biomarkers we selected can efficiently predict the origin of tumors and allow the clinicians to avoid adjuvant systemic therapy or to choose less aggressive therapeutic options. We also developed a user-friendly webserver which enables users to predict the origin site of tumors by uploading the miRNAs expression or DNA methylation profiles of those cancers.Availability The webserver, data, and code are accessible free of charge at http://server.malab.cn/MMCOP/Contact zouquan{at}nclab.netSupplementary information Supplementary data are available at Bioinformatics online.