Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

ANFIS-based fuzzy systems for searching dna-protein binding sites

Dianhui Wang, Monther Alhamdoosh, Witold Pedrycz
doi: https://doi.org/10.1101/058800
Dianhui Wang
1Department of Computer Science and Computer Engineering, La Trobe University,Melbourne, Victoria 3083, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Monther Alhamdoosh
1Department of Computer Science and Computer Engineering, La Trobe University,Melbourne, Victoria 3083, Australia
2Bio21 Institute, The University of Melbourne, 30 Flemington Road, Parkville, Victoria 3010, Australia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Witold Pedrycz
2Bio21 Institute, The University of Melbourne, 30 Flemington Road, Parkville, Victoria 3010, Australia
3Department of Electrical & Computer Engineering, University of Alberta, Edmonton T6R 2V4, AB, Canada
4Department of Electrical and Computer Engineering, Faculty of Engineering, King Abdulaziz University, Jeddah 21589, Saudi Arabia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Transcriptional regulation mainly controls how genes are expressed and how cells behave based on the transcription factor (TF) proteins that bind upstream of the transcription start sites (TSSs) of genes. These TF DNA binding sites (TFBSs) are usually short (5-15 base pairs) and degenerate (some positions can have multiple possible alternatives). Traditionally, computational methods scan DNA sequences using the position weight matrix (PWM) of a given TF, calculate binding scores for each K-mer against the PWM, and finally classify a K-mer as to whether it is a putative TFBS or a background sequence based on a cut-off threshold. The FSCAN system, which is proposed in this paper, employs machine learning techniques to build a learner model that is able to identify TFBSs in a set of bound sequences without the need for a cut-off threshold. Our proposed method utilizes fuzzy inference techniques along with a distribution-based filtering algorithm to predict the binding sites of a TF given its PWM model and phastCons scores for the input DNA sequences. Data imbalance reduction techniques are also used to ease the learning of the adaptive-neuro fuzzy inference system (ANFIS) algorithm. The proposed system is tested on 22 ChIP-chip sequence-sets from the Saccharomyces Cerevisiae genome. Our results show that FSCAN outperforms other approaches like MatInspector and MATCH and is quite robust. As more transcriptional data becomes available, our proposed framework encourages the use of fuzzy logic techniques in the prediction of TFBSs.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted June 15, 2016.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
ANFIS-based fuzzy systems for searching dna-protein binding sites
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
ANFIS-based fuzzy systems for searching dna-protein binding sites
Dianhui Wang, Monther Alhamdoosh, Witold Pedrycz
bioRxiv 058800; doi: https://doi.org/10.1101/058800
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
ANFIS-based fuzzy systems for searching dna-protein binding sites
Dianhui Wang, Monther Alhamdoosh, Witold Pedrycz
bioRxiv 058800; doi: https://doi.org/10.1101/058800

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3689)
  • Biochemistry (7796)
  • Bioengineering (5675)
  • Bioinformatics (21284)
  • Biophysics (10578)
  • Cancer Biology (8174)
  • Cell Biology (11945)
  • Clinical Trials (138)
  • Developmental Biology (6763)
  • Ecology (10401)
  • Epidemiology (2065)
  • Evolutionary Biology (13866)
  • Genetics (9708)
  • Genomics (13073)
  • Immunology (8146)
  • Microbiology (20014)
  • Molecular Biology (7853)
  • Neuroscience (43056)
  • Paleontology (319)
  • Pathology (1279)
  • Pharmacology and Toxicology (2258)
  • Physiology (3351)
  • Plant Biology (7232)
  • Scientific Communication and Education (1312)
  • Synthetic Biology (2006)
  • Systems Biology (5538)
  • Zoology (1128)