Trends in Genetics
Volume 32, Issue 10, October 2016, Pages 620-637
Journal home page for Trends in Genetics

Review
Computational Approaches for Functional Prediction and Characterisation of Long Noncoding RNAs

https://doi.org/10.1016/j.tig.2016.08.004Get rights and content

Trends

lncRNAs represent a large proportion of the transcriptome that is currently sparsely annotated.

Expression-based experiments often yield a large number of lncRNAs cosegregating with the biological system being studied.

The ability to effectively enrich candidate pools for lncRNAs most likely to be involved in the phenotype under study is crucial.

Powerful computational methods for investigating lncRNA function and biology from experimental and sequence information are emerging.

Combining several computational methods is an effective approach to maximise research findings and effectively deploy laboratory resources.

Although a considerable portion of eukaryotic genomes is transcribed as long noncoding RNAs (lncRNAs), the vast majority are functionally uncharacterised. The rapidly expanding catalogue of mechanistically investigated lncRNAs has provided evidence for distinct functional subclasses, which are now ripe for exploitation as a general model to predict functions for uncharacterised lncRNAs. By utilising publicly-available genome-wide datasets and computational methods, we present several developed and emerging in silico approaches to characterise and predict the functions of lncRNAs. We propose that the application of these techniques provides valuable functional and mechanistic insight into lncRNAs, and is a crucial step for informing subsequent functional studies.

Section snippets

The Emerging Need for Computational Methodologies to Discern Functional lncRNAs

Over the past decade advances in sequencing methodologies have revealed the transcriptional complexity of the genome. Early use of genome tiling arrays and CAGE-sequencing led to the elucidation that a much greater portion of the genome is transcribed than previously expected, with the majority of transcription producing non-protein coding RNAs 1, 2. Initially hampered by characteristic low expression, biological specificity, and lack of sequence conservation [3], the functions of the group of

Computational Techniques to Impute lncRNA Function

Core features of functional lncRNAs can be probed via an array of computational methods strengthened by publicly-available datasets. Gene expression information is commonly utilised to detect potential regulatory targets – currently the most common mechanism of action – or involvement in biological processes. Expression-based approaches depend on experimental data, which continues to be generated across a diverse repertoire of biological contexts and made publicly available through repositories

Integrative Approaches for Functional Candidate Selection

Each of the approaches outlined above can extract potential candidates for characterisation from an initial pool of lncRNAs. However, isolated use produces insufficient evidence to generate detailed hypotheses of function. This is typical with differential expression – where a multitude of lncRNA can be differentially expressed between conditions – but this change tells us little about whether the expression pattern is functional, and if so, how. Instead, integrative approaches that test

Testing lncRNA Functionality

Computational approaches provide a way to test multiple avenues of functionality on a large cohort of potentially important transcripts. Because of their relative ease in reuse on multiple candidates, computational analyses have been extensively used in identifying lncRNAs on a genome-wide scale 15, 17, 18, 45. While these approaches can provide insight into the general trends of lncRNA biology, the known specificity of lncRNA expression and function require that experimental methods (reviewed

Concluding Remarks and Future Directions

Within the past decade, lncRNAs have emerged as important RNA species, capable of fulfilling previously unascertained biological roles. Through the increasing application of high-throughput sequencing methods, lncRNAs continue to be discovered. However, the gap between identified and functionally characterised molecules remains considerably larger than that of protein-coding genes. By utilising a spectrum of freely-available software and publicly-available datasets, it is possible to

Acknowledgements

The authors would like to thank Daniel Thompson for helpful discussion on earlier drafts of this manuscript.

Glossary

Cis-regulatory
a type of regulatory relationship defined by the close genomic proximity between the regulator and target genes.
Enhancer RNA (eRNA)
a type of lncRNA transcribed from a genomic region possessing chromatin modifications typical of enhancer DNA. Enhancer lncRNAs may be non-functional, with the DNA being responsible for enhancer activity.
Functional enrichment
process by which functional annotations of groups of genes are tested for statistical enrichment in particular groups above

References (166)

  • M. Ge

    A bipartite network-based method for prediction of long non-coding RNA–protein interactions

    Genomics Proteomics Bioinformatics

    (2016)
  • B. Yan

    The research strategies for probing the function of long noncoding RNAs

    Genomics

    (2012)
  • K. Kashi

    Discovery and functional analysis of lncRNAs: Methodologies to investigate an uncharacterized transcriptome

    Biochim. Biophys. Acta

    (2016)
  • A.J. Blythe

    The ins and outs of lncRNA structure: how, why and what comes next?

    Biochim. Biophys. Acta

    (2016)
  • S. Somarowthu

    HOTAIR forms an intricate and modular secondary structure

    Mol. Cell

    (2015)
  • Z. Lu

    RNA duplex map in living cells reveals higher-order transcriptome structure

    Cell

    (2016)
  • C. Davidovich

    Toward a consensus on the binding specificity and promiscuity of PRC2 for RNA

    Mol. Cell

    (2015)
  • Y. Wang

    The long noncoding RNA lncTCF7 promotes self-renewal of human liver cancer stem cells through activation of Wnt signaling

    Cell Stem Cell

    (2015)
  • P. Kapranov

    Examples of the complex architecture of the human transcriptome revealed by RACE and high-density tiling arrays

    Genome Res.

    (2005)
  • P. Carninci

    The transcriptional landscape of the mammalian genome

    Science

    (2005)
  • T. Derrien

    The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression

    Genome Res.

    (2012)
  • Y. Zhao

    NONCODE 2016: an informative and valuable data source of long non-coding RNAs

    Nucleic Acids Res.

    (2016)
  • X.C. Quek

    lncRNAdb v2.0: expanding the reference database for functional long noncoding RNAs

    Nucleic Acids Res.

    (2015)
  • T.R. Mercer et al.

    Structure and function of long noncoding RNAs in epigenetic regulation

    Nat. Struct. Mol. Biol.

    (2013)
  • A. Gabory

    The H19 locus: role of an imprinted non-coding RNA in growth and development

    Bioessays

    (2010)
  • L. Ma

    On the classification of long non-coding RNAs

    RNA Biol.

    (2013)
  • G.S. Laurent

    The landscape of long noncoding RNA classification

    Trends Genet.

    (2015)
  • Y. Zhang

    Long noncoding RNA LINP1 regulates repair of DNA double-strand breaks in triple-negative breast cancer

    Nat. Struct. Mol. Biol.

    (2016)
  • D. Chakraborty

    Combined RNAi and localization for functionally dissecting long noncoding RNAs

    Nat Methods.

    (2012)
  • M. Guttman

    lincRNAs act in the circuitry controlling pluripotency and differentiation

    Nature

    (2011)
  • M.K. Iyer

    The landscape of long noncoding RNAs in the human transcriptome

    Nat Genet.

    (2015)
  • G.S. Laurent

    Functional annotation of the vlinc class of non-coding RNAs using systems biology approach

    Nucleic Acids Res.

    (2016)
  • G. St Laurent

    VlincRNAs controlled by retroviral elements are a hallmark of pluripotency and cancer

    Genome Biol.

    (2013)
  • M.N. Cabili

    Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses

    Genes Dev.

    (2011)
  • A.E. Kornienko

    Long non-coding RNAs display higher natural expression variation than protein-coding genes in healthy humans

    Genome Biol.

    (2016)
  • G. Bussotti

    Improved definition of the mouse transcriptome via targeted RNA sequencing

    Genome Res.

    (2016)
  • J.M. Stuart

    A gene-coexpression network for global discovery of conserved genetic modules

    Science

    (2003)
  • M.E. Dinger

    Long noncoding RNAs in mouse embryonic stem cell pluripotency and differentiation

    Genome Res.

    (2008)
  • Y. Xiao

    Predicting the functions of long noncoding RNAs using RNA-seq based on Bayesian network

    Biomed Res. Int.

    (2015)
  • M. Zhou

    Prioritizing candidate disease-related long non-coding RNAs by walking on the heterogeneous lncRNA and disease network

    Mol. Biosyst.

    (2015)
  • P. Yao

    Coexpression networks identify brain region-specific enhancer RNAs in the human brain

    Nat. Neurosci.

    (2015)
  • X. Guo

    Long non-coding RNAs function annotation: a global prediction method based on bi-colored networks

    Nucleic Acids Res.

    (2013)
  • Q. Liao

    Large-scale prediction of long non-coding RNA functions in a coding–non-coding gene co-expression network

    Nucleic Acids Res.

    (2011)
  • J. Sun

    Inferring novel lncRNA–disease associations based on a random walk model of a lncRNA functional similarity network

    Mol Biosyst.

    (2014)
  • X. Yang

    A network based method for analysis of lncRNA–disease associations and prediction of lncRNAs implicated in diseases

    PLoS One

    (2014)
  • X. Chen

    Constructing lncRNA functional similarity network based on lncRNA–disease associations and disease semantic similarity

    Sci. Rep.

    (2015)
  • P. Langfelder et al.

    WGCNA: an R package for weighted correlation network analysis

    BMC Bioinformatics

    (2008)
  • M. Ashburner

    Gene Ontology: tool for the unification of biology

    Nat. Genet.

    (2000)
  • M. Kanehisa

    Data, information, knowledge and principle: back to metabolism in KEGG

    Nucleic Acids Res.

    (2014)
  • M. Milacic

    Annotating cancer variants and anti-cancer therapeutics in reactome

    Cancers

    (2012)
  • Cited by (0)

    These authors contributed equally to this manuscript.

    View full text