supraHex: an R/Bioconductor package for tabular omics data analysis using a supra-hexagonal map

Biochem Biophys Res Commun. 2014 Jan 3;443(1):285-9. doi: 10.1016/j.bbrc.2013.11.103. Epub 2013 Dec 2.

Abstract

Biologists are increasingly confronted with the challenge of quickly understanding genome-wide biological data, which usually involve a large number of genomic coordinates (e.g. genes) but a much smaller number of samples. To meet the need for data of this shape, we present an open-source package called 'supraHex' for training, analysing and visualising omics data. This package devises a supra-hexagonal map to self-organise the input data, offers scalable functionalities for post-analysing the map, and more importantly, allows for overlaying additional data for multilayer omics data comparisons. Via applying to DNA replication timing data of mouse embryogenesis, we demonstrate that supraHex is capable of simultaneously carrying out gene clustering and sample correlation, providing intuitive visualisation at each step of the analysis. By overlaying CpG and expression data onto the trained replication-timing map, we also show that supraHex is able to intuitively capture an inherent relationship between late replication, low CpG density promoters and low expression levels. As part of the Bioconductor project, supraHex makes accessible to a wide community in a simple way, what would otherwise be a complex framework for the ultrafast understanding of any tabular omics data, both scientifically and artistically. This package can run on Windows, Mac and Linux, and is freely available together with many tutorials on featuring real examples at http://supfam.org/supraHex.

Keywords: Bioinformatics; Clustering; DNA replication timing; Gene expression; Sample correlation; Visualisation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Computational Biology / methods*
  • Computational Biology / statistics & numerical data*
  • DNA Replication
  • Data Interpretation, Statistical
  • Embryonic Development / genetics
  • Genomics / statistics & numerical data*
  • Mice
  • Multigene Family
  • Software*