BioViz Connect: Web Application Linking CyVerse Cloud Resources to Genomic Visualization in the Integrated Genome Browser

Front Bioinform. 2022 May 23:2:764619. doi: 10.3389/fbinf.2022.764619. eCollection 2022.

Abstract

Genomics researchers do better work when they can interactively explore and visualize data. Due to the vast size of experimental datasets, researchers are increasingly using powerful, cloud-based systems to process and analyze data. These remote systems, called science gateways, offer user-friendly, Web-based access to high performance computing and storage resources, but typically lack interactive visualization capability. In this paper, we present BioViz Connect, a middleware Web application that links CyVerse science gateway resources to the Integrated Genome Browser (IGB), a highly interactive native application implemented in Java that runs on the user's personal computer. Using BioViz Connect, users can 1) stream data from the CyVerse data store into IGB for visualization, 2) improve the IGB user experience for themselves and others by adding IGB specific metadata to CyVerse data files, including genome version and track appearance, and 3) run compute-intensive visual analytics functions on CyVerse infrastructure to create new datasets for visualization in IGB or other applications. To demonstrate how BioViz Connect facilitates interactive data visualization, we describe an example RNA-Seq data analysis investigating how heat and desiccation stresses affect gene expression in the model plant Arabidopsis thaliana. The RNA-Seq use case illustrates how interactive visualization with IGB can help a user identify problematic experimental samples, sanity-check results using a positive control, and create new data files for interactive visualization in IGB (or other tools) using a Docker image deployed to CyVerse via the Terrain API. Lastly, we discuss limitations of the technologies used and suggest opportunities for future work. BioViz Connect is available from https://bioviz.org.

Keywords: AT1G07350; Arabidopsis; CyVerse; Integrated Genome Browser; SR45a; Terrain API; abiotic stress; visualization.