Summary
More than 40% of the germline variants in ClinVar today (May, 2021) are designated as Variants of Uncertain Significance (VUS). That is, there is insufficient evidence to determine the clinical impact of these variants, which confounds the clinical management of the individuals who carry them. These variants remain unclassified in part because the patient-level data needed for their interpretation is largely siloed, due to its sensitive nature. Federated analysis offers the potential to overcome this problem by “bringing the code to the data”: analyzing the sensitive patient-level data computationally within its secure home institution, and providing researchers with valuable insights from data that would not otherwise be accessible. We tested this principle with a federated analysis of breast cancer patients and controls from clinical data at RIKEN, derived from the BioBank Japan repository. We used as exemplars variants in BRCA1 and BRCA2, genes for which variants designated as pathogenic confer significant risk of breast, ovarian, and other cancers. By sharing analysis software workflows, we were able to analyze these data within RIKEN’s secure computational framework, without the need to transfer the data, gathering evidence for the interpretation of several variants. This exercise serves as a proof of concept, and represents an approach to help realize the core charter of the Global Alliance for Genomics and Health (GA4GH): to responsibly share genomic data for the benefit of human health. The workflows are available at Dockstore at https://dockstore.org/workflows/github.com/BRCAChallenge/federated-analysis/cooccurrence:master, and the source code is available on GitHub at https://github.com/BRCAChallenge/federated-analysis.
Competing Interest Statement
The authors have declared no competing interest.