Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Data Representation in the DARPA SD2 Program

Nicholas Roehner, View ORCID ProfileJacob Beal, Bryan Bartley, Richard Markeloff, Tom Mitchell, Tramy Nguyen, Daniel Sumorok, Nicholas Walczak, Chris Myers, Zach Zundel, James Scholz, Benjamin Hatch, Mark Weston, John Colonna-Romano
doi: https://doi.org/10.1101/2021.09.17.460644
Nicholas Roehner
1Raytheon BBN Technologies
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jacob Beal
1Raytheon BBN Technologies
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jacob Beal
  • For correspondence: jakebeal@ieee.org
Bryan Bartley
1Raytheon BBN Technologies
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Richard Markeloff
1Raytheon BBN Technologies
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Tom Mitchell
1Raytheon BBN Technologies
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Tramy Nguyen
1Raytheon BBN Technologies
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Daniel Sumorok
1Raytheon BBN Technologies
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nicholas Walczak
1Raytheon BBN Technologies
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Chris Myers
2University of Colorado Boulder
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Zach Zundel
3University of Utah
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
James Scholz
3University of Utah
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Benjamin Hatch
3University of Utah
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mark Weston
4Netrias
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John Colonna-Romano
5Aptima
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

1 SUMMARY

Modern scientific enterprises are often highly complex and multidisciplinary, particularly in areas like synthetic biology where the subject at hand is itself inherently complex and multidisciplinary. Collaboration across many organizations is necessary to efficiently tackle such problems [6, 15], but remains difficult. The challenge is further amplified by automation that increases the pace at which new information can be produced, and particularly so for matters of fundamental research, where concepts and definitions are inherently fluid and may rapidly change as an investigation evolves [7].

The DARPA program Synergistic Discovery and Design (SD2) aimed to address these challenges by organizing the development of data-driven methods to accelerate discovery and improve design robustness, with one of the key domains under study being synthetic biology. The program was specifically organized such that teams provided complementary types of expertise and resources, and without any team being in a dominant organizational position, such that subject-matter investigations would necessarily require peer-level collaboration across multiple team boundaries. With more than 100 researchers across more than 20 organizations, several of which ran experimental facilities with high-throughput automation, participants were forced to confront challenges around effective data sharing.

The default architecture for scientific collaboration is essentially one of anarchy, with ad-hoc bilateral relations between pairs of collaborators or experimental phases (Figure 1(a)). This was by necessity the case during early phases of the SD2 program as well, in which incorporating new tools into pipelines was ad-hoc and time-consuming, and data was generally disconnected from genetic designs and experimental plans. The other typical approach for collaboration is one of “command and control”, in which a dominant organization determines the data sharing content and format for all participants (Figure 1(b)). This can be efficient, but tends to be limited in flexibility and extensibility, rendering it unsuitable for research collaboration, as indeed was found when we attempted this approach during the first year of the SD2 program. We addressed these problems with the application of distributed standards to create a “flexible rendezvous” model of collaboration (Figure 1(c)), enabling information flow to track evolving collaborative relationships, improving the sharing and utility of information across the community and supporting accelerated rates of experimentation.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • nicholas.roehner{at}raytheon.com,jakebeal{at}ieee.org,chmy5075{at}colorado.edu,weston{at}netrias.com

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted September 18, 2021.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Data Representation in the DARPA SD2 Program
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Data Representation in the DARPA SD2 Program
Nicholas Roehner, Jacob Beal, Bryan Bartley, Richard Markeloff, Tom Mitchell, Tramy Nguyen, Daniel Sumorok, Nicholas Walczak, Chris Myers, Zach Zundel, James Scholz, Benjamin Hatch, Mark Weston, John Colonna-Romano
bioRxiv 2021.09.17.460644; doi: https://doi.org/10.1101/2021.09.17.460644
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Data Representation in the DARPA SD2 Program
Nicholas Roehner, Jacob Beal, Bryan Bartley, Richard Markeloff, Tom Mitchell, Tramy Nguyen, Daniel Sumorok, Nicholas Walczak, Chris Myers, Zach Zundel, James Scholz, Benjamin Hatch, Mark Weston, John Colonna-Romano
bioRxiv 2021.09.17.460644; doi: https://doi.org/10.1101/2021.09.17.460644

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Synthetic Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (4672)
  • Biochemistry (10334)
  • Bioengineering (7655)
  • Bioinformatics (26281)
  • Biophysics (13497)
  • Cancer Biology (10663)
  • Cell Biology (15392)
  • Clinical Trials (138)
  • Developmental Biology (8485)
  • Ecology (12802)
  • Epidemiology (2067)
  • Evolutionary Biology (16818)
  • Genetics (11380)
  • Genomics (15454)
  • Immunology (10592)
  • Microbiology (25159)
  • Molecular Biology (10196)
  • Neuroscience (54373)
  • Paleontology (399)
  • Pathology (1663)
  • Pharmacology and Toxicology (2889)
  • Physiology (4332)
  • Plant Biology (9223)
  • Scientific Communication and Education (1585)
  • Synthetic Biology (2553)
  • Systems Biology (6769)
  • Zoology (1459)