Abstract
In order to take full advantage of next generation genomics data, I need informatics methods to be based on agreed upon formally specified standards that can be implemented easily in a uniform fashion without ambiguity. These standards should be encoded as logical formulae, so that provably correct and efficient decision procedures can be used for query answering and validation.
In this paper I present the core of such a standard for sequence data: a collection of definitions of relations that hold between genomic intervals, and an alegbra for performing operations upon these intervals. I show how these relations can be used to extend formalize concepts in the Sequence Ontology (SO).
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.