ABSTRACT
Human genetic variants are usually represented by four values with variable length: chromosome, position, reference and alternate alleles. Thereis no guarantee that these components are represented in a consistent way across different data sources, and processing variant-based data can be inefficient because four different comparison operations are needed for each variant, three of which are string comparisons. Working with strings, in contrast to numbers, poses extra challenges on computer memory allocation and data-representation. Existing variant identifiers do not typicallyrepresent every possible variant we may be interested in, nor they are directly reversible. To overcome these limitations, VariantKey, a novel reversible numerical encoding schema for human genetic variants, is presented here alongside a multi-language open-source software implementation (http://github.com/genomicspls/variantkey). VariantKey represents variants as single 64 bit numeric entities, while preserving the ability to be searched and sorted by chromosome and position. The individual components of short variants can be directly read back from the VariantKey, while long variants are supported with a fast lookup table.
Highlights
~100 compounds identified by high-content screen inhibit SGs in HEK293, NPCs and iPS-MNs.
ALS-associated RBPs are recruited to SGs in an RNA-dependent manner
Molecules with planar moieties prevent recruitment of ALS-associated RBPs to SGs
Compounds inhibit TDP-43 accumulation in SGs and in TARDBP mutant iPS-MNs.
Footnotes
↵11 Lead Contact