%0 Journal Article %A Arli A. Parikesit %A Peter F. Stadler %A Sonja J. Prohaska %T Large-Scale Evolutionary Patterns of Protein Domain Distributions in Eukaryotes %D 2017 %R 10.1101/142182 %J bioRxiv %P 142182 %X The genomic inventory of protein domains is an important indicator of an organism’s regulatory and metabolic capabilities. Existing gene annotations, however, can be plagued by substantial ascertainment biases that make it difficult to obtain and compare quantitative domain data. We find that quantitative trends across the Eukarya can be investigated based on a combination of gene prediction and standard domain annotation pipelines. Species-specific training is required, however, to account for the genomic peculiarities in many lineages. In contrast to earlier studies we find wide-spread statistically significant avoidance of protein domains associated with distinct functional high-level gene-ontology terms.1998 ACM Subject Classification J.3 Life and Medical Sciences %U https://www.biorxiv.org/content/biorxiv/early/2017/05/27/142182.full.pdf