ABSTRACT
Summary Changes to regulatory sequences account for important phenotypic differences between species and populations. In heterozygote individuals, regulatory polymorphism typically manifests as allele-specific expression (ASE) of transcripts. ASE data from inter-species and inter-population hybrids, in conjunction with expression data from the parents, can be used to infer regulatory changes in cis and trans throughout the genome. Improper data handling, however, can create problems of mapping bias and excessive loss of information, which are prone to arise unintentionally from the cumbersome pipelines with multiple dependencies that are common among current methods. Here, we introduce a new, selfcontained method implemented in Python that generates allele-specific expression counts from genotype-specific map alignments. Rather than assessing individual SNPs, our approach sorts and counts reads within a given homologous region by comparing individual read-mapping statistics from each parental alignment. Reads that are aligned ambiguously to both references are resolved proportionally to the allele-specific matching read counts or statistically using a binomial distribution. Using simulations, we show CompMap has low error rates in assessing regulatory divergence.
Availability The Python code with examples and installation instructions is available on the GitHub repository https://github.com/santiagosnchez/CompMap
Contact santiago.snchez{at}gmail.com
Competing Interest Statement
The authors have declared no competing interest.