Abstract
The recent availability of large-scale sequence data for the human Y chromosome has revolutionized analyses of and insights gained from this non-recombining, paternally inherited chromosome. However, the studies to date focus on Eurasian variation, and hence the diversity of early-diverging branches found in Africa has not been adequately documented. Here we analyze over 900 kb of Y chromosome sequence obtained from 547 individuals from southern African Khoisan and Bantuspeaking populations, identifying 232 new sequences from basal haplogroups A and B. We find new branches within haplogroups A2 and A3b1 and suggest that the prehistory of haplogroup B2a is more complex than previously suspected; this haplogroup is likely to have existed in Khoisan groups before the arrival of Bantu-speakers, who brought additional B2a lineages to southern Africa. Furthermore, we estimate older dates than obtained previously for both the A2-T node within the human Y chromosome phylogeny and for some individual haplogroups. Finally, there is pronounced variation in branch length between major haplogroups; haplogroups associated with Bantu-speakers have significantly longer branches. This likely reflects a combination of biases in the SNP calling process and demographic factors, such as an older average paternal age (hence a higher mutation rate), a higher effective population size, and/or a stronger effect of population expansion for Bantu-speakers than for Khoisan groups.