TY - JOUR T1 - Procedures for enumerating and uniformly sampling transmission trees for a known phylogeny JF - bioRxiv DO - 10.1101/160812 SP - 160812 AU - Matthew Hall Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/07/10/160812.abstract N2 - One approach to the reconstruction of infectious disease transmission trees from pathogen genomic data has been to annotate the internal nodes of a phylogeny with information about the host that each ancestral lineage was infecting. If the transmission bottleneck is complete, the set of all possible ways of making this annotation is equivalent to the set of partitions of the nodes of the phylogeny such that the nodes in each partition element induce a connected subgraph of the tree. However, the mathematical properties of this space remain largely unexplored. Here, a procedure by which the cardinality of the set of partitions for a given phylogeny can be calculated is described, and also I show how to uniformly sample from that set. The procedure is outlined, first, for situations where one sample is available from each host and trees do not have branch lengths, and it is then extended to allow incomplete sampling, multiple sampling, and the application to time trees in a situation where limits on the period during which each host could have been infected are known. The sampling algorithm is available as an R script. ER -