Summary
The plastid genome retains several features from its cyanobacterial-like ancestor, one being the co-transcriptional organization of genes into operon-like structures. Some plastid operons have been identified but undoubtedly many more remain undiscovered. Here we utilize the highly variable plastome structure that exists within certain legumes of the inverted repeat lost clade (IRLC) to find conserved gene clusters. These plastomes exhibit an unusually high frequency of translocations and inversions. We analysed the plastomes of 23 legume species and identified 32 locally collinear blocks (LCBs), which are regions within the plastid genomes that occur in different orientation and/or order among the plastid genomes but are themselves free from internal rearrangements. Several represent gene clusters that have previously been recognized as plastid operons. It appears that the number of LCBs has reached saturation in our data set, suggesting that these LCBs are not random, but likely represent legume plastid operons protected from internal rearrangement by functional constraint. Some of the LCBs we identify, such as psbD/C/Z, are previously known plastid operons. Others, such as rpl32-ndhF-psbA-matK-rbcL-atpB-atpE, may represent novel polycistronic operons in legumes.