Introduction

In footnote to a letter written in 1860 from Alfred Russel Wallace to Charles Darwin, Wallace (1860) drew attention to a phenomenon that he simply could not understand: “P.S. ‘Natural Selection’ explains almost everything in Nature, but there is one class of phenomena I cannot bring under it,—the repetition of the forms and colours of animals in distinct groups, but the two always occurring in the same country and generally on the very same spot”. Within months, his long-time friend and former traveling companion Henry Walter Bates had proposed a compelling evolutionary explanation for a significant subset of observations that Wallace had alluded to. Inspired by his observations of Amazonian butterflies, Bates argued that a palatable species might gain protection from predators by resembling an unpalatable or otherwise unprofitable species (Bates 1862), a phenomenon later referred to as “Batesian mimicry”. Bates’ observations and perceptive explanation were quickly interpreted as outstanding examples of the power of natural selection, and they were included by Darwin from the fourth edition onwards of “The Origin of Species”.

Yet Bates (1862) had also noted examples of apparently unpalatable butterfly species that appeared to resemble one another: (p. 507) “Not only, however, are the Heliconidae the objects selected for imitation; some of them are themselves the imitators”. Although Bates suggested that rare species, independent of their palatability, may gain protection from predators by resembling more common species—anticipating Müller’s arguments by more than a decade (Mallet and Joron 1999)—he also proposed that (1862, p. 513): “Some of the mutual resemblances of the Heliconidae already mentioned seem not to be due to the adaptation of the one to the other, but rather...to the similar adaptation of all to the same local, probably inorganic, conditions”, a phenomenon now referred to as “pseudo-mimicry” (Grobman 1978).

The gifted and iconoclastic German naturalist Johannes Friedrich (“Fritz”) Müller, who had emigrated to Brazil in 1852, was struck by the same puzzling phenomenon of mimicry among unpalatable butterflies. Müller proposed several explanations for the trend, many of which he communicated through private letters to Darwin. For example, one early explanation was inspired by Darwin’s theory of sexual selection, specifically that individuals may develop a preference for mates with certain colour patterns after seeing other forms (of the same and different species) with similar appearances (Poulton 1909). However, if this particular argument was correct then one would expect that the process would tend to produce closer resemblances among male co-mimics than females (assuming females were the choosier sex)—in fact, where sexual dimorphism occurs, it is the females that tend to be mimetic (Turner 1978). In 1878, Müller gave a brief description of a different explanation for the phenomenon; this time, like Bates’ original theory, it assumed that mimicry arose as a consequence of selection imposed by predation (Müller 1878). Müller’s argument was based on “strength in numbers”: two or more unpalatable species may evolve a similar appearance simply because they share the mortality costs involved in teaching naïve predators to avoid them. A fuller description of the theory was subsequently published by Müller in the journal Kosmos in 1879 and quickly translated by Raphael Meldola for the Entomological Society of London (Müller 1879). It is a theory that continues to resonate, although, as we will see, it is not without controversy.

In presenting his case, Müller provided what could well be the first formal mathematical model to support an evolutionary hypothesis. Following Müller’s original argument (1879), let a 1 and a 2 be the numbers of two approximately equally unpalatable (or otherwise defended) species in some definite district during one summer, and let n be the number of individuals of each distinct unpalatable species that are killed by predators during a season before their distastefulness is generally known. If the species are distinct in appearance, then each species would lose n individuals in the course of educating predators. If, however, the two species were exactly alike in appearance, then the first species would lose only a 1 n/(a 1+a 2) and the second would lose only a 2 n/(a 1+a 2). Under these conditions, a mimetic mutant of species 1 that perfectly resembled species 2 would tend to spread from extreme rarity (in that it will have a higher mean survivorship than its more common conspecifics) so long as a 2 > a 1. Following the same logic as Müller’s original model, Mallet (1999) showed that if highly unpalatable species are avoided more rapidly than weakly unpalatable species (see Lindström et al. 2006; Skelhorn and Rowe 2006a for recent demonstrations of this effect) then an analogous set of conclusions apply. In this case, however, the relative benefits of mimicry are a function of both relative unpalatability and relative abundance (the two are in effect traded off against one another), so that even if two unpalatable species had identical densities then we could still see selection on the less defended species to resemble the better defended species (Mallet 1999).

In contrast to Batesian mimicry, Müller’s explanation for similarity among unpalatable species (a phenomenon that has become known as “Müllerian mimicry”, see below) was initially greeted with general scepticism. Bates himself “could not see that Dr Müller’s explanations and calculations cleared up all the difficulties” and that mimicry among unpalatable species remained, “a great stumbling block” (HW Bates in Müller 1879). Wallace, while accepting the logic of Müller’s theory, argued that many such “difficult cases of mimicry” may arise either due to the rarity of one of the unpalatable species (following Bates’ proposals above) or because some predators might find one of the unpalatable species palatable, suggesting an evolutionary dynamic that is more Batesian in character than Müllerian (Wallace 1882).

In this review, I discuss the evidence that has accumulated supporting and opposing Müller’s specific model and his arguments in general. At the outset we must recognise a semantical issue—should Müllerian mimicry be the term used to describe Müller’s specific mechanism through which unpalatable species evolve a general appearance or the general phenomenon of a similarity between two or more unpalatable species? Hereafter (in recognition of Müller’s key contribution and widespread use of the term in this way), I use “Müllerian mimicry” to refer to the general phenomenon of evolved similarity among unpalatable species and “Müller’s hypothesis” to describe his specific theory.

Taxonomic distribution of Müllerian mimicry

An informal survey of the scientific journal literature since 1960 might suggest that mimicry among well-defended species is predominantly a lepidopteran phenomenon. For example, in September 2007, I conducted a Web of Science® survey of experimental–observational papers since 1960 with keywords “Müllerian” (in various guises) and “mimicry”. This simple survey indicated that 52 of 87 (60%) empirical (as opposed to theoretical) articles on Müllerian mimicry dealt with butterflies, the majority of which were neotropical heliconids. While heliconid butterflies are indeed among the most celebrated examples (Turner 1981; Sheppard et al. 1985; Brower 1996), many other fascinating cases have been uncovered including burnet moths (Sbordini et al. 1979; Niehuis et al. 2007), bumblebees (Plowright and Owen 1980; Williams 2007), heteropteran bugs (Zzavy and Nedved 1999), poison arrow frogs (Symula et al. 2001; Chiari et al. 2004), vipers (Sanders et al. 2006), fish (Springer and Smith-Vaniz 1972; in Randall 2005) and perhaps even birds (Dumbacher and Fleischer 2001—see also the footnote in Bates 1862, p. 507, referring to avian examples of mimicry observed by AR Wallace). A form of Müllerian mimicry may also arise in rewarding flower species which gain a mutual advantage from evolving a common advertising display (Chittka 1997; Roy and Widmer 1999; Benitez-Vieyra et al. 2007). Of course, this phenomenon is based on different profitable types evolving the same signals to enhance the rate at which they are exploited by consumers, but the underlying principles are the same. Taking the analogy even further, the Appendix briefly outlines some potential examples of Müllerian mimicry in the world of marketing.

Rings rather than pairs

Although Müllerian mimicry is often presented as involving resemblance between just two species, it is important to recognise at the outset that it frequently involves larger collections of similar-looking species—“mimicry rings” (Mallet and Gilbert 1995; Gilbert 2005). Linsley et al. (1961) for example describe a series of “lycid complexes” that include collections of unpalatable lycid beetles, arctiid moths, parasitic hymenoptera and flies, all of which are orange in coloration with black tips (see Fig. 1a,b). Similarly, tarantula hawk wasps in the genera Pepsis and Hemipepsis have some of the most painful stings known to man and form Müllerian (and Batesian) mimicry complexes with many other species of stinging tarantula hawks, as well as numerous flies, beetles and moths (Schmidt 2004). Cross-order examples of Müllerian mimicry are not uncommon and provide some of the most spectacular examples of adaptive resemblance, including the co-mimicry of tiger beetles and wasps (Schultz 2001) and the co-mimicry of moths and wasps (Weller et al. 2000, Fig. 1c,d).

Fig. 1
figure 1

ad Examples of Müllerian mimicry. a Shows an unpalatable lycid beetle (Coleoptera), while b shows an unpalatable arctiid moth (Lepidoptera), both with highly contrasting orange and black colours. The two species were photographed on goldenrod in southern Ontario at the same time of year (photo credit: Henri Goulet). (c, d) Two arctiid moths (Lepidoptera) with different hymenopteran models, namely: c black pompilid wasps and d yellow pepsis wasps (photo credit: Rebecca Simmons)

Evidence for Müller’s hypothesis

I begin by examining evidence for the specific assumptions of the mathematical model proposed by Müller, as well as evidence for its general predictions. Where specific assumptions of Müller’s model are unsupported, I have attempted to suggest why and propose alternative approaches. I highlight field experiments in particular because I believe the insights they deliver are frequently more directly applicable. However, understanding the psychology of predation often requires experiments conducted in a more controlled laboratory setting, and for this reason these investigations have also played an important role in shaping our understanding of Müllerian mimicry.

Do predators need to learn to avoid unpalatable prey?

As Müller immediately recognised (Müller 1878), his theory rests on there being a degree of learnt avoidance of unpalatable prey—if all avoidance were innate then there would be no predators to educate. While there does appear to be some evidence of innate avoidance, notably for highly dangerous prey such as snakes (e.g. Caldwell and Rubinoff 1983), and naïve predators frequently show heightened wariness when presented with novel and/or conspicuously patterned prey (e.g. Schuler and Roper 1992), there are now numerous studies demonstrating learnt avoidance of natural prey by co-occurring predators. For example, Mostler (1935) found that, in their first encounters, young birds showed the same behaviour towards noxious insects as towards harmless ones. Likewise, when exposed to butterflies for the first time, naïve young jacamars tended to attack unpalatable butterflies without inhibition yet subsequently rejected them (Chai 1996). In other experiments, inexperienced lizards readily attacked unpalatable butterflies (cast using a fishing rod), but rapidly learned to avoid them (Boyden 1976). In a more recent experiment, Pinheiro (2003) released a variety of palatable and unpalatable butterfly species into different habitats and recorded the responses of avian predators. The unpalatable butterflies were often sight-rejected by birds, but this avoidance behaviour was much more evident in habitats in which both predators and prey co-occurred, emphasizing how avoidance is typically learned. So, there seems ample evidence that many predators need to sample unpalatable or otherwise defended prey before they learn to avoid them.

Do predators take a “fixed n” of unpalatable prey, independent of their density?

One of the specific assumptions made by Müller was that predators consume a fixed number of each distinct unpalatable prey species over the course of a particular period of time, independent of their density, before learning to reject them. Of course, all models are simplifications of the real world and it is arguably unfair to interpret Müller’s hypothesis too strictly. Nevertheless, in addressing this assumption, it provides an important starting point for enquiry.

To date, there has been no data to support the assumption of “fixed n” of each unpalatable prey type attacked, independent of their density, over the time course of an experiment. Indeed, several experiments have reported a significant positive correlation between the number of unpalatable prey of a given type that attacked (n) by the end of the study and the number of that prey type originally presented (a) (Greenwood et al. 1989; Lindström et al. 2001; Beatty et al. 2004). This correlation has previously been explained as a consequence of predators not seeing enough of the rarest forms to complete their learning (Greenwood et al. 1989; Mallet 2001). However, an alternative explanation for the phenomenon may be that predators occasionally return to attacking the more common unprofitable type (Beatty et al. 2004) since the opportunity cost of not attacking them, should some prove to be palatable, is higher. Lynn (2005) has since presented an alternative explanation, supported by a simple signal detection model in which palatable and unpalatable prey types are occasionally confused, but it is hard to envisage predators having much difficulty in discriminating the palatable from unpalatable prey types in any of the above experiments.

Of course, we do not need a fixed n in a given time period for Müller’s general mechanism to work (Mallet 2001) but a positive relationship between n and a will inevitably make the phenomenon harder to detect. For example, if the number of each unpalatable prey taken in the course of learning is always directly proportional to their abundance and both phenotypes have identical coefficients such that n 1 = ka 1 and n 2 = ka 2 then the proportion of each prey type attacked will be constant (k) whatever the relative densities of the prey types and there will be no selection for mimicry. A more general yet more relevant representation of Müller’s model is that, all else being equal, predators will take disproportionately more of the rarer distinct unpalatable type of prey (relative to abundance) in the course of their learning, that is (n 1/a 1) > (n 2/a 2) for a 1 < a 2. If this condition holds, then there will still be “strength in numbers” and (“anti-apostatic”) selection for uniformity.

Field experiments have tended to confirm that the above inequality holds under natural conditions, such that common forms of unpalatable Heliconius butterflies have a selective advantage compared to rare forms of the same unpalatable species (Benson 1972; Mallet and Barton 1989; Kapan 2001). Likewise, experiments with avian predators feeding on two distinct forms of artificial unpalatable prey (occurring at frequencies of 1:9 and 9:1, Greenwood et al. 1989) and just one form of unpalatable prey (4%, 12% and 32% of total prey, Lindström et al. 2001) have found that unpalatable phenotypes experience greater “mortality” when they are rare (see also Ihalainen et al. 2008). So, although there is no evidence that predators consume a “fixed n” when learning to avoid unpalatable prey types, it seems that common forms of unpalatable prey frequently have an advantage over rarer forms and that predation continues to explain why.

Nevertheless, we should be cautious. Some experiments have failed to find any significant anti-apostatic selection when different types of unpalatable prey have been presented. In particular, experiments involving garden birds (Greenwood et al. 1981) and domestic chicks (Greenwood et al. 1981) have found little or no evidence to indicate that the common form is at a selective advantage over the rarer form. Likewise, in more recent experiments using captive great tits (Rowe et al. 2004; Lindström et al. 2006; Ihalainen et al. 2007), the combined mortality of unpalatable prey was no higher when they comprised two different signals compared to one, indicating there was no “strength in numbers”. These experiments, which have repeatedly found no evidence for anti-apostatic selection, suggest that strong selection for Müllerian mimicry among unpalatable prey is not as inevitable as one might first expect, and it is important to ask why. One reason may be that the captive birds in aviary experiments are typically required to continue foraging for a fixed period of time or until a particular number of food items are attacked, whereas in more natural systems the predators can simply move on if they taste something unpleasant—as such, the relative rates of mortality involved in avoidance learning may be qualitatively different in an experimental setting (I am grateful to MP Speed for this observation).

Generalization and the importance of palatable prey

Another potential explanation for the mixed experimental evidence for anti-apostatic selection (invoked in several of the above studies) is that the learning environments in many of the aviary experiments with artificial prey may not have been particularly challenging. In effect, when predators are presented with two unpalatable prey types and one palatable one, then all they need to do is to remember the characteristics of the one palatable prey type to meet their dietary requirements, not the characteristics of the two unpalatable ones. Indeed, the ability of predators to differentiate between rewarding and non-rewarding types is likely to be higher when both types are presented simultaneously, so that predators receive direct differential conditioning (Dyer and Chittka 2004).

Müller did not consider the appearance of profitable prey at all when making his arguments, and in effect he assumed that predators learn to recognise the characteristic features of each and every distinct unpalatable prey type independently before they are avoided. Yet, as Fisher (1930) observed, “being recognized as unpalatable is equivalent to avoiding confusion with palatable prey”. Therefore, while predators may well learn to associate a given species’ defensive attributes with particular stimuli, it is also likely that predators will adopt rules to help distinguish palatable from unpalatable prey and thereby maximize their rate of reward, particularly in the early stages of foraging (MacDougall and Dawkins 1998; Sherratt and Beatty 2003). Such rules are likely to involve a combination of both generalisation (attributing common properties to distinguishable objects) and categorisation (using stimuli to classify into discrete groups, see Chittka and Osorio 2007), and it is clear that both processes may play an important role in influencing the way Müllerian mimicry evolves. For example, if predators seek to place prey into categories (good vs bad to eat, say), then even unpalatable prey species that are readily distinguishable may mutually reduce one another’s attack rate, so long as they share a common appearance property (Chitka and Osorio 2007). Likewise, recent work has suggested that generalisation may facilitate the gradual evolution of Müllerian mimicry (Balogh and Leimar 2005), mediated by a form of “peak shift”, in which a predator’s maximum aversive response arises around a negative stimulus in the direction of the other negative stimulus. Given the prevalence of generalisation, it is important to note that a predator’s perceived failure to discriminate between two different prey species does not necessarily mean it has no means of telling the prey types apart—such behaviour may also arise from predators’ generalising their experiences, with wider generalisation typically expected for more aversive prey (Sherratt 2001; Balogh and Leimar 2005).

Animals do not have a limitless capacity for processing (and storing) information, so one would predict that rules for differentiating profitable from unprofitable prey will be of particular importance to predators when they are faced with a diverse array of potential prey (MacDougall and Dawkins 1998). Beatty et al. (2004) put the above ideas to the test using a computer “game” in which humans foraged for artificial computer-generated profitable and unprofitable prey. When there was just one common profitable and one unprofitable prey type, then selection on a rare second species of unprofitable prey to become an imperfect mimic of the more common unprofitable prey type was relatively weak. By contrast, when there were six profitable and six unprofitable prey (all of which shared a common appearance feature making them distinct from profitable prey), then there was intense selection on a seventh unprofitable prey to share some of the characteristics with the common unprofitable prey. Collectively, these results indicate that Müllerian mimicry is disproportionately more likely to evolve in relatively diverse communities with a range of prey types, compared to simple communities. Of course, humans are well known for their high intelligence and strategising, and there is a need to test these ideas with “real” predators.

How intense is the selection for uniformity in natural systems?

If avoidance learning were quick, there were few naïve predators, and/or the densities of both of the potential unpalatable mimics were high, then one would expect selection for Müllerian mimicry would be relatively weak. Indeed, Beatty et al. (2004) found that two dissimilar-looking unprofitable prey types only had significantly different survivorship when there was a marked disparity in their densities. Here, I ask just how intense is the selection on unpalatable prey species to evolve a resemblance to one another. Put another way, is the predator community really that naïve, and unpalatable prey populations so low in density, that predator education has a significant impact on fitness?

From mark-recapture experiments on Lepidoptera, it appears that relatively few novel-looking unpalatable Heliconius butterflies tend to be attacked before the local predator community learns to avoid them. For example, Mallet and Barton (1989) found that novel experimental and familiar control butterflies (approximately 20 of each type released per site) differed primarily in their probability of establishment and that most selection (in terms of differences in survival) occurred very soon after release. Similarly, Kapan (2001) released novel experimental and familiar control butterflies at “low” and “high” densities (one pair every 150–200 m [46 experimental, 34 controls] and one pair every 40 m [21 experimental, 16 controls]) and found that life expectancies only differed significantly between treatments and controls at low-density sites (2 days vs 12 days).

Nevertheless, despite apparent rapid learning, selection on appropriate signals can be intense. The selection coefficient s for mimicry in the above cases (and similar examples) can be estimated by comparing the fitness of the rare experimental morphs with familiar control morphs. Assuming a constant daily reproductive output, then the selection against unfamiliar colour patterns can be estimated as \(s = {{{\text{1}} - \left[ {{\text{life expectancy}}} \right]{\text{experimental}}} \mathord{\left/ {\vphantom {{{\text{1}} - \left[ {{\text{life expectancy}}} \right]{\text{experimental}}} {\left[ {{\text{life expectancy}}} \right]{\text{control}}}}} \right. \kern-\nulldelimiterspace} {\left[ {{\text{life expectancy}}} \right]{\text{control}}}}\). Despite the fact that differences in mortality appear to be restricted to the establishment phase and only significant when prey are at low overall densities, the estimates of s tend to be relatively high, of the order 0.22 (Benson 1972) to 0.52 (Mallet and Barton 1989) and 0.83 (Kapan 2001, low-density releases combined; yet 0.06 for high-density releases), indicating that there can indeed be intense selection against the rare form. It seems that the estimated coefficients of selection are so high precisely because predation acts so quickly in reducing the chances of novel unpalatable forms from establishing. Of course, like Müller’s initial model, these experiments compare the success of a common mimic with a rare non-mimic rather than an incipient imperfect mimic, but it is clear that the anti-apostatic selection is frequently strong enough to maintain any such mimicry if it arises. Similar effects have been seen in aviary experiments. For example, Ihalainen et al. (2008) found that great tits (Parus major) with prior experience of a warningly coloured unpalatable model avoided these models at much higher rates than entirely novel unpalatable forms, even when these novel forms were somewhat similar in appearance to the original model.

The above field estimates of life expectancy are based on a series of assumptions relating to the probability of re-sighting if alive. For example, we must control for the fact that, to a human field researcher, novel-looking forms may be easier to find than experimental controls (Kapan 2001), which would act to enhance the apparent survival advantage of novel types. Furthermore, disappearances of released butterflies can happen for reasons other than death. Nevertheless, there is considerable indirect evidence that predation may play an important role in mediating the success of the different colour forms. For example, the fact that novel morphs of butterflies were particularly disadvantaged in release sites with predatory birds such as jacamars (Galbula spp) and that the novel forms showed a higher incidence of beak marks (Mallet and Barton 1989), all indicate that predators play a key role in generating the observed selection.

Spatial polymorphisms

The observation of selection against rarity (anti-apostatic selection) leads to an apparent paradox: the processes that generate Müllerian mimicry appear to promote uniformity in appearance, yet many Müllerian mimics are notorious for their spatial variation in form. For example, the neotropical Müllerian mimics Heliconius erato and Heliconius melpomene (Turner 1981; Sheppard et al. 1985) are locally monomorphic and resemble one another closely in any given area, but both species co-vary in appearance from area to area ‘‘as if by touch of an enchanter’s wand’’ (comments of H.W. Bates in Müller 1879, p. xxix), exhibiting up to 30 different colour patterns at different locations throughout their ranges (Brower 1996; Jiggins and McMillan 1997; Flanagan et al. 2004). Analogous spatial variation has been reported in a number of other Müllerian mimics, such as the burnet moth Zygaena ephialtes, which exists in four different colour forms in different areas throughout Europe, with each form a member of a different Müllerian mimicry complex (Turner 1971), cotton stainer bugs (genus Dysdercus) which show widespread coincident intra-specific variation in colour pattern (Zrzavy and Nedved 1999) and Eulaema bees which likewise co-vary in colour pattern over South America (Dresler 1979). Spatial mosaic formation is not exclusively a feature of Müllerian mimicry systems—even Batesian mimics show localised polymorphisms, evolving to resemble the defended models in their area (Ruxton et al. 2004). For example, colubrid snakes of the genus Pliocercus are rear fanged (non-venomous) and while they comprise at most two species in Central America, they occur in a variety of phenotypic forms resembling the front-fanged venomous Micrurus coral snakes in the area (Greene and McDiarmid 1981).

Although spatial mosaics are seen as something of a paradox in Müllerian mimicry systems (e.g. Langham 2004), their formation is a direct consequence of the frequency dependence favouring common forms, imposed at limited spatial scales (Joron and Iwasa 2005). For example, simulations confirm that all that is required to facilitate the mosaic structure is localised anti-apostatic selection, low predator and prey dispersal rates and chance effects which help generate the initial heterogeneity (Sherratt 2006, Fig. 2). While the zones at which different phenotypic forms meet often coincide with a barrier to dispersal such as a major river, several mathematical and simulation models suggest that the hybrid zones between different phenotypic forms can form even without such barriers, in which case they can move in a manner influenced by the curvature of the boundary (Sasaki et al. 2002; Sherratt 2006; Kawaguchi and Sasaki 2006). Indeed, a well-known juncture between two forms of H. erato has been documented to move about 47 km in 17 years (Blum 2002).

Fig. 2
figure 2

Spatial mosaic formation by Müllerian mimics. Here, populations of the two species 1 and 2 were distributed in a regular grid (50 × 50 cells). Individuals of each species could occur in any one of ten different morphs (dominant colour morph in each cell shown for species 1 and 2). In any given time step, individuals can disperse (at low rates) to neighbouring cells, the local predator community continues to forage on a given phenotype in a given cell until a fixed number consumed, and the prey reproduced. Starting with a random distribution of morphs to cells (an extreme assumption generating high heterogeneity), the system soon reduces to one in which the two species in the same cell share the same appearance, with narrow ‘hybrid zones’ between races of the different phenotypes

Rings again

Given that the proposed selective benefits of Müllerian mimicry centre on reducing the burden of predator learning, one may ask a related question to the one posed above—why do not all unpalatable species in the same area evolve the same warning pattern? There are several inter-related general explanations, which are not mutually exclusive. First, the different mimicry rings may contain members that are not completely overlapping in spatio-temporal distribution, so there is little or no selection pressure for phenotypes to converge. Second, the different mimicry rings may contain forms that are so distinct from one another that any intermediate “jack-of-all-trades” phenotypes are at a selective disadvantage. Third, there may be developmental constraints on morphology in some species, limiting the range of phenotypes that they can mimic (although I am not aware of any specific examples).

While neotropical ithomiine butterflies appear to show vertical stratification as a consequence of differences in host plant height (e.g. Beccaloni 1997), the evidence is rather equivocal for taxonomic groups such as Heliconius (Mallet and Gilbert 1995). However, other candidate mechanisms that may enhance local stratification in butterfly mimicry rings include different nocturnal roosting heights (Mallet and Gilbert 1995; Mallet and Joron 1999), different larval host plants (Willmot and Mallet 2004) and there may be a small degree of temporal separation in flight activity (DeVries et al. 1999)

Turner (1984) introduced a gravitational analogy to describe the formation of mimicry rings, suggesting that rings remain can stable because intermediate mutational forms are at a selective disadvantage. Nevertheless, the more species that join, the more powerful is its “gravitational attraction” so although highly stable, such configurations may not be entirely permanent. Using individual-based models, Franks and Noble (2004) demonstrated that Müllerian mimicry rings can indeed readily evolve for precisely the reasons that Turner (1984) had articulated. They also found that Batesian mimics can “chase” their respective mimicry rings through cycles of colourations, increasing the chance that two mimicry rings might move within convergence range of each other. One consequence of this is that the more Batesian mimics there were in the system, the fewer the mean number of distinct rings that formed. Another implication is that Batesian mimics may serve to enhance the mimetic similarity among Müllerian mimics, although such predictions have yet to be tested.

Advergence or convergence?

If we were to strictly interpret Müller’s model with only two discrete phenotypes, then it predicts there should be selection on one unpalatable species (all else being equal, the rare one) to resemble the other unpalatable species, but the reverse should not be true. In this case, the commoner species benefits from having unpalatable mimics due to “strength in numbers”, yet the predicted evolutionary dynamic is one of unilateral “advergence” (Brower and Brower 1972) rather than “convergence”. So, just because mimicry benefits both species, it does not mean that there should necessarily be convergence.

Despite the predictions of Müller’s original two-phenotype model, all bets are off when we allow for intermediate forms with imperfect mimicry. For example, it is possible that a mutant of a common species which more closely resembled a rarer species would lose little or none of its protection from resembling the common species and under these conditions one might expect a degree of evolutionary convergence. This was essentially Müller’s (1878) belief (English translation): “the question of which one of two species is the original and which one is the copy is an irrelevant question; each had an advantage from becoming similar to the other; they could have converged to each other”. Dixey (1919) was an early champion of this argument, referring to the phenomenon of reciprocal advantage as “diaposematism” and sharply criticising contemporaries (notably Marshall 1908) who questioned Müller’s insight by making a case for advergence.

The advergent–convergent dichotomy may well be too simplistic. For example, it is possible that even when two species experience selection to resemble one another, differences in the mutational space available and intensity of selection produces an outcome which is predominantly, but not exclusively, advergent in nature (Balogh and Leimar 2005, see Fig. 3). Likewise, Sheppard et al. (1985) noted that while Müllerian mimicry might involve an initial stage of advergence (which is particularly likely if one of the species was cryptic and the other conspicuous), mutual convergence in both species might subsequently be selected for (Franks and Sherratt 2007).

Fig. 3
figure 3

Although the evolution of mimicry is frequently portrayed as a two-step process (involving a mutation producing a large phenotypic change followed by finer-scale adjustments), recent theory (Balogh and Leimar 2005) confirms that it can evolve by gradual means through predator generalisation and associated peak shift. In this particular simulation, we start with two relatively dissimilar yet equally unpalatable prey species (species A [dotted line] and B [smooth line]), with species B five times more common than species A. In this case, the species evolve mimicry primarily through advergence, in that the rarer species (A) changes more in phenotype than the common species B. Redrawn from Franks and Sherratt (2007)

So far, the empirical evidence available has tended to support advergence (see Mallet 1999 for a detailed discussion). For example, one putative advergent Müllerian species, the poison arrow frog Dendrobates imitator, appears to have evolved resemblance to different model species in different geographical areas (Symula et al. 2001). Similarly, ecological and genetic arguments (Mallet 1999; Flanagan et al. 2004) suggest that Müllerian mimicry among H. melpomene and H. erato may have arisen mostly via advergence of H. melpomene towards H. erato. The rewarding perennial herb Turnera sidoides (Turneraceae) appears to have adverged in floral characteristics to resemble a variety of species of marrow (Malvaceae) on which bees tend to specialise (Benitez-Vieyra et al. 2007). If advergence were widespread, this would make the evolutionary dynamics of Müllerian mimicry much more similar to Batesian mimicry than generally believed (Mallet 1999), which makes distinguishing the two phenomena even harder.

Tasting the difference

It has long been appreciated that the defensive qualities of Müllerian co-mimics are unlikely to be equal (Wallace 1882), and there has been a great deal of debate as to the nature of the relationship between co-mimics when one species is much better defended than the other (see next section). However, what if the defensive attributes of co-mimics were approximately equal in terms of their deterrence but different in mode of action? For example, the monarch butterfly (Danaus plexippus) may possess different defence chemicals than its mimic, the viceroy (Limenitis archippus; Nishida 2002), while the Müllerian co-mimics two-spot ladybird (Adalia bipunctata) and the seven-spot ladybird (Coccinella septempunctata) employ different alkaloids for use in the reflex bleeding process (Gilsan King and Meinwald 1996). Using domestic chicks (Gallus gallus domesticus) as predators and coloured crumbs flavoured with either the same or different unpalatable chemicals as prey, Skelhorn and Rowe (2005) demonstrated that different deterrent chemicals can interact synergistically in similar-looking prey to enhance predator learning and memory. By contrast, follow-up experiments found that chicks learned to avoid two distinct unpalatable prey at similar rate, whether or not they contained different unpalatable chemicals (Skelhorn and Rowe 2006b). It is unclear precisely why this synergistic effect arose among mimetic prey with different defensive chemicals (although the element of surprise may facilitate learning), so it is difficult to make generalisations about the likelihood of such interactions occurring in natural systems. If widespread, then it raises the possibility that Müllerian mimicry also serves to enhance the absolute rate of avoidance learning in prey with different defensive chemicals, rather than by simply sharing out the costs of initial education (Skelhorn and Rowe 2005; Ruxton and Speed 2005).

The mimicry spectrum

One of the most long-standing, controversial and interesting aspects of Müllerian mimicry is its relationship to Batesian mimicry. The vast majority of textbooks continue to treat Batesian and Müllerian mimicry as distinct phenomena, with parasitic “Batesian mimics” exploiting the signals of unpalatable models and honest “Müllerian mimics” mutually re-enforcing the meaning of their shared signals. However, over the years (starting with Dixey, Marshall and Wallace), generations of researchers have questioned the nature of the relationship between these two forms of mimicry, with many workers suggesting that the two forms lie at the extreme ends of a continuum, and some questioning whether Müllerian mimicry actually occurs at all. For example, in a detailed review of mimicry with particular reference to Australian fauna, Nicholson (1927, p. 89) followed Wallace (1882) in proposing that many alleged Müllerian mimics were in fact Batesian in nature: ‘‘The incipient mimic need not therefore be palatable; it need only be less distasteful than its model, other things being equal’’. DeRuiter (1959, p. 353) was even more explicit, arguing that the Müllerian mechanism ‘‘is very unlikely to be realized except when predators live in the presence of such a superabundance of food that they never have to resort to relatively distasteful prey’’.

In the previous section, I noted that Müllerian co-mimics with approximately equal levels of defence that differ in their mode of action can sometimes subtly complement one another in facilitating more rapid avoidance learning. However, what happens when the co-mimic species differ in their absolute level of defence? It seems likely that many Müllerian co-mimics will vary in this way. To quote Dixey (1919, p. 564) “every upholder of Müllerian mimicry, so far as I am aware, is not only ready to admit, but is prepared positively to assert that distastefulness is relative; that it exists, like other means of defence, in degrees that may vary indefinitely from species to species”. One potential example is seen in adult tiger moths, many of which employ plant secondary compounds sequestered in their larval stage for defence against vertebrate predators (Weller et al. 2000; Simmons and Weller 2002). Adult moths of the tiger moth genus Sphecosoma mimic different genera and species of polybiine wasps (Hymenoptera: Vespidae), yet it seems reasonable to assume that they are not as well protected as the stinging wasps they resemble. Another potential example of variation in defence among Müllerian mimics is seen in a pair of geometrid moth species of the genus Arichanna, as described by Nishida (1994). These species are visually similar and presumed Müllerian mimics. However, Arichanna gaschkevitchii is a monophagous specialist that collects relatively large quantities of grayanotoxins from its host plant. By contrast, Arichanna melanaria is an oligophagous generalist that collects smaller quantities of grayanotoxins and is presumably rather less harmful to its predators.

What are the implications of such variation? Of course, even if two defended mimetic species vary in their defensive attributes, then both could be unprofitable to attack. Under these conditions, Müller’s mechanism could still lead to selection for a mutually beneficial shared warning pattern. Indeed, in a recent aviary experiment, Ihaleinen et al. (2007) found that mixing artificial mimetic prey of high and moderate unpalatability did not increase overall predation rates on these prey compared to prey with uniformly high unpalatability. However, other researchers have postulated that a parasitic form of Müllerian mimicry (“quasi-Batesian mimicry”, Speed 1999; Speed and Turner 1999) can arise among similar-looking unpalatable species when they vary in unpalatability (or any other form of defence), with the less unpalatable mimics actually undermining the effectiveness of the signals of their more unpalatable models. There has been considerable interest in this possibility for several reasons. First and foremost, if a moderately unpalatable species has a tendency to corrupt the protection of the signal used by better defended prey, then rarer unpalatable mimics that resemble different model species can be favoured, leading to the co-occurrence of several distinct forms of the same weakly defended species (Müllerian polymorphism). More generally, if this type of relationship were common, then many alleged examples of Müllerian mimicry are really Batesian—to take an extreme perspective, if quasi-Batesian mimicry was rife, then Müller’s mechanism may explain few if any examples of Müllerian mimicry in the natural world.

Three main mechanisms are thought to be capable of facilitating quasi-Batesian effects, in theory at least. All three mechanisms emphasise the nature of selection after an initial period of avoidance learning is complete, or learning has reached some form of equilibrium. First, it is possible that predators may sometimes be prepared to consume weakly unpalatable prey in times of nutritive need, so that, in effect distasteful prey temporarily become tolerated. If such nutritional crises are commonplace, then it is clear that it would pay a weakly defended prey to evolve a similarity to a better defended prey (Sherratt et al. 2004) and that in some cases this mimicry would increase the attack rates on the better defended model (Speed 1993a). Given the recent experimental and theoretical developments in this area, I consider the phenomenon in detail in the next section.

The second mechanism thought to be capable of generating Müllerian mimicry through advergent (and potentially parasitic) means follows Wallace in recognising variation among predator species. It is widely acknowledged that certain predator species may find some “unpalatable” prey types (not necessarily the species one might class as moderately unpalatable) acceptable because they readily deal with their defence (Endler and Mappes 2004). For example, greater pewee (Contopus fumigatus), golden-crowned (Myiodynastes chrysocephalus) and dusky-capped flycatchers (Myiarchus tuberculifer) are relatively tolerant of arctiid moths, which are generally considered to be unacceptable to most vertebrate predators (Collins and Watson 1983). Likewise, tanagers (Pipraeidea melanonota) have been observed to attack large numbers of warningly coloured ithomine butterflies (squeezing out their abdomen contents), yet no other insectivorous birds in the region had been observed attacking these butterflies (Brown and Vasconcellos Neto 1976). So, just as prey acceptability can vary over time due to variation in nutritional state, it can also vary among predator species on a community level due to their ability to deal with particular prey defences. Under these conditions, an unpalatable species that is palatable to some predator species may face selection to resemble a more universally unpalatable species, generating a potential hybrid between Batesian and Müllerian mimicry. Despite its plausibility as a mechanism, I am not aware of any empirical or theoretical work that has attempted to assess the effects of inter-specific (or intra-specific) variability in predators’ abilities to overcome prey defences, on the nature of mimetic relationships among defended prey.

Finally, and most controversially, it is possible that predators adopt a higher long-term attack rate on weakly defended prey compared to better defended prey due to a different equilibrium between learning and forgetting. In essence, if avoidance learning were more effective for highly defended prey, then weakly defended mimics may parasitise the more effective educational properties of the better defended model. Following the work of numerous authors (Owen and Owen 1984; Huheey 1988; Speed 1993b; Speed and Turner 1999), it is now clear that such psychological processes can in theory increase the long-term attack rate on a better defended species when mimicked by a less well-defended species. What is not clear is whether the psychological models assumed are accurate reflections of the means by which predators learn to avoid unpalatable prey (Skelhorn and Rowe 2006a). Indeed, from an optimisation perspective, one might expect that all unpalatable prey would eventually be avoided by well-fed predators (Turner and Speed 1999), unless the predators were exceedingly dim-witted, or they look like palatable prey.

State (and density) dependent foraging behaviour as a route to mimicry

Here, I briefly discuss the phenomenon of state-dependent foraging behaviour in relation to unpalatable prey and raise the question as to how important these processes are in generating mimicry among unpalatable prey, compared to Müller’s learning-based theory. I also discuss how density effects may come to influence the evolutionary dynamics of mimicry.

There is now a substantial body of work to support the contention that predators are indeed more prepared to attack defended prey items in times of nutritive need (Poulton 1890; Gelperin 1968; Chai 1986). For example, European starlings (Sturnus vulgaris) increased their attack rates on quinine-injected mealworms when their body masses and fat stores were experimentally reduced, while choice trials clearly indicated the response was not due to any hunger-based reduction in the discriminatory abilities of the birds (Barnett et al. 2007). Moreover, Srygley and Kingsolver (2000) showed that when the (presumed) demand for resources increased at the height of red-winged blackbirds’ breeding season, then more individuals of a moderately distasteful species of butterfly (tethered on platforms) were attacked by adults.

Of course, it is difficult to know how frequently energetic shortfalls occur in a natural setting—the availability of alternative palatable prey will clearly play a key role (Kokko et al. 2003; Sherratt 2003), and this parameter is likely to vary seasonally. One might wonder why, if weakly defended prey are occasionally attacked, then they do not simply evolve greater defence, although it is equally clear that defences are often costly (Ruxton et al. 2004). It is also important to note that while state-dependent foraging behaviour can cause weakly defended prey to evolve mimicry of more defended prey, then this does not necessarily mean that the relationship is parasitic. In particular, if mimicry somehow allows higher overall prey densities (the system is “open”, Sherratt et al. 2004), then the extra availability of potential food in the system can simultaneously (1) enhance the deterrent effect of the common signal (since there is more guaranteed food if the predator should ever find itself critically short of energy) and (2) reduce the per capita rate of encounter with any prey (a saturation effect). Under these conditions, even the presence of a palatable Batesian mimic could improve the survivorship of the unpalatable model it resembles (Speed 1999)!

Density-dependent feedbacks complicate the evolution of a range of traits ranging from senescence (Abrams 1993) to clutch size (Ricklefs 2000), and it is not surprising to note that they may also play a role in the evolution of mimicry. A dramatic example of the above saturation effect (termed “ochlosis” by Carpenter 1948, and “arithmetic mimicry” by van Someren and Jackson 1959) was seen in a recent experiment by Rowland et al. (2007) who allowed captive great tits (P. major) to forage for a fixed number of food items (palatable and unpalatable nuts wrapped in marked paper). When moderately unpalatable mimics were added to the system, then it reduced the overall proportion of highly unpalatable models that were attacked simply because more food was available. Moreover, despite the addition of moderately unpalatable prey, the great tits did not significantly change their probability of attacking the highly unpalatable models on encounter. Intriguingly, in an earlier experiment to investigate quasi-Batesian mimicry, Speed et al. (2000) found that increasing the density of moderately unpalatable mimics actually increased the “mortality” of the more unpalatable model. Yet, in this particular experiment, the overall prey densities were kept constant, so that the increased tendency of predators to attack models on encounter would not have been masked by increases in food availability.

What is the field and comparative evidence for quasi-Batesian mimicry?

Quasi-Batesian (parasitic) mimicry (+/−) is only one of three possible plausible relationships that may hold between unpalatable (or otherwise defended) mimetic prey, the alternatives being relationships that are commensal (+/0) or mutualistic (+/+) in nature. Indeed, if the defended co-mimics differ greatly in density, then I strongly suspect that it will pay the rare species to resemble the common species but that this mimicry will not markedly affect the efficacy of the signal already adopted by the more common prey species, such that the relationship will be commensal (Mallet 1999).

If quasi-Batesian mimicry were widespread, then one might expect there to be selection on common weakly defended species to resemble several different well-defended models. That said, one might expect far less intense selection for polymorphism in quasi-Batesian mimics than bona fide Batesian mimics because it will take more weakly defended mimics to reduce the protection to a better defended model than a classical Batesian mimic (Ruxton et al. 2004). So, the fact that local polymorphisms are more typical (although by no means common, Joron and Mallet 1998) in Batesian than Müllerian systems does not allow one to rule out quasi-Batesian parasitism. It turns out that co-occurring morphs do occasionally occur in Müllerian systems—for example, the African monarch Danaus chrysippus and its co-mimic Acraea spp. occur in several different yet sympatric forms within East Africa (Edmunds 1974), while two to ten mimetic colour forms of Heliconius numata are often seen flying together in the Amazon basin (Joron et al. 1999; Joron 2005). Yet in these cases, there are several alternative plausible explanations for the polymorphism (based primarily on localised selection combined with a degree of population inter-mixing), leading one to conclude that we cannot use the incidence of polymorphism in unpalatable mimetic prey as an acid test for or against quasi-Batesian mimicry.

I suspect that a more fruitful way of gauging the prevalence of parasitic relationships among unpalatable prey is to examine the influence of the relative density of the mimics—that is, we remove the “all else being equal” clause. The classical Müllerian learning-based hypothesis suggests that rare (or late-emerging) unpalatable prey can evolve to mimic a more common (or early-emerging) unpalatable prey even if this more common prey is less well defended. By contrast, the mimicry of a less well-defended prey by a better defended prey cannot be explained on the basis of state-dependent predatory behaviour alone, and it may be challenging to explain on the basis of long-term differential rates of learning and forgetting. Not many studies have evaluated both the relative level of defence in mimetic species (which, to add to the complication, may vary from area to area) and the way in which the Müllerian mimicry evolved. As noted above, Mallet (1999) and others (e.g. Flanaghan et al. 2004) have argued that H. melpomene has adverged to mimic the erato group, yet these two species appear to have similar high levels of unpalatability to birds (Brower et al. 1963; Srygley and Chai 1990). Nevertheless, the viceroy butterfly L. archippus has clearly adverged in appearance to match the monarch D. plexippus and queen Danaus gilippus (see Mallet 1999), yet it appears the viceroy is about as unpalatable as the monarch and is more unpalatable than the queen (Ritland 1991a, b; Ritland and Brower 1991; but see Brower 1958). Symula et al. (2001) similarly argued that D. imitator mimics three different poison frogs in different geographical regions, and yet this species appears more toxic than any of its models. If confirmed, perhaps the frogs have evolved this way due to their relative rarity or because the local predators were already trained to avoid other frogs with particular warning signals at the time D. imitator initially spread (see for example Ihalainen et al. 2008). Whatever the reason, it looks like we have specific cases of advergence by better defended species to resemble more weakly defended species that Müller’s learning hypothesis can readily explain, yet state-dependent theories (in which the weakly defended species becomes a mimic of the better defended species to avoid predation by hungry predators) cannot.

Future work

Much of the work conducted on Müllerian mimicry has so far centred on neotropical butterflies, despite the fact that it is a taxonomically widespread phenomenon. More experimental field studies, especially with non-lepidopteran groups, would therefore be of considerable value in elucidating the full range and mode of action of the selective processes at work. For example, despite the obvious parallels, the evolution of Müllerian mimicry among insect-pollinated plants has so far received scant attention (but see Chittka 1997; Roy and Widmer 1999).

There is also a need to consider how avoidance learning and predators’ response to this information might differ between aviary experiments and more natural settings. This research is important because it is becoming increasingly clear that the ways in which predators learn to avoid unpalatable prey may differ qualitatively between simple and complex communities and because predators in natural conditions have options (such as flying away) that are not available to them in caged experiments. Indeed, under natural conditions, it may be days before a predator sees another individual with the same rare colour pattern, but the understandable constraints on experimental design mean that rare unpalatable forms of prey are often presented simultaneously and/or within a relatively short time period.

On a more mechanistic level, we still need to know precisely why the number of unpalatable prey attacked tends to increase with their density and why different defences in co-mimics can interact to enhance avoidance learning. This work is needed, not least because both phenomena can markedly affect the intensity and nature of anti-apostatic selection. In particular, there is an intriguing possibility that Müllerian mimics may benefit one another not only through classical Müllerian mechanisms but also, for example, by adding uncertainty as to the precise type of defence an organism has. Such phenomena clearly merit further investigation.

Finally, while the subject of mimicry has already seen a great deal of theorising, one potentially fruitful line of enquiry for theorists is addressing the implications of variation for Müllerian mimicry, including between-species variation in predators’ abilities to deal with these defences. Thus, there is evidence that certain prey species may be unprofitable to some predator species yet profitable to other, and it is not entirely obvious what the selective consequences of this variation are for the evolution of mimicry. Perhaps more importantly, there is an increasing realisation that defences can evolve too, such that a Müllerian system might evolve into a Batesian system or vice-versa, although I am aware of no evidence of this from a phylogenetic perspective. The easy part will be developing the theory to understand such transitions—the hard part will be putting these theories to the test.