Abstract
Inferring species interactions from observational data is one of the most controversial tasks in community ecology. One difficulty is that a single pairwise interaction can ripple through an ecological network and produce surprising indirect consequences. For example, two competing species would ordinarily correlate negatively in space, but this effect can be reversed in the presence of a third species that is capable of outcompeting both of them when it is present. Here, I apply models from statistical physics, called Markov networks or Markov random fields, that can predict the direct and indirect consequences of any possible species interaction matrix. Interactions in these models can be estimated from observational data via maximum likelihood. Using simulated landscapes with known pairwise interaction strengths, I evaluated Markov networks and several existing approaches. The Markov networks consistently outperformed other methods, correctly isolating direct interactions between species pairs even when indirect interactions or abiotic environmental effects largely overpowered them. A linear approximation, based on partial covariances, also performed well as long as the number of sampled locations exceeded the number of species in the data. Indirect effects reliably caused a common null modeling approach to produce incorrect inferences, however.
Key words: Ecological interactions; Occurrence data; Species associations; Markov network; Markov random field; Ising model; Biogeography; Presence–absence matrix; Null model
Introduction
If nontrophic species interactions, such as competition, are important drivers of community assembly, then ecologists might expect to see their influence in our data sets (MacArthur 1958, Diamond 1975). Despite decades of work and several major controversies, however (Lewin 1983, Strong et al. 1984, Gotelli and Entsminger 2003, Connor et al. 2013), existing methods for detecting competition’s effects on community structure are unreliable (Gotelli and Ulrich 2009), and thus important ecological processes remain poorly understood. More generally, it can be difficult to reason about the complex web of direct and indirect species interactions in real assemblages, especially when these interactions occur against a background of other ecological processes such as dispersal limitation and environmental filtering (Connor et al. 2013). For this reason, it isn’t always clear what kinds of patterns would even constitute evidence of competition, as opposed to some other biological process or random sampling error (Lewin 1983, Roughgarden 1983).
Most existing methods in this field compare the frequency with which two putative competitors are observed to co-occur, versus the frequency that would be expected if all species on the landscape were independent (Strong et al. 1984, Gotelli and Ulrich 2009). Examining a species pair against such a “null” background, however, rules out the possibility that the overall association between two species could be driven by an outside force. For example, even though the two shrub species in Figure 1 compete with one another for resources at a mechanistic level, they end up clustering together on the landscape because they both grow best in areas that are not overshadowed by trees. If this sort of effect is common, then significant deviations from independence will not generally provide convincing evidence of species’ direct effects on one another.
While competition between the two shrubs in the previous example does not leave the commonly-expected pattern in community structure (negative association at the landscape level), it nevertheless does leave a signal in the data (Figure 1C). Specifically, among shaded sites, there will be a deficit of co-occurrences, and among unshaded sites, there will also be such a deficit.
In this paper, I introduce Markov networks (undirected graphical models also known as Markov random fields; Murphy 2012) as a framework for understanding the landscape-level consequences of pairwise species interactions, and for detecting them with observational data. Markov networks, which generalize partial correlations to non-Gaussian data (Lee and Hastie 2012, Loh and Wainwright 2013), have been used in many scientific fields for decades to model associations between various kinds of “particles”. For example, a well-studied network named the Ising model has played an important role in our understanding of physics (where nearby particles tend to align magnetically with one another; Cipra 1987). In spatial contexts, these models have been used to describe interactions between adjacent grid cells (Harris 1974, Gelfand et al. 2005). In neurobiology, they have helped researchers determine which neurons are connected to one another by modeling the structure in their firing patterns (Schneidman et al. 2006). Following recent work by Azaele et al. (2010) and Fort (2013), I suggest that ecologists could similarly treat species as the interacting particles in the same modeling framework. Doing so would allow ecologists to simulate and study the landscape-level consequences of arbitrary species interaction matrices, even when our observations are not Gaussian. While ecologists explored some related approaches in the 1980’s (Whittam and Siegel-Causey 1981), computational limitations had previously imposed severe approximations that produced unintelligible results (e.g. “probabilities” greater than one; Gilpin and Diamond 1982). Now that it is computationally feasible to fit these models exactly, the approach has become worth a second look.
The rest of the paper proceeds as follows. First, I discuss how Markov networks operate and how they can be used to simulate landscape-level data or to predict the direct and indirect consequences of possible interaction matrices. Then, using simulated data sets where the “true” ecological structure is known, I compare this approach with several existing methods for detecting species interactions. Finally, I discuss opportunities for extending the approach presented here to larger problems in community ecology.
Methods
Conditional relationships and Markov networks
Ecologists are often interested in inferring direct interactions between species, controlling for the indirect influence of other species. In statistical terms, this implies that ecologists want to estimate conditional (“all-else-equal”) relationships, rather than marginal (“overall”) relationships. The most familiar conditional relationship is the partial correlation, which indicates the portion of the sample correlation between two species that remains after controlling for other variables in the data set (Albrecht and Gotelli 2001). The example with the shrubs and trees in Figure 1 shows how the two correlation measures can have opposite signs, and suggests that the partial correlation is more relevant for drawing inferences about species interactions (e.g. competition). Markov networks extend this approach to non-Gaussian data, much as generalized linear models do for linear regression (Lee and Hastie 2012).
Markov networks give a probability value for every possible combination of presences and absences in communities. For example, given a network with binary outcomes (i.e. 0 for absence and 1 for presence), the relative probability of observing a given presence-absence vector, , is given by
Here, αi is an intercept term determining the amount that the presence of species i contributes to the log-probability of ; it directly controls the prevalence of species i. Similarly, βij is the amount that the co-occurrence of species i and species j contributes to the log-probability; it controls the probability that the two species will be found together (Figure 2A, Figure 2B). β thus acts as an analog of the partial covariance, but for non-Gaussian networks. Because the relative probability of a presence-absence vector increases when positively-associated species co-occur and decreases when negatively-associated species co-occur, the model tends to produce assemblages that have many pairs of positively-associated species and relatively few pairs of negatively-associated species (exactly as an ecologist might expect).
A major benefit of Markov networks is the fact that the conditional relationships between species can be read directly off the matrix of β coefficients (Murphy 2012). For example, if the coefficient linking two mutualist species is +2, then—all else equal—the odds of observing either species increase by a factor of e2 when its partner is present (Murphy 2012). Of course, if all else is not equal (e.g. Figure 1, where the presence of one competitor is associated with release from another competitor), then species’ marginal association rates can differ from this expectation. For this reason, it is important to consider how coefficients’ effects propagate through the network, as discussed below.
Estimating the marginal relationships predicted by a Markov network is more difficult than estimating conditional relationships, because doing so requires absolute probability estimates. Turning the relative probability given by Equation 1 into an absolute probability entails scaling by a partition function, Z(α, β), which is defined so that that the probabilities of all possible assemblages that could be produced by the model sum to one (bottom of Figure 2B). Calculating Z(α, β) exactly, as is done in this paper, quickly becomes infeasible as the number of species increases: with 2N possible assemblages of N species, the number of bookkeeping operations required for exact inference spirals exponentially into the billions and beyond. Numerous techniques are available for working with Markov networks that keep the computations tractable, e.g. via analytic approximations (Lee and Hastie 2012) or Monte Carlo sampling (Salakhutdinov 2008), but they are beyond the scope of this paper.
A major benefit of Markov networks is the fact that the conditional relationships between species can be read directly off the matrix of β coefficients (Murphy 2012). For example, if the coefficient linking two mutualist species is +2, then—all else equal—the odds of observing either species increase by a factor of e2 when its partner is present (Murphy 2012). Of course, if all else is not equal (e.g. Figure 1, where the presence of one competitor is associated with release from another competitor), then species’ marginal association rates can differ from this expectation. For this reason, it is important to consider how coefficients’ effects propagate through the network, as discussed below.
Estimating the marginal relationships predicted by a Markov network is more difficult thanestimating conditional relationships, because doing so requires absolute probability estimates. Turning the relative probability given by Equation 1 into an absolute probability entails scaling by a partition function, Z(α, β), which is defined so that that the probabilities of all possible assemblages that could be produced by the model sum to one (bottom of Figure 2B). Calculating Z(α, β) exactly, as is done in this paper, quickly becomes infeasible as the number of species increases: with 2N possible assemblages of N species, the number of bookkeeping operations required for exact inference spirals exponentially into the billions and beyond. Numerous techniques are available for working with Markov networks that keep the computations tractable, e.g. via analytic approximations (Lee and Hastie 2012) or Monte Carlo sampling (Salakhutdinov 2008), but they are beyond the scope of this paper.
Simulations
In order to compare different methods for drawing inferences from observational data, I simulated two sets of landscapes using known parameters.
The first set of simulated landscapes included the three competing species shown in Figure 1. For each of 1000 replicates, I generated a landscape with 100 sites by sampling exactly from a probability distribution defined by the interaction coefficients in that figure (Appendix A). Each of the methods described below (a Markov network, two correlation-based methods and a null model) was then evaluated on its ability to correctly infer that the two shrub species competed with one another, despite their frequent co-occurrence.
I also simulated a second set of landscapes with five, ten, or twenty potentially-interacting species on landscapes composed of 20, 100, 500, or 2500 observed communities (24 replicate simulations for each combination; Appendix B). These simulated data sets span the range from small, single-observer data sets to large collaborative efforts such as the North American Breeding Bird Survey. As described in Appendix B, I randomly drew the “true” coefficient values for each replicate so that most species pairs interacted negligibly, a few pairs interacted very strongly, and competition was three times more common than facilitation. I then used Gibbs sampling to randomly generate replicate landscapes with varying numbers of species and sites (Appendix B). For half of the simulated landscapes, I treated each species’ α coefficient as a constant, as described above. For the other half, I treated the α coefficients as linear functions of two abiotic environmental factors that varied from location to location across the landscape (Appendix B). The latter set of simulated landscapes provide an important test of the methods’ ability to distinguish co-occurrence patterns that were generated from pairwise biotic interactions from those that were generated by external forces like abiotic environmental filtering. This task was made especially difficult because—as with most analyses of presence-absence data for co-occurrence patterns—the inference procedure did not have access to any information about the environmental or spatial variables that helped shape the landscape (cf Connor et al. 2013, Blois et al. 2014).
Inferring α and β coefficients from presence-absence data
In the previous two sections, the values of α and β were known. In practice, however, ecologists will often need to estimate these parameters from co-occurrence data. When the number of species is reasonably small, one can compute exact maximum likelihood estimates for all of the α and β coefficients by optimizing . Fully-observed Markov networks like the ones considered here have unimodal likelihood surfaces (Murphy 2012), ensuring that this procedure will always converge on the global maximum. This maximum is the unique combination of α and β coefficients that would be expected to produce exactly the observed co-occurrence frequencies. For the analyses in this paper, I used the rosalia package (Harris 2015a) for the R programming language (R Core Team 2015) to define the objective function and gradient as R code. The rosalia package then uses the BFGS method in R’s optim function to find the best values for α and β.
For analyses with 5 or more species, I made a small modification to the maximum likelihood procedure described above. Given the large number of parameters associated with some of the networks to be estimated, I regularized the likelihood using a logistic prior distribution (Gelman et al. 2008) with a scale of 1 on the α and β terms.
Other inference techniques for comparison
After fitting Markov networks to the simulated landscapes described above, I used several other techniques for inferring the sign and strength of marginal associations between pairs of species (Appendix B).
The first two alternative interaction measures were the sample covariances and the partial covariances between each pair of species’ data vectors on the landscape (Albrecht and Gotelli 2001). Because partial covariances are undefined for landscapes with perfectly-correlated species pairs, I used a regularized estimate based on ridge regression [Wieringen and Peeters (2014); i.e. linear regression with a Gaussian prior]. For these analyses, I set the ridge parameter to 0.2 divided by the number of sites on the landscape.
The third alternative method, described in Gotelli and Ulrich (2009), involved simulating possible landscapes from a null model that retains the row and column sums of the original matrix (Strong et al. 1984). Using the default options in the Pairs software described in Gotelli and Ulrich (2009), I simulated the null distribution of scaled C-scores (a test statistic describing the number of non-co-occurrences between two species). The software then calculated a Z statistic for each species pair using this null distribution. After multiplying this statistic by –1 so that positive values corresponded to facilitation and negative values corresponded to competition, I used it as another estimate of species interactions.
Method evaluation
For the simulated landscapes based on Figure 1, method evaluation was fairly qualitative: any method whose test statistic for the two shrubs indicated a negative relationship passed; other methods failed.
For the larger landscapes, I rescaled the four methods’ estimates using linear regression through the origin so that they all had a consistent interpretation. In each method, I regressed the “true” β coefficient for each species pair against the model’s estimate, re-weighting the pairs so that each landscape contributed equally to the rescaled estimate. These regressions yielded squared errors for each of the 23,140 simulated species pairs, across all conditions and replicates. I then partitioned these errors into groups defined by the properties of the simulated landscapes (i.e. species richness, number of observed communities, and the presence/absence of environmental filtering). Within each partition, the mean squared error was used to calculate the proportion of variance explained (compared with a baseline model that assumed all interaction strengths to be zero).
Results
Three species
As shown in Figure 1, the marginal relationship between the two shrub species was positive—despite their competition for space at a mechanistic level—due to indirect effects of the dominant tree species. As a result, the covariance method falsely reported positive associations for 94% of the simulated landscapes, and the randomization-based null model falsely reported such associations 100% of the time. The two methods for evaluating conditional relationships (Markov networks and partial covariances), however, successfully controlled for the indirect pathway via the tree species and each correctly identified the direct negative interaction between the shrubs 94% of the time.
Larger landscapes
The accuracy of the four evaluated methods varied substantially, depending on the parameters that produced the simulated communities (Figure 3). In general, however, there was a consistent ordering: overall, the Markov network explained 54% of the “true” parameters’ squared deviations from zero, followed by partial covariances (33%), and sample covariances (22%). The null model scores initially explained only 12%. After manually reducing the value of one especially strong outlier (Z = 1004, implying p < 10–1000000), this increased to 17% (Appendix B). Figure 3 reflects the adjusted version of the results.
Discussion
The results presented above show that Markov networks can reliably recover species’ pairwise interactions from observational data, even for cases where a common null modeling technique reliably fails. Specifically, Markov networks were successful even when direct interactions were largely overwhelmed by indirect effects (Figure 1) or environmental effects (lower panels of Figure 3). For cases where fitting a Markov network is computationally infeasible, these results also indicate that partial covariances—which can be computed straightforwardly by linear regression—can often provide a surprisingly useful approximation. The partial correlations’ success on simulated data may not carry over to real data sets, however; Loh and Wainwright (2013) show that the linear approximations can be less reliable in cases where the true interaction matrix contains more structure (e.g. guilds or trophic levels). On the other hand, if ecologists are familiar enough with the natural history of their study systems to describe this kind of structure as a prior distribution on the parameters or as a penalty on the likelihood, then this information could reduce the effective degres of freedom to estimate and real-world results might be even better than those shown in Figure 3.
Ecologists will also need natural history to pin down the exact nature of the interactions identified by a network model (e.g. which species in a positively-associated pair is facilitating the other), particularly when real pairs of species can reciprocally influence one another in multiple ways simultaneously (Bruno et al. 2003); the β coefficients in Markov networks have to reduce this complexity to a single number. In short, partial correlations and Markov networks both help prevent us from mistaking marginal associations for conditional ones, but they can’t tell us the underlying biological mechanisms at work.
Despite these limitations, Markov networks have enormous potential to improve ecological inferences. For example, Markov networks provide a simple answer to the question of how competition should affect a species’ overall prevalence, which was a major flash point for the null model debates in the 1980’s (Roughgarden 1983, Strong et al. 1984). Equation 1 can be used to calculate the expected prevalence of a species in the absence of biotic influences (; Lee and Hastie 2012). Competition’s effect on prevalence in a Markov network can then be calculated by subtracting this value and the observed prevalence (cf Figure 2D).
Markov networks—particularly the Ising model for binary networks—have been studied in statistical physics for nearly a century (Cipra 1987), and the models’ properties, capabilities, and limits are well-understood in a huge range of applications, from spatial statistics (Gelfand et al. 2005) to neuroscience (Schneidman et al. 2006) to models of human behavior (Lee et al. 2013). Modeling species interactions using the same framework would thus allow ecologists to tap into an enormous set of existing discoveries and techniques for dealing with indirect effects, stability, and alternative stable states (i.e. phase transitions; Cipra (1987)).
This modeling approach is also highly extensible, even when it is inconvenient to compute the likelihood exactly. For example, the mistnet software package for joint species distribution modeling (Harris 2015b) can fit approximate Markov networks to large species assemblages (>300 species) while simultaneously modeling each species’ nonlinear response to the abiotic environment. Combining multiple ecological processes into a common model could help ecologists to disentangle different factors that can confound simpler co-occurrence analyses (cf Connor et al. 2013). Numerous other extensions are possible: Markov networks can be fit with a mix of discrete and continuous variables, for example (Lee and Hastie 2012). There are even methods (Whittam and Siegel-Causey 1981, Tjelmeland and Besag 1998) that would allow the coefficient linking two species in an interaction matrix to vary as a function of the abiotic environment or of third-party species that tip the balance between facilitation and exploitation (Bruno et al. 2003).
Finally, the results presented here have important implications for ecologists’ continued use of null models to draw inferences about species interactions. Null and neutral models can be very useful for clarifying our thinking about the numerical consequences of species’ richness and abundance patterns (Harris et al. 2011, Xiao et al. 2015), but deviations from a null model must be interpreted with care (Roughgarden 1983). In complex networks of ecological interactions (and even in small networks with three species), it may simply not be possible to implicate individual species pairs or specific ecological processes like competition by rejecting a general-purpose null (Gotelli and Ulrich 2009). Estimating pairwise coefficients directly seems like a much more promising approach: to the extent that the models’ relative performance on real data sets is similar to the range of results shown in Figure 3, scientists in this field could easily double their explanatory power by switching from null models to linear regression and partial covariances, or triple it by switching to a Markov network.
Acknowledgements
This research was funded by a Graduate Research Fellowship from the US National Science Foundation and benefited greatly from discussions with A. Sih, M. L. Baskett, R. McElreath, R. J. Hijmans, A. C. Perry, and C. S. Tysor. Additionally, A. K. Barner, E. Baldridge, E. P. White, D. Li, D. L. Miller, N. Golding, and N. J. Gotelli provided useful feedback on an earlier draft of this work.