Influence of climate forecasts, data assimilation, and uncertainty propagation on the performance of near-term phenology forecasts

Evaluation of ecological forecasts is a vital step in the continuous improvement of near-term ecological forecasts. Here we performed a thorough evaluation of a near-term phenological forecast system which has been operating for several years. We evaluated point forecast accuracy and the reliability of the prediction intervals. We also tested the contribution of upstream climate forecasts on phenology forecast proficiency. We found that 9 month climate forecasts contributed little skill overall, though some species did benefit from them. The assimilation of observed winter and spring temperature provided the largest improvement of forecast skill throughout the spring. We also found that phenology forecast prediction intervals were most robust when uncertainty was propagated from climate, phenological model, and model parameters as opposed to using climate uncertainty alone. Our analysis points the way toward several potential improvements to the forecasting system, which can be re-evaluated at a future date in a continuous cycle of forecast refinement.


29
Predicting the future state of ecological systems is essential for conservation, management, stakeholder 30 engagement, and the evaluation of ecological models. Near-term iterative forecasting, where forecasts 31 are regularly updated based on the newest available data, can improve the accuracy of future predictions 32 by using up-to-date information about the system and the newest forecasts for the drivers of the system are also more accurate at shorter lead times, thus should provide more accurate phenology forecasts as 71 dormancy release draws nearer. From these two processes, improved climate weather forecasts at shorter 72 lead times and assimilation of lagged variables, phenological forecasts should improve as spring and the 73 growing season progresses. However, there has been little exploration of the relative influence of data 74 assimilation and improved weather forecasts on ecological predictions. 75 Integrating climate forecasts and assimilating climate data into ecological forecast systems both come 76 with development and computational costs (Taylor and White, 2020;Thomas et al., 2020;Welch et al., 77 2019), and they should be rigorously tested to justify inclusion over simpler methods. Comparison against 78 a baseline model, such as one based on the long term climatological average, allows the assessment of 79 "forecast skill" (Jolliffe and Stephenson, 2003;Harris et al., 2018) which indicates the improvements to 80 the model by including detailed forecasts and data on climate.

81
Climate forecasts also involve additional uncertainty in the resulting ecological forecasts, and under-82 standing this uncertainty is essential for using forecasts for decision making (Clark, 2001;Dietze, 2017;83 Dietze et al., 2018). Focusing on point estimates alone causes end-users to apply their own, potentially and White (2020), with the exception that the ensemble is unweighted and each of the four models was fit 115 50 times using bootstrapping of the data. This allows us to estimate both model and parameter uncertainty.   123 We evaluated the uncertainty in our forecast system by calculating the coverage, which is the fraction 124 of observations that fall within the prediction interval. For a forecast with uncertainty based on a 95% 125 prediction interval perfect coverage is obtained when 95% of the observations fall within those intervals.

126
A coverage below 95% signifies the forecast is too confident, and above 95% not confident enough (Harris  data assimilation of observed temperatures, we compared three distinct approaches to generating future 155 temperatures for the phenological forecasts (Fig. 1). The first method combines both data assimilation of 156 observed temperature (from PRISM) up to the issue date and near-term temperature forecasts from five 157 member climate forecast ensemble (Method 1, Fig. 1A). For each phenological model this produces five 158 predictions (one for each climate ensemble member) which are used to produce an average prediction date 159 along with associated climate uncertainty from the variance among the five climate ensemble members.

160
This is the method currently implemented in the automated forecasting system. The second method 161 uses data assimilation but replaces near-term forecasts with historical information on temperature to 162 provide a baseline to assess the value of near-term forecasts. This is done by assimilating observed

185
For early issue dates (predictions for spring events made in December through March), average point 186 estimate predictions (across species and locations) made using both data assimilation of observed temper-187 ature and near-term climate forecasts (Method 1) had larger errors that those based on data assimilation 188 of observed temperature alone (Method 2) and those using only climatology (no data assimilation or 189 near-term forecasts, Method 3) ( Fig. 2A). Climatology and data assimilation only forecasts had very 190 similar point estimate errors during this period ( Fig. 2A). All three methods underestimated uncertainty 191 during this period, with the assimilation+forecast method generally having the worst uncertainty estimates 192 (Fig. 2B).

193
Beginning around April 1, average point estimates based on assimilation-only and assimilation+forecast 194 methods produced similar errors that were lower than the errors from climatology alone ( Fig. 2A). How-195 ever uncertainty estimates were better (higher coverage) for the climatology based predictions during this 196 period (Fig. 2B). For a short period from the beginning of March to mid-April coverage from observed 197 temperature and climatology was the worst overall. 198 We also explored patterns within individual species by focusing on the four taxon with the most 199 data in the National Phenology Network dataset (Fig. 3). Acer rubrum showed similar patterns to the

236
The minimal gains from including temperature forecasts are due in part to the high uncertainty present 237 in near-term temperature forecasts more than a week into the future (Dietze, 2017). Our system uses 238 temperature forecasts obtained from the CFSv2 global climate model, which makes better predictions 239 than climatological temperature only 20% of the time from January to June (Saha et al., 2014). This 240 uncertainty in the climate forecasts places a hard limit on how much better the resulting phenological 241 forecasts can perform than a climatology only method. However, disregarding climate forecasts entirely 242 is likely not desirable, as assimilating observed temperature alone exhibited increased errors midway 243 thru spring. This increase in error was even more pronounced at the species level. This was likely due 244 to improbable temperature profiles from combining observed and historic temperature (Fig. 1C, Fig.   245 S2) resulting in increasingly inaccurate predictions. In these cases the integration of climate forecasts is  The root mean square error (RMSE) and coverage of all forecasts using the three methodologies for data assimilation. Horizontal orange lines indicate the RMSE from using the long-term climatological average. Coverage was calculated using all three sources of uncertainty (climate, model, and parameter).