Detecting and validating influential organisms for 2 rice growth: An ecological network approach

Abstract


Introduction
advances in empirical and statistical methods provide a practical way to achieve this goal. Oomycetes species, Globisporangium nunn (syn. Pythium nunn), was added, and a midge 30 species, Chironomus kiiensis, was removed from small rice plots. The rice responses (the 31 growth rate and gene expression patterns) were measured before and after the manipulation. 32 We confirmed that the two species, especially G. nunn, indeed had statistically clear effects 33 on the rice performance, which demonstrated that integration of the eDNA-based monitoring 34 and time series analysis would be effective for detecting previously overlooked influential 35 organisms in agricultural systems. In five rice plots established in 2017 in an experimental field at Kyoto University, 42 Japan, daily rice growth rate (cm/day in height) was monitored during the growing season by 43 measuring rice leaf height of target individuals (red points in the right panel of Figure 1a) 44 every day using a ruler, which showed consistent patterns among the plots (Figure 1b). Daily 45 mean air temperature also showed consistent patterns among the five plots ( Figure 1c). The 46 daily growth rate reached the maximum during late June to early July, and the height of rice individuals did not increase after the middle of August (first headings appeared on 12 or 13 48 August in the five plots). During the monitoring period, we occasionally observed decreases Potential causal relationships among the daily rice growth rates, eDNA time series, and 23 climate data were detected using unified information-theoretic causality (UIC) analysis, virtually no effects of the rice growth on air temperature, which is biologically convincing. In 33 total, 718 ASVs were identified as potentially causal (i.e., statistically clear information flow; 34 P < 0.05; we use the term "statistical clarity" instead of "statistical significance", according these two species in 2019 are multifold. First, they can be relatively easily manipulated; G. 46 nunn had already been isolated and cultivated (Kobayashi et al., 2010;Tojo et al., 1993) and C. kiiensis is an insect species and relatively large (several mm to cm). Second, the two 48 species had not been reported to be pathogens; using pathogen species under field conditions 49 is not recommended in the experimental field because they might have adverse effects on we could not identify with lower-level taxonomic information partially because of the 23 conservative assignment algorithm of the sequence analysis pipeline, Claident (Tanabe and 24 Toju, 2013). We described them as "putative Globisporangium spp."and regarded them as G. 25 nunn that we added because the eDNA concentrations of these Pythiaceae sequences were 26 increased after we added G. nunn isolates. 27 The increase in the eDNA concentrations of putative Globisporangium spp. in the GN 28 treatment after the manipulation was statistically clear (P < 2.0 × 10 -16 ; Figure 3b). In

Rice performance after the field manipulation experiments in 2019
We measured the rice growth as in 2017 and compared those of the GN and CK 48 treatments with that of the CT treatment before and after the manipulation experiments. In 49 general, we did not find statistically clear differences among the treatment in the daily rice growth rate before and after manipulation (P > 0.05; Figure 4a) (see Figure 4- figure   1 supplement 1 for the rice growth trajectory during the growing season). However, we found 2 that the cumulative rice growth (during 10 days after the manipulation) increased in the GN 3 and CK treatments after the manipulation, and the effect was statistically clear (P = 0.016 and 4 P = 0.021 for the GN and CK treatments, respectively; Figure 4b). Alternative statistical 5 modeling that included the treatments (the control versus GN or CK treatments) and 6 manipulation timing (i.e., before or after the manipulation), which simultaneously took the 7 temporal changes of all the treatments into account, also showed qualitatively similar results 8 (Materials and Methods and Supplementary file 5). We did not find any statistically clear 9 difference in the rice yields (e.g., the number of rice grains per head, the grain weights, or the 10 proportion of fertile grains) among the treatments (see Supplementary file 6 for the raw rice 11 yield data).

13
Changes in rice gene expression pattern in 2019 14 We further explored changes in the rice performance before and after the manipulation 15 experiments using RNA-seq. We collected rice leaf samples four times (one day before the 16 first manipulation and one, 14, and 38 days after the third manipulation). As a result of the 17 RNA-seq analysis, 875,534,703 reads were generated (Supplementary file 7), and 8,105,902 18 reads per sample were used for the sequence analysis. DESeq2 analysis was performed, and 19 here we focus on the results of the second sampling event (i.e., one day after the third 20 manipulation). The analysis suggested that, when all three rice pot locations within each plot 21 were merged (see Figure 3a for the pot locations), there were differentially expressed genes  In the present study, we demonstrated a novel research framework to integrate the 42 ecological network concept and agricultural practices to screen potentially influential 43 organisms. We performed intensive field monitoring in 2017, obtained intensive and 44 extensive time series of rice and ecological communities, detected potentially influential 45 organisms using time series-based causality analysis, and based on the results, we performed 46 field manipulation experiments and evaluated the effect of the manipulation. Throughout the study, we detected statistically clear effects of the manipulation on the rice growth and gene 48 expression patterns especially for the GN treatment, proving that our proposed framework might work as a novel approach to comprehensively detect potentially influential, previously 1 overlooked ecological community members for an agricultural system (in our case, rice).
2 Below, we discuss the results, advantages, disadvantages, potential limitations, and future 3 perspectives of our approach.

5
Monitoring of ecological community using eDNA and rice growth 6 Our framework started from monitoring ecological community members using 7 quantitative eDNA metabarcoding (Ushio, 2022). We successfully detected more than 1000 8 ASVs and described detailed temporal dynamics of the ASVs (Figure 1d, e; Ushio 2022). We 9 included spike-in standard DNAs whose concentrations were known a priori, and our 10 previous study had shown that, in almost all cases, there were clear positive linear 11 correlations between the sequence reads and the copy numbers of the spike-in standard DNAs 12 (e.g., Supplementary figures in Ushio 2022). This suggested that our quantitative eDNA 13 metabarcoding reasonably quantified the concentrations of eDNA, which can be a "rough" 14 index of species abundance in the artificial rice plots. Some time-series-based causal 15 inference methods required quantitative data (i.e., relative abundance data that is typical for 16 amplicon sequencing is not suitable) (Osada et al., 2023;Schreiber, 2000;Sugihara et al., 17 2012), and our method justifies the use of the advanced time series method to infer the causal 18 relationship between rice growth and ecological community members.

19
Rice growth was monitored by a manual measurement, e.g., rice height measurements 20 made using rulers (Figure 1b). Although the measurements successfully captured the seasonal 21 dynamics of the rice growth, one might speculate whether only a single variable is sufficient 22 to capture the rice performance. Indeed, it is true that assessing more variables, for example, 23 gene expression (Nagano et al., 2019(Nagano et al., , 2012 and multispectral image monitoring during the 24 whole growth season (Candiago et al., 2015), would be potentially useful to fully capture the 25 trajectory of the rice growth. Nonetheless, we suggest that very frequent monitoring of a 26 single variable is sufficient to reconstruct and predict the dynamics of the whole system 27 dynamics (Takens, 1981). Thus, we suggest that our monitoring approach can resolve the 28 dynamics at least as a proof-of-concept study, but taking more variables at fine time 29 resolution is certainly beneficial for more accurate delineation of the system dynamics. rice (which was necessary in order not to spread potential pathogens to nearby farms), and C. 46 kiiensis (midge) is a common freshwater insect species in the study region. C. kiiensis may have effects on water quality (e.g., the concentration of phosphorus) through their feeding 48 behavior (Kawai et al., 2003). However, they have never been focused on in terms of rice 49 growth management, and thus, they provided an opportunity to test the validity of our approach in the present study. Globisporangium spp. clearly increased (Figure 3b), suggesting that the introduction of G. 7 nunn was successful. However, we also observed increases in the putative Globisporangium 8 spp. abundance in some of the CT and CK samples. The rice plots were located close to each 9 other (50-60 cm apart; Figure 3a) and thus immigrations might have occurred between the 10 GN and the other plots. As for C. kiiensis, we did not observe a statistically clear decrease in   Stress response genes such as Os11g0184900 might also have contributed to the changes in 40 the rice performance, but further studies are required to elucidate the mechanisms by which 41 ecological community members affect the rice performance under field conditions.

42
In the present study, we could not test the effect of other species on rice growth due to 43 the limitations of time, labor, and money, but the species listed in Supplementary file 1 could 44 be candidates of novel biological agents that influence the rice performance. Thus, validating 45 the effects of these species on the rice performance is important, and targeting ASVs with a 46 large TE value would be a promising direction to find previously overlooked beneficial/detrimental organisms for rice under field conditions. In addition, we could not test 48 the effects of species that had no clear effects on the rice performance according to the time 49 series analysis. It will also be important to validate that such species do not have clear influence on the rice performance in the field, which would further validate the effectiveness 1 of our research framework. Although the present study has shown that our framework is potentially useful for 5 detecting potentially influential, previously overlooked organisms for rice growth under field 6 conditions, there are potential limitations that should be acknowledged.

7
One limitation is the quantitative capacity and species identification accuracy of the issue, short-read sequencing has dominated current eDNA studies, but it is often not 21 sufficient for lower-level taxonomic identification (i.e., species or subspecies identification).

22
Using long-read sequencing techniques (e.g., Oxford Nanopore MinION) for eDNA studies is 23 a promising approach to overcome the second issue (Baloğlu et al., 2021). 24 Another limitation is that the monitoring of rice and the introduced organisms in this 25 study was not comprehensive. While intensive monitoring was conducted during the rice 26 growing season, only a limited number of variables were measured for a limited number of 27 rice individuals and the detailed dynamics of the two introduced species was unclear (i.e., the 28 fate of the introduced species). This is particularly important for understanding how the 29 introduced organisms affected rice performance. At the molecular level, acquiring larger-30 scale "omics" data for rice and ecological community members, for example long-term gene   In this study, we demonstrated that intensive monitoring of an agricultural system and 48 the application of nonlinear time series analysis are helpful for identifying potentially influential organisms under field conditions. Although the effects of the two species were 1 relatively small and how the two species influenced the rice performance (i.e., the fate of the 2 introduced species) was still unclear, the research framework presented here has future 3 potential. The intensive monitoring, nonlinear time series analysis, and in situ validation of 4 the statistical results may be applicable for other agricultural and aquatic systems. For 5 example, the application of our framework to fishery aquaculture systems may be helpful for 6 detecting potential pathogens of fish. Also, if the continuous monitoring of soil ecological 7 communities becomes possible, our approach can be extended to other crops and vegetables.

8
An automated, real-time monitoring system combined with advanced machine learning 9 methods would provide a promising tool for increasing the accuracy and applicability of our 10 approach. In conclusion, we proposed a novel framework to integrate the ecological network 11 concept and agricultural practices to explore and detect potentially influential organisms. Our 12 proof-of-concept study would be an important basis for the further development of field-13 based system management. Yoshinami for maintenance of the experimental field and assistance in the field monitoring. This   (122 days). The containers ("plots") were filled with well water. This 9 experimental system was previously reported in Ushio (2022), which investigated the mechanism of ecological 10 community dynamics. In the present study, we focused on the relationship between rice performance and 11 ecological community members.

12
Daily rice growth was monitored by measuring rice leaf height of target individuals every day using a 13 ruler (the highest leaf heights were measured). We selected four rice individuals at the center of each plot as the 14 target individuals (indicated by the four red points in the right panel of Figure 1a). The average height of the 15 four individuals were used as a representative rice height of each plot (a time-lapse movie that shows rice 16 growth is available here: https://doi.org/10.6084/m9.figshare.19029650.v1). Leaf SPAD was also measured 17 every day using a SPAD meter (SPAD-502Plus, KONICA-MINOLTA, Inc., Tokyo, Japan) for the same leaf 18 whose height was measured, but not analyzed intensively in the present study. Climate variables (temperature, 19 light intensity, and humidity) were monitored using a portable, automated climate logger (Logbee, Chitose 20 Industries, Osaka, Japan). The Logbee data was corrected and refined using the climate station data provided by 21 the Center for Ecological Research, Kyoto University. For all analyses after this subsection, the free statistical environment R was used (R Core Team, 2022).

2
The procedure used to estimate DNA copy numbers consisted of two parts, following previous studies (Ushio, 3 2019; Ushio et al., 2018b). Briefly, we performed (i) linear regression analysis to examine the relationship 4 between sequence reads and the copy numbers of the internal standard DNAs for each sample, and (ii) 5 conversion of sequence reads of non-standard DNAs to estimate the copy numbers using the result of the linear 6 regression for each sample. The regression equation was: MiSeq sequence reads = sample-specific regression 7 slope × the number of standard DNA copies [/µl]. Then, the estimated copy numbers per µl extracted DNA 8 (copies/µl) were converted to DNA copy numbers per ml water in the rice plot (copies/ml water). The 9 quantitative capacity of this method was thoroughly evaluated by comparing with quantitative PCR,  Detections of potential causal relationships between rice growth and ecological community members 17 We detected potentially influential organisms for rice growth analyzing the quantitative, 1197-species where -(< ) is the optimal embedding dimension of lower dimensional models. Eqn.
[2] is a TE version of 34 simplex projection (Sugihara and May, 1990), and we set tp = 1 in Eqn.
[2] to determines the optimal E (i.e., the 35 optimal E based on one-step forward prediction By using UIC, we quantified TE from eDNA time series to rice growth. We standardized the eDNA time 40 series (copies/ml water) and rice growth rates (cm/day) to have zero means and a unit of variance before the 41 analysis. We tested time-lag up to 14 days (i.e., tp in Eqn. [1] was from 0 to -14; effects from up to 14 days ago 42 were considered). We interpret a finding that TE from eDNA time series to rice growth rate is statistically 43 greater than zero to mean that the ecological community member statistically clearly influences rice growth rate. We selected one rice individual in the middle of each of the Wagner pots as the target individuals (indicated by 23 the three red points in the inlet panel of Figure 3a; labelled as locations "1", "2", and "3" from east to west).

24
Leaf SPAD was also measured using a SPAD meter as in 2017. The ecological communities in the rice plots . For rice growth rates, the effect of the treatment was tested with the random effects of 42 rice individual and rice plot (in R, this is lme4::lmer(growth_rate ~ treatment + (1|ind/plot), data = 43 data)), and the effects were separately analyzed for before/after the manipulation. Rice cumulative growth (i.e., 44 cumulative growth before the third manipulation or 10 days after the third manipulation) was analyzed similarly, 45 but the random effect of rice individuals was not included because there was only one cumulative value for each 46 rice individual. Also, we tested alternative models including the timing and manipulation treatments (in R, these are lme4::lmer(growth_rate ~ before_or_after_manipulation*treatment + (1|ind/plot), data = data) and lme4::lmer(cumulative_growth ~ before_or_after_manipulation*treatment + (1|plot), 49 data = data)), and obtained general agreement with the results of the above-mentioned analysis. For 50 ecological community, we analyzed the eDNA data taken during the intensive monitoring period before and 51 after the field manipulation experiments (18 June to 12 July; Supplementary file 3), and the effects of the 52 manipulation (i.e., the addition of G. nunn or the removal of C. kiiensis) were separately tested for each 53 treatment with the random effect of rice plot assuming a gamma error distribution (in R, this is 54 lme4::glmer(DNA_conc ~ before_or_after_manipulation + (1|plot), data = data, family = 55 Gamma(link="log"))). Also, as in the rice growth analysis, we tested alternative models including the timing 56 and manipulation treatments (in R, these are lme4::glmer(DNA_conc ~ Differences in the ecological community compositions were visualized using t-distributed stochastic neighbor 59 embedding (t-SNE) (Van der Maaten and Hinton, 2008). In the present study, we used the term "statistical clarity" instead of "statistical significance" to avoid misinterpretations, according to the recommendations by were fitted, and the differences were tested. These analyses were performed using DESeq2::DESeq() function 31 implemented in the package. In the DESeq2 analysis, false discovery rate was set as 0.05, and genes with an 32 adjusted P < 0.05 found by DESeq2 were assigned as differentially expressed. We measured rice gene 33 expression four times and we detected almost no DEGs at sampling events 1, 3, and 4 except for two DEGs at 34 sampling event 4. Therefore, we report only the results of sampling event 2 in the present study (one day after 35 the third field manipulation). In Supplementary files 8 and 9, we report a list of nuclear DEGs.     The left-top inset shows daily mean air temperature (thick line) and daily maximum and minimum air temperature (dashed lines). The right-bottom inset shows three individuals (red and green points) in each Wagner pot, and the number in each red individual indicate the pot location number. Heights and SPAD of the red individuals were measured. Total eDNA copy numbers of (b) putative Globisporangium spp. and (c) midge (Chironomus kiiensis) in the rice plots. "before" and "after" indicate "from 18 June to 24 June" and "from 25 June to 12 July," respectively. (d) Overall community compositions after the manipulation. Gray, red, and blue indicate CT (control), GN (Globisporangium nunn added), and CK (Chironomus kiiensis removed) treatments, respectively. Each ellipse indicates the overall distribution of each treatment data.