Partitioned Local Depth analysis of time series transcript abundance data reveals elaborate community structure

Transcriptome studies that provide temporal information are valuable for identifying groups of similarly-behaving transcripts, giving insight into overarching gene regulatory networks. Nevertheless, inferring transcriptional networks from time series data is challenging, in part because it is difficult to holistically consider both local relationships and global structure of these complex and overlapping transcriptional responses. To address this need, we employed the Partitioned Local Depth (PaLD) method to examine four time series transcriptomic datasets generated using the model plant Arabidopsis thaliana. Here, we provide a self-contained description of the method and demonstrate how it can be used to make predictions about gene regulatory networks based on time series data. The analysis provides a global network representation of the data from which graph partitioning methods and neighborhood analysis can reveal smaller, more well-defined groups of like-responding transcripts. These groups of transcripts that change in response to hormone treatment (e.g., auxin or ethylene) or high salinity were demonstrated to be enriched in common biological function and/or binding of transcription factors that were not identified with prior analyses of this data using other clustering and inference methodologies. These results reveal the ability of PaLD to generate predictions about gene regulatory networks using time series transcriptomic data, which can be of value to the systems biology community.


39
Transcriptome remodeling is a highly dynamic, tightly controlled process during which 40 transcript abundance is altered in response to stimuli (e.g., hormones and stress 41 responses) [1][2][3][4][5][6][7][8][9][10] with precise temporal kinetics. Insight about the transcriptional 42 regulatory networks downstream of stimuli can be extracted from time series data, which 43 provides information about changes in transcript abundance over time [11][12][13][14][15][16]. 44 Determining not only which genes have altered transcript abundance in response to stimuli 45 but also inferring which of these genes might operate in the same transcriptional networks 46 based on common temporal patterns provides greater depth to our understanding of 47 signaling pathways driven by transcriptional cascades. Genes that have the same temporal 48 dynamics form subnetworks that may be turned on by the same upstream transcription 49 factors (TFs) or turned off by the same repression mechanisms. Additionally, common 50 temporal kinetics for transcript synthesis may allow coordinated responses if subnetworks 51 include transcripts encoding proteins that function in the same biological processes [17-52 22]. Time series data can also be used to infer the sequential activation of genes in a 53 network using next-state modeling, which combines information about TF-target 54 interactions with information about the temporal dynamics of gene expression [23-27]; a 55 rapid increase in the synthesis of TFs suggests that these proteins may control the 56 expression of downstream target genes [28][29][30]. Overall, transcriptome studies that 57 measure changes in transcript abundance over time provide valuable information about 58 genes that are co-expressed or are sequentially activated, information that can be used to 59 predict gene regulatory networks [31][32][33][34]. 60 Inferring transcriptional networks from time series transcriptomic data remains a 61 challenge for biologists. Commonly employed clustering and labelling methods, such as 62 hierarchical clustering and k-means clustering [35,36], often require additional user input 63 (e.g., defining the number of clusters) that might not fit the naturally occurring pattern of 64 clusters in the data. In addition to these tools, other methods have been developed to infer 65 gene networks from time series data [37][38][39][40][41][42][43][44][45]. At their core, these techniques group 66 entities (or objects) together into classes (or collections of objects). While clustering 67 techniques often provide class labels at a local level, they do not provide information on the 68 precise connections between the underlying entities (transcripts) and between classes 69 (clusters). Network inference methods that can reveal information not only about local 70 connections but also global relationships with minimal user input are needed to develop 71 better models for predicting gene networks. Recently, Berenhaut, Moore, and Melvin 72 introduced a new technique, Partitioned Local Depth (PaLD), that provides both local-and 73 global-level information when examining social science datasets [46]. This study is the first 74 application of PaLD to transcriptomic data and this approach to reveal connections 75 between genes in the form of a weighted network, in which edge weights (and resulting 76 proximity in the network) convey relationship strengths between data points. This 77 approach is motivated by a community-level social perspective in which an underlying 78 assumption is that an object (transcript) is relatively similar to its nearby neighbors. Here, 79 the concept of "nearby" reflects local perspective and adapts to varying density without the 80 need for extraneous parameters that restrict the number of groups, such as the 81 optimization that is required to determine the number of K-means clusters [47]. 82 Community detection methods can then be applied to the weighted networks obtained 83 from PaLD to reveal smaller groups of transcripts with the most similar responses, but that 84 maintain connectivity to the overall structure, thereby providing a holistic view of the 85 patterns and connections within and between community groups [48,49]. 86 In this study, we applied PaLD to search for relationships between transcripts whose 87 abundance changes in several time series microarray or RNA-Seq datasets from the model 88 organism Arabidopsis thaliana. We then evaluated the efficacy of PaLD in this context by 89 asking whether groups of like-responding transcripts had conserved function using Gene 90 Ontology (GO) annotations, and whether these groups had conserved regulation by 91 uncovering TFs binding to promoter regions of genes in these groups. Embedding in a latent (social) space in which local comparisons (or "conflicts") arise, we 126 define a measure of pairwise relationship strength by considering alignment in induced 127 local regions (or foci). This perspective allows for consideration of interactions occurring 128 across a variety of scales and gives a meaning of local that does not require the user to 129 provide additional inputs or search an underlying parameter space. We present the 130 algorithm in a concrete computational form, but in the note below, we briefly describe the 131 probabilistic social perspective from which the PaLD approach arises; for further details 132 see [46]. 133 For fixed , ∈ , define the local focus, , , to be the set of points which are as close to 134 as is to , or as close to as is to , that is , 135 (1) 136 We consider a point ∈ to be aligned with in opposition to if is in the local focus 137 (i.e., ∈ , ) and is closer to than to (i.e., ( , ) < ( , )). The strength of the 138 alignment is then inversely proportional to the number of points in the local set (i.e., � , �). 139 The cohesion of to is defined via 140

142
(2) 141 i.e., it is the average strength of alignment, where | , | is the number of elements in the set 143 , and ( ) = 1 if statement is true and ( ) = 0, otherwise. In the case that ( , ) = 144 ( , ), we resolve the tie by setting the numerator to 1/2; for clarity of presentation this 145 case is suppressed. 146 To identify important relationships between points, we will emphasize those pairs for 147 which mutual cohesion is greater than what is typical for a point in a local focus. Leveraging 148 the symmetry of the local foci, we consider the relationship between and to be 149 (particularly) strong whenever 150

152
(3) 151 Further context and justification for the value of this threshold from a probabilistic 153 perspective are given in (Berenhaut et al. 2022). 154 For = { 1 , 2 , . . . , }, we obtain an × matrix of cohesion values, = � , �. The 155 cohesion matrix conveys strength of relationship between points and can be visualized as a 156 network (or graph). The adjacency matrix defining the weighted undirected graph, , is 157 obtained by replacing each value, ,s , with the minimum of ,s and ,s . Thresholding 158 the symmetrized cohesion matrix at the value in Eq. (3), leads to an adjacency matrix 159 defining the graph, ⋆ , whose (weighted) edges are only those for which the cohesion is 160 particularly strong. 161 For visualization purposes, we apply the Kamada-Kawai graph drawing algorithm [56] to 162 the graph ⋆ . When further partitioning is desired (for instance, as in Section 3.2), we may 163 employ a community detection algorithm, such as the Louvain algorithm, on the largest 164 connected component. Louvain is one of several algorithms available to find a partition of a 165 network into groups which maximizes modularity. 166 Note: (Probabilistic perspective and local depth). Aggregating the support for from all 167 points ∈ , we obtain a measure of local centrality, referred to as local (community) 168 depth, and defined by 169 (4) 170 From the probabilistic perspective considered in [46], the local depth of , ℓ( ), described 171 above is the probability that, in opposition with a point selected uniformly at random 172 from the remaining points in , a focus-point, ∈ , selected uniformly at random is 173 closer to than it is to . We then have the following, (5) 175 In particular, cohesion, as equivalently defined in Eq. (2), is obtained from partitioning the 176 probability which defines local depth; thus, this approach is entitled Partitioned Local 177

Depth. 178
Example (Example 1, revisited). Consider the distances in Fig. 1A. To compute the cohesion 179 of 4 to 2 , we consider the local foci for = 1,3,4,5,6,7. Among these 6 induced sets, 4 is 180 contained in 2 , 1 = , 2 , 4 = 2 , 5 = { 2 , 3 , 4 , 5 }, 2 , 6 = { 1 , 2 , 3 , 4 , 5 , 6 } and 181 2 , 7 = . Across these sets, 4 is closer to 2 than to the opposing point, , for = 1,6,7. As 182 a result, as in Eq. (2), 183 The remaining pairwise cohesion values are computed similarly and are displayed in measure the distance between the fitted time-curves, and , (associated to two different 202 genes), we consider similarity in functional value and first derivative (capturing the 203 expression level and rate of change, respectively). In particular, we define the associated 204 distance between the 2 curves via 205 where ′ is the first derivative of and is the maximal time point. Note that the first term 207 is the total (absolute) area between the curves; and the second corresponds to the 208 (absolute) area between the derivatives. Throughout, the distance between transcript 209 abundance profiles is understood to be the dissimilarity of the associated fitted cubic 210 polynomials as described in Eq. (6). The distance function given in Equation (6), defines the 211 distance between 2 curves as the area between the graphs of the 2 curves plus the area 212 between the graphs of their derivatives. There are many possible distance metrics, 213 depending on the application at hand. For instance, standardizing the coefficient of 214  Table S1. 247 Identification of enriched binding of transcription factors in the clusters and groups was 248 achieved using TF DEACoN , a program that uses DAP-Seq data [57] to report targets of a 249 subset of transcription factors in Arabidopsis [58]. The size of some groups is larger in the 250 enrichment analysis due to microarray probes recognizing transcripts from more than one 251 gene or smaller due to probes that were not associated with a gene identifier being 252 removed. The output of this analysis uses a log fold change cutoff greater than 1 and a -253 value cutoff of less than 0.05 and is such that the abundance in the group is compared to 254 the whole genome. (Note that adjusted -values were used here, yielding slightly different 255 results than those obtained using raw values.) 256 GO enrichment in PaLD groups, k-means clusters, or STEM groups, relative to the whole 257 genome, was performed in the Singular Enrichment Analysis tool in AgriGO [59]. In AgriGo, 258 "statistical test method" was set to hypergeometric and "multi-test adjustment method" 259 We began by applying PaLD to generate a network using 447 transcripts whose abundance 275 changed in response to treatment with 1-aminocyclopropane-1-carboxylic acid (ACC), 276 which is the immediate precursor to the plant hormone ethylene. This treatment increases 277 ethylene synthesis and signaling [50]. The resulting network structure illustrated local and 278 global relationships among the transcripts, which is shown with each node representing a 279 gene and each edge connecting the transcripts with a locally similar temporal response 280 (Fig. 2). The network contained 2 large groups of transcripts that were largely 281 downregulated (at the top of the diagram) or upregulated (in the lower part of the 282 diagram), while a smaller group in the middle of the network contained transcripts with 283 mixed responses. We selected 7 transcripts (the same transcripts used in the illustrative 284 example) to represent different parts of the network. For each transcript, we generated a 285 fiber plot, which contains fitted curves reporting the abundance of those 7 transcripts as a 286 function of time after ACC treatment (Fig. 2). Although visually similar, these fiber plots are 287 not to be confused with Andrews plots for displaying multivariate structure (see [61]). 288 Additionally, smaller subnetworks, or neighborhoods, were determined for each of these 7 289 transcripts. These neighborhoods consisted of a transcript and all of its direct connections 290 in the network, which are shown in gray on the fiber plots. These results illustrated the 291 strength of PaLD in grouping genes with similar behavior, as the fibers in each of these 292 plots followed the same trends. The varying neighborhood sizes reflecting local structure is 293 a key feature of PaLD not exhibited by naïve nearest neighbor methods (see [46] To divide this network into larger neighborhoods of like-responding genes, we used the 317 Louvain method for community detection [15]. This method maximizes modularity, a 318 quantity that is larger for partitions with stronger within-group ties relative to that which 319 would be expected by chance. When used in combination with PaLD, this approach 320 One goal of this work is to understand how PaLD groups containing transcripts with similar 340 kinetic response compare to methods used to sort transcripts into like-responding groups 341 or clusters. To evaluate the ability of PaLD to organize transcripts as compared to the 342 frequently used k-means clustering, we compared the PaLD groups to previously generated 343 k-means clusters using this same set of data. The k-means clustering method was 344 previously applied to this group of 447 transcripts yielding 72 clusters ranging in size from 345 1 to 111 genes, with 48 clusters of size 1 and 24 clusters of size greater than or equal to 2 346 [50]. The transformed gene expression data for transcripts in each of the top ten k-means 347 clusters prior to fitting the cubic polynomials are shown in Fig. S2. To visually compare the 348 k-means clusters to PaLD groups, we colored the nodes in the PaLD network to correspond 349 to each of the 10 largest k-means clusters, which we refer to as A-J (corresponding to 350 Clusters 1-10 in [50]), with this network shown in Fig. 4 We also quantitatively compared the k-means clusters to the groups previously defined for 364 the PaLD network using a table to illustrate overlaps between PaLD groups and k-means 365 clusters ( Table 1). The k-means clusters A, D, and E had the least spread across the PaLD 366 groups, which is consistent with the diagram in Fig. 4. The matrix illustrates that 100% of 367 the genes in cluster E were in PaLD group 6, 72% of the genes in cluster A (80/111 368 transcripts) were in group 7, and 67% of the genes in cluster D were in group 1, while 369 other PaLD groups showed broader distribution between k-means clusters, with cluster J 370 containing genes in 5 PaLD groups with no more than 33% of those genes in any single 371  We also compared the unfitted curves of PaLD groups 1 and 7 to clusters D and A, 382 respectively, to illustrate the more temporally linked gene expression patterns of the 383 transcripts within the PaLD groups (Fig. S3). This analysis revealed that some sets of 384 transcripts have consistent behavior and are grouped together by both k-means clustering 385 and PaLD, while for other transcripts these methods detect relationships in different ways. 386

Enrichment analysis of PaLD groups in the ACC microarray dataset revealed 387 conserved biological functions and transcriptional regulators 388
We hypothesized that transcripts which respond similarly in time may work together to 389 drive the same biological processes and that their synthesis may be regulated by the same 390 TFs. Since genes often work together in smaller subnetworks, we reasoned that the groups 391 revealed by the community detection algorithm would be large enough to yield enrichment 392 results yet small enough to represent biologically meaningful subnetworks. Therefore, to 393 assess the ability of PaLD to predict genes that function, and are regulated, in the same 394 biological network, we examined whether there were enriched functions and TF targets 395 within the PaLD groups (Table 2). Using AgriGO v2.0, we identified enriched function in 396 these groups, finding that 6 out of 7 PaLD groups were significantly enriched in GO 397 annotations. To examine enriched targets of TFs in each of the groups, we used TF DEACoN 398 [58], which identifies significant enrichment in genes that have been shown to be targets of 399 TFs defined by a publicly available DNA Affinity Purification (DAP)-Seq dataset [57]. This 400 analysis revealed that 5 out of 7 groups were enriched in targets of known TFs. Of the 7 401 groups, 4 had enrichment in both annotated function and TF binding. Interestingly, the 402 group that was enriched in transcripts annotated with "negative regulation of ethylene 403 signaling" was also enriched in targets of the ethylene signaling master transcriptional 404 regulator ETHYLENE INSENSITIVE 3 (EIN3), which supported our hypothesis that PaLD 405 could reveal structure that represented biologically-supported subnetworks. 406 To further confirm that PaLD could effectively define groups of genes that functioned in the 414 same biological processes, we compared the PaLD groups to a set of core ethylene response 415 genes, called the gold standards, which were identified in a meta-analysis of root ethylene 416 responses examining a single time point after treatments that elevated levels of ethylene in 417 roots across three transcriptional datasets [62]. This analysis is shown in the supplemental 418 data, which revealed the patterns of response of this conserved set of ethylene regulated 419 transcripts (Fig. S4). We noted in Table 2 how those consistently ethylene regulated  420 transcripts partitioned between the PaLD groups. 421 We also examined functional annotation and TF binding in the k-means clusters to ask 422 whether these groups were similarly enriched in function and regulation (Table 3)

PaLD analysis of transcripts with altered expression in response to elevated auxin 442
We also examined a second microarray dataset that was simultaneously generated with the 443 ACC dataset described above, which generated a larger group of DE transcripts. PaLD was 444 used to generate a network using the 1246 transcripts whose transcript abundance was 445 significantly changed in roots in response to treatment with the auxin indole-3-acetic acid 446 (IAA). Interestingly, the general structure forms a torus (or doughnut) shape, which is quite 447 different from the "hourglass" shaped ACC network. The Louvain community detection 448 algorithm was applied to this data, yielding 8 groups which are shown in distinct colors in 449 upregulation. Group 2, which was positioned oppositely to groups 1, 4 and 5, also had a 453 unimodal curve, but with decreased transcript abundance in response to auxin, followed by 454 a return to baseline. Fiber plots of each group revealed that the patterns of transcript 455 abundance over time were slightly shifted from one group to the next. We reasoned that 456 these results might be indicative of waves of transcripts that were activated at various 457 times from the start of hormone treatment. 458 2). From these videos we produced snapshots of the networks from representative angles 474 (Fig. 6). From these images it is clear that many of the groups that appear to overlap with 475 one another in the two-dimensional networks were spatially, and thus temporally, distinct. 476 It is interesting to contrast the local and global structural information provided by PaLD 477 over standard node-labelling methods. We compared the distribution of transcripts between the PaLD groups and the top ten k-489 means clusters for the IAA dataset, in a matrix (Table S2). The PaLD and k-means methods 490 appeared to sort temporally-linked transcripts in different ways, providing a basis for 491 comparing enrichment between the PaLD groups and k-means clusters. We looked for 492 enriched function and enriched TF binding to genes in these PaLD groups and compared 493 them to the k-means clusters. Enriched GO annotations were found in all 8 PaLD groups 494 and 8 out of 10 (80%) k-means clusters (Table 4 and Table S3). On the other hand, 5 out of 495 8 (63%) and 7 out of 10 (70%) PaLD groups and k-means clusters, respectively, were 496 enriched in targets of TFs. As predicted, each set of groups/clusters had groups enriched in 497 auxin signaling, with the PaLD IAA group 5 and 6 and k-means cluster G having this 498 annotation. Nevertheless, examination of each of the summary tables revealed a number of 499 differences in the qualitative output. For example, the GO annotation "covalent chromatin 500 modification" was enriched in PaLD group 4 but was not enriched in any of the k-means 501 clusters (Table 4 and Table S3). Similarly, genes functioning in "root morphogenesis" and 502 "root meristem growth" were enriched in PaLD group 5 but were not enriched in any of the 503 k-means clusters (Table 4 and Table S3). These results highlight the ability of PaLD to 504 reveal new functional information regarding time series data that is biologically relevant. 505 To demonstrate the application of PaLD in the analysis of time series gene expression data 506 derived using RNA-seq, we applied this same analysis pipeline to two additional datasets 507 that examined gene expression over time in response to ethylene gas [51] (Fig. S5 and 508 We asked whether PaLD could reveal smaller subnetworks of interest in the IAA 519 microarray dataset. We used a second approach to look for neighborhoods of like-520 responding genes in the PaLD network, by generating lists of nearest neighbors for each 521 gene. It is important to note that neighborhoods can be of varying size reflecting 522 heterogeneity in the data. To illustrate this approach, we selected the rapidly upregulated 523 gene encoding Auxin Response Factor 19 (ARF19), which drives auxin gene regulatory 524 networks in roots. The ARF19 neighborhood was highlighted on the PaLD network (Fig. 7A) 525 and the temporal responses of the genes in this neighborhood were illustrated in the fiber 526 plot in Fig. 7B and are listed in Supplemental Data Set 1. 527 We next asked whether PaLD could resolve functional differences in subnetworks with 528 high degrees of overlap (e.g., neighborhoods of the genes directly connected to ARF19). PaLD groups as displayed in Fig. 5.  Neighborhoods for each transcript were queried for conserved function using Metascape and 550 transcripts annotated with the GO term "auxin" or "root" are highlighted on the network with 551 colors corresponding to PaLD groups as displayed in Fig. 5. (E and G) The fiber plots for these 552 annotated groups are shown with the same color coding. 553 We also sought to reveal subnetworks by searching for specific GO annotations (a top-554 down approach). For contrast, here, we employed a different enrichment analysis 555 software, Metascape [67], which employs a greater number of GO knowledgebases 556 compared to AgriGO. We automated a process using Metascape to identify all the GO terms 557 enriched in each of the 1246 neighborhoods in the IAA network. We then used this data to 558 search for neighborhoods enriched in GO terms that included "auxin" or "root" and 559 highlighted these neighborhoods on the PaLD network in Fig. 7D and F, respectively. The 560 transcripts involved in root-related processes formed a localized niche in the network (Fig.  561 7F), which overlapped significantly with the ARF19 neighborhood, while transcripts 562 associated with auxin-related processes were more dispersed throughout a larger region of 563 the network (Fig. 7D). Likewise, the fiber plot of the auxin-associated neighbors was 564 revealed to be variable (Fig. 7E), while the fiber plot of the root-associated transcripts was 565 more consistent in temporal responses (Fig. 7G). Taken together, these findings revealed 566 that PaLD neighborhoods could be used to predict subnetworks, and that a top-down 567 approach could be employed to search for these networks. 568

569
A central challenge in systems biology is the inference of gene relationships from patterns 570 found in transcriptomic data. In particular, time series transcript abundance data include a 571 temporal dimension that allows us to make predictions about like-responding transcripts, 572 due to common transcriptional controls or biological function of the proteins that they 573 encode. These transcriptomic datasets can be used to address questions including how 574 hormones or physiological stress alter the transcriptional landscape in plants. Such 575 experiments reveal the set of genes that have altered expression under elevated levels of 576 plant hormones or salinity, for instance, providing new information about the pathways 577 and mechanisms that are activated in response to these factors. In this paper, we analyzed 578 several time series transcriptomic datasets, generated for the model plant Arabidopsis, to 579 delve more deeply into the mechanisms driving these complex regulatory processes, by 580 applying new methods to query the data for predictive patterns. The datasets used in this 581 analysis were selected to focus on hormone and stress response gene regulatory networks 582 in Arabidopsis. 583 The wide variety of methods which can be employed in bioinformatics can make it 584 challenging to determine which methods to apply and, beyond this, may require the user to 585 make decisions that influence the results. Specifically, widely employed methods such as k-586 means and hierarchical clustering require the user to provide additional inputs (e.g., the 587 number of clusters) or select values from a parameter space (e.g., to minimize some cost 588 function). Many methods used to cluster time series data provide limited information 589 about the global structure of the data, lacking a way to visualize how transcripts are both 590 directly and indirectly connected to one another within a larger network. Dimension 591 reduction techniques, such as principal component analysis, are often utilized to provide a 592 flat 2-dimensional display of the data. Since such methods can compress valuable structural 593 features, network-based methods can also be used to reveal informative structure. Yet, 594 again, methods for creating such networks may force the user to select parameters (e.g., 595 number of neighbors, or a threshold for the maximum distance between neighbors) and 596 these choices can influence the results obtained. In light of these challenges, the need to 597 share straightforward modeling algorithms for holistically reflecting local and global 598 structure (including delineated groups or clusters), which can be easily applied to 599 biological datasets, is more critical than ever before. 600 In this paper, we present the first application of Partitioned Local Depth (PaLD) to time 601 series transcript abundance data and provide a self-contained description of the approach. 602 Our focus here is on the shape of the curve for transcript abundance over time We also applied the PaLD method to analyze a microarray dataset in which roots were 635 treated with IAA. Both the PaLD and k-means clustering methods resulted in similar 636 numbers of groups that were enriched in functional annotations and targets of TFs but 637 differ in distribution of genes between these groups and thus the content of these 638 enrichments. A benefit of using the PaLD approach is that the number of like-responding 639 transcripts can vary across the set and, in contrast to common nearest-neighbor methods, 640 is not a single quantity that must be specified by the user. This allows us to identify 641 meaningful sets of temporally similar transcripts that are directly linked to given genes of 642 interest. For illustrative purposes, we examined the neighborhood of Auxin Response Factor 643 19 (ARF19) revealing transcripts that were directly connected to ARF19 included genes 644 encoding the auxin transporters PIN3 and PIN7. This finding is consistent with these 645 transcripts having similar kinetic responses to auxin, which may be due to ARF19 driving 646 PIN3 and PIN7 transcription or with all 3 transcripts being controlled by a common TF. We 647 also found significant overlap between the ARF19 neighborhood and transcripts that were 648 associated with GO annotations linked to "root." These results review the utility of the 649 localized neighborhoods that PaLD generates as a way to reveal transcriptional networks. 650 The PaLD approach provides network models visualizations that are rich and informative 651 spatial representations of the relationships among temporally-linked transcripts. The 652 resulting local and global structure provides insight into the relationships between genes. 653 We showed how PaLD can be used to model biological networks and reveal groups of genes 654 with enriched function and regulation, which can then be employed to predict subnetworks 655 supported by functional and TF-target information. The PaLD networks can also be 656 implications that can then be tested experimentally. In particular, Arabidopsis T-DNA 664 insertional mutants are widely available to test which genes might be misregulated 665 downstream of knockout mutant in a TF, thereby revealing transcriptional networks. 666 Taken together, the PaLD approach allows time series transcriptomic data, which is often 667 rich and complex, to become more interpretable and manageable in a research setting. In 668 light of the increased specialization of researchers across the sciences, the need to create 669 straightforward modeling algorithms that can be easily applied to datasets by biologists 670 with limited training in bioinformatics is more critical than ever before. Ultimately, we 671 demonstrated here that PaLD is a straightforward method that can be used to process 672 transcriptomic data to reveal enrichment in function and regulation, which is the first step 673 in modeling complex biological processes that can then be validated experimentally in the 674 laboratory.