Visual integration of omics data to improve 3D models of fungal chromosomes

The functions of eukaryotic chromosomes and their spatial architecture in the nucleus are reciprocally dependent. Hi-C experiments are routinely used to study chromosome 3D organization by probing chromatin interactions. Standard representation of the data has relied on contact maps that show the frequency of interactions between parts of the genome. In parallel, it has become easier to build 3D models of the entire genome based on the same Hi-C data, and thus benefit from the methodology and visualization tools developed for structural biology. 3D modeling of entire genomes leverages the understanding of their spatial organization. However, this opportunity for original and insightful modeling is under exploited. In this paper, we show how seeing the spatial organization of chromosomes can bring new perspectives to Hi-C data analysis. We assembled state-of-the-art tools into a workflow that goes from Hi-C raw data to fully annotated 3D models and we re-analysed public Hi-C datasets available for three fungal species. Besides the well-described properties of the spatial organization of their chromosomes (Rabl conformation, hypercoiling and chromosome territories), our 3D models highlighted i) in Saccharomyces cerevisiae, the backbones of the cohesin anchor regions, which were aligned all along the chromosomes, ii) in Schizosaccharomyces pombe, the oscillations of the coiling of chromosome arms throughout the cell cycle and iii) in Neurospora crassa, the massive relocalization of histone marks in mutants of heterochromatin regulators. 3D modeling of the chromosomes brings new opportunities for visual integration. This holistic perspective supports intuition and lays the foundation for building new concepts.


Introduction 1
What if it were possible to see all the details of chromosomes inside the nucleus of a cell? In 2 eukaryotic cells, the nucleus is a dynamic organelle which is highly organized and characterized 3 by extensive compartmentalization of structural components in its three-dimensional space (see 4 (Razin et al. 2014; Dundr and Misteli 2001;Cremer and Cremer 2001;Arifulin et al. 2018) for 5 reviews). In such a crowded environment, the arrangement of chromosomes is constrained and 6 requires the formation of multiple chromatin domains to limit gene positions to preferred locations 7 within the nuclear space (Misteli 2020). The spatial organization of chromosomes is of great 8 interest, helping molecular biologists represent the objects they work with, and understand their 9 interactions. Even if immense progress has been made in cell imaging, biological molecules (e.g. 10 DNA, RNA, proteins) are too small to be individualized with optical microscopes (Wong and 11 Eleftheriades 2013; Jensen 2013) and consequently, the interior of the cell (and the interior of a 12 nucleus even more) remains largely invisible to the human eye. Alternative solutions are based 13 on molecular-scale techniques like X-ray crystallography (Smyth 2000), NMR microscopy (Reckel 14 et al. 2005) or electron microscopy (Radulović et al. 2022). By analyzing atom arrangements in 15 molecules, these techniques produce informative views of macromolecular complexity (Nogales 16 and Scheres 2015; Baumeister 2022), but they require complex technical skills and expensive 17 equipment. Furthermore, it is important to keep in mind that in the end, these images are still 18 artificial representations of reality. In other words, they are "models". 19 The use of models is widespread in biology. From model organisms to model systems, their 20 interest is to understand a phenomenon in a simplified context, in order to, later, generalize to 21 more complex situations. In cell biology for instance, structural models are used to represent cell 22 components (membranes, nucleus, cytoplasm, etc.), to understand their organization, and to 23 describe their constituent molecules (Im et al. 2016). The work of David Goodsell provides an 24 emblematic example (Goodsell 2009). His drawings representing cellular compartments and their 25 molecular actors are so striking because of the unexpected density of molecules and the 26 complexity of their organization. Goodsell's illustrations have been featured as "Molecule of the 1 Month'' on the Protein Data Bank website for over twenty years and the scientific journal Nature 2 chose one of his paintings to make the cover of a special issue on COVID-19 (August 20, 2020 3 issue). Models make it possible to represent and summarize, in an intuitive but still scientifically 4 rigorous way, the massive knowledge of cell molecular structures. Creating them thus represents 5 a stimulating challenge, at the crossroads of multiple disciplines (biology, physics, computer 6 science, art) (O'Donoghue 2021). 7 Modeling chromosomes is challenging because they belong to the mesoscale, i. e. a length-scale 8 that is larger than discrete molecular complexes yet still remains intracellular (Sear et al. 2015). 9 As a consequence, they are both too small to be observed with precision under optical 10 microscopes, and too large to be fully modeled at the atomic scale. In eukaryotes, chromosomes 11 are long DNA molecules, tightly packed to fit within the nucleus, a space only a fraction of their 12 length. To this end, DNA is wrapped around histone proteins to form nucleosomes, stacked to 13 form chromatin fibers, themselves arranged into higher-order chromatin architecture (Misteli 14 2020). In this study, we are specifically interested in seeing the spatial organization of fungal 15 chromosomes. Our laboratory has long-standing expertise in functional genomics projects in  Rodriguez et al. 2022). Notably, the nuclear architecture of N. crassa, a multicellular fungus that 22 grows as a mycelium with a network of hyphae (Galagan et al. 2003), has structural homology 23 (thanks to the existence of heterochromatin and euchromatin) with the human genome 24 (Rodriguez et al. 2022). This makes the N. crassa genome a cost-efficient model to study 25 chromosome conformation. S. cerevisiae and S. pombe are distantly related yeast species 26 (evolutionary distance of at least 400 Mya) that represent very different models of unicellular 27 study, we explored the potential of using 3D models as well as Hi-C contact maps to better 1 understand the spatial organization of fungal chromosomes. For this purpose, we created a 2 workflow called 3DGB to simplify the creation of 3D models from Hi-C data. Open source and 3 freely available on GitHub, 3DGB generates contact maps, builds 3D models and adds further 4 processing of the 3D model output as PDB files, suitable for advanced visualization with 5 molecular viewer software. Using 3DGB, we created several models of the S. cerevisiae, S. 6 pombe and N. crassa genomes, starting from Hi-C data available in public databases. Different 7 strains were analyzed (wild type and mutants for heterochromatin organization or structural 8 proteins), at different stages of the life cycle. Our models showcase known characteristics of the 9 chromosomal organization of these genomes. But they also reveal, thanks to the visual 10 integration of omics data, important properties of regulatory proteins with critical functions for the 11 maintenance of spatial organization. We thus demonstrate the interest of 3D modeling of Hi-C 12 contacts for studies of genome organization.  To simplify the creation of 3D models for researchers interested in Hi-C data analysis, we created 5 3D Genome Builder (3DGB), a bioinformatics workflow that streamlines the generation of 3D 6 models to visualize and explore the spatial organization of chromosomes, based on Hi-C 7 experimental results. A general overview of the workflow is presented in Figure 1. Starting from 8 Hi-C raw FASTQ files, 3DGB automatically performs the critical bioinformatics steps required to 9 i) compute Hi-C contact frequencies, ii) infer associated 3D models of the chromatin organization 10 and iii) annotate and control the quality of the 3D models. These 3D models are stored in standard 11 PDB files, so they can be further investigated with complementary visualization tools (see below). 12 3D models can optionally be enriched with supplementary quantitative omics data, such as ChIP-13 seq or RNA-seq signals (see next section). 3DGB has been designed to remain as simple as 14 possible. Therefore, it requires only four basic inputs to be specified by the user (Figure 1, green 15 boxes): FASTQ file identifiers, a FASTA file with the sequence of the reference genome, the list 16 of restriction sites for the enzymes used during the Hi-C experiments, and the targeted resolutions 17 for the Hi-C data analysis. This final parameter has an impact on the final 3D models, i.e. the 18 smaller the value (specified in bp), the more detailed the 3D model. 19 The eight main steps required for Hi-C data processing are represented in blue boxes in Figure  20 1 and rely on two state-of-the-art software packages. The first step utilized HiC-Pro (Servant et 21 al. 2015), a reference in Hi-C data processing, cited more than 800 times. It processes raw 22 FASTQ files, performs quality control and generates normalized contact counts and associated 23 figures (presenting important statistics to evaluate the quality of read mapping and justify potential 24 read filtering). Then, Pastis (Varoquaux et al. 2021) iteratively computes 3D models of the 25 organization of chromosomes, through an original negative binomial contact count modelization 26 (referred to as Pastis-NB). Interestingly, it outputs a consensus model, which, by its uniqueness, 1 greatly simplifies downstream analyses and interpretations. Around those two main components 2 (HiC-Pro and Pastis), 3DGB centralizes the configuration, performs contact calculations,  3 generates contact heatmaps, builds 3D models and adds further processing of the 3D model 4 output as PDB files. 3DGB is open source, available in GitHub (https://github.com/data-fun/3d-5 genome-builder) and archived in Software Heritage 6 (swh:1:dir:26b6504724952e6d0d7db34c394e052217523754).   Enriched output PDB files for advanced visualization with molecular viewer 1 software 2 The main 3DGB outputs are 3D models of genome organization. These 3D models are composed 3 of beads for which 3D coordinates (x, y, z) were inferred from contact information (see Methods). 4 One bead represents several thousand base pairs corresponding to the chosen resolution during 5 the Hi-C data analysis (see previous section). To facilitate the study of these models, we enriched 6 the structures produced by Pastis as PDB files with a four-step procedure. First, 3DGB formats 7 and enriches the 3D model output by Pastis for visualization and data integration by annotating 8 each bead with the chromosome number. This is useful to distinguish chromosomes when 9 viewing complete structures (see Figure 2 for illustrations). Second, it automatically reconstructs 10 beads for which no coordinates could be calculated by Pastis, by interpolating missing 11 coordinates from the existing ones (see Methods). Note that we do not extrapolate missing 12 coordinates, meaning that beads with missing coordinates located at the extremities of the model  Altogether, these enhancements provide convenient 3D models to be viewed and explored by Hi-21 C data analysts. For chromatin 3D model file formats, we provided the 3D model structure in the 22 PDB format, a traditional file format widely used to store coordinates of molecular structures. Most 23 software used in structural biology to view and manipulate structures can handle the PDB format. 24 In this paper, we preferentially used Mol* (Sehnal et al. 2021), a ubiquitous viewer for large scale 25 molecular structures, with a user-friendly web interface allowing visualization and customization 26 of 3D models with only a few mouse clicks. We also used HiC3D-Viewer (Djekidel et al. 2017), a 1 viewer especially developed for chromatin structures. Note that HiC3D-viewer requires 2 conversion of the PDB file format into the G3D format. 3DGB also provides 3D models in the G3D 3 file format, allowing users to choose the viewing software they prefer. As illustrations, images 4 produced by Mol* and HiC3D-viewer are presented in Figure 2, for three different 3DGB models 5 inferred in yeasts and a filamentous fungus (see next section). 6 Part 2. Seeing the spatial organization of fungal chromosomes with 3D 7 Genome Builder (3DGB) 8 Creation of wild type models and visual consistency with the literature 9 To test the performances and the relevance of 3DGB, we chose three emblematic fungal species: 10 Saccharomyces cerevisiae, Schizosaccharomyces pombe and Neurospora crassa. We wanted 11 to create 3D models of their genome organization and confront the models with current Detailed information regarding the source of the data are given in Table 1. As a result, we 16 obtained new models of S. cerevisiae, S. pombe and N. crassa genomes. Three of them, which 17 correspond to wild type situations, are shown in Figure 2. As expected, the key features of the 18 spatial organization of the three species' genomes were observed, i.e. the Rabl conformation, the 19 hypercoiling of chromatin fibers and the chromosome territories. Finding these well-described 20 characteristics was an important step in validating the relevance of the 3DGB workflow, especially 21 considering that 3DGB uses a probabilistic-based strategy to infer 3D models (Pastis-NB), which 22 is free of initial constraints, such as the relative position of chromosomes.      ChIP-seq data, in order to better understand the relationship between the identified patterns of 7 chromatin domains (information derived from Hi-C data analysis) and the cohesin residency 8 regions (information derived from ChIP-seq data analysis). They used cells arrested in mitosis, 9 and studied both the wild type strain and mutants altered for cohesin or its regulators. Notably, to 10 finely describe chromatin loop formation, they used a recent improvement of the Hi-C technique, 11 To go further in this and enrich visualization, we collected their Micro-C XL data and reanalyzed 24 them with 3DGB, to create an updated 3D model of the spatial organization of the S. cerevisiae 25 chromosomes (see Methods). Considering the very high precision of the data, we expected to 26 observe both the general configuration (chromosome territories and Rabl conformation) and the 1 fine chromatin domains (coiling), by zooming in and out with visualization software. Our results 2 are presented in Figure 3. Note that the overall model ( Figure 3A) is the same as the one shown 3 in Figure 2B, but with different (color-blind friendly) colors used to identify the 16 chromosomes 4 of the S. cerevisiae genome. ChIP-seq data were also reanalyzed (see Methods) to locate the 5 CARs on the new 3D model. They are shown in yellow in is shown Figure 3B. Notably, we could observe that CARs, which are still represented by yellow 10 beads on the 3D models ( Figure 3D), were aligned in space, and this, independently of the length 11 of the chromatin loops whose boundaries they represent ( Figure 3C). This observation was true 12 for all chromosomes (another example can be found in Supplementary Figure S3). 3D modeling 13 of Hi-C contacts is therefore of significant interest here, revealing the existence of a cohesin 14 skeleton, ensuring structural stability of the spatial organization of the S. cerevisiae

1
The yellow beads correspond to the cohesive backbone as it was originally identified in the wild type (B).

(F) Juxtaposition of the 3D model and the associated contact map, for the beads which are located between
3 430 and 580 kb. This is the same genomic region exposed in (C). To verify the relevance of this observation, we also reanalyzed the Hi-C data obtained with an S. 1 cerevisiae strain in which the gene encoding the subunit Mcd1 of the cohesin complex was 2 depleted (Figure 3D, black star). Results are presented in Figure 3E. This time, the 3D models 3 showed more irregular coiling of chromosomes and globule-like structures. Again, this 4 observation is relevant to the original article of Costantino et al. in which the authors found that in 5 the absence of cohesin, the mitotic chromatin is not entirely disorganized, but rather structured 6 into globular domains, dependent on histone modifications (Costantino et al. 2020). An important 7 question then was what happens, in this context, to the cohesin skeleton previously observed in 8 the wild type strain. We therefore mapped the genomic positions of CARs, as they were defined 9 based on the ChIP-seq data in the wild type strain, on the 3D "mutant" model ( Figure 3E exclusively on their interpretation of Hi-C contact maps, they showed that if the mitotic 5 organization into large domains gradually dissolves across the cell cycle, small domains remain 6 relatively stable. They also hypothesized that the Rabl conformation was stable in interphase but 7 disrupted during mitosis. With this dataset, we assessed our ability to observe these structural 8 features in the 3D models obtained with 3DGB, at different stages of the S. pombe cell cycle. We 9 collected Hi-C data from the original article (Table 1) Table 1 and Methods section for technical details).  the GEO database under the accession "assembly nc14". When we started using 3DGB to create 12 a 3D model of the wild-type N. crassa chromosome (Figure 2), we used as reference genome 13 the "nc12" assembly, available from the GenBank database. Surprisingly, we observed in our 14 inferred 3D model, an inconsistency in the order of the genomic regions, with respect to the 15 succession of our beads (Figure 6). Our 3D model thus directly highlighted the inverted contig 16 12.304 on the chromosome 6, originally found by Galazka et al. At 10 kb resolution, the 95 beads 17 used to represent this region were placed in the 3D space according to the information of Hi-C 18 contacts only, independently of the reference genome assembly. This explains why we were able 19 to highlight a mismatch between the chaining of beads in the 3D models and the numbering of 20 these beads according to the genome sequence ( Figure 6A). Note that when using the nc14 21 assembly of the genome as reference, the obtained 3D model of chromosome 6 we obtained was 22 coherent both spatially (order of the beads) and sequentially (order of the genomics regions). 23 From this experience, we developed an automated procedure (integrated to 3GDB as an option) 24 to detect and correct inverted contigs by comparing the chaining of beads in the 3D model (based 25 on the distance between adjacent beads, Figure 6B) and the numbering from the genomic 26 sequence written in the FASTA file. This method also produces a corrected version of the 3D 1 model and of the genome assembly ( Figure 6C).

15
Quantification of the stability of 3D models to random noise with multiple uses of 3DGB The advantage of using a workflow like 3DGB is the possibility of automating tasks. In this part, 2 we assess the stability of the 3D organization of the chromosomes with respect to potential 3 inaccuracies in the contact frequencies measured with Hi-C (see Methods). Our strategy was to 4 alter the original values of contact frequencies (issued from the Hi-C data) by adding a "shot 5 noise" on the contact frequencies, and then run 3DGB to create an associated 3D model. If the 6 intensity of the shot noise was low, the output model was expected to be close to the original 7 model (obtained from the original data). The resulting RMSD score, calculated by comparing the 8 3D structure inferred from the reference N. crassa contact map and the 3D structure inferred from 9 the altered N. crassa map, was also expected to be low. Altogether, we assessed 23 levels of 10 intensity of the shot noise (see Methods) and generated, automatically, a total of 1150 models. 11 Our results are summarized in Supplementary Figure S5. As expected, we observed an 12 increase in the RMSD scores, when the intensity of the shot noise increases. This was more 13 striking for the global structure (Supplementary Figure 5B The objective of this work was to explore the interest of creating 3D visualizations of chromosome 2 organization to complement the classic contact maps producing during Hi-C data analysis. We 3 found that, although methods to create 3D models exist, they are not often exploited, and this is 4 particularly true of fungal genome studies. The emblematic 3D representation of the complete 5 genome of S. cerevisiae had been obtained more than 10 years ago, but only partial structures 6 were generated for S. pombe and N. crassa (see Supplementary Figure S1). Our first intention 7 was therefore to create, from existing bioinformatics tools and Hi-C datasets, complete models 8 for these three species. Unlike what has been done for S. cerevisiae by Duan et al. (Duan et al. 9 2010), we wanted to build models without specifying initial physical constraints regarding 10 chromosome positioning. The Pastis-NB method was particularly suitable for this task. To 11 automate the generation of 3D models, we built 3DGB, a high-throughput Hi-C data processing 12 workflow (Figure 1). With this workflow, we provided enriched PDB files for advanced 13 visualization with molecular viewer software (Figure 2) and we created several hundred models 14 in reasonable computing times (Supplementary Figure S5). The 3DGB workflow is available to 15 other scientists who would like to add 3D models to their analyses of Hi-C data. 16 The relevance of 3D models of the spatial organization of chromosomes is an important question 17 of our study. Beyond the production of a "beautiful image", what is the added value of these 18 models for Hi-C data analysts? It is indeed important to keep in mind that these models are not 19 "pictures'' of the interior of cell nuclei. They are based on experimental data, whose quality may 20 vary and significantly impact the inferred model. Ultimately, they represent measurements of 21 contact frequencies observed at the scale of cell populations and therefore, they are simplified 22 representations of reality. This work on fungal genomes taught us that the visualization of 3D 23 models is profitable for several reasons. First, it gives an overall representation of the organization 24 of the chromosomes (Figure 2) (Figures 3 and 4) and can provide insight at 23 the level of the biological compaction of chromatin, compensating for technical limitations 24 (resulting in chromosomes which are more spaced). As stated previously, another inherent 25 limitation of 3D models is that they are "static", "population averaged" pictures of a biological 26 system (chromatin organization) known to be highly dynamic and variable from cell to cell. These 27 artifacts will exist as long as the Hi-C data are snapshots obtained from cell populations. Still, it 1 is important to note that Rabl conformations and chromosome territories were observed in all 2 species (Figure 2), even with the N. crassa model, which is an average representation derived 3 from Hi-C experiments performed on an asynchronous cell population. Therefore, with 4 synchronous cells, the situation can be expected to be even better. This is what we observed with 5 S. pombe models. Because the cells were initially synchronized, the Hi-C datasets produced at 6 different stages of the cell cycle revealed several other interesting genome organizations in the 7 inferred 3D models (Figure 4) and we thus managed to render the dynamics of chromatin. As for 8 the "population average" limit, single cell Hi-C strategies are developing (Stevens et al. 2017), 9 opening the promising perspective of creating 3D genome models of single cells. A final limit of 10 current 3D models is the missing regions of the fungal genomes. As illustrations, the models 11 presented here are built using genomic sequences in which the rDNA repeats were deleted, thus 12 preventing the correct representation of structural features in the nucleolus (explaining the hole 13 we can observe on chromosome 12 in S. cerevisiae models, Figures 2 and 3). Overall, the limits underline the dependence of 3D modeling on the quality of experimental Hi-C 26 raw data and generation methods. We are confident however, that the technical improvements 27 that are being introduced at a rapid pace will progressively minimize these limitations. Importantly, 1 even though the technical ceiling is not yet reached, we have already managed to highlight 2 important fungal genome characteristics in a novel way by reanalyzing public datasets. For N. 3 crassa, the 6 3D models produced by 3DGB condensed the information of 5 research articles 4 and about 20 raw FASTQ data files into one rich large-scale illustration (Figure 5). They bring 5 new opportunities for visual integration of omics data: while Hi-C analysis often implies a "zoom-6 in" logic, focusing on precise regions of a contact map, 3D modeling completes and enhances 7 this logic with a "zoomed-out" vision. 8 In conclusion, we have presented a holistic approach that is favorable to intuition and new 9 hypotheses. We explored large-scale integration of epigenetic ChIP-seq data into the 3D context 10 of chromatin, but any suitable combination of omics datasets can be made using the genome 11 model as a visual support. For example, we are currently exploring the interpolation of genes on 12 the 3D genome structure. 3DGB could be adapted to provide a model at the gene (one bead Pastis and all custom Python scripts) and a Singularity container (for HiC-Pro). 8

3DGB inputs 9
3DGB requires the following inputs: Hi-C FASTQ files (provided as SRA IDs or as local FASTQ 10 files), a reference genome sequence provided in a FASTA file (without mitochondrial DNA), one 11 or several enzyme restriction site motifs, and finally, values for the output resolution to draw the 12 contact map and generate 3D models. 13 Main steps in the analysis 14 The first steps of the workflow format the necessary information for the read mapping step: structures. This is the case for Mol*, used to create images presented in this article . 21 Analyses of experimental datasets in fungal species 22 Access to raw data from SRA database 23 All the Hi-C seq, Micro-C XL seq and ChIP seq raw FASTQ files were downloaded from the SRA 24 database, see Table 1 and Supplementary Table 1 (second tab for ChIP seq) for all SRA ids. 25 The reference genomes of S. cerevisiae and S. pombe were obtained from the NCBI Genome 1 database, respectively version R64 (S288C) and ASM294v2. 19. The nc14 version of the N. Application of 3DGB to create 3D models 6 3DGB was applied with default parameters (FASTQ files, see Table 1  The hpo mutant N. crassa contact frequency matrix was used as a "reference" to generate "noisy 17 contact frequency matrices". The hpo mutant was selected because it has the lowest number of 18 valid read pairs (2,922,526) and this sample is the most likely to be sensitive to noise. Let be 19 the number of counts in the contact frequency matrix of this sample, with and the indices of 20 the beads ranging from 1 to = 825, with the total number of beads for this sample. To create 21 an "altered" or "noisy" N. crassa contact frequency matrix, it was set = + the number of 22 counts in the noisy contact frequency matrix, where is a value sampled from a Poisson 23 distribution with parameter , independently for each pair of beads ( , ). Note that the contract 24 frequency matrices are symmetric and = for each pair of beads. The Poisson distribution 25 was used, because it models shot noise, i.e. the variability in read counts that is due to the read 1 sampling process rather than to biological variations (Anders and Huber 2010). The level of noise 2 (i.e. the mean parameter of the Poisson distribution used to generate shot noise) varies among 3 the following values 0.1, 5, 10, 15, 20 up to 110. For each value of , 50 replicates were 4 generated, i.e. 50 noisy contact frequency matrices. To assess the stability of 3D structures, the 5 RMSD were computed between the 3D structure inferred from the reference N. crassa contact 6 frequency matrix (the raw contact matrix with counts ) and the 3D structure inferred from the 7 altered N. crassa maps (the noisy contact frequency matrix with counts ). This procedure leads 8 to 50 values of RMSD for each value of . The Supplementary Figure S5 represents the RMSD 9 values for ranging from 0.1 to 50. Above the value = 50, the RMSD values reach a threshold 10 and do not increase with the value of . The RMSD between each one of the seven chromosomes 11 of N. crassa were also computed, for each value of and each of the 50 replicates. 12 Visual integration of omics data 13 The same pipeline was used for integration of ChIP-Seq data from N. crassa and S. cerevisiae. 14 To perform alignment, Bowtie2 was used with default parameters (Supplementary Table 1, tab  15 2). The program samtools was used to sort and index the SAM files into bam files, and the 16 program bamCoverage was used to generate 5 kb or 50 kb .bedgraph files with RPKM 17 normalization (for S. cerevisiae and N. crassa respectively). Each bin in the bedgraph 18 corresponds to one bead of the 3D model. Note that the resolution of the bedgraph and of the 3D 19 model are the same, allowing the integration of the data into the PDB file. For S. cerevisiae, the 20 threshold was set to 80 counts after normalization, converting the continuous ChIP-seq signal 21 into a binary signal highlighting the high residency CARs (Costantino et al. 2020). For N. crassa, 22 the ChIP-seq data was kept continuous to highlight the epigenetic mark distribution. The values 23 below 20 counts are discarded for visualization . 24 Data access 1 The 3DGB workflow is available on GitHub: https://github.com/data-fun/3d-genome-builder. The 2