A giant virus genome is densely packaged by stable nucleosomes within virions

The two doublet histones of Marseillevirus are distantly related to the four eukaryotic core histones and wrap 121 basepairs of DNA to form remarkably similar nucleosomes. By permeabilizing Marseillevirus virions and performing genome-wide nuclease digestion, chemical cleavage and mass spectrometry assays, we find that the higher-order organization of Marseillevirus chromatin fundamentally differs from that of eukaryotes. Marseillevirus nucleosomes fully protect DNA within virions as closely abutted 121-bp DNA wrapped cores without linker DNA or phasing along genes. Likewise, we observed that a large fraction of the nucleosomes reconstituted onto multi-copy tandem repeats of a nucleosome positioning sequence are tightly packed. Dense promiscuous packing of fully wrapped nucleosomes rather than “beads-on-a-string” with genic punctuation represents a new mode of DNA packaging by histones. We suggest that doublet histones have evolved for viral genome protection and may resemble an early stage of histone differentiation leading to the eukaryotic octameric nucleosome.


Introduction
The association of histones with DNA in the eukaryotic nucleus was known by the late 19 th century (Luck, 1965), but it was the revolutionary discovery of nucleosomes in the early 1970s (Hewish and Burgoyne, 1973;Kornberg and Thomas, 1974;Noll, 1974) that established the fundamental subunit structure of eukaryotic chromatin. Other nucleosome configurations were described for homotetrameric archaeal histones, which were found to wrap ~60-bp of DNA (Pereira et al., 1997) and form higher-order "slinkies" (Mattiroli et al., 2017). Some archaeal histones are doublets with two histone fold domains, a differentiated form hypothesized to predate the evolution of eukaryotic nucleosomes (Malik and Henikoff, 2003).
Although the DNA wrap of archaeal nucleosomes resembles that of eukaryotic nucleosomes, the histones are too dissimilar in sequence from the four core eukaryotic histones to identify correspondences with specific domains of histone doublets. However, sequencing of the giant Marseillevirus discovered in 2007 led to the realization that its genome encodes two histone doublets that are paired homologs of the four eukaryote core histones (Boyer et al., 2009). In Marseillevirus, Hβ-Hα is homologous to eukaryotic H2B and H2A, and Hδ-Hγ is homologous to H4 and H3. Related viruses of the family Marseilleviridae have since been discovered infecting Acanthamoeba species worldwide .
We and others have previously used biochemical reconstitution and cryoEM imaging to solve the highresolution structure of Marseillevirus nucleosomes, which show a striking resemblance to their eukaryotic counterparts (Liu et al., 2021;Valencia-Sanchez et al., 2021). Unlike octameric eukaryotic nucleosomes, which wrap 147 bp of DNA, tetrameric Marseillevirus nucleosomes wrap only 121 bp of DNA, despite being reconstituted on the Widom 601 artificial positioning sequence, selected for 147-bp wrapping of nucleosome cores. Because Marseillevirus doublets assemble into nucleosomes that are not fully wrapped by DNA, it was proposed that they are inherently unstable, perhaps to facilitate expression during early stages of infection or for gene regulation (Liu et al., 2021). However, without isolation of viral chromatin in its native form, the extent to which reconstituted Marseillevirus nucleosomes are representative of their conformation in virions is debatable (Vannini and Marazzi, 2021). By mapping the chromatin landscape of the viral genome we aimed to understand the functional and evolutionary basis for viral nucleosomal packaging.
Here we show that chromatin from Marseillevirus can be released from viral particles by breaching the capsid and permeabilizing the lipid membrane followed by either enzymatic or chemical cleavage. Using mass spectrometry, MNase-seq and MPE-seq, we confirmed that the 121-bp wrap observed by cryoEM for reconstituted nucleosomes is also observed for endogenous histone doublets in the virion. However, in contrast to eukaryotic nucleosomes, which are 147-bp particles separated by ~50-bp linkers, 121-bp Marseillevirus nucleosomes are tightly packed within the virion. Unlike eukaryotic nucleosomes, which are depleted from transcription start sites and phased downstream, we observed no depletion or phasing at genes. To determine whether tight packaging seen in the virion is inherent to Marseillevirus histone doublets, we reconstituted histones on multi-copy arrays with a strong 147-bp positioning sequence. We found that most of the arrays were fully occupied by histone doublets and were resistant to chemical cleavage, although a fraction of the nucleosomes were wrapped to form 147-bp particles, suggesting that neighboring nucleosomes stabilize wrapping. Our findings indicate that Marseillevirus histone doublets have evolved for tight packing of 121-bp particles without sequence or genic preference, consistent with a viral packaging function, thus representing a previously unknown mode of genome packaging by histones.

Release of Marseillevirus chromatin from viral particles by nuclease digestion
The capsids of giant viruses that infect amoeba are resistant to treatments that disrupt other viruses.
Previous attempts to release chromatin intact from Marseilleviridae capsids were reported to have been unsuccessful (Liu et al., 2021), however by dialyzing in low pH conditions Schrad and co-workers (Schrad et al., 2020) had shown that giant Mimivirus capsids can be breached, although without releasing their contents. We followed their protocol to open the Marseillevirus capsid, then dialyzed into a neutral buffer for Micrococcal Nuclease (MNase) digestion. Capillary gel electrophoresis revealed a striking ladder of protected particles with mono-and di-nucleosomes dominating and discernable tri-and tetra-nucleosome peaks (Figure 1a). Under the same permeabilization and digestion conditions, Drosophila mononucleosomes dominate, with only a minor di-nucleosomal peak (Figure 1b).
High yields of digested chromatin were obtained from samples treated with 0.5% NP-40 over a time-course ranging from 1 minute to 18 hours (Figure 1a, S1a). In contrast, 0.1% digitonin had no effect (Figure 1c, S1b-c). As NP-40 permeabilizes membranes by sequestering lipids, whereas digitonin permeabilizes cell membranes by displacing membrane sterols, our results suggest that the chromatin of Marseilleviridae, like that of giant Mimiviridae (Kuznetsov et al., 2010;Suzan-Monti et al., 2007;Xiao et al., 2009;Zauberman et al., 2008), is enclosed within an impermeable lipid membrane that would have protected it from low pH conditions, and subsequent permeabilization allowed for MNase to gain access to chromatin. Under our permeabilization and digestion conditions, 85-90% of the DNA in virions was recovered as intact MNaseprotected particles ( Figure S1). To ascertain the protein composition of Marseillevirus chromatin, we centrifuged viral suspensions following NP40 permeabilization and MNase treatment, then extracted total protein from the pellet, supernatant and washes. We performed SDS-PAGE and excised bands from silver-stained gels as indicated (Figure 2a), including the ~25 kDal bands corresponding to the predicted MW of both 949) and 218) and neighboring bands. We also excised bands from total protein extracted from reconstituted chromatin using histones produced in Escherichia coli and from a blank gel lane. Protein samples were digested with trypsin and subjected to mass spectrometry followed by a comparison of peptide MWs to those predicted for trypsinized Marseillesvirus ORFs and likely contaminants. Only Hβ-Hα and Hδ-Hγ peptides were found at consistently high levels in both viral chromatin and reconstituted chromatin samples (Figure 2b and Supplemental Item 1). A distantly related variant of Hβ-Hα, Hz-He (MW = 19,033), was detected in the chromatin fraction at a much lower level in the expected size range, consistent with previous mass spectrometry of viral extracts (Boyer et al., 2009). Taken      . A ~20-kb region centered over Position 27,000 is ~3-fold over-represented, and a ~2-kb region centered over Position 318,400 is ~10-fold over-represented similarly in all ten MNase datasets regardless of digestion level. To determine whether these striking over-representation features were present in the initial sample from 2007 used to assemble the original T19 map, we mapped the primary T19 reads from the original fastq files, and observed that over-representation at Position 27,000 was already conspicuously present in this sample (Figure 3a, left, top track).
To confirm the validity of the Genbank assembly of the T19 circular genome, we performed de novo genome assembly for a representative T19 sample. We ran the SPAdes program on the 50-bp end reads, which provided >2000x genomic coverage per sample. We obtained >99.9% coverage by the 8 largest contigs, where regions of over-representation are spanned in our assembly (Table S1). This demonstrates that the over-represented regions seen in T19 and G648 are inherent to the genome used for profiling chromatin, and do not reflect a mis-assembly artifact.
To quantify over-representation, we randomly sampled fragments of similar lengths to each overrepresented feature from the rest of the genome 1000 times and plotted the abundance of each sample on a log-log cumulative plot for the raw reads from the original 2007 virus culture and the No-MNase pairedend reads from the 2020 virus culture used in this study (Figure 3c, left panel). This revealed that the 20-kb region over Position 27,000 was ~2-fold over-represented in the original 2007 viral culture and ~2.7-fold over-represented in the 2020 culture relative to their respective genome-wide median abundances.
Likewise, the 2-kb region over Position 318,400 was ~1.1-fold over-represented in the original 2007 viral culture and ~10-fold over-represented in the 2020 culture. Although the magnitude of over-representation of the 2-kb region in the original 2007 viral culture is small, it is ranked between #1 and #2 of the 1000 randomly sampled regions, and so likely represents incipient over-representation of some virions in that culture. It is evident that over-representation of the 20-kb region continued to be maintained at 2-3-fold over-abundance during successive passages in Acanthamoeba polyphaga culture at similar levels within the cultured Marseillevirus populations, while the 2-kb region expanded ~12-fold during successive passages in culture.
We next asked if regional over-representation and expansion during culturing is a general feature of the Marseilleviridae . G648 is a much more recent isolate and therefore has been maintained for a shorter time in culture than T19. For G648 we observed a different region of overrepresentation, in which a ~50 kb segment approximately centered over Position 319,000 is present in all ten samples in the MNase concentration series (Figure 3b). Although the primary reads from the fastq files used for the original assembly of G648 were relatively sparse, quantitative sampling analysis showed that this ~50-kb region is also strongly over-represented, with a 1.9-fold excess, compared to a 2.6-fold excess for the No MNase sample (Figure 3c, right panel). As for T19, de novo assembly yielded >99.9% coverage by the 8 largest contigs, confirming the validity of the original assembly ( Table S1).
Mapping of unpaired fragments from paired-end sequencing libraries shows them to be in large excess immediately outside of the over-represented regions as expected for the presence of novel tandem repeat junctions (Figure 3d). Our observations of different over-represented regions in two Marseillevirus isolates and increases in the abundance of over-represented regions in T19 during passage indicate that regional over-representation occurs during culturing of Marseillevirus in A. polyphaga. Analysis of long repeatspanning fragments indicated that regional over-representation is accounted for by intrachromosomal tandem repeats rather than extrachromosomal circles ( Figure S2).

The Marseillevirus genome is densely packaged by 121-bp nucleosomes without linkers
Marseillevirus mono-nucleosomal DNA fragments released by MNase are smaller than those of Drosophila, and show much less spacing between nucleosomes (Figure 1d). Comparison of a Marseillevirus ladder to a lightly digested Drosophila ladder revealed an average difference of ~55 bp for di-nucleosomes (two nucleosomes separated by one linker) and ~120 bp for tri-nucleosomes (three nucleosomes separated by two linkers). Given that the average eukaryotic nucleosome repeat length is ~200 bp and a nucleosome wraps 147 bp, our MNase digestion results suggest that there are no linkers separating Marseillevirus nucleosomes, but rather that they are abutted against one another ( Figure S3a). Such tight packing of Marseillevirus nucleosomes might explain why they remain insoluble even after MNase digestion, in contrast to Drosophila nucleosomes, which are quantitatively released by MNase into solution (Figure 1c, Figure S1), as if tight packaging within the permeabilized virion prevents release of mononucleosomes.
At the highest MNase digestion levels, control Drosophila nucleosomes showed a dominant 147-bp peak, but also a smear of sub-nucleosome-sized digestion products (Figure 4a). Using the highest digestion level for Marseillevirus we observed a dominant 121-bp MNase-protected peak, a small ~70-bp peak, but no smear, confirming cryoEM observations of 121-bp wrapping of reconstituted nucleosome cores (Liu et al., 2021;Valencia-Sanchez et al., 2021). These observations indicate that Marseillevirus nucleosomes are less sensitive to intranucleosomal cleavages than are eukaryotic nucleosomes.

MPE-seq confirms the dense packing of Marseillevirus nucleosomes.
MNase is an endo/exo-nuclease that is known to preferentially digest AT-rich regions (Chung et al., 2010;McGhee and Felsenfeld, 1983). Consistent with these observations, we found that ~90% of the cleavage sites occur between A/T base pairs ( Figure S4) and long AT-rich regions are preferentially digested ( Figure S5). Therefore, we wondered whether the unexpected ~70 bp peak (Figure 4a) (Cartwright et al., 1983). MPE-seq is performed similarly to MNase-seq, but without exonuclease activity and without sequence bias (Ishii et al., 2015) ( Figure S5). When we treated Drosophila nuclei and permeabilized Marseillevirus particles with MPE-Fe(II), we observed mostly mononucleosome-sized      His2Av and ball,chr3R:26,862,872,500) was selected for display. For T19 an arbitrarily chosen representative 10-kb region was selected (53,001-63,000) and the region with the same coordinates in G648 (but not orthologous) was also selected. Fragment size classes represent subnucleosomes (1-100 bp), nucleosomes (101-200 bp) and mixtures of mono-and di-nucleosomes (201-300 bp). Nearly uniform occupancy is seen for MPEgenerated Marseillevirus nucleosomes, whereas under the same conditions Drosophila nucleosomes display regions of conspicuous phasing characteristic of nucleosomes separated by linker regions. Tracks are group-autoscaled within sets of four. Percent G+C (green) and open reading frames (ORFs, black boxes) are plotted at 10-bp resolution for each 10-kb span. (c) Autocorrelation illustrates periodicities over representative regions. To sensitively detect nucleosome phasing, we plotted autocorrelations over the first 600 bp of each representative region shown above for the three fragment size classes as indicated using MNase-seq and MPE-seq data. Autocorrelations in 1-bp lag steps over a 5'-aligned 10-kb span are plotted for MNase-seq (blue) and MPE-seq (magenta). was seen for all three size classes, but only for MNase-seq, with little if any distinctness in the chromatin landscape for MPE-seq. We also examined these representative 10-kb regions for evidence of nucleosome phasing by performing autocorrelation analysis (Figure 5c). This revealed consistent periodicities for both the MNase-seq and MPE-seq Drosophila data, most prominently for the 101-200 subset, as expected for nucleosome phasing. In contrast, the slight periodicities we observed for Marseillevirus T19 and G648 were mostly inconsistent between MNase-seq and MPE-seq. This lack of consistent phasing in Marseillevirus T19 and G648 nucleosomes suggests that they are densely packed without intervening linkers characteristic of eukaryotic nucleosomes.

The Marseillevirus chromatin landscape lacks genic differentiation.
We wondered whether the minor regularities seen in the MNase-digested Marseillevirus nucleosome landscapes and by autocorrelation analysis ( Figure S6) correspond to genic regions, which are annotated as     (Figure 6a), although when the scale was expanded a positioned +1 nucleosome was observed. However, no nucleosome positioning was seen for MPEgenerated T19 and G648 5'-aligned ORFs (Figure 6a, c), indicating that the minor +1 nucleosome positioning that was seen with MNase is likely attributable to its aggressive endo/exonuclease activity ( Figure S3a). The lack of 5' phasing, a universal characteristic of eukaryotic genes, indicates that the packaged Marseillevirus genome is likely to be transcriptionally inactive upon release from the capsid during infection.
Alignment of the 191 ORFs at the stop codon of their 3' ends showed extreme sensitivity of Drosophila nucleosomes to MNase levels not seen for MPE-generated fragments (Figure 6b), which is consistent with partial unwrapping of AT-rich ORF 3'-end DNA from nucleosome cores and sensitivity to MNase exonucleolytic activity. A similar MNase sensitivity and MPE insensitivity was seen for T19 and G648 nucleosomes at the very 3' ends of ORFs (Figure 6b), which is likewise attributable to AT-rich regions at Marseillevirus 3' ends. However, unlike chromatin at the 3' ends of Drosophila ORFs, both T19 and G648 chromatin displayed an average peak of MNase resistance just upstream of the 3' end. As this peak was absent from MPE-digested average profiles, we attribute it to internal MNase cleavage of a subset of nucleosomes that are relatively excluded from neighboring AT-rich regions rather than to 3' nucleosome phasing. The lack of any chromatin accessibility features punctuating genic regions implies that Marseillevirus nucleosomes have evolved exclusively for packaging within the virion.

Dense packing of reconstituted Marseillevirus chromatin on Widom 601 arrays.
We wondered whether previous reconstitutions of Marseillevirus nucleosomes on single 601 sequences (Liu et al., 2021;Valencia-Sanchez et al., 2021) failed to show fuller wrapping because of the lack of Three-copy (521 bp) or 12-copy (2084 bp) Widom 601 arrays and 60 µg/ml or 300 µg/ml of Marseillevirus histones and control Xenopus core histones were assembled into nucleosomes. Native (left) and cross-linked (right) assemblies were subjected to MPE-seq (cyan) and MNase-seq (magenta). The higher abundance of fragments towards the middle of each array is a consequence of mapping to a linear fragment, where fragments that span an end can only align with internal copies. MPE 12A and MPE12B were assembled onto the 12-copy array, cross-linked, digested with MPE and aligned to the 3-copy array. Profiles were group autoscaled. (b) Length distributions of fragments produced by MPE from the 3-copy array reconstituted samples show that up to 39% of the total occupancy is accounted for by precise cleavages on either or both ends of a 147-bp particle. Xenopus MPE Xenopus MNase 3 1-bp sites = 39% of total 3 1-bp sites = 8% of total 3 1-bp sites = 73% of total 3 1-bp sites = 70% of total 3 1-bp sites = 27% of total 3 1-bp sites = 2% of total

Native
Cross-linked neighboring nucleosomes, which we find are closely abutted in virions. Accordingly, we performed MPE and MNase digestion on native and cross-linked reconstituted Marseillevirus chromatin that had been assembled at 60 µg/ml or 300 µg/ml concentrations onto a three-copy 601 array and onto a 12-copy 601 array. Following DNA extraction, we prepared sequencing libraries and performed paired-end sequencing, aligning fragments from native or cross-linked 3-copy and 12-copy chromatin digests to the 3-copy 601 array. For the 3-copy native chromatin, MPE-seq revealed similar chromatin landscapes for the 60 µg/ml and 300 µg/ml samples, with higher occupancy over the three 147-bp 601 positioning sequences than over the intervening 40-bp linkers, although not as high as for Xenopus control core histones assembled on the same 3-copy array (Figure 7a, left). The transitions between 601 and linker sequence were sharply defined, indicating precise positioning of 147-bp 601 particles over the 601 sequence. A much smaller fraction of 147-bp 601 particles was observed for MPE digestion of cross-linked 3-copy chromatin (Figure 7a, right).
To quantify the relative abundance of cleavages precisely at 147-bp 601 particle ends, we plotted the length distributions for each 3-copy sample (Figure 7b). In each case, MPE digestions of 3-copy array chromatin resulted in 1-bp wide fragment length peaks at 147 bp, 334 bp and 521 bp. As the entire 3-copy array is 521 bp and 334 bp is exactly the size expected for a 601-linker-601 spanning fragment, it is evident that MPE endonucleolytically digests to completion without detectable encroachment into the nucleosome-wrapped particles. Nineteen percent of the nucleosomes in the 60 µg/ml sample were 147 bp and precisely phased over the 601 positioning sequence, and 147 bp, 334 bp and 521 bp fragment ends accounted for 39% of the total ( Figure S3b). By comparison, control Xenopus nucleosomes assembled on the same 3-copy 601 array and subjected to MPE-seq yielded 38% 147-bp particles, and 147 bp, 334 bp and 521 bp fragment ends accounted for 73% of the total. This confirms that the inherent tendency of adjacent Marseillevirus nucleosomes to interact and partially overcome DNA sequence-directed positioning more effectively than that observed for eukaryotic nucleosomes. High histone occupancy over the linkers between 601 positioning sequences contrasts with the 121-bp wrap seen for Marseillevirus nucleosomes reconstituted onto single 601 sequences (Liu et al., 2021;Valencia-Sanchez et al., 2021) and suggests that closely abutted Marseillevirus nucleosomes stabilize one another and prevent unwrapping ( Figure S3a).
After cross-linking, only ~1% of 3-copy Marseillevirus chromatin arrays corresponded to 147 bp particles, and 147 bp, 334 bp and 521 bp fragment ends accounted for only ~2% of the total, superimposed over a broad distribution of fragment lengths (Figure 7b and S3b). For 12-copy cross-linked reconstituted arrays, MPE-seq resulted in the total absence of 1-bp peaks. By comparison, control Xenopus nucleosomes reconstituted on 3-copy arrays yielded 14% 147-bp particles, and 147 bp, 334 bp and 521 bp fragment ends accounted for 26% of the total (Figure S3b). This indicates that a larger fraction of the MPE cleavages are uniformly distributed on reconsistuted Marseillevirus chromatin than on reconstituted eukaryotic chromatin.
In contrast to the results with MPE, MNase digestion of the same reconstituted chromatin showed a distribution of fragment lengths ~20-25 bp smaller than the three discrete fragment lengths produced by MPE digestion of Marseillevirus chromatin. Most notably, MNase digestion of native Marseillevirus chromatin produced fewer than 1% precisely positioned cleavages and resulted in a rough profile, whereas for Xenopus nucleosomes assembled on the same 3-copy Widom 601 array, 70% of the cleavages were precisely positioned and resulted in a clean sawtooth pattern (Figure 7a and S3b). Reduced nucleosome positioning on 601 arrays measured by both MPE-seq and MNase-seq distinguishes Marseillevirus doublet histones from eukaryotic core histones and recapitulates the situation in virio.

Discussion
We have shown that Marseillevirus nucleosomes can be recovered intact from virions and used to elucidate nucleosome organizational features using mass spectrometry, MNase-seq and MPE-seq. These methods reveal particles that differ from eukaryotic nucleosomes in being refractory to internal cleavages and tightly packed into a landscape without linker DNA or phasing around genes. Taken together, our findings reveal a dense chromatin landscape that may have evolved to maximize protection of viral DNA for survival during infection in amoeba cytoplasm. This mode of chromatin organization differs drastically from that of eukaryotes, where nucleosomes not only protect DNA, but also have evolved for gene regulation by limiting access to regulatory elements (Kornberg and Lorch, 2020). Given the close structural superimposition of the Marseillevirus nucleosome with the eukaryotic nucleosome (Liu et al., 2021;Valencia-Sanchez et al., 2021), our finding that Marseillevirus chromatin lacks linkers or genic punctuation is especially remarkable. Marseillevirus chromatin is also unlike that of well-studied archaeal chromatin, in which 60-bp single-wrapped units are thought to form long "slinkies" that dynamically open and close Mattiroli et al., 2017). Rather, tight packing without linkers implies an inherently stiff fiber. Therefore, Marseillevirus nucleosomes represent a previously unknown mode of chromatin packaging.
In addition to traditional MNase-seq, we applied MPE-seq, a chemical cleavage mapping method with high penetrability and without sequence bias (Ishii et al., 2015). Because MPE-seq lacks "nibbling" activity, it revealed fully wrapped 147-bp reconstituted particles on Widom 601 DNA. This artificial nucleosome positioning sequence was based on SELEX selection from chemically synthesized random DNA sequences using eukaryotic histone cores (Lowary and Widom, 1998). Therefore, our results using reconstituted 601 arrays imply that Marseillesvirus nucleosomes follow the same rules for nucleosome wrapping as eukaryotic nucleosomes, but doublet histones have evolved to wrap less DNA to facilitate dense nucleosome packing without intervening linkers. The fact that wrapping to the edges of the 147-bp Widom positioning sequence on 3-copy arrays accounts for only ~40% of the total and just 2% when cross-linked implies that the large majority of MPE-generated cleavages result from dense nucleosome packing without sequence preference on arrays just as we observed in virions.
The lack of linkers or genic punctuation of chromatin in virions raises questions as to how the chromatin landscape becomes accessible following infection, when the virion is transformed into a cytoplasmic "viral factory" within Acanthamoeba cytoplasm (Liu et al., 2021;Suzan-Monti et al., 2007). In addition to the abundant doublet Hβ-Hα and Hδ-Hγ histones that Marseillevirus encodes on divergent transcription units, a separately encoded histone doublet variant, Hz-He, is distantly related to Hβ-Hα and is present at very low levels in the chromatin fraction of the virion (Figure 2b). It is possible that Hz-He acts as a replacement variant, analogous to the eukaryotic histone variant H2A.Z, which replaces canonical H2A around regulatory elements where it facilitates accessibility to the transcriptional machinery (Luk et al., 2010). Hβ-Hα has a very highly charged C-terminal tail with 20 lysines, and we speculate that replacement by the tailless Hz-He variant might be facilitated by lysine acetylation, which would neutralize the basic charge of the Hβ-Hα C-terminal tail and detach it from its electrostatic contact with the acidic DNA wrapping around the core.
Doublet histones are found in some archaeal clades, which led to the proposal that histone doublets made possible the differentiation from homotypic nucleosomes typical of most archaea to heterotypic nucleosomes that later evolved into the four eukaryotic core histones (Malik and Henikoff, 2003). Our evidence that Marseillevirus doublet histones are well-suited for viral packaging is consistent with the possibility that eukaryotic histones have evolved from virus-encoded histone doublets that infected a host proto-eukaryote just as present-day Marseilleviridae infect Acanthamoeba in oceans around the globe.
There has been considerable recent interest in the hypothesis that the eukaryotic nucleus evolved from a viral factory (Liu and Krupovic, 2022;Talbert and Henikoff, 2021), and if so, the as-yet unknown mechanism whereby a tightly packed viral particle transitions to a fully functional viral factory may shed light on the earliest stages of eukaryotic evolution.

Drosophila cells
Drosophila S2 cells were grown in HyClone SFX Insect Cell Culture Media (Cytiva SH30278.02) supplemented with 18mM L-Glutamine, seeded at 2x10 6 /mL three times per week, and harvested with >95% viability at mid-log phase. A total of 1x10 7 cells were centrifuged in a swinging-bucket rotor for 4 min at 700xg at 25°C and washed twice in cold 1x PBS. The cell pellet was resuspended in 1 mL TM2+PI (10mM Tris pH 8, 2mM MgCl2 + Protease Inhibitor, Sigma 11836170001) and chilled in ice water for 1 min. NP-40 was added to 0.5% and vortexed gently at half maximum speed for ~3 sec and returned to ice water slurry. Release of nuclei was ascertained by microscopic observation of aliquots until ≥80% of cellular membranes were disrupted (~3 min). Nuclei were centrifuged 10 min at 150xg at 4˚C, washed twice in 1.5 mL TM2+PI and finally resuspended in 200 µL TM2+PI. Each digestion reaction contained either 50K or 150K nuclei per timepoint.

Viral capsid opening and permeabilization
We followed the viral opening procedure described by Schrad et al., 2020(Schrad et al., 2020  To improve recovery, we included the non-ionic detergent NP-40, which is widely used for chromatin release from cells. Recovery from Marseillevirus T19 particles was vastly improved, ~85-90% in 0.5% NP-40 using the maximum MNase digestion conditions that had resulted in mostly mononucleosomes in the previous Drosophila experiments. High yields and fragment size distributions were obtained from samples treated with 0.5% NP-40 over a time course ranging from 1 minute to 18 hours. In contrast, 0.1% of the non-ionic detergent digitonin had no effect. As NP-40 permeabilizes membranes by sequestering lipids, whereas digitonin permeabilizes cell membranes by displacing membrane sterols, our results suggest that viral chromatin is enclosed within a lipid membrane that must be permeabilized for MNase to access chromatin.
We performed digestions over a time course and concentration range that we had previously found to be sufficient for digesting chromatin from Drosophila S2 cells into mono-and oligo-nucleosomes (Chereji et al., 2019). In that study, 1 minute digestions at 2.5 U/million cells had yielded an electrophoretic 'ladder'  (Figure 1c), which reflects the selection for smaller fragments during end-polishing and PCR.

MPE-seq
MPE-seq was performed as described by Ishii et al. (Ishii et al., 2015). Briefly, the opened Marseillevirus and S2 nuclei were treated with hydrogen peroxide across the range of 0.001-1 mM and cleavage was induced by addition of methidiumpropyl-EDTA-Fe(II) (MPE, a generous gift from Jim Kadonaga), which is MPE complexed with ammonium iron(II) sulfate) at 10 µM or 40 µM for 5 min at room temperature.
The reaction was quenched with 6 mM bathophenanthroline (Sigma 133159), followed by 2xSTOP buffer (10 mM Tris, 2 mM MgCl2, 340 mM NaCl, 20 mM EDTA, 4 mM EGTA, 100 µg/mL RNase, DNase-free) at a volume equal to the sample and total DNA extracted as described below. For reconstituted nucleosomes, we digested with 40 µM MPE using at a ratio of 1 mM H2O2/650 ng DNA for 5 min at room temperature.

DNA extraction
Sample volumes were adjusted to ~340 µL with TM2. was rinsed twice with 1 ml 80% ethanol, air dried and dissolved in 30 uL 10 mM Tris pH 8.0. Two µL was analyzed using a HDS1000 ScreenTape on an Agilent 4200 TapeStation.

Proteomics
Viral and reconstituted protein extracts were resolved by 18% SDS-PAGE, and gels were silver-stained using the Pierce kit (Thermo cat. no. 24600). Protein bands were excised, destained, and proteolytically digested as described (Shevchenko et al., 1996). Proteolytic peptides were desalted and analyzed by LC- After reaching optical density = 0.6, the cultures were induced with 0.5 mM IPTG at 37°C for 3 hr and harvested by centrifugation. Cells were resuspended in 20 mM Tris pH 7.5, 2 M NaCl, 5 mM imidazole, 1 mM β-mercaptoethanol (βme) and lysed on a cell disruptor (AvestinEmulsiflexC3). The extract was clarified by centrifugation, applied to Ni-NTA agarose beads (Qiagen) and eluted with 20 mM Tris pH 7.5, 2 M NaCl, 300 mM imidazole, 1 mM βme. The protein sample was further purified using a Superdex200 26/600 sizeexclusion chromatography column (GE Healthcare), and fractions were collected and concentrated to 1.8 mg/ml in 10 mM Tris pH 7.5, 2 M NaCl, 1 mM EDTA, 5 mM βme.

Purification of Widom 601 DNA array
A plasmid (3_187_widom_601) containing three tandem copies of the Widom 601 nucleosome positioning sequence with a 40-bp linker flanked by the EcoRV restriction site was transformed into DH5α competent cells (ThermoFisher) and cultured in 2xYT-Ampicillin medium overnight. The 3_187_widom_601 DNA insert fragment was excised using EcoRV and purified using a previously published protocol (Dyer et al., 2004).

Reconstitution of nucleosomes on Widom 601 arrays
Xenopus laevis and Marseillevirus nucleosome array reconstitutions were performed as described (Grau et al., 2021). Briefly, nucleosomes arrays were assembled by mixing purified 3_187_widom_601 DNA fragment and histone octamers, followed by overnight gradient salt dialysis in 10 mM Tris-HCl pH 7.5, 1 mM EDTA, 1 mM dithiothreitol (DTT) and 2 M to 0.25 M KCl using a peristaltic pump. Nucleosomes arrays were dialyzed into TCS-50 buffer (20 mM Tris-HCl pH 7.5, 50 mM NaCl, 1 mM EDTA, 1 mM DTT), concentrated, and stored at 4°C until use. Octamer/DNA ratios were optimized using small-scale reactions, and products were verified by native PAGE. The ratio used for Marseillevirus tetramer:DNA was 3.1, and for X. laevis octamer:DNA was 3.9.

Data processing and analysis
Barcoded libraries were mixed to achieve equimolar representation as desired     and MPE, with partial accessibility preferentially but not exclusively where the wrap shifts from one nucleosome to the next (red arrows). Double-strand cleavages on neighboring nucleosomes release the particle in between, leaving unwrapped ends. Because of its exonuclease activity MNase "nibbles" on the ends down to 121 bp, and when dinucleosomes are released, nibbling results in two adjacent 121-bp particles, and so on. Cleavages within adjacent particles separated by a single DNA wrap (not shown) will release fragments of variable size averaging~70 bp. Upon nucleosome release by MPE the unwrapped DNA ends immediately snap onto the exposed basic surface of the histone core resulting in a broad distribution of fragment sizes with a peak at~147 bp. (b) Percentages of precisely positioned 147-bp nucleosomes (%147 bp) and cleavages (%147+334+521 bp).

Supplementary Figure 4: Preferential cleavage between A/T base pairs in Marseillevirus and
Drosophila. MNase is an endo-exonuclease that cleaves preferentially at A/T-rich DNA then 'nibbles' on ends until it reaches a G/C-rich 'clamp', as observed in our MNase-seq data for both Marseillevirus and Drosophila, whose genomes are both 44-45% G+C overall.      Figure 6: Autocorrelation illustrates periodicities over individual Marseillevirus gene bodies. To sensitively detect genes most likely to be phased, we plotted autocorrelations over the 600-bp span of each ORF for the 101-200 bp fragment size class using MNase-seq and MPE-seq data. Autocorrelations (-1 to +1) in 1-bp lag steps over a 5'-aligned 300-bp span are plotted for each of the 10 ORFs ≥600 bp with the highest and lowest variance in amplitude for each size class for (a) MNaseseq and (b) MPE-seq.