Abstract
Cryo electron tomography with subsequent subtomogram averaging is a powerful technique to structurally analyze macromolecular complexes in their native context. Although close to atomic resolution, in principle, can be obtained, it is not clear how individual experimental parameters contribute to the attainable resolution. Here, we have used immature HIV-1 lattice as a benchmarking sample to optimize the attainable resolution for subtomogram averaging. We systematically tested various experimental parameters such as the order of projections, different angular increments and the use of the Volta phase plate. We find that although any of the prominently used acquisition schemes is sufficient to obtain subnanometer resolution, dose-symmetric acquisition provides considerably better outcome. We discuss our findings in order to provide guidance for data acquisition. Our data is publicly available at EMPIAR-10277 as well as EMD-10207 and might be used to further develop processing routines.
Introduction
Cryo electron tomography (CryoET) is a powerful imaging technique to structurally analyze pleomorphic biological objects such as cells, organelles and subcellular architecture [1] [2]. In combination with subtomogram averaging (SA) structures of repetitive objects within such tomograms, such as e.g. macromolecular complexes, can be resolved [3] [4]. In principle close to atomic resolution can be obtained. In practice, however, although this technique is being used by many laboratories, the vast majority of structures are not resolved into the subnanometer regime. The biological properties of the object of interest are a prerequisite for obtaining high resolution. The most important of those properties are (i) specimen thickness, which is particularly critical for larger biological objects because it limits the attainable signal to noise ratio (SNR) at a given dose [5]. (ii) The abundance of the structure of interest within the pleomorphic objects that determines the number of repetitive subtomograms that can be obtained. (iii) The consistency of the structure across the repetitive objects, namely low structural dynamics. And (iv) the structural preservation after embedding into vitrified ice [6].
Not only these biological properties but also technical parameters limit the attainable resolution. Unlike image acquisition for single particle analysis (SPA), tomographic data collection requires the specimen to be imaged at different tilt angles. This results in a number of complications that must be considered prior to the image acquisition. The total electron dose has to be distributed among the acquired projections leading to lower SNR when compared to SPA projections. The SNR decreases even more at high tilt angles due to increased effective thickness of the sample. Moreover, the continued exposure results in an accumulation of dose and consequently the gradual deterioration of the specimen. As such, the information content decreases with projection number whereby high-resolution information is lost at first [7]. In order to obtain the best possible resolution during the subsequent subtomogram averaging, one has to optimize the tilt range and the angular increment, thus defining the number of projections and the order in which they are acquired. Jointly, these parameters are referred to as a ‘tilt-scheme’. Several previous studies have discussed how to choose the angular increment in order to obtain the best possible sampling of tomographic reconstructions in Fourier space [8] [9]. The deductions from these studies are however not directly transferable to SA. In SA, the sampling of Fourier space is a result of averaging many subtomograms with different orientations within the tomogram of origin. Therefore, increasing the number of subtomograms (i.e. acquiring more tomograms) should be more important than uniform sampling of high-frequencies on the individual tomogram level.
Also for the order in which the projections are acquired, different tilt-schemes have been proposed. Traditionally, continuous acquisition schemes have been used. Here, the projections are collected by tilting strictly into one direction from a minimum tilt angle to the maximum tilt angle. The advantages of this scheme are the minimal mechanical interference during tilting and the relatively rapid data collection. However, the projections acquired at first and at the lowest accumulated dose, have a low SNR as they are collected at high tilts with large effective specimen thickness. One would predict that this caveat leads to a poor preservation of high-resolution information within the entire tomogram, although the impact of which has to the best of our knowledge not yet been systematically tested. To better deal with the trade-off of effective specimen thickness and accumulated dose, alternative schemes have been introduced. The bidirectional scheme starts at 0 degrees and first proceeds towards the minimum tilt angle. Subsequently, it returns to 0 degrees in order to continue to collect in positive direction until the maximum angle. This way the least dose-exposed projections are acquired where the effective specimen thickness is minimal, albeit in only one direction, which leads to better preservation of high-resolution information. The disadvantage of this approach is the difference in projection quality and resemblance between the first and the second half of the tilt-series, because the latter is only acquired after the specimen has already been exposed with half of the total dose. This can complicate the subsequent processing of the projections, especially in terms of tilt-series alignment [10]. To avoid any sharp decline of information content between adjacent projections, and in order to preserve as much high-resolution information as possible, the electron dose should be systematically accumulated from lower to higher tilt angles, and as such distributed symmetrically in both directions. The respective dose-symmetric tilt-scheme has been coined the ‘Hagen scheme’ [11]. It starts the acquisition at 0 degrees and then alternates positive and negative tilt angles until it reaches the specified range. In this way, the first projections containing the best-preserved high-resolution information, are acquired at low tilts and thus with the best possible SNR. In comparison to the aforementioned dose-asymmetric schemes, the dose-symmetric scheme requires more acquisition time. How these different tilt-schemes affect the attainable resolution of subtomogram averaging has not yet been systematically tested.
Tilt-series are generally collected out of focus to generate phase contrast that facilitates particle detection but also leads to the signal modulation described by the contrast transfer function (CTF). CTF correction is required to properly interpret high-resolution structural features. The quality of the correction depends on the precision with which one is able to estimate the defocus for each projection. The high-tilt projections with rather low SNR are typically more difficult to correct, which is another argument for dose-symmetric acquisition schemes. Alternatively, the Volta phase plate (VPP) allows contrast-rich imaging in focus without the need for CTF correction [12]. If a defocus is applied or observed because parts of the titled projections are above or below the focal plane, both defocus and phase-shift need to be determined prior to the CTF correction. Whether VPP projections are compatible with high-resolution SA has not yet been systematically tested.
As both biological properties of a sample and determination of optimal acquisition parameters play a key role in attainable resolution, it is difficult to assess in practice why structural analysis by SA is limited to a given resolution. Thus far, not many structures with subnanometer resolution were obtained by SA and only 7 of those have reached a resolution below 5 Å (as of July 2019). The first one to breach the 5 Å barrier was a structure of the immature HIV-1 CA-SP1 lattice assembled in the presence of the maturation inhibitor Bevirimat (BVM), which was resolved to 3.9 Å [13]. The purified HIV-1 derived protein ΔMACANCSP2 forms virus-like particles (VLPs) in vitro, which exhibit an identical lattice as the immature HIV-1 capsid. These VLPs are well suited for SA. The specimen scores high on any of the four above-introduced biological parameters and thus represents an excellent object for the technical benchmarking of acquisition and processing routines. The particle has 120 nm diameter and is usually embedded into around 200 nm thick ice. The VLPs contain a large copy number of the lattice-forming protein and the CA-SP1 layer of the protein forms a locally ordered shell with C6 symmetry. In the study reporting the 3.9 Å resolution, the dose-symmetric scheme was used for the data collection [13], and it has been assumed that this scheme was critical for achieving the high resolution. Accordingly, it has been routinely used for samples with high resolution potential and current all structures resolved below 5 Å were collected using this scheme. However, no systematic study/benchmarking was performed to compare the advantage of dose-symmetric scheme over the other tilt-schemes. Neither have angular increment variations or VPP been systematically tested in combination with dose-symmetric acquisition. Here, we use the immature HIV-1 lattice as a benchmarking object to systematically study the effect of different acquisition parameters on the resolution attainable by SA. We compare continuous, bidirectional and dose-symmetric schemes, each with a constant 3 degree angular increment; dose-symmetric schemes with increasing and decreasing angular increment; and dose-symmetric schemes without and with VPP, both in focus and with defocus. We found that although each of the schemes is suitable to obtain subnanometer resolution, the dose-symmetric scheme is indeed the most efficient data collection strategy for obtaining higher resolution that might even be sufficient to build atomic models de novo.
Results
Optimal Image Acquisition comes at the cost of throughput
We chose the in vitro assembled immature HIV-1 lattice in the absence of BVM as a benchmarking sample, which was originally resolved to 4.5 Å (EMD-4016, [13]). We acquired 20-30 tilt-series using 7 different acquisition schemes, namely the (i) continuous, (ii) bidirectional and (iii) dose-symmetric (DS) schemes with even angular increment. To assess the importance of additional acquisition parameters, we further varied the dose-symmetric scheme with (iv) decreasing (DS dec), (v) and increasing (DS inc) angular increment as well as with VPP correction both (vi) in focus (DS VPP foc) and (vii) with defocus (DS VPP def). The zero-tilt projections together with their periodograms with fitted CTF model from CTFFind4 [14] are shown in Figure 1. The plots indicate successful CTF fitting and already show the reduced high-resolution information content at zero degrees in case of the continuous scheme. The VPP projections have high contrast and show the characteristic features in the respective power spectra.
Depending on the specimen, the number of tomograms that can be acquired in a given time frame might be yet another important acquisition parameter because it influences the number of particles in the dataset. The practically achieved, average acquisition time of one tilt-series with 41 projections for each tilt-scheme are shown in Table 2. The continuous scheme is about twice as fast in comparison to the dose-symmetric scheme with VPP in focus. However, the continuous scheme suffers on average from 30% field of view lost; i.e. the position initially selected for acquisition overlaps with the projection acquired at zero-degree tilt by only 70%. This might be a disadvantage especially for specimen of limited availability, with fewer particles or fiducials. In case of all other acquisition schemes, similar average acquisition times of ~0.5h per tilt-series were observed. Secondary parameters such as the number of the required focusing or image tracking iterations might have influenced this observation.
A Comparative Benchmarking Workflow
All datasets were subjected to a consistent subtomogram averaging workflow including 3D-CTF correction [15], with some deviations that take into account their different nature, i.e. no CTF correction was applied to the VPP data set acquired in focus (see M&M for detail). Since individual tomograms might still differ even in critical properties such as specimen thickness, we implemented a workflow that allows selecting the objectively 5 best tomograms for each scheme that were then used for benchmarking. Briefly, it uses a multiple sampling approach to find the ideal sub-dataset constellation by optimizing the SA resolution (see M&M for detail). To thereby account for variations in VLP content per tomogram, the number of subtomograms contributing to the structural analysis from the 5 selected tomograms set was set to ~15,000. For detailed overview of parameters and software (SW) used in each step, see Supplementary Table 2.
Dose-symmetric acquisition is superior already at small data set sizes
We aligned each of the structures in multiple iterative rounds of SA (see M&M). Since the CA-SP1 is C6 symmetric, we used C1, C2, C3 and C6 symmetry alignment to systematically assess how dataset size impacts on the attainable resolution. A matrix with the final resolution achieved vs. symmetry is shown in Figure 2. The overall best resolution of 4.2 Å was obtained with the dose-symmetric scheme with constant angular increment using C6 symmetry (see Figure 2 and Supplementary Figure 1). This was measured rather conservatively, with gold standard FSC computed by averaging 5 phase-randomized FSC curves [16]. FSC calculation of our averages against the previously deposited reference structure (EMD-3782) of 3.9 Å resulted in a resolution estimate of 4.4 Å (see Figure2C).
Although any of the tested schemes was sufficient to achieve subnanometer resolution, there are considerable differences. While the bidirectional scheme also led to a resolution below 5 Å, the continuous scheme achieved only 7.0 Å – the worst resolution amongst all schemes. Interestingly, the dose-symmetric scheme performs very well already at smaller dataset size. The achieved resolution almost plateaus already at C2 symmetry analysis, while in case of the other schemes it more gradually increases towards C6 symmetry (Figure 2). In case of the continuous scheme, only a very minor increase in resolution is observed. This is further underscored by B-factor analysis (see M&M for details). At FSC 0.5 criterion the resolution increases nearly linearly with the logarithm of the number of particles for all schemes except for the continuous one which starts to flatten already at ~3000 particles (Figure 2D). This becomes even more apparent at 0.143 criterion (Figure 2E). While the resolution of the dose-symmetric schemes without VPP still increases almost linearly, the other schemes plateau at ~8000 particles.
In case of the dose-symmetric scheme, we can assess the impact of the angular increment. Although a decreasing angular increment might be beneficial for the resolution of the tomographic datasets [17], these previous considerations were not intended for SA, where the final averages are sampled differently than the initial tomograms. Alternatively, one could argue that an increasing angular increment will distribute less dose towards the high-tilt and large-thickness projections and thus might be superior. At last, uniform angular sampling might be beneficial during the averaging procedure because it simplifies weighting of the angular sampling. We empirically found that indeed the latter is more important. The dose-symmetric tilt-series with varying angular increments resulted in worse resolution than the tilt-series with the constant angular increment. This finding suggests that the increased sampling of high-frequencies on the tomogram level is less important for high-resolution SA than uniform sampling of angles.
Overall, those results are consistent with the observed structural features of respective averages as shown in Figure 3. The structure obtained from the dose-symmetric tilt-series recovers even more high-resolution features than the equivalent 4.5 Å structure (EMD-4016) from [13] while the 4.8 Å structure corresponding to the bidirectional tilt-series is slightly worse. In all cases, large side chains are very clearly observed. In case of the continuous scheme, even the helical pitch is not discernible very clearly.
Datasets with VPP are challenging to correct for spatial frequency weighting
FSC analysis of the VPP data suggests an overall relatively good performance, similar to the bidirectional scheme (Figure 2). This is particularly remarkable for the dataset acquired in focus, because it had not been CTF corrected. CTF correction is not possible in this case, because the actual function is rather featureless in focus and cannot be reliably fitted. However, 3D-CTF correction is still important for high-resolution SA because it compensates for defocus variations resulting from the different positions of the individual particles.
The visual inspection of the respective structures does not credibly support the estimated resolution (Figure 4). The typical high-resolution features are not observed, suggesting inaccurate spatial frequency weighting. A variation of averaging parameters such as high-pass filtering or sharpening with different arbitrarily chosen B-factors did not recover the respective structural features (not shown). Amplitude matching using the 4.2 Å structure from the dose-symmetric scheme as a reference only partially resolved these issues (see Supplementary Figure 2). We conclude that although the respective high-resolution information might be contained in the average, it is non-trivial to recover it de novo. The defocused VPP dataset, although 3D-CTF corrected, suffers from the same problem. One might thus speculate that high-pass filtering at the SA level is insufficient and different filters might be rather used already during the tomogram reconstruction in order to suppress the very pronounced low frequencies.
Tomogram Alignment Accuracy Impacts On the Attainable Resolution
The importance of the accuracy of the alignment of the projections for SA has often been argued but to the best of our knowledge, not yet been systematically quantified. To test the influence of the tilt-series alignment precision on the attainable resolution for the given benchmarking dataset, we introduced errors to the fiducial-based alignment models by artificially adding shifts into a random direction in sillico (only for the dose-symmetric scheme). We reconstructed the respective tomograms and proceeded with SA workflow starting with 4x binned tomograms (assuming that the errors would not have a significant impact on 8x binned data) with C1 and C6 symmetry. The results are summarized in Table 3. While error of 0.5 pixels is negligible for both C1 and C6 symmetry the impact of displacement by 2.0 pixels seems to be more significant for structures with high-resolution worsening the resolution by 0.9 Å and 1.4 Å, respectively. This is in line with the residual error and its standard deviation reported by eTomo [18] during the alignment routine - a shift by 0.5 pixels does not significantly increase the residual error nor the standard deviation while a shift by 2.0 pixels increases the residual error by factor of 2.
Discussion and Conclusions
During the last decade, cryoET has gained enormous momentum and has become an important method to structurally analyze macromolecules in their native context. However, the aspect of how to optimally acquire the data has remained somewhat unorganized. Here, we have systematically compared different tomographic tilt-schemes in order to lay down a path towards high resolution to SA. Under the experimental conditions we chose for our benchmarking study, the dose-symmetric scheme with the constant angular increment outperformed all other tested schemes in terms of ultimately obtained resolution. While the bidirectional scheme provides a reasonable alternative in terms of acquisition time to resolution ratio, the continuous scheme has clear limitations. Despite superior acquisition speed, our results clearly suggest that even with twice the number of particles the resolution does not further improve beyond the 7 Å regime. Also variations of the angular increment were not beneficial. However, the question of the optimal angular increment (together with non-constant dose distribution within the tilt-series) was not addressed in this study and most likely will be sample dependent. Although the differences in the final resolution attained might not seem tremendous, they can be of critical importance if a structure is determined de novo.
The acquisition and analysis of VPP datasets comes with additional challenges, such as VPP conditioning, stability, increased acquisition time as well as phase-shift and defocus determination, heavily oversampled low-frequencies and others. As far as we can see, there is no clearly defined way to recover the high-resolution features for a given structure de novo, and even if so, the resolution was comparably lower. The better contrast however might be beneficial for the identification of particles in cases where high resolution is not required. The high contrast of VPP imaging is highly beneficial for cellular, biological and ultrastructural investigations, however, further work is required to unlock its full potential for SA analysis.
The 3D maps of final structures with C6 symmetry as well as their corresponding half-maps (both raw and CTF-reweighted) are publicly available at EMDB (EMD-10207) and the raw tilt-series are available at EMPIAR (EMPIAR-10277) and can be used to further develop and/or benchmark processing routines for SA. In addition to the presented data the EMPIAR deposition also contains 8 tilt-series acquired at regions without any gold fiducials. We hope these tilt-series will be used to test and improve current fiducial-less alignment techniques.
We believe that our conclusions are generic for projects where particle number, specimen thickness and available fiducials are not limiting. To which extent they are applicable to thicker or fiducial-less specimen, such as e.g. obtained during FIB-SEM projects, remains to be tested in the future.
Methods
Sample preparation
The sample of HIV-1 △MACANCSP2 VLPs was prepared as described in [13]. Degassed 2/1-3C C-flat grids were glow discharged for 45 seconds at 20 mA. VLP solution was diluted with 10 nm colloid gold in VLP sample buffer and 2.5 μl of the solution was applied to the grids and plunge frozen in liquid ethane using FEI Vitrobot Mark III at the temperature of 15°C and relative humidity of ~90% (blotting time 1.0 s).
Image Acquisition
To minimize biological variations of the sample, all datasets were collected on the same grid. All datasets were collected on FEI Titan Krios TEM at 300 keV, with Gatan Quantum K2xp direct electron detector using LS energy filter with slit width of 20 eV. Projections were acquired using SerialEM SW [19] as 8K x 8K super-resolution movies of 10-20 frames at the magnification of 105,000x which corresponds to 4K pixel size of 1.33 Å. The frames were aligned using MotionCorr [20]. For all datasets the tilt-range was ±60° with 41 projections per tilt-series and target total dose of ~140 e/Å2 (corresponds to an incident dose of ~3.5 e/Å2 per projection). The overview of parameters that differ among the schemes is shown in Table 1. The continuous scheme was collected using tiltcontroller function in SerialEM using parameters shown in Supplementary Table 2. All other tested schemes were collected using drift measurements and backlash as described in [11]. We collected 20-30 tilt-series for each scheme.
Image Processing
1. Initial pre-processing
For all tilt-series, we performed CTF estimation using CTFFind4 and corrected for dose-exposure as described in [21] using Matlab implementation that was adapted for the tomographic tilt-series [22]. Tilt-series that contained one or more inadequate projections (i.e. not properly tracked or failed CTF estimation) were discarded. For the following steps eTomo [18] was used. The pixels with outlier intensities were removed and preliminary alignment was computed based on cross-correlation (CC). The automatic seeding procedure was used to find the gold fiducials for alignment and the seeding model was manually corrected such that it contains only fiducials that are present in the field of view in all projections (on average 4 to 5 fiducials per tilt-series fulfilled this constrain). The tilt-series with less than 3 fiducials were eliminated from further processing. The fiducials were automatically tracked and in cases where tracking failed the model was corrected manually. The fiducial centers were manually refined prior the final alignment. Tomograms were reconstructed 8x binned and using SIRT-like filter (except for DS VPP foc and DS VPP def datasets, as their contrast was sufficient using radial filtering). The tomograms were used to position the center of mass into the center of tomogram along z axis as well as to assess tomograms thickness and the quality of the alignment - all tilt-series where the fiducials showed strong movement in tomograms were removed from further processing. From the remaining tilt-series, the most suited 8-10 tilt-series per dataset were chosen for further processing based on the alignment residuals, defocus range and specimen thickness.
2. Tomogram reconstruction
Tomograms were reconstructed with 3D-CTF correction using novaCTF [15]. Multiplication was used as the correction method, with 15 nm slab size and astigmatism correction. The DS VPP foc dataset was also reconstructed using novaCTF with the CTF-correction turned off. To ensure accurate phase-shift estimation, the DS VPP def tomograms were reconstructed both with and without 3D-CTF correction. The uncorrected tomograms were used until step 5. Tomograms were subsequently binned 2x, 4x, and 8x using Fourier cropping.
3. Particle picking
Similar to [13], the centers of the VLPs were picked manually and their spherical shape was used to generate initial positions and orientations on the lattice [23]. The lattice was oversampled, i.e. on average 10x more positions were created than assumed number of subunits. The center picking was done in IMOD on the 8x binned tomograms from step 1, i.e. reconstructed using SIRT-like filter. These tomograms were used only to generate list of positions, for SA itself the tomograms reconstructed using novaCTF, as described in step 2, were used. The particles were picked not only from perfectly preserved VLPs (or VLPs that were fully in the field of view), but also from the incomplete VLPs. The precision of the center picking is not crucial for the quality of the final structure - already in the first two iterations of alignment, the initial positions shift to the lattice.
4. Reference creation
For each dataset one tomogram was chosen (typically the one with the lowest defocus and thus strong low-frequency information) that was used to create the initial reference. For DS VPP foc dataset a reference was created from each tomogram and the one visually closest to the references from other datasets was chosen. Twenty iterations of alignment were run to obtain a reference for each dataset. All starting references were shifted and rotated to have the same position and orientation within the box to facilitate further processing (e.g. same masks could be used for all the datasets) as well as structural analysis.
5. Subtomogram averaging
Two iterations of alignment were run on particles from 8x binned tomograms using the references obtained in the previous step. At this stage misaligned particles were discarded. This was done fully automatically, using ellipsoid fitting and removing particles that deviated above the standard deviation either in angle or in radius. So-called distance cleaning was performed - particles that shifted to the same position were also discarded (the criterion for choosing the better particle was angular distance based on the ellipsoid fitting). Approximately 8% of particles were left for each dataset. The subsequent SA workflow exactly followed the protocol from [13].
For DS VPP def dataset this step was still done using particles from the tomograms without 3D-CTF correction and the final positions and orientations were subsequently used to generate an average using particles from the 3D-CTF corrected tomograms. The improvement in resolution w.r.t. the uncorrected structure confirmed an accurate phase-shift estimation and the corrected tomograms were thus used for all subsequent processing steps.
6. Selection of 5 best tomograms from each dataset
For each dataset, all possible combinations of 5 tomograms were generated and an average structure for each of the combinations was computed using the orientations and positions from the final alignment of unbinned particles. Each tomogram within the dataset contributed with the same amount of particles (particles were randomly removed from each tomogram to match the tomogram with the least number of particles). Resolution at 0.143 was computed and the subset with the best resolution for each dataset was chosen for further processing.
7. Reconstruction of 5 tomograms subsets
The final positions from the SA alignment (step 5) were used to compute the center of mass for each tomogram and all tomograms from the chosen subsets were reconstructed using novaCTF with the refined defocus shift. For DS VPP foc dataset this step was omitted.
8. SA workflow of 5 tomograms subsets
For each dataset the step 4 was repeated, creating a reference using one of the tomograms from the subset. Two iterations of alignment were run on particles from 8x binned tomograms followed by ellipsoid-based removal of misaligned particles and distance cleaning. All VLPs with more than 50% of particles removed during the ellipsoid-based cleaning were discarded from further processing. From the remaining particles a random selection was removed and the alignment continued with ~15000 particles. The subsequent SA workflow at lower binning exactly followed the protocol from [13]. For unbinned particles 4 iterations of alignment were run (see Supplementary Table 1).
9. Testing the influence of the number of particles
Two approaches were used to assess the influence of the number of particles on the final structure and attainable resolution. First, we exploited the symmetrical property of the structure. Step 8 was repeated for each dataset using C1, C2 and C3 symmetry, effectively reducing the number of particles 6x, 3x and 2x, respectively. Second, we used B-factor analysis as proposed in [24]. For each dataset 3 logarithmically smaller subsets of particles were randomly selected from the final set of particles (i.e. 1100, 2980 and 8100). For each of the subset, 3 iterations of alignment were run on unbinned data using the positions and orientations obtained in step 8 as a starting point. This analysis was done using C6 symmetry.
Author Contributions
B.T., W.J.H.H. and M.B. conceived the project. M.B. and H.G.K. supervised the project. M.O. performed the sample preparation and prepared cryo-EM grids. W.J.H.H. and B.T. collected the data. B.T. developed and performed the pipeline for the data analysis. B.T. and M.B. wrote the manuscript with input from all authors.
Competing Interests
The authors declare no competing interests.
Acknowledgements
We thank Drs. Matteo Allegretti, Julia Mahamid, Jürgen Plitzko, Florian Schur and William Wan for discussions and Shyamal Mosalaganti for critical reading of the manuscript. H.G.K. and M.B. acknowledge funding by the German Research Association (DFG, project number 240245660; project 5). M.B. acknowledges funding by EMBL, the Max Planck Society and the European Research Council (#724349 ComplexAssembly).