Abstract
Endocytosis is a fundamental cellular process for eukaryotic cells to transport molecules into the cell. To understand the molecular mechanisms behind the process, researchers have obtained abundant biochemical information about the protein dynamics involved in endocytosis via fluorescence microscopy and geometric information about membrane shapes via electron tomography. However, measuring the biophysical information, such as the osmotic pressure and the membrane tension, remains a problem due to the small dimension of the endocytic invagination. In this work, we combine Machine Learning and Helfrich model of the membrane, as well as the dataset of membrane shapes extracted from the electron tomography to infer biophysical information about endocytosis. Our results show that Machine Learning is able to find solutions that both match the experimental profile and fulfill the membrane shape equations. Furthermore, we show that at the early stage of endocytosis, the inferred membrane tension is negative, which implies strong compressive forces acting at the boundary of the endocytic invagination. This method provides a generic framework to extract membrane information from the super-resolution imaging.
Introduction
Clathrin-mediated endocytosis (CME) is an essential process for eukaryotic cells to uptake nutrients, regulate signal transduction, and control the membrane composition [1–6]. When CME occurs, a small patch of the plasma membrane is internalized into the cytoplasm to form an endocytic pit, which is later pinched off to form a vesicle. In the past decades, tremendous efforts have been devoted to understand the molecular mechanisms that drive CME. Fluorescence microscopy is widely used to obtain the dynamic information of the protein concentration during CME [7–9]. Electron microscopy serves as a powerful tool to resolve the morphology of the membrane during CME [10–14]. Though the advanced imaging technologies have helped accumulate an enormous amount of data, mining out useful information from the data remains insufficient. For instance, the deformation of the membrane during CME is shaped by many factors, including the force generated by the actin polymerization, the curvature induced by clathrin proteins, and the osmotic pressure as a result of the solute concentration difference between inside and outside of the cell. Geometric profiles of the membrane obtained via the electron tomography therefore contain abundant information of the mechanical properties of the membrane. However, the analysis of the profile data is often limited to a few geometric features, derived from the original high dimensional profile data. How to use the data to have a mechanistic understanding of the physical mechanism remains elusive.
Measuring the physical quantities involved in endocytosis is important for us to understand how endocytosis happens under the orchestrated action of many proteins. Yeast cells have been used as a model system to study endocytosis due to their fast proliferation and easy genetic manipulation [15–17]. Different from the mammalian cells, yeast cells have a cell wall, which enables them to maintain a high osmotic pressure (Fig. 1a). Minc et al. have used the deformation of a PDMS chamber to infer the osmotic pressure by assuming the growth rate of fission yeast cells is powered by the osmotic pressure, and the estimated value is 0.85 ± 0.15 MPa [18]. Atilgan et al. have obtained a value of 1.5 ± 0.2 MPa in fission yeast cells by comparing the geometric difference between the natural state in which the osmotic pressure expands the cell wall and the lysed state in which the cell wall shrinks in response to the released osmotic pressure [19]. In budding yeast, the osmotic pressure is estimated to be 0.6 ± 0.2 MPa by analyzing the volume change upon variations in osmolarity [20]. Membrane tension is another important quantity that is relevant for membrane morphology [21–23]. Its value might vary significantly among different organisms and is widely accepted to be in the range of 10−2 to 101pN/nm [18, 19, 24–26].
(a) Simplified illustration of endocytosis in yeast cells. The osmotic pressure inside of the cell pushes the plasma membrane against the cell wall. Internalization of the membrane therefore needs a pulling force f to overcome the resistance from the pressure p, as well as the membrane tension σ. (b) Parametrization of the axisymmetric membrane shape [r(u), z(u)] where u = 0 corresponds to the membrane tip and u = 1 to the membrane base where the membrane is in contact with the cell wall. (c) The structure of the neuron network and the loss functions for the forward and the inverse problem.
Various theoretical models have been proposed to account for the membrane shape evolution during endocytosis [27–32]. All of these models are based on the classical Helfrich theory [33, 34], which calculates the membrane shape by minimizing the total energy of the membrane. Physical parameters, such as the bending rigidity, membrane tension, and osmotic pressure are needed to characterize the membrane property. Among the theoretical studies, some assume a very small osmotic pressure and the membrane shape is mainly determined by the membrane tension [28, 30]. While other models assume a large osmotic pressure [29, 35]. The ambiguity in the choice of these physical parameters might come from the difference between organisms, but also reflects the difficulty in obtaining these physical parameters from experiments.
In the past ten years, Machine Learning (ML) [36, 37] has made brilliant achievements, including pattern recognition [38, 39], computer vision [40, 41], data mining [42, 43], natural language processing [44, 45] and automatic driving [46]. Recently, as one of the ML applications in the field of scientific computing, physics-informed neural networks (PINNs) developed in Ref. [47] have been showed to be a powerful tool in solving forward and inverse problems of partial differential equations (PDEs) [48–51]. In the framework of PINNs, both the data and the physics contribute to the loss in the training process. PINN is easy coding and much more flexible than the classical numerical methods. In particular, it has been showed that PINN is more effective than the classical numerical methods in solving inverse problems, for instance, learning parameterized PDEs [48, 52], and using the concentration field to learn velocity fields with hidden fluid dynamics [52–54].
In this paper we combine the Helfrich model of membrane and PINNs to learn the model parameters including osmotic pressure and membrane tension based on the membrane profile data obtained via the electron tomography for yeast cells [10]. The method demonstrates stable convergences of the estimated parameters and, the learned osmotic pressure is consistent with experimental measurements. Furthermore, we find negative membrane tensions at the early stage of endocytosis, which implicates strong compressive forces applied at the boundary of the endocytic patch.
Results
Idea demonstration of the Helfrich model and the PINNs
The idea of the classical Helfrich model is to find membrane shapes that minimize the total energy of the membrane, which is written as [33, 34]
The first term describes the bending energy of the internalized membrane patch with a bending rigidity κ. The two principal curvatures of the membrane surface are denoted as c1 and c2. The second term describes the energy contribution from the membrane tension σ with the conjugated variable A being the total surface area of the internalized membrane patch. The third term describes the energy contribution from the osmotic pressure p with the conjugated variable V being the volume enclosed between the membrane patch and the cell wall (Fig. 1a).
We use PINNs to infer the model parameters in the Helfrich model (1). The PINN is a fully-connected neuron network with one input u, which is defined on the fixed interval [0, 1], and two outputs r and z, which are the meridian coordinates of the membrane profile (Fig. 1b). Note that we assume axisymmetry of the membrane profile in the Helfrich model and perform a symmetrization procedure to the experimental profile for comparison. In the forward problem, model parameters are given, and the loss function Lfor = Leqs + Lbc + Lcon of the network includes evaluations of the variational equations on a number of points Leqs, boundary conditions Lbc and coordinates constraints Lcon. In the inverse problem, the model parameters become internal variables of the neuron network. The loss function Ltot = Lfor + Ldata incorporates the difference between the symmetrized experimental profile and the ML outputs Ldata (Fig. 1c). A detailed description of the neuron network and the symmetrization procedure can be found in the Supplemental Material.
After learning the model parameters, we further calculate the parameter f, which is a Lagrangian multiplier derived from the learned p, σ, Rb to impose the membrane height z0. It represents the minimum force needed to pull the membrane to the corresponding height z0. A detailed description of the force calculation can be found in the Supplemental Material.
PINNs are able to solve the nonlinear membrane shape equations
We first verify whether PINNs can solve the fourth-order nonlinear membrane shape equations (7) and (8) provided the parameters are given, i.e., the forward problem, by comparing the ML solution that minimizes the loss function Lfor with the solution obtained by a finite difference (FD) method.
The parameters p/κ and σ/κ define two characteristic length scales and
. We choose two sets of parameters (listed in Table 1) with one having Lp < Lσ (pressure-dominant) and the other Lp > Lσ (tension-dominant). Using a fully connected network with 3 hidden layers, each layer containing 32 nodes, we see an excellent agreement between the ML solutions and the FD solutions for a series of membrane heights for both sets of parameters (Fig. 2). We stress that though the ML solutions are comparable with the FD solutions in accuracy, the ML method is much more time-consuming than the FD method when solving the forward problem. The advantage of the ML method compared with the FD method is mainly reflected in the inverse problem, which is discussed in the next section.
Two sets of parameters used in Fig. 2.
(a, b) ML solutions (dashed lines) for a series of membrane heights z0 = 10, 30, 50, 100nm are shown in different colors. The corresponding FD solutions are shown in solid lines. The tension-dominant regime is shown in (a) and the pressure-dominant regime is shown in (b). The two sets of parameters are listed in Table 1.
PINNs show stable convergence of the model parameters for the inverse problem
In this section, we use the PINNs to infer the model parameters, which include the re-scaled pressure p/κ, the re-scaled tension σ/κ, and the base radius Rb. The idea is to train the network parameters, which contain the model parameters, such that the ML outputs r(u) and z(u) minimize the total loss function Ltot. In this way, the ML outputs not only fulfil the variational equations, but also match the membrane profile data obtained via electron tomography. We stress that it is the re-scaled parameters p/κ and σ/κ that can be inferred from the experimental data. In order to have the absolute value of the membrane tension σ and the osmotic pressure p, we specify κ = 2000 kBT from Ref. [29] for the rest of the paper.
During the training of the neural network, we switch the model parameters between trainable and untrainable states. In particular, we fix σ/κ and p/κ and tune Rb in the first 105 training epochs, then fix Rb and vary σ/κ and p/κ for another 105 training epochs. After executing this training procedure twice, the loss function Ltot and Ldata can be both reduced to 10−2 (Fig. 3a), which implies that the membrane shape equations are satisfied and the ML-learned shape fits well with the experimental data. All three parameters show a trend of convergence at the end of the training (Fig. 3b). Upon switching of the training states, their values show a jump with the drop of the loss function (Fig. 3b).
(a) Evolution of the total loss function Ltot and the experimental loss function Ldata during a training process. (b) Evolution of the three model parameters to be tuned by the neuron network to minimize the total loss function during a training process. (c) Illustration of the original membrane profile (solid cyan), the symmetrized membrane profile (dash-dotted green), the ML-learned membrane profile (dashed orange), and the FD-solved membrane profile, respectively.
As a verification of the effectiveness of the ML-learned parameters, we substitute them into the membrane shape equations and solve the equations with a FD method. We overlay the original and symmetrized experimental membrane profile, the ML-learned membrane profile, the FD-solved membrane profile, and observe a good agreement between all the profiles (Fig. 3c).
Due to the non-linearity of the membrane shape equations (7) and (8), the solutions might depend on the model parameters with different sensitivity. The possibility that different parameters lead to similar membrane shapes limits the precision of the estimated parameters. In order to estimate the precision of the parameters for a particular experimental profile, we repeat the learning procedure 10 times and use the standard deviation of the parameters over the 10 times to measure the uncertainty of the estimation. For most of the experimental profiles, the multiple learning strategy achieves a small standard deviation compared with the average value. In particular, for the profile shown in Fig. 3c, the learned osmotic pressure p = 0.71 ± 0.06MPa, and the membrane tension σ = 13.67 ± 0.18pN/nm (pstd/pavg = 0.09, σstd/σavg = 0.01, see the comparison between the ML-learned shape and the experimental profile for the other 9 trainings in the Supplemental Material(Fig. S2).
ML-learned parameters are consistent with experimental measurements
We perform the ML learning procedure on 79 membrane profiles extracted from the Time-Resolved Electron Tomography provided by Ref. [10]. For each dataset, the learning procedure was repeated 10 times, so that we have 790 learning results in total for the osmotic pressure p and the membrane tension σ. In most of the learnings, the total loss function can be reduced to below 10−2 and concentrated at 10−3 (Fig. 4), which proves the effectiveness of the ML method. A more direct visualization of the one-time learning results for 9 randomly picked-up membrane profiles are shown in Fig. 5. All of the learnings show good agreement among the ML-learned shape, the symmetrized experimental profile, and the FD-solved shape, which proves the effectiveness of the ML method. We stress that what learned by the ML method is the symmetrized experimental profile, but not the original profile. For original profiles that are highly asymmetric, the membrane might be subject to asymmetric force distributions or local variations of the model parameters. The ML-learned parameters therefore represent the average of the model parameters such that the local variations are evened out.
Histograms of the loss function Ltot in (a), Lfor in (b) and Ldata in (c).
The learned parameters σ, p and the derived parameter f are indicated on the top of each box. In each panel, the original membrane profile (solid cyan), the symmetrized membrane profile (dash-dotted green), the ML-learned membrane profile (dashed orange), and the FD-solved membrane profile (dotted red), are shown respectively.
To demonstrate the precision of learned parameters, for each one of the 79 experimental profiles, we calculate the average and the standard deviation of the learned parameters over the 10 repetitions and show their joint distribution in Fig. 6. We find that: (i) The distribution of the standard deviations is concentrated in a small range near 0, and is much narrower than the distribution of the average values. This certifies a good precision of the learned parameters; (ii) The average values of the osmotic pressure p are mostly positive and the distribution is peaked at 0.45MPa, which is consistent with experimental measurements [18, 19]; (iii) The distribution of the average values of the membrane tension σ exhibits two peaks, a large one centered at −100pN/nm and a small one centered at 12pN/nm; (iv) The average values of the force f are mostly positive and peaked at 2750pN.
The distribution for the osmotic pressure p in (a), for the membrane tension σ in (b), and for the force f in (c).
Negative membrane tensions occur at early stage of endocytosis
We have shown that the average values of the model parameters have a much wider distribution than the standard deviations over the 10 repeated learnings (Fig. 6), which suggests that the parameters might vary at different stages of endocytosis. To test the stage-dependence of the model parameters, we use the membrane height as a indicator of the timeline of endocytosis, and plot the ML-learned parameters as a function of the corresponding membrane height z0. It is found that the osmotic pressure p shows no dependence on the membrane height (Fig. 7a), but the membrane tension σ exhibits a strong height dependence (Fig. 7b). Large and negative membrane tensions are found for z0 < 25nm. Above 50nm, the membrane tensions are almost independent of the membrane height z0 and remain to be small and positive (Fig. 7b). The forces f stay around 0pN for z0 < 25nm, and reach a plateau of about 3000pN above 50nm (Fig. 7c). A strong positive correlation (R = 0.87) between the force f and the membrane tension σ is observed (Fig. 7d left). However, the force f is only weakly correlated with the osmotic pressure p (R = 0.25)(Fig. 7d right).
(a-c) Model parameters as a function of the membrane height, with the the osmotic pressure p in (a), the membrane tension σ in (b), and the force f in (c). Each point represents the average value of the same experimental profile over 10 repeated learnings, and the error bar represents the standard deviation. (d) Scattered plots of (f, σ) in left, and (f, p) in right. The horizontal and vertical error bars of each point represent the standard deviations of the corresponding parameter over 10 repeated learnings, respectively.
Discussion
The ML method has the natural advantage in learning model parameters
We have shown that the ML method is able to solve the nonlinear membrane shape equations to an accuracy that is comparable with the FD method. However, in terms of speed, the FD method beats the ML methods with orders of magnitude. The ML method takes at least a few minutes to solve the equations, and even hours when the membrane height is large. In contrast, it takes less than a second for the FD method to solve the same equation. However, when it comes to the inverse problem, i.e., learning the model parameters with given experimental data, the FD method is less flexible than the ML method. We use the bvp5c solver in MATLAB which is an iterative algorithm based on the finite difference scheme of Runge-Kutta methods [55]. It requires a proper initial guess of the solutions to solve the membrane shape equations provided the parameters are given. In practice, we always choose a flat shape as the initial guess. If the membrane height is large, direct use of the bvp5c solver often causes error. By contrast, the ML method does not require a proper guess of the solution, and is naturally extended to solve the inverse problem of the membrane shape equations by incorporating Ldata into the loss function. Solving the inverse problem with the ML method has almost the same computational cost as the forward problem. Therefore, the ML method is very suitable for the task of parameter learning.
Our results implies that the initiation of endocytosis is facilitated with negative membrane tension
The presence of a large osmotic pressure inside of the yeast cells imposes a large force barrier to complete endocytosis [35]. In this paper, we have shown that the ML-learned osmotic pressure has an average value of about 0.66MPa (Fig. 7a), and the force f to pull the membrane to a height of 100nm is about 3000pN (Fig. 7c). These findings are consistent with previous studies. However, at the early stage of endocytosis when the membrane height is low, our results suggest large and negative membrane tensions are present at the boundary of the endocytic invagination (Fig. 7b and c). The negative tension implies an inward force to pump lipids into the endocytic invagination through the boundary and might induce membrane buckling, thus reducing the force requirement to pull the membrane up. The myosin-1 motors have been reported to form a ring structure around the endocytic invagination [56]. They might play such a role to generate negative tensions. Investigating the effect of the negative tension will be our future work.
Sensitivity of the model parameters limit the precision of the ML estimation
There are two possible factors that limit the precision of the learned parameters. One is due to the neuron network being unable to converge in a limited number of iterations, which is manifested as a large loss function. The other is due to the fact that different model parameters could give similar membrane shapes. In our results, we find for some experimental profiles, the repeated learnings indeed give quite different parameters. The large error bars observed in some of the points in Fig. 7 mainly arise from the insensitivity of the model parameters to the membrane shapes but not the failure of the neuron network. To prove this, we overlay the FD-solved membrane shapes of the 10 repeated learnings for experimental profiles that have the largest error bars. Most of the shapes overlap with each other, though the parameters are quite different (Fig. 8). In addition, for the 79 experimental profiles, we observe no correlation between the average loss function and the standard deviation of the model parameters over the 10 repeated learnings (Fig. 9), which implies that the large error bars are not due to the failure of the neuron network to converge to a small loss function. Improving the parameter precision for those profiles therefore needs more information than the geometric shapes.
Each panel shows the FD-solved membrane shapes for the same experimental profile over 10 repeated learnings. The top, middle and bottom rows are for parameters p, σ and f, respectively. From left to right, the standard deviations go from large to small.
(a, b) The average loss functions ⟨Lfor⟩ vs. the standard deviation of the osmotic pressure pstd in (a) and ⟨Ldata⟩ vs. pstd in (b). (c, d) The average loss functions ⟨Lfor⟩ vs. the standard deviation of the membrane tension σstd in (a) and ⟨Ldata⟩ vs. σstd in (b)
Methods
Helfrich Model of the axisymmetric membrane
We assume rotational symmetry of the membrane shape such that the surface of the membrane can be parameterized as
The parameter ϕ ∈ [0, 2π] is the azimuthal angle, and the parameter u ∈ [0, 1] is the meridional coordinate with u = 0 at the membrane tip and u = 1 at the membrane base where the membrane is in contact with the cell wall. The functions r(u) and z(u) depict the membrane profile as shown in Fig. 1b. The total energy (1) then becomes a functional of r(u), z(u) and their derivatives up to second order,
in which
and
By performing variations against the functions r(u) and z(u), with a constraint that a(u) is a constant, we obtain two variational equations
and
Here we do not give the explicit expressions of the equations which are quite lengthy and contain little information. Note that Eqs. (7) and (8) are fourth-order ordinary differential equations about r(u) and z(u). In order to solve the equations, we also need to specify 8 boundary conditions, which include setting the base radius r(1) = Rb, and the membrane height z(0) = z0. A more detailed description of the boundary conditions can be found in the Supplemental Material.
Author contributions
R.M. conceived the project; R.M. and Z.M. designed the study and Z.L. performed the computational work; R.M. and Z.M. supervised the project; Z.L., R.M. and Z.M. wrote the paper.
Additional information
Supplementary Information accompanies this paper at “Supplementary information.pdf”.
Data availability
Database of membrane profiles obtained by electron microscopy are available in https://www.embl.de/download/briggs/endocytosis.html.
Code availability
The custom code generated during the current study are available at GitHub.
Acknowledgments
We thank Prof. Julien Berro for critical reading of the manuscript. RM acknowledges financial support from National Natural Science Foundation of China under Grants No. 12004317, Fundamental Research Funds for Central Universities of China under Grant No. 20720200072, and 111 project No. B16029. ZM acknowledges financial support from National Natural Science Foundation of China under Grants No. 12171404, Fundamental Research Funds for Central Universities of China under Grant No.
20720210037.