Computational redesign of a PETase for plastic biodegradation by the GRAPE strategy

The excessive use of plastics has been accompanied by severe ecologically damaging effects. The recent discovery of a PETase from Ideonella sakaiensis that decomposes poly(ethylene terephthalate) (PET) under mild conditions provides an attractive avenue for the biodegradation of plastics. However, the inherent instability of the enzyme limits its practical utilization. Here, we devised a novel computational strategy (greedy accumulated strategy for protein engineering, GRAPE). A systematic clustering analysis combined with greedy accumulation of beneficial mutations in a computationally derived library enabled the design of a variant, DuraPETase, which exhibits an apparent melting temperature that is drastically elevated by 31 °C and strikingly enhanced degradation performance toward semicrystalline PET films (23%) at mild temperatures (over two orders of magnitude improvement). The mechanism underlying the robust promotion of enzyme performance has been demonstrated via a crystal structure and molecular dynamics simulations. This work shows the capabilities of computational enzyme design to circumvent antagonistic epistatic effects and provides a valuable tool for further understanding and advancing polyester hydrolysis in the natural environment.

utilization. Here, we devised a novel computational strategy (greedy accumulated strategy for protein engineering, GRAPE). A systematic clustering analysis combined with greedy accumulation of beneficial mutations in a computationally derived library enabled the design of a variant, DuraPETase, which exhibits an apparent melting temperature that is drastically elevated by 31 °C and strikingly enhanced degradation performance toward semicrystalline PET films 25 (23%) at mild temperatures (over two orders of magnitude improvement). The mechanism underlying the robust promotion of enzyme performance has been demonstrated via a crystal structure and molecular dynamics simulations. This work shows the capabilities of computational enzyme design to circumvent antagonistic epistatic effects and provides a valuable tool for further understanding and advancing polyester hydrolysis in the natural 30 environment.

Main Text:
Earth's ecosystem is approaching a planetary-scale transformation as a result of human influence 1 . Currently, a wide variety of petroleum-based synthetic polymers are produced worldwide to the extent of a 20-fold increase in annual production over the five decades since the 35 1960s, and a slowdown of this trend is not expected 1 . Accumulated marine and terrestrial pollution problems and the negative impacts of microplastic exposure on human health have caused escalating public and governmental concerns 3, 4 . With the excessive use of plastics and increasing pressure on the capacity for plastic waste disposal, the demand for eco-friendly waste management has assumed dramatic importance in the last few years 5, 6 .
Poly(ethylene terephthalate) (PET) is one of the most widely used man-made synthetic plastics worldwide, with an annual manufacturing capacity of over 30 million tons 7 . Its excellent durability, however, has now become an environmental detriment 8 . The aromatic terephthalate building blocks in crystalline PET largely reduce the chain mobility and lead to restricted accessibility of the ester linkages 9 . Therefore, the degradation activities of natural hydrolases 5 have been limited by extremely low turnover rates towards PET 10 . In the past decade, there have been several foundational studies identifying enzymes from thermophilic organisms that exhibit detectable PET degrading activity [11][12][13][14] . However, these enzymes require considerably high operational temperatures near the PET glass transition temperature (~76 °C), severely hampering their utility in environmental remediation. Recently, a major breakthrough was achieved by 10 Yoshida et al., who reported a new PETase from Ideonella sakaiensis 201-F6 (IsPETase) with the highest activity among PET-degrading enzymes at ambient conditions 15 , which indicates a promising application for in situ disposal of hard-to-collect plastic disintegrates with low energy consumption. Although IsPETase has caught much attention for its intriguing capability, the enzyme exhibits poor durability: most of its activity is lost within 24 h of incubation at 37 °C 16 . 15 Apart from this, IsPETase exhibits only reasonable activity toward low-crystallinity PET (1.9%), whereas most PET applications, including commercial containers, employ high-crystallinity PET with 30%-50% crystallinity 17 .
Not content with the finite repertoire of naturally occurring enzymes, a number of research groups have studied the engineering of IsPETase, which has been recently summarized by 20 Taniguchi 18 . Lately reported crystal structures of PET-degrading enzymes allow for methods that exploit rational design to improve the PET degradation activity [19][20][21][22][23][24][25] . Successful single point mutations afford 1.2-to 3.1-fold higher affinity for PET [22][23][24][25][26] , and very recently, Kim's group has succeeded in creating an IsPETase S121E/D186H/R280A variant with enhanced thermal stability (by 8.81 °C) and 14-fold higher PET degradation at 40 °C 16 . Despite these attempts, there remains a 25 key challenge to whether protein engineering achieves improved enzymatic stability and effective catalytic conversion of high-crystallinity PET by allowing large jumps along the sequence fitness landscape. There is usually a pathway whereby some new functions could be acquired by individually beneficial mutations; however, when the desired function is beyond what a single mutation or double mutations can accomplish, the number of possible paths grows 30 exponentially as the mutations accumulate, and most paths result in downhill or even unfolded proteins 27 . Since a majority of protein engineering studies involve simple uphill walks, the main demand lies in identifying an efficient path of accumulated mutations to achieve the desired protein performance 28 .
The last few years have witnessed impressive progresses in dealing with multidimensional 35 space by a number of metaheuristic methods 29 . Inspired by the widespread greedy algorithm applications in artificial intelligence, we introduce a novel strategy, termed GRAPE, to effectively tackle the evolutionary hard problem by computational design that enhances the probability of discovering the adaptive routes to improved fitness. The computationally redesigned IsPETase (DuraPETase) derived from this campaign exhibited an apparent melting 40 temperature increased by 31 °C with good performance toward semicrystalline PET and vastly improved long-term survival under mild conditions. The enzymatic PET degradation was enhanced by over 300-fold at 37 °C for 10 days. Ameliorated biodegradation of other plastics, such as poly(butylene terephthalate) (PBT) and poly(ethylene 2,6-naphthalenedicarboxylate) (PEN), suggested DuraPETase a broad ability to degrade semiaromatic polyesters. To evaluate 45 the underlying molecular mechanism for enhanced performance, a three-dimensional crystal structure of the variant was determined; this structure highlighted a fine-tuned 'aromatic tunnel' flanked by synergistic hydrophobic interactions.

Computational redesign of a robust DuraPETase using the GRAPE strategy
The GRAPE strategy introduced here uses a greedy strategy for global optimization of 5 mutations in each cluster to create functional variations and select the fittest variants to direct the search to higher elevations on the fitness landscape. The initial step in the GRAPE strategy (Scheme 1) consists of computational predictions of potentially stabilizing mutations along the whole protein sequence. We previously developed the ABACUS 30 software for de novo protein design based on a statistical energy function. Here, this algorithm was applied together with three 10 complementary algorithms, FoldX (force field-based energy function) 31 , Rosetta_ddg (force field-based energy function) 32 and Consensus Analysis (phylogeny-based method) 33 , to improve the protein stability. Mutations with an ABACUS energy < -3 A.e.u., a folding free energy (∆∆G fold ) < -5 kJ/mol or a consensus score > 2.0 were selected (Table S1 and Figure S1). A total of 253 unique predicted mutations were obtained as potentially stabilized candidates after 15 filtering, and ABACUS, FoldX, Rosetta_ddg, and consensus analysis algorithms provided 85, 63, 65, and 54 of these 253 mutations, respectively, with some overlap. In the second step, the potentially stabilized candidates were inspected for biophysical pitfalls, such as the introduction of internal cavities, loss of hydrogen-bonding interactions, or exposure of hydrophobic residues at the surface of the enzyme, providing a sublibrary of 85 candidates. After experimental 20 validation, 21 well-expressed mutants displayed increased stability (> 1.5 °C increase in the apparent T m ) ( Figures S2 and S3). However, owing to the ubiquity of epistatic effects 34-36 , these positive mutations may not cooperate to reach the target function. Indeed, an IsPETase variant containing all 21 beneficial mutations is completely inactive.
Hence, the next step, which systematically accumulates the beneficial variants in the well-25 defined library, is crucial for the GRAPE strategy. Searching in the high-dimensional space involves an incomprehensibly large number of possible pathways among which only an infinitesimal fraction can escape from one local optima trap to reach a better solution. The proposed GRAPE strategy combines the advantages of greedy and clustering algorithms to provide a viable solution to minimize experimental efforts but maximize the exploration of 30 epistatic effects in terms of additivity and/or synergism between sets of mutations at each branching point. In the GRAPE strategy, the K-means algorithm 37 , which has been proven to be a very powerful tool for data mining problems and has been adopted to perform knowledge discovery in bioinformatics research 38 , was applied to cluster the stabilizing mutations into several groups. Variants were characterized based on their positions, efficacies, and presumed 35 effects. Subsequently, the best individual of each population served as the parent to attract offspring to its region of the fitness landscape. Each individual in the cluster was crossed with the current global best one. The greedy algorithm accepts the newly generated individual only when its fitness is better than that of the parent. Figure 1A shows the stipulation that each individual stays in its historically best position and moves forward toward the global best 40 position.
Accordingly, the 21 identified stabilizing mutations were clustered into three groups (Table  1). S214H in the 1 st cluster was considered the starting point for the initial round of greedy accumulation. The exploitation process was then continued in a multihierarchical manner until all variants in the cluster were traversed or no further improvements were found ( Figure 1B). Whether a combination stage was positive or deleterious was conditional on the basis of two parameters, i.e., thermostability and enzyme activity. If the combined variant showed high thermostability but seriously reduced activity, it was advisable to make a compromise. After five cycles of the accumulations, the best hit (S214H-I168R-W159H-S188Q-R280A, M5) in the 1 st 5 cluster was obtained, which had a T m value of 71 °C. Further combination of the remaining mutations in the 1 st cluster in M5 failed to increase the T m value. Subsequently, the M5 variant was used as the template for crossover of mutations in the 2 nd cluster, leading to the best hit variant S214H-I168R-W159H-S188Q-R280A-A180I-G165A-Q119Y-L117F (IsPETase-M9) with a T m value of 75 °C. Although the combination of the remaining mutations in the 2 nd cluster 10 slightly increased the T m values, degradation activities were largely reduced compared to that of M9 ( Figure S4). In addition, the introduction of the S187W mutation to the best hit in each round resulted in extremely decreased expression. Therefore, M9 was set as the parent for the accumulation of other sites in the 3 rd cluster. The most thermostable variant, S214H-I168R-W159H-S188Q-R280A-A180I-G165A-Q119Y-L117F-T140D (IsPETase-M10, referred to as 15 DuraPETase), originating from the 3 rd cluster, exhibited dramatically enhanced thermostability (ΔT m = 31 °C). Before addressing the above challenge, we also explored a stepwise combination of the most stabilizing variants (ΔT m > 7 °C). The stability of the S214H-D186H-I168R mutant was dramatically decreased, and the introduction of the P181A mutation largely reduced the enzymatic activity. Progressively addressing these multiple sites in the GRAPE scheme with 20 multilayer upward branches allows negative trade-offs to be determined and quantified and thus finds effective ways to circumvent apparent dead ends.

Enhanced PET degradation activity toward semicrystalline PET film
With the significantly enhanced performance of DuraPETase, we can explore the anticipated PET digestion of semicrystalline PET film (23%) ( Figure S5). DuraPETase and the wild-type 25 enzyme were incubated at various temperatures ranging from 37 °C to 60 °C ( Figure 2A). The degradation activity profiles largely reflect the difference in inherent robustness between the designed enzyme and the wild type. IsPETase lost complete activity within 24 h at 37 °C, and the concentration of the degradation product reached 12 μ M and did not increase thereafter, which is consistent with the results in Kim's report 16 . When the temperature increased, the activity of 30 IsPETase dropped sharply within 1 h due to thermal-induced denaturation. As expected, the enhanced thermostability of the DuraPETase coincided with remarkable improvement in longterm survival at moderate temperatures, and the enzyme even withstood elevated temperatures up to 60 °C for 3 days of incubation. The total released compound concentrations reached up to 3.1 mM when the incubation time was extended to 10 days at 37 o C, which was a 300-fold 35 increase over the concentrations of compounds released by IsPETase. (Figure 2B).
The high performance of DuraPETase suggested that its tolerance to organic solvents, especially ethylene glycol (EG) would be substantially enhanced, which can decrease product inhibition during the depolymerizing process. As shown in Figure 2C, DuraPETase exhibited a marked increase in tolerance to methanol (MeOH) and EG. The degradation activity was 40 enhanced by 1.6-fold and 1.1-fold at 5% and 10% (v/v) MeOH, respectively, and approximately 30% of residual activity was maintained at 20% (v/v) MeOH. When incubated with EG solvent, DuraPETase maintained most of its activity at 5%, 20% and 50% (v/v) EG and even increased its degradation activity up to 1.5-fold when incubated with 10% (v/v) EG. This promoted performance of DuraPETase in co-solvents enhances the potential of this enzyme for auxiliary biodegradation in glycolysis and alcoholysis applications.
Polymer hydrolysis was also confirmed by scanning electron microscopy (SEM) images of the film surfaces following enzymatic hydrolysis for 10 days ( Figure 2D). The buffer-only control sample exhibited a smooth and uniform surface without characteristic surface defects. 5 The digested PET sample treated with IsPETase showed visual modifications with some irregular grooves of different sizes. Remarkably, severe erosion occurred with the formation of highly porous foam structures on the film surface after 10 days of degradation by DuraPETase, indicating that the biodegradation behavior was not limited to the amorphous area. The hole sizes after degradation were increased when 5% MeOH was added, and there were many spiny 10 fragments in the corroded area. The film surface visually became nonuniform and partially broke into pieces when touched ( Figure S6).

Complete degradation of microplastics by DuraPETase
Pollution of the marine environment by micro-and even nanoplastics has become ubiquitous. Their persistence continues to increase, as they are extremely difficult to recognize 15 and collect due to their reduced visibility. The high degradation ability of DuraPETase allows us to explore whether complete degradation of nanoplastics or even microplastics can be achieved. The total released product determined by HPLC indicated complete degradation of nanoplastics ( = 50-100 nm, 0.21 g/L) by DuraPETase within 1 h at 37 °C ( Figure 3C). While this observation is encouraging, much slower degradation of PET microplastic was observed during 20 the incubation period. Figure 3E shows photographs of microplastic samples ( = 10-50 μ m, 0.29 g/L) biodegraded over a period of 20 days. The light transmittance of microplastic solvent exhibited a slow increase. After 20 days of incubation, the microplastics degraded by DuraPETase almost disappeared with respect to microplastics added to a buffer-only control, which maintained a white color. Although the IsPETase-degraded sample was slightly clarified, 25 its optical transparency was significantly lower than that of samples incubated with DuraPETase. In addition, HPLC analysis of the degradation product concentration reached 1.26 mM, which also indicated > 80% degradation of microplastics by DuraPETase.

Extending the potential applications of DuraPETase
To extend the potential applications of DuraPETase, we also examined the use of this 30 enzyme for the degradation of other semiaromatic polyesters, including PEN and PBT. PEN is similar to PET in chemical structure but contains naphthyl rings, which produce much stiffer molecular chains than the phenyl rings of PET. Thus, PEN exhibits a high T g (120 °C) and good tensile and barrier properties 39 . PBT exhibits higher tensile, flexural and dielectric strengths and faster, more economical molding characteristics than many thermosets. Although the degradation 35 product of PBT is TPA, the longer aliphatic chain in butanediol than in EG endows the PBT's excellent resistance to a broad range of chemicals at room temperature (T g ≈ 50 °C) 40 .
Biodegradation of PBT was barely accomplished by IsPETase due to the notably high specificity of the enzyme toward aromatic groups rather than aliphatic chains (Figure 4). However, a gradual increase in degradation product release was found for DuraPETase at 37 °C, 40 although the degradation activity was substantially reduced with respect to the degradation of PET. SEM analysis also supported this conclusion. In contrast to the slow degradation of PBT, significant enhancement of PEN degradation by DuraPETase was observed. Over 10 days of incubation, DuraPETase produced a maximum product concentration of 48 μ M with PEN film, presumably reflecting the high hydrolysis specificity of DuraPETase toward semiaromatic polyesters due to the redesigned hydrophobic active cleft, even though the degradation activity was still lower than that targeting PET. As the introduction of the naphthalene ring into the main chain stiffens the polymer chains and largely enhances their dielectric and mechanical properties, PEN holds potential for food packaging, high-performance industrial fiber, and flexible printed 5 circuit applications. Different types of such high-performance plastic would eventually be dispersed into the environment. These results therefore inspire further engineering to improve the depolymerization of new classes of semiaromatic polyesters.

Mechanism underlying the improved properties of DuraPETase
To further evaluate the molecular basis for the improved properties of the obtained variant, 10 we determined the crystal structure of DuraPETase at 1.63 Å resolution (PDB ID: 6KY5), which revealed a conserved α /β hydrolase fold, with a core consisting of seven α -helices and nine β sheets. The DuraPETase-PET binding mode predicted by molecular docking was similar to the mode observed in the crystal structure of IsPETase bound to 1-(2-hydroxyethyl) 4-methyl terephthalate (HEMT) 21 . The distance between the carbonyl carbon of the PET polymer and the 15 C α atom of Ser160 was 2.75 Å, which was suggested to be within the relevant reactive distance for nucleophilic attack. The oxyanion of the tetrahedral intermediate is stabilized by an oxyanion hole involving backbone NH groups of Tyr87 and Met161. In addition to the interaction information observed crystallographically, we explored MD simulations to analyze potential effects of the mutations on global structure dynamics. 20 According to the crystal structure and MD simulations, key aspects of the molecular mechanism by which the mutations improve the protein robustness were proposed as follows: introduction of new electrostatic interactions (T140D, W159H, I168R and S188Q), improvement of hydrophobic packing in the protein surface and interior (L117F, Q119Y, A180I, S214H and R280A), and reduction in the conformational entropy of a local coil region (G165A) ( Figure S7). 25 For the T140D mutation, the substitution of aspartate for threonine in T140D mutation conferred new hydrogen bonds with the hydroxyl group of Ser142 in the crystal structure, which maintained for 93.18% of the simulation time. The N ε atom of W159H formed new hydrogenbonding interactions with the backbone oxygen atom of H237 in the crystal structure. However, this W159H mutation was observed to restore the active site by forming a new hydrogen bond 30 with Ser160 in the simulations. The original catalytic residue His237 was concomitantly flipped "up" out of the catalytic triad to play an aromatic stabilization role with PET ( Figure 5B). This reconstruction of the catalytic triad was also suggested by Austin et al. 20 . For I168R and S188Q mutations, new salt bridge interactions were formed between I168R and D186 in the crystal structure. The guanidine group is also suggested to donate new hydrogen bonds to the amide 35 oxygen atom of S188Q in the MD simulations for 10.36% of trajectories to stabilize the enzyme structure, which may explain the negative epistatic interactions when D186H was added to S214H-I168R-W159H (M3) and S214H-I168R-W159H-S188Q (M4).
In addition to electrostatic effects, hydrophobic interactions have a marked influence on enzymatic performance ( Figure S7). It is noteworthy that the benzene ring of the PET molecule 40 stands at an active-site crevice that consists of five mutually contacting phenylalanine, tyrosine, tryptophan, and histidine side chains in DuraPETase-PET binding mode. These key residues may provide an "induced-fit" arrangement of the active site cleft. Residues Tyr87 and Trp185 were observed to form continuous π -π interactions with the aromatic motif of the PET polymer for 54.50% and 31.96% of the entire trajectory, respectively. Simultaneously, L117F and Q119Y were located in proximity to the active site and promoted "T-shaped" π -π interactions with Tyr87 (the oxyanion hole-forming residue) for 71.42% and 68.62% of the trajectory, respectively. The stability of the oxyanion hole may be enhanced, and the tetrahedral intermediate can be further stabilized. Another substitution in the active site, S214H was suggested to prevent the wobbling 5 of Trp185, and it formed an offset parallel π -π stacking interaction with Trp185 for 60.48% of the simulation time. Therefore, the crystal structure and MD simulations suggested a wellorganized hydrophobic domain involving S214H, Trp185, Tyr87, L117F, and Q119Y, which we termed as 'aromatic tunnel' flanker, to finely tune the active-site cleft suitable for binding the aromatic motif of PET. Although the flexibility of the active site region has been largely reduced 10 ( Figure 5B), the hydrophobic substitutions may significantly enhance the binding affinity for PET polymer (Figure 5C), and further enable DuraPETase to hydrolyze ester bonds along semicrystalline PET polymer chains via targeting neighboring aromatic motifs in the polymer chain backbone. 15 Durability, the greatest asset of plastic, has now become a lingering curse that results in plastic remaining in our environment for hundreds of years 8 . Even when physically broken, plastics never truly leave the environment but are present as micro-and nanoplastics that are choking marine life and propagating up the food chain. The biodegradation of plastics under ambient conditions, especially for uncollectable microplastics, is highly desirable to enhance 20 changes in this scenario. To this end, the seminal discovery of IsPETase immediately aroused immense interest from numerous research groups in investigating the mechanistic basis of its catalytic mechanism and in increasing the efficiency and stability of this exciting enzyme. Although combining random mutagenesis with high-throughput screening has proven to be a successful strategy for the modification of enzyme properties, a long-sought alternative to 25 screening-based approaches is reliable in silico design of performance-enhancing mutations, especially for the degradation of insoluble solid synthetic polymers. Over the last 20 years, in silico design based on energy calculations has advanced greatly from fairly simple to increasingly accurate and versatile methods 46-47 and has had a particularly positive impact in the area of protein skeleton optimization. However, the accuracy based on energy functions is still 30 suboptimal because of several factors, including insufficient conformational sampling of the static structure, imbalances in the force fields, and intrinsic problems with existing data sets 28 . Although the drawbacks can be mitigated by using hybrid methods that incorporate complementary statistical-based approaches such as ABACUS, most stability strategies focus on single-point-mutation or simple stepwise combination processes, resulting in increased 35 prediction errors upon application to multiple-point mutants. Whenever epistatic effects are present, the predictions are prone to frustrate optimization processes.

Discussion
The GRAPE strategy proposed in this study represents a step forward in the computational enzyme design because of its capability to reduce the risk of combining mutations with antagonistic effects and independency of specific structure. In most cases the combination 40 process fails to immediately find efficient pathways when coupled mutations have negative epistatic interactions. However, simultaneously making all possible paths creates an exponentially larger library. A systematic clustering and greedy combination strategy that has been successfully utilized in machine learning therefore provides a way to tap into the possible beneficial pathways in a well-defined library by reducing the dimensionality of the data. The computationally redesigned DuraPETase enzyme described herein has substantially improved operational robustness and semicrystalline PET degradation activity, which are important properties for its potential use in real-case applications such as surface modification of PET fibers in the textile industry and in situ disposal of microplastics. Both the crystal structure and MD simulations suggest an underlying 'aromatic tunnel' mechanism to explain the promoted 5 enzymatic performance via a structural view, indicating a significant effect of the synergy between various beneficial mutations on stability and activity enhancement. Furthermore, Zimmermann et al. has hypothesized combinatorial exo-and endo-type or endo-type-only degradation mechanisms for different PET microstructures when they are incubated with TfCut2 near the PET glass transition temperature 48 . Encouraged by the impressive performance of 10 DuraPETase, researchers are expected to investigate the degradation mechanism of scrap PET based on this robust enzyme at mild temperatures.
In summary, this work shows that a collection of subtle variations identified by in silico approaches provides clues regarding how to design a PET hydrolase to improve its incorporation of semicrystalline aromatic polyesters. We believe that the proposed GRAPE strategy constitutes 15 a significant advance in enzyme design methodology that is complementary to traditional computational design strategies, which often focus on exquisitely tuned energy evaluations of single mutations. The variant DuraPETase, designed here by the GRAPE strategy, serves as a useful catalyst for efficient PET degradation at moderate temperatures and opens up avenues for research in decreasing environmental microplastic accumulation and understanding PET 20 microstructure alterations upon enzymatic erosion. Despite the aforementioned achievements, complete degradation of plastic waste still presents a number of challenges. There is an urgent need to further research this topic via studies of issues such as coupling DuraPETase with downstream enzymes, degrading the TPA and EG product, and scaling up the combination degradation system for industrial applications.

Materials and Methods
Tables S1-S3 Figures S1-S8 References (S1-S16) Scheme 1 Schematic representation of the GRAPE strategy. In step 1, stabilizing mutations are generated with multiple algorithms. The computational designs with typical known pitfalls are eliminated. Then, the remaining designs are selected for experimental validation.
Step 2 characterizes the variants according to their positions, efficacies and presumed effects.