Abstract
The yeast Komagataella phaffii is widely used as a microbial host for heterologous protein production. However, molecular tools for this yeast are basically restricted to a few integrative and replicative plasmids. Four sequences that have recently been proposed as the K. phaffii centromeres could be used to develop a new class of mitotically stable vectors. In this work we designed a color-based genetic assay to investigate genetic stability in K. phaffii. Plasmids bearing each centromere and the ADE3 marker were evaluated in terms of mitotic stability in an ade2/ade3 auxotrophic strain which allows plasmid screening through colony color. Plasmid copy number was verified through qPCR. Our results confirmed that the centromeric plasmids were maintained at low copy number as a result of typical chromosome-like segregation during cell division. These features, combined with high transformation efficiency and in vivo assembly possibilities, prompt these plasmids as a new addition to the K. phaffii genetic toolbox.
Author summary The methylotrophic yeast Komagataella phaffii is considered as one of the most important platforms for the production of proteins and metabolites. We sought in this study to develop a color-based genetic system widely used in other yeasts to assess mitotically stability of vectors carrying the proposed K. phaffii centromeres. First, we constructed a K. phaffii strain (LA3) mutant for ADE2 and ADE3; this resulted in a strain that forms white colonies and when transformed with a vector (pPICH-ADE3) carrying ADE3 turns red. Next, the four K. phaffii centromeres were cloned into pPICH-ADE3 and tested in LA3 for copy number and plasmid stability. Centromeres are responsible for proper chromosome segregation during cell division, hence guaranteeing that both daughter cells receive one copy of the duplicated DNA. Our results show that three K. phaffii centromeres behaved as expected conferring extra stability to the replicative plasmids and maintaining them at low copy number. Once characterized, centromeres can be used as parts in the construction of advanced genetic manipulation tools, thus allowing the construction of strains capable of expressing large metabolic pathways for the production of complex biochemicals.
Introduction
Komagataella phaffii is a methylotrophic yeast of great industrial importance which has been used for more than 30 years as a heterologous protein production platform [1]. Its genome was first published in 2009 and has since then been refined and thoroughly studied [2,3]. As a result, in addition to a protein factory, K. phaffii has also been widely considered as a platform for the production of chemicals, biopharmaceuticals, vitamins and other molecules. However, the construction and regulation of new pathways demand complex molecular biology tools which are not readily available for this yeast [4].
K. phaffii genetic manipulation traditionally involves the use of shuttle vectors assembled in Escherichia coli and subsequently integrated into the yeast’s genome [5]. Recent studies have described the development of a wide range of genetic parts for use in this yeast, as well as new methods of plasmid assembly and transformation [6]. An alternative to integrative strategies is the use of replicative plasmids, which are usually based on the well-known ARS1 sequence [1]. These plasmids may overcome some drawbacks such as genetic instability in multi-copy strains and non-specific integration [7,8]. In addition, they present higher transformation efficiency when compared to integrative vectors and can be assembled by in vivo recombination, which eliminates the need for bacterial transformation [9,10]. However, replicative plasmids show low mitotic stability when compared to integrative vectors and few vector options are available for use [11]. Stability problems can be circumvented by the creation of centromeric plasmids, which may provide proper segregation during mitosis. A greater mitotic stability as well as low copy number allow stable and constant protein expression [12]. Centromeric plasmids can be constructed in vivo, allowing the assembly and cloning of large sequences including whole metabolic pathways and regulatory regions [13]. Therefore, the construction of such vectors would be of great value for K. phaffii strain development in the context of synthetic biology.
Centromeres are typically surrounded by large heterochromatin sections in most organisms [14]. Their structure ranges from simple “point” centromeres of only ~125 bp in Saccharomyces cerevisiae to epigenetic, sequence-independent centromeres, such as those present in plants and animals. The reason for this phenomenon is that, for most eukaryotes, centromeres are maintained epigenetically and not genetically. Sequence homologies are rare in and between species, hampering the definition of a consensus sequence. In addition, some DNA regions can be centromeric or not depending on its function in previous cell cycles, which highlights the epigenetic nature of the centromere [15].
As for non-conventional yeasts there are wide variations in centromere size and structure. Candida glabrata has centromeres that show some homology to the CDEI and CDEIII regions of S. cerevisiae while Kuraishia capsulata centromeres have 200-bp conserved sequences [16,17]. On the other hand, Candida tropicalis, Schizosaccharomyces pombe and Candida albicans have regional centromeres named after their sizes which range from 3 to 110 kb [18–20].
K. phaffii centromeres have recently been identified, bearing no sequence similarities to those of any other yeast [3]. Since centromere function relies strongly on its structure rather than on its sequence, a centromere-specific histone H3 variant (CSE4) was used in the search for centromeric regions in K. phaffii. A CSE4 homolog was identified in chromosome 2 and tagged with a fluorescence marker. The corresponding nuclear localization of the histone-DNA complex indicated a centromere pattern typical of budding yeasts [3]. Tridimensional conformation analysis followed the centromere clustering pattern observed in yeasts and narrowed down all four K. phaffii centromere locations to 20 kb windows [21].
Considering that a low transcription rate is typical of centromeric regions, RNA-seq analysis allowed to pinpoint the putative centromeric locations for all four K. phaffii centromeres [3]. Similarly to C. tropicalis and S. pombe, K. phaffii centromeres are formed by inverted repeats. All four sequences have two inverted repeats of ~2,5 kb, separated by a central segment of 800 to 1300 bp. Chromatin immunoprecipitation sequencing analysis showed that the CSE4 histone binds preferably to the central region, but also along the inverted repeats [22].
K. phaffii centromeric sequences contain early replication peaks with autonomously replicating sequences, characteristics that are also observed in centromeres of other yeasts [23,24]. According to recently published studies, there are native ARS sequences contained within centromeres 2, 3 and 4. These comprise regions within the inverted repeats, as well as unique adjacent sequences [22,25].
In order to expand the functional analysis of the K. phaffii centromeres we sought in this study to develop a genetic system based on an ade2/ade3 auxotrophic strain and a replicative vector carrying the wild-type ADE3. Vectors carrying each individual centromere were used to assess plasmid copy number and mitotic stability.
Results and Discussion
In yeasts, adenine synthesis pathway is used as a tool for auxotrophic selection, gene copy number indicator and for plasmid stability analysis [26]. Many genes from this pathway have been deleted in S. cerevisiae in order to create auxotrophic strains, while in K. phaffii studies have only focused on ADE1 and ADE2 [27,28]. K. phaffii LA2, a strain mutant for ADE2 [29], was used as a starting point for the construction of a strain that would allow plasmid stability verification. Deletion of ADE2 results in cells auxotrophic for adenine which accumulate a red pigment [26] while deletion of genes located upstream, such as ADE1 or ADE3, should prevent the formation of such pigment [27]. As expected, the deletion of ADE3 in LA3 strain results in white colonies (Fig 1). Deletion of ADE3 in S. cerevisiae has regulatory effects in the histidine synthesis pathway [30]. Consequently, ade2 ade3 strains are not only auxotrophic for adenine, but also for histidine. In order to verify if this phenotype is applicable to K. phaffii, we plated strains X-33, LA2 and LA3 on MD medium without supplementation, comparing growth and colony color to cells plated on MD medium with adenine and histidine (Fig 1). LA3 strain displayed the expected histidine auxotrophy phenotype, showing that the adenine-histidine pathways in K. phaffii and S. cerevisiae have common characteristics.
In order to assess plasmid stability, we first constructed plasmid pPICH-ADE3 bearing the ADE3 gene (Fig 2). When transformed with pPICH-ADE3, LA3 cells should return to being red and any changes on colony color would allow a simple screening of plasmid loss [26]. Although adenine auxotrophy has been explored for other purposes in K. phaffii [28], this particular color-based system has not yet been used for measuring plasmid stability in this yeast.
pPICH-ADE3 was used for cloning all four K. phaffii centromeres. Since it revealed extremely difficult to amplify entire centromeric regions we designed a strategy to amplify centromeres in halves in order to reduce fragment size and to avoid primer annealing inside the inverted repeats (Fig 3). Amplified fragments exhibited in their ends overlapping regions that would allow recombination between each other and with vector pPICH-ADE3. Centromeric primer sequences were designed using K. phaffii GS115 genome sequence as reference [2]. The amplified regions corresponded to the following chromosomal coordinates: chromosome 1 position 1401429-1406917 (5488 bp); chromosome 2 position 1543739-1550657 (6918 bp); chromosome 3 position 2204800-2211493 (6693 bp) and chromosome 4 position 1703369-1709958 (6589 bp).
Centromeric sequences are known as early replication regions and according to recently published studies there are native ARS sequences contained within centromeres 2, 3 and 4 [11, 22]. Fig 4 shows the relative positions of the ARS sequences within and around the K. phaffii centromeres. In chromosome 2, ARS are located on coordinates 1543374-1543971 (597 bp) and 1549967-1551156 (1189 bp). These sequences were partially amplified in this work, containing 232 and 690 bp, respectively. As for chromosome 3, there is an ARS located on coordinates 2204369-2205185 (816 bp) which was also partially amplified (385 bp). Chromosome 4 has an ARS on coordinates 1703466-1704103 (637 bp) which was fully amplified, as well as a partially amplified ARS (840 bp) located on coordinates 1709118-1710114.
LA3 strain was individually transformed with pPICH-ADE3 and all four centromeric plasmids (pPICH-CEN1-4). Plasmids pPICH-CEN1, 2 and 4 were verified for autonomous replication by plasmid rescue in E. coli, while the circular structure of pPICH-CEN3 was confirmed through a set of overlapping PCRs since CEN3 was the only centromere that could not be cloned directly in bacteria.
Plasmid stability was firstly verified through colony color in non-selective medium (Fig 5). When plated on YPD non-selective medium, colonies transformed with pPICH-ADE3 lost their color rapidly and presented a red center with large white edges, a result consistent with plasmid instability. In contrast, strains transformed with all centromeric plasmids presented a uniform red coloration throughout the colony.
Further stability examination of the centromeric plasmids was performed by growing cells in liquid YPD medium for 144 hours. After diluting and plating cultures on non-selective medium, red and white colonies were counted and compared between each construction (Fig 6). LA3 strain transformed with pPICH-ADE3 did not yield red colonies in any growth period, indicating that the plasmid was mitotically unstable. Conversely, centromeric plasmids presented a higher mitotic stability than pPICH-ADE3. After 96 hours of growth, cells with pPICH-CEN1 started to present white colonies, while all other centromeric plasmids remained stable. After 144 hours, pPICH-CEN1 was lost in most colonies while the other centromeric plasmids were lost in <10% cells. The reason for the instability of pPICH-CEN1 could be related to the absence of an autonomously replicating sequence within the centromere, since all other centromeres were cloned with at least a partially amplified ARS. The original replicating sequence in the pPICH-ADE3 plasmid, ARS1, has shown to be less efficient than its modern counterparts, therefore new ARS contained in the centromeres could have enhanced the mitotic stability of the centromeric plasmids [11].
Yeast centromeric plasmids knowingly have a higher mitotic stability under non-selective conditions than common replicative vectors since they are equally segregated between daughter cells and therefore provide a uniform culture of cells containing the plasmid [26]. A centromeric vector containing K. phaffii CEN2 has been constructed and it presented an enhanced stability when compared to a replicative plasmid [25]. In addition to K. phaffii and S. cerevisiae, centromeric plasmids have been developed for other yeasts such as S. pombe, C. glabrata and Scheffersomyces stipitis and in all cases enhanced plasmid stability under non-selective conditions was verified [12,17].
Yeast replicative plasmids are normally replicated but are unevenly distributed between daughter cells, which creates both multi-copy and plasmidless cells [26]. Under selective conditions, cells lacking the plasmid are unable to survive and the result is a population of multi-copy plasmid-containing cells. The construction of centromeric plasmids should provide better plasmid segregation and stability and cells should maintain a low and stable plasmid copy number during yeast growth [31]. Plasmid copy number was assessed by qPCR after strains were grown in YPD medium containing zeocin in order to ensure that all cells assayed were harboring the centromeric plasmids. The results were compared to LA3 strain transformed with pPICH-ADE3 also grown in selective medium and to the LA3 control strain, grown in YPD medium. Results from qPCR (Fig 7) indicate that the strain transformed with centromeric plasmids carried 1-2 copies per cell while the replicative plasmid was present at approximately 25 copies per cell. The difference between plasmid copy number for the replicative vector and all centromeric vectors was significant according to a t-test (p<0.05). This result illustrates the expected segregation pattern described above for growth in selective conditions and, together with the mitotic stability analysis, provides a clear picture of K. phaffii genetic manipulation using centromeric plasmids.
S. cerevisiae centromeric plasmids, in comparison to plasmids bearing the 2 µm sequence, presented the same difference in copy number when auxotrophic markers were used. However, when the kanMX G418 resistance marker was used, plasmid copy number did not differ between centromeric and replicative plasmids [36]. This indicates that factors other than the type of replication origin can influence plasmid copy number. In a previous study, K. phaffii was transformed with replicative and integrative vectors bearing centromere 2 and, unlike our results, a low copy number of vector sequences was observed in both cases [25]. The reason for this discrepancy is unclear but it could be related to strain variation or plasmid constructions used since pPICH-CEN1-4 contained the ADE3 gene in addition to ARS1 and the zeocin-resistance marker.
Overall, our results indicate that centromeric plasmids could be employed as a new tool for genetic manipulation of K. phaffii. Plasmids were maintained for long periods in non-selective medium, indicating that growth can be performed without the addition of antibiotics or any form of selective pressure. The centromeric plasmids’ low copy numbers per cell characterize a stable and homogeneous culture that can provide reliable expression results. Finally, their structure as a circular molecule allows in vivo plasmid assembly with relatively short homologous sequences when compared to genomic integration techniques where sequences have to be much longer for directed homologous recombination. Simpler assembly may also facilitate the construction of larger and more sophisticated vectors such as yeast artificial chromosomes (YAC) whose stability features may be also analysed by the color-based assay described in this work.
Methods
Strains and Media
DNA cloning was performed using chemically competent Escherichia coli XL-10 Gold (Agilent Technologies) grown in LB medium (5 g L−1 yeast extract, 10 g L−1 peptone and 10 g L−1 NaCl, pH 7,2). When needed, agar was added to a final concentration of 1,5%. When zeocin (25 μg mL−1) was used for bacterial antibiotic selection, NaCl concentration was reduced to 5 g L−1.
K. phaffii strains were derived from X-33 (Invitrogen). LA2 strain (amd2 ade2) was described in a previous work 29. Yeast was routinely grown in YPD medium (10 g L−1 yeast extract, 20 g L−1 peptone and 20 g L−1 glucose). Solid medium used 2% agar. Zeocin and G418, when used, were added at 100 μg mL−1 and 500 μg mL−1, respectively. Hygromycin B was used to a final concentration of 50 μg mL−1. Minimal medium (MD) used 0,34% Yeast Nitrogen Base, 1% (NH4)2SO4, 2% glucose, 0,00004% biotin and 0,0002% adenine or 0,004% histidine, when needed.
PCR
DNA was amplified using Invitrogen Platinum Taq DNA Polymerase (High Fidelity), Promega GoTaq Colorless Master Mix or Sigma-Aldrich Accutaq LA DNA Polymerase. All primers used in this work are shown in Table 1.
DNA manipulation
All basic DNA manipulation and analysis were performed as previously described [32]. Restriction digestion was performed in accordance to the manufacturer instructions (New England Biolabs), as well as vector dephosphorylation with Shrimp Alkaline Phosphatase (Promega) and ligation with T4 DNA ligase (USB). In-Fusion Cloning Kit (Clontech) was used for in vitro assembly of plasmids. Site-directed mutagenesis was performed using the Transformer Site-Directed Mutagenesis kit (Clontech). PCR and gel purification used Promega Wizard SV Gel and PCR Clean-Up System.
Quantitative PCR (qPCR)
Strains harboring the zeocin resistance plasmids were grown to an OD (optical density measured at 600nm) of 1 in 10 mL YPD containing zeocin while LA3 was grown in 10 mL YPD. Cells were collected by centrifugation at 2000 × g for 5 minutes. Cell pellet was resuspended with 1 mL 0,25% SDS and incubated at 98°C for 8 minutes according to a previous work [25]. Finally, cell debris was removed by centrifugation and DNA was diluted 10-fold in water before qPCR reactions.
Quantitative PCR reactions used primers qZEO-F and qZEO-R for plasmid quantification and qHIS-F and qHIS-R as an internal single-copy control. Assays were carried out with iTaq Universal SYBR Green Supermix (Bio-Rad) in a Rotor-Gene Q (Qiagen) thermal cycler. Analysis used the absolute quantification method and standard curves that ranged from 1×104 to 1×108 copies of the gene of interest. pPIC9 (Invitrogen) and pPICH linearized plasmids were used for construction of the standard curves.
Yeast transformation
K. phaffii was electroporated following two different protocols. For integrative cassettes, we followed the Pichia Expression Kit protocol (Invitrogen) and when using replicative plasmids, we proceeded as described previously [33].
Construction of an ade2 ade3 strain for color-based stability assays
Strain LA2 [29] was transformed with an ADE3 deletion cassette and had the marker recycled before moving on with the centromeric plasmid transformations.
Construction of the deletion cassette used PCR reactions assembled by an “In-Fusion” cloning reaction. Briefly, primers ADE3up-F and R; ADE3dw-F and R were used for PCR amplification of 491 bp and 582 bp, respectively, from K. phaffii genome. These reactions amplified sequences used for directing homologous recombination and substitution of the complete ADE3 coding sequence. Meanwhile, primers ADE3lox-F and ADE3lox-R amplified the kanR G418 resistance cassette from plasmid pGKL [34]. PCR fragments were assembled and cloned into pBluescript II SK+ linearized with SmaI. A final PCR reaction using primers ADE3up-F and ADE3dw-R amplified the whole deletion cassette which was used for transformation of K. phaffii LA2. Cells were selected in YPD containing G418.
The resulting ade2/ade3 strain was later transformed with pYRCre2 [35] and selected in YPD supplied with hygromycin B. This step promoted a Cre-mediated excision of the kan cassette thus eliminating G418 resistance. After PCR confirmation of marker recycling using primers ADE3conF and ADE3conR, the resulting strain was plated in non-selective YPD medium, causing loss of the pYRCre2 plasmid. The resulting strain was named LA3.
Construction of centromeric plasmids containing ADE3
Plasmid pPICH [29], which is derived from pPICHOLI (MoBiTec), contains the ARS1 replicating sequence [1]. This sequence is originally located on K. phaffii GS115 chromosome 2, coordinates 413701-413856 [2]. The plasmid was digested with NotI for cloning of the K. phaffii native ADE3 gene. The complete gene was amplified from X-33 DNA using primers ADE3up-F and ADE3dw-R following digestion with NotI. After vector dephosphorylation, fragments were ligated and transformed into E. coli XL-10 Gold. One positive clone was then submitted to site-directed mutagenesis using primers Mut1(Hpa) and Mut2(Bam) for removal of the BamHI restriction site present within the ADE3 coding sequence. The final plasmid containing ARS1, the Sh ble resistance marker and ADE3 was named pPICH-ADE3.
pPICH-ADE3 was digested with BamHI for cloning of all four K. phaffii centromeres. These were amplified from K. phaffii X-33 genomic DNA using two PCR reactions for each centromeric sequence. Primers Cen1/2/3/4-F and Cen1/2/3/4c-R amplified the first inverted repeat of each centromere while primers Cen1/2/3/4c-F and Cen1/2/3/4-R amplified the other half of the sequences. In order to promote in vitro/in vivo assembly amplicons had approximately 80 bp homology between each other and 15 bp with pPICH-ADE3. Firstly, we attempted an “In-Fusion” cloning reaction for each of the four centromeres using linearized pPICH-ADE3 and the two PCR fragments. Plasmids were extracted and analyzed by restriction digestion. Centromeres 1, 2 and 4 were successfully assembled and cloned into the plasmid through this strategy.
Centromere 3 did not yield any E. coli clones following the “In-Fusion” reaction; therefore, we proceeded to an in vivo assembly strategy. Primers Cen370-F and Cen3c-R; Cen3c-F and Cen370-R amplified both inverted repeats adding 70 bp of homologous sequences between the fragments and pPICH-ADE3. Finally, we transformed K. phaffii LA3 using the linearized vector and both centromeric fragments, using 85 bp of homology for directing recombination. Clones were selected in YPD supplied with zeocin.
The resulting plasmids were named pPICH-CEN1, pPICH-CEN2, pPICH-CEN3 and pPICH-CEN4. All plasmids were transformed into K. phaffii LA3 for subsequent stability and quantification assays.
Stability analysis
LA3 strain transformed with each of the four centromeric plasmids was grown in 20 mL YPD for 16 hours at 28°C and 200 rpm. This culture was inoculated to 20 mL YPD to an initial OD of 0,1. After 24 h of growth under the same conditions the culture was used as inoculum for another flask containing 20 mL YPD to an OD of 0,1. This procedure was repeated every 24 h until a total of 144 hours. At 96 and 144 hours of growth, a culture sample was diluted 106-fold and 100 μL of this dilution were plated on YPD. Plates were incubated at 30 °C for 72 h.