Single-cell characterization of transcriptomic heterogeneity in lymphoblastoid cell lines

Lymphoblastoid Cell Lines (LCLs) are generated by transforming primary B cells with Epstein-Barr Virus (EBV) and are used extensively as model systems in viral oncology, immunology, and human genetics research. In this study, we characterized single-cell transcriptomic profiles of five LCLs and present a simple discrete-time simulation to explore the influence of stochasticity on LCL clonal evolution. Single-cell RNA sequencing revealed substantial phenotypic heterogeneity within and across LCLs with respect to immunoglobulin isotype; virus-modulated host pathways involved in survival, proliferation, and differentiation; viral replication state; and oxidative stress. This heterogeneity is likely attributable to intrinsic variance in primary B cells and host-pathogen dynamics. Stochastic simulations demonstrate that initial primary cell heterogeneity, random sampling, time in culture, and even mild differences in phenotype-specific fitness can contribute substantially to dynamic diversity in populations of nominally clonal cells.


40
Lymphoblastoid Cell Lines (LCLs) are immortalized cells prepared by in vitro transformation 41 of resting primary B cells from peripheral blood with Epstein-Barr Virus (EBV). 1,2 LCLs are used growth arrest versus immortalization of early-infected cells remains elusive. 27  immortalization. Indeed, we recently described a gene expression program having low expression 84 of LMP1 and NFB targets which was unique to early infection (Latency IIb) relative to an 85 otherwise identical population of LCLs. 30 The wide distribution in LMP1 and NFB target 86 expression levels within an LCL has been characterized and ascribed to the dynamic sampling of 87 a distribution of immune evasive states, at the fringes of which growth and survival can be 88 compromised. 31-33 5 In this study, we characterize the transcriptomic profiles of five different LCLs with single-cell 90 resolution to assess inter-and intra-sample heterogeneity. Four of the sampled LCLs (two in-91 house and two commercial cell lines) were transformed with the prototypical B95-8 strain of EBV 92 derived from an infectious mononucleosis patient, 34 while a fifth sample (in-house) was prepared 93 from cells transformed with the M81 strain isolated from a human nasopharyngeal carcinoma 94 (NPC) sample. 35,36 Primary cells used in establishing the five LCLs were isolated and transformed 95 from a total of four donors; cells from one donor were transformed concomitantly to establish 96 LCLs with each of the tested EBV strains. We observe B cell genetic heterogeneity in the form of 97 differential heavy chain isotype expression across LCLs and, in three instances, within a sample.

209
Finally, the viral EBNA2 and EBNA3 proteins are responsible for transcriptional regulation that 210 we specifically interrogated within the single cell data. The direct EBNA2 targets RUNX3 and 211 FCER2/CD23 correlated with NFB expression (Figure 2 -figure supplement 6). 44

312
Mean phenotype proportions are generally conserved, whereas trial standard deviation 313 decreases as the sample size increases (trials = 25, rounds = 300, n = 100, 500, 1000, or 5000 314 cells). This is generally expected, since undersampling increases the likelihood that phenotype 315 frequencies in the drawn sample will deviate from those of the population, even in the case of 316 replacement.

317
It is notable that minor differences in relative fitness (1-2%) can lead to dramatic changes in 318 isotype distributions over time ( Figure 5C). The rate of such change is proportional to the 319 magnitude(s) of fitness differences (n = 1000 cells, rounds = 300). Four randomly selected clonal 320 evolution trajectories realized with a modest fitness advantage (2%) for class-switched cells (IgA, proportions ( 1-10%, mean  5% and  0.5-4.5%, mean  2%, respectively). 29 55 The disparity in 368 isotype frequencies is notable since these samples were transformed, cultured, prepared, and 369 sequenced in parallel (i.e., under equivalent conditions and within the same interval).

370
The polyclonality exhibited within LCL 777 B95-8 and LCL 777 M81 contrast with the 371 dominance of a single isotype in LCL 461 B95-8 and GM18502 samples (in each case, IgG). The 372 only notable difference between LCL 461 B95-8 and LCL 777 B95-8 is that the former sample 373 was in culture substantially longer prior to single-cell library preparation. Given that the GM18502 374 line was derived more than a decade ago, these observations implicate the influence of culture 375 period in altering significantly the isotype proportions present within LCLs, which is altogether 376 consistent with known (and profound) challenges associated with cell culture. [56][57][58] In this regard, 377 the data from GM12878 merit remark. The finding of polyclonality in this sample is surprising, 378 given that GM12878 has been in culture over a timescale comparable to GM18502. 59  within lytic cells. 73 The presence of NFATC1 is particularly notable considering the recent report of this factor contributing to the spontaneous lytic phenotype of type 2 EBV by upregulating 436 expression of BZLF1 to promote the lytic gene expression cascade. 74

437
Although PC loadings reveal substantial upregulation of more than a dozen EBV lytic genes, 438 cells within the lytic clusters curiously lack expression of BZLF1, which plays a role in the latent-439 to-lytic transition. 18 The absence of BZLF1 reads (and low mRNA counts generally) ostensibly

582
The simulation was implemented in Python, and the code used to generate the simulated data is 583 provided as a supporting file. The code is also available at (add as public GitHub repo) and may 584 be freely implemented and modified.