Comprehensive analysis of HIV elite controllers’ and progressors’ transcriptional profiles from CD8+ T lymphocytes demonstrates heterogeneity of pathways and master regulators, which may be essential for disease nonprogression

Cytotoxic and noncytotoxic CD8+ T lymphocyte responses are essential for the control of HIV infection. Understanding the mechanisms underlying HIV control in elite controllers (ECs), who do not progress to AIDS for many years without treatment, may facilitate the development of new effective therapeutic strategies. We performed a comprehensive analysis of the transcriptional profiles of CD8+ cells from ECs and treated and untreated progressors using an original pipeline. Cluster analysis enabled the identification of five distinct groups (EC groups 1-5) of ECs based on their transcription profiles. Profiles of EC groups 2-4 were associated with different numbers of differentially expressed genes, but the corresponding genes shared the same cellular processes. These three groups were associated with increased metabolism, survival, proliferation, and the absence of an “exhausted” phenotype of CD8+ lymphocytes, compared to untreated progressors. The transcriptional profiles of EC group 1 were opposite to those of EC groups 2-4 and similar to those of the treated progressors. This group may be associated with residual dysfunction of T lymphocytes. The EC group 5 was indistinguishable from normal. EC groups 1 and 5 can have mechanisms of nonprogression unrelated to CD8+ cells. The transcription changes in CD8+ lymphocytes from ECs may be attributable to the receptors that modulate the functional states of CD8+ cells. We identified 22 receptors, which are potential master regulators and may be responsible for the observed expression changes of CD8+ cells in ECs. These receptors can be considered potential targets of therapeutic intervention, which may decelerate disease progression. IMPORTANCE A small group of HIV-infected individuals, known as elite controllers (ECs), do not develop AIDS for many years, despite being untreated. Understanding the mechanisms governing HIV control in ECs may help develop new effective therapeutic strategies. Since CD8+ T lymphocytes seem to play a significant role in HIV control, we analyzed the large number of transcriptional profiles in total CD8+ cells from ECs, treated, and untreated progressors. We found distinct groups of ECs, which may have different mechanisms governing HIV nonprogression, and we identified related pathways and cellular processes that are dissimilar from those in progressors. We also identified master regulators, key proteins in the signaling network that can be responsible for observed transcriptional changes in CD8+ cells from ECs and may be essential for disease nonprogression. These proteins may represent potential targets for therapeutic interventions.

cART must be administered throughout life because interruption of therapy leads to viral rebound 60 and disease progression (4, 5). Thus, new approaches to treat HIV infection are being developed (5-61 9). One of the most promising approaches is developing therapeutic and preventive vaccines; 62 however, no effective vaccines exist at present (6). This lack of a vaccine can be explained by the 63 fact that people do not develop natural, protective immunity to HIV infection, whereas almost all 64 successful vaccines were created for diseases for which natural immunity exists (8). Since some 4 seems to have a significant role (11). The cellular response causes a decline in viral load in the acute 73 phase of infection but cannot eliminate the virus from the organism because a high mutation rate 74 during replication allows HIV to escape from the immune response. Persistent infection causes 75 chronic activation of HIV-specific CD4 + and CD8 + cells, which leads to their apoptosis (12, 13). A 76 phenomenon called "bystander activation" was described in chronically infected patients, in which 77 significant numbers of CD4 + and CD8 + non-HIV-specific T lymphocytes are activated in a T-cell 78 receptor-independent and cytokine-dependent manner that leads to their apoptosis (11,(14)(15)(16). 79 Chronic infection also causes the dysfunction or "exhaustion" of CD8 + lymphocytes, which is patients. Most infected patients, called progressors, usually develop AIDS after 8-10 years, but a 86 small group of people, known as long-term nonprogressors (LTNPs), remains asymptomatic for 87 more than ten years and are characterized by higher CD4 + cell counts and lower viral load (13, 19-88 21). LTNPs can be divided into several groups. The group, called elite controllers (ECs), 89 demonstrates the best control of viral replication with a viral load less than 50 copies/ml while 90 maintaining CD4 cell counts from 200 to 1000/ml. Little is known about the mechanisms of viral 91 control in ECs (11, 13, 18-24), but it seems that host and viral factors, as well as various cell types, 92 may be involved in control, and the cellular response of CD8 + lymphocytes has a major role.  (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint 5 To date, most of the studies related to the investigation of mechanisms of HIV suppression in 97 ECs have focused on particular molecules or pathways, whereas analysis of HIV-related genome- 98 wide OMIC data provides an opportunity to reveal novel mechanisms that were not known or 99 previously hypothesized (26). Transcriptomics studies are the most frequent in HIV research and 100 include those investigating CD8 + lymphocytes from ECs (27-30). Most of the corresponding studies 101 were focused on total CD8 + lymphocytes, rather than HIV-specific lymphocytes (27,29,30). This 102 focus is important because non-HIV-specific CD8 + and CD4 + lymphocytes are involved in the 103 pathogenesis of disorder by the "bystander activation" effect (see above), which leads to their 104 dysfunction and apoptosis (11, 14-16). To identify specific HIV control mechanisms, the 105 comparison of gene transcription in CD8 + lymphocytes from ECs to both progressors and healthy 106 controls is required. For instance, Hyrcza and colleagues compared transcription profiles from total 107 CD4 + and CD8 + lymphocytes between ECs, progressors (acute and chronic phases), and uninfected 108 people (27). They found differentially expressed genes (DEGs) between ECs and progressors in both 109 the acute and chronic phases but did not find differences between ECs and healthy controls. This can 110 potentially be explained by the fact that CD8 + ECs have heterogeneous transcription profiles, and 111 some of them are indistinguishable from healthy controls. Chowdhury and colleagues applied cluster 112 analysis to CD8 + cell transcriptional profiles from 51 ECs and found five distinct groups (30). Some 113 of the groups were distinguishable from both control samples and samples from cART-treated 114 patients. The authors found that the pathways governed by mTOR and eIF2 proteins are potentially 115 the most important for the functions of CD8 + lymphocytes in ECs and that these pathways are 116 dominant in three out of five EC groups.

117
In the present study, we performed a comprehensive analysis of the transcription profiles of 118 CD8 lymphocytes from a large cohort of ECs, cART-treated and untreated progressors using the 119 pipeline, which includes the following steps: (1) identification of distinct groups of ECs based on the 120 . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint 6 corresponding transcription profiles; (2) identification of differential expression for each EC group,  values of more than 0.95, and two clusters have p-values of more than 0.65 (Table 1), which is still 144 acceptable for further analysis. 145 Hereafter, we will refer to EC groups by numbers in order as in Table 1 (EC groups 1-5).

147
Identification of DEGs and their comparison between EC groups and cART-treated and 148 untreated progressors. We identified DEGs between each of the EC groups and healthy controls as 149 well as between cART-treated progressors and controls (Table 2). We also retrieved from GEO two  Table 2. The corresponding thresholds were chosen empirically 158 to balance the number of DEGs and statistical significance of differential expression (for details, 159 please see the Materials and Methods section). Only EC groups 2 and 3, as well as untreated 160 progressors, were associated with a high number of DEGs, whereas other groups containing the 161 most samples were only slightly different from healthy controls.

162
To compare transcriptional profiles between EC groups together with cART-treated progressors 163 and healthy controls, we performed cluster analysis in the space of genes that were differentially 164 expressed in at least one EC group or cART group compared to the healthy control (Fig 2). 165 Analysis of the content of Table 2 and Fig 2 allows the following conclusions to be drawn. 166 First, all transcription profiles from EC group 5 and some profiles from EC group 4, as well as 167 . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint 8 cART-treated progressors, are indistinguishable from healthy controls (right part of Fig 2). This 168 finding means that total CD8 + lymphocytes from corresponding ECs are potentially not involved in 169 control under viremia. On the other hand, considerable control under viral replication was achieved 170 in corresponding cART-treated progressors. Since samples from cART-treated progressors are 171 divided into two groups on the dendrogram (Fig 2), we performed bootstrap resampling analysis in a 172 similar manner to the case of ECs but did not find any stable clusters. Second, transcriptional 173 profiles from EC groups 2, 3, and, partially, group 4 (cyan, green, and red color in the left part of  The transcription profiles from ECs and cART-treated progressors cannot be directly compared 187 with untreated progressors' profiles, since they were measured on different microarray platforms.    The key DEGs from pathways related to activation, survival, and immune-specific CD8 + 214 lymphocyte functions are presented in Table 3. Here, we considered genes as differentially 215 . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint expressed if the log fold change was more than |0.5| and the unadjusted p-value was less than 0.05.

216
The corresponding thresholds were chosen empirically to balance the number of key genes and 217 differential expression with statistical significance. and cART-treated progressors was opposite to EC groups 2-4 at the level of DEGs (Fig 2 and 3).

223
EC groups 2 and 3 are associated with the same differentially regulated pathways, whereas 224 group 4 is associated with fewer pathways which, however, may be observed because some 225 transcription profiles from EC group 4 are indistinguishable from healthy controls (Fig 2), but other 226 profiles from this group are similar to those from groups 2 and 3. The t-scores obtained for each 227 pathway by gene set enrichment analysis (see Materials and Methods) decreased from group 2 to 4, 228 which indicates the different magnitudes of the pathways' differential expression. We performed 229 gene set overrepresentation analysis (see Materials and Methods) to find KEGG pathways associated 230 with genes that are differentially expressed in EC group 2 (EC group 3) but not differentially 231 expressed in EC group 3 (EC group 4) (see above). The obtained pathways (Table S2)  (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint metabolism and various steps of protein synthesis: from gene transcription to translation, folding, 240 transport, and degradation (Fig 4) (for more details see Discussion).  We also calculated MRs for cART-treated and untreated progressors for comparison. Most MRs 255 obtained for each group of ECs, cART-treated and untreated progressors are intracellular "hubs," 256 such as kinases, phosphatases, ubiquitin ligases, GTPases, and transcription factors (Table S3). We  (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint 13 hand, non-HIV-specific CD8 + cells may still contribute to the control under HIV replication, e.g., by 287 secretion of cytokines that compete with HIV particles for the corresponding coreceptors.

288
In our study, we analyzed the transcriptional profiles of total CD8 + lymphocytes (including 289 HIV-and non-HIV-specific lymphocytes) from ECs, cART-treated, and untreated progressors using progression. 295 We identified three large clusters (displayed in Fig 3 and Fig 4) of transcription profiles from 296 the total CD8 + cells: (1) profiles from ECs which, in turn, formed four distinct groups (EC groups 2-  highest magnitude, whereas group 5 is indistinguishable from the healthy control. Nevertheless, EC 304 groups 2 to 4 were similar at the level of differentially regulated pathways (Fig 4). On the other 305 hand, these groups are different in terms of MRs (Fig 5, Table S3), which regulate the transcription  Given the information on DEGs (Table 2), differentially expressed pathways (Fig 4), and MRs (Fig   310   . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint 5, Table S3), we can conclude that the same cellular functions are changed in CD8 + lymphocytes 311 from EC groups 2-4 but with different magnitudes, and the initial causes may be different, which is 312 reflected by different MRs.

313
The identified pathways may indicate preserved functions, survival, and proliferation of CD8 + 314 lymphocytes from EC groups 2-4 compared to untreated progressors. Many pathways related to 315 metabolism and protein synthesis were upregulated in EC groups 2-4, whereas fewer corresponding 316 pathways were upregulated in untreated progressors, which may indicate a lower degree of increase 317 in metabolism (Fig 4). The anti-apoptotic genes were upregulated in CD8 + lymphocytes from EC 318 groups 2-4 and downregulated in untreated progressors (Table 3). The pro-apoptotic genes were 319 upregulated in untreated progressors, whereas they have a mixed pattern of expression in EC groups 320 2-4. Thus, CD8 + lymphocytes from EC groups 2-4 have a higher ability to survive than untreated 321 progressors; these observations are in accordance with literature data (13, 19). The transcription of 322 cyclins is also dissimilar in EC groups 2-4 and untreated progressors. The only gene coding cyclin 323 D2 is upregulated in ECs, whereas genes coding cyclins E1, E2, B1, B2, and A2 are upregulated in 324 untreated progressors (see Table 3). Since CD8 + cells from the blood are asynchronized in cell cycle  (Table   332 3). The absence of dysfunction of CD8 + lymphocytes in ECs is supported by unchanged 333 transcription of immune checkpoints, whereas the corresponding genes are upregulated in untreated 334 . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint progressors (17) (Table 3). Interestingly, the transcription of genes coding perforin and granzyme B 335 was upregulated in both groups, but the gene coding Fas ligand was downregulated in ECs but 336 upregulated in untreated progressors. CD4 + and CD8 + T lymphocytes in LTNPs have lower 337 frequencies of apoptosis than progressors, which correlates with a lower frequency of cells 338 expressing Fas and FasL (35).

339
In contrast, the corresponding pathways associated with cART-treated progressors are 340 differentially regulated in the opposite direction: they are mostly downregulated, whereas the same 341 pathways from EC groups 2-4 are upregulated (Fig 4). This finding may indicate the downregulation 342 of metabolism, activation, growth, proliferation, and other essential processes in CD8+ T 343 lymphocytes from cART-treated progressors. Some corresponding pathways, e.g., JAK-STAT and 344 FoxO pathways, are changed in the same direction in cART-treated and untreated progressors (Fig   345   4). However, the transcription of many important genes, e.g., pro-and anti-apoptotic genes, is not 346 changed, unlike in untreated progressors (see Table 3). This finding may indicate that many of the 347 considered processes are preserved in CD8+ T lymphocytes from cART-treated compared to 348 untreated progressors, which is in accordance with literature data (24). It is known that low-level 349 viremia (~ 1 copy/mL of plasma) is detectable in most individuals under cART, which leads to 350 residual activation and dysfunction of CD8+ T lymphocytes.
351 Surprisingly, EC group 1 is associated with transcription changes and pathways similar to those 352 in cART-treated progressors. Thus, EC group 1 may also be associated with some degree of CD8 + T 353 lymphocyte dysfunction and may have other viremic control mechanisms that are unrelated to CD8 + 354 cells.

355
The observed transcription changes in CD8 + lymphocytes from ECs may be a consequence of  Some of the identified receptors, e.g., androgen and G-coupled estrogen receptors, are not 378 typically associated with CD8 + lymphocytes; however, they may also contribute to expression 379 changes in ECs. For instance, the androgen receptor gene is downregulated in EC groups 2 and 4, 380 but its expression is not changed in progressors. It was shown that women are significantly 381 overrepresented in the EC population compared to men (22). This finding can be explained by 382 . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint androgens' influence on the immune system, including CD8 + lymphocytes (40). Thus, the androgen 383 receptor may play a significant role in the observed phenomenon.  Background correction and quantile normalization of transcription data were performed using 401 different functions depending on the microarray platform: function "rma" from the "affy" package 402 for Affymetrix microarray (GSE6740 dataset) and the "neqc" function from the "limma" package 403 for Illumina beadchips (GSE87620 and GSE25669 datasets).

404
Next, we removed probes that were unexpressed in all samples of the dataset. To filter out 405 corresponding probes on the Affymetrix microarray, we used the "mas5calls" function from the 406 . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint "affy" package and removed probes having an "absent" score across all samples. To do this on 407 Illumina beadchips, we removed probes that have detection p-values greater than 0.05 in all samples 408 of the dataset.

409
For comparison of DEGs between datasets, we selected only probes having Entrez IDs and only 410 one probe per gene with the highest variance among samples using the "nsFilter" function from the 411 "genefilter" package.

412
To find potential clusters on heterogenic transcriptional profiles from ECs, we selected 51 413 corresponding samples and filtered out 50 percent of genes with the lowest variance using the 414 "nsFilter" function. This step was employed to remove genes whose expression is not changed 415 significantly across samples and cannot be useful to find potential EC groups. To find clusters, we 416 used a hierarchical agglomerative clustering approach implemented in the "hclust" basic R function. 417 We choose 1 -Pearson correlation coefficient between pairs of samples as a distance measure, and 418 the "ward.D2" clustering method (41). To estimate the uncertainty in obtained clusters, we 419 performed multiscale bootstrap resampling using the "pvclust" function from the "pvclust" package.

420
For each cluster, "pvclust" calculates p-values, which indicate how strongly the cluster is supported 421 by data: the higher the p-value, the more probable the cluster's existence. Function "pvclust"  To visualize clusters, we used the "heatmap.2" function from the "gplots" R package. This 426 function creates a heatmap with two dendrograms: one for rows, which are genes, and another for 427 columns, which are samples. To create these dendrograms, the abovementioned clustering method  (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint 20 background gene set, e.g., all genome genes. To perform analysis, we used the function "enrichr" 455 from the "enrichR" R package. We selected pathways with an adjusted p-value less than 0.1, as in 456 the gene set enrichment analysis.    Table S1. Groups of CD8 + lymphocytes' transcription profiles from ECs. 475 Table S2. KEGG pathways associated with genes, which are differentially expressed in EC group 2 476 (EC group 3), but not differentially expressed in EC group 3 (EC group 4).

477
. CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint Table S3. Master regulators and their transcription changes in EC groups, cART-treated and 478 untreated progressors. The master regulator was selected if the corresponding gene was 479 differentially expressed with log fold changes more than |0.5| and p-value less than 0.5 in at least 480 one of five EC groups. EC 1-5 are groups of ECs; cART is cART-treated progressors, AI1 and AI2 481 are untreated progressors in the acute phase from two GEO datasets, CI is untreated progressors in 482 the chronic phase.   (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made   (Table 2), were used to create the 655 heatmap. Row Z-Score is the number of standard deviations by which the value of log fold change 656 in particular column is above or below the mean value of all groups.  and Methods). The positive value and red color mean that pathway is up-regulated, whereas the 660 negative value and blue color mean that pathway is down-regulated compared to healthy control. EC 661 1-5 are groups of ECs; cART is cART-treated progressors, AI1 and AI2 are untreated progressors in 662 the acute phase from two GEO datasets, CI is untreated progressors in the chronic phase.

663
. CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint    (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint

GADD45B↓
. CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted October 20, 2020. ; https://doi.org/10.1101/2020.10.19.346528 doi: bioRxiv preprint