In silico analysis of single nucleotide polymorphisms (SNPs) in human C-C chemokine receptor type five (CCR5) gene

Introduction Chemokines are small transmembrane proteins with immune surveillance and immune cell recruitment functions. the expression of CCR5 gene affects virus production and viral load(1). The CCR5 gene contains two introns, three exons, and two promoters, and it is necessary as a co-receptor for the entry of the macrophage-tropic HIV strains. Mutations in the coding region of CCR5 affect the protein structure, which will affect production, chemokine binding, transport, signaling and expression of the CCR5 receptor. Methods SNPs within CCR5 gene were retrieved from ensemble database. Coding SNPs were analyzed using SNPnexus. Coding non-synonymous SNPs in CCR5 binding domains with Viral gp120 were analyzed using SIFT, PolyPhen and I-mutant tools. Project HOPE then used to modelled the 3D structure of the protein resulting from these SNPs. Non-coding SNPs that affects miRNAs in 3’ rejoin were analyzing using PolymiRTS. SNPs that affect transcription factor binding were analyzed using regulomeDB. Results (178) non-synonyms missense SNPs were found to have deleterious and damaging effect on the structure and function of the protein. In CCR5 binding domains with Viral gp120: 3 SNPs rs145061115, rs199824195 and rs201797884 were found to affect both structure and function and stability of chemokine protein. The 2 SNPs rs185691679 and rs199722070 has a role in disruption and creation of the target sites in miRNA seeds due to their high conservation score. Conclusion Mutations in CCR5 gene may explain and represent the molecular basis of the resistance to HIV infection.


Introduction and literature review:
Chemokines are small transmembrane proteins with immune surveillance and immune cell recruitment functions. The effects of these chemokines are mediated by their Gprotein-coupled receptors (GPCR's), which upon binding to the relevant ligand results in the release of the αi and βγ G-protein subunits. Other chemokine receptors work together with CCR5 to stimulate T-cell functions (2) . This, in turn, mediates an effector response (1, 3). Depending on the structure and number of cysteine residue Chemokines are classified as C, CC, CXC, and CX3C. The C-C receptors often share a significant degree of homology, with 'C-C chemokine receptor type five' (CCR5) and 'C-C chemokine receptor type two' (CCR2) sharing 75% homology (4). The chemokine receptor CCR5 is expressed on various cell populations including macrophages, dendritic cells and memory T cells in the immune system, endothelium, epithelium, vascular smooth muscle and fibroblasts, and microglia, neurons, and astrocytes in the central nervous system (5). In 1996, it was discovered that CCR5 is necessary as a coreceptor for entry of the macrophage-tropic HIV strains (6,7). In the Structure of CCR5 coreceptor, the HIV binding domains are a N-terminal and ECL2 domain, the ECL1 domain is indicated. Viral gp120 has been shown to bind sulfate moieties on the cell surface (8,9). During initial infection, the virus uses CCR5, whereas the alternative coreceptor 'C-X-C chemokine receptor type four' (CXCR4) is used in later HIV infection when the infected individual is progressing to AIDS.
The CCR5 protein consists of 352 amino acids with a molecular weight of 40.6 kDa (10). The protein consists of amino-terminal (N-terminal), three extracellular loops (ECL), three intracellular loops (ICL), cytoplasmic or carboxyl tail (C-terminal tail) and seven transmembrane domains (TMD) made up of hydrophobic residues. These hydrophobic regions are important to chemokine ligand binding, HIV coreceptor activity and functional response of the receptor. The N-terminal is rich in tyrosine and acidic amino acids which facilitating interaction with ligands and HIV (11). The I12T, C20S and A29S variants are all located in the N-terminal. the variants markedly reduce cell surface expression and ligand binding with HIV co-receptors (12). The C20S variant prevents disulfide bond formation between the N-terminal and ECL 3.
Considering the importance of this bond in chemokine binding (13). The CCR5 gene was localized in chromosome 3p21 (14) and was found within a cluster of genes encoding for other chemokine receptors which included CCR1, CCR2, CCR3, XCR1 and CCBP2 (10,15). The CCR5 gene is composed of three exons, two introns, and two promoters. The two promoters for CCR5, Pu and Pd contain several ATG transcription sites, prior to the start codon of exon 3, leading to the generation of different CCR5 transcripts, which vary in their 5′ UTR regions (16). Mutations in the Coding region of CCR5 affect the protein structure, which will affect production, chemokine binding, transport, signaling and expression of the CCR5 receptor. Mutations in the promoter region will affect the DNA transcription factor (TF) binding or regulatory sites leading to change in mRNA production of CCR5.
The CCR5Δ32 mutation was discovered as a genetic mutation that protect the cells from infection by HIV (7,14,17) ,The deletion involves a frameshift mutation result in mutant allele contains 215 amino acids in comparison to the full-length 352 amino acid wild type CCR5.the affected rejoin was the second extracellular loop (17),The subsequent protein lacked the last three transmembrane domains as well as regions important in G-protein interaction and signal transduction. The Δ32 mutant allele is confined mostly to individuals of European descent, at gene frequencies of approximately 10%, and has a north to south latitude decline in frequency (18).
In this study, we performed computational analysis of all SNPs in the CCR5 gene found in ensemble database in order to identify coding and non-coding SNPs that can possibly modify the structure and function of chemokine receptor.

2.Materials and methods:
All SNPs and their related protein sequences within the CCR5 gene were retrieved from   Flowchart of the analysis processes.

3.Results:
Among two thousand five hundred and ninety (2590)    The effect of SNPs in 3' region of CCR5 gene created by PolymiRTS show the target sites disrupted and created by SNPs in miRNA seed along with conservation score (CS) and context +score change, the higher (CS), the more profound effect is predicted to be, the higher context +score the more likelihood of change in target site of miRNA.
one miRNA hsa-miR-3692-5p (with CS 6) which affected by the SNP rs185691679 are worth noting to disrupted the target site in miRNA seed.
Two miRNAs hsa-miR-488-5p (with CS 9 and 7) which affected by the SNPs  The aim of this study was to analyze SNPs Identified in CCR5 gene using different initially discovered in the Vietnamese population(23), affects a highly conserved cysteine involved in disulfide bonding between ECL-1 and ECL-2, which is important for CCR5 structure and in HIV binding (24),the structural information of this variant result from project Hope web server show that the mutant residue is bigger than the wild-type residue, the wild-type residue charge was neutral but the mutant residue charge is positive-which can lead to protein folding problems-and the wild-type residue is more hydrophobic than the mutant residue this will cause loss of hydrophobic interactions in the core of the protein. this SNP is only found in East Asian population with minor allele frequencies (0.003%). The change of a Tryptophan into an Arginine at position 190 means that the mutant residue is smaller than the wild-type residue and the wild-type residue is more hydrophobic than the mutant residue. The mutated residue is not in direct contact with a ligand, however, the mutation could affect the local stability which could affect the ligand-contacts made by one of the neighboring residues. This will cause a possible loss of external interactions. The other SNPs found in this study that affect transcription factor and miRNA binding; some are predicted to have significant effect on miRNA binding such as hsa-miR-3692-5p and hsa-miR-488-5p due to their high conservation score, and the other such as SNPs chr3:46414196, chr3:46414176 and chr3:6414175 are likely to affect transcription factor binding.