Journal of Molecular Biology
Volume 282, Issue 1, 11 September 1998, Pages 71-97
Journal home page for Journal of Molecular Biology

Regular article
Large-scale sequence comparisons reveal unusually high levels of variation in the HLA-DQB1 locus in the class II region of the human MHC1

https://doi.org/10.1006/jmbi.1998.2018Get rights and content

Abstract

Comparison of genomic sequences flanking the HLA-DQB1 locus in the human MHC class II region reveals local sequence variation of up to 10%, which is the highest level of sequence variation found in the human genome so far. The variation is haplotype-specific and extends far beyond the transcriptional unit of the DQB1 gene, suggesting hitch-hiking along with functionally selected alleles as the most likely mechanism. All major insertions/deletions (indels) were found to be of retroviral origin and in the immediate upstream region of DQB1. Possible cis-acting effects of these indels on the transcriptional regulation of DQB1 are discussed.

Introduction

The human major histocompatibility complex (MHC) spans about 4 Mb on the short arm of chromosome 6 (6p21.3). Many of its gene products are highly polymorphic and have been shown to be associated with the immune system, and with susceptibility and resistance to various diseases Klein 1986, Trowsdale 1995. Because of this special interest, the MHC was targeted and subdivided for early sequencing at the last chromosome 6 workshop (Beck et al., 1997). We have been focusing on the MHC class II subregion, where we previously reported higher than average levels of variation in the non-coding sequences of two loci flanking HLA-Z1 and HLA-DOB (Beck et al., 1996). Furthermore, the availability of the genomic sequence has contributed towards the identification of 127 novel polymorphic markers within the class II region that were used to characterise the recombination pattern and several recombination hotspots in this region (Cullen et al., 1997). The identification of so-called single-nucleotide polymorphisms (SNPs), which are a specific type of sequence variation, is expected to contribute towards the understanding of complex diseases and disorders such as diabetes, asthma and others Collins et al 1997, Schafer and Hawkins 1998. To facilitate this, it is important that such data are made available in public databases. As part of the human chromosome 6 mapping and sequencing project, we have created such a database (6ace) in which we maintain and curate a variety of chromosome 6-associated data, including sequence variation (Theaker et al., 1997). The long-established association with diseases such as diabetes and narcolepsy have made the DQB1 locus a particularly competitive target for genomic sequencing Morel et al 1988, Mignot et al 1994. Although about 8 kb of the DQB1 gene were sequenced 15 years ago Larhammar et al 1983, Boss and Strominger 1984, the entire locus including the surrounding regions were sequenced only recently by us (this work) and by Ellis and co-workers during a study of predisposing factors to narcolepsy (Ellis et al., 1997). Variations between the early DQB1 sequences have been described (Radley et al., 1994). By analysing over 220 kb of overlapping sequences, we report here the level of genomic sequence variation in the MHC DQ locus and that of the non-MHC linked, spinocerebellar ataxia type 1 (SCA1) locus for comparison.

Section snippets

Results and discussion

Figure 1 shows the chromosomal locations of the DQB1 and SCA1 loci that have been analysed for sequence variation. Clones p797a11 and F1121, E1448, and 93N13 of the DQB1 locus (6p21.3) overlap by 86.2 kb and clones SGII and 467D16 of the SCA1 locus (6p23) overlap by 138.5 kb. In addition to providing a source for variation data, the availability of clone p797a11 (Ellis et al., 1997) indicated to us that the two gaps (between 93N13 and E1448 and E1448 and F1121) in our clone contig map had been

DNA sequencing and analysis

Cosmids F1121 and E1448 were isolated from the ICRF chromosome 6 library (RPETO1 cell line: Nizetic et al., 1994), whereas PACs 93N13 and 467D16 were isolated from the RPCI-1 PAC library (HSF7 cell line: Ioannou et al., 1994). The entire clones were randomly subcloned into M13mp18 and pUC18 (Bankier et al., 1987). Recombinant clones (80% M13 s, 20% pUCs) were picked, amplified and purified in 96-well microtitre plates (Beck and Alderton 1993, Mardis 1994; A. Smith & L. Baron, unpublished

Acknowledgements

We thank E. Mignot and D. Ruddy for providing sequence and associated data of clone p797a11 prior to publication; and S. Dear, I. Dunham, J. Kaufman and members of the chromosome 6 project group for comments and contributions (http://www.sanger.ac.uk/HGP/Chr6/). This work was funded by the Wellcome Trust.

References (44)

  • J.M Boss et al.

    Cloning and sequence analysis of the human major histocompatibility complex gene DC-3β

    Proc. Natl Acad. Sci. USA

    (1984)
  • F.S Collins et al.

    Variations on a themeCataloging human DNA sequence variation

    Science

    (1997)
  • M Cullen et al.

    Characterization of recombination in the HLA class II region

    Am. J. Hum. Genet.

    (1997)
  • S D’Alfonso et al.

    The natural history of an HLA haplotype and its recombinants

    Immunogenetics

    (1998)
  • A.W Dangel et al.

    The dichotomous size variation of human component C4 genes is mediated by a novel family of endogenous retroviruses, which also establishes species-specific genomic patterns among Old World primates

    Immunogenetics

    (1994)
  • R.L Dawkins et al.

    Disease associations with complotypes, supratypes and haplotypes

    Immunol. Rev.

    (1983)
  • M.C Ellis et al.

    HLA class II haplotype and sequence analyis support a role for DQ in narcolepsy

    Immunogenetics

    (1997)
  • D.H.A Fitch et al.

    Duplication of the γ-globin gene mediated by L1 long interspersed repetitive elements in an early ancestor of simian primates

    Proc. Natl Acad. Sci. USA

    (1990)
  • S Gaudieri et al.

    Duplication and polymorphism in the MHCAlu generated diversity and polymorphism within the PERB11 gene family

    Hereditas

    (1997)
  • A.L Hughes et al.

    Nucleotide substitution at major histocompatibility complex class II locievidence for overdominant selection

    Proc. Natl Sci. USA

    (1989)
  • P.A Ioannou et al.

    A new bacteriophage P1-derived vector for the propagation of large human DNA fragments

    Nature Genet.

    (1994)
  • S Kambhu et al.

    Endogenous retroviral long terminal repeats within the HLA-DQ locus

    Proc. Natl Acad. Sci. USA

    (1990)
  • Cited by (55)

    • Association of early onset myasthenia gravis in Newfoundland dogs with the canine major histocompatibility complex class I

      2017, Neuromuscular Disorders
      Citation Excerpt :

      The MHC has over 220 loci and many of the genes are involved in presentation of antigen to T cells [42]. Exon 2 of DLA class II genes, DLA-DRB1, DLA-DQA1, and DLA-DQB1, are often sequenced to identify disease associations to the region because exon 2 codes for the highly polymorphic antigen-binding domain that represents most of the variation within the DLA class II region [43–46]. Typing of exon 2 may lead to spurious associations when investigating small sample sets [24].

    • The MHC, disease and selection

      2011, Immunology Letters
      Citation Excerpt :

      Maintenance of many alleles could also be envisaged as being driven by fluctuating selection, due to continual change in pathogen type and abundance [16]. It has been proposed that the gradient of polymorphism in regions flanking class I and class II is due to ‘hitch-hiking’ of variants along with selection for groove variation [17]. Another way of viewing the extreme variation is reflected in the idea of ‘Associative Balancing Complex’ evolution.

    • HLA-DQB1*03 in allergic fungal sinusitis and other chronic hypertrophic rhinosinusitis disorders

      2004, Journal of Allergy and Clinical Immunology
      Citation Excerpt :

      HERVs are inherited truncated retroviral gene inserts that make up approximately 1% of the human genome.44 Some HERVs have been found to contain glucocorticoid response elements and other genetic regulatory motifs with either gene promoter or silencer activity that can potentially regulate transcription of nearby genes both upstream and downstream of the insert.44-47 Most HERV LTRs in the HLA MHC class II region are located next to the HLA-DQB1 locus and likely influence both MHC class II gene transcription and recombination.45,47

    View all citing articles on Scopus
    1

    Edited by J. Karn

    View full text