Abstract
DHFR gene amplification is present in methotrexate (MTX)-resistant colon cancer cells and acute lymphoblastic leukemia. However, little is known about DHFR gene amplification due to difficulties in quantifying amplification size and recognizing the repetitive rearrangements involved in the process. In this study, we have proposed an integrative framework to characterize the amplified region by using a combination of single-molecule real time sequencing, next-generation optical mapping, and chromosome conformation capture (Hi-C). Amplification of the DHFR gene was optimized to generate homogenously amplified patterns. The amplification units of 11 genes, from the DHFR gene to the ATP6AP1L gene position on chromosome 5 (~2.2Mbp), and a twenty-fold tandemly amplified region were verified using long-range genome and RNA sequencing data. In doing so, a novel inversion at the start and end positions of the amplified region as well as frameshift insertions in most of the MSH and MLH genes were detected. These might stimulate chromosomal breakage and cause the dysregulation of mismatch repair pathways. Using Hi-C technology, high adjusted interaction frequencies were detected on the amplified unit and unsuspected position on 5q, which could have a complex network of spatial contacts to harbor gene amplification. Characterizing the tandem gene-amplified unit and genomic variants as well as chromosomal interactions on intra-chromosome 5 can be critical in identifying the mechanisms behind genomic rearrangements. These findings may give new insight into the mechanisms underlying the amplification process and evolution of drug resistance.
Footnotes
ahreumkim88{at}snu.ac.kr +82-31-600-3001, jongyeon.anna{at}gmail.com +82-31-600-3001, jeongsun{at}snu.ac.kr +82-31-600-3001
Abbreviations
- A3SS
- Alternative 3’ splice site
- A5SS
- Alternative 5’ splice site
- BAM
- Binary alignment map
- BFB
- Breakage-fusion-bridge
- CNV
- Copy number variation
- DEG
- Differentially expressed gene
- DEL
- Deletion
- DHFR
- Dihyrofolate reductase
- DUP
- Duplication
- FDR
- False discovery rate
- FISH
- Fluorescent In Situ Hybridization
- FPKM
- Fragment per kilo base per million
- GATK
- Genome analysis toolkit
- GSEA
- Gene set enrichment analysis
- Hi-C
- High throughput chromosome conformation capture
- IGV
- Integrative genomics viewer
- INV
- Inversion
- INVDUP
- Inverted duplication
- KEGG
- Kyoto encyclopedia of genes and genomes
- Lsign
- Sum of the sign of the entries in the lower triangle
- Lvar
- the variance of the lower triangle
- MTX
- Methotrexate
- MXE
- Mutually exclusive exon
- ncRNA
- non-coding RNA
- NGS
- Next generation sequencing
- PacBio
- Pacific Bioscience
- RI
- Retained intron
- RNA-seq
- RNA sequencing
- SE
- Skipped exon
- SMRT
- Single molecule real-time
- SNV
- Single nucleotide variation
- Score
- Conner score
- STAR
- Spliced transcripts alignment to a reference
- SV
- Structural variation
- TAD
- Topologically associating domains
- Usign
- Sum of the sign of the entries in the upper triangle
- Uvar
- The variance of the upper triangle
- VSD
- Variance stabilizing data
- WGS
- Whole-genome sequencing