MGMT expression : insights into its regulation . 2 . Single nucleotide polymorphisms

High intraand interindividual variations in the expression levels of the human O6-methylguanine-DNA methyltransferase (MGMT) gene have been observed. This DNA repair enzyme can be a cause of resistance of cancer cells to alkylating chemotherapy. It has been studied the association of single nucleotide polymorphisms (SNPs) of MGMT with the risk for different types of cancer, progression-free survival in patients with cancer treated with alkylating chemotherapy, as well as an effect of SNPs on the MGMT gene expression and activity of the enzyme. SNPs have been suggested to be the factors which influence the levels of interindividual variability of the MGMT expression. Therefore, the aim of this paper was to review the experimental data on SNPs of the human MGMT gene, which are associated with cancer, as well as on location of MGMT-SNPs in regulatory and protein-coding regions of the gene in relation to its regulation. Lots of MGMT SNPs, which could affect the gene expression and result in interindividual MGMT variability or the enzyme resistance to pseudosubstrate inhibitors, have been revealed within the promoter and enhancer regions, the 5'and 3'-UTRs and introns of the MGMT gene, as well as within the protein-coding region. Many of them may have regulatory effect.

Introduction.The expression of the O 6 -methylguanine-DNA methyltransferase (MGMT), the DNA repair enzyme responsible for removing alkylation adducts from the O 6 -guanine in DNA, and its activity determine cell response to alkylating agents, including anticancer chemotherapy, preventing mutations and cell death.This enzyme can provide resistance of cancer cells to such therapy.The high intra-and interindividual variations in the MGMT expression levels have been observed, indicating to a complicated regulation of this gene [1,2].
It has been suggested that single nucleotide polymorphisms (SNPs) are the factors which influence the levels of interindividual variability of the MGMT expression [3,4].The association of some MGMT polymorphisms with the risk for different types of cancer, as well as the effects of polymorphic variations on the gene expression and activity of this enzyme are discus-sed in [3].Many known SNPs of the human MGMT gene, which could affect the expression and result in interindividual MGMT variability or the resistance of the enzyme to pseudosubstrate inhibitors, have been revealed within the promoter and enhancer regions, the 5'-and 3'-UTRs and introns of the MGMT gene, as well as within the protein-coding region [3,4].It has been shown that at least two intragenic SNPs have the influence upon an interindividual variation of the MGMT activity in peripheral blood mononuclear cells [5], as well as in normal human lung tissue [6].
Currently, disease-and trait-associated SNPs are rapidly being identified in the genome wide association studies and using related strategies [7].The majority (~93 %) of these variants lie within non-coding sequences, which are concentrated in regulatory DNA marked by deoxyribonuclease I (DNaseI) hypersensitive sites (DHSs) [8].Thus, SNPs in functionally important non-coding DNA regions could make a significant contribution to the phenotypic variation and disease susceptibility among individuals [9].Evaluation of extent and number of SNPs detected within the human MGMT gene can be done by looking at the track of all SNPs in the protein-coding and non-coding regions including promoter from the UCSC genome browser (the University of California Santa Cruz, Figure in Supplement).Since disease-associated SNPs systematically perturb the transcription factor (TF) recognition sequences and frequently alter allelic chromatin states, it is suggested the involvement of some SNPs in transcriptional regulatory mechanisms, including modulation of promoter and enhancer elements and enrichment within the expression quantitative trait loci [8].In particular, SNPs in the regulatory non-coding regions can influence the DNA helical conformations, protein binding, methylation status of CpG dinucleotides, transcript splicing, etc.
The aim of this paper is to review the experimental data on SNPs of the human MGMT gene, which are as-sociated with cancer, as well as on location of MGMT-SNPs in regulatory and protein-coding regions of the gene in relation to its regulation.This article is the second part of a thematic series on the regulation of MGMT expression.
SNPs in regulatory regions of the human MGMT gene.There are several known SNPs in the promoter, enhancer region, and introns of the human MGMT gene.They include G135T, G290A, C485A, C575A, G666A, C777A, G795C, A1034G and C1099T (numbered according to the X61657) [5,[10][11][12].The X61657 is a genomic clone 1157 base pairs in length which contains the sequence of the 5'-upstream region of the human MGMT gene with the promoter activity and the first untranslated exon, taken from the NIH genetic sequence database GenBank [13,14].We have aligned X61657 to chromosome 10 by using Nucleotide Blast program (http://blast.ncbi.nlm.nih.gov/Blast.cgi) and BLAT alignment tool [15] (Fig. 1).The changes within the promoter and enhancer regions can affect the gene expression potentially, but it is difficult to check this, because many other factors can influence the MGMT expression.In particular, it is suggested that the C575A variant revealed in melanoma patients could influence the gene transcription [10], as it is located closely to the binding site (BS) with the eukaryotic heat shock factor [13].The C1099T polymorphism located within the enhancer region in the exon 1 [16] has been shown to increase the MGMT promoter-enhancer activity in luciferase assay [11].
Since recent numerous studies have identified many disease-and trait-related genetic variants [7], we analysed the human MGMT gene non-coding and coding regions for SNP enrichment.The Fig. 1 presents a track that contains the information about SNPs located within an open chromatin area marked by DHS and overlapped with TF ChIP-seq BSs [17,18].They include rs1625649; rs35322871; rs113813075; rs79442343; rs34180180; rs189357135; rs112837630; rs34138162; rs1623007; rs2782888; rs181536588; rs16906252; rs113327489; rs186050433; rs16906255; rs149452540 (Table 1).The location and data on SNPs were taken from the UCSC Genome Browser, the SNP database (dbSNP, build 137) at NCBI [17] and the SNPedia resource, which is focused on the medical, phenotypic and genealogical associations of SNPs [19].Some of the revealed SNPs are associated with cancer.In particular, the TT genotype of rs1625649 located 523 bases upstream the transcription start site (TSS) of MGMT was found to correlate with a worse progression-free survival in patients with metastatic colorectal cancer treated with oxaliplatin-based chemotherapy than the combined GG + GT genotypes [20].The coding-synonimous polymorphism R (CGC) ® R (CGT) in NM_002412 (rs16906252, a strong association with the promoter methylation (silencing) in colorectal cancer [21,22].Also, it has been shown, that polymorphism C485A in promoter was associated with increased risk of lung cancer in a Korean population, despite this SNP did not have an effect on the promoter activity [12].Many other SNPs have been also found in introns (Suppl.Fig. ).It has been shown that the A allele of the rs7087131 variant of the MGMT gene (chr10:131474474, plus DNA strand; ref. allele G) was associated with a decreased risk of esophageal squamous cell carcinoma [23], while rs12268840 (chr10:131325299, in intron 1, plus DNA strand) -with increased risks of adenocarcinoma of the esophagus [24].
We discussed in our previous paper [25] that promoter of the human MGMT gene from GenBank (X61657) contained TSS, which is located within the CpG island (CGI), DHS and exon 1 of the gene (Fig. 1).The overlapping of CGI and DHS marks the open chromatin region and the location of active cis-regulatory elements.The ENCODE studies of different cell lines have demonstrated by using ChIP-seq (chromatin immunoprecipitation with antibodies specific to the TF followed by sequencing of the precipitated DNA), that the MGMT promoter can be targeted by several TFs in open chromatin region including c-Myc, E2F6, Egr-1, ELF1, HEY1, Nrf1, Oct-2, Pol2, POU2F2, TAF1 (Fig. 1, Table 2).TFBSs form two clusters located close to each other.It has been identified a 59 bp enhancer within the second cluster, which produced increased transcriptional activity in a reporter gene assay and was required for efficient promoter function [16].This enhancer sequence is located at the first exon/intron at position +144 to +202 with respect to the TSS of the MGMT promoter X61657 [16].This sequence overlaps with BSs for c-Myc, Oct-2, Pol2, POU2F2, TAF1 revealed in different cell types (Fig. 1).
Since SNPs can perturb TF recognition sites, we have analyzed the DNA sequences around polymorphic nucleotides ± 20 bases for any changes (Fig. 2 For example, in an open chromatin region of the given gene there are three SNPs (rs79442343, rs113327489, rs186050433), a reference allele of which forms CpG dinucleotide (Table 1).The rs2782888 SNP forms such dinucleotide, increasing a total number of CpG dinucleotides within CGI, methylation of which can potentially be attractive to binding with methyl-CpG binding proteins which in turn recruit histone deacetylase complexes and cause the chromatin condensation [26].
Also, we have classified SNPs revealed from the open chromatin area of the human MGMT promoter by using Regulome database [27] to identify a putative regulatory potential and a functional role of variants (Table 3).According to this database, only three of the analyzed SNPs (rs112837630, rs2782888, rs16906255) belong to the group of variants, which likely affect TF binding, since these SNPs are located within TF binding motif and DNase hypersensitive site [27].However, the Fig. 1 shows higher amount of such SNPs, which are located within an open chromatin area and may affect TF binding.This difference is probably caused by the fact that RegulomeDB contains data from the old built 132 of dbSNP.
SNPs in the protein-coding region of the human MGMT gene.SNPs in the protein-coding region of the MGMT gene were also found out in populations of healthy donors and patients with different types of cancers.For instance, SNP at codon L53L (silent coding effect) in exon 3 was detected in melanoma patients and healthy Swedish individuals [10,28] and those from the UK [5], L84F (missense) in exon 3 -in melanoma patients and healthy individuals in Sweden [10,28], lung patients and healthy controls from Poland [11] and in Caucasian population [29], as well as from the UK [5], I143V (rs2308321, missense) in exon 5 -in Swedish [10,28], Polish [11], English [5] and Caucasian [29] populations.SNP at codon G160R (missense) in exon 5 was detected in young patients with adult type cancers and in the control group [30].K178R (rs2308327, missense) in exon 5 was identified in melanoma patients and healthy individuals in Sweden [10,28], in lung cancer patients and healthy donors from Poland [11] and the UK [5].A silent SNP A197A in exon 5 was found in melanoma patients and healthy Swedish individuals [10].
Frequent in the human population polymorphic variants L84F, I143V and K178R are shown to have the same DNA repair efficiency as the wild-type MGMT enzyme, but they are more resistant to the pseudosub-  [3,4].In particular, it has been revealed that I143V and I143V/K178R variants have no effect upon the DNA repair activity compared with the wild-type MGMT [28].Despite I143V is located closely to the C145 residue at the active site of MGMT, this variant has been shown to have no effect on the activity of the enzyme, but it was more resistant to inactivation by the PaTrin-2 [5].A higher sensitivity of proteins to inactivation by PaTrin-2 has been also observed for variants I143-K178 and I143-R178 comparable with the V143-K178 and V143-R178 alleles [5].SNPs L84F and I143V showed a statistically significantly increased risk of lung cancer in Caucasian population, especially in smoking women with non-small cell lung cancer [29], and of adenocarcinoma of the esophagus [24].In another study, distributions of L53L and L84F polymorphisms did not significantly differ between the lung cancer patients and healthy controls [12].I143V and K178R have been recently shown to corre-late with an increased risk of temozolomide-induced myelosuppression [31].It has been shown that K178R is associated with increased risks of adenocarcinoma of the esophagus [24], lung cancer risk [32], and with colorectal cancer risk [33].The association of MGMT polymorphisms with a risk for lung, breast, colorectal and endometrial cancer is reviewed in [3].
Conclusions.Many SNPs have been identified within the human MGMT gene coding and non-coding regions, in particular within the promoter region, 5'-and 3'-UTRs and introns.Some SNPs, which are detected in the open chromatin area of promoter region and marked by DHS, overlap with experimentally revealed in different cell lines TF BSs.Consequently, they can perturb these recognition sites, change methylation pattern, due to forming new or destruction of existing CpG dinucleotides in DNA, and as a result, such SNPs may affect the gene expression.The association of MGMT SNPs with the risk for different types of cancer, progression-free survival in patients with cancer treated with alkylating chemotherapy, as well as an effect of SNPs on the gene expression, activity of this enzyme and its resistance to pseudosubstrate inhibitors were discussed.Thus, SNPs can be the factors which influence the levels of interindividual variability of the MGMT expression.
Acknowledgements.The study was supported by grant for Young Scientists from the National Academy of Sciences of Ukraine (0111U008220).

Fig. 1 .
Fig. 1.Single nucleotide polymorphisms (SNPs) from ENCODE [18] within promoter region of the human MGMT gene.The open chromatin region of MGMT promoter, marked by DHS and overlapped with TF ChIP-seq binding sites, is separated by straight lines: 1 -the human MGMT promoter X61657 from BLAT search; 2 -RefSeq gene and TSS location; 3 -DNaseI hypersensitive regions, marked as gray and dark boxes, darkness of which is proportional to the maximum signal strength observed in any cell line (the number to the left of the box shows how many cell lines are hypersensitive in the region); 4 -the track of TF ChIP-seq, which shows regions where TFs bind to DNA as assayed in different cell lines (the darkness of the box is proportional to the maximum signal strength observed in any cell line); 5 -CpG island; 6 -repeating elements (the open chromatin region contains C-rich repeat (Low complexity Family and Class), 156 bps in length at position chr10:131265319-131265474 on plus strand of DNA); 7 -single nucleotide polymorphisms track from dbSNP build 137 which have a minor allele frequency of at least 1 %, mapping only once to reference assembly

Fig. 2 .
Fig. 2. Examples of effect of polymorphic allele upon change of recognition motif for TF.Bioinformatic analysis has been done by using Match (A) and TFSEARCH (B) programs.Notes: *matrix identifier, position (strand), core match, matrix match, sequence (always the plus strand is shown), factor name; **TFMATRIX entries with High-scoring

Table 1 )
was found to have 369 MGMT EXPRESSION: INSIGHTS INTO ITS REGULATION. 2. SINGLE NUCLEOTIDE POLYMORPHISMS

Table 1
SNPs from dbSNP build 137 within open chromatin area of the human MGMT gene