Polymorphisms in microRNA target sites modulate risk of lymphoblastic and myeloid leukemias and affect microRNA binding

Background MicroRNA dysregulation is a common event in leukemia. Polymorphisms in microRNA-binding sites (miRSNPs) in target genes may alter the strength of microRNA interaction with target transcripts thereby affecting protein levels. In this study we aimed at identifying miRSNPs associated with leukemia risk and assessing impact of these miRSNPs on miRNA binding to target transcripts. Methods We analyzed with specialized algorithms the 3′ untranslated regions of 137 leukemia-associated genes and identified 111 putative miRSNPs, of which 10 were chosen for further investigation. We genotyped patients with acute myeloid leukemia (AML, n = 87), chronic myeloid leukemia (CML, n = 140), childhood acute lymphoblastic leukemia (ALL, n = 101) and healthy controls (n = 471). Association between SNPs and leukemia risk was calculated by estimating odds ratios in the multivariate logistic regression analysis. For miRSNPs that were associated with leukemia risk we performed luciferase reporter assays to examine whether they influence miRNA binding. Results Here we show that variant alleles of TLX1_rs2742038 and ETV6_rs1573613 were associated with increased risk of childhood ALL (OR (95% CI) = 3.97 (1.43-11.02) and 1.9 (1.16-3.11), respectively), while PML_rs9479 was associated with decreased ALL risk (OR = 0.55 (0.36-0.86). In adult myeloid leukemias we found significant associations between the variant allele of PML_rs9479 and decreased AML risk (OR = 0.61 (0.38-0.97), and between variant alleles of IRF8_ rs10514611 and ARHGAP26_rs187729 and increased CML risk (OR = 2.4 (1.12-5.15) and 1.63 (1.07-2.47), respectively). Moreover, we observed a significant trend for an increasing ALL and CML risk with the growing number of risk genotypes with OR = 13.91 (4.38-44.11) for carriers of ≥3 risk genotypes in ALL and OR = 4.9 (1.27-18.85) for carriers of 2 risk genotypes in CML. Luciferase reporter assays revealed that the C allele of ARHGAP26_rs187729 creates an illegitimate binding site for miR-18a-3p, while the A allele of PML_rs9479 enhances binding of miR-510-5p and the C allele of ETV6_rs1573613 weakens binding of miR-34c-5p and miR-449b-5p. Conclusions Our study implicates that microRNA-binding site polymorphisms modulate leukemia risk by interfering with the miRNA-mediated regulation. Our findings underscore the significance of variability in 3′ untranslated regions in leukemia.


Background
MicroRNAs (miRNAs) are a class of small (~22 nucleotide) nocoding RNAs that are potent regulators of gene expression in animals and plants. In animals miRNAs bind to target sequences (usually located in the 3′ untranslated region [3′UTR]) in messenger RNAs (mRNAs) and act by negatively regulating gene expression. This binding requires complementarity between the nucleotides 2-8 of miRNA (so called "seed" region) and the target mRNA [1]. To date more than 2500 mature human miRNAs have been identified [2] and they are predicted to regulate over 60% of human protein-coding genes [3]. This regulatory network can be very complex as one miRNA may potentially regulate several mRNAs, and a given mRNA may possess in its sequence binding sites for several miRNAs.
Since miRNAs control a wide variety of biological processes, including proliferation, apoptosis and differentiation, dysfunctions of the miRNA regulatory network may contribute to tumorigenesis. MiRNAs can act as both oncogenes and tumor suppressors. Examples of oncogenic miRNAs that are amplified or overexpressed in cancers include miR-17-92 cluster, miR-21, miR-155 and miR-372/373, while tumor suppressor miRNAs commonly deleted or with reduced expression in cancers are represented by miR-15a and miR-16-1, miR-34 family, let-7 family and miR-29 [4]. MiRNAs play a crucial role in normal hematopoiesis by controlling the differentiation of hematopoietic stem cells into different types of mature blood cells, while deregulation of miRNA networks has been linked to hematological malignancies [5]. Aberrant miRNA expression profiles have been observed in leukemias and lymphomas, and for several miRNAs there is experimental evidence for their functional involvement in leukemogenesis [6,7]. Specific miRNA expression signatures can accurately discriminate different leukemia subtypes and are often of great prognostic relevance [8][9][10][11][12]. Changes in miRNA expression may result from genomic and epigenetic alterations or the impairment of miRNA biogenesis pathway [13]. In addition, polymorphisms in miRNA genes or miRNA target sites (miRSNPs) can modify miRNA action. While polymorphisms in miRNA genes are relatively rare, SNPs in miRNA-binding sites in target genes are more frequent. Several studies have shown that SNPs in miRNA target sites enhance or weaken the interaction between miRNA and its target transcripts and are associated with cancers and other diseases [14,15]. In leukemia, however, so far only one study associated a SNP in the 3′UTR of the NPM1 gene with adverse outcome in acute myeloid leukemia [16].
Considering that miRNAs have been shown to play an essential role in leukemogenesis and that SNPs in miRNA-binding sites in target genes have been associated with various cancers, in this study we aimed at identifying miRSNPs associated with leukemia risk and assessing the impact of these miRSNPs on miRNA binding to target transcripts.

Characteristics of the study groups
Mean age at diagnosis was 7.2 years (range 0-17) for ALL patients, 51.6 (16-90) for AML patients, 51.5  for CML patients and 51.4  for controls % of males was 53% for ALL patients, 45% for AML patients, 54% for CML patients and 55% for controls. Clinical characteristics of leukemia patients are presented in Tables 1, 2 and 3.

Identification of putative SNPs affecting miRNA binding
To identify putative miRSNPs we analyzed SNPs located in the 3′UTR regions of genes with reported relevance for leukemias (according to Entrez Gene). Out of 137 analyzed genes 3′UTRs of 88 genes did not harbor any SNP or SNP with minor allele frequency (MAF) greater than 0.05 in Caucasian population. The remaining 49 genes possessed 160 SNPs in their 3′UTRs (Table 4). These SNPs were analyzed using miRanda [17], PITA [18], Patrocles [19] and PolymiRTS [20] regarding their potential impact on miRNA binding. We identified 111 putative miRSNPs, of which 10 were chosen for further studies. The criteria for inclusion were: 1) concordance of at least two applied algorithms regarding the effect of the SNP on miRNA binding (except for rs2735383 G > C in NBN which was earlier associated with lung cancer and shown to affect binding of miR-629 [21]), and 2) expression of predicted miRNAs in bone marrow, white blood cells or leukemias and lymphomas according to the mimiRNA database [22] ( Table 5).

Effects of miRSNPs on leukemia risk
The selected miRSNPs were genotyped in leukemia patients and healthy individuals to identify miRSNPs associated with leukemia risk. Table 6 summarizes the genotype frequencies in the control and leukemia groups.
All SNPs were in Hardy-Weinberg equilibrium in controls except for ABL1_rs7457 C > T, but for this SNP we did not detect significant associations with any leukemia type. However, associations in AML and CML were significant only considering the uncorrected p-value. In ALL and CML more than one SNP was found to be associated with leukemia risk so we assessed effects of combined risk genotypes. In ALL risk genotypes were defined as ETV6_rs1573613 CC, PML_rs9479 GG, TLX1_rs2742038 TT, ATM_rs227091 CC and CT, and IRF8_rs10514611 TT (the latter two reached borderline significance in the analysis of individual SNPs). In CML risk genotypes were defined as ARHGAP26_rs187729 CC and IRF8_rs10514611 TT. In both leukemia types we observed a trend for an increasing leukemia risk as the number of risk genotypes rose, with OR for carriers of 3 or 4 risk genotypes in ALL reaching 13.91 (4. 38-44.11) and for carriers of 2 risk genotypes in CML amounting to 4.9 (1.27-18.85) ( Table 7).

Impact of SNPs on miRNA binding
For the miRSNPs that were associated with leukemia risk we performed luciferase assay to examine whether they influence binding of miRNAs predicted by the applied algorithms. No significant differences in luciferase levels were observed between the 3′UTR-wild type and 3′UTR-variant constructs for each 3′UTR tested in the presence of the miRNA negative control. The C allele of ARHGAP26_rs187729 T > C replaces a G:U wobble with a canonical G-C pair ( Figure 1B). This created a new binding site for miR-18a-3p (luciferase activity increased by 7% for the T allele and decreased by 34% for the C allele) ( Figure 1A). MiR-18a-3p inhibitor only slightly increased luciferase level for the C allele (by 10%), suggesting that endogenous miR-18a-3p levels in Jurkat cells are too low to exert a significant regulatory effect on the ARHGAP26 3′UTR.  The C allele of ETV6_rs1573613 T > C introduces a mismatch in the centre of the target site for miR-34c and miR-449b-5p (these two miRNAs have the same seed sequence) ( Figure 1D). This weakened binding of miR-34c-5p (luciferase activity reduced by 27% for the T allele and 10% for the C allele) and also decreased binding of miR-449b-5p (luciferase activity reduced by 47% for the T allele and 14% for the C allele) ( Figure 1C).
The A allele of PML_rs9479 G > A introduces a mismatch within the binding site for miR-510-5p. The effect was, however, opposite to the prediction as the A allele enhanced binding of miR-510-5p (luciferase activity reduced by 10% for the G allele and 29% for the A allele) ( Figure 1E). A detailed analysis of the PML 3′UTR sequence showed that only 7 bp downstream of the miR-510-5p binding site for the G allele of rs9479 (position 2621 of the 3′UTR) there is a second binding site for miR-510-5p (position 2628) ( Figure 1F). Thus, in the 3′ UTR with the G allele there are two sites competing for miR-510-5p binding: one in the position 2621 of the 3′ UTR and second in the position 2628. Their close proximity may result in less efficient miR-510-5p binding to either site and decrease the regulatory effect on PML. Whereas in the 3′UTR with the A allele miR-510-5p may bind only to the position 2628, causing PML downregulation. The A allele of PML_rs9479 G > A also introduces a mismatch within the binding site for miR-589-3p ( Figure 1F) but no significant effect on luciferase activity was observed for either variant of the PML 3′UTR ( Figure 1E). The T allele of TLX1_rs2742038 C > T was predicted to enhance binding of miR-492 and the T allele of IRF8_rs10514611 C > T was predicted to increase binding of miR-330-3p but we did not observe any noticeable effects on luciferase levels for either 3′UTR variant of those SNPs.

Discussion
In this study we show that polymorphisms in microRNAbinding sites (miRSNPs) modulate leukemia risk and influence binding of miRNAs to target transcripts. To our knowledge this is only the second study reporting a relevance of miRSNPs in leukemia. Recently, Cheng et al. [16] identified a SNP in the 3′UTR of the NPM1 gene that, although present with similar frequency in the control group, was associated with adverse outcome and shorter survival in patients with AML. They further showed that this SNP created an illegitimate binding site for miR-337-5p, which reduced levels of NPM1 mRNA and protein.
MiRSNPs in PLA2G2A, IL-16 and NOD2 were also studied in acute leukemia but no significant differences in the genotype frequencies between leukemia patients and control group were detected [23].
Our study identified five miRSNPs associated with risk of different leukemia types, which is in line with findings Predicted impact of the variant allele on miRNA binding by different algorithms: ↑ binding enhanced, ↓ binding weakened, miRNAs expressed in bone marrow, white blood cells or leukemias and lymphomas are in bold, miRNAs predicted by more than one algorithm are in italics.  in other tumors showing that polymorphisms in miRNAbinding sites may predispose to cancer [21,[24][25][26]. Moreover, we observed a significant trend for an increasing ALL and CML risk with the growing number of risk genotypes, indicative of a possible additive effect of the identified miRSNPs. This finding suggests that a panel of miRNA-binding site polymorphisms could be of clinical utility as markers of leukemia risk. To verify this possibility, miRSNPs reported in this study should be tested in prospective studies on independent patient groups.
In chronic myeloid leukemia SNPs in ARHGAP26 and IRF8 showed significant association with an increased risk of CML, however only for the uncorrected p-value. ARHGAP26 (GRAF) belongs to a family of Rho GTPase activating proteins and is a tumor suppressor acting by negatively regulating RhoA, a small GTP-binding protein with a growth-promoting effect in RAS-mediated malignant transformation [27,28]. Abnormal methylation of the ARHGAP26 promoter and downregulation of the ARH-GAP26 mRNA was observed in acute myeloid leukemia and myelodysplastic syndrome [29,30]. In our study, the C allele of ARHGAP26_rs187729 T > C, associated with an increased risk of CML, created a new binding site for miR-18a-3p which decreased the protein expression by 34%. MiR-18a is a component of the oncogenic miR-17-92 cluster which is often amplified in aggressive B-cell lymphomas [31] and increased expression of miR-18a was correlated with a shorter overall survival in diffuse large B-cell lymphoma patients [32]. Our results demonstrate that the functional polymorphism in the 3′UTR of the tumor suppressor ARHGAP26 creates an illegitimate binding site for miR-18a-3p, which may affect protein levels and contribute to an increased risk of CML.
A significant association with increased CML risk was also observed for a SNP in IRF8. IRF8 (ICSBP) is a transcription factor that regulates expression of genes stimulated by interferons and is essential for the differentiation of myeloid, dendritic and B-lymphoid lineages [33]. Its tumor suppressor activity is supported by mice with a null mutation of IRF8 developing a CML-like syndrome [34] and lack of IRF8 expression in patients with chronic and acute myeloid leukemia [35]. The mechanism for the role of IRF8 downregulation in the pathogenesis of CML is the regulation by IRF8 of several apoptosis-related genes [36,37]. We did not observe any effect of the IRF8 3′UTR on the protein expression in the presence of miR-330-3p, irrespective of the allele. However, it is possible that this SNP affects binding of another miRNA or influences other regulatory functions of the 3′UTR. The potential functional significance of IRF8_rs10514611 C > T remains to be elucidated.
In pediatric acute lymphoblastic leukemia we identified three miRSNPs modulating the ALL risk, in ETV6, PML and TLX1. The tumor suppressor ETV6 (TEL) is a transcription factor (repressor of translation) with a crucial role in the embryonic development and hematopoietic regulation [38,39]. Translocations involving the ETV6 locus (12p13) are a frequent event in leukemia and myelodysplastic syndrome. They result in fusions with numerous partners and contribute to leukemogenesis by several pathogenic mechanisms [40]. In our study, the C allele of ETV6_rs1573613 T > C, associated with an increased risk of ALL, weakened binding of miR-34c-5p and miR-449b-5p, which resulted in higher protein levels (respectively 17% and 33% higher than for the T allele). The result is in accordance with the effect predicted by the in silico analysis, however it does not fit in the model of ETV6 as tumor suppressor. A polymorphism associated with an increased leukemia risk would be expected to decrease rather than increase expression of a protein with a tumor suppressor function. Two cases of ETV6 amplification have been described: one in B lymphoblastic leukemia [41] and the other in myelodysplastic syndrome [42]. Also in our study one patient with common-ALL had an additional chromosome 12 in her blast cell kariotype, and this girl was also a homozygote for the C allele of ETV6_rs1573613 T > C. It is possible then that ETV6 can play a dual role of both a tumor suppressor gene and an oncogene, although a mechanism underlying the oncogenic activity of ETV6 remains to be revealed. Among the known rearrangements involving ETV6 there are a few in which the 3′UTR of ETV6 is preserved. For example the MN1-ETV6 fusion protein acts as a transcriptional activator whereas ETV6 is a repressor, and the PAX5-ETV6 fusion protein is an aberrant transcription factor affecting both the PAX5 and the ETV6 pathways [40]. Increased expression of those fusion proteins resulting from the weaker binding of miR-34c-5p and miR-449b-5p to the 3′UTR with the C allele of ETV6_rs1573613 T > C could intensify their aberrant action. It is also feasible that the miRSNP in ETV6 could act in trans. Recently a concept of "competing endogenous RNAs" (ceRNA) has been developed which assumes that messenger RNAs, transcribed pseudogenes Figure 1 Effect of miRSNPs on miRNA binding and protein expression. A-C) Jurkat cells were transfected in triplicate with 1 µg either empty psiCheck2 vector or psiCheck2 constructs containing 3′UTRs with wild-type and variant alleles, with miRNA mimics, inhibitors or miRNA negative control (50 pmol/well). 24 h post transfection luciferase activity was measured. Data show relative Renilla luciferase levels normalized to firefly luciferase and corrected for the effect of miRNA mimics on the empty psiCheck2 vector. Values for the miRNA negative control were set as 100%. All transfections were done three times. * p < 0.05, ** p < 0.01, *** p < 0.001. Sequence alignments of D) miR-18a-3p with the ARHGAP26 3′UTR, E) miR-34c-5p and miR-449b-5p with the ETV6 3′UTR and F) miR-510-5p and miR-589-3p with the PML 3′UTR. MiRNA seed sequences are underlined. Allelic variants in each 3′UTR are underlined and in bold. For miR-510-5p its seed sequence repeated twice is shown (one in bold, the other underlined) instead of the entire mature miRNA sequence to show its complementarity with two adjacent miR-510 binding sites in the PML 3′UTR.
and long non-coding RNAs compete for miRNA binding, intertwined in a large regulatory network [43]. Thus, presence of the C allele would release the miRNAs otherwise bound by the wild-type 3′UTR and increase the pool of miR-34c-5p and miR-449b-5p available for other targets. Indeed, members of the miR-34 family have been reported to be overexpressed in childhood ALL [44,45]. A significant association with elevated ALL risk was also observed for a SNP in TLX1. TLX1 (HOX11) is a transcription factor that plays a crucial role in embryonic development and in the genesis of the spleen [46]. Translocations involving the TLX1 locus (10q24) and its increased expression occur in a significant proportion of T-ALL, supporting the oncogenic role of TLX1 [47] and suggesting that TLX1 expression in adult tissues is tightly controlled. We did not observe any effect of the TLX1 3′ UTR on the protein expression in the presence of miR-492 for either allelic variant. Hence, the potential functional role of TLX1_rs2742038 C > T remains to be revealed.
The A allele of PML_rs9479 G > A was associated with reduced risk of both pediatric ALL and acute myeloid leukemia in adults. PML is a transcription factor and tumor suppressor that controls cell growth and apoptosis [48]. Translocation involving the PML locus (15q22) resulting in the expression of a fusion protein PML-RARα is found in the majority of acute promyelocytic leukemia cases [49]. Moreover, a partial or complete loss of the PML protein expression has been observed in several solid tumors [48], highlighting its role in cancerogenesis. In this study the A allele of PML_rs9479 G > A enhanced binding of miR-510-5p resulting in 19% decrease in the protein expression as compared to the G allele. However, lower expression caused by the A allele, which had a protective effect in our study, does not fit in the model of PML as a tumor suppressor. Increased expression of PML was observed in tumor cells of Hodgkin lymphoma [50] and hepatocellular carcinoma [51,52] suggesting an oncogenic role of PML but its functional significance is unknown. It is more plausible that the protective role of the A allele could be attributed to sequestering miR-510-5p from its other targets. Little is known about miR-510-5p function and its validated target genes but its oncogenic role was shown in breast cancer where overexpression of miR-510 increased tumor growth in vivo [53].
We are aware of some limitations of our study. The groups of leukemia patients were small, so the results should be treated as preliminary and need to be replicated in larger cohorts. Also, although we restricted our analysis to the miRNAs that are expressed in bone marrow, white blood cells or leukemias and lymphomas, the actual interaction of the analyzed miRNA-mRNA pairs in leukemic cells should be confirmed. Nonetheless, our study demonstrates that microRNA-binding site polymorphisms influence leukemia risk by interfering with the miRNA-mediated regulation of gene expression and underscores the significance of the variability in the 3′UTRs in leukemia.

Study groups
Since at the time when this study was conceived there were no reports on the association of miRNA-binding sites polymorphisms with leukemia, we decided to perform a pilot study in various leukemia types. The study comprised children diagnosed with acute lymphoblastic leukemia (ALL, n = 101) and adults with acute (AML, n = 87) and chronic myeloid leukemia (CML, n = 140) from oncology departments in Poznan, Poland. The control group (n = 471) was recruited among healthy blood donors with no history of cancer from Poznan Blood Centre. All protocols were carried out according to the Declaration of Helsinki. The study was approved by the Ethics Committee at the Poznan University of Medical Sciences (decision no 803/10) and informed consent was obtained from all subjects or their legal guardians.

Selection of polymorphisms
We analyzed 137 genes associated with leukemia according to Entrez Gene [54] as of April 2011. As such we defined genes which in the field 'Phenotypes' in the gene report were linked with susceptibility to any type of leukemia. The 3′UTR sequences were obtained from the University of California Santa Cruz genome browser [55] in April 2011. SNPs residing in 3′UTR of these genes were identified by searching dbSNP [56]. SNPs with minor allele frequency (MAF) greater than 0.05 in Caucasian population were analyzed by specialized algorithms and databases: miRanda [17], PITA [18], the Patrocles database [19] and the PolymiRTS database [20] regarding their potential impact on miRNA binding.

Reporter constructs
SNPs for which we found statistically significant differences between the control and study groups were tested for their impact on miRNA binding by luciferase assay. 1000-1300 bp fragments of 3′UTRs were amplified from DNA of homozygotes for the major allele. The forward primer contained XhoI restriction site for convenient cloning. Purified PCR products were cloned into pGEM®-Teasy vector (Promega, Fitchburg, WI, USA). Inserts containing variant allele for each SNP were obtained by site-directed mutagenesis using the Quik-Change II Site-Directed Mutagenesis Kit (Stratagene, La Jolla, CA, USA). Wild-type and variant inserts were then cleaved out using XhoI and NotI (Promega) and subcloned into psiCheck2 vector (Promega) downstream the Renilla luciferase reporter gene. This vector contains also the secondary reporter, firefly luciferase, used for transfection normalization. Each construct was sequenced to confirm the sequence and orientation of the insert.

Luciferase assay
Jurkat cells were plated on 24-well plate (2 × 10 5 cells per well) in RPMI with 10% fetal bovine serum without antibiotics and transfected in triplicate with 1 µg either empty psiCheck2 vector or psiCheck2 constructs containing 3′UTRs with wild-type and variant alleles, with miRNA mimics or inhibitors or miRNA negative control (50 pmol/well) (Ambion, Austin, TX, USA) using Lipofectamine 2000 (Invitrogen, Carlsbad, CA, USA), according to the protocol. Among the studied miRNAs, only miR-18a-3p is expressed in Jurkat cells according to the mimiRNA database [22], and for this miRNA we also included its inhibitor in the luciferase assay. 24 h post transfection luciferase activity was measured using Dual-Luciferase® Reporter Assay System (Promega) according to the manufacturer's protocol. All transfections were carried out three times.

Statistical analysis
Hardy-Weinberg equilibrium of the genotypes in the study groups was verified by the Chi-square test. The association between SNPs and leukemia risk was calculated by estimating the odds ratio (OR) and its 95% confidence intervals (CI) in the multivariate logistic regression analysis, adjusted for sex and age. To detect the best genetic model (dominant, additive or recessive) first we performed the Cochran-Armitage test for trend and the model with the highest likelihood was chosen for the logistic regression analysis. Benjamini-Hochberg false discovery rate (FDR) control method was used to correct for multiple comparisons in each leukemia group. The SNPs showing a significant or borderline significant (before applying FDR) association with leukemia risk were then analyzed for the effect of the total number of risk genotypes on leukemia risk in multivariate logistic regression analysis, adjusted for sex and age. Only samples without any missing genotype were included in the analysis: n = 468 for controls, n = 93 for ALL and n = 140 for CML. Normalized Renilla luciferase reporter gene expression levels were compared by Student's t test. P < 0.05 was considered statistically significant.