Microarray-based analysis and clinical validation identify ubiquitin-conjugating enzyme E2E1 (UBE2E1) as a prognostic factor in acute myeloid leukemia

Background Previous research suggested that single gene expression might be correlated with acute myeloid leukemia (AML) survival. Therefore, we conducted a systematical analysis for AML prognostic gene expressions. Methods We performed a microarray-based analysis for correlations between gene expression and adult AML overall survival (OS) using datasets GSE12417 and GSE8970. Positive findings were validated in an independent cohort of 50 newly diagnosed, non-acute promyelocytic leukemia (APL) AML patients by quantitative RT-PCR and survival analysis. Results Microarray-based analysis suggested that expression of eight genes was each associated with 1-year and 3-year AML OS in both GSE12417 and GSE8970 datasets (p < 0.05). Next, we validated our findings in an independent cohort of AML samples collected in our hospital. We found that ubiquitin-conjugating enzyme E2E1 (UBE2E1) expression was adversely correlated with AML survival (p = 0.04). Multivariable analysis showed that UBE2E1 high patients had a significant shorter OS and shorter progression-free survival after adjusting other known prognostic factors (p = 0.03). At last, we found that UBE2E1 expression was negatively correlated with patients’ response to induction chemotherapy (p < 0.05). Conclusions In summary, we demonstrated that UBE2E1 expression was a novel prognostic factor in adult, non-APL AML patients. Electronic supplementary material The online version of this article (doi:10.1186/s13045-016-0356-0) contains supplementary material, which is available to authorized users.


Background
Acute myeloid leukemia (AML), characterized by expansion of malignant myeloid precursor cells in peripheral blood and bone marrow, is the most prevalent acute leukemia in adults [1]. Several AML prognostic factors have been reported, including patient age and cytogenetic features [2,3]. Interestingly, Metzeler et al showed that high expression of lymphoid enhancer binding factor-1 (LEF1) is a favorable AML prognostic factor in non-acute promyelocytic leukemia (APL) AML [4]. This study provided insights on prognostic single gene expression in AML. Therefore, we performed a systematical microarraybased analysis to search gene expression that correlates with AML overall survival (OS).

Microarray datasets download and analysis
We selected AML microarray datasets from Oncomine (www.oncomine.com). Our selection criteria included (i) microarray examining adult AML patient samples; (ii) array data and patient survival data were both published; (iii) microarray data quality; and (iv) microarray used affymetrix array platform. Based on those selection criteria, we used GSE12417 and GSE8970 datasets for our analysis [5,6]. GSE12417 had 2 independent cohort of samples, which were examined by affymetrix platforms GPL570 and GPL96, respectively. Specifically, GSE12417-GPL96 dataset included 163 adult AML patient gene expression profiles, while GSE12417-GPL570 dataset included 79 adult AML patient gene expression profiles. The patients were previously untreated and received cytarabine-based intensive induction and consolidation chemotherapy in the trial [4,5]. GSE8970 dataset used affymetrix platform GPL96. GSE8970-GPL96 dataset included 34 adult AML patient samples. The patients were pretreated with tipifarnib [6]. The stem cell transplantation status of those patients was not available. The same probe ID system was used in all above datasets, enabling results to be cross-compared. Gene expression profiles of above datasets were downloaded from NCBI Gene Expression Omnibus database. Clinic information of those patients was downloaded from Oncomine.
Our algorithm of prognostic genes identification was to identify prognostic genes in each microarray dataset and then find common prognostic genes across all tested datasets to avoid bias associated with single microarray dataset. In one dataset, single gene expression in each AML patient sample was presented by probe intensity. Patients with a probe intensity value above or below the median of all samples were categorized in probe high and probe low groups, respectively. Survival (1 year and 3 years) of the two groups was compared by the Mantel-Cox test, and p < 0.05 was considered significant. Such calculation was repeated for all genes (probes) in the dataset by programming in R software to generate a list of prognostic genes. Common genes across both datasets were identified using the same probe ID (Fig. 1a).

Patient samples
Our validation cohort had 50 newly diagnosed AML patient samples collected at West China Hospital, Sichuan University from 2010 to 2011. The inclusive criteria include (1) adult patients (age > 18); (2) patients with newly diagnosed AML except non-APL subtype; and (4) no chemotherapy was administered before the study. Bone marrow cell samples of the patients were collected as described previously [7]. All patients were treated Fig. 1 a Diagram showing the principal of microarray-based analysis. b OS according to high gene expression (gene high , red) and low gene expression (gene low , blue), analyzed using microarray datasets GSE12417 and GSE8970. GSE12417 had two independent cohorts, examined by two microarray plateforms GPL570 and GPL96. c OS according to UBE2E1 high (red) and UBE2E1 low (blue) in validation cohort. d Relative UBE2E1 expression in patients with different performance status. e Relative UBE2E1 expression in patients with different responses to induction chemotherapy, no response (NR) vs. complete response (CR). (*p < 0.05, **p < 0.01) based on the standard protocol including anthracyclines plus cytarabine. The study was reviewed and approved by the Central Ethics Committees of Institute of Hematology/Blood Diseases Hospital, Chinese Academy of Medical Sciences, and was filed in and permitted by the Ethics Committees of West China Hospital, Sichuan University.

Quantitative RT-PCR
Total RNA was extracted from patient bone marrow cells with RNeasy Mini Kit (QIAGEN) according to the manufacturer's instruction. The expression of target genes was analyzed by qPCR using SYBR green real-time PCR system (Bio-Rad). The expression of housekeeping gene GAPDH was used as an internal control. Primers used were described in Table 1.

Statistical analysis
Patients' characteristics between UBE2E1 high and UBE2E1 low groups were analyzed using Fisher's exact test. The association between UBE2E1 expression as well as other prognostic factors and patients' survival was investigated using univariable Cox regression and multivariable logistic regression analysis. All above statistical analyses were performed in SPSS version 22 software. Patient survival was graphed and analyzed using Graph-Pad Prism 5 software with Mantel-Cox test (a function of GraphPad Prism 5). A p < 0.05 was considered statistically significant.

Microarray-based analysis for AML prognostic gene expression
The microarray-based analysis showed that eight probes' (genes') expression was each associated with AML OS in all datasets, from 1-year survival to 3-year OS (Fig. 1b). These genes were ACOT11, FAXDC2, FECH, HBD, KLF1, LEF1, SLC25A37, and UBE2E1 (Table 2 for chisquare and p value). As shown in Fig. 1b, the expression of FAXDC2, FECH, HBD, KLF1, LEF1, and SLC25A37 was a favorable prognostic factor for AML, while high expression of UBE2E1 and ACOT11 was associated with poor OS (p < 0.05). Furthermore, we compared the target genes' expression in normal BM vs. AML BM. At least in two tested microarray datasets GSE13159 and GSE1159, all target genes were aberrantly expressed in AML: AML patients had averagely increased ACOT11 and UBE2E1 gene expression, while the patients had lower expression of the other genes (Additional file 1: Figure S1). In addition, we conducted multivariable analyses of the microarray datasets. The results revealed that only UBE2E1, LEF1, and FECH1 were independent prognostic factors in AML, despite the impact of the    (Table 3). Interestingly, among those three identified prognostic-related single genes, high expression of LEF1 has already been reported as a favorable prognostic factor in cytogenetically normal adult AML [4].

High expression of UBE2E1 is a poor prognostic factor in AML
We validated our findings in an independent cohort of 50 AML patients (median age 43). Target gene expression was analyzed by quantitative RT-PCR. Based on median gene expression, we divided our patients into two study groups, gene high and gene low groups. The survival analysis showed that out of eight genes identified by microarray studies, the expression of only one gene UBE2E1 (ubuquitin-conjugating enzyme E2E1) was associated with AML OS in our validation cohort, and this gene was one of the three genes with independent prognostic value on multivariable analysis in the training set. The UBE2E1 high group had a markedly shorter OS compared with UBE2E1 low group (p = 0.02; Fig. 1c). Expression of the other seven genes was not associated with AML prognosis in our study (p > 0.05; Additional file 1: Figure S2). We could not detect KLF1 expression in AML patient samples, although the qPCR primers for this gene were validated. Next, we performed multivariable analysis to verify the prognostic significance of UBE2E1 expression in our validation cohort. The patient characteristics of UBE2E1 high and UBE2E1 low groups are shown in Table 4. No significant difference in patient characteristics, such as age, FAB subtypes, WBC count, BM blast percentages, gene mutations, was found between the two groups. We found no difference in patients' treatment between those two groups (Table 5). UBE2E1 high patients had a short OS (p = 0.04) as well as a short progression-free survival (p = 0.03) compared with UBE2E1 low patients after adjusting for the impact of other prognostic factors including patient age, gender, performance status, and response to induction chemotherapy (Table 6).

UBE2E1 expression and its association with chemotherapy response
Finally, in our validation cohort, low UBE2E1 expression was associated with a better performance status in the  patients ( Fig. 1d; p < 0.05). We also found that UBE2E1 expression was associated with response to induction chemotherapy. Patients who had relatively higher UBE2E1 expression were more likely to achieve no response (NR) to chemotherapy while patients who had lower UBE2E1 expression were more likely to enter complete remission (CR) (Fig. 1e; p < 0.05). This result suggests that UBE2E1 expression may be a possible predictor for chemotherapy response in AML patients.

Discussion
In this study, we performed a genome-wide screening to identify gene expression that correlate with adult AML OS. The gene expression profiles (GEPs) from 2 independent datasets of patient samples were used in our analysis. The correlation of each gene expression and AML OS was calculated by a program coded by R software. Only gene identified with statistical significance in both datasets was considered as positive results for further test. By this strategy of analysis, we identified 8 AML prognostic genes. Next, we tested our findings using an independent cohort of 50 AML samples. Our result suggested that although several genes, such as HBD and ACOT11, had trend correlation, only one gene, UBE2E1, was statistically correlated with AML OS in our validation cohort. The negative findings of other 7 genes in our validation cohort might be caused by the relatively small number of patients. In addition, we noticed that the patients in microarray-testing cohort and our validation cohort had different ethnic backgrounds. Further studies might be necessary to draw a more confirmative conclusion. Mounting evidence has shown that AML is highly heterogeneous and dynamic [8]. The heterogeneous entity of AML emerges from the disease genetic basis, leukemogenesis, pathophysiology, and prognosis. However, cluster of gene expression signature [5], or even single gene expression [4], has been shown to correlate with AML prognosis. Therefore, what is the interpretation of prognostic single gene expression, such as UBE2E1 and LEF1, in AML? We hypothesized that different subgroups of AML, with discrete driver mutations, might have similar epigenetic effectors' upregulation/downregulation, which correlate with patient's survival. We also hypothesized that in different prognosisrelevant AML subgroups, the effectors have patterned expression. To test these hypotheses, we performed another microarray-based analysis for UBE2E1 expression in AML with complex karyotype vs. normal karyotype, FLT3 mutation vs. wildtypeFLT3, and NPM1 mutation vs. wildtypeNPM1. We selected those genetic abnormalities because they have high frequency of occurrence in AML and correlate with the patient clinical outcome: patients with complex karyotype or FLT3 mutation had poor treatment outcome, while patients with NPM1 mutation had good treatment outcome [8]. As shown in Additional file 1: Table S1, complex karyotype or FLT3-mutated AML had relatively high UBE2E1 expression, compared with normal karyotype or wildtype FLT3 AML, respectively. NPM1-mutated AML had relatively low UBE2E1 expression. These preliminary findings might indicate that UBE2E1 have patterned expressions, which was well matched with AML classification despite of different genetic basis.
Protein ubiquitination was accomplished by sequential action of enzymes E1, E2, and E3. Specifically, E2 transferred E1-activated ubiquitin to E3, an ubiquitin ligase, and formed an isopeptide bond between ubiquitin and protein substrate. UBE2E1 was a member of ubiquitinconjugating enzyme E2 class. Zhu et al. showed that UBE2E1 regulated HOX gene expression by ubiquitinating histones [9]. Thus, UBE2E1 might play a regulatory  role in cell by selectively ubiquitinating target proteins. The function of UBE2E1 in cell signaling is still largely unknown. However, the regulation of UBE2E1 on HOX gene might be a key to understand the prognostic role of UBE2E1 in AML. HOX gene is a family of highly conserved homeodomain transcription factor genes [10]. There are 39 HOX genes, belonging to 4 gene clusters, in human. Previous work has shown that HOX genes are aberrantly expressed in AML [11,12]. Animal study indicated that overexpression HOX gene, HOXA10 and HOXA9, promoted AML leukemogenesis [13,14]. To identify potential UBE2E1 downstream HOX genes, we started with microarray datasets. We found co-expression of UBE2E1 with HOXA11 in AML. We also examined HOXA11 and UBE2E1 co-expression in our validation cohort (Additional file 1: Figure S3). Interestingly, a recent publication suggested that HOXA11 expression correlated with glioblastoma patient treatment responses and prognosis [15]. Thus, it is highly possible that UBE2E1 regulates AML chemoresistance through HOXA11.
In our study, we found co-expression of UBE2E1 with HOX family gene, HOXA11, in AML. Therefore, we hypothesized that UBE2E1 regulates HOXA11 gene expression in AML, and HOXA11 transcription factor level might be relevant to AML treatment resistance. We are actively conducting more mechanistic studies to demonstrate the role of UBE2E1 in AML.

Conclusions
In conclusion, we performed a genome-wide, microarraybased analysis for gene expressions that correlated with AML survival, and found 8 candidate genes. We further tested these genes in an independent validation cohort of 50 AML samples, and identified that UBE2E1 expression adversely correlated with AML prognosis.

Additional file
Additional file 1: Figure S1. Aberrant target gene expression in AML. AML patient samples microarray datasets GSE13159 and GSE1159 were downloaded from NCBI. Normal samples in those datasets were bone marrow cells or peripheral blood mononuclear cells (PBMCs) from healthy donors. Target gene expression in normal samples vs. AML patient samples were plotted and compared by the Student's t test (**p < 0.01). Figure S2. Target gene correlation with patient survival in validation cohort. Figure S3. Co-expression of UBE2E1 and HOXA11 in AML. Table S1.