Expression levels of long non-coding RNAs are prognostic for AML outcome
- Arvind Singh Mer†1,
- Johan Lindberg†2,
- Christer Nilsson3,
- Daniel Klevebring1,
- Mei Wang1,
- Henrik Grönberg1,
- Soren Lehmann3, 4 and
- Mattias Rantalainen1Email author
© The Author(s). 2018
Received: 16 January 2018
Accepted: 21 March 2018
Published: 7 April 2018
Long non-coding RNA (lncRNA) expression has been implicated in a range of molecular mechanisms that are central in cancer. However, lncRNA expression has not yet been comprehensively characterized in acute myeloid leukemia (AML). Here, we assess to what extent lncRNA expression is prognostic of AML patient overall survival (OS) and determine if there are indications of lncRNA-based molecular subtypes of AML.
We performed RNA sequencing of 274 intensively treated AML patients in a Swedish cohort and quantified lncRNA expression. Univariate and multivariate time-to-event analysis was applied to determine association between individual lncRNAs with OS. Unsupervised statistical learning was applied to ascertain if lncRNA-based molecular subtypes exist and are prognostic.
Thirty-three individual lncRNAs were found to be associated with OS (adjusted p value < 0.05). We established four distinct molecular subtypes based on lncRNA expression using a consensus clustering approach. LncRNA-based subtypes were found to stratify patients into groups with prognostic information (p value < 0.05). Subsequently, lncRNA expression-based subtypes were validated in an independent patient cohort (TCGA-AML). LncRNA subtypes could not be directly explained by any of the recurrent cytogenetic or mutational aberrations, although associations with some of the established genetic and clinical factors were found, including mutations in NPM1, TP53, and FLT3.
LncRNA expression-based four subtypes, discovered in this study, are reproducible and can effectively stratify AML patients. LncRNA expression profiling can provide valuable information for improved risk stratification of AML patients.
Acute myeloid leukemia (AML) is a heterogeneous disease on both the molecular- and phenotypic level, caused by malignant transformation of hematopoietic progenitor cells. During pre-leukemic evolution and disease progression, affected hematopoietic cells gradually accumulate a range of molecular alterations, including somatic mutations, cytogenetic abnormalities, epigenetic alterations, and transcriptomic changes [1, 2]. Numerous recurrent point mutations, epigenetic changes, and cytogenetic abnormalities have been identified through next generation sequencing technology [1, 3]. Cytogenetics together with mutation status of NPM1, CEBPA, and FLT3 internal tandem duplications (FLT3-ITD) form the basis of the European LeukemiaNet (ELN) risk classification system , which provides means for risk stratification of AML patients. However, almost half of patients are classified into the intermediate risk group. Further improvements of the risk stratification of AML patients would provide the potential for improved therapy decisions.
LncRNAs are defined as RNA molecules longer than 200 nucleotides that are transcribed while not protein coding. It has been estimated that more than 58,000 lncRNAs are encoded in the human genome [5, 6]. LncRNAs are involved in a multitude of biological processes that are central in tumorigenesis and progression of cancer, including cell cycle regulation, proliferation, apoptosis, migration, and genomic stability [5, 7]. LncRNAs have multiple modes of action, including involvement in controlling chromatin condensation, regulation of transcription, regulation of RNA splicing, controlling RNA stability, and promoting or inhibiting translation of mRNAs to proteins .
Most large-scale genomic analyses of cancer patient data have focused on the protein coding region of the genome. However, estimates from the ENCODE study suggest that up to 75% of the human genome gets transcribed into RNA, whereas only about 3% of the human genome is protein coding [9, 10]. LncRNAs are a group of non-coding RNAs that have several recent discoveries linked to cancer [11–13]. For example, HOX transcript antisense intergenic RNA (HOTAIR) is known to act as an epigenetic regulator in breast and colorectal cancer [14–16]. Several other lncRNAs are known to play a functional role as oncogenes or tumor suppressors and have clear prognostic potential [14, 17]. Multiple studies have highlighted the role of lncRNA in hematopoietic cellular development and malignancies. In T cell acute lymphoblastic leukemia (T-ALL), the lncRNA LUNAR1 (leukemia-induced non-coding activator RNA) promotes cell growth via enhanced IGF1R expression . The IRAIN lncRNA, located within IGF1R locus, directly interacts with the IGF1R promotor . IRAIN is shown to be downregulated in leukemia cell lines and in high-risk AML patients. Garzon et al.  have previously reported lncRNA expression results from a study consisting of cytogenetically normal acute myeloid leukemia (CN-AML) patients using a custom microarray platform for lncRNA expression profiling, with a focus on assessing association with routine clinical phenotypes and mutations. In that study, lncRNAs were reported to be associated with recurrent mutations in several genes in CN-AML patients, including NPM1, CEBPA, IDH2, ASXL1, and RUNX1, and FLT3-ITD [7, 20]. LncRNA expression has previously also been shown to be associated with treatment response and survival in several other cancer types [5, 21–23].
Despite growing evidence for the potential importance of lncRNAs as prognostic and diagnostic markers across a multitude of cancers, including AML, lncRNA expression in AML has not been comprehensively characterized to date with a focus on ascertaining the potential presence of prognostic lncRNA-based AML subtypes. In this study, we applied whole-transcriptome RNA-sequencing (RNA-seq) with the aim to identify prognostic lncRNAs, to define novel lncRNA-based AML subtypes and to ascertain their prognostic value and relevance for risk stratification of AML patients. Furthermore, novel lncRNA expression-based subtypes were validated in independent patient cohort.
Description of Clinseq-AML cohort
Number of patients
Sex: no. of patients (%)
No. of the patients aged < 60
De novo AML
Median follow-up (days)
Bone marrow blast: median (range, %)
WBC counts: median (range, per mm3)
Cytogenetic aberrations: N (%)
Mutation: N (%)
Individual lncRNAs are prognostic of overall survival in AML
Novel lncRNA-based molecular subtypes of AML
LncRNA AML subtypes are prognostic
We validated the lncRNA expression-based subtypes in the independent TCGA-AML cohort (Fig. 4c, d). In the TCGA-AML cohort, the prognostic value of the lncRNA-based subtypes is significant (Fig. 4c, n = 172, p = 0.01, log-rank test). However, for cytogenetically normal patients in the TCGA-AML cohort, the prognostic performance is not significant (Fig. 4d, p value = 0.2, log rank) which potentially might have occurred due to the low sample size (n = 78). Details of validation using the TCGA-AML cohort are provided in the following section.
Nested cross-validation and independent validation of the lncRNA subtype
To determine consistency of the subtype discovery, we implemented a nested cross-validation procedure that is analogous to repeatedly splitting our cohort into a training set (for model fitting, including parameter estimation) and an independent subset of patients for model evaluation in respect to prognostic value (test set). The misclassification rate of test set samples (nested cross-validation) was low, with overall classification accuracy in the nested cross-validation procedure of 85% (Additional file 1: Figure S7), using class labels assigned in the primary subtype discovery phase as reference. Cross-validation of lncRNA subtypes also revealed significant prognostic value (overall survival, Additional file 1: Figure S8) (p value = 0.012). These results indicate that the lncRNA subtypes, prediction model, and the prognostic value of the subtypes are robust.
Next, we assessed the reproducibility of newly discovered AML subtypes using independent TCGA-AML cohort. To handle intrinsic batch differences between the Clinseq and TCGA studies, we applied batch correction on Clinseq and TCGA lncRNA expression data . We trained a random forest model  for subtype classification based on the Clinseq data and subsequently predicted subtypes in the TCGA-AML cohort. A list of lncRNAs selected using random forest models can be found in Additional file 2. Based on the predicted subtype labels in the TCGA cohort, we then assessed the prognostic information in respect to overall survival (Fig. 4c and Additional file 1: Figure S5B). In the TCGA-AML cohort, the prognostic value of the four lncRNA-based subtypes was found to be significant (n = 172, p = 0.01, log-rank test). In concordance with Clinseq-AML cohort, in TCGA-AML cohort, subtype G1 (n = 30) has the best survival outcome with mean (SE) overall survival at 60 months (5 years) which is 40 ± 12%. Similarly, subtypes G2 (n = 43) and G3 (n = 69) show intermediate survival with mean (SE) overall survival at 60 months which are 31 ± 8% and 22 ± 87% respectively. Similar to Cliniseq, subtype G4 (n = 30) has the worst survival outcome in TCGA cohort, where no patient survive at 60 months (Fig. 4c). We also fitted a multivariable Cox proportional hazards models, adjusting for age, sex, and established prognostic markers using TCGA clinical and mutation data (Fig. 5b). Prognostic value of this model was also found to be significant value (p value = 1.25 × 10−9). When compared with the reference group G1, in this model, subtype G4 was significantly different in overall survival (p value = 4.48 × 10−3). We also evaluated the prognostic performance in the subset of cytogenetically normal patients in the TCGA-AML cohort. In this subset of patients, the association to overall survival was not significant (Fig. 4d, p value = 0.2, log rank). However, this might be due to the low sample size (n = 78).
LncRNA expression subtypes are partially associated with clinicopathological factors
Association analysis of lncRNA-derived molecular subtypes with established somatic aberrations and other risk factors
Adjusted p value
Patients belonging to group G1 are enriched for CEBPA mutations (2.56 and 1.46% single and double mutation respectively). CEBPA double mutations have been associated with favorable outcome in AML [27, 28]. Cluster G2 is enriched in NPM1 mutation (6.93%) but has low percentage of TP53 mutations (1.09%). The cluster G3 contains a substantial number of FLT3-ITD. This cluster is also enriched in CEBPA single and double mutations. Cluster G4 harbors a high percentage of TP53 mutations (4.75%). This cluster also contains the highest percentage (8.08%) of patients classified as high-risk category using ELN risk classification system.
We found that lncRNA expression-based subtypes were independent from the European LeukemiaNet (ELN) risk classification system  and the distribution of the ELN risk score is fairly even in all four groups (Fig. 1 and Additional file 1: Table S19). For each ELN risk type, we further stratified it using lncRNA subtypes (Additional file 1: Figure S9). These results indicate that for each ELN risk score, lncRNA subtypes can provide further stratification of patients. Although lncRNA-based subtypes were not found to be highly concordant with any specific mutations, cytogenetics, or clinical factors, we found that mutations in NPM1 and TP53 were associated with the lncRNA-based subtypes (Chi-square test p value is 1.09 × 10−5 and 2.99 × 10−3 respectively, see Additional file 1: Tables S1 to S20 for details).
Pathway analysis of genes associated with lncRNA-based subtypes
LncRNA-based subtypes are not concordant with mRNA-based subtypes
The present study is the most comprehensive lncRNA expression study in AML to date. We characterized lncRNA expression using RNA sequencing in a cohort of 274 AML patients (data included in Additional file 6) with the aim to determine if individual lncRNAs were associated with AML outcome and if lncRNA-based prognostic subtypes of AML could be defined. The findings were subsequently validated in the independent TCGA-AML cohort (Additional file 7).
In the Clinseq-AML cohort, 33 individual lncRNAs were found to have independent prognostic information and four robust lncRNA-based subtypes of AML were discovered that are prognostic of overall survival. Some of the established clinical and genetic factors of AML were found to be associated with the lncRNA expression subtypes, although subtypes did not display a high degree of concordance with any of the clinical or genetic factors. Similarly, lncRNA-based subtypes were not found to be concordant with mRNA-based subtypes, suggesting that lncRNA expression represents an independent source of molecular information. Subtype G1 was characterized by displaying the longest overall survival. This group is also dominated by intermediate level of ELN risk and normal karyotypes. It also harbors high frequency of CEBPA double mutations. In de novo AML, CEBPA double mutations are known to have a favorable prognostic significance [27, 28]. Subtypes G2 and G3 represent prognostically poorer AML subtypes. Both of these subtypes have a high frequency of patients with intermediate risk level based on ELN risk classification. In comparison to subtype G1, they possess more cytogenetic abnormalities. Subtype G4 represents a group of AML patients with poor prognosis, with the highest frequency of TP53 single and double mutations. When ascertaining the independent prognostic value of lncRNA subtypes, given ELN risk classification (which includes cytogenetic classification), and genetic mutations, the lncRNA subtype model was confirmed to provide a significant prognostic value. We have also developed a subtype prediction biomarker panel consisting of 35 lncRNAs (Additional file 2), which provided equivalent classification as the full set of lncRNA features considered in this study and could be seen as a candidate biomarker panel for lncRNA-based subtyping in AML.
We have validated our lncRNA expression-based subtype model in independent TCGA-AML cohort. Our results show that similar to Clinseq-AML cohort, in the TCGA-AML cohort, the lncRNA-based subtypes are significantly associated with overall survival. In particular, it is evident that subtype G1 is associated with more favorable outcome and subtype G4 indicates worse outcome. These associations are evident in both the cohort even after adjusting for known prognostic factors through multivariate analysis.
Both Clinseq-AML and TCGA-AML cohorts have similar percentage of cytogenetically normal patients, 47.4 and 45.1% respectively. Cytogenetic abnormalities, such as del7 (9.9% in Clinseq-AML, 9.9% in TCGA-AML) and del5 (6.2% in Clinseq-AML, 5.6% in TCGA-AML), have very similar distribution in both the cohorts. However, frequency of recurrent genetic abnormalities such as inv(16) (3.3% in Clinseq-AML, 7.7% in TCGA-AML) and inv(3) (1.8% in Clinseq-AML, 0% in TCGA-AML) are not similar. Interestingly, the Clinseq-AML cohort contains both de novo and non-de novo AML patients; however, the TCGA-AML cohort is completely comprised of de novo AML cases. We performed differential gene expression analysis between de novo and non-de novo samples in the Clinseq-AML cohort (Additional file 8). However, we did not find any significant difference in lncRNA expression pattern between de novo and non-de novo AML as no lncRNA is significantly differentially expressed (fdr < 0.05).
We would like to stress the fact that there are several differences between the Clinseq and TCGA cohort such as difference is sequencing protocol, batch effect, and frequency of recurrent genetic abnormalities, as discussed above. Our analysis shows that despite the various sources of heterogeneity and cohort differences, lncRNA expression-based subtypes are consistent and have significant association with survival. Previously, Garzon et al.  studied lncRNA expression in cytogenetically normal acute myeloid leukemia (CN-AML) patients using a custom microarray platform with a focus on assessing lncRNAs association with routine clinical phenotypes and mutations. In contrast, present study contains a more representative set of AML patients and ascertains the presence of lncRNA-based molecular subtypes in AML. Furthermore, the present study is almost twice in compared to the previously published results , which only include CN-AML patients. We also note that RNA sequencing, which is employed here, provide an unbiased and comprehensive approach to lncRNA profiling compared to targeted microarray-based expression profiling which may be limited by selection bias during design of the array. Despite such differences, similar to Garzon et al. , our results show that pathways such as mRNA processing, immune system process, and chromosome organization are enriched in lncRNA subtypes G1, G3, and G4 respectively (Fig. 6 and Additional file 3).
We have also compared lncRNA expression-based subtypes with mRNA expression-based subtypes (C1 to C7). The mRNA subtypes were generated using the same methodology as lncRNA expression-based subtypes (for details, see Additional file 5). Our analysis shows that lncRNA-based subtypes are not directly correlated with mRNA-based subtypes and lncRNA subtypes provide independent prognostic information.
Although the present study is the largest lncRNA expression study reported to date, the sample size in this study might represent a limiting factor to establishing potential additional lncRNA subtypes that are rare (i.e., present in a low proportion of AML patients), since there would be too few principal examples present in this cohort. Furthermore, the RNAseq-based lncRNA profiling method applied in this study has limitations in quantifying lncRNA molecules at very low abundances. These limitations can be overcome by using a larger sample size and deeper sequencing technology.
Expression profiles of lncRNAs have previously been studied in several cancer types, including proposed lncRNA subtypes [30–33]. However, in the context of hematological malignancies, only a few studies have focused on the role of lncRNA expression. Moreover, these studies have focused on risk prediction and were limited to a specific subset of AML. Our analysis is the first to provide lncRNA-based stratification of AML patients by means of lncRNA subtypes. The proposed subtypes are characterized by distinct molecular profiles defined by lncRNA expression, which also provide prognostic information. LncRNA expression and related molecular subtypes provide a promising avenue for improved patient stratification in the future and information about lncRNA expression that offer a starting point for functional studies.
For detailed material and method, refer to the supplementary information provided in Additional file 5. A brief description is as follows:
We used Clinseq-AML cohort, consist of 274 AML patients, treated according to the national guidelines in Sweden. The study was approved by the regional ethical review board in Stockholm, Sweden. All samples from the Clinseq-AML cohort were collected prior to the initiation of treatment. For detail characteristics of patients in Clinseq-AML cohort, see Table 1. In this study, we used data from 142 patients of the TCGA-AML study , who have received intensive induction treatment (chemotherapy) analogous to the Clinseq-AML cohort. Clinical and mutational data was retrieved from the data portal of TCGA (https://gdc.cancer.gov) and TCGA-AML study publication . Detailed characteristics of TCGA-AML cohort can be found in Additional file 7.
Sequencing and bioinformatics processing
Transcriptomic RNA and somatic mutation panel of genes were sequenced using the Illumina HiSeq-2500 platform. Ribosomal RNA depletion was performed using the Ribo-Zero gold kit. HTSeq count version 0.6.1  was used for gene expression estimation. RNAseq count data normalization was performed using the TMM method . A total of 3030 lncRNAs were annotated using MiTranscriptome database .
Subtype discovery and validation
Consensus clustering-based unsupervised learning was applied for subtype discovery . Optimal number of cluster (k = 4) was determined using weighted silhouette index. For validation, first, we performed 10-fold cross-validation on Cliniseq-AML data. At each cross-validation round, data was randomly divided into train and test set. Unsupervised learning was performed on training set, and labels were used to train random forest model . Labels for test dataset were predicted using this model.
For independent validation, common lncRNA in Clinseq and TCGA dataset were selected as features and batch correction was applied . We trained random forest classifier  on batch-corrected Clinseq-AML data and subtype labels were predicted for TCGA-AML data.
Clinical association and survival analysis
For association analyses, Chi-square test was used. Overall survival was measured from the date of diagnosis to the date of death. Kaplan-Meier curve and non-parametric log-rank statistic were used for comparison. Uni-variable and multivariable Cox’s proportional hazards regression models were fitted to the survival data. In multivariate analysis, we adjusted for age, sex, etiology, ELN score, and mutational status of genes. Analysis was carried out using R (version 3.1.1).
This study is supported by grants from the Swedish research council (Vetenskapsrådet - Unga forskare), CRisP (Vetenskapsrådet Linnaeus grant), Swedish Cancer Society (Cancerfonden), Swedish e-Science Research Centre (SERC)—“e-Science for Cancer Prevention and Control (ecpc)”, the Strategic Research Programme in Cancer (StratCan) at Karolinska Institutet, and the Stockholm County Council.
Availability of data and materials
The dataset supporting the conclusions of this article is included within the article (and its additional files).
ASM performed analyses and drafted the manuscript. JL designed and carried out the molecular profiling and mutation analysis. DK carried out bioinformatics pre-processing. MW contributed to statistical analyses. CN collected and assembled clinical data. HG and SL designed and collected material for the Clinseq-AML cohort and provided clinical expertise. All authors contributed to writing the manuscript. MR conceived and supervised the study and wrote the manuscript. All authors reviewed and approved the final manuscript.
Ethics approval and consent to participate
Ethical approval for this study was given by the regional ethical review board in Stockholm, Sweden.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Cancer Genome Atlas Research N. Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N Engl J Med. 2013;368:2059–74.View ArticleGoogle Scholar
- Marcucci G, Haferlach T, Dohner H. Molecular genetics of adult acute myeloid leukemia: prognostic and therapeutic implications. J Clin Oncol. 2011;29:475–86.View ArticlePubMedGoogle Scholar
- Ley TJ, Mardis ER, Ding L, Fulton B, McLellan MD, Chen K, Dooling D, Dunford-Shore BH, McGrath S, Hickenbotham M, et al. DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature. 2008;456:66–72.View ArticlePubMedPubMed CentralGoogle Scholar
- Döhner H, Estey EH, Amadori S, Appelbaum FR, Büchner T, Burnett AK, Dombret H, Fenaux P, Grimwade D, Larson RA, et al. Diagnosis and management of acute myeloid leukemia in adults: recommendations from an international expert panel, on behalf of the European LeukemiaNet. Blood. 2010;115:453–74.View ArticlePubMedGoogle Scholar
- Huarte M. The emerging role of lncRNAs in cancer. Nat Med. 2015;21:1253–61.View ArticlePubMedGoogle Scholar
- Iyer MK, Niknafs YS, Malik R, Singhal U, Sahu A, Hosono Y, Barrette TR, Prensner JR, Evans JR, Zhao S, et al. The landscape of long noncoding RNAs in the human transcriptome. Nat Genet. 2015;47:199–208.View ArticlePubMedPubMed CentralGoogle Scholar
- Garzon R, Volinia S, Papaioannou D, Nicolet D, Kohlschmidt J, Yan PS, Mrózek K, Bucci D, Carroll AJ, Baer MR, et al. Expression and prognostic impact of lncRNAs in acute myeloid leukemia. Proc Natl Acad Sci. 2014;111:18679–84.View ArticlePubMedPubMed CentralGoogle Scholar
- Silva A, Bullock M, Calin G. The clinical relevance of long non-coding RNAs in cancer. Cancers. 2015;7:2169–82.View ArticlePubMedPubMed CentralGoogle Scholar
- Ling H, Fabbri M, Calin GA. MicroRNAs and other non-coding RNAs as targets for anticancer drug development. Nat Rev Drug Discov. 2013;12:847–65.View ArticlePubMedPubMed CentralGoogle Scholar
- Consortium EP. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74.View ArticleGoogle Scholar
- Sahu A, Singhal U, Chinnaiyan AM. Long noncoding RNAs in cancer: from function to translation. Trends Cancer. 2015;1:93–109.View ArticlePubMedPubMed CentralGoogle Scholar
- Yang G, Lu X, Yuan L. LncRNA: a link between RNA and cancer. Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms. 2014;1839:1097–109.View ArticleGoogle Scholar
- Cheetham SW, Gruhl F, Mattick JS, Dinger ME. Long noncoding RNAs and the genetics of cancer. Br J Cancer. 2013;108:2419–25.View ArticlePubMedPubMed CentralGoogle Scholar
- Qi P, Du X. The long non-coding RNAs, a new cancer diagnostic and therapeutic gold mine. Mod Pathol. 2013;26:155–65.View ArticlePubMedGoogle Scholar
- Xue Y, Gu DY, Ma GX, Zhu LJ, Hua QH, Chu HY, Tong N, Chen JF, Zhang ZD, Wang ML. Genetic variants in lncRNA HOTAIR are associated with risk of colorectal cancer. Mutagenesis. 2015;30:303–10.View ArticlePubMedGoogle Scholar
- Kogo R, Shimamura T, Mimori K, Kawahara K, Imoto S, Sudo T, Tanaka F, Shibata K, Suzuki A, Komune S, et al. Long non-coding RNA HOTAIR regulates Polycomb-dependent chromatin modification and is associated with poor prognosis in colorectal cancers. Cancer Res. 2011;71:6320–6.View ArticlePubMedGoogle Scholar
- Bhan A, Soleimani M, Mandal SS. Long noncoding RNA and cancer: a new paradigm. Cancer research. 2017;77(15):3965–81.View ArticlePubMedGoogle Scholar
- Trimarchi T, Bilal E, Ntziachristos P, Fabbri G, Dalla-Favera R, Tsirigos A, Aifantis I. Genome-wide mapping and characterization of notch-regulated long noncoding RNAs in acute leukemia. Cell. 2014;158:593–606.View ArticlePubMedPubMed CentralGoogle Scholar
- Sun JN, Li W, Sun YP, Yu DH, Wen X, Wang H, Cui JW, Wang GJ, Hoffman AR, Hu JF. A novel antisense long noncoding RNA within the IGF1R gene locus is imprinted in hematopoietic malignancies. Nucleic Acids Res. 2014;42:9588–601.View ArticlePubMedPubMed CentralGoogle Scholar
- Hughes JM, Salvatori B, Giorgi FM, Bozzoni I, Fatica A. CEBPA-regulated lncRNAs, new players in the study of acute myeloid leukemia. J Hematol Oncol. 2014;7:69.View ArticlePubMedPubMed CentralGoogle Scholar
- Malek E, Jagannathan S, Driscoll JJ. Correlation of long non-coding RNA expression with metastasis, drug resistance and clinical outcome in cancer. Oncotarget. 2014;5:8027–38.View ArticlePubMedPubMed CentralGoogle Scholar
- Sun J, Chen X, Wang Z, Guo M, Shi H, Wang X, Cheng L, Zhou M. A potential prognostic long non-coding RNA signature to predict metastasis-free survival of breast cancer patients. Scientific reports. 2015;5:16553.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhou M, Guo MN, He DF, Wang XJ, Cui YQ, Yang HX, Hao DP, Sun J. A potential signature of eight long non-coding RNAs predicts survival in patients with non-small cell lung cancer. J Transl Med. 2015;13(1):231.Google Scholar
- Wilkerson MD, Hayes DN. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics. 2010;26:1572–3.View ArticlePubMedPubMed CentralGoogle Scholar
- Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007;8:118–27.View ArticlePubMedGoogle Scholar
- Breiman L. Random forests. Mach Learn. 2001;45:5–32.View ArticleGoogle Scholar
- Preudhomme C, Sagot C, Boissel N, Cayuela JM, Tigaud I, de Botton S, Thomas X, Raffoux E, Lamandin C, Castaigne S, et al. Favorable prognostic significance of CEBPA mutations in patients with de novo acute myeloid leukemia: a study from the Acute Leukemia French Association (ALFA). Blood. 2002;100:2717–23.View ArticlePubMedGoogle Scholar
- Wouters BJ, Lowenberg B, Erpelinck-Verschueren CA, van Putten WL, Valk PJ, Delwel R. Double CEBPA mutations, but not single CEBPA mutations, define a subgroup of acute myeloid leukemia with a distinctive gene expression profile that is uniquely associated with a favorable outcome. Blood. 2009;113:3088–91.View ArticlePubMedPubMed CentralGoogle Scholar
- Ebisuya M, Yamamoto T, Nakajima M, Nishida E. Ripples from neighbouring transcription. Nat Cell Biol. 2008;10:1106–13.View ArticlePubMedGoogle Scholar
- Chen H, Xu J, Hong J, Tang R, Zhang X, Fang JY. Long noncoding RNA profiles identify five distinct molecular subtypes of colorectal cancer with clinical relevance. Mol Oncol. 2014;8:1393–403.View ArticlePubMedPubMed CentralGoogle Scholar
- Yang J, Lin J, Liu T, Chen T, Pan S, Huang W, Li S. Analysis of lncRNA expression profiles in non-small cell lung cancers (NSCLC) and their clinical subtypes. Lung Cancer. 2014;85:110–5.View ArticlePubMedGoogle Scholar
- Su X, Malouf GG, Chen Y, Zhang J, Yao H, Valero V, Weinstein JN, Spano JP, Meric-Bernstam F, Khayat D, Esteva FJ. Comprehensive analysis of long non-coding RNAs in human breast cancer clinical subtypes. Oncotarget. 2014;5:9864–76.PubMedPubMed CentralGoogle Scholar
- Du Z, Fei T, Verhaak RG, Su Z, Zhang Y, Brown M, Chen Y, Liu XS. Integrative genomic analyses reveal clinically relevant long noncoding RNAs in human cancer. Nat Struct Mol Biol. 2013;20:908–13.View ArticlePubMedPubMed CentralGoogle Scholar
- Anders S, Pyl PT, Huber W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31:166–9.View ArticlePubMedGoogle Scholar
- Robinson MD, Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010;11:R25.View ArticlePubMedPubMed CentralGoogle Scholar