Skip to main content

Precision oncology in AML: validation of the prognostic value of the knowledge bank approach and suggestions for improvement


Recently, a novel knowledge bank (KB) approach to predict outcomes of individual patients with acute myeloid leukemia (AML) was developed using unbiased machine learning. To validate its prognostic value, we analyzed 1612 adults with de novo AML treated on Cancer and Leukemia Group B front-line trials who had pretreatment clinical, cytogenetics, and mutation data on 81 leukemia/cancer-associated genes available. We used receiver operating characteristic (ROC) curves and the area under the curve (AUC) to evaluate the predictive values of the KB algorithm and other risk classifications. The KB algorithm predicted 3-year overall survival (OS) probability in the entire patient cohort (AUCKB = 0.799), and both younger (< 60 years) (AUCKB = 0.747) and older patients (AUCKB = 0.770). The KB algorithm predicted non-remission death (AUCKB = 0.860) well but was less accurate in predicting relapse death (AUCKB = 0.695) and death in first complete remission (AUCKB = 0.603). The KB algorithm’s 3-year OS predictive value was higher than that of the 2017 European LeukemiaNet (ELN) classification (AUC2017ELN = 0.707, p < 0.001) and 2010 ELN classification (AUC2010ELN = 0.721, p < 0.001) but did not differ significantly from that of the 17-gene stemness score (AUC17-gene = 0.732, p = 0.10). Analysis of additional cytogenetic and molecular markers not included in the KB algorithm revealed that taking into account atypical complex karyotype, infrequent recurrent balanced chromosome rearrangements and mutational status of the SAMHD1, AXL and NOTCH1 genes may improve the KB algorithm. We conclude that the KB algorithm has a high predictive value that is higher than those of the 2017 and 2010 ELN classifications. Inclusion of additional genetic features might refine the KB algorithm.

To the Editor,

Risk-stratification schemas based on cytogenetic data and mutational status of selected genes, such as the 2010 and 2017 ELN genetic-risk classifications [1, 2], are widely used to predict the AML patients’ outcomes and guide therapeutic decisions. To increase accuracy of outcome prediction for individual patients, Gerstung et al. [3] developed a novel knowledge bank (KB) algorithm, which combined data on pretreatment clinical, cytogenetic, and gene mutation characteristics, treatment received, and outcomes from 1540 German AML patients [3]. Testing of several machine learning models revealed that inclusive, multistage statistical models scored best in predicting OS and probabilities of non-remission death, relapse death, and death in CR1. Although a relatively small study [4] confirmed prognostic usefulness of KB approach, to our knowledge, it has not been hitherto validated in a large, independent patient cohort. Therefore, we applied the KB algorithm to 1612 adults with de novo AML and investigated whether additional cytogenetic and molecular alterations might improve its accuracy. No patient receiving an allogeneic stem-cell transplantation in CR1 was included in the analyses (Additional file 1).

We used ROC curves and the AUC to assess the ability of the KB approach to predict 3-year OS probability in comparison with the actual patient outcomes. The KB algorithm had a high AUCKB = 0.799 (95% CI 0.777–0.821) for the entire patient cohort, for younger (< 60 years) patients AUCKB = 0.747 (95% CI 0.717–0.776) and for older (≥ 60 years) patients AUCKB = 0.770 (95% CI 0.716–0.824), for whom risk stratification is more difficult because they have generally poor prognosis (Fig. 1a–c).

Fig. 1

The receiver operating characteristic (ROC) curves illustrating the ability of the knowledge bank (KB) algorithm to predict 3-year overall survival rates in the a whole AML patient cohort, b younger adults with AML and c older adults with AML. The ROC curves illustrating the ability of the KB algorithm to predict additional outcome endpoints. d non-remission death, e relapse death and f death in first complete remission. The ROC curves illustrating the abilities of the KB algorithm (blue line), 2017 European LeukemiaNet (ELN) genetic-risk classification (gray line) and 2010 ELN genetic-risk classification (magenta line) to predict 3-year overall survival rates in the g whole cohort of patients with AML and h patients who did not die early. i The ROC curves showing the abilities of the KB algorithm (blue line) and the 17-gene stemness score (magenta line) to predict 3-year overall survival rates in 863 patients with RNA expression data available

Concerning other outcome endpoints, the KB algorithm was excellent for prediction of non-remission death (i.e., death within 3 years after diagnosis without CR1 achievement) with an AUCKB = 0.860 (95% CI 0.838–0.882). For relapse death (i.e., death of patients achieving CR1 who relapsed and died within first 3 years), the predictive ability of the KB approach was worse (AUCKB = 0.695, 95% CI 0.662–0.727). It was even worse for prediction of death in CR1, with a poor AUCKB of 0.603 (95% CI 0.537–0.670; Fig. 1d–f).

Next, we compared the predictive values of the KB approach and of two well-established genetic-risk classifications, the 2010 [1, 5, 6] and 2017 ELN [2, 7, 8] classifications. Among all patients, the KB approach had the highest predictive value with AUCKB = 0.799 (95% CI 0.777–0.821), followed by the 2010 ELN classification (AUC2010ELN = 0.721, 95% CI 0.696–0.746) and the 2017 ELN classification (AUC2017ELN = 0.707, 95% CI 0.682–0.732; Fig. 1g). Compared directly, the KB approach was significantly better than both the 2017 (p < 0.001) and 2010 (p < 0.001) ELN classifications.

When we performed the aforementioned comparisons after excluding early death patients, the KB approach still outperformed both the 2010 and 2017 ELN classifications, but the differences among classifications were smaller than in the entire patient cohort (Fig. 1h; Additional file 1).

We also compared the predictive value of the KB approach [3] with another AML risk classification, the 17-gene stemness score [9, 10], which is calculated as the weighted sum of the normalized expression values of 17 genes whose expression differs between leukemia stem cells and leukemic bulk blasts [9]. Among our 863 patients with RNA expression data available, the predictive values of the KB approach (AUCKB = 0.764, 95% CI 0.733–0.800) and of the 17-gene stemness score (AUC17-gene = 0.732, 95% CI 0.700–0.765) did not differ significantly (p = 0.10; Fig. 1i).

To determine whether genetic alterations not included in the KB algorithm might improve its performance, we compared the frequencies of 44 gene mutations and eight cytogenetic categories (listed in Additional file 1) between patients alive 3 years after diagnosis who were correctly predicted alive and patients falsely predicted to be dead. Three molecular and two cytogenetic markers were significantly different between the patient groups (Table 1).

Table 1 Predicted and observed frequencies of additional genetic markers in AML patients alive and those who were dead 3 years after diagnosis

To cross-validate these findings, we compared these markers’ frequencies between patients who died within first 3 years and were correctly predicted as dead and those falsely predicted to be alive. The frequencies of SAMHD1 mutations and atypical complex karyotype (i.e., without 5q, 7q and 17p abnormalities) [11] were significantly different in both comparisons. Frequencies of AXL and NOTCH1 mutations and of infrequent recurrent balanced chromosome rearrangements [12] were significantly different among patients alive and tended to be different among patients who died (Table 1).

Summarizing, we show that the KB algorithm has a high predictive value, higher than the 2017 and 2010 ELN classifications, and identify additional genetic factors that might improve it.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding authors on reasonable request.



Acute myeloid leukemia


Overall survival


European LeukemiaNet


Knowledge bank


First complete remission


Cancer and Leukemia Group B


Alliance for Clinical Trials in Oncology


Receiver operating characteristic


Area under the curve


  1. 1.

    Döhner H, Estey EH, Amadori S, Appelbaum FR, Büchner T, Burnett AK, et al. Diagnosis and management of acute myeloid leukemia in adults: recommendations from an international expert panel, on behalf of the European LeukemiaNet. Blood. 2010;115(3):453–74.

    Article  Google Scholar 

  2. 2.

    Döhner H, Estey E, Grimwade D, Amadori S, Appelbaum FR, Büchner T, et al. Diagnosis and management of AML in adults: 2017 ELN recommendations from an international expert panel. Blood. 2017;129(4):424–47.

    Article  Google Scholar 

  3. 3.

    Gerstung M, Papaemmanuil E, Martincorena I, Bullinger L, Gaidzik VI, Paschka P, et al. Precision oncology for acute myeloid leukemia using a knowledge bank approach. Nat Genet. 2017;49(3):332–40.

    CAS  Article  Google Scholar 

  4. 4.

    Huet S, Paubelle E, Lours C, Grange B, Courtois L, Chabane K, et al. Validation of the prognostic value of the knowledge bank approach to determine AML prognosis in real life. Blood. 2018;132(8):865–7.

    CAS  Article  Google Scholar 

  5. 5.

    Röllig C, Bornhäuser M, Thiede C, Taube F, Kramer M, Mohr B, et al. Long-term prognosis of acute myeloid leukemia according to the new genetic risk classification of the European LeukemiaNet recommendations: evaluation of the proposed reporting system. J Clin Oncol. 2011;29(20):2758–65.

    Article  Google Scholar 

  6. 6.

    Mrózek K, Marcucci G, Nicolet D, Maharry KS, Becker H, Whitman SP, et al. Prognostic significance of the European LeukemiaNet standardized system for reporting cytogenetic and molecular alterations in adults with acute myeloid leukemia. J Clin Oncol. 2012;30(36):4515–23.

    Article  Google Scholar 

  7. 7.

    Herold T, Rothenberg-Thurley M, Grunwald VV, Janke H, Goerlich D, Sauerland MC, et al. Validation and refinement of the revised 2017 European LeukemiaNet genetic risk stratification of acute myeloid leukemia. Leukemia. 2020;34(12):3161–72.

    Article  Google Scholar 

  8. 8.

    Eisfeld A-K, Kohlschmidt J, Mims A, Nicolet D, Walker CJ, Blachly JS, et al. Additional gene mutations may refine the 2017 European LeukemiaNet classification in adult patients with de novo acute myeloid leukemia aged <60 years. Leukemia. 2020;34(12):3215–27.

    Article  Google Scholar 

  9. 9.

    Ng SWK, Mitchell A, Kennedy JA, Chen WC, McLeod J, Ibrahimova N, et al. A 17-gene stemness score for rapid determination of risk in acute leukaemia. Nature. 2016;540(7633):433–7.

    CAS  Article  Google Scholar 

  10. 10.

    Bill M, Nicolet D, Kohlschmidt J, Walker CJ, Mrózek K, Eisfeld A-K, et al. Mutations associated with a 17-gene leukemia stem cell score and its prognostic relevance in the context of the European LeukemiaNet classification for acute myeloid leukemia. Haematologica. 2020;105(3):721–9.

    CAS  Article  Google Scholar 

  11. 11.

    Mrózek K, Eisfeld A-K, Kohlschmidt J, Carroll AJ, Walker CJ, Nicolet D, et al. Complex karyotype in de novo acute myeloid leukemia: typical and atypical subtypes differ molecularly and clinically. Leukemia. 2019;33(7):1620–34.

    Article  Google Scholar 

  12. 12.

    Eisfeld A-K, Mrózek K, Kohlschmidt J, Nicolet D, Orwick S, Walker CJ, et al. The mutational oncoprint of recurrent cytogenetic abnormalities in adult patients with de novo acute myeloid leukemia. Leukemia. 2017;31(10):2211–8.

    CAS  Article  Google Scholar 

Download references


We thank the patients who participated in clinical trials and their families supporting them; Donna Bucci and Christopher Manring and the CALGB/Alliance Leukemia Tissue Bank at The Ohio State University Comprehensive Cancer Center, Columbus, OH, for sample processing and storage services; and Lisa J. Sterling for data management. The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. This article is dedicated to the memory of Clara D. Bloomfield, M.D., who died on 1 March 2020.


Research reported in this publication was supported in part by the National Cancer Institute of the National Institutes of Health under Award Numbers U10CA180821, U10CA180882, and U24CA196171, (to the Alliance for Clinical Trials in Oncology) UG1CA233180, UG1CA233331, UG1CA233338, U10CA101140, U10CA180861, P30CA016058, and RCA197734, the Leukemia Clinical Research Foundation, the Warren D. Brown Foundation, and by an allocation of computing resources from The Ohio Supercomputer Center. Support to Alliance for Clinical Trials in Oncology and Alliance Foundation Trials programs is listed at

Author information




MB, KM, JK, DN, CDB, and CCO designed the study; MB, KM, BG, JK, DN, DP, A-KE, RG, and CCO analyzed the data; JK and DN performed the statistical analyses; MB, KM, and JK wrote the manuscript; MB, KM, JK, BG, JCB, CDB, and CCO edited the manuscript; JEK, BLP, AJC, RMS, JCB, and CDB provided study materials or patients; JCB, CDB, and CCO provided administrative support; JCB, and CDB provided financial support. All authors, except CDB, who died before the completion of the final draft, read and approved the final manuscript.

Corresponding authors

Correspondence to Marius Bill or Krzysztof Mrózek or Christopher C. Oakes.

Ethics declarations

Ethics approval and consent to participate

All study protocols were approved by the Institutional Review Boards at each participating center in accordance with the Declaration of Helsinki. Each patient provided written informed consent for the research use of their specimens before enrollment.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Presented in part in abstract form at the 61st Annual Meeting of the American Society of Hematology, Orlando, FL, December 7-10, 2019.

Supplementary Information

Additional file 1

. Supplementary Material.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bill, M., Mrózek, K., Giacopelli, B. et al. Precision oncology in AML: validation of the prognostic value of the knowledge bank approach and suggestions for improvement. J Hematol Oncol 14, 107 (2021).

Download citation


  • Acute myeloid leukemia
  • Knowledge bank
  • Next-generation sequencing
  • Gene mutations
  • Clinical outcome