Comparative efficacy of tandem autologous versus autologous followed by allogeneic hematopoietic cell transplantation in patients with newly diagnosed multiple myeloma: a systematic review and meta-analysis of randomized controlled trials

Background Despite advances in understanding of clinical, genetic, and molecular aspects of multiple myeloma (MM) and availability of more effective therapies, MM remains incurable. The autologous-allogeneic (auto-allo) hematopoietic cell transplantation (HCT) strategy is based on combining cytoreduction from high-dose (chemo- or chemoradio)-therapy with adoptive immunotherapy. However, conflicting results have been reported when an auto-allo HCT approach is compared to tandem autologous (auto-auto) HCT. A previously published meta-analysis has been reported; however, it suffers from serious methodological flaws. Methods A systematic search identified 152 publications, of which five studies (enrolling 1538 patients) met inclusion criteria. All studies eligible for inclusion utilized biologic randomization. Results Assessing response rates by achievement of at least a very good partial response did not differ among the treatment arms [risk ratio (RR) (95% CI) = 0.97 (0.87-1.09), p = 0.66]; but complete remission was higher in the auto-allo HCT arm [RR = 1.65 (1.25-2.19), p = 0.0005]. Event-free survival did not differ between auto-allo HCT group versus auto-auto HCT group using per-protocol analysis [hazard ratio (HR) = 0.78 (0.58-1.05)), p = 0.11] or using intention-to-treat analysis [HR = 0.83 (0.60-1.15), p = 0.26]. Overall survival (OS) did not differ among these treatment arms whether analyzed on per-protocol [HR = 0.88 (0.33-2.35), p = 0.79], or by intention-to-treat [HR = 0.80 (0.48-1.32), p = 0.39] analysis. Non-relapse mortality (NRM) was significantly worse with auto-allo HCT [RR (95%CI) = 3.55 (2.17-5.80), p < 0.00001]. Conclusion Despite higher complete remission rates, there is no improvement in OS with auto-allo HCT; but this approach results in higher NRM in patients with newly diagnosed MM. At present, totality of evidence suggests that an auto-allo HCT approach for patients with newly diagnosed myeloma should not be offered outside the setting of a clinical trial.

(Continued from previous page) Conclusion: Despite higher complete remission rates, there is no improvement in OS with auto-allo HCT; but this approach results in higher NRM in patients with newly diagnosed MM. At present, totality of evidence suggests that an auto-allo HCT approach for patients with newly diagnosed myeloma should not be offered outside the setting of a clinical trial.
Keywords: Autologous hematopoietic stem cell transplantation, Allogeneic hematopoietic stem cell transplantation, Multiple myeloma, Systematic review

Background
The past two decades witnessed major advances in treatment of multiple myeloma (MM), including introduction of high-dose therapy (HDT) (chemotherapy or chemoradiotherapy), autologous hematopoietic cell transplantation (auto-HCT), and other effective therapies including immunomodulatory drugs or proteasome inhibitors, namely bortezomib [1][2][3][4][5]. These new chemotherapeutic agents when used in combinations, have led to improvement in survival and a higher frequency and better quality of response; but have not translated into cure of this disease [3,4].
The concept of ″total therapy″ treatment approach for patients with newly diagnosed MM, using multi-agent induction regimens, tandem auto-auto HCT, and posttransplantation maintenance resulted in progressive increase in proportion of patients achieving complete remission (CR) [6]. The Intergroupe Francophone du Myelome (IFM) demonstrated that tandem auto-auto HCT improves overall survival (OS) among patients with myeloma, particularly if a very good partial response (VGPR) is not achieved after undergoing the first auto-HCT [7]. A meta-analysis by our group showed that tandem auto-auto HCT versus single auto-HCT in previously untreated MM results in improved response rates, but not improved OS [8].
Badros et al. demonstrated the feasibility of offering reduced-intensity conditioning (RIC) allogeneic (allo)-HCT as a salvage strategy in 31 patients with relapsed MM [9]. Seventeen (55%) of 31 cases had received at least two auto-HCT and 17 (55%) had progressive disease at time of allografting [9]. Despite these adverse clinical features, 19 (61%) patients achieved CR or a near CR, with the 100-day and overall non-relapse mortality (NRM) of 10% and 29%, respectively [9]. This suggests a beneficial graftversus-myeloma (GVM) effect mediated by alloreactive donor T-cells is capable of disease control, even in MM refractory to HDT. Gahrton et al. compared outcomes of patients who received allo-HCT for relapsed MM during 1983MM during -1993MM during and 1994MM during -1998 showing improvement in NRM and OS for patients allografted during the later time period [10]. The authors speculate that earlier time to allografting (10 months versus 14 months), for patients transplanted during the later time period, probably contributed to this beneficial effect [10]. Similar results were recently reported by Kumar et al., where 1 year OS post allo-HCT improved in three successive eras (1989-1994, 1995-2000, and 2001-2005) and increased interval between time of MM diagnosis and allografting was found to be an independent adverse prognostic factor for OS [11].
Combining benefits of cytoreductive-therapy from HDT and auto-HCT with adoptive immunotherapy (from allo-HCT) forms the basis of auto-allo HCT treatment strategy in patients with MM. Conflicting results, however, have been noted when an auto-allo HCT approach has been compared to an auto-auto HCT strategy. A recent systematic review on the same issue was performed by Armeson et al. [12] However, this systematic review is limited by inclusion of an inappropriate study, in our opinion. That is, this systematic review included the study by Garban et al. which was not a true randomized controlled trial but rather represents comparisons from two parallel trials (IFM99-03 and IFM99-04) that enrolled allograft and autograft recipients separately. Most importantly, the systematic review by Armeson et al. did not attempt to evaluate the methodological quality of included studies, which is the one of the key reasons to conduct a systematic review. Assessment of risk of bias in the systematic review process provides explanations on whether the observed findings are indeed the effect of the intervention or as a result of bias. Accordingly, we performed a systematic review of published studies comparing auto-auto HCT with auto-allo HCT in patients with newly diagnosed MM that addresses all the issues that were not addressed in the systematic review by Armeson et al.

Results
Initial search yielded 152 references and 2 abstracts, of which 149 were excluded for various reasons as shown in Figure 1. Five studies (four full-manuscripts and one abstract) enrolling a total of 1538 patients were eligible for inclusion into this meta-analysis [13][14][15][16][17]. In one case [15], we identified a complementary publication [18] which provided longer follow-up on the originally published study. Additionally, we excluded one manuscript [19] because it was an indirect comparison (i.e. patients were enrolled separately into two parallel trials, IFM99-03 and IFM99-04, with different primary endpoints and subsequently compared to each other). Finally, we excluded one abstract, HOVON50/54, because patients on the control arm received only a single auto HCT [20].
Patient, disease and treatment characteristics Table 1 summarizes extracted data pertinent to patients0 disease and treatment characteristics. All studies allocated patients to auto-allo HCT if an HLA-matched sibling donor was available, except one [16] where matched volunteer unrelated donors were permitted. For patients undergoing tandem auto-auto HCT, high-dose melphalan 200 mg/m 2 (MEL200) was the preferred regimen for the first autograft in three studies [13,14,16], melphalan dose ranging from 100 to 200 mg/m 2 was used in one study [15], and melphalan dose ranging from 140 (with total body irradiation) to 200 mg/m 2 was used in another study [17]. For the second autograft, MEL200 was the preferred regimen in two studies [13,16]   autograft using MEL200 or not to undergo a second autograft [14]. For the purpose of this meta-analysis, only patients who received a second autograft were included in analysis.
For patients who received an auto-allo HCT approach, MEL200 was the preferred regimen for autografting in four studies [13][14][15][16]. RIC regimen of 2 Gy TBI was the preparative regimen in two studies [13,15]. Bjorkstrand et al. combined fludarabine with 2 Gy TBI [14], while the two remaining studies employed a RIC regimen with fludarabine/melphalan for allo-HCT [16,17]. No specific disease-risk eligibility criteria were required except in one study which limited enrollment to patients with deletion of chromosome 13q [16].

Methodological quality
Methodological quality of included studies is summarized in Table 2. Briefly, all five studies utilized biologic randomization. Four studies reported data on prognostic factors and groups were balanced for presence of associated prognostic risk factors [13][14][15]17,18] while one study did not report data on prognostic factors [16]. None of the studies reported whether all consecutive patients were enrolled. Four studies had at least 1:2 ratio of auto-allo HCT to auto-auto HCT patients while one study [17] had a 1:3.4 ratio. None of the five studies reported blinding of any study personnel. Four studies [13][14][15]17,18] reported using the same reference time for assessing time dependent outcomes while one study [16] did not report a reference time. Three studies [13][14][15]18] reported outcomes according to intention-to-treat (ITT) and three studies [14,15,17,18] reported harms for patients treated per protocol. One study reported a priori expected difference, pre-specified α and β error, and sample size calculation [13].

Benefits
Summary of all evidence is presented in Table 3.

Response rates
Response data was reported per protocol in four studies and one study reported all outcomes according to both ITT and per protocol [14]. Two studies [14,17] used European Bone Marrow Transplantation (EBMT) criteria [21] for response assessment; one study [13] used International Uniform Response (IUR) Criteria [22], while the (more stringent CR and PR) criteria used by Bruno et al. was described [15]. One study did not report how response was assessed [16].

Sensitivity analysis/subgroup analysis
To assess robustness of the pooled results and explore possible reasons for heterogeneity, additional sensitivity and subgroup analyses were performed (see Table 4). To evaluate robustness of response outcomes, sensitivity analysis was performed according to response criteria (EBMT [21], IUR [22], non-EBMT/IUR [15], and not reported). There was no significant difference in ORR or CR regardless of criteria used. Sensitivity analysis for primary outcome of OS was performed according to all elements of risk of bias. Significant differences in pooled results were only detected when per protocol analysis of OS in a study (104 patients) which included at least 1:2 ratio of auto-allo HCT versus auto-auto HCT [HR (95% CI) = 0.55 (0.32-0.94) p = 0.03] was compared with per protocol analysis of OS in a study (110) which did not

Discussion
Auto-HCT has been regarded as the standard of care for younger myeloma patients [1,23]. However, much controversy exists about the role and timing of allo-HCT in newly diagnosed MM. Our meta-analysis indicates despite higher CR rates following an auto-allo HCT approach, there is no apparent improvement in OS, whether comparative analysis is performed as per-protocol or on ITT basis. This is likely explained by significantly higher NRM associated with RIC allo-HCT versus a second auto-HCT [RR (95% CI) = 3.55 (2.17-5.80), p < 0.00001]. Accordingly, further improvements in the auto-allo HCT approach will require strategies to significantly reduce NRM and augment anti-myeloma effects. Not surprising, significant cause of NRM in the auto-allo HCT arm resulted from development of acute and/or chronic GVHD in these patients. For instance, in the study by Krishnan et al. eight (13%) of 60 deaths were attributed to GVHD [13]. Similarly, in the study by Rosiñol et al., three (75%) of four cases of NRM were from complications of acute GVHD [17]. This suggests that future treatment strategies aimed at exploiting GVM effects, in auto-allo HCT approach, should avoid exacerbating GVHD at all costs. It is noteworthy that OS benefit with an auto-allo HCT approach is limited to studies using 2 Gy TBI-based conditioning regimens [14,15], which  has led to speculation [14] that the lack of survival benefit in other studies might relate to use of more intense conditioning which is associated with increased regimen-related toxicity and mortality in those studies [16,17]. It is important to indicate the largest trial by Krishnan et al. [13] used 2 Gy TBI conditioning but was also subject to referral bias, and to date has not reported any survival benefit.
Conceptually, auto-allo HCT approach combines the advantage of cytoreduction from HDT from the first autograft with the benefit of adoptive immunotherapy resulting from the donor T cell alloreactivity. Notwithstanding, in the study by Krishnan et al. 22 (37%) of 60 deaths in the auto-allo HCT arm were still due to MM [13]. As a result, future strategies should aim at achieving deeper remissions, namely molecular remissions, or a state of minimal residual disease, prior to moving forward with allografting. This might entail evaluating novel potent therapies during the peri-allografting phase. Moreover, designing more effective regimens for allo-HCT, beyond 2 Gy TBI, is likely necessary to improve outcomes.
In regards to using auto-auto HCT as the control arm for comparison in these studies, one could argue that this approach is not yet considered the standard of care in all patients with newly diagnosed MM. In fact, outcomes from various studies comparing single auto-HCT versus tandem auto-auto approach have been discrepant [7,24,25] and a published meta-analysis failed to show OS benefit with tandem autografts [8].
A major limitation of all studies comparing auto-auto HCT to auto-allo HCT is lack of detailed information about disease/genetic risk stratification. Only one study limited accrual to patients with deletion 13q detectable by FISH [16]. However prognostic significance of 13q deletion detected by FISH as opposed to conventional cytogenetics remains questionable [26]. Whether an auto-allo HCT approach might be beneficial for high-risk MM is not known, and should be further assessed in future trials [27][28][29]. We were not able to assess if auto-allo HCT approach might be beneficial for high risk myeloma patients as included studies did not report results according to risk categories for all outcomes. An individual patient data meta-analysis would be suitable to answer this question. Furthermore, the results are prone to outcome reporting bias as only three studies reported OS data according to ITT [13,14,18] and another study reported data using per-protocol analysis only [17].   [12]. The objectives of the IFM99-03 trial were to evaluate the feasibility and NRM of RIC allografting [19], whereas the primary end point of IFM99-04 was to compare CR rates achieved after the second auto HCT (with or without anti-IL-6 monoclonal antibody BE-8). Additionally, we excluded a cohort of high-risk patients reported by a study by Krishnan et al. because the original aim of this study was to assess progression-free survival among standard-risk patients [13]. The investigators reported only partial data on a smaller cohort of high-risk patients.

Conclusions
Efforts at identifying particular subgroups of patients with MM, based on prognostic clinical, biological, cytogenetic and genetic risk factors, which are likely to benefit from an auto-allo HCT approach is necessary to help refine the role of this approach in MM. At the present, totality of evidence suggests that an auto-allo HCT approach for patients with newly diagnosed myeloma should not be offered outside the setting of a clinical trial.

Study selection and data extraction
Two authors (M.A.K-D and M.H.) appraised the list of references and selected studies in consultation with other authors (T.R. and A.K.). Disagreements were resolved by consensus. Dual data extraction on clinical outcomes, treatment benefits and harms, and methodological quality of included studies was undertaken. Since biologic randomization is not similar to traditional randomized controlled trials, not all elements of risk of bias were applicable. For methodological quality, we extracted data on the following elements: comparability of two groups on all aspects except the intervention (e.g. disease stage, age, gender, etc.), enrollment of consecutive patients, enrollment of patients in auto-allo and auto-auto group in at least 1:2 ratio, description of withdrawals and dropouts (if any), blinding of study personnel and who was blinded (e.g. data collectors, outcome assessors etc.), comparability of reference time used for time-dependent outcomes between treatment groups and analysis according to ITT principle for benefits and per-protocol for adverse events. Clinical outcomes analyzed included: response rates (ORR, CR and VGPR), OS, EFS, NRM and GVHD. For purposes of this review, OS was considered the primary outcome; response rates, EFS, NRM and GVHD were considered secondary outcomes.

Statistical analysis
Dichotomous data were summarized using RR based on number of events and total number of patients and pooled under random-effects model. For time-to event data, HR and 95% CI were extracted when reported. When authors did not report time-to-event estimates, we extracted data from publication using methods described by Tierney et al. [30]. Time-to-event data were pooled using generic inverse variance under randomeffects model. For analysis of proportional data, methods by Stuart et al. [31] were used to transform proportions into a quantity according to Freeman-Tukey variant of the arcsine square root transformed proportion [31]. Pooled proportion was calculated as a back-transform of the weighted mean of the transformed proportions, using random-effects model [31]. All data are reported with 95% CI. Calculation of the I 2 statistic was used to test for heterogeneity. An I 2 > 50% was considered statistically significant heterogeneity [32]. To assess robustness of the pooled results and explore possible reasons for heterogeneity, additional sensitivity analyses/subgroup analyses were performed according to publication type, patient and disease characteristics, and methodological quality of included studies (risk of bias and random error). All analysis were performed using RevMan 5.1 [33] and StatsDirect [34] software. This work is reported according to the PRISMA guidelines [35].