The functional epigenetic landscape of aberrant gene expression in molecular subgroups of newly diagnosed multiple myeloma

Background Multiple Myeloma (MM) is a hematological malignancy with genomic heterogeneity and poor survival outcome. Apart from the central role of genetic lesions, epigenetic anomalies have been identified as drivers in the development of the disease. Methods Alterations in the DNA methylome were mapped in 52 newly diagnosed MM (NDMM) patients of six molecular subgroups and matched with loci-specific chromatin marks to define their impact on gene expression. Differential DNA methylation analysis was performed using DMAP with a ≥10% increase (hypermethylation) or decrease (hypomethylation) in NDMM subgroups, compared to control samples, considered significant for all the subsequent analyses with p<0.05 after adjusting for a false discovery rate. Results We identified differentially methylated regions (DMRs) within the etiological cytogenetic subgroups of myeloma, compared to control plasma cells. Using gene expression data we identified genes that are dysregulated and correlate with DNA methylation levels, indicating a role for DNA methylation in their transcriptional control. We demonstrated that 70% of DMRs in the MM epigenome were hypomethylated and overlapped with repressive H3K27me3. In contrast, differentially expressed genes containing hypermethylated DMRs within the gene body or hypomethylated DMRs at the promoters overlapped with H3K4me1, H3K4me3, or H3K36me3 marks. Additionally, enrichment of BRD4 or MED1 at the H3K27ac enriched DMRs functioned as super-enhancers (SE), controlling the overexpression of genes or gene-cassettes. Conclusions Therefore, this study presents the underlying epigenetic regulatory networks of gene expression dysregulation in NDMM patients and identifies potential targets for future therapies.


Infinium 850K array was used for determining differentially methylated CpG sites in NDMM subgroups
The distribution of Differentially Methylated CpGs (DMCs) within or beyond DMRs were determined using 850K Infinium methylation arrays in selected samples. The samples from each MM subgroup, analyzed by eRRBS or methylation array, showed high reproducibility between the two methods for these most variable sites (correlation coefficient >91%). PCA analysis based on 850K array showed some heterogeneity between the samples within t(14;16) subgroup ( Supplementary   Fig. 3).
Likewise eRRBS, the highest number of most variable DMCs seen on the arrays were obtained at intergenic regions (42%) and gene bodies (39%) followed by transcription start site (TSS) (10%), 5'-UTRs (6%), 3'-UTRs (2%), and first exons (1%)  Table 4). This strengthens the eRRBS data and confirms prevalent distribution of hypomethylated CpGs along the gene body and intergenic regions. This suggest existence of significant epigenetic regulatory elements at gene body that might contribute to the transcription machinery in MM.
The top 5% (n=43,223 unique sites) of most variable DMCs from the methylation array have different hits of CpGs than the eRRBS DMRs, showed similarity in eRRBS DMR-methylation pattern at the major annotated regions (Supplementary Fig.   4B). It also showed that the t(4;14) subgroup was relatively hypermethylated, with the t(11;14) subgroup being the most hypomethylated compared to the other NDMM subgroups. We also noticed formation of neighboring clusters among the t(14;16) and t(14;20) or HRD-D1 and HRD-D2 subgroups. Based on the bootstrapping analysis from eRRBS, we observed that the t(11;14) subgroup formed the farthest cluster to the control, as obtained from eRRBS, while in 850 array, we found isolated clustering of t(11;14) from other MM subgroups, but not essentially the most distant cluster from the control.

Supplementary Figure 3
Unsupervised principle component analysis (PCA) of 850k methylation array data of all the samples from different NDMM subgroups, used in the present study. The proportion pf variance for the principle component (PC)-1 is 17%, PC2 is 7%, and PC3 is 5%. The red circles demarcates distinct segregation of majority of the control and t(4;14) samples from the remaining MM subgroups, based on their DNA-methylation profile. PC stands for principle component. The key genes per MM subgroup were marked. (G) Based on bootstrapping of the differential expression homology of top 5% variable genes, we observed that the t(14;16) and t(14;20) subgroups formed the closest cluster to control, followed by t(11;14), t(4;14) and HY subgroups. AU: Approximately unbiased p-value; BP: Bootstrap probability. We measured Euclidian distance of the average clusters (pvclust program of the R package). AU: Approximately unbiased p-value; BP: Bootstrap probability. We measured Euclidian distance of the average clusters (pvclust program of the R package).

Supplementary Figure 6
Difference in the distribution of hypomethylated epitranscript-DMRs at the promoter (A) and body (B) or hypermethylated epitranscript-DMRs at the promoter (C) and gene body (D) across the NDMM subgroups were evaluated using ANOVA *The P value (Pr) is larger than the F ratio and the inter-group DMR-methylation values for degrees of freedom in the ANOVA table.

Hypo-DMRs at promoter Hypo-DMRs at body Hyper-DMRs at promoter Hyper-DMRs at body Pr (>F) value
1.55e-09 *** 6.11e-14 *** 0. Depiction of overlap between the hyper/hypo-methylated DMRs in t(4;14) NDMM patients and histone marks from KMS11 cells. Differentially expressed genes in the t(4;14) subgroup, which span these DMR-histone overlapped sites, are shown at the promoter and gene body as both counts and as a percentage.

Supplementary Figure 7B
Depiction of overlap between the hyper/hypo-methylated DMRs in t(11;14) NDMM patients and histone marks from U266 cells. Differentially expressed genes in the t(11;14) subgroup, which spans these DMR-histone overlapped sites are shown at the promoter and gene body as both counts and as a percentage.

Supplementary Figure 7C
Depiction of overlap between the hyper/hypo-methylated DMRs in t(14;16) NDMM patients and histone marks from MM.1S cells. Differentially expressed genes in the t(14;16) subgroup, which spans these DMR-histone overlapped sites are shown at the promoter and gene body as both counts and as a percentage.

Supplementary Figure 8
Schematic representation of putative super-enhancer (SE) loops. SE-loops are generally demarcated by CTCF binding sites. SE-loops influence over-expression of an individual gene or a gene cassette, where promoter or upstream promoter regions and gene body could be completely overlapped with the SE elements. The gene expression are also impacted, where SE are not fully overlapped to the gene or the gene-promoter but exist proximally to the gene regulatory elements.

Supplementary Figure 9
The overlap of DMRs (>50 bp) with super-enhancers (SE) (n=677) or CTCF (n=4,791) binding sites was determined at the promoters (a+d), gene bodies (b+e) or intergenic regions (c+f). Based on average linkage of the methylation values, SE-DMRs were clustered into three major modules [hypermethylated (in blue), hypermethylated (in red) and combination of both, while DMR-CTCFs were clustered into two major modules (hyper-and hypomethylated) across the genomic regions.

Supplementary Figure 10
MAF overexpression is a signature for the t(14;16) subgroup of MM. We observed an overlap of SE elements at the promoter, upstream enhancer and portion of MAF gene body in MM1.s cells.

Supplementary Figure 11
Expression of GEMs in NDMM patients were validated using RNA sequencing, based on their correlation (r 2 ) coefficient. Expression are represented as signal intensity (log2 scale