br were topologically important for
were topologically important for connecting the DM genes on the PPI network. Finally, a random-walk based machine learning method was developed to propagate the DNA methylation scores from the DM genes to ECs, and the derived scores for the EC genes were used to con-struct a support vector machine for classifying tumor recurrence.
2.2. Raw data collection and processing
Endometrial tissue specimens were obtained as part of our ongoing work on characterizing molecular alterations in endometrioid endome-trial carcinomas and were described in a previous report . Global DNA methylation pattern of the 60 tumors and 12 controls were sur-veyed using methyl-CpG binding domain-based capture  coupled with massively parallel sequencing (MBDCap-seq; ). Briefly, methyl-ated DNA was eluted by the MethylMiner Methylated DNA Enrichment Kit (Invitrogen) according to the manufacturer's instructions. Eluted DNA was used to generate libraries for sequencing following the stan-dard protocols from Illumina. MBDCap-seq libraries were sequenced using the Illumina Genome Analyzer II as per the manufacturer's in-structions. Image analysis and K 252a calling were performed with the standard Illumina pipeline. Sequencing reads were mapped by ELAND algorithm. Unique reads up to 36 base pairs were mapped to the human reference genome (hg18), with up to two mismatches. Reads in satellite regions were excluded due to the large number of amplifica-tions. Biological reproducibility, technical repeat, and validation analysis were conducted, and the results suggest that MBDCap-seq can reliably identify differentially methylated regions in the genome. The methyla-tion level was normalized based on the unique read numbers for each sample by a linear method.
The tumor differential methylation (TDM) score was calculated for each of the known promoter CpG islands for each cancer patient by com-paring the average methylation level in a 8-kb window covering the CpG island in the tumor relative to normal controls using one-sample t-test. Let p be the p-value resulted from the t-test, for a CpG island significantly hypermethylated (over-methylated) in tumor, a positive TDM score was calculated as −log10(p); similarly, for a hypomethylated (under-meth-ylated) CpG island, a negative TDM score was calculated as log10(p). In both cases, p-values N 0.01 were converted to 1 and as a result the cor-responding TDM scores became zero. These CpG island level TDM scores were then mapped to gene-level scores, by assigning to each gene the highest TDM score among the CpG islands associated with the gene. This resulted in 4214 genes that had non-zero TDM scores for at least one patient. A detailed description and analysis of the complete DNA methylome for these patients has been published elsewhere .
2.3. Epigenetic marker and epigenetic connector subnetwork selection
Among the 60 patients available for analysis, 16 had recurrence within 3 years and were designated as recurrent, and the remaining were designated as non-recurrent. Because our objective is to classify tumor recurrence, patients that had persistent tumors or had non-re-current tumor but last follow-ups were within three years after surgery were pre-excluded. In order to identify potential epigenetic markers for recurrence, we compared the TDM score of each gene between the re-current tumors and the non-recurrent ones using two-sample t-test. Genes with a p-value b 0.02 were termed differentially methylated (DM) genes. Next, we mapped the DM genes to the human protein-protein interaction network obtained from HPRD (Release 9) . We used the largest connected component of the network, which contained 9205 unique genes (official gene symbols) and 36,720 interactions.