preAssociationProbeFiltering {ELMER} | R Documentation |
This function has some filters to the DNA methylation data in each it selects probes to avoid correlations due to non-cancer contamination and for additional stringency.
Filter 1: We usually call locus unmethylated when the methylation value < 0.3 and methylated when the methylation value > 0.3. Therefore Meth_B is the percentage of methylation value > K. Basically, this step will make sure we have at least a percentage of beta values lesser than K and n percentage of beta values greater K. For example, if percentage is 5%, the number of samples 100 and K = 0.3, this filter will select probes that we have at least 5 (5% of 100%) samples have beta values > 0.3 and at least 5 samples have beta values < 0.3. This filter is importante as true promoters and enhancers usually have a pretty low value (of course purity can screw that up). we often see lots of PMD probes across the genome with intermediate values like 0.4. Choosing a value of 0.3 will certainly give some false negatives, but not compared to the number of false positives we thought we might get without this filter.
preAssociationProbeFiltering(data, K = 0.3, percentage = 0.05)
data |
A MultiAssayExperiment with a DNA methylation martrix or a DNA methylation matrix |
K |
Cut off to consider probes as methylated or unmethylated. Default: 0.3 |
percentage |
The percentage of samples we should have at least considered as methylated and unmethylated |
An object with the same class, but with the probes removed.
Yao, Lijing, et al. "Inferring regulatory element landscapes and transcription factor networks from cancer methylomes." Genome biology 16.1 (2015): 1. Method section (Linking enhancer probes with methylation changes to target genes with expression changes).
random.probe <- runif(100, 0, 1) bias_l.probe <- runif(100, 0, 0.3) bias_g.probe <- runif(100, 0.3, 1) met <- rbind(random.probe,bias_l.probe,bias_g.probe) met <- preAssociationProbeFiltering(data = met, K = 0.3, percentage = 0.05) met <- rbind(random.probe,random.probe,random.probe) met <- preAssociationProbeFiltering(met, K = 0.3, percentage = 0.05) data <- ELMER:::getdata("elmer.data.example") # Get data from ELMER.data data <- preAssociationProbeFiltering(data, K = 0.3, percentage = 0.05) cg24741609 <- runif(100, 0, 1) cg17468663 <- runif(100, 0, 0.3) cg14036402 <- runif(100, 0.3, 1) met <- rbind(cg24741609,cg14036402,cg17468663) colnames(met) <- paste("sample",1:100) exp <- met rownames(exp) <- c("ENSG00000141510","ENSG00000171862","ENSG00000171863") sample.info <- S4Vectors::DataFrame(sample.type = rep(c("Normal", "Tumor"),50)) rownames(sample.info) <- colnames(exp) mae <- createMAE(exp = exp, met = met, colData = sample.info, genome = "hg38") mae <- preAssociationProbeFiltering(mae, K = 0.3, percentage = 0.05)