pcot2 {pcot2} | R Documentation |
The pcot2
function implements the PCOT2 testing method, which is a
two-stage permutation-based approach for testing changes in activity in
pre-specified gene sets.
pcot2(emat, class = NULL, imat, permu = "ByColumn", iter = 1000, alpha = 0.05, adjP.method = "BY", var.equal = TRUE, ncomp = 2, dist.method = "euclidean")
emat |
A gene expression matrix with no missing values; Each row represents a gene and each column represents a sample. |
class |
Class labels representing two distinct experimental conditions (e.g., normal and disease). |
imat |
The gene category indicator matrix indicates presence or absence of genes in pre-defined gene sets (e.g., gene pathways). The indicator matrix contains rows representing gene identifiers of genes present in the expression data and columns representing pre-defined group names. A value of 1 indicates the presence of a gene and 0 indicates the absence for the gene in a particular group. |
permu |
Specifies whether genes or samples are permuted. By default, permutations are performed by sample ("ByColumn"). |
iter |
The number indicates how many permutations will be performed in the analysis. |
alpha |
alpha determines the significance threshold for the permutation p-values. |
adjP.method |
Specifies that p-values be adjusted by one of the following methods: "bonferroni", "holm", "hochberg", "hommel", "BH" (Benjamini and Hochberg), or "BY" (Benjamini and Yekutieli). |
var.equal |
Specifies the use of either a pooled estimate of correlation for the two classes or an unpooled estimate for calculating each T-squared statistic. By default, the pooled estimate is used. |
ncomp |
The dimensionality to which the data matrix is reduced
via principal coordinates. The default dimensionality is set as
|
dist.method |
Specifies the method for calculating
distance in the PCO procedure. The available distance methods are
"euclidean", "maximum", "manhattan", "canberra", "binary",
"pearson","correlation" or "spearman". For additional details see the
|
The raw permutation p-values are adjusted for multiple testing by a call to 'p.adjust'.
res.all |
A data frame which prints information for all pathways |
res.sig |
A data frame which prints information for significant pathways at a given alpha level |
comparison |
Print the contrast used in the analysis |
...
Sarah Song and Mik Black
ns <- 40 ## 40 samples cla <- rep(c("Trt","Ctr"),each=ns/2) ngene <- 10 ## 10 genes per group npath <- 10 ## 10 groups nreal <- 3 ## alter groups ## nnull <- npath-nreal ## null groups ## pname <- c(paste("RealP",1:nreal, sep=""), paste("NullP",1:nnull, sep="")) ## Three main inputs in the function ## ## [1] Simulate (gene) expression matrix (emat) ## rmv <- function(mn, covm, nr, nc){ sigma <- diag(nr) sigma[sigma==0] <- covm x1 <- rmvnorm(nc/2, mean=mn, sigma=sigma) x0 <- rmvnorm(nc/2, mean=rep(0,nr), sigma=sigma) mat <- t(rbind(x1,x0)) return(mat) } covm <- 0.9 ##covariance ct <- c(6,8,10) ##mean library(mvtnorm) emat <- c() for (i in 1:nreal) emat <- rbind(emat, rmv(rep(ct[i],ngene),covm=covm, ngene, ns)) # for alt pathways for (i in 1:(npath-nreal)) emat <- rbind(emat, rmv(mn=rep(0,ngene),covm=covm, nr=ngene, nc=ns)) dimnames(emat) <- list(paste("Gene", 1:(ngene*npath),sep=""), cla) ## [2] class label ## cla ## [3] indicator matrix (row: genes and col: pathways) imat <- kronecker(diag(npath),rep(1,ngene)) dimnames(imat) <- list(paste("Gene",1:(ngene*npath), sep=""), pname) results.pcot2 <- pcot2(emat, cla, imat) results.pcot2$res.sig results.pcot2$res.all