ihw.default {IHW} | R Documentation |
Given a vector of p-values, a vector of covariates which are independent of the p-values under the null hypothesis and a nominal significance level alpha, IHW learns multiple testing weights and then applies the weighted Benjamini Hochberg (or Bonferroni) procedure.
## Default S3 method: ihw(pvalues, covariates, alpha, covariate_type = "ordinal", nbins = "auto", m_groups = NULL, quiet = TRUE, nfolds = 5L, nfolds_internal = 5L, nsplits_internal = 1L, lambdas = "auto", seed = 1L, distrib_estimator = "grenander", lp_solver = "lpsymphony", adjustment_type = "BH", return_internal = FALSE, ...) ## S3 method for class 'formula' ihw(formula, data = parent.frame(), ...)
pvalues |
Numeric vector of unadjusted p-values. |
covariates |
Vector which contains the one-dimensional covariates (independent under the H0 of the p-value) for each test. Can be numeric or a factor. (If numeric it will be converted into factor by binning.) |
alpha |
Numeric, sets the nominal level for FDR control. |
covariate_type |
"ordinal" or "nominal" (i.e. whether covariates can be sorted in increasing order or not) |
nbins |
Integer, number of groups into which p-values will be split based on covariate. Use "auto" for automatic selection of the number of bins. Only applicable when covariates is not a factor. |
m_groups |
Integer vector of length equal to the number of levels of the covariates (only to be specified when the latter is a factor/categorical). Each entry corresponds to the number of hypotheses to be tested in each group (stratum). This argument needs to be given when the complete vector of p-values is not available, but only p-values below a given threshold, for example because of memory reasons. See the vignette for additional details and an example of how this principle can be applied with numerical covariates. |
quiet |
Boolean, if False a lot of messages are printed during the fitting stages. |
nfolds |
Number of folds into which the p-values will be split for the pre-validation procedure |
nfolds_internal |
Within each fold, a second (nested) layer of cross-validation can be conducted to choose a good regularization parameter. This parameter controls the number of nested folds. |
nsplits_internal |
Integer, how many times to repeat the nfolds_internal splitting. Can lead to better regularization parameter selection but makes ihw a lot slower. |
lambdas |
Numeric vector which defines the grid of possible regularization parameters. Use "auto" for automatic selection. |
seed |
Integer or NULL. Split of hypotheses into folds is done randomly. To have output of the function be reproducible, we set a seed. Use NULL if you don't want a seed. |
distrib_estimator |
Character ("grenander" or "ECDF"). Only use this if you know what you are doing. ECDF with nfolds > 1 or lp_solver == "lpsymphony" will in general be excessively slow, except for very small problems. |
lp_solver |
Character ("lpsymphony" or "gurobi"). Internally, IHW solves a sequence of linear programs, which can be solved with either of these solvers. |
adjustment_type |
Character ("BH" or "bonferroni") depending on whether you want to control FDR or FWER. |
return_internal |
Returns a lower level representation of the output (only useful for debugging purposes). |
... |
Arguments passed to internal functions. |
formula |
|
data |
data.frame from which the variables in formula should be taken |
A ihwResult object.
ihwResult, plot,ihwResult-method, ihw.DESeqResults
save.seed <- .Random.seed; set.seed(1) X <- runif(20000, min=0, max=2.5) # covariate H <- rbinom(20000,1,0.1) # hypothesis true or false Z <- rnorm(20000, H*X) # Z-score .Random.seed <- save.seed pvalue <- 1-pnorm(Z) # pvalue ihw_fdr <- ihw(pvalue, X, .1) # Standard IHW for FDR control ihw_fwer <- ihw(pvalue, X, .1, adjustment_type = "bonferroni") # FWER control table(H[adj_pvalues(ihw_fdr) <= 0.1] == 0) #how many false rejections? table(H[adj_pvalues(ihw_fwer) <= 0.1] == 0)