dream {variancePartition} | R Documentation |
Fit linear mixed model for differential expression and preform hypothesis test on fixed effects as specified in the contrast matrix L
dream(exprObj, formula, data, L, ddf = c("Satterthwaite", "Kenward-Roger"), REML = TRUE, useWeights = TRUE, weightsMatrix = NULL, control = lme4::lmerControl(calc.derivs = FALSE, check.rankX = "stop.deficient"), suppressWarnings = FALSE, BPPARAM = bpparam(), ...)
exprObj |
matrix of expression data (g genes x n samples), or ExpressionSet, or EList returned by voom() from the limma package |
formula |
specifies variables for the linear (mixed) model. Must only specify covariates, since the rows of exprObj are automatically used a a response. e.g.: ~ a + b + (1|c) Formulas with only fixed effects also work, and lmFit() followed by contrasts.fit() are run. |
data |
data.frame with columns corresponding to formula |
L |
contrast matrix specifying a linear combination of fixed effects to test |
ddf |
Specifiy "Satterthwaite" or "Kenward-Roger" method to estimate effective degress of freedom for hypothesis testing in the linear mixed model. Note that Kenward-Roger is more accurate, but is *much* slower. Satterthwaite is a good enough exproximation for most datasets. |
REML |
use restricted maximum likelihood to fit linear mixed model. default is TRUE. Strongly discourage against changing this option |
useWeights |
if TRUE, analysis uses heteroskedastic error estimates from voom(). Value is ignored unless exprObj is an EList() from voom() or weightsMatrix is specified |
weightsMatrix |
matrix the same dimension as exprObj with observation-level weights from voom(). Used only if useWeights is TRUE |
control |
control settings for lmer() |
suppressWarnings |
if TRUE, do not stop because of warnings or errors in model fit |
BPPARAM |
parameters for parallel evaluation |
... |
Additional arguments for lmer() or lm() |
A linear (mixed) model is fit for each gene in exprObj, using formula to specify variables in the regression. If categorical variables are modeled as random effects (as is recommended), then a linear mixed model us used. For example if formula is ~ a + b + (1|c), then to model is
fit <- lmer( exprObj[j,] ~ a + b + (1|c), data=data)
useWeights=TRUE causes weightsMatrix[j,] to be included as weights in the regression model.
Note: Fitting the model for 20,000 genes can be computationally intensive. To accelerate computation, models can be fit in parallel using foreach/dopar to run loops in parallel. Parallel processing must be enabled before calling this function. See below.
The regression model is fit for each gene separately. Samples with missing values in either gene expression or metadata are omitted by the underlying call to lmer.
Hypothesis tests and degrees of freedom are producted by lmerTest and pbkrtest pacakges
MArrayLM2 object (just like MArrayLM from limma), and the directly estimated p-value (without eBayes)
# load library # library(variancePartition) library(BiocParallel) # load simulated data: # geneExpr: matrix of gene expression values # info: information/metadata about each sample data(varPartData) form <- ~ Batch + (1|Individual) + (1|Tissue) # Fit linear mixed model for each gene # run on just 10 genes for time fit = dream( geneExpr[1:10,], form, info) # view top genes topTable( fit ) # get contrast matrix testing if the coefficient for Batch2 is # different from coefficient for Batch3 # The variable of interest must be a fixed effect L = getContrast( geneExpr, form, info, c("Batch2", "Batch3")) # plot contrasts plotContrasts( L ) # Fit linear mixed model for each gene # run on just 10 genes for time fit2 = dream( geneExpr[1:10,], form, info, L) # view top genes topTable( fit2 )