ResidualMatrix {BiocSingular} | R Documentation |
Definitions of the ResidualMatrixSeed and ResidualMatrix classes and their associated methods. These classes are designed to support delayed calculation of the residuals from a linear model fit, usually prior to a principal components analysis. The aim is to perform matrix multiplication without explicitly calculating the residuals, allowing efficient computation based on features of the the original matrix (e.g., sparsity).
ResidualMatrixSeed(x, design=NULL) ResidualMatrix(x, design=NULL)
x |
A matrix-like object. This can alternatively be a ResidualMatrixSeed, in which case |
design |
A numeric matrix containing the experimental design,
to be used for linear model fitting on each column of |
The ResidualMatrixSeed
constructor will return a ResidualMatrixSeed object.
The ResidualMatrix
constructor will return a ResidualMatrix object,
containing values equivalent to lm.fit(x=design, y=x)$residuals
.
ResidualMatrixSeed objects are implemented as DelayedMatrix backends.
They support standard operations like dim
, dimnames
and extract_array
.
Passing a ResidualMatrixSeed object to the DelayedArray
constructor will create a ResidualMatrix object.
ResidualMatrix objects are derived from DelayedMatrix objects and support all of valid operations on the latter. Several functions are specialized for greater efficiency when operating on ResidualMatrix instances, including:
Subsetting, transposition and replacement of row/column names. These will return a new ResidualMatrix rather than a DelayedMatrix.
Matrix multiplication via %*%
, crossprod
and tcrossprod
.
These functions will return a DelayedMatrix.
Calculation of row and column sums and means by colSums
, rowSums
, etc.
All other operations applied to a ResidualMatrix will use the underlying DelayedArray machinery. Unary or binary operations will generally create a new DelayedMatrix instance containing a ResidualMatrixSeed.
runPCA(x, rank, center=TRUE, scale=FALSE, get.rotation=TRUE, get.pcs=TRUE, ...)
will perform a PCA on a ResidualMatrix object x
.
All other arguments are as described in runPCA
.
This method has the special behaviour that center=TRUE
is ignored if:
x
was generated using a design
that can be parameterized with an intercept.
This means that the residuals on each column of x
are centered at zero already.
No subsetting by row was performed on x
,
i.e., the zero-centering of the residuals is preserved.
This improves efficiency by avoiding an unnecessary additional centering step,
which would otherwise require block processing or deferred centering - see ?"BiocSingular-options"
.
The ResidualMatrix is particularly efficient when combined with approximate PCA strategies based on matrix multiplication.
This is achieved by setting BSPARAM
to values like IrlbaParam
or RandomParam
.
The matrix product used in each algorithms can be computed efficiently without actually computing the residuals.
Aaron Lun
design <- model.matrix(~gl(5, 50)) library(Matrix) y0 <- rsparsematrix(nrow(design), 200, 0.1) y <- ResidualMatrix(y0, design) y # For comparison. fit <- lm.fit(x=design, y=as.matrix(y0)) DelayedArray(fit$residuals) crossprod(y) tcrossprod(y) y %*% rnorm(200) # PCA can be performed very quickly on ResidualMatrix # instances as the underlying representation can be # used directly, e.g., without loss of sparsity. pc.out <- runPCA(y, 10, BSPARAM=IrlbaParam())