ResidualMatrix {BiocSingular}R Documentation

The ResidualMatrix class

Description

Definitions of the ResidualMatrixSeed and ResidualMatrix classes and their associated methods. These classes are designed to support delayed calculation of the residuals from a linear model fit, usually prior to a principal components analysis. The aim is to perform matrix multiplication without explicitly calculating the residuals, allowing efficient computation based on features of the the original matrix (e.g., sparsity).

Usage

ResidualMatrixSeed(x, design=NULL)

ResidualMatrix(x, design=NULL)

Arguments

x

A matrix-like object.

This can alternatively be a ResidualMatrixSeed, in which case design is ignored.

design

A numeric matrix containing the experimental design, to be used for linear model fitting on each column of x.

Value

The ResidualMatrixSeed constructor will return a ResidualMatrixSeed object.

The ResidualMatrix constructor will return a ResidualMatrix object, containing values equivalent to lm.fit(x=design, y=x)$residuals.

Methods for ResidualMatrixSeed objects

ResidualMatrixSeed objects are implemented as DelayedMatrix backends. They support standard operations like dim, dimnames and extract_array.

Passing a ResidualMatrixSeed object to the DelayedArray constructor will create a ResidualMatrix object.

Methods for ResidualMatrix objects

ResidualMatrix objects are derived from DelayedMatrix objects and support all of valid operations on the latter. Several functions are specialized for greater efficiency when operating on ResidualMatrix instances, including:

All other operations applied to a ResidualMatrix will use the underlying DelayedArray machinery. Unary or binary operations will generally create a new DelayedMatrix instance containing a ResidualMatrixSeed.

PCA with ResidualMatrix objects

runPCA(x, rank, center=TRUE, scale=FALSE, get.rotation=TRUE, get.pcs=TRUE, ...) will perform a PCA on a ResidualMatrix object x. All other arguments are as described in runPCA.

This method has the special behaviour that center=TRUE is ignored if:

This improves efficiency by avoiding an unnecessary additional centering step, which would otherwise require block processing or deferred centering - see ?"BiocSingular-options".

The ResidualMatrix is particularly efficient when combined with approximate PCA strategies based on matrix multiplication. This is achieved by setting BSPARAM to values like IrlbaParam or RandomParam. The matrix product used in each algorithms can be computed efficiently without actually computing the residuals.

Author(s)

Aaron Lun

Examples

design <- model.matrix(~gl(5, 50))

library(Matrix)
y0 <- rsparsematrix(nrow(design), 200, 0.1)
y <- ResidualMatrix(y0, design)
y

# For comparison.
fit <- lm.fit(x=design, y=as.matrix(y0))
DelayedArray(fit$residuals)

crossprod(y)
tcrossprod(y)
y %*% rnorm(200)

# PCA can be performed very quickly on ResidualMatrix 
# instances as the underlying representation can be 
# used directly, e.g., without loss of sparsity.
pc.out <- runPCA(y, 10, BSPARAM=IrlbaParam())

[Package BiocSingular version 1.4.0 Index]