scRUVIII {scMerge}R Documentation

scRUVIII: RUVIII algorithm optimised for single cell data

Description

A function to perform location/scale adjustment to data as the input of RUVIII which also provides the option to select optimal RUVk according to the silhouette coefficient

Usage

scRUVIII(Y = Y, M = M, ctl = ctl, fullalpha = NULL, k = k,
  cell_type = NULL, batch = NULL, return_all_RUV = TRUE,
  fast_svd = FALSE, rsvd_prop = 0.1)

Arguments

Y

The unnormalised SC data. A m by n matrix, where m is the number of observations and n is the number of features.

M

The replicate mapping matrix. The mapping matrix has m rows (one for each observation), and each column represents a set of replicates. The (i, j)-th entry of the mapping matrix is 1 if the i-th observation is in replicate set j, and 0 otherwise. See ruv::RUVIII for more details.

ctl

An index vector to specify the negative controls. Either a logical vector of length n or a vector of integers.

fullalpha

Not used. Please ignore.

k

The number of unwanted factors to remove. This is inherited from the ruvK argument from the scMerge::scMerge function.

cell_type

An optional vector indicating the cell type information for each cell in the batch-combined matrix. If it is NULL, pseudo-replicate procedure will be run to identify cell type.

batch

Batch information inherited from the scMerge::scMerge function.

return_all_RUV

Whether to return extra information on the RUV function, inherited from the scMerge::scMerge function

fast_svd

If TRUE, fast algorithms will be used for singular value decomposition calculation via the irlba and rsvd packages. We recommend using this option when the number of cells is large (e.g. more than 1000 cells).

rsvd_prop

If fast_svd = TRUE, then rsvd_prop will be used to used to reduce the computational cost of randomised singular value decomposition. We recommend setting this number to less than 0.25 to achieve a balance between numerical accuracy and computational costs. If a lower value is used on a lower dimensional data (say < 1000 cell) will potentially yield a less accurate computed result but with a gain in speed. The default of 0.1 tends to achieve a balance between speed and accuracy.

Value

A list consists of:

Author(s)

Yingxin Lin, Kevin Wang

Examples

L = ruvSimulate(m = 200, n = 1000, nc = 100, nCelltypes = 3, nBatch = 2, lambda = 0.1, sce = FALSE)
Y = log2(L$Y + 1L); M = L$M; ctl = L$ctl; batch = L$batch;
res = scRUVIII(Y = Y, M = M, ctl = ctl, k = c(5, 10, 15, 20), batch = batch)

[Package scMerge version 1.0.0 Index]