normalize {scater} | R Documentation |
Compute normalised expression values from count data in a SingleCellExperiment object, using the size factors stored in the object.
normalizeSCE(object, exprs_values = "counts", return_log = TRUE, log_exprs_offset = NULL, centre_size_factors = TRUE, preserve_zeroes = FALSE) ## S4 method for signature 'SingleCellExperiment' normalize(object, exprs_values = "counts", return_log = TRUE, log_exprs_offset = NULL, centre_size_factors = TRUE, preserve_zeroes = FALSE) normalise(...)
object |
A SingleCellExperiment object. |
exprs_values |
String indicating which assay contains the count data that should be used to compute log-transformed expression values. |
return_log |
Logical scalar, should normalized values be returned on the log2 scale?
If |
log_exprs_offset |
Numeric scalar specifying the pseudo-count to add when log-transforming expression values.
If |
centre_size_factors |
Logical scalar indicating whether size fators should be centred. |
preserve_zeroes |
Logical scalar indicating whether zeroes should be preserved when dealing with non-unity offsets. |
... |
Arguments passed to |
Normalized expression values are computed by dividing the counts for each cell by the size factor for that cell.
This aims to remove cell-specific scaling biases, e.g., due to differences in sequencing coverage or capture efficiency.
If log=TRUE
, log-normalized values are calculated by adding log_exprs_offset
to the normalized count and performing a log2 transformation.
Features marked as spike-in controls will be normalized with control-specific size factors, if these are available. This reflects the fact that spike-in controls are subject to different biases than those that are removed by gene-specific size factors (namely, total RNA content). If size factors for a particular spike-in set are not available, a warning will be raised.
If centre_size_factors=TRUE
, all sets of size factors will be centred to have the same mean prior to calculation of normalized expression values.
This ensures that abundances are roughly comparable between features normalized with different sets of size factors.
By default, the centre mean is unity, which means that the computed exprs
can be interpreted as being on the same scale as log-counts.
It also means that the added log_exprs_offset
can be interpreted as a pseudo-count (i.e., on the same scale as the counts).
If preserve_zeroes=TRUE
and the pseudo-count is not unity, size factors are instead centered at the specified value of log_exprs_offset
.
The log-transformation is then performed on the normalized expression values with a pseudo-count of 1, which ensures that zeroes remain so in the output matrix.
This yields the same results as preserve_zeroes=FALSE
minus a matrix-wide constant of log2(log_exprs_offset)
.
Note that normalize
is exactly the same as normalise
.
A SingleCellExperiment object containing normalized expression values in "normcounts"
if log=FALSE
,
and log-normalized expression values in "logcounts"
if log=TRUE
.
All size factors will also be centred in the output object if centre_size_factors=TRUE
.
Davis McCarthy and Aaron Lun
data("sc_example_counts") data("sc_example_cell_info") example_sce <- SingleCellExperiment( assays = list(counts = sc_example_counts), colData = sc_example_cell_info ) example_sce <- normalize(example_sce)