calculateCPM {scater} | R Documentation |
Calculate count-per-million (CPM) values from the count data.
calculateCPM(object, exprs_values = "counts", use_size_factors = TRUE, size_factor_grouping = NULL, subset_row = NULL)
object |
A SingleCellExperiment object or count matrix. |
exprs_values |
A string specifying the assay of |
use_size_factors |
A logical scalar indicating whether size factors in |
size_factor_grouping |
A factor to be passed to |
subset_row |
A vector specifying whether the rows of |
If requested, size factors are used to define the effective library sizes. This is done by scaling all size factors such that the mean scaled size factor is equal to the mean sum of counts across all features. The effective library sizes are then used to in the denominator of the CPM calculation.
Assuming that object
is a SingleCellExperiment:
If use_size_factors=TRUE
, size factors are automatically extracted from the object.
Note that effective library sizes may be computed differently for features marked as spike-in controls.
This is due to the presence of control-specific size factors in object
, see normalizeSCE
for more details.
If use_size_factors=FALSE
, all size factors in object
are ignored.
The total count for each cell will be used as the library size for all features (endogenous genes and spike-in controls).
If use_size_factors
is a numeric vector, it will override the any size factors for non-spike-in features in object
.
The spike-in size factors will still be used for the spike-in transcripts.
If no size factors are available, the library sizes will be used.
If object
is a matrix or matrix-like object, size factors will only be used if use_size_factors
is a numeric vector.
Otherwise, the sum of counts for each cell is directly used as the library size.
Note that the rescaling is performed to the mean sum of counts for all features, regardless of whether subset.row
is specified.
This ensures that the output of the function with subset.row
is equivalent (but more efficient) than subsetting the output of the function without subset.row
.
Matrix of CPM values.
data("sc_example_counts") data("sc_example_cell_info") example_sce <- SingleCellExperiment( list(counts = sc_example_counts), colData = sc_example_cell_info) cpm(example_sce) <- calculateCPM(example_sce, use_size_factors = FALSE)