isOutlier {scater}R Documentation

Identify outlier values

Description

Convenience function to determine which values in a numeric vector are outliers based on the median absolute deviation (MAD).

Usage

isOutlier(metric, nmads = 5, type = c("both", "lower", "higher"),
  log = FALSE, subset = NULL, batch = NULL, min_diff = NA)

Arguments

metric

Numeric vector of values.

nmads

A numeric scalar, specifying the minimum number of MADs away from median required for a value to be called an outlier.

type

String indicating whether outliers should be looked for at both tails ("both"), only at the lower tail ("lower") or the upper tail ("higher").

log

Logical scalar, should the values of the metric be transformed to the log10 scale before computing MADs?

subset

Logical or integer vector, which subset of values should be used to calculate the median/MAD? If NULL, all values are used. Missing values will trigger a warning and will be automatically ignored.

batch

Factor of length equal to metric, specifying the batch to which each observation belongs. A median/MAD is calculated for each batch, and outliers are then identified within each batch.

min_diff

A numeric scalar indicating the minimum difference from the median to consider as an outlier. The outlier threshold is defined from the larger of nmads MADs and min_diff, to avoid calling many outliers when the MAD is very small. If NA, it is ignored.

Details

Lower and upper thresholds are stored in the "threshold" attribute of the returned vector. This is a numeric vector of length 2 when batch=NULL for the threshold on each side. Otherwise, it is a matrix with one named column per level of batch and two rows (one per threshold).

Value

A logical vector of the same length as the metric argument, specifying the observations that are considered as outliers.

Author(s)

Aaron Lun

Examples

data("sc_example_counts")
data("sc_example_cell_info")
example_sce <- SingleCellExperiment(
    assays = list(counts = sc_example_counts), 
    colData = sc_example_cell_info
)
example_sce <- calculateQCMetrics(example_sce)

## with a set of feature controls defined
example_sce <- calculateQCMetrics(example_sce, 
feature_controls = list(set1 = 1:40))
isOutlier(example_sce$total_counts, nmads = 3)


[Package scater version 1.12.2 Index]