filterTrajFeaturesByFF {CellTrails} | R Documentation |
Filters trajectory features that exhibit a significantly high fano factor (index of dispersion) by considering average expression levels.
filterTrajFeaturesByFF(sce, threshold = 1.7, min_expr = 0, design = NULL, show_plot = TRUE)
sce |
An |
threshold |
A Z-score cutoff (default: 1.7) |
min_expr |
Minimum average expression of feature to be considered |
design |
A numeric matrix describing the factors that should be blocked |
show_plot |
Indicates if plot should be shown (default: TRUE) |
To identify the most variable features an unsupervised strategy
that controls for the relationship between a features’s average expression
intensity and its expression variability is applied. Features are placed
into 20 bins based on their mean expression. For each bin the fano factor
(a windowed version of the index of dispersion, IOD = variance / mean)
distribution is computed and standardized
(Z-score(x) = x/sd(x) - mean(x)/sd(x)).
Features with a Z-score
greater than threshold
remain labeled as trajectory feature
in the SingleCellExperiment
object. The parameter min_expr
defines the minimum average expression level of a feature to be
considered for this filter method. Please note that spike-in controls are
ignored and are not listed as trajectory features.
To account for systematic bias in the expression data
(e.g., cell cycle effects), a design matrix can be provided for the
learning process. It should list the factors that should be blocked and
their values per sample. It is suggested to construct a design matrix
with model.matrix
.
A character
vector
Daniel C. Ellwanger
trajFeatureNames
isSpike
model.matrix
# Simulate example data set.seed(1101) dat <- simulate_exprs(n_features=15000, n_samples=100) # Create container alist <- list(logcounts=dat) sce <- SingleCellExperiment(assays=alist) # Filter incrementally trajFeatureNames(sce) <- filterTrajFeaturesByDL(sce, threshold=2) trajFeatureNames(sce) <- filterTrajFeaturesByCOV(sce, threshold=0.5) trajFeatureNames(sce) <- filterTrajFeaturesByFF(sce, threshold=1.7) # Number of features length(trajFeatureNames(sce)) #filtered nrow(sce) #total