makePerCellDF {scater}R Documentation

Create a per-cell data.frame from a SingleCellDataFrame

Description

Create a per-cell data.frame (i.e., where each row represents a cell) from a SingleCellExperiment, most typically for creating custom ggplot2 plots.

Usage

makePerCellDF(
  x,
  features = NULL,
  exprs_values = "logcounts",
  use_dimred = TRUE,
  use_altexps = FALSE,
  prefix_altexps = FALSE,
  check_names = FALSE
)

Arguments

x

A SingleCellExperiment object. This is expected to have non-NULL row names.

features

Character vector specifying the features for which to extract expression profiles across cells. May also include features in alternative Experiments if permitted by use_altexps.

exprs_values

String or integer scalar indicating the assay to use to obtain expression values. Must refer to a matrix-like object with integer or numeric values.

use_dimred

Logical scalar indicating whether data should be extracted for dimensionality reduction results in x. Alternatively, a character or integer vector specifying the dimensionality reduction results to use.

use_altexps

Logical scalar indicating whether (meta)data should be extracted for alternative experiments in x. Alternatively, a character or integer vector specifying the alternative experiments to use.

prefix_altexps

Logical scalar indicating whether altExp-derived fields should be prefixed with the name of the alternative Experiment.

check_names

Logical scalar indicating whether the column names of the output data.frame should be made syntactically valid and unique.

Details

This function enables us to conveniently create a per-feature data.frame from a SingleCellExperiment. Each row of the returned data.frame corresponds to a column in x, while each column of the data.frame corresponds to one aspect of the (meta)data in x. Columns are provided in the following order:

  1. Columns named according to the values in features represent the expression values across cells for the specified feature in the exprs_values assay.

  2. Columns named according to the columns of rowData(x) represent the row metadata variables.

  3. If use_dimred=TRUE, columns named in the format of <DIM>.<NUM> represent the <NUM>th dimension of the dimensionality reduction result <DIM>.

  4. If use_altexps=TRUE, columns are named according to the row names and column metadata fields of successive alternative Experiments, representing the assay data and metadata respectively in these objects. The names of these columns are prefixed with the name of the alternative Experiment if prefix_altexps=TRUE. Note that alternative Experiment rows will only be present if they are specified in features.

By default, nothing is done to resolve syntactically invalid or duplicated column names; this will often lead (correctly) to an error in downstream functions like ggplot. If check_names=TRUE, this is resolved by passing the column names through make.names. Of course, as a result, some columns may not have the same names as the original fields in x.

Value

A data.frame containing one field per aspect of data in x - see Details. Each row corresponds to a cell (i.e., column) of x.

Author(s)

Aaron Lun

See Also

ggcells, which uses this function under the hood.

Examples

example_sce <- mockSCE()
example_sce <- logNormCounts(example_sce)
example_sce <- runPCA(example_sce)

df <- makePerCellDF(example_sce, features="Gene_0001")
head(colnames(df))
tail(colnames(df))

df$Gene_0001
df$Mutation_Status
df$PCA.1


[Package scater version 1.16.2 Index]