plotScoreDistribution {SingleR}R Documentation

Plot score distributions of labels.

Description

Plot score distributions of labels.

Usage

plotScoreDistribution(
  results,
  show = c("scores", "delta.med", "delta.next"),
  labels.use = colnames(results$scores),
  scores.use = NULL,
  calls.use = 0,
  pruned.use = 0,
  size = 0.5,
  ncol = 5,
  dots.on.top = TRUE,
  this.color = "#F0E442",
  pruned.color = "#E69F00",
  other.color = "gray60",
  show.nmads = 3,
  show.min.diff = NULL,
  grid.vars = list()
)

Arguments

results

A DataFrame containing the output from SingleR, classifySingleR, combineCommonResults, or combineRecomputedResults).

show

String specifying whether to show the scores ("scores"), the difference from the median ("delta.med") or the difference from the next-best score ("delta.next").

labels.use

String vector indicating one or more labels to show. If NULL, all labels available in results are presented.

scores.use

Integer scalar specifying which scores to use. This can refer to any column index of results$orig.results to use scores from individual references of a combined prediction (see ?combine-predictions), or to the top-level results which is indicated by the value 0.

Alternatively, scores.use can be an integer vector containing multiple such column indices (or zero). In such cases, multiple plots will be created showing multiple sets of scores.

Default setting, scores.use=NULL, will create plots for all targets that make sense: when show="scores" plots are created for top-level results & all individual references; when show="delta.med" or "delta.next", plots are created for all individual references.

calls.use, pruned.use

Integer scalar specifying which chosen labels or pruning calls to use, defaulting to those from the top-level results which is indicated by the value 0. However, this can also refer to any column index of results$orig.results to use labels from individual references of a combined prediction (see ?combine-predictions).

Alternatively, an integer vector of the same length as scores.use, specifying the labels to use in each plot generated by scores.use.

size

Numeric scalar to set the size of the dots.

ncol

Integer scalar to set the number of labels to display per row.

dots.on.top

Logical specifying whether cell dots should be plotted on top of the violin plots.

this.color

String specifying the color for cells that were assigned to the label.

pruned.color

String specifying the color for cells that were assigned to the label but pruned.

other.color

String specifying the color for other cells not assigned to the label.

show.nmads

Numeric scalar that shows the threshold that would be used for pruning with pruneScores. Only used when show="delta.med".

show.min.diff

Numeric scalar that shows the threshold that would be used for pruning with pruneScores. Only used when show="delta.med" or "delta.next".

grid.vars

named list of extra variables to pass to grid.arrange, used when scores.use is of length greater than 1. If NULL, the function will not arrange plots in a grid and will instead output them as a list. Doing so outputs them one after another on the graphics device.

Details

This function creates jitter and violin plots showing assignment scores or related values for all cells across one or more labels. It is intended for visualizing and adjusting the nmads, min.diff.med, and min.diff.next cutoffs of the pruneScores function, or for comparing scores accross predictions when multiple references were used (see ?combine-predictions).

The show argument determines what values to show on the y-axis. Options are:

For a given label X, cells distributions in several categories are shown:

Each category is grouped and colored separately based on this.color and related parameters.

Values are stratified according to the assigned labels in results$labels. If any fine-tuning was performed, the highest scoring label for an individual cell may not be its final label. This may manifest as negative values when show="delta.med".

Also note that pruneScores trims based on the min.diff.med and min.diff.next cutoffs first, before calculating the first-labels' delta medians. Thus, the actual nmads cut-off used in pruneScores may vary from the one portrayed in the plot.

Value

One or more ggplot objects showing assignment scores in violin plots is generated on the current graphics device. Or such objects are returned one-by-one as a list if scores.use is of length greater than 1, and grid.vars is set to NULL.

Working with combined results

When results are the output of a combined prediction (see ?combine-predictions), scores.use, calls.use, and pruned.use are used to indicate which prediction's scores, chosen labels, and pruning calls should be utilized.

Values of these inputs can be:

Author(s)

Daniel Bunis and Aaron Lun

See Also

SingleR, to generate scores.

pruneScores, to remove low-quality labels based on the scores, and to see more about the quailty cutoffs.

grid.arrange, for tweaks to the how plots are arranged when multiple are output together.

Examples

example(SingleR, echo=FALSE)

# To show the distribution of scores grouped by label:
plotScoreDistribution(results = pred)
# We can display a particular label using the label
plotScoreDistribution(results = pred,
    labels.use = "B")

# To show the distribution of deltas between cells' maximum and median scores,
#   grouped by label, change 'show' to "delta.med":
#   This is useful for checking/adjusting nmads and min.diff.med cutoffs
plotScoreDistribution(results = pred,
    show = "delta.med")

# To show the distribution of deltas between cells' top 2 fine-tuning scores,
#   grouped by label, change `show` to "delta.next":
#   This is useful for checking/adjusting min.diff.next cutoffs
plotScoreDistribution(results = pred, show = "delta.next")


### Visualizing and adjusting pruning cutoffs ###

# The default nmads cutoff of 3 is displayed when 'show = "delta.med"', but
# this can be adjusted or turned off with 'show.nmads'
plotScoreDistribution(results = pred,
    show = "delta.med", show.nmads = 2)
plotScoreDistribution(results = pred,
    show = "delta.med", show.nmads = NULL)

# A min.diff cutoff can be shown using 'show.min.diff' when
# 'show = "delta.med"' or 'show = "delta.next"'
plotScoreDistribution(results = pred,
    show = "delta.med", show.min.diff = 0.03)
plotScoreDistribution(results = pred,
    show = "delta.next", show.min.diff = 0.03)


### Multi-Reference Compatibility ###

# When SingleR is run with multiple references, default output will contain
# separate plots for each original reference, as well as for the the combined
# set when 'show' = "scores".
example(combineRecomputedResults, echo = FALSE)
plotScoreDistribution(results = combined)

# 'scores.use' sets which original results to plot distributions for, and can
# be multiple or NULL (default)
plotScoreDistribution(results = combined, show = "scores",
    scores.use = 0)
plotScoreDistribution(results = combined, show = "scores",
    scores.use = 1:2)

# To color and group cells by non-final label and pruned calls,
# use 'calls.use' and 'pruned.use'
plotScoreDistribution(results = combined, show = "scores",
    calls.use = 1, pruned.use = 1)

# To have plots output in a grid rather than as separate pages, provide,
# a list of inputs for gridExtra::grid.arrange() to 'grids.vars'.
plotScoreDistribution(combined,
    grid.vars = list(ncol = 1))

# An empty list will use grid.arrange defaluts
plotScoreDistribution(combined,
    grid.vars = list())


[Package SingleR version 1.2.4 Index]