combineCommonResults {SingleR} | R Documentation |
Combine results from multiple runs of classifySingleR
(usually against different references) into a single DataFrame.
For each cell, the label from the result with the highest score is used as that cell's combined label.
This assumes that each run of classifySingleR
was performed using a common set of marker genes,
hence the Common
in the function name.
combineCommonResults(results)
results |
A list of DataFrame prediction results as returned by |
For each cell, we identify the reference with the highest score across all of its labels.
The “combined label” is then defined as the label assigned to that cell in the highest-scoring reference.
(The same logic is also applied to the first and pruned labels, if available.)
See comments in ?"combine-predictions"
for the overall rationale.
Each result should be generated from training sets that use a common set of genes during classification,
i.e., common.genes
should be the same in the trained
argument to each classifySingleR
call.
This is because the scores are not comparable across results if they were generated from different sets of genes.
It is also for this reason that we use the highest score prior to fine-tuning,
even if it does not correspond to the score of the fine-tuned label.
It is highly unlikely that this function will be called directly by the end-user.
Users are advised to use the multi-reference mode of SingleR
and related functions,
which will take care of the use of a common set of genes before calling this function to combine results across references.
A DataFrame is returned containing the annotation statistics for each cell or cluster (row).
This mimics the output of classifySingleR
and contains the following fields:
scores
, a numeric matrix of correlations formed by combining the equivalent matrices from results
.
labels
, a character vector containing the per-cell combined label across references.
references
, an integer vector specifying the reference from which the combined label was derived.
orig.results
, a DataFrame containing results
.
It may also contain first.labels
and pruned.labels
if these were also present in results
.
The metadata
contains common.genes
,
a character vector of the common genes that were used across all references in results
.
Jared Andrews, Aaron Lun
SingleR
and classifySingleR
, for generating predictions to use in results
.
combineRecomputedResults
, for another approach to combining predictions.
# Making up data (using one reference to seed another). ref <- .mockRefData(nreps=8) ref1 <- ref[,1:2%%2==0] ref2 <- ref[,1:2%%2==1] ref2$label <- tolower(ref2$label) test <- .mockTestData(ref1) # Applying classification with SingleR's multi-reference mode. ref1 <- scater::logNormCounts(ref1) ref2 <- scater::logNormCounts(ref2) test <- scater::logNormCounts(test) pred <- SingleR(test, list(ref1, ref2), labels=list(ref1$label, ref2$label)) pred[,1:5] # Only viewing the first 5 columns for visibility.