1 Introduction

Comprehensive quality control (QC) of single-cell RNA-seq data was performed with the singleCellTK package. This report contains information about each QC tool and visualization of the QC metrics for each sample. For more information on running this pipeline and performing quality control, see the documentation. If you use the singleCellTK package for quality control, please include a reference in your publication.

2 Summary Statistics

Total
Number of Cells 2700
Mean counts 2366.9
Median counts 2197
Mean features detected 846.99
Median features detected 817
scDblFinder - Number of doublets 87
scDblFinder - Percentage of doublets 3.22
DecontX - Mean contamination 0.0808
DecontX - Median contamination 0.055

The summary statistics table summarizes QC metrics of the cell matrix. This table summarizes the mean and median of UMI counts and median of genes detected per cell, as well as the number and percentages of doublets and estimated ambient RNA scores per dataset.

3 General quality control metrics

SingleCellTK utilizes the scater package to compute cell-level QC metrics. The wrapper function runPerCellQC can be used to separately compute QC metrics on its own. The wrapper function plotRunPerCellQCResults can be used to plot the general QC outputs. The QC outputs are sum, detected, and percent_top_X. sum contains the total number of counts for each cell. detected contains the total number of features for each cell. percent_top_X contains the percentage of the total counts that is made up by the expression of the top X genes for each cell. The subsets_ columns contain information for the specific gene list that was used. For instance, if a gene list containing mitochondrial genes named mito was used, subsets_mito_sum would contains the total number of mitochondrial counts for each cell.

3.1 Total Counts

3.2 Total Features

3.3 Percentage of Library Size Occupied by Top 50 Expressed Features

3.4 Total Mitochondrial Counts

3.5 Total Mitochondrial Features

3.6 Percentage of Mitochondrial Counts

3.7 Parameters

useAssay counts
collectionName mito
geneSetList NULL
geneSetListLocation rownames
percent_top 50 100 200 500
use_altexps FALSE
flatten TRUE
detectionLimit 0
packageVersion 1.20.1

In this function, the inSCE parameter is the input SingleCellExperiment object, while the useAssay parameter is the assay object that in the SingleCellExperiment object the user wishes to use.

4 Doublet Detection

4.1 ScDblFinder

scDblFinder is a doublet detection algorithm in the scran package. scDblFinder aims to detect doublets by creating a simulated doublet from existing cells and projecting it to the same PCA space as the cells. The wrapper function runScDblFinder can be used to separately run the scDblFinder algorithm on its own. The wrapper function plotScDblFinderResults can be used to plot the QC outputs from the scDblFinder algorithm. The output of scDblFinder is a scDblFinder_doublet_score and scDblFinder_doublet_call. The doublet score of a droplet will be higher if the it is deemed likely to be a doublet.

4.1.1 pbmc3k

4.1.1.1 ScDblFinder Doublet Score

4.1.1.2 Density Score

4.1.1.3 Violin Score

4.1.1.4 Parameters

useAssay counts
nNeighbors 50
simDoublets 10000
seed 12345
packageVersion 1.6.0

The nNeighbors parameter is the number of nearest neighbor used to calculate the density for doublet detection. simDoublets is used to determine the number of simulated doublets used for doublet detection.

5 Ambient RNA Detection

5.1 DecontX

In droplet-based single cell technologies, ambient RNA that may have been released from apoptotic or damaged cells may get incorporated into another droplet, and can lead to contamination. decontX, available from the celda, is a Bayesian method for the identification of the contamination level at a cellular level. The wrapper function runDecontX can be used to separately run the DecontX algorithm on its own. The wrapper function plotDecontXResults can be used to plot the QC outputs from the DecontX algorithm. The outputs of runDecontX are decontX_contamination and decontX_clusters. decontX_contamination is a numeric vector which characterizes the level of contamination in each cell. Clustering is performed as part of the runDecontX algorithm. decontX_clusters is the resulting cluster assignment, which can also be labeled on the plot.

5.1.1 pbmc3k

5.1.1.1 DecontX Contamination Score

5.1.1.2 Density Score

5.1.1.3 Violin Score

5.1.1.4 DecontX Clusters

5.1.1.5 Parameters

useAssay counts
z NULL
maxIter 500
delta 10 10
estimateDelta TRUE
convergence 0.001
iterLogLik 10
varGenes 5000
dbscanEps 1
seed 12345
logfile NULL
verbose TRUE
packageVersion 1.8.1

6 Session Information

Session Information
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] dplyr_1.0.7                 ggplot2_3.3.5               TENxPBMCData_1.10.0         HDF5Array_1.20.0           
##  [5] rhdf5_2.36.0                singleCellTK_2.4.0          DelayedArray_0.18.0         Matrix_1.4-0               
##  [9] SingleCellExperiment_1.14.1 SummarizedExperiment_1.22.0 Biobase_2.52.0              GenomicRanges_1.44.0       
## [13] GenomeInfoDb_1.28.1         IRanges_2.26.0              S4Vectors_0.30.0            BiocGenerics_0.38.0        
## [17] MatrixGenerics_1.4.0        matrixStats_0.59.0         
## 
## loaded via a namespace (and not attached):
##   [1] rappdirs_0.3.3                MCMCprecision_0.4.0           scattermore_0.7              
##   [4] R.methodsS3_1.8.1             SeuratObject_4.0.2            tidyr_1.1.3                  
##   [7] bit64_4.0.5                   knitr_1.33                    irlba_2.3.3                  
##  [10] R.utils_2.10.1                data.table_1.14.0             rpart_4.1-15                 
##  [13] KEGGREST_1.32.0               RCurl_1.98-1.3                doParallel_1.0.16            
##  [16] generics_0.1.0                ScaledMatrix_1.0.0            cowplot_1.1.1                
##  [19] RSQLite_2.2.7                 RANN_2.6.1                    combinat_0.0-8               
##  [22] future_1.21.0                 bit_4.0.4                     webshot_0.5.2                
##  [25] xml2_1.3.2                    spatstat.data_2.1-0           httpuv_1.6.1                 
##  [28] assertthat_0.2.1              viridis_0.6.1                 xfun_0.24                    
##  [31] jquerylib_0.1.4               evaluate_0.14                 promises_1.2.0.1             
##  [34] fansi_0.5.0                   assertive.files_0.0-2         dbplyr_2.1.1                 
##  [37] igraph_1.2.6                  DBI_1.1.1                     htmlwidgets_1.5.3            
##  [40] spatstat.geom_2.2-2           purrr_0.3.4                   ellipsis_0.3.2               
##  [43] RSpectra_0.16-0               annotate_1.70.0               deldir_0.2-10                
##  [46] sparseMatrixStats_1.4.0       vctrs_0.3.8                   ROCR_1.0-11                  
##  [49] abind_1.4-5                   cachem_1.0.5                  RcppEigen_0.3.3.9.1          
##  [52] withr_2.4.2                   GSVAdata_1.28.0               sctransform_0.3.2            
##  [55] scran_1.20.1                  goftest_1.2-2                 svglite_2.0.0                
##  [58] cluster_2.1.2                 ExperimentHub_2.0.0           lazyeval_0.2.2               
##  [61] crayon_1.4.1                  labeling_0.4.2                edgeR_3.34.0                 
##  [64] pkgconfig_2.0.3               nlme_3.1-152                  vipor_0.4.5                  
##  [67] rlang_0.4.11                  globals_0.14.0                lifecycle_1.0.0              
##  [70] miniUI_0.1.1.1                colourpicker_1.1.0            filelock_1.0.2               
##  [73] dbscan_1.1-8                  BiocFileCache_2.0.0           enrichR_3.0                  
##  [76] rsvd_1.0.5                    AnnotationHub_3.0.1           polyclip_1.10-0              
##  [79] lmtest_0.9-38                 graph_1.70.0                  Rhdf5lib_1.14.2              
##  [82] zoo_1.8-9                     beeswarm_0.4.0                ggridges_0.5.3               
##  [85] GlobalOptions_0.1.2           png_0.1-7                     viridisLite_0.4.0            
##  [88] rjson_0.2.20                  bitops_1.0-7                  R.oo_1.24.0                  
##  [91] KernSmooth_2.23-20            rhdf5filters_1.4.0            Biostrings_2.60.1            
##  [94] blob_1.2.1                    DelayedMatrixStats_1.14.0     shape_1.4.6                  
##  [97] stringr_1.4.0                 parallelly_1.26.1             gridGraphics_0.5-1           
## [100] shinyjqui_0.4.0               beachmat_2.8.0                scales_1.1.1                 
## [103] memoise_2.0.0                 GSEABase_1.54.0               magrittr_2.0.1               
## [106] plyr_1.8.6                    ica_1.0-2                     zlibbioc_1.38.0              
## [109] compiler_4.1.2                kableExtra_1.3.4              dqrng_0.3.0                  
## [112] RColorBrewer_1.1-2            fitdistrplus_1.1-5            XVector_0.32.0               
## [115] listenv_0.8.0                 patchwork_1.1.1               pbapply_1.4-3                
## [118] MASS_7.3-54                   mgcv_1.8-38                   tidyselect_1.1.1             
## [121] stringi_1.7.2                 shinyBS_0.61                  highr_0.9                    
## [124] yaml_2.2.1                    assertive.numbers_0.0-2       BiocSingular_1.8.1           
## [127] locfit_1.5-9.4                ggrepel_0.9.1                 grid_4.1.2                   
## [130] sass_0.4.0                    tools_4.1.2                   future.apply_1.7.0           
## [133] circlize_0.4.13               rstudioapi_0.13               bluster_1.2.1                
## [136] foreach_1.5.1                 celda_1.8.1                   metapod_1.0.0                
## [139] gridExtra_2.3                 farver_2.1.0                  assertive.types_0.0-3        
## [142] Rtsne_0.15                    DropletUtils_1.12.1           digest_0.6.27                
## [145] BiocManager_1.30.16           FNN_1.1.3                     shiny_1.6.0                  
## [148] Rcpp_1.0.7                    scuttle_1.2.0                 BiocVersion_3.13.1           
## [151] later_1.2.0                   RcppAnnoy_0.0.18              httr_1.4.2                   
## [154] AnnotationDbi_1.54.1          assertive.properties_0.0-4    colorspace_2.0-2             
## [157] rvest_1.0.0                   XML_3.99-0.6                  tensor_1.5                   
## [160] reticulate_1.20               splines_4.1.2                 uwot_0.1.10                  
## [163] statmod_1.4.36                spatstat.utils_2.2-0          scater_1.20.1                
## [166] xgboost_1.4.1.1               systemfonts_1.0.2             plotly_4.9.4.1               
## [169] shinyalert_2.0.0              xtable_1.8-4                  assertive.base_0.0-9         
## [172] jsonlite_1.7.2                R6_2.5.0                      pillar_1.6.1                 
## [175] htmltools_0.5.1.1             mime_0.11                     glue_1.4.2                   
## [178] fastmap_1.1.0                 DT_0.18                       BiocParallel_1.26.1          
## [181] BiocNeighbors_1.10.0          interactiveDisplayBase_1.30.0 codetools_0.2-18             
## [184] fishpond_1.8.0                utf8_1.2.1                    bslib_0.2.5.1                
## [187] lattice_0.20-45               spatstat.sparse_2.0-0         tibble_3.1.2                 
## [190] multipanelfigure_2.1.2        curl_4.3.2                    ggbeeswarm_0.6.0             
## [193] leiden_0.3.8                  scDblFinder_1.6.0             gtools_3.9.2                 
## [196] magick_2.7.2                  shinyjs_2.0.0                 survival_3.2-13              
## [199] limma_3.48.1                  rmarkdown_2.9                 munsell_0.5.0                
## [202] GenomeInfoDbData_1.2.6        iterators_1.0.13              reshape2_1.4.4               
## [205] gtable_0.3.0                  spatstat.core_2.2-0           Seurat_4.0.3