scLang is a suite for package development for scRNA-seq
analysis. It offers functions that can operate on both
Seurat and SingleCellExperiment objects. These
functions are primarily aimed to help developers build tools compatible
with both types of input.
To install scLang, run the following commands in an R
session:
This tutorial uses an scRNA-seq human pancreas dataset. After loading
the required packages, download the dataset using the
BaronPancreasData function from scRNAseq. The
dataset will be stored as a SingleCellExperiment object.
library(scLang)
library(scRNAseq)
library(scater)
library(Seurat)
sceObj <- BaronPancreasData('human')Next, we will normalize and log-transform the data using the
logNormCounts function from scuttle (loaded
automatically with scater):
We will also need PCA and UMAP dimensions. These can be computed
using the runPCA and runUMAP functions from
scater:
Now we will convert the dataset to a Seurat object:
The scExpMat function extracts the expression matrix
from a Seurat or SingleCellExperiment
object:
mat1 <- scExpMat(sceObj)
dim(mat1)
#> [1] 20125 8569
mat2 <- scExpMat(seuratObj)
identical(mat1, mat2)
#> [1] TRUENote: By default, this function extracts normalized
and log-transformed data, looking for the data assay for a
Seurat object, and for the logcounts assay for
a SingleCellExperiment object. This behavior can be changed
using the dataType parameter.
scExpMat can also take a matrix as an argument. This
option is useful when building functions allowing users to use either a
single-cell expression matrix or an object of a dedicated class
(Seurat, SingleCellExperiment) as input:
By default, scExpMatconverts the expression matrix to a
dense matrix. If this behavior is not desired, conversion can be skipped
by setting densify to FALSE:
is(mat1)[1]
#> [1] "matrix"
mat2 <- scExpMat(sceObj, densify=FALSE)
is(mat2)[2]
#> [1] "CsparseMatrix"scExpMatcan also extract the expression data only for
selected genes:
The scCol function extracts a column from the metadata
of the Seurat object or the coldata of the
SingleCellExperiment object:
col1 <- scCol(seuratObj, 'label')
col2 <- scCol(sceObj, 'label')
identical(col1, col2)
#> [1] TRUE
head(col1)
#> [1] "acinar" "acinar" "acinar" "acinar" "acinar" "acinar"It can also be used to insert a new column. Here, we just make a
modified copy of the label column for the
Seurat object:
scCol(seuratObj, 'labelCopy') <- paste0(scCol(seuratObj, 'label'), '_copy')
head(seuratObj[['labelCopy']])
#> labelCopy
#> human1_lib1.final_cell_0001 acinar_copy
#> human1_lib1.final_cell_0002 acinar_copy
#> human1_lib1.final_cell_0003 acinar_copy
#> human1_lib1.final_cell_0004 acinar_copy
#> human1_lib1.final_cell_0005 acinar_copy
#> human1_lib1.final_cell_0006 acinar_copyThe metadataDF function extracts the metadata/coldata
data frame from a Seurat or
SingleCellExpression object:
df1 <- metadataDF(seuratObj)
df2 <- metadataDF(sceObj)
identical(df1, df2)
#> [1] FALSE
head(df1)[, c(1, 2)]
#> orig.ident nCount_originalexp
#> human1_lib1.final_cell_0001 human1 22412
#> human1_lib1.final_cell_0002 human1 27953
#> human1_lib1.final_cell_0003 human1 16895
#> human1_lib1.final_cell_0004 human1 19300
#> human1_lib1.final_cell_0005 human1 15067
#> human1_lib1.final_cell_0006 human1 15747The metadataNames function extracts the column names of
the metadata/coldata data frame from a Seurat or SingleCellExpression
object:
The scColCounts and scColPairCounts
functions are wrappers around dplyr::count and enable
counting frequencies of elements from one or two categorical columns in
a Seurat or SingleCellExpression object:
freq1 <- scColCounts(sceObj, 'donor')
freq2 <- scColCounts(seuratObj, 'donor')
identical(freq1, freq2)
#> [1] TRUE
head(freq1)
#> GSM2230757 GSM2230758 GSM2230759 GSM2230760
#> 1937 1724 3605 1303
freq1 <- scColPairCounts(sceObj, 'donor', 'label')
freq2 <- scColPairCounts(seuratObj, 'donor', 'label')
identical(freq1, freq2)
#> [1] TRUE
head(freq1)
#> donor label n
#> 1 GSM2230757 acinar 110
#> 2 GSM2230757 activated_stellate 51
#> 3 GSM2230757 alpha 236
#> 4 GSM2230757 beta 872
#> 5 GSM2230757 delta 214
#> 6 GSM2230757 ductal 120scLang includes three visualization functions that adapt Seurat
visualization tools, extending their usage to
SingleCellExpression objects in addition to
Seurat objects.
The dimPlot function mimics the essential behavior of
the DimPlot function from Seurat:
dimPlot(sceObj, groupBy='label')
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'The featurePlot functions mimics the
FeaturePlot function from Seurat (though using a different
color scheme):
featurePlot(sceObj, 'SOX4')
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'
#> Found more than one class "package_version" in cache; using the first, from namespace 'SeuratObject'
#> Also defined by 'alabaster.base'The violinPlot functions mimics the
ViolinPlot function from Seurat:
sessionInfo()
#> R version 4.5.3 (2026-03-11)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] Seurat_5.4.0 SeuratObject_5.3.0
#> [3] sp_2.2-1 scater_1.39.4
#> [5] ggplot2_4.0.2 scuttle_1.21.5
#> [7] scRNAseq_2.25.0 SingleCellExperiment_1.33.2
#> [9] SummarizedExperiment_1.41.1 Biobase_2.71.0
#> [11] GenomicRanges_1.63.2 Seqinfo_1.1.0
#> [13] IRanges_2.45.0 S4Vectors_0.49.1
#> [15] BiocGenerics_0.57.0 generics_0.1.4
#> [17] MatrixGenerics_1.23.0 matrixStats_1.5.0
#> [19] scLang_0.99.3 BiocStyle_2.39.0
#>
#> loaded via a namespace (and not attached):
#> [1] spatstat.sparse_3.1-0 ProtGenerics_1.43.0 bitops_1.0-9
#> [4] httr_1.4.8 RColorBrewer_1.1-3 sctransform_0.4.3
#> [7] tools_4.5.3 alabaster.base_1.11.4 R6_2.6.1
#> [10] HDF5Array_1.39.1 uwot_0.2.4 lazyeval_0.2.3
#> [13] rhdf5filters_1.23.3 withr_3.0.2 gridExtra_2.3
#> [16] progressr_0.19.0 cli_3.6.6 spatstat.explore_3.8-0
#> [19] fastDummies_1.7.5 labeling_0.4.3 prismatic_1.1.2
#> [22] alabaster.se_1.11.0 sass_0.4.10 spatstat.data_3.1-9
#> [25] S7_0.2.1 ggridges_0.5.7 pbapply_1.7-4
#> [28] Rsamtools_2.27.2 parallelly_1.46.1 RSQLite_2.4.6
#> [31] BiocIO_1.21.0 spatstat.random_3.4-5 ica_1.0-3
#> [34] dplyr_1.2.1 Matrix_1.7-5 ggbeeswarm_0.7.3
#> [37] abind_1.4-8 lifecycle_1.0.5 yaml_2.3.12
#> [40] rhdf5_2.55.16 SparseArray_1.11.13 BiocFileCache_3.1.0
#> [43] Rtsne_0.17 paletteer_1.7.0 grid_4.5.3
#> [46] blob_1.3.0 promises_1.5.0 ExperimentHub_3.1.0
#> [49] crayon_1.5.3 miniUI_0.1.2 lattice_0.22-9
#> [52] beachmat_2.27.5 cowplot_1.2.0 GenomicFeatures_1.63.2
#> [55] cigarillo_1.1.0 KEGGREST_1.51.1 sys_3.4.3
#> [58] maketools_1.3.2 pillar_1.11.1 knitr_1.51
#> [61] abdiv_0.2.0 rjson_0.2.23 future.apply_1.20.2
#> [64] codetools_0.2-20 glue_1.8.0 spatstat.univar_3.1-7
#> [67] data.table_1.18.2.1 vctrs_0.7.2 png_0.1-9
#> [70] gypsum_1.7.0 spam_2.11-3 gtable_0.3.6
#> [73] rematch2_2.1.2 cachem_1.1.0 xfun_0.57
#> [76] S4Arrays_1.11.1 mime_0.13 tidygraph_1.3.1
#> [79] survival_3.8-6 fitdistrplus_1.2-6 ROCR_1.0-12
#> [82] liver_1.28 nlme_3.1-169 bit64_4.6.0-1
#> [85] alabaster.ranges_1.11.0 filelock_1.0.3 RcppAnnoy_0.0.23
#> [88] GenomeInfoDb_1.47.2 bslib_0.10.0 irlba_2.3.7
#> [91] vipor_0.4.7 KernSmooth_2.23-26 otel_0.2.0
#> [94] DBI_1.3.0 tidyselect_1.2.1 bit_4.6.0
#> [97] compiler_4.5.3 curl_7.0.0 httr2_1.2.2
#> [100] BiocNeighbors_2.5.4 h5mread_1.3.3 DelayedArray_0.37.1
#> [103] plotly_4.12.0 rtracklayer_1.71.3 scales_1.4.0
#> [106] lmtest_0.9-40 ggeasy_0.1.6 rappdirs_0.3.4
#> [109] goftest_1.2-3 stringr_1.6.0 digest_0.6.39
#> [112] spatstat.utils_3.2-2 alabaster.matrix_1.11.0 rmarkdown_2.31
#> [115] XVector_0.51.0 htmltools_0.5.9 pkgconfig_2.0.3
#> [118] dbplyr_2.5.2 fastmap_1.2.0 ensembldb_2.35.0
#> [121] rlang_1.2.0 htmlwidgets_1.6.4 UCSC.utils_1.7.1
#> [124] shiny_1.13.0 farver_2.1.2 jquerylib_0.1.4
#> [127] zoo_1.8-15 jsonlite_2.0.0 BiocParallel_1.45.0
#> [130] BiocSingular_1.27.1 RCurl_1.98-1.18 magrittr_2.0.5
#> [133] dotCall64_1.2 patchwork_1.3.2 Rhdf5lib_1.33.6
#> [136] Rcpp_1.1.1 ggnewscale_0.5.2 viridis_0.6.5
#> [139] reticulate_1.46.0 stringi_1.8.7 alabaster.schemas_1.11.0
#> [142] ggalluvial_0.12.6 ggraph_2.2.2 MASS_7.3-65
#> [145] AnnotationHub_4.1.0 plyr_1.8.9 parallel_4.5.3
#> [148] listenv_0.10.1 ggrepel_0.9.8 deldir_2.0-4
#> [151] Biostrings_2.79.5 graphlayouts_1.2.3 splines_4.5.3
#> [154] tensor_1.5.1 igraph_2.2.3 spatstat.geom_3.7-3
#> [157] RcppHNSW_0.6.0 buildtools_1.0.0 reshape2_1.4.5
#> [160] ScaledMatrix_1.19.0 BiocVersion_3.23.1 XML_3.99-0.23
#> [163] evaluate_1.0.5 henna_0.7.5 BiocManager_1.30.27
#> [166] tweenr_2.0.3 httpuv_1.6.17 RANN_2.6.2
#> [169] tidyr_1.3.2 purrr_1.2.1 polyclip_1.10-7
#> [172] scattermore_1.2 future_1.70.0 alabaster.sce_1.11.0
#> [175] ggforce_0.5.0 rsvd_1.0.5 xtable_1.8-8
#> [178] restfulr_0.0.16 AnnotationFilter_1.35.0 RSpectra_0.16-2
#> [181] later_1.4.8 viridisLite_0.4.3 tibble_3.3.1
#> [184] memoise_2.0.1 beeswarm_0.4.0 AnnotationDbi_1.73.1
#> [187] GenomicAlignments_1.47.0 cluster_2.1.8.2 globals_0.19.1