RCSL is an R toolkit for single-cell clustering and trajectory analysis using single-cell RNA-seq data.
RCSL can be installed directly from GitHub with ‘devtools’.
library(devtools)
devtools::install_github("QinglinMei/RCSL")Now we can load RCSL. We also load the SingleCellExperiment, ggplot2 and igraph package.
library(RCSL)
library(SingleCellExperiment)
library(ggplot2)
library(igraph)
library(umap)We illustrate the usage of RCSL on a human preimplantation embryos and embryonic stem cells(Yan et al., (2013)). The yan data is distributed together with the RCSL package, with 90 cells and 20,214 genes:
data(yan, package = "RCSL")
head(ann)##                 cell_type1
## Oocyte..1.RPKM.     zygote
## Oocyte..2.RPKM.     zygote
## Oocyte..3.RPKM.     zygote
## Zygote..1.RPKM.     zygote
## Zygote..2.RPKM.     zygote
## Zygote..3.RPKM.     zygoteyan[1:3, 1:3]##          Oocyte..1.RPKM. Oocyte..2.RPKM. Oocyte..3.RPKM.
## C9orf152             0.0             0.0             0.0
## RPS11             1219.9          1021.1           931.6
## ELMO2                7.0            12.2             9.3origData <- yan
label <- ann$cell_type1In practice, we find it always beneficial to pre-process single-cell RNA-seq datasets, including: 1. Log transformation. 2. Gene filter
data <- log2(as.matrix(origData) + 1)
gfData <- GenesFilter(data)resSimS <- SimS(gfData)## Calculate the Spearman correlation 
## Calculate the Nerighbor Representation 
## Find neighbors by KNN(Euclidean)Estimated_C <- EstClusters(resSimS$drData,resSimS$S)## ======== Calculate maximal strongly connected components ======== 
## ======== Calculate maximal strongly connected components ======== 
## ======== Calculate maximal strongly connected components ========resBDSM <- BDSM(resSimS$S, Estimated_C)## ======== Calculate maximal strongly connected components ========ARI_RCSL <- igraph::compare(resBDSM$y, label, method = "adjusted.rand")DataName <- "Yan"
res_TrajecAnalysis <- TrajectoryAnalysis(gfData, resSimS$drData, resSimS$S,
                                         clustRes = resBDSM$y, TrueLabel = label, 
                                         startPoint = 1, dataName = DataName)res_TrajecAnalysis$MSTPlotres_TrajecAnalysis$PseudoTimePlotres_TrajecAnalysis$TrajectoryPlot